^{1,2}, Lynda Gentchev

^{3}, Amitabha S. Basu

^{4}, Rafael E. Jimenez

^{3}, Kamel Boussaid

^{4}, and Abhi S. Gholap

^{4}

The same collection of paraffin blocks containing the 100 cases each of breast, lung, and colon cancer were used as the primary source material to derive the manual TMA, the algorithmic TMA, as well as the robotic TMA. Since these paraffin blocks contained fairly large tissue specimens, averaging ≥4 cm^{2}, there were at least 400 potential sites to core from each block so that both the manual TMA as well as the algorithmic TMA could easily be obtained from the same block. We first carried out the manual TMA and then the algorithmic TMA from the first paraffin block. We obtained the robotic TMA from the second paraffin block since the the algorithmic TMA had already directed the areas to be cored from the first block.

Virtual slides of all three categories of TMA were independently assessed both algorithmically as well as manually by two raters (both pathologists) for epithelial cell (cancer cell) percentages. Estimates of epithelial cell (cancer cell) percentages of selected FOVs on the whole slides—as well as the TMA cores produced within all three categories—were also independently determined by these two raters and compared with the algorithmic measurements. The strength of these correlations was also determined.

**Statistical methods**

Intraclass correlations (ICCs) were calculated to assess the level of agreement among measurements of epithelial percentages. To calculate the ICC, intercept-only random effects models were fit to the epithelial percentage with random effect variance components for tissue sample, rater, and error variance. The ICC was calculated as the variance component for tissue sample divided by the sum of the three variance components. Confidence intervals were calculated using the delta method.

For tests of the statistical significance of comparisons of epithelial cell percentages in different samples, Student's *t*-test was applied.

**Software availability and accessibility**

We have included a detailed code of the algorithms and a weblink to the algorithms.

**Detailed code of the algorithms.** Since the full source code of the algorithms is too lengthy to provide in print, an abbreviated synoptic code that gives the essence of the source code is provided (Supplementary Materials; “Abbreviated synoptic code” section).

**Weblink to the algorithms.** A direct link to a demonstration of the ERAs—or a way to upload one's own photomicrographs or scanned images for analysis with the ERAs—is the web site: http://www.pathxchange.org. Details to access this link are provided (Supplementary Materials; “Demonstration of web site link” section).

**Results**

Our specific FOV algorithms invariably could divide a whole virtual slide into a grid of FOVs. The number of FOVs that were needed to analyze the whole slide ranged 200–600 (average 400), depending on the size of the tissue specimen. The process of scanning each slide into a virtual slide took ~15 min and the process of gridding each slide into FOVs and analyzing each FOV algorithmically took an additional 15 min (Figure 1A; Supplementary Table S1).

Image acquisition by either the Aperio ScanScope T2 System or the iSCAN System produced equivalent results with uniformly sharp images with high contrast. For approximately 10% of the images, mask removal and contrast enhancement improved image quality (Supplementary Table S3). The ERAs (Supplementary Tables S2, S4, S5, S6, and S7) applied to each FOV were successful in recognizing epithelium, filtering out stroma, and determining its epithelial percentage and therefore its cancer cell density.

For ~5% of the acquired images, the quality was below the standard necessary for the algorithms to interpret them. These images had to be discarded. Another 5% of the FOVs contained only a very small amount of tissue compared with the size of the given square (<10% tissue); these squares also had to be discarded because they falsely elevated the epithelial percentage (Supplementary Table S7).

After analyzing all the FOVs on a given slide, the algorithms ranked the FOVs on the basis of epithelial percentage and selected the 3–5 FOVs with the highest epithelial percentage (Figure 1B; Figure 2, row 1). FOVs with comparatively low epithelial percentage were identified but not selected (Figure 2, row 2). The algorithm-based determinations of locations of those FOVs exhibiting the highest epithelial percentages were the same every time the algorithm was run and therefore showed no inter-observer, intra-observer, or fatigue variability.