Results and Discussion
CellProfiler's main window allows the user to point and click to do most tasks, including the design of a new assay. The software uses the concept of a pipeline, which is a series of modules. Each module performs a specific task on the image or on identified objects (Figure 1A). A typical pipeline consists of loading the images, correcting for uneven illumination, identifying the objects, and then taking measurements on those objects. These modules can easily be added, removed, or rearranged within a pipeline. The resulting measurements can be viewed by (i) using CellProfiler's built-in viewing and plotting data tools; (ii) exporting in a tab-delimited spreadsheet format that can be opened in programs like Microsoft® Excel® and OpenOffice.org Calc; (iii) exporting in a format that can be imported into a database like Oracle® or MySQL® (MySQL, Cupertino, CA, USA); or (iv) opening in MATLAB® (Mathworks, Natick, MA, USA). An analysis can be done on one specific image, a group of images, or thousands of images by using a computing cluster.
CellProfiler bridges the gap between powerful computational methods and their practical application in the biological laboratory. Computer scientists can prototype new computational methods and contribute them to the project, and then biologists can easily use these new additions in their work. Further, the functionality of existing modules can be enhanced by researchers with some programming experience, because the code is open-source, well-documented, and in a language that is relatively easy to understand. While most users will download the completely free, compiled version of CellProfiler, the CellProfiler Developer's version requires the software package MATLAB and its image processing toolbox.
As described in the manual, available at www.cellprofiler.org/linked_files/CellProfilerManual.pdf, CellProfiler already contains advanced object identification algorithms from the literature (4,9,10,11,12,13,14,15,16) and is open to adding new algorithms as described above. In object identification modules, users can rapidly select the best solution for their application using a Test Mode to see the results of various methods. In the following examples, we show the identification of objects by CellProfiler and select measurements for each. Note that the full spectrum of measurements, including many not often measured by biologists (17,18,19), can be recorded for each identified object, including location within the image, size, shape, color intensity, texture (smoothness), correlation between colors, and number of neighbors. Moreover, each broad category contains many different specific measurements. For example, size includes area, perimeter, and major/minor axis length, and shape includes eccentricity (elongation), solidity, form factor, and 32 other shape-related measures.Yeast Colonies
Counting colonies on agar plates and classifying them by size, color, or shape is tedious, time-consuming, and subjective. While complete systems for automated colony counting exist, they are more expensive and less flexible than using a digital camera or off-the-shelf flatbed scanner to acquire images for analysis by CellProfiler. The cost of this solution can be less than $100. Furthermore, the algorithms in CellProfiler are accurate and adaptable, and unusual features of colonies, which commercial software and even the human eye cannot detect, can be measured (e.g., certain measures of texture and shape). After the initial analysis strategy has been established, plates can be analyzed automatically in large batches.
Here we show an example of yeast colonies (Saccharomyces cerevisiae) that were analyzed by CellProfiler (Figure 1B). In this analysis, the plates are automatically cropped to remove the edges, and individual colonies are identified, even when clumped (Figure 1C). Measurement modules then calculate measurements of interest for each individual colony. Any of the available measurements can then be used to classify the colonies, for example, by size (Figure 1D), by color (Figure 1E), or by a combination of measurements, such as size and color. In this example, the apparent correlation between size-classified (Figure 1D) and color-classified (Figure 1E) yeast colonies is verified by a scatter plot of these two measurements (Figure 1F). Each class of colonies can be analyzed separately to allow the researcher to focus on classes of interest. This allows for addressing questions like “Are the red colonies larger than white colonies?” or “Do the larger colonies have more irregularly-shaped borders?” In this example, the colonies all display a smooth round phenotype, but the colony shape and texture of yeast strains with unusual morphology could be quantified using these methods.