2Center for Applied Proteomics and Molecular Medicine, George Mason University, Manassas, VA, USA
Reverse phase protein microarrays (RPMA) are widely used to measure a large number of protein analytes from a small clinical sample. Data normalization for this technology is problematic for complex samples contaminated with blood and non cellular proteins. To address this need, we adopted gene microarray algorithms to RPMA processing and analysis, tailored to the study set, to compare seven normalization analytes across sample sets including cell lines, tissues subjected to laser capture microdissection and blood contaminated tissues. Specific normalization analytes were found to be advantageous for classes of sample sets. ssDNA was found optimal for samples contaminated with blood.
Reverse phase protein microarrays (RPMA) are designed for quantitative, multiplexed analysis of proteins, and their posttranslational modified forms, from a limited amount of sample. To correct for sample to sample variability due to the number of cells in each lysate and the presence of extracellular proteins or red blood cells, a normalization method is required that is independent of these potentially confounding parameters. We adopted a gene microarray algorithm for use with RPMA to optimize the proteomic data normalization process and developed a systematic approach to RPMA processing and analysis, tailored to the study set. Our approach capitalizes on the gene microarray algorithms geNorm and NormFinder to identify the normalization parameter with the lowest variability across a proteomic sample set. Seven analytes (ssDNA, glyceraldehyde 3-phosphate dehydrogenase, α/β-tubulin, mitochondrial ribosomal protein L11, ribosomal protein L13a, β-actin, and total protein) were compared across sample sets including cell lines, tissues subjected to laser capture microdissection, and blood-contaminated tissues. We examined normalization parameters to correct for red blood cell content. We show that single-stranded DNA (ssDNA) is proportional to total non-red blood cell content and is a suitable RPMA normalization parameter. Simple modifications to RPMA processing allow flexibility in using ssDNA-or protein-based normalization molecules.
Reverse phase protein microarray (RPMA) is a quantitative, multiplexed array of heterogeneous mixtures of cellular proteins derived from cells, serum, or body fluids (1-3). Levels of proteins or post-translationally modified proteins are detected by probing the array with specific, validated antibodies directed against target proteins (4). Similar to oligonucleotide microarrays, RPMA data analysis begins with image analysis and spot finding using software that generates raw pixel intensity values for each array spot. The pixel intensity is directly proportional to the amount of protein per spot. RPMA technology provides quantitative information regarding the state of cellular signaling cascades derived from cellular samples or from proteins shed into body fluids (3).
Ideally, protein analyte levels should reflect the tissue's physiologic state at the time of procurement. Tissue samples are highly heterogeneous with regard to cellular and extracellular elements, biological state, disease state, originating organ, and level of contamination by blood. Due to this heterogeneity and the unknown contribution of cellular and extracellular components, including blood, data normalization methods should be used (5, 6). RPMA data normalization corrects spot intensity values through the use of a reference factor (e.g., total protein) wherein non-specific staining is first subtracted out and then spot values are divided by the reference factor, allowing data within and between RPMA data sets to be directly compared.
Quality control metrics and normalization algorithms have been extensively evaluated for gene microarrays (minimum information about a microarray experiment; MIAME) (7-10), but RPMA analysis is not a direct recapitulation of gene microarray analysis. RPMAs differ from gene microarrays because RPMAs are printed with heterogeneous protein mixtures, and can have an unknown contaminating serum or extracellular protein component. Furthermore, RPMAs are routinely detected with a single-wavelength fluorescent or chromogenic marker rather than dual fluorescent markers (2, 3).
Reverse phase protein microarrays represent a unique, complex normalization issue. Tissue samples of equal volume can contain a different number of target cells mixed with different levels of blood, extracellular matrix, or extracellular fluid. Thus, samples with equal total protein can contain vastly different numbers of target cells (Supplementary Figure S1). Therefore, when RPMA is applied to complex tissue samples procured by laser capture micro-dissection (LCM; 11) or from whole tissue specimens, or to blood or bone marrow aspirates, the total protein in each sample or the volume of each sample may not reflect accurate reference factors for normalization because a variable amount of non-cell derived protein is present in each sample.
Based on this need, we have developed additional reference factors for normalization in an attempt to more accurately normalize each sample to the cellular content as a common scale. We have chosen several protein reference analytes that are derived from different cellular compartments. In particular, we have developed and verified single-stranded (denatured) DNA content as a normalization factor that correlates with total cell number. We have also adapted an algorithm used for gene microarrays to RPMA analytes in order to determine the best normalization analyte showing the greatest reduction in RPMA sample to sample variability.
Erythrocytes (red blood cells (RBCs)) are devoid of a nucleus, therefore cellular DNA content—which is proportional to total nucleated cell content of the sample—is a potential new normalization molecule for blood-contaminated samples. Our multi-analyte RPMA normalization process capitalizes on quality metrics proposed for gene microarrays while providing a means to determine the optimal normalization analytes for each RPMA study set.
We demonstrated the utility of these normalization methods in several tissue types, including blood-contaminated metastatic tumor samples, by evaluating the following reference molecules: (i) total protein, (ii) β-actin, (iii) single-stranded DNA (ssDNA), (iv) glyceraldehyde 3-phosphate dehydrogenase (GAPDH), (v) α/β-tubulin (microtubule subunits), (vi) mitochondrial ribosomal protein L11 (MRPL11), and (vii) ribosomal protein L13a (RPL13a) (Supplementary Table 1). Additionally, we created RPMA Analysis Suite (RAS), a dedicated macro tool (VBA Excel macro) for RPMA data reduction that we designed to maintain data reduction steps while permitting flexibility in array design and normalization options.