Of course, not all proteins are as stable and abundant as collagen, and not all samples are preserved in conditions that reduce potential environmental damage. As such, even though researchers have been studying ancient proteins for more than 20 years, successes have been limited to analyzing only a small number of abundant proteins. But this could be changing with the rise of new instruments and protein preparation methods.
In a paper published last year in the Journal of Proteome Research (2), Capellini and colleagues analyzed 126 unique proteins from a 43,000-year-old woolly mammoth. “We showed it was possible to go from a very few common proteins, something boring like collagen, to something more interesting like albumin,” explains Capellini. Albumin is a protein that helps transport other biomolecules through the bloodstream of a mammal.
A typical proteomics workflow consists of three separate steps: sample preparation, mass spectrometry, and data analysis. At first, protein samples are purified and separated into smaller fractions using chromatography methods that include liquid and gas-based separation. This step can in many ways be thought of as a “pre-screen,” enabling researchers to separate the components of the complex mixture to enhance mass analysis and also provide some initial information on protein composition prior to more directed analysis. Next, the protein fractions are analyzed by mass spectrometry. Here, the proteins and peptides are first ionized and then the resulting fragments introduced into a machine that collects data on ion masses. Finally, the mass spectra obtained from the mass spectrometer are compared with a database of known peptide masses to identify the protein components within the sample.
One key to Capellini's success in analyzing such a wide range of ancient proteins from the mammoth was to limit loss during sample preparation. In contrast to modern protein studies where more cells can be grown or experiments repeated, ancient samples are always limited and therefore Capellini's team had to modify several steps in the traditional proteomics workflow, such as eliminating precipitation steps during sample preparation to reduce the loss of cross-linked proteins.
“This is a very exciting time to get into ancient protein [analysis] because the technological developments that are coming out on a yearly basis are opening up a lot of possibilities,” says Capellini.Techniques age quickly
Altering traditional workflows is not the only concern for researchers working with ancient samples — contamination will always be an issue with these samples. And when it comes to ancient protein analysis, keratin, a structural protein that makes up human skin, hair, and nails, is a common contamination culprit.
To combat keratin contamination, most proteomics labs follow strict protocols. For example, Corthals’ lab fractionates peptides using liquid chromatography rather than gel electrophoresis. In gel-electrophoresis, proteins must be removed from a lysis solution, resolved on an electrophoresis gel, and then returned to a solution prior to MS analysis. On the other hand, liquid chromatography-based fractionation keeps proteins in a single solution, reducing the possibility of contamination.
Downstream, the data analysis software used to identify peptides and proteins from MS data has greatly improved over the years. With the expansion of MS spectra databases as more proteomics studies are completed, as well as the creation of new algorithms developed by mathematical biologists to extend mass analysis, peptide identification programs are more accurate and make less false-positive identifications. “That's not trivial,” says Corthals. “Those people are the unsung heroes because no one wants to hear about the programming.”
But the expansion of databases with new spectra and growth of MS analysis tools does bring up a quandary for those working with ancient proteins: what spectral databases do you search when examining ancient samples? For the woolly mammoth study, Cappellini and colleagues searched against a protein analysis of its modern ancestor, the elephant. And while a search against a high-quality mammoth genome could have potentially also identified new isoforms or similar enzymes with different functions, that genome was publishedin 2008—one of the first examples of sequencing ancient DNA—and only done to 0.7-fold coverage (3).