With its improved ability to capture sets of pre-specified genes from highly divergent taxa, Naylor envisions that his method will be especially useful for comparative biochemistry and physiology studies. “Imagine you've got 55 genes in a pathway that's associated with cancer, and you can make baits for all of those 55 elements associated with a particular disease condition or a particular morphogenetic pathway… and you've got a candidate gene pathway, say from the human or zebrafish. We can make probes for all of the elements in that candidate gene pathway and interrogate for that set across taxa.”Challenges for the Future
For most researchers, the major advantage presented by these three new targeted enrichment methods over carrying out de novo whole genome sequencing (WGS) boils down to costs. While the expense of WGS has dropped considerably in recent years, it is still far less expenwsive to sequence large numbers of samples with targeted enrichment, especially when it comes to storing and analyzing the much larger, and more diverse, volumes of data generated with WGS.
“There are so many gene families, so many duplications, so many elements of unknown function,” notes Naylor. In essence, using targeted enrichment upfront reduces much of the bioinformatic filtering of sequences that needs to be carried out downstream.
Still, data analysis remains a worry as the amount of sequence information being generated with targeted resequencing is nearly beyond the limits of present bioinformatics methods to process efficiently. “Collecting data is not the issue; analyzing the data is the biggest problem that we have. …I would imagine in the next two to three years we really see a number of interesting and provocative and hopefully very helpful analytical methods that come on the scene and allow us to analyze the data we have collected,” says Faircloth.
Another interesting challenge for the future, according to both Faircloth and Naylor, is how to handle paralogs during sequence analysis. At present, sequences that appear to be paralogs are explicitly eliminated. “We're probably throwing out information that could be massively useful to us, but we're throwing it out because there's really no good mechanism to deal with it,” explains Faircloth.
In the end though, these new methods are leading to a new excitement among phylogeneticists around the globe. “Everyone I think has the sense, even those who aren't savvy to bioinformatics or even next-gen sequencing, there's a change happening. Our goal is really just to facilitate science and to help people produce the best possible datasets,” says Lemmon.
For Faircloth, targeted resequencing has created a unique and exciting opportunity when it comes to understanding organisms and their evolutionary relationships. “What we can do now is amazing. It really allows us to work with all of these species that for the longest time have constrained our ability to understand the deepest relationships, or the shallowest relationships, or how different taxa spread across a phylogenetic tree differ in population genetic parameters. We can now work with all of these taxa and we're really not limited in terms of data collection and that is a fantastically powerful development.”