The Earth BioGenome Project aims to sequence all eukaryotic species

Written by Janelle Weaver

An unfunded project with an ambitious goal promises to shed light on the evolutionary history of higher organisms and enhance conservation efforts. But is the $4.8 billion project well-conceived?

Since the Human Genome Project was completed in 2003, large-scale genome sequencing efforts have proliferated. For example, Genome 10K was launched in 2009 to sequence the genomes of at least one individual from each vertebrate genus, approximately 10,000 genomes. Two years later, the i5K was unveiled as an initiative to sequence the genomes of 5000 arthropod species. 2015 saw the announcement of the B10K Project, which plans to generate representative draft genome sequences from all extant bird species within five years. The list goes on and on.

But a recently announced project has a more ambitious goal: to sequence all eukaryotic species on Earth. On February 23rd, the Earth BioGenome Project was officially announced at BioGenomics2017, the Global Biodiversity Genomics Conference held at the Smithsonian National Museum of Natural History in Washington, D.C. As reported in Science the next day, the first step of the project would be to sequence in great detail the DNA of a member of each eukaryotic family (about 9,000 in all) to create reference genomes on par or better than the reference human genome. Next would come sequencing to a lesser degree, a species from each of the 150,000 to 200,000 genera. Finally, the participants obtain rough genomes of the 1.5 million remaining known eukaryotic species.

“I think the project is a good idea and will make an important contribution,” said Luke Thompson, a research associate at the National Oceanic and Atmospheric Administration and manager of the Earth Microbiome Project, which was founded in 2010 as a massively collaborative effort to characterize microbial life on this planet. “Numerous insights on the history and evolution of life on Earth are sure to follow from this work.”

“A Good Idea”

According to Thompson, the first stage of the project in particular would be very valuable. “This would provide immediate insight to the evolutionary history of higher organisms, improve taxonomic classification, and provide genomic templates for sequencing individual genera and species,” he said. “Speaking from a microbial perspective, which is my area of expertise, such an effort will provide a foundation for studies of co-evolution and symbiosis between microorganisms and higher organisms, including insights into the endosymbiosis events which enabled the fantastic radiation of eukaryotic diversity.”

The 10-year project, which is currently unfunded, would cost an estimated $4 to $5 billion to complete. As such, some scientists have argued that it would not be a wise use of money and might take away funds from research endeavors focused on other important goals such as improving human health. “The number one challenge is getting buy-in from the scientific community,” said John Kress, a research botanist and curator at the Smithsonian National Museum of Natural History and co-organizer of the Earth BioGenome Project. “This effort will enhance what they do as biologists and conservationists and technologists and will not take funding away from their major projects, but actually add funding to what they want to do.”

In addition, Kress will have to work with project co-organizers Harris Lewin, an expert in mammalian comparative and functional genomics at the University of California, Davis, and Gene Robinson, an evolutionary biologist at the University of Illinois at Urbana-Champaign, to tackle the second biggest challenge: acquiring the funding to get the project off the ground. “Along with that is convincing other funding agencies and research agencies that this thing has legs and this will help float a lot of boats,” Kress said.

If the project is funded, the co-organizers will have to overcome many more hurdles. Although they would leave the bulk of the analysis to other scientists using the open-access data, the trio plans to collaborate with other genome sequencing projects to develop standardized analytic tools and standards to ensure high-quality genome sequences. “We can only be successful if it is a community effort,” Kress said.

Fraught with Difficulty

Even with the help of others, organizing a project of this scope would be very challenging. Managing the metadata in particular would be critical, Thompson said. “For each species, we will need to have photographs or micrographs, common and scientific names, body measurements, location and date of collection, and as many other parameters as possible,” he said. “These metadata are critical to interpreting the genome sequences.”

Moreover, obtaining the samples and extracting high-quality DNA would be very hard, said

Rob Knight, a biologist at the University of California, San Diego, and co-founder of the Earth Microbiome Project. He noted a number of other challenges, including sample tracking and quality control on such a large scale, building a structured phenotype database that allows interpretation of the data to answer scientific and conservation questions, the cost of the computation required to do all the assemblies, and the cost of building sequencing libraries as well as storing and transmitting the data.

“It’s a good idea but faces many hard logistical challenges and will cost much more than anticipated,” Knight said. Although sequencing costs have already declined dramatically, he recommended that the full-scale project wait until long-read, haplotype-resolved sequencing can be achieved at a reasonable price.

“A much more plausible project would produce fewer, better genomes that cover the tree of life evenly, then figure out which species or populations to target next. A resource that collected fewer genomes but invested far more in chemical and phenotypic characterization of the resulting organisms would have a greater impact,” he said. “And do we really need every beetle species more than we need entire populations of threatened species that are more evolutionarily unique, or dense coverage of disease vectors and their wild relatives?”

According to Knight, there’s no doubt the proposed resource would be useful, but he questions whether it’s the best allocation of scarce research resources. “However, sequencing the whole human genome provided many benefits and scientific insights relative to, for example, just sequencing the genes known to be medically important at the time—an alternative and much cheaper approach that was widely advocated,” he said. “And the same may be true of sequencing the whole species catalog.”