Each human is profoundly unique, yet, as the Human Genome Project has shown, we typically share the same 20,000-25,000 genes. To better understand where our individuality comes from, scientists have built detailed genetic maps of 1092 individuals from 14 different populations worldwide, uncovering rare nucleotide changes and unique genetic signatures among the populations. Not only will these maps help explain our uniqueness, they will also help us understand patterns associated with the development of disease. The first phase of the 1000 Genomes Project was published this week in Nature (1), and the nucleotide database is available online for free.
The project comes after the completion of the HapMap, which illuminated millions of the most common single nucleotide polymorphisms (SNPs) that vary between people. “If the Human Genome Project provides a coarse map and the HapMap shows the highways, then the 1000 Genomes Project looks at the small alleys and side streets, all the details,” said Fuli Yu, assistant professor at the Human Genome Sequencing Center at Baylor College of Medicine, and co-author on the study.
The team used both a low-coverage whole genome approach, which gives a rough estimation of the sequence of the whole genome, and a deeper exome sequencing of just the regions that code for proteins. The researchers then applied algorithms to identify candidate variants and determine their quality, and then statistical methods to infer the final genotype. Their final product is a validated haplotype map of 38 million SNPs, 1.4 million short insertions and deletions, and more than 14,000 larger deletions.
Each rare genetic change most likely evolved fairly recently, which is why each is largely confined to people living in a fairly small area—the 14 populations studied in the project included Finnish people, people with Mexican ancestry in Los Angeles, and Han Chinese in Beijing. The changes can have many functional consequences, such as leading to truncated and defective proteins or altering gene expression. “Many well-known human disease alleles differ in frequency across populations, although not all do. The alleles that cause sickle cell anemia are one of the best known examples,” said Abecasis. This work also provides further proof that low-frequency, rare variants in genes can be associated with complex diseases like cardiovascular disease or cancer.
Since all the project’s data is publicly available, researchers and physicians can compare the DNA of patients with genetic disorders or cancer genome projects to data from the study. As the project expands, this ability will increase: in the final phase, the teams expect to sequence another 1500 individuals from 12 other populations. And, someday, Abecasis predicts: “I do expect that we'll be able to sequence millions of people. First in research studies, and gradually to guide clinical care too. A challenge in the next round of studies will be to connect genetic variation to outcomes in each person.”
1. The 1000 Genomes Project Consortium. 2012. An integrated map of genetic variation from 1,092 human genomes. Nature 491(7422):56-65.