Expanded human reference genome reflects global genetic diversity

Written by Caitlin Killen

Sequences from newly mapped genomes could allow the human reference genome to represent a wider subset of the population.

In findings presented at the Annual Meeting of the American Society of Human Genetics (Houston, TX, USA, 15-19 October), a collaborative project led by researchers at the University of California, San Francisco (CA, USA) has unveiled an amended human reference genome that captures the diversity of the world’s population.

The human reference genome is a product of the Human Genome Project. It is a template genome, which all genomic studies are compared to, and has been a vital tool in genome-wide association studies, sequencing of genes and the understanding of disease variation.

Some areas of the genome are highly conserved and there is little variation between individuals, therefore the reference genome is perfect for understanding these sections. However, when studying individuals from different ethnic backgrounds, the reference genome may introduce bias due to it being less comparable in these cases.

“As integral as it has been to the scientific community, the current reference genome does not represent the genetic diversity found in different human populations. It is limiting because the reference genome is constructed with DNA from a few people, and over 70% of its sequences comes from a single donor,” explained Karen Wong, a graduate student who presented the findings.

To address this issue, the team sequenced 300 genomes from males and females representing diverse populations around the world. A total of 220 whole genome de novo assemblies were generated and utilized to identify non-reference unique insertions (NUIs) – sequences that were not found in the reference genome.

Recurrent NUIs were integrated into the chromosomal assemblies of the reference genome, allowing them to be annotated based on genomic location. It was then demonstrated that previous unmapped reads from the Human Genome Project could be aligned to the new NUI-containing reference.

The team plan to continue sequencing genomes that capture global diversity, whilst evaluating the most effective approach for displaying the reference genome, incorporating their newly sequenced reads.

“It is important to consider what information should be represented in the human reference genome, and what doesn’t need to be included, to make sure it is helpful for a variety of uses but avoids overcomplexity,” concluded Wong.

Check out the rest of our coverage from ASHG 2019 here.