Email updates

Keep up to date with the latest news and content from Genome Biology and BioMed Central.

This article is part of the supplement: Beyond the Genome: The true gene count, human evolution and disease genomics

Selected oral presentation

Genome sequencing and analysis of admixed genomes of African and Mexican ancestry: implications for personal ancestry reconstruction and multi-ethnic medical genomics

Francisco M De La Vega1*, Katarzyna Bryc2, Jeremiah D Degehnardt2, Shaila Musharoff2, Jeffrey M Kidd3, Vrunda Seth4, Sarah Stanley1, Abra Brisbin2, Alon Keinan2, Andrew Clark2 and Carlos D Bustamante23

  • * Corresponding author: Francisco M De La Vega

Author Affiliations

1 Genetic Systems R&D, Life Technologies, Foster City, CA 94404, USA

2 Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14853, USA

3 Department of Genetics, Stanford University, Stanford, CA 94305, USA

4 Genetic Systems R&D, Life Technologies, Beverly, MA 01915, USA

For all author emails, please log on.

Genome Biology 2010, 11(Suppl 1):O4  doi:10.1186/gb-2010-11-s1-o4

The electronic version of this article is the complete one and can be found online at: http://genomebiology.com/2010/11/S1/O4


Published:11 October 2010

© 2010 De La Vega et al; licensee BioMed Central Ltd.

Background

Understanding the contribution of rare and common genetic variants to disease susceptibility will probably require multi- and trans- ethnic sequencing studies that compare the genomes of many individuals with and without a particular disease. Accounting for the role of population stratification at fine scales, both in terms of genomic and geographic location, will be important because rare alleles are likely to show more population stratification. Here, we present results from sequencing, assembly and genomic analysis of two genomes from the Phase 3 HapMap. The donor individuals are of Mexican and African ancestry and represent the first 'admixed' genomes to be sequenced to high coverage.

Materials and methods

DNA was obtained from the Coriell Institute. We sequenced 2 x 25bp and 2 x 50bp mate-paired libraries with the Applied Biosystems SOLiD™ System. The total average haploid coverage achieved was ~20X. Mapping and SNP calling was performed with the SOLiD analysis pipeline. We developed methods to estimate 'admixture breakpoints' along the admixed genomes.

Results

We used the distribution of admixture breakpoints to infer the personal admixture history of the samples and patterns of genomic diversity to reconstruct the demographic history of European, African and Native American continental populations. We compared the distribution of functional and putatively neutral genetic variation among 12 sequenced genomes and found that difference in demographic history might account for statistically significant, differences in distributions of synonymous versus benign, possibly damaging, and probably damaging non-synonymous coding variants. Finally, we used the admixed genomes data together with 1000 Genomes to quantify the relative proportions of private, rare and common functional and neutral alleles within and among populations.

Conclusions

Genomic sequencing provides finer resolution of admixture breakpoints based on allele frequency estimates from HapMap and the 1000 Genomes Projects. Our results suggest that there are many new variants to be discovered in populations from African and the American descent, some of potential disease relevance.