Email updates

Keep up to date with the latest news and content from Genome Biology and BioMed Central.

Open Access Research

High resolution discovery and confirmation of copy number variants in 90 Yoruba Nigerians

Hajime Matsuzaki, Pei-Hua Wang, Jing Hu, Rich Rava and Glenn K Fu*

Author Affiliations

Affymetrix, Inc., 3420 Central Expressway, Santa Clara, CA 95051, USA

For all author emails, please log on.

Genome Biology 2009, 10:R125  doi:10.1186/gb-2009-10-11-r125

Published: 9 November 2009

Abstract

Background

Copy number variants (CNVs) account for a large proportion of genetic variation in the genome. The initial discoveries of long (> 100 kb) CNVs in normal healthy individuals were made on BAC arrays and low resolution oligonucleotide arrays. Subsequent studies that used higher resolution microarrays and SNP genotyping arrays detected the presence of large numbers of CNVs that are < 100 kb, with median lengths of approximately 10 kb. More recently, whole genome sequencing of individuals has revealed an abundance of shorter CNVs with lengths < 1 kb.

Results

We used custom high density oligonucleotide arrays in whole-genome scans at approximately 200-bp resolution, and followed up with a localized CNV typing array at resolutions as close as 10 bp, to confirm regions from the initial genome scans, and to detect the occurrence of sample-level events at shorter CNV regions identified in recent whole-genome sequencing studies. We surveyed 90 Yoruba Nigerians from the HapMap Project, and uncovered approximately 2,700 potentially novel CNVs not previously reported in the literature having a median length of approximately 3 kb. We generated sample-level event calls in the 90 Yoruba at nearly 9,000 regions, including approximately 2,500 regions having a median length of just approximately 200 bp that represent the union of CNVs independently discovered through whole-genome sequencing of two individuals of Western European descent. Event frequencies were noticeably higher at shorter regions < 1 kb compared to longer CNVs (> 1 kb).

Conclusions

As new shorter CNVs are discovered through whole-genome sequencing, high resolution microarrays offer a cost-effective means to detect the occurrence of events at these regions in large numbers of individuals in order to gain biological insights beyond the initial discovery.