This article is part of a special issue on exome sequencing.

Open Access Highly Accessed Research

Comparison of solution-based exome capture methods for next generation sequencing

Anna-Maija Sulonen12, Pekka Ellonen1, Henrikki Almusa1, Maija Lepistö1, Samuli Eldfors1, Sari Hannula1, Timo Miettinen1, Henna Tyynismaa3, Perttu Salo12, Caroline Heckman1, Heikki Joensuu4, Taneli Raivio56, Anu Suomalainen3 and Janna Saarela1*

Author affiliations

1 Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Biomedicum Helsinki 2U, Tukholmankatu 8, 00290 Helsinki, Finland

2 Unit of Public Health Genomics, National Institute for Health and Welfare, Biomedicum Helsinki, Haartmaninkatu 8, 00290 Helsinki, Finland

3 Research Programs Unit, Molecular Neurology, Biomedicum-Helsinki, University of Helsinki, Haartmaninkatu 8, 00290 Helsinki, Finland

4 Department of Oncology, Helsinki University Central Hospital (HUCH), Haartmaninkatu 4, 00290 Helsinki, Finland

5 Institute of Biomedicine, Department of Physiology, University of Helsinki, Haartmaninkatu 8, 00290 Helsinki, Finland

6 Children's Hospital, Helsinki University Central Hospital (HUCH), Stenbäckinkatu 11, 00290 Helsinki, Finland

For all author emails, please log on.

Citation and License

Genome Biology 2011, 12:R94  doi:10.1186/gb-2011-12-9-r94

Published: 28 September 2011

Abstract

Background

Techniques enabling targeted re-sequencing of the protein coding sequences of the human genome on next generation sequencing instruments are of great interest. We conducted a systematic comparison of the solution-based exome capture kits provided by Agilent and Roche NimbleGen. A control DNA sample was captured with all four capture methods and prepared for Illumina GAII sequencing. Sequence data from additional samples prepared with the same protocols were also used in the comparison.

Results

We developed a bioinformatics pipeline for quality control, short read alignment, variant identification and annotation of the sequence data. In our analysis, a larger percentage of the high quality reads from the NimbleGen captures than from the Agilent captures aligned to the capture target regions. High GC content of the target sequence was associated with poor capture success in all exome enrichment methods. Comparison of mean allele balances for heterozygous variants indicated a tendency to have more reference bases than variant bases in the heterozygous variant positions within the target regions in all methods. There was virtually no difference in the genotype concordance compared to genotypes derived from SNP arrays. A minimum of 11× coverage was required to make a heterozygote genotype call with 99% accuracy when compared to common SNPs on genome-wide association arrays.

Conclusions

Libraries captured with NimbleGen kits aligned more accurately to the target regions. The updated NimbleGen kit most efficiently covered the exome with a minimum coverage of 20×, yet none of the kits captured all the Consensus Coding Sequence annotated exons.