This article is part of a special issue on exome sequencing.

Open Access Highly Accessed Research

Comparison of solution-based exome capture methods for next generation sequencing

Anna-Maija Sulonen12, Pekka Ellonen1, Henrikki Almusa1, Maija Lepistö1, Samuli Eldfors1, Sari Hannula1, Timo Miettinen1, Henna Tyynismaa3, Perttu Salo12, Caroline Heckman1, Heikki Joensuu4, Taneli Raivio56, Anu Suomalainen3 and Janna Saarela1*

Author Affiliations

1 Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Biomedicum Helsinki 2U, Tukholmankatu 8, 00290 Helsinki, Finland

2 Unit of Public Health Genomics, National Institute for Health and Welfare, Biomedicum Helsinki, Haartmaninkatu 8, 00290 Helsinki, Finland

3 Research Programs Unit, Molecular Neurology, Biomedicum-Helsinki, University of Helsinki, Haartmaninkatu 8, 00290 Helsinki, Finland

4 Department of Oncology, Helsinki University Central Hospital (HUCH), Haartmaninkatu 4, 00290 Helsinki, Finland

5 Institute of Biomedicine, Department of Physiology, University of Helsinki, Haartmaninkatu 8, 00290 Helsinki, Finland

6 Children's Hospital, Helsinki University Central Hospital (HUCH), Stenbäckinkatu 11, 00290 Helsinki, Finland

For all author emails, please log on.

Genome Biology 2011, 12:R94  doi:10.1186/gb-2011-12-9-r94

Published: 28 September 2011

Additional files

Additional file 1:

Comparison of the probe designs of the exome capture kits against the CCDS exon annotation, UCSC exon annotation and each other. (a) Numbers of CCDS exon regions, common target regions outside CCDS annotations and the regions covered individually by the Agilent SureSelect and Agilent SureSelect 50 Mb kits. SureSelect has one single region outside the SureSelect 50 Mb design. (b) The same as (a) for the NimbleGen SeqCap and NimbleGen SeqCap v2.0 kits. (c-f) The same as Figures 1a and 1b and Additional files 1a and 1b, respectively, but the exon annotation from UCSC is given instead of the CCDS annotation. Regions of interest are defined as merged genomic positions, regardless of their strandedness, which overlap with the kit in question. Sizes of the spheres are proportional to the number of targeted regions in the kit. The total number of targeted regions is given under the name flag of each sphere.

Format: TIFF Size: 1.2MB Download file

Open Data

Additional file 2:

Capture details of the CCDS exons for Agilent SureSelect in the control I sample. Tabular file listing CCDS annotated exons, their targeting in the Agilent SureSelecet kit and mean sequencing coverage for the control I sample.

Format: CSV Size: 19MB Download file

Open Data

Additional file 3:

Capture details of the CCDS exons for Agilent SureSelect 50 Mb in the control I sample. Tabular file listing CCDS annotated exons, their targeting in the Agilent SureSelecet 50 Mb kit and mean sequencing coverage for the control I sample.

Format: CSV Size: 19MB Download file

Open Data

Additional file 4:

Capture details of the CCDS exons for NimbleGen SeqCap in the control I sample. Tabular file listing CCDS annotated exons, their targeting in the NimbleGen SeqCap kit and mean sequencing coverage for the control I sample.

Format: CSV Size: 19MB Download file

Open Data

Additional file 5:

Capture details of the CCDS exons for NimbleGen SeqCap v2.0 in the control I sample. Tabular file listing CCDS annotated exons, their targeting in the NimbleGen SeqCap v2.0 kit and mean sequencing coverage for the control I sample.

Format: CSV Size: 19.1MB Download file

Open Data

Additional file 6:

Comparison of poorly captured targets between the exome capture kits. Table presenting comparisons between the regions of the common target with poor capture success in one kit (mean sequencing coverage 0×) and reasonable capture success in another kit (mean sequencing coverage ≥ 10×).

Format: PDF Size: 58KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 7:

Mean allele balances for heterozygous single nucleotide variants. (a-e) Allele balances are given for the heterozygous SNVs in the whole genome (a), in each exome capture method's own CTR (b), in each exome capture method's own CTR and flanking the 100 bp (c), in the CCDS annotated exon regions (d) and the common regions targeted for capture in all the methods (e) for different minimum sequencing coverages. In (a, b), allele balances are given for the control I sample (bars without outline) and for the mean values from the 26 additional exome samples (bars with thick outline). The ideal allele balance of 0.5 is indicated with a red line. Regions including mostly non-targeted base pairs, as in the whole genome and CTR + flanking regions, had a mean allele balance closer to 0.5 than the regions with only targeted base pairs. Additionally, allele balance was shifted away from the 0.5 with increasing minimum sequencing depth.

Format: TIFF Size: 1.2MB Download file

Open Data

Additional file 8:

Correlation of VCP genotype calls from Agilent SureSelect- and NimbleGen SeqCap-captured (a) and SureSelect 50 Mb- and SeqCap v2.0-captured (b) sequenced genotypes to the Illumina Human660W-Quad v1 SNP chip genotypes with exact sequencing coverages. Correlations for heterozygous, reference homozygous and variant homozygous SNPs (according to the chip genotype call) are presented in separate graphs, though graphs lying near 100% correlation cannot be visualized. The x-axis represents the exact coverage of the sequenced SNPs.

Format: TIFF Size: 1.5MB Download file

Open Data

Additional file 9:

Correlation of SAMtools' pileup genotype calls (a, b) and GATK genotype calls (c, d) from Agilent SureSelect- and NimbleGen SeqCap-captured and SureSelect 50 Mb- and SeqCap v2.0-captured sequenced genotypes to the Illumina Human660W-Quad v1 SNP chip genotypes. The SAMtools' pileup genotype calls correlated worse with the chip genotypes than the genotypes accommodated with the quality ratios. Correlations for heterozygous, reference homozygous and variant homozygous SNPs (according to the chip genotype call) are presented in separate graphs, though graphs lying near 100% correlation cannot be visualized. The x-axis represents the accumulative coverage of the sequenced SNPs.

Format: TIFF Size: 2.4MB Download file

Open Data

Additional file 10:

Examined disorders of the Finnish disease heritage, their mutation loci and the sequencing coverage of control sample I on the loci.

Format: PDF Size: 83KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 11:

Sample preparation workflows for sample preparation I (a) and sample preparation II (b). (a) Orange boxes represent the protocol provided by Agilent for the SureSelect Human All Exon capture kit, and green boxes the protocol for the SeqCap EZ Exome capture kit by NimbleGen. Protocol simplifications and equalizations were made in the sample preparation, and are represented as blue boxes and arrows. (b) Similarly for sample preparations II, orange boxes refer to Agilent SureSelect 50 Mb and green boxes to NimbleGen SeqCap v2.0. Not all the steps of the provided protocols are represented.

Format: TIFF Size: 1.5MB Download file

Open Data