Email updates

Keep up to date with the latest news and content from Genome Biology and BioMed Central.

Open Access Method

Biased estimates of clonal evolution and subclonal heterogeneity can arise from PCR duplicates in deep sequencing experiments

Erin N Smith123, Kristen Jepsen124, Mahdieh Khosroheidari14, Laura Z Rassenti1, Matteo D’Antonio2, Emanuela M Ghia1, Dennis A Carson15, Catriona HM Jamieson156, Thomas J Kipps15 and Kelly A Frazer1234*

Author Affiliations

1 Moores UCSD Cancer Center, University of California San Diego, 9500 Gilman Drive, La Jolla 92093, CA, USA

2 Department of Pediatrics and Rady Children’s Hospital, University of California San Diego, 9500 Gilman Drive, La Jolla 92093, CA, USA

3 Clinical and Translational Research Institute, University of California San Diego, 9500 Gilman Drive, La Jolla 92093, CA, USA

4 Institute for Genomic Medicine, University of California San Diego, 9500 Gilman Drive, La Jolla 92093, CA, USA

5 Department of Medicine, University of California San Diego, 9500 Gilman Drive, La Jolla 92093, CA, USA

6 Stem Cell Program, University of California San Diego, La Jolla, CA, USA

For all author emails, please log on.

Genome Biology 2014, 15:420  doi:10.1186/s13059-014-0420-4

Published: 7 August 2014

Abstract

Accurate allele frequencies are important for measuring subclonal heterogeneity and clonal evolution. Deep-targeted sequencing data can contain PCR duplicates, inflating perceived read depth. Here we adapted the Illumina TruSeq Custom Amplicon kit to include single molecule tagging (SMT) and show that SMT-identified duplicates arise from PCR. We demonstrate that retention of PCR duplicate reads can imply clonal evolution when none exists, while their removal effectively controls the false positive rate. Additionally, PCR duplicates alter estimates of subclonal heterogeneity in tumor samples. Our method simplifies PCR duplicate identification and emphasizes their removal in studies of tumor heterogeneity and clonal evolution.