Email updates

Keep up to date with the latest news and content from Genome Biology and BioMed Central.

Open Access Highly Accessed Method

Corset: enabling differential gene expression analysis for de novo assembled transcriptomes

Nadia M Davidson1 and Alicia Oshlack12*

Author Affiliations

1 Murdoch Childrens Research Institute, Royal Children’s Hospital, Flemington Road, Parkville 3052, Melbourne, VIC, Australia

2 Department of Genetics, University of Melbourne, Melbourne, VIC, Australia

For all author emails, please log on.

Genome Biology 2014, 15:410  doi:10.1186/s13059-014-0410-6

Published: 26 July 2014

Abstract

Next generation sequencing has made it possible to perform differential gene expression studies in non-model organisms. For these studies, the need for a reference genome is circumvented by performing de novo assembly on the RNA-seq data. However, transcriptome assembly produces a multitude of contigs, which must be clustered into genes prior to differential gene expression detection. Here we present Corset, a method that hierarchically clusters contigs using shared reads and expression, then summarizes read counts to clusters, ready for statistical testing. Using a range of metrics, we demonstrate that Corset out-performs alternative methods. Corset is available from https://code.google.com/p/corset-project/ webcite.