Email updates

Keep up to date with the latest news and content from Genome Biology and BioMed Central.

Open Access Highly Accessed Method

Modeling non-uniformity in short-read rates in RNA-Seq data

Jun Li1, Hui Jiang12 and Wing Hung Wong13*

Author Affiliations

1 Department of Statistics, Stanford University, Sequoia Hall, 390 Serra Mall, Stanford, CA 94305, USA

2 Stanford Genome Technology Center, 855 California Ave, Palo Alto, CA 94304, USA

3 Department of Health Research and Policy, Stanford University, 259 Campus Drive, Redwood Building, Stanford, CA 94305, USA

For all author emails, please log on.

Genome Biology 2010, 11:R50  doi:10.1186/gb-2010-11-5-r50

Published: 11 May 2010

Abstract

After mapping, RNA-Seq data can be summarized by a sequence of read counts commonly modeled as Poisson variables with constant rates along each transcript, which actually fit data poorly. We suggest using variable rates for different positions, and propose two models to predict these rates based on local sequences. These models explain more than 50% of the variations and can lead to improved estimates of gene and isoform expressions for both Illumina and Applied Biosystems data.