Open Access Highly Accessed Method

FusionSeq: a modular framework for finding gene fusions by analyzing paired-end RNA-sequencing data

Andrea Sboner12, Lukas Habegger1, Dorothee Pflueger3, Stephane Terry3, David Z Chen1, Joel S Rozowsky2, Ashutosh K Tewari4, Naoki Kitabayashi3, Benjamin J Moss3, Mark S Chee5, Francesca Demichelis36, Mark A Rubin3* and Mark B Gerstein127*

Author affiliations

1 Program in Computational Biology and Bioinformatics, Yale University, 300 George Street, New Haven, CT 06511, USA

2 Molecular Biophysics and Biochemistry Department, Yale University, 260 Whitney Avenue, New Haven, CT 06520, USA

3 Department of Pathology and Laboratory Medicine, Weill Cornell Medical College, 1300 York Avenue, New York, NY 10065, USA

4 Department of Urology, Weill Cornell Medical College, 525 East 68th Street, New York, NY 10065, USA

5 Prognosys Biosciences, Inc., 505 Coast Blvd South, La Jolla, CA 92037, USA

6 Institute for Computational Biomedicine, Weill Cornell Medical College, 1305 York Avenue, New York, NY 10065, USA

7 Department of Computer Science, Yale University, 51 Prospect Street, New Haven, CT 06511, USA

For all author emails, please log on.

Citation and License

Genome Biology 2010, 11:R104  doi:10.1186/gb-2010-11-10-r104

Published: 21 October 2010

Abstract

We have developed FusionSeq to identify fusion transcripts from paired-end RNA-sequencing. FusionSeq includes filters to remove spurious candidate fusions with artifacts, such as misalignment or random pairing of transcript fragments, and it ranks candidates according to several statistics. It also has a module to identify exact sequences at breakpoint junctions. FusionSeq detected known and novel fusions in a specially sequenced calibration data set, including eight cancers with and without known rearrangements.