Open Access Research

Assembly of 500,000 inter-specific catfish expressed sequence tags and large scale gene-associated marker development for whole genome association studies

Shaolin Wang1, Eric Peatman1, Jason Abernathy1, Geoff Waldbieser2, Erika Lindquist3, Paul Richardson3, Susan Lucas3, Mei Wang3, Ping Li1, Jyothi Thimmapuram4, Lei Liu4, Deepika Vullaganti4, Huseyin Kucuktas1, Christopher Murdock2, Brian C Small2, Melanie Wilson5, Hong Liu1, Yanliang Jiang1, Yoona Lee1, Fei Chen1, Jianguo Lu1, Wenqi Wang1, Peng Xu1, Benjaporn Somridhivej1, Puttharat Baoprasertkul1, Jonas Quilang1, Zhenxia Sha1, Baolong Bao1, Yaping Wang1, Qun Wang1, Tomokazu Takano1, Samiran Nandi1, Shikai Liu1, Lilian Wong1, Ludmilla Kaltenboeck1, Sylvie Quiniou2, Eva Bengten5, Norman Miller5, John Trant6, Daniel Rokhsar37, Zhanjiang Liu1* and the Catfish Genome Consortium

Author affiliations

1 The Fish Molecular Genetics and Biotechnology Laboratory, Department of Fisheries and Allied Aquacultures and Program of Cell and Molecular Biosciences, Aquatic Genomics Unit, 203 Swingle Hall, Auburn University, Auburn, AL 36849, USA

2 USDA, ARS, Catfish Genetics Research Unit, 141 Experiment Station Road, Stoneville, Mississippi 38776, USA

3 DOE Joint Genome Institute, Genomic Technologies Department, 2800 Mitchell Drive Bldg 400-462, Walnut Creek, CA 94598, USA

4 The WM Keck Center for Comparative and Functional Genomics, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA

5 Department of Microbiology, University of Mississippi Medical Center, 2500 North State Street, Jackson, MS 39216, USA

6 Center of Marine Biotechnology, University of Maryland Biotechnology Institute, 701 East Pratt Street, Baltimore, MD 21202, USA

7 Department of Molecular and Cell Biology, University of California, Berkeley, 142 Life Sciences Addition #3200, Berkeley, CA 94720, USA

For all author emails, please log on.

Citation and License

Genome Biology 2010, 11:R8  doi:10.1186/gb-2010-11-1-r8

Published: 22 January 2010

Abstract

Background

Through the Community Sequencing Program, a catfish EST sequencing project was carried out through a collaboration between the catfish research community and the Department of Energy's Joint Genome Institute. Prior to this project, only a limited EST resource from catfish was available for the purpose of SNP identification.

Results

A total of 438,321 quality ESTs were generated from 8 channel catfish (Ictalurus punctatus) and 4 blue catfish (Ictalurus furcatus) libraries, bringing the number of catfish ESTs to nearly 500,000. Assembly of all catfish ESTs resulted in 45,306 contigs and 66,272 singletons. Over 35% of the unique sequences had significant similarities to known genes, allowing the identification of 14,776 unique genes in catfish. Over 300,000 putative SNPs have been identified, of which approximately 48,000 are high-quality SNPs identified from contigs with at least four sequences and the minor allele presence of at least two sequences in the contig. The EST resource should be valuable for identification of microsatellites, genome annotation, large-scale expression analysis, and comparative genome analysis.

Conclusions

This project generated a large EST resource for catfish that captured the majority of the catfish transcriptome. The parallel analysis of ESTs from two closely related Ictalurid catfishes should also provide powerful means for the evaluation of ancient and recent gene duplications, and for the development of high-density microarrays in catfish. The inter- and intra-specific SNPs identified from all catfish EST dataset assembly will greatly benefit the catfish introgression breeding program and whole genome association studies.