Open Access Highly Accessed Software

PyCogent: a toolkit for making sense from sequence

Rob Knight1*, Peter Maxwell2, Amanda Birmingham3, Jason Carnes4, J Gregory Caporaso5, Brett C Easton2, Michael Eaton6, Micah Hamady7, Helen Lindsay2, Zongzhi Liu1, Catherine Lozupone1, Daniel McDonald7, Michael Robeson8, Raymond Sammut2, Sandra Smit1, Matthew J Wakefield29, Jeremy Widmann1, Shandy Wikman1, Stephanie Wilson7, Hua Ying2 and Gavin A Huttley2*

Author Affiliations

1 Department of Chemistry and Biochemistry, University of Colorado, Boulder, Colorado, USA

2 Computational Genomics Laboratory, John Curtin School of Medical Research, The Australian National University, Canberra, Australian Capital Territory, Australia

3 Thermo Fisher Scientific, Lafayette, Colorado, USA

4 Seattle Biomedical Research Institute, Seattle, Washington, USA

5 Department of Biochemistry and Molecular Genetics, University of Colorado Health Sciences Center, Aurora, Colorado, USA

6 Science Applications International Corporation, Englewood, Colorado, USA

7 Department of Computer Science, University of Colorado, Boulder, Colorado, USA

8 Department of Ecology and Evolutionary Biology, University of Colorado, Boulder, Colorado, USA

9 Walter and Eliza Hall Institute, Melbourne, Victoria, Australia

For all author emails, please log on.

Genome Biology 2007, 8:R171  doi:10.1186/gb-2007-8-8-r171

Published: 21 August 2007


We have implemented in Python the COmparative GENomic Toolkit, a fully integrated and thoroughly tested framework for novel probabilistic analyses of biological sequences, devising workflows, and generating publication quality graphics. PyCogent includes connectors to remote databases, built-in generalized probabilistic techniques for working with biological sequences, and controllers for third-party applications. The toolkit takes advantage of parallel architectures and runs on a range of hardware and operating systems, and is available under the general public license from webcite.