Method
A standard variation file format for human genome sequences
1 Omicia, 2200 Powell Street, Suite 525, Emeryville, CA 94608, USA
2 Department of Human Genetics and Eccles Institute of Human Genetics, 15 North 2030 East, University of Utah, Salt Lake City, UT 84108, USA
3 Royal Society of Chemistry, Thomas Graham House, Cambridge, CB4 0WF, UK
4 EMBL Outstation - Hinxton, European Bioinformatics Institute, Wellcome Trust, Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
5 Department of Biology, Boston College, 140 Commonwealth Avenue, Chestnut Hill, MA 02467, USA
6 Ontario Institute for Cancer Research, 101 College St, Suite 800, Toronto, ON M5G0A3, Canada
7 Department of Biomedical Informatics, Health Sciences Education Building, Suite 5700, 26 South 2000 East, University of Utah, Salt Lake City, UT 84112, USA
Genome Biology 2010, 11:R88 doi:10.1186/gb-2010-11-8-r88
Published: 26 August 2010Abstract
Here we describe the Genome Variation Format (GVF) and the 10Gen dataset. GVF, an extension of Generic Feature Format version 3 (GFF3), is a simple tab-delimited format for DNA variant files, which uses Sequence Ontology to describe genome variation data. The 10Gen dataset, ten human genomes in GVF format, is freely available for community analysis from the Sequence Ontology website and from an Amazon elastic block storage (EBS) snapshot for use in Amazon's EC2 cloud computing environment.



