Table 2 |
||
|
The pragmas defined by GVF, in addition to those already defined by GFF3 (gff-version, sequence-region, feature-ontology, attribute-ontology, source-ontology, species, genome-build) |
||
|
Pragma |
Allowed tags |
Description |
|
|
||
|
file-version |
Comment |
This allows the specification of the version of a specific file. What exactly the version means is left undefined, but the tag is provided for the case when an individual's variants are described in GVF and then, at a later date, changes to the data or the software require an update to the file. An increment of the file-version could signify such a change. Any numeric version of file-version is allowed |
|
file-date |
Comment |
The file-date pragma is included as a method to describe the date when the file was created. The ISO 8601 standard for dates in the form YYYY-MM-DD is required for the value |
|
individual-id |
Dbxref, Gender, Population, Comment |
This pragma provides details about the individual whose variants are described in the file |
|
##individual-id Dbxref = Coriell:NA18507;Gender = male;Ethnicity = Yoruba; Comment = Yoruba from Ibadan |
||
|
source-method |
Seqid, Source, Type, Dbxref, Comment |
This pragma provides details about the algorithms or methodologies used to generate data for a given source in the file. This is used, for example, to document how a particular type of variant was called. A typical use would be to provide a DBxref link to a journal article describing software used for calling the variant data with the given source tag |
|
##source-method Seqid = chr1;Source = MAQ;Type = SNV;Dbxref = PMID:18714091;Comment = MAQ SNV calls; |
||
|
attribute-method |
Seqid, Source, Type, Attribute, Dbxref, Comment |
This pragma provides details about algorithms or methodologies for a given attribute tag in the file. This is used to document how a particular type of attribute value (that is, Genotype, Variant_effect) was calculated |
|
##attribute-method Source = SOLiD;Type = SNV;Attribute = Genotype;Comment = Genotype is reported here as determined in the original study |
||
|
technology-platform |
Seqid, Source, Type, Read_length, Read_type, Read_pair_span, Platform_class, Platform_name, Average_coverage. Comment, Dbxref |
This pragma provides details about the technologies (that is, sequencing or DNA microarray) used to generate the primary data |
|
##technology-platform Seqid = chr1;Source = AFFY_SNP_6;Type = SNV;Dbxref = URI:http://www.affymetrix.com webcite; Platform_class = SNP_Array;Platform_name = Affymetrix Human SNP Array 6.0; |
||
|
data-source |
Seqid, Source, Type, Dbxref, Data_type, Comment. |
This pragma provides details about the source data for the variants contained in this file. This could be links to the actual sequence reads in a trace archive, or links to a variant file in another format that have been converted to GVF |
|
##data-source Source = MAQ;Type = SNV;Dbxref = SRA:SRA008175;Data_type = DNA sequence;Comment = NCBI Short Read Archive http://www.ncbi.nlm.nih.gov/Traces/sra webcite; |
||
|
phenotype-description |
Ontology, Term, Comment |
A description of the phenotype of the individual. This pragma can contain either ontology constrained terms, or a free text description of the individual's phenotype or both. |
|
##phenotype-description Ontology = http://www.human-phenotype-ontology.org/human-phenotype-ontology.obo.gz webcite;Term = acute myloid leukemia;Comment = AML relapse; |
||
|
ploidy |
Ontology, Term, Comment |
This pragma defines the ploidy for a given genome. This pragma can contain either ontology constrained terms, or a free text description of the individual's ploidy. It is suggested that ontology constrained terms use a subtype of the term PATO:0001374, which includes haploid, diploid, polyploid, triploid etc |
|
##ploidy chr22 1 49691432 diploid |
||
|
##ploidy chrY 1 57772954 haploid |
||
|
|
||
|
The pragmas defined by GVF may refer to the entire file or may limit their scope by use of tag-value pairs. For example, if a pragma only applies to SNVs that were called by Gigabayes on chromosome 13, then the tags: Seqid = chr13;Source = Gigabayes;Type = SNV would indicate the scope. The Dbxref tag within a GVF pragma takes values of the form 'DBTAG:ID' and provides a reference for the information given by the pragma whether that be the location of sequence files or a link to a paper describing a method. Tags beginning with uppercase letters are reserved for future use within the GFF/GVF specification, but applications are free to provide additional tags beginning with lower case letters. |
||
|
Reese et al. Genome Biology 2010 11:R88 doi:10.1186/gb-2010-11-8-r88 |
||