Table 1

Data collection description: summary of the data sources

Data type
Description
Representation

Gene expression
Expression data from oligonucleotide arrays for 13,566 genes across 55 mouse tissues (Zhang et al. [21])
Median-subtracted, arcsinh intensity measurements

Expression data from Affymetrix arrays for 18,208 genes across 61 mouse tissues (Su et al. [44])
gcRMA-condensed intensity measurements

Tag counts at quality 0.99 cut-off from 139 SAGE libraries for 16,726 genes [45]
Average and total tag counts
Sequence patterns
Protein sequence pattern annotations from Pfam-A (release 19) for 15,569 genes with 3,133 protein families [46]
Binary annotation patterns

Protein sequence pattern annotations from InterPro (release 12.1) for 16,965 genes with 5,404 sequence patterns [47]
Binary annotation patterns
Protein interactions
Protein-protein interactions from OPHID for 7,125 genes [28] (downloaded on 20 April 2006)
Binary interaction patterns and shortest path between genes
Phenotypes
Phenotype annotations from MGI for 3,439 genes with 33 phenotypes [48] (downloaded on 21 February 2006 from [49])
Binary annotation patterns
Conservation profile
Conservation pattern from Ensembl (v38) for 15,939 genes across 18 species [50]
Binary conservation patterns and conservation scores

Conservation pattern from Inparanoid (v4.0) for 15,703 genes across 21 species [51]
Binary conservation patterns and Inparanoid scores
Disease associations
Disease associations from OMIM for 1,938 genes to 2,488 diseases/phenotypes [52,53] (downloaded on 6 June 2006 from [54])
Binary annotation patterns

gcRMA, robust multi-array analysis with background adjustment for GC content of probes; OMIM, Online Mendelian Inheritance in Man; OPHID, Online Predicted Human Interaction Database; SAGE, serial analysis of gene expression.

Peña-Castillo et al. Genome Biology 2008 9(Suppl 1):S2   doi:10.1186/gb-2008-9-s1-s2