Genomic DNA k-mer spectra: models and modalities
-
* Corresponding author: Benny Chor benny@cs.tau.ac.il
1 School of Computer Science, Tel Aviv University, Klausner St, Ramat-Aviv, Tel-Aviv 39040, Israel
2 School of Physics and Astronomy, Tel Aviv University, Klausner St, Ramat-Aviv, Tel-Aviv 39040, Israel
3 European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
Genome Biology 2009, 10:R108 doi:10.1186/gb-2009-10-10-r108
Published: 8 October 2009Additional files
Additional data file 1:
On the top are 9-mer spectra of human chromosomes (left to right) 1, 6, 20. At the bottom are 11-mer spectra of human chromosomes (left to right) 1, 6, 20. All six spectra are multimodal.
Format: PDF Size: 57KB Download file
This file can be viewed with: Adobe Acrobat Reader
Additional data file 2:
Whole genome spectra of human (top) and opossum (bottom); 9-mers (left) and 11-mers (right). All four spectra are multimodal.
Format: PDF Size: 61KB Download file
This file can be viewed with: Adobe Acrobat Reader
Additional data file 3:
From top-left: Escherichia coli, Aeropyrum pernix, zebrafish (Danio rerio), pufferfish (Tetraodon nigroviridis), Arabidopsis thaliana, bee (Apis mellifera), nematode (Caenorhabditis elegans), yeast (Saccharomyces cerevisiae), and sea squirt (Ciona savignyi). All spectra are unimodal.
Format: JPEG Size: 374KB Download file
Additional data file 4:
Chicken (k = 10), platypus (Ornithorhynchus anatinus, an egg laying mammal; k = 10), frog (k = 11), and lizard (k = 11). All four k-mer spectra are multimodal.
Format: JPEG Size: 104KB Download file
Additional data file 5:
Each plot is partitioned according to 8-mers that contain the CpG dimer (colored green), and those that do not (colored blue). The green ones comprise the left-most part in the multimodal spectra for (a) human, and (b) chicken. In the two other, non-tetrapodal species, (c) C. elegans (nematode), and (d) pufferfish, there is no such effect.
Format: JPEG Size: 934KB Download file
Additional data file 6:
Whole genome (k = 10), all introns (k = 10), all 3' UTRs (k = 10), all exons (k = 9), all 5' UTRs (k = 8), all 600 base long promotors (k = 6), all 1,000 base long promotors (k = 7), all 5,000 base long promotors (k = 7).
Format: JPEG Size: 260KB Download file
Additional data file 7:
Additional information includes taxonomical classification, usable length, percentage C+G content, CpG suppression (measured by ρCG) and whether the k-mer spectrum is observed to have unimodal or multimodal behavior. It also specifies web sites from which sequences were downloaded [22-27], and where different functional genomic regions were downloaded from [28-30].
Format: PDF Size: 59KB Download file
This file can be viewed with: Adobe Acrobat Reader
