Table 1

Taxonomic distribution of genomic datasets used in this study

Set
Taxonomic group
No. of species
No. of sequences

Fully sequenced genomes
Archaea
19
42,079

Bacteria
160
477,069

Eukarya
19
221,950

Total
198
741,098




Partial genomes
Protists
17
43,550

Viridiplantae
76
221,896

Fungi
27
62,528

Arthropods
17
22,528

Nematodes
31
95,341

Lophotrochozoa
4
10,365

Deuterostomes
21
90,243

Total
193
546,451

Peregrín-Álvarez and Parkinson Genome Biology 2007 8:R238   doi:10.1186/gb-2007-8-11-r238

Open Data