Figure 2.

A histogram of gene clusters observed in exactly N of 13 H. influenzae strains compared to the expected number of genes estimated by the supragenome model (trained on all 13 strains). Over 1,400 genes were observed in all 13 strains, indicating that there is a common core set of genes. Distributed genes appear in variable numbers of strains, from 1 to 12. Overall, the model fits the data well, though it underestimated the number of genes observed once and overestimated the number of genes observed twice.

Hogg et al. Genome Biology 2007 8:R103   doi:10.1186/gb-2007-8-6-r103
Download authors' original image