Additional file 1.

Figure S1. Distribution of unclassified genera of 22 habitats. Sequences that could not be classified at RDP confidence threshold 0.5 were assigned to unclassified genera. Unclassified reads account for relatively small proportion of the total reads in the majority of the samples. Figure S2. Phylum profiling of 22 human habitats. The average relative abundance of phyla in each habitat was measured by the fraction of total 16S rRNA gene sequences. Each color represents a phylum. (A) Firmicutes, Actinobacteria, and Proteobacteria are the major phyla identified in human body. (B) Phyla accounting for <0.5% of the total phyla are shown. Preterm baby stool in this dataset does not contain low abundance phyla with the 0.5% standard, thus there are no data plotted. The total fractions of the phyla <0.5% in this figure are listed on top of the plot. Figure S3. Accumulation curves at the genus level. The only difference between Figure S3A and Figure 1 is that all the samples were rarified to 1,000 reads in Figure S3A. The accumulation curves exhibit similar patterns in both figures. Figure S3B shows stool richness at different sequencing depths. Sixty stool samples with >9,000 reads were rarified to 1,000, 3,000, 6,000, and 9,000 reads. Both deep sequencing and a large number of subjects are required to detect all the possible taxa. Figure S4. The association of sequencing depth and sample frequency. The x-axis shows the rank abundance of each genus and the y-axis shows the number of subjects who share the genus. Sixty stool samples with >9,000 reads were rarified to 1,000, 3,000, 6,000, and 9,000 reads. The points showing the abundance of each genus at different depths are linked by line segments. With increased sequencing depth, the number of subjects who share the same genus, including the minor genera, is increased. Figure S5. The relative abundances of taxa in each habitat and dispersal among subjects. Dispersal of a given genus is indicated by sample prevalence of that genus on x-axis. The average relative abundance (m ± se) of each genus is indicated on y-axis. The most abundant genera in general have the highest prevalence. However low abundance genera can also be ubiquitous and high abundance genera can be distributed in a limited number of subjects. Also see Figure 4. Figure S6. The correlation of abundant taxa between their dominant habitats and less dominant sites. The abundances of Bacteroides from stool, Streptococcus from throat, and Lactobacillus from posterior fornix were compared with abundances in the rest of the habitats. The taxon abundance lacks correlation between major habitats (oral, skin, vaginal, stool), but it shows moderate correlation within the oral and vaginal sites. Figure S7. Dynamics of Lactobacillus in vaginal habitats between visits. The relative abundances of Lactobacillus undergo great changes between two visits. Vaginosis related genera (Gardnerella, Prevotella, Atopobium) are over-represented after Lactobacillus loses its dominance. These subjects were asymptomatic and met the criteria of HMP study. Figure S8. Dynamics of Moraxella in anterior nares between visits. The relative abundance of Moraxella (colored orange) varies from 0% to 70% between visits. Figure S9. The relative abundances of Bacteroidetes and Firmicutes in HMP stool samples and twin study stool samples. The relative abundances of two major phyla Bacteroidetes and Firmicutes are plotted. Bacteroidetes in HMP stool samples are significantly higher than those in obese and lean group of the twin studies (P <0.001). Figure S10. Cluster analysis of the HMP dataset with data from other studies. (A) Clustering analysis of HMP stool and twin study stool samples. Hierarchical clustering was performed using Bray-Curtis dissimilarity and complete linkage. Red labels represent the HMP samples and blue labels represent twin study samples. (B) Clustering analysis of HMP saliva and Chinese saliva samples. Red: HMP sample; green: healthy controls of Chinese saliva samples; blue: Chinese saliva samples from subjects with dental caries. The majority of the samples are clustered by project rather than health status.

Format: PDF Size: 10.2MB Download file

This file can be viewed with: Adobe Acrobat Reader

Zhou et al. Genome Biology 2013 14:R1   doi:10.1186/gb-2013-14-1-r1