Email updates

Keep up to date with the latest news and content from Genome Biology and BioMed Central.

Open Access Highly Accessed Research

A metagenomic study of diet-dependent interaction between gut microbiota and host in infants reveals differences in immune response

Scott Schwartz12, Iddo Friedberg39, Ivan V Ivanov45, Laurie A Davidson14, Jennifer S Goldsby4, David B Dahl2, Damir Herman6, Mei Wang7, Sharon M Donovan7 and Robert S Chapkin138*

Author Affiliations

1 Training Program in Biostatistics, Bioinformatics, Nutrition and Cancer, Texas A&M University, 155 Ireland Street, College Station, TX 77843, USA

2 Department of Statistics, Texas A&M University, 155 Ireland Street, College Station, TX 77843, USA

3 Department of Microbiology, Miami University, 700 East High St, Oxford, OH 45056, USA

4 Program in Integrative Nutrition and Complex Diseases, Texas A&M University, College Station, TX 77843, USA

5 Veterinary Physiology and Pharmacology, Texas A&M University, College Station, TX 77843, USA

6 Division of Hematology and Oncology, Winthrop P Rockefeller Cancer Institute, University of Arkansas for Medical Sciences, 4301 W. Markham St, Little Rock, AR 72205, USA

7 Department of Food Science and Human Nutrition, 905 S. Goodwin Avenue, University of Illinois, Urbana, IL 61801, USA

8 Department of Microbial and Molecular Pathogenesis, Texas A&M Health Science Center, 8441 State Hwy 47, Bryan, TX 77807, USA

9 Computer Science and Software Engineering, Miami University, 700 East High St, Oxford, OH 45056, USA

For all author emails, please log on.

Genome Biology 2012, 13:r32  doi:10.1186/gb-2012-13-4-r32

Published: 30 April 2012

Additional files

Additional file 1:

Figure S1. Overview of the analysis pipeline. (a) Stool samples were obtained from six breast-fed and six formula-fed infants. (b) Gut microbial DNA and host gut-epithelial mRNA were isolated and sequenced/hybridized. (c) Microbial DNA sequence was analyzed for functional content and taxa using MG-RAST and PhymmBL; gut epithelial mRNA was analyzed for eukaryotic gene function using microarray. (d) Significant multivariate correlations between gut-epithelium mRNA expression and metagenomic DNA frequency were determined using multivariate canonical correlation analysis (CCA) repeated on subsets of host gene expression data.

Format: TIFF Size: 155KB Download file

Open Data

Additional file 2:

Figure S2. Original log 2 transformed raw CodeLink microarray data shown in an MA-plot. Upper panel: the x-axis shows the average of the average gene expression of BF and FF infants for each probe. The y-axis shows the difference between the two averages. The color bar shows the count density of the plotted data. BF samples exhibited a systematically higher gene expression level relative to FF samples. Lower panel: loess normalization of the original log 2 transformed raw CodeLink microarray data. This normalization procedure corrected for the systematic increase in BF gene expression relative to FF gene expression seen in the upper panel. The data were adjusted by the loess fit (blue line) shown in the upper panel.

Format: TIFF Size: 286KB Download file

Open Data

Additional file 3:

Table S1. Host GO enrichment analysis.

Format: DOC Size: 78KB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data

Additional file 4:

Figure S3. Phyla distribution using 16S rRNA analysis (top) and PhymmBL classification of all reads (bottom). X-axis: sample numbers 1 to 6 BF, 7 to 12 FF. Y-axis: percentage of total assigned reads. See Additional file 8 for number of assigned reads.

Format: TIFF Size: 8.1MB Download file

Open Data

Additional file 5:

Table S2. Counts of mapped microbiome sequences.

Format: DOC Size: 39KB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data

Additional file 6:

Figure S4. Example of canonical correlations of random gene sets. Analogous to the random gene set shown in Figure 4. Random (1,000) gene sets were sampled and analyzed. The first 5 of 1,000 are shown.

Format: PDF Size: 2.9MB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 7:

Figure S5. Example of the best performing genes in random gene sets. These data are analogous to the random gene set shown in Figure 5. Random (1,000) gene sets were sampled and analyzed. The first 5 of 1,000 are shown.

Format: PDF Size: 2.3MB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 8:

Data set 1. Discrete sets of biomarkers (genes) known to be involved in intestinal biology (459).

Format: CSV Size: 13KB Download file

Open Data

Additional file 9:

Data set 2. Discrete sets of biomarkers (genes) known to be involved in immunity and defense (660).

Format: CSV Size: 19KB Download file

Open Data

Additional file 10:

Table S3. Breakdown of sequencing depth in terms of average number of reads across samples mapped to SEED categories.

Format: DOC Size: 36KB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data

Additional file 11:

Supplemental protocol. Canonical correlation calculations.

Format: PDF Size: 97KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 12:

Figure S6. A principal components analysis (PCA) of the virulence characteristics combined with all host gene triples. Top panel: host intestinal biology genes. Middle panel: immunity and defense genes. Bottom panel: random genes. The plots show the proportion of variation explained by the first and second principal components versus the variation explained by just the second principal component. The analyses provide a characterization of a lower dimensional structure underlying the data. When combined with the virulence characteristics, the immunity and defense genes (middle panel) generally exhibit a simpler latent structure compared to the other gene sets (top and bottom panels), as judged by the slight northeast shift in the point cloud. While the latent structure identified by PCA need not reflect a relationship between the virulence characteristics and the host genes, it may, in which case the immunity and defense genes are slightly more promising as a set with respect to future canonical correlation analysis (CCA) aimed at uncovering simple and strong relationships between the metagenomic and host transcriptome data. In this way, PCA may be used as a screening device to identify promising gene triples for CCA analysis.

Format: TIFF Size: 350KB Download file

Open Data