Open Access Research

The genomic landscape shaped by selection on transposable elements across 18 mouse strains

Christoffer Nellåker1*, Thomas M Keane2, Binnaz Yalcin3, Kim Wong2, Avigail Agam13, T Grant Belgard145, Jonathan Flint3, David J Adams2, Wayne N Frankel6 and Chris P Ponting12*

Author Affiliations

1 MRC Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford, South Parks Road, Oxford, OX1 3PT, UK

2 The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1HH, UK

3 Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford, OX3 7BN, UK

4 University of California, Los Angeles, California, 90095, USA

5 National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA

6 The Jackson Laboratory, Bar Harbor, Maine 04609, USA

For all author emails, please log on.

Genome Biology 2012, 13:R45  doi:10.1186/gb-2012-13-6-r45

Published: 15 June 2012

Additional files

Additional file 1:

Supplementary Table 1. Identifiers for mice sequenced in this study.

Format: DOC Size: 40KB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data

Additional file 2:

Supplementary Table 2. RepBase sequences used as probes for identification of TEVs.

Format: XLS Size: 45KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 3:

Supplementary Figure 1. Proportions of TEVs along different lineages of the phylogeny shown in Figure 1. (a, b) Proportions of TEV classes in the TE superfamilies and ERV subfamilies, respectively, on the 129S1/SVImJ and 129P2/OlaHsd lineages. (c, d) Proportions of TEV classes in the TE superfamilies and ERV subfamilies, respectively, on the A/J and BALB/cJ lineages. (e, f) Proportions of TEV classes in the TE superfamilies and ERV subfamilies, respectively, on the NOD/ShiLtJ and AKR/J lineages. Total numbers of predicted TEVs occurring between neighboring branch nodes are indicated below the x-axis. (g, h) Proportions of TEV classes in the TE superfamilies and ERV subfamilies, respectively, that are private to each strain. Numbers of TEVs called as being private to strains are indicated above the plots.

Format: PDF Size: 252KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 4:

Supplementary Table 3. Summary of validation results. Percentages in parentheses denote the false negative rate estimated from concordance between 129P2/OlaHsd, 129S1/SvImJ and 129S5/SvEvBrd strains.

Format: DOC Size: 30KB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data

Additional file 5:

Supplementary Table 4. B6- PCR true positive validation results.

Format: XLS Size: 44KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 6:

Supplementary Figure 2. We show a representative PCR gel image for one ERV (located on chromosome 9: 98,366,615-98,366,616), one LINE (chr10:23,570,601-23,570,602), and one SINE (chr1:162,157,648-162,157,649). PCR was carried out across eight strains: A/J, AKR/J, BALB/cJ, C3H/HeJ, C57BL/6J, CBA/J, DBA/2J and LP/J. We used Hyperladder II as size marker.

Format: PPT Size: 380KB Download file

This file can be viewed with: Microsoft PowerPoint Viewer

Open Data

Additional file 7:

Supplementary Figure 3. Gene annotation biases for TEV occurrence. Gene annotations that are significantly enriched (green shades) or depleted (red shades) in exons, or intronic, 5 kb flanking or intergenic regions for TEV insertions, having accounted for GC content, chromosome and lengths, and after correcting for multiple testing. (a, b) Gene annotations are from either the Gene Ontology (slim set) (a) or the Mouse Genome Informatics phenotypes associated with gene disruptions (b). SINE TEVs show a pattern of enrichments and depletions that is the complement of patterns observed for LINE and ERV TEVs.

Format: PDF Size: 207KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 8:

Supplementary Table 5. Intronic TEVs associated with differential gene expression across strains. The TEVs found to be associated with differential expression against the background of all TEVs with associated expression data and equal sampling of genes with no TEV insertions.

Format: XLS Size: 69KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 9:

Supplementary Table 6. Intronic TEVs with associated genes' brain RNA-Seq expression values. The expression differences between the associated Ensembl genes with and without TEVs are significantly different as determined by ANOVA of log expression values with an FDR < 0.05.

Format: XLS Size: 551KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 10:

Supplementary Table 7. Intergenic TEVs with associated upstream and downstream genes' brain RNA-Seq expression values. The expression differences between the associated Ensembl genes with and without TEVs are significantly different as determined by ANOVA of log expression values with an FDR < 0.05.

Format: XLS Size: 1.2MB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 11:

Supplementary Figure 4. Flow chart outlining how structural variants and B6+ TEV calls were classified according to various superfamily classes and whether they were full-length.

Format: PPT Size: 186KB Download file

This file can be viewed with: Microsoft PowerPoint Viewer

Open Data

Additional file 12:

Supplementary Figure 5. Schematic overview of the bootstrapping sampling method used to generate the high confidence list of gene expression changes associated with TEVs.

Format: PNG Size: 819KB Download file

Open Data

Additional file 13:

Supplementary file 1. Tab deliminated file with all the TEVs with strain distribution pattern. B6+ TEVs are denoted as the default '1' if the same as the reference and 'DEL' if called as absent in a strain. B6- TEVs are denoted as the default '0' if the same as the reference and 'INS' if called as inserted in a strain. We conservatively estimate there to be 103,798 TEVs, although 110,930 TEVs classified in these files are due to instances of TEVs being annotated as different subfamilies in different strains (for example, LINE versus LINE_frag).

Format: TGZ Size: 1.2MB Download file

Open Data