Open Access Research

Genomic diversity of the human intestinal parasite Entamoeba histolytica

Gareth D Weedall1, C Graham Clark2, Pia Koldkjaer1, Suzanne Kay1, Iris Bruchhaus3, Egbert Tannich3, Steve Paterson1 and Neil Hall1*

Author Affiliations

1 Institute of Integrative Biology, University of Liverpool, Biosciences building, Crown Street, Liverpool L69 7AH, UK

2 Pathogen Molecular Biology Department, Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK

3 Bernhard Nocht Institute for Tropical Medicine, Bernhard-Nocht-Str. 74, 20359 Hamburg, Germany

For all author emails, please log on.

Genome Biology 2012, 13:R38  doi:10.1186/gb-2012-13-5-r38

Published: 25 May 2012

Additional files

Additional file 1:

Coverage of genes in sequenced strains. Histograms of the proportion covered of E. histolytica genes. The proportions of E. histolytica genes with 0 to 10%, 11 to 20%, 21 to 30%, 31 to 40%, 41 to 50%, 51 to 60%, 61 to 70%, 71 to 80%, 81 to 90% and 91 to 100% of their sequence covered are plotted. The majority of genes are well covered (91 to 100% of their length) by sequence libraries.

Format: PDF Size: 5KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 2:

Table of single nucleotide variants in E. histolytica strains relative to the HM-1:IMSS reference genome. SNP calls relative to the HM-1:IMSS reference genome.

Format: XLS Size: 15.9MB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 3:

Additional analyses carried out to test the effects of SNP calling parameters, verify called SNPs and to simulate the four-haplotypes test. Additional analyses carried out to test the effects of varying the SNP calling parameters on the numbers of SNPs called in the Rahman strain, verify the called SNPs in Rahman strain by comparison with an independently generated dataset and to simulate the four-haplotypes test under models of varying recombination and gene conversion.

Format: DOCX Size: 226KB Download file

Open Data

Additional file 4:

Table of high quality candidate marker SNPs. Table of high quality candidate markers: defined as SNPs that vary between strains but are homozygous within each strain, occur in coding sequences and have a base called in every strain sequenced.

Format: XLS Size: 1.6MB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 5:

Table of polymorphic genes in E. histolytica strains relative to the HM-1:IMSS reference genome. Counts of homozygous nonsynonymous and synonymous SNPs in polymorphic genes.

Format: XLS Size: 1MB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 6:

Table of putative nonsense mutations in E. histolytica genes. Table of SNPs causing putative nonsense mutations in E. histolytica strains. The table includes both homozygous and heterozygous SNPs.

Format: XLS Size: 108KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 7:

Highly polymorphic genes in E. histolytica. Table of genes with five or more homozygous nonsynonymous SNPs in one or more strains, with details of protein features.

Format: XLS Size: 30KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 8:

Genes showing deep coverage in one or more E. histolytica strains. Table of genes (with RPKM values) with more than two-fold greater depth of coverage than the average of HM-1A and HM-1B in one or more strain, representing putative high copy number genes in that strain.

Format: XLS Size: 246KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 9:

Genes putatively missing from one or more E. histolytica strains. Table of genes (with RPKM values) with RPKM < 1 and where this value is < 50% of the average of HM-1A and HM-1B, representing putative missing genes in that strain.

Format: XLS Size: 91KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data