Email updates

Keep up to date with the latest news and content from Genome Biology and BioMed Central.

Open Access Method

MetaMerge: scaling up genome-scale metabolic reconstructions with application to Mycobacterium tuberculosis

Leonid Chindelevitch12, Sarah Stanley34, Deborah Hung34, Aviv Regev456 and Bonnie Berger124*

Author Affiliations

1 Department of Mathematics, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA

2 Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 32 Vassar Street, Cambridge, MA 02139, USA

3 Department of Molecular Biology, Massachusetts General Hospital, Simches Research Center, 185 Cambridge Street, Boston, MA 02114, USA

4 Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, MA 02142, USA

5 Department of Biology, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA

6 Howard Hughes Medical Institute, 4000 Jones Bridge Road, Chevy Chase, MD 20815, USA

For all author emails, please log on.

Genome Biology 2012, 13:r6  doi:10.1186/gb-2012-13-1-r6

Published: 31 January 2012

Additional files

Additional file 1:

Reactions in models 1 and 2 differing by at most one metabolite. There are 29 pairs of reactions (each pair consisting of one reaction from model 1 and one reaction from model 2) that differ by at most one metabolite. These reactions are given in the representation used in the original models. Each pair is followed by a line of dashes.

Format: DOC Size: 19KB Download file

This file can be viewed with: Microsoft Word Viewer

Open Data

Additional file 2:

Enzymes predicted to have similar metabolic impact to that of known drug targets. There are 29 enzymes listed for ethambutol and 2 enzymes listed for isoniazid. Each enzyme is given with its gene ID and name, the set of reactions it is essential for in the initial models (uppercase metabolite names for [13], lowercase for [14]), its function, and its category. If a reaction for which an enzyme is essential is common to both models, only one model is chosen to represent it.

Format: XLS Size: 40KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 3:

A sample Python session yielding a combined M. tuberculosis model. This file contains the step-by-step transcript of a Python session culminating in the creation of a combined M. tuberculosis model from the two original models and the production of an SBML file containing the combined model.

Format: TXT Size: 11KB Download file

Open Data

Additional file 4:

The full code of the MetaMerge algorithm implemented in Python. The code is divided into 14 modules, each of which contains multiple functions, as follows: ClassDefinitions.py, the code for defining all the formats used by MetaMerge internally; FeatureMatching.py, the code for matching species, reactions based on the available features; FeaturePreparation.py, the code for extracting metabolite and reaction features from text; GeneProcessing.py, the code for processing gene information; MatchProcessing.py, the code for processing the metabolite and reaction matching matrices; MetaboliteMatching.py, the code for generating and processing closely matching metabolites; MetaMerge.py, an initializer for the other modules required by the MetaMerge algorithm; MetaMergeCore.py, the user interface of MetaMerge for preparing the matching matrices; ModelParsing.py, the user interface for parsing a metabolic model in Excel or SBML format; NetworkMerging.py, the code for merging two networks based on their matching matrices; OutputProcessing.py, the code for processing the output of the MetaMerge algorithm; ReactionMatching.py, the code for generating and processing closely matching reactions; Unrelated.py, the code for analyzing a metabolic network, not directly related to MetaMerge; Utilities.py, the code of miscellaneous auxiliary functions used by the MetaMerge algorithm. Additionally, the zipped directory contains a shelve file called Mappings with KEGG and Biocyc identifiers for the metabolites in both M. tuberculosis models extracted by Jeremy Zucker, and the cleaned-up and extended Excel files Mycobacterium tuberculosis 1.xls and Mycobacterium tuberculosis 2.xls for models 1 [13] and 2 [14], respectively, containing additional annotation contributed by Marina Druz.

Format: ZIP Size: 334KB Download file

Open Data

Additional file 5:

A list of errors corrected in the original M. tuberculosis models. This file contains a list of errors detected in the original M. tuberculosis models [13,14] and corrected in the Excel files contained in Additional file 4. Most of these are typographical errors, but some are due to inconsistent notations in different parts of the original Excel files.

Format: TXT Size: 2KB Download file

Open Data

Additional file 6:

Model 1 (Beste et al. [13]) in SBML. The model contains 873 reactions and 753 metabolites. Each reaction is annotated with lower and upper bounds on its flux, the EC numbers for the enzymes catalyzing it, the Boolean expression containing the genes it requires, the name and chemical equation of the reaction, and the pathway to which it belongs, whenever these are known. Each metabolite is annotated with its abbreviation, official name, molecular formula, IUPAC name, CAS number, and BioCyc and KEGG database identifiers, whenever these are known.

Format: XML Size: 22.8MB Download file

Open Data

Additional file 7:

Model 2 (Jamshidi and Palsson [14]) in SBML. The model contains 937 reactions and 825 metabolites. Each reaction is annotated with its confidence score, the proteins needed to catalyze it, the Boolean expression containing the genes it requires, the name and chemical equation of the reaction, and the subsystem to which it belongs, whenever these are known. Each metabolite is annotated with its abbreviation, official name, molecular formula and charge, IUPAC name, CAS number, and BioCyc and KEGG database identifiers, whenever these are known.

Format: XML Size: 1.5MB Download file

Open Data

Additional file 8:

The combined M. tuberculosis model in SBML. The model contains 1,400 reactions and 1,017 metabolites. Each reaction is annotated with the corresponding information from the reactions in the original models that it corresponds to. Each metabolite is similarly annotated with the corresponding information from the metabolites in the original models that it corresponds to. In case a reaction or metabolite in the combined model represents two or more reactions or metabolites from the same original model, the annotations are separated by 'or', while if those reactions or metabolites that come from different models are separated by 'OR'.

Format: XML Size: 23.7MB Download file

Open Data

Additional file 9:

The MONGOOSE toolbox. The MONGOOSE (MetabOlic Network Growth OptimizatiOn Solved Exactly) toolbox [33] is a software suite we have developed, which gives certifiably correct results quickly and efficiently and is able to handle the largest metabolic model currently reconstructed. Its main features are the use of exact rational arithmetic, which avoids the risk of erroneous results due to rounding errors, as well as its ability to compress the metabolic network in order to speed up subsequent computations. This file describes in detail the algorithms underlying MONGOOSE [33].

Format: PDF Size: 154KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data