Table 1

Gold standard sets

Organism
GSTD
References
No. of ORFs
No. of ORF pairs
Comments/details

S. cerevisiae
P
[13, 22]
732
3,400
Derived from MIPS [42] complex catalog. We excluded ribosomal proteins to avoid bias towards extreme codon usage similarity of their genes

N
[13, 22]
2,760
1,442,691
Pairs of proteins that are not localized in the same cell compartment. We excluded ribosomal proteins






P. falciparum
P
[43]
352
7,689
Protein pairs within the same KEGG [19] pathway

N
[43]
354
27,367
Protein pairs with KEGG information, excluding pairs in the gold standard positive set






E. coli
P
[44]
2,196
7,063
Pull-down assay using a His-tagged ORF library

N
-
3,703
4,437,833
We compiled a set of protein pairs that are not in the gold standard positive set, given that at least one protein from each pair is copurified with an associate protein by Arifuzzaman et al. [44]

Each set comprises only ORFs that could be associated with their genomic sequences using the names that were provided in the original references. Self interactions were considered in neither the training nor the testing process. GSTD, gold standard dataset; N, negative; P, positive.

Najafabadi and Salavati Genome Biology 2008 9:R87   doi:10.1186/gb-2008-9-5-r87

Open Data