Table 1 |
|||
|
A comparison of the number of pufferfish hits by hmmsearch results versus the pufferfish database both before and after the THoR process |
|||
| Domain name (SMART name) |
N(SMART) |
N(THoR) |
N(THoR) - N(SMART) |
|
|
|||
| 14-3-3 homologs (14_3_3) |
9 |
9 |
0 |
| Domains in Ataxins and HMG-containing proteins (AXH) |
6 |
6 |
0 |
| Breast cancer carboxy-terminal domain (BRCT) |
31 |
39 |
8 |
| Bromo domain (BROMO) |
89 |
89 |
0 |
| Bulb-type mannose-specific lectins (B_lectin) |
1 |
2 |
1 |
| Chromatin organization modifier domain (CHROMO) |
62 |
69 |
7 |
| Calpain-like thiol protease family (CysPc) |
31 |
32 |
1 |
| Tandem repeat (DM15) |
6 |
6 |
0 |
| Endothelin (END). |
5 |
5 |
0 |
| Exonuclease (EXOIII) |
10 |
12 |
2 |
| Receptor for Ubiquitination Targets (FBOX) |
34 |
45 |
11 |
| Formin homology 2 domain (FH2) |
20 |
35 |
15 |
| Fibronectin type 1 domain (FN1) |
49 |
49 |
0 |
| High mobility group (HMG) |
82 |
84 |
2 |
| Homeodomain (HOX) |
319 |
323 |
4 |
| Protein kinase C-related kinase homology region 1 homologs (HR1) |
19 |
19 |
0 |
| Short calmodulin-binding motif containing conserved Ile and Gln residues (IQ) |
228 |
226 |
-2 |
| Kyprides, Ouzounis, Woese motif (KOW) |
12 |
12 |
0 |
| Kringle (KR) |
33 |
34 |
1 |
| Zinc-binding domain present in Lin-11, Isl-1, Mec-3 (LIM) |
204 |
214 |
10 |
| Pleckstrin homology (PH) |
373 |
436 |
63 |
| Zinc finger (PHD) |
216 |
303 |
87 |
| Phosphoinositide 3-kinase, region postulated to contain C2 domain (PI3K_C2) |
10 |
12 |
2 |
| Motif in proteasome subunits, Int-6, Nip-1 and TRIP-15 (PINT) |
16 |
17 |
1 |
| Phosphatidylinositol phosphate kinases (PIPKc) |
14 |
15 |
1 |
| Domain found in a protein subunit of human RNase MRP and RNase P ribonucleoprotein
complexes and archaeal proteins (POP4) |
1 |
1 |
0 |
| Domain found in Plexins, Semaphorins and Integrins (PSI) |
116 |
119 |
3 |
| Domain with conserved PWWP motif (PWWP) |
27 |
29 |
2 |
| Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases (RhoGEF) |
99 |
111 |
12 |
| Src homology 2 domains (SH2) |
142 |
153 |
11 |
| Src homology 3 domains (SH3) |
358 |
373 |
15 |
| Staphylococcal nuclease homologs (SNc) |
3 |
6 |
3 |
| Domain in short gastrulation protein and chordin (SOG) |
3 |
3 |
0 |
| snRNP Sm proteins (Sm) |
18 |
18 |
0 |
| TopoisomeraseII (TOP2c) |
3 |
3 |
0 |
| Tetratricopeptide repeats (TPR) |
573 |
552 |
-21 |
| Tudor domain (TUDOR) |
25 |
44 |
19 |
| Domain present in VPS-27, Hrs and STAM (VHS) |
15 |
14 |
-1 |
|
|
|||
|
N(SMART) is the number of domains found in the predicted set of pufferfish proteins using hmmsearch with SMART thresholds. N(THoR) is the number of domains found in pufferfish using hmmsearch with SMART thresholds using the alignment created by THoR. N(THoR) - N(SMART) is the difference between the THoR results and the SMART results. The SMART domain families COLIPASE, ChW, CheW, Galanin, IL10, IL2, LIGANc, POLIIIc, POX and REC were used for the benchmarking as negative controls. None of these domains was expected to provide positive hits to the pufferfish database, because they are prokaryote-specific or mammal-specific domains; indeed, no pufferfish homologs were detected by THoR. The domains AAA and WD40 were both searched by THoR with only one round of PSI-BLAST, because they were known to contain many members and a full search of five rounds would require an unnecessarily lengthy period of time to complete. They are not shown because they encountered memory-allocation errors with hmmbuild and their search iterations did not complete. |
|||
|
Dickens and Ponting Genome Biology 2003 4:R52 doi:10.1186/gb-2003-4-8-r52 |
|||