Genome Biology 2:reviews0001.1-0001.9. Additional data file 1

Classification of publicly available nitrilase superfamily sequences into 13 branches. National Center for Biotechnology Information (NCBI) links are provided for 176 sequences in 13 branches, 12 of which resulted from non-redundant BLASTp searches (version 2.1.2, [17]) using the first sequence in each branch as the prototype for the branch. BLAST runs with each of these prototypes detect more sequences than shown in each branch. Poorly scoring sequences were generally found to be more closely related to another prototype based upon E-value and presence of signature sequences, and were assigned to branches such that no sequence belongs to more than one branch. Sequence definition lines are as provided by the host databases. It should be obvious that definition lines generated by automatic annotations or annotation jamborees are inconsistent with the branch designations provided here. The poorest scoring sequence in branch 1, the cyanide hydratase from Fusarium lateritium [30], has an artifactually poor E-value because the database sequence is not as published. Branch 13 contains outlier nitrilase-related sequences that cannot be readily classified as members of the other 12 branches and do not contain an additional fused domain.

BLASTP 2.1.2 [Nov-13-2000]


Reference:
Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schäffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), 
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs",  Nucleic Acids Res. 25:3389-3402.
Branch 1 - Nitrilase


Query= (346 letters)

Database: nr 594,795 sequences; 188,047,558 total letters

                                                                   Score     E
Sequences producing significant alignments:                        (bits)  Value

sp|P32961|NRL1_ARATH  NITRILASE 1 >gi|99738|pir||S22398 nitr...   689  0.0
pir||T49147  nitrilase 1 - Arabidopsis thaliana >gi|1389699|...   687  0.0
sp|P32962|NRL2_ARATH  NITRILASE 2 >gi|322548|pir||S31969 nit...   613  e-174
emb|CAA68934.3|  (Y07648) nitrilase 2 [Arabidopsis thaliana]      611  e-174
gb|AAB05220.1|  (U38845) nitrilase 2 [Arabidopsis thaliana]       610  e-174
sp|P46010|NRL3_ARATH  NITRILASE 3 >gi|11266291|pir||T49148 n...   587  e-167
sp|P46011|NRL4_ARATH  NITRILASE 4 >gi|508737|gb|AAA19628.1| ...   439  e-122
pir||T03739  nitrilase (EC 3.5.5.1) 4B - common tobacco >gi|...   426  e-118
sp|Q42965|NRL4_TOBAC  NITRILASE 4 >gi|7435979|pir||T03736 ni...   425  e-118
dbj|BAA77679.1|  (AB027054) nitrilase-like protein [Oryza sa...   423  e-117
pir||T27679  probable nitrilase (EC 3.5.5.1) ZK1058.6 - Caen...   307  1e-82
ref|NP_012102.1|  nitrilase; Nit1p [Saccharomyces cerevisiae...   193  2e-48
emb|CAA46923.1|  (X66132) ORF-1 [Saccharomyces cerevisiae]        192  4e-48
pir||S77025  nitrilase (EC 3.5.5.1) - Synechocystis sp. (str...   172  7e-42
dbj|BAA11653.1|  (D82961) cyanide degrading enzyme [Pseudomo...   166  5e-40
emb|CAC18136.1|  (AL451011) conserved hypothetical protein [...   165  6e-40
sp|Q03217|NRL2_RHORH  ALIPHATIC NITRILASE >gi|322251|pir||A4...   164  1e-39
pir||JC4212  nitrilase (EC 3.5.5.1) - Comamonas testosteroni...   159  4e-38
dbj|BAA90460.1|  (AB028892) nitrilase [Bacillus sp. OxB-1]        154  1e-36
sp|Q02068|NRL1_RHORH  ALIPHATIC NITRILASE >gi|322250|pir||A4...   150  3e-35
sp|P10045|NRLB_KLEPO  NITRILASE, BROMOXYNIL-SPECIFIC >gi|789...   140  3e-32
sp|P32964|CYHY_GLOSO  CYANIDE HYDRATASE (FORMAMIDE HYDROLYAS...   128  1e-28
sp|P20960|NRLA_ALCFA  NITRILASE, ARYLACETONE-SPECIFIC (ARYLA...   125  6e-28
gb|AAF66098.1|AF192405_1  (AF192405) putative cyanide hydrat...   118  9e-26
sp|P32963|CYHY_GIBBA  CYANIDE HYDRATASE (FORMAMIDE HYDROLYAS...    69  8e-11
 
 

Branch 2 - Aliphatic Amidase

Query= (346 letters)

Database: nr 594,795 sequences; 188,047,558 total letters

                                                                   Score     E
Sequences producing significant alignments:                        (bits)  Value

sp|P11436|ALAM_PSEAE  ALIPHATIC AMIDASE >gi|77606|pir||A2674...   684  0.0
gb|AAA25697.1|  (M27612) alphatic amidase (EC 3.5.1.4) [Pseu...   681  0.0
pir||H83222  aliphatic amidase PA3366 [imported] - Pseudomon...   681  0.0
gb|AAF14257.1|  (AF136599) aliphatic amidase [Bacillus stear...   581  e-165
gb|AAF69000.1|AF257487_1  (AF257487) amidase [Bacillus sp. B...   579  e-164
sp|Q01360|ALAM_RHOER  ALIPHATIC AMIDASE (WIDE SPECTRUM AMIDA...   565  e-160
pir||F64556  aliphatic amidase - Helicobacter pylori (strain...   527  e-149
emb|CAA72932.1|  (Y12252) AmiE protein, aliphatic amidase [H...   524  e-148
pir||B71951  aliphatic amidase - Helicobacter pylori (strain...   524  e-148
pir||F64674  aliphatic amidase - Helicobacter pylori (strain...   189  3e-47
pir||C71842  aliphatic amidase - Helicobacter pylori (strain...   184  1e-45
 

Branch 3 - N-terminal Amidase
 

Query= (457 letters)

Database: nr 594,795 sequences; 188,047,558 total letters

                                                                   Score     E
Sequences producing significant alignments:                        (bits)  Value

ref|NP_012596.1|  52-kDa amidase specific for N-terminal asp...   850  0.0
pir||T39937  deamidase - fission yeast (Schizosaccharomyces ...   116  6e-25
 

Branch 4 - Biotinidase
Query= (523 letters)

Database: nr 594,795 sequences; 188,047,558 total letters

                                                                   Score     E
Sequences producing significant alignments:                        (bits)  Value

sp|P43251|BTD_HUMAN  BIOTINIDASE PRECURSOR                       1093  0.0
ref|NP_000051.1|  biotinidase precursor [Homo sapiens] >gi|1...  1093  0.0
ref|NP_004657.1|  vanin 1; Vannin 1 [Homo sapiens] >gi|41280...   386  e-106
gb|AAF21453.1|U39664_1  (U39664) Tiff66 [Homo sapiens]            385  e-106
ref|XP_004354.1|  vanin 1 [Homo sapiens] >gi|4572585|emb|CAB...   385  e-106
ref|NP_035834.1|  vanin 1; pantetheinase [Mus musculus] >gi|...   382  e-105
gb|AAF21452.1|U39663_1  (U39663) TIFF66 [Canis familiaris]        378  e-104
ref|NP_036109.1|  vanin 3 [Mus musculus] >gi|6102996|emb|CAB...   373  e-102
ref|NP_060869.1|  VNN3 protein [Homo sapiens] >gi|7160973|em...   370  e-101
ref|XP_004352.1|  vanin 2 [Homo sapiens] >gi|4572586|emb|CAB...   359  5e-98
dbj|BAA82525.1|  (D89974) glycosylphosphatidyl inositol-anch...   358  8e-98
ref|NP_004656.1|  vanin 2; Vannin 2 [Homo sapiens] >gi|41280...   355  9e-97
ref|XP_004353.1|  VNN3 protein [Homo sapiens]                     220  4e-56
emb|CAB77020.1|  (AJ276261) vanin-like protein [Drosophila m...   162  7e-39
gb|AAG30008.1|  (AF281333) biotinidase fragment 2 [Oncorhync...   151  2e-35
gb|AAF46130.1|  (AE003436) CG3599 gene product [Drosophila m...   148  2e-34
gb|AAF46129.1|  (AE003436) vanin-like gene product [Drosophi...   126  5e-28
gb|AAG30007.1|  (AF281332) biotinidase fragment 1 [Oncorhync...    94  3e-18
 
 

Branch 5 - Beta-ureidopropionase
Query= (393 letters)

Database: nr 594,795 sequences; 188,047,558 total letters

                                                                   Score     E
Sequences producing significant alignments:                        (bits)  Value

sp|Q03248|BUP_RAT  BETA-UREIDOPROPIONASE (BETA-ALANINE SYNTH...   813  0.0
ref|NP_057411.1|  beta-ureidopropionase [Homo sapiens] >gi|6...   684  0.0
ref|XP_009883.1|  beta-ureidopropionase [Homo sapiens]            684  0.0
gb|AAF06739.1|  (AF169560) beta-ureidopropionase [Homo sapiens]   679  0.0
gb|AAF54141.1|  (AE003676) CG3027 gene product [Drosophila m...   502  e-141
pir||T16068  hypothetical protein F13H8.7 - Caenorhabditis e...   487  e-136
dbj|BAB09868.1|  (AB008268) beta-ureidopropionase [Arabidops...   475  e-133
 
 

Branch 6 - Carbamylase
Query= (304 letters)

Database: nr 594,795 sequences; 188,047,558 total letters

                                                                   Score     E
Sequences producing significant alignments:                        (bits)  Value

pir||JW0082  N-carbamyl-D-amino acid amidohydrolase (EC 3.5....   629  e-179
emb|CAA62550.1|  (X91070) D-N-alpha-carbamilase [Agrobacteri...   612  e-174
gb|AAB47607.1|  (U59376) N-carbamoyl-D-amino acid amidohydro...   610  e-174
pir||JW0083  N-carbamyl-D-amino acid amidohydrolase (EC 3.5....   367  e-101
 
 

Branch 7 - Prokaryotic NAD+ Synthetase
Query= (300 letters)

Database: nr 576,719 sequences; 181,542,687 total letters

                                                                  Score     E
Sequences producing significant alignments:                        (bits)  Value

sp|Q9X0Y0|NAE2_THEMA  PROBABLE GLUTAMINE-DEPENDENT NAD(+) SY...   607  e-173
sp|O67091|NADE_AQUAE  PROBABLE GLUTAMINE-DEPENDENT NAD(+) SY...   238  8e-62
sp|P74292|NADE_SYNY3  PROBABLE GLUTAMINE-DEPENDENT NAD(+) SY...   183  2e-45
gb|AAF84763.1|AE004015_6  (AE004015) NH3-dependent NAD synth...   170  2e-41
sp|Q03638|NADE_RHOCA  GLUTAMINE-DEPENDENT NAD(+) SYNTHETASE ...   158  8e-38
emb|CAB38325.1|  (Y17736) NAD(+) synthase (glutamine-hydroly...   129  4e-29
sp|P71911|NADE_MYCTU  GLUTAMINE-DEPENDENT NAD(+) SYNTHETASE ...   103  3e-21
pir||D70680  hypothetical protein Rv2438c - Mycobacterium tu...   103  3e-21
 

Branch 8 - Eukaryotic NAD+ Synthetase

Query= (360 letters)

Database: nr 576,719 sequences; 181,542,687 total letters

                                                                   Score     E
Sequences producing significant alignments:                        (bits)  Value

ref|NP_011941.1|  Yhr074wp >gi|731675|sp|P38795|NADE_YEAST P...   751  0.0
dbj|BAB14034.1|  (AK022436) unnamed protein product [Homo sa...   457  e-128
ref|NP_060631.1|  hypothetical protein FLJ10631 >gi|7022784|...   457  e-128
sp|O74940|NADE_SCHPO  PUTATIVE GLUTAMINE-DEPENDENT NAD(+) SY...   442  e-123
gb|AAG12939.1|AC073944_2  (AC073944) hypothetical protein; 1...   393  e-108
sp|Q9VYA0|NADE_DROME  PUTATIVE GLUTAMINE-DEPENDENT NAD(+) SY...   388  e-107
pir||T19420  hypothetical protein C24F3.4 - Caenorhabditis e...   280  1e-74
 
 

Branch 9 - Apolipoprotein N-acyltransferase
Query= (511 letters)

Database: nr 594,795 sequences; 188,047,558 total letters

                                                                   Score     E
Sequences producing significant alignments:                        (bits)  Value

sp|Q9ZI86|LNT_PSEAE  APOLIPOPROTEIN N-ACYLTRANSFERASE (ALP N...   770  0.0
pir||A82261  apolipoprotein N-acyltransferase VC0958 [import...   241  1e-62
gb|AAD09824.1|  (AF116773) apolipoprotein N-acyltransferase ...   238  9e-62
dbj|BAA35308.1|  (D90705) Apolipoprotein n-acyltransferase (...   233  4e-60
sp|P23930|LNT_ECOLI  APOLIPOPROTEIN N-ACYLTRANSFERASE (ALP N...   232  7e-60
sp|P44626|LNT_HAEIN  APOLIPOPROTEIN N-ACYLTRANSFERASE (ALP N...   222  9e-57
pir||H81166  apolipoprotein N-acyltransferase, probable NMB0...   155  1e-36
pir||E81938  probable apolipoprotein N-acyltransferase NMA09...   153  5e-36
sp|Q9ZDG3|LNT_RICPR  APOLIPOPROTEIN N-ACYLTRANSFERASE (ALP N...   137  3e-31
sp|Q52910|LNT_RHIME  APOLIPOPROTEIN N-ACYLTRANSFERASE (ALP N...   120  4e-26
emb|CAB95793.1|  (AL359949) putative lipoprotein N-acyltrans...    96  8e-19
sp|O67000|LNT_AQUAE  APOLIPOPROTEIN N-ACYLTRANSFERASE (ALP N...    94  3e-18
sp|P74055|LNT_SYNY3  APOLIPOPROTEIN N-ACYLTRANSFERASE (ALP N...    91  2e-17
sp|O83279|LNT_TREPA  APOLIPOPROTEIN N-ACYLTRANSFERASE (ALP N...    85  2e-15
pir||G71327  probable apolipoprotein N-acyltransferase (cutE...    81  3e-14
pir||E70129  acid-inducible protein (act206) homolog - Lyme ...    81  3e-14
pir||B75407  acid tolerance protein Act206-related protein -...    80  7e-14
pir||H72359  hypothetical protein TM0573 - Thermotoga mariti...    76  1e-12
emb|CAB11299.1|  (Z98604) hypothetical protein MLCB2052.01 [...    67  4e-10
pir||B70945  hypothetical protein Rv2051c - Mycobacterium tu...    67  5e-10
emb|CAC15462.1|  (AJ294477) putative polyprenol-phosphate-ma...    66  8e-10
emb|CAC14382.1|  (AL445963) putative transferase [Streptomyc...    66  8e-10
sp|O84539|LNT_CHLTR  APOLIPOPROTEIN N-ACYLTRANSFERASE (ALP N...    56  9e-07
pir||B81662  apolipoprotein N-acetyltransferase, probable TC...    54  4e-06
pir||D71965  apolipoprotein n-acyltransferase - Helicobacter...    50  6e-05
sp|Q9Z7Q1|LNT_CHLPN  APOLIPOPROTEIN N-ACYLTRANSFERASE (ALP N...    49  1e-04
pir||D64542  apolipoprotein N-acyltransferase - Helicobacter...    49  1e-04
pir||B81313  probable integral membrane protein Cj1095 [impo...    47  4e-04
 
 

Branch 10 - Nit and NitFhit
Query= (327 letters)

Database: nr 594,795 sequences; 188,047,558 total letters

                                                                   Score     E
Sequences producing significant alignments:                        (bits)  Value

ref|NP_005591.1|  nitrilase 1 [Homo sapiens] >gi|3228666|gb|...   677  0.0
gb|AAC40184.1|  (AF069985) nitrilase homolog 1 [Mus musculus]     575  e-163
ref|NP_036179.1|  nitrilase 1 [Mus musculus] >gi|3228668|gb|...   574  e-163
gb|AAF87104.1|AF284575_1  (AF284575) Nit protein 1 [Xenopus ...   372  e-102
gb|AAC39137.1|  (AF069989) nitrilase and fragile histidine t...   275  4e-73
pir||T43198  nitrilase/Fhit protein - Caenorhabditis elegans...   274  1e-72
emb|CAB78004.1|  (AL161512) nitrilase 1 like protein [Arabid...   248  5e-65
pir||T40601  putitive nitrilase homolog - fission yeast  (Sc...   219  3e-56
pir||B69109  N-carbamoyl-D-amino acid amidohydrolase - Metha...   207  1e-52
ref|NP_012409.1|  nitrilase superfamily member; Nit2p [Sacch...   196  2e-49
pir||E83086  conserved hypothetical protein PA4475 [imported...   193  2e-48
sp|P55175|Y601_SYNY3  HYPOTHETICAL 30.2 KD PROTEIN SLL0601 >...   188  9e-47
ref|NP_064587.1|  Nit protein 2 [Homo sapiens] >gi|11433227|...   185  6e-46
dbj|BAA32602.1|  (AB017194) hypothetical protein [Plectonema...   184  2e-45
gb|AAF87102.1|AF284573_1  (AF284573) Nit protein 2 [Mus musc...   182  4e-45
sp|Q10166|YAUB_SCHPO  HYPOTHETICAL 35.7 KD PROTEIN C26A3.11 ...   169  4e-41
pir||F82325  conserved hypothetical protein VC0421 [imported...   168  9e-41
ref|NP_013455.1|  nitrilase superfamily member; Nit3p [Sacch...   160  2e-38
pir||B81199  nitrilase NMB0441 [imported] - Neisseria mening...   154  2e-36
pir||E81834  conserved hypothetical protein NMA2044 [importe...   153  3e-36
pir||T48563  hypothetical protein F14F18.210 - Arabidopsis t...   153  3e-36
gb|AAF54370.1|  (AE003682) CG8132 gene product [Drosophila m...   146  4e-34
pir||T36488  probable hydrolase - Streptomyces coelicolor >g...   127  1e-28
sp|P77192|YBEM_ECOLI  HYPOTHETICAL 20.5 KDA PROTEIN IN CRCB-...   119  7e-26
 
 

Branch 11 - NB11
 

Query= (292 letters)

Database: nr 584,562 sequences; 183,936,037 total letters
                                                                   Score     E
Sequences producing significant alignments:                        (bits)  Value

pir||F71901  hypothetical protein jhp0694 - Helicobacter pyl...   598  e-170
pir||E64614  beta-alanine synthetase homolog - Helicobacter ...   592  e-168
emb|CAB73204.1|  (AL139076) putative hydrolase [Campylobacte...   355  3e-97
gb|AAF85242.1|AE004053_5  (AE004053) beta-alanine synthetase...   321  5e-87
emb|CAB45873.1|  (Y19104) beta-alanine synthase [Lycopersico...   219  4e-56
gb|AAG03682.1|AE004467_4  (AE004467) probable hydratase [Pse...   211  7e-54
gb|AAD15597.1|  (AC006232) putative nitrilase [Arabidopsis t...   201  1e-50
pir||F75263  probable hydrolase - Deinococcus radiodurans (s...   187  1e-46
ref|NP_048426.1|  contains ATP/GTP-binding site motif A; sim...   174  1e-42
gb|AAD19716.1|  (AF124349) hydrolase [Zymomonas mobilis]          174  1e-42
pir||T34905  probable hydrolase - Streptomyces coelicolor >g...   166  2e-40
pir||T28684  hypothetical protein - Streptomyces coelicolor ...   164  1e-39
pir||T41662  probable nitrilase - fission yeast (Schizosacch...   128  9e-29
 

Branch 12 - NB12
 

Query= (428 letters)

Database: nr 594,795 sequences; 188,047,558 total letters
                                                                   Score     E
Sequences producing significant alignments:                        (bits)  Value

pir||JC4601  hypothetical 48.2k protein - Sphingomonas yanoi...   868  0.0
dbj|BAB04808.1|  (AP001510) unknown conserved protein [Bacil...   322  4e-87
sp|P54608|YHCX_BACSU  HYPOTHETICAL 60.2 KD PROTEIN IN CSPB-G...   322  7e-87
 
 

Non-fused Outliers
emb|CAC12333.1|  (AL445066) nitrilase related protein [Therm...
ref|NP_068956.1|  conserved hypothetical protein [Archaeoglo...
pir||G83608  probable hydratase PA0293 [imported] - Pseudomo...
pir||A72454  probable nitrilase APE2277 - Aeropyrum pernix (...
dbj|BAB04766.1|  (AP001510) unknown conserved protein [Bacil...
pir||H82556  beta-alanine synthetase XF2443 [imported] - Xyl...
pir||E69863  conserved hypothetical protein ykrU - Bacillus ...
pir||B83387  hypothetical protein PA2074 [imported] - Pseudo...
sp|Q11146|Y480_MYCTU  HYPOTHETICAL 35.2 KDA PROTEIN RV0480C ...
gb|AAD19716.1|  (AF124349) hydrolase [Zymomonas mobilis]       
sp|P55176|YPQQ_PSEFL  HYPOTHETICAL 31.2 KD PROTEIN IN PQQF 5...
pir||E64614  beta-alanine synthetase homolog - Helicobacter ...
pir||F71901  hypothetical protein jhp0694 - Helicobacter pyl...
gb|AAD15597.1|  (AC006232) putative nitrilase [Arabidopsis t...
sp|P55178|YAG5_STALU  HYPOTHETICAL PROTEIN IN AGR OPERON (OR...
gb|AAC38297.1|  (AF012132) unknown [Staphylococcus epidermidis]
sp|P55177|YAG5_STAAU  HYPOTHETICAL 29.8 KD PROTEIN IN AGR OP...
emb|CAB76877.1|  (AL159139) putative hydrolase [Streptomyces...
pir||G71949  hypothetical protein jhp0294 - Helicobacter pyl...
pir||E64558  conserved hypothetical protein HP0309 - Helicob...
pir||A70310  conserved hypothetical protein aq_103 - Aquifex...
pir||C71109  hypothetical protein PH0642 - Pyrococcus horiko...
pir||H83195  conserved hypothetical protein PA3598 [imported...
pir||B81369  probable hydrolase Cj0947c [imported] - Campylo...
gb|AAG19366.1|  (AE005031) Vng0936c [Halobacterium sp. NRC-1]  
pir||C75051  hydrolase related PAB1449 - Pyrococcus abyssi (...