|
| Correspondence The DNA-binding region of RAG 1 is not a homeodomainGenome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892-4470, USA
Genome Biology 2002, 3:interactions1004.1-1004.4doi:10.1186/gb-2002-3-8-interactions1004 Subject areas: Bioinformatics, Biochemistry and structural biology, Development, Immunology The electronic version of this article is the complete one and can be found online at: http://genomebiology.com/2002/3/8/interactions/1004.1
© 2002 BioMed Central Ltd CorrespondenceOne of the goals of functional annotation is to catalog information that would be of value in guiding experimental design and analysis. Even in cases for which sequence similarity can be detected reliably, however, functional annotations found in public databases are often incorrect [1,2]. Here, we discuss a case in which a functional assignment was made to RAG1, a protein catalogued as a homeodomain protein in the Online Mendelian Inheritace in Man (OMIM) database [3]. A more in-depth bioinformatic analysis shows this assignment to be incorrect. The known biochemical functions of RAG1 are as an integrase and recombinase [4], functions that are not consistent with those of other homeodomain proteins [5]. Comparison of RAG 1 with homeodomainsThe homeodomain is a DNA-binding domain found in many eukaryotic transcription factors and is characterized by a highly stringent sequence signature [6,7]. Structural studies on homeodomain family members have revealed that these proteins contain almost superimposable structures, all consisting of a three-helical bundle with an amino-terminal extension and all exhibiting a similar mode of DNA binding [8,9,10]. Several positions within the homeodomain region that are involved in DNA recognition or stabilization of the structure are conserved across species. The DNA-binding region of RAG1 is also highly conserved (Figure 1a). This region shows 20% sequence identity with the homeodomain of the Engrailed protein. The DNA-binding domain of RAG1 could not, however, be aligned to a dataset of 129 human homeodomain sequences [11], as RAG1 lacks the evolutionarily conserved amino-acid residues that define the homeodomain family. Also, the sequence motif in the homeodomain DNA-recognition helix (48-WF-x-N-x-R-53, where x is any amino acid) is in fact absent from the RAG1 DNA-binding region. Experimental evidence that RAG1 does not belong to the homeodomain family comes from the observation that a mutant RAG1 protein containing a conserved homeodomain motif failed to bind DNA and is non-functional in in vitro recombination assays [12].
Comparison of RAG 1 with Tc3 transposase and Hin invertaseCaenorhabditis elegans Tc3 is a member of the Tc1/mariner family of transposable elements found in species ranging from fungi to humans [13]. The X-ray structure of Tc3 transposase revealed that the specific DNA-binding region contains three α helices comprising the helix-turn-helix motif [14]. Sequence alignments between the DNA-binding regions of RAG1 and Tc3 shows patches of high sequence conservation, especially in the amino-terminal region. The lack of the DNA-contacting residues of Tc3 from RAG1 indicates significant divergence in the RAG1 DNA-recognition site, however. Automated fold prediction using the University of California, Los Angeles and the Department of Energy cooperative (UCLA-DOE) Fold Recognition Server [15] identified the DNA-binding domain of Tc3 transposase of C. elegans as the candidate whose fold is most likely to represent the RAG1 family. Hin recombinase belongs to a family of bacterial DNA invertases that catalyze a site-specific recombination reaction. The Hin DNA-binding domain shares distinct sequence similarities with RAG1, and there is a striking similarity between the Hin recognition sequence and RAG1-nonamer site. At the sequence level, the invariant 138-GGRPR-142 motif in the amino-terminal arm of Hin DNA-binding domain is conserved in RAG1. This motif is positioned in the minor groove of the DNA-recognition sequence and provides critical DNA contacts. The sequence similarity between RAG1 and Hin recombinase is extended through the DNA-binding region, with a total of 13 residues absolutely conserved in this region. Structural conservation of the DNA-binding domain of RAG1 and Hin recombinase is illustrated by the observation that a RAG1 hybrid protein containing the homologous DNA-binding region of Hin recombinase is functional in in vitro recombination assays [12]. The helix-turn-helix motifIn the DNA-binding domains of Hin recombinase, Tc3 transposase, and Engrailed, the first and the second helices lie almost anti-parallel to each other, with a turn between the second and the third helices. In all cases, the recognition helix fits into the major groove of the DNA. Although the essential features of the helix-turn-helix motifs are very similar, these proteins do not all dock on the DNA in the same fashion (Figure 1b,1c,1d). In the X-ray structure of the Engrailed homeodomain-DNA complex, several residues in the exposed hydrophilic face of helix 3 establish specific contacts with the last four base pairs of the recognition sequence, whereas the residues in the amino-terminal arm of the protein contact the first two base pairs of the recognition sequence. Compared to Tc3 transposase and Hin recombinase, the helices and the loops are longer in the homeodomain structure; only helix 3 is inserted in the major groove and the residues in the center of this relatively longer helix provide DNA contacts. In the Hin-recombinase structure, the α-helical core, along with extensions at both the amino and carboxyl termini, participate in DNA recognition. The eight-residue carboxy-terminal tail of Hin recombinase is inserted in the minor groove of the DNA-recognition site. Wrap-around of the DNA-binding site by the carboxy-terminal extension has not been observed in Tc3 or homeodomain structures. In contrast to the other structures, both the second and third helices of Tc3 transposase participate in DNA recognition by binding to the major groove. The six residues preceding the first helix in Tc3 adopt a conformation different from that seen in the longer amino terminus of the Hin recombinase and the Engrailed homeodomain. Genomic perspectiveThe ability of the RAG1 proteins to catalyze both the formation of hybrid joints and transposition highlight the similarities between the mechanism of site-specific rearrangement by V(D)J recombination and certain transposition/retroviral integration reactions. The occurrence of RAG proteins in jawed vertebrates and conservation of domain architecture and function from prokaryotes suggest that the RAG1 proteins might have been horizontally transferred into the eukaryotic genome by a transposon. The question that may be posed here is what the relevance of the current observation is, and whether the functional mis-assignment is of great importance. During vertebrate lymphocyte development, RAG1 mediates the somatic assembly of antigen receptors, which involves DNA-bond breakage and strand-transfer reactions, reminiscent of transposition reactions in bacteria. Homeodomain proteins play a fundamental role in diverse cellular processes by transcriptional regulation of downstream-target genes. RAG1 has been identified only in jawed vertebrates, whereas homeodomain proteins are highly conserved from yeast to human. The evolution and biological functions of RAG1 and homeodomain proteins are markedly different, and one cannot substitute for the other. Unfortunately, with the initial mis-classification by Spanopoulou et al. [12] has come experimental interpretation in the context of RAG1 being a homeodomain. Specifically, Villa et al. [16] have interpreted the biochemical effects of mutations leading to Omenn Syndrome as having to do with changes in homeodomain structure, despite statements implicating the observed defects with low degrees of V(D)J recombination. In addition, Aidinis et al. [17] proposed models of interaction of a RAG1 'homeodomain' with the chromatin proteins HMG1 and HMG2. In the study by Aidinis et al. [17], the experiments were designed under the assumption that RAG1 was a homeodomain, leading to incorrect extension of the interpretation of results to the involvement of a homeodomain structure in V(D)J recombination. The model proposed by this group has therefore been made in the wrong biological context. The incorrect assignment of RAG1 as a homeodomain has colored the interpretation of experimental results. This is emblematic of the larger problem that annotation-error propagation plays in incorrectly guiding experimental discovery. Often, there may be little or no similarity between a sequence of interest and those in the public databases, meaning that it would be very difficult (if not impossible) to determine any degree of relatedness on the basis of sequence alone. Even in cases where homology can be detected reliably, the annotations currently found in the public databases are often incorrect. The considerable effect of processes such as alternative splicing [18] and the ability of proteins to perform markedly different functions depending on their cellular localization and compartmentalization [19], coupled with the number of annotation errors currently in the public databases, all help to re-emphasize the importance of database curation and experimental validation in maintaining the purity and utility of these public resources. References
Have something to say? Post a comment on this article! |


on Google Scholar






author email
corresponding author email
Figure 1.