Figure 1.
A putative example of RNA editing as revealed by comparison of cDNA and genomic DNA
sequences. (a) Gene models for CG18314 based on sequence of two DGCr1 full-length cDNA clones (GH15292.c, GH08370.c) that
differ at their 5' and 3' termini. Although the cDNAs have alternative 5' and 3' UTRs
and are alternatively spliced, they share the same protein-coding potential (shown
in blue). CG18314 encodes a G-protein-coupled receptor of the rhodopsin family, containing a seven-transmembrane
protein domain (7tm_1; the red bar shows the extent of the domain) with similarity
to β2-adrenergic receptors of mouse (X15643, E value = 9e-23) and human (M15169, E = 8e-22). Shown hatched is a 310-bp portion of cDNA sequences with A-to-G nucleotide
variation. (b) Sequence alignments of this 310-bp portion of genomic sequence, two cDNA and three
EST sequences (GH14918, GH14553, HL02270). Shown in yellow are codons with A-to-G
nucleotide variation. Above the genomic nucleotide sequence is its translated amino-acid
sequence starting at amino acid 224 of the protein. Comparing the cDNA nucleotide
sequence to the genomic sequence identifies 10 A-to-G nucleotide variations. Two are
silent, seven result in amino-acid changes, and one alters the stop codon, allowing
two additional amino acids to be encoded. The amino acids that are affected are shown
below the nucleotide sequence (red letters in a gray circle). Two of the amino-acid
changes (N224S and S229G) map to the conserved seven-transmembrane protein domain.
The Anopheles gambiae genomic draft contains sequence encoding this protein (gi|21299606|gb|EAA11751.1|
(AAAB01008960) agCP5433) which is highly conserved at the amino-acid sequence level
(E = e-168) and also encodes N and S at these sites. To sample additional transcripts
of this gene, we performed gene-specific RT-PCR to amplify the region shown in (b).
From a total of 64 independent transcripts we confirmed the 10 cases of editing diagrammed
above, and identified 15 new sites of A-to-G nucleotide variations. A list of these
putative editing sites showing the resulting amino-acid change and the number of times
this change was observed, given in parentheses, is as follows: N224D (2), N224S (12),
L225L (9), N227S (1), S229G (9), H230R (1), M231V (1), L236L (16), A239A (1), P246P
(2) E254G (1), I272I (1), I275M (1), I281V (1), S286G (1), K306R (16), K308R (5),
K308G (8), Q312Q (1), A313A (1), L315L (31), I316V (52), *323W (44) and S324G (4).
Stapleton et al. Genome Biology 2002 3:research0080.1 doi:10.1186/gb-2002-3-12-research0080 |