|
Resolution: standard / high Figure 2.
Many human alternative isoforms in SWISS-PROT derive from PTC+ mRNAs. (a) We analyzed each of the human SWISS-PROT entries containing a VARSPLIC line in its
feature table, using this information to assemble protein isoform sequences. Ambiguous
VARSPLIC entries led us to discard five entries from our analysis at this point. (b) We next identified cDNA/mRNA sequences corresponding to each protein isoform assembled
from SWISS-PROT. BLAST was used to align each protein isoform sequence to translated
cDNA/mRNA sequences in GenBank and Refseq, filtering to ensure only high confidence
matches. To obtain the coding sequence of each mRNA/cDNA sequence, we used LocusLink
to map each to the correct human genomic contig sequence from the NCBI human genome
build 30. We referred to the CDS feature of each GenBank or RefSeq cDNA/mRNA record
to identify stop codon locations. (c) We used the SPIDEY mRNA-to-genomic DNA alignment program to determine the gene structure
of each mRNA/cDNA isoform sequence. After generating these gene structures, we could
determine the PTC+ status on the basis of stop codon location relative to exon-exon junctions. If the
termination codon was found to be more than 50 nucleotides upstream of the final intron,
the transcript was deemed PTC+ and designated a candidate target of NMD according to the model of mammalian PTC recognition.
(d) Each putative PTC+ isoform was manually inspected for errors in gene structure prediction. These errors
include false exon predictions due to poly(A) tails and cDNA/mRNA sequence not seen
in the corresponding genomic sequence.
Hillman et al. Genome Biology 2004 5:R8 doi:10.1186/gb-2004-5-2-r8 |