Categories of data models. The Apollo data models fall into one of two broad categories (or are a descriptive auxiliary class, for example, Comment): (a) a location on a sequence; or (b) a sequence. The corresponding Java superclasses are Range and AbstractSequence, respectively. The inheritance hierarchy from these two central classes is shown here, but some minor classes and relationships are omitted to simplify this description. Each class or interface is drawn as a rectangle, and interfaces have the suffix 'I' included in their class name (an interface specifies the methods that the class is required to implement). Lines ending with an open-headed arrow-point indicate the superclass and subclass relationships and a dotted line connects an interface to a class that implements that interface. Thus, both GenomicRange and SeqFeature are subclasses of the base class Range (which implements the Range I interface) and in turn FeatureSet, GenericAnnotation, FeaturePair, and AssemblyFeature are specializations of SeqFeature. Similarly an SRSSequence is a subclass of AbstractLazySequence, which in turn is a subclass AbstractSequence. Each of these subclasses inherits all the methods of their parent class and may extend the model's behavior either by adding new methods or by overriding the inherited methods. In addition to inheritance, the connecting lines also depict other types of relationships, with the terminus indicating the potential cardinality. Thus, a GenericAnnotationSet must have at least one, but may possibly have more, pieces of Evidence associated with it (drawn as a single crossbar to indicate 'at least one' and a triangular tripod to represent 'many'), one Identifier, which maintains synonyms and database cross references (a single crossbar indicates 'one and only one'), and may have zero or more optional Comments (drawn as a triangular tripod to represent 'many' and a single circle to indicate 'none'). Thus, a FeatureSet is both itself a SeqFeature that is composed of one or more component SeqFeatures. This enables Transcripts to be composed of a set of Exons, or an alignment to be composed of a set of high-scoring pairs. Likewise a CurationSet (across two different species) may contain component CurationSets (for the individual species) to enable comparative analysis.
Lewis et al. Genome Biology 2002 3:research0082.1 doi:10.1186/gb-2002-3-12-research0082