Table 1 |
|
|
Hierarchy of assembly data types |
|
| Data type |
Description |
|
|
|
| Scaffold (100 kb to 10 Mb) |
Layout of potentially nonoverlapping contigs based on mate-pair information, ideally
spanning entire chromosomes or replicons |
| Contig (5 kb to 500 kb) |
Layout of overlapping reads with a consensus sequence |
| Mate-pair (2 kb to 100 kb) |
Pair of end-sequenced reads with a known orientation and separation |
| Read (0.5 kb to 1.0 kb) |
Base-calls and quality scores assigned to a chromatogram |
| Chromatogram (4× 10,000 time points) |
Signal data from a sequencing reaction of a physical piece of DNA |
|
|
|
|
Each type is composed of the next lower level type. Typical sizes are also listed. bp, base pairs; Mb, megabases. |
|
|
Schatz et al. Genome Biology 2007 8:R34 doi:10.1186/gb-2007-8-3-r34 |
|