Table 1 |
|||||||||||
|
Comparison of assembly statistics |
|||||||||||
|
Dataset |
Assembler |
#ctgs/scfs |
Good Ctgs/scfs |
Total aln (Mbp) |
Slt |
Hvy |
Ch |
Size @ 10 Mbp |
#@ 10 Mbp |
Max ctg size |
Err per Mbp |
|
|
|||||||||||
|
mockE |
SOAPdenovo |
63,014 |
99.3% |
51 |
167 |
131 |
1 |
28,208 |
195 |
249,819 |
5.9 |
|
mockE |
SOAPdenovo_MA |
63,107 |
99.3% |
51 |
166 |
131 |
1 |
28,208 |
195 |
249,819 |
5.8 |
|
mockE |
Velvet |
12,381 |
96.0% |
41 |
269 |
106 |
2 |
46,122 |
128 |
183,815 |
9.2 |
|
mockE |
Velvet_MA |
12,830 |
96.2% |
41 |
256 |
100 |
2 |
42,269 |
137 |
179,673 |
8.7 |
|
mockE |
MetaVelvet |
23,323 |
96.7% |
49 |
474 |
160 |
5 |
62,131 |
93 |
367,458 |
13.0 |
|
mockE |
MetaVelvet_MA |
22,772 |
96.8% |
49 |
462 |
156 |
4 |
62,138 |
91 |
367,458 |
12.7 |
|
mockE |
Meta-IDBA |
22,064 |
95.3% |
47 |
362 |
151 |
3 |
26,141 |
223 |
249,069 |
11.0 |
|
mockE |
Meta-IDBA_MA |
22,032 |
95.4% |
47 |
362 |
151 |
3 |
26,141 |
223 |
249,069 |
11.0 |
|
mockS |
SOAPdenovo |
45,251 |
98.8% |
28 |
135 |
99 |
0 |
5,672 |
626 |
186,064 |
8.4 |
|
mockS |
SOAPdenovo_MA |
44,928 |
98.8% |
28 |
135 |
98 |
0 |
5,672 |
626 |
186,064 |
8.3 |
|
mockS |
Velvet |
20,981 |
95.6% |
28 |
498 |
127 |
1 |
6,134 |
770 |
119,120 |
22.4 |
|
mockS |
Velvet_MA |
21,050 |
95.8% |
28 |
485 |
115 |
1 |
6,060 |
775 |
119,120 |
21.5 |
|
mockS |
MetaVelvet |
19,649 |
94.5% |
28 |
518 |
158 |
2 |
13,028 |
351 |
217,330 |
24.2 |
|
mockS |
MetaVelvet_MA |
20,551 |
95.3% |
28 |
517 |
143 |
3 |
6,685 |
622 |
217,330 |
20.1 |
|
mockS |
Meta-IDBA |
4,573 |
92.3% |
18 |
101 |
83 |
0 |
13,150 |
368 |
119,604 |
10.2 |
|
mockS |
Meta-IDBA_MA |
4,559 |
92.5% |
18 |
101 |
83 |
0 |
13,150 |
368 |
119,604 |
10.2 |
|
HMP |
SOAPdenovo |
39,028 |
89.9% |
11 |
1,138 |
2,686 |
0 |
9,881 |
514 |
116,204 |
347.6 |
|
HMP |
SOAPdenovo_MA |
35,230 |
89.1% |
11 |
1,138 |
2,618 |
0 |
11,359 |
426 |
238,051 |
341.5 |
|
HMP |
Meta-IDBA |
25,861 |
88.9% |
7 |
718 |
2,102 |
0 |
4,215 |
1144 |
59,188 |
402.8 |
|
HMP |
Meta-IDBA_MA |
25,698 |
88.7% |
7 |
710 |
2,087 |
0 |
4,215 |
1144 |
59,188 |
399.6 |
|
HMPscf |
SOAPdenovo |
31,673 |
99.9% |
11 |
- |
- |
10 |
9,906 |
510 |
116,181 |
0.9 |
|
HMPscf |
SOAPdenovo_MA |
27,231 |
99.9% |
11 |
- |
- |
10 |
11,359 |
426 |
238,051 |
0.9 |
|
HMPscf |
Meta-IDBA |
20,352 |
99.9% |
7 |
- |
- |
10 |
4,946 |
939 |
59,188 |
1.4 |
|
HMPscf |
Meta-IDBA_MA |
22,886 |
99.9% |
7 |
- |
- |
9 |
22,304 |
238 |
66,401 |
1.3 |
|
|
|||||||||||
|
Datasets are mockE (mock Even), mockS (mock Staggered), HMP (Tongue dorsum, contig-level analysis), HMPscf (Tongue dorsum, scaffold-level analysis). All analyses other than HMPscf were done at the contig level. If necessary, contigs were extracted from scaffolds by splitting at three consecutive Ns. Assemblers with suffix _MA indicate the results produced by running MetAMOS on contigs produced by the corresponding assembler. #ctgs/scfs: total number of contigs/scaffolds in the assembly. Good Ctgs/scfs: fraction of contigs/scaffolds that mapped without errors to reference genomes. For the HMP dataset (Tongue dorsum contigs) alignments were only made to a small set of genomes estimated by the HMP project to match the genomes in this sample. For the HMPscf dataset good scaffolds are those without chimeric errors. Total Aln: total amount of sequence that can be aligned to the reference genomes (in Mbp). Slt: slight mis-assemblies determined by alignments that cover 80% or more of the aligned contig in a single match. Hvy: heavy misassemblies determined by alignments that cover less than 80% of the aligned contig in a single match or have two or more matches to a single reference. Ch: Chimeras are contigs with matches to two distinct reference genomes. Neither heavy mis-assemblies nor chimeras count towards reference coverage. Size @ 10 Mbp: the size of the largest contig c such that the sum of all contigs larger than c is more than 10 Mbp (similar to the commonly used N50 size). #@ 10 Mbp: smallest number of contigs whose cumulative size adds up to more than 10 Mbp. Max ctg size: size of the largest contig in the assembly. Err per Mbp: average number of errors per Mbp. Numbers in bold represent the best value for the specific dataset. |
|||||||||||
|
Treangen et al. Genome Biology 2013 14:R2 doi:10.1186/gb-2013-14-1-r2 |
|||||||||||