Table 1 |
||||||
|
Progress in crop genome sequencing |
||||||
|
Species (common name) |
Genome size |
Ploidy |
Sequence strategy |
Publication date |
Assembly features |
Reference |
|
|
||||||
|
Oryza sativa (rice) |
389 Mb |
2n = 2x = 24 |
BAC physical map, Sanger sequencing |
Aug 2005 |
Essentially complete chromosome arm coverage |
[2] |
|
|
||||||
|
Populus trichocarpa (black cottonwood) |
550 Mb |
2n = 2x = 38 |
BAC physical map, WGS, Sanger sequencing |
Sep 2006 |
2,447 cscaffolds containing 410 Mb, 82% of sequence genetically anchored |
[3] |
|
|
||||||
|
Vitus vinifera (pinot noir grape) |
475 Mb |
2n = 2x = 36 |
WGS, Sanger sequencing |
Sep 2007 |
3,514 csupercontigs containing 487 Mb, 69% of sequence genetically anchored |
[5] |
|
|
||||||
|
Sorghum bicolor (sorghum) |
700 Mb |
2n = 2x = 20 |
WGS, Sanger sequencing |
Jan 2009 |
229 scaffolds containing 97% of the genome, 88% of sequence genetically anchored |
[6] |
|
|
||||||
|
Zea mays (maize) |
2,300 Mb |
2n = 2x = 20, one aWGD allotetraploid |
BAC physical map, BAC sequence 4-6 x deep |
Nov 2009 |
2,048 Mb in 125,325 bcontigs forming 61,161 scaffolds |
[4] |
|
|
||||||
|
Glycine max (soybean) |
1,115 Mb |
Two WGD 2n = 2x = 40 allopolyploid |
WGS, Sanger sequencing |
Jan 2010 |
397 scaffolds containing 85% of the genome, 98% of sequence genetically anchored |
[7] |
|
|
||||||
|
Malus × domestica (apple) |
750 Mb |
One WGD 2n = 2x = 34 |
WGS, Sanger, Roche 454 |
Oct 2010 |
1,629 cmetacontigs containing 80% of the genome, 71% of sequence genetically anchored |
[10] |
|
|
||||||
|
Theobroma cacao (cacao) |
430 Mb |
2n = 2x = 20 |
WGS, Sanger, Illumina, Roche 454 |
Dec 2010 |
524 scaffolds containing 80% of the genome, 67% of sequence genetically anchored |
[11] |
|
|
||||||
|
Fragaria vesca (woodland strawberry) |
240 Mb |
2n = 2x = 14 |
WGS, Roche 454, Illumina, SOLiD |
Dec 2010 |
272 scaffolds containing 95% of the genome, 94% of sequence genetically anchored |
[13] |
|
|
||||||
|
Phoenix dactylifera (date palm) |
658 Mb |
2n = 2x = 36 |
WGS, Illumina |
June 2011 |
57,277 scaffolds containing 60% of the genome |
[12] |
|
|
||||||
|
Solanum tuberosum (potato) |
844 Mb |
2n = 4x = 48 |
Double monoploid DM and diploid RH, WGS, Illumina, Roche 454 |
July 2011 |
443 superscaffolds containing 78% of the genome, 86% of the assembly genetically anchored |
[14] |
|
|
||||||
|
Brassica rapa (Chinese cabbage) |
485 Mb |
Three WGD 2n = 2x = 20 |
WGS, Illumina, BAC end Sanger sequencing |
Aug 2011 |
288 Mb in scaffolds, 90% of the assembly genetically anchored |
[15] |
|
|
||||||
|
Medicago truncatula (alfalfa relative) |
375 Mb |
WGD 2n = 2x = 16 |
BAC physical map, Sanger, Illumina |
Dec 2011 |
8 pseudomolecules containing 70% of the genome, 100% in optical map |
[16] |
|
|
||||||
|
Manihot esculenta (cassava) |
770 Mb |
2n = 2x = 36 |
WGS, Roche 454, BAC end Sanger sequencing |
Jan 2012 |
12,977 scaffolds containing 80% of the genome |
[19] |
|
|
||||||
|
Cajanus cajan (pigeonpea) |
833 Mb |
2n = 2x = 22 |
WGS, Illumina |
Jan 2012 |
137,542 scaffolds containing 73% of the genome |
[20] |
|
|
||||||
|
Setaria italic (foxtail millet) |
500 Mb |
2n = 2x = 18 |
WGS, Sanger, Illumina, BAC end sequence |
May 2012 |
597 scaffolds containing 80% of the genome, 99% of the assembly genetically anchored |
[21] |
|
|
||||||
|
Solanum lycopersicum (tomato) |
900 Mb |
2n = 2x = 24 |
WGS, Roche 454, Illumina and SOLiD, BAC end Sanger sequencing |
May 2012 |
91 scaffolds containing 85% of the genome, 99% of the assembly genetically anchored |
[17] |
|
|
||||||
|
Cucumis melo (melon) |
312 Mb |
Three WGD 2n = 2x = 24 |
WGS, Roche 454, BAC end sequencing |
July 2012 |
1,584 scaffolds containing 83% of the genome, 88% of the assembly genetically anchored |
[22] |
|
|
||||||
|
Musa acuminate (Cavendish banana) |
523 Mb |
2n = 2x = 22 |
WGS, Roche 454, Sanger, Illumina |
Aug 2012 |
24,425 contigs containing 90% of the genome, 70% of the assembly genetically anchored |
[33] |
|
|
||||||
|
Citrus sinensis (Valencia sweet orange) |
367 Mb |
2n = 2x = 18 |
Dihaploid WGS, Illumina |
Jan 2013 |
4,811 scaffolds containing 82% of the genome, 73% of the assembly genetically anchored |
[23] |
|
|
||||||
|
Gossypium raimondii (D genome cotton) |
880 Mb |
2n = 2x = 26 |
WGS, Illumina |
Aug 2012 |
4,715 scaffolds containing 85% of the genome, 73% of the assembly genetically anchored |
[24] |
|
|
||||||
|
Hordeum vulgare (barley) |
5,100 Mb |
2n = 2x = 14 |
WGS, Illumina, BAC physical map, BAC sequence (Roche 454, Illumina) |
Nov 2012 |
Physical map (4.98 Gb), BAC sequence (1.13 Gb), WGS assemblies (1.9 Gb); integrated by physical map and syntenic order |
[26] |
|
|
||||||
|
Triticum aestivum (bread wheat) |
17,000 Mb |
2n = 6x = 42 allopolyploid |
WGS, Roche 454 |
Nov 2012 |
Orthologous group assembly, 437 Mb |
[27] |
|
|
||||||
|
Gossypium raimondii (D genome cotton) G. hirsutum (upland cotton) |
880 Mb |
2n = 2x = 26 AtDt allopolyploid |
WGS, Sanger, Roche 454, Illumina Illumina |
Dec 2012 |
1,084 scaffolds containing 86% of the genome, 98% anchored and oriented to genetic map 82x coverage |
[25] |
|
|
||||||
|
Cicer arietinum (chickpea) |
738 Mb |
2n = 2x = 16 |
WGS, Illumina BAC end sequence |
Jan 2013 |
7,163 scaffolds containing 64% of the genome |
[31] |
|
|
||||||
|
Phylostachys heterocycla (bamboo) |
2 Gb |
2n = 2x = 48 |
WGS, Illumina BAC end sequence |
Apr 2013 |
80% of the 2.05 Gb assembly maps to 5,499 scaffolds of less than 62 kb |
[34] |
|
|
||||||
|
Picea abies (Norway spruce) |
20,000 Mb |
2n = 2x = 24 |
fosmid pools with both haploid (megagametophyte) and diploid WGS |
May 2013 |
Merged assembly 12.0 Gb, with 4.3 Gb in ≥10 kb scaffolds |
[42] |
|
|
||||||
|
Pinus taeda (Loblolly pine) |
24,000 Mb |
2n = 2x = 24 |
WGS single haploid megagametophyte assembly |
In progress |
||
|
|
||||||
|
Miscanthus sp. (elephant grass) |
1,500 Mb |
One WGD, diploid progenitors 2n = 2x = 38 |
WGS |
In progress |
||
|
|
||||||
|
Elais guineensis Elais oleifera (oil palm) |
1,890 Mb |
2n = 2x = 32 commercial F1 hybrids |
WGS, BAC physical maps |
In progress |
||
|
|
||||||
|
Saccharum officinarum x S. spontaneum (sugar cane) |
>15,000 Mb |
Diploid progenitors x = 10; 2n = 80; × = 8; 2n = 40-128 |
WGS |
In progress |
||
|
|
||||||
|
aWGD alloploids have a whole-genome duplication in recent lineage. bA contig is an unambiguous linear assembly of sequences with no physical gaps in coverage, but which can contain errors. cThe terms supercontig, scaffold or metacontig are used interchangeably to describe a set of contigs that are linked by a known physical distance but that contain sequence gaps. These scaffolds are usually created using mate-pair reads and BAC end sequences. dPseudomolecule is a term applied to a chromosome-scale assembly of contigs and scaffolds that is anchored to a long-range framework using genetic markers and other chromosome features, including cytogenetic features and deletions. |
||||||
|
Bevan and Uauy Genome Biology 2013 14:206 doi:10.1186/gb-2013-14-6-206 |
||||||