Figure 2.

Crossbow workflow. Previously copied and pre-processed read files are downloaded to the cluster, decompressed and aligned using many parallel instances of Bowtie. Hadoop then bins and sorts the alignments according to primary and secondary keys. Sorted alignments falling into each reference partition are then submitted to parallel instances of SOAPsnp. The final output is a stream of SNP calls made by SOAPsnp.

Langmead et al. Genome Biology 2009 10:R134   doi:10.1186/gb-2009-10-11-r134
Download authors' original image