Figure 2.

PAR-CLIP data viewer. From top to bottom, both panels show conservation scores (branch lengths of seven-mers as described by Friedman et al. [11] and the widely used phyloP [47] and phastCons [48] scores, all computed for the 46-way vertebrate multiple alignment obtained from the UCSC genome browser [49]), the read coverage in each experiment and the genomic sequence of the cluster. Below the sequence, SNP positions according to the 1000 genomes project are indicated in red (here there is only one in (a)) and the actual sequencing reads are shown as black bars for each of the experiments. Mismatches are color-coded as in the genomic sequence above (in both clusters, there are only T to C conversions). Different sequences that have been mapped to a cluster can be distinguished by distinct start or end positions of the corresponding bars or distinct mismatches. The height of each bar is proportional to the corresponding read count. For clarity, if a sequence is observed more than 15 times in an experiment, the corresponding bar is not heightened further and the read count is indicated in white. Ensembl genes and transcripts are shown below the reads (here these are present only in (a)), together with PAR-CLIP clusters in yellow and seed site assignments in blue. (a) An experimentally validated target site of hsa-miR-l5 in the 3' UTR of DMTF1. This illustrates the characteristic features of many valid target sites (see main text). Interestingly, there is also a known SNP (red box) in proximity to the seed site. (b) An intergenic (that is, there are no Ensembl genes or transcripts) cluster that does not have these characteristics. Additionally, it does not contain a miRNA seed site nor any overrepresented seven-mer according to PARma. The validated cluster has Cscore and MAscore > 0.9, whereas for the intergenic cluster, both scores are 0.

