|
Resolution: standard / high Figure 5.
Inclusion of evolutionary information greatly increases the specificity and selectivity
of CRM searches based on binding-site clustering. The effects of integrating comparative
data into searches for binding site clusters were assessed by counting the number
of (a) true positive, (b) negative and (c) novel CRMs recovered at the different site density cutoffs plotted on the x-axis. The positives used here include the 15 positive pCRMs from Table 2 and 10 additional
positive CRMs from the literature (see text), all of which have identifiably orthologous
sequence in D. pseudoobscura, while the negatives included only the 14 non-functional pCRMs for which orthologous
sequence in D. pseudoobscura could be found. The solid line in each panel shows the results without the use of
D. pseudoobscura; the dashed line shows the results with D. pseudoobscura. Searches displayed were performed using the aligned sites constraint (see Materials
and methods). Comparable results were obtained for the aligned + preserved sites constraint.
The number of false positives is not strictly monotonically decreasing with an increasing
binding site cutoff. This stems from the cluster merging behavior of CIS-ANALYST -
sometimes a decrease in the minimum number of sites leads CIS-ANALYST to tack on a
lower-density cluster that is adjacent to a higher-density one, resulting in a single
cluster with more sites but lower site density. This can actually increase the number
of conserved sites necessary to reach the conservation threshold (see Materials and
methods).
Berman et al. Genome Biology 2004 5:R61 doi:10.1186/gb-2004-5-9-r61 |