Table 4 |
|||||||
|
Known and novel predicted regulatory elements, obtained when applying FastCompare to D. melanogaster and D. pseudoobscura |
|||||||
|
Sequence |
Rank |
DATG |
WATG |
Orientation |
U/C |
TRANSFAC |
Comments |
|
|
|||||||
|
(a) Known regulatory elements |
|||||||
|
AACAGCTG |
1 |
373 |
[0;1800] |
- |
1.64 |
- |
Known AP-4/MyoD site |
|
ATTTGCATA |
3 |
882 |
[100;2000] |
- |
3.20 |
Oct-1 |
Known (mammalian) Oct-1 site |
|
CACGTGC |
5 |
825.5 |
- |
- |
1.02 |
Myc/Max, PHO4, USF |
Known Myc/Max site |
|
ATTTATGC |
6 |
866 |
- |
- |
3.52 |
CdxA |
Known CdxA site |
|
TGACGTCA |
9 |
825 |
- |
- |
2.36 |
CREB |
Known CREB site |
|
TGATAAG |
11 |
760.5 |
[0;1100] |
- |
2.53 |
GATA |
Known GATA site, carbohydrate metabolism (p < 10-5) |
|
TATCGATA |
12 |
168 |
[0;1900] |
- |
5.39 |
- |
Known DRE site |
|
TTTATGGC |
14 |
978.5 |
- |
- |
2.82 |
Abd-B |
Known Abd-B site |
|
TAATTGA |
24 |
907 |
[0;1900] |
- |
2.58 |
Ubx, Athb-1 |
Known Antp site |
|
GAGAGAG |
26 |
705.5 |
- |
← (p < 10-4) |
1.87 |
- |
Known GAGA site, morphogenesis (p < 10-23) |
|
CAGGTGC |
33 |
1020.5 |
- |
- |
0.83 |
Sn |
Known Snail site |
|
TGACTCA |
46 |
911 |
[100;2000] |
- |
1.89 |
AP-1, GCN4 |
Known AP-1 site |
|
ATCAATCA |
51 |
967 |
[0;1900] |
- |
1.72 |
Pbx-1 |
Known Pbx-1 site |
|
AAGGTCA |
93 |
1015.5 |
[400;1900] |
- |
1.16 |
HNF-4, ER |
Known HRE |
|
AACATGTG |
105 |
994 |
[100;2000] |
- |
1.62 |
- |
Known Twist site |
|
GTAAACA |
147 |
813 |
[0;1200] |
- |
2.54 |
Freac, SRY |
Known DAF-16 site in C. elegans |
|
(b) Novel predicted regulatory elements |
|||||||
|
ACACACAC |
2 |
922.5 |
- |
→ (p < 10-12) |
1.97 |
- |
Unknown site, embryonic development (p < 10-9) |
|
CAAGGAG |
13 |
1091 |
[200;2000] |
← (p < 10-8) |
0.84 |
- |
Unknown site |
|
GCACACAC |
29 |
886 |
- |
- |
1.80 |
- |
Unknown site, histogenesis (p < 10-5) |
|
CAAGTTCA |
30 |
920 |
[0;1900] |
- |
1.23 |
- |
Unknown site |
|
TAATTAA |
31 |
871 |
[500;2000] |
- |
3.07 |
Ftz |
Unknown palindromic homeodomain-like site |
|
CAACAACA |
42 |
968.5 |
[200;2000] |
- |
1.22 |
- |
Unknown site, regulation of transcription (p < 10-5) |
|
TGGCGCC |
48 |
951 |
- |
- |
0.84 |
- |
Unknown palindromic site |
|
CCTGTTGC |
111 |
653 |
[0;1800] |
- |
0.90 |
- |
Unknown site |
|
GTGTGACC |
112 |
296 |
[0;1900] |
→ (p < 10-5) |
2.22 |
- |
Unknown site |
|
CAGGTAG |
143 |
924.5 |
[0;1700] |
- |
0.94 |
- |
Unknown site, cell fate commitment (p < 10-8) |
|
CACACGCA |
145 |
968.5 |
- |
- |
1.49 |
- |
Unknown site, cellular morphogenesis (p < 10-5) |
|
GTCAACAA |
169 |
904 |
- |
- |
1.48 |
- |
Unknown site, similar to DAF-16 |
|
AAATGGCG |
205 |
592 |
- |
- |
1.54 |
- |
Unknown site |
|
TTGACCCA |
239 |
860 |
[0;1700] |
- |
1.60 |
- |
Unknown site |
|
TGACACAC |
273 |
860 |
- |
- |
1.83 |
- |
Unknown site |
|
TGTCAAC |
281 |
999 |
[100;1900] |
1.55 |
- |
Unknown site |
|
|
|
|||||||
|
(a) For each known regulatory element, we show the best k-mer, its rank within the set of 469 highest scoring k-mers, the median distance to ATG (for occurrences upstream of genes within the conserved set), the optimal window, the orientation bias, the corrected ratio of upstream/coding bias, the total (up-regulated/down-regulated) number of microarray conditions in which the k-mer was found (see Method), TRANSFAC matches, and the best GO enrichment. (b) Novel predicted regulatory elements. k-mers shown here were selected from the list of 469 highest scoring k-mers based on their short median distance to ATG, short optimal window, significant orientation bias, strong over-representation ratio (U/C), presence in upstream regions of over/underexpressed genes in several microarray conditions, palindromicity or ressemblance to known sites in other species. |
|||||||
|
Elemento and Tavazoie Genome Biology 2005 6:R18 doi:10.1186/gb-2005-6-2-r18 |
|||||||