Table 2

Indel density for annotation features (across all 44 ENCODE regions)


Indels
Rate (number per 100 kb)
Rate (bp per 100 kb)





n
bp
n
99% CI
bp
99% CI
Feature length (kb)

Manual
2,186
6,504
14.6
11.7 to 18.2
43.4
34.4 to 54.7
14,998
Random
2,300
6,506
15.3
13.6 to 17.3
43.4
37.5 to 50.2
15,000
Overall
4,486
13,010
15.0
13.4 to 16.7
43.4
38.3 to 49.1
29,998








RNA transcription







     CDS
5
5
0.7
0.1 to 8.6
0.7
0.1 to 8.6
675
     TSS
2
2
3.3

3.3

61
     RACEfrags
9
28
2.1
0.8 to 5.4
6.6
1.3 to 33.9
425
     TARs/transfrags
37
78
5.8
3.5 to 9.6
12.3
6.8 to 22.3
634
     Pseudo-exons
9
26
6.6
2.6 to 16.6
19.1
5.8 to 63.3
136
     3' UTR
48
103
11.0
7.2 to 16.7
23.6
13.5 to 41.3
436
     5' UTR
7
32
6.0
1.6 to 22.3
27.4
3.8 to 198.7
117
     TUF
53
160
12.2
7.8 to 19.2
36.9
20.2 to 67.6
433








Open chromatin







     FAIRE-sites
106
327
7.7
5.6 to 10.6
23.8
15.5 to 36.7
1,372
     DHS (NHGRI)
19
61
6.1
3.3 to 11.3
19.7
8.3 to 46.9
310
     DHS (Regulome)
43
135
8.6
5.3 to 14.0
27.0
13.4 to 54.4
499








DNA-protein intreraction/transcript regulation







     HisPolTAF
141
348
13.1
10.0 to 17.2
32.4
22.5 to 46.5
1,076
     Seq_specific (all motifs)
131
420
11.2
8.3 to 15.0
35.8
23.1 to 55.3
1,174
     SeqSp (sequence specific factors)
54
225
10.2
6.2 to 16.7
42.5
20.1 to 89.5
530








Ancestral repeats
532
1,592
7.9
6.7 to 9.2
26.5
21.7 to 32.5
5,998








Evolutionary constraint







     MCS strict
19
31
2.5
1.3 to 5.1
4.1
1.6 to 10.4
748
     MCS moderate
78
170
5.1
3.5 to 7.6
11.2
6.8 to 18.5
1,515
     MCS loose
356
960
9.8
8.2 to 11.7
26.4
20.9 to 33.4
3,637








Cell cycle







     EarlyRepSeg
1,124
2,989
16.4
13.8 to 19.4
43.5
33.3 to 56.9
6,868
     MidRepSeg
1,190
3,352
15.4
13.5 to 17.5
43.2
35.3 to 53.0
7,751
     LateRepSeg
1,110
3,345
13.9
12.1 to 15.9
41.9
32.9 to 53.3
7,991

bp, base pairs; CDS, coding sequence; CI, confidence interval; DHS, DNAse hypersensitive sites; ENCODE, Encyclopedia of DNA Elements; FAIRE, formaldehyde assisted isolation of regulatory elements; kb, kilobases; MCS, multi-species conserved sequence; NHGRI, National Human Genome Research Institute; transfrag, transcribed fragment; RACEfrag, rapid amplification of cDNA ends fragment; TAR, transcriptionally active region; TSS, transcription start site; TUF, transcripts of unknown function; UTR, untranslated region.

Clark et al. Genome Biology 2007 8:R180   doi:10.1186/gb-2007-8-9-r180

Open Data