Additional file 3.

Distribution of domain gain events according to the position of the domain insertion and the number of exons gained in the set of high-confidence domain gains and the set of medium-confidence domain gains. (a) The distribution of characteristics of domains from the high-confidence set of domain gains is identical to that in Figure 2. (b) The distribution of characteristics of domains from the set of medium-confidence domain gains. There are in total 330 high-confidence domain gain events and 849 medium-confidence domain gains (of which 19 gains have ambiguous position and are not shown in the graph). The flowchart in Additional file 1 shows the procedure for creation of these two sets of domain gains. The distribution of domain gains in the medium-confidence set (b) is similar to that in the set of high-confidence domain gains; the main difference is that the number of middle domain gains is increased. We believe that this is largely due to false domain gain calls caused by some proteins in the TreeFam families missing the Pfam annotations for domains that are actually present in these proteins.

Format: EPS Size: 2.6MB Download file

Buljan et al. Genome Biology 2010 11:R74   doi:10.1186/gb-2010-11-7-r74