Figure 4.

Specifying the phylo-HMM for analysis of VWF. The meaning of the substitution model arguments (lines 1 to 3) are as follows: ordered_param, rate will be split and ordered from small to large across bins; distribution, the statistical distribution by which parameter values are determined; and recode_gaps, whether gap characters are set to 'N'. The substitution model is then turned into a likelihood function (line 5) by providing a phylogenetic tree, specifying that the Γ distribution is split into two bins and the autocorrelated occurrence of rate class members is indicated by the sites_independent argument. We finish the definition of the Γ rate heterogeneity distribution by setting the bin probabilities (bprobs) to be fixed at the default value (line 6), which is equal. The remaining statements provide the alignment data to the likelihood function, optimize it, and extract the posterior probabilities for each site belonging to each rate class (lines 7 to 9). The slow rate class is automatically assigned the name bin0 and those probabilities are extracted by slicing the array (line 10). HMM, hidden Markov model; VWF, von Willebrand Factor.

Knight et al. Genome Biology 2007 8:R171   doi:10.1186/gb-2007-8-8-r171
Download authors' original image