Figure 2.
An example of confounding and a stratified analysis of environmental and genetic factors.
Here we assume two populations (for example, races), groups A and B. G1 and G2 represent
dichotomous genotype classes at a candidate gene locus (here one of the classes represents
two genotypes for simplification, as would be the case for a dominant model), and
E1 and E2 represent two strata of an environmental factor. (a) We assume that the probability (P) of trait D depends only on E, so that the risk
of D given E1 is 10%, versus 1% given E2. In group A, the frequency of G1, G2, E1
and E2 are each 50%, whereas in group B, the frequency of G1 and E1 are each 10% and
the frequency of G2 and E2 are each 90% Then, within group A, the prevalence of D
is 5.5% whereas in group B the prevalence is 1.9%; hence, a racial difference exists
in the prevalence of D. (b) We next consider the prevalence of D within strata defined by G and E. First, we
assume G and E are frequency-independent within each group. In this case, the frequency
difference in D between groups A and B persists within strata defined by G, but not
within strata defined by E. Thus, the environmental factor E can completely explain
the racial difference between groups A and B, but the genetic factor does not. Next
consider the case where G and E are completely correlated in frequency within groups.
In this case, analysis stratified on G or E eliminates the prevalence difference between
groups A and B, and it is impossible to determine which is the functional cause of
the racial difference. More important, consider the situation where factor E was not
measured. Then for the first scenario (G and E independent within group), analysis
stratified on G yields the correct interpretation that G does not contribute to the
racial difference; for the second scenario (G and E fully correlated), however, analysis
stratified on G would lead to the incorrect conclusion that G is the cause of the
racial difference. P(D|G1) denotes the probability of disease given an individual
has genotype G1, and similarly for G2, E1 and E2.
Risch et al. Genome Biology 2002 3:comment2007.1-comment2007.12 doi:10.1186/gb-2002-3-7-comment2007 |