Interactions versus Confounding
Lead Author(s): Jeff Martin, MD
Confounding vs Interaction
How do confounding and interaction differ?
An extraneous or nuisance pathway that an investigator hopes to prevent or rule out
Confounding is an extraneous pathway that we want ro rule out or preclude
A more detailed description of the relationship between the exposure and disease
A richer description of the biologic or behavioral system under study
A finding to be reported, not a bias to be eliminated
Interaction, however, when present, is a more detailed description of the biological or behavioral system under study.
When interaction is present, the issue of confounding becomes irrelevant.
- It is not extraneous but rather a richer description of the system.
- When present, it is not a bias we are seeking to eliminate but rather a new finding we should report.
Reporting or Ignoring Interactions
When to report or ignore interaction is not clear cut and we can give no absolute rules for this. When to report or ignore is a clinical, statistical, and practical decision.
Clinical: Is the magnitude of stratum-specific differences substantively (clinically) important?
*By clinical, we mean that we have to look at the magnitude of the stratum-specific differences. Differences that are so small to be of very little relevance from a clinical or biologic perspective are not worth reporting. In contrast, very large differences are really telling us something clinically and we should want to report these.
- There are inherent limitations in the power of the test of homogeneity
- Only relatively large effect sizes or large sample size can achieve p < 0.05
- One approach is to report interaction for p < 0.10 to 0.20 if the magnitude of differences is high enough.
- However, meaning of p value is not different than in other contexts.
* By statistical, we mean that we need to look at the p value and confidence intervals, but what p value should we use? There are inherent limitations in the statistical power of tests of homogeneity. Only relatively large magnitude of difference between stratum-specific estimates or large sample sizes can achieve p values of less than 0.05. Hence, it may be worthwhile to use a higher threshold - not for declaring statistical significance of interaction but for when deciding when to report stratum-specific estimates as opposed to pooling them. It should be emphasized that we are not condoning a different cut off of statistical significance for tests of interaction as if to say that they are fundamentally different than any other hypothesis testing. They are indeed interpreted just like any other p value.
Practical: How complicated is the story? i.e., if it is not too complicated to report stratum-specific estimates, it is often more revealing to report potential interaction than to ignore it.
* Finally, from a practical perspective, the question is just how complicated is it to report stratum-specific estimates individually instead of just one number which would apply for all strata. If there are 10 different strata to report on, this could make for a complicated message. On the hand, if there are just two strata, then it is probably worthwhile to report this than ignore it.
Guidelines for Reporting vs Ignoring Interactions
The table below looks at several examples to get a feeling for when we should report, rather than ignore interaction. You'll see that this an art form and requires consideration of both clinical and statistical significance.
Let's say we are looking at the association between a given exposure and a given disease and we have to then look at the effect of a potential effect modifier that has two levels: present and absent.
- First two columns you see the stratum-specific measures of association (in this example we are using risk ratios)
- Third column (red background) is the p value for the test of homogeneity (or heterogeneity)
- Final column contains our recommendation in terms of whether to declare or ignore interaction
No Report - Chance As a Cause of Interaction
Does every time the stratum-specific estimates differ indicate that we have interaction going on and that we should not adjust for the third variable but rather that we should declare that interaction is present and report all the stratum-specific estimates?
The example of sperimicide use and Down Syndrome looks at the association of these two variables and the influence of age on this association. There is a reasonable differences in the ORs in the two strata, but look at the sample sizes.
- This could get messy especially with multiple potential effect modifier and multiple different levels for the different effect modifiers.
- We could have dozens of different measures of association to report.
- Some of the cells are rather small and therefore we know these numbers are not very statistically precise.
- Somehow, we need to take this possibility of random variation (i.e., sampling error or chance) into account when we assess the presence of interaction.