What If: Effect Modification

Wed Aug 16 2023

Series: Working Through 'What If?'

Null average causal effect in a population does not imply null average causal effect in a particular subpopulation. Imagine, for instance, that yelling at people on average has no effect on their cancer risk. However, if we stratify by age, we might find that yelling at people has a positive effect on cancer risk in young people, and a negative effect on cancer risk in old people. Age, therefore, is an effect modifier.

An important definition: If there is a factor V that modifies the effect of a treatment X on an outcome Y, and the treatment effect goes in opposite directions in different subpopulations of V, then we have a qualitative effect modification.

Stratification

A stratified analysis is the simplest way to measure effect modification. You stratify the population into subpopulations of interest, and then estimate the average causal effect in each subpopulation.

Stratification as a Tool for Adjustment

Stratification is not only used to measure effect modification. It is also routinely used as an alternative for standardization and IP weighting (see the last few chapters) to adjust for some factor L. Under this method, we get $n$ estimates of the average causal effect - one for each stratum. And under conditional exchangeability, associational measures of effect are equal to causal measures of effect. However, stratification cannot provide an “average” causal effect for the entire population.

You also must ensure, when using stratification, that you separately compute the weights for every combination of variables required for conditional exchangeability. This is in contrast to Standardization or IP Weighting: after stratifying by V, you can use standardization or IP weighting to adjust for L within each stratum. This effectively lets us handle conditional exchangeability vs. effect modification separately.

Stratification necessarily results in multiple estimates of the average causal effect (one for each stratum) if used as a tool for adjustment. In some cases, this is not desirable, so we should fall back on standardization or IP weighting.

Transportability

How can we generalize the results of a study to a target population? This is the question of transportability. In general, we can consider our result transportable if:

The distribution of effect modifiers is the same
The treatment version is the same
The levels of interference are the same. Interference is when the treatment of one unit affects the outcome of another unit.

Surrogate vs Causal Effect Modifications

A surrogate effect modification is when the effect of a treatment on a outcome is modified by a factor V, but the true causal effect is in fact not V, but perhaps some other factor for which V is a proxy. For example, if we are interested in the effects of screaming at someone on their cancer risk by age, we might find that the effect of screaming on cancer risk has an effect at age 20, but not at age 70. However, this is not because age is a true causal effect modifier, but because age is a proxy for hearing loss, which renders the treatment ineffective.

In general, the term “effect modification by V” does not imply that V is a causal effect modifier.

Matching

Matching is another method of adjusting for variables L. The concept of matching is very straightforward: for each unit in the treatment group, we find a unit in the control group that has the same value (or values) of L. This creates a pseudo-population wherein the distribution of L is the same between the treatment and control groups. We can then estimate the average causal effect in this pseudo-population.

Many Different Measures

The textbook outlines a result computed by several different risk ratios on a dataset. We’ll steal the results of the table here to discuss their implication.

0.8: The average causal effect in the entire population, as calculated by standardization (or IP weighting)
2.0 and 0.5: The average causal effect in the subpopulations of V, as calculated by stratification
1.0: The average causal effect in the pseudo-population created by matching

Which of these measures is the “true” average causal effect? The answer is that they are all true, but they are all true in different ways.

In the dataset, we have qualitative effect modification (as we can see from the second bullet point). Treatment doubles the risk of cancer in the subpopulation of V, but halves the risk of cancer in the subpopulation of not-V.

However, overall causal effect is beneficial. In this case, it is because the risk under no treatment in V is much higher than the odds of being in not-V.

The point of all this is just to say that you should be diligent with describing your target population, just as you must be diligent in describing your intervention (as discussed in the last post). This is yet again another difficulty related to observational studies: in a randomized controlled trial, your population is well-defined almost by definition. In an observational study, this is certainly not the case.