Causality
Brian Caffo, Jeff Leek, Roger Peng
@bcaffo
www.bcaffo.com
Correlation isn’t causation, but then what is causation? (XKCD/552)
How do we define causality?
David Hume
http://en.wikipedia.org/wiki/David_Hume
“We may define a cause to be an object followed by another, and where all the objects, similar to the first, are followed by objects similar to the second. Or, in other words, where, if the first object had not been, the second never had existed.”
For further discussion and references on the history of causality http://plato.stanford.edu/entries/causation-counterfactual/
The causal effect of a treatment on a subject is the change in the outcome:
from the treatment the subject actually received and
what would of occurred had she or he received the other treatment
Consider a treatment
Counterfactual
The average causal effect is the average of the subject-specific causal effects
Some thoughts
This way of thinking about causal inference requires an assignable treatment or intervention
We can’t observe counterfactuals; we only get to observe one state of nature
We can, with assumptions and careful study design, make inferences about average causal effects
Causal thinking is essential for understanding the causal implications associated assumptions in our data analysis
Think about these study designs in the light of counterfactuals
Crossover trials
Crossover trials: Give a subject a treatment, then after a suitable washout period, give the other
Crossover trials
Consider a study of chronic migraines; give subjects one relief medication, washout period, then another
Why would this not work for an ad campaign, or a weight loss study?
How does it actualize counterfactual
thinking?
http://bit.ly/1T39kyx
Natural experiments
http://bit.ly/1AUj1t9
http://bit.ly/1cDE08b
Matching (aka finding dopplegangers)
http://bit.ly/1ARH7Fh
Randomization is our most effective tool for estimating average causal effects
(we have an entire lecture on randomization)
Randomization
Randomization, with high probability, makes the treated and untreated groups directly comparable
This can be shown to lead to causal inferences
http://skillsprojects.files.wordpress.com/2009/10/like_for_like.jpg
What else can we do?
Some examples
Instrumental variables
oral contraceptives
ovarian cancer
Reimbursement rates
Modeling
Modeling
Modeling comes with a lot of assumptions
Most well known modeling technique in causal inferences use propensity scores