1 of 20

Causality

Brian Caffo, Jeff Leek, Roger Peng

@bcaffo

www.bcaffo.com

2 of 20

Correlation isn’t causation, but then what is causation? (XKCD/552)

3 of 20

How do we define causality?

4 of 20

David Hume

http://en.wikipedia.org/wiki/David_Hume

“We may define a cause to be an object followed by another, and where all the objects, similar to the first, are followed by objects similar to the second. Or, in other words, where, if the first object had not been, the second never had existed.”

For further discussion and references on the history of causality http://plato.stanford.edu/entries/causation-counterfactual/

5 of 20

The causal effect of a treatment on a subject is the change in the outcome:

from the treatment the subject actually received and

what would of occurred had she or he received the other treatment

6 of 20

Consider a treatment

7 of 20

Counterfactual

8 of 20

The average causal effect is the average of the subject-specific causal effects

9 of 20

Some thoughts

This way of thinking about causal inference requires an assignable treatment or intervention

We can’t observe counterfactuals; we only get to observe one state of nature

We can, with assumptions and careful study design, make inferences about average causal effects

Causal thinking is essential for understanding the causal implications associated assumptions in our data analysis

10 of 20

Think about these study designs in the light of counterfactuals

11 of 20

Crossover trials

Crossover trials: Give a subject a treatment, then after a suitable washout period, give the other

12 of 20

Crossover trials

Consider a study of chronic migraines; give subjects one relief medication, washout period, then another

Why would this not work for an ad campaign, or a weight loss study?

How does it actualize counterfactual

thinking?

http://bit.ly/1T39kyx

13 of 20

Natural experiments

http://bit.ly/1AUj1t9

http://bit.ly/1cDE08b

14 of 20

Matching (aka finding dopplegangers)

http://bit.ly/1ARH7Fh

15 of 20

Randomization is our most effective tool for estimating average causal effects

(we have an entire lecture on randomization)

16 of 20

Randomization

Randomization, with high probability, makes the treated and untreated groups directly comparable

This can be shown to lead to causal inferences

http://skillsprojects.files.wordpress.com/2009/10/like_for_like.jpg

17 of 20

What else can we do?

Some examples

18 of 20

Instrumental variables

oral contraceptives

ovarian cancer

Reimbursement rates

20 of 20

Modeling

Modeling comes with a lot of assumptions

Most well known modeling technique in causal inferences use propensity scores