Lecture 2
Cause and Effect
DATA 8
Summer 2017
Slides created by Ani Adhikari, John DeNero, and Sam Lau
Announcements
Causality
Really?
npr.org (report on a study in heart.bmj.com)
Observation
The first question
Is there any relation between chocolate consumption and heart disease?
“any relation”
An answer
Some data:
�“Among those in the top tier of chocolate consumption, 12 percent developed or died of cardiovascular disease during the study, compared to 17.4 percent of those who didn’t eat chocolate.”�- Howard LeWine of Harvard Health Blog, reported by npr.org
(in my opinion)
The next question
Does chocolate consumption lead to a reduction in heart disease?
This question is often harder to answer.
“[The study] doesn’t prove a cause-and-effect relationship between chocolate and reduced risk of heart disease and stroke.”�- JoAnn Manson, chief of Preventive Medicine at Brigham and Women’s Hospital, Boston
Miasmas, miasmatism, miasmatists
John Snow, 1813-1858
Comparison
Snow’s “Grand Experiment”
“… there is no difference whatever in the houses or the people receiving the supply of the two Water Companies, or in any of the physical conditions with which they are surrounded …”
Snow’s table
Supply Area | Number of houses | Cholera deaths | Deaths per 10,000 houses |
S&V | 40,046 | 1,263 | 315 |
Lambeth | 26,107 | 98 | 37 |
Rest of London | 256,423 | 1,422 | 59 |
Key to establishing causality
If the treatment and control groups are similar apart from the treatment, then differences between the outcomes in the two groups can be ascribed to the treatment.
Trouble
If the treatment and control groups have systematic differences other than the treatment, then it might be difficult to identify causality.
Such differences are often present in observational studies.
When they lead researchers astray, they are called confounding factors.
Randomize!
Careful ...
Regardless of what the dictionary says,
in probability theory
Random ≠ Haphazard