Lecture 2
Cause and Effect
DATA 8
Fall 2019
Announcements
A Link
Guardian UK
A Stronger Link?
npr.org (report on a study in heart.bmj.com)
Observation
The first question
Is there any relation between chocolate consumption and heart disease?
An answer
Some data:
�“Among those in the top tier of chocolate consumption, 12 percent developed or died of cardiovascular disease during the study, compared to 17.4 percent of those who didn’t eat chocolate.”�- Howard LeWine of Harvard Health Blog, reported by npr.org
(in my opinion)
The next question
Does chocolate consumption lead to a reduction in heart disease?
This question is often harder to answer.
“[The study] doesn’t prove a cause-and-effect relationship between chocolate and reduced risk of heart disease and stroke.”�- JoAnn Manson, chief of Preventive Medicine at Brigham and Women’s Hospital, Boston
Association
London, early 1850’s
Illustration from Punch (1852).
Miasmas, miasmatism, miasmatists
John Snow, 1813-1858
Causation
Comparison
Snow’s “Grand Experiment”
“… there is no difference whatever in the houses or the people receiving the supply of the two Water Companies, or in any of the physical conditions with which they are surrounded …”
Snow’s table
Supply Area | Number of houses | Cholera deaths | Deaths per 10,000 houses |
S&V | 40,046 | 1,263 | 315 |
Lambeth | 26,107 | 98 | 37 |
Rest of London | 256,423 | 1,422 | 59 |
Key to establishing causality
If the treatment and control groups are similar apart from the treatment, then differences between the outcomes in the two groups can be ascribed to the treatment.
Confounding
Trouble
If the treatment and control groups have systematic differences other than the treatment, then it might be difficult to identify causality.
Such differences are often present in observational studies.
When they lead researchers astray, they are called confounding factors.
Example of Confounding
Randomize!
Careful ...
Regardless of what the dictionary says,
in probability theory
Random ≠ Haphazard
Credit: xkcd