Conditional
Probability
Slides developed by Mine Çetinkaya-Rundel of OpenIntro
The slides may be copied, edited, and/or shared via the CC BY-SA license
Some images may be included under fair use guidelines (educational purposes)
Relapse
Researchers randomly assigned 72 chronic users of cocaine into three groups: desipramine (antidepressant), lithium (standard treatment for cocaine) and placebo. Results of the study are summarized below.
Marginal probability
What is the probability that a patient relapsed?
Marginal probability
What is the probability that a patient relapsed?
P(relapsed) = 48 / 72 ~ 0.67
Joint probability
What is the probability that a patient received the antidepressant (desipramine) and relapsed?
Joint probability
What is the probability that a patient received the antidepressant (desipramine) and relapsed?
P(relapsed and desipramine) = 10 / 72 ~ 0.14
Conditional probability
The conditional probability of the outcome of interest A given condition B is calculated as
Conditional probability (cont.)
P(relapse | desipramine) = 10 / 24 ~ 0.42
If we know that a patient received the antidepressant (desipramine), what is the probability that they relapsed?
Conditional probability (cont.)
P(relapse | desipramine) = 10 / 24 ~ 0.42
P(relapse | lithium) = 18 / 24 ~ 0.75
P(relapse | placebo) = 20 / 24 ~ 0.83
If we know that a patient received the antidepressant (desipramine), what is the probability that they relapsed?
Conditional probability (cont.)
P(desipramine | relapse) = 10 / 48 ~ 0.21
P(lithium | relapse) = 18 / 48 ~ 0.38
P(placebo | relapse) = 20 / 48 ~ 0.42
If we know that a patient relapsed, what is the probability that they received the antidepressant (desipramine)?
General multiplication rule
Independence and
conditional probabilities
Consider the following (hypothetical) distribution of gender and major of students in an introductory statistics class:
60 / 100 = 0.6.
30 / 50 = 0.6.
Independence and
conditional probabilities (cont.)
Generically, if P(A | B) = P(A) then the events A and B are said to be independent.
Independence and
conditional probabilities (cont.)
Generically, if P(A | B) = P(A) then the events A and B are said to be independent.
Breast cancer screening
Note: These percentages are approximate, and very difficult to estimate.
Inverting probabilities
When a patient goes through breast cancer screening there are two competing claims: patient had cancer and patient doesn't have cancer. If a mammogram yields a positive result, what is the probability that patient actually has cancer?
Note: Tree diagrams are useful for inverting probabilities:�we are given P(+|C) and asked for P(C|+).
Practice
Suppose a woman who gets tested once and obtains a positive result wants to get tested again. In the second test, what should we assume to be the probability of this specific woman having cancer?
(a) 0.017
(b) 0.12
(c) 0.0133
(d) 0.88
Practice
Suppose a woman who gets tested once and obtains a positive result wants to get tested again. In the second test, what should we assume to be the probability of this specific woman having cancer?
(a) 0.017
(b) 0.12
(c) 0.0133
(d) 0.88
Practice
What is the probability that this woman has cancer if this second mammogram also yielded a positive result?
(a) 0.0936
(b) 0.088
(c) 0.48
(d) 0.52
Practice
What is the probability that this woman has cancer if this second mammogram also yielded a positive result?
(a) 0.0936
(b) 0.088
(c) 0.48
(d) 0.52
Bayes' Theorem
The conditional probability formula we have seen so far is a special case of the Bayes' Theorem, which is applicable even when events have more than just two outcomes.
Bayes' Theorem
The conditional probability formula we have seen so far is a special case of the Bayes' Theorem, which is applicable even when events have more than just two outcomes.
Bayes’ Theorem
Application activity:
inverting probabilities
A common epidemiological model for the spread of diseases is the SIR model, where the population is partitioned into three groups: Susceptible, Infected, and Recovered. This is a reasonable model for diseases like chickenpox where a single infection usually provides immunity to subsequent infections. Sometimes these diseases can also be difficult to detect.
Imagine a population in the midst of an epidemic where 60% of the population is considered susceptible, 10% is infected, and 30% is recovered. The only test for the disease is accurate 95% of the time for susceptible individuals, 99% for infected individuals, but 65% for recovered individuals. (Note: In this case accurate means returning a negative result for susceptible and recovered individuals and a positive result for infected individuals).
Draw a probability tree to reflect the information given above. If the individual has tested positive, what is the probability that they are actually infected?
Application activity:
inverting probabilities (cont.)