Hypothesis testing - 2
Saket Choudhary
Introduction to Public Health Informatics
DH 302
Lecture 06 || Friday, 24th January 2025
From last lecture…
In a land of stats, so wild and vast,�Hypothesis testing had students aghast.�"Is it null? Is it not? Should we reject?"�Confusion spread, hard to correct.
Coauthored with ChatGPT
Then came the voice, "Here's the key,�State your null as plain as can be.�Assume it's true, don't let it stray,�And let your data have its say."
But oh, the p-values, they played their tricks,�"Below 0.05? It's a statistical fix!"�"Above that line? We must comply—�The null survives, we let it fly."
From last lecture…
From last lecture…
Why do we select the null as such?
dfdf
Visualizing the p-values region
Area = α/2
Area = α/2
Distribution of T under H0
Significant
findings
Null findings
Significant
findings
T1-α/2
Tα/2
P-value
Tobs
P-value = Probability of sampling a test statistic at least as extreme as the observed test statistic if the null hypothesis is true
We “reject” the null hypothesis (H0) if the pvalue is below the threshold (𝝰)
dfdf
Type I,II errors and Power
dfdf
Type I,II errors and Power
False-positive
False-
negative
Distribution of T under H0
False-positive
Distribution of T under HA
Power
False-
negative
The false-positive rate is the probability of incorrectly rejecting H0.
The false-negative rate is the probability of incorrectly accepting H0.
Power = 1 – false-negative rate = probability of correctly rejecting H0.
Tα/2
T1-α/2
dfdf
Types of error
What is p-value?
Goodness of fit - Chi-squared test
Problem: What distribution should I fit?
Use a pseudocount of +1 in frequencies
= 5.744762
Is 5.7 high/low/medium?
Example of Chi-square in R
chi_square_stat <- sum((observed - expected)^2 / expected)
dof <- length(observed) - 1
p_value <- pchisq(chi_square_stat, dof, lower.tail = FALSE)
alpha <- 0.05 # Significance level
if (p_value < alpha) {
cat("Reject the null hypothesis")
} else {
cat("Fail to reject the null hypothesis")
}
P-value = 0.33 (>0.05)
Thus, we fail to reject the null hypothesis that the there is statistically no significant difference between the frequencies observed in Mar 2019 - Mar 2023 follow the same distribution as the Feb 2015 - Feb 2019 ones”
Another goodness of fit test - Likelihood ratio test (or G-test)
Oi = an observed count for bin i
Ei = an expected count for bin i, asserted by the null hypothesis
G follows a chi-squared distribution with degrees of freedom = (length of observations - 1)
Example of G-test in R
G_stat <- 2 * sum(observed * log(observed / expected), na.rm = TRUE)
dof <- length(observed) - 1
p_value <- pchisq(G_stat, df = dof)
alpha <- 0.05 # Significance level
if (p_value < alpha) {
cat("Reject the null hypothesis")
} else {
cat("Fail to reject the null hypothesis")
}
P-value = 0.59 (>0.05)
Thus, we fail to reject the null hypothesis that the there is statistically no significant difference between the frequencies observed in Mar 2019 - Mar 2023 follow the same distribution as the Feb 2015 - Feb 2019 ones”
Was the rare event statistically different in 4 years?
What is the probability of observing something as extreme?
Null hypothesis?
Was the rare event statistically different in 4 years?
What is the probability of observing entries as small as the one in April 2020?
Assume a poisson model
= (sum of observations)/length(of observations)
P(X ≤ 3524) = ppois(x = 3524, lambda) < 1e-16 → The rare event is statistically different
Is this event a “rare” event?
A simpler case: Are trauma related deaths in 2020 similarly distributed as 2019?
Sum
88463
72503
df_wide$diff <- df_wide$`2020`-df_wide$`2019`
df_wide$chisq <- df_wide$diff^2/(df_wide$`2019`)
chi_square_stat <- sum(df_wide$chisq)
dof <- 11
p_value <- pchisq(chi_square_stat, dof, lower.tail = FALSE)
alpha <- 0.05 # Significance level
if (p_value < alpha) {
cat("Reject the null hypothesis")
} else {
cat("Fail to reject the null hypothesis")
}
Ideally, we should check if
(** this was automatically true for the 2015-2019 vs 2019 - 2023 example as we binned the observations)
A simpler case: Are trauma related deaths in 2020 similarly distributed as 2019?
Sum
88463
72503
chisq <- chisq.test(x = df_wide$`2020`, p = df_wide$`2019`, rescale.p = T)
> chisq$statistic
X-squared
2738.136
> chisq$p.value
[1] 0
# Method 1
72503
O_i
1
Probability from 2019
df_wide$p_i <- df_wide$`2019`/sum(df_wide$`2019`)
df_wide$E_i <- df_wide$p_i * sum(df_wide$`2020`)
chisq_square_stat <- sum((df_wide$`2020`-df_wide$E_i)^2/df_wide$E_i)
dof <- 11
p_value <- pchisq(chi_square_stat, dof, lower.tail = FALSE)
> chisq_square_stat
[1] 2738.136
> p_value
[1] 0
# Method 2
How is G-test related to chi-squared test?
How is G-test (Likelihood ratio test) related to Chi-squared?
Central Limit Theorem
Binomial to Normal?
Binomial to Normal
Expectations and Variances
Expectations and Variances
Exercise - Calculate the mean of the binomial random variable
Some digression
Obesity and BMI - The old paradim
Obesity: requirement of the new definition
Obesity the new definition
dfdf
Testing for difference in mean (median) of two samples
Next: Testing for difference of means
Question: Is there statistically significant difference in mean between men and women BMI?
What is the null hypothesis?
Null Hypothesis: The mean bmi is same for men and womean
33
Questions?