Introduction to ANOVA
Week 9 Lecture 1
aldrin depth
3.8 bottom
4.8 bottom
4.9 bottom
5.3 bottom
5.4 bottom
5.7 bottom
.
.
.
5.1 surface
5.2 surface
�Exploratory analysis
Aldrin concentration (nanograms per liter) at three levels of depth
�Research question
Is there a difference between the mean aldrin concentrations among the three levels?
�ANOVA
ANOVA is used to assess whether the mean of the outcome variable is different for different levels of a categorical variable
H0 : The mean outcome is the same across all categories,
𝜇1 = 𝜇2 = … = 𝜇k
where 𝜇k represents the mean of the outcome for observations in category i
HA : At least one mean is different than others
�Hypotheses
HA : 𝜇B ≠ 𝜇M ≠ 𝜇S
HA : 𝜇B = 𝜇M = 𝜇S
HA : At least one mean is different
HA : At least one mean is different
HA : 𝜇B > 𝜇M > 𝜇S
�Hypotheses
HA : 𝜇B ≠ 𝜇M ≠ 𝜇S
HA : 𝜇B = 𝜇M = 𝜇S
HA : At least one mean is different
HA : At least one mean is different
HA : 𝜇B > 𝜇M > 𝜇S
�𝘵 test vs. ANOVA - Purpose
𝘵 test
Compare means from two groups to see whether they are so far apart that the observed difference cannot reasonably be attributed to sampling variability
H0 : 𝜇1 = 𝜇2
ANOVA
Compare the means from two or more groups to see whether they are so far apart that the observed differences cannot all reasonably be attributed to sampling variability
H0 : 𝜇1 = 𝜇2 = … = 𝜇k
�test vs. ANOVA - Method
z/𝘵 test
Compute a test statistic (a ratio)
ANOVA
Compute a test statistic (a ratio)
�𝘵 test vs. ANOVA
�Test statistic
Does there appear to be a lot of variability within groups? How about between groups?
�𝑭 distribution and p-value
�Conditions/Assumptions
How do we check for normality? (Remember previous lectures)
How can we check this condition?
Degrees of freedom associated with ANOVA
Sum of squares between groups, SSG
Measures the variability between groups
where is each group size, is the average for each group, is the overall (grand) mean
aldrin depth
3.8 bottom
4.8 bottom
4.9 bottom
5.3 bottom
5.4 bottom
5.7 bottom
.
.
.
5.1 surface
5.2 surface
Sum of squares total, SST
Measures the variability between groups
where xi represent each observation in the dataset
SST = (3.8 - 5.1)2 + (4.8 - 5.1)2 + (4.9 - 5.1)2 + … + (5.2 - 5.1)2
= (-1.3)2 + (-0.3)2 + (-0.2)2 + … + (0.1)2
= 1.69 + 0.09 + 0.04 + … + 0.01
= 54.29
Sum of squares error, SSE
Measures the variability within groups:
SSE = SST - SSG
SSE = 54.29 - 16.96 = 37.33
Mean square error
Mean square error is calculated as sum of squares divided by the degrees of freedom
MSG = 16.96/2 = 8.48
MSE = 37.33/27 = 1.38
Test statistic, F value
As we discussed before, the F statistic is the ratio of the between group and within group variability
p-value
p-value is the probability of at least as large a ratio between the “between group” and “within group” variability, if in fact the means of all groups are equal. It’s calculated as the area under the F curve, with degrees of freedom dfG and dfE, above the observed F statistic.
dfG = 2; dfE = 27
R, Please Save Me!
�Conclusion
�Conclusion - in context
What is the conclusion of the hypothesis test?
The data provide convincing evidence that the average aldrin concentration
�Conclusion - in context
What is the conclusion of the hypothesis test?
The data provide convincing evidence that the average aldrin concentration
�Which means differ?
Can you see any pitfalls with this approach?
�Which means differ?
Based on the box plots below, which means would you expect to be significantly different?
mid-depth & surface
bottom & surface;
mid-depth & surface