CMSC 320
2026
Fardina F Alam
Hypothesis Testing: Different Types of Statistical Tests
(PART03)
Topics we will cover
Parametric Tests for Means: These tests are used to compare means. They assume that your data follows a normal distribution and that you have a sufficient sample size.
Some Non-Parametrics Test
Post-Hoc Analysis
Different Types of Statistical Tests
Statistical Tests
A statistical hypothesis test is a method of statistical inference used to decide whether the data at hand sufficiently support a particular hypothesis.
NOTE: Generally, in hypothesis tests, test statistic means to obtain all of the sample data and convert it to a single value. For example, Z-test calculates Z statistics, t-test calculates t-test statistic, and F-test calculates F values etc., are the test statistics. Test statistics need to compare to an appropriate critical value (cv) or p-value. A decision can then be made to reject or not reject the null hypothesis.
Does knowing more help us?
Yes! If we have an idea of the standard deviation of the underlying population (or even just have enough data to make an estimate), we can use a z-test instead, which give more accurate results.
According to the theory, we cannot use z-tests for sample sizes under 30 elements.
Z Test - When to use
The Z-test compares a sample mean to a population mean. It is used when :
Assumption: It assumes that the sample data is normally distributed or that the sample size is large enough for the Central Limit Theorem to apply
T-Test
T-test is a statistical test used to determine if there is a significant difference between the means of two groups.It is used when
In General, the type of test statistic used depends on the number of samples being compared
Additionally, samples can be paired or not paired.
One sample T-Test
Determine if the mean of a single sample is significantly different from a known or hypothesized population mean.
How to run a one-sample t test:
import numpy as np
from scipy import stats
stats.ttest_1samp(your_data, popmean=0.5)
>>> TtestResult(statistic=2.456308468440, pvalue=0.017628209047638, df=49)
Two Sample t-test
The two sample t test, (also referred to as the unpaired t test), is used to compare the means of two different samples.
Example: We have noticed most humans fall into one of two distinct categories–male or female. We would like to know if our sample of males is taller than our sample of females.
Can we just take the average of the two samples?
Two-Sample T-test
Two Sample T-Test
Q: What sort of p value would we see if men and women had different heights?
When can we use the T-test?
Our Assumptions: For the t-test we assume:
Q: What if my data isn’t nearly normally distributed?
Paired Sample t test
The paired sample t test is used to compare the means of two related groups of samples.
Paired Sample t test: Example
The aliens monitor a bunch of humans, test them for intelligence, and then run one half of them through a machine to make them smarter. Afterwards, they want to know if their machine worked.
This would be called a paired t-test.
Null Hypothesis: ?
Alternative Hypothesis: ?
Paired Sample t test: Example
The aliens monitor a bunch of humans, test them for intelligence, and then run one half of them through a machine to make them smarter. Afterwards, they want to know if their machine worked.
Null Hypothesis: The machine did nothing
Alternative Hypothesis: The machine came from a different distribution
Paired Sample t test: Example
The aliens monitor a bunch of humans, test them for intelligence, and then run one half of them through a machine to make them smarter. Afterwards, they want to know if their machine worked.
Null Hypothesis: The machine did nothing
Alternative Hypothesis: The machine came from a different distribution
Ques: The aliens get a p-value of .05. What can they conclude?
Recap: Tests so Far
Multiple Groups
The Aliens decide to kidnap humans to study, but we don’t know what humans eat! We have five different food mixes we want to try. We split the humans up into five groups and feed each group a different mix, and then measure how much the humans grow over the next few years.
Ques: How do Aliens know if the mixes have different effects?
Anova (Analysis of Variance) Test
ANOVA is a powerful statistical test for comparing the means of multiple groups (three or more groups (more than two)) to determine if there are significant differences among them.
We use a anova test.
Notes: In t-tests and z-tests, we typically compare means of two groups using individual datasets or assess the mean of a single group against a known value. ANOVA evaluates differences in means across three or more groups as a whole, considering both within-group and between-group variability.
Parametric Tests ( Comparing Means)
Nonparametric Tests ( Comparing Medians)
Nonparametric Hypothesis Tests Used when data do not meet assumptions of parametric tests (e.g., normality, equal variances).
Example:
Rank-based tests → work on ranked data
Frequency-based test → work on counts in categories
Nonparametric Tests: Alternatives to Parametric Tests
Kruskal–Wallis Test
Mann–Whitney U Test
Wilcoxon Signed-Rank Test
Spearman’s Rank Correlation: Measures correlation between two variables based on ranks. Useful when data are ordinal or non-normal
*** Nonparametric tests are mostly based on ranked data instead of raw values, making them more robust when assumptions of parametric tests are not met.
Instead of using the actual numerical values (raw scores), nonparametric tests convert the data into ranks (positions).�
Example: Raw data (exam scores) → 45, 80, 60, 90, 75�Ranked data (from smallest to largest) → 1, 2, 3, 4, 5
So, the test looks at whether groups differ in their rank distributions, not the exact values.
Why? Because ranks are less sensitive to outliers and do not require the assumption that data are normally distributed.
The Chi-squared test (Frequency-based test for categorical data)
Analyze categorical data to check for an association or relationship between two or more categorical variables.�
Type: Nonparametric test (compares frequencies, not raw values)�
When to use: To determine if observed frequencies differ significantly from expected frequencies in a contingency table.�
Example: Is there a relationship between gender and preference for a soda brand (Yes/No)?
What about this?
We are monitoring birds from two different places on the planet, and get the following results:
Bird Type | Location A | Location B |
Grackle | 7 | 13 |
Pigeon | 2 | 7 |
Sea pigeon | 15 | 1 |
One of those big fish-beak things | 13 | 0 |
Big long bird | 22 | 0 |
Bat | 3 | 4 |
Each bird type and location falls into distinct categories, making them categorical variables suitable for analysis using methods like the chi-square test.
We want to find out if two different places on Earth have the same types of birds
Do these locations have the same underlying bird population?
Enter the Chi Square Test! A test for checking if two sets of categorical variables come from the same distribution.
Null hypothesis: ?
Alternative hypothesis: ?
The bird populations observed in Location A and Location B are the same.
The bird populations observed in Location A and Location B are different.
Considerations: (How to decide an appropriate statistical test?)
Post Hoc Tests for ANOVA
When to use: ANOVA tells us that at least one group differs, but not which groups are different.
Post-hoc analysis identifies the specific group differences after ANOVA is significant when more than two groups are compared.
Purpose
Post Hoc Test/ Analysis
Common Post-Hoc Tests
Key Idea Post-hoc tests show which group means differ significantly from each other.
Example
Scenario: We conducted a study to compare the test scores of students from three different schools: School A, School B, and School C.
More than 2 groups (A,B,C) → Apply ANOVA
ANOVA result: p-value < 0.05 (indicating significant overall differences among schools).
Example
Next Step: Applied Tukey's HSD post hoc test.
Interpretation from ANOVA: School A and School B, as well as School A and School C, have significantly different test scores (p < 0.05). But there is no significant difference in test scores between School B and School C.
Summary:
Statistical tests let us reason about one more samples and how they relate to each other and the population.
Summary: Main Steps of Hypothesis Testing