Statistical Inference!
Now with 33% more pages!
created by Ikusahime22 (r/APStudents) April 2018/2019 | Discord: Violet Laplace#0290
Common Confidence & Significance Levels
Confidence Level | Significance Level | Z* Value | When to Use |
90% | 0.10 | 1.645 | Not very serious topic / Girl Scout cookies |
95% | 0.05 | 1.96 | When in doubt, use this / Court of law |
99% | 0.01 | 2.576 | Very serious topic |
Errors
Error | Definition |
Type I | Reject the H0 when it’s actually true |
Type II | Fail to reject the H0 when it’s actually false |
Other Things to Remember
Thing | What’s important about it |
Confidence Intervals | *ARE NOT PROBABILITY! Confidence intervals are statements about the future - “In the long run, X% of all confidence intervals constructed in this method will contain the true parameter.” |
Sx vs. σ | T-distributions are used because Sx is an estimator of the variability of the test statistic (as opposed to σ, which is exact) |
Power | The formal definition of power is 1-beta (probability of a Type 2 error). You want the power of a significance test to be high. In order to increase power, increase the sample size. If you can’t, decrease alpha. |
S (Standard error of the residuals) | Estimate of the variability of the prediction of y based on x |
Z test vs. T test | Z test is used when σ is known. T test is used when σ is unknown. Think about normality later. |
T-distributions | are robust (resistant to outliers) |
Degrees of freedom (df) |
|
Margin of error and sample size |
|
List of Tests & Intervals
Method | When to Use | Conditions | Test Hypotheses | Test Conclusion | Interval Conclusion |
1-Sample Z Test 1-Sample Z Interval | σ is known counts are involved one population | Random:
Independence:
Normality:
| H0: μ = some number Ha: μ ≠, <, > some number | “Since a p-value of [p-value] is [greater/less] than an alpha of [significance level]... we fail to reject the null hypothesis that [context]” OR we have significant evidence to reject the null hypothesis [context] in favor of the alternative [context] | “We are [significance level]% confident that the true mean [context] is between [lower bound] and [upper bound]” |
2-Sample Z Test 2-Sample Z Interval | σ is known counts are involved two populations (comparing two means) | Random:
Independence:
Normality:
| H0: μ1 = μ2 or μ1-μ2 = 0 Ha: μ1 ≠, <, > μ2 or μ1-μ2 ≠, <, > 0 | “Since a p-value of [p-value] is [greater/less] than an alpha of [significance level]... we fail to reject the null hypothesis that [context]” OR we have significant evidence to reject the null hypothesis [context] in favor of the alternative [context] | “We are [significance level]% confident that the true mean [context 1] is between [lower bound] and [upper bound] [higher/lower] than the mean [context 2]” |
1-Proportion Z Test 1-Proportion Z Interval | σ is known proportions are involved one population | Random:
Independence:
Normality:
| H0: P = some number Ha: P ≠, <, > some number | “Since a p-value of [p-value] is [greater/less] than an alpha of [significance level]... we fail to reject the null hypothesis that [context]” OR we have significant evidence to reject the null hypothesis [context] in favor of the alternative [context] | We are [significance level]% confident that the true proportion of [context] is between [lower bound] and [upper bound]” |
2-Proportion Z Test 2-Proportion Z Interval | σ is known proportions are involved two populations | Random:
Independence:
Normality:
| H0: P1 = P2 or P1-P2 = 0 Ha: P1 ≠, <, > P2 or P1-P2 ≠, <, > 0 | “Since a p-value of [p-value] is [greater/less] than an alpha of [significance level]... we fail to reject the null hypothesis that [context]” OR we have significant evidence to reject the null hypothesis [context] in favor of the alternative [context] | “We are [significance level]% confident that the true proportion of [context 1] is between [lower bound] and [upper bound] [higher/lower] than the proportion of [context 2] |
1-Sample T Test 1-Sample T Interval | Sx is known / σ is unknown counts are involved one population | Random:
Independence:
Normality:
| H0: μ = some number Ha: μ ≠, <, > some number | “Since a p-value of [p-value] is [greater/less] than an alpha of [significance level]... we fail to reject the null hypothesis that [context]” OR we have significant evidence to reject the null hypothesis [context] in favor of the alternative [context] | “We are [significance level]% confident that the true mean [context] is between [lower bound] and [upper bound] |
Paired T Test Paired T Interval | Sx of the differences is known / σ of the differences is unknown counts are involved one population, two treatments
| Random:
*PAIRED T TESTS ARE NOT INDEPENDENT! Normality:
| H0: μ(diff) = some number Ha: μ(diff) ≠, <, > some number | “Since a p-value of [p-value] is [greater/less] than an alpha of [significance level]... we fail to reject the null hypothesis that [context]” OR we have significant evidence to reject the null hypothesis [context] in favor of the alternative [context] | “We are [significance level]% confident that the true mean difference [context] is between [lower bound] and [upper bound] |
2-Sample T Test 2-Sample T Interval | Sx1 and Sx2 are known / σ1 and σ2 are unknown counts are involved one population | Random sampling/assignment:
Independence:
Normality:
| H0: μ1 = μ2 or μ1-μ2 = 0 Ha: μ1 ≠, <, > μ2 or μ1-μ2 ≠, <, > 0 | “Since a p-value of [p-value] is [greater/less] than an alpha of [significance level]... we fail to reject the null hypothesis that [context]” OR we have significant evidence to reject the null hypothesis [context] in favor of the alternative [context] | We are [significance level]% confident that the true mean [context 1] is between [lower bound] and [upper bound] [higher/lower] than the mean [context 2] |
Chi-Square: Goodness of Fit (GOF) | Extension of the 1-Proportion Z Test One population One row/column Is there a significant difference between the expected and observed proportions? | Random All expected values are greater than 1 Less than 20% of expected values are less than 5 *CHI-SQUARE DISTRIBUTIONS ARE NOT NORMAL! | H0: P1 = X, P2 = Y, P3 = Z… Ha: There is a difference between the [observed] and the [expected] in at least one category. | “Since a p-value of [p-value] is [greater/less] than an alpha of [significance level]... we fail to reject the null hypothesis that there is no difference [context]” OR we have significant evidence to reject the null hypothesis [context] in favor of the alternative there is a difference in at least one category [context] | N/A |
Chi-Square: Homogeneity | Extension of the 2-Proportion Z Test Multi-population Multiple rows/columns Is there a significant difference in at least one proportion between the categories? | Random All expected values are greater than 1 Less than 20% of expected values are less than 5 | H0: P1 = P2 = P3… Ha: There is a difference in at least one category between [populations]. | “Since a p-value of [p-value] is [greater/less] than an alpha of [significance level]... we fail to reject the null hypothesis that there is no difference [context]” OR we have significant evidence to reject the null hypothesis [context] in favor of the alternative there is a difference in at least one category [context] | N/A |
Chi-Square: Association/Independence | Is there a significant association between categorical variables? | Random All expected values are greater than 1 Less than 20% of expected values are less than 5 | H0: There is no association between [variable 1] and [variable 2]. Ha: There is an association. | “Since a p-value of [p-value] is [greater/less] than an alpha of [significance level]... we fail to reject the null hypothesis that there is no difference [context]” OR we have significant evidence to reject the null hypothesis [context] in favor of the alternative there is a difference in at least one category [context] | N/A |
Linear Regression (Slope) T Test Linear Regression (Slope) T Interval | Is there a linear relationship between quantitative variables? | Random Observations are independent Linearity/Residuals:
| H0: The slope of the true line of regression b/t [x + context] and [y+ context] is 0. Ha: The slope is ≠, <, > 0 | “Since a p-value of [p-value] is [greater/less] than an alpha of [significance level]... we fail to reject the null hypothesis that the slope of the true line of regression [context] is 0” OR we have significant evidence to reject the null hypothesis that the slope of the true line of regression [context] is 0 in favor of the alternative that there is [a/a positive/a negative] relationship [context]” | “We are [significance level]% confident that the true slope of the regression line [context] is between [lower bound] and [upper bound] Formula: b +/- t*SEb |