Inference for Comparing Two Population Proportions
Investigating whether a proportion differs across two populations and estimating that difference
�A Reminder
�Additional Reminders
�Examples
�Example: Academic Social Media Use
Scenario: A study surveyed students about their use of social media for academic purposes. Among 92 public school students, 52.2% reported using social media for academic purposes, while among 87 private school students, 57.5% reported the same. Construct a 90% confidence interval to estimate the difference in the proportion of social media use for academics between public and private school students.
�Example: Preference for Co-op Gaming
�Example: Public Transit for Environmental Reasons
Scenario: A study compared public transportation use for environmental reasons between high school and college students. Among 74 high school students, 44.6% reported using public transportation for environmental reasons. Among 83 college students, 50.6% reported the same. Test whether college students are significantly more likely to use public transportation for environmental reasons than high school students.
�Examples: Coffee Sizes
Scenario: Students compared the sizes of lattes served at two university coffee shops. At Espresso Yoself, a sample of 59 lattes had an average size of 16.2 ounces with a standard deviation of 1.8 ounces. At Latte Da!, a sample of 71 lattes had an average size of 15.8 ounces with a standard deviation of 2.0 ounces. Construct a confidence interval to estimate the difference in mean latte sizes at the two coffee shops.
�Example: Preference for Virtual School Events
Scenario: Two high schools conducted a survey to find out how many students prefer virtual events over in-person ones. At Cedar Valley High School, 87 students were surveyed, and about 62.1% expressed a preference for virtual events. At Summit Ridge High School, 93 students were surveyed, and about 53.8% favored virtual events. Construct a confidence interval to estimate the difference in the proportion of students at the two schools who prefer virtual events.
�Example: Fitness App Effectiveness
Scenario: A fitness app developer wants to compare the average daily step counts of users for two different apps. For the Steptactular app, a random sample of 103 users showed an average of 8,520 steps per day with a standard deviation of 950 steps. For the Toe-Tally Fit app, a sample of 97 users had an average of 7,980 steps per day with a standard deviation of 1,020 steps. Test whether the average step count is significantly higher for users of Steptacular than for Toe-Tally Fit.
�Example: Doomscrolling
Scenario: A study measured the average time spent scrolling through posts on BlueSky for two age groups. Among 78 individuals aged 16-20, the average scrolling time was 14.6 minutes with a standard deviation of 3.1 minutes. Among 82 individuals aged 21-24, the average time was 15.2 minutes with a standard deviation of 3.4 minutes. Construct a confidence interval to estimate the difference in mean scrolling time between the two groups.
�Example: Curated Playlist Usage
Scenario: A survey investigated the use of curated playlists on music streaming platforms. Among 91 listeners aged 16-18, 59.3% reported using curated playlists. Among 113 listeners aged 19-24, 61.9% reported the same. Test whether the proportion of curated playlist users differs between these two age groups.
�Example: Loading Times
Scenario: Researchers measured the load times of games on two gaming consoles. For PS5, 46 games had an average load time of 19.4 seconds with a standard deviation of 2.8 seconds. For Nintendo Switch, 54 games had an average load time of 20.7 seconds with a standard deviation of 3.1 seconds. Construct a confidence interval to estimate the difference in mean load times between the two consoles.
�Example: Book Medium Preference
Scenario: A survey asked students whether they prefer e-books over physical books for academic use. Among 95 undergraduate students, 67.4% preferred e-books, while among 88 graduate students, 72.7% preferred e-books. Construct a confidence interval to estimate the difference in the proportion of undergraduate and graduate students who prefer e-books.
�Example: Community Service
Scenario: A study compared the participation rates in community service between high school and college students. Of 124 high school students surveyed, 29% had participated in community service in the past year. Among 119 college students surveyed, 35.3% had participated. Test whether college students are significantly more likely to participate in community service than high school students.
�Example: Tutoring Effectiveness
Scenario: Two online tutoring platforms were compared based on their effectiveness in improving math test scores. For BrainyBees.com, 84 students showed an average score improvement of 15.3 points with a standard deviation of 4.2 points. For QuizWhiz.com, 79 students had an average improvement of 13.8 points with a standard deviation of 3.9 points. Test whether BrainyBees leads to significantly greater average improvements than QuizWhiz.
�Example: Streaming Subscriptions
Scenario: A streaming service compared subscription rates for two age groups. Among 67 individuals aged 18-20, 80.6% had active subscriptions. Among 89 individuals aged 21-24, 76.4% had subscriptions. Test whether the subscription rate is significantly higher for the younger age group.
�Example: Plant-Based Burgers
Scenario: Teenagers and young adults were surveyed about their preference for plant-based burgers. Among 101 teenagers, 37.6% preferred plant-based burgers. Among 108 young adults, 45.4% expressed the same preference. Construct a confidence interval to estimate the difference in the proportion of preference for plant-based burgers between teenagers and young adults.
�Example: Strength Training
Scenario: Two weightlifting programs were compared based on bench press performance. Among 53 participants in the beginner program, the average bench press weight for participants was increased by 23.1 pounds with a standard deviation of 7.8 pounds. Among 63 participants in the advanced program, the average increase was 27.6 pounds with a standard deviation of 4.9 pounds. Test whether the advanced program significantly increases bench press performance more than the beginner program.
Inference: Where We’ve Been and Where �We Are Headed
Inference On… | Covered? |
One Numerical Variable | ✔️ |
One Binary Categorical Variable | ✔️ |
Associations Between a Numerical Variable and a Binary Categorical Variable | ✔️ |
Associations Between Two Binary Categorical Variables | ✔️ |
One MultiClass Categorical Variable | We’ll Omit |
Associations Between Two MultiClass Categorical Variables | We’ll Omit |
Associations Between One Numerical Variable and One MultiClass Categorical Variable | |
Associations Between Two Numerical Variables | |
�Next Time…