1 of 21

Inference for Comparing Two Population Proportions

Investigating whether a proportion differs across two populations and estimating that difference

2 of 21

�A Reminder

  •  

3 of 21

�Additional Reminders

  •  

4 of 21

�Examples

  • As usual, the following slides contain example problems for us to complete
  • Each example involves a request to perform inference comparing proportions or means across two populations
  • You’ll need to decide…
    • What type of inference is appropriate (confidence interval or hypothesis test)
    • Whether the inference is being done to compare population means or population proportions
      • The majority of the scenarios will involve proportions, but there are a few scenarios investigating means mixed in

5 of 21

�Example: Academic Social Media Use

Scenario: A study surveyed students about their use of social media for academic purposes. Among 92 public school students, 52.2% reported using social media for academic purposes, while among 87 private school students, 57.5% reported the same. Construct a 90% confidence interval to estimate the difference in the proportion of social media use for academics between public and private school students.

6 of 21

�Example: Preference for Co-op Gaming

  •  

7 of 21

�Example: Public Transit for Environmental Reasons

Scenario: A study compared public transportation use for environmental reasons between high school and college students. Among 74 high school students, 44.6% reported using public transportation for environmental reasons. Among 83 college students, 50.6% reported the same. Test whether college students are significantly more likely to use public transportation for environmental reasons than high school students.

8 of 21

�Examples: Coffee Sizes

Scenario: Students compared the sizes of lattes served at two university coffee shops. At Espresso Yoself, a sample of 59 lattes had an average size of 16.2 ounces with a standard deviation of 1.8 ounces. At Latte Da!, a sample of 71 lattes had an average size of 15.8 ounces with a standard deviation of 2.0 ounces. Construct a confidence interval to estimate the difference in mean latte sizes at the two coffee shops.

9 of 21

�Example: Preference for Virtual School Events

Scenario: Two high schools conducted a survey to find out how many students prefer virtual events over in-person ones. At Cedar Valley High School, 87 students were surveyed, and about 62.1% expressed a preference for virtual events. At Summit Ridge High School, 93 students were surveyed, and about 53.8% favored virtual events. Construct a confidence interval to estimate the difference in the proportion of students at the two schools who prefer virtual events.

10 of 21

�Example: Fitness App Effectiveness

Scenario: A fitness app developer wants to compare the average daily step counts of users for two different apps. For the Steptactular app, a random sample of 103 users showed an average of 8,520 steps per day with a standard deviation of 950 steps. For the Toe-Tally Fit app, a sample of 97 users had an average of 7,980 steps per day with a standard deviation of 1,020 steps. Test whether the average step count is significantly higher for users of Steptacular than for Toe-Tally Fit.

11 of 21

�Example: Doomscrolling

Scenario: A study measured the average time spent scrolling through posts on BlueSky for two age groups. Among 78 individuals aged 16-20, the average scrolling time was 14.6 minutes with a standard deviation of 3.1 minutes. Among 82 individuals aged 21-24, the average time was 15.2 minutes with a standard deviation of 3.4 minutes. Construct a confidence interval to estimate the difference in mean scrolling time between the two groups.

12 of 21

�Example: Curated Playlist Usage

Scenario: A survey investigated the use of curated playlists on music streaming platforms. Among 91 listeners aged 16-18, 59.3% reported using curated playlists. Among 113 listeners aged 19-24, 61.9% reported the same. Test whether the proportion of curated playlist users differs between these two age groups.

13 of 21

�Example: Loading Times

Scenario: Researchers measured the load times of games on two gaming consoles. For PS5, 46 games had an average load time of 19.4 seconds with a standard deviation of 2.8 seconds. For Nintendo Switch, 54 games had an average load time of 20.7 seconds with a standard deviation of 3.1 seconds. Construct a confidence interval to estimate the difference in mean load times between the two consoles.

14 of 21

�Example: Book Medium Preference

Scenario: A survey asked students whether they prefer e-books over physical books for academic use. Among 95 undergraduate students, 67.4% preferred e-books, while among 88 graduate students, 72.7% preferred e-books. Construct a confidence interval to estimate the difference in the proportion of undergraduate and graduate students who prefer e-books.

15 of 21

�Example: Community Service

Scenario: A study compared the participation rates in community service between high school and college students. Of 124 high school students surveyed, 29% had participated in community service in the past year. Among 119 college students surveyed, 35.3% had participated. Test whether college students are significantly more likely to participate in community service than high school students.

16 of 21

�Example: Tutoring Effectiveness

Scenario: Two online tutoring platforms were compared based on their effectiveness in improving math test scores. For BrainyBees.com, 84 students showed an average score improvement of 15.3 points with a standard deviation of 4.2 points. For QuizWhiz.com, 79 students had an average improvement of 13.8 points with a standard deviation of 3.9 points. Test whether BrainyBees leads to significantly greater average improvements than QuizWhiz.

17 of 21

�Example: Streaming Subscriptions

Scenario: A streaming service compared subscription rates for two age groups. Among 67 individuals aged 18-20, 80.6% had active subscriptions. Among 89 individuals aged 21-24, 76.4% had subscriptions. Test whether the subscription rate is significantly higher for the younger age group.

18 of 21

�Example: Plant-Based Burgers

Scenario: Teenagers and young adults were surveyed about their preference for plant-based burgers. Among 101 teenagers, 37.6% preferred plant-based burgers. Among 108 young adults, 45.4% expressed the same preference. Construct a confidence interval to estimate the difference in the proportion of preference for plant-based burgers between teenagers and young adults.

19 of 21

�Example: Strength Training

Scenario: Two weightlifting programs were compared based on bench press performance. Among 53 participants in the beginner program, the average bench press weight for participants was increased by 23.1 pounds with a standard deviation of 7.8 pounds. Among 63 participants in the advanced program, the average increase was 27.6 pounds with a standard deviation of 4.9 pounds. Test whether the advanced program significantly increases bench press performance more than the beginner program.

20 of 21

Inference: Where We’ve Been and Where �We Are Headed

Inference On…

Covered?

One Numerical Variable

✔️

One Binary Categorical Variable

✔️

Associations Between a Numerical Variable and a Binary Categorical Variable

✔️

Associations Between Two Binary Categorical Variables

✔️

One MultiClass Categorical Variable

We’ll Omit

Associations Between Two MultiClass Categorical Variables

We’ll Omit

Associations Between One Numerical Variable and One MultiClass Categorical Variable

Associations Between Two Numerical Variables

21 of 21

�Next Time…

  • What we’ll be doing…
    • Inference for Categorical Variables with More than Two Levels
  • How to prepare…
    • Read sections 11.1, 11.2, and 11.4 in our textbook
  • Homework: Complete HW 8 (Hypothesis Tests for Comparing Parameters Across Two Populations) on MyOpenMath