API CAN CODE �Data in Learners’ Lives
Lesson 6: Building a Survey for Data Collection
1
This work was made possible through generous support from the National Science Foundation (Award # 2141655).
Lesson 1.5 Recap
2
Value
The ability to extract meaningful insights
Velocity
The speed at which data is generated
Volume
The vast amount of data generated
Veracity
Accuracy, reliability, and trustworthiness of data
Variety
The diverse types and formats of data
Good Data Science Questions
Good Data Science Questions can be answered with Data!
Data can come from different places.
Collections of data, called datasets, reflect measurements �from multiple places, multiple time-points, or multiple �people.
Think of a data science question about the local issue�you’ve come up with or revise an existing one. �
(look at these examples for inspiration)
3
Project - Potential Primary Source
Think of a primary source for your project. �
What question would you like to answer?�
Who would you like to gather data from? �(Is there a particular population of interest?)�
What questions should you ask? �What kinds of responses do you want to get back?
4
5
Messy Data
Write down your answer to: � What grade are you in?
Check with the students around you.
Discuss:
Messy Data - Recap
6
Sampling Designs - SRS
7
1
3
6
8
9
11
2
4
5
7
10
12
3
11
5
Simple Random Sampling (SRS)�
Population
Sample
Sampling Designs - Systematic
8
Population
Sample (Every 3rd)
12
11
10
9
8
7
6
5
4
3
2
1
11
8
5
2
Sampling Designs - Stratified
9
Population
Random Sample
Strata
Sampling Designs - Cluster
10
3
2
1
9
8
7
12
11
10
3
2
1
12
11
10
4
5
6
12
11
10
15
14
13
Population
Sample Group
Clusters
Clusters
(2 Clusters)
Some BAD Sampling Designs…
Convenience samples - based on ease of access, rather than representation. (Like asking your friends to fill out your survey, or giving a survey to the first 100 people to attend a school sports game!)�
Voluntary Response samples - participants can choose if they want to participate if they want to share their opinions. Typically advertised with a poster and QR code.�
What issues of representation and bias are involved?
11
Why do sampling designs matter?
12
Food Deserts and Sampling Design
13
Food Deserts and Sampling Design
What might bad sampling design look like if you were studying food deserts?
14
Racial Profiling
15
Google Form Survey Design
Design a Google Form to collect data on your local issue and answer your question!�
16
Response Validation
17
Use the response validation tool in Google Forms to force respondents to respond in particular ways
(like a number!)
Sharing Your Survey
18
Exit Ticket
The following represent different people doing data science, such as political pollsters, students, administrators, and the manager of a movie theater. Name each of the following sampling designs:
19
Thanks!
apicancode@umd.edu
20
This work was made possible through generous support from the National Science Foundation (Award # 2141655).
API Can Code is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike
4.0 International (CC BY-NC-SA 4.0) License