API CAN CODE �Data in Learners’ Lives
Lesson 2: Data Collection and its Purpose & Impact
1
This work was made possible through generous support from the National Science Foundation (Award # 2141655).
Warmup
Brainstorm 5 sources of data in your life
(or recall them from the last lesson!) to
use for today’s lesson!
2
Lesson 1.1 Recap
3
Data Never Sleeps
Explore the Data Never Sleeps website!�
Look at Data Never Sleeps 1.0. Then, go to Data Never Sleeps 12.0. How are they similar? How are they different? �
4
Data Never Sleeps
Explore the Data Never Sleeps website!
Who might collect or care about some �of these data streams?�
5
Data Collectors
Return to your list of sources of data in your life from the warmup. �
Come up with one or more “collector” for each source of data you came up with – someone that might be interested in collecting the data you create or consume.
6
Discussion - Data Around the World
7
8
Data and Privacy - Class Debate
Is there a privacy issue here? What is it?�TikTok accesses your location, contact info, browsing history, and other usage data and is able to share this data with other apps or websites.
9
Data and Privacy - Class Debate
Is there a privacy issue here? What is it?�One of your neighbors had something stolen off his porch, so he installed a Ring camera on his front door. The camera has a view of most of the street, and movement within a distance of 25 feet is usually recorded and collected by Amazon.
Discussion - Privacy in Data Collection & Consumption
10
Discussion - Privacy in Data Collection & Consumption
11
Representation in Data Collection
Political polls used to be done by calling people on landlines; now, it is done by calling cell phones, from a number the user probably doesn’t know.
Who is represented in these polls? Who is not?
What makes someone more or less likely to be represented? �
12
Representation in Data Collection
Some polls are done through social media. Common sites are Facebook and Twitter.
Who is represented in these polls?
Who isn’t? Could this lead to any problems?
13
Representation in Data Collection
Early facial recognition software was trained using, predominantly, photos of white people.
Who is represented in the training set? Who isn’t?
Could this lead to any problems? �
14
Representation in Data Collection
Read one of these articles:
First article: what does a “false positive”�mean in criminal investigation? How �can bias in representation cause issues?��Second article: who is overrepresented�in photo-matching data? What problem �might this cause?
15
Closing Discussion
16
Exit Ticket
Imagine a movie production company wants to know which of two new movie ideas will sell more tickets. �They collect data by going up to people who are leaving an AMC near a local college campus from 1pm - 6pm and asking them which idea they prefer.
Who is underrepresented in this survey?
How could the issue of representation �bias the company’s conclusions?
17
Thanks!
apicancode@umd.edu
18
This work was made possible through generous support from the National Science Foundation (Award # 2141655).
API Can Code is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike
4.0 International (CC BY-NC-SA 4.0) License