Making your own summary tables
Welcome to the microdata, where everyone is data!
A quick guided tour of the history of the Census data
A census taker counts household members in 1850
In this picture, a clerk uses one of Herman Hollerith's electric tabulating machines to process census data in 1890.
It’s better
Source: Census/Pew Research Center
It’s better
Source: Census/Pew Research Center
It’s complicated
Source: Census/Pew Research Center
http://www.pewsocialtrends.org/interactives/multiracial-timeline/
It’s complicated
Source: Census/Pew Research Center
http://www.pewsocialtrends.org/interactives/multiracial-timeline/
Data is not the same over time: each questionnaire is informed and adjusted with previous and current issues in mind
One big job for a data journalist: making sense of the ways in which researchers thought about data and surveys over time and find responsible ways to make sense of it even if it’s not easy
All the terms are here: https://docs.google.com/spreadsheets/d/1KMXnlx6VLAkfctCO5Y_Zm-Q7oqV6KDXeqOatKuiT9xE/edit?usp=sharing
Microdata
👆 raw microdata sample
Yuck!
What is that?!
Microdata
Summary data
Microdata allows you to make your own tables!
https://dl.dropboxusercontent.com/u/100051277/italian_barbers.pdf
Census didn’t have what you wanted? No problem: make it yourself!
Microdata
IPUMS
PUMS = “The American Community Survey (ACS) Public Use Microdata Sample (PUMS) files are a set of untabulated records about individual people or housing units.” — Census
IPUMS = Integrated Public Use Microdata Series (census microdata for social and economic research) — IPUMS
Microdata
Statistical concepts
When x goes up, so does y�
Statistical concepts
Statistics 101:
https://knightcenter.utexas.edu/blog/understanding-statistics-journalists-guide �http://journalistsresource.org/tip-sheets/research/statistics-for-journalists?utm_source=Journalist%27s%20Resource&utm_campaign=0ef5939417-10-4-11_campaign&utm_medium=email
Statistical concepts
Statistics 101:
https://knightcenter.utexas.edu/blog/understanding-statistics-journalists-guide �http://journalistsresource.org/tip-sheets/research/statistics-for-journalists?utm_source=Journalist%27s%20Resource&utm_campaign=0ef5939417-10-4-11_campaign&utm_medium=email
Statistical concepts
READ MORE! AND DOUBLE CHECK YOU DATA ANALYSIS WITH INTERVIEWS WITH EXPERTS!�
https://knightcenter.utexas.edu/blog/understanding-statistics-journalists-guide
http://journalistsresource.org/tip-sheets/foundations/math-for-journalists
IPUMS online search
Microdata exercise!
Our task: According to the BBC, 51% of all nail salon technicians are of Vietnamese descent. For your nut graf you wan to find out how many Vietnamese people are nail technicians over time Let’s see how that trend developed: http://www.bbc.com/news/magazine-32544343
IPUMS online search
0
Conceptually prep your spreadsheet: Start with the what you want your result table to look like and work backward.
Columns: years
Rows: birthplace
Filters: All manicurists since 1950
My ideal table
IPUMS online search
1
IPUMS online search
2
In another tab go to the data dictionary page:
https://usa.ipums.org/usa-action/variables/group
If you go to Person > Work IPUMS will display a range of categories; years and surveys available. Basically, each category represents a question related to the subject you picked.
Find the code for occupations:
OCC1950 = Occupation, 1950 basis (IPUMS harmonized professions over time and has this going back all the way to 1950. If you only want a snapshot in time, OCC1990 or OCC2010 might work just find for ya).
Click on it, then go to codes. You should find a full list of professions you can search for. The closest to nail salon workers we can find is Barbers, beauticians, and manicurists (code = 740).
IPUMS online search
3
Open this page in a third tab:
https://sda.usa.ipums.org/all_usa_samples/Doc/nes.htm
Go to Alphabetical Variable List.
It shows EVERY CATEGORY (dictionary item) and you can choose to display those categories as an alphabetical list.
Find the birthplace code (bpl) and the census year code (year)
IPUMS online search
4
Let’s bring it all together and make a table!
IPUMS online search
4
bpl for birthplace
year for census year
Then for filter, type in this:
occ1950(740) year(1950,1960,1970,1980,1990,2000,2010)
That filters your table to occupation code 740 (barbers, beauticians, and manicurists) and for those specific decades.
Make sure these are checked
IPUMS online search
5
IPUMS spits out this ugly HTML table
IPUMS online search
5
Some explanations:
What percentage of the whole column does this value represent.
Our best guess for this percentage is 2.4% but we’re 95% sure that the value lies between 1.9% and 2.9%.
Number of people who in the year said they were barbers, beauticians, and/or manicurists
IPUMS online search
6
You can copy and paste that ugly HTML table into spreadsheets.
See example file:
https://docs.google.com/spreadsheets/d/1kFClouCSPVN7gSNWH8do1W26yK7V2B3uIhw1Li83qT4/edit?usp=sharing