1 of 14

IPUMS

Dr. Maximilian Hell

Code for America

March 2, 2020

2 of 14

The best data, shamefully underused

American Community Survey: used to be long-form Census, now annual survey

Questions on: Educational Attainment, Income, Language proficiency, Race, Migration, Disability, Employment, Housing characteristics.

Not the most fine-grained questions..

But the samples are h-u-g-e and representative!

SO: we get lots of detail, geographic, over time

3 of 14

Census Bureau makin’ things difficult

The Problem: the actual census.gov website is terrible

Only answers certain specific questions, confusing layout

Very inaccessible for mere mortals

4 of 14

ipums.org is here to help!

  • Sign up, browse variables, select sample, create extract, download data
  • Maintained by U Minnesota Population Center
  • Has lots of other data: more detailed economic data (CPS), how people spend their time (ATUS), health (NHIS), higher ed (NSCG), geography (NHGIS), full count Census from 1940, etc.
  • Easy to use, superb documentation
  • Let’s quickly check it out

5 of 14

Example: Commute Times across the U.S.

Let’s look at 5 variables:

TRANTIME, the time in minutes it takes people to get to work,

TRANWORK, the mode of transportation people took to work,

EDUC, the highest year of school a person has completed

MET2013, the metro area a person lives in

RACE, a person’s racial self-id

6 of 14

Distribution of Commute Times

by Metro Area and

Mode of Transportation,

in Minutes

7 of 14

Distribution of Commute Times

in Las Vegas, NV

by Racial Self-ID and

Mode of Transportation,

in Minutes

8 of 14

Distribution of Commute Times

in Durham - Chapel Hill, NC

by Racial Self-ID and

Mode of Transportation,

in Minutes

9 of 14

Distribution of Commute Times

in Portland, OR

by Racial Self-ID and

College Degree Attainment,

in Minutes

10 of 14

Trend in Mean Commute Times

by College Degree Attainment,

in Minutes

11 of 14

You can do fancier things too

12 of 14

Share of Children Earning More Than

Their Parents

by Race

13 of 14

Data Science Principles to Keep in Mind

Make it simpler

Always comment extensively

Audit each manipulation

Randomize

I recommend R Studio for running R code to generate the figures

And socviz.co for an intro to visualization in R

14 of 14

Questions?

hell @ codeforamerica.org