Rethinking the way we think about percentages
SRCCON 2019
Amelia McNamara
University of St Thomas
Department of Computer and Information Sciences
PhD, statistics, UCLA
@AmeliaMN
Ryan Menezes
Los Angeles Times
Data journalist
B.S., statistics, UCLA
@ryanvmenezes
What kind of math do you apply to your work?
Let’s discuss
A home is crowded if it houses more than 1 person per room
Eight homes, all crowded
22 homes, all crowded
16,000 homes, 32% of them crowded
What could we do to ensure we find meaningful data points?
Let’s discuss
Finding more meaningful data points
Consider the denominator!
Crowding rate in a ZIP
Crowding rate in a ZIP
National crowding rate (3%)
Crowding rate in a ZIP
National crowding rate (3%)
Number of homes in a ZIP (denominator of p-hat)
Most acute crowding: 90011 (22,000 homes, 42% crowded)
90006: 18,000 homes, 43% crowded
How do you feel about this approach?�Would it work for your audience?
Let’s discuss
Going further
The Most Dangerous Equation (Howard Wainer: Picturing the Uncertain World, 2009)
All Maps of Parameter Estimates are Wrong (Gelman and Price, 1999)
"Unfortunately, multiply imputed maps are not suitable for presenting final results (estimated
cancer rates, mean radon concentrations, etc.) to most audiences, who would likely just be confused by them. Furthermore, maps really do make convenient look-up tables (what is the cancer rate, or mean radon level, in my county?)."
Bayesian Surprise Maps
How spatial polygons shape our world - Amelia McNamara https://www.youtube.com/watch?v=wn5larsRHro
How do we make this simpler?
How to Improve Bayesian Reasoning Without Instruction:
Frequency Formats
(Gigerenzer and Hoffrage, 1995)
How do we explain/implement this?
Let’s discuss