PLSC 30500: Introduction to quantitative social science

Autumn term 2024, University of Chicago

This version: 9/23/2024

Instructors:

Professor Andy Eggers (aeggers@uchicago.edu)

Teaching assistant: John Kainer (jckainer@uchicago.edu)

Class meetings:

Lecture meets 10:30-12:20 Monday and Wednesday (Pick Hall 319).
Lab meets Fridays 10:30 am - 11:20 am (Cobb 430) and 12:30-1:20 (SSRB 107)

Office hours:
Andy’s office hours (zoom or in person) are Wednesday 3-4:45. Reserve fifteen-minute slots (at least 1.5 hours in advance) at https://calendly.com/andyeggers/office-hours. If you want to meet outside of these times, please get in touch.

Logistics

Course materials are posted online on the course GitHub repository at https://github.com/UChicago-pol-methods/IntroQSS-F24. We will be updating materials available throughout the quarter.
We will use the Canvas website for homework submissions (due 10am on Mondays) and some readings but not otherwise.
We will manage questions about the course through a private course Stack Overflow team: https://stackoverflowteams.com/c/uchicagopolmeth. We encourage you to make your questions public, as asking and answering questions will be part of your participation grade for the class. If you are asking a question about R code, try to provide a minimal working example to help others understand your problem:

Course description

This course introduces skills and concepts that will help students understand and produce quantitative social science research.

On completing this course, students should:

understand basic foundations of probability and statistics that arise in common forms of statistical inference
understand challenges and common approaches to the problem of estimation
understand what statistical inference means and common approaches to producing measures of uncertainty
understand how simulation can be used to study the properties of estimators
have a foundation for using R for data science problems

The course is the first course in the political science department’s quantitative methods sequence. (It is followed by Causal Inference in the winter and Linear Models in the spring.) Later courses in the sequence build on what we teach, and we will avoid spending lots of time on topics that we know will be covered adequately in those courses. We certainly hope that this course will inspire students to continue on in the sequence. That said, we aim for the course to be useful and enjoyable to students who take no further methods courses or who take methods courses outside of our sequence.

Course philosophy

Researchers who do quantitative social science typically need a mix of different skills, including some combination of substantive expertise (i.e. knowledge of the subject of study), knowledge of statistics, programming ability, and creativity. Assembling the skills you need takes time. One nine-week course is not enough.

Our main goal is to give you a strong foundation in the theory of statistics so that you understand the core ideas of estimation, inference, and hypothesis testing. This requires a grounding in probability theory. Our aim is that, by the end of the course, you see how probability problems (e.g. involving drawing balls from an urn) are related to statements we might make about a dataset (and the real world).

We will approach these concepts in three modes:

Mathematics (symbolic representations; theorems and proofs)
Simulations (observing regular features of random processes with “made-up data”)
Applications (analysis of real data)

While we develop your understanding of probability and how it relates to estimation and inference, we will be developing your ability as programmers.

This course will also touch in some way on much of what you will study in future courses, so it is partly a “taster” that we hope will excite you to go further.

Computing

Students will work with data using the R statistical environment.

We will establish foundations in base R and also expose you to some key elements of the “tidyverse” approach to programming (ggplot). But this is not primarily a programming course. There are other courses where you will learn more about programming and less about probability and statistics.

You will need access to a laptop to use in and out of class. Please let us know if that is an obstacle.

We recommend working in RStudio on your own laptop. You may also use RStudio cloud, which is available for free online. If you don’t like RStudio you may also use the command line to your heart’s content.

Prerequisites

Students are required to have some previous exposure to R programming and probability, as is commonly provided in one of the university’s math camps. Background in calculus is useful for some parts of the course, but will not be important for our assignments.

Materials for the 2024 math prefresher camp for political science PhD students are available here:

https://github.com/UChicago-pol-methods/polisci-math-prefresher

If you did not attend, it is probably worth reviewing these materials regardless of your background experience.

If you don’t have much experience with programming or think you may struggle with that aspect of the course, we recommend spending time beforehand on some of the many excellent tutorials on R.

References and Teaching Materials

We will rely mainly on Foundations of Agnostic Statistics (Aronow & Miller, 2019, CUP). Many students find this book challenging, in part because its explanations tend to be brief and it provides few examples. We will augment their approach in lectures and labs.

Aronow & Miller is available online through the library. We will use this enough that, if you get any benefit from working with a paper book, we recommend buying it.

For a different view of many of the same topics, we recommend Introduction to Probability (Blitzstein & Hwang, 2019, Taylor & Francis). It contains more words, more examples, and more visualizations to explain many of the same concepts. Like Aronow & Miller, it is available online through the library.

Assessments

Weekly problem sets 40%

Class participation 10%

Class participation will include both in-class participation, as well as submission of questions and answers on the class Stack Overflow.

Midterm in-class exam 20% (October 23)

Final take-home exam 30% (due Dec 10)

Collaboration and academic integrity

We encourage you to use any available resources, including classmates and large language models, to understand the material in the course. (And at our private StackOverflow site we hope you will ask and answer questions.) But you should make sure that you are eventually able to do everything yourself.

You can collaborate with classmates on problem sets, but you should do the write-up yourself: write your own code and write your own responses to the questions. For simple coding questions, we expect many students to have similar answers. But you won’t learn to code unless you write code yourself, and you won’t learn to think and write about data analysis unless you think and write for yourself.

You may not collaborate on the final take-home exam.

Familiarize yourself with the university’s policies on academic dishonesty and plagiarism, e.g. https://studentmanual.uchicago.edu/academic-policies/academic-honesty-plagiarism/. The key idea is that you should give credit to others when you use their language and findings. If you commit plagiarism, there could be serious consequences, including failing the course and being asked to leave the university.

Accommodations

Please reach out to the instructors directly if you would like to request accommodations for the course to better facilitate your learning. Student Disability Services (disabilities.uchicago.edu) is also available to provide you resources and support, and may provide approval for specific academic accommodations. Informing us in a timely manner will help us to ensure accommodations are met and we are able to implement an appropriate assessment of your learning.

Schedule

Lectures 1.1, 1.2, 2.1, 2.2 (Sept 30, Oct 2, Oct 7, Oct 9): Probability (Aronow & Miller chapter 1)

Lectures 3.1, 3.2, 4.1, 4.2 (Oct 14, Oct 16, Oct 21, Oct 23): Summarizing distributions (Aronow & Miller chapter 2)

Midterm: Oct 23 (before Lecture 4.2)

Lectures 5.1, 5.2, 6.1 (Oct 28, Oct 30, Nov 4, Nov 6): Estimation (Aronow & Miller chapter 3.1-3.3)

Lectures 7.1, 7.2 (Nov 11, Nov 13): Inference (Aronow & Miller chapter 3.4)

Lectures 8.1, 8.2, 9.1, 9.2 (Nov 18, Nov 20, Dec 2, Dec 4) Regression (Aronow & Miller chapter 4)

Take-home final exam due 9pm Tuesday Dec 10