1 of 43

Lecture 16: Causal inference

case studies II

Sharad Goel

Stanford University

2 of 43

Estimating causal effects from observational data

Response surface modeling�Propensity score matching�Interrupted time series�Difference-in-differences

3 of 43

Optimizing bail decisions�Jung, Concannon, Shroff, Goel, Goldstein, 2017

Judges determine whom to release and whom to detain pending trial, balancing flight risk against the burden of bail on individuals.

Goal is to design more equitable release policies.

4 of 43

The data

150,000 cases for non-violent offenses from a large urban prosecutor’s office.

For each instance, we know the judge’s decision [release or “detain”], the outcome [appear or fail-to-appear], and detailed case information.

5 of 43

Average treatment effect

What is the effect of treatment [ being detained ] on failure to appear rates?

6 of 43

Average treatment effect

What is the effect of treatment [ being detained ] on failure to appear rates?

FTA rate is 9% for those who are detained, and 15% for those who are released.

7 of 43

Average treatment effect

What is the effect of treatment [ being detained ] on failure to appear rates?

FTA rate is 9% for those who are detained, and 15% for those who are released.

Selection bias�Those defendants who were detained are different than those who were released. [ Detained defendants are likely riskier. ]

8 of 43

Response surface modeling

Estimate potential outcomes by fitting a model to predict responses as a function of covariates and treatment.

9 of 43

Response surface modeling

Y is the outcome [ appear or fail to appear ]�X is the vector of covariates [ case details, etc. ]�Z is the treatment [ detain or release ]�

10 of 43

Response surface modeling

Y is the outcome [ appear or fail to appear ]�X is the vector of covariates [ case details, etc. ]�Z is the treatment [ detain or release ]�

11 of 43

Response surface modeling�The ignorability assumption

The ignorability assumption means that given the observed covariates X, the potential outcomes are independent of the treatment Z.

In this case, the treatment is ignorable and we don’t have to worry about selection bias. �[ We revisit this assumption shortly. ]

12 of 43

Response surface modeling�Estimating f

Model for Z=0 is fit on all cases in which the judge released the defendant.�

Model for Z=1 is fit on all cases in which the judge detained the defendant.�

13 of 43

Response surface modeling�Estimating f via more complex models

We can use more complex ML models [ regularized regression, random forest, etc. ] to estimate outcomes, but the basic strategy is the same.

14 of 43

Response surface modeling�Result

On the subset of people who were in reality detained, use the model to estimate Y_i(released).

15 of 43

Response surface modeling�Result

On the subset of people who were in reality detained, use the model to estimate Y_i(released).

On this subset, the counterfactual FTA rate is 17%.�[ On the set of people released, observed FTA is 15%. ]

16 of 43

Response surface modeling�Result

On the subset of people who were in reality detained, use the model to estimate Y_i(released).

On this subset, the counterfactual FTA rate is 17%.�[ On the set of people released, observed FTA is 15%. ]

Average treatment effect on the treated [ ATT ] is�9% - 17% = -8%

17 of 43

Response surface modeling�Result

On the subset of people who were in reality released, use the model to estimate Y_i(detained).

Using the model and the observed outcomes, we have estimates of both Y_i(detained) and Y_i(released) for each defendant, and can then compute the ATE.

18 of 43

Matching

Instead of modeling the response surface, one can attempt to find similarly situated individuals.

19 of 43

Matching

Instead of modeling the response surface, one can attempt to find similarly situated individuals.

For each person detained, find a similar defendant who was released. Use this matched sample to estimate the unobserved potential outcomes.

Can think of treatment as randomly assigned in the matched sample.

20 of 43

Matching

How to find “similar” individuals?

21 of 43

Matching

How to find “similar” individuals?

Exact matching�Find individuals who share exact same observed traits.�[ age, sex, race, criminal history, … ]

22 of 43

Matching

How to find “similar” individuals?

Exact matching�Find individuals who share exact same observed traits.�[ age, sex, race, criminal history, … ]

Exact matching is typically impossible in high dimensions.

23 of 43

Propensity score matching

For each individual, estimate their likelihood of being treated based on all observed traits.�[ Estimate these propensity scores via a big model. ]

Match individuals on their propensities to be treated.

24 of 43

Propensity score matching�Full population

25 of 43

Propensity score matching�Matched population

26 of 43

Propensity score matching�Results

In the sample of released individuals matched to defendants who were detained, FTA rate is 18%.�[ On the set of people released, observed FTA is 15%. ]�[ The response surface model yields an FTA rate of 17%. ]

27 of 43

Designing optimal policies�Beyond average treatment effects

28 of 43

Policy design strategy

Based on features of the case and the defendant’s criminal history, estimate the likelihood the defendant would miss trial if released.�
Detain the defendants deemed riskiest.�

29 of 43

Policy design strategy

Based on features of the case and the defendant’s criminal history, estimate the likelihood the defendant would miss trial if released.��For each defendant, estimate the potential outcome Y_i(release).�
Detain the defendants deemed riskiest.�

30 of 43

Evaluating detention policies

Number detained vs. number who fail to appear

Use the response surface model to estimate counterfactual when necessary.

If the policy suggests releasing an individual who was in reality detained, or detaining an individual who was in reality released, the model yields estimates of the unobserved potential outcomes.

31 of 43

Evaluating detention policies

34 of 43

30%

fewer defendants can be detained �without increasing overall flight risk.

35 of 43

Violating ignorability�

Suppose we only observe a defendant’s age X, but judges observe many other factors.

Then, given two defendants of the same age, exactly one of whom is released, the released defendant is likely less risky than the detained one, violating ignorability.

36 of 43

Violating ignorability

When ignorability is violated, then we can’t estimate Y(t) in terms of Pr(Y=1 | X, Z=t).

37 of 43

Violating ignorability

Ignorability is an assumption, and cannot be directly verified.

But we can examine the sensitivity of our conclusions to violations of ignorability.

38 of 43

Sensitivity analysis

Make assumptions about the magnitude and effect of selection, and then examine how estimates change.�[ Many different ways to do this. ]

39 of 43

Sensitivity analysis�Rosenbaum-Rubin sensitivity analysis

Assume there is a binary covariate u that affects both a judge’s decision [ detain or release ] and the outcome [ appear or fail to appear ] conditional on the decision.

40 of 43

Sensitivity analysis�Rosenbaum-Rubin sensitivity analysis

Assume there is a binary covariate u that affects both a judge’s decision [ detain or release ] and the outcome [ appear or fail to appear ] conditional on the decision.

[ New ] ignorability assumption

41 of 43

Sensitivity analysis�Rosenbaum-Rubin sensitivity analysis

Four parameters:

Probability that u = 1
The effect of u on the judge’s decision
The effect of u on the defendant’s likelihood of failing to appear if RoR’d
The effect of u on the defendant’s likelihood of failing to appear if detained

42 of 43

Sensitivity analysis�Rosenbaum-Rubin sensitivity analysis

43 of 43

Decision-making

Risk assessment tools estimate risk. They don’t dictate policy.

Rather than detaining individuals deemed “high” risk of flight, one might attempt to improve appearance rates via automated court reminders or by providing transportation.�[ We’re rolling out an open-source platform to do this. ]

1 of 43

2 of 43

3 of 43

4 of 43

5 of 43

6 of 43

7 of 43

8 of 43

9 of 43

10 of 43

11 of 43

12 of 43

13 of 43

14 of 43

15 of 43

16 of 43

17 of 43

18 of 43

19 of 43

20 of 43

21 of 43

22 of 43

23 of 43

24 of 43

25 of 43

26 of 43

27 of 43

28 of 43

29 of 43

30 of 43

31 of 43

32 of 43

33 of 43

34 of 43

35 of 43

36 of 43

37 of 43

38 of 43

39 of 43

40 of 43

41 of 43

42 of 43

43 of 43