1 of 41

Week 10:

Quasi-Experimental Designs

Program Evaluation - PADP 8640 - Spring 2016

2 of 41

Outline

  • Bias and Comparison Groups
  • Natural Experiments
  • Difference-in-Differences
  • Regression Discontinuity Designs

3 of 41

How do we know a program “works”?

  • Still getting at the question of how can we accurately measure whether or not a policy achieved anything

  • Since most policies and interventions are not randomly assigned, we keep running into issues of bias…

4 of 41

How do we know a program “works”: Selection Bias

  • Selection bias: some individuals more likely to be part of a policy or program than others

Results potentially drive by WHO is in

which group, not by WHICH group

someone is in.

Internet is the king of selection bias

  • Online polls
  • Dads on vacation

5 of 41

How do we know a program “works”: Sampling bias

  • Are certain types of people more or less likely to be included in our sample?
  • If so, study is not externally valid

Examples:

  • using NBA/WNBA players to estimate average height of US population
  • Comparing graduates of an elite college to all other college graduates

6 of 41

All about comparisons...

Selection bias and sampling bias largely about our ability to compare groups

  • Are treatment and control group equal in expectation?
  • Are treatment and control group even relevant to one another?
  • Bias means that we cannot account for the way(s) in which treatment and control groups are different

7 of 41

Quasi-Experimental Designs

  • Today, we are primarily focusing on attempts to identify and/or make a suitable control group
  • This can be a really complex exercise
  • What would be a good control group for:
    • Economic impact of Braves moving to Cobb County?
    • Assessing whether a Starbucks gold card is good for your health?
    • The economic benefit of an MPA degree?

8 of 41

Natural Experiments

  • What is the appeal of an experiment for evaluation?
  • If we cannot manipulate/assign treatment, what features of program or of policy might help approximate an experiment?

9 of 41

Natural Experiments

  • In a “natural experiment,” treatment…
    • Is exogenous
    • Is random forced
    • Creates groups that are “equal in expectation”
  • What does this mean?

10 of 41

Natural Experiments

  • In a “natural experiment,” treatment…
    • Is exogenous
    • Is random forced
    • Creates groups that are “equal in expectation”

Geography

Time

Participation Threshold

11 of 41

Example: Students on one side or another of attendance boundary

12 of 41

Example: Changes in benefit program over time

13 of 41

Example: Eligibility Cutoff

14 of 41

Natural Experiments

  • The basic premise of a “natural experiment” is that some external factor has addressed some concern about self-selection

15 of 41

Natural Experiments

  • Potential concerns:
    • Secular trends
    • Participant actions (how “arguably exogenous” is treatment?)
    • Bandwith choices

16 of 41

Difference-in-Differences

  • In “diff-in-diff” design, an exogenous treatment creates four distinct groups

Time

Before

After

Policy/program

Affected

Group A

Group C

Not-affected

Group B

Group D

17 of 41

Difference-in-Differences: Example

  • Dynarski (2003)
  • Social Security Student Benefit
  • Program: Cash payment to students with deceased fathers for college tuition support
  • Exogenous intervention: Congress abruptly ended the program in 1981

18 of 41

Difference-in-Differences: Example

  • Exogenous intervention: Congress abruptly ended the program in 1981

Before 1981

After 1981

Deceased father

Yes

Group A

Group C

No

Group B

Group D

19 of 41

Difference-in-Differences: Example

  • Exogenous intervention: Congress abruptly ended the program in 1981

Before 1981

After 1981

Deceased father

Yes

Group A

Group C

No

Group B

Group D

First Difference

20 of 41

Difference-in-Differences: Example

  • Exogenous intervention: Congress abruptly ended the program in 1981

Before 1981

After 1981

Deceased father

Yes

Group A

Group C

No

Group B

Group D

Second Difference

21 of 41

Difference-in-Differences: Example

First difference - second difference = difference of differences

22 of 41

Difference-in-Differences: Example

First difference - second difference = difference of differences

23 of 41

Difference-in-Differences: Example

Before 1981

After 1981

Deceased father

Yes

Group A

Group C

No

Group B

Group D

24 of 41

Difference-in-Differences: Good News!

  • Difference-in-differences model uses simple regression strategy
  • From social security benefits example:

COLLi = β0 + Β1BEFOREi + β2FATHERDECi + β3(BEFOREi x FATHERDECi) + εi

Before 1981

After 1981

Deceased father

Yes

Group A

Group C

No

Group B

Group D

25 of 41

Difference-in-Differences: Good News!

  • Difference-in-differences model uses simple regression strategy
  • From social security benefits example:

COLLi = β0 + Β1BEFOREi + β2FATHERDECi + β3(BEFOREi x FATHERDECi) + εi

Before 1981

After 1981

Deceased father

Yes

Group A

Group C

No

Group B

Group D

26 of 41

Difference-in-Differences: Good News!

  • Difference-in-differences model uses simple regression strategy
  • From social security benefits example:

COLLi = β0 + Β1BEFOREi + β2FATHERDECi + β3(BEFOREi x FATHERDECi) + εi

Before 1981

After 1981

Deceased father

Yes

Group A

Group C

No

Group B

Group D

27 of 41

Difference-in-Differences: Example Results

H0: β = 0

Estimate

S.E.

t

p

Intercept

0.476***

0.019

25.22

0.000

BEFORE

0.026

0.021

1.22

0.111

FATHERDEC

-0.123

0.083

-1.48

0.070

BEFORE x FATHERDEC

0.182*

0.096

1.90

0.029

COLLi = β0 + Β1BEFOREi + β2FATHERDECi + β3(BEFOREi x FATHERDECi) + εi

28 of 41

Difference-in-Differences: Example Results

Estimate

S.E.

t

p

Intercept

0.476***

0.019

25.22

0.000

BEFORE

0.026

0.021

1.22

0.111

FATHERDEC

-0.123

0.083

-1.48

0.070

BEFORE x FATHERDEC

0.182*

0.096

1.90

0.029

What is first difference? → ybefore, non-deceased - yafter, non-deceased

What is second difference? → ybefore, deceased - ybefore, non-deceased

Difference-in-difference? → [ybefore, deceased- yafter, deceased] - [ybefore, non-deceased - yafter, non-deceased]

29 of 41

Difference-in-Differences (without natural experiment)

Difference-in-differences for interrupted time series data

  • While natural experiments are ideal, a diff-in-diff design can be applied to any program change with longitudinal data
  • Diff 1: Pre/post program change for program participants
  • Diff 2: Pre/post observations for non-participants (to capture secular trend)

  • Often, you’ll also want to include observable covariates.

30 of 41

Group Exercise

Group Exercise instructions

31 of 41

Regression Discontinuity Designs

  • Many programs and policies have eligibility cutoffs
    • Poverty related programs have income cutoffs
    • Eligibility for tax credits and expenditures
    • Test scores
    • Other ideas?

  • Are people on one side of cutoff different than the other?
  • When? Why?

32 of 41

Regression Discontinuity Designs

  • Regression discontinuity designs use the arbitrariness of treatment cutoffs to assess causality
  • Basic assumption: people close to cutoff on either side are similar
  • Two main requirements
    • Continuous eligibility index
    • Clearly defined cutoff score
    • (sub-requirement: cutoff is enforcemed, measurements are not gamed)

33 of 41

Regression Discontinuity Designs

  • How would we fit a regression for this?

34 of 41

Regression Discontinuity Designs

  • LATE (Local Average Treatment Effect) instead of ATE (Average Treatment Effect)
  • Why?
  • How could you test robustness of your result?

35 of 41

General RDD framework

  • f(I-Ic)i is distance from cutoff; in theory, this “soaks up” selection bias
  • B2Yit-1 is observation from prior time period; adjust for initial differences amongst individuals

36 of 41

Regression Discontinuity Designs

Look for RD designs when policy/program eligibility occurs on a continuous variable:

  • Income: what’s the difference between $9,995 and $10,005?
  • Distance: 9 miles versus 11 miles?

Think substantively: does cutoff carry meaning?

Inappropriate for non-continuous cutoffs:

  • No spectrum for single moms, college graduates, or being homeless

37 of 41

Regression Discontinuity Designs

What did your readings say about RDDs?

Why?

38 of 41

Quasi-Experimental Designs

We always ask:

  • How close is close enough?
  • How similar is similar enough?
  • How exogenous is exogenous enough?

39 of 41

Quasi-Experimental vs. Experimental

Experiments aren’t perfect either

  • Why might an experiment be inappropriate?

  • What are some potential benefits of an observational study?

40 of 41

Questions

41 of 41

Exercise

Group 1: Chairman Christopher Hart

National Transportation Safety Board

Group 2: JT Griffin

Senior VP of Pub. Policy, MADD

Group 3: Jana Simpler

Director, Governors’ Highway Safety Association

Group 4: Rob Tod

Chair, National Brewers’ Association

Gwen Ifill!