1 of 50

Jiyong Park

Bryan School of Business and Economics

University of North Carolina at Greensboro

jiyong.park@uncg.edu

Potential Outcome Framework, Regression, and Matching

1

Korea Summer Workshop on Causal Inference 2022

Korea Summer Workshop on Causal Inference 2022

Boot Camp for Beginners

Potential Outcome Framework,

Regression, and Matching

Session Website: https://sites.google.com/view/causal-inference2022

2 of 50

Potential Outcome Framework, Regression, and Matching

2

Korea Summer Workshop on Causal Inference 2022

Korea Summer Workshop on Causal Inference 2022

Potential Outcome Framework

3 of 50

Potential Outcome Framework, Regression, and Matching

3

Korea Summer Workshop on Causal Inference 2022

Potential Outcome Framework

  • Causation is defined as the difference in potential outcomes after the treatment.
    • “What if the treatment was not applied?”
    • Causal effect = (Actual outcome for treated if treated) – (Potential outcome for treated if not treated)

4 of 50

Potential Outcome Framework, Regression, and Matching

4

Korea Summer Workshop on Causal Inference 2022

Potential Outcome Framework

  • Causation is defined as the difference in potential outcomes after the treatment.
    • “What if the treatment was not applied?”
    • Causal effect = (Actual outcome for treated if treated) – (Potential outcome for treated if not treated)

Causal effect of reading on grades

Counterfactual

Dominici, F., Bargagli-Stoffi, F.J. and Mealli, F., 2020. From controlled to undisciplined data: estimating causal effects in the era of data science using a potential outcome framework. arXiv preprint arXiv:2012.06865.

Causal effect of adopting a dog on depression

Counterfactual

5 of 50

Potential Outcome Framework, Regression, and Matching

5

Korea Summer Workshop on Causal Inference 2022

Potential Outcome Framework

  • Causation is defined as the difference in potential outcomes after the treatment.
    • “What if the treatment was not applied?”
    • Causal effect = (Actual outcome for treated if treated) – (Potential outcome for treated if not treated)

Counterfactual

Treatment

Potential Outcomes

Causal Effect

Subject i

1

1

3

ATE on the Treated (ATET)

2

1

1

3

0

1

ATE on the Untreated (ATEU)

4

0

1

Main Focus: Average Treatment Effect (ATE)

Counterfactual

Individual treatment effect (ITE) cannot be identified by definition.

6 of 50

Potential Outcome Framework, Regression, and Matching

6

Korea Summer Workshop on Causal Inference 2022

Fundamental Problem of Causal Inference

  • In reality, we do not observe both potential outcomes; we only observe one.
    • Causal effect = (Actual outcome for treated if treated) – (Potential outcome for treated if not treated)
    • But, we can only observe:

= (Actual outcome for treated if treated) – (Actual outcome for untreated if not treated)

Actual Comparison

Counterfactual

Control Group

Counterfactual

Control Group

Ideal Comparison

7 of 50

Potential Outcome Framework, Regression, and Matching

7

Korea Summer Workshop on Causal Inference 2022

Fundamental Problem of Causal Inference

  • In reality, we do not observe both potential outcomes; we only observe one.
    • Causal effect = (Actual outcome for treated if treated) – (Potential outcome for treated if not treated)
    • But, we can only observe:

= (Actual outcome for treated if treated) – (Actual outcome for untreated if not treated)

Counterfactual

Control Group

Treatment

Potential Outcomes

Subject i

1

1

3

1

2

1

1

1

3

0

2

1

4

0

2

1

Ignorability ≅ Exchangeability ≅ Unconfoundedness ≅ Exogeneity

Actual Comparison

Control Group

8 of 50

Potential Outcome Framework, Regression, and Matching

8

Korea Summer Workshop on Causal Inference 2022

Fundamental Problem of Causal Inference

  • In reality, we do not observe both potential outcomes; we only observe one.
    • Causal effect = (Actual outcome for treated if treated) – (Potential outcome for treated if not treated)
    • But, we can only observe:

= (Actual outcome for treated if treated) – (Actual outcome for untreated if not treated)

Control Group

Dominici, F., Bargagli-Stoffi, F.J. and Mealli, F., 2020. From controlled to undisciplined data: estimating causal effects in the era of data science using a potential outcome framework. arXiv preprint arXiv:2012.06865.

Counterfactual

Ideal Comparison

Actual Comparison

9 of 50

Potential Outcome Framework, Regression, and Matching

9

Korea Summer Workshop on Causal Inference 2022

Selection Bias

  • In reality, treatments are not assigned randomly. Individuals select into the treatment (we don’t know why).
  • As a result, the treatment and control groups may be systematically different and not be comparable.

Are they comparable,

except the adoption W?

Are they comparable,

except a confounder X and the adoption W?

Dominici, F., Bargagli-Stoffi, F.J. and Mealli, F., 2020. From controlled to undisciplined data: estimating causal effects in the era of data science using a potential outcome framework. arXiv preprint arXiv:2012.06865.

10 of 50

Potential Outcome Framework, Regression, and Matching

10

Korea Summer Workshop on Causal Inference 2022

Selection Bias

  • Selection bias is the systematic difference between the treatment group in the absence of treatment (i.e., counterfactual) and the control group.

  • Decomposition of causal effect and selection bias
    • Observed effect of the treatment = (Outcome for treated if treated) – (Outcome for untreated if not treated)

= (Outcome for treated if treated)

– (Outcome for treated if not treated) + (Outcome for treated if not treated)

– (Outcome for untreated if not treated)

= (Outcome for treated if treated) – (Outcome for treated if not treated)

+ (Outcome for treated if not treated) – (Outcome for untreated if not treated)

= Causal effect + Selection bias

Causal effect

Selection bias

11 of 50

Potential Outcome Framework, Regression, and Matching

11

Korea Summer Workshop on Causal Inference 2022

Ceteris Paribus ≅ Comparable Control Group

  • From the perspective of potential outcomes, causal inference is to remove the selection bias.

  • How? the treatment group should be comparable to the control (untreated) group in the absence of treatment.
  • Average treatment effect (ATE) can be identified by taking advantage of research designs in which control groups are comparable to the treatment group in all aspects, on average, but the fact that they are treated.

Control Group

Treatment Group

w/o Treatment

(Counterfactual)

= (Outcome for treated if not treated) – (Outcome for untreated if not treated)

Treatment

Potential Outcomes

Subject i

1

1

3

1

2

1

1

1

3

0

2

1

4

0

2

1

If Ceteris Paribus is satisfied, then exchangability holds.

12 of 50

Potential Outcome Framework, Regression, and Matching

12

Korea Summer Workshop on Causal Inference 2022

Ceteris Paribus ≅ Comparable Control Group

Park, S., Tafti, A.R. and Shmueli, G., 2021. Transporting Causal Effects Across Populations Using Structural Causal Modeling: The Example of Work-From-Home Productivity. Available at SSRN.

Bloom, N., Liang, J., Roberts, J. and Ying, Z.J., 2015. Does working from home work? Evidence from a Chinese experiment. Quarterly Journal of Economics130(1), pp.165-218.

Self-selected

treatment group

Self-selected

control group

Comparable except work-from-home?

Volunteered employees

Non-volunteered employees

The figure was adapted from Park et al. (2021)

Employees in the airfare and hotel departments of the Shanghai call center

Volunteer (self-select) to work from home?

Example of the causal effect of work from home

13 of 50

Potential Outcome Framework, Regression, and Matching

13

Korea Summer Workshop on Causal Inference 2022

Ceteris Paribus ≅ Comparable Control Group

Employees in the airfare and hotel departments of the Shanghai call center

Volunteered employees

Non-volunteered employees

Self-selected

treatment group

Self-selected

control group

Counterfactual of

treatment group

Idea comparison

(Causal effect)

Not comparable (Selection bias)

The figure was adapted from Park et al. (2021)

Volunteer (self-select) to work from home?

Causal effect + Selection bias

Park, S., Tafti, A.R. and Shmueli, G., 2021. Transporting Causal Effects Across Populations Using Structural Causal Modeling: The Example of Work-From-Home Productivity. Available at SSRN.

Bloom, N., Liang, J., Roberts, J. and Ying, Z.J., 2015. Does working from home work? Evidence from a Chinese experiment. Quarterly Journal of Economics130(1), pp.165-218.

Example of the causal effect of work from home

14 of 50

Potential Outcome Framework, Regression, and Matching

14

Korea Summer Workshop on Causal Inference 2022

Ceteris Paribus ≅ Comparable Control Group

Randomized

treatment group

Self-selected

control group

The figure was adapted from Park et al. (2021)

Randomized

control group

(as-if counterfactual)

Self-selected

group

Comparable (Causal effect)

Volunteered employees

Non-volunteered employees

Park, S., Tafti, A.R. and Shmueli, G., 2021. Transporting Causal Effects Across Populations Using Structural Causal Modeling: The Example of Work-From-Home Productivity. Available at SSRN.

Bloom, N., Liang, J., Roberts, J. and Ying, Z.J., 2015. Does working from home work? Evidence from a Chinese experiment. Quarterly Journal of Economics130(1), pp.165-218.

Employees in the airfare and hotel departments of the Shanghai call center

Volunteer (self-select) to work from home?

Example of the causal effect of work from home

15 of 50

Potential Outcome Framework, Regression, and Matching

15

Korea Summer Workshop on Causal Inference 2022

Ceteris Paribus ≅ Comparable Control Group

Self-selected

control group

Volunteered employees

Non-volunteered employees

Randomized

treatment group

Randomized

control group

(as-if counterfactual)

Causal effect + Selection bias

Selection bias

Causal effect + Selection bias

Selection bias

Causal effect

Comparable (Causal effect)

The figure was adapted from Park et al. (2021)

Self-selected

group

Park, S., Tafti, A.R. and Shmueli, G., 2021. Transporting Causal Effects Across Populations Using Structural Causal Modeling: The Example of Work-From-Home Productivity. Available at SSRN.

Employees in the airfare and hotel departments of the Shanghai call center

Volunteer (self-select) to work from home?

Example of the causal effect of work from home

16 of 50

Potential Outcome Framework, Regression, and Matching

16

Korea Summer Workshop on Causal Inference 2022

Korea Summer Workshop on Causal Inference 2022

Gold Standard of Causal Inference

: Random Assignment

17 of 50

Potential Outcome Framework, Regression, and Matching

17

Korea Summer Workshop on Causal Inference 2022

Causal Hierarchy from the Perspective of Potential Outcomes

Level of Causal Inference

Meta-Analysis

Randomized Controlled Trial

Quasi-Experiment

Instrumental Variable

“Designed” Regression/Matching

(based on causal knowledge or theory)

Model-Free Descriptive Statistics (no causal inference)

Regression/Matching (little causal inference)

18 of 50

Potential Outcome Framework, Regression, and Matching

18

Korea Summer Workshop on Causal Inference 2022

Gold Standard of Causal Inference: Random Assignment

  • Recall the law of large numbers and coin flipping.

Law of large numbers, in statistics, means that, as the number of identically distributed, randomly generated variables increases, their sample mean approaches their theoretical mean.

19 of 50

Potential Outcome Framework, Regression, and Matching

19

Korea Summer Workshop on Causal Inference 2022

Gold Standard of Causal Inference: Random Assignment

  • Relying on the law of large numbers, random assignment ensures that the subjects with various characteristics are distributed evenly across the treatment and control groups, so that they are comparable.

For each unit, flip a coin to determine the treatment, which is random.

Heads… get the treatment

Tails… do not get the treatment

Only systematic difference between the two groups is the treatment (i.e., Ceteris Paribus).

20 of 50

Potential Outcome Framework, Regression, and Matching

20

Korea Summer Workshop on Causal Inference 2022

Gold Standard of Causal Inference: Random Assignment

  • Relying on the law of large numbers, random assignment ensures that the subjects with various characteristics are distributed evenly across the treatment and control groups, so that they are comparable.

Male : Female = 0.5 : 0.5

w/ kids : w/o kids = 0.4 : 0.6

Average age = 31.5

Average heights = 169 cm

Average blood pressure = 95

etc…

Male : Female = 0.5 : 0.5

w/ kids : w/o kids = 0.4 : 0.6

Average age = 32

Average heights = 171 cm

Average blood pressure = 105

etc…

Male : Female = 0.5 : 0.5

w/ kids : w/o kids = 0.4 : 0.6

Average age = 32

Average heights = 170 cm

Average blood pressure = 100

etc…

Randomly assign 50 to treatment group

Randomly assign 50 to control group

Treatment Group: Use a Pill

Control Group: Use a Placebo Pill

21 of 50

Potential Outcome Framework, Regression, and Matching

21

Korea Summer Workshop on Causal Inference 2022

Example of Randomized Experiments

  • Impact of computer use in the classroom on academic performance of college students (Carter et al. 2017)

Carter, S.P., Greenberg, K. and Walker, M.S., 2017. The impact of computer usage on academic performance: Evidence from a randomized trial at the United States Military Academy. Economics of Education Review56, pp.118-132.

22 of 50

Potential Outcome Framework, Regression, and Matching

22

Korea Summer Workshop on Causal Inference 2022

Example of Randomized Experiments

  • Impact of computer use in the classroom on academic performance of college students (Carter et al. 2017)

Carter, S.P., Greenberg, K. and Walker, M.S., 2017. The impact of computer usage on academic performance: Evidence from a randomized trial at the United States Military Academy. Economics of Education Review56, pp.118-132.

Confounders

Treatment

23 of 50

Potential Outcome Framework, Regression, and Matching

23

Korea Summer Workshop on Causal Inference 2022

Example of Randomized Experiments

If they are sufficiently randomized, their influences should be minimized.

Average Treatment Effect (ATE)

with other confounders hold constant due to random assignment

Carter, S.P., Greenberg, K. and Walker, M.S., 2017. The impact of computer usage on academic performance: Evidence from a randomized trial at the United States Military Academy. Economics of Education Review56, pp.118-132.

  • Impact of computer use in the classroom on academic performance of college students (Carter et al. 2017)

24 of 50

Potential Outcome Framework, Regression, and Matching

24

Korea Summer Workshop on Causal Inference 2022

Korea Summer Workshop on Causal Inference 2022

Selection on Observables

: Regression, Matching, and Weighting

25 of 50

Potential Outcome Framework, Regression, and Matching

25

Korea Summer Workshop on Causal Inference 2022

Causal Hierarchy from the Perspective of Potential Outcomes

Level of Causal Inference

Meta-Analysis

Randomized Controlled Trial

Quasi-Experiment

Instrumental Variable

“Designed” Regression/Matching

(based on causal knowledge or theory)

Model-Free Descriptive Statistics (no causal inference)

Regression/Matching (little causal inference)

Selection on Unobservables Strategies

Selection on Observables Strategies

26 of 50

Potential Outcome Framework, Regression, and Matching

26

Korea Summer Workshop on Causal Inference 2022

How to Balance between Treatment and Control Groups

    • (1) Regression adjustment
      • Accounts for the selection bias using control variables (with a specific functional form)

    • (2) Matching
      • Matches treatment units to control units that share similar observed variables (without functional forms)

    • (3) Weighting
      • Gives weights to the treatment and control units in a way to balance the probability of being treated

 

27 of 50

Potential Outcome Framework, Regression, and Matching

27

Korea Summer Workshop on Causal Inference 2022

Regression from the Perspective of Potential Outcomes

Angrist, J.D. and Pischke, J.S., 2017. Undergraduate econometrics instruction: through our classes, darkly. Journal of Economic Perspectives31(2), pp.125-144.

“This approach abandons the traditional regression framework in which all regressors are treated equally. The pedagogical emphasis on statistical efficiency and functional form, along with the sophomoric narrative that sets students off in search of “true models” as defined by a seemingly precise statistical fit, is ready for retirement.”

“Instead, the focus should be on the set of control variables needed to insure that the regression-estimated effect of the economic variable of interest has a causal interpretation.”

28 of 50

Potential Outcome Framework, Regression, and Matching

28

Korea Summer Workshop on Causal Inference 2022

Regression from the Perspective of Potential Outcomes

    • Traditional viewpoint of regression

“The search for a true model with a large number of explanatory variables” (p. 128)

“No pride of place to any particular set of variables” (p. 128)

    • Modern viewpoint of regression for causal inference

“Regression should be taught the way it is now most often used: as a tool to control for confounding factors.” (p. 126)

“The modern regression paradigm turns on the notion that the analyst has data on control variables that generate apples-to-apples comparisons for the variable of interest.” (p. 132)

 

“We are confident that the coefficients describe in a reasonable way the relationship between achieving and GSES [genetic endowment and socioeconomic status], TQ [teacher quality], SQ [non-teacher school quality], and PG [peer group characteristics], for this collection of 627 elementary school students.”

Angrist, J.D. and Pischke, J.S., 2017. Undergraduate econometrics instruction: through our classes, darkly. Journal of Economic Perspectives31(2), pp.125-144.

29 of 50

Potential Outcome Framework, Regression, and Matching

29

Korea Summer Workshop on Causal Inference 2022

Rethinking Regression for Causal Inference

 

 

 

Selection bias

 

30 of 50

Potential Outcome Framework, Regression, and Matching

30

Korea Summer Workshop on Causal Inference 2022

Rethinking Regression for Causal Inference

 

 

Selection bias

 

31 of 50

Potential Outcome Framework, Regression, and Matching

31

Korea Summer Workshop on Causal Inference 2022

Rethinking Regression for Causal Inference

 

Identification Assumption

: Conditional independence

 

32 of 50

Potential Outcome Framework, Regression, and Matching

32

Korea Summer Workshop on Causal Inference 2022

1. There should be a clear distinction between causes and controls.

2. The role of control variables is to account for the selection bias.

3. Don’t interpret the coefficients of controls in a causal manner.

Rethinking Regression for Causal Inference

“A third important characteristic of the Dale and Krueger (2002) study is a clear distinction between causes and controls on the right hand side of the regressions at the heart of their study. In the modern paradigm, regressors are not all created equal. Rather, only one variable at a time is seen as having causal effects. All others are controls included in service of this focused causal agenda.” (p. 129)

“The modern regression paradigm turns on the notion that the analyst has data on control variables that generate apples-to-apples comparisons for the variable of interest.” (p. 132)

“potential outcomes conditional on controls…” (p. 132)

“It’s unlikely that the regression coefficients multiplying the controls have a causal interpretation. We don’t imagine that the controls are as good as randomly assigned and we needn’t care whether they are.” (p. 132)

“The controls have a job to do: they are the foundation for the conditional independence claim” (p. 132)

Angrist, J.D. and Pischke, J.S., 2017. Undergraduate econometrics instruction: through our classes, darkly. Journal of Economic Perspectives31(2), pp.125-144.

33 of 50

Potential Outcome Framework, Regression, and Matching

33

Korea Summer Workshop on Causal Inference 2022

Regression is Analogous to Matching

“Regression is an automated matchmaker that produces within-group comparisons: there’s a single causal variable of interest, while other regressors measure conditions and circumstances that we would like to hold fixed when studying the effects of this cause.” (p. 130)

“By holding the control variables fixed—that is, by including them in a multivariate regression model—we hope to give the regression coefficient on the causal variable a ceteris paribus, apples-to-apples interpretation.” (p. 130)

Angrist, J.D. and Pischke, J.S., 2017. Undergraduate econometrics instruction: through our classes, darkly. Journal of Economic Perspectives31(2), pp.125-144.

34 of 50

Potential Outcome Framework, Regression, and Matching

34

Korea Summer Workshop on Causal Inference 2022

Matching

  • Matching is to compose the control group that is close to the treatment group in terms of observed variables.
  • Unlike regression, matching does not assume a functional form of controls (matching variables) and selection bias.
  • “Any matching procedure to make the control and treatment more similar in the observables can be seen as a flexible functional form with adding “control variables” to an [regression] analysis framework.” (Goldfarb et al. 2022, p. 12)

Goldfarb, A., Tucker, C. and Wang, Y., 2022. Conducting research in marketing with quasi-experiments. Journal of Marketing86(3), pp.1-20.

35 of 50

Potential Outcome Framework, Regression, and Matching

35

Korea Summer Workshop on Causal Inference 2022

Matching

  • Propensity Score Matching (PSM): Match the treated and untreated units based on the likelihood of being treated, conditional on observed covariates (i.e., propensity score).

Propensity Score Matching

36 of 50

Potential Outcome Framework, Regression, and Matching

36

Korea Summer Workshop on Causal Inference 2022

Matching

  • Propensity Score Stratification (less popular): Compare the treated and untreated units within each stratum based on the likelihood of being treated, conditional on observed covariates (i.e., propensity score).

Propensity Score Stratification

Comparison within each stratum

37 of 50

Potential Outcome Framework, Regression, and Matching

37

Korea Summer Workshop on Causal Inference 2022

Matching

  • Coarsened Exact Matching (CEM): Match the treated and untreated units that fall into the same bin of the coarsened data (i.e., apply an exact match on the coarsened data).

Coarsened Exact Matching

38 of 50

Potential Outcome Framework, Regression, and Matching

38

Korea Summer Workshop on Causal Inference 2022

Weighting

    • Whereas matching filters the units that share similar characteristics, weighting gives weights to the treatment and control units in a way to balance the probability of being treated (i.e., propensity score).

Matching

Weighting

39 of 50

Potential Outcome Framework, Regression, and Matching

39

Korea Summer Workshop on Causal Inference 2022

Inverse Probability Weighting

    • IPW aims to generate a pseudo-population in which the treatment is independent of confounders.

C = 1

C = 0

60

40

X = 1

X = 0

X = 1

X = 0

30

30

30

10

27

3

12

18

21

9

2

8

Y = 1

Y = 0

Y = 1

Y = 0

Y = 1

Y = 0

Y = 1

Y = 0

 

C

X

Y

40 of 50

Potential Outcome Framework, Regression, and Matching

40

Korea Summer Workshop on Causal Inference 2022

Inverse Probability Weighting

    • Rethinking IPW from the perspective of potential outcomes

C = 1

C = 0

60

40

X = 1

X = 0

X = 1

X = 0

30

30

30

10

27

3

12

18

21

9

2

8

Y = 1

Y = 0

Y = 1

Y = 0

Y = 1

Y = 0

Y = 1

Y = 0

?

?

?

?

 

 

Selection on observables assumption

: Treatment and control groups are comparable, conditional on the observed covariates (e.g., C)

41 of 50

Potential Outcome Framework, Regression, and Matching

41

Korea Summer Workshop on Causal Inference 2022

Inverse Probability Weighting

    • Rethinking IPW from the perspective of potential outcomes

C = 1

C = 0

60

40

X = 1

X = 0

X = 1

X = 0

30

30

30

10

27

3

12

18

21

9

2

8

Y = 1

Y = 0

Y = 1

Y = 0

Y = 1

Y = 0

Y = 1

Y = 0

27

3

27

3

21

9

7

3

12

18

12

18

6

24

2

8

 

 

Pseudo-Population

Replacing the untreated (treated) counterfactuals with the treated (untreated) outcomes is equivalent to weighting them by the inverse of the probability of being treated (untreated).

42 of 50

Potential Outcome Framework, Regression, and Matching

42

Korea Summer Workshop on Causal Inference 2022

Inverse Probability Weighting

 

 

27

3

27

3

21

9

7

3

12

18

12

18

6

24

2

8

 

 

 

 

    • Rethinking IPW from the perspective of potential outcomes

C = 1

C = 0

60

40

X = 1

X = 0

X = 1

X = 0

30

30

30

10

27

3

12

18

21

9

2

8

Y = 1

Y = 0

Y = 1

Y = 0

Y = 1

Y = 0

Y = 1

Y = 0

43 of 50

Potential Outcome Framework, Regression, and Matching

43

Korea Summer Workshop on Causal Inference 2022

Inverse Probability Weighting

    • In the pseudo-population, the treatment is assigned as if random, independent of the observed covariates.

C = 1

C = 0

60

40

X = 1

X = 0

X = 1

X = 0

30

30

30

10

27

3

12

18

21

9

2

8

Y = 1

Y = 0

Y = 1

Y = 0

Y = 1

Y = 0

Y = 1

Y = 0

 

27

3

27

3

21

9

7

3

12

18

12

18

6

24

2

8

 

 

Pseudo-Population

44 of 50

Potential Outcome Framework, Regression, and Matching

44

Korea Summer Workshop on Causal Inference 2022

Inverse Probability Weighting

 

27

3

27

3

21

9

7

3

12

18

12

18

6

24

2

8

 

C

X

Y

Pseudo-Population

 

C

X

Y

 

Using Sample

Using Pseudo-Population

Y = 1

Y = 0

Y = 1

Y = 0

Y = 1

Y = 0

Y = 1

Y = 0

45 of 50

Potential Outcome Framework, Regression, and Matching

45

Korea Summer Workshop on Causal Inference 2022

Weighting vs. Regression/Matching

Weighting

Regression/Matching

    • In some special cases, conditioning may not work as it causes other backdoor paths to open. In this case, weighting methods can be used.

46 of 50

Potential Outcome Framework, Regression, and Matching

46

Korea Summer Workshop on Causal Inference 2022

Comparison of Regression, Matching, and Weighting

Regression

Matching

Weighting

Pros

  • Flexible for within-group comparisons
  • Easy to account for multiple treatment groups
  • Retain all observations
  • Resemble a RCT design (by constructing a pseudo-population, without functional assumptions)
  • Easy to assess the balance
  • Resemble a RCT design while retaining all observations
  • Easy to assess the balance
  • Work even if conditioning doesn’t work

Cons

  • Sensitive to a functional form
  • Difficult to assess conditional independence

  • [For PSM] Sensitive to the accuracy of propensity score
  • Smaller sample (statistically less efficient & may be different from the original data)
  • More sensitive to the accuracy of propensity score

Common Limitation

The selection on observed covariates does not rule out the potential selection on unobservables.

It is critical to convince how the observed covariates account for the selection on ubosbervables.

47 of 50

Potential Outcome Framework, Regression, and Matching

47

Korea Summer Workshop on Causal Inference 2022

CAVEAT: Last Resort for Causal Inference

    • Selection on observables does not guarantee the balance of unobserved covariates.
    • Selection on observables strategies should be considered the last resort for causal inference, as it is quite challenging to account for unobserved variables by using only observed variables.

A realistic perspective for such an approach is that “we can hope to infer causality.” (Goldfarb and Turker 2014)

Goldfarb, A. and Tucker, C.E., 2014. Conducting Research with Quasi-Experiments: A Guide for Marketers. Rotman School of Management Working Paper No. 2420920.

Altonji, J.G., Elder, T.E. and Taber, C.R., 2005. Selection on observed and unobserved variables: Assessing the effectiveness of Catholic schools. Journal of Political Economy, 113(1), pp.151-184.

  1. Sensitivity test or boundary analysis for the omitted variable bias is useful (e.g., Altonji et al. 2005).

  • Causal graph might help in convincing the roles of observed variables in eliminating selection bias.

48 of 50

Potential Outcome Framework, Regression, and Matching

48

Korea Summer Workshop on Causal Inference 2022

Still, They Work in Service of Experimental Methods

    • Control variables, matching, or weighting are still helpful in making randomized controlled trials or quasi-experiments more rigorous. It is more common to utilize control variables, matching, or weighting in tandem with other experimental methods, rather than in isolation.

    • Ceteris Paribus in the presence of control variables, matching, or weighting
      • The control group should be comparable to the treatment group in all aspects, except control variables, but the fact that they are treated.

49 of 50

Potential Outcome Framework, Regression, and Matching

49

Korea Summer Workshop on Causal Inference 2022

Causal Hierarchy from the Perspective of Potential Outcomes

Level of Causal Inference

Meta-Analysis

Randomized Controlled Trial

Quasi-Experiment

Instrumental Variable

“Designed” Regression/Matching

(based on causal knowledge or theory)

Regression/Matching (little causal inference)

Model-Free Descriptive Statistics (no causal inference)

Last Resort for Causal Inference

Toward Credibility

50 of 50

End of Document

Potential Outcome Framework, Regression, and Matching

50

Korea Summer Workshop on Causal Inference 2022