Jiyong Park
Bryan School of Business and Economics
University of North Carolina at Greensboro
Potential Outcome Framework, Regression, and Matching
1
Korea Summer Workshop on Causal Inference 2022
Korea Summer Workshop on Causal Inference 2022
Boot Camp for Beginners
Potential Outcome Framework,
Regression, and Matching
Session Website: https://sites.google.com/view/causal-inference2022
Potential Outcome Framework, Regression, and Matching
2
Korea Summer Workshop on Causal Inference 2022
Korea Summer Workshop on Causal Inference 2022
Potential Outcome Framework
Potential Outcome Framework, Regression, and Matching
3
Korea Summer Workshop on Causal Inference 2022
Potential Outcome Framework
Potential Outcome Framework, Regression, and Matching
4
Korea Summer Workshop on Causal Inference 2022
Potential Outcome Framework
Causal effect of reading on grades
Counterfactual
Dominici, F., Bargagli-Stoffi, F.J. and Mealli, F., 2020. From controlled to undisciplined data: estimating causal effects in the era of data science using a potential outcome framework. arXiv preprint arXiv:2012.06865.
Causal effect of adopting a dog on depression
Counterfactual
Potential Outcome Framework, Regression, and Matching
5
Korea Summer Workshop on Causal Inference 2022
Potential Outcome Framework
Counterfactual
| Treatment | Potential Outcomes | Causal Effect | |
Subject i | | | | |
1 | 1 | 3 | | ATE on the Treated (ATET) |
2 | 1 | 1 | | |
3 | 0 | | 1 | ATE on the Untreated (ATEU) |
4 | 0 | | 1 | |
Main Focus: Average Treatment Effect (ATE)
Counterfactual
Individual treatment effect (ITE) cannot be identified by definition.
Potential Outcome Framework, Regression, and Matching
6
Korea Summer Workshop on Causal Inference 2022
Fundamental Problem of Causal Inference
= (Actual outcome for treated if treated) – (Actual outcome for untreated if not treated)
Actual Comparison
Counterfactual
Control Group
Counterfactual
Control Group
Ideal Comparison
Potential Outcome Framework, Regression, and Matching
7
Korea Summer Workshop on Causal Inference 2022
Fundamental Problem of Causal Inference
= (Actual outcome for treated if treated) – (Actual outcome for untreated if not treated)
Counterfactual
Control Group
| Treatment | Potential Outcomes | |
Subject i | | | |
1 | 1 | 3 | 1 |
2 | 1 | 1 | 1 |
3 | 0 | 2 | 1 |
4 | 0 | 2 | 1 |
Ignorability ≅ Exchangeability ≅ Unconfoundedness ≅ Exogeneity
Actual Comparison
Control Group
Potential Outcome Framework, Regression, and Matching
8
Korea Summer Workshop on Causal Inference 2022
Fundamental Problem of Causal Inference
= (Actual outcome for treated if treated) – (Actual outcome for untreated if not treated)
Control Group
Dominici, F., Bargagli-Stoffi, F.J. and Mealli, F., 2020. From controlled to undisciplined data: estimating causal effects in the era of data science using a potential outcome framework. arXiv preprint arXiv:2012.06865.
Counterfactual
Ideal Comparison
Actual Comparison
Potential Outcome Framework, Regression, and Matching
9
Korea Summer Workshop on Causal Inference 2022
Selection Bias
Are they comparable,
except the adoption W?
Are they comparable,
except a confounder X and the adoption W?
Dominici, F., Bargagli-Stoffi, F.J. and Mealli, F., 2020. From controlled to undisciplined data: estimating causal effects in the era of data science using a potential outcome framework. arXiv preprint arXiv:2012.06865.
Potential Outcome Framework, Regression, and Matching
10
Korea Summer Workshop on Causal Inference 2022
Selection Bias
= (Outcome for treated if treated)
– (Outcome for treated if not treated) + (Outcome for treated if not treated)
– (Outcome for untreated if not treated)
= (Outcome for treated if treated) – (Outcome for treated if not treated)
+ (Outcome for treated if not treated) – (Outcome for untreated if not treated)
= Causal effect + Selection bias
Causal effect
Selection bias
Potential Outcome Framework, Regression, and Matching
11
Korea Summer Workshop on Causal Inference 2022
Ceteris Paribus ≅ Comparable Control Group
Control Group
Treatment Group
w/o Treatment
(Counterfactual)
= (Outcome for treated if not treated) – (Outcome for untreated if not treated)
| Treatment | Potential Outcomes | |
Subject i | | | |
1 | 1 | 3 | 1 |
2 | 1 | 1 | 1 |
3 | 0 | 2 | 1 |
4 | 0 | 2 | 1 |
If Ceteris Paribus is satisfied, then exchangability holds.
Potential Outcome Framework, Regression, and Matching
12
Korea Summer Workshop on Causal Inference 2022
Ceteris Paribus ≅ Comparable Control Group
Park, S., Tafti, A.R. and Shmueli, G., 2021. Transporting Causal Effects Across Populations Using Structural Causal Modeling: The Example of Work-From-Home Productivity. Available at SSRN.
Bloom, N., Liang, J., Roberts, J. and Ying, Z.J., 2015. Does working from home work? Evidence from a Chinese experiment. Quarterly Journal of Economics, 130(1), pp.165-218.
Self-selected
treatment group
Self-selected
control group
Comparable except work-from-home?
Volunteered employees
Non-volunteered employees
The figure was adapted from Park et al. (2021)
Employees in the airfare and hotel departments of the Shanghai call center
Volunteer (self-select) to work from home?
Example of the causal effect of work from home
Potential Outcome Framework, Regression, and Matching
13
Korea Summer Workshop on Causal Inference 2022
Ceteris Paribus ≅ Comparable Control Group
Employees in the airfare and hotel departments of the Shanghai call center
Volunteered employees
Non-volunteered employees
Self-selected
treatment group
Self-selected
control group
Counterfactual of
treatment group
Idea comparison
(Causal effect)
Not comparable (Selection bias)
The figure was adapted from Park et al. (2021)
Volunteer (self-select) to work from home?
Causal effect + Selection bias
Park, S., Tafti, A.R. and Shmueli, G., 2021. Transporting Causal Effects Across Populations Using Structural Causal Modeling: The Example of Work-From-Home Productivity. Available at SSRN.
Bloom, N., Liang, J., Roberts, J. and Ying, Z.J., 2015. Does working from home work? Evidence from a Chinese experiment. Quarterly Journal of Economics, 130(1), pp.165-218.
Example of the causal effect of work from home
Potential Outcome Framework, Regression, and Matching
14
Korea Summer Workshop on Causal Inference 2022
Ceteris Paribus ≅ Comparable Control Group
Randomized
treatment group
Self-selected
control group
The figure was adapted from Park et al. (2021)
Randomized
control group
(as-if counterfactual)
Self-selected
group
Comparable (Causal effect)
Volunteered employees
Non-volunteered employees
Park, S., Tafti, A.R. and Shmueli, G., 2021. Transporting Causal Effects Across Populations Using Structural Causal Modeling: The Example of Work-From-Home Productivity. Available at SSRN.
Bloom, N., Liang, J., Roberts, J. and Ying, Z.J., 2015. Does working from home work? Evidence from a Chinese experiment. Quarterly Journal of Economics, 130(1), pp.165-218.
Employees in the airfare and hotel departments of the Shanghai call center
Volunteer (self-select) to work from home?
Example of the causal effect of work from home
Potential Outcome Framework, Regression, and Matching
15
Korea Summer Workshop on Causal Inference 2022
Ceteris Paribus ≅ Comparable Control Group
Self-selected
control group
Volunteered employees
Non-volunteered employees
Randomized
treatment group
Randomized
control group
(as-if counterfactual)
Causal effect + Selection bias
Selection bias
Causal effect + Selection bias
Selection bias
Causal effect
Comparable (Causal effect)
The figure was adapted from Park et al. (2021)
Self-selected
group
Park, S., Tafti, A.R. and Shmueli, G., 2021. Transporting Causal Effects Across Populations Using Structural Causal Modeling: The Example of Work-From-Home Productivity. Available at SSRN.
Employees in the airfare and hotel departments of the Shanghai call center
Volunteer (self-select) to work from home?
Example of the causal effect of work from home
Potential Outcome Framework, Regression, and Matching
16
Korea Summer Workshop on Causal Inference 2022
Korea Summer Workshop on Causal Inference 2022
Gold Standard of Causal Inference
: Random Assignment
Potential Outcome Framework, Regression, and Matching
17
Korea Summer Workshop on Causal Inference 2022
Causal Hierarchy from the Perspective of Potential Outcomes
Level of Causal Inference
Meta-Analysis
Randomized Controlled Trial
Quasi-Experiment
Instrumental Variable
“Designed” Regression/Matching
(based on causal knowledge or theory)
Model-Free Descriptive Statistics (no causal inference)
Regression/Matching (little causal inference)
Potential Outcome Framework, Regression, and Matching
18
Korea Summer Workshop on Causal Inference 2022
Gold Standard of Causal Inference: Random Assignment
Law of large numbers, in statistics, means that, as the number of identically distributed, randomly generated variables increases, their sample mean approaches their theoretical mean.
Potential Outcome Framework, Regression, and Matching
19
Korea Summer Workshop on Causal Inference 2022
Gold Standard of Causal Inference: Random Assignment
For each unit, flip a coin to determine the treatment, which is random.
Heads… get the treatment
Tails… do not get the treatment
Only systematic difference between the two groups is the treatment (i.e., Ceteris Paribus).
Potential Outcome Framework, Regression, and Matching
20
Korea Summer Workshop on Causal Inference 2022
Gold Standard of Causal Inference: Random Assignment
Male : Female = 0.5 : 0.5
w/ kids : w/o kids = 0.4 : 0.6
Average age = 31.5
Average heights = 169 cm
Average blood pressure = 95
etc…
Male : Female = 0.5 : 0.5
w/ kids : w/o kids = 0.4 : 0.6
Average age = 32
Average heights = 171 cm
Average blood pressure = 105
etc…
Male : Female = 0.5 : 0.5
w/ kids : w/o kids = 0.4 : 0.6
Average age = 32
Average heights = 170 cm
Average blood pressure = 100
etc…
Randomly assign 50 to treatment group
Randomly assign 50 to control group
Treatment Group: Use a Pill
Control Group: Use a Placebo Pill
Potential Outcome Framework, Regression, and Matching
21
Korea Summer Workshop on Causal Inference 2022
Example of Randomized Experiments
Carter, S.P., Greenberg, K. and Walker, M.S., 2017. The impact of computer usage on academic performance: Evidence from a randomized trial at the United States Military Academy. Economics of Education Review, 56, pp.118-132.
Potential Outcome Framework, Regression, and Matching
22
Korea Summer Workshop on Causal Inference 2022
Example of Randomized Experiments
Carter, S.P., Greenberg, K. and Walker, M.S., 2017. The impact of computer usage on academic performance: Evidence from a randomized trial at the United States Military Academy. Economics of Education Review, 56, pp.118-132.
Confounders
Treatment
Potential Outcome Framework, Regression, and Matching
23
Korea Summer Workshop on Causal Inference 2022
Example of Randomized Experiments
If they are sufficiently randomized, their influences should be minimized.
Average Treatment Effect (ATE)
with other confounders hold constant due to random assignment
Carter, S.P., Greenberg, K. and Walker, M.S., 2017. The impact of computer usage on academic performance: Evidence from a randomized trial at the United States Military Academy. Economics of Education Review, 56, pp.118-132.
Potential Outcome Framework, Regression, and Matching
24
Korea Summer Workshop on Causal Inference 2022
Korea Summer Workshop on Causal Inference 2022
Selection on Observables
: Regression, Matching, and Weighting
Potential Outcome Framework, Regression, and Matching
25
Korea Summer Workshop on Causal Inference 2022
Causal Hierarchy from the Perspective of Potential Outcomes
Level of Causal Inference
Meta-Analysis
Randomized Controlled Trial
Quasi-Experiment
Instrumental Variable
“Designed” Regression/Matching
(based on causal knowledge or theory)
Model-Free Descriptive Statistics (no causal inference)
Regression/Matching (little causal inference)
Selection on Unobservables Strategies
Selection on Observables Strategies
Potential Outcome Framework, Regression, and Matching
26
Korea Summer Workshop on Causal Inference 2022
How to Balance between Treatment and Control Groups
Potential Outcome Framework, Regression, and Matching
27
Korea Summer Workshop on Causal Inference 2022
Regression from the Perspective of Potential Outcomes
Angrist, J.D. and Pischke, J.S., 2017. Undergraduate econometrics instruction: through our classes, darkly. Journal of Economic Perspectives, 31(2), pp.125-144.
“This approach abandons the traditional regression framework in which all regressors are treated equally. The pedagogical emphasis on statistical efficiency and functional form, along with the sophomoric narrative that sets students off in search of “true models” as defined by a seemingly precise statistical fit, is ready for retirement.”
“Instead, the focus should be on the set of control variables needed to insure that the regression-estimated effect of the economic variable of interest has a causal interpretation.”
Potential Outcome Framework, Regression, and Matching
28
Korea Summer Workshop on Causal Inference 2022
Regression from the Perspective of Potential Outcomes
“The search for a true model with a large number of explanatory variables” (p. 128)
“No pride of place to any particular set of variables” (p. 128)
“Regression should be taught the way it is now most often used: as a tool to control for confounding factors.” (p. 126)
“The modern regression paradigm turns on the notion that the analyst has data on control variables that generate apples-to-apples comparisons for the variable of interest.” (p. 132)
“We are confident that the coefficients describe in a reasonable way the relationship between achieving and GSES [genetic endowment and socioeconomic status], TQ [teacher quality], SQ [non-teacher school quality], and PG [peer group characteristics], for this collection of 627 elementary school students.”
Angrist, J.D. and Pischke, J.S., 2017. Undergraduate econometrics instruction: through our classes, darkly. Journal of Economic Perspectives, 31(2), pp.125-144.
Potential Outcome Framework, Regression, and Matching
29
Korea Summer Workshop on Causal Inference 2022
Rethinking Regression for Causal Inference
Selection bias
Potential Outcome Framework, Regression, and Matching
30
Korea Summer Workshop on Causal Inference 2022
Rethinking Regression for Causal Inference
Selection bias
Potential Outcome Framework, Regression, and Matching
31
Korea Summer Workshop on Causal Inference 2022
Rethinking Regression for Causal Inference
Identification Assumption
: Conditional independence
Potential Outcome Framework, Regression, and Matching
32
Korea Summer Workshop on Causal Inference 2022
1. There should be a clear distinction between causes and controls.
2. The role of control variables is to account for the selection bias.
3. Don’t interpret the coefficients of controls in a causal manner.
Rethinking Regression for Causal Inference
“A third important characteristic of the Dale and Krueger (2002) study is a clear distinction between causes and controls on the right hand side of the regressions at the heart of their study. In the modern paradigm, regressors are not all created equal. Rather, only one variable at a time is seen as having causal effects. All others are controls included in service of this focused causal agenda.” (p. 129)
“The modern regression paradigm turns on the notion that the analyst has data on control variables that generate apples-to-apples comparisons for the variable of interest.” (p. 132)
“potential outcomes conditional on controls…” (p. 132)
“It’s unlikely that the regression coefficients multiplying the controls have a causal interpretation. We don’t imagine that the controls are as good as randomly assigned and we needn’t care whether they are.” (p. 132)
“The controls have a job to do: they are the foundation for the conditional independence claim” (p. 132)
Angrist, J.D. and Pischke, J.S., 2017. Undergraduate econometrics instruction: through our classes, darkly. Journal of Economic Perspectives, 31(2), pp.125-144.
Potential Outcome Framework, Regression, and Matching
33
Korea Summer Workshop on Causal Inference 2022
Regression is Analogous to Matching
“Regression is an automated matchmaker that produces within-group comparisons: there’s a single causal variable of interest, while other regressors measure conditions and circumstances that we would like to hold fixed when studying the effects of this cause.” (p. 130)
“By holding the control variables fixed—that is, by including them in a multivariate regression model—we hope to give the regression coefficient on the causal variable a ceteris paribus, apples-to-apples interpretation.” (p. 130)
Angrist, J.D. and Pischke, J.S., 2017. Undergraduate econometrics instruction: through our classes, darkly. Journal of Economic Perspectives, 31(2), pp.125-144.
Potential Outcome Framework, Regression, and Matching
34
Korea Summer Workshop on Causal Inference 2022
Matching
Goldfarb, A., Tucker, C. and Wang, Y., 2022. Conducting research in marketing with quasi-experiments. Journal of Marketing, 86(3), pp.1-20.
Potential Outcome Framework, Regression, and Matching
35
Korea Summer Workshop on Causal Inference 2022
Matching
Propensity Score Matching
Potential Outcome Framework, Regression, and Matching
36
Korea Summer Workshop on Causal Inference 2022
Matching
Propensity Score Stratification
Comparison within each stratum
Potential Outcome Framework, Regression, and Matching
37
Korea Summer Workshop on Causal Inference 2022
Matching
Coarsened Exact Matching
Potential Outcome Framework, Regression, and Matching
38
Korea Summer Workshop on Causal Inference 2022
Weighting
Matching
Weighting
Potential Outcome Framework, Regression, and Matching
39
Korea Summer Workshop on Causal Inference 2022
Inverse Probability Weighting
C = 1
C = 0
60
40
X = 1
X = 0
X = 1
X = 0
30
30
30
10
27
3
12
18
21
9
2
8
Y = 1
Y = 0
Y = 1
Y = 0
Y = 1
Y = 0
Y = 1
Y = 0
C
X
Y
Potential Outcome Framework, Regression, and Matching
40
Korea Summer Workshop on Causal Inference 2022
Inverse Probability Weighting
C = 1
C = 0
60
40
X = 1
X = 0
X = 1
X = 0
30
30
30
10
27
3
12
18
21
9
2
8
Y = 1
Y = 0
Y = 1
Y = 0
Y = 1
Y = 0
Y = 1
Y = 0
| ? |
? | |
| ? |
? | |
Selection on observables assumption
: Treatment and control groups are comparable, conditional on the observed covariates (e.g., C)
Potential Outcome Framework, Regression, and Matching
41
Korea Summer Workshop on Causal Inference 2022
Inverse Probability Weighting
C = 1
C = 0
60
40
X = 1
X = 0
X = 1
X = 0
30
30
30
10
27
3
12
18
21
9
2
8
Y = 1
Y = 0
Y = 1
Y = 0
Y = 1
Y = 0
Y = 1
Y = 0
27
3
27
3
21
9
7
3
12
18
12
18
6
24
2
8
Pseudo-Population
Replacing the untreated (treated) counterfactuals with the treated (untreated) outcomes is equivalent to weighting them by the inverse of the probability of being treated (untreated).
Potential Outcome Framework, Regression, and Matching
42
Korea Summer Workshop on Causal Inference 2022
Inverse Probability Weighting
27
3
27
3
21
9
7
3
12
18
12
18
6
24
2
8
C = 1
C = 0
60
40
X = 1
X = 0
X = 1
X = 0
30
30
30
10
27
3
12
18
21
9
2
8
Y = 1
Y = 0
Y = 1
Y = 0
Y = 1
Y = 0
Y = 1
Y = 0
Potential Outcome Framework, Regression, and Matching
43
Korea Summer Workshop on Causal Inference 2022
Inverse Probability Weighting
C = 1
C = 0
60
40
X = 1
X = 0
X = 1
X = 0
30
30
30
10
27
3
12
18
21
9
2
8
Y = 1
Y = 0
Y = 1
Y = 0
Y = 1
Y = 0
Y = 1
Y = 0
27
3
27
3
21
9
7
3
12
18
12
18
6
24
2
8
Pseudo-Population
Potential Outcome Framework, Regression, and Matching
44
Korea Summer Workshop on Causal Inference 2022
Inverse Probability Weighting
27
3
27
3
21
9
7
3
12
18
12
18
6
24
2
8
C
X
Y
Pseudo-Population
C
X
Y
Using Sample
Using Pseudo-Population
Y = 1
Y = 0
Y = 1
Y = 0
Y = 1
Y = 0
Y = 1
Y = 0
Potential Outcome Framework, Regression, and Matching
45
Korea Summer Workshop on Causal Inference 2022
Weighting vs. Regression/Matching
Weighting
Regression/Matching
Potential Outcome Framework, Regression, and Matching
46
Korea Summer Workshop on Causal Inference 2022
Comparison of Regression, Matching, and Weighting
| Regression | Matching | Weighting |
Pros |
|
|
|
Cons |
|
|
|
Common Limitation | The selection on observed covariates does not rule out the potential selection on unobservables. It is critical to convince how the observed covariates account for the selection on ubosbervables. | ||
Potential Outcome Framework, Regression, and Matching
47
Korea Summer Workshop on Causal Inference 2022
CAVEAT: Last Resort for Causal Inference
“A realistic perspective for such an approach is that “we can hope to infer causality.” (Goldfarb and Turker 2014)
Goldfarb, A. and Tucker, C.E., 2014. Conducting Research with Quasi-Experiments: A Guide for Marketers. Rotman School of Management Working Paper No. 2420920.
Altonji, J.G., Elder, T.E. and Taber, C.R., 2005. Selection on observed and unobserved variables: Assessing the effectiveness of Catholic schools. Journal of Political Economy, 113(1), pp.151-184.
Potential Outcome Framework, Regression, and Matching
48
Korea Summer Workshop on Causal Inference 2022
Still, They Work in Service of Experimental Methods
Potential Outcome Framework, Regression, and Matching
49
Korea Summer Workshop on Causal Inference 2022
Causal Hierarchy from the Perspective of Potential Outcomes
Level of Causal Inference
Meta-Analysis
Randomized Controlled Trial
Quasi-Experiment
Instrumental Variable
“Designed” Regression/Matching
(based on causal knowledge or theory)
Regression/Matching (little causal inference)
Model-Free Descriptive Statistics (no causal inference)
Last Resort for Causal Inference
Toward Credibility
End of Document
Potential Outcome Framework, Regression, and Matching
50
Korea Summer Workshop on Causal Inference 2022