1 of 65

Jiyong Park

Bryan School of Business and Economics

University of North Carolina at Greensboro

jiyong.park@uncg.edu

Causal Graph and Structural Causal Model

1

Korea Summer Workshop on Causal Inference 2022

Korea Summer Workshop on Causal Inference 2022

Session Website: https://sites.google.com/view/causal-inference2022

Boot Camp for Beginners

Causal Graph and

Structural Causal Model

2 of 65

Causal Graph and Structural Causal Model

2

Korea Summer Workshop on Causal Inference 2022

Korea Summer Workshop on Causal Inference 2022

Causal Graph

: Directed Acyclic Graph and Bayesian Network

3 of 65

Causal Graph and Structural Causal Model

3

Korea Summer Workshop on Causal Inference 2022

Causal Graph (Diagram)

    • Directed Acyclic Graph (DAG)
      • Graph: A structure made from nodes and edges
      • Directed: Direction represents a causal relationship between nodes
      • Acyclic: No directed cycles

    • Bayesian Network (Belief Network)
      • Probabilistic graphical model that represents a set of variables and their conditional dependencies

node

edge

 

4 of 65

Causal Graph and Structural Causal Model

4

Korea Summer Workshop on Causal Inference 2022

Relationship Types in Causal Graph

    • (Direct) Causal Effect

    • Mediator (Chain)

(Indirect Causal Effect)

    • Confounder (Fork)

    • Collider (Immorality)

Lederer, D.J., Bell, S.C., Branson, R.D., Chalmers, J.D., Marshall, R., Maslove, D.M., Ost, D.E., Punjabi, N.M., Schatz, M., Smyth, A.R. and Stewart, P.W., 2019. Control of confounding and reporting of results in causal inference studies. Guidance for authors from editors of respiratory, sleep, and critical care journals. Annals of the American Thoracic Society16(1), pp.22-28.

common causes

common effects

5 of 65

Causal Graph and Structural Causal Model

5

Korea Summer Workshop on Causal Inference 2022

Association in Causal Graph

    • Two nodes are associated if they share the same information.
      • Information of nodes flows in the direction of edges.
    • The paths for noncausal associations (except direct and indirect causal effects) are called backdoor paths.

Examples of backdoor paths

between X and Y

6 of 65

Causal Graph and Structural Causal Model

6

Korea Summer Workshop on Causal Inference 2022

Association in Causal Graph

    • X and Y are d-separated if there is no information flow (no association) between X and Y.
      • All paths between X and Y are blocked.

    • X and Y are d-connected if there is information flow (association) between X and Y.
      • Not all paths between X and Y are blocked.

Blocking the information flow through X

= Conditioning on, or controlling for, X

7 of 65

Causal Graph and Structural Causal Model

7

Korea Summer Workshop on Causal Inference 2022

Association in Causal Graph by Structure

    • As they are, mediators and confounders do establish an association between nodes, but colliders do not.

Mediator (Chain)

Confounder (Fork)

Collider (Immorality)

information flow

information flow

information flow

X

M

Y

X

C

Y

X

Z

Y

X and Y are d-connected.

X and Y are d-separated.

X and Y are d-connected.

8 of 65

Causal Graph and Structural Causal Model

8

Korea Summer Workshop on Causal Inference 2022

Association in Causal Graph by Structure

    • After conditioning on them (controlling for, or blocking), mediators and confounders does not establish an association between nodes, but colliders do.

Mediator (Chain)

Confounder (Fork)

Collider (Immorality)

M

information flow

C

information flow

Z

information flow

X

Y

X

Y

X

Y

X and Y are d-separated.

In general, mediators should not be blocked. But, to estimate the only direct causal effect of X on Y, mediators should be blocked.

X and Y are d-separated.

To estimate the direct or indirect causal effect of X on Y, confounders should be blocked.

X and Y are d-connected.

To estimate the direct or indirect causal effect of X on Y, colliders should NOT be blocked.

9 of 65

Causal Graph and Structural Causal Model

9

Korea Summer Workshop on Causal Inference 2022

Korea Summer Workshop on Causal Inference 2022

Applications of Causal Graph

for Design-Based Approach

10 of 65

Causal Graph and Structural Causal Model

10

Korea Summer Workshop on Causal Inference 2022

(1) Structure-Based Research Design

    • Controversies on the effect of estrogen on uterine cancer
      • Since 1970s, many studies have documented evidence on the positive relationship between estrogen therapy and uterine cancer.
      • Some researchers raised a concern on research designs.

“An estrogen therapy may lead to uterine bleeding, which allows to diagnose latent uterine cancers. Once the uterine bleeding is controlled for, we can identify the causal effect of the estrogen therapy on diagnosis of cancer.”

- Horwitz and Feinstein (@ Yale)

“Conditioning on the uterine bleeding could generate another noncausal association between the estrogen therapy and diagnosis of cancer.”

- Jick, Rothman, Walker (@ Harvard and Boston U.)

vs.

11 of 65

Causal Graph and Structural Causal Model

11

Korea Summer Workshop on Causal Inference 2022

(1) Structure-Based Research Design

Robins, J.M., 2001. Data, Design, and Background Knowledge in Etiologic Inference. Epidemiology, pp.313-320.

EdX: Causal Diagrams: Draw Your Assumptions Before Your Conclusions (https://courses.edx.org/courses/course-v1:HarvardX+PH559x+3T2019)

    • Different causal diagrams and research designs for the effect of estrogen on uterine cancer

Estrogens

Uterine Cancer

Diagnosis of Cancer

Unobserved

Causal Graph 1

Causal association

12 of 65

Causal Graph and Structural Causal Model

12

Korea Summer Workshop on Causal Inference 2022

(1) Structure-Based Research Design

Robins, J.M., 2001. Data, Design, and Background Knowledge in Etiologic Inference. Epidemiology, pp.313-320.

EdX: Causal Diagrams: Draw Your Assumptions Before Your Conclusions (https://courses.edx.org/courses/course-v1:HarvardX+PH559x+3T2019)

    • Different causal diagrams and research designs for the effect of estrogen on uterine cancer

Estrogens

Uterine Cancer

Diagnosis of Cancer

Uterine Bleeding

Noncausal association 1

Unobserved

Causal Graph 2

Causal association

13 of 65

Causal Graph and Structural Causal Model

13

Korea Summer Workshop on Causal Inference 2022

(1) Structure-Based Research Design

Robins, J.M., 2001. Data, Design, and Background Knowledge in Etiologic Inference. Epidemiology, pp.313-320.

EdX: Causal Diagrams: Draw Your Assumptions Before Your Conclusions (https://courses.edx.org/courses/course-v1:HarvardX+PH559x+3T2019)

    • Different causal diagrams and research designs for the effect of estrogen on uterine cancer

Estrogens

Uterine Cancer

Diagnosis of Cancer

Uterine Bleeding

Noncausal association 1

Unobserved

Research designs for conditioning on uterine bleeding is enough to identify the causal effect.

Causal Graph 2

Causal association

14 of 65

Causal Graph and Structural Causal Model

14

Korea Summer Workshop on Causal Inference 2022

(1) Structure-Based Research Design

Robins, J.M., 2001. Data, Design, and Background Knowledge in Etiologic Inference. Epidemiology, pp.313-320.

EdX: Causal Diagrams: Draw Your Assumptions Before Your Conclusions (https://courses.edx.org/courses/course-v1:HarvardX+PH559x+3T2019)

    • Different causal diagrams and research designs for the effect of estrogen on uterine cancer

Estrogens

Uterine Cancer

Diagnosis of Cancer

Uterine Bleeding

Unobserved

Causal Graph 3

Noncausal association 1

Causal association

15 of 65

Causal Graph and Structural Causal Model

15

Korea Summer Workshop on Causal Inference 2022

(1) Structure-Based Research Design

Robins, J.M., 2001. Data, Design, and Background Knowledge in Etiologic Inference. Epidemiology, pp.313-320.

EdX: Causal Diagrams: Draw Your Assumptions Before Your Conclusions (https://courses.edx.org/courses/course-v1:HarvardX+PH559x+3T2019)

    • Different causal diagrams and research designs for the effect of estrogen on uterine cancer

Estrogens

Uterine Cancer

Diagnosis of Cancer

Uterine Bleeding

Causal association

Unobserved

Research designs for conditioning on uterine bleeding can open up another noncausal association.

Noncausal association 2

Noncausal association 1

Causal Graph 3

16 of 65

Causal Graph and Structural Causal Model

16

Korea Summer Workshop on Causal Inference 2022

(1) Structure-Based Research Design

Robins, J.M., 2001. Data, Design, and Background Knowledge in Etiologic Inference. Epidemiology, pp.313-320.

EdX: Causal Diagrams: Draw Your Assumptions Before Your Conclusions (https://courses.edx.org/courses/course-v1:HarvardX+PH559x+3T2019)

    • Different causal diagrams and research designs for the effect of estrogen on uterine cancer

Estrogens

Uterine Cancer

Diagnosis of Cancer

Uterine Bleeding

Unobserved

In such cases, research designs to remove the arrow from estrogens to uterine bleeding are required (e.g., using inverse probability weighting).

Causal association

Noncausal association 2

Noncausal association 1

Causal Graph 3

17 of 65

Causal Graph and Structural Causal Model

17

Korea Summer Workshop on Causal Inference 2022

(2) Design of Control Variables / Conditioning Strategies

Level of Causal Inference

Meta-Analysis

Randomized Controlled Trial

Quasi-Experiment

Instrumental Variable

Designed” Regression/Matching

(based on causal knowledge or theory)

Model-Free Descriptive Statistics (no causal inference)

Regression/Matching (little causal inference)

Selection on Unobservables Strategies

Selection on Observables Strategies

18 of 65

Causal Graph and Structural Causal Model

18

Korea Summer Workshop on Causal Inference 2022

(2) Design of Control Variables / Conditioning Strategies

    • Example: Effect of dietary sodium intake on systolic blood pressure (Luque-Fernandez et al. 2019)

Luque-Fernandez, M.A., Schomaker, M., Redondo-Sanchez, D., Jose Sanchez Perez, M., Vaidya, A. and Schnitzer, M.E., 2019. Educational Note: Paradoxical collider effect in the analysis of non-communicable disease epidemiological data: a reproducible illustration and web application. International Journal of Epidemiology48(2), pp.640-653.

AGE = Age (years)

SOD = 24-hour dietary sodium intake

PRO = 24-hour excretion of urinary protein

SBP = Systolic blood pressure

 

 

True model

Estimated model

(unconditional model)

(conditioning on the confounder)

(conditioning on the confounder & collider)

19 of 65

Causal Graph and Structural Causal Model

19

Korea Summer Workshop on Causal Inference 2022

(2) Design of Control Variables / Conditioning Strategies

    • Example: Effect of dietary sodium intake on systolic blood pressure (Luque-Fernandez et al. 2019)

Luque-Fernandez, M.A., Schomaker, M., Redondo-Sanchez, D., Jose Sanchez Perez, M., Vaidya, A. and Schnitzer, M.E., 2019. Educational Note: Paradoxical collider effect in the analysis of non-communicable disease epidemiological data: a reproducible illustration and web application. International Journal of Epidemiology48(2), pp.640-653.

20 of 65

Causal Graph and Structural Causal Model

20

Korea Summer Workshop on Causal Inference 2022

(2) Design of Control Variables / Conditioning Strategies

Tafti, A. and Shmueli, G., 2020. Beyond overall treatment effects: Leveraging covariates in randomized experiments guided by causal structure. Information Systems Research31(4), pp.1183-1199.

“Causal diagrams help avoid common pitfalls in deciding which subset of variables to include as controls and which variables have a posttreatment or mediating role, thereby requiring a special way of incorporation into the analysis that differs from simply being included as control variables. Without causal diagrams, it can be hard to know the researchers’ assumptions and how each measured (or unobserved) variable fits into the causal story.” (Tafti and Shmueli 2020, p. 1189)

21 of 65

Causal Graph and Structural Causal Model

21

Korea Summer Workshop on Causal Inference 2022

(2) Design of Control Variables / Conditioning Strategies

Cinelli, C., Forney, A. and Pearl, J., 2021. A crash course in good and bad controls. Sociological Methods & Research, p.00491241221099552.

“In all cases, structural knowledge is indispensable for deciding whether a variable is a good or bad control, and graphical models provide a natural language for articulating such knowledge, as well as efficient tools for examining its logical ramifications.” (Cinelli et al. 2021, p. 15)

Examples of good controls

Examples of bad controls

22 of 65

Causal Graph and Structural Causal Model

22

Korea Summer Workshop on Causal Inference 2022

(3) Communicating Identification Assumptions

  • Example: Identification assumptions of instrumental variables (IVs)

(1) The IVs should be correlated with the endogenous treatment variable (relevance).

(2) The IVs should not be correlated with the error term in the explanatory equation.

      • Exclusion restriction: The IVs do not affect the outcome except through the treatment variable.
      • Exogeneity of IV: The IVs do not share any confounders with the outcome.

“Although these conditions are statistically similar, it is important to consider them separately in order to incorporate subject-matter knowledge in discussions about their validity and in decisions on adjustments.” (Swanson and Hernán 2013, p. 371)

Swanson, S.A. and Hernán, M.A., 2013. Commentary: how to report instrumental variable analyses (suggestions welcome). Epidemiology24(3), pp.370-374.

unverifiable statistical assumption

23 of 65

Causal Graph and Structural Causal Model

23

Korea Summer Workshop on Causal Inference 2022

(4) Transportability: From RCTs to Observational Studies

Pearl, J. and Bareinboim, E., 2014. External Validity: From Do-Calculus to Transportability Across Populations. Statistical Science, pp.579-595.

Prosperi, M., Guo, Y., Sperrin, M., Koopman, J.S., Min, J.S., He, X., Rich, S., Wang, M., Buchan, I.E. and Bian, J., 2020. Causal inference and counterfactual prediction in machine learning for actionable healthcare. Nature Machine Intelligence2(7), pp.369-375.

“License to transfer causal effects learned in experimental studies to a new population, in which only observational studies can be conducted” (Pearl and Bareinboim 2014, p. 579)

24 of 65

Causal Graph and Structural Causal Model

24

Korea Summer Workshop on Causal Inference 2022

Korea Summer Workshop on Causal Inference 2022

Structural Causal Model

25 of 65

Causal Graph and Structural Causal Model

25

Korea Summer Workshop on Causal Inference 2022

Causal Inference ≅ How to Address Endogeneity

  • Various approaches in causal inference depending on how to address endogeneity

Treatment Group with Grant

Control Group without Grant

1. Research Design for Causal Inference

    • Randomized Controlled Trial
    • (Natural) Quasi-Experiment
    • Local Average Treatment Effect (LATE)

2. Selection Model (Statistical Modeling)

3. Causal Graph (Graphical Modeling)

Causal effect of grant!

Selection Process (Data Generation Process)

26 of 65

Causal Graph and Structural Causal Model

26

Korea Summer Workshop on Causal Inference 2022

Structural Causal Model = Probabilistic Causal Mechanisms

Bareinboim, E., Correa, J.D., Ibeling, D. and Icard, T., 2022. On pearl’s hierarchy and the foundations of causal inference. In Probabilistic and Causal Inference: The Works of Judea Pearl (pp. 507-556). (https://www.causalai.net/r60.pdf)

To answer questions at Layer i, one needs knowledge at Layer i or higher.

Data Generation Process

27 of 65

Causal Graph and Structural Causal Model

27

Korea Summer Workshop on Causal Inference 2022

Structural Causal Model = Probabilistic Causal Mechanisms

Bareinboim, E., Correa, J.D., Ibeling, D. and Icard, T., 2022. On pearl’s hierarchy and the foundations of causal inference. In Probabilistic and Causal Inference: The Works of Judea Pearl (pp. 507-556). (https://www.causalai.net/r60.pdf)

28 of 65

Causal Graph and Structural Causal Model

28

Korea Summer Workshop on Causal Inference 2022

Causal Inference with SCM

“The problem of causal inference is thus to perform inferences across layers of the hierarchy (Fig. 1.2(b)) from a partial understanding of the SCM (Fig. 1.2(c)).”

(Bareinboim et al. 2022)

Bareinboim, E., Correa, J.D., Ibeling, D. and Icard, T., 2022. On pearl’s hierarchy and the foundations of causal inference. In Probabilistic and Causal Inference: The Works of Judea Pearl (pp. 507-556). (https://www.causalai.net/r60.pdf)

29 of 65

Causal Graph and Structural Causal Model

29

Korea Summer Workshop on Causal Inference 2022

Definition of Causal Effect Using do-operator

 

 

 

 

 

 

 

 

 

Identification

How to convert? do-calculus

Conditional Distributions

Interventional Distributions

30 of 65

Causal Graph and Structural Causal Model

30

Korea Summer Workshop on Causal Inference 2022

Definition of Causal Effect Using do-operator

Causal Effect

Source: Brady Neal’s lecture notes

(https://www.bradyneal.com/causal-inference-course)

31 of 65

Causal Graph and Structural Causal Model

31

Korea Summer Workshop on Causal Inference 2022

Revisiting Random Assignment

    • Random assignment of the treatment T acts like the do(T) operator.

Source: Brady Neal’s lecture notes

(https://www.bradyneal.com/causal-inference-course)

32 of 65

Causal Graph and Structural Causal Model

32

Korea Summer Workshop on Causal Inference 2022

Identification of Causal Effect

    • Transforming interventional distributions (with do-operator) into estimable conditional distributions

Identification

Conditional Distributions

Interventional Distributions

Source: Brady Neal’s lecture notes

(https://www.bradyneal.com/causal-inference-course)

33 of 65

Causal Graph and Structural Causal Model

33

Korea Summer Workshop on Causal Inference 2022

Identification of Causal Effect

    • Backdoor criterion: A set of variables that block all paths, but the causal association, between X and Y
    • Backdoor adjustment: Conditioning on variables that satisfy the backdoor criterion

Backdoor Adjustment

Identification

Source: Brady Neal’s lecture notes

(https://www.bradyneal.com/causal-inference-course)

34 of 65

Causal Graph and Structural Causal Model

34

Korea Summer Workshop on Causal Inference 2022

Identification of Causal Effect Using do-calculus (with Graph)

    • Generalizing identification beyond the backdoor criterion

Source: Causal Inference under the Rubric of Structural Causal Model (Yonghan Jung)

35 of 65

Causal Graph and Structural Causal Model

35

Korea Summer Workshop on Causal Inference 2022

Identification of Causal Effect

Source: Causal Inference under the Rubric of Structural Causal Model (Yonghan Jung)

36 of 65

Causal Graph and Structural Causal Model

36

Korea Summer Workshop on Causal Inference 2022

Estimation of Causal Effect

Source: Causal Inference under the Rubric of Structural Causal Model (Yonghan Jung)

37 of 65

Causal Graph and Structural Causal Model

37

Korea Summer Workshop on Causal Inference 2022

Further Reading and Advanced Topics

Bareinboim, E., Correa, J.D., Ibeling, D. and Icard, T., 2022. On pearl’s hierarchy and the foundations of causal inference. In Probabilistic and Causal Inference: The Works of Judea Pearl (pp. 507-556). (https://www.causalai.net/r60.pdf)

38 of 65

Causal Graph and Structural Causal Model

38

Korea Summer Workshop on Causal Inference 2022

Potential Outcome Framework vs. Structural Causal Model

Potential Outcome Framework

Structural Causal Model

Gold Standard

Random Assignment

Causal Inference Using Observational Data

(1) Identification

(Is it possible to estimate a causal effect?)

Research Design

Backdoor Criterion / do-Calculus

(2) Estimation

(How to estimate a

causal effect using data?)

Statistical/Econometrics Methods

(DID, RD, Matching, IV, SC, etc.)

Statistical/Computational Methods

(IPW, Doubly Robust Estimators,

Double ML, etc.)

39 of 65

Causal Graph and Structural Causal Model

39

Korea Summer Workshop on Causal Inference 2022

Difference (1) Manipulability

    • Does gender cause the increase or decrease in employment?

Gender

Employment Rate

Organizational Policy (e.g., parental leave)

Employment Rate

Picture Put in Resume

Employment Rate

Gender

40 of 65

Causal Graph and Structural Causal Model

40

Korea Summer Workshop on Causal Inference 2022

Research Design Assumes Manipulability

    • Design-based approach presumes hypothetical randomized experiments.

    • Causation can be defined and quantified only for the treatments that can be designed and manipulated.

Without treatment definitions that specify actions to be performed on experimental units, we cannot unambiguously define causal effects of treatments.” (Rubin 1978, p. 39)

No Causation without Manipulation” (Holland 1986, p. 959)

Angrist, J.D. and Pischke, J.S., 2010. The credibility revolution in empirical economics: How better research design is taking the con out of econometrics. Journal of Economic Perspectives, 24(2), pp.3-30.

Rubin, D.B., 1978. Bayesian inference for causal effects: The role of randomization. The Annals of Statistics, pp.34-58.

Holland, P.W., 1986. Statistics and causal inference. Journal of the American Statistical Association, 81(396), pp.945-960.

Over 65 years ago, Haavelmo submitted the following complaint to the readers of Econometrica (1944, p. 14): “A design of experiments (a prescription of what the physicists call a ‘crucial experiment’) is an essential appendix to any quantitative theory. And we usually have some such experiment in mind when we construct the theories, although—unfortunately—most economists do not describe their design of experiments explicitly.”” (Angrist and Pischke 2010, p. 16)

41 of 65

Causal Graph and Structural Causal Model

41

Korea Summer Workshop on Causal Inference 2022

Manipulability Plays No Role in Causal Graph

Manipulability theories of causation, according to which causes earn their meaning and usefulness by transmitting change from actions to effects have had considerable intuitive appeal among scientists and philosophers [1–4]. The rise of Fisher’s RCT to the “gold standard” of experimental science further entrenched manipulability as a prerequisite for causation. In some communities, this entrenchment has turned into a dogma, cast for example in the mantra “no causation without manipulation [5] that has led to cultural prohibition on labeling sex or race as “causes.”

Other research camps have been more tolerant to causal labels. In the structural causal model (SCM) framework, for example, manipulations are merely convenient means of interrogating nature, and causal relations enjoy independent existence, oblivious to external interventions [6–9]. In this framework, variables earn causal character through their capacity to sense and respond to changes in other variables. For example the variable “sex” earns the label “cause” by virtue of having responders such as “hormone content” or “height” which are gender dependent.” (Pearl 2018, p. 1)

Pearl, J., 2018. Does obesity shorten life? Or is it the soda? On non-manipulable causes. Journal of Causal Inference6(2).

42 of 65

Causal Graph and Structural Causal Model

42

Korea Summer Workshop on Causal Inference 2022

Difference (2) Causal Structure/Knowledge

    • Example: Maternal status and birth outcome

(1) Graph-based approach leveraging the causal structure/knowledge (Vahratian et al. 2005)

Vahratian, A., Siega-Riz, A.M., Savitz, D.A. and Zhang, J., 2005. Maternal pre-pregnancy overweight and obesity and the risk of cesarean delivery in nulliparous women. Annals of Epidemiology15(7), pp.467-474.

43 of 65

Causal Graph and Structural Causal Model

43

Korea Summer Workshop on Causal Inference 2022

Difference (2) Causal Structure/Knowledge

    • Example: Maternal status and birth outcome

(2) Design-based approach without imposing the causal structure/knowledge (Torche 2011)

Torche, F., 2011. The effect of maternal stress on birth outcomes: Exploiting a natural experiment. Demography48(4), pp.1473-1491.

This article examines the effect of one such condition - prenatal maternal stress - on birth weight, an early outcome shown to affect cognitive, educational, and socioeconomic attainment later in life. Exploiting a major earthquake as a source of acute stress and using a difference-in-difference methodology, I find that maternal exposure to stress results in a significant decline in birth weight and an increase in the proportion of low birth weight.” (Torche 2011, p. 1473)

Treatment Group

Control Group

44 of 65

Causal Graph and Structural Causal Model

44

Korea Summer Workshop on Causal Inference 2022

Policy-Based vs. Knowledge-Based Causation

Manipulability

Causal Structure/Knowledge

Source: Kosuke Imai’s Lecture Note (Harvard U)

45 of 65

Causal Graph and Structural Causal Model

45

Korea Summer Workshop on Causal Inference 2022

Korea Summer Workshop on Causal Inference 2022

Causal Discovery

: Identifying Causal Relationships from Data

46 of 65

Causal Graph and Structural Causal Model

46

Korea Summer Workshop on Causal Inference 2022

Toward Knowledge Discovery

47 of 65

Causal Graph and Structural Causal Model

47

Korea Summer Workshop on Causal Inference 2022

Toward Knowledge Discovery

Theory → Evidence (Data)

Evidence (Data) → Theory

48 of 65

Causal Graph and Structural Causal Model

48

Korea Summer Workshop on Causal Inference 2022

Toward Causal Knowledge Discovery (as a Graph)

    • Causal (structure) discovery is the problem of identifying causal relationships from large quantities of data through computational methods.

49 of 65

Causal Graph and Structural Causal Model

49

Korea Summer Workshop on Causal Inference 2022

Data Generation Process and Causal Discovery

  • Data Generation Process: Causal Graph → Data
  • Causal Discovery: Data → Causal Graph

Ma, S. and Statnikov, A., 2017. Methods for computational causal discovery in biomedicine. Behaviormetrika44(1), pp.165-191.

Causal Effect Identification and Estimation

50 of 65

Causal Graph and Structural Causal Model

50

Korea Summer Workshop on Causal Inference 2022

Overall Structure of Causal Discovery

(2) What is the Markov equivalence class?

(1) What assumptions are required?

+ Acyclicity for DAG

(3) How to learn causal structures?

(4) How to test conditional independence?

51 of 65

Causal Graph and Structural Causal Model

51

Korea Summer Workshop on Causal Inference 2022

Causal Markov and Faithfulness Assumptions

  • Causal Markov Assumption: A node is dependent only on its descendants in the graph. In other words, a node is independent on other variables, conditional on its causes.
  • Faithfulness Assumption: Nodes that are causally connected in a particular way in the graph are probabilistically dependent.

 

 

 

 

Causal Markov Assumption

Causal Markov Assumption + Faithfulness Assumption

52 of 65

Causal Graph and Structural Causal Model

52

Korea Summer Workshop on Causal Inference 2022

Violation of Faithfulness Assumption

Source: Brady Neal’s lecture notes

  • Distinct causal paths that have opposite effects could cancel out each other.

53 of 65

Causal Graph and Structural Causal Model

53

Korea Summer Workshop on Causal Inference 2022

Recall the Conditional (In-)Dependence

    • Immorality (“V” structure) is the only structure that exbibits unconditional independence, but conditional dependence.

Mediator (Chain)

Confounder (Fork)

Collider (Immorality)

Common Cause

Common Effect

54 of 65

Causal Graph and Structural Causal Model

54

Korea Summer Workshop on Causal Inference 2022

Markov Equivalence Class

  • Markov Equivalence Class: A set of DAGs that encode the same set of conditional independencies.

Eberhardt, F., 2016. Introduction to the foundations of causal discovery. International Journal of Data Science and Analytics2(3), pp.81-91.

The “V” structures (colliders, immorality) play a critical role as it has only one structure for the same class.

55 of 65

Causal Graph and Structural Causal Model

55

Korea Summer Workshop on Causal Inference 2022

Causal Discovery Algorithms

  • Causal discovery algorithms can be classified into two types: (i) constraint-based and (ii) score-based.

Constraint-based algorithms are based on conditional independence constraints.

Score-based algorithms generate a number of candidate causal graphs, assign a score to each, and select a final graph based on the scores.

PC Algorithm

(Peter Spirtes and Clark Glymour)

FCI Algorithm

(Fast Causal Inference)

Assuming no unobserved confounders

Assuming unobserved confounders

GES Algorithm

(Greedy Equivalence Search)

56 of 65

Causal Graph and Structural Causal Model

56

Korea Summer Workshop on Causal Inference 2022

Conditional Independence Tests

  • Conditional independence tests depend on the distributions of variables in Bayesian networks (causal graphs).

1) Discrete Bayesian networks (categorical variables)

2) Discrete Bayesian networks (ordered factors)

3) Gaussian Bayesian networks (continuous normal variables)

4) Non-Gaussian Bayesian networks (continuous variables)

57 of 65

Causal Graph and Structural Causal Model

57

Korea Summer Workshop on Causal Inference 2022

Causal Discovery Algorithms (1) PC Algorithm

  • Step 1. Start with a complete undirected graph.
  • Step 2. Eliminate edges between variables that are unconditionally independent.
  • Step 3. For each pair of variables having an edge between them, eliminate the edge if they are independent, conditional on a subset of variables with edges to them (increasing the size of subsets 1 to n).
  • Step 4. Identify a “V” structure (collider, immorality) and orient edges.
  • Step 5. Orient the remaining edges not to be a collider (i.e., orientation propagation).

Ground Truth

Skeleton

58 of 65

Causal Graph and Structural Causal Model

58

Korea Summer Workshop on Causal Inference 2022

Causal Discovery Algorithms (2) FCI Algorithm

  • FCI algorithm is similar to PC algorithm, but it further assumes that there could be an unmeasured confounder between nodes, except the “Y” structures.

Note that causal discovery algorithms do not necessarily provide complete causal information

PC algorithm

FCI algorithm

59 of 65

Causal Graph and Structural Causal Model

59

Korea Summer Workshop on Causal Inference 2022

Causal Discovery Algorithms (2) FCI Algorithm

  • FCI algorithm is similar to PC algorithm, but it further assumes that there could be an unmeasured confounder between nodes, except the “Y” structures.

Ground Truth

Unmeasured confounder

Graph after removing conditional independence

Graph after orienting the “V” structures

Can be an arrow head or tail

When will it become an arrow tail (i.e., causal effect of X on Y)?

60 of 65

Causal Graph and Structural Causal Model

60

Korea Summer Workshop on Causal Inference 2022

Causal Discovery Algorithms (2) FCI Algorithm

  • FCI algorithm is similar to PC algorithm, but it further assumes that there could be an unmeasured confounder between nodes, except the “Y” structures.

Ground Truth

Unmeasured confounder

Graph after removing conditional independence

Graph after orienting the “V” structures

A

B

A

B

A

B

If there is an unmeasured confounder between X and Y, A (or B) and Y cannot be independent conditional on X.

61 of 65

Causal Graph and Structural Causal Model

61

Korea Summer Workshop on Causal Inference 2022

Causal Discovery Algorithms (3) GES Algorithm

  • Step 1. Start with an empty graph containing no edges.
  • Step 2. Greedily add edges (dependencies) one at a time in the orientation that maximize some fit score, such as Bayesian Information Score (BIC) (the lower, the better fit).
  • Step 3. Map the resulting model to the corresponding Markov equivalence class.
  • Step 4. Continue Steps 2 and 3 until the score can no longer be improved.
  • Step 5. Remove edges one at a time as long as it maximizes the score (e.g., decreases the BIC).
  • Step 6. Continue Step 5 until no further edges can be removed.

62 of 65

Causal Graph and Structural Causal Model

62

Korea Summer Workshop on Causal Inference 2022

Summary of Causal Discovery Algorithms

LiNGAM: Linear, non-gaussian, acyclic model

PNL: post-non-linear causal model

ANM: non-linear additive noise model

Glymour, C., Zhang, K. and Spirtes, P., 2019. Review of causal discovery methods based on graphical models. Frontiers in Genetics10, p.524.

FCM (functional causal model)

63 of 65

Causal Graph and Structural Causal Model

63

Korea Summer Workshop on Causal Inference 2022

Practices for Causal Discovery

Practical causal analysis is not a matter of pressing a few buttons. There are multiple algorithms available, many of them are poorly tested, some of them are poor implementations of good algorithms, some of them are just plain poor algorithms, all of them have choices of parameters, and all of them have conditions on the data distributions and other assumptions under which they will be informative rather than misleading.” (Glymour et al. 2019, p. 11)

Glymour, C., Zhang, K. and Spirtes, P., 2019. Review of causal discovery methods based on graphical models. Frontiers in Genetics10, p.524.

64 of 65

Causal Graph and Structural Causal Model

64

Korea Summer Workshop on Causal Inference 2022

Bridging the Two Worlds

  • One of the most cited resistance among social scientists against the structural causal models stems from the inability to validate causal structures that represent phenomena of interest.

“In general it is easy to come up with arguments for the presence of links: as anyone who has attended an empirical economics seminar knows, the difficult part is coming up with an argument for the absence of such effects that convinces the audience.” (Imbens 2020, p. 1140)

Imbens, G.W., 2020. Potential outcome and directed acyclic graph approaches to causality: Relevance for empirical practice in economics. Journal of Economic Literature58(4), pp.1129-79.

Can the causal discovery be a remedy for this concern?

(maybe not as of now, but will be the case in the near future)

65 of 65

End of Document

Causal Graph and Structural Causal Model

65

Korea Summer Workshop on Causal Inference 2022