1 of 28

Causal Design Patterns

Causal Design Patterns

Emily Riederer

@emilyriederer

February 15, 2021

Based on emily.rbind.io/post/causal-design-patterns/

2 of 28

A

B

B

A

B

A

A

B

A

B

B

A

B

A

A

B

A

B

B

A

B

A

A

B

A

B

B

A

B

A

A

B

A

B

B

A

B

A

A

B

A

B

B

A

B

A

A

B

B

A

A

B

B

A

B

A

3 of 28

Can’t test

  • Ethics
  • Reputational risk
  • Logistics
  • Made a mistake

Expensive to test

  • Direct costs
  • Implementation cost
  • Opportunity cost

Why wait?

  • Long term endpoints
  • More historical variants

4 of 28

Why observational causal inference?

Can’t test

  • Ethics
  • Reputational risk
  • Logistics
  • Made a mistake

Expensive to test

  • Direct costs
  • Implementation cost
  • Opportunity cost

Why wait?

  • Long term endpoints
  • More historical variants

5 of 28

Unifying themes

  • Compare potential outcomes of the observed versus the counterfactual �
  • Create a counterfactual by exploiting any semi-random variation
  • Exploit variation in distribution, in assignment, and across time

6 of 28

Four strategies for causal inference

7 of 28

When we have imbalance...

When you have:

- “similar” treated and untreated individuals

- different distributions

- on few relevant dimensions

Tries to:

Rebalance to make groups more comparable

8 of 28

Stratification overview

Assumption:

  • All common causes of treatment and outcome are accounted for
  • All observations have positive probability of treatment
  • Few variables require adjustment

Recipe:

  • Bin population by subgroups
  • Calculate average by group
  • Weight average across groups

9 of 28

Stratification application

Scenario:

  • Attempt to A/B test “one-click instant checkout” on Black Friday
  • Due to a glitch, Chrome users see the button 50% of the time but Mozilla users only 30%
  • Mozilla users spend less on average

Saw button

Didn’t see

Chrome

Mozilla

10 of 28

When we have imbalance along many dimensions...

When you have:

- “similar” treated and untreated individuals

- different distributions

- on many dimensions

Tries to:

Rebalance to make groups more comparable

11 of 28

Propensity Score Weighting overview

Assumption:

  • All common causes of treatment and outcome are accounted for
  • All observations have positive probability of treatment

Recipe:

  • Model probability of receiving treatment based on traits
  • Derive weights from predicted probabilities
  • Apply weights when calculating average outcome by group

12 of 28

Propensity Score Weighting application

Scenario:

  • We sent a text message to all customers with a valid number and want to measure the effect on likelihood of purchase
  • Only customers for whom we lack a phone number are untreated
  • Number-less customers are also less active on average

Have phone

No phone

P(Phone|X)

No phone - Reweighted

13 of 28

Propensity Score Weighting math

14 of 28

Propensity Score Weighting math

15 of 28

Propensity Score Weighting math

16 of 28

Propensity Score Weighting math

17 of 28

Propensity Score Weighting math

18 of 28

When we have no overlap...

When you have:

- disjoint treated and untreated individuals

- separated by sharp cut-off

Tries to:

Exploit arbitrary variation in treatment assignment at cut-off to evaluate local effect

19 of 28

Regression Discontinuity overview

Assumption:

  • Assignment rule is unknown to individuals (not gameable)
  • Outcome can be modeled as continuous w.r.t. running variable
  • Can fit a reasonably well-specified and simple model

Recipe:

  • Model outcome as function of running variable on each side
  • Evaluate models at cut-off value
  • The local treatment effect is the difference in estimates

20 of 28

Regression Discontinuity application

Scenario:

  • Customers who’ve not purchased in last 90 days sent a coupon
  • We want to measure the effect of the coupon on the likelihood of a purchase amount in next year

Days since last purchase

90

21 of 28

Regression Discontinuity breakdown

Assumptions:

  • Known and gameable policy

Scenario:

  • We advertise “Free Shipping and Returns over $50”
  • We want to measure the effect of offering free returns on the likelihood of a making a return

22 of 28

When we have pre-existing differences...

When you have:

- different baselines in comparison groups

- variation across time (pre/post)

Tries to:

Compare how difference in pre/post behavior differs across populations

23 of 28

Difference-in-Differences overview

Assumption:

  • Decision to treat not influenced by anticipated outcome
  • But-for the treatment, groups would have parallel trends
  • No spill-over between groups

Recipe:

  • Take the pre/post treatment difference within each group
  • Find the difference in differences between groups
  • Technically done as a fixed-effects regression

24 of 28

Difference-in-Differences application

Scenario:

  • We want to understand the effect of a store remodel on the number of visits
  • It’s too expensive to randomize stores to remodel

Remodel

As-is

Time of

remodel

25 of 28

Difference-in-Differences breakdown

Assumptions:

  • Decision-to-treat violated��
  • Spillover violated

Scenario:

  • Store was chosen for remodeling because of anticipated traffic increase
  • Untreated store is in the same neighborhood, and customers could feasibly move between

26 of 28

Difference-in-Differences extensions

Accounts for different time-series features

Control is weight-average of many observations

27 of 28

Implications

28 of 28

Learn More

These resources and many more linked at emily.rbind.io/post/resource-roundup-causal/

Introduction to Causal Inference

Brady Neal