1 of 17

A General Form of Covariate Adjustment in Randomized Clinical Trials

Ting Ye – Department of Biostatistics, University of Washington

Joint work with

Marlena Bannick, Department of Biostatistics, University of Washington

Jun Shao, Department of Statistics, University of Wisconsin-Madison

Yu Du, Global Statistical Sciences, Eli Lilly and Company

Jingyi Liu, Global Statistical Sciences, Eli Lilly and Company

Yanyao Yi, Global Statistical Sciences, Eli Lilly and Company

2 of 17

Why adjusting for covariates?

Design Stage

covariate-adaptive randomization

balance across baseline covariates to gain credibility and efficiency

“Balance of treatment groups with respect to one or more specific prognostic covariates can enhance the credibility of the results of the trial” – EMA (2015) Guideline

Analysis Stage

model-assisted analysis

more efficient use of data under minimal assumption required by unadjusted analysis

“Incorporating prognostic baseline covariates in the design and analysis of clinical trial data can result in a more efficient use of data to demonstrate and quantify the effects of treatment. Moreover, this can be done with minimal impact on bias or the Type I error rate.” – FDA (2023) Guidance

3 of 17

Covariate Adjustment for Unconditional Treatment Effect

 

4 of 17

Covariate Adjustment for Unconditional Treatment Effect

 

Working Model

Representative Estimator

Unadjusted

ANOVA

 

5 of 17

Covariate Adjustment for Unconditional Treatment Effect

 

Working Model

Representative Estimator

Unadjusted

ANOVA

Linear adjustment

 

* Ye, T., Shao, J., Yi, Y., and Zhao, Q. (2023). Toward better practice of covariate adjustment in analyzing randomized clinical trials. JASA.

6 of 17

Covariate Adjustment for Unconditional Treatment Effect

 

Working Model

Representative Estimator

Unadjusted

ANOVA

Linear adjustment

G-computation

G-computation using

logistic model (FDA guidance)

 

* Guo, K. and Basse, G. (2021). The generalized Oaxaca-blinder estimator. JASA.

* Bannick, M., Shao, J., Liu, J., Du, Y., Yi, Y., and Ye, T. (2023+) A general form of covariate adjustment in randomized clinical trials. arXiv:2306.10213

7 of 17

Covariate Adjustment for Unconditional Treatment Effect

 

Working Model

Representative Estimator

Unadjusted

ANOVA

Linear adjustment

G-computation

G-computation using

logistic model (FDA guidance)

AIPW (doubly robust)

 

* Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., and Newey, W. (2017) Double/debiased/neyman machine learning of treatment effects. American Economic Review.

* Bannick, M., Shao, J., Liu, J., Du, Y., Yi, Y., and Ye, T. (2023+) A general form of covariate adjustment in randomized clinical trials. arXiv:2306.10213

8 of 17

Covariate Adjustment for Unconditional Treatment Effect

 

Working Model

Representative Estimator

Unadjusted

ANOVA

Linear adjustment

G-computation

G-computation using

logistic model (FDA guidance)

AIPW (doubly robust)

the general form of covariate adjustment in randomized clinical trials

9 of 17

Covariate Adjustment for Unconditional Treatment Effect

 

Working Model

Representative Estimator

Unadjusted

ANOVA

Linear adjustment

G-computation

G-computation using

logistic model (FDA guidance)

AIPW (doubly robust)

 

10 of 17

Covariate Adjustment for Unconditional Treatment Effect

 

Working Model

Representative Estimator

Unadjusted

ANOVA

Linear adjustment

G-computation

G-computation using

logistic model (FDA guidance)

AIPW (doubly robust)

 

11 of 17

Covariate Adjustment for Unconditional Treatment Effect

 

 

Working Model

Representative Estimator

Unadjusted

ANOVA

Linear adjustment

G-computation

G-computation using

logistic model (FDA guidance)

AIPW (doubly robust)

12 of 17

Covariate Adjustment for Unconditional Treatment Effect

 

Working Model

Representative Estimator

Unadjusted

ANOVA

Linear adjustment

G-computation

G-computation using

logistic model (FDA guidance)

AIPW (doubly robust)

All methods discussed and beyond are available in the R package

Robust Inference for Covariate Adjustment in Randomization Clinical Trials [RobinCAR]

https://github.com/tye27/RobinCar

13 of 17

Simulation (G-computation can be biased)

 

Method

Bias

SD

G-computation using

Poisson regression

-0.005

0.249

G-computation using

NB with unknow dispersion parameter

0.159

0.263

  • Poisson regression uses “log” link, a canonical link, hence, G-computation is unbiased

  • Negative Binomial regression with unknown dispersion parameter (estimated by maximum likelihood) is NOT a GLM with canonical link, hence, G-computation is biased.

14 of 17

Simulation (AIPW is more general than G-computation)

 

Method

Bias

SD

G-computation using

Poisson regression

-0.005

0.249

G-computation using

NB with unknow dispersion parameter

0.159

0.263

AIPW using

Poisson regression

-0.005

0.249

AIPW using

NB with unknow dispersion parameter

-0.005

0.253

  • AIPW is always unbiased, regardless of using GLM with canonical link or not

  • Poisson regression uses “log” link, a canonical link, hence, G-computation is equal to AIPW.

15 of 17

Simulation (CASE 1)

 

 

Correct

Naïve

Working model

Method

Bias

SD

SE

CP

SE

CP

Simple Randomization

(unadjusted)

Sample Mean

0.03

3.13

3.12

94.80

GLM-logistic

AIPW

0.00

2.74

2.76

95.16

JC

0.01

2.67

2.67

94.98

Random Forest

AIPW-CF

0.02

2.73

2.74

95.06

JC

0.02

2.71

2.72

95.18

Stratified Permuted Block Randomization (block size of 6)

(unadjusted)

Sample Mean

0.02

2.80

2.81

95.08

3.12

97.36

GLM-logistic

AIPW

0.00

2.75

2.75

95.20

2.76

95.24

JC

0.01

2.67

2.67

95.00

Random Forest

AIPW-CF

0.02

2.71

2.72

95.04

2.74

95.68

JC

0.03

2.70

2.71

95.00

  • All estimators have negligible biases compared to their standard deviations.

  • Correct SEs derived under corresponding randomization scheme are very close to SD, as a result, coverage probabilities are close to 95%.

  • Covariate adjustment achieves >20% variance reduction under simple randomization.

  • Joint calibration always shows smaller SD.

16 of 17

Simulation (CASE 2)

 

 

Correct

Naïve

Working model

Method

Bias

SD

SE

CP

SE

CP

Simple Randomization

(unadjusted)

Sample Mean

0.08

3.05

3.06

95.32

GLM-logistic

AIPW

0.10

2.98

2.99

95.30

JC

0.05

2.73

2.73

95.00

Random Forest

AIPW-CF

0.03

2.73

2.74

95.30

JC

0.03

2.72

2.71

94.98

Stratified Permuted Block Randomization (block size of 6)

(unadjusted)

Sample Mean

0.05

2.74

2.75

95.28

3.06

97.20

GLM-logistic

AIPW

0.05

2.75

2.75

95.22

2.99

96.90

JC

0.05

2.72

2.73

95.20

Random Forest

AIPW-CF

0.05

2.71

2.72

95.44

2.74

95.68

JC

0.04

2.70

2.71

95.32

  • All estimators have negligible biases compared to their standard deviations. Correct SE is very close to SD and corresponding CP is close to 95%.

  • When working model is poorly specified, AIPW may not improve efficiency (vs. sample mean),

Joint calibration restores the efficiency gain (guaranteed).

  • AIPW using random forest with cross-fitting shows consistent variance reduction.

  • Joint calibration performs almost identically under SR and CAR. (Universal Applicability) [AIPW with GLM does not]

  • Using SE derived under SR may lose efficiency

17 of 17

All methods are available in the one-stop user-friendly R package

RobinCAR

https://github.com/tye27/RobinCar

Summary

  • Unadjusted, linearly adjusted and G-computation estimators are all special cases of AIPW estimator
  • One asymptotic theory of AIPW estimator fits all.
  • Joint calibration strategy achieves guaranteed efficiency gain and universal applicability simultaneously.