Modern Covariate Adjustment in Clinical Trials Using Targeted Learning�
.
Foroogh Shamsi | Sr. Data Scientist, Novo Nordisk
November 7, 2025
Novo Nordisk®
Disclosure
The opinions expressed in these slides constitute the personal opinions of the authors and not necessarily those of Novo Nordisk
2025 Boston Pharmaceutical Statistics Symposium
2
November 7, 2025
Novo Nordisk | Modern Covariate Adjustment in Clinical Trials Using Targeted Learning
Novo Nordisk®
What is covariate adjustment in clinical trials?
2025 Boston Pharmaceutical Statistics Symposium
3
November 7, 2025
Novo Nordisk | Modern Covariate Adjustment in Clinical Trials Using Targeted Learning
Novo Nordisk®
What do we gain from covariate adjustment?
2025 Boston Pharmaceutical Statistics Symposium
4
November 7, 2025
Novo Nordisk | Modern Covariate Adjustment in Clinical Trials Using Targeted Learning
Novo Nordisk®
Including covariate effects in trial model
2025 Boston Pharmaceutical Statistics Symposium
5
November 7, 2025
AI/ML Model
Novo Nordisk | Modern Covariate Adjustment in Clinical Trials Using Targeted Learning
Novo Nordisk®
Putting covariate adjustment to work: a real-case setup
2025 Boston Pharmaceutical Statistics Symposium
6
November 7, 2025
Treatment (A = 1)
Control (A = 0)
Randomisation (1:1)
outcome
Treatment period
Design question: what should the sample size be?
Novo Nordisk | Modern Covariate Adjustment in Clinical Trials Using Targeted Learning
Novo Nordisk®
Why do we need simulations for covariate-adjusted trials?
2025 Boston Pharmaceutical Statistics Symposium
7
November 7, 2025
simulations
Novo Nordisk | Modern Covariate Adjustment in Clinical Trials Using Targeted Learning
Novo Nordisk®
Introducing carts package
2025 Boston Pharmaceutical Statistics Symposium
8
November 7, 2025
Novo Nordisk | Modern Covariate Adjustment in Clinical Trials Using Targeted Learning
Novo Nordisk®
Putting covariate adjustment to work: back to real-case setup
2025 Boston Pharmaceutical Statistics Symposium
9
November 7, 2025
Treatment (A = 1)
Control (A = 0)
Randomisation (1:1)
outcome
Treatment period
DGP: Data Generating Process
Novo Nordisk | Modern Covariate Adjustment in Clinical Trials Using Targeted Learning
Novo Nordisk®
Building the DGP from historical data: covariates and outcome
2025 Boston Pharmaceutical Statistics Symposium
10
November 7, 2025
# assumes hist_data has ARM, PCHG, SEX, AGE, BMIBL
treatment_assign <- function(n) {
data.frame(a = rbinom(n, 1, 0.5)) # 1:1 randomization
}
baseline_covar <- covar_bootstrap( # bootstrap from historical data
hist_data,
subset = c("SEX", "AGE", "BMIBL")
)
covariates <- treatment_assign %join% baseline_covar
Covariates(5)
SEX AGE BMIBL
1 1 34 26.65929
2 1 53 26.28001
3 0 43 25.14439
4 1 26 27.44688
5 0 36 26.42317
DGP → Estimation → Inference
# Fit models to historical data
mod_hist <- glm(y ~ a * (AGE + BMIBL + SEX), data = hist_data)
# Define outcome model
outcome <- setargs(
outcome_continuous,
mean = ~ a * (AGE + BMIBL + SEX),
par = unname(coef(mod_hist)), # Coefficients from fitted model to hist_data
sd = sd_hist # set to match hist_data variability
)
outcome_continuous(
data = covariates(5),
mean = ~ a * ( AGE + BMIBL + SEX),
par = coef(mod_hist),
sd = 7
)
y
<num>
1: -1.000605
2: -4.500758
3: 5.549962
4: 13.127839
5: -5.330944
DGP: Data Generating Process
Novo Nordisk | Modern Covariate Adjustment in Clinical Trials Using Targeted Learning
Novo Nordisk®
Building the DGP from historical data: Trial object and simulation�
2025 Boston Pharmaceutical Statistics Symposium
11
November 7, 2025
# build the trial object
trial <- Trial$new(
covariates = covariates,
outcome = outcome,
info = “Two-arm parallel design”
# exclusion = identity,
# estimators = list(),
# summary.args = list(),
)
# simulate some data for sanity-check
set.seed(1)
dd <- trial$simulate(n = 300)
DGP → Estimation → Inference
head(dd)
id a SEX AGE BMIBL num y
<num> <int> <int> <int> <num> <num> <num>
1: 1 0 1 54 26.79364 0 -0.7514961
2: 2 0 0 43 26.43002 0 -2.9922489
3: 3 1 0 51 25.87378 0 -12.6518833
4: 4 1 1 43 25.55013 0 -7.8245090
5: 5 0 0 57 25.50099 0 -7.6016633
6: 6 1 1 35 25.08327 0 -1.6877439
tapply(dd$y, dd$a, mean) # mean outcome by arm (a=0, a=1)
0 1
0.3937212 -1.461689
DGP: Data Generating Process
Novo Nordisk | Modern Covariate Adjustment in Clinical Trials Using Targeted Learning
Novo Nordisk®
Estimators: unadjusted vs TL-adjusted
2025 Boston Pharmaceutical Statistics Symposium
12
November 7, 2025
trial$estimators(list(
unadj = est_glm(),
tl_adj = est_adj(
covariates = c("BMIBL", "SEX", "AGE"),
# response = "y", # default: targeted::learner_glm
# treatment = "a"
)
))
trial$run(n = 400, R = 10000)
DGP → Estimation → Inference
propensity model
Unadjusted
mean → 0
ML
(cross-fitting)
trial$estimates
── trial.estimates ──
Model arguments:
Estimators: unadj, tl_adj
Simulation parameters: n = 400, R = 10000
Sample data:
id a SEX AGE BMIBL num y
<num> <int> <int> <int> <num> <num> <num>
1: 1 0 1 38 33.00362 0 6.159559
2: 2 1 1 37 25.43798 0 -5.075016
3: 3 1 0 51 37.29375 0 6.961396
4: 4 0 1 33 27.06698 0 7.359042
5: 5 1 1 52 26.74374 0 -9.674040
6: 6 1 1 31 25.88076 0 3.771534
trial$estimates$estimates$unadj
Estimate Std.Err 2.5% 97.5% P-value
1 -2.965e+00 7.580e-01 -4.450e+00 -1.479e+00 9.191e-05
2 -1.721e+00 6.816e-01 -3.057e+00 -3.853e-01 1.156e-02
3 -2.107e+00 7.227e-01 -3.523e+00 -6.901e-01 3.558e-03
4 -1.937e+00 7.215e-01 -3.351e+00 -5.231e-01 7.255e-03
5 -1.792e+00 6.591e-01 -3.084e+00 -5.002e-01 6.551e-03
---
9996 -2.333792 0.693666 -3.693351 -0.974232 0.000767
9997 -2.057341 0.711686 -3.452220 -0.662462 0.003843
9998 -2.389544 0.728003 -3.816404 -0.962684 0.001030
9999 -1.894332 0.704145 -3.274431 -0.514232 0.007140
10000 -2.319107 0.762842 -3.814250 -0.823964 0.002365
Estimate Std.Err 2.5% 97.5% P-value
Mean -2.05151 0.730470 -3.48320 -0.61981 0.045164
SD 0.72592 0.035649 0.73069 0.72785 0.109053
trial$estimates$estimates$tl_adj
Estimate Std.Err 2.5% 97.5% P-value
1 -2.795e+00 6.607e-01 -4.090e+00 -1.500e+00 2.342e-05
2 -1.402e+00 6.220e-01 -2.621e+00 -1.829e-01 2.419e-02
3 -2.580e+00 6.236e-01 -3.802e+00 -1.358e+00 3.511e-05
4 -1.523e+00 6.326e-01 -2.763e+00 -2.831e-01 1.606e-02
5 -1.662e+00 5.918e-01 -2.822e+00 -5.017e-01 4.989e-03
---
9996 -2.3661669 0.6128463 -3.5673236 -1.1650101 0.0001129
9997 -1.1019418 0.6265280 -2.3299142 0.1260305 0.0786103
9998 -2.3416644 0.6407015 -3.5974163 -1.0859124 0.0002573
9999 -1.9359908 0.6249079 -3.1607878 -0.7111939 0.0019480
10000 -2.0680194 0.6171502 -3.2776115 -0.8584273 0.0008054
Estimate Std.Err 2.5% 97.5% P-value
Mean -2.05189 0.626920 -3.28063 -0.82315 0.019982
SD 0.62287 0.022511 0.62405 0.62482 0.065852
| Mean Estimate | SD of Estimate | Mean Std.Err |
unadjusted | -2.05151 | 0.72592 | 0.730470 |
TL adjusted (BMI, Sex, Age) | -2.05189 | 0.62287 | 0.626920 |
TL: Targeted Learning
Available learners in the “targeted” package: learner_mars, learner_glm, learner_xgboost, …, learner_sl (super_learner)
Novo Nordisk | Modern Covariate Adjustment in Clinical Trials Using Targeted Learning
Novo Nordisk®
Power estimation
2025 Boston Pharmaceutical Statistics Symposium
13
November 7, 2025
trial$estimate_power(
n = 400,
R = 10000,
level = 0.05
)
DGP → Estimation → Inference
unadj tl_adj
0.8014 0.9060
trial$summary() # default level = 0.05, alternative = "="
estimate std.err std.dev power na
unadj -2.058567 0.7309649 0.7344038 0.8014 0
tl_adj -2.052315 0.6268261 0.6302970 0.9060 0
trial$summary(level = 0.02, alternative = "<")
estimate std.err std.dev power na
unadj -2.058567 0.7309649 0.7344038 0.7770 0
tl_adj -2.052315 0.6268261 0.6302970 0.8881 0
Novo Nordisk | Modern Covariate Adjustment in Clinical Trials Using Targeted Learning
Novo Nordisk®
Sample size estimation
2025 Boston Pharmaceutical Statistics Symposium
14
November 7, 2025
trial$estimate_samplesize(
power = 0.9, # default
estimator = trial$estimators("unadj"),
level = 0.05
# null = 0,
# alternative = "=",
)
DGP → Estimation → Inference
── Estimated sample-size to reach 90% power ──
n = 546 (actual estimated power≈90%)
trial$estimate_samplesize(
power = 0.9, # default
estimator = trial$estimators("tl_adj")
)
── Estimated sample-size to reach 90% power ──
n = 422 (actual estimated power≈90.66%
| Sample size (power = 0.90) |
unadjusted | 546 |
TL-adjusted (BMI, Sex, Age) | 422 |
Novo Nordisk | Modern Covariate Adjustment in Clinical Trials Using Targeted Learning
Novo Nordisk®
Translating insights into design decisions�
2025 Boston Pharmaceutical Statistics Symposium
15
November 7, 2025
Novo Nordisk | Modern Covariate Adjustment in Clinical Trials Using Targeted Learning
Novo Nordisk®
Sensitivity: stress-test the assumptions
2025 Boston Pharmaceutical Statistics Symposium
16
November 7, 2025
Novo Nordisk | Modern Covariate Adjustment in Clinical Trials Using Targeted Learning
Novo Nordisk®
2025 Boston Pharmaceutical Statistics Symposium
17
November 7, 2025
Resources
Vignettes:
Packages:
Novo Nordisk | Modern Covariate Adjustment in Clinical Trials Using Targeted Learning
Novo Nordisk®
THANK YOU
Contributors: Benedikt Sommer, Klaus Kähler Holst
Affiliations: Novo Nordisk
Contact: fghs@novonordisk.com, bkts@novonordisk.com, kkzh@novonordisk.com
Novo Nordisk®