1 of 29

Why a collabathon on R(t) estimation?

Laura White, PhD, Professor, Department of Biostatistics

Chad Milando, PhD, Research Scientist, Department of Environmental Health

bu.edu/sph | @BUSPH

2 of 29

Outline of comments (15 mins total)

  • What is R(t)?
  • Brief history of methods to estimate it (1-2 slides)
    • Wallinga & Teunis (2004), EpiEstim first easy to use with software-recent explosion of methods
  • Motivation for development of new methods include
    • Reporting delays, smoothing, prediction
  • Current challenges
    • Implementation of methods: we will show that there is not standardization in how these are implemented, documented, and interpreted
    • Performance: challenging to assess how the methods are performing (lack of ground truth, many dimensions of performance)

bu.edu/sph | @BUSPH

3 of 29

What is R(t)?

  • Average number of cases infected by each case "at time t"
    • We focus on instantaneous R(t): expected number of infections generated at time t by currently infectious individuals

Refs: Vegvari et al, 2021; Gostic et al, 2020; White et al, 2021

bu.edu/sph | @BUSPH

4 of 29

Brief history of estimation methods

Leo et al, 2003

bu.edu/sph | @BUSPH

5 of 29

Brief history of estimation methods

Estimation of R(t) from line list data (e.g. daily case counts)

  • Wallinga and Teunis (2004): propose taking generation interval + case counts -> case R(t)

bu.edu/sph | @BUSPH

6 of 29

Brief history of estimation methods

  • Fraser (2007) and Cori et al (2013): renewal equations to estimate instantaneous R(t)
    • Software: EpiEstim
  • Many innovations on this software and method
  • Proliferation of more methods, particularly in the last 4-5 years

bu.edu/sph | @BUSPH

7 of 29

Brief history of estimation methods

Nash et al, 2022, Plos Digital Health

bu.edu/sph | @BUSPH

8 of 29

Motivation for new methods

  • Smoothing concerns to address noisy data
  • Reporting delays
  • Handling uncertainties in the serial interval
  • Spatial variability
  • Packages/Methods: EpiEstim, EpiFilter, EpiNow, EpiNow2, EpiLPS, APEstim, bayEStim, earlyR, Epidemia, ern, estimateR, epinowcast

bu.edu/sph | @BUSPH

9 of 29

Taking the pulse of the community

  • We conducted a survey this summer with n=32 responses
    • 78% Academic research; 17% Public health dept or agency; 4% industry
  • 26% users; 65% both user and developer; 9% neither
  • 81% have used R(t) estimation tools in the past

bu.edu/sph | @BUSPH

10 of 29

Current challenges (Implementation)

From survey respondents

  • Inconsistent input formats
  • Limited information on serial intervals
  • Can require a lot of computing power
  • Hard to learn how to use
  • Challenges with low or sporadic case counts
  • Unsure what tool to use

bu.edu/sph | @BUSPH

11 of 29

Current challenges (Performance)

From survey respondents

  • Don't always reproduce the data generating process
  • Unclear statement of limitations of methods
  • Many produce unrealistically tight confidence intervals
  • Unable to capture spatial variability
  • Variable reporting delays hard to incorporate
  • Tools are not all regularly maintained

bu.edu/sph | @BUSPH

12 of 29

What features do people want?

From survey respondents:

  • Better documentation: model details, step-by-step examples
  • Efficiency (fast run times)
  • Handling small sample sizes/inconsistent reporting
  • Accurate real-times estimates

bu.edu/sph | @BUSPH

13 of 29

Collabathon goals

  • Determine how to evaluate existing methods
  • Evaluate existing methods
  • Develop standards for inputs/outputs/documentation
  • Define areas of innovation in methods and start working on them
  • Create sustainable community to increase feedback and communication between public health and research communities

bu.edu/sph | @BUSPH

14 of 29

Format of collabathon

This morning: Get familiar with existing tools

    • Goals: Consider: 1) implementation and performance questions; 2) opportunities for innovation

01

This afternoon: Form working groups

    • Focus on 1) innovation and 2) implementation questions

02

Today-Tomorrow-Thursday: Periodic reporting on working group progress

03

Thursday: Discussion on sustainability and future community

04

Throughout: Keeping an eye on standards (inputs, outputs, etc.) -> potential to form a separate working group?

bu.edu/sph | @BUSPH

15 of 29

Group Exercise

Chad Milando, PhD

BUSPH

16 of 29

Outbreak�simulation

  • Given an incubation distribution and transmission distribution
  • And instantaneous R(t) curve for infections
  • Generate an outbreak dataset

bu.edu/sph | @BUSPH

17 of 29

Demo:

https://mobslab.shinyapps.io/simulate_infection_data/

https://mobslab.shinyapps.io/simulate_infection_data/

https://github.com/cmilando/RtEval/

bu.edu/sph | @BUSPH

18 of 29

R packages for R(t) estimation

Package Name

Title

Authors

EpiEstim

Estimate Time Varying Reproduction Numbers from Epidemic Curves

Anne Cori

EpiNow2

Estimate Real-Time Case Counts and Time-Varying Epidemiological Parameters

Sam Abbott

EpiLPS

A Fast and Flexible Bayesian Tool for Estimating Epidemiological Parameters

Oswaldo Gressani

R0

Estimation of R0 and Real-Time Reproduction Number from Epidemics

Pierre-Yves Boelle, Thomas Obadia

epigrowthfit

Nonlinear Mixed Effects Models of Epidemic Growth

Mikael Jagan

EpiILMCT

Continuous Time Distance-Based and Network-Based Individual Level Models for Epidemics

Waleed Almutiry

EpiILM

Spatial and Network Based Individual Level Models for Epidemics

Vineetha Warriyar K. V.

earlyR

Estimation of Transmissibility in the Early Stages of a Disease Outbreak

Thibaut Jombart

epitrix

Small Helpers and Tricks for Epidemics Analysis

Thibaut Jombart

ern

Effective Reproduction Number Estimation

David Champredon

RtEstim

Estimate the Effective Reproductive Number with Trend Filtering

Daniel J. McDonald

epinowcast

Flexible Hierarchical Nowcasting

Sam Abbott

EpiFilter

Recursive Bayesian smoother for estimating the effective reproduction number, R, from the incidence of an infectious disease in real time and retrospectively

Kris V. Parag

Epidemia

Modeling of Epidemics using Hierarchical Bayesian Models

James Scott

bu.edu/sph | @BUSPH

19 of 29

Different subsets of required inputs

  • Incidence of case reports
  • Distributions for:
    • Incubation time (Ii, Ij)
    • Transmission time (Pij)
    • Generation interval (Gij)
    • Serial interval (Sij)
  • Reporting fraction
  • Priors for R(t)

Lehtinen, Sonja, Peter Ashcroft, and Sebastian Bonhoeffer. "On the relationship between serial interval, infectiousness profile and generation time." Journal of the Royal Society Interface 18, no. 174 (2021): 20200756.

bu.edu/sph | @BUSPH

20 of 29

Different methods / temporal smoothing

  • Back-calculation for early estimation periods
  • MCMC in RC++ vs HMC in STAN
  • Gaussian Process
  • Auto-regressive (AR1) processes
  • Delta method
  • Laplacian P-Splines
  • Lotka-Euler equation
  • Weekly vs daily
  • Random walk

bu.edu/sph | @BUSPH

21 of 29

Testing status

Package

Status

Notes

EpiEstim

Working

R(t) for infections lines up with “true R(t)”

EpiNow2

Working

R(t) for infections lines up with “true R(t)”

RtEstim

Working

R(t) for infections lines up with “true R(t)”

EpiLPS

Working

R(t) for infections lines up with “true R(t)”

R0

In progress

Offset between R(t) for infections and ”true R(t)”

epinowcast

In progress

Need guidance

EpiFilter

Can’t model PMF

Not possible to input a non-parametric serial interval

ern

Can’t model PMF

Not possible to input a non-parametric serial interval

epitrix

No effective R(t)

No function for calculating time-varying R(t)

earlyR

No effective R(t)

No function for calculating time-varying R(t)

epigrowthfit

No effective R(t)

No function for calculating time-varying R(t)

Epidemia

Not tested

Not available for R version 4.4.0+

EpiILMCT

Not tested

Individual-level network model

EpiILM

Not tested

Individual-level network model

Common inputs:

  • time-series for aggregated dates of infection (starting Day 0)
  • serial interval distribution (starting with Day 0 = 0, Day 1 = …)
  • “true R(t)” used to create case data (red line / dotted line)
  • Temporal smoothing set to 1 day

bu.edu/sph | @BUSPH

22 of 29

Note: some package contain reporting delay functions, others are shifted by the weighted mean

Testing status

  • Dotted line “true R(t)” used to create case data

bu.edu/sph | @BUSPH

23 of 29

Breakout group questions

bu.edu/sph | @BUSPH

24 of 29

Packages to work on

Groups

  • Developer / Implementor

  • User / evaluator

  • Decision-maker

All code can be found at https://github.com/cmilando/RtEval

`00_Simulate` recreates the outbreak simulator

`02_<package>_infections.R` compares the ‘true’ R(t) to infections data

`02_<package>_reports.R` compares the `true` R(t) to reporting data

`03_plot.R plots`

`04_eval.R starts some evaluation`

Package

Status

Notes

EpiEstim

Working

R(t) for infections lines up with “true R(t)”

EpiNow2

Working

R(t) for infections lines up with “true R(t)”

RtEstim

Working

R(t) for infections lines up with “true R(t)”

EpiLPS

Working

R(t) for infections lines up with “true R(t)”

R0

In progress

Offset between R(t) for infections and ”true R(t)”

epinowcast

In progress

Need guidance

bu.edu/sph | @BUSPH

25 of 29

Questions for developers / implementors

  • How easy is it to specify different types of input data?
    • Distributions vs non-parametric inputs
  • How can different inputs be included or removed?
    • E.g., can incubation and reporting delay be modeled stochastically. The defaults?
  • How is temporal smoothing specified?
    • How do we ensure control and consistency?
  • How are missing data / NA handled? Error checking?
  • Do the vignettes explain different functionality in adequate detail?
  • What functions are essential, exploratory, or optional?
  • Optional rabbit hole 🐇 why doesn’t the package recover the “true R(t)”?

Package

EpiEstim

EpiNow2

RtEstim

EpiLPS

R0

epinowcast

bu.edu/sph | @BUSPH

26 of 29

Questions for users / evaluators

  • What is the influence of using different features / smoothing on:
    • prediction bias and precision? What metrics to use?
    • computational time for the entire pipeline? What metrics to use?
  • How good are predictions at the tail? What metrics to use?
    • Does this change if you have a different # cases or incubation time
  • How do estimates for a single day change as you get more data?
  • What happens when you introduce some random noise into estimated cases and serial interval? Are there simulations that cause the packages to give different results?
    • E.g., rapidly changing case counts in response to large-scale interventions
  • What is the impact of mis-specifying an input (i.e., user error)?
  • How is computational time impacted by problem scale?
    • # of individuals in the simulation and # of days of the simulation

Package

EpiEstim

EpiNow2

RtEstim

EpiLPS

bu.edu/sph | @BUSPH

27 of 29

Questions for decision-makers

Use the simulation tool to test out different distributions

Using only daily reports, under what conditions would your decisions change:

    • R(t) > 1 or < 1
    • R(t) slope increasing or decreasing?
    • Other?

So, change R(t) at the tail, and look at Daily Reports. What decision would you make if R(t) at the tail was increasing but reports were

bu.edu/sph | @BUSPH

28 of 29

All code can be found at https://github.com/cmilando/RtEval

29 of 29

Users/Evaluators: https://shorturl.at/iPcqw

Developers: https://shorturl.at/axkeH

Decision makers: https://shorturl.at/cqpgb

bu.edu/sph | @BUSPH