1 of 14

Lecture 11: Turning Considerations into Probabilities

Jacob Steinhardt

Stat 165, Spring 2024

2 of 14

Motivating Question

When will the UK’s �peak 7-day moving average �of COVID hospitalizations �occur (pre-March 2022)?��(Note: Using only info prior �to Dec. 21, 2021.)

3 of 14

Last Time

Essentially used the following formula:��DateOfPeak = Dec. 21st� + 10 days to reach case peak (2.4-day doubling time, 4.1 doublings)� + 9 days (case peak to hospital peak)� + 3 days (lag of 7-day average)� = Jan. 12th

To create confidence intervals, have to assess two things:

How wrong could each individual term in the sum be?
How likely is it that this sum is fundamentally a “wrong” decomposition?

4 of 14

Two Sources of Error

What invalidating considerations could cause the forecast to be totally wrong?
How sensitive is my forecast to each numerical quantity, and how uncertain am I about those quantities?

Important to check both! Easy to focus on #2, but sometimes #1 dominates (or points to important follow-up questions).

5 of 14

Source 1: Invalidating Considerations

If this estimate is totally off, why is that?

I also call this “structural uncertainty”�

[Brainstorming exercise]

Much more infectious than anticipated – shorter doubling than thought, increase total # of cases
Large holiday effect
Policy change that significantly affects transmissibility (lockdown)
Scientific breakthrough (vaccines)

6 of 14

Invalidating Considerations (my list)

UK cases could be capped by herd immunity rather than hospital strain (17+ million cases instead of 6.7 million)
Doubling time really is super-fast (1.5 days instead of 2.4)
Peak happens due to people self-adjust behavior until R is close to 1, leading to a very long “peak”

7 of 14

Evaluating Considerations

Herd immunity: would add at most 2 doublings (7 million vs. 28 million), or ~5 days, to the date of the peak
Short doubling time: were assuming 4 doublings before. 1.5 days vs. 2.4 days = ~4 days earlier for peak
Extended peak: assume we stay at 75% hospital capacity until enough people are infected to reach herd immunity. Need 17 million cases, or ~8.5 million confirmed cases. Works out to ~12 days extra.

�How likely is each of these considerations to matter?

I subjectively gave 15% (herd immunity), 15% (doubling time), 10% (extended peak)

8 of 14

Source 2: Numerical Sensitivity

Mainline forecast is based on:

N₀: cases so far (~200k)
N: total cases (~6.7M)
t: doubling time (~2.4 days)
𝚫₀: case->hospital lag (~9 days)
𝚫₁: 1-day->7-day avg (~3 days)�
N or N₀ off by factor of 2: changes answer by 2.4 days
If t is off by 1: changes answer by 4.1 days
If 𝚫₀ or 𝚫₁ is off by 1, changes answer by 1 day

Final formula: log₂(N/2N₀)*t + 𝚫₀ + 𝚫₁

9 of 14

Numerical Sensitivity: Quantitative Summary

Ranges are 70% CIs
How to combine into final uncertainty range?
My estimate: [-3.6, +4.9] for 70% CI (ignoring invalidating considerations)

10 of 14

Combining Structural + Numerical Uncertainty

Numerical uncertainty: 70% CI of [Jan. 8, Jan. 17th] (after rounding)

15% chance of before Jan. 8, 15% of after Jan. 17th

Structural uncertainty:

15% chance of +5 days (>=Jan. 17th)
15% chance of -4 days (<=Jan. 8th)
10% chance of +12 days (>=Jan. 24th)

In reality, both forms of uncertainty are present. Final CIs should be wider than either individually.

11 of 14

My Subjective Final Forecast

Median of Jan. 13th
10% of Jan. 24th or later
25% of Jan. 18th or later
25% of Jan. 7th or earlier

Do you agree or disagree with this assessment?

12 of 14

The Actual Answer

13 of 14

Retrospective

Only 20k hospital patients, vs. 40k in previous wave (I had been thinking we might get 70k)
Maybe around 8.3 million infected (I had been thinking 6.7 million)
Moderately more cases than expected, but way less deadly
Probably closer to the herd immunity estimate (8.3 confirmed infected => more like 16 million or more actually infected, close to my estimate)�
My case estimate was close to correct, but for wrong reason. But I knew my estimate was robust to this. None of the structural uncertainty showed up.

14 of 14

Commentary from Misha

�(See lecture notes for several additional comments)

Misha Yagudin��The core step, which is missing from your write-ups, is getting less confused about what’s going on and assembling a world model. I usually start pretty cluelessly; for example, I was forecasting cultured meat progress last month. I spend a lot of time trying to understand how the processes might work, how to reference class might look like, and what technological limitations are.

Until I had some understanding (still limited), I wasn’t looking for considerations. But after building a world model, I developed ways to approach most questions (sometimes very structurally uncertain).