1 of 15

Lecture 9: Combining Forecasts

Jacob Steinhardt�Stat 157, Spring 2022

2 of 15

Revisiting Reference Classes

“What is the probability that Joe � Biden completes his full term?”

�(For simplicity, assume we were trying to forecast at the beginning of his term.)

3 of 15

Revisiting Reference Classes

  • Focus on two cases for simplicity: dying in office, and resignation/impeachment
  • To get more data, let’s look at US, Canada, UK, France, and Spain

4 of 15

Deaths in Office

  • US: 8 since 1789 (8/233 years, or 3.4%/year)
    • 0 since 1970
  • Canada: 2 since 1878, plus one who resigned in illness (2.5/144 = 1.7%/year)
    • 0 since 1970
  • UK: 4 since 1800 (4/222 = 1.8%/year)
    • 0 since 1970
  • France: 4 since 1800 (4/22 = 1.8%/year)
    • 1 since 1970 ( =1.9%/year)
  • Spain: 4 since 1800 ( =1.8%/year)
    • 2 since 1970 ( =3.8%/year)

5 of 15

Combining Reference Classes

���������How should we combine this information? Take average? Which time frame?

Country

Fraction (Total)

Percent (Total)

Fraction (1970+)

Percent (1970+)

US

8/233

3.4%

0/52

0%

Canada

2.5/144

1.7%

0/52

0%

UK

4/222

1.8%

0/52

0%

France

4/222

1.8%

1/52

1.9%

Spain

4/222

1.8%

2/52

3.8%

6 of 15

Country Similarities and Differences

7 of 15

First Approach: Make up Weights (“Ensembling”)

  • US: 5, Canada, UK: 4, France: 3.5, Spain: 0.5
  • 1970+: 4, Total: 1�������
  • Final average:�0.8%/year

Country

Fraction (Total)

Percent (Total)

Weight

Fraction (1970+)

Percent (1970+)

Weight

US

8/233

3.4%

5

0/52

0%

20

Canada

2.5/144

1.7%

4

0/52

0%

16

UK

4/222

1.8%

4

0/52

0%

16

France

4/222

1.8%

3.5

1/52

1.9%

14

Spain

4/222

1.8%

0.5

2/52

3.8%

2

Average

2.3%

0.5%

8 of 15

A More Principled Approach

Basic approach misses some things:

  • We know “Total” should overestimate the 1970+ period
  • Some classes have larger/smaller sample size

Next: a more sophisticated approach using probability models

  • Use this to understand what factors matter, practice, then go with your gut

9 of 15

Ensembling with Probabilities

  • π: true probability�
  • πj: probability for reference class j�
  • Xj: observed �count for �reference class j

10 of 15

Simplest Model: Gaussians

  • πj ~ Gaussian(π, 𝜎j2)
  • What if we know πj is an over/underestimate?
  • πj ~ Gaussian(π+𝛥j, 𝜎j2)
    • 𝛥j: bias, 𝜎j2: uncertainty (variance)�
  • Maximum likelihood estimate for π: (on board)�

11 of 15

What about Xj? (On board)

  • We have Xj ~ Binomial(πj, Nj), where Nj is total # of “trials” (e.g. 222 years)�
  • Gaussian approximation of binomial: Xj ~ Gaussian(Njj, Njj*(1-πj))�
  • Let πj’ = Xj / Nj. Then simplifies to πj’ ~ Gaussian(πj, πj*(1-πj)/Nj)�
  • Trick: estimate variance with empirical variance�
  • Simplify to single Gaussian: πj’ ~ Gaussian(π+𝛥j, 𝜎j2 + πj’*(1-πj’)/(Nj-1))

12 of 15

Rule of Thumb

Adjusted standard deviation dominated either by a priori uncertainty (𝜎j) or by sample noise (roughly sqrt(Xj) / Nj)

If Xj = 0 then want to adjust slightly (roughly 1/Nj or perhaps 0.5/Nj)

13 of 15

Applying the Probabilistic Approach

Country

Fraction (Total)

Percent (Total)

Bias (𝛥)

Var. (𝜎2)

Adjusted Var.

Fraction (1970+)

Percent (1970+)

Bias (𝛥)

Var. (𝜎2)

Adjusted Var.

US

8/233

3.4%

.016

2.5e-4

4.1e-4

0/52

0%

0

0.1e-4

1.9e-4

Canada

2.5/144

1.7%

.008

0.9e-4

2.3e-4

0/52

0%

0

0.3e-4

2.0e-4

UK

4/222

1.8%

.008

0.9e-4

1.8e-4

0/52

0%

0

0.3e-4

2.0e-4

France

4/222

1.8%

.008

1.1e-4

2.0e-4

1/52

1.9%

0

0.5e-4

5.8e-4

Spain

4/222

1.8%

.02

5.0e-4

5.8e-4

2/52

3.8%

0.02

5.0e-4

13.6e-4

Final ensemble answer (see spreadsheet): 0.64%/year

14 of 15

Your Turn! Reference Classes for Resignation

  • US: 1 resignation / 233 years (1974)
  • Canada: many dissolutions of government; 1 prime minister resigned due to scandals (1873)
  • UK: many dissolutions of government; 3 were due to scandals (1855, 1857, 1924), but none involved the prime minister resigning

How would you combine this information?

15 of 15

Other Uses of Ensembling

  • External opinions: average many “forecasts” together
  • Weight by track record�
  • This is how I (Jacob) form most opinions about the world!
    • Key: pick good people to “trust” in each area, and occasionally find ways to vet track record�
  • Also another reason why forecasting in groups, or examining a problem from many angles, is helpful