Lectures 7-8: Combining Forecasts
Stat 165, Spring 2025
Slides credit: Jacob Steinhardt
Warm-up Question
�“How many 8.5 x 11 sheets of paper �does the average tree produce?”
�
Histogram of Answers
���������Now that you’ve seen the class’s distribution, what would you guess?
Wisdom of Crowds: Ox Judging Contest (Galton, 1907)
Wisdom of Crowds
Typical error: ± 30-40 lbs
Median answer: 1207 lbs
Correct answer: 1198 lbs
0.7% error!
Ways of Combining Forecasts
What are different techniques for combining forecasts?
General name for this: “ensembling” (also used in machine learning)
Mean
Median
Unlike mean, robust to outliers
Also independent of scale (same on linear or log scale)
�Disadvantage: uses data less efficiently (only cares about middle values)
Trimmed Mean
Weighted Mean
Exercise:
Trimmed mean is special case of weighted mean, where we assign 0 weight to answers that are far from rest.
Implications for Your Own Forecasting
Weighing Experts
We are changing our call for the February FOMC meeting from a 50 [basis point] hike to a 25bp hike, although we think markets should continue to place some probability on a larger-sized hike. (source, Jan 18)� Shared by an economist at Citigroup, the 3rd largest banking institution in the US.
�
Pricing Wednesday morning pointed to a 94.3% probability of a 0.25 percentage point hike at the central bank’s two-day meeting that concludes Feb. 1, according to CME Group data. (source, Jan 18)
The CME group is the world's largest financial derivatives exchange. The CME FedWatch Tool uses futures
pricing data (the 30-Day Fed Funds futures pricing data) to analyze the probabilities of changes to the Fed rate.
�
Markets expect the Fed to raise rates again on February 1, 2023, probably by 0.25 percentage points…. However, there’s a reasonable chance the Fed opts for a larger 0.5 percentage point hike. (source, Jan 2)
Simon Moore is a writer at Forbes. He provides an outsourced Chief Investment Officer service to institutional
investors. He has previously served as Chief Investment Officer at Moola and FutureAdvisor, both are consumer
investment startups that were subsequently acquired by S&P 500 firms. He has published two books and is a CFA
Charterholder and educated at Oxford and Northwestern.
How do we choose the weights?
Cadmium weighted averages
Guess | CI width | Equal weights | Precision | 1/sigma |
40 | 10 | 1 | 0.01 | 0.1 |
80 | 30 | 1 | 0.001111111111 | 0.03333333333 |
70 | 40 | 1 | 0.000625 | 0.025 |
60 | 20 | 1 | 0.0025 | 0.05 |
60 | 30 | 1 | 0.001111111111 | 0.03333333333 |
80 | 20 | 1 | 0.0025 | 0.05 |
50 | 30 | 1 | 0.001111111111 | 0.03333333333 |
70 | 20 | 1 | 0.0025 | 0.05 |
55 | 20 | 1 | 0.0025 | 0.05 |
60 | 30 | 1 | 0.001111111111 | 0.03333333333 |
Weighted average (truth = 48) | 62.5 | 55.20775623 | 59.63636364 |
Working in Teams: The Delphi Method
Delphi method:
Variants:
Question. Why come up with numbers individually (rather than working collaboratively the whole time?)
Asch experiments (Wikipedia)
Ensembling with Yourself
What was the total annual budget of the �US government in FY2022?
Come up with at least 3 distinct approaches �to Fermi estimate this.��Then, decide how to combine the estimates �together.
Combining Confidence Intervals
What if instead of point estimates, we have 80% confidence intervals?
Combining Sums
What if we are predicting X + Y, and have confidence intervals for X and Y?
For 70%/80% CI, stdev is usually a decent approximation
For extreme tails (99% CI), can be more complicated.
Summary