ABCDEFGHIJ
1
2
Models v Markets: Kelly criterion betting outcomes*
3
ModelHouse ControlSenate ControlHouse RacesSenate RacesGov RacesTOTAL PROFIT/LOSSReturn on Bankroll
4
The Weekly Standard $ (35.82) $ 993.94 $ 958.12 3.99%
5
The Crosstab $ 830.00 $ 2,232.22 $ 3,062.22 3.33%
6
538 Deluxe $ 3,522.64 $ 1,029.52 $ 514.66 $ 5,066.81 3.75%
7
CNN $ 2,742.89 $ 879.48 $ 3,622.37 3.32%
8
538 Classic $ 1,017.92 $ (197.19) $ 1,296.42 $ 304.97 $ 540.29 $ 2,962.42 1.77%
9
Noah Rudnick $ 2,548.64 $ 2,548.64 2.90%
10
The Economist $ 1,000.47 $ (1,036.73) $ (36.26)-0.06%
11
538 Lite $ (1,317.55) $ 433.87 $ (322.37) $ (1,206.05)-0.89%
12
DDHQ/0ptimus $ 2,113.14 $ 473.90 $ (4,763.97) $ (68.40) $ 345.51 $ (1,899.80)-1.14%
13
14
15
This shows the net Profit and Loss for a betting strategy on PredictIt based on using the various models listed. The strategy here is to follow a simple Kelly criterion with $1k to invest per contract. If the model suggests the true odds are less than the market price, you would buy NO shares (at a sizing according to Kelly). If the model suggests the true odds are greater than the market price, you would buy YES shares (again at a sizing determined by Kelly criterion).
16
17
So if the model is more accurate than the market, you will profit. If it's less accurate than the market, you will lose money. If they're both equally as accurate or both in concordance with each other, you will break even.
18
19
Now, there's a lot of variance to consider! One race going a little bit this way or that way can swing your whole portfolio. This is also just a relative measure of accuracy - it compares models vs markets. But you can also compare each separately to reality, which is shown by the Brier scores:
20
21
Models v Markets: Brier scores on races with betting markets*
22
ModelHouse RacesSenate RacesGov Races
23
538 Deluxe0.08950.05300.0943
24
PredictIt0.10180.06500.0948
25
538 Classic0.10070.08390.0962
26
CNN0.10020.0667
27
The Crosstab0.1070
28
538 Lite0.11290.08130.1086
29
The Economist0.1142
30
DDHQ/0ptimus0.12730.07950.0954
31
Noah Rudnick0.1218
32
The Weekly Standard0.0738
33
34
Table includes all races for which there were betting markets as of 11/06 (N=89)
35
36
To calculate Brier scores, I'm using Wikipedia's definition (which is basically mean square error).*Results only for called races so far
37
38
Similarly, I'm using the Kelly criterion formula from Wikipedia as well.
39
40
Please feel free to double-check my math! It's quite possible (maybe even likely) that I've made a formula error somewhere.
41
Click here for link to Spreadsheet where you can more easily compare from model to model (also includes experts)
42
43
44
45
46
Models v Markets: Number of 'correct calls' for resolved races with PredictIt markets
47
ModelHouseHouse % SenateSenate %GovernorGovernor %
48
538 Deluxe8089.89%1990.48%2388.46%
49
PredictIt7988.76%1990.48%2284.62%
50
538 Classic7988.76%1885.71%2284.62%
51
CNN7786.52%1885.71%
52
DDHQ/0ptimus7685.39%2388.46%
53
The Crosstab7685.39%
54
The Economist7584.27%
55
538 Lite7483.15%1885.71%2284.62%
56
Noah Rudnick7483.15%
57
The Weekly Standard1990.48%
58
59
60
61
Models v Models: Number of 'correct calls' for all resolved races
62
ModelHouseHouse % SenateSenate %GovernorGovernor %
63
538 Deluxe42497.47%3093.75%3391.67%
64
538 Classic42397.24%2990.63%3288.89%
65
CNN42196.78%2990.63%
66
DDHQ/0ptimus42096.55%3391.67%
67
The Crosstab42096.55%
68
The Economist41996.32%
69
538 Lite41896.09%2990.63%3288.89%
70
Noah Rudnick41896.09%
71
The Weekly Standard3093.75%
72
73
74
75
76
Number of Ratings in House (called races with PredictIt markets only)
77
ModelSafe RLikely RLeans RTossupLeans DLikely DSafe D
78
Cook Political8720281555
79
Inside Elections121018192135
80
Crystal Ball843222967
81
PredictIt718141413148
82
DDHQ/0ptimus81016281466
83
The Economist13189214167
84
538 Lite91810228156
85
538 Classic919817101411
86
The Crosstab1316131461610
87
538 Deluxe1121612141311
88
Noah Rudnick1115411131416
89
CNN1421411151013
90
91
92
% of Dem Wins per Rating Category (called races with PredictIt markets only)
93
ModelSafe RLikely RLeans RTossupLeans DLikely DSafe D
94
"Perfect" performance:2.50%15.00%32.50%50.00%67.50%85.00%97.50%
95
Cook Political0.00%14.29%5.00%75.00%100.00%100.00%100.00%
96
Inside Elections0.00%10.00%22.22%73.68%100.00%100.00%100.00%
97
Crystal Ball0.00%0.00%21.88%50.00%93.10%100.00%100.00%
98
PredictIt0.00%5.56%21.43%64.29%100.00%100.00%100.00%
99
DDHQ/0ptimus0.00%0.00%12.50%71.43%100.00%100.00%100.00%
100
The Economist7.69%5.56%33.33%76.19%100.00%100.00%100.00%