| A | B | C | D | E | F | G | H | I | J | |
|---|---|---|---|---|---|---|---|---|---|---|
1 | ||||||||||
2 | Models v Markets: Kelly criterion betting outcomes* | |||||||||
3 | Model | House Control | Senate Control | House Races | Senate Races | Gov Races | TOTAL PROFIT/LOSS | Return on Bankroll | ||
4 | The Weekly Standard | $ (35.82) | $ 993.94 | $ 958.12 | 3.99% | |||||
5 | The Crosstab | $ 830.00 | $ 2,232.22 | $ 3,062.22 | 3.33% | |||||
6 | 538 Deluxe | $ 3,522.64 | $ 1,029.52 | $ 514.66 | $ 5,066.81 | 3.75% | ||||
7 | CNN | $ 2,742.89 | $ 879.48 | $ 3,622.37 | 3.32% | |||||
8 | 538 Classic | $ 1,017.92 | $ (197.19) | $ 1,296.42 | $ 304.97 | $ 540.29 | $ 2,962.42 | 1.77% | ||
9 | Noah Rudnick | $ 2,548.64 | $ 2,548.64 | 2.90% | ||||||
10 | The Economist | $ 1,000.47 | $ (1,036.73) | $ (36.26) | -0.06% | |||||
11 | 538 Lite | $ (1,317.55) | $ 433.87 | $ (322.37) | $ (1,206.05) | -0.89% | ||||
12 | DDHQ/0ptimus | $ 2,113.14 | $ 473.90 | $ (4,763.97) | $ (68.40) | $ 345.51 | $ (1,899.80) | -1.14% | ||
13 | ||||||||||
14 | ||||||||||
15 | This shows the net Profit and Loss for a betting strategy on PredictIt based on using the various models listed. The strategy here is to follow a simple Kelly criterion with $1k to invest per contract. If the model suggests the true odds are less than the market price, you would buy NO shares (at a sizing according to Kelly). If the model suggests the true odds are greater than the market price, you would buy YES shares (again at a sizing determined by Kelly criterion). | |||||||||
16 | ||||||||||
17 | So if the model is more accurate than the market, you will profit. If it's less accurate than the market, you will lose money. If they're both equally as accurate or both in concordance with each other, you will break even. | |||||||||
18 | ||||||||||
19 | Now, there's a lot of variance to consider! One race going a little bit this way or that way can swing your whole portfolio. This is also just a relative measure of accuracy - it compares models vs markets. But you can also compare each separately to reality, which is shown by the Brier scores: | |||||||||
20 | ||||||||||
21 | Models v Markets: Brier scores on races with betting markets* | |||||||||
22 | Model | House Races | Senate Races | Gov Races | ||||||
23 | 538 Deluxe | 0.0895 | 0.0530 | 0.0943 | ||||||
24 | PredictIt | 0.1018 | 0.0650 | 0.0948 | ||||||
25 | 538 Classic | 0.1007 | 0.0839 | 0.0962 | ||||||
26 | CNN | 0.1002 | 0.0667 | |||||||
27 | The Crosstab | 0.1070 | ||||||||
28 | 538 Lite | 0.1129 | 0.0813 | 0.1086 | ||||||
29 | The Economist | 0.1142 | ||||||||
30 | DDHQ/0ptimus | 0.1273 | 0.0795 | 0.0954 | ||||||
31 | Noah Rudnick | 0.1218 | ||||||||
32 | The Weekly Standard | 0.0738 | ||||||||
33 | ||||||||||
34 | Table includes all races for which there were betting markets as of 11/06 (N=89) | |||||||||
35 | ||||||||||
36 | To calculate Brier scores, I'm using Wikipedia's definition (which is basically mean square error). | *Results only for called races so far | ||||||||
37 | ||||||||||
38 | Similarly, I'm using the Kelly criterion formula from Wikipedia as well. | |||||||||
39 | ||||||||||
40 | Please feel free to double-check my math! It's quite possible (maybe even likely) that I've made a formula error somewhere. | |||||||||
41 | Click here for link to Spreadsheet where you can more easily compare from model to model (also includes experts) | |||||||||
42 | ||||||||||
43 | ||||||||||
44 | ||||||||||
45 | ||||||||||
46 | Models v Markets: Number of 'correct calls' for resolved races with PredictIt markets | |||||||||
47 | Model | House | House % | Senate | Senate % | Governor | Governor % | |||
48 | 538 Deluxe | 80 | 89.89% | 19 | 90.48% | 23 | 88.46% | |||
49 | PredictIt | 79 | 88.76% | 19 | 90.48% | 22 | 84.62% | |||
50 | 538 Classic | 79 | 88.76% | 18 | 85.71% | 22 | 84.62% | |||
51 | CNN | 77 | 86.52% | 18 | 85.71% | |||||
52 | DDHQ/0ptimus | 76 | 85.39% | 23 | 88.46% | |||||
53 | The Crosstab | 76 | 85.39% | |||||||
54 | The Economist | 75 | 84.27% | |||||||
55 | 538 Lite | 74 | 83.15% | 18 | 85.71% | 22 | 84.62% | |||
56 | Noah Rudnick | 74 | 83.15% | |||||||
57 | The Weekly Standard | 19 | 90.48% | |||||||
58 | ||||||||||
59 | ||||||||||
60 | ||||||||||
61 | Models v Models: Number of 'correct calls' for all resolved races | |||||||||
62 | Model | House | House % | Senate | Senate % | Governor | Governor % | |||
63 | 538 Deluxe | 424 | 97.47% | 30 | 93.75% | 33 | 91.67% | |||
64 | 538 Classic | 423 | 97.24% | 29 | 90.63% | 32 | 88.89% | |||
65 | CNN | 421 | 96.78% | 29 | 90.63% | |||||
66 | DDHQ/0ptimus | 420 | 96.55% | 33 | 91.67% | |||||
67 | The Crosstab | 420 | 96.55% | |||||||
68 | The Economist | 419 | 96.32% | |||||||
69 | 538 Lite | 418 | 96.09% | 29 | 90.63% | 32 | 88.89% | |||
70 | Noah Rudnick | 418 | 96.09% | |||||||
71 | The Weekly Standard | 30 | 93.75% | |||||||
72 | ||||||||||
73 | ||||||||||
74 | ||||||||||
75 | ||||||||||
76 | Number of Ratings in House (called races with PredictIt markets only) | |||||||||
77 | Model | Safe R | Likely R | Leans R | Tossup | Leans D | Likely D | Safe D | ||
78 | Cook Political | 8 | 7 | 20 | 28 | 15 | 5 | 5 | ||
79 | Inside Elections | 12 | 10 | 18 | 19 | 21 | 3 | 5 | ||
80 | Crystal Ball | 8 | 4 | 32 | 2 | 29 | 6 | 7 | ||
81 | PredictIt | 7 | 18 | 14 | 14 | 13 | 14 | 8 | ||
82 | DDHQ/0ptimus | 8 | 10 | 16 | 28 | 14 | 6 | 6 | ||
83 | The Economist | 13 | 18 | 9 | 21 | 4 | 16 | 7 | ||
84 | 538 Lite | 9 | 18 | 10 | 22 | 8 | 15 | 6 | ||
85 | 538 Classic | 9 | 19 | 8 | 17 | 10 | 14 | 11 | ||
86 | The Crosstab | 13 | 16 | 13 | 14 | 6 | 16 | 10 | ||
87 | 538 Deluxe | 11 | 21 | 6 | 12 | 14 | 13 | 11 | ||
88 | Noah Rudnick | 11 | 15 | 4 | 11 | 13 | 14 | 16 | ||
89 | CNN | 14 | 21 | 4 | 11 | 15 | 10 | 13 | ||
90 | ||||||||||
91 | ||||||||||
92 | % of Dem Wins per Rating Category (called races with PredictIt markets only) | |||||||||
93 | Model | Safe R | Likely R | Leans R | Tossup | Leans D | Likely D | Safe D | ||
94 | "Perfect" performance: | 2.50% | 15.00% | 32.50% | 50.00% | 67.50% | 85.00% | 97.50% | ||
95 | Cook Political | 0.00% | 14.29% | 5.00% | 75.00% | 100.00% | 100.00% | 100.00% | ||
96 | Inside Elections | 0.00% | 10.00% | 22.22% | 73.68% | 100.00% | 100.00% | 100.00% | ||
97 | Crystal Ball | 0.00% | 0.00% | 21.88% | 50.00% | 93.10% | 100.00% | 100.00% | ||
98 | PredictIt | 0.00% | 5.56% | 21.43% | 64.29% | 100.00% | 100.00% | 100.00% | ||
99 | DDHQ/0ptimus | 0.00% | 0.00% | 12.50% | 71.43% | 100.00% | 100.00% | 100.00% | ||
100 | The Economist | 7.69% | 5.56% | 33.33% | 76.19% | 100.00% | 100.00% | 100.00% | ||