Interpretability Theory of Impact Voting Results - Public Editing

	A	B	C	D	E	F	G	H	I	J	K	L	M	N	O	P	Q	R	S	T	U	V	W
1	s	Name	Mean	Std	Neel Nanda	Evan Hubinger	Adam Scherlis	Victoria Krakovna	Janos Kramar	Dane Sherburn	Charlie Steiner	Nick Turner	Ramana Kumar	Simeon Campos	Aron Malmborg	Jenny Nitishinskaya	Charbel-Raphael Segerie	Arthur Conmy	Curt Tigges	Mark Bissell	Claude Haiku	GPT-4o mini	Sebastian Jost

2	3	Auditing	6.7	1.77	9	5	4	9	6	4	7	6	6	8	8	4	5	7	7	8	7	10	7.5
3	7	Improving Feedback	6.5	1.86	6	8	8	4	4	9	8	6	6	8.5	7	3	3	7	8	6	8	8	6
4	1	Force multiplier	6.6	1.77	8	8	8	8	3	7	5	8	5	5	7	5	4	6	6	7	8	10	8
5	6	Threat model evidence	6.6	2.04	8.5	6	4	9	8	7	5	9	6	3.5	9	4	4	6	4	7	7	9	9
6	2	Better prediction	6.3	2.06	8	6	8	6	3	5	4	8	6	6	8	5	2	4	8	8	9	9	7
7	4	Auditing for deception	6.2	1.61	9	5	4	8	7	5	6	6	5	7	9	4	4	5	7	5	6	8	7
8	8	Informed oversight	5.9	1.99	8	7	4	7	4	8	4	4	5	8	7	4	3	3	8	6	8	9	5.5
9	12	Cultural shift 1	5.5	1.84	4	6	4	5	7	7	6	7	7	6	8	2	3	5	4	3	8	8	5
10	9	In the loss function	5.3	1.73	5	7	4	5	4	6	3	6	6	7.5	5	4	3	8	6	2	7	8	5
11	10	Norm setting	5.3	2.13	5	6	4	5	4	8	6	7	3	7	8	2	2	6	6	1	7	8	5.5
12	17	Forecasting discontinuities	5.5	1.84	7	5	8	6	6	7	4	5	6	8	3	3	3	2	6	5	8	7	6
13	14	Epistemic learned helplessness	5.4	1.64	7.5	4	8	4	6	3	3	8	5	6	4	4	5	7	5	6	7	7	4
14	5	Enabling coordination	5.3	2.00	6	3	4	4	4	4	9	6	3	6	8	5	3	6	5	5	8	9	3
15	16	Get AIs to do it	5.1	2.00	6	7	8	5	7	3	4	3	5	4	5	2	5	4	6	3	8	9	3
16	18	Intervening on training	5.3	2.40	7.5	8	4	6	6	8	6	5	7	4	4	3	3	1	2	2	7	9	8
17	11	Regulation	5.1	2.16	6.5	3	4	3	3	4	2	5	6	8	7	4	4	4	5	4	9	10	5
18	13	Cultural shift 2	4.9	1.70	4	6	4	3	5	8	3	6	7	5	5	3	3	4	5	3	8	7	4
19	15	Microscope AI	4.9	2.00	3	7	4	3	3	9	2	4	5	6.5		3	5	8	5	4	7	6	3
20	20	ELK	5.4	2.26		6		8			3	3	7	5	4	3	2	3	7	9	8	6	7
21	19	Auditing a training run	5.1	2.27	5.5	8	4	4	6	8	2	5	4	7	3	2	1	4	5	4	8	8	8
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100