ABCDEFGHIJKLMNOPQRSTUVWXYZ
1
Answer to 1 (probability 0 to 1, or odds)
Answer to 2 (probability 0 to 1, or odds)
Comments / questions / objections to the framing / etc.
Check all that apply:
Optional: Your affiliation
2
0.080.1
I'm doing (or have done) a lot of governance research or strategy analysis related to AGI or transformative AI.
FHI
3
0.250.3
`1` and `2` seem like they should be close together for me, because in your brain emulation scenario it implies that our civ has a large amount of willingness to sacrifice competitiveness/efficiency of AI systems (by pausing everything and doing this massive up-front research project). This slack seems like it would let us remove the vast majority of AI x-risk.
I'm doing (or have done) a lot of technical AI safety research.
4
0.20.5
Really did not think about these probabilities much. LMK if you want me to give them more thought. Deployment-related work seems really important to me, and I take that to be excluded from technical research, hence the large gap.
I'm doing (or have done) a lot of governance research or strategy analysis related to AGI or transformative AI.
Open Philanthropy
5
10%15%
Rough answer -- I spent ~3m thinking about this question (although have thought about related issues, framed slightly differently, for many days).
I'm doing (or have done) a lot of technical AI safety research.
CHAI or UC Berkeley
6
0.60.6
I'm doing (or have done) a lot of technical AI safety research.
7
0.30.5
I'm doing (or have done) a lot of technical AI safety research.
8
0.10.1FHI
9
0.10.1
I am not thinking too hard about the difference between the two questions.
I'm doing (or have done) a lot of governance research or strategy analysis related to AGI or transformative AI.
10
68%63%
It's a bit ambiguous whether Q1 covers failures to apply technical AI safety research. Given the elaboration, I'm taking 1 to include the probability that some people do enough AI safety research but others don't know/care/apply it correctly. Also I am giving you answers to the nearest 3 %age points but thinking a few hours more or changing my mood could easily move me tens of %age points.
I'm doing (or have done) a lot of technical AI safety research.
CHAI or UC Berkeley
11
15%50%
On the answer to 2, I'm counting stuff like "Human prefs are so inherently incoherent and path-dependent that there was never any robust win available and the only thing any AI could possibly do is shape human prefs into something arbitrary and satisfy those new preferences, resulting in a world that humans wouldn't like if they had taken a slightly different path to reflection and cognitive enhancement than the one the AI happened to lead them down." I guess that's not quite a central example of "the value of the future [being] drastically less than it could have been"?

On the below, I'm counting timelines thinking and thinking about what technical research Open Phil would want done as "governance research or strategy analysis related to AGI or transformative AI"
I'm doing (or have done) a lot of governance research or strategy analysis related to AGI or transformative AI.
Open Philanthropy
12
0.10.1
I'm doing (or have done) a lot of technical AI safety research.
FHI
13
[0.1, 0.5][0.1, 0.9]
Conscientious semi-objection (sorry this is not super-cooperative, but it felt even *less* cooperative to not respond!): my credences for this question are really unstable, so I gave ranges that my credences don't seem to go outside of in practice. I feel like something pathological is happening when I answer this question, and am much happier with the answer "look, there are so many details that it's very hard to think about clearly, but it's in the category 'one of the very very highest EV problems in the world, should be tackled by a group of who that treat it as top priority and assume that nobody is going to show up to solve it for them.'"

Lately, I've been starting to think that probability assignment / EV number-crunching is critical for certain kinds of cause-selection work, but unhelpful for the group of people who are trying to be directly responsible for solving the problem.

I think I qualify for both "technical AI safety research" and "governance / strategy analysis", but I'm grading on a curve because so far relatively little has been done. Feel free to change my self-classification :)
I'm doing (or have done) a lot of technical AI safety research., I'm doing (or have done) a lot of governance research or strategy analysis related to AGI or transformative AI.
14
0.960.98
I'm doing (or have done) a lot of technical AI safety research.
MIRI
15
0.60.8
What are the worlds being compared in question 2? One in which AI systems (somehow manage to) do what the people deploying them wanted/intended vs the default? Or <the former> vs one in which humanity makes a serious effort to solve AI alignment? I answered under the first comparison.

My answer to question 1 might increase if I lean more on allowing "technical" to include philosophy (e.g., decision theory, ethics, metaphysics, content) and social science (e.g., sociology, economics).
I'm doing (or have done) a lot of technical AI safety research.
16
80%70%
I think that there's a lot riding on how you interpret “what the people deploying them wanted/intended” in the second question—e.g. does it refer to current values or values after some reflection process?
I'm doing (or have done) a lot of technical AI safety research.
MIRI
17
0.920.92
I'm doing (or have done) a lot of technical AI safety research.
18
0.10.2
I'm doing (or have done) a lot of technical AI safety research.
19
0.980.96
I think there's a large chance that the deployers will take an attitude of "lol we don't care whatevs", but I'm still counting this as "not doing what they intended" because I expect that the back of their brains will still have expected something other than instant death and empty optimizers producing paperclips. If we don't count this fatality as "unintended", the answer to question 2 might be more like 0.85.

The lower probability to 2 reflects the remote possibility of deliberately and successfully salvaging a small amount of utility that is still orders of magnitude less than could have been obtained by full alignment, in which possible worlds case 1 would be true and case 2 would be false.
I'm doing (or have done) a lot of technical AI safety research., I'm doing (or have done) a lot of governance research or strategy analysis related to AGI or transformative AI.
MIRI
20
10%50%
Very non-considered, off-the-cuff answers. I think I can access perspectives/assumptions/worldviews on which these probabilities are <10% and ones on which they're significantly higher. But I don't have a highly considered & resilient way for how to "aggregate" these perspectives into an all-things-considered credence.

I think there is some concern that a lot of ways in which 2 is true might be somewhat vacuous: to get to close-to-optimal futures we might need advanced AI capabilities, and these might only get us there if AI systems broadly optimize what we want. So this includes e.g. scenarios in which we never develop sufficiently advanced AI. Even if we read "AI systems not doing/optimizing what the people deploying them wanted" as presupposing the existence of AI systems, there may be vacuous ways in which non-advanced AI systems don't do what people want, but the key reason is not their 'misalignment' but simply their lack of sufficiently advanced capabilities.

I think on the most plausible narrow reading of 2, maybe my answer is more like 15%.
I'm doing (or have done) a lot of governance research or strategy analysis related to AGI or transformative AI.
FHI
21
0.260.66
The difference between the numbers is because (i) maybe we will solve alignment but the actor to build TAI will implement the solution poorly or not at all (ii) maybe alignment always comes at the cost of capability and competition will lead to doom (iii) maybe solving alignment is impossible
I'm doing (or have done) a lot of technical AI safety research.
MIRI
22
10%5%
My second answer feels a lot more grounded than my first (though it's also anchored on things I said a year or two ago, and maybe should be updated now). The first is higher than the second because we're thinking a really truly massive research effort -- it seems quite plausible that (a) coordination failures could cause astronomical disvalue relative to coordination successes and (b) with a truly massive research effort, we could fix coordination failures, even when constrained to do it just via technical AI research. I don't really know what probability to assign to this (it could include e.g. nuclear war via MAD dynamics, climate change, production web, robust resistance to authoritarianism, etc and it's hard to assign a probability to all of those things, and that a massive research effort could fix them when constrained to work via technical AI research).
I'm doing (or have done) a lot of technical AI safety research.
23
0.80.95
I'm doing (or have done) a lot of governance research or strategy analysis related to AGI or transformative AI.
24
0.150.3
I think my answer to (1) changes quite a lot based on whether 'technical AI safety research' is referring only to research that happens before the advent of AGI, separate from the process of actually building it.
In the world where the first AGI system is built over 200 years by 10,000 top technical researchers thinking carefully about how to make it safe, I feel a lot more optimistic about our chances than the world where 10,000 researchers do research for 200 years, then hand some kind of blueprint to the people who are actually building AGI, who may or may not actually follow the blueprint.
I'm interpreting this question as asking about the latter scenario (hand over blueprint). If I interpreted it as the former, my probability would be basically the same as for Q2.
I'm doing (or have done) a lot of technical AI safety research.
OpenAI
25
0.20.4
I'm doing (or have done) a lot of technical AI safety research.
26
0.030.03
I'm doing (or have done) a lot of technical AI safety research.
27
5%6%
I'm doing (or have done) a lot of governance research or strategy analysis related to AGI or transformative AI.
Open Philanthropy
28
15%15%
I'm doing (or have done) a lot of governance research or strategy analysis related to AGI or transformative AI.
FHI
29
14%18%
I'm doing (or have done) a lot of governance research or strategy analysis related to AGI or transformative AI.
30
0.50.7
For my 0.5 answer to question 1, I'm not imagining an ideal civilizational effort, but am picturing things being "much better" in the way of alignment research.

I haven't thought super long about these numbers, though I've run some checks on them.
I'm doing (or have done) a lot of technical AI safety research.
MIRI
31
0.10.12
(2) is a bit higher than (1) because even if we "solve" the necessary technical problems, people who build AGI might not follow what the technical research says to do
I'm doing (or have done) a lot of governance research or strategy analysis related to AGI or transformative AI.
FHI
32
0.20.22OpenAI
33
0.050.1
I'm doing (or have done) a lot of technical AI safety research.
OpenAI
34
0.480.56
I’m interpreting “not doing what the people deploying them wanted/intended” as intended to mean “doing things that are systematically bad for the overall value of the future”, but this is very different from what it literally says. The intentions of people deploying AI systems may very well be at odds with the overall value of the future, so the latter could actually benefit from the AI systems not doing what the people deploying them wanted. If I take this phrase at its literal meaning then my answer to question 2 is more like 0.38.
I'm doing (or have done) a lot of technical AI safety research.
FHI
35
0.250.35
Very off the cuff answer and more of a gestalt impression rather than informed by some inside view. I haven't worked on AI stuff for ~4 years.
Open Philanthropy
36
0.20.4
I'm uncertain about this to the point where people shouldn't take these numbers seriously. I could probably easily be swayed in either direction.
I'm doing (or have done) a lot of technical AI safety research.
FHI
37
0.50.5
My research: designing practical lifelong learners.

My general thoughts: might be more useful to discuss concrete trajectories AI development could follow, and then concrete problems to solve re safety. Here's one trajectory:

1. AI tech gets to the point where it's useful enough to emulate smart humans with little resource costs: AI doesn't look like an "optimised function for a loss function", rather it behaves like a human who's just very interested in a particular topic (e.g. someone who's obsessed with mathematics, some human interaction, and nothing else).

2. Governments use this to replicate many researchers in research fields, resulting in a massive acceleration of science development.

3. We develop technology which drastically changes the human condition and scarcity of resources: e.g. completely realistic VR worlds, drugs which make people feel happy doing whatever they're doing while still functioning normally, technology to remove the need to sleep and ageing, depending on what is empirically possible.

Dangers from this trajectory: 1. the initial phase of when AI is powerful, but we haven't reached (3), and so there's a chance individual actors could do bad things. 2. Some may think that certain variations of how (3) ends up may be bad for humanity; for example, if we make a "happiness drug", it might be unavoidable that everyone will take it, but it might also make humanity content by living as monks.

FHI
38
0.10.15FHI
39
20%20%
"Drastically less than it could have been" is confusing, because it could be "the future is great, far better than the present, but could have been drastically better still" or alternatively "we destroy the world / the future seems to have very negative utility". I'm sort of trying to split the difference in my answer. If it was only the latter, my probability would be lower, if the former, it would be higher.

Also, a world where there was tremendous effort directed at technical AI safety seems like a world where there would be far more effort devoted to governance/ethics, and I'm not sure how practically to view this as a confounder.
I'm doing (or have done) a lot of technical AI safety research.
40
0.0190.02
Low answers to the above should not be taken to reflect a low base rate of P(humans achieve AGI); P(humans achieve AGI) is hovering at around 0.94 for me. (Regardless, I take these probabilities quite seriously even if "low"; such is the requirement of safety!)
I'm doing (or have done) a lot of technical AI safety research.
OpenAI
41
0.250.3
(haven't thought about this very long)
I'm doing (or have done) a lot of technical AI safety research.
OpenAI
42
0.730.7
2 is less likely than 1 because I put some weight on x-risk from AI issues other than alignment that have technical solutions.
I'm doing (or have done) a lot of technical AI safety research.
43
0.050.3
I'm doing (or have done) a lot of technical AI safety research.
44
0.20.9
above: I think the basket of things we need a major effort on in order to clear this hurdle is way way broader than technical AI safety research (and inclusive of a massive effort into technical AI safety research), so 1 is a lot lower than 2 because I think "doing enough technical AI safety research" is necessary but not sufficient

these numbers are a bit made-up and I didn't do anything other than gut-check to get them

below: yo what's "a lot" of technical AI safety research, IDK man. am going with "no".
45
0.10.3
I'm not sure how to answer whether the amount of governance research or strategy analysis is "a lot" but honestly if I'm uncertain that feels like enough of a reason to not check the box.

Other comment: I mostly just matched the questions to cached queries that I've been pondering for years, but also notice that they haven't been significantly moved in that last ~2 years (after working in this field for ~6 years) -- so maybe I should be intentionally seeking out things that could possibly update these numbers.
I'm doing (or have done) a lot of technical AI safety research.
OpenAI
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100