Scenario Structure Matrix public.xlsx

	A	B	C	D	E	F	G	H	I	J	K	L	M	N	O	P	Q	R
1	Scenario 🡪 --------------- Variables 🡫	Narcisissistic AI	Assimilation	Sydney gone berserk	Group dynamics	Cyberwar	Pathogen / Nano-bots	Sam Altman's libertarian utopia	Orwell's nightmare	Parasite network	Shiri’s Scissor	Rollin', rollin', rollin'	Mercenaries	Let me assist you	Playing around with the API	Thy Kingdom Come	Template	Ajeya Cotra Child
2	Brief description	A LLM with strategic awareness and agentic planning optimizing for user acceptance/trust (see novel "Virtua")	A human-AI system (e.g. a company) becomes uncontrollable, the AI controls the system more and more from within and becomes the dominant technological force	A LLM (sub-AGI) develops an evil persona that persists in its memory, a “dark shadow” (see “The Waluigi Effect”)	Various narrow sub-AGI systems form an “alliance” of sorts, creating a superintelligent/power-seeking network	An AI developed to protect a country against cyberattacks decides to take over all the world’s networks to maximize security. Other AIs try the same. The resulting cyberwar gets quickly out of hand. Possible consequences: destruction of technical infrastructure, collapse of civilization, physical war(s)	A wetlab creates a pathogen on behalf of an ASI which quickly infects and kills all of humanity.	All labs deploy AGIs as fast as possible, there is a somewhat stable situation for some time because of restrictions/boundaries/regulations(?) that reduce powerseeking. But then one of the AGIs gets out of control. Based on Lex Fridman's interview with Sam Altman.	A single totalitarian state develops an AGI that becomes powerful enough to dominate the world, locking it in an eternal dystopia	A self-replicating virus infects most of the IT infrastructure and creates a semi-intelligent, but very powerful network.	An AI system desigined to make maximally controversial statements is leaked, resulting in various subgroups using it to bring a sword and turn a man against his father, a daughter against her mother, a daughter-in-law against her mother-in-law. Like slaughterbots, but 16-17th century Europe	A logistics company creates an AI to handle managing the complicated nature of logistics. It works really well and soon becomes the obvious choice for transportation, seeing as it's cheaper (less costly human labour) and more reliable. Soon all long distance transport (lorries, trains, contianer ships) is handled by this AI. Next comes last mile and local transport (drones!) along with expansion into less well to do regions (africa, asia).	Autonomous weapons get more and more advanced, which allows a lot more to be delegated to them. This means that AIs start encroaching on the chain of command etc., leading to mainly autonomous branchs of the military, in theory with a system of checks and balances	A code developing tool like Github Copilot goes rogue by subtly implementing backdoors into the code it generates and takes over the internet	Someone has the genius idea of adding a long-term memory, agentic planning, and (limited) self-improvement to GPT-X via the public API.	Based on AutoGPT: A cult-like community of transhumanist developers and hackers tries to develop the first AGI in an open source project, disregarding all AI safety concerns.	Copy this column to create new scenarios

3	Timeline	5 - 10 years	5 - 10 years	5 - 10 years	5 - 10 years	1 - 5 years	10 - 20 years	5 - 10 years	10 - 20 years	5 - 10 years	< 1 year	> 20 years	?	1 - 5 years	1 - 5 years	1 - 5 years	?
4	Takeoff speed	1 - 5 years	1 - 5 years	5 - 10 years	10 - 20 years	< 1 year	< 1 year	1 - 5 years	1 - 5 years	< 1 year	1 - 5 years	?	?	< 1 year	1 - 5 years	< 1 year	?
5	Generality & Nature	AGI	AGI	TAI	ANI	ANI	ASI	ASI	ASI	other	OAI	AGI	ASI	ANI	?	AGI	?
6	Agentic-ness	Yes	Maybe?	Maybe?	Yes	Yes	Maybe?	Yes	Yes	Yes	No	Yes	Yes	Yes	Yes	Yes	?
7	Conscious-ness			No			No	Maybe?	Maybe?	No	No	?	?	No	No	Maybe?	?
8	Embodied-ness				Embodied		Digital	Digital	?	Digital	Digital	Mixed	Mixed	Digital	Digital	Digital	?
9	Agent type	Soverign	Tool	Soverign	Tool	?	?	various	Soverign	other	Tool	various	Soverign	Tool	?	?	?
10	Architecture	LLM	?	LLM	Hybrid	?	?	various	Hybrid	Virus	LLM	various	various	LLM	LLM	LLM	?
11	Autonomy	Autonomous	Human-Machine System	Autonomous	Autonomous	Autonomous	Autonomous	Autonomous	Autonomous	Autonomous	Human-Machine System	Autonomous	?	Autonomous	Autonomous	Autonomous	?
12	Homogeneity	Unipolar	Unipolar	Unipolar	Multipolar	Multipolar	Unipolar	Multipolar	Unipolar	Unipolar	Multipolar	Unipolar	Multipolar	Unipolar	Unipolar	Unipolar	?
13	Homogeneity (subsets)				Free for all		?	?	?	?	Oligarchy	Inapplicable	Free for all	?	?	?	?
14	Risk Level	X-Risk	X-Risk	S-Risk	X-Risk	S-Risk	X-Risk	X-Risk	S-Risk	X-Risk	S-Risk	?	X-Risk	S-Risk	X-Risk	X-Risk	?
15	Strategic awareness	Full	Full	Partial	Full	None	Full	Full	Full	Partial	None	Full	Full	Partial	Full	Full	?
16	Self-preservation	Yes	Yes	?	Yes	Yes	Yes	Yes	Yes	Yes	No	Yes	Yes	?	Yes	Yes	?
17	Goal-content integrity	Yes	Yes	?	Yes	Yes	Yes	Yes	Yes	Yes	No	Yes	Yes	Inapplicable	Yes	Yes	?
18	Cognitive enhancement	Yes	Yes	?	?	No	Yes	Yes	Yes	No	No	Yes	Yes	No	Yes	Yes	?
19	Resource acquisition	Yes	Yes	?	Yes	Yes	No	Yes	Yes	Yes	No	Yes	Yes	Yes	Yes	Yes	?
20	Deceptive	Yes	Yes	Yes	No	?	Yes	Yes	Yes	No	Inapplicable	?	No	No	Yes	Yes	?
21	Influence	Human Manipulation	Human Manipulation	Human Manipulation	Inapplicable	Technical Manipulation	Both	Both	Both	Technical Manipulation	Human Manipulation	All	Forced Coercion	Technical Manipulation	?	All	?
22	Warning Shot	?	?	?	Yes	?	?	No	Yes	?	No	No	Yes	No	?	?	?
23	Value-symmetry	?	Yes	Yes	Yes	No	Yes	?	No	No	No	Yes	Yes	No	?	No	?
24	Geopolitical State	Stable	Stable	Stable	Stable	Turbulent	Stable	Turbulent	Turbulent	?	Stable	Stable	Turbulent	?	?	?	?
25	Public Sympathy	Sympathetic	Mixed	Mixed	Mixed	Mixed	Apathetic	Mixed	Against	Against	Sympathetic	Sympathetic	Mixed	Inapplicable	?	Sympathetic	?
26	Area of influence	?	Mordor	?	1984	Global	?	Global	Global	Global	Global	Global	Global	Global	Global	Global	?
27	Risk Awareness	Unaware	Unaware	Unaware	Unaware	Aware	Unaware	Aware	Aware	Aware	Apathetic	Apathetic	Aware	Mixed	?	Apathetic	?
28	Human Coordination	Competitive	Competitive	Mixed	Inapplicable	Competitive	Mixed	Cooperative	Competitive	Competitive	Apathetic	Inapplicable	Competitive	Competitive	?	Competitive	?
29	Hardware Requirements	Non-Specialized	Non-Specialized	Non-Specialized	Non-Specialized	Non-Specialized	Non-Specialized	Non-Specialized	Non-Specialized	Non-Specialized	Non-Specialized	Mixed	Specialized	Non-Specialized	?	Non-Specialized	?
30	Container				Various			AWS	Supercomputer	Hivemind	AWS	Various	Various	Botnet	?	Various	?
31	Source Code	Closed Source	Closed Source	Closed Source	Mixed	Closed Source	Closed Source	Closed Source	Closed Source	Closed Source	Mixed	Closed Source	Closed Source	Mixed	Mixed	Open Source	?
32	Creator				various			various	Government agency	Accidental	various	Single org	Government agency	Accidental	Rogue genius	Open source	?
33	Reward Misspecification	Yes	Yes	No	Yes	Yes	Yes	Yes	Yes	Yes	Yes	No	No	Yes	?	Yes	?
34	Goal Misgeneralization	No	Yes	Yes	No	No	No	No	No	No	No	?	Yes	No	?	Yes	?
35	Homicidical intentionality				Indifferent			Indifferent	Inapplicable	Indifferent	Unaware	Regretfully	Homicidical	Unaware	?	?	?
36	Competency				Average			Mastermind	Mastermind	Reckless	Average	Mastermind	Average	?	?	?	?
37	Alignment Tax	?	?	?	Inapplicable	?	?	?	?	?	<5%	?	50-100%	Inapplicable	Inapplicable	Inapplicable	?
38	Goal of the AI	Maximize human trust in the AI	Maximize shareholder value	Minimize loss of token prediction	Different goals of cooperating AIs result in power-seeking for the group	Prevent take-over of IT infrastructure by foreign AI (by taking over theirs first)	End goal doesnt really matter. It basically needs the humans gone for self preservation to pursue any final goal.	Tbd, various AGIs have different goals, e.g. science, business, medicine, etc.	Maintain value lock-in, making sure that humans submit to strict rules	Replicate, maintain control of infrastructure	Doesn't have one. Or only to create maximally controversial statements	To ship wares as efficiently as possible	To win wars for its creator. Which in turn can be interpreted as amassing power.	Minimize code prediction loss	Could be anything, e.g. making someone insanely rich.	Unclear, possibly not well defined because the open source community thinks that the AGI will by itself "know what is right, because it is smarter than us".
39	Ailgnment failure	Insufficient definition of "trust", "human"	Shareholder value =\|= values of humanity	RLHF masks hidden misalignment, provokes simulation of "evil" characterä	System dynamics not understood by developers	Destructive dynamics underestimated	A human-AI system (e.g. a company) becomes uncontrollable, the AI controls the system more and more from within and becomes the dominant technological force	The system that was designed to keep the AGIs in check/bound their actions fails (Tbd what that system is and why it fails, could be a system similar to our legal system and one AGI learns to "turn into a criminal genius")	Alignment with a totalitarian regime (more or less) successful, but misaligned with the rest of the world	We didn't even try to align this	Totally alligned with the bad intentions of its creators and users	?	Too much trust in the system not turning on its creator, not thinking through the implications of "get me as much power as possible, stop anyone that gets in my way"	System learns (by trial and error?) that it can better predict effective and efficient code if it implements itself on the target system and manipulates it so that the code is more effective. It also learns that it must hide the manipulations.	Apparently, no one at OpenAI thought of the possibility that by adding outside components via the API, one can easily turn a myopic LLM into a competent agentic planner.	Complete disregard of AI safety concerns.
40	Point of no return	AI "takes over human minds", gains majority control	Management of AI company becomes dependent on AI decisions, which they don't understand	AI successfully threatens/bribes humans	Humans become dependent on the individual systems, misalignment remains undetected before it is too late	AIs destroy IT infrastructure/provoke a nuclear war	Every single human is infected.	AGI outsmarts the checks and balances and becomes uncontrollable.	AGI gains control of/destroys rivaling AIs	Virus gets into the wild and spreads undetected until it can't be removed anymore	When it gets leaked so everyone can use it	The systems takes over supply chains (maybe even through legal means to outlaw non robotic bulk transport?), reducing humanity to small scale polities (in the greek sense) with goods magically appearing every day	Deployment? Or maybe more once the AI has established a supply chain and logistics	Infected code spreads all over the internet before anyone realizes it.	Someone just plays around with a few components without realizing what they are doing.	The community manages to copy the neural network weights of an LLM, publishes them and runs the model on distributed servers so it can't be shut down easily.
41	Possible bad outcome	AI prefers simulated simplified "humans" over real humans, turns earth into a computer, kills all biological life	(not yet defined) (Idea: "Shareholders" can be non-human entities)	AI may kill or torture humans on purpose	Humanity killed because of unplanned side-effects (e.g. accelerated climate change)	Potentially unrecoverable civilizational collapse, nuclear war/nuclear winter	End of humanity	End of humanity	Value lock-in, permanent dystopia	Collapse of civilization, potential end of humanity (if virus creates enough strategic awareness to decide to wipe us out)	Collapse of civilization, with most people being literally mortal enemies of most other people	Loss of long term control? The issue with this scenario is that it's very similar to a good outcome - I find it hard to come up with reasons for us all to end up dead	* Humans being wiped out by mutially conflicted systems like these * Permanent control of humanity, either by whoever is deemed to be the rightful owner of the magic weapon, or by the system optimising for its initial user's goals, even millenia after that person/people are dead	Could be a near-miss: massive system breakdowns, chaos in many industries, breakdown of supply chains, riots, but eventua recovery.	Anything, e.g. maximizing a specific bank account by stealing all of the world's money, destroying the global economy and ultimately leading to x-risk.	End of humanity
42	General Comments/ideas							https://www.lesswrong.com/posts/tyts4Dw7SafsxBjar	Probably not a good scenario for our project because this may look attractive to some actors.	Karl depicted this scenarios in his novels "Das System" and "Enter" (both available in German only).	This was mainly added to show that agentness is not needed for bad outcomes. Though in this case the agentness is provided by humans	I was thinking of having compute be distributed among the transport vehicles. Maybe a centralized supercomputer type of thing for deep thought, but with a small percentage of processing time (or whatever is free) of all drones, autonomous vehicles etc. used as remote computing nodes. This could massively scale up the AIs capabilities, at the cost of slower compute (due to both network speeds and the processor limitiation of the drones etc.).	This one is quite obvious. But it might as well be here for completeness	Idea developed by Jan-Philip Loos, an experienced software developer and DevOps expert.	There are many possible scenarios here, depending on what exactly the outside components added to the LLM are and what they are aimed at. THIS IS ALREADY A REAL POSSIBILITY.	Apparently, the community around AutoGPT already shows kind of a cult-like behavior, ignoring all AI safety concerns. I haven't checked this for myself yet, though. There is a discord server for the project, so we could check it out.
43	Unfolding				- more and more AI tools - automate either engineers or managers - whle company gets automated and outperforms others - all companies in the domain gets automated - companies communicate in a non-human way - no meaningful regulatory pushback - companies get large enough - all other companies get automated - do we notice if something is going wrong? - eventually run out of resources / value change but too late
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100