ABCDEFGHIJKLMNOPQRSTUVWXYZAAABAC
1
Scenario 🡪
---------------
Variables 🡫
Narcisissistic AIAssimilationSydney gone berserkGroup dynamicsCyberwarPathogen / Nano-bots
Sam Altman's libertarian utopia
Orwell's nightmareParasite networkShiri’s ScissorRollin', rollin', rollin'MercenariesLet me assist youPlaying around with the APIThy Kingdom ComeTemplateAjeya Cotra Child
2
Brief descriptionA LLM with strategic awareness and agentic planning optimizing for user acceptance/trust (see novel "Virtua")A human-AI system (e.g. a company) becomes uncontrollable, the AI controls the system more and more from within and becomes the dominant technological force A LLM (sub-AGI) develops an evil persona that persists in its memory, a “dark shadow” (see “The Waluigi Effect”)Various narrow sub-AGI systems form an “alliance” of sorts, creating a superintelligent/power-seeking networkAn AI developed to protect a country against cyberattacks decides to take over all the world’s networks to maximize security. Other AIs try the same. The resulting cyberwar gets quickly out of hand. Possible consequences: destruction of technical infrastructure, collapse of civilization, physical war(s)A wetlab creates a pathogen on behalf of an ASI which quickly infects and kills all of humanity.All labs deploy AGIs as fast as possible, there is a somewhat stable situation for some time because of restrictions/boundaries/regulations(?) that reduce powerseeking. But then one of the AGIs gets out of control. Based on Lex Fridman's interview with Sam Altman.A single totalitarian state develops an AGI that becomes powerful enough to dominate the world, locking it in an eternal dystopiaA self-replicating virus infects most of the IT infrastructure and creates a semi-intelligent, but very powerful network.An AI system desigined to make maximally controversial statements is leaked, resulting in various subgroups using it to bring a sword and turn a man against his father, a daughter against her mother, a daughter-in-law against her mother-in-law. Like slaughterbots, but 16-17th century EuropeA logistics company creates an AI to handle managing the complicated nature of logistics. It works really well and soon becomes the obvious choice for transportation, seeing as it's cheaper (less costly human labour) and more reliable. Soon all long distance transport (lorries, trains, contianer ships) is handled by this AI. Next comes last mile and local transport (drones!) along with expansion into less well to do regions (africa, asia).Autonomous weapons get more and more advanced, which allows a lot more to be delegated to them. This means that AIs start encroaching on the chain of command etc., leading to mainly autonomous branchs of the military, in theory with a system of checks and balancesA code developing tool like Github Copilot goes rogue by subtly implementing backdoors into the code it generates and takes over the internetSomeone has the genius idea of adding a long-term memory, agentic planning, and (limited) self-improvement to GPT-X via the public API.Based on AutoGPT: A cult-like community of transhumanist developers and hackers tries to develop the first AGI in an open source project, disregarding all AI safety concerns.Copy this column to create new scenarios
3
Timeline5 - 10 years5 - 10 years5 - 10 years5 - 10 years1 - 5 years10 - 20 years5 - 10 years10 - 20 years5 - 10 years< 1 year> 20 years?1 - 5 years1 - 5 years1 - 5 years?
4
Takeoff speed1 - 5 years1 - 5 years5 - 10 years10 - 20 years< 1 year< 1 year1 - 5 years1 - 5 years< 1 year1 - 5 years??< 1 year1 - 5 years< 1 year?
5
Generality & NatureAGIAGITAIANIANIASIASIASIotherOAIAGIASIANI?AGI?
6
Agentic-nessYesMaybe?Maybe?YesYesMaybe?YesYesYesNoYesYesYesYesYes?
7
Conscious-nessNoNoMaybe?Maybe?NoNo??NoNoMaybe??
8
Embodied-nessEmbodiedDigitalDigital?DigitalDigitalMixedMixedDigitalDigitalDigital?
9
Agent typeSoverignToolSoverignTool??variousSoverignotherToolvariousSoverignTool???
10
ArchitectureLLM?LLMHybrid??variousHybridVirusLLMvariousvariousLLMLLMLLM?
11
AutonomyAutonomousHuman-Machine SystemAutonomousAutonomousAutonomousAutonomousAutonomousAutonomousAutonomousHuman-Machine SystemAutonomous?AutonomousAutonomousAutonomous?
12
HomogeneityUnipolarUnipolarUnipolarMultipolarMultipolarUnipolarMultipolarUnipolarUnipolarMultipolarUnipolarMultipolarUnipolarUnipolarUnipolar?
13
Homogeneity (subsets)Free for all????OligarchyInapplicable Free for all????
14
Risk LevelX-RiskX-RiskS-RiskX-RiskS-RiskX-RiskX-RiskS-RiskX-RiskS-Risk?X-RiskS-RiskX-RiskX-Risk?
15
Strategic awarenessFullFullPartialFullNoneFullFullFullPartialNoneFullFullPartialFullFull?
16
Self-preservationYesYes?YesYesYesYesYesYesNoYesYes?YesYes?
17
Goal-content integrityYesYes?YesYesYesYesYesYesNoYesYesInapplicableYesYes?
18
Cognitive enhancementYesYes??NoYesYesYesNoNoYesYesNoYesYes?
19
Resource acquisitionYesYes?YesYesNoYesYesYesNoYesYesYesYesYes?
20
DeceptiveYesYesYesNo?YesYesYesNoInapplicable?NoNoYesYes?
21
InfluenceHuman ManipulationHuman ManipulationHuman ManipulationInapplicableTechnical ManipulationBothBothBothTechnical ManipulationHuman ManipulationAllForced CoercionTechnical Manipulation?All?
22
Warning Shot???Yes??NoYes?NoNoYesNo???
23
Value-symmetry?YesYesYesNoYes?NoNoNoYesYesNo?No?
24
Geopolitical StateStableStableStableStableTurbulentStableTurbulentTurbulent?StableStableTurbulent????
25
Public SympathySympatheticMixedMixedMixedMixedApatheticMixedAgainstAgainstSympatheticSympatheticMixedInapplicable?Sympathetic?
26
Area of influence?Mordor?1984Global?GlobalGlobalGlobalGlobalGlobalGlobalGlobalGlobalGlobal?
27
Risk AwarenessUnawareUnawareUnawareUnawareAwareUnawareAwareAwareAwareApatheticApatheticAwareMixed?Apathetic?
28
Human CoordinationCompetitiveCompetitiveMixedInapplicableCompetitiveMixedCooperativeCompetitiveCompetitiveApatheticInapplicableCompetitiveCompetitive?Competitive?
29
Hardware RequirementsNon-SpecializedNon-SpecializedNon-SpecializedNon-SpecializedNon-SpecializedNon-SpecializedNon-SpecializedNon-SpecializedNon-SpecializedNon-SpecializedMixedSpecializedNon-Specialized?Non-Specialized?
30
ContainerVariousAWSSupercomputerHivemindAWSVariousVariousBotnet?Various?
31
Source CodeClosed SourceClosed SourceClosed SourceMixedClosed SourceClosed SourceClosed SourceClosed SourceClosed SourceMixedClosed SourceClosed SourceMixedMixedOpen Source?
32
CreatorvariousvariousGovernment agencyAccidentalvariousSingle orgGovernment agencyAccidentalRogue geniusOpen source?
33
Reward MisspecificationYesYesNoYesYesYesYesYesYesYesNoNoYes?Yes?
34
Goal MisgeneralizationNoYesYesNoNoNoNoNoNoNo?YesNo?Yes?
35
Homicidical intentionalityIndifferentIndifferentInapplicableIndifferentUnawareRegretfullyHomicidicalUnaware???
36
CompetencyAverageMastermindMastermindRecklessAverageMastermindAverage????
37
Alignment Tax???Inapplicable?????<5%?50-100%InapplicableInapplicableInapplicable?
38
Goal of the AIMaximize human trust in the AIMaximize shareholder valueMinimize loss of token predictionDifferent goals of cooperating AIs result in power-seeking for the groupPrevent take-over of IT infrastructure by foreign AI (by taking over theirs first)End goal doesnt really matter. It basically needs the humans gone for self preservation to pursue any final goal.Tbd, various AGIs have different goals, e.g. science, business, medicine, etc.Maintain value lock-in, making sure that humans submit to strict rulesReplicate, maintain control of infrastructureDoesn't have one. Or only to create maximally controversial statementsTo ship wares as efficiently as possibleTo win wars for its creator. Which in turn can be interpreted as amassing power.Minimize code prediction lossCould be anything, e.g. making someone insanely rich.Unclear, possibly not well defined because the open source community thinks that the AGI will by itself "know what is right, because it is smarter than us".
39
Ailgnment failureInsufficient definition of "trust", "human"Shareholder value =|= values of humanityRLHF masks hidden misalignment, provokes simulation of "evil" characteräSystem dynamics not understood by developersDestructive dynamics underestimatedA human-AI system (e.g. a company) becomes uncontrollable, the AI controls the system more and more from within and becomes the dominant technological forceThe system that was designed to keep the AGIs in check/bound their actions fails (Tbd what that system is and why it fails, could be a system similar to our legal system and one AGI learns to "turn into a criminal genius")Alignment with a totalitarian regime (more or less) successful, but misaligned with the rest of the worldWe didn't even try to align thisTotally alligned with the bad intentions of its creators and users?Too much trust in the system not turning on its creator, not thinking through the implications of "get me as much power as possible, stop anyone that gets in my way"System learns (by trial and error?) that it can better predict effective and efficient code if it implements itself on the target system and manipulates it so that the code is more effective. It also learns that it must hide the manipulations.Apparently, no one at OpenAI thought of the possibility that by adding outside components via the API, one can easily turn a myopic LLM into a competent agentic planner.Complete disregard of AI safety concerns.
40
Point of no returnAI "takes over human minds", gains majority controlManagement of AI company becomes dependent on AI decisions, which they don't understandAI successfully threatens/bribes humansHumans become dependent on the individual systems, misalignment remains undetected before it is too lateAIs destroy IT infrastructure/provoke a nuclear warEvery single human is infected.AGI outsmarts the checks and balances and becomes uncontrollable.AGI gains control of/destroys rivaling AIsVirus gets into the wild and spreads undetected until it can't be removed anymoreWhen it gets leaked so everyone can use itThe systems takes over supply chains (maybe even through legal means to outlaw non robotic bulk transport?), reducing humanity to small scale polities (in the greek sense) with goods magically appearing every dayDeployment? Or maybe more once the AI has established a supply chain and logisticsInfected code spreads all over the internet before anyone realizes it.Someone just plays around with a few components without realizing what they are doing.The community manages to copy the neural network weights of an LLM, publishes them and runs the model on distributed servers so it can't be shut down easily.
41
Possible bad outcomeAI prefers simulated simplified "humans" over real humans, turns earth into a computer, kills all biological life(not yet defined) (Idea: "Shareholders" can be non-human entities)AI may kill or torture humans on purposeHumanity killed because of unplanned side-effects (e.g. accelerated climate change)Potentially unrecoverable civilizational collapse, nuclear war/nuclear winterEnd of humanityEnd of humanityValue lock-in, permanent dystopiaCollapse of civilization, potential end of humanity (if virus creates enough strategic awareness to decide to wipe us out)Collapse of civilization, with most people being literally mortal enemies of most other peopleLoss of long term control?
The issue with this scenario is that it's very similar to a good outcome - I find it hard to come up with reasons for us all to end up dead
* Humans being wiped out by mutially conflicted systems like these
* Permanent control of humanity, either by whoever is deemed to be the rightful owner of the magic weapon, or by the system optimising for its initial user's goals, even millenia after that person/people are dead
Could be a near-miss: massive system breakdowns, chaos in many industries, breakdown of supply chains, riots, but eventua recovery.Anything, e.g. maximizing a specific bank account by stealing all of the world's money, destroying the global economy and ultimately leading to x-risk.End of humanity
42
General Comments/ideashttps://www.lesswrong.com/posts/tyts4Dw7SafsxBjarProbably not a good scenario for our project because this may look attractive to some actors.Karl depicted this scenarios in his novels "Das System" and "Enter" (both available in German only).This was mainly added to show that agentness is not needed for bad outcomes. Though in this case the agentness is provided by humansI was thinking of having compute be distributed among the transport vehicles. Maybe a centralized supercomputer type of thing for deep thought, but with a small percentage of processing time (or whatever is free) of all drones, autonomous vehicles etc. used as remote computing nodes. This could massively scale up the AIs capabilities, at the cost of slower compute (due to both network speeds and the processor limitiation of the drones etc.).This one is quite obvious. But it might as well be here for completenessIdea developed by Jan-Philip Loos, an experienced software developer and DevOps expert.There are many possible scenarios here, depending on what exactly the outside components added to the LLM are and what they are aimed at. THIS IS ALREADY A REAL POSSIBILITY.Apparently, the community around AutoGPT already shows kind of a cult-like behavior, ignoring all AI safety concerns. I haven't checked this for myself yet, though. There is a discord server for the project, so we could check it out.
43
Unfolding
- more and more AI tools
- automate either engineers or managers
- whle company gets automated and outperforms others
- all companies in the domain gets automated
- companies communicate in a non-human way
- no meaningful regulatory pushback
- companies get large enough
- all other companies get automated
- do we notice if something is going wrong?
- eventually run out of resources / value change but too late
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100