| A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Category | This AIS concept: | Is a lot like: | Because: (optional) | Advantages | Disadvantages | Notes | Source | |||||||||||||||||||
2 | AGI | The invention of AGI | The invention of the steam engine | ||||||||||||||||||||||||
3 | The invention of AGI | The invention of the printing press | |||||||||||||||||||||||||
4 | The invention of AGI | Being visited by aliens | "The more I play around with LMs the more I think they're aliens. Like yes, aliens created in some weird reflected image of us. Definitely plenty of humanity creeping in there. But... an LM is so much further way from a human than I am to any existing (developed) human" - Jeffrey Ladish on Twitter | "i used to say stuff like “aliens have landed on earth” but I don’t believe it so much anymore . . . AIs are built off of our cognitive scaffolding. they are familiar to us and have familiar successes and failures" - roon on Twitter | Stuart Russell made the comparison in a talk in November 2013, though it may have been made earlier | ||||||||||||||||||||||
5 | AI-powered military weapons | Inventing nuclear weapons | "The military advantage conferred by autonomous weapons will certainly dwarf that of chemical weapons, and they will likely be more powerful even than nukes due to their versatility and precision." - Nora Belrose | ||||||||||||||||||||||||
6 | Worrying about AGI now | "Worrying about overpopulation on Mars/Alpha Centauri" | "The risk is easily managed and far in the future" - Stuart Russell | "Mars has a carrying capacity of zero. Thus, groups that are currently planning to send a andful of humans to Mars are worrying about overpopulation on Mars, which is why they are developing life-support systems." - Stuart Russell | Andrew Ng | ||||||||||||||||||||||
7 | Worrying about AGI now | "Working on a plan to move the human race to Mars with no consideration for what we might breathe, drink, or eat once we arrive." - Stuart Russell | We are currently devoting tremendous resources to AGI! | Stuart Russell, Human Compatible (p.151) | |||||||||||||||||||||||
8 | Taking actions now to prevent future risks from AI | Taking actions now to prevent future risks from climate change | "focus on reducing the long-term harms of climate change" does not "dangerously distract from the near-term harms from air pollution and oil spills." - Garrison Lovely | ||||||||||||||||||||||||
9 | Training an AGI to solve a task instead of training a narrow AI | Hiring a man with a bazooka to open a door instead of hiring a locksmith | |||||||||||||||||||||||||
10 | AGI | Harnessing fire | "Controlled, it's transformative; uncontrolled, it can cause devastation." - Harry Luk | Poetic | Super abstract | ||||||||||||||||||||||
11 | Sample inefficiency might fundamentally limit scaling | "There’s no amount of jet fuel you can add to an airplane to make it reach the moon." | "If these models can’t get anywhere close to human level performance with the data a human would see in 20,000 years, we should entertain the possibility that 2,000,000,000 years worth of data will be also be insufficient." | Dwarkesh Patel | |||||||||||||||||||||||
12 | We might not need to fully understand brains before we create general intelligence | "We didn’t develop a full understanding of thermodynamics until a century after the steam engine was invented." | "The usual pattern in the history of technology is that invention precedes theory, and we should expect the same of intelligence." | Dwarkesh Patel | |||||||||||||||||||||||
13 | We might not need to fully understand brains before we get brain-computer interfaces working | We are getting brain controlled robot arms and cochlear implants to work without understanding what the brain is doing | Stuart Russell, Human Compatible | ||||||||||||||||||||||||
14 | A direct reward signal (RL) might be better than simply imitating humans for training powerful AI on some tasks | "a human learning to play tennis might at first try to emulate the skills of good tennis players, but later on develop their own technique by observing what worked and what didn’t work to win matches." | Epoch | ||||||||||||||||||||||||
15 | AGI Could Defeat All of Us | AI obsoleting humans | The combustion engine obsoleting horses | Max Tegmark, Life 3.0 | |||||||||||||||||||||||
16 | Trying to get alignment right on the first try | Trying to get activation functions right on the first try | "Most neural networks for the first decades of the field used sigmoids; the idea of ReLUs wasn't discovered, validated, and popularized until decades later. What's lethal is that we do not have the Textbook From The Future telling us all the simple solutions that actually in real life just work and are robust; we're going to be doing everything with metaphorical sigmoids on the first critical try." | - Obviously a quite technical example. Best suited to a technical audience | Eliezer Yudkowsky | ||||||||||||||||||||||
17 | Insisting that "artificial superintelligences will make small easily defeated attacks against humanity, as soon as they have the ability to make weak attacks, that will surely give us plenty of warning" | Claiming that "It's ludicrous to imagine that Russia could successfully invade Ukraine - before they can invade with 3400 tanks they'd need to invade with 340, and before 340, 34. 34 tanks would be easily defeated and give Ukraine plenty of time to prepare for further escalation." | There WAS plenty of warning, and for pretty much the reasons mockingly listed. Russia COULDN'T invade infinitely fast, and needed to mass troops along the border first. | Eliezer Yudkowsky | |||||||||||||||||||||||
18 | One superintelligence vs the combined intelligences of humanity | Kasparov vs the World | Kasparov won. | Kasparov also used the sneaky tactic of reading the discussion forums | "The world" didn't have time to practice together before the game, but we might get to prepare to coordinate against AGI | Yudkowsky vs Hotz debate (might not be first source) | |||||||||||||||||||||
19 | "Trying to apply AlphaGo in the real world" | "Trying to write a novel by wondering whether the first letter should be A, B, C, and so on." | AlphaGo "has no notion of abstract plan." It considers "actions occuring in a sequence from the initial state." | Stuart Russell, Human Compatible (p.265) | |||||||||||||||||||||||
20 | Humanity trying to survive the present time of perils | Walking along a steep precipice | Toby Ord, The Precipice | ||||||||||||||||||||||||
21 | Many AI researchers are concerned about an extremely bad outcome from AI | "Imagine you're about to get on an airplane and 50% of the engineers that built the airplane say there's a 10% chance that their plane might crash and kill everyone." | Aza Raskin | ||||||||||||||||||||||||
22 | "humanity facing down an opposed superhuman intelligence" | "a 10-year-old trying to play chess against Stockfish 15" | Eliezer Yudkowsky | ||||||||||||||||||||||||
23 | "humanity facing down an opposed superhuman intelligence" | "the 11th century trying to fight the 21st century" | Eliezer Yudkowsky | ||||||||||||||||||||||||
24 | "humanity facing down an opposed superhuman intelligence" | "Australopithecus trying to fight Homo sapiens" | Eliezer Yudkowsky | ||||||||||||||||||||||||
25 | Humanity trying to monitor the actions of a superintelligent AI | A novice playing a grandmaster in chess | "It is easy for a chess grandmaster to detect bad moves in a novice but very hard for a novice to detect bad moves in a grandmaster." | Anthropic | |||||||||||||||||||||||
26 | Can be difficult to shut down | The internet | Widespread, useful, distributed. | ||||||||||||||||||||||||
27 | Humans are bad at perceiving exponential change around them and properly acting on it, and how this spells trouble for AI Safety. It's important for us to get ahead of the curve. | Our delayed reaction to the early days of the COVID-19 pandemic | During the early days of the pandemic, a lot of people saw the news about COVID-19 cases spreading worldwide, but the idea of stocking masks, or worrying too much about it still felt insane. In some ways, this is because we're just bad at perceiving exponential change. We're bad at looking at the growth in early cases and going “huh, if this pace keeps increasing, this is going to be very, very bad”. This is a fundamental failure of our ability to identify and grasp exponential change around us. And this failure meant we were unprepared for much to come. In the same way, a lot of us are watching all the news about AI, being a bit surprised by some advancements, but failing to notice the exponential pace at which things are happening. How it's not only that it's going fast, but that we're starting to go faster and faster. We're underestimating how fast the world will change around us. And we're failing to properly respond to this change appropriately. | It's great at paving the way for deeper conversations about how we should regulate AI, even for audiences that feel that some of these concerns feel unjustified at present (especially AGI). It also conveys a useful altitude for governance and policy, it says, “we should be paying attention to this, and we should take precautions before it's too late”. It's a great conversation starter for tough audiences. | It might not work great for a US audience, given how polarized it was during the COVID-19 pandemic (I haven't used it for a US public yet). It can also feel a bit too early if people aren't impressed by current AI at all. | "I've used it very successfully several times in public, one time for a large audience in a congressional event in Chile. I think people liked the analogy and found it useful." - Agustín Covarrubias | Ezra Klein | ||||||||||||||||||||
28 | A superintelligence composed of many weaker models | A corporation, which is more capable than any of its employees individually | Organizations are more likely to get stuck in local minima than individuals - Eliezer Yudkowsky | ||||||||||||||||||||||||
29 | AGI | The stock market / financial system | No one can control the finanical system | Obviously some people can influence the financial system (the fed, some investors). Also the stock market has some positive connotations for making people rich. | |||||||||||||||||||||||
30 | Alignment Is Hard | Trusting a maximally curious AI to be aligned with human values | Trusting a curious scientist to be aligned with fruit fly values | "This rarely ends well for the fruit flies" | Scott Alexander | ||||||||||||||||||||||
31 | Safety constraints learned during training could be overcome during deployment | "A fear of heights is not a value that a human would usually endorse having in their terminal goals, and if it were really getting in the way of achieving their goals, they would look for ways to overcome it." | "This is an example of an agent that initially produces behavior that appears to obey a constraint, but when there is a sufficiently large conflict between this constraint and other goals, then the constraint will be overcome." | Jeremy Gillen and Peter Barnett | |||||||||||||||||||||||
32 | Trying to understand powerful capabilities by conducting experiments on weak models | Learning how to avoid "crashes during a Formula 1 race by practicing on a Subaru" | Dario Amodei | ||||||||||||||||||||||||
33 | Sharp left turn | Human evolution | Evolution provides no evidence for the sharp left turn | ||||||||||||||||||||||||
34 | Discontinuous capabilities advance from scaling | Magikarp evolving into formidable Gyrados at level 20 | Super fun one | It's Pokemon... It's not real. | Peter Wildeford | ||||||||||||||||||||||
35 | Inner misalignment | Humans not maximizing inclusive genetic fitness | "As AI designers, we're not particularly in the role of "evolution", construed as some agent that wants to maximize fitness, because there is no such agent in real life . . . Rather, we get to choose both the optimizer—"natural selection", in terms of the analogy—and the training data—the "environment of evolutionary adaptedness". — Zack Davis In the evolutionary environment, there was no concept for inclusive genetic fitness that could be rewarded. But our values are built around concepts/abstractions that can be rewarded, which provides some optimism that we can instill those values. "I think that natural selection is a relatively weak analogy for ML training. The most important disanalogy is that we can deliberately shape ML training. Animal breeding would be a better analogy, and seems to suggest a different and much more tentative conclusion." - Paul Christiano Vanessa Kosoy responds: "As to animal breeding: For one thing, it's possible that people would learn to game the metrics and the entire thing would go completely off-rails. For another, breeding friendly humans might be much easier than aligning AI, because friendly humans already occur in the wild, whereas aligned superhuman AI does not." | ||||||||||||||||||||||||
36 | Adversarial inputs | Artificial sweeteners | Joseph Bloom | ||||||||||||||||||||||||
37 | Adversarial inputs | Deadly mosquito bites | |||||||||||||||||||||||||
38 | Correctly specifying a reward function | Trying to make responsible wishes (e.g. King Midas) | "In every story where someone is granted three wishes, the third wish is always to undo the first two wishes." | Stuart Russell, Human Compatible (p.11) | |||||||||||||||||||||||
39 | "Losing even a small part of the rules that make up our values could lead to results that most of us would now consider as unacceptable" | "Dialing nine out of ten phone digits correctly does not connect you to a person 90% similar to your friend" | Eliezer Yudkowsky | ||||||||||||||||||||||||
40 | Failing to specify some part of human values makes it less likely we will achieve human flourishing | "if your rocket design fails to account for something important, the chance it will cause the rocket to explode is much higher than the chance is will make the rocket much more fuel-efficient." | "When aiming for a narrow target (which human flourishing is), unknowns reduce your chances of success, not increase it. So, given the uncertainty we have about many relevant issues, we should very worried indeed, even if we don't have a strong argument that one of those issues will definitely kill us." | Eliezer Yudkowsky via Vanessa Kosoy | |||||||||||||||||||||||
41 | We could easily build something powerful that we do not fully understand | "humans built bridges for millennia before developing mechanical engineering" | "many such bridges collapsed, in now-predictable ways" | Daniel Eth | |||||||||||||||||||||||
42 | We could easily build something powerful that we do not fully understand | "flight was developed before much aerodynamic theory" | Daniel Eth | ||||||||||||||||||||||||
43 | It is possible for humans (even/especially with good intentions) to create something that destroys everything they care about. | State formation | "An inhuman eldritch horror with immense power and strange goals and values, many antithetical to those who built it. Do not create such a thing, and destroy any that exist while you still can." | Too political. | Shankar Sivarajan | ||||||||||||||||||||||
44 | Developing a truly robust solution to alignment on the first try | Writing a perfect set of laws on the first try | "Human civilization manages to function at all now because we can react to ways in which the current system is insufficient. We design laws, and when we discover that there's some case that those laws don't cover or that the law was flawed or that it isn't very useful anymore, we change them before things can get too bad (usually). Think of the difficulty entailed in designing a true set of laws or systemic structures that are robust to all possible adversaries (or at least self-correct in the ways we would like, without any human supervision). When we develop AI systems more powerful than us, we no longer have the ability to react to new inputs, just to hope that our solutions are that good." | Jozdien | |||||||||||||||||||||||
45 | Emergent capabilities in AI | If you built a car and it had surprising capabilities | "[AI] is very different from most other industries: imagine if each new model of car had some chance of spontaneously sprouting a new (and dangerous) power, like the ability to fire a rocket boost or accelerate to supersonic speeds." | Could be considered more of a disanalogy | Dario Amodei | ||||||||||||||||||||||
46 | Specifying a human value into a reward function | Translating specific concepts like "jeong" (Korean) from one langauge to another | These concepts require a lot of cultural context to communicate accurately | Been Kim | |||||||||||||||||||||||
47 | A less intelligent being controlling a smarter being | A child controlling a mother | One of the only examples of this occurring | Geoffrey Hinton | |||||||||||||||||||||||
48 | Following instructions from an AI | People in the 11th century following instructions for making an air conditioner | Even if they understand every step, they will still be surprised when cold air comes out because they don't have advanced enough phsyics | Eliezer Yudkowsky | |||||||||||||||||||||||
49 | Trying to set the right initial conditions for developing AGI | Shooting an arrow into space | "Degree differences in aim add to million-lightyear-apart destinations" | Aidan McLaughlin | |||||||||||||||||||||||
50 | CIRL | AI learning about human preferences will not necessarily learn to be evil just because there are some people who do evil things | Criminologists do not become criminals | Learning about preferences =/= *adopting* those preferences | Stuart Russell, Human Compatible (p.179) | ||||||||||||||||||||||
51 | Organization Policies (including Control agendas) | Responsible Scaling Policies | "Slamming on the breaks after the car has crashed" | Control AI | |||||||||||||||||||||||
52 | Overlapping safety standards / defense in depth | The Swiss Cheese Model | CAIS | ||||||||||||||||||||||||
53 | Having safety distributed across an organization, instead of a dedicated department | Building a house to be earthquake-proof | “Companies that succeed with safety build it into the development process. It is extremely difficult to make a house earthquake-proof after building it; you work with the engineers from the start. AI is no different” | Sara Hooker | |||||||||||||||||||||||
54 | The idea that an AI company is not accelerating race dynamics if it is behind the frontier | The idea that Pepsi does not affect Coca-Cola's strategy, because Pepsi is behind Coca-Cola in the soft drinks market | "But it does not follow from this that Pepsi’s presence and behavior have no impact on Coca-Cola. In a world where Coca-Cola had an unchallenged global monopoly, it likely would charge higher prices, be slower to innovate, introduce fewer new products, and pay for less advertising than it does now, with Pepsi threatening to overtake it should it let its guard down." | "Anthropic’s leaders will note that unlike Pepsi, they’re not trying to overtake OpenAI, which should give OpenAI some latitude to slow down if it chooses to. But the presence of a competing firm surely gives OpenAI some anxiety, and might on the margin be making them go faster." | Dylan Matthews | ||||||||||||||||||||||
55 | Scaling labs | Misaligned superintelligence | Misaligned incentives, power seeking, lack of transparency, deception, incorrigibility, subverting constraints | Stephen Casper | |||||||||||||||||||||||
56 | Consulting AI advisors when making a safety case | Regulators consulting human experts for advice on regulation | Safety Cases | ||||||||||||||||||||||||
57 | Making a safety case that an AI based on capability, trustworthiness, control, or deference | Making a case that a robber won't steal based on inability (robber is an infant), trustworthiness (robber does not steal), control (robber is in jail), or deference (security company gives okay) | Safety Cases | ||||||||||||||||||||||||
58 | Racing against China to achieve AGI | A monkey racing another to eat a poison banana | Zvi Mowshowitz | ||||||||||||||||||||||||
59 | Cyborgism | A base model is a simulator that can produce different simulated personas | A physics engine can support different agents | ||||||||||||||||||||||||
60 | Learning about LLMs | Jane Goodall learing about chimpanzees | You can learn faster by gaining an intuitive understanding of model behavior based on your first-hand experiences and following your intuitive questions | "To future AI, we're not chimps; we're plants." - Andrew Critch (but the cyborgism analogy is to current AI, not necessarily future AI) | |||||||||||||||||||||||
61 | Deception | AIs that display good behavior in training but misaligned behavior in deployment | The Volkswagen emissions scandal | "The car manufacturer programmed their vehicles to limit emissions during tests by government regulators, but when the vehicles went back on the road, they immediately resumed spewing toxic emissions." | CAIS | ||||||||||||||||||||||
62 | Deceptive alignment is more likely than corrigible alignment | Baby ducks are good at imprinting on adult mother ducks but not humans | Knowing how to specify a pointer to some value (e.g. human values per corrigibility) can be a complex structural process, deception is simpler because it allows a model to infer optimal behavior without specifying a pointer to human values | Evan Hubinger | |||||||||||||||||||||||
63 | A deceptive model wouldn't have to contain a single explicit circuit to determine whether it is in training vs deployment (e.g. by looking for a factorization of RSA-2048) | A robot that was trained to sweep a factory floor but suddenly found itself in the jungle would behave strangely, even if it did not have a single explicit circuit that was checking "Am I in a jungle or a factory?" | Evan Hubinger | ||||||||||||||||||||||||
64 | We might train deception out of "young models" that think about deception on an object level but fail to train out more abstract thoughts that are also deceptive | "The fact that humans won't stab even 100 civilians in a lineup, is [not] much evidence against their willingness to drop a nuke that kills tens of thousands." | Nate Soares | ||||||||||||||||||||||||
65 | Trying to train a model that shares our goals | An eight year old trying to hire a trusted adult to care for them and run a new business | Saints, sycophants, and schemers are behaviorally indistinguishable from our limited perspective | Ajeya Cotra | |||||||||||||||||||||||
66 | A deceptively aligned model performing well on the training data but for the "wrong reasons" | Blaise Pascal going to church because he doesn't want to go to hell, not out of some internally-motivated piety | Evan Hubinger | ||||||||||||||||||||||||
67 | An AI not taking a certain action during training because it knows the action would be reinforced, impeding its goals | Humans not trying heroin because they know if they tried it, the action would be reinforced | This is a weak form of gradient hacking that does not require full knowledge of the loss landscape | ||||||||||||||||||||||||
68 | A model lying in its output, despite knowing the truth | Split brain patients confabulating answers | Some part of their brain knows the truth, but the info isn't shared across the boundary | McKenna Fitzgerald | |||||||||||||||||||||||
69 | Thinking that SGD willl struggle to turn a non-schemer into a schemer via small incremental weight changes | Thinking that evolution will fail to evolve eyes | Joe Carlsmith | ||||||||||||||||||||||||
70 | Model organisms of misalignment | Doing science experiments on mice | |||||||||||||||||||||||||
71 | Governance and Regulation | An AI regulatory agency | The Nuclear Regulatory Commission | "Since the creation of the Nuclear Regulatory Commission in 1975, there has never been a major nuclear accident in the US. And sure, this is because the NRC prevented any nuclear plants from being built in the United States at all from 1975 to 2023 . . . Still, they technically achieved their mandate. | Scott Alexander | ||||||||||||||||||||||
72 | An international AI regulatory agency | The International Atomic Energy Agency (IAEA) | The IAEA enforces nuclear regulations against proliferation by monitoring programs for misue. They also allow conditional access to a nuclear fuel bank, an approach that might be relevant to AI regulation. | "It's not clear whether safeguarding AI is sufficiently analogous to safeguarding nuclear energy to justify a focus on the IAEA" - CAIS "Some experts argue an IAEA-like agency for AI would be difficult to get support for among policymakers. That’s because it would require countries such as the U.S. and China to allow international inspectors full access to the most advanced AI labs in their jurisdictions in an effort to avoid risks that have not yet materialized." - TIME | |||||||||||||||||||||||
73 | An international partnership between public and private sectors for AI | Gavi, the vaccine alliance | |||||||||||||||||||||||||
74 | "Calling in the government on AI" | "Calling in Godzilla on Tokyo" | "You shouldn't do it because you're hoping Godzilla will carefully enact a detailed land-use reform policy. You should do it if humanity would be better off with less Tokyo. That, Godzilla can do." | Eliezer Yudkowsky | |||||||||||||||||||||||
75 | Banning AI development | The NIH ban on funding for protocols involving human germline alternation | Shows that there is a reasonable precedent for curtailing technology development in certain cases | Stuart Russell, Human Comptible (p.155) | |||||||||||||||||||||||
76 | Unilateral decisions by scaling labs to release frontier models | Drug companies putting a new drug in the water supply after minimal testing to keep up with their competitors | AI Notkilleveryoneism Memes | ||||||||||||||||||||||||
77 | An international joint AI research institution | CERN | |||||||||||||||||||||||||
78 | Why powerful AI companies agree with the "AI Safety" people about the need for government regulation. | Baptists and Bootleggers | The "baptists" are useful idiots manipulated by the "bootleggers," who are rich, powerful, and influential enough to benefit from regulation, such as by it kneecapping their competition. | People have had a century to reflect on Prohibition, and mostly agree it was a mistake. | Pretty old idea among any community even libertarian-adjacent. | ||||||||||||||||||||||
79 | Keeping models/weights closed | Gun control | Only the state and its approved agents (e.g., cops, military, defense contractors) should have access to firearms. | Some people might agree with the analogy, but draw the wrong conclusion. | |||||||||||||||||||||||
80 | Retraining everyone whose job is automated to be a data scientist | Escaping a sinking ocean liner on a tiny lifeboat | "Data science is a very tiny lifeboat for a giant cruise ship" — Stuart Russell | ||||||||||||||||||||||||
81 | It should be legal for an AI to learn from copyrighted materials | "the NYT would not have a case against an aspiring journalist who honed their craft by reading through the NYT’s backlog." | CAIS | ||||||||||||||||||||||||
82 | It should not be legal for AI to learn from copyrighted materials | "it seems obvious that AI systems should not have the right to bear arms, even though this right is guaranteed to Americans by the Constitution." | CAIS | ||||||||||||||||||||||||
83 | AI development | Driving a car | "The way I see it, it's not just a choice between slamming on the brakes or hitting the gas. If you're driving down a road with unexpected twists and turns, then two things that will help you a lot are having a clear view out the windshield and an excellent steering system. In Al, this means having a clear picture of where the technology is and where it's going, and having plans in place for what to do in different scenarios." | Helen Toner | |||||||||||||||||||||||
84 | "We don't need to regulate Al, it's already illegal to do bad things with it" | "We don't need to regulate sarin gas, it's already illegal to kill people with it". | "Yes, but one good and easy way to stop people dying is to not let everyone have access to sarin!" | Shakeel Hashim | |||||||||||||||||||||||
85 | AI developers should be liable for harm committed by their models | Harms from building a car | "If I build a car that is far more dangerous than other cars, don’t do any safety testing, release it, and it ultimately leads to people getting killed, I will probably be held liable and have to pay damages, if not criminal penalties." | Vox | |||||||||||||||||||||||
86 | AI developers should not be liable | Harms from information found through a search engine | "If I build a search engine that (unlike Google) has as the first result for “how can I commit a mass murder” detailed instructions on how best to carry out a spree killing, and someone uses my search engine and follows the instructions, I likely won’t be held liable" | Vox | |||||||||||||||||||||||
87 | Compute | Uranium enrichment | "Uranium mining and enrich- ment and compute fabrication and training both lead to outputs that can be used for both safe and harmful purposes, require significant capital investments, and can be differentiated by quantitative measures of quality (e.g., operations per watt or the level of enrichment) . . . Each process is lengthy, difficult, expensive, and potentially amenable to monitoring." | "Despite these limitations with the analogy, it is striking that society has also safely produced fairly large quantities of nuclear power and that there have been zero instances of nuclear terrorism in the nearly 80 years since the advent of nuclear weapons technology. The political and technical regimes that govern nuclear technology likely deserve some credit for this situation. While this system is certainly imperfect—rogue states like North Korea have still managed to build up their nuclear capacity in part via illegal proliferation networks—it is nevertheless a proof of concept for an institutional design that governs a highly sought after, dual-use technology at global scale." | "while both are dual-use, it is possible to infer a narrower set of potential uses for enriched uranium, while high-capability models can be applied to a wider variety of use cases. . . The analogy is inexact as the chips are the physical location where the process occurs, rather than the material that is processed—which is data using particular algorithms. So an individual AI chip processing data, say, can be compared to an individual centrifuge enriching uranium, while a data center can be compared to an enrichment plant." "There is another significant limitation associated with the nuclear analogy; namely, while the emphasis on hardware excludability advances nonproliferation aims, there are no obvious parallels in nuclear proliferation to the release of model weights." The level of compute to achieve a given level of capabilities falls over time - Helen Toner | This paper, for one | |||||||||||||||||||||
88 | AI legislation that would give a regulator enough flexibility to adapt to changing AI technology | Legislation for the FAA in the US | Yoshua Bengio | ||||||||||||||||||||||||
89 | Spreading safety and security over development/deployment | Spreading too little butter over too much bread | Miles Brundage | ||||||||||||||||||||||||
90 | Trying to regulate AI with third party human auditors | Meshing gears moving at two different speeds | "Most forms of 'operational safety' happen at computer speed" vs "Most forms of compliance involve processes that happen at 'human speed'" | Jack Clark | |||||||||||||||||||||||
91 | Instrumental Convergence | Many different goals imply the same initial actions | In Minecraft, no matter where you want to go or what you want to build, you have to start by chopping down some trees to collect wood, mine some resources, etc. | This is true in Minecraft because a game designer set it up that way: it does not convey how IC for AI is a property of the world. | Evan Hubinger | ||||||||||||||||||||||
92 | The AI does not hate you | "You’re probably not an evil ant-hater who steps on ants out of malice, but if you’re in charge of a hydroelectric green-energy project and there’s an anthill in the region to be flooded, too bad for the ants." | Stephen Hawking | ||||||||||||||||||||||||
93 | Intelligence | "Trying to assign an IQ to machines" | "Trying to get four-legged animals to compete in a human decathlon." | "True, horses can run fast and jump high, but they have a lot of trouble with pole-valuting and throwing the discus." | Stuart Russell, Human Compatible (p.48) | ||||||||||||||||||||||
94 | "Intelligence without knowledge" | "an engine without fuel." | Stuart Russell, Human Compatible (p.79) | ||||||||||||||||||||||||
95 | Saying that an AI could be more intelligent than humanity | Saying that Mike Tyson is stronger than my grandmother | Intelligence and Strength are fuzzy concepts. We don't need to decompose them precisely for them to be useful ideas. | Scott Alexander | |||||||||||||||||||||||
96 | Intelligence | A cake | "the bulk of the cake is unsupervised learning, the icing on the cake is supervised learning, and the cherry on the cake is reinforcement learning (RL)." | DeepMind researchers responded . . . with their own image of a cake topped with countless cherries, arguing that reinforcement learning with its many reward signals can reflect significant value. (Source) | Yann LeCun | ||||||||||||||||||||||
97 | Intelligence explosion | Intelligence begetting further intelligence | A "strength explosion" in which you become strong enough to open a sealed box containing steroids | "You need to be very strong to get the lid off the box. But once you do, you can take the steroids." | Scott Alexander | ||||||||||||||||||||||
98 | An AI using its capabilities to become even more capable | Humans inventing reading and writing to record "training data" for each other | Scott Alexander | ||||||||||||||||||||||||
99 | An AI using its capabilities to become even more capable | Inventing iodine supplementation to "treat goiter and gain a few IQ points on a population-wide level" | Scott Alexander | ||||||||||||||||||||||||
100 | An AI using its capabilities to become even more capable | Human civilization | "Humans build tech which improves human capabilities which leads to more tech, etc" | AI Notkilleveryoinism Memes |