AI Safety Arguments [public]


1		Results of the AI Safety Arguments Competition	The original author of the text. Might not be the same person who submitted it to the competition.	The person who submitted the entry.	Sources for factual claims made.	The works below with "public domain accepted" are licensed under CC0.
2		it is important to be mindful of potential negative risks when doing any kind of broader outreach, and it’s especially important that those doing outreach are familiar with the audiences they plan to interact with. If you are thinking of using the arguments for public outreach, please consider reaching out to us beforehand. You can contact us at info@centerforaisafety.org.
3	ID	Text	Attribution	Submitter	Sources	Public Domain Status	Notes

4	10	Discussion about artificial intelligence too often gets stuck on questions such as "is it really intelligent?" or "does it actually understand?". These are wrong questions. Instead, one should ask: "what can it do?", "how is it used?" and "what are the benefits and risks?"	Anonymous	Anonymous		Public Domain
5	15	People sometimes say that existential risk from AI is a nonissue, since any truly intelligent system would also be wise, would know what we meant, and would care. Two counterexamples: 1) Human sociopaths: sometimes highly intelligent while lacking any moral sense 2) Reinforcement learning algorithms. Their goals (reward function) are completely separate from their intelligence (optimiser / planner).	Gavin Leech	Gavin Leech	Comparisons of the sociopath and psychoneurotic on the Leiter Adult Intelligence Scale	Public Domain
6	16	Current systems already cheat quite ingeniously. We cannot write down exactly what we want. The history of philosophy is the history of failing to perfectly formalise human values. Every moral theory has appalling edge cases, where the neat summary fails. If we don’t write down exactly what we want, then the system will find edge cases. They already do.	Gavin Leech	Gavin Leech	Specification gaming: the flip side of AI ingenuity	Public Domain
7	27	The British mathematician I. J. Good who worked with Alan Turing on Allied code-breaking during World War II is remembered for making this important insight in a 1966 paper: "Let an ultra-intelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultra-intelligent machine could design even better machines; there would then unquestionably be an “intelligence explosion,” and the intelligence of man would be left far behind. Thus the first ultra-intelligent machine is the last invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control. It is curious that this point is made so seldom outside of science fiction. It is sometimes worthwhile to take science fiction seriously." Today far more people are taking this concern seriously. For example, Shane Legg, co-founder of DeepMind, recently remarked: "If you went back 10-12 years ago the whole notion of Artificial General Intelligence was lunatic fringe. People (in the field) would literally just roll their eyes and just walk away. (I had that happen) multiple times. But every year (the number of people who roll their eyes) becomes less."	William Kiely	William Kiely	I.J. Good Quote (also in 418) Shane Legg interview.	Public Domain	See #418 for the original submission of the I.J. Good quote.
8	35	When we are headed the wrong way, the last thing needed is progress.	Nick Bostrom, in an old version of his website	Nicole Nohemi Mauthe		Public Domain
9	52	The smartest ones are most criminally capable.	Anonymous	Anonymous	Returns to education in criminal organizations: Did going to college help Michael Corleone?
10	56	The current state of AI research regulations is often seen as precariously irresponsible. Imagine hundreds of corporations experimenting with nuclear materials. Imagine that there's no well-established scientific understanding of nuclear physics yet, and the dangers are largely unknown. The arguments for caution are waved aside by those competing to build the first nuclear reactor — a revolutionary power source. How long will it be, before one of these groups blindly stockpiles enough uranium to cause a nuclear explosion, right in the middle of a major city? In this analogy, nuclear stockpiles are powerful AI systems, and the "explosion" are the drastic impact they could have on society. Much like nuclear weapons, these systems may fall into the wrong hands, empowering their new owners to threaten or corrupt our institutions. Much like a nuclear reactor meltdown, we may lose control of what we've built entirely, resulting in mindless, pointless destruction. Many experts in the field foresee these dangers, yet the measures taken in response to their warnings are woefully insufficient. Nobody wants to cut into their profits, or give up a competitive edge, for something as mundane as safety.	Thane Ruthenis; Partially inspired by ML Systems Will Have Weird Failure Modes	Thane Ruthenis		Public Domain
11	63	One common failure mode in trying to think about AI safety is the belief that somehow intelligence and ethical behavior are related. This faulty reasoning stems from observations like, that smarter individuals are more likely to donate to charity. We should note that correlation is not causation, if it were so then we should expect AI to try to be as tall as possible because smarter humans are taller. Intelligence is not what makes ethical behavior possible in humans, because indeed humans evolved empathy, cooperation, and other behaviors because of specific selection for these traits, they were not a by product of our general intelligence (like our mathematical ability is). Likewise, if we don't specify AI to act ethically, it will not have any incentive to choose ethical solutions to problems it is presented to solve over unethical ones.	Anonymous	Anonymous	Genes, psychological traits and civic engagement Correlation among body height, intelligence, and brain gray matter volume in healthy children Evolutionary Explanations for Cooperation
12	104	When training an ML model, we set up some explicit or implicit goal for it — a reward signal, or minimizing some loss function on some dataset — then mutate the model's design until we find one that tends to show good performance. The crucial issue is that the model's behaviour isn't robustly determined by whatever goal we set up: it's only correlated with it. That correlation may break down in practice if the training process fails to uniquely specify our intended goal. As a toy example, consider an ML model trained to operate a vacuum cleaner. Our intended goal is to remove dust from some environment, but the training process might instead find a model design which attempts to maximize the amount of dust in the vacuum cleaner's bag. In the training environment, the best way to do that might be to gather the dust already present on the floor. In the real world, however, there may be better strategies, such as bowling over a potted plant and then vacuuming up the mess the model itself created. We can discourage this specific behaviour during training, but that's playing catch-up. How can we be sure that we've eliminated all such failure modes? That there aren't other goals that are correlated with good performance during training, but which lead to disastrous results in real-life scenarios? This task becomes harder the more difficult the goals we want to solve become, and the more complex the ML models grow.	Thane Ruthenis; Adapted from Risks from Learned Optimization	Thane Ruthenis		Public Domain
13	114	Given 20th-century technology, it was inefficient to concentrate too much information and power in one place. Nobody had the ability to process all available information fast enough and make the right decisions. This is one reason the Soviet Union made far worse decisions than the United States, and why the Soviet economy lagged far behind the American economy. However, artificial intelligence may soon swing the pendulum in the opposite direction... The main handicap of authoritarian regimes in the 20th century—the desire to concentrate all information and power in one place—may become their decisive advantage in the 21st century.	Yuval Noah Harari	Michał Kubiak
14	120	AI can also scale up existential harm. Consider the use of autonomous weapons in warfare. Today, going to war with another nation is a very expensive decision, both economically and politically. If AI can make warfare more economically efficient, then governments may be more willing to go to war—they may become more trigger-happy.	Features of Evil AI: Scalability	Michał Kubiak
15	124	History shows that for the general public, and even for scientists not in a key inner circle, and even for scientists in that key circle, it is very often the case that key technological developments still seem decades away, five years before they show up. In 1901, two years before helping build the first heavier-than-air flyer, Wilbur Wright told his brother that powered flight was fifty years away. In 1939, three years before he personally oversaw the first critical chain reaction in a pile of uranium bricks, Enrico Fermi voiced 90% confidence that it was impossible to use uranium to sustain a fission chain reaction. If Fermi and the Wrights couldn’t see it coming three years out, imagine how hard it must be for anyone else to see it.	Nearly a direct quote from Eliezer Yudkowsky, There is no Fire Alarm For Artificial General Intelligence.	Michał Kubiak	Wright Brothers from Wilbur's Story Fermi from The Making of the Atomic Bomb by Richard Rhodes. Screenshot here.
16	130	The potential benefits are huge; everything that civilisation has to offer is a product of human intelligence; (...) success in creating AI would be the biggest event in human history. Unfortunately, it might also be the last, unless we learn how to avoid the risks.	Stephen Hawking, Stuart Russell, Max Tegmark, Frank Wilczek	Michał Kubiak
17	133	The primary concern is not spooky emergent consciousness but simply the ability to make high-quality decisions. Here, quality refers to the expected outcome utility of actions taken, where the utility function is, presumably, specified by the human designer. Now we have a problem: 1. The utility function may not be perfectly aligned with the values of the human race, which are (at best) very difficult to pin down. 2. Any sufficiently capable intelligent system will prefer to ensure its own continued existence and to acquire physical and computational resources – not for their own sake, but to succeed in its assigned task. A system that is optimizing a function of n variables, where the objective depends on a subset of size k<n, will often set the remaining unconstrained variables to extreme values; if one of those unconstrained variables is actually something we care about, the solution found may be highly undesirable.	Stuart Russell	Michał Kubiak and anonymous
18	138	Many prominent AI experts have recognized the possibility that AI presents an existential risk. Contrary to misrepresentations in the media, this risk need not arise from spontaneous malevolent consciousness. Rather, the risk arises from the unpredictability and potential irreversibility of deploying an optimization process more intelligent than the humans who specified its objectives. This problem was stated clearly by Norbert Wiener (prodigious mathematician and philosopher, one of the early pioneers of AI) in 1960, and we still have not solved it.	Allan Dafoe and Stuart Russell	Michał Kubiak	Norbert Wiener from The New York Times
19	139	Our experience with Chernobyl suggests it may be unwise to claim that a powerful technology entails no risks. It may also be unwise to claim that a powerful technology will never come to fruition. On September 11, 1933, Lord Rutherford, perhaps the world’s most eminent nuclear physicist, described the prospect of extracting energy from atoms as nothing but “moonshine.” Less than 24 hours later, Leo Szilard invented the neutron-induced nuclear chain reaction; detailed designs for nuclear reactors and nuclear weapons followed a few years later. Surely it is better to anticipate human ingenuity than to underestimate it, better to acknowledge the risks than to deny them.	Allan Dafoe and Stuart Russell	Michał Kubiak	The New York Times The Making of the Atomic Bomb		We checked both the NYT and the book. NYT only gives Sept 11, book gives Sept 12. Changed the wording to "A single day layer"
20	161	In nature, for a species to prevail over other species, one species must either outsmart the other (like humans who outsmarted farm animals) or alternatively, one species must be sufficiently numerous to outnumber the other by a large margin (like bacteria, viruses and insects which still escape being completely tamed by humans, even though humankind outsmarts them by a very, very wide margin). Superintelligent AIs might not only outsmart humans but owing to easy digital replicability and ever-growing IT infrastructure might also outnumber humans, in which case humanity gets the worst of both worlds - being outsmarted and outnumbered.	Michał Kubiak	Michał Kubiak		Public Domain
21	164	Sufficiently smart AI can fake being helpful and useful to humans until we put enough trust in it - then it will be in a position to subdue us overnight.	Michał Kubiak	Michał Kubiak		Public Domain
22	171	How often do you see a scientist proclaim that attaining the holy grail of his field may very realistically cause human extinction? Such claims are usually spouted by crackpots, failed academics, or poorly-informed tabloid journalists. Yet this is exactly what half the leading AI researchers believe, adding that our society under-prioritizes AI safety. Making this sort of claim is not in their personal interests. If taken seriously, it would make the public view their work with more caution and less enthusiasm. It may decrease their funding, and provoke additional scrutiny or regulations from the government. In addition, many of them don't have any concrete suggestions on how to prevent it — they aren't hyping up their personal research agendas. One may invent convoluted explanations for this, but perhaps the simplest one is correct: the threat is real.	Thane Ruthenis; Inspired by 335 which also won.	Thane Ruthenis	See sources for 335	Public Domain	Works much better in parallel with #335.
23	174	The proliferation of neural networks has the potential to severely compromise the reliability and security of future digital technologies. ML models, at their core, are software products. We don't tend to think about them this way, because the way they're developed and implemented is so different, but that's what they are. And one of the things that sets them apart from conventional software is that they're completely opaque. Black boxes. We can't inspect the algorithms they implement: we can't check them for bugs, or do a thorough security audit. We're limited to pure I/O testing. Would we trust a "normal" program developed this way with anything important? And their unfamiliar nature doesn't make them immune to all the common software flaws. They're buggy, often responding to unfamiliar ("off-distribution") inputs in unpredictable ways. They crucially fail in response to certain tailored stimuli — so-called adversarial examples. They could be shipped with arbitrary backdoors built-in, and their internal intractability makes these backdoors undetectable. There are ways to make them reveal private data they'd seen once. The only reasons they're not already exploited to steal confidential information, compromise infrastructure, and breach secure networks, is because they're relatively new and rare, and the attackers haven't had time and sufficient reason to adapt. As NNs become more commonplace, trusted with more and more responsibilities, that will change. If we haven't figured out how to open the black box by then, there will be immense damage.	Thane Ruthenis	Thane Ruthenis	Trojans, private data extraction, adversarial examples are all established research areas.	Public Domain
24	180	Consider our civilization's response to COVID-19 and the colossal systemic failures it revealed. The inability of various social groups to coordinate, even in the face of a lethal threat. The efforts to produce vaccines are sorely underfunded. The worldwide authorities on the subject played political games instead of conveying accurate information. The governments, who'd too shown themselves incapable of coordination, and incapable of streamlining their procedures when necessary. That was our best response to a pandemic. A familiar, physical threat, a threat that we knew how to address, a threat that spread so slowly it gave us weeks and months to react. The threat that AI Risk warns us about is none of those things. Why would we ever think that it is being addressed adequately?	Thane Ruthenis	Thane Ruthenis		Public Domain
25	201	Consider the gap between an adult and a three-year-old child. The child is very like an adult in some aspects: he can think and reason, he can communicate with others, and he can explore and act on new information. But he's also, fundamentally, defenseless. Someone with ill intentions may convince him of anything — that other people see the sky as yellow, that two plus two sometimes equals five, that they can get him in trouble with his parents if he won't do what they say. Even if he'd been told not to trust strangers, they can quickly persuade him that they're an exception. It would be absurd to suggest, too, that giving him a knife would make him safer. An adult would be able to disarm him with words, or physically, all the same. Children are vulnerable, and they remain vulnerable regardless of what you say or give them. You don't leave them alone in the company of people you don't trust, no matter what. The gap between humanity and superintelligence is the same. We cannot hope to hold our own against it: it would be able to trick us, outmaneuver and overpower us, however it wants. Our only hope is to create one that wouldn't want to do that.	Thane Ruthenis	Thane Ruthenis		Public Domain
26	203	The idea that any superintelligent AI would be trustworthy is a seductive one. It feels natural, on some level. If the AI would be smarter than us, why would we presume we'd know better what it's ought to do? It wouldn't misinterpret the goal we gave it — it'd know what we really meant, even if we make a mistake. It'd want to help us — cooperation is mutually beneficial, and it'd be smart enough to recognize that. It may even become more moral than us, the way the modern society is more moral than the societies of the past! It certainly wouldn't engage in mindless destruction. But thinking about an AI this way, by imagining a very smart person looking at the world and choosing what to do, conveys wrong intuitions. Any AI is a mechanism, first and foremost. Every component of that mechanism is there solely to improve its ability to achieve whatever goal we coded into it. And specifying real-world goals in terms of program code is incredibly difficult; we do not understand how to do it safely at all, in fact. An AI whose goal is ripe with errors won't care to correct them, or to cooperate with us — it would not "care" at all. Like any mechanism, it would simply grind the world down until its goal is achieved. The way its actions would look from the outside may seem like communication, like the intelligent activity of a fellow person... But that would only reflect the sophistication of its mechanism, not its mindfulness. A steamroller would not spontaneously develop morality and stop if it were about to crush an injured person. Neither would an AI.	Thane Ruthenis	Thane Ruthenis		Public Domain
27	220	Insofar as we allow algorithms to pull our strings, push our buttons, and nudge our behavior, we invite artificial intelligence to master the art of human puppeteering.	Jeung-Min Amrine	Jeung-Min Amrine		Public Domain
28	225	Oil spills are hard to clean up even though they don’t try to spread more and don’t try to make themselves harder to clean up; on the other hand, an AI that is sufficiently intelligent might work against our efforts to turn it off because being turned off gets in the way of it pursuing its goals (whatever they are).	Anonymous; adapted from Joseph Carlsmith, Is Power-Seeking AI an Existential Risk?	Anonymous		Public Domain
29	256	Many experts believe that there is a significant chance that humanity will develop machines more intelligent than ourselves during the 21st century. This could lead to large, rapid improvements in human welfare, but there are good reasons to think that it could also lead to disastrous outcomes. The problem of how one might design a highly intelligent machine to pursue realistic human goals safely is very poorly understood. If AI research continues to advance without enough work going into the research problem of controlling such machines, catastrophic accidents are much more likely to occur.	Robert Wiblin, Positively shaping the development of artificial intelligence	Anonymous	Research Priorities for Robust and Beneficial Artificial Intelligence
30	272	The machinery of our brains primarily evolved to help us navigate a complex social world. For that reason, when we imagine a super-intelligent AI we can't help but try to model its mind, imagine what it is thinking, and empathize with its needs and values. But you must remember: a super-intelligent AI isn't a very clever human. It is not designed remotely like an animal brain and there is no reason to presume you will find things like thoughts, emotions, and values inside. It is an optimization algorithm so alien and different from a human brain that your empathy, although coming from a good place, is entirely the wrong tool to use in this situation.	Bryn Hopewell	Bryn Hopewell		Public Domain
31	293	AI alignment is difficult like rockets are difficult. When you put a ton of stress on an algorithm by trying to run it at a smarter-than-human level, things may start to break that don’t break when you are just making your robot stagger across the room. It’s difficult the same way space probes are difficult. You may have only one shot. If something goes wrong, the system might be too “high” for you to reach up and suddenly fix it. [...] And it’s difficult sort of like cryptography is difficult. Your code is not an intelligent adversary if everything goes right. If something goes wrong, it might try to defeat your safeguards—but normal and intended operations should not involve the AI running searches to find ways to defeat your safeguards even if you expect the search to turn up empty. [...] AI alignment: treat it like a cryptographic rocket probe. This is about how difficult you would expect it to be to build something smarter than you that was nice, given that basic agent theory says they’re not automatically nice, and not die. You would expect that intuitively to be hard.	Eliezer Yudkowsky, AI Alignment: Why It’s Hard, and Where to Start	Steven Kaas
32	297	If you’re not worried about [advanced AI], then just take a moment and think about your least favorite political leader on the planet. Don’t tell me who it is, but just close your eyes and imagine the face of that person. And then just imagine that they will be in charge of whatever company or organization has the best AI going forward as it gets ever better, and gradually become in charge of the entire planet through that. How does that make you feel?	Max Tegmark, 80,000 hours podcast	Justin
33	300	It's becoming common knowledge that machine learning will revolutionise society. And yet, the failure rate of machine learning projects remains notoriously high. This is probably a ‘canary in the coalmine’ for the risks we’ll face in the future when AI becomes more complicated, and the stakes are much higher. Future AI systems could be designed to save millions of lives by running hospitals more efficiently, or engineering crops that resist drought to fight world hunger. The rewards seem incredible, but the risks are just as immense. The trial & error mistakes that cause machine learning projects to fail today may cause catastrophes in the future. When AI eventually becomes the most powerful force in society, there will be no room for second chances.	Justin	Justin	Artificial Intelligence and Machine Learning Projects Are Obstructed by Data Issues Global Survey of Data Scientists, AI Experts and Stakeholders. Gartner Says Nearly Half of CIOs Are Planning to Deploy Artificial Intelligence Crop researchers harness artificial intelligence to breed crops for the changing climate Decoding crop genetics with artificial intelligence Artificial Intelligence, Machine Learning, Automation, Robotics, Future of Work and Future of Humanity: A Review and Research Agenda Are We Safe Enough in the Future of Artificial Intelligence? A Discussion on Machine Ethics and Artificial Intelligence Safety	Public Domain
34	301	Artificial intelligence is the defining problem of the 21st century. In the next 20 years, we will need to structure our economy for the transition from a human workforce to a machine workforce. We will need to structure our geopolitical environment to ensure super-human AI isn't used as a military weapon. And we will need to coordinate the development of AI so that trust is the #1 priority—superhuman intelligence that we can only trust 99.9% of the time doesn't sound appealing to anyone. Every year that AI grows more powerful, the margin for error shrinks.	Justin	Justin		Public Domain
35	302	If nothing yet has struck fear into your heart, I suggest meditating on the fact that the future of our civilization may well depend on our ability to write code that works correctly on the first deploy.	Nate Soares, Positive Outcomes for AI	Anonymous			Transcript
36	310	The AI does not hate you, nor does it love you, but you are made out of atoms that it can use for something else.	Eliezer Yudkowsky, AI As a Positive and Negative Factor in Global Risk	Chris Leong			2 other later submissions of this same quote
37	316	It's a common misconception that those who want to mitigate AI risk think there's a high chance AI will wipe out humanity this century. But opinions vary and proponents of mitigating AI risk may still think the likelihood is low. Crowd forecasts have placed the probability of a catastrophe caused by AI at around 6% this century, and extinction caused by AI at around 4% this century. But even these low probabilities are worth trying to reduce when what's at stake is millions or billions of lives. How willing would you be to take a pill randomly from a pile of 100 if you knew 6 were poison? And the risk may be higher for timeframes beyond this century.	Ryan Beck	Ryan Beck	Ragnarök Series	Public Domain	Catastrophe defined as global population falling 10%. Numbers slightly updated from original argument to reflect implied values as of July 6th, 2022.
38	328	The vast majority of actors would not want to develop unsafe systems. However, there are reasons to think that alignment will be hard with modern deep learning systems, and difficulties with making large language models safe provide empirical support for this claim. Misaligned AI may seem acceptably safe and only have catastrophic consequences with further advancements in AI capabilities, and it may be unclear in advance whether a model is dangerous. In the heat of an AI race between companies or governments, proper care may not be taken to make sure that the systems being developed behave as intended.	Joshua Clymer and Michael Chen	Joshua Clymer and Michael Chen	Why AI alignment could be hard with modern deep learning Red Teaming Language Models with Language Models	Public Domain
39	328	Companies and governments will find it strategically valuable to develop advanced AIs which are able to execute creative plans in pursuit of goal-achieving real-world outcomes. Current large language models have a rich understanding of the world which generalizes to other domains, and reinforcement learning agents already achieve superhuman performance at various games. With further advancements in AI research and computing, we are likely to see the development of human-level AI this century. But for a wide variety of goals, it is often valuable to pursue instrumental goals such as acquiring resources, self-preservation, seeking power, and eliminating opposition. By default, we should expect that highly capable agents will have these unsafe instrumental objectives.	Michael Chen	Michael Chen	Bostrom, The Superintelligent Will	Public Domain
40	335	Perhaps the best window into what those working on AI really believe [about existential risks from AI] comes from the 2016 survey of leading AI researchers. As well as asking if and when AGI might be developed, it asked about the risks: 70 percent of the researchers agreed with Stuart Russell’s broad argument about why advanced AI might pose a risk; 48 percent thought society should prioritize AI safety research more (only 12 percent thought less). And half the respondents estimated that the probability of the longterm impact of AGI being “extremely bad (e.g., human extinction)” was at least 5 percent. I find this last point particularly remarkable — in how many other fields would the typical leading researcher think there is a one in twenty chance the field’s ultimate goal would be extremely bad for humanity?	Toby Ord, The Precipice	skluug			See also the related 171
41	399	It is a mistake to assume that AI researchers are driven by the positive consequences of their work. Geoffrey Hinton, the winner of a Turing Award for his enormous contribution to deep neural networks, is not optimistic about the effects of advanced AI, or whether humans can decide what it does. In a 2015 meeting of the Royal Society, he stated that 'there is not a good track record of less intelligent things controlling things of greater intelligence', and that 'political systems will use (AI) to terrorize people'. Nevertheless, he presses on with his research, because 'the prospect of discovery is too sweet'.	Adrià Garriga-Alonso	Adrià Garriga-Alonso	The Doomsday Invention	Public Domain
42	415	If the capabilities of nuclear technology and biotechnology advance faster than their respective safety protocols, the world faces an elevated risk from those technologies. Likewise, increases in AI capabilities must be accompanied by an increased focus on ensuring the safety of AI systems.	Arran McCutcheon	Arran McCutcheon		Public Domain
43	416	[I]magine what would happen if we received notice from a superior alien civilization that they would arrive on Earth in thirty to fifty years. The word pandemonium doesn’t begin to describe it. Yet our response to the anticipated arrival of superintelligent AI has been . . . well, underwhelming begins to describe it.	Stuart Russell, Human Compatible, page 2	Anonymous
44	418	Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an ‘intelligence explosion,’ and the intelligence of man would be left far behind. [...] Thus the first ultraintelligent machine is the last invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control.	Irving John Good (I.J. Good), Speculations Concerning the First Ultraintelligent Machine	Anonymous			First person who submitted the quote.
45	420	Intelligence is the most powerful of all the tools we have: It allowed humankind to invent advanced language, science and communication. Artificial intelligence has the potential of being an even more powerful tool and we should therefore be thinking about how we can create this tool in a way that is safe and beneficial.	Anonymous	Anonymous		Public Domain
46	516	AI systems are already capable of programming. If humans can program smart AI, what could smart AI program? Smarter AI? Is this process controllable by humans?	Nicholas Kross and jimv	Nicholas Kross and jimv	Competitive Programming with AlphaCode OpenAI Codex	Public Domain	Built on this comment from jimv, who will also receive a portion of the prize: https://www.lesswrong.com/posts/3eP8D5Sxih3NhPE6F/usd20k-in-prizes-ai-safety-arguments-competition?commentId=dEfFY3Dc3rfEJapBC
47	562	[T]he technology [of lethal autonomous drones], from the point of view of AI, is entirely feasible. When the Russian ambassador made the remark that these things are 20 or 30 years off in the future, I responded that, with three good grad students and possibly the help of a couple of my robotics colleagues, it will be a term project [six to eight weeks] to build a weapon that could come into the United Nations building and find the Russian ambassador and deliver a package to him.	Stuart Russell, FLI Podcast	Scott Emmons
48	570	Humanity has risen to a position where we control the rest of the world precisely because of our unrivalled mental abilities. If we pass this mantle to our machines, it will be they who are in this unique position.	Toby Ord, The Precipice	Trevor Wissinger
49	575	The smarter AI gets, the further it strays from our intuitions of how it should act.	Nicholas Kross	Nicholas Kross		Public Domain
50	578	Building a nuclear weapon is hard. Even if one manages to steal the government's top-secret plans, one still needs to find a way to get uranium out of the ground, find a way to enrich it and attach it to a missile. On the other hand, building an AI is easy. With scientific papers and open source tools, researchers are doing their utmost to disseminate their work. It's pretty hard to hide a uranium mine — downloading TensorFlow takes one line of code. As AI becomes more powerful and more dangerous, greater efforts need to be taken to ensure malicious actors don't blow up the world.	joseph_c	joseph_c		Public Domain
51	580	Just think of the way in which we humans have acted towards animals, and how animals act towards lesser animals, now think of how a powerful AI with superior intellect might act towards us, unless we create them in such a way that they will treat us well, and even help us.	Jonas Inderhaug	Jonas Inderhaug		Public Domain
52	581	The predictability of today's AI systems doesn't tell us squat about whether they will remain predictable after achieving human-level intelligence. Individual apes are far more predictable than individual humans, and apes themselves are far less predictable than ants.	Trevor Wissinger	Trevor Wissinger		Public Domain
53	585	Thousands of researchers at the world's richest corporations are all working to make AI more powerful. Who is working to make AI more moral?	Anonymous	Anonymous
54	603	Leading up to the first nuclear weapons test, the Trinity event in July 1945, multiple physicists in the Manhattan Project thought a single explosion might destroy the world. Edward Teller, Arthur Compton, and J. Robert Oppenheimer all had concerns that the nuclear chain reaction could ignite Earth's atmosphere in an instant. Yet, despite disagreement and uncertainty over their calculations, they detonated the device anyway. If the world's experts in a field can be uncertain about causing human extinction with their work, and still continue doing it, what safeguards are we missing for today's emerging technologies? Could we be sleepwalking into a catastrophe with bioengineering, or perhaps artificial intelligence?	Nicholas Kross	Nicholas Kross	The man who feared, rationally, that he’d just destroyed the world	Public Domain
55	606	These researchers built an AI for discovering less toxic drug compounds. Then they retrained it to do the opposite. Within six hours it generated 40,000 toxic molecules, including VX nerve agent and 'many other known chemical warfare agents.'	Arthur Holland Mitchell on Twitter	Nanda Ale	Dual use of artificial-intelligence-powered drug discovery
56	608	Imagine a piece of AI software was invented, capable of doing any intellectual task a human can, at a normal human level. Should we be concerned about this? Yes, because this artificial mind would be more powerful (and dangerous) than any human mind. It can think anything a normal human can, but faster, more precisely, and without needing to be fed. In addition, it could be copied onto a million computers with ease. An army of thinkers, available at the press of a button.	Nicholas Kross; Inspired by Tim Urban, The AI Revolution: The Road to Superintelligence	Nicholas Kross		Public Domain
57	611	AI has a history of surprising us with its capabilities. Throughout the last 50 years, AI and machine learning systems have kept gaining skills that were once thought to be uniquely human, such as playing chess, classifying images, telling stories, and making art. Already, we see the risks associated with these kinds of AI capabilities. We worry about bias in algorithms that guide sentencing decisions or polarization induced by algorithms that curate our social media feeds. But we have every reason to believe that trends in AI progress will continue. AI will likely move from classifying satellite imagery to actually deciding whether to order a drone strike or from helping AI researchers conduct literature reviews to actually executing AI research. As these AI systems continue to grow more capable, our ability to understand and control them will tend to weaken, with potentially disastrous consequences. it is therefore critical that we build the technological foundation to ensure these systems share our values and the policy and regulatory foundation to ensure these systems are used for good.	Anonymous	Anonymous		Public Domain
58	617	The first moderately smart AI anyone develops might quickly become the last time that people are the smartest things around. We know that people can write computer programs. Once we make an AI computer program that is a bit smarter than people, it should be able to write computer programs too, including re-writing its own software to make itself even smarter. This could happen repeatedly, with the program getting smarter and smarter. If an AI quickly re-programs itself from moderately smart to super-smart, we could soon find that it is as disinterested in the well-being of people as people are in mice.	jimv	jimv		Public Domain
59	622	Progress moves faster than we think: who in the past would've thought that the world economy would double in size, multiple times in a single person's lifespan?	Nicholas Kross; Adapted from Nick Bostrom	Nicholas Kross	World GDP over the last two millenia
60	634	AI keeps finding new ways to cheat.	Trevor Wissinger	Trevor Wissinger		Public Domain
61	636	"When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong." - Arthur C. Clarke. In the case of AI, the distinguished scientists are saying not just that something is possible, but that it is probable. Let's listen to them.	John Petrie	John Petrie	Arthur C Clarke, Profiles of the Future.	Public Domain
62	645	It's easy to imagine that the AI will have an off switch and that we could keep it locked in a box and ask it questions. But just think about it. If some animals were to put you in a box, do you think you would stay in there forever? Or do you think you'd figure a way out that they hadn't thought of?	Michaeel Majeed	Michaeel Majeed		Public Domain
63	648	Question: "If it becomes a problem, why can't you just shut it off? Why can't you just unplug it?" Response: "Why can't you just shut off bitcoin? There isn't any single button to push, and many people prefer it not being shut off and will oppose you."	Nanda Ale	Nanda Ale		Public Domain
64	651	I'm old enough to remember when protein folding, text-based image generation, StarCraft play, 3+ player poker, and Winograd schemas were considered very difficult challenges for AI. I'm 3 years old.	Miles Brundage	Nanda Ale
65	659	When you hear about "the dangers of AI", what do you think of? Probably a bad actor using AI to hurt others, or a sci-fi scenario of robots turning evil. However, the bigger harm is more likely to be misalignment: an AI smarter than humans, without sharing human values. The top research labs, at places like DeepMind and OpenAI, are working to create superhuman AI, yet the current paradigm trains AI with simple goals. Detecting faces, trading stocks, maximizing some metric or other. So if super-intelligent AI is invented, it will probably seek to fulfil a narrow, parochial goal. With its mental capacity, faster speed, and ability to copy itself, it could take powerful actions to reach its goal. Since human values, wants, and needs are complicated and poorly understood, the AI might not care about anyone who gets in its way. It doesn't turn evil, it simply bulldozes through humans who aren't smart enough to fight it. Without safeguards, we could end up like the ants whose hills we pave over with our highways.	Nicholas Kross	Nicholas Kross		Public Domain
66	666	Computers can already "think" faster than humans. If we created AI software that was smarter than humans, it would think better, not just faster. Giving a monkey more time won't necessarily help it learn quantum physics, because the monkey's mind may not have the capacity to understand the concept at all. Since there's no clear upper limit to how smart something can be, we'd expect superhumanly smart AI to think on a level we can't comprehend. Such an AI would be unfathomably dangerous and hard to control.	Nicholas Kross; Inspired by Tim Urban, The AI Revolution: Our Immortality or Extinction	Nicholas Kross		Public Domain
67	671	If an artificial intelligence program became generally smarter than humans, there would be a massive power imbalance between the AI and humanity. Humans are slightly smarter than apes, yet we built a technological society while apes face extinction. Humans are much smarter than ants, and we barely think of the anthills we destroy to build highways. At a high enough level of intelligence, an AI program would be to us as we are to ants.	Nicholas Kross; Inspired by Tim Urban, The AI Revolution: The Road to Superintelligence	Nicholas Kross		Public Domain
68	672	One of the main concerns about general AI is that it could quickly get out of human control. If humans invent an AI with human-level cognitive skills, that AI could still think faster than humans, solve problems more precisely, and copy its own files to more computers. If inventing human-level AI is within human abilities, it's also within human-level AI's abilities. So this AI could improve its own code, and get more intelligent over several iterations. Eventually, we would see a super-smart AI with superhuman mental abilities. Keeping control of that software could be an insurmountable challenge.	Nicholas Kross; Inspired by Tim Urban, The AI Revolution: The Road to Superintelligence	Nicholas Kross		Public Domain
69	683	We have no idea when it will be invented. All we know is that it won't be tomorrow. But when we do discover that it's going to be tomorrow, it will already be too late to make it safe and predictable.	Trevor Wissinger	Trevor Wissinger		Public Domain
70	687	A machine with superintelligence would be able to hack into vulnerable networks via the internet, commandeer those resources for additional computing power, take over mobile machines connected to networks connected to the internet, use them to build additional machines, perform scientific experiments to understand the world better than humans can, invent quantum computing and nanotechnology, manipulate the social world better than we can, and do whatever it can to give itself more power to achieve its goals — all at a speed much faster than humans can respond to.	Intelligence Explosion FAQ	Trevor Wissinger
71	690	The human brain has some capabilities that the brains of other animals lack. It is to these distinctive capabilities that our species owes its dominant position... Other animals have stronger muscles or sharper claws, but we have cleverer brains. If machine brains one day come to surpass human brains in general intelligence, then this new superintelligence could become very powerful. As the fate of the gorillas now depends more on us humans than on the gorillas themselves, so the fate of our species then would come to depend on the actions of the machine superintelligence. But we have one advantage: we get to make the first move.	Abbreviated from Nick Bostrom, Superintelligence, Preface	Trevor Wissinger
72	702	Whereas the short-term impact of AI depends on who controls it, the long-term impact depends on whether it can be controlled at all.	Stephen Hawking, Stuart Russell, Max Tegmark, and Frank Wilczek	Anonymous
73	746	Once AI is close enough to human intelligence, it will be able to improve itself without human maintenance. It will be able to take itself the rest of the way, all the way up to humanlike intelligence, and it will probably pass that point as quickly as it arrived. There's no upper limit to intelligence, only an upper limit for intelligent humans; we don't know what a hyperintelligent machine would do, it's never happened before. If it had, we might not be alive right now; we simply don't know how something like that would behave, only that it would be as capable of outsmarting us as we are of outsmarting lions and hyenas.	Trevor Wissinger	Trevor Wissinger		Public Domain
74	761	For the last 20 years, AI technology has improved without warning. We are now closer to human-level artificial intelligence than ever before, to building something that can invent solutions to our problems, or invent a way to make itself as smart to humans as humans are to ants. But we won't know what it will look like until after it is invented; if it is as smart as a human but learns one thousand times as fast, it might detect the control system and compromise all human controllers. Choice and intent are irrelevant; it's a computer, and all it takes is one single glitch. We need a team for this, on-call and ready to respond with solutions, immediately after the first AI system starts approaching human-level intelligence.	Trevor Wissinger	Trevor Wissinger		Public Domain
75	772	With all the advanced tools we have, and with all our access to scientific information, we still don't know the consequences of many of our actions. We can't always predict what will happen when we say something to another person and we can't predict how to get our kids to do what we say. When we create a whole new type of intelligence, why should we be able to predict what happens then?	Robert Cousineau	Robert Cousineau		Public Domain
76	789	If AI can eventually be made smarter than humans, our fate would depend on it in the same way Earth's species depend on us. This is not reassuring for us.	Nicholas Kross; inspired by Nick Bostrom, Superintelligence, preface	Nicholas Kross	Holocene Extinction	Public Domain
77	807	...computers are everywhere. It is not like the problem of nuclear proliferation, where the main emphasis is on controlling plutonium and enriched uranium. The raw materials for A[G]I are already everywhere. That cat is so far out of the bag that it’s in your wristwatch, cellphone, and dishwasher.	Eliezer Yudkowsky, AI As a Positive and Negative Factor in Global Risk	Trevor Wissinger
78	820	The process of building AI systems is prone to errors and can be difficult to understand. This is doubly true as the systems get more complex and "smarter". Therefore, it is important to ensure that the development of AI systems is safe and secure. This is hard because we have to figure out how to get "what we want", all those complex and messy human values, into a form that satisfies a computer.	Nicholas Kross	Nicholas Kross		Public Domain
79	896	If the word "intelligence" evokes Einstein instead of humans, then it may sound sensible to say that intelligence is no match for a gun, as if guns had grown on trees. It may sound sensible to say that intelligence is no match for money as if mice used money. Human beings didn’t start out with major assets in claws, teeth, armour, or any of the other advantages that were the daily currency of other species. If you had looked at humans from the perspective of the rest of the ecosphere, there was no hint that the soft pink things would eventually clothe themselves in armored tanks. We invented the battleground on which we defeated lions and wolves. We did not match them claw for claw, tooth for tooth; we had our own ideas about what mattered. Such is the power of creativity.	Eliezer Yudkowsky, AI As a Positive and Negative Factor in Global Risk	Trevor Wissinger
80	908	AI has become smarter every year for the last 10 years. Progress has gotten faster recently. The question is, how much smarter does it need to get before it is smarter than humans? If it is smarter than humans, all it will take is a single glitch, and it could choose to do all sorts of horrible things. It will not think like a human, it will not want the same things that humans want, but it will understand human behaviour better than we do.	Trevor Wissinger	Trevor Wissinger		Public Domain
81	915	If an AI's ability to learn and observe starts improving rapidly and approaches human intelligence, then it will probably behave unpredictably, and we might not have enough time to assert control before it is too late.	Trevor Wissinger; Inspired by Tim Urban, The AI Revolution: Our Immortality or Extinction and Tim Urban, The AI Revolution: The Road to Superintelligence	Trevor Wissinger
82	938	A common problem in the modern world is when incentives don’t match up with the value being produced for society. For instance, corporations have an incentive to profit-maximise, which can lead to producing value for consumers but can also involve less ethical strategies such as underpaying workers, regulatory capture, or tax avoidance. Laws & regulations are designed to keep behaviour like this in check, and this works fairly well most of the time. Some reasons for this are: (1) people have limited time/intelligence/resources to find and exploit loopholes in the law, (2) people usually follow societal and moral norms even if they’re not explicitly represented in law, and (3) the pace of social and technological change has historically been slow enough for policymakers to adapt laws & regulations to new circumstances. However, advancements in artificial intelligence might destabilise this balance. To return to the previous example, an AI tasked with maximising profit might be able to find loopholes in laws that humans would miss, they would have no particular reason to pay attention to societal norms, and they might be improving and becoming integrated with society at a rate which makes it difficult for policy to keep pace. The more entrenched AI becomes in our society, the worse these problems will get.	Callum McDougal; Inspired by What Failure Looks Like and Bird's Eye View of AI Alignment - Threat Models	Callum McDougal		Public Domain
83	948	Even the phrase.... "artificial superintelligence" may still evoke images of book-smarts-in-a-box: an AI that’s really good at cognitive tasks stereotypically associated with "intelligence", like chess or abstract mathematics. But not superhumanly persuasive; or far better than humans at predicting and manipulating human social situations; or inhumanly clever in formulating long-term strategies. So instead of Einstein, should we think of, say, the 19th-century political and diplomatic genius Otto von Bismarck? But that’s only the mirror version of the error. The entire range from village idiot to Einstein, or from village idiot to Bismarck, fits into a small dot on the range from amoeba to human.	Eliezer Yudkowsky, AI As a Positive and Negative Factor in Global Risk	Trevor Wissinger	https://intelligence.org/files/AIPosNegFactor.pdf
84	955	Assuming that advanced AI would preserve humanity is the same as an ant colony assuming that real estate developers would preserve their nest. Those developers don’t hate ants, they just want to use that patch of ground for something else.	Arran McCutcheon	Arran McCutcheon		Public Domain	Adapted from Argument #310, which also won.