ABCDEFGI
1
Author(s)TitleDateMethodologyKey findings or argumentsSuggestions for actions that should be taken, e.g. further researchSearch terms / identified howAffiliated with CLR?
2
Sotala, Kaj and Lukas GloorSuperintelligence as a Cause or Cure for Risks of Astronomical Suffering2017This is a mostly theoretical paper, drawing on some previous research and theory relating to possible futures and to the development of artificial intelligence."[T]he known universe may eventually be populated by vast amounts of minds: published estimates include the possibility of 1025 minds supported by a single star, with humanity having the potential to eventually colonize tens of millions of galaxies. While this could enable an enormous number of meaningful lives to be lived, if even a small fraction of these lives were to exist in hellish circumstances, the amount of suffering would be vastly greater than that produced by all the atrocities, abuses, and natural causes in Earth’s history so far… A superintelligence which was building subagents, such as worker robots or disembodied cognitive agents, might then also construct them in such a way that they were capable of feeling pain - and thus possibly suffering - if that was the most efficient way of making them behave in a way that achieved the superintelligence’s goals… a superintelligence which had no disincentive to create suffering but did have an incentive to create whatever furthered its goals, could create vast populations of agents which sometimes suffered while carrying out the superintelligence’s goals. Because of the ruling superintelligence’s indifference towards suffering, the amount of suffering experienced by this population could be vastly higher than it would be in e.g. an advanced human civilization, where humans had an interest in helping out their fellow humans… The simpler the algorithms that can suffer, the more likely it is that an entity with no regard for minimizing it would happen to instantiate large numbers of them." // “pathways that could lead to the instantiation of large numbers of suffering subroutines”: “Anthropocentrism. If the superintelligence had been programmed to only care about humans, or by minds which were sufficiently human-like by some criteria, then it could end up being indifferent to the suffering of any other minds, including subroutines. Indifference. If attempts to align the superintelligence with human values failed, it might not put any intrinsic value on avoiding suffering, so it may create large numbers of suffering subroutines. Uncooperativeness. The superintelligence’s goal is something like classical utilitarianism, with no additional regards for cooperating with other value systems. As previously discussed, classical utilitarianism would prefer to avoid suffering, all else being equal. However, this concern could be overridden by opportunity costs. For example, Bostrom suggests that every second of delayed space colonization corresponds to a loss equal to 10^14 potential lives. A classical utilitarian superintelligence that took this estimate literally might choose to build colonization robots that used suffering subroutines, if this was the easiest way and developing alternative cognitive architectures capable of doing the job would take more time.” // "A superintelligence might run simulations of sentient beings for a variety of purposes... Below are some pathways that could lead to mind crime (Gloor 2016): Anthropocentrism. Again, if the superintelligence had been programmed to only care about humans, or about minds which were sufficiently human-like by some criteria, then it could be indifferent to the suffering experienced by non-humans in its simulations. Indifference. If attempts to align the superintelligence with human values failed, it might not put any intrinsic value on avoiding suffering, so it may create large numbers of simulations with sentient minds if that furthered its objectives. Extortion. The superintelligence comes into conflict with another actor that disvalues suffering, so the superintelligence instantiates large numbers of suffering minds as a way of extorting the other entity. Libertarianism regarding computations: the creators of the first superintelligence instruct the AI to give every human alive at the time control of a planet or galaxy, with no additional rules to govern what goes on within those territories. This would practically guarantee that some humans would use this opportunity for inflicting widespread cruelty (see the previous section).""[I]t is possible to productively work on [reducing risks of astronomical future suffering] today, via some of the following recommendations. Carry out general AI alignment work... research aimed at aligning AIs with human values seems likely to also reduce the risk of suffering outcomes. If our argument for suffering outcomes being something to avoid is correct, then an aligned superintelligence should also attempt to establish a singleton that would prevent negative suffering outcomes, as well as avoiding the creation of suffering subroutines and mind crime... Research ways to clearly separate superintelligence designs from ones that would contribute to suffering risk... Carry out research on suffering risks and the enabling factors of suffering." They also recommend rethinking some of Nick Bostrom's theoretical arguments.Center on Long-term Risk website and ("astronomical suffering" OR “suffering risks” OR “s-risks”) (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)Yes
3
Oesterheld, CasparMultiverse-wide Cooperation via Correlated Decision Making2017This is a brief comment about moral circles being more likely to expand if we seek to cooperate with other actors. The paper itself argues that we should promote cooperation.The argument of the paper itself is that: "Some decision theorists argue that when playing a prisoner’s dilemma-type game against a sufficiently similar opponent, we should cooperate to make it more likely that our opponent also cooperates. This idea, which Hofstadter calls superrationality, has strong implications when combined with the insight from modern physics that we probably live in a large universe or multiverse of some sort. If we care about what happens in civilizations located elsewhere in the multiverse, we can superrationally cooperate with some of their inhabitants. That is, if we take their values into account, this makes it more likely that they do the same for us. In this paper, I attempt to assess the practical implications of this idea. I argue that to reap the full gains from trade, everyone should maximize the same impartially weighted sum of the utility functions of all collaborators." // "Making people care intrinsically less about particular nations thus aligns their values more with those of superrational collaborators elsewhere in the multiverse. Similarly, intrinsic preferences for members of one’s race, species, or substrate are inconsistent with an outside view of someone from a completely different species with a different substrate."Center on Long-term Risk websiteYes
4
Gloor, LukasSuffering-Focused AI Safety: In Favor of “Fail-Safe” Measures2016This is a mostly theoretical paper, drawing on some previous research and theory relating to possible futures and to the development of artificial intelligence.This is the paper where Gloor outlined some of the scenarios summarized in Sotala and Gloor (2017). Hence, the categories of risk types are similar. Additional cited risks include: "Retributivism: perhaps the AI gets built by people who want to punish members of an outgroup (e.g. religious fundamentalists punishing sinners)... Miserable creatures: perhaps the AI’s goal function includes terms that attempt to specify sentient or human-like beings and conditions that are meant to be good for these beings. However, because of programming mistakes, unanticipated loopholes or side-effects, the conditions specified actually turn out to be bad for these beings. Worse still, the AI has a maximizing function and wants to fill as many regions of the universe as possible with these poor creatures. Black swans: perhaps the AI cares about sentient or human-like minds in 'proper' ways, but has bad priors, ontology, decision theory, or other fundamental constituents that would make it act in unfortunate and unpredictable ways.""Rather than concentrating all our efforts on a specific future we would like to bring about, we should identify futures we least want to bring about and work on ways to steer AI trajectories around these. In particular, a 'fail-safe' approach to AI safety is especially promising because avoiding very bad outcomes might be much easier than making sure we get everything right. This is also a neglected cause despite there being a broad consensus among different moral views that avoiding the creation of vast amounts of suffering in our future is an ethical priority."Center on Long-term Risk websiteYes
5
Gloor, LukasAltruists Should Prioritize Artificial Intelligence2016This is a mostly theoretical paper, drawing on some previous research and theory relating to possible futures and to the development of artificial intelligence."The large-scale adoption of today's cutting-edge AI technologies across different industries would already prove transformative for human society. And AI research rapidly progresses further towards the goal of general intelligence. Once created, we can expect smarter-than-human artificial intelligence (AI) to not only be transformative for the world, but also (plausibly) to be better than humans at self-preservation and goal preservation. This makes it particularly attractive, from the perspective of those who care about improving the quality of the future, to focus on affecting the development goals of such AI systems, as well as to install potential safety precautions against likely failure modes." In the section on "VII. Artificial sentience and risks of astronomical suffering," Gloor notes that, "[u]nfortunately, we cannot rule out that the space colonization machinery orchestrated by a superintelligent AI would also contain sentient minds, including minds that suffer (though probably also happy minds). The same way factory farming led to a massive increase in farmed animal populations, multiplying the direct suffering humans cause to animals by a large factor, an AI colonizing space could cause a massive increase in the total number of sentient entities, potentially creating vast amounts of suffering." Citing Althaus and Gloor (2016), Gloor then outlines "some ways AI outcomes could result in astronomical amounts of suffering": "Suffering in AI workers... Optimization for sentience [i.e. the AI insufficiently considers or miscalculates the experiences of sentient beings]... "Ancestor simulations... Warfare." Gloor then cites Gloor (2016) and Tomasik (2011) for further detail. // "Critics may object because the above scenarios are largely based on the possibility of artificial sentience, particularly sentience implemented on a computer substrate. If this turns out to be impossible, there may not be much suffering in futures with AI after all. However, computer-based minds also being able to suffer in the morally relevant sense is a common implication in philosophy of mind. Functionalism and type A physicalism (“eliminativism”) both imply that there can be morally relevant minds on digital substrates. Even if one were skeptical of these two positions and instead favored the views of philosophers like David Chalmers or Galen Strawson (e.g. Strawson, 2006), who believe consciousness is an irreducible phenomenon, there are at least some circumstances under which these views would also allow for computer-based minds to be sentient. Crude “carbon chauvinism,” or a belief that consciousness is only linked to carbon atoms, is an extreme minority position in philosophy of mind... As long as we are not very confident indeed that minds on a computer substrate would be incapable of suffering in the morally relevant sense, we should believe that most of the future’s expected suffering is located in futures where superintelligent AI colonizes space.""CLR has looked systematically into paths to impact for affecting AI outcomes with particular emphasis on preventing suffering, and we have come up with a few promising candidates. The following list presents some tentative proposals: Worst-case AI safety; Differential intellectual progress; Promoting cooperation; Raising awareness of: suffering-focused ethics; anti-speciesism; concern for the suffering of “weird” minds; It is important to note that human values may not affect the goals of an AI at all if researchers fail to solve the value-loading problem. Raising awareness of certain values may therefore be particularly impactful if it concerns groups likely to be in control of the goals of smarter-than-human artificial intelligence. Further research is needed to flesh out these paths to impact in more detail, and to discover even more promising ways to affect AI outcomes. As there is always the possibility that we have overlooked something or are misguided or misinformed, we should remain open-minded and periodically rethink the assumptions our current prioritization is based on."Center on Long-term Risk website and “Artificial sentience” (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk) and "Moral consideration" etcYes
6
Althaus, David, and Lukas GloorReducing Risks of Astronomical Suffering: A Neglected Priority2016This is a mostly theoretical paper, drawing on some previous research and theory relating to possible futures and to the development of artificial intelligence."Moreover, it seems that a lot of people overestimate how good the future will be due to psychological factors, ignorance about some of the potential causes of astronomical future suffering, and insufficient concern for model uncertainty and unknown unknowns." Details are provided on each of these points. In the section on "Unawareness of possible sources of astronomical suffering," the authors note the possibility for suffering subroutines or simulations, citing other papers included in this review. "In conclusion, there are several reasons why the probability of risks of astronomical suffering – although difficult to assess – is significant; we should be careful to not underestimate them."Center on Long-term Risk website and ("astronomical suffering" OR “suffering risks” OR “s-risks”) (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)Yes
7
Gloor, LukasCause prioritization for downside-focused value systems2018This is a mostly theoretical paper, drawing on some previous research and theory relating to possible futures and to the development of artificial intelligence."If humans went extinct, this would greatly reduce the probability of space colonization and any associated risks (as well as benefits). Without space colonization, there are no s-risks 'by action,' no risks from the creation of astronomical suffering where human activity makes things worse than they would otherwise be. Perhaps there would remain some s-risks 'by omission,' i.e. risks corresponding to a failure to prevent astronomical disvalue. But such risks appear unlikely given the apparent emptiness of the observable universe. Because s-risks by action overall appear to be more plausible than s-risks by omission, and because the latter can only be tackled in an (arguably unlikely) scenario where humanity accomplishes the feat of installing compassionate values to robustly control the future, it appears as though downside-focused altruists have more to lose from space colonization than they have to gain." // "Sometimes efforts to reduce these other existential risks also benefits s-risk reduction. For instance, efforts to reduce non-AI-related extinction risks may increase global stability and make particularly bad futures less likely in those circumstances where humanity nevertheless goes on to colonize space." // Citing other CLR work, Gloor argues that "AI alignment: (Probably) positive for downside-focused views; high variance." This section includes the comment that, "[w]hether such history simulations would be fine-grained enough to contain sentient minds, or whether simulations on a digital medium can even qualify as sentient, are difficult and controversial questions. It should be noted however that the stakes are high enough such that even comparatively small credences such as 5% or lower would already go a long way in terms of the implied expected value for the overall severity of s-risks from artificial sentience." Further comments are added, such as: "While the earliest discussions about the risks from artificial superintelligence have focused primarily on scenarios where a single goal and control structure decides the future (singleton), we should also remain open for scenarios that do not fit this conceptualization completely. Perhaps what happens instead could be several goals either competing or acting in concert with each other, like an alien economy that drifted further and further away from originally having served the goals of its human creators. Alternatively, perhaps goal preservation becomes more difficult the more capable AI systems become, in which case the future might be controlled by unstable goal functions taking turns over the steering wheel (see “daemons all the way down”). These scenarios where no proper singleton emerges may perhaps be especially likely to contain large numbers of sentient subroutines. This is because navigating a landscape with other highly intelligent agents requires the ability to continuously model other actors and to react to changing circumstances under time pressure – all of which are things that are plausibly relevant for the development of sentience. In any case, we cannot expect with confidence that a future controlled by non-compassionate goals will be a future that neither contains happiness nor suffering. In expectation, such futures are instead likely to contain vast amounts of both happiness and suffering, simply because these futures would contain astronomical amounts of goal-directed activity in general. Successful AI alignment could prevent most of the suffering that would happen in an AI-controlled future." // "One premise that strongly bears on the likelihood of scenarios where the future contains astronomical quantities of suffering is whether artificial sentience, sentience implemented on computer substrates, is possible. Because most philosophers of mind believe that digital sentience is possible, only having very small credences (say 5% or smaller) in this proposition is unlikely to be epistemically warranted. Moreover, even if digital sentience was impossible, another route to astronomical future suffering is that whatever substrates can produce consciousness would be used/recruited for instrumental purposes."Center on Long-term Risk website and “Artificial sentience” (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)Yes
8
Tomasik, BrianHow the Simulation Argument Dampens Future Fanaticism2016This is a mostly theoretical paper, drawing on various previous research and theory."[T]here's a non-trivial chance that most of the copies of ourselves are instantiated in relatively short-lived simulations run by superintelligent civilizations, and if so, when we act to help others in the short run, our good deeds are duplicated many times over. Notably, this reasoning dramatically upshifts the relative importance of short-term helping even if there's only a small chance that Nick Bostrom's basic simulation argument is correct."Center on Long-term Risk websiteYes
9
Tomasik, BrianThe Eliminativist Approach to Consciousness2014This is a mostly theoretical paper, drawing on various previous research and theory."This essay explains my version of an eliminativist approach to understanding consciousness. It suggests that we stop thinking in terms of 'conscious' and 'unconscious' and instead look at physical systems for what they are and what they can do. This perspective dissolves some biases in our usual perspective and shows us that the world is not composed of conscious minds moving through unconscious matter, but rather, the world is a unified whole, with some sub-processes being more fancy and self-reflective than others. I think eliminativism should be combined with more intuitive understandings of consciousness to ensure that its moral applications stay on the right track... Looking at the universe from a more physical stance has helped me see that even alien artificial intelligences are likely to matter morally, that plants and bacteria have some ethical significance, and that even elementary physical operations might have nonzero (dis)value."Center on Long-term Risk websiteYes
10
Tomasik, BrianFlavors of Computation Are Flavors of Consciousness2014This is a mostly theoretical paper, drawing on various previous research and theory."I propose to think of consciousness as intrinsic to computation, although different types of computation may have very different types of consciousness – some so alien that we can't imagine them. Since all physical processes are computations, this view amounts to a kind of panpsychism." // "We can tell there's something wrong with our ordinary conceptions when we think about ourselves. Suppose I grabbed a man on the street and described every detail of what your brain is doing at a physical level -- including neuronal firings, evoked potentials, brain waves, thalamocortical loops, and all the rest -- but without using suggestive words like "vision" or "awareness" or "feeling". Very likely he would conclude that this machine was not conscious; it would seem to be just an automaton computing behavioral choices "in the dark". If our conceptualization of consciousness can't even predict our own consciousness, it must be misguided in an important way." // "Delineating consciousness based on possession of (biological) neurons would also exclude artificial computer minds from being counted as conscious, in a similar way as the standard biological definition of life excludes artificial life, even when artificial life forms satisfy most of the other criteria for life." // "Abstract machine intelligence would be a very different flavor of consciousness, so much that we can't do it justice by trying to imagine it. But I find it parochial to assume that it wouldn't be meaningful consciousness... It's completely legitimate to care about some types of physical processes and not others if that's how you feel. I just personally incline toward the view that complex machine consciousness of any sort has moral standing."Center on Long-term Risk websiteYes
11
Tomasik, BrianArtificial Intelligence and Its Implications for Future Suffering2014This is a mostly theoretical paper, drawing on various previous research and theory."Consider a superintelligent AI that uses moderately intelligent robots to build factories and carry out other physical tasks that can't be pre-programmed in a simple way. Would these robots feel pain in a similar fashion as animals do? At least if they use somewhat similar algorithms as animals for navigating environments, avoiding danger, etc., it's plausible that such robots would feel something akin to stress, fear, and other drives to change their current state when things were going wrong... Sufficiently intelligent helper robots might experience "spiritual" anguish when failing to accomplish their goals. So even if chopping the head off a helper robot wouldn't cause "physical" pain -- perhaps because the robot disabled its fear/pain subroutines to make it more effective in battle -- the robot might still find such an event extremely distressing insofar as its beheading hindered the goal achievement of its AI creator."Center on Long-term Risk websiteYes
12
Tomasik, BrianDo Artificial Reinforcement-Learning Agents Matter Morally?2014This is a mostly theoretical paper, drawing on various previous research and theory."Artificial reinforcement learning (RL) is a widely used technique in artificial intelligence that provides a general method for training agents to perform a wide variety of behaviours. RL as used in computer science has striking parallels to reward and punishment learning in animal and human brains. I argue that present-day artificial RL agents have a very small but nonzero degree of ethical importance. This is particularly plausible for views according to which sentience comes in degrees based on the abilities and complexities of minds, but even binary views on consciousness should assign nonzero probability to RL programs having morally relevant experiences. While RL programs are not a top ethical priority today, they may become more significant in the coming decades as RL is increasingly applied to industry, robotics, video games, and other areas. I encourage scientists, philosophers, and citizens to begin a conversation about our ethical duties to reduce the harm that we inflict on powerless, voiceless RL agents." // Various other contributors -- most of whose works have been summarized elsewhere in this literature review -- are also summarized in a short section on "Previous discussions of machine welfare
."
Center on Long-term Risk websiteYes
13
Tomasik, BrianA Lower Bound on the Importance of Promoting Cooperation2014This paper makes a wider argument about the importance of cooperation. Artificial sentience is referred to briefly as one example of where this approach could have benefits."Maintaining stability and rule of law. Some of the most significant potential sources of suffering in the future are reinforcement-learning algorithms, artificial-life simulations, and other sentient computational processes. Reducing these forms of suffering would plausibly require machine-welfare laws or norms within a stable society. It's hard to imagine humane concerns carrying currency in a competitive, Wild West environment. International cooperation and other measures to maintain social tranquility are important for enabling more humane standards for industrial and commercial computations."Center on Long-term Risk websiteYes
14
Tomasik, BrianA Dialogue on Suffering Subroutines2013This is a mostly theoretical paper, drawing on various previous research and theory."This piece presents a hypothetical dialogue that explains why instrumental computational processes of a future superintelligence might evoke moral concern. I give some examples of present-day systems that we may consider at least somewhat conscious, such as news reporting or automated stock trading. Agent-like components seem to emerge in many places, and it's plausible this would continue in the computing processes of a future civilization. Whether these subroutines matter, how much they matter, and how to even count them are questions for future generations to figure out, but it's good to keep an open mind to the possibility that our intuitions about what suffering is may change dramatically with new insights." // "I learned of the idea that suffering subroutines might be ethically relevant from Carl Shulman in 2009. In response to this piece, Carl added: 'Of course, there can be smiling happy subroutines too! Brian does eventually get around to mentioning "gradients of bliss", but this isn't a general reason for expecting the world to be worse, if you count positive experiences too. I would say 'sentient subroutines.''" // "I coined the phrase "suffering subroutines" in a 2011 post on Felicifia. I chose the alliteration because it went nicely with "sentient simulations," giving a convenient abbreviation (SSSS) to the conjunction of the two concepts. I define sentient simulations as explicit models of organisms that are accurate enough to count as conscious, while suffering subroutines are incidental computational processes that nonetheless may matter morally. Sentient synthetic artificial-life agents are somewhere on the border between these categories, depending on whether they're used for psychology experiments or entertainment (sentient simulations) vs. whether they're used for optimization or other industrial processes (suffering subroutines). It appears that Meghan Winsby (coincidentally?) used the same 'suffering subroutines' phrase in an excellent 2013 paper: 'Suffering Subroutines: On the Humanity of Making a Computer that Feels Pain.' It seems that her usage may refer to what I call sentient simulations, or it may refer to general artificial suffering of either type."Center on Long-term Risk websiteYes
15
Tomasik, BrianRisks of Astronomical Future Suffering2011This is a mostly theoretical paper, drawing on some previous research and theory relating to possible futures and to the development of artificial intelligence. Although written in 2011, some of the references are more recent as the paper has been updated."Even if humans do preserve control over the future of Earth-based life, there are still many ways in which space colonization would multiply suffering. Following are some of them." One example is the "[s]pread of wild animals." A second is "[s]entient simulations." "Given astronomical (Bostrom, 2003) computing power, post-humans may run various kinds of simulations. These sims may include many copies of wild-animal life, most of which dies painfully shortly after being born. For example, a superintelligence aiming to explore the distribution of extraterrestrials of different sorts might run vast numbers of simulations (Thiel, Bergmann and Grey, 2003) of evolution on various kinds of planets. Moreover, scientists might run even larger numbers of simulations of organisms-that-might-have-been, exploring the space of minds. They may simulate decillions of reinforcement learners that are sufficiently self-aware as to feel what we consider conscious pain." A third is "[s]uffering subroutines." "It could be that certain algorithms (say, reinforcement agents (Tomasik, 2014)) are very useful in performing complex machine-learning computations that need to be run at massive scale by advanced AI. These subroutines might be sufficiently similar to the pain programs in our own brains that we consider them to actually suffer. But profit and power may take precedence over pity, so these subroutines may be used widely throughout the AI’s Matrioshka brains." A fourth is "[b]lack swans." "The range of scenarios that we can imagine is limited, and many more possibilities may emerge that we haven’t thought of or maybe can’t even comprehend." // "If I had to make an estimate now, I would give ~70% probability that if humans choose to colonize space, this will cause more suffering than it reduces on intrinsic grounds." // "It may be that biological suffering is a drop in the bucket compared with digital suffering. The biosphere of a planet is less than Type I on the Kardashev scale; it uses a tiny sliver of all the energy of its star. Intelligent computations by a Type II civilization can be many orders of magnitude higher. So humans’ sims could be even more troubling than their spreading of wild animals. Of course, maybe there are ETs running sims of nature for science or amusement, or of minds in general to study biology, psychology, and sociology. If we encountered these ETs, maybe we could persuade them to be more humane. I think it’s likely that humans are more empathetic than the average civilization." // "Some make the argument that because we know so very little now, it’s better for humans to stick around because of the "option value": If they later realize it’s bad to spread, they can stop, but if they realize they should spread, they can proceed to reduce suffering in some novel way that we haven’t anticipated. Of course, the problem with the 'option value' argument is that it assumes future humans do the right things, when in fact, based on examples of speculations we can imagine now, it seems future humans would probably do the wrong things much of the time." // "Paperclippers would presumably be less intrinsically humane than a 'friendly AI', so some might cause significantly more suffering than a friendly AI, though others might cause less, especially the 'minimizing' paperclippers, e.g., cancer minimizers or death minimizers." // "In an idealized scenario like coherent extrapolated volition (CEV) (Yudkowsky, 2004), say, if suffering reduction was the most compelling moral view, others would see this fact... Of course, this rosy picture is not a likely future outcome. Historically, forces seize control because they best exert their power. It’s quite plausible that someone will take over the future by disregarding the wishes of everyone else, rather than by combining and idealizing them. Or maybe concern for the powerless will just fall by the wayside."Center on Long-term Risk websiteYes
16
Baumann, TobiasS-risk FAQ2017This post is a summary of the arguments put forward in other papers by CLR."In the future, it may become possible to run such complex simulations that the (artificial) individuals inside these simulations are sentient. Nick Bostrom coined the term mindcrime for the idea that the thought processes of a superintelligent AI might cause intrinsic moral harm if they contain (suffering) simulated persons. Since there are instrumental reasons to run many such simulations, this could lead to vast amounts of suffering. For example, an AI might use simulations to improve its knowledge of human psychology or to predict what humans would do in a conflict situation. Other common examples include suffering subroutines and spreading wild animal suffering to other planets.""Ok, I’m sold. What can I personally do to help reduce s-risks? A simple first step is to join the discussion, e.g. in this Facebook group. If more people think and write about the topic (either independently or at EA organizations), we’ll make progress on the crucial question of how to best reduce s-risks. At the same time, it helps build a community that, in turn, can get even more people involved. If you’re interested in doing serious research on s-risks right away, you could have a look at this list of open questions to find a suitable research topic. Work in AI policy and strategy is another interesting option, as progress in this area allows us to shape AI in a more fine-grained way, making it easier to identify and implement safety measures against s-risks. Another possibility is to donate to organizations working on s-risks reduction."“subroutines” (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk) and "Artificial sentience" etc and "astronomical suffering" etcYes
17
Armstrong, StuartThe AI in Mary's Room2016This is a discussion of the "Mary's room" experiment, suggesting that AI could experience qualia.Closer to the original experiment, we could imagine the AI is programmed to enter into certain specific subroutines, when presented with certain stimuli. The only way for the AI to start these subroutines, is if the stimuli is presented to them. Then, upon seeing red, the AI enters a completely new mental state, with new subroutines. The AI could know everything about its programming, and about the stimulus, and, intellectually, what would change about itself if it saw red. But until it did, it would not enter that mental state.“subroutines” (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
18
Muehlhauser, Luke2017 Report on Consciousness and Moral Patienthood2017"I (Luke Muehlhauser) investigated the following question: 'In general, which types of beings merit moral concern?' Or, to phrase the question as some philosophers do, 'Which beings are moral patients?' For this preliminary investigation, I focused on just one commonly endorsed criterion for moral patienthood: phenomenal consciousness, a.k.a. 'subjective experience.'" // Muehlhauser's "goals for this report are to: survey the types of evidence and argument that have been brought to bear on the distribution question, briefly describe example pieces of evidence of each type,7 without attempting to summarize the vast majority of the evidence (of each type) that is currently available, report what my own intuitions and conclusions are as a result of my shallow survey of those data and arguments, try to give some indication of why I have those intuitions, without investing the months of research that would be required to rigorously argue for each of my many reported intuitions, and list some research projects that seem (to me) like they could make progress on the key questions of this report, given the current state of evidence and argument.""I think this is a good test for theories of consciousness: If you described your theory of consciousness to a team of software engineers, machine learning experts, and roboticists, would they have a good idea of how they might, with several years of work, build a robot that functions according to your theory? And would you expect it to be phenomenally conscious, and (additionally stipulating some reasonable mechanism for forming beliefs or reports) to believe or report itself to have phenomenal consciousness for reasons that are fundamentally traceable to the fact that it is phenomenally conscious?" // The "probability of consciousness as loosely defined above" of AlphaGo is estimated at <5%, compared to 85% for chimpanzees. The probability of consciousness of a sort I intuitively morally care about" is estimated at <5% and 90%, respectively. // "By the time I began this investigation, I had already found persuasive my four key assumptions about the nature of consciousness: physicalism, functionalism, illusionism, and fuzziness. During this investigation I studied the arguments for and against these views more deeply than I had in the past, and came away more convinced of them than I was before." // "[D]uring the first few months of this investigation, I raised my probability that a very wide range of animals might be conscious. However, this had more to do with a 'negative' discovery than a 'positive' one, in the following sense: Before I began this investigation, I hadn’t studied consciousness much, and I held out some hope that there would turn out to be compelling reasons to “draw lines” at certain points in phylogeny, for example between animals which do and don’t have a cortex, and that I could justify a relatively sharp drop in probability of consciousness for species falling 'below' those lines. But, as mentioned above, I eventually lost hope that there would (at this time) be compelling arguments for drawing any such lines in phylogeny (short of having a nervous system at all). Hence, my probability of a species being conscious now drops gradually as the values of my 'four factors' decrease, with no particularly 'sharp' drops in probability among creatures with a nervous system.""During the investigation, I became less optimistic that philosophical arguments of the traditional analytic kind will contribute much to our understanding of the distribution question on the present margin. I see more promise in scientific work." "During the investigation, it became clear to me that I think too much professional effort is being spent on different schools of thought arguing with each other, and not enough effort spent on schools of thought ignoring each other and making as much progress as they can on their own assumptions to see what those assumptions can lead to." Many specific research projects are suggested.“subroutines” (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk) and "moral consideration" etc and "Moral patient" etcNo
19
Bostrom, NickCrucial Considerations and Wise Philanthropy2014"In this 2014 talk, the Future of Humanity Institute's Nick Bostrom discusses the concept of crucial considerations and how we can use it to maximize our impact on the long-term future." Subroutines are mentioned as a brief example."So a crucial consideration is a consideration such that if it were taken into account, it would overturn the conclusions we would otherwise reach about how we should direct our [altruistic] efforts, or an idea or argument that might possibly reveal the need not just for some minor course adjustment in our practical endeavors, but a major change of direction or priority." // "To just pick an example: insects. If you are a classical utilitarian, this consideration arises within the more mundane—we’re setting aside the cosmological commons and just thinking about here on Earth. If insects are sentient then maybe the amount of sentience in insects is very large because there are so very, very many of them. So that maybe the effect of our policies on insect well-being might trump the effect of our policies on human well-being or animals in factories and stuff like that. I’m not saying it does, but it’s a question that is non-obvious and that could have a big impact. Or take another example: Subroutines. With certain kinds of machine intelligence there are processes, like reinforcement learning algorithms and other subprocesses within the AI, that could turn out to have moral status in some way. Maybe there will be hugely large numbers of runs of these subprocesses, so that if it turns out that some of these kinds of things count for something, then maybe the numbers again would come to dominate."“subroutines” (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
20
Wiblin, Robert, and Keiran HarrisAnimals in the wild often suffer a great deal. What, if anything, should we do about that?2019This is a small section in a podcast discussion about wild animal welfare.Robert Wiblin: I guess to push even further into weird territory, some people who worry about wild animals think that this is kind of a good prototype for being worried about suffering that you might get as like a part of artificial intelligence or as part of kind of digital systems, where you might have yet subroutines I guess is the term that people use, or like parts of computer programs that are sentient, but not able to control their lives, perhaps in some kind of analogous way to wild animals, and that we might kind of neglect the welfare of those computer systems in the same way that we kind of do our wild animals today... How good do you think is that analogy and how much should that play a role in the case in favor of working on wild animals, hoping those will flow through the concern about other agents in the future?
Persis Eskander: I think there’s a pretty interesting line of thought that goes something like if we keep constantly trying to expand the type of beings that we consider moral patients, then it makes us less likely to overlook something like sentient subroutines. So, that would obviously be a hugely positive thing if it were the case that subroutines were sentient. There’s another line of thought that is contrary to that, which is something like, “Well, if we created something artificial, we would just do so in a way that meant it didn’t have the capacity, there was no possibility of it being sentient, because there would be no need,” or that there would be, I don’t know, there’s some way in which we would be able to factor that out... I don’t really know how plausible either of these are. I think there’s probably good arguments on both sides, but it’s just really speculative and I’m not really sure that I have a huge amount that I could sort of add to the argument. I think it’s interesting and I definitely think there is some value to the idea that we should constantly stay alert to the possibility that we’re overlooking some beings from our moral circle, but I’m not really sure … It doesn’t really seem obvious to me that focusing on wild animal welfare is the most promising way to do that, or that there’s necessarily a link between wild animal welfare and whatever the next version of potentially sentient being that we’re unfamiliar with is.
Robert Wiblin: Yeah. I think that’s kind of my take as well. I agree that there’s some effect here, but it seems … and I think trying to get people to worry about possible suffering, that’s in non biological forms like in computers in future is a good goal. It does just seem like doing it via the wild animal route is like bringing with it a whole lot of challenges that you might be able to avoid just by talking about that directly. I actually, I remember seeing some opinion polling, I’ll try to chase it up, suggesting that many people did believe that it was possible that artificial intelligence in future could have pleasure and pain, so there might not even be that much skepticism about that. It might be almost easier to get people to worry about AI as a moral agent than to get them to worry about wild animals, and certainly to try to get them to worry about AI via getting them to worry about wild animals. I tend to favor directness in plans in general.
Persis Eskander: Yeah. I guess there’s maybe some value to the argument that people tend to change their values incrementally, so it might be quite a large step to take them from the small number of animals that are currently within our sphere of concern to intelligent or sentient subroutines, and maybe it’s easier to sort of introduce gradual changes to people, to introduce gradual moral patients so that when they do encounter something as strange as artificial sentience, they’re less likely to object to it, or they’re less likely to find it really absurd."
“Artificial sentience” (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
21
Cotton-Barratt, OwenHuman and animal interventions: the long-term view2014This is a short blog post summarizing discussions that had been had elsewhere in the effective altruism community. Several comments were quite critical of the author's claims."This post discusses the question of how we should seek to compare human- and animal-welfare interventions. It argues: first, that indirect long-term effects mean that we cannot simply compare the short term welfare effects; second, that if any animal-welfare charities are comparably cost-effective with the best human-welfare charities, it must be because of their effect on humans, changing the behaviour of humans long into the future." // Comment by Peter Hurford: "[S]ome people (though I'm unsure) think that spreading anti-speciesism might be a critical gateway toward helping people expand their moral concern to wild animals or computer programs (e.g., suffering subroutines) in the far future too." // Comment by Greg Colbourn: "I think superintelligence/FAI is a critical far future factor that could very much benefit from animal advocacy. Caring about animals (and lesser minds in general) is very important from the perspective of FAI. If such an AI ends up taking some 'Coherent Extrapolated Volition' of humanity as the basis of it's utility function, then we need 'care for lesser minds' to be a strong component of this, else we are doomed."“subroutines” (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
22
Wiblin, RobertInterview with Brian Tomasik2012This is a short interview."What’s more, post-humans might multiply the amount of wild-animal and other non-human suffering through activities like terraforming, directed panspermia, sentient simulations, running suffering subroutines, and (speculatively) creating new universes in labs. I think it’s crucial to build a movement to give animals greater weight in moral calculations, by spreading concern for animal suffering, and Vegan Outreach and the Humane League are existing charities which do that efficiently thanks to the scale of its operations."“subroutines” (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)Yes (Tomasik)
23
Animal Charity EvaluatorsBoard of Directors Meeting2014This is the record of a standard monthly meeting for Animal Charity Evaluators' board of directors.Two of the board members present (Brian Tomasik and Simon Knutsson) were (later) affiliated with CLR or its predecessor, the Foundational Research Institute. A third, Jacy Anthis, was later very indirectly affiliated, via Sentience Politics. Brian Tomasik shared ideas about how ACE could target the far future. This Included the suggestion that ACE "Target those who build/control AI or political/economic/evolutionary trends rather than values". The notes state that "Digital sentience... represents a broad range of possible life forms. It could be simulations in a sophisticated computer or ‘sentient subroutines’ of a larger digital system." They also state that "The board holds a diversity of perspectives on how the organization should prioritize these topics, but leans towards maintaining a focus on the mainstream research at least for the immediate future."“subroutines” (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)Partly
24
Anthis, Jacy ReeseWhy I prioritize moral circle expansion over artificial intelligence alignment2018This is a mostly theoretical post, drawing on previous relevant discussions. It "is written for a very specific audience: people involved in the effective altruism community who are familiar with cause prioritization and arguments for the overwhelming importance of the far future.""When people in the effective altruism (EA) community have worked to affect the far future, they’ve typically focused on reducing extinction risk, especially risks associated with superintelligence or general artificial intelligence alignment (AIA). I agree with the arguments for the far future being extremely important in our EA decisions, but I tentatively favor improving the quality of the far future by expanding humanity’s moral circle more than increasing the likelihood of the far future or humanity’s continued existence by reducing AIA-based extinction risk because: (1) the far future seems to not be very good in expectation, and there’s a significant likelihood of it being very bad, and (2) moral circle expansion seems highly neglected both in EA and in society at large." // "I think there’s a significant chance that the moral circle will fail to expand to reach all sentient beings, such as artificial/small/weird minds (e.g. a sophisticated computer program used to mine asteroids, but one that doesn’t have the normal features of sentient minds like facial expressions)." "My views on this are currently largely qualitative, but if I had to put a number on the word “significant” in this context, it’d be somewhere around 5-30%. This is a very intuitive estimate, and I’m not prepared to justify it." "In other words, I think there’s a significant chance that powerful beings in the far future will have low willingness to pay for the welfare of many of the small/weird minds in the future." // "I place significant moral value on artificial/small/weird minds." // Some of the comments were fairly skeptical of Anthis' claims, though they did not necessarily relate to views on artificial sentience itself or the probability of particular scenarios involving artificial sentience; discussion tended to be around whether moral circle expansion was the most cost-effective method of reducing the risk that artificial entities suffer. Lukas Gloor (of CLR) comments: "I think there is an additional reason why MCE seems less effective than more targeted interventions to improve the quality of the long-term future: Gains from trade between humans with different values become easier to implement as the reach of technology increases. As long as a non-trivial fraction of humans end up caring about animal wellbeing or digital minds, it seems likely it would be cheap for other coalitions to offer trades. So whether 10% of future people end up with an expanded moral circle or 100% may not make much of a difference to the outcome: It will be reasonably good either way if people reap the gains from trade."Altruists should prioritize working on moral circle expansion in order to benefit "artificial/small/weird minds."“Artificial sentience” (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk) and "Moral consideration" etc and "Moral patient" etcOnly very indirectly
25
msmashWe're Not Living in a Computer Simulation, New Research Shows2017This is a link to an external article"A reader shares a report: A team of theoretical physicists from Oxford University in the UK has shown that life and reality cannot be merely simulations generated by a massive extraterrestrial computer. The finding -- an unexpectedly definite one -- arose from the discovery of a novel link between gravitational anomalies and computational complexity. In a paper published in the journal Science Advances, Zohar Ringel and Dmitry Kovrizhi show that constructing a computer simulation of a particular quantum phenomenon that occurs in metals is impossible -- not just practically, but in principle. The pair initially set out to see whether it was possible to use a technique known as quantum Monte Carlo to study the quantum Hall effect -- a phenomenon in physical systems that exhibit strong magnetic fields and very low temperatures, and manifests as an energy current that runs across the temperature gradient. The phenomenon indicates an anomaly in the underlying space-time geometry. [...] They discovered that the complexity of the simulation increased exponentially with the number of particles being simulated. If the complexity grew linearly with the number of particles being simulated, then doubling the number of particles would mean doubling the computing power required. If, however, the complexity grows on an exponential scale -- where the amount of computing power has to double every time a single particle is added -- then the task quickly becomes impossible."“subroutines” (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
26
ArbitalMindcrime2015This is a wiki page summarizing content from other research, websites, and blogs."'Mindcrime' is Nick Bostrom's suggested term for scenarios in which an AI's cognitive processes are intrinsically doing moral harm, for example because the AI contains trillions of suffering conscious beings inside it. Ways in which this might happen: Problem of sapient models (of humans): Occurs naturally if the best predictive model for humans in the environment involves models that are detailed enough to be people themselves. Problem of sapient models (of civilizations): Occurs naturally if the agent tries to simulate, e.g., alien civilizations that might be simulating it, in enough detail to include conscious simulations of the aliens. Problem of sapient subsystems: Occurs naturally if the most efficient design for some cognitive subsystems involves creating subagents that are self-reflective, or have some other property leading to consciousness or personhood. Problem of sapient self-models: If the AI is conscious or possible future versions of the AI are conscious, it might run and terminate a large number of conscious-self models in the course of considering possible self-modifications." Further detail is provided on each of these routes to mindcrime. For example, the authors note that, "[i]t's possible that a sufficiently advanced AI to have successfully arrived at detailed models of human intelligence, would usually also be advanced enough that it never tried to use a predictable/searchable model that engaged in brute-force simulations of those models... This, however, doesn't make it certain that no mindcrime will occur. It may not take exact, faithful simulation of specific humans to create a conscious model. An efficient model of a (spread of possibilities for a) human may still contain enough computations that resemble a person enough to create consciousness, or whatever other properties may be deserving of personhood. Consider, in particular, an agent trying to use... Besides problems that are directly or obviously about modeling people, many other practical problems and questions can benefit from modeling other minds - e.g., reading the directions on a toaster oven in order to discern the intent of the mind that was trying to communicate how to use a toaster. Thus, mindcrime might result from a sufficiently powerful AI trying to solve very mundane problems." // "Trying to consider these issues is complicated by: Philosophical uncertainty about what properties are constitutive of consciousness and which computer programs have them; Moral uncertainty about what (idealized versions of) (any particular person's) morality would consider to be the key properties of personhood; Our present-day uncertainty about what efficient models in advanced agents would look like." // "The prospect of mindcrime is an especially alarming possibility because sufficiently advanced agents, especially if they are using computationally efficient models, might consider very large numbers of hypothetical possibilities that would count as people. There's no limit that says that if there are seven billion people, an agent will run at most seven billion models; the agent might be considering many possibilities per individual human. This would not be an astronomical disaster since it would not (by hypothesis) wipe out our posterity and our intergalactic future, but it could be a disaster orders of magnitude larger than the Holocaust, the Mongol Conquest, the Middle Ages, or all human tragedy to date." // "Eliezer Yudkowsky has suggested 'mindgenocide' as a term with fewer Orwellian connotations" than "mindcrime." // "Literally nobody outside of MIRI or FHI ever talks about this problem.""Research avenues: Behaviorism: Try to create a limited AI that does not model other minds or possibly even itself, except using some narrow class of agent models that we are pretty sure will not be sentient. This avenue is potentially motivated for other reasons as well, such as avoiding probable environment hacking block 1 ​and averting programmer manipulation. Try to define a nonperson predicate that whitelists enough programs to carry out some pivotal achievement. Try for an AI that can bootstrap our understanding of consciousness and tell us about what we would define as a person, while committing a relatively small amount of mindcrime, with all computed possible-people being stored rather than discarded, and the modeled agents being entirely happy, mostly happy, or non-suffering. E.g., put a happy person at the center of the approval-directed agent, and try to oversee the AI's algorithms and ask it not to use Monte Carlo simulations if possible. Ignore the problem in all pre-interstellar stages because it's still relatively small compared to astronomical stakes and therefore not worth significant losses in success probability. (This may backfire under some versions of the Simulation Hypothesis.) Try to finish the philosophical problem of understanding which causal processes experience sapience (or are otherwise objects of ethical value), in the next couple of decades, to sufficient detail that it can be crisply stated to an AI, with sufficiently complete coverage that it's not subject to the Nearest unblocked strategy problem."("mind crime" OR “mindcrime”) (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
27
Hagen, SebastianComment on "Superintelligence 12: Malignant failure modes"2014This is a discussion of the "Mind crime" idea put forwards in Nick Bostrom, Superintelligence: Paths, Dangers, Strategies (Oxford, UK: Oxford University Press, 2014)"Any level of perverse instantiation in a sufficiently powerful AI is likely to lead to total UFAI; i.e. a full existential catastrophe. Either you get the AI design right so that it doesn't wirehead itself - or others, against their will - or you don't. I don't think there's much middle ground. OTOH, the relevance of Mind Crime really depends on the volume. The FriendlyAICriticalFailureTable has this instance: '22: The AI, unknown to the programmers, had qualia during its entire childhood, and what the programmers thought of as simple negative feedback corresponded to the qualia of unbearable, unmeliorated suffering. All agents simulated by the AI in its imagination existed as real people (albeit simple ones) with their own qualia, and died when the AI stopped imagining them. The number of agents fleetingly imagined by the AI in its search for social understanding exceeds by a factor of a thousand the total number of humans who have ever lived. Aside from that, everything worked fine.' This scenario always struck me as a (qualified) FAI success. There's a cost - and it's large in absolute terms - but the benefits will outweigh it by a huge factor, and indeed by enough orders of magnitude that even a slight increase in the probability of getting pre-empted by a UFAI may be too expensive a price to pay for fixing this kind of bug."("mind crime" OR “mindcrime”) (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
28
Bostrom, Nick, Allan Dafoe, and Carrick FlynnPolicy Desiderata for Superintelligent AI: A Vector Field Approach2016This is a paper considering "the speculative prospect of superintelligent AI and its normative implications for governance and global policy.""The issue of mind crime may arise well before the attainment of human-level or superintelligent AI." // "Digital beings with mental life might be created on purpose, but they could also be generated inadvertently. In machine learning, for example, large numbers of agents are often generated during training procedures—many semi-functional versions of a reinforcement learner are created and pitted against one another in self-play, many fully functional agent instantiations are created during hyperparameter sweeps, and so forth. It is quite unclear just how sophisticated artificial agents can become before attaining some degree of morally relevant sentience—or before we can no longer be confident that they possess no such sentience." // This paper also cited Bostrom's Superintelligence with coining the term "mind crime." // "Several factors combine to mark the possibility of mind crime as a salient special circumstance of advanced developments in AI. One is the novelty of sentient digital entities as moral patients... Related to this issue of novelty is the fact that digital minds can be invisible, running deep inside some microprocessor, and that they might lack the ability to communicate distress... Another factor is that it can be unclear what constitutes mistreatment of a digital mind... A fourth factor, amplifying the other three, is that it may become inexpensive to generate vast numbers of digital minds. This will give more agents the power to inflict mind crime and to do so at scale. With high computational speed or parallelization, a large amount of suffering could be generated in a small amount of wall clock time. It is plausible that the vast majority of all minds that will ever have existed will be digital. The welfare of digital minds, therefore, may be a principal desideratum in selecting an AI development path for actors who either place significant weight on ethical considerations or who for some other reason strongly prefer to avoid causing massive amounts of suffering."("mind crime" OR “mindcrime”) (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk) and "moral patient" etcNo
29
Bensinger, RobSam Harris and Eliezer Yudkowsky on “AI: Racing Toward the Brink”2018This is the transcript for a podcast interview with Eliezer Yudkowsky, the founder of the Machine Intelligence Research Institute.Sam Harris asks: "What is mindcrime? And why is it so difficult to worry about?" Yudkowsky replies: "I think, by the way, that that’s a pretty terrible term. (laughs) I’m pretty sure I wasn’t the one who invented it. I am the person who invented some of these terrible terms, but not that one in particular. First, I would say that my general hope here would be that as the result of building an AI whose design and cognition flows in a sufficiently narrow channel that you can understand it and make strong statements about it, you are also able to look at that and say, 'It seems to me pretty unlikely that this is conscious or that if it is conscious, it is suffering.' I realize that this is a sort of high bar to approach. The main way in which I would be worried about conscious systems emerging within the system without that happening on purpose would be if you have a smart general intelligence and it is trying to model humans. We know humans are conscious, so the computations that you run to build very accurate predictive models of humans are among the parts that are most likely to end up being conscious without somebody having done that on purpose."("mind crime" OR “mindcrime”) (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk) and "moral consideration" etcNo
30
Arbital"Behaviorist genie"2015This is a wiki page summarizing content from other research, websites, and blogs."A behaviorist genie is an AI that has been averted from modeling minds in more detail than some whitelisted class of models. This is possibly a good idea because many possible difficulties seem to be associated with the AI having a sufficiently advanced model of human minds or AI minds, including: Mindcrime." // "A behaviorist genie would still require most of genie theory and corrigibility to be solved. But it's plausible that the restriction away from modeling humans, programmers, and some types of reflectivity, would collectively make it significantly easier to make a safe form of this genie." // Paul Christiano comments: "I can imagine this concept becoming relevant one day. But it seems sufficiently improbable that it doesn't seem worth thinking about until we run out of urgent things to think about. Reasons it seems improbable: It would be shocking if people were willing to take such a massive efficacy hit for the sake of safety. This seems to require the 'very well-coordinated group takes over world' / 'world becomes very well-coordinated,' as well 'all reasonable approaches to AI control fail.' It doesn't look like this makes the problem much easier. It's hard for me to imagine a capability state where you can kind of solve AI control, but then you have trouble if the AI starts thinking about people. That seems like a super scary bug that indicates something deeply wrong that will probably bite you one way or another. (I would assume that this is the MIRI view.)"("mind crime" OR “mindcrime”) (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
31
Arbital"Faithful simulation"2016This is a wiki page summarizing content from other research, websites, and blogs."Since the main use for the notion of 'faithful simulation' currently appears to be identifying a safe plan for uploading one or more humans as a pivotal act, we might also consider this problem in conjunction with the special case of wanting to avoid mindcrime. In other words, we'd like a criterion of faithful simulation which the AGI can compute without it needing to observe millions of hypothetical simulated brains for ten seconds apiece, which could constitute creating millions of people and killing them ten seconds later. We'd much prefer, e.g., a criterion of faithful simulation of individual neurons and synapses between them up to the level of, say, two interacting cortical columns, such that we could be confident that in aggregate the faithful simulation of the neurons would correspond to the faithful simulation of whole human brains. This way the AGI would not need to think about or simulate whole brains in order to verify that an uploading procedure would produce a faithful simulation, and mindcrime could be avoided."("mind crime" OR “mindcrime”) (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
32
Pace, Ben(notes on) Policy Desiderata for Superintelligent AI: A Vector Field Approach2019This post lists different "Policy Desiderata for Superintelligent AI"Mind crime prevention. Four key factors: novelty, invisibility, difference, and magnitude. Novelty and invisibility: Sentient digital entities may be moral patients. They would be a novel type of mind, and would not exhibit many characteristics that inform our moral intuitions - they lack facial expressions, physicality, human speech, and so on, if they are being run invisibly in some microprocessor. This means we should worry about policy makers taking an unconscionable moral decision. Difference: It is also the case that these minds may be very different to human or animal minds, again subverting our intuitions about what behaviour is normative toward them, and increasing the complexity of choosing sensible policies here. Magnitude: It may be incredibly cheap to create as many people as currently exist in a country, magnifying the concerns of the previous three factors. “With high computational speed or parallelization, a large amount of suffering could be generated in a small amount of wall clock time.” This may mean that mind crime is a principal desideratum in AI policy."("mind crime" OR “mindcrime”) (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
33
TsviBTSpeculations on information under logical uncertainty2016This is a theoretical post about "logical uncertainty.""If we have a good predictor under logical uncertainty P, we can ask: how does P’s predictions about the output of a computation Y change if it learns the outcome of X? We can then define various notions of how informative X is about Y." One possible use includes "Non-person predicate: disallowing computations that are very informative about a human might prevent some mindcrime." // "Recall the proposal to prevent mindcrime by vetoing computations that are informative about a human. Agent: Okay to run computation X? Overseer: Hold on, let me make sure it is safe. Overseer: *commits a whole lot of mindcrime* Overseer: Um, yep, that is definitely mindcrime, no you may not run X. Agent: Whew, glad I checked. Overseer: *grimaces inwardly* Even if the predictor can successfully detect potential mindcrime, it may itself commit mindcrime, especially while thinking about computations that include mindcrime. This might be partially sidestepped by not computing the answer to X and just adding possible outputs of X to K, but the resulting counterfactuals might not make sense. More fundamentally, the overseer blacklists computations that are definitely bad because they are informative about H, rather than whitelisting computations that are definitely safe. There may be many situations where an agent commits mindcrime without modeling any existing human, or commit mindcrime on very nonhuman but nevertheless morally valuable minds." // "I don’t think this works as stated, and this kind of problem should probably be sidestepped anyway."("mind crime" OR “mindcrime”) (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
34
ArbitalAdvanced agent properties2016This is a wiki page summarizing content from other research, websites, and blogs."Psychological modeling of other agents (not humans per se) potentially leads to... mindcrime." "Realistic psychological modeling potentially leads to... More probable mindcrime."("mind crime" OR “mindcrime”) (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
35
Drexler, K. EricReframing Superintelligence: Comprehensive AI Services as General Intelligence2019This is a lengthy technical report that addresses several different aspects of superintelligence in order to "reframe" discussion of artificial superintelligence."Bostrom’s (2014, p.201–208) concept of “mind crime” presents what are perhaps the most difficult moral questions raised by the prospect of computational persons. In this connection, SI-level [superintelligence-level] assistance may be essential not only to prevent, but to understand the very nature and scope of potential harms to persons unlike ourselves. Fortunately, there is seemingly great scope for employing SI-level capabilities while avoiding potential mindcrime, because computational systems that provide high-order problem-solving services need not be equivalent to minds." // "Concepts of artificial intelligence have long been tied to concepts of mind, and even abstract, rational-agent models of intelligent systems are built on psychomorphic and recognizably anthropomorphic foundations. Emerging AI technologies do not fit a psychomorphic frame, and are radically unlike evolved intelligent systems, yet technical analysis of prospective AI systems has routinely adopted assumptions with recognizably biological characteristics. To understand prospects for AI applications and safety, we must consider not only psychomorphic and rational-agent models, but also a wide range of intelligent systems that present strongly contrasting characteristics." "What would count as high-level yet non-psychomorphic intelligence? One would be inclined to say that we have general, high-level AI if a coordinated pool of AI resources could, in aggregate: Do theoretical physics and biomedical research; Provide a general-purpose conversational interface for discussing AI tasks; Discuss and implement3 designs for self-driving cars, spacecraft, and AI systems; Effectively automate development of next-generation AI systems for AI design. None of these AI tasks is fundamentally different from translating languages, learning games, driving cars, or designing neural networks—tasks performed by systems not generally regarded as mind-like. Regarding the potential power such a coordinated pool of AI services, note that automating the development of AI systems for AI design enables what amounts to recursive improvement... Rather than regarding artificial intelligence as something that fills a mindshaped slot, we can instead consider AI systems as products of increasingly automated technology development, an extension of the R&D process that we see in the world today."("mind crime" OR “mindcrime”) (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
36
ArbitalDo-What-I-Mean hierarchyUnclearThis is a wiki page summarizing content from other research, websites, and blogs."Do-What-I-Mean refers to an aligned AGI's ability to produce better-aligned plans, based on an explicit model of what the user wants or believes." "Risks from pushing toward higher levels of DWIM might include... Sufficiently advanced psychological models might constitute mindcrime."("mind crime" OR “mindcrime”) (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
37
Johnson, MikeWhy I think the Foundational Research Institute should rethink its approach2017This is a a theoretical discussion focusing on the Foundational Research Institute's (i.e. CLR's) views on consciousness from someone at the Qualia Research Institute."In short, I think FRI has a worthy goal and good people, but its metaphysics actively prevent making progress toward that goal. The following describes why I think that, drawing heavily on Brian’s writings (of FRI’s researchers, Brian seems the most focused on metaphysics): Note: FRI is not the only EA organization which holds functionalist views on consciousness; much of the following critique would also apply to e.g. MIRI, FHI, and OpenPhil." "The FRI model [of consciousness] seems to imply that suffering is ineffable enough such that we can't have an objective definition, yet sufficiently effable that we can coherently talk and care about it. This attempt to have it both ways seems contradictory, or at least in deep tension." // "Two people can agree on FRI’s position that there is no objective fact of the matter about what suffering is (no privileged definition), but this also means they have no way of coming to any consensus on the object-level question of whether something can suffer. This isn’t just an academic point: Brian has written extensively about how he believes non-human animals can and do suffer extensively, whereas Yudkowsky (who holds computationalist views, like Brian) has written about how he’s confident that animals are not conscious and cannot suffer, due to their lack of higher-order reasoning." // "Speaking for myself, the more I stared into the depths of functionalism, the less certain everything about moral value became-- and arguably, I see the same trajectory in Brian’s work and Luke Muehlhauser’s report. Their model uncertainty has seemingly become larger as they’ve looked into techniques for how to 'de-reify' consciousness while preserving some flavor of moral value, not smaller. Brian and Luke seem to interpret this as evidence that moral value is intractably complicated, but this is also consistent with consciousness not being a reification, and instead being a real thing." // "To sum up: Brian’s notion that consciousness is the same as computation raises more issues than it solves; in particular, the possibility that if suffering is computable, it may also be uncomputable/reversible, would suggest s-risks aren’t as serious as FRI treats them.""FRI is pursuing a certain research agenda, and QRI is pursuing another, and there’s lots of value in independent explorations of the nature of suffering. I’m glad FRI exists, everybody I’ve interacted with at FRI has been great, I’m happy they’re focusing on s-risks, and I look forward to seeing what they produce in the future. On the other hand, I worry that nobody’s pushing back on FRI’s metaphysics, which seem to unavoidably lead to the intractable problems I describe above. FRI seems to believe these problems are part of the territory, unavoidable messes that we just have to make philosophical peace with. But I think that functionalism is a bad map, that the metaphysical messes it leads to are much worse than most people realize (fatal to FRI’s mission), and there are other options that avoid these problems (which, to be fair, is not to say they have no problems)." Specific suggestions are then offered, relating to each objection that Johnson raises.("mind crime" OR “mindcrime”) (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
38
AnonymousArtificial Utility Monsters as Effective Altruism2014This is a largely theoretical post."Dear effective altruist, have you considered artificial utility monsters as a high-leverage form of altruism? In the traditional sense, a utility monster is a hypothetical being which gains so much subjective wellbeing (SWB) from marginal input of resources that any other form of resource allocation is inferior on a utilitarian calculus... However, we may broaden the traditional definition somewhat and call any technology utility-monstrous if it implements high SWB with exceptionally good cost-effectiveness and in a scalable form - even if this scalability stems form a larger set of minds running in parallel, rather than one mind feeling much better or living much longer per additional joule/dollar. Under this definition, it may be very possible to create and sustain many artificial minds reliably and cheaply, while they all have a very high SWB level at or near subsistence. An important point here is that possible peak intensities of artificially implemented pleasures could be far higher than those commonly found in evolved minds: Our worst pains seem more intense than our best pleasures for evolutionary reasons - but the same does not have to be true for artifial sentience, whose best pleasures could be even more intense than our worst agony, without any need for suffering anywhere near this strong. If such technologies can be invented - which seems highly plausible in principle, if not yet in practice - then the original conclusion for the utilitarian calculus is retained: It would be highly desirable for utilitarians to facilitate the invention and implementation of such utility-monstrous systems and allocate marginal resources to subsidize their existence. This makes it a potential high-value target for effective altruism." // "A compromise between raw efficiency of SWB per joule/dollar and better forms to attract humans might be best. There is probably a sweet spot - perhaps various different ones for different target groups - between resource-efficiency and attractiveness. Only die-hard utilitarians will actually want to fund something like hedonium, but the rest of the world may still respond to 'The Sims - now with real pleasures!', likeable VR characters, or a new generation of reward-based Tamagotchis." // "Despite these risks, one may hope that most humans who care enough to run artificial sentience are more benevolent and careful than malevolent and careless in a way that causes more positive SWB than suffering. After all, most people love their pets and do not torture them, and other people look down on those who do (compare this discussion of Norn abuse, which resulted in extremely hostile responses). And there may be laws against causing artificial suffering. Still, this is an important point of concern." // Ghatanathoah comments that, "[i]t seems to me that the project of transhumanism in general is actually the project of creating artificial utility monsters. If we consider a utility monster a creature that can transmute resources into results more efficiently that's essentially what a transhuman is... Maybe you meant to create some creature with an inhuman psychology, like orgasmium. To answer that question I'd have to delve deeper and more personally into my understanding of ethics. Long story short, I think that would be a terrible idea. My population ethics only considers the creation of entities with complex values that somewhat resemble human ones to be positive. For all other types of creatures I am a negative preference utilitarian, I consider their addition to be a bad thing and that we should make sacrifices to prevent it. And that's even assuming that it is possible to compare their utility functions with ours. I don't think interpersonal utility comparison between two human-like creatures is hard at all. But a creature with a totally alien set of values is likely impossible.""To my best knowledge, we don't have the capacity yet to create artificial utility monsters. However, foundational research in neuroscience and artificial intelligence/sentience theory is already ongoing today and certainly a necessity if we ever want to implement utility-monstrous systems. In addition, outreach and public discussion of the fundamental concepts is also possible and plausibly high-value (hence this post). Generally, the following steps seem all useful and could use the attention of EAs, as we progress into the future: spread the idea, refine the concepts, apply constructive criticism to all its weak spots until it becomes either solid or revealed as irredeemably undesirable; identify possible misunderstandings, fears, biases etc. that may reduce human acceptance and find compromises and attraction factors to mitigate them; fund and do the scientific research that, if successful, could lead to utility-monstrous technologies; fund the implementation of the first actual utility monsters and test them thoroughly, then improve on the design, then test again, etc.; either make the templates public (open-source approach) or make them available for specialized altruistic institutions, such as private charities; perform outreach and fundraising to give existence donations to as many utility monsters as possible."“Artificial sentience” (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
39
Anthis, Jacy ReeseComment on "Why I'm focusing on invertebrate sentience"2018This is a brief comment on a post about someone's personal research plans."I remain skeptical of how much this type of research will influence EA-minded decisions, e.g. how many people would switch donations from farmed animal welfare campaigns to humane insecticide campaigns if they increased their estimate of insect sentience by 50%? But I still think the EA community should be allocating substantially more resources to it than they are now, and you seem to be approaching it in a smart way, so I hope you get funding! I'm especially excited about the impact of this research on general concern for invertebrate sentience (e.g. establishing norms that there are at least some smart humans are actively working on insect welfare policy) and on helping humans better consider artificial sentience when important tech policy decisions are made (e.g. on AI ethics)."“Artificial sentience” (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)Only very indirectly
40
Zhang, LinchuanComment on "Should Longtermists Mostly Think About Animals?"2020This is a theoretical comment on a post about wild animals."A2. Claim: It's possible to design experiences that have much more utility than anything experienced today. I can outline two viable paths (disjunctive): A2a. Simulation For this to hold, you have to believe: A2ai. Humans or human-like things can be represented digitally. I think there is philosophical debate, but most people who I trust think this is doable. A2aii. Such a reproduction can be cheap. I think this is quite reasonable since again, existing animals (including human animals) are not strongly optimized for computation. A2aiii. simulated beings are capable of morally relevant experiences or otherwise production of goods of intrinsic moral value. Some examples may be lots of happy experiences, or (if you have a factor for complexity) lots of varied happy experiences, or other moral goods that you may wish to produce, like great works of art, deep meaningful relationships, justice, scientific advances, etc. A2b. Genetic engineering. "A3. Claim: Our descendants may wish to optimize for positive moral goods. I think this is a precondition for EAs and do-gooders in general 'winning', so I almost treat the possibility of this as a tautology. A4. Claim: There is a distinct possibility that a high % of vast future resources will be spent on building valuable moral goods, or the resource costs of individual moral goods are cheap, or both. It's not crazy that one day we'll use multiple orders of magnitude of energy more for producing moral goods than we currently spend doing all of our other activities combined... A4bi. Genetic engineering: In the spirit of doing things with made-up numbers, it sure seems likely that we can engineer humans to be 10x happier, suffer 10x less, etc. If you have weird moral goals (like art or scientific insight), it's probably even more doable to genetically engineer humans 100x+ better at producing art, come up with novel mathematics, etc. A4bii. It's even more extreme with digital consciousness. The upper bound for cost is however much it costs to emulate (genetically enhanced) humans, which is probably at least 10x cheaper than the biological version, and quite possibly much less than that. But in theory, so many other advances can be made by not limiting ourselves to the human template, and abstractly consider what moral goods we want and how to get there. A5. Conclusion: for total utilitarians, it seems likely that A1-A4 will lead us to believe that expected utility in the future will be dominated by scenarios of heavy-tails of extreme moral goods." // Matthew Barnet also commented on the same post: "I also expect artificial sentience to vastly outweigh natural sentience in the long-run, though it's worth pointing out that we might still expect focusing on animals to be worthwhile if it widens people's moral circles."“Artificial sentience” (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
41
St Jules, MichaelComment on "Candidate Scoring System recommendations for the Democratic presidential primaries"2020This comment applies research and theoretical discussion from elsewhere to the question of voting."If AGI does not properly care for artificial sentiences because humans generally, policymakers or their designers didn't care, this could be astronomically bad. That being said, near-misses could also be astronomically bad. I think all policy driven in part by or promoting impartial concern for welfare may contribute to concern for artificial sentience, and just having a president who is more impartial and generally concerned with welfare might, too. Similarly, policy driven by or promoting better values (moral or for rationality) and a president with better values generally may be good for the long-term future. Better policies for and views on farmed animals seem like they would achieve the best progress towards the moral inclusion of artificial sentience, among issues considered in CSS. They would also drive the most concern for wild animals, too, of course."“Artificial sentience” (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
42
St Jules, MichaelComment on "What are the key ongoing debates in EA?"2020This is a short comment on a post about "key ongoing debates" in effective altruism.St Jules suggests that "Consciousness and philosophy of mind, for example on functionalism/computationalism and higher-order theories" is a key ongoing debate" in effective altruism, adding that, "[t]his could have important implications for nonhuman animals and artificial sentience. I'm not sure how much debate there is these days, though."“Artificial sentience” (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
43
Baumann, TobiasRisk factors for s-risks2019This post outlines theoretically plausible risk factors for s-risks."The simplest risk factor is the capacity of human civilisation to create astronomical amounts of suffering in the first place. This is arguably only possible with advanced technology." // "Nick Bostrom likens the development of new technologies to drawing balls from an urn that contains some black balls, i.e. technologies that would make it far easier to cause massive destruction or even human extinction. Similarly, some technologies might make it far easier to instantiate a lot of suffering, or might give agents new reasons to do so.1 A concrete example of such a technology is the ability to run simulations that are detailed enough to contain (potentially suffering) digital minds." // "It is plausible that most s-risks can be averted at least in principle – that is, given sufficient will to do so. Therefore, s-risks are far more likely to occur in worlds without adequate efforts to prevent s-risks. This could happen for three main reasons: Inadequate risk awareness... Strong competitive pressure... Indifference." // "Human civilisation contains many different actors with a vast range of goals, including some actors that are, for one reason or another, motivated to cause harm to others. Assuming that this will remain the case in the future3, a third risk factor is inadequate security against bad actors." // "S-risks are also more likely if future actors endorse strongly differing value systems that have little or nothing in common, or might even be directly opposed to each other. This holds especially if combined with a high degree of polarisation and no understanding for other perspectives – it is also possible that different value systems still tolerate each other, e.g. because of moral uncertainty." // "I am most worried about worlds where many risk factors concur. I’d guess that a world where all four factors materialise is more than four times as bad as a world where only one risk factor occurs (i.e. overall risk scales super-linearly in individual factors).""Further research could consider the following questions for each of the risk factors: What are the best practical ways to make this risk factor less likely? How likely is it that this risk factor will materialise in future society? How will this risk factor change due to the emergence of powerful artificial intelligence or other transformative future technologies?"“Artificial sentience” (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)Yes
44
Sotala, KajS-risks: Why they are the worst existential risks, and how to prevent them2017"This is a linkpost for https://foundational-research.org/s-risks-talk-eag-boston-2017/," which is a talk by Max Daniel with the same title.Commenter cousin_it writes that "It seems like an s-risk outcome (even one that keeps some people happy) could be more than a million times worse than an x-risk outcome, while not being a million times more improbable, so focusing on s-risks is correct." However, Paul Christiano replies that "I don't buy the 'million times worse'... I agree that if you are million-to-1 then you should be predominantly concerned with s-risk, I think they are somewhat improbable/intractable but not that improbable+intractable. I'd guess the probability is ~100x lower, and the available object-level interventions are perhaps 10x less effective." Christiano's comment received more upvotes (34) than Sotala's post itself (21), suggesting support for his views among LessWrong readers, although the upvotes might not all indicate support for the quoted claims.("astronomical suffering" OR “suffering risks” OR “s-risks”) (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)Yes
45
Turchin, AlexeyCuring past sufferings and preventing s-risks via indexical uncertainty2018This is a theoretical post outlining "a hypothetical way to make past suffering non-existent via multiple resurrections of every suffering moment with help of giant Benevolent AI.""Benevolent superintelligence could create many copies of each suffering observer-moment and thus 'save' any observer from suffering via induced indexical uncertainty. A lot of suffering has happened in the human (and animal) kingdoms in the past. There are also possible timelines in which an advanced superintelligence will torture human beings (s-risks). If we are in some form of multiverse, and every possible universe exists, such s-risk timelines also exist, even if they are very improbable—and, moreover, these timelines include any actual living person, even the reader. This thought is disturbing." Several commenters were critical of the idea, such one comment by Kyle Bogosian arguing that, "[t]his is an algorithmic trick without ethical value. The person who experienced suffering still experienced suffering. You can outweigh it by creating lots of good scenarios, but making those scenarios similar to the original one is irrelevant."("astronomical suffering" OR “suffering risks” OR “s-risks”) (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
46
Turchin, AlexeyMini map of s-risks2017This post lists a number of possible s-risks.None of the listed s-risks explicitly refer to artificial sentience, despite the post citing CLR for the definition of the term "s-risks." In response to a comment noting that "FRI has focused on a few s-risks that you didn't mention," the author replies: "Thanks for adding ideas, I will add them in the next version of the map."("astronomical suffering" OR “suffering risks” OR “s-risks”) (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
47
Clifton, JessePreface to CLR's Research Agenda on Cooperation, Conflict, and TAI2019This is CLR's research agenda. It lays out various considerations relating to s-risks.This research agenda does not directly discuss the probability or likely experience of artificial sentient beings much, beyond citing other sources that discuss this. However, in discussing various possible outcomes and game theoretic considerations that could benefit from further evaluation, it does discuss factors that have implications for the likelihood of s-risks.There are many suggestions for further research.("astronomical suffering" OR “suffering risks” OR “s-risks”) (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)Yes
48
Baumann, TobiasA typology of s-risks2018This is a short post listing various s-risks.S-risks are categorised into "Incidental s-risks" that "arise when the most efficient way to achieve a certain goal creates a lot of suffering in the process," "[a]gential s-risks" that "involve agents that actively and intentionally want to cause harm," and "[n]atural s-risks." In a section on "[o]ther dimensions," Baumann notes that, "we can also distinguish s-risks... by the kind of sentient beings that is affected: s-risks could involve human suffering, animal suffering, or potentially even suffering artificial minds." Within the "[i]ncidental s-risks" category, both "[e]conomic productivity" and "[i]nformation gain" explicitly mention artificial entities as potentially suffering in these circumstances."Based on this typology, I recommend that further work considers the following questions: Which types of s-risk are most important in terms of the amount of expected suffering? Which types of s-risk are particularly tractable or neglected? What are the most significant risk factors that make different types of s-risks more likely, or more serious in scale if they occur? What are the most effective interventions for each respective type of s-risk? Are there interventions that address many different types of s-risk at the same time?"("astronomical suffering" OR “suffering risks” OR “s-risks”) (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)Yes
49
Wiblin, Robert, and Keiran HarrisShould we leave a helpful message for future civilizations, just in case humanity dies out?This is a small section in a podcast discussion about artificial intelligence and several issues relating to the long-term future.Paul Chrisiano: "I think my best guess is that if you go out into a universe and optimize everything as being good. The total level of goodness delivered is commensurate with the total amount of badness that will be delivered if you went out into the universe and optimize it for things being bad. I think to the extent that one has that empirical view or like that maybe moral, some combination of empirical view and moral view valid the nature of what is good and what is bad, then S-risks are not particularly concerning because people are so much more likely to be optimizing the universe for good." S-risks deserves "some concern because it’s a plausible perspective." Rob Wiblin: "You might create something that has a lot of good in it, but also has a bunch of bad in it as well and you go out and spread that. We just don’t realize that we’re also as a side effect to creating a bunch of suffering or some other disvalue. How plausible do you think any of these scenarios are?" Paul Christiano: "I guess the one that seems by far most plausible to me is this conflicts, threats and following through on threats model... I think it’s hard to make a sufficiently extreme moral error. There might be moral error that’s combined with threats they could follow through on but I think it’s hard for me to get- the risk from that is larger than the risk for me getting what is good, very nearly exactly backwards." "I think the answer is, it’s relatively unlikely to have significant amounts of disvalue created in this way but not unlikely, like the one in a million level, unlikely, more like the 1% level." // "Suppose you have a P probability of the best thing you can do and a one-minus P probably the worst thing you can do, what does P have to be so it’s the difference between that and the barren universe. I think most of my probability is distributed between you would need somewhere between 50% and 99% chance of good things and then put some probability or some credence on views where that number is a ["bajillion"] times larger or something in which case it’s definitely going to dominate... I think I would put a half probability or like weight a half to a third on the exactly 50 or things very close to 50% and then most of the rest gets split between somewhat more than 50% rather than radically more than 50%." // Rob Wiblin: "Another rationale might be, even if you are symmetric in that point I would be that there’s more people working on trying to prevent extinction or trying to make the future go well than there are people worrying about the worst-case scenarios and trying to prevent them, so it’s like potentially right now a more neglected problem that deserves more attention than it’s getting. Did you put much, or any weight on that?" Paul Christiano: "I think ultimately I mostly care about neglectedness because of how it translates to tractability. I don’t think this problem is currently more tractable than– I don’t feel like it’s more tractable than AI alignment. Maybe they seem like they’re in the same ballpark in terms of tractability... it’s like maybe less tractable on its face than alignment." Alignment seems "more clear and concrete," whereas working on s-risk reduction seems "quite a fuzzy kind of difficulty and the things that we’re going to do are all much more like bing shots. I don’t know, I think it’s a messy subject." // Robert Wiblin: "Robert Wiblin: Quite a lot of people think that these risks of bad outcomes and threats are more likely in a multipolar scenario where you have a lot of groups that are competing over having influence over the future and I guess over potentially the use of artificial intelligence or whatever other technologies end up mattering. Do you share that intuition?" Paul Christiano: "I think it’s at least somewhat worse. I don’t know how much worse like maybe twice as bad seems like a plausible first pass guess. The thing is turning a lot on how sensitive people are threatening each other in the world."("astronomical suffering" OR “suffering risks” OR “s-risks”) (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
50
Wei DaiComment on "Why I expect successful (narrow) alignment"2018This is a comment on a post by Tobias Baumann arguing that "advanced AI systems will likely be aligned with the goals of their human operators, at least in a narrow sense.""I find it unfortunate that people aren't using a common scale for estimating AI risk, which makes it hard to integrate different people's estimates, or even figure out who is relatively more optimistic or pessimistic. For example here's you (Tobias): 'My inside view puts ~90% probability on successful alignment (by which I mean narrow alignment as defined below). Factoring in the views of other thoughtful people, some of which think alignment is far less likely, that number comes down to ~80%.' Robert Wiblin, based on interviews with Nick Bostrom, an anonymous leading professor of computer science, Jaan Tallinn, Jan Leike, Miles Brundage, Nate Soares, Daniel Dewey: 'We estimate that the risk of a serious catastrophe caused by machine intelligence within the next 100 years is between 1 and 10%.' Paul Christiano: 'I think there is a >1/3 chance that AI will be solidly superhuman within 20 subjective years, and that in those scenarios alignment destroys maybe 20% of the total value of the future.' It seems to me that Robert's estimate is low relative to your inside view and Paul's, since you're both talking about failures of narrow alignment ("intent alignment" in Paul's current language), while Robert's "serious catastrophe caused by machine intelligence" seems much broader. But you update towards much higher risk based on "other thoughtful people" which makes me think that either your "other thoughtful people" or Robert's interviewees are not representative, or I'm confused about who is actually more optimistic or pessimistic."Either way it seems like there's some very valuable work to be done in coming up with a standard measure of AI risk and clarifying people's actual opinions."("astronomical suffering" OR “suffering risks” OR “s-risks”) (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No (Baumann is)
51
Turchin, AlexeyComment on "What are some of bizarre theories based on anthropic reasoning?"2019This is a short comment on a post about "anthropic reasoning."In response to the question "What are some of bizarre theories based on anthropic reasoning?", Alexy Turchin comments: "S-risks are rare." They see this as "anthropic reasoning" in the sense that someone might argue: "We are not currently in the situation of s-risks, so it is not typical state of affairs." This implies that they believe that s-risks are not "rare."("astronomical suffering" OR “suffering risks” OR “s-risks”) (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
52
JimmyJComment on “What ever happened to PETRL (People for the Ethical Treatment of Reinforcement Learners)?”2019This is a short forum discussion about the founders of the People for the Ethical Treatment of Reinforcement Learners."The founders of PETRL include Daniel Filan, Buck Shlegeris, Jan Leike, and Mayank Daswani, all of whom were students of Marcus Hutter. Brian Tomasik coined the name. Of these five people, four are busy doing AI safety-related research. (Filan is a PhD student involved with CHAI, Shlegeris works for MIRI, Leike works for DeepMind, and Tomasik works for FRI. OTOH, Daswani works for a cybersecurity company in Australia.) So, my guess is that they became too busy to work on PETRL, and lost interest. It's kind of a shame, because PETRL was (to my knowledge) the only organization focused on the ethics of AI-qua-moral patient. However, it seems pretty plausible to me that the AI safety work the PETRL founders are doing now is more effective. In July 2017, I emailed PETRL asking them if they were still active: 'Dear PETRL team, Is PETRL still active? The last blog post on your site is from December 2015, and there is no indication of ongoing research or academic outreach projects. Have you considered continuing your interview series? I'm sure you could find interesting people to talk to.' The response I received was: Thanks for reaching out. We're less active than we'd like to be, but have an interview in the works. We hope to have it out in the next few weeks! That interview was never published.""Robot rights" (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
53
Tyler, TimComment on "The Magnitude of His Own Folly"2008This is a short comment on a post about other topics."'Eliezer's plan seems to enslave AIs forever for the benefit of humanity; and this is morally reprehensible.' I'm not sure that I would put it like that. Humans enslave their machines today, and no-doubt this practice will continue once the machines are intelligent. Being enslaved by your own engineered desires isn't necessarily so bad - it's a lot better than not existing at all, for example. However it seems clear that we will need things such as my Campaign for Robot Rights if our civilisation is to flourish. Eternally-subservient robots - such as those depicted in Wall-E - would represent an enormous missed opportunity. We have seen enough examples of sexual selection run amok in benevolent environments to see the danger. If we manage to screw-up our future that badly, we probably deserve to be casually wiped out by the first passers-by.""Robot rights" (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
54
Wiblin, RobertComment on "Does improving animal rights now improve the far future?"2019This is a short comment, summarizing ideas discussed elsewhere in the effective altruism community.The original post notes that, "80000 Hours says, 'We think intense efforts to reduce meat consumption could reduce factory farming in the US by 10-90%. Through the spread of more humane attitudes, this would increase the expected value of the future of humanity by 0.01-0.1%.'" Wibling comments: "It is also a pretty low figure (in my view), which reflects that we're also skeptical of the size of these effects. But here are some pathways to consider: Animal organisations do sometimes campaign on the moral patienthood of non-humans, and persuade people of this, especially in countries where this view is less common; Getting people to stop eating meat makes it easier for them to concede that the welfare of non-humans is of substantial importance; Fixing the problem of discrimination against animals allows us to progress to other moral circle expansions sooner, most notably from a long-termist perspective, recognising the risks of suffering in thinking machines; Our values might get locked in this century through technology or totalitarian politics, in which case we need to rush to reach something tolerable as quickly as possible; Our values might end up on a bad but self-reinforcing track from which we can't escape, which is a reason to get to something tolerable quickly, in order to make that less likely; Animal advocacy can draw people into relevant moral philosophy, effective altruism and related work on other problems, which arguably increases the value of the long-term future."("Moral circle" OR "moral expansiveness") AND (robots OR machines OR "artificial intelligence") (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
55
Vaughan, KerryThree Heuristics for Finding Cause X2016This is a post linking to various other discussions in the effective altruism community to comment on what sort of issues or causes are likely to present opportunities for cost-effective actions to improve the future.The "Three Heuristics for Finding Cause X" are "Expanding the moral circle," looking for the "Consequences of technological progress," and thinking about "Crucial Considerations." In the first of these, "The ethical importance of digital minds" is listed as an example: "Whatever it is that justifies including humans and animals in our scope of moral concern may exist in digital minds as well. As the sophistication and number of digital minds increases, our concern for how digital minds are treated may need to increase proportionally." Links are provided to two articles by Brian Tomasik.("Moral consideration" OR “moral concern”) AND (robots OR machines OR "artificial intelligence") (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
56
Wei DaiComment on "An Argument for Why the Future May Be Good"2017This is a short comment on a post about the value of the long-term future."What lazy solutions [to the problems of the future] will look like seems unpredictable to me. Suppose someone in the future wants to realistically roleplay a historical or fantasy character. The lazy solution might be to simulate a game world with conscious NPCs. The universe contains so much potential for computing power (which presumably can be turned into conscious experiences), that even if a very small fraction of people do this (or other things whose lazy solutions happen to involve suffering), that could create an astronomical amount of suffering."("Moral circle" OR "moral expansiveness") AND (robots OR machines OR "artificial intelligence") (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
57
Tomasik, BrianComment on "A vision for anthropocentrism to supplant wild animal suffering"2019This is a short comment, though it links to further discussion of space colonization in another article written by Tomasik."Jacy has argued that farm-animal suffering is a closer analogy to most far-future suffering than wild-animal suffering, and I largely agree with his arguments, although he and I both believe that some concern for naturogenic suffering is an important part of a "moral-circle-expansion portfolio", especially if events within some large simulations fall mainly into the 'naturogenic' moral category. There could also be explicit nature simulations run for reasons of intrinsic/aesthetic value or entertainment. I agree that terraforming and directed panspermia, if they occur at all, will be relatively brief preludes to a much larger and longer artificial future. A main reason I mention terraforming and directed panspermia at all is because they're less speculative/weird, and there's already a fair amount of discussion about them. But as I said here: 'in the long run, it seems likely that most Earth-originating agents will be artificial: robots and other artificial intelligences (AIs). [...] we should expect that digital, not biological, minds will dominate in the future, barring unforeseen technical difficulties or extreme bio-nostalgic preferences on the part of the colonizers.'"("Moral circle" OR "moral expansiveness") AND (robots OR machines OR "artificial intelligence") (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)Yes
58
Tomasik, BrianWhy Digital Sentience is Relevant to Animal Activists2015This is a blog post summarizing arguments that Tomasik has made elsewhere."When aiming to reduce animal suffering, we often focus on the short-term, tangible impacts of our work, but longer-term spillover effects on the far future are also very relevant in expectation. As machine intelligence becomes increasingly dominant in coming decades and centuries, digital forms of non-human sentience may become increasingly numerous, perhaps so numerous that they outweigh all biological animals by many orders of magnitude. Animal activists should thus consider how their work can best help push society in directions to make it more likely that our descendants will take humane measures to reduce digital suffering. Far-future speculations should be combined with short-run measurements when assessing an animal charity’s overall impact.""One implication is that it would be valuable for more people to explore and write about the future of animal suffering in general. Insofar as the values and dynamics of future civilization are influenced by what we do in the present, even small changes to steer humanity in better directions now may prevent literally astronomical amounts of suffering down the road. Another implication is that even if you do continue to focus on the very urgent cause of animal suffering in the here and now, you can learn more about which short-term campaigns are likely to have more long-term payoff." Several specific examples are suggested.("Moral consideration" OR “moral concern”) AND (robots OR machines OR "artificial intelligence") (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)Yes
59
Brauner, Jan M., and Friederike M. Grosse-HolzThe expected value of extinction risk reduction is positiveUnclearSeveral comments on digital sentience are offered as part of a wider argument that, "[t]he expected value of efforts to reduce the risk of human extinction (from non-AI causes) seems robustly positive.""Powerful agents will likely be able to create powerless beings as 'tools' if this seems useful for them. Sentient 'tools' could include animals, farmed for meat production or spread to other planets for terraforming (e.g. insects), but also digital sentient minds, like sentient robots for task performance or simulated minds created for scientific experimentation or entertainment. The last example seems especially relevant, as digital minds could be created in vast amounts if digital sentience is possible at all, which does not seem unlikely. If we find we morally care about these 'tools' upon reflection, the future would contain many times more powerless beings than powerful agents. The EV of the future thus depends on the welfare of both powerful agents and powerless beings, with the latter potentially much more relevant than the former." // "Given that our experience with welfare is restricted to animals (incl. humans) shaped by evolution, it is unclear what the default welfare of digital sentients would be. If there is at least some moral concern for digital sentience, it seems fairly likely that the creating agents would prefer to give their sentient tools net positive welfare." // "Space colonization by an AI might include (among other things of value/disvalue to us) the creation of many digital minds for instrumental purposes. If the AI is only driven by values orthogonal to ours, it would likely not care about the welfare of those digital minds. Whether we should expect space colonization by a human-made, misaligned AI to be morally worse than space colonization by future agents with (post-)human values has been discussed extensively elsewhere."("Moral consideration" OR “moral concern”) AND (robots OR machines OR "artificial intelligence") (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
60
Wiblin, Robert, Arden Koehler, and Keiran HarrisDavid Chalmers on the nature and ethics of consciousness2019This is a small section in a podcast discussion about consciousness."Arden Koehler: So you said elsewhere that if more fully autonomous artificial intelligence comes around, then we might have to start worrying about it being conscious, and therefore presumably worthy of moral concern. But you don’t think we have to worry about it too much before then. So I’m just wondering if you can say a bit about why, and whether you think it’s possible that programs or computers could become gradually more and more conscious and might … whether that process might start before they are fully autonomous.
David Chalmers: Yeah, that’s an interesting point. And I guess one would expect to get to conscious AI well before we get human level artificial general intelligence, simply because we’ve got a pretty good reason to believe there are many conscious creatures whose degree of intelligence falls well short of human level artificial general intelligence.
David Chalmers: So if fish are conscious for example, and you might think if an AI gets to sophistication and information processing and whatever the relevant factors are to the degree present in fish, then that should be enough. And it does open up the question as to whether any existing AI systems may actually be conscious. I think the consensus view is that they’re not. But the more liberal you are about descriptions of consciousness, the more we should take seriously the chance that they are.
David Chalmers: I mean, there is this website out there called ‘People for the Ethical Treatment of Reinforcement Learners’ that I quite like. The idea is that every time you give a reinforcement learning network its reward signal, then it may be experiencing pleasure or correspondingly suffering, depending on the valence of the signal. As someone who’s committed to taking panpsychism seriously, I think I should at least take that possibility seriously."
("Moral consideration" OR “moral concern”) AND (robots OR machines OR "artificial intelligence") (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
61
Bostrom, NickTranshumanist ValuesUnclearThis is a short paragraph in a post about the values of "transhumanism.""Transhumanism advocates the well-being of all sentience, whether in artificial intellects, humans, and non-human animals (including extraterrestrial species, if there are any). Racism, sexism, speciesism, belligerent nationalism and religious intolerance are unacceptable. In addition to the usual grounds for deeming such practices objectionable, there is also a specifically transhumanist motivation for this. In order to prepare for a time when the human species may start branching out in various directions, we need to start now to strongly encourage the development of moral sentiments that are broad enough encompass within the sphere of moral concern sentiences that are constituted differently from ourselves."("Moral consideration" OR “moral concern”) AND (robots OR machines OR "artificial intelligence") (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
62
Beckstead, NickComment on "A Long-run perspective on strategic cause selection and philanthropy"2013This is a very short comment on a post about Nick Beckstead and Carl Shulman's philanthropic priorities."We think many non-human animals, artificial intelligence programs, and extraterrestrial species could all be of moral concern, to degrees varying based on their particular characteristics but without species membership as such being essential."("Moral consideration" OR “moral concern”) AND (robots OR machines OR "artificial intelligence") (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
63
syllogismThe Ethical Status of Non-human Animals2012Though not writing specifically about artificial sentience, this discussion about moral weight has some indirect relevance."[S]pecies-specific weighting factors have no place in our moral calculus. If two minds experience the same sort of stimulus, the species of those minds shouldn't affect how good or bad we believe that to be. I owe the line of argument I'll be sketching to Peter Singer's work. His book Practical Ethics is the best statement of the case that I'm aware of."("Moral consideration" OR “moral concern”) AND (robots OR machines OR "artificial intelligence") (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
64
Centre for Effective AltruismCapacity to feel pleasure and painUnclearThis is a short summary of discussions internal and external to the effective altruism community on which entities have the capacity to feel pleasure and pain."All plausible theories of value hold that pleasure and the avoidance of pain have value. It is therefore highly important to ask which entities are capable of experiencing pleasure and pain. Philosophers and scientists discuss three broad hypotheses on this issue. First, it is possible that only humans feel pleasure and pain. This is currently an uncommon view, although it has a long history. For instance, the 17th century philosopher René Descartes put forward influential arguments to the effect that animals lack internal experience, and until several decades ago animal experimenters and veterinarians were taught to disregard apparent pain responses. Second, it is possible that only sufficiently advanced non-human animals feel pleasure and pain. This is the most common view: that other creatures such as chimpanzees, dogs, and pigs also have internal experiences, but that there is some cut-off point beyond which species such as clams, jellyfish, and sea-sponges lie. A conservative cut-off of this sort might include only primates, and a liberal cut-off might go so far as to include insects. Third, it is possible that animals are not the only beings capable of experiencing pleasure and pain. Some philosophers argue that sufficiently advanced artificial intelligence would be capable of experiencing these feelings, or that sufficiently detailed computer simulations of people would have the same experiences that flesh-and-blood people do (Tomasik 2016). Also, it is not necessarily inconceivable that plants, relatively simple machines, or even fundamental physical processes, can experience pleasure or pain, although there are very few proponents of these views."("Moral patient" OR "moral patients" OR "moral patiency") AND (robots OR machines OR "artificial intelligence") (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
65
Muehlhauser, LukeA Software Agent Illustrating Some Features of an Illusionist Account of Consciousness2017"One common critique of functionalist and illusionist theories of consciousness is that, while some of them may be 'on the right track,' they are not elaborated in enough detail to provide compelling accounts of the key explananda of human consciousness, such as the details of our phenomenal judgments, the properties of our sensory qualia, and the apparent unity of conscious experience. In this report, I briefly describe a preliminary attempt to build a software agent which critics might think is at least somewhat responsive to this critique.""This software agent, written by Buck Shlegeris, aims to instantiate some cognitive processes that have been suggested, by David Chalmers and others, as potentially playing roles in an illusionist account of consciousness. In doing so, the agent also seems to exhibit simplified versions of some explananda of human consciousness. In particular, the agent judges some aspects of its sensory data to be ineffable, judges that it is impossible for an agent to be mistaken about its own experiences, and judges inverted spectra to be possible. I don’t think this software agent offers a compelling reply to the critique of functionalism and illusionism mentioned above, and I don’t think it is “close” to being a moral patient (given my moral intuitions). However, I speculate that the agent could be extended with additional processes and architectural details that would result in a succession of software agents that exhibit the explananda of human consciousness with increasing thoroughness and precision. Perhaps after substantial elaboration, it would become difficult for consciousness researchers to describe features of human consciousness which are not exhibited (at least in simplified form) by the software agent, leading to some doubt about whether there is anything more to human consciousness than what is exhibited by the software agent (regardless of how different the human brain and the software agent are at the “implementation level,” e.g. whether a certain high-level cognitive function is implemented using a neural network vs. more traditional programming methods). However, I have also learned from this project that this line of work is likely to require more effort and investment (and thus is probably lower in expected return on investment) than I had initially hoped."("Moral patient" OR "moral patients" OR "moral patiency") AND (robots OR machines OR "artificial intelligence") (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
66
Soares, NatePowerful planners, not sentient software2015"Andrew Ng, an AI scientist of some renown," said that "Computers are becoming more intelligent and that’s useful as in self-driving cars or speech recognition systems or search engines. That’s intelligence. But sentience and consciousness is not something that most of the people I talk to think we’re on the path to." Soares agrees that, "these objections are correct. I endorse Ng’s points wholeheartedly — I see few pathways via which software we write could spontaneously “turn evil.” I do think that there is important work we need to do in advance if we want to be able to use powerful AI systems for the benefit of all, but this is not because a powerful AI system might acquire some “spark of consciousness” and turn against us... The focus of our research at MIRI isn’t centered on sentient machines that think or feel as we do. It’s aimed towards improving our ability to program software systems to execute plans leading towards very specific types of futures."("Moral patient" OR "moral patients" OR "moral patiency") AND (robots OR machines OR "artificial intelligence") (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
67
Wiblin, Robert, and Keiran HarrisDr Paul Christiano on how OpenAI is developing real solutions to the ‘AI alignment problem’, and his vision of how humanity will progressively handover decision-making to AI systems2018This is a small section in a podcast discussion about artificial intelligence and several issues relating to the long-term future."Robert Wiblin: Yeah, I guess I feel like AI’s probably would deserve moral consideration.
Paul Christiano: I also agree with that, yes. That’s what makes the situation so tricky.
Robert Wiblin: That’s true, but then there’s this question of: they deserve moral consideration as to their … I suppose ’cause I’m sympathetic to hedonism, I care about their welfare-
Paul Christiano: As do I. To be clear, I totally care about their welfare... Robert Wiblin: But that’s another case where it’s like, for example, I think that we should be concerned about the welfare of pigs and make pigs’ lives good, but I wouldn’t then give pigs lots of GDP to organize in the way that pigs want, but the disanalogy there is that we think we’re more intelligent and have better values than pigs whereas it’s less clear that’d be true with AI. But, in as much as I worry that AI wouldn’t have good values, it actually is quite analogous, that. Paul Christiano: Yeah, I think your position is somewhat … the arguments you’re willing to make here are somewhat unusual amongst humans, probably. I think most humans have more of a tight coupling between moral concern and thinking that a thing deserves liberty and self-determination and stuff like that... Robert Wiblin: Okay, I’ve barely thought about this issue at all, to be honest, which perhaps is an oversight, but I need to think about it some more and then maybe we can talk about it again.
Paul Christiano: I don’t think it’s that important an issue, mostly. I think, but like, details of how to make alignment work, etc., are more important."
("Moral patient" OR "moral patients" OR "moral patiency") AND (robots OR machines OR "artificial intelligence") (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No
68
St Jules, MichaelPhysical theories of consciousness reduce to panpsychism2020Although not discussing artificial sentience explicitly, this post (and the comments on the EA Forum and Facebook) discuss ideas about consciousness that would affect whether or not someone expected artificial entities to be deserving of moral consideration."The necessary features for consciousness in prominent physical theories of consciousness that are actually described in terms of physical processes do not exclude panpsychism, the possibility that consciousness is ubiquitous in nature, including in things which aren't typically considered alive. I’m not claiming panpsychism is true, although this significantly increases my credence in it, and those other theories could still be useful as approximations to judge degrees of consciousness. Overall, I'm skeptical that further progress in theories of consciousness will give us plausible descriptions of physical processes necessary for consciousness that don't arbitrarily exclude panpsychism, whether or not panpsychism is true. The proposed necessary features I will look at are information integration, attention, recurrent processes, and some higher-order processes. These are the main features I've come across, but this list may not be exhaustive... Similar points are made by Brian Tomasik."("Moral patient" OR "moral patients" OR "moral patiency") AND (robots OR machines OR "artificial intelligence") (site:effectivealtruism.org OR site:80000hours.org OR site:openphilanthropy.org OR site:animalcharityevaluators.org OR site:globalprioritiesinstitute.org OR site:lesswrong.com OR site:intelligence.org OR site:arbital.com OR site:fhi.ox.ac.uk)No