Singularity Summit 2012

How We’re Predicting AI


For more transcripts, videos and audio of Singularity Summit talks visit

Speaker: Stuart Armstrong

Transcriber(s): Ethan Dickinson and Jeremy Miller

Moderator: Alright, Dr. Stuart Armstrong is next. He is the James Martin research fellow at Oxford University's Future of Humanity Institute. His academic background is in advanced math and computational biochemistry, and he has designed useful methods for rapidly comparing and virtually screening medicinal compounds.

His current research centers on formal decision theory, the risks and possibilities of artificial intelligence, the mid- to long-term potential for human progress and intelligent life overall, and anthropic probability. He is particularly interested in finding decision processes that give the correct answer under conditions of extreme ignorance, ways of mapping humanity's partially defined values onto an artificial entity, and the interaction between various existential risks. Please join me in welcoming to the stage Stuart Armstrong.


Stuart Armstrong: Hi there. First I just wanted to thank the Singularity Institute for inviting me amongst all these interesting speakers. I also wanted to thank my co-author, Kaj Sotala, who's unfortunately not here today.

This talk does exactly what it says in the title. It has a look at how we're predicting artificial general intelligence, and its main conclusion is we need to increase our uncertainty, spread those error bars, add caveats to our predictions. One hint that we may need to do this is from looking at previous predictions.

This is the famous Dartmouth AI conference, and they were basically predicting AGI or near-AGI over the course of a few months work. Nine years later, Dreyfus was basically claiming that AI had nearly reached the limit of its potential. I think it's safe to say that neither of these predictions have been entirely borne out in practice.

Here's another prediction. "AGI will be developed in 15 to 25 years." You may want to take a moment to think who predicted this. Actually various people predicted it quite a lot during this year. Oh, and also in 2011, 2010...


Stuart: ...and a variety of other dates, stretching back to way more than 25 years in the past. We ought to be able to do better than this.

This talk is going to look at AGI predictions, timelines, and philosophy, and ask the question, "What performance should we expect from our predictors?" and "What performance do we get, as far as we can tell?" I'll be making extensive use of the Singularity Institute's database of AI predictions going back to the 1950s.

What performance should we expect? This is xkcd's cartoon where fields are lined up by order of purity where everybody sneers on the less pure subjects. Quite conveniently this is also a lineup of quality of predictions. For instance, predictions in physics are a lot more solid than predictions in sociology. Let's make a little bit of space and add some economists and historians on the graph.

Why are some of these predictions so much more solid than others? It's mainly because of different methods that they have to use. Deductive logics, hard and soft versions of the scientific methods, down to the poor historians who are limited to just looking at past examples.

Where would AGI predictors fit on this graph? There's that convenient slot all the way down there on the left.


Stuart: The reason that they're down there is that they can use none of these methods, namely because there's not a single example of an AGI that we've ever built. There we're reliant on expert opinion, which is a lot less reliable than the other methods.

Another interesting thing is you might think that as you went down there, that as the uncertainties grew, that the experts would become more humble, less attached to their predictions, more willing to engage with opposing points of view. No.


Stuart: Just one small example, if we look at economists, here is Paul Krugman waxing lyrical about the Chicago economics. [slide says "... comments from Chicago economists are the product of a Dark Age of macroeconomics..." Paul Krugman] Here's a Chicago economist responding in an equally generous spirit. [slide says "... leads to the unavoidable conclusion that Krugman isn't reading real economics anymore..."]


Stuart: This matters because – first of all it tells you that there's at least one expert here being massively overconfident. Also, these are experts, they're economists, they presumably have insight, maybe wisdom, and we'd really like to go in and grasp that insight from inside their minds, but as long as their insights are pointing in completely opposite directions, we can't.

This is the cartoon explanation for overconfidence. We have a bunch of things in our heads that cause our opinions. We also have some biases and rationalization, as we saw earlier this morning, and that reads a reasonable conclusion. Our opponents are in exactly the same situation. However, from inside, all we can see is this. [slide shows "reasonable conclusion" changing to "biased conclusion"; "biases and rationalizations" are only apparent in our opponents, not ourselves]


Stuart: What this means is that no matter how strongly we feel that our opinions are correct, we have to admit that the opinions of our opponents are just as likely to be correct as our own, unless there's some objective criteria that can sort out who has the genuine expertise. Proper objective criteria, not just our own preferred versions of it.

Fortunately, people have been doing research into what makes a good expert opinion. This is James Shanteau's graph. Everything on the left, those tasks lead to good expert performance. If they have properties on the right, that leads to poor expert performance. Amongst the three most important are whether experts agree or disagree, whether the problem is decomposable or not, and towering above all else, whether feedback is available.

When it comes to predicting AI, in most situations we're stuck with this. Just on theoretical grounds, we should expect AGI predictors to be rather poor. There's no reason why an individual predictor might not do all they can to move into the left column whenever it's possible, for instance by decomposing the problem. Unfortunately, most of the predictors don't. One of the reasons for this is that they end up solving the wrong problem. They end up solving a much easier problem, but the wrong one.

At the institute where I work, we have a distinction between predicting "grind," which is easy for certain values of "easy," and predicting "insight," which is hard. Grind is basically things that can be predicted by just assuming that enough people will work on it for long enough.

For instance, how long will it take to produce the next Michael Bay blockbuster? There you're going to have a producer, you're going to have artists, you're going to have actors, you're going to have marketers. They're going to do some work, and after a certain amount of work, the blockbuster is going to emerge at the end. We can estimate how long it will take by looking at previous movies, estimate how expected delays are in the same way.

But could anyone hazard a guess as to when someone will solve the Riemann hypothesis? This is predicting insight, which is a lot harder to do.

A lot of AGI predictions are actually predicting insight, but pretend that they're predicting grind. The most typical example that comes up again and again is various versions of "Moore's law, hence AGI," which is, by the year XXXX, computers will have some thing that's generally comparable with the human brain, floating operations per second, number of synapses, something along those lines, hence AGI. Moore's law is basically grind, so you can predict that, but this thing has skirted over the real difficult part of it, which is, when the computer gets this property, how are we going to get AGI from it?

Anyway, that's theory. Let's have a look at facts. This is the database, as I said the Singularity Institute database. 95 of the predictions are timeline predictions to AGI. Unfortunately, they're not in a standardized form of, "By golly, I predict we'll have human-level AGI by the year XXX." I went into all the timeline predictions and got a median estimate for each one. It's a somewhat subjective process, I encourage you to go online, have a look at the data, maybe give your own estimates, if you want to. We also assess the expertise of the predictor, to distinguish people who know what they're talking about from writers and journalists who presumably don't.


Stuart: This is the data. The date the prediction was made versus the predicted AGI arrival. There you have Turing's original prediction, and you can actually make out the AI winter in the middle there. There's also some predictions above, about seven predictions above 2100.

Looking at this, the first thing that strikes is that it's all over the place. There's maybe some light patterns, but take two typical predictions, there'll be 20 years' distinction between them easily. There's no sign that experts are converging on a particular prediction. There's no particular  sign that experts are predicting differently from non-experts. The theory that the predictions are probably quite poor seems to be backed up by the facts.

There's two folk explanations as to why we should expect AGI predictions to be so poor. One of them is the so-called "Maes-Garreau law," which is basically people predict that AGI will happen just before they die.


Stuart: AI will happen, it'll be a rapture of the nerds, you'll be saved.


Stuart: There is no real evidence for this. Here I've plotted the prediction minus expected lifetime, versus the age of the prediction. If you predict that the AI would happen just at your expected death, you'd be on zero on that graph. These are the people who expect to die before seeing AGI, these are the ones where there's more than five years difference between their prediction and their expected lifetime. If that law was true, we'd expect to see a lot more clumping around zero than we do here.

Another folk theory is that AGI is always 15 to 25 years away. Close enough that it's worth predicting and doing work on, far enough that no one will tell you you've got it wrong any time soon. This has some evidence to it. If you plot the time to AGI from different predictors, systematically a third are in this range. If we look at experts, we see the same pattern. If we look at non-experts, we see the same pattern.

What's interesting to me is that, given that we have so few data points, these three graphs look remarkably similar. It kind of looks as if experts and non-experts are guessing the same time delays.We can also look at failed predictions, the ones whose time has come and gone. That also looks quite similar, seeing how few data we have. This doesn't prove the experts don't have deep insights, but it's not evidence in favor of it. Just looking at those four graphs, I think I can say that there's no evidence that experts have any predictive advantage. Though remembering what I said about biases and overconfidence, that doesn't mean that your own guess is any better. Your guess is probably as good as an expert's, which is bad.


Stuart: What can we do? The first thing we can do is spread uncertainty. That always helps. So, you think AGI will happen in 2040. What you're saying is, all of these experts are wrong. You have unique insight that is better, and they're all deluded. You'd need quite an extraordinary amount of evidence if you were saying that. "Well," you say, "it's pretty likely." Well, we spread your uncertainty a bit, now there's less of them wrong. By the way, feel free to spread the uncertainty of their predictions as well, experts are always overconfident. Now we go to "it's just an approximate guess," and now you're in the median estimate. You don't stand out, you're not making extraordinary strong claims.

Another reason to spread your uncertainty is if we look at our current best timeline predictions. These concern whole-brain emulations, uploads. That's what Robin Hanson was talking about earlier. Basically you slice up a brain, scan it, construct a model, and instantiate it on a computer. The reason that timeline predictions for whole brain emulations are much better than others is that they're very decomposed. They're justified why this is a grind problem and we don't need to wait for insights. There's a clear assumption of scenarios, there's a certain amount of feedback, and there's multiple pathways to the same goal.

My colleague Anders Sandberg ran some Monte Carlo simulations for whole-brain emulations under three different scenarios. These are probability distributions for when whole-brain emulations might happen. As you can see, the uncertainty of this is spread over the entire century. If this is the best timeline prediction, and its uncertainty is about a hundred years, then we should at least expect that most other timeline predictions have the same kind of uncertainty.

What can we say about AGI? Timeline predictions are pretty poor. Other types of predictions such as plans for how to build AGIs have similar types of problems. But we can actually say some stuff about AGI. Not timelines but what AGIs are going to be like. The source for this area actually comes from philosophy.

This is something very controversial and probably surprising for computer scientists in the room, because their vision of a philosopher is probably someone who says inane things like, "Gödel's theorem proves AGI is impossible!" To which the computer scientist can just respond "Well, no." "Yes." "No." "Yes." "No."


Stuart: The discussion continues in an intellectual fashion.


Stuart: But what you have to remember is philosophers are also experts, and since they're experts, they're massively overconfident. If you want to take their arguments, you need to add more caveats, more uncertainty, decompose them, and then you can generally end up with something quite reasonable.

Let's take that inane Gödel's theorem statement. We can decompose it into something a lot more reasonable and weaker, that ends up with the conclusion that there may be a problem with self-reference in AGI, "and won't you keep that under consideration?" Then generally the discussion is a lot more productive. Remember that philosophers have on occasion come up with some useful ideas. Sorry, in philosophy, useful ideas have been come up with, not always by philosophers.


Stuart: I think things like the scientific method and formal logic have their role to play in AGI.

Some examples of how you can take philosophical arguments and improve them. This is Dreyfus' original argument in 1956 that computers can't cope with ambiguity. Just add a simple caveat that using 1965 AI approaches, you get something that's both reasonable and true. This is a more recent prediction by Gozzi that I selected from the list, it makes a long, interesting point that computing isn't thinking. The main thing you can extract from this is that it's possible that AGIs may be nothing like human brains. It's possible that thinking of them as human is the wrong approach.

In preparation for this talk, I've looked at a lot of philosophical predictions of very varying quality. I think the most useful around today in my opinion is what I'm calling "Omohundro-Yudkowsky thesis." In its strong form, this is just, "behaving dangerously is a generic behavior for high-intelligence AGIs." This is a massive oversimplification. This is the "supply and demand" model for AGIs, it makes simplifying assumptions as to what AGIs will be, but just like the supply and demand model, from simplifying assumptions you can get useful results.

Let's refine it and narrow it. When you do that, you reduce it to, "Many AGI designs have the potential for unexpected dangerous behavior," combined with the prescriptive, "AGI programmers should demonstrate to moderate skeptics that their design is safe." I don't mean that they have to go and convince Omohundro and Yudkowsky, necessarily, but you should at least be able to convince people that aren't on your programming team or your own mother that your design is safe.

You may feel, as I did at some point, that the thesis is wrong, based on your opinion. Unfortunately, as we've seen before, your opinion on its own doesn't actually bear any weight. You need objective evidence to show it wrong. There's a really easy way of showing that it's wrong. I direct your attention to the second clause again. If you are an AGI programmer and you can't convince moderate skeptics that your design is safe, and there are these quite strong arguments for potential danger out there, I feel I have to ask, why are you going ahead?

Anyway, that was the philosophical interlude. Let's just get back to the conclusions. This fits perfectly with Kahneman's point earlier, our opinions are not strong evidence. Philosophy has some useful things to say. One of the reasons that philosophy has some useful things to say is that other approaches to AGI have so many problems, like the timeline approaches. Above all else, across philosophical, across timelines, across everything concerning AGI, I beg you to increase your uncertainty.

I want to say thanks to everybody who's helped me here, and I also want to say thanks to anyone who's had the courage to go out on a limb and pen an AGI prediction and put it out in the public domain. On these websites you can find my data and methodology [ and]. Thanks.


[Q&A period begins]

Man 1: Stuart, in saying that we should increase our uncertainty, are you meaning that we should expect it to be later than commonly thought likely, or is it equally likely that we'll all be surprised in the wrong direction and the AGI will appear sooner than is commonly recognized?

Stuart: I feel we should increase our uncertainty in both directions. If there's a crucial insight that might arrive tomorrow that makes it easy, then that's also a source of uncertainty. It's not fully formalized, but my current 80 percent estimate is something like 5 to 100 years.


[next question]

Man 2: Thank you. I'm sure that many of the same experts who made AGI predictions that you looked at also made predictions about maybe domain-specific features of AI, and when those might be developed. Did you look at all of those and see whether expert prediction – because some of those we can evaluate whether they were more or less correct now. Did you look at those at all?

Stuart: What you're talking about would be decomposing the problem, and surprisingly few experts did it. I don't want to mention names, but there was far too many, "I feel it would be wrong to suppose that it would take longer than 30 years to develop AGI," or that kind of reasoning. As for ones that decomposed the problem, there the problem is the whole "Moore's law hence AGI" approach. If you have, say, voice recognition, language interpretation, translation, and say map-planning or something like that, you can predict maybe when these will arrive, and if you predict them correctly as some people do, then that's great. You need to show that modules like this, you can put them together to make a true AGI, or build from these to make an AGI. There's a component missing. Definitely someone who makes better interim predictions is more likely to make a better AGI prediction, but it's still not very solid.

[next question]

Man 3: Have you tried comparing similar predictions in other problem domains, but ones that have similar characteristics, where the problem did eventually get solved? In that case you would be able to compare how people predicted against how it actually happened.

Stuart: That would be a very good idea, and the answer is no. The main reason for this is that I don't have other convenient databases of predictions in other fields. There could be... I mean, AGI predictions there seems to be something special about their tone, often, and about the approaches. As I say, some of them, if this had been anything else but an AGI prediction, the author would obviously never have written that, but it seems that in some cases the standards are very different. So, no, to answer your question. I would like to.

[next question]

Man 4: Actually, one example of such would be astronautics predictions in the late 50s and early 60s, which were even more spectacularly right and wrong than what we're talking about.

[next question]

Man 5: I was just wondering if, in your analysis, how much you've tried to filter the results based on people's definition of what AGI might be. I guess the thing that comes to mind for me is that we're already in a place where you can argue that certain types of intelligence, like say for instance what a Watson or Google does, is already superhuman in certain forms of intelligence. It would almost seem that by the time we get to whatever it is, that it's already going to be way beyond what we understand. Have you tried to come up with some formal definitions in which you say, "people are applying this definition, versus another one"?

Stuart: No, I didn't try and filter them at all. In the process where I told you where I got a single median AGI estimate, as long as their prediction was something in the vicinity of an AGI I kept it. Mainly because there's so little data. If I start breaking it down into categories, then I'd be able to say even less than what I can with this.

You were making a second point... Yes. There seems to be... just reading some of the older predictions, I get the impression that very few people would count Watson as an AGI. I get the impression a lot of them would think, if something like Watson exists, then there must be an AGI behind it, but I get some impressions that they would not have counted Watson doing exactly what it does as an AGI. I see you're shaking your head, so I'm probably wrong on that.

Man 5: That's not what I was trying to assert at all. What I was trying to say is that, we have aspects of superhuman intelligence that are already emerging, not to say that anything about those is much like what AGI might be about. Just simply that as it's coming online, this artificial intelligence, as it arrives it's already quite distinctly at a different scale than what human beings are capable of. In terms of us even understanding what AGI might actually look like when it arrives, that we don't necessarily have a very good feel for that. People's predictions, I guess you really have to be pretty be careful in terms of, do people even know what they're predicting?

Stuart: Well the answer to that is we need to increase our uncertainties. [laughs]


[next question]

Man 6: Did you put any weight into economic or competitive factors? There's a huge difference between what's possible to do, and what people put energy into doing. If you take a look at the space race in the 1960s or the Human Genome Project, or the economic factors behind electronics and Moore's law in modern semiconductor companies, the economic factors are, I think at least, larger than the technological factors.

Stuart: I didn't put any weight into those factors, because I was just looking at what other people had predicted. Some of them used economic models in their reasoning, most didn't. I think though at the moment, you can make a very strong case that what's missing from AGI is insights rather than economic desire to build one.

Moderator: Stuart Armstrong from the Future of Humanity Institute. Thank you so much for being here.

Stuart: Thank you.