Will AI end everything?
A guide to guessing
Katja Grace, AI Impacts
What are we talking about?
Losing the future
Not an abstract thing
How to guess?
(a guess)
Priors, arguments, people
(See wiki.aiimpacts.org for more argument detail)
Argument from competent bad AI agents
First pass
Two new things
2. Agents
1. Cognitive labor
Argument from competent bad AI agents
Quantified (ish)
How much cognitive labor will be working toward bad futures vs. good futures?
If more cognitive labor working toward bad future than good future at some time, will the future be bad?
A lot more than human cognitive labor, eventually
So let’s ignore
2. How much cognitive labor is controlled by new AI agents?
A rough proxy:
2.a. How likely is an AI system to be an agent?
2.b. How much of its own cognition does AI system ‘control’?
2.c. How much non-agentic AI cognitive labor will be controlled by AI systems?
2.a. How likely is a system to be an agent?
Considerations:
2.b. How much of its own cognition does an AI system ‘control’?
2.c. How much non-agentic AI cognitive labor will be controlled by AI systems?
3. What fraction of AI goals are bad?
3.A. Why is it hard to find good goals? (Part 1)
Face from thispersondoesnotexist.com
3.A. Why is it hard to find good goals? (Part 2)
4) Against: short-term goals
a) Counter: selection effects
5) Against: why expect agents to learn the world really well but values really badly?
a) Counter: ‘sharp left turn’
3.B. Why is it easy to find appealing bad goals?
3.C. Why can’t we give AI systems goals anyway?
3. What fraction of AI goals are bad?
Argument from competent bad AI agents
Quantified (ish)
How much cognitive labor will be working toward bad futures vs. good futures?
If more cognitive labor working toward bad future than good future at some time, will the future be bad?
Share to bad AI agent goals: 14%
My overall guesses:
If more cognitive labor goes to bad future, will the future be bad?
Two scenarios: fast, slow
Considerations regarding the whole argument
How to guess?
(a guess)
Priors, arguments, people
(See wiki.aiimpacts.org for more argument detail)
Conclusions