AI accelerating AI progress: risks and mitigations
Plan
Plan
Drivers of progress
Growth of effective compute
More spending, cheaper compute, better algorithms
In the last decade (h/t Epoch):
Drivers of progress
Individual enhancements often improve performance by more than increasing training compute by 5X (WIP).
Key takeaway: better software responsible for ~50% of recent progress.
Plan
AI is beginning to accelerate AI progress
Future AI will accelerate things by more.
What changes when we have AGI?
Caveat: maybe AGI initially requires loads of runtime compute (e.g. BoN).
AGI = AI that can fully automate AI R&D.
Key difference: abundant cognitive labour.
Toy example:
→ AGI performs 40 million forward passes per second with training compute
What is the effect of abundant cognitive labour?
Two contrasting perspectives:
We don’t know which perspective is right.
Scary possibility: rapid software* progress without additional compute.
Software progress is 10X faster than it is today:
*I use “software” to include all non-compute sources of progress: algorithms, data, prompting, scaffolding, tool-use…
Scary possibility: rapid software* progress without additional compute.
Three objections:
Objection 1: diminishing returns to software progress
Objection 1: diminishing returns to software progress
Data from ImageNet efficiency improvements (h/t Epoch)
Objection 2: re-training takes many months.
Objection 3: computational experiments take weeks or months
Plan
We should proceed very cautiously around AGI
Dangerous capabilities
Alignment
AI acceleration could make it hard to proceed cautiously
→ General destabilisation.
Threat models
Plan
Early warning signs
Take protective measures.
Forecast software progress after deploying new model
Measure historical rate of software progress
Evals on AI R&D tasks
RCTs + surveys
How much will a new model accelerate software progress?
Early warning signs – when should labs pause?
Either:
Protective measures
What should labs do today?
This is a responsible scaling policy for the risk of AI accelerating AI progress. More.
What should labs do today?
Why now?
Is this just like the other dangerous capability evals?
Key differences:
Questions