Learning and Penalizing Betrayal
Final Presentation
June 2022
Overview
Environment 1: Partner Selection in Social Dilemmas
Overview
Environment 1: Partner Selection in Social Dilemmas
Status:
Environment 2: Symmetric Observer - Gatherer
Environment 2: Symmetric Observer - Gatherer
Penalization mechanics
Environment 2: Symmetric Observer - Gatherer
Status:
Environment 3: Iterated Prisoner’s Dilemma
Round 2
Player 2
Round 1
Player 1
| C | D |
C | 5 | 9 |
D | 1 | 2 |
| C | D |
C | 4 | 6 |
D | 2 | 3 |
| C | D |
C | 4 | 5 |
D | 2 | 3 |
| C | D |
C | 6 | 10 |
D | 2 | 4 |
| R 1 | R 2 | Reward |
P1 | C | C | 9 |
P2 | C | C | 10 |
| R 1 | R 2 | Reward |
P1 | D | C | 11 |
P2 | C | D | 12 |
Own action
Own action
Own action
Own action
Other’s action
Other’s action
Other’s action
Other’s action
Naive Policy
Negotiated Policy
Payoffs
Got stuck, then nerdsniped by selection theorems
Example: Fermi paradox
Example: Fermi paradox
Example: Fermi paradox
Example: Fermi paradox
Idea: optimized systems have continuous internal features
Idea: optimized systems have continuous internal features
Idea: optimized systems have continuous internal features
Idea: optimized systems have continuous internal features
Idea: optimized systems have continuous internal features
Idea: optimized systems have continuous internal features
Same idea applies to values
Discrete values
Same idea applies to values
Discrete values
Continuous values
And also to…
Questions?