🧠 Intelligence Beyond Commitment Devices 🗳️
Xinyuan Sun (Xyn)
Research, Flashbots ⚡🤖
Alignment and Coordination
Alignment of AI is choosing desired outcomes.
AI <> AI
AI <> human
AI + human <> AI + human
Choosing desired outcomes is coordination.
Along the way we make some over-approximations as compromises.
One of the most popular way to implement coordination, is using commitments (e.g., decision theory, mechanism design, commitment devices).
How do we study commitments? Will commitments for AIs exhibit unique properties? If so, what are they? How do we leverage/mitigate those properties?
Alignment and Coordination
Alignment of AI is choosing desired outcomes.
AI <> AI
AI <> human
AI + human <> AI + human
Choosing desired outcomes is coordination.
Along the way we make some over-approximations as compromises.
One of the most popular way to implement coordination, is using commitments (e.g., decision theory, mechanism design, commitment devices).
How do we study commitments? Will commitments for AIs exhibit unique properties? If so, what are they? How do we leverage/mitigate those properties?
Alignment and Coordination
Alignment of AI is choosing desired outcomes.
AI <> AI
AI <> human
AI + human <> AI + human
Choosing desired outcomes is coordination.
Along the way we make some over-approximations as compromises.
One of the most popular way to implement coordination, is using commitments (e.g., decision theory, mechanism design, commitment devices).
How do we study commitments? Will commitments for AIs exhibit unique properties? If so, what are they? How do we leverage/mitigate those properties?
This is Kim
Tips for nerds: this box is a commitment device, since it is common knowledge that this commitment is credible, we say it is credible commitment device
This is Kim
Kim has a box.
This box allows him to delegate arbitrary actions, and it is common knowledge that this box executes things faithfully.
Now Kim wants to use this box to improve the efficiency of the games that he is playing with his frienemy Don.
Tips for nerds: this box is a commitment device, since it is common knowledge that this commitment is credible, we say it is credible commitment device
This is Kim
Kim has a box.
This box allows him to delegate arbitrary actions, and it is common knowledge that this box executes things faithfully.
Now Kim wants to use this box to improve the efficiency of the games that he is playing with his frienemy Don.
Tips for nerds: this box is a commitment device, since it is common knowledge that this commitment is credible, we say it is credible commitment device
Prisoner’s Dilemma
Kim would want to delegate the box prior to game begins.
Prisoner’s Dilemma
Now box makes the game work, great!
Tips for nerds: Folk theorem for commitment device states that any payoff within the convex hull of individually rational payoff sets are achievable (for single-shot games even)
Commitment Devices
Commitment Devices
Crypto-economic Commitment Devices
Enforcement means commitments can credibly (with high confidence) predicate/predict agent’s behavior. E.g., smoking, grim-trigger in cartels, …
Common knowledge of enforcements means the increased certainty (predicated strategy space of agents) can shift equilibria of the game.
The computation of delegation and implementation of common knowledge takes time.
The blocktime.
Let’s call the blocktime t.
Crypto-economic Commitment Devices
Enforcement means commitments can credibly (with high confidence) predicate/predict agent’s behavior. E.g., smoking, grim-trigger in cartels, …
Common knowledge of enforcements means the increased certainty (predicated strategy space of agents) can shift equilibria of the game.
The computation of delegation and implementation of common knowledge takes time.
The blocktime.
Let’s call the blocktime t.
Crypto-economic Commitment Devices
Enforcement means commitments can credibly (with high confidence) predicate/predict agent’s behavior. E.g., smoking, grim-trigger in cartels, …
Common knowledge of enforcements means the increased certainty (predicated strategy space of agents) can shift equilibria of the game.
The computation of delegation and implementation of common knowledge takes time.
The blocktime.
Let’s call the blocktime t.
Crypto-economic Commitment Devices
Suppose we start at time T, within blocktime t:
Transactions (commitments, denoted by Com) are sent to the commitment device.
At the end of the blocktime, the commitment device implements common knowledge of the commitments and their settlement (denoted by a function F mapping a list of all commitments COM to a commitment device state) by making a public broadcast of the results.
Tips for nerds: Here I’m equating consensus broadcast at the end of each slot as finality of the transactions, this is true only on blockchains that have single-slot finality.
Crypto-economic Commitment Devices
Suppose we start at time T, within blocktime t:
Transactions (commitments, denoted by Com) are sent to the commitment device.
At the end of the blocktime, the commitment device implements common knowledge of the commitments and their settlement (denoted by a function F mapping a list of all commitments COM to a commitment device state) by making a public broadcast of the results.
Tips for nerds: Here I’m equating consensus broadcast at the end of each slot as finality of the transactions, this is true only on blockchains that have single-slot finality.
Crypto-economic Commitment Devices
Formally, we can define crypto as a Permissionless Credible Commitment Device (PCCD) that consists of:
PCCDs implement coordination via common knowledge.
But for games played within blocktime, t’ < t, it cannot coordinate or align! Because it cannot choose the desired outcome for lack of enforcement and common knowledge.
Also, the mediator/auctioneer’s (who execute the commitment semantics) strategy is not certain (action space is not predicated). Because it’s a game played in t’. And often, the mediator is a set of profit-seeking or even malicious agents that could collude! E.g., algorithmic pricing, google ad auction, …
Problem with Prisoner’s Dilemma
Suppose now the Commitment Constructor CCom allows payments.
Tips for nerds: this is the classic Stackelberg competition scenario, where the leader or the mechanism designer gets an asymmetric payoff
Problem with Prisoner’s Dilemma
Now the only payoff possible is (3,1), where Don is indifferent between using the device and not.
Problem with Prisoner’s Dilemma
Suppose both Kim and Don can use the box, now (3,1) (1,3) are the set of implementable payoffs. Not sure if desirable.
Cooperative Games
Let’s abstract away the mediator - “person with box,” call her C, and we call Kim&Don A&B.
C controls the commitment semantics F because her strategy is not certain (because C’s choice/computation/settlement of commitments is a game played within blocktime t.)
We have the box, which is a credible commitment device, so it feels natural to model this as cooperative games. Specifically, suppose ABC use the box to form coalitions.
v(A) = v(B) = 1, v(C) = 0
v(AB) = 2, v(AC) = v(BC) = 3
v(ABC) = 4
Core of this game (stable, no sub-coalition could deviate profitably) is (A: 1, B: 1, C: 2)
Cooperative Games
Let’s abstract away the mediator - “person with box,” call her C, and we call Kim&Don A&B.
C controls the commitment semantics F because her strategy is not certain (because C’s choice/computation/settlement of commitments is a game played within blocktime t.)
Now, will the commitment game be different? We start by modeling the “box” as a PCCD for playing cooperative games. Specifically, suppose ABC use the box to form coalitions.
v(A) = v(B) = 1, v(C) = 0
v(AB) = 2, v(AC) = v(BC) = 3
v(ABC) = 4
Core of this game (stable, no sub-coalition could deviate profitably) is (A: 1, B: 1, C: 2)
Cooperative Games
Let’s abstract away the mediator - “person with box,” call her C, and we call Kim&Don A&B.
C controls the commitment semantics F because her strategy is not certain (because C’s choice/computation/settlement of commitments is a game played within blocktime t.)
Now, will the commitment game be different? We start by modeling the “box” as a PCCD for playing cooperative games. Specifically, suppose ABC use the box to form coalitions.
v(A) = v(B) = 1, v(C) = 0
v(AB) = 2, v(AC) = v(BC) = 3
v(ABC) = 4
Core of this game (stable, no sub-coalition could deviate profitably) is (A: 1, B: 1, C: 2)
Core
The core manifests itself, e.g., Kim and Don can both use the box and bid in first price auction.
Core
There is no point in using the box anymore. The box is useless.
Concrete Example - Trading
At time T, There exists some liquidity on an Automated Market Maker (AMM). User wants to trade some assets potentially utilizing the AMM liquidity.
Suppose users have time preferences and want to finish trade in one block.
Currently, on Ethereum base layer protocol, most users just send a swap trading against the AMM. And their swaps gets picked off by sophisticated parties.
One big reason is because users cannot coordinate with other users on trading against each other first and then settling against the AMM liquidity (walrasian style).
This coordination game between users happens at time t’ < t the blocktime.
Obviously the latter is a “desired outcome” where higher welfare is achieved (trade with less fees), and we have “alignment” across users of the PCCD.
Concrete Example - Trading
At time T, There exists some liquidity on an Automated Market Maker (AMM). User wants to trade some assets potentially utilizing the AMM liquidity.
Suppose users have time preferences and want to finish trade in one block.
Currently, on Ethereum base layer protocol, most users just send a swap trading against the AMM. And their swaps gets picked off by sophisticated parties (bots).
One big reason is because users cannot coordinate with other users on trading against each other first and then settling against the AMM liquidity (walrasian style).
This coordination game between users happens at time t’ < t the blocktime.
Obviously the latter is a “desired outcome” where higher welfare is achieved (trade with less fees), and we have “alignment” across users of the PCCD.
Concrete Example - Trading
At time T, There exists some liquidity on an Automated Market Maker (AMM). User wants to trade some assets potentially utilizing the AMM liquidity.
Suppose users have time preferences and want to finish trade in one block.
Currently, on Ethereum base layer protocol, most users just send a swap trading against the AMM. And their swaps gets picked off by sophisticated parties (bots).
One big reason is because users cannot coordinate with other users on trading against each other first and then settling against the AMM liquidity (walrasian style).
This coordination game between users happens at time t’ < t the blocktime.
Obviously the latter is a “desired outcome” where higher welfare is achieved (trade with less fees), and we have “alignment” across users of the PCCD. But we cannot.
Maximal Extractable Value (MEV) games
In both the Prisoner’s Dilemma and the AMM liquidity example, we see the PCCD commitment game achieve undesirable outcomes because there are some games played within the blocktime t. And since the “speed of commitments” is slower than the speed at which those games are played, the PCCD fails to align/coordinate.
And in those games, some value/welfare is transferred unfairly (handed to the mafia and the monarch) or destroyed (handed to the moloch).
We call those value MEV. And we call those games the MEV game.
MEV is a big industry, with active players and adversarial incentives.
Maximal Extractable Value (MEV) games
In both the Prisoner’s Dilemma and the AMM liquidity example, we see the PCCD commitment game achieve undesirable outcomes because there are some games played within the blocktime t. And since the “speed of commitments” is slower than the speed at which those games are played, the PCCD fails to align/coordinate.
And in those games, some value/welfare is transferred undesirably (handed to the rent-seeking mafia and the monarch) or destroyed (burnt to tribute the moloch).
We call those value MEV. And we call those games the MEV game of a PCCD.
MEV is a big industry, with active players and adversarial incentives.
MEV games are exactly same as the AI alignment/coordination game.
Maximal Extractable Value (MEV) games
In both the Prisoner’s Dilemma and the AMM liquidity example, we see the PCCD commitment game achieve undesirable outcomes because there are some games played within the blocktime t. And since the “speed of commitments” is slower than the speed at which those games are played, the PCCD fails to align/coordinate.
And in those games, some value/welfare is transferred undesirably (handed to the rent-seeking mafia and the monarch) or destroyed (burnt to tribute the moloch).
We call those value MEV. And we call those games the MEV game of a PCCD.
MEV is a big industry, with active players and adversarial incentives.
MEV games are exactly same as the AI alignment/coordination game.
Correspondence
The study of MEV enables
AI alignment and cooperative AI.
Why? On a high-level
Correspondence
The study of MEV enables
AI alignment and cooperative AI.
Why? On a high-level
Elaboration
One-boxing in Newcomb’s paradox as MEV game. Credible commitment by the predictor and the decision in the (causal but not MEV-time) past conditions on the decision in the future.
It’s a flashloan!
Correspondence
The study of MEV enables
AI alignment and cooperative AI.
Why? On a high-level
Correspondence
The study of MEV enables
AI alignment and cooperative AI.
Why? On a high-level
Correspondence
The study of MEV enables
AI alignment and cooperative AI.
Why? On a high-level
"But whoever lives by the truth comes into the light, so that it may be seen plainly that what they have done has been done in the sight of God."
The boundary of the light cone is defined by the speed of light, which is the fastest possible speed at which information can be transmitted between events. Events that are within the light cone are causally connected and can potentially influence each other, while events outside the light cone are causally disconnected and unmeasurable.
The ordering and settlement of commitments in a MEV game is not certain, in that there exists no “happens-before” relation like the one of “causal ordering” in distributed systems.
MEV games are defined by the fact that they are played faster than the speed of commitments. That there exists maximal value to be extracted by misaligned intelligences where light cannot shed.
MEV = aligning misaligned superior intelligence hiding beyond the commitment cone
AI alignment = aligning misaligned superior intelligence hiding beyond the commitment cone
Certainty has limit
Most technologies of implementing certainty has a speed limit, if not a fundamental physical bound.
The technology of commitments is no exception.
Intelligence has (almost) no limit
Intelligent agents will collaborate and compete in ultra-refine high-frequency games.
Those games are not bounded (except for physical limits) by any “speed of certainty/commitment.”
There will be misalignment
Games that are played faster than certainty are MEV games, and they almost always are not aligned.
Those games WILL be played by misaligned intelligent agents, hiding beyond the commitment cone, cuz that’s where there is most x-domain MEV (correlated yet not regulated games).
MEV is a phenomena.
Misalignment is here, concrete
Bot PvP. Hyper-financialization.
Real-incentives with billion$ value on Ethereum from MEV games.
The slower intelligence becomes the stale quote (snail commitment) to be sniped off by super-intelligences (commitments with more agency) for MEV.
We care about the slower intelligence.
Proof Sketch
Proof Sketch
Proof Sketch
Proof Sketch
Proof Sketch
Proof Sketch
Proof Sketch
How do we align those MEV games?
How do we align the intelligence hiding behind the commitment cone, with their speed of intelligence travelling at a speed faster than the speed of commitment/certainty/common knowledge?
Either
i) slow down the speed of intelligence,
ii) speed up the speed of commitment,
iii) use technologies other than commitments to implement coordination that is not subject to speed limits,
or iv) bound the amount of misaligned games!
How do we align fast MEV games?
First, make more games aligned via commitments.
Privacy achieves meta-game freeness, i.e., there exists no game that is unaligned and has correlated payoff to the already aligned games. Because privacy fixes the game that is being played by over-approximation.
Essentially, you are leveraging commitments of information to threaten to “dumb the game down” so there is not much high-speed intelligences can do!
This is Kim
Kim has a box.
The box can mitigate coordination/alignment issues with intelligence.
With MEV, the box is useless or even harmful.
Kim is sad.
We are all Kim
This is Kim
Kim has a box.
The box can mitigate coordination/alignment issues with intelligence.
With MEV, the box is useless or even harmful.
Kim is sad.
We are all Kim
This is Kim
Kim has a box.
The box can mitigate coordination/alignment issues with intelligence.
With MEV, the box is useless or even harmful.
Kim is sad.
We are all Kim
This is Kim
Kim has a box.
The box can mitigate coordination/alignment issues with intelligence.
With MEV, the box is useless or even harmful.
Kim is sad.
We are all Kim
This is Kim
Kim has a box.
The box can mitigate coordination/alignment issues with intelligence.
With MEV, the box is useless or even harmful.
Kim is sad.
We are all Kim
This is Kim
Kim has a box.
The box can mitigate coordination/alignment issues with intelligence.
With MEV, the box is useless or even harmful.
Kim is sad.
We are all Kim
So let’s work together to capture
the intelligence beyond commitments.
@sxysun1
xinyuan@flashbots.net
Backup Slides
Crypto-economic Commitment Devices
Formally, we can define crypto as a Permissionless Credible Commitment Device (PCCD) that consists of:
Tips for nerds: Here I’m equating consensus broadcast at the end of each slot as finality of the transactions, this is true only on blockchains that have single-slot finality.
Prisoner’s Dilemma
State S = strategies that agents play in the PD game; mapping from PlayerID to Strat
Commitment constructor CCom = takes in either a function from PD game strategy to PD game strategy (Strat -> Strat), or a PD game strategy (Strat)
Commitment semantics F: [Com] -> S = checks the list of commitments, apply commitments of type Strat to commitments of type Strat -> Strat, then puts result into the mapping of S
Prisoner’s Dilemma
State S = strategies that agents play in the PD game; mapping from PlayerID to Strat
Commitment constructor CCom = takes in either a function from PD game strategy to PD game strategy (Strat -> Strat), or a PD game strategy (Strat)
Commitment semantics F: [Com] -> S = checks the list of commitments, apply commitments of type Strat to commitments of type Strat -> Strat, then puts result into the mapping of S
What about blocktime t?
🧠 Intelligence Beyond Commitment Devices 🗳️
Xinyuan Sun (Xyn)
Research, Flashbots ⚡🤖