1 of 20

Symbolic MEV Extraction

georgios@paradigm.xyz

@gakonst

2 of 20

Disclaimer: early brainstorm, I’m not a searcher, might be impractical, open-ended discussion

3 of 20

Question: Can we apply techniques used for software security in MEV extraction?

4 of 20

State of MEV Searching (for the most part)

  • Main categories:
    • Atomic / stat / top of block arbitrage
    • Backrunning / sandwiching
    • Liquidations
    • “Long tail”
  • Software Stack:
  • Geodistributed nodes
    • Fast access to state (e.g. Uniswap reserve values)
    • Strong mempool access (Eg. peer directly with AVAX validators)
  • Handwritten bot for each integration
    • “White box” simulations (E.g. native Uniswap in Rust, vs going to the EVM)
  • Reliable submission & closing (e.g. PGA / Flashbots auction)
  • Observability

5 of 20

State of MEV Searching (for the most part)

  • Main categories:
    • Atomic / stat / top of block arbitrage
    • Backrunning / sandwiching
    • Liquidations
    • “Long tail”
  • Software Stack:
  • Geodistributed nodes
    • Fast access to state (e.g. Uniswap reserve values)
    • Strong mempool access (Eg. peer directly with AVAX validators)
  • Handwritten bot for each integration
    • “White box” simulations (E.g. native Uniswap in Rust, vs going to the EVM)
  • Reliable submission & closing (e.g. PGA / Flashbots auction)
  • Observability

6 of 20

Black vs Grey vs White Box MEV Extraction

  • Black Box: No knowledge of the insides of a system
  • Grey Box: Some knowledge
  • White box: Full knowledge

In MEV extraction:

  • White box: custom code per app, maintenance as apps change, reliable
  • Black box = property-based, unclear how effective bc of no knowledge of the app
  • Grey box = property-based, extracts info about the app as it runs

7 of 20

But Ethereum is a Dark Forest

  • Not all bots are application specific
  • “Generalized” frontrunners
    • Simulate others’ transactions
    • Look at state diff + balance changes
    • If any subcall is profitable → copy + frontrun
  • Pros:
    • Can cover a wider range of opportunities / long tail mev discovered by others
  • Cons:
    • Unclear how effective it really is vs specialized
    • Relies on frontrunning / copying, does not discover new opportunities

8 of 20

Can we build a bot that reliably extracts MEV on arbitrary contracts without knowledge of application logic?

9 of 20

Property-based Testing

Testing kinds:

  • Unit testing: test pre-defined scenarios
  • Property-based testing: check that certain conditions hold over multiple inputs

Two ways to use:

  • Defensively: Ensuring your code is working as expected
  • Aggressively: Find counterargument that breaks a property

10 of 20

Property-based MEV Extraction

  1. Define properties for contract solvency: e.g. ETH/token balance does not go down
  2. Monitor for new contract bytecodes
  3. For each contract
    1. Get all its functions from the jumptable
    2. Call them with various arguments many times
    3. Find arguments which break the defined properties
  4. Submit series of transactions required to break the property
  5. ???
  6. $$$

11 of 20

Property-based MEV Extraction

  • Define properties for contract solvency: e.g. ETH/token balance does not go down
  • Monitor for new contract bytecodes
  • For each contract
    • Get all its functions from the jumptable
    • Call them with various arguments many times
    • Find arguments which break the defined properties
  • Submit series of transactions required to break the property
  • ???
  • $$$

What inputs? Random!?

12 of 20

Symbolic Execution

  1. Translate your “concrete” problem into a problem of constraints
  2. SMT solvers can find solutions to problems with constraints better than you can by randomly throwing info at something

13 of 20

Symbolic Execution

  • Concrete: x = 2, y = 3 → execute
  • Fuzzing/Property-based: assign random x,y → execute
  • Symbolic:
    • Track symbolic state via SMT queries
      • x, y, z = 2 * y
    • Track constraints
      • 2 * y == x
      • x <= y + 10
    • Send to solver (z3 / ccv3 / …)
    • Solver outputs TRUE or FALSE + input which breaks

14 of 20

Symbolic EVM Execution

  • Explore all possible states of an EVM execution
  • Used for formally verifying smart contracts
    • AND output the inputs when properties break
    • E.g. during audits
  • Examples
    • HEVM
    • Pakala
    • Mythril
    • Manticore
    • ..?
  • Maybe we can use this to make $?

15 of 20

Symbolic MEV Extraction

  • Monitor new contracts
  • Symbolically execute multiple EVM transactions on that contract → generate constraints
  • Send the constraints to Z3
  • ??
  • $$$

Black box, but much better than random because the constraints identify the structure of the problem, which z3 can solve effectively.

16 of 20

..people can get creative

17 of 20

..people can get creative

18 of 20

The Concrete vs Symbolic Searcher

“The Concrete”

THE SYMBOLIC

Rewrites solidity contracts in rust for speed

Looks for alpha

The alpha finds him

Code is 5 years old

Knows every contract inside out

Hasn’t checked the trends in weeks

Writes his properties once

Only does arbs, liqs & top of block

Maintenance takes all his time

Doesn’t know Solidity

Lets the solver do his job

Writes his own bots

19 of 20

Future: Fast Symbolic MEV Extraction

  • Rust Symbolic EVMs
  • EVM-specific SMT solvers
  • Solvers are largely heuristic based → teach & generalize the heuristics via reinforcement learning? (e.g. train an RNN vs z3)
  • Hardware acceleration: MEV ASICs?! (really SMT ASICs)
    • Probably a bad idea, z3 & others are heuristics based so if they take a long time they will never find the answer, h/w will not help

20 of 20

Thank you for your attention!

Q&A?

georgios@paradigm.xyz

@gakonst