1 of 25

Differential Testing

with Foundry

@annascarroll

2 of 25

What is differential testing?

Differential testing is utilized to ensure identical behavior between two or more implementations of equivalent code.

Compare 2+ instances of equivalent code

A, B

Provide the same test inputs

x

Ensure all meaningful behavior is identical (not just outputs)

A(x) == B(x)

�Note: it’s loosely assumed that inputs are generated (e.g. fuzzed), but not required

3 of 25

4 of 25

Differential testing smart contracts

Candidate implementations do not need to be written in the same language, nor even compile to the same VM

Test Solidity against Python

for EVM-agnostic code like Math or Merkle Tree libraries

Test Solidity against Solidity

for rich, stateful, EVM-specific applications - emits, reverts, etc.

Test Solidity against Huff, Vyper, Edge

for personal edification

5 of 25

with Foundry…

All the ergonomics of Foundry that Solidity devs are used to
Easy combination with other testing patterns

Local tests, Fork tests
Explicit inputs, Fuzzed inputs

Cheatcode `ffi` enables testing with non-EVM implementations
Can stick to native testing patterns for EVM-only implementations

6 of 25

EVM-Specific Use Cases

Gas Optimization
Validating Upgrades

7 of 25

Gas Optimization

Write an un-Optimized reference implementation in idiomatic Solidity
Transpile the reference into a highly Optimized final version
Differential test the un-Optimized reference vs. Optimized final
“The Seaport Pattern”

8 of 25

Gas Optimization - Benefits

Developers can design & build the app more quickly using Solidity, then focus on optimization later
End users get a hyper-optimized contract on mainnet
Reference implementation becomes a valuable resource

Auditors can audit the reference and the final for higher confidence
Integrators & devs with less context on Solidity can understand the code via the reference
Teaching tool to learn Yul / Assembly by studying the reference & the final

Differential testing builds confidence that optimizations didn’t break functionality

9 of 25

Validating Upgrades

Write fork tests against production contracts pre-upgrade
Write fork tests against production contracts post-upgrade
Differential testing builds confidence that new code doesn’t interact badly with state on mainnet

10 of 25

Coding Patterns

“Naive” pattern
Seaport pattern
My proposed pattern >:~)

11 of 25

Naive pattern

12 of 25

Naive pattern

Probably the first thing in peoples’ minds when they picture Differential testing
Gets clunky really fast
To make richer assertions (expectEmit, expectRevert, etc.), need to repeat the same assertion for each contract
Lots of boilerplate

13 of 25

Seaport pattern

14 of 25

15 of 25

Seaport pattern: Benefits

A lot cleaner than the Naive pattern, but still less readable than normal Foundry tests
Under-the-hood, workarounds are inscrutable for most devs to understand
Inaccessible

16 of 25

Proposed pattern

17 of 25

Proposed pattern

18 of 25

Proposed pattern: Input Equality

Explicitly defined inputs are automatically equal
To make fuzz inputs equal… side quest into Foundry fuzzer

19 of 25

Foundry Fuzzer

The Foundry fuzzer is “smart” about producing inputs
It doesn’t just throw meaningless data at tests
It adds values from the test setup to a dictionary

Any values written to storage (SSTORE)
Any values pushed to the stack (PUSH opcodes)

Dictionary data serves as input for an algorithm which produces test inputs
More bugs and edge cases can be caught this way

Algorithm that produces inputs can be made deterministic by adding a salt
If input dictionary is the same && a salt is provided, test inputs are the same

20 of 25

Proposed pattern: Input Equality

Explicit developer-defined defined inputs are automatically the same
To make fuzz inputs the same…

Provide a salt
Probably exclude values pushed to stack
Maybe exclude values from storage
Make sure FUZZ_RUNS is configured as high as you want it - inputs will be exactly the same every time you run the tests

[fuzz]

seed = "1337"

include_push_bytes = false

21 of 25

Proposed pattern: Drawbacks

Excluding dictionary values makes the fuzzer a bit “dumber”

Could surface less bugs

Including dictionary values means the inputs might no longer be the same between the test runs

e.g. does not conform to true definition of Differential testing

To mitigate, could run the test suite twice with different configurations to get the best of both worlds

22 of 25

Proposed pattern: Benefits

Very readable
Quick & easy to try, even with existing test suites
Pragmatic
Approachable for all levels
Could radically reduce the barrier-to-entry for trying Differential testing!

23 of 25

Acknowledgements

Kudos to emo.eth, 0age, et al. for work on Differential testing in Seaport

Thank u evalir for answering Qs about Foundry fuzzer :)

Thank u Prestwich, Jenny Pollack, aleph_v for being awesome sounding boards <3

24 of 25

References

Differential Testing - Foundry Book
Differential testing - Wikipedia
Seaport Discussion #809 · Understanding the "DifferentialTest" test contract, particularly the "stateless" modifier
Seaport DifferentialTest.sol
Seaport FulfillAdvancedOrderCriteria.t.sol#L45
EnbangWu/differential-fuzzing (“Naive pattern”)
Paradigm Gradual Dutch Auctions differential tests (Python vs. Solidity)
Differential Test | Testing with Foundry - YouTube (Python vs. Solidity)

25 of 25

Questions?

Twitter: @annascarroll