EVM: Technical walkthrough
Transaction and Gas
- Signature is not present.
- Three types of Tx: Legacy, AccessList, eip1559Tx
- TransactTo is zero or contract address.
- Gas is introduced to limit execution, GasPrice for prioritizing transactions (eip1559).
Block
There are more additional fields but those are not used in EVM execution: OmnerHash, ParentHash, State/Transaction/Receipt Root, Bloom, ExtraData,MixHash/Nonce
BlockEnv and TxEnv can be seen as const field in EVM execution.�Additional cfg can be found in CfgEnv that contains ChainId and SpecId.
Beige paper: https://github.com/chronaeon/beigepaper/blob/master/beigepaper.pdf
Database interface
All Block/Transaction data are contained inside environment struct.
EVM: Host and Interpreter
EVM Diagram
Interpreter executes contracts and calls Host for needed information. For example to call another contract.
If revert or selfdestruct happen contract call stops, and all its changes are reverted. Parent caller continue its execution.
Interpreter
Interpreter contains:
Interpreter machine�in code
Just look and marvel at that rust code
OpCodes
Can be roughly separated into:
Full list here: https://github.com/wolflo/evm-opcodes and https://www.evm.codes/
CREATE And CREATE2
CREATE and CREATE2, are OpCodes used to create contract.
They randomly create address where bytecode is going to be added. Bytecode is received as return value of Interpreter after input code is executed.
Only difference between them is how address of contract is going to be created:
Call OpCodes
Multiple variants of CALL are called with different call context.Call context contains: Address, Caller, ApparentValue. (It affects SLOAD and SSTORE)
DELEGATECALL was a new opcode that was a bug fix for CALLCODE which did not preserve msg.sender and msg.value. If Alice invokes Bob who does DELEGATECALL to Charlie, the msg.sender in the DELEGATECALL is Alice (whereas if CALLCODE was used the msg.sender would be Bob).
More info: https://ethereum.stackexchange.com/questions/3667/difference-between-call-callcode-and-delegatecall
Logs
Logs are a way to log a message that something happened while executing smart contract. It allows smart contract devs to have a nice way to notify users/machine for specific event.
Log contain:
Gas
Every Opcode is priced in terms of Gas. Every memory extension, DB load or store has some dynamic or base gas calculation.
FeeSpend is representing GasUsed*GasPrice and it is what you pay when you execute transaction to miner.
Eip1559 is improvement that introduced BaseFee that is taken from FeeSpend and burned (destroyed) rest of Fee is transferred to miner that created the block. And where our GasPrice is calculated as BaseFee+PriorityFee.
There was a way to get refund on gas GasRefund to decrease use gas. It is used in SSTORE and SELFDESTRUCT (Idea was okay but was misused and in future probably going to be removed).
Traces
It is utility used for debugging and useful for profiling of contract execution. It contains every step of execution and its opcode, used gas, memory, stack.
It can be tied with solidity output to get full view of what is happening.
Call Traces are for some use cases eve more needed, it represent what contracts are called.
Inspector
-Implementation detail but for traces to be obtain there are need to have some kind of hooks that will allows us to inspect internal state in runtime.
Forge (upcoming tool for solidity devs) are using something similar with Sputnik to obtain traces and apply cheatcodes that help with debugging.
It mostly does hooking on Host part and on every step inside Interpreter.
Interpreter code exploration
Host
Host contains:
Subroutine (State and reverts)
It contains:
Host Trait
Precompile Name | Address | Type |
Secp256k1::ecrecovery | 0x00…01 | Curve signature recovery |
sha256 | 0x00…02 | Hash |
ripemd160 | 0x00…03 | Hash |
Identity | 0x00…04 | Utility |
bigModExp | 0x00…05 | Math |
Bn128::add | 0x00…06 | Curve |
Bn128::mul | 0x00…07 | Curve |
Bn128::pair | 0x00…08 | Curve |
Blake2 | 0x00…09 | Hash |
More info: https://docs.klaytn.com/smart-contract/precompiled-contracts
Host code exploration
Hard Forks
More on it here: https://ethereum.org/en/history/
Optimizations
Use u64 for gas calculations, in spec it is U256: Spending u256 gas is not something that is going to happen, for comparison current eth Block limit is 30M gas.
Memory calculation for u64, u256 does not make sense. There is no hard limit on memory used, but for every 32bit you use you pay for gas that acts as soft limiter. Usually memory is specified as offset+size and memory is paid as `max(offset+size)` number
Ethereum uses big-endian encoding and all PUSH values are in bigendian format, this can be slow on most machines that uses little endian and have support for u64 items. So in EVM stack is basically U256 that is [u64;4] (list of four u64 numbers) and we always convert those things back and forth.
Q&A