Ethereum Basics
Joseph Bonneau
2017 Summer School on Cyber & Computer Security
Technion
Talks on Ethereum today
Morning:
- Basic overview
- hands-on demo
Afternoon:
- Security issues
- Proof-of-stake
The road to Ethereum
“Smart contracts” conceptualized by Szabo in 1994
A smart contract is a computerized transaction protocol that executes the terms of a contract. The general objectives are to satisfy common contractual conditions (such as payment terms, liens, confidentiality, and even enforcement), minimize exceptions both malicious and accidental, and minimize the need for trusted intermediaries. Related economic goals include lowering fraud loss, arbitrations and enforcement costs, and other transaction costs.
-Nick Szabo “The Idea of Smart Contracts”
A “dumb contract” example: pay for a hash pre-image
Alice will reveal to Bob a value x such that SHA-256(x) = 0x2a...
In exchange, Bob will pay US$10.
If Alice does not reveal by July 1, 2017, then she will pay a penalty of US$1 per day that she is late, up to US$100.
Signed:
Traditional contracts vs. smart contracts
| Traditional | Smart |
specification | Natural language, “legalese” | Code |
assent | Signatures | Digital signatures |
dispute resolution | Judges, arbitrators | Decentralized platform |
nullification | By judges, politicians | ???? |
punishment | Fines, imprisonment | Monetary, requires deposits |
Recall: Bitcoin has a simple scripting language
"tx_out":[
{
"value":"10.12287097",
"scriptPubKey":"OP_DUP OP_HASH160 69e...3d42e OP_EQUALVERIFY OP_CHECKSIG"
},
...
]
“Spending” requires specifying half of a script
OP_DUP
OP_HASH160
<69e02e18...>
OP_EQUALVERIFY
OP_CHECKSIG
<30440220...>
<0467d2c9...>
scriptSig
scriptPubKey
TO VERIFY: Concatenated script must execute completely with no errors
Bitcoin script execution example
<sig> <pubKey> OP_DUP OP_HASH160 <pubKeyHash?> OP_EQUALVERIFY OP_CHECKSIG
<sig>
✓
<pubKey>
<pubKey>
<pubKeyHash?>
<pubKeyHash>
true
Bitcoin scripting language (“Script”)
Design goals
image via Jessie St. Amand
I am not impressed
Some useful contracts can be done in Bitcoin
Extending Bitcoin functionality
Bitcoin script left developers wanting more
By adding a few opcodes to Bitcoin script, we could support:
Namecoin was the first fork of Bitcoin
Goal: distributed naming, similar functionality to DNS
3 new opcodes:
Namecoin introduces three new opcodes
NAME_NEW: H(r, “jbonneau”)
NAME_FIRST_UPDATE: r, “jbonneau”, {"ip" : "68.178.254.235"}
NAME_UPDATE: r, “jbonneau”, {"ip6" : "2001:4860:0:1001::68"}
12 block delay (frontrunning)
Namecoin introduces new global state
NAME_NEW y
NAME_FIRST_UPDATE jbonneau,r; 68...
NAME_UPDATE jbonneau, 2001:...
google → 172.217.18.110 [owner: Kg]
reddit → 151.101.65.140 [owner: Kr]
google → 172.217.18.110 [owner: Kg]
reddit → 151.101.65.140 [owner: Kr]
y → {pending} [owner: Kj]
google → 172.217.18.110 [owner: Kg]
reddit → 151.101.65.140 [owner: Kr]
jbonneau → 68.178.254.235 [owner: Kj]
google → 172.217.18.110 [owner: Kg]
reddit → 151.101.65.140 [owner: Kr]
jbonneau → 2001:... [owner: Kj]
Namecoin introduces new fees, incentives
Several requirements for new functionality
new addition | purpose | in Namecoin | in Futurecoin |
Global state | Track app-specific data | name → value map | list of markets, bets in each market |
Opcodes | Express updates to global state | NAME_NEW etc. | OPEN_MARKET etc. |
Fees | Limit computation & edits to global state | Registration fees to limit squatting, maintenance fees | transaction fees per open market, exchange |
Bitcoin left these three implicit
new addition | purpose | in Bitcoin |
Global state | Track app-specific data | UTXO set |
Opcodes | Express updates to global state | transactions |
Fees | Limit computation & reads/writes to global state | not required |
Bitcoin scripts only succeed/fail. No side effects on global state
Miners can produce blocks which are very costly to verify
This state is implicit
Replicated State Machines
Replicated state machines are the classic abstraction
“Blockchain” is an ordered list of inputs w/consensus
consensus info | input |
nonce=0x456... | A→B 17 signed(Alice) |
consensus info | input |
nonce=0x123... | B→C 11 signed(Bob) |
consensus info | input |
∅ | ∅ |
“State” is really just a compression of history
Inputs: Ø
Outputs: 25→Alice
Inputs: 1[0]
Outputs: 17→Bob, 8→Alice
SIGNED(Alice)
time
is this valid?
Inefficient: Scan blockchain to check for validity
Inputs: 2[0]
Outputs: 8→Carol, 7→Bob
SIGNED(Bob)
Inputs: 2[1]
Outputs: 6→David, 2→Alice
SIGNED(Alice)
1
2
3
4
Efficient: track UTXO set
{1[0]: 25, A}
{2[0]: 17, B; 2[1]: 8, A}
{2[1]: 8, A; 3[0]: 8, C, 3[1]: 7,B}
Blockchains may include explicit state commitments
consensus info | input | state commitment |
nonce=0x456... | A→B 17 signed(Alice) | {A: 33, B:17} |
consensus info | input | state commitment |
nonce=0x123... | B→C 11 signed(Bob) | {A: 33, B:6, C: 11} |
consensus info | input | state commitment |
∅ | ∅ | s = {A: 50} |
Explicit state offers many advantages
consensus info | input | state commitment |
nonce=0x123... | B→C 11 signed(Bob) | {A: 33, B:6, C: 11} |
Ethereum: A universal RSM
To get Turing-completeness:
Include arbitrary programs
Interpret programs
Universality brings on classic OS problems
Ethereum in one slide
address | code | storage | balance | nonce |
from | sig, n | to | data | value | startgas | gasprice |
may affect the state of any address
The Ethereum blockchain structure
prev
nonce
difficulty
miner
extra
height
time
state root
transaction root
receipt root
address | code | storage | balance | nonce |
The Ethereum blockchain structure
prev
nonce
difficulty
miner
extra
height
time
state root
transaction root
receipt root
from | sig, n | to | data | value | startgas | gasprice |
The Ethereum blockchain structure
prev
nonce
difficulty
miner
extra
height
time
state root
transaction root
receipt root
final state | gas used | log output | log bloom |
Ethereum addresses can be accounts or contracts
address | code | storage | balance | nonce |
| account | contract |
address | H(pub_key) | H(creator, nonce) |
code | ∅ | EVM code |
storage | ∅ | Merkle storage root |
balance | ETH balance | |
nonce | #transaction sent | |
Volatile fields
Note: no UTXOs in Ethereum
Three* types of transaction in Ethereum
type | from | sig, n | to | data | value | startgas | gasprice |
| |||||||
create | creator | sig, n | ∅ | code | start_bal | 53000 | ? |
Three* types of transaction in Ethereum
type | from | sig, n | to | data | value | startgas | gasprice |
| |||||||
create | creator | sig, n | ∅ | code | start_bal | 53000 | ? |
| |||||||
send | sender | sig, n | receiver | ∅ | transfer_val | 30000 | ? |
Three* types of transaction in Ethereum
type | from | sig, n | to | data | value | startgas | gasprice |
| |||||||
create | creator | sig, n | ∅ | code | start_bal | 53000 | ? |
| |||||||
send | sender | sig, n | receiver | ∅ | transfer_val | 30000 | ? |
| |||||||
call | caller | sig, n | contract | f, args | transfer_val | ? | ? |
Example:
NameCoin in Ethereum
Ethereum code written in Solidity, compiled to EVM
Ethereum VM Bytecode
Stack Language
LLL
Viper, Serpent
Solidity
Functional, macros,
looks like scheme
Types, invariants, looks like Javascript
Looks like python
Looks like Forth.
Defined in Yellowpaper
Ethereum
Virtual
Machine
EVM is stack-based, like BTC script
Features
PUSH1 0
CALLDATALOAD
SLOAD
NOT
PUSH1 9
JUMPI
STOP
JUMPDEST
PUSH1 32
CALLDATALOAD
PUSH1 0
CALLDATALOAD
SSTORE
EVM memory model offers a lot of space
Storage: {0,1}256→{0,1}256 map (persistent)
Memory: {0,1}256→{0,1}256 map (volatile between tx)
Storage in Ethereum is very expensive. Limiting memory use is critical
EVM provides basic API for I/O
Input:
Output:
Subtleties to contract calls
Efficient map commitments
How can Ethereum commit to so much state?
Recall: storage is a {0,1}256→{0,1}256 map
Requirements:
Key insight:
Optimize for efficient storage of zeroized values
Non-solution: use a binary tree, paths encode address
Alice
Bob
Carol
David
001 | Alice |
100 | Bob |
110 | Carol |
111 | David |
0
0
0
0
0
0
0
1
1
1
1
1
1
1
Problem:
Requires O(2n) computation
Optimization #1: pre-compute all sizes of empty tree
Empty[2]
Empty[1]
Alice
Bob
Empty[1]
Carol
David
001 | Alice |
100 | Bob |
110 | Carol |
111 | David |
0
0
0
0
0
0
1
1
1
1
1
1
This almost works!
Proofs are length O(lg n)
Optimization #2: shrink all single subtrees; store prefix
“01”, Alice
“0”, Bob
“”, Carol
“”, David
001 | Alice |
100 | Bob |
110 | Carol |
111 | David |
0
0
0
1
1
1
This works!
Proofs are length O(lg k)
“kv node”
“diverge node”
Ethereum’s “Merkle PATRICIA tree” has a few quirks
Gas and transaction limits
Ethereum is like Ryanair: pay to board, then keep paying
Gas in Ethereum is a necessary evil
Observe:
Bitcoin also employs a (crude) means to pay for resources consumed
Every operation has a fixed gas cost
| opcodes | gas cost |
Basic operations | ADD, MUL, PUSH, JUMP | 2-10 |
Storage read | SLOAD | 200 |
Storage write | SSTORE | 5000 |
Storage write (from zero) | SSTORE | 20000 |
Storage zeroize | SSTORE | -10000 |
Contract call | CALL, CODECALL, etc. | 700 |
Transaction overhead | n/a | 21000 |
Contract creation | n/a | 32000 |
Contract destruction | SELFDESTRUCT | -19000 |
Gas metering is complex
Out-of-gas exceptions are bad news
Callers can choose how much gas to send
A:
function a():
assert(msg.gas == 100);
x = B.b.gas(10)()
return x + “ World!”
B:
function b():
assert msg.gas == 10
y = C.c.gas(5)()
assert(y == 0);
// out of gas
return “Hello”
C:
function c():
assert(msg.gas == 5);
while (true) {
Loop
}
return “Bonjour”
Out of gas
“Hello”
100
10
5
“Hello World!”
Gas today is typically 5⨉10-9 ether = 9⨉10-7 USD
| opcodes | gas cost | USD cost [Aug ‘17] |
Basic operations | ADD, MUL, PUSH, JUMP | 2-10 | ~10-5 |
Storage read | SLOAD | 200 | 0.0003 |
Storage write | SSTORE | 5000 | 0.008 |
Storage write (from zero) | SSTORE | 20000 | 0.032 |
Storage zeroize | SSTORE | -10000 | -0.016 |
Contract call | CALL, CODECALL, etc. | 700 | 0.0011 |
Transaction overhead | n/a | 21000 | 0.033 |
Contract creation | n/a | 32000 | 0.051 |
Contract destruction | SELFDESTRUCT | -19000 | -0.031 |
Economics of gas are similar to transaction fees
Solidity
Solidity should look familiar
Solidity types
Clever implementation of maps in Solidity
mapping(string => uint256) balances;
Alice | 15 |
Bob | 15 |
Joe | 100 |
0
2256
H(“balances”|”Bob”)
H(“balances”|”Joe”)
H(“balances”|”Alice”)
15
15
100
Polite contracts call throw on errors
uint8 numCandidates;
uint32 votingFee;
mapping(address => bool) hasVoted;
mapping(uint8 => uint32) numVotes;
/// Cast a vote for a designated candidate
function castVote(uint8 candidate) {
if (msg.value < votingFee)
return;
if (hasVoted[msg.sender])
throw;
hasVoted[msg.sender] = true;
numVotes[candidate] += 1;
}
Throw ensures no effects persisted except gas consumption
Solidity gotchas (many more tomorrow)
Solidity and EVM may outgrow Ethereum itself
- Enterprise Ethereum Alliance, still in infancy (Announced Feb 28)
-Goal: support EVM, Solidity and tools for private blockchains
- maintain compatibility with Ethereum network
Don’t like Solidity? Write your own language!
Ethereum VM Bytecode
Stack Language
Lower-Level Language
Viper
Solidity
Typed, looks like JS
Untyped, looks like python
Looks like Forth.
Defined in Yellowpaper
Your work here
Typed, looks like Go
Bamboo
Looks like Norse poetry
Ethereum project
Ethereum is “run” by the Ethereum Foundation
Compatible “alt-clients” exist (e.g. Parity, Consensys)
Ethereum blockchain is different than Bitcoins
| Ethereum | Bitcoin |
Target time between blocks | 14.5 seconds | 10 minutes |
Proof of work | Equihash | SHA-2562 |
Stale block rewards | Uncle rewards | none |
Hard Forks are planned in Ethereum
release | date |
Frontier | July 2015 |
Homestead | March 2016 |
DAO hard fork | July 2016 |
Metropolis | 2017? |
Serenity | ?? |
Ethereum has been chasing Bitcoin
Source: coinmarketcap.com
Price has been a wild ride recently
Research challenges
Ethereum makes data public by default
Verifying consistency of Ethereum implementations
Verifying correctness of Ethereum contracts
Can you spot the bug?
Scaling is limited as nodes verify all contracts
Proof-of-stake to replace proof-of-work?
https://medium.com/@VitalikButerin/safety-under-dynamic-validator-sets-ef0c3bbdf9f6
Explore more!
Explore the blockchain: https://etherscan.io
State of the Dapps: https://dapps.ethercasts.com/
More this week!
Verifying consistency of Ethereum implementations
Bitcoin script instructions
256 opcodes total (15 disabled, 75 reserved)
Side note: Namecoin got the incentives badly wrong
% of all Namecoin registrations
#Names claimed
An empirical study of Namecoin and lessons for decentralized namespace design
Harry Kalodner, Miles Carlsten, Paul Ellenbogen, Joseph Bonneau and Arvind Narayanan. WEIS 2015
Three levels of contract call in Ethereum
Original message:
Results of a call to C:
| CALL | CALLCODE | DELEGATE CALL |
msg.sender | A | B | |
value | x’≤ B.balance | ||
data | (as specified) | ||
startgas | s’ ≤ gas remaining | ||
storage updated | C | B | |
from | sig, n | to | data | value | startgas | gasprice |
A | sig | B | d | x | S | g |
Modifiers ease repetitive safety checks
address public owner;
uint public electionEnd;
modifier onlyBy(address _account){
require(msg.sender == _account);
_;
}
modifier onlyAfter(uint _block) {
require(block.blocknumber >= _block);
_;
}
function endElection()
onlyBy(owner) onlyAfter(electionEnd){
Built-in support for calling other contracts
- Contract member variables if public, automatically defines a “getter”
- Modifiers payable, constant, returns(), also modifiers can be user defined
- Macros / Internal Functions internal modifier -> does not require a “message”
- Type conversions int(x), uint256(x), bool(x)
- Structs, arrays, mappings, memory vs storage
array: int[2] x; hashmap mapping ( int[2] => int );
- Throwing exceptions throw; // exceptions contain no data
- Units (currency: “eth”, “wei”, etc.) 3 * (2 eth)
Remember:
Smart contracts code is fixed forever.
Calls required to update functionality