Correct-by-construction Casper:�� Binary consensus→Sharding
ETHcc
March 8th, 2018
Vlad Zamfir :P
Outline
Part 1: Introduction
Introduction to Consensus Pt. 1
Consensus protocols allow nodes on a distributed network to make consistent decisions given mutually exclusive possibilities.
For example, if we are “coming to consensus on a bit”, then the goal of a consensus protocol is to have nodes all output “0” or all output “1.”
Introduction to Consensus Pt. 2
Instead of a boring 0/1, we can also decide on values for:
Introduction to Consensus Pt. 3
We want our consensus protocol to have safety and liveness.
Safety: if (protocol-following) nodes do decide on values,� then they will decide on values that do not conflict.
Liveness: any (protocol-following) node will eventually make a decision� that accepts or rejects any given value.
Introduction to Consensus Pt. 4
Consensus is tricky because messages can be delayed, reordered or lost. We describe consensus protocols in terms of the adverse conditions their safety and liveness can survive.
A consensus protocol is called “asynchronously safe” if it has consensus safety regardless of timing assumptions.��It is called “Byzantine fault tolerant” for t faults if it survives t nodes behaving arbitrarily rather than following the protocol.
Introduction to Consensus Pt. 5
FLP impossibility showed that it’s impossible for a deterministic protocol to be guaranteed to be safe and live in asynchrony, in the context of any faults.
We will present a family of “correct-by-construction” deterministic, asynchronously safe, Byzantine fault tolerant consensus protocols!
They are not necessarily live in asynchrony (FLP), but they are live in a partially synchronous network.
Correct-by-construction protocol design
“Traditional” approach to protocol design
1) Formally specify the protocol
2) Define properties that the protocol must satisfy (in the case of consensus, something related to safety and liveness)
3) Prove that the protocol has the given properties
The “Correct-by-Construction” Way
1) Formally but only partially specify the protocol
2) Define properties and proofs that protocol must satisfy
3) Derive more of the protocol in a way that is proven to satisfy them
The goal is to make proving the protocol’s correctness almost trivial.
Traditional: Protocol first, analysis second�CBC: Analysis first, protocol second
CBC Casper Safety Proof
CBC Casper Safety Proof�Part 1
CBC Casper Safety Prerequisites
Before we start our safety proof, we do need to define a few things:
The protocol has:
They describe how individual �(protocol-following) nodes can evolve
CBC Casper Safety Prerequisites
Before we start our safety proof, we do need to define a few things:
We write σ → σ’ to denote the existence of a �protocol state transition from σ to σ’��We insist that we have the properties:
CBC Casper Safety Proof Pt. 1
Before we start our safety proof, we do need to define a few things:
The protocol has:
A function called “the estimator”, E, mapping �protocol states Σ to possible values of the consensus, C
E : Σ → C
CBC Casper Safety Proof Pt. 1
Before we start our safety proof, we do need to define a few things:
In the binary consensus,�the estimator maps protocol states to {0, 1}.
In the blockchain consensus it is the “fork choice rule”, �mapping protocol states to blockchains.
CBC Casper Safety Proof Pt. 1
Before we start our safety proof, we do need to define a few things:
The protocol has:
A type system for reasoning about consensus values.
In the “types-as-formulas” tradition, we take �types to be terms in a logic for describing consensus values.
CBC Casper Safety Proof Pt. 1
Before we start our safety proof, we do need to define a few things:
For a proposition p (a proposition about values of the consensus),�we will write c ⇒ p for consensus value c, if c satisfies p
So we can write E(σ) ⇒ p, �when the estimator on state σ entails proposition p
CBC Casper Safety Proof Pt. 1
Before we start our safety proof, we do need to define a few things:
Estimate safety:
A proposition p is safe at protocol state σ, �if it holds for the estimator for all future protocol states σ’.
S(p, s) ⇔ for all σ’. σ→σ’ ⇒ ( E(σ’) ⇒ p )
CBC Casper Safety Proof Pt. 1
Before we start our safety proof, we do need to define a few things:
Estimate safety:
A proposition p is safe at protocol state σ, �if it holds for the estimator for all future protocol states σ’.
S(p, s) ⇔ for all σ’. σ→σ’ ⇒ ( E(σ’) ⇒ p )
CBC Casper Safety Proof Pt. 1
Lemma 1: σ → σ’ and S(p, σ) ⇒ S(p, σ’) (forward safety)
Lemma 2: σ → σ’ and S(p, σ’) ⇒ ¬S(¬p,σ) (backwards consistency)
2
2
CBC Casper Safety Proof Pt. 1
Theorem (distributed consistency, aka consensus safety)
σ1→σ’, σ2 → σ’ then S(p, σ1) ⇒ ¬S(¬p, σ2)
From S(p, σ1) and lemma 1, we have S(p, σ’)�From S(p, σ’) and lemma 2, we have ¬S(¬p, σ2)
Therefore S(p, σ1) ⇒ ¬S(¬p, σ2)
CBC Casper Safety Proof Pt. 1
This theorem can be interpreted as consensus safety:
We have S(p, σ1) ⇒ ¬S(¬p, σ2), which is equivalent to�⇔ ¬S(p, σ1) ∨ ¬S(¬p, σ2)�⇔ ¬(S(p, σ1) ∧ S(¬p, σ2))��Which says it’s impossible for �p to be safe at σ1 and ¬p to be safe at σ2�(for any σ1 and σ2 with a common future state)
CBC Casper Safety Proof Pt. 1
This theorem can be interpreted as consensus safety:
Estimate safety is a local property:�An estimate is safe if holds for all future protocol states� (from this current protocol state).
Consensus safety is a distributed property: �Protocol-following nodes make mutually-consistent decisions.
CBC Casper Safety Proof Pt. 1
This theorem can be interpreted as consensus safety:
Estimate safety has the “distributed consistency” property �for any collection states with a common future, �S(p,σ) ⇒ ¬S(¬p, σ’)�⇔ ¬(S(p,σ1) ∧ S(¬p, σ2))��As long as nodes have common future protocol states,�Then we can take advantage of our theorem for �decisions that “correspond only to safe properties”.
CBC Casper Safety Proof Pt. 1
This theorem can be interpreted as consensus safety:
By assuming that a protocol-following node at state σ only makes �decisions that correspond only to safe properties at state σ.
I.e. a node at state σ is allowed to make a decision on:
CBC Casper Safety Proof Pt. 1
This theorem can be interpreted as consensus safety:
So if protocol-following nodes:
Then we have the property that our “distributed consistency” �property is equivalent to consensus safety.��
CBC Casper Safety Proof Pt. 1
This theorem can be interpreted as consensus safety:
The cbc safety proof (part 1) therefore shows:
Nodes who make decisions on “safe estimates” and �have common future protocol states have consensus safety
CBC Casper Safety Proof�Part 2
CBC Casper Safety Proof Pt. 2
The first part of the safety proof assumes that nodes have a common future protocol state.
The second part will guarantee that protocol-following nodes have common future states (so long as there are not too many Byzantine faults).
Non-triviality and Consensus failure
But first, we will observe that Byzantine faults can cause consensus failure in any non-trivial consensus protocol.
Non-triviality and Consensus failure
We will say that the protocol is non-trivial if there exists any protocol state that isn’t safe on p or on ¬p, there exists a transition to a state σ1 and to a state σ2 such that p is safe at σ1 and ¬p is safe at σ2.
This means that a node at state σ1 and a node at state σ2 do not have a common future protocol state. But they must be able to each get to either of these states.
Non-triviality and Consensus failure
Nodes therefore cannot independently follow any state transitions available to them, but must instead transition through the protocol as a function of interactions with other nodes.
Non-triviality and Consensus failure
We are going to assume that:
Non-triviality and Consensus failure
Equivocation faults are fundamentally enough to cause consensus failure.
“Equivocation” is behaviour exhibited by a named node that can’t be produced by a single execution of the protocol, but can be produced by more than one protocol execution.
Non-triviality and Consensus failure
Equivocation faults are fundamentally enough to cause consensus failure.
If protocol-following nodes are to make consistent decisions, they must be able to synchronize based on signed messages from the consensus forming nodes.
�However, if each of these nodes are equivocating, they might be running completely independent versions of the protocol for each of our nodes.
Non-triviality and Consensus failure
Equivocation faults are fundamentally enough to cause consensus failure.
If we might be running through protocol states independently, then why should we be able to maintain consensus safety?
�We might be able to have consensus safety if only some proper subset of consensus forming nodes are equivocating.�
Non-triviality and Consensus failure
Equivocation faults are fundamentally enough to cause consensus failure.��The question for us, then, is how do we make sure that nodes have a common future protocol state as long as there is less than some number of equivocations?
Well first, let’s be sure that we can count equivocations!
CBC Casper Safety Proof Pt. 2
We now insist that messages attest that the sender was at a particular protocol state at the time of sending:
�Check if there exists a single protocol execution through each�the protocol states observed in the sender’s messages.
CBC Casper Safety Proof Pt. 2
From a set of messages from many senders, we can detect equivocation from each sender, and check how many of them equivocated
If the validators have weight, then we can calculate the �total weight of equivocating validators
CBC Casper Safety Proof Pt. 2
Tying the knot:�
Let protocol states be sets of messages that exhibit�not more than some number (weight) of Byzantine faults
CBC Casper Safety Proof Pt. 2
Tying the knot:
�Receiving (or making) a message that says “I’m at this state” �represents a protocol state transition
�Receiving a message m at protocol state σ means moving �to state σ + m, if it doesn’t exhibit more than t faults.�
CBC Casper Safety Proof Pt. 2
Tying the knot:��Any message is an equivocation w.r.t. some other message � (that may actually never exist).��So receiving m at σ means no longer having a �common future protocol state with any set of messages σ’ �such that σ + σ’ doesn't exhibit more than t faults, �but σ + σ’ + m does �
CBC Casper Safety Proof Pt. 2
Tying the knot:�
Two nodes have a common future state if there is not�more than t Byzantine faults in their union!
CBC Casper Safety Proof Pt. 2
Tying the knot:
σ1 and σ2 have a common future σ1 + σ2 ... �if and only if σ1 + σ2 exhibits not more than t Byzantine faults
So: not more than t Byzantine faults �⇒ nodes only visit states with common futures
CBC Casper Safety Proof Pt. 2
Tying the knot:
So:
nodes exhibit not more than t Byzantine faults �⇒ nodes only visit states with common futures�⇒ consensus safety
CBC Casper Safety Proof Pt. 2
Tying the knot:
And:
consensus failure �⇒ nodes do not have a common future�⇒ nodes are at states whose union exhibit � more than t faults
Which is what was wanted!
Part 2: Specifications
Casper the Friendly Binary Consensus
For each of these protocols:
We will define:
�Because that’s all we need to satisfy the cbc framework
Casper the Friendly Binary Consensus
Casper the Friendly Binary Consensus
Protocol messages:�The set of protocol messages M is parametric in a set of validator names, V, which are the names of consensus forming nodes.
Protocol messages in FBC have three parts:
Casper the Friendly Binary Consensus
Definitions:
Dependency: The dependencies of a message m are all of the messages in the justification of that message m (or in their justifications, or in their justifications).
Latest message: The latest message from a validator v in some protocol state σ is the message m with the properties:
Casper the Friendly Binary Consensus
Binary estimator:
Before we define the estimator for our binary consensus protocol,�let us define the score of an estimate e in some protocol state σ:
�The score for estimate e in protocol state σ is the sum of the weights of validators whose latest messages have an estimate e.
Casper the Friendly Binary Consensus
Binary estimator:
Now, we say that, for any protocol state, the estimator returns the estimate with the highest score.
For example, if more validators (by weight) have latest messages with the estimate “1”, then the estimator will return “1.”
Casper the Friendly Binary Consensus
Binary estimator:
If there is a tie, the estimator returns:
Any of these are fine!
Casper the Friendly Integer Consensus (with median)
Casper the Friendly Integer Consensus
Protocol messages:
Just as binary, protocol messages have three parts:
Casper the Friendly Integer Consensus
Integer Estimator:
Instead of choosing the estimate with the highest score, as in the binary consensus, we choose the median of the integers given in latest message…
�...the median is weighted by the validators weights.
Casper the Friendly GHOST
Casper the Friendly Ghost
Protocol messages:
Blockchain protocol messages also have 3 parts:
You can think of the estimate as the previous block pointer!
Casper the Friendly Ghost
GHOST estimator:
Also only based on latest messages
Casper the Friendly Ghost
GHOST estimator:
Also only based on latest messages
Safety Oracle
A Simple Safety Oracle: “Clique Oracle”
The safety proof ensures that nodes that make decisions on safe estimates make consensus-safe decisions.
But we haven’t said how we might determine whether a proposition is safe at a given protocol state!
A Simple Safety Oracle: “Clique Oracle”
While we don’t need to detect safety to enjoy consensus safety on implicit decisions,
if we ever want to report or use our decisions we still want to be able to detect safe estimates.
We call our safety detectors “safety oracles”.
A Simple Safety Oracle: “Clique Oracle”
An ideal safety oracle OIdeal returns “True” on a given estimate p and state s�⇔ an ideal adversary fails to change a node’s estimate
But the ideal adversary can be computationally quite demanding, so we will describe one oracle that can be used to detect safe estimates for some protocols.
An oracle is safe if it has the property that O(p, σ) => OIdeal(p, σ)
A Simple Safety Oracle: “Clique Oracle”
The “clique oracle” will work for protocols with the property that: ��If a majority of latest messages from non-equivocating validators in a protocol state σ agree with an estimate e, then the estimator also entails e: E(σ) => e
A Simple Safety Oracle: “Clique Oracle”
So we can define “agreement” between estimates 𝒆 and 𝒆’ for each protocol:
A Simple Safety Oracle: “Clique Oracle”
Given some protocol state σ, an “𝒆-clique” in σ is a set of validators where:
A Simple Safety Oracle: “Clique Oracle”
These validators in an 𝒆-clique cannot “convince” each other to disagree with 𝒆.
If they have more than half of the weight, then neither can other validators…
(who have less than half of the weight)
...at least assuming that there aren’t any equivocating nodes in the 𝒆-clique.
A Simple Safety Oracle: “Clique Oracle”
If:
Then:
A Simple Safety Oracle: “Clique Oracle”
The estimator will agree with the largest majority of latest messages.�So at least enough equivocation is required to create a bigger weight majority than�available in the clique: We.
�The minority has Wt - We weight
So we need at least f faults, by weight, such that Wt - We + f > We
⇔ f > 2We - Wt
Subjective Fault Tolerance Thresholds
Subjective Fault Tolerance Thresholds
One node can run protocol states as sets of messages exhibiting less than t1 faults, while another runs a protocol with the same messages but only when they exhibit less than t2 faults
Two nodes have a common future state (guaranteeing our consensus safety)�if the union of their views has less than min(t1,t2) faults
Liveness
Liveness
Consensus protocols have “liveness” if nodes are guaranteed to eventually make a decision]
FLP Impossibility states that it’s impossible to be both safe and live in an asynchronous network. (For deterministic protocols without cryptography)
All of our discussion so far has been without any timing assumptions (hence asynchronous safety)
Liveness
Indeed, we have said nothing about when nodes should make and send messages to each other
Instead, we’ve been speaking about arbitrary additions to arbitrary sets of protocol messages
This means that we can’t give a liveness proof (even in synchrony/partial synchrony) for the current protocol specification
Liveness
However, as we have seen in observations, there are many protocol states with safety.
And nodes can coordinate timeouts in order to produce a DAG of messages corresponding to these protocol states, in a synchronous network (and eventually in a partially synchronous network).
This means that we could have liveness in synchrony/partial synchrony, if we additionally recommended a message passing DAG, like “a blockchain”.
Overhead and Performance
Casper the Friendly Sharded Blockchain
�(with a hierarchical binary sharding and �fork-choice enforced atomicity of cross-shard blocks)
Casper the Friendly Sharded Blockchain
The safety proof says nothing about which protocol states nodes actually are required to eventually visit.
It therefore will allow for nodes to never see all blocks from all shards, while still providing the distributed consistency/consensus safety property!
Casper the Friendly Sharded Blockchain
So we are going to follow the same process as normal,
but instead of a bit, an integer, or a blockchain;
this time with a sharded data structure
Sharded Block Structure
Protocol messages:
Very similar to the blockchain, but:
Sharded Block Structure
Protocol messages:
Merge blocks:
blocks on the merged shards
Sharded Block Structure
Protocol messages:
Sharded fork choice rule
The fork choice rule returns a blockchain for every shard
Fork-choice rule enforced atomicity of merge blocks
The fork choice rule enforced atomicity of merge blocks:
The merge block is in the fork choice of all of the shards being merged�Or it’s in the fork choice of none.
Fork-choice enforced atomicity of merge block safety
Fork-choice enforced �plausible atomicity of safe merge blocks
This is the weakest atomicity definition in this model:
Binary hierarchical sharded fork choice rule
We’re only going to allow some shards to have merge blocks�
Namely the ones that are connected by an edge, here:
Binary hierarchical sharded fork choice rule
Shards that are lower in the hierarchy follow their parent
I.e. the fork choice of the child is a function of the fork choice of the parent��But not visa versa
This allows nodes to stay on a single branch of the tree
Binary hierarchical sharded fork choice rule
Shards that are lower in the hierarchy follow their parent
They start their fork choice at the most recent merge block in the fork choice of their parent
Binary hierarchical sharded fork choice rule
Shards that are lower in the hierarchy follow their parent
The fork choice ignores or orphans any merge blocks that were orphaned by its parent’s fork choice
Binary hierarchical sharded fork choice rule
Shards that are lower in the hierarchy follow their parent
The parent can orphan blocks from the child shard
Binary hierarchical sharded fork choice rule
Shards that are lower in the hierarchy follow their parent
But the parent must only include merge blocks that are later than the previous merge blocks, for the child shard
Binary hierarchical sharded fork choice rule
So we are able to define a fork choice rule that guarantees the atomicity of cross-shard blocks, in this binary hierarchical sharding!
Binary hierarchical sharded fork choice rule
We will assign validators uniformly across every branch of the tree, so each only needs to do the fork choice rule for N shards out of 2N - 1
Part 3: Conclusion and Future work
Future Work
Research on the cbc-process:
Future Work
Research on sharding:
Future Work
Conclusion
The correct-by-construction framework for generating consensus protocols is pretty cool…
We were able to define lots of consensus protocols (including sharding!) without doing almost any extra analysis on a case-by-case basis!
Conclusion
We can explore lots of different liveness strategies, overhead/performance tradeoffs,
All without thinking about the maintenance of our Byzantine fault tolerant asynchronous safety.
Conclusion
This flexibility is useful for iterating fast, and progress has been accelerating!
The framework has been laid out for you to contribute!
Don’t be shy :)
Check out the docs, the repo, and help us!
Read the papers:
Contribute to the codebase: tinyurl.com/cbc-casper-code
Join the gitter and make yourself known: tinyurl.com/cbc-casper-chat
�Ask me about how you can contribute to cbc-casper research!
Thank you for listening�Thank you ETHcc! <3