1 of 126

Correct-by-construction Casper:�� Binary consensus→Sharding

ETHcc

March 8^th, 2018

Vlad Zamfir :P

2 of 126

Outline

Part 1

Introduction to consensus
Correct-by-construction protocol design
CBC Casper safety proof

Part 2

Specification of binary, integer, and blockchain protocols
Safety oracles
Liveness
Subjective fault tolerance thresholds
Performance and Overhead
Sharding!

Part 3

Conclusion and future work
How to get involved!

3 of 126

Part 1: Introduction

4 of 126

Introduction to Consensus Pt. 1

Consensus protocols allow nodes on a distributed network to make consistent decisions given mutually exclusive possibilities.

For example, if we are “coming to consensus on a bit”, then the goal of a consensus protocol is to have nodes all output “0” or all output “1.”

5 of 126

Introduction to Consensus Pt. 2

Instead of a boring 0/1, we can also decide on values for:

Integers!
Blockchains!!
Sharded blockchains!!!

6 of 126

Introduction to Consensus Pt. 3

We want our consensus protocol to have safety and liveness.

Safety: if (protocol-following) nodes do decide on values,� then they will decide on values that do not conflict.

Liveness: any (protocol-following) node will eventually make a decision� that accepts or rejects any given value.

7 of 126

Introduction to Consensus Pt. 4

Consensus is tricky because messages can be delayed, reordered or lost. We describe consensus protocols in terms of the adverse conditions their safety and liveness can survive.

A consensus protocol is called “asynchronously safe” if it has consensus safety regardless of timing assumptions.��It is called “Byzantine fault tolerant” for t faults if it survives t nodes behaving arbitrarily rather than following the protocol.

8 of 126

Introduction to Consensus Pt. 5

FLP impossibility showed that it’s impossible for a deterministic protocol to be guaranteed to be safe and live in asynchrony, in the context of any faults.

We will present a family of “correct-by-construction” deterministic, asynchronously safe, Byzantine fault tolerant consensus protocols!

They are not necessarily live in asynchrony (FLP), but they are live in a partially synchronous network.

9 of 126

Correct-by-construction protocol design

10 of 126

“Traditional” approach to protocol design

1) Formally specify the protocol

2) Define properties that the protocol must satisfy (in the case of consensus, something related to safety and liveness)

3) Prove that the protocol has the given properties

11 of 126

The “Correct-by-Construction” Way

1) Formally but only partially specify the protocol

2) Define properties and proofs that protocol must satisfy

3) Derive more of the protocol in a way that is proven to satisfy them

12 of 126

The goal is to make proving the protocol’s correctness almost trivial.

13 of 126

Traditional: Protocol first, analysis second�CBC: Analysis first, protocol second

14 of 126

CBC Casper Safety Proof

15 of 126

CBC Casper Safety Proof�Part 1

16 of 126

CBC Casper Safety Prerequisites

Before we start our safety proof, we do need to define a few things:

The protocol has:

Protocol states, Σ
Protocol state transitions, T

They describe how individual �(protocol-following) nodes can evolve

17 of 126

CBC Casper Safety Prerequisites

Before we start our safety proof, we do need to define a few things:

We write σ → σ’ to denote the existence of a �protocol state transition from σ to σ’��We insist that we have the properties:

σ → σ for all protocol states σ (identity)
If σ → σ’ and σ’ → σ’’ then σ → σ’’ (transitivity)

18 of 126

CBC Casper Safety Proof Pt. 1

Before we start our safety proof, we do need to define a few things:

The protocol has:

A function called “the estimator”, E, mapping �protocol states Σ to possible values of the consensus, C

E : Σ → C

19 of 126

CBC Casper Safety Proof Pt. 1

Before we start our safety proof, we do need to define a few things:

In the binary consensus,�the estimator maps protocol states to {0, 1}.

In the blockchain consensus it is the “fork choice rule”, �mapping protocol states to blockchains.

20 of 126

CBC Casper Safety Proof Pt. 1

Before we start our safety proof, we do need to define a few things:

The protocol has:

A type system for reasoning about consensus values.

In the “types-as-formulas” tradition, we take �types to be terms in a logic for describing consensus values.

21 of 126

CBC Casper Safety Proof Pt. 1

Before we start our safety proof, we do need to define a few things:

For a proposition p (a proposition about values of the consensus),�we will write c ⇒ p for consensus value c, if c satisfies p

So we can write E(σ) ⇒ p, �when the estimator on state σ entails proposition p

22 of 126

CBC Casper Safety Proof Pt. 1

Before we start our safety proof, we do need to define a few things:

Estimate safety:

A proposition p is safe at protocol state σ, �if it holds for the estimator for all future protocol states σ’.

S(p, s) ⇔ for all σ’. σ→σ’ ⇒ ( E(σ’) ⇒ p )

23 of 126

CBC Casper Safety Proof Pt. 1

Before we start our safety proof, we do need to define a few things:

Estimate safety:

A proposition p is safe at protocol state σ, �if it holds for the estimator for all future protocol states σ’.

S(p, s) ⇔ for all σ’. σ→σ’ ⇒ ( E(σ’) ⇒ p )

24 of 126

CBC Casper Safety Proof Pt. 1

Lemma 1: σ → σ’ and S(p, σ) ⇒ S(p, σ’) (forward safety)

Lemma 2: σ → σ’ and S(p, σ’) ⇒ ¬S(¬p,σ) (backwards consistency)

25 of 126

CBC Casper Safety Proof Pt. 1

Theorem (distributed consistency, aka consensus safety)

σ₁→σ’, σ₂ → σ’ then S(p, σ₁) ⇒ ¬S(¬p, σ₂)

From S(p, σ₁) and lemma 1, we have S(p, σ’)�From S(p, σ’) and lemma 2, we have ¬S(¬p, σ₂)

Therefore S(p, σ₁) ⇒ ¬S(¬p, σ₂)

26 of 126

CBC Casper Safety Proof Pt. 1

This theorem can be interpreted as consensus safety:

We have S(p, σ₁) ⇒ ¬S(¬p, σ₂), which is equivalent to�⇔ ¬S(p, σ₁) ∨ ¬S(¬p, σ₂)�⇔ ¬(S(p, σ₁) ∧ S(¬p, σ₂))��Which says it’s impossible for �p to be safe at σ₁ and ¬p to be safe at σ₂�(for any σ₁ and σ₂ with a common future state)

27 of 126

CBC Casper Safety Proof Pt. 1

This theorem can be interpreted as consensus safety:

Estimate safety is a local property:�An estimate is safe if holds for all future protocol states� (from this current protocol state).

Consensus safety is a distributed property: �Protocol-following nodes make mutually-consistent decisions.

28 of 126

CBC Casper Safety Proof Pt. 1

This theorem can be interpreted as consensus safety:

Estimate safety has the “distributed consistency” property �for any collection states with a common future, �S(p,σ) ⇒ ¬S(¬p, σ’)�⇔ ¬(S(p,σ₁) ∧ S(¬p, σ₂))��As long as nodes have common future protocol states,�Then we can take advantage of our theorem for �decisions that “correspond only to safe properties”.

29 of 126

CBC Casper Safety Proof Pt. 1

This theorem can be interpreted as consensus safety:

By assuming that a protocol-following node at state σ only makes �decisions that correspond only to safe properties at state σ.

I.e. a node at state σ is allowed to make a decision on:

a proposition p if S(p, σ)
a value c if c ⇒ p implies S(p, σ)�

30 of 126

CBC Casper Safety Proof Pt. 1

This theorem can be interpreted as consensus safety:

So if protocol-following nodes:

make decisions at state σ that correspond � only to properties that have estimate safety at σ
have a common future state

Then we have the property that our “distributed consistency” �property is equivalent to consensus safety.��

31 of 126

CBC Casper Safety Proof Pt. 1

This theorem can be interpreted as consensus safety:

The cbc safety proof (part 1) therefore shows:

Nodes who make decisions on “safe estimates” and �have common future protocol states have consensus safety

32 of 126

CBC Casper Safety Proof�Part 2

33 of 126

CBC Casper Safety Proof Pt. 2

The first part of the safety proof assumes that nodes have a common future protocol state.

The second part will guarantee that protocol-following nodes have common future states (so long as there are not too many Byzantine faults).

34 of 126

Non-triviality and Consensus failure

But first, we will observe that Byzantine faults can cause consensus failure in any non-trivial consensus protocol.

35 of 126

Non-triviality and Consensus failure

We will say that the protocol is non-trivial if there exists any protocol state that isn’t safe on p or on ¬p, there exists a transition to a state σ₁ and to a state σ₂ such that p is safe at σ₁ and ¬p is safe at σ₂.

This means that a node at state σ₁ and a node at state σ₂ do not have a common future protocol state. But they must be able to each get to either of these states.

36 of 126

Non-triviality and Consensus failure

Nodes therefore cannot independently follow any state transitions available to them, but must instead transition through the protocol as a function of interactions with other nodes.

37 of 126

Non-triviality and Consensus failure

We are going to assume that:

Some nodes are “named” as consensus-forming nodes
Protocol following nodes only experience state transitions when receiving signed messages from these named nodes.

38 of 126

Non-triviality and Consensus failure

Equivocation faults are fundamentally enough to cause consensus failure.

“Equivocation” is behaviour exhibited by a named node that can’t be produced by a single execution of the protocol, but can be produced by more than one protocol execution.

39 of 126

Non-triviality and Consensus failure

Equivocation faults are fundamentally enough to cause consensus failure.

If protocol-following nodes are to make consistent decisions, they must be able to synchronize based on signed messages from the consensus forming nodes.

�However, if each of these nodes are equivocating, they might be running completely independent versions of the protocol for each of our nodes.

40 of 126

Non-triviality and Consensus failure

Equivocation faults are fundamentally enough to cause consensus failure.

If we might be running through protocol states independently, then why should we be able to maintain consensus safety?

�We might be able to have consensus safety if only some proper subset of consensus forming nodes are equivocating.�

41 of 126

Non-triviality and Consensus failure

Equivocation faults are fundamentally enough to cause consensus failure.��The question for us, then, is how do we make sure that nodes have a common future protocol state as long as there is less than some number of equivocations?

Well first, let’s be sure that we can count equivocations!

42 of 126

CBC Casper Safety Proof Pt. 2

We now insist that messages attest that the sender was at a particular protocol state at the time of sending:

�Check if there exists a single protocol execution through each�the protocol states observed in the sender’s messages.

43 of 126

CBC Casper Safety Proof Pt. 2

From a set of messages from many senders, we can detect equivocation from each sender, and check how many of them equivocated

If the validators have weight, then we can calculate the �total weight of equivocating validators

44 of 126

CBC Casper Safety Proof Pt. 2

Tying the knot:�

Let protocol states be sets of messages that exhibit�not more than some number (weight) of Byzantine faults

45 of 126

CBC Casper Safety Proof Pt. 2

Tying the knot:

�Receiving (or making) a message that says “I’m at this state” �represents a protocol state transition

�Receiving a message m at protocol state σ means moving �to state σ + m, if it doesn’t exhibit more than t faults.�

47 of 126

CBC Casper Safety Proof Pt. 2

Tying the knot:��Any message is an equivocation w.r.t. some other message � (that may actually never exist).��So receiving m at σ means no longer having a �common future protocol state with any set of messages σ’ �such that σ + σ’ doesn't exhibit more than t faults, �but σ + σ’ + m does �

48 of 126

CBC Casper Safety Proof Pt. 2

Tying the knot:�

Two nodes have a common future state if there is not�more than t Byzantine faults in their union!

49 of 126

CBC Casper Safety Proof Pt. 2

Tying the knot:

σ₁and σ₂have a common future σ₁+ σ₂... _�if and only if σ₁+ σ₂exhibits not more than t Byzantine faults

So: not more than t Byzantine faults �⇒ nodes only visit states with common futures

50 of 126

CBC Casper Safety Proof Pt. 2

Tying the knot:

So:

nodes exhibit not more than t Byzantine faults �⇒ nodes only visit states with common futures�⇒ consensus safety

51 of 126

CBC Casper Safety Proof Pt. 2

Tying the knot:

And:

consensus failure �⇒ nodes do not have a common future�⇒ nodes are at states whose union exhibit � more than t faults

Which is what was wanted!

52 of 126

Part 2: Specifications

53 of 126

Casper the Friendly Binary Consensus

For each of these protocols:

We will define:

protocol messages

So protocol states and their transitions are defined � (sets of messages with no more than t faults)

an estimator

So that we can relate protocol states to values of the consensus

�Because that’s all we need to satisfy the cbc framework

54 of 126

Casper the Friendly Binary Consensus

55 of 126

Casper the Friendly Binary Consensus

Protocol messages:�The set of protocol messages M is parametric in a set of validator names, V, which are the names of consensus forming nodes.

Protocol messages in FBC have three parts:

estimate (0 or 1)
sender (a validator name)
justification (a protocol state)

(i.e. a set of protocol messages).

56 of 126

Casper the Friendly Binary Consensus

Definitions:

Dependency: The dependencies of a message m are all of the messages in the justification of that message m (or in their justifications, or in their justifications).

Latest message: The latest message from a validator v in some protocol state σ is the message m with the properties:

m has v as a sender and
m is not in the dependency of any other m’ from v in σ.

57 of 126

Casper the Friendly Binary Consensus

Binary estimator:

Before we define the estimator for our binary consensus protocol,�let us define the score of an estimate e in some protocol state σ:

�The score for estimate e in protocol state σ is the sum of the weights of validators whose latest messages have an estimate e.

58 of 126

Casper the Friendly Binary Consensus

Binary estimator:

Now, we say that, for any protocol state, the estimator returns the estimate with the highest score.

For example, if more validators (by weight) have latest messages with the estimate “1”, then the estimator will return “1.”

59 of 126

Casper the Friendly Binary Consensus

Binary estimator:

If there is a tie, the estimator returns:

An exception, or
The proposition “0 or 1”, or
A non-deterministic choice of 0 or 1
Always returns 0, or
Always returns 1

Any of these are fine!

64 of 126

Casper the Friendly Integer Consensus (with median)

65 of 126

Casper the Friendly Integer Consensus

Protocol messages:

Just as binary, protocol messages have three parts:

an estimate (an integer),
a sender (a validator name), and
a justification (a protocol state)

(which itself is a set of protocol messages.)

66 of 126

Casper the Friendly Integer Consensus

Integer Estimator:

Instead of choosing the estimate with the highest score, as in the binary consensus, we choose the median of the integers given in latest message…

�...the median is weighted by the validators weights.

69 of 126

Casper the Friendly GHOST

70 of 126

Casper the Friendly Ghost

Protocol messages:

Blockchain protocol messages also have 3 parts:

an estimate (a block),
a sender (a validator name), and
a justification (a protocol state)

(which itself is a set of protocol messages)

You can think of the estimate as the previous block pointer!

71 of 126

Casper the Friendly Ghost

GHOST estimator:

Also only based on latest messages

72 of 126

Casper the Friendly Ghost

GHOST estimator:

Also only based on latest messages

76 of 126

Safety Oracle

77 of 126