1 of 69

Correct-by-construction Casper:�Understanding Casper Protocol States

RChain Developer’s Retreat

April 18th, 2018

Vlad Zamfir :P

2 of 69

Outline

  • Introduction to consensus
    • Safety, liveness, and non-triviality
  • Safety proof pt. 1
    • Estimate safety, common futures, and consensus safety
  • Safety proof pt. 2
    • Protocol states as sets of messages
    • Equivocations and lost states
  • Liveness, stuck-free-ness, and non-triviality
    • FLP impossibility
    • FLP impossibility in CBC Casper

3 of 69

Part 1: Introduction

4 of 69

Introduction to Consensus Pt. 1

Consensus protocols allow nodes on a distributed network to make consistent decisions out of mutually exclusive possibilities.

For example, if we are “coming to consensus on a bit”, then the goal of a consensus protocol is to have nodes all output “0” or all output “1.”

5 of 69

Introduction to Consensus Pt. 2

Instead of a boring 0/1, we can also decide on values for:

  1. Integers!
  2. Blockchains!!
  3. Sharded blockchains!!!

6 of 69

Introduction to Consensus Pt. 3

We want our consensus protocol to have safety and liveness.

Safety: if (protocol-following) nodes do decide on values,� then they will decide on values that do not conflict.

Liveness: any (protocol-following) node will eventually make a decision� that accepts or rejects any given value.

7 of 69

Introduction to Consensus Pt. 4

A non-trivial protocol is a protocol where it nodes must coordinate to make decisions.

A trivial protocol has predetermined outcomes. ��E.g. all nodes immediately decided on “1” without talking to any other nodes.

For our consensus protocols to be useful, they must be non-trivial.

8 of 69

Introduction to Consensus Pt. 4

For some initial state 𝞂0 , and decided states 𝞂1 ,𝞂2

9 of 69

Introduction to Consensus Pt. 4

Consensus is tricky because messages can be delayed, reordered or lost. We describe consensus protocols in terms of the adverse conditions their safety and liveness can survive.

A consensus protocol is called “asynchronously safe” if it has consensus safety regardless of timing assumptions.��It is called “Byzantine fault tolerant” for t faults if it survives t nodes behaving arbitrarily, rather than following the protocol.

10 of 69

CBC Casper Safety Proof�Part 1

11 of 69

CBC Casper Safety Prerequisites

Before we start our safety proof, we do need to define a few things:

The protocol has:

  • Protocol states, Σ
  • Protocol state transitions, T

They describe how individual �(protocol-following) nodes can evolve

12 of 69

CBC Casper Safety Prerequisites

Before we start our safety proof, we do need to define a few things:

We write σ → σ’ to denote the existence of a �protocol state transition from σ to σ’��We insist that we have transitivity:

  • If σ → σ’ and σ’ → σ’’ then σ → σ’’ (transitivity)
  • σ → σ for all σ (identity)

13 of 69

CBC Casper Safety Proof Pt. 1

Before we start our safety proof, we do need to define a few things:

The protocol has:

A function called “the estimator”, E, mapping �protocol states Σ to possible values of the consensus, C

E : Σ → C

E : Σ → P

14 of 69

CBC Casper Safety Proof Pt. 1

Before we start our safety proof, we do need to define a few things:

In the binary consensus,�the estimator maps protocol states to {0, 1}.

In the blockchain consensus it is the “fork choice rule”, �mapping protocol states to blockchains.

15 of 69

CBC Casper Safety Proof Pt. 1

Before we start our safety proof, we do need to define a few things:

The protocol has:

A type system for reasoning about consensus values.

In the “types-as-formulas” tradition, we take �types to be terms in a logic for describing consensus values.

16 of 69

CBC Casper Safety Proof Pt. 1

Before we start our safety proof, we do need to define a few things:

For a proposition p (a proposition about values of the consensus),�we will write c ⇒ p for consensus value c, if c satisfies p

So we can write E(σ) ⇒ p, �when the estimator on state σ entails proposition p

17 of 69

CBC Casper Safety Proof Pt. 1

Before we start our safety proof, we do need to define a few things:

Estimate safety:

A proposition p is safe at protocol state σ, �if it holds for the estimator for all future protocol states σ’.

S(p, σ) := E(σ) ⇒ p, and � for all σ’. σ→σ’ ⇒ ( E(σ’) ⇒ p )

18 of 69

CBC Casper Safety Proof Pt. 1

Before we start our safety proof, we do need to define a few things:

Estimate safety:

A proposition p is safe at protocol state σ, �if it holds for the estimator for all future protocol states σ’.

S(p, σ) := E(σ) ⇒ p, and � for all σ’. σ→σ’ ⇒ ( E(σ’) ⇒ p )

19 of 69

CBC Casper Safety Proof Pt. 1

Lemma 1: σ → σ’ and S(p, σ) ⇒ S(p, σ’) (forward safety)

Lemma 2: σ → σ’ and S(p, σ’) ⇒ ¬S(¬p,σ) (backwards consistency)

2

2

20 of 69

CBC Casper Safety Proof Pt. 1

Theorem (distributed consistency, aka consensus safety)

σ1→σ’, σ2 → σ’ then S(p, σ1) ⇒ ¬S(¬p, σ2)

From S(p, σ1) and lemma 1, we have S(p, σ’)�From S(p, σ’) and lemma 2, we have ¬S(¬p, σ2)

Therefore S(p, σ1) ⇒ ¬S(¬p, σ2)

21 of 69

CBC Casper Safety Proof Pt. 1

This theorem can be interpreted as consensus safety:

We have S(p, σ1) ⇒ ¬S(¬p, σ2), which is equivalent to�⇔ ¬S(p, σ1) ∨ ¬S(¬p, σ2)�⇔ ¬(S(p, σ1) ∧ S(¬p, σ2))��Which says it’s impossible for �p to be safe at σ1 and ¬p to be safe at σ2�(for any σ1 and σ2 with a common future state)

22 of 69

CBC Casper Safety Proof Pt. 1

This theorem can be interpreted as consensus safety:

Estimate safety is a local property:�An estimate is safe if holds for all future protocol states� (from this current protocol state).

Consensus safety is a distributed property: �Protocol-following nodes make mutually-consistent decisions.

23 of 69

CBC Casper Safety Proof Pt. 1

This theorem can be interpreted as consensus safety:

Estimate safety has the “distributed consistency” property �for any collection states with a common future, �S(p,σ) ⇒ ¬S(¬p, σ’)�⇔ ¬(S(p,σ1) ∧ S(¬p, σ2))��As long as nodes have common future protocol states,�Then we can take advantage of our theorem for �decisions that “correspond only to safe properties”.

24 of 69

CBC Casper Safety Proof Pt. 1

This theorem can be interpreted as consensus safety:

By assuming that a protocol-following node at state σ only makes �decisions that correspond only to safe properties at state σ.

I.e. a node at state σ is allowed to make a decision on:

  • a proposition p if S(p, σ)
  • a value c if c ⇒ p implies S(p, σ)

25 of 69

CBC Casper Safety Proof Pt. 1

This theorem can be interpreted as consensus safety:

So if protocol-following nodes:

  • make decisions at state σ that correspond � only to properties that have estimate safety at σ
  • have a common future state

Then we have the property that our “distributed consistency” �property is equivalent to consensus safety.��

26 of 69

CBC Casper Safety Proof Pt. 1

This theorem can be interpreted as consensus safety:

The cbc safety proof (part 1) therefore shows:

Nodes who make decisions on “safe estimates” and �have common future protocol states have consensus safety

27 of 69

CBC Casper Safety Proof�Part 2

28 of 69

CBC Casper Safety Proof Pt. 2

The first part of the safety proof assumes that nodes have a common future protocol state.

The second part will guarantee that protocol-following nodes have common future states (so long as there are not too many Byzantine faults).

29 of 69

Non-triviality and Consensus failure

But first, we will observe that Byzantine faults can cause consensus failure in any non-trivial consensus protocol.

30 of 69

Non-triviality and Consensus failure

We will say that the protocol is non-trivial if there exists any protocol state that isn’t safe on p or on ¬p, there exists a transition to a state σ1 and to a state σ2 such that p is safe at σ1 and ¬p is safe at σ2.

This means that a node at state σ1 and a node at state σ2 do not have a common future protocol state. But they must be able to each get to either of these states.

31 of 69

Non-triviality and Consensus failure

Nodes therefore cannot independently follow any state transitions available to them, but must instead transition through the protocol as a function of interactions with other nodes.

32 of 69

Non-triviality and Consensus failure

We are going to assume that:

  • Some nodes are “named” as consensus-forming nodes
  • Protocol following nodes only experience state transitions when receiving signed messages from these named nodes.

33 of 69

Non-triviality and Consensus failure

Equivocation faults are fundamentally enough to cause consensus failure.

“Equivocation” is behaviour exhibited by a named node that can’t be produced by a single execution of the protocol, but can be produced for by more than one protocol execution.

34 of 69

Non-triviality and Consensus failure

Equivocation faults are fundamentally enough to cause consensus failure.

If protocol-following nodes are to make consistent decisions, they must be able to synchronize based on signed messages from the consensus forming nodes.

�However, if each of these nodes are equivocating, they might be running completely independent versions of the protocol for each of our nodes.

35 of 69

Non-triviality and Consensus failure

Equivocation faults are fundamentally enough to cause consensus failure.

If we might be running through protocol states independently, then why should we be able to maintain consensus safety?

�We might be able to have consensus safety if only some proper subset of consensus forming nodes are equivocating.�

36 of 69

Non-triviality and Consensus failure

Equivocation faults are fundamentally enough to cause consensus failure.��The question for us, then, is how do we make sure that nodes have a common future protocol state as long as there is less than some number of equivocations?

Well first, let’s be sure that we can count equivocations!

37 of 69

CBC Casper Safety Proof Pt. 2

We now insist that messages attest that the sender was at a particular protocol state at the time of sending:

�M = V x Σ

Check if there exists a single protocol execution through each�of the protocol states observed in the sender’s messages.

V(m) is the validator who created the message

S(m) is the protocol state attested to in the message

38 of 69

CBC Casper Safety Proof Pt. 2

From a set of messages from many senders, we can detect equivocation from each sender, and check how many of them equivocated

Eq(m1,m2) := m1≠ m2 and V(m1) = V(m2) and

S(m1) ↛ S(m2) and S(m2) ↛ S(m1)

39 of 69

CBC Casper Safety Proof Pt. 2

If the validators have weight, then we can calculate the �total weight of equivocating validators:

F(s) = 𝛴 W(v), � for all v : ∃m1ϵs, ∃m2ϵs V(m1) = v and Eq(m1,m2)

40 of 69

CBC Casper Safety Proof Pt. 2

Tying the knot:�

Let protocol states be sets of messages that exhibit�not more than some number (weight) of Byzantine faults

�Σ = {sϵ P(M) s.t. F(s) ≤ t} �M = V x Σ�I = {{}} (initial state)

41 of 69

CBC Casper Safety Proof Pt. 2

Tying the knot:

�Receiving (or making) a message that says “I’m at this state” �represents a protocol state transition

�Receiving a message m at protocol state σ means moving �to state σ U m, if it doesn’t exhibit more than t faults.�

42 of 69

43 of 69

CBC Casper Safety Proof Pt. 2

Tying the knot:��Any message is an equivocation w.r.t. some other message � (that may actually never exist).��So receiving m at σ means longer having a �common future protocol state with any set of messages σ’ �such that σ U σ’ doesn't exhibit more than t faults, �but σ U σ’ U m does

44 of 69

CBC Casper Safety Proof Pt. 2

Tying the knot:�

Two nodes have a common future state if there is not�more than t Byzantine faults in their union!

45 of 69

CBC Casper Safety Proof Pt. 2

Tying the knot:

σ1 and σ2 have a common future σ1 U σ2 ... if and only if σ1 U σ2 exhibits not more than t Byzantine faults

So: not more than t Byzantine faults �⇒ nodes only visit states with common futures

46 of 69

CBC Casper Safety Proof Pt. 2

Tying the knot:

So:

nodes exhibit not more than t Byzantine faults �⇒ nodes only visit states with common futures�⇒ consensus safety

47 of 69

CBC Casper Safety Proof Pt. 2

Tying the knot:

And:

consensus failure �⇒ nodes do not have a common future�⇒ nodes are at states whose union exhibit � more than t faults

Which is what was wanted!

48 of 69

Lost States

Lost states are states with with the newly-transitioned-to protocol state does not share a common future with, while the previous protocol state did.

Each transition that corresponds to receiving a message from a non-byzantine validator leads to a non-empty set of lost states. �

This corresponds to states where this new message that resulted in a transition corresponds to the tth equivocation, thus making no common future exist (as this hypothetical common future would have t faults, and thus is not a protocol state).

49 of 69

Lost States

Lost states are states with with the newly-transitioned-to protocol state does not share a common future with, while the previous protocol state did.

If σA ~ σB denotes “having a common future”

LOST(σ, σ’) = {σ’’ : σ ~ σ’’ and σ’ ~/ σ’’}

50 of 69

Lost States

Each transition that corresponds to receiving a message from a non-byzantine validator leads to a non-empty set of lost states. �

LOST(σ, σ U {m}) = {σ’’ : σ ~ σ’’ and σ U m ~/ σ’’}

= {σ’’ : F(σ U σ’’) ≤ t and F(σ U m U σ’’) > t}

= {σ’ U {m} : F(σ’ U σ) = t and F(σ’ U σ U m) = t + 1}

51 of 69

Liveness, Stuck-free-ness, and Non-triviality

52 of 69

Higher-order definitions

EVENTUALLY(P, σ) ⇔ P(σ) ∨ ∀ σ’ . σ → σ’ : EVENTUALLY(p, σ’)

POSSIBLY(P, σ) ⇔ P(σ) ∨ ∃ σ’ . σ → σ’ : POSSIBLY(P, σ’)

53 of 69

Liveness

LIVE(p, σ) ⇔ EVENTUALLY(S(p,x) ⋁ S(¬p,x), σ)

54 of 69

Stuck and Stuck-free-ness

STUCK(p, σ) ⇔ ¬POSSIBLY(S(p,x), σ) ∧ ¬POSSIBLY(S(¬p,x), σ)

¬STUCK(p, σ) ⇔ POSSIBLY(S(p,x), σ) POSSIBLY(S(¬p,x), σ)

55 of 69

Locked on p

L(p, σ) ⇔ EVENTUALLY(S(p,x), σ)

56 of 69

Bivalence

BIVALENT(p, σ) ⇔ ∃σ'. σ σ' : S(p, σ') ∧ ∃σ''. σ σ'' : S(¬p, σ'')

57 of 69

FLP Impossibility

FLP impossibility showed that it’s impossible for a deterministic protocol to be guaranteed to be safe and live in asynchrony, in the context of any faults.

As CBC Casper fits this description, we will be able to use the previous definitions to show a proof of FLP for all CBC Casper protocols.

That is, CBC Casper protocols are not necessarily live in asynchrony!

58 of 69

FLP Impossibility for CBC

  1. Assume that we have a bivalent state with no future bivalent states
  2. Pick two transitions such that:
    1. They aren’t showing equivocations, individually
    2. They aren’t an equivocation, mutually
    3. One is locked on p, and one is locked on not p
  3. Show a contradiction
  4. Therefore, there must be a future bivalent state

59 of 69

What is cool about the cbc protocol states?

  • We can think about liveness totally independently of safety!
  • We can talk about safety and FLP without talking about the estimator!
  • Extra-protocol fault tolerance threshold!

60 of 69

Check out the docs, the repo, and help us!

Read the papers:

Contribute to the codebase: tinyurl.com/cbc-casper-code

Join the gitter and make yourself known: tinyurl.com/cbc-casper-chat

�Ask me about how you can contribute to cbc-casper research!

61 of 69

Thank you for listening�Thank you EDCON Toronto! <3

62 of 69

63 of 69

64 of 69

65 of 69

66 of 69

67 of 69

68 of 69

69 of 69