1 of 52

09 September 2021, ICTAC

Formal Design, Implementation and Verification of Blockchain Languages using K

Grigore Rosu

Professor of CS, University of Illinois at Urbana

President & CEO, Runtime Verification Inc.

2 of 52

What is ?

2

3 of 52

Ideal Language Framework Vision

3

Deductive program verifier

Parser

Interpreter

Compiler

(semantic) Debugger

Symbolic execution

Model checker

Formal Language Definition

(Syntax and Semantics)

4 of 52

Current State-of-the-Art�- Sharp Contrast to Ideal Vision -

4

C

Java

JavaScript

Solidity

Ethereum VM

Interpreter

Symbolic Execution

Compiler

Model Checker

Deductive Verifier

Separate tools, by separate teams, little to no code shared

5 of 52

Current State-of-the-Art�- Sharp Contrast to Ideal Vision -

5

C

Java

JavaScript

Solidity

Ethereum VM

Interpreter

Symbolic Execution

Compiler

Model Checker

Deductive Verifier

The story of the PL/FM community.

Maintenance hell

(L * T systems).

Uneconomical.

Wasted talent!

L

T

6 of 52

How It Should Be

6

C

Java

JavaScript

Solidity

Ethereum VM

Interpreter

Symbolic Execution

Compiler

Model Checker

Deductive Verifier

Ideal Language Framework

L

T

7 of 52

K Framework�http://kframework.org

  • We tried various semantic styles, for >15y and >100 top-tier conference and journal papers:
    • Small-/big-step SOS; Evaluation contexts; Abstract machines (CC, CK, CEK, SECD, …); Chemical abstract machine; Axiomatic; Continuations; Denotational;…
  • But each of the above had limitations
    • Especially related to modularity, notation, verification
  • K framework initially engineered: keep advantages and avoid limitations of various semantic styles
    • Then theory came

7

8 of 52

Complete K Definition of KernelC

8

9 of 52

Complete K Definition of KernelC

9

Syntax declared using annotated BNF

10 of 52

Complete K Definition of KernelC

10

Configuration given as a nested cell structure.

Leaves can be sets, multisets, lists, maps, or syntax

11 of 52

Complete K Definition of KernelC

11

Semantic rules given contextually

rule

<k> X = V => V …</k>

<env>… X |-> (_ => V) …</env>

Semantic rules given contextually

rule

k: X = V => V …

env: … X |-> (_ => V) …

12 of 52

K Scales

Several large languages were recently defined in K:

  • JavaScript ES5: by Park etal [PLDI’15]
    • Passes existing conformance test suite (2872 programs)
    • Found (confirmed) bugs in Chrome, IE, Firefox, Safari
  • Java 1.4: by Bogdanas etal [POPL’15]
  • x86: by Dasgupta etal [PLDI’19]
  • C11: Ellison etal [POPL’12, PLDI’15]
    • 192 different types of undefined behavior
    • 10,000+ program tests (gcc torture tests, obfuscated C, …)
    • Commercialized by startup (Runtime Verification, Inc.)

+ EVM[CSF’18], Solidity, IELE[FM’19], WASM, Michelson, …

13 of 52

K Configuration and Definition of C

13

120 Cells!

… plus ~5000 rules …

14 of 52

Ideal Language Framework Vision [K]

14

Deductive program verifier

Parser

Interpreter

Compiler

(semantic) Debugger

Symbolic execution

Model checker

Formal Language Definition

(Syntax and Semantics)

K -> LLVM -> x86

15 of 52

1. Translate K def to LLVM�2. Compile LLVM to x86

Code (6-int-overflow.c)

RV-Match: Commercial tool

  • Instance of K concrete execution with C lang
  • Automatic debugger for subtle bugs other tools can't find, with no false positives
  • Seamless integration with unit tests, build infrastructure, and continuous integration
  • Platform for analyzing programs, boosting standards compliance and assurance

Concrete Execution Backend

16 of 52

RV-Match on Toyota ITC Benchmark�- Comparison with Other Analysis Tools -

Toyota Benchmark

ISSRE ’15 (1276 tests)

RV-Match

GrammaTech

CodeSonar

MathWorks

Code Prover

MathWorks

Bug Finder

GCC

Clang

DR

FPR

PM

DR

FPR

PM

DR

FPR

PM

DR

FPR

PM

DR

FPR

PM

DR

FPR

PM

Static memory

100

100

100

100

100

100

97

100

98

97

100

98

0

100

0

15

100

39

Dynamic memory

94

100

97

89

100

94

92

95

93

90

100

95

0

100

0

0

100

0

Stack-related

100

100

100

0

100

0

60

70

65

15

85

36

0

100

0

0

100

0

Numerical

96

100

98

48

100

69

55

99

74

41

100

64

12

100

35

11

100

33

Resource management

93

100

96

61

100

78

20

90

42

55

100

74

6

100

25

3

100

18

Pointer-related

98

100

99

52

96

71

69

93

80

69

100

83

9

100

30

13

100

36

Concurrency

67

100

82

70

77

73

0

100

0

0

100

0

0

100

0

0

100

0

Inappropriate code

0

100

0

46

99

67

1

97

10

28

94

51

2

100

13

0

100

0

Miscellaneous

63

100

79

69

100

83

83

100

91

69

100

83

11

100

34

11

100

34

AVERAGE (Unweighted)

79

100

89

59

97

76

53

94

71

52

98

71

4

100

20

6

100

24

AVERAGE (Weighted)

82

100

91

68

98

82

53

95

71

62

99

78

5

100

22

7

100

26

 

[CAV’16]

17 of 52

From RV-Match to Blockchain

  • RV-Match currently commercialized within

  • Same K technology used for defining / executing blockchain languages: EVM, IELE, WASM, Michelson,

17

18 of 52

Ideal Language Framework Vision [K]

18

Deductive program verifier

Parser

Interpreter

Compiler

(semantic) Debugger

Symbolic execution

Model checker

Formal Language Definition

(Syntax and Semantics)

19 of 52

State-of-the-Art

  • Redefine the language using a different semantic approach (Hoare/separation/dynamic logic)
  • Language specific, non-executable, error-prone

19

20 of 52

What We Want

  • Use directly the trusted

language model/semantics!

  • Language-independent proof system
    • Takes operational semantics as axioms
    • Derives reachability properties
    • Sound and relatively complete for all languages!

20

Formal Language Definition

(Syntax and Semantics)

Deductive program verifier

Symbolic execution

21 of 52

Matching μ-Logic

[…, LICS’13, RTA’15, OOPSLA’16, FSCD’16, LMCS’17, LICS’19, ICFP’20]

16 proof rules only!

Simple proof checker (200 LOC, vs Coq’s 30000 LOC)!

22 of 52

Expressiveness of Matching μ-Logic

23 of 52

K = (Best Effort) Implementation of ML

  • Reachability logic implemented in K, generically

23

EVM

Solidity

Michelson

WASM

  • Evaluated it with the existing semantics of C, Java, JavaScript, EVM, and several tricky programs
  • Morale:
    • Performance is comparable with language-specific provers!

24 of 52

for the Blockchain

24

25 of 52

K Framework Vision

25

Deductive program verifier

Parser

Interpreter

Compiler

(semantic) Debugger

Symbolic execution

Model checker

Formal Language Definition

(Syntax and Semantics)

Blockchain Language Definition

26 of 52

KEVM: Semantics of the Ethereum Virtual Machine (EVM) in K

Complete semantics of EVM in K

    • https://github.com/kframework/evm-semantics
    • Passes 60,000+ tests of C++ reference implementation
    • Faster (!) than EVM simulators, like ethereumjs (Truffle)
    • Used as canonical EVM spec (replacing Yellow Paper)

26

[CSL’18]

27 of 52

1) Formal documentation: http://jellopaper.org

27

What Can We Do with KEVM?

28 of 52

What Can We Do with KEVM?

2) Generate and deploy correct-by-construction EVM client/simulator/emulator

Firefly tool: KEVM to run, analyze and monitor tests

Cardano testnet: mantis executing KEVM

28

29 of 52

What Can We Do with KEVM?

3) Formally verify Ethereum smart contracts

RV does that commercially. Won first Ethereum Security grant to verify Casper, then were hired to formalize Beacon Chain (Serenity) and verify ETH1 -> ETH2 deposit contract

29

[ FSE’18 ]

30 of 52

Smart Contract Verification Workflow

ERC20 Informal Business Logic

Formalize

Refine

1

rule

transfer(T, V) => true

caller: F

account:

id: F

balance: BF => BF - V

account:

id: T

balance: BT => BT + V

log: . => Transfer(F,T,V)

requires

0 <= V /\

V <= BF /\

BT + V <= MAXVALUE

ERC20-K formal executable high-level spec

2

[transfer]

callData: #abiCallData("transfer", #address(TO_ID), #uint256(VALUE))

gas: {GASCAP} => _

refund: _ => _

requires:

andBool 0 <=Int TO_ID andBool TO_ID <Int (2 ^Int 160)

andBool 0 <=Int VALUE andBool VALUE <Int (2 ^Int 256)

andBool 0 <=Int BAL_FROM andBool BAL_FROM <Int (2 ^Int 256)

andBool 0 <=Int BAL_TO andBool BAL_TO <Int (2 ^Int 256)

[transfer-success]

k: #execute => (RETURN RET_ADDR:Int 32 ~> _)

localMem: .Map => ( .Map[ RET_ADDR := #asByteStackInWidth(1, 32) ] _:Map )

log: _:List ( .List => ListItem(#abiEventLog(ACCT_ID, "Transfer", #indexed(#address(CALLER_ID

……….

………

…….

ERC20-EVM formal executable low-level spec that contains all EVM details

3

31 of 52

Notable Contracts� We’ve Verified

  • Alchemix, Element
  • ETH2.0 Deposit
  • GnosisSafe
  • Ethereum Casper FFG
  • Uniswap
  • DappSys DSToken ERC20
  • Bihu KEY token

31

32 of 52

K Semantics of Other�Blockchain Languages

  • WASM (web assembly) – collaboration with the Ethereum Foundation, Web3, Emurgo, Elrond
  • Michelson – collaboration with Tezos; K interpreter comparable at performance with their OCAML reference implementation
  • Solidity, Vyper – with the Ethereum Foundation
  • TEAL + AlgoClarity – collaboration with Algorand
  • Plutus (functional) – collaboration with IOHK

32

33 of 52

Modelling and Verification of�Blockchain Protocols

  • Matching logic, rewriting and K can also be used to formally specify and verify consensus protocols, random number generators, etc.
  • Done or ongoing:
    • Casper FFG, Casper CBC (Ethereum Foundation)
    • RANDAO (Ethereum Foundation)
    • Algorand (Algorand)
    • ETH 2.0 / Beacon Chain (Ethereum Foundation)
    • Platon (Platon Networks)
  • Several others planned or in discussions

33

34 of 52

Taking K to the Next Level

  • Proof objects
  • Universal blockchain technology

34

35 of 52

Proof Object Generation

35

Deductive program verifier

Parser

Interpreter

Compiler

(semantic) Debugger

Symbolic execution

Model checker

Formal Language Definition

(Syntax and Semantics)

36 of 52

Proof Object Generation

  • Each of the K tools is a best-effort implementation of proof search in Matching µ-Logic:

  • New Haskell backend of K will explicitly generate proof objects for verification tasks

36

16 proof rules only!

Simple proof checker (<200 LOC)!

In contrast, Coq has about 45 proof rules, and its proof checker has 30000+ lines of OCAML

37 of 52

Proof Object Generation

  • No need to trust the (complex) K implementation, nor any company (including Runtime Verification)
    • It is known that program verifiers / tools can have bugs in spite of best efforts, bug finders and company prestige
  • Proof objects act as 3rd-party checkable correctness certificates on the blockchain, in a proof-carrying code style (proofs can be stored offchain, or snarked)
  • In combination with domain-specific languages for requirements specifications, this will offer the highest level of software assurance known to man

37

38 of 52

Ultimate Goal����� a Universal Blockchain Technology

38

39 of 52

Goals

K-powered blockchain: blockchain of truth

    • Smart contracts in any programming language formalized in K
    • Or better, smart contracts directly as executable K specifications
    • Computation = proof: builtin proof checker will ensure valid computations and correctness claims on the blockchain

39

40 of 52

K – A Universal Blockchain Language

  • We want to be able to write (provably correct) smart contracts in any programming language.
  • All you need is a K-powered blockchain!

40

K language semantics will be stored on blockchain. Fast LLVM backend of K can be used as execution engine / VM.

Vyper

V1.2

Solidity

V0.2.21

Plutus

V2.1

41 of 52

Semantics-Based Compilation

41

Deductive program verifier

Parser

Interpreter

Compiler

(semantic) Debugger

Symbolic execution

Model checker

Formal Language Definition

(Syntax and Semantics)

42 of 52

Semantics-Based Compilation (SBC)

Goals

    • Execution of P in L equivalent to executing LP in a start configuration
    • LP should be “as simple as possible”, only capturing exactly the dynamics of L necessary to execute program P

Program P in Language L

Semantics-Based Compilation

𝕂 Semantics of Language L

𝕂 Semantics of Language LP

43 of 52

K – A Universal Blockchain Language

  • K-powered blockchain enables (provably correct) smart contracts in any programming language!

43

Vyper

V1.2

Solidity

V0.2.21

Plutus

V2.1

1. Write contract P in any language, say L (unique address)

2. SBC[L] your P into LP ; verify P (or LP) with K prover

44 of 52

K – A Universal Blockchain Language

  • When all the projected K tools will be completed, K will provide everything we need to
    • Design new smart contract languages, add them in the blockchain and start using them right away, with auto-generated correct(!) implementations and tools
    • Same for virtual machines(!) and consensus protocols(!)
  • Everything will be either a trusted formal specification or generated automatically from one
  • Proof objects will serve as correctness certificates
  • Perfect. No compromise! … moreover …

44

45 of 52

a Ultimate Smart Contract Language

45

46 of 52

K as a Smart Contract Language

  • Smart contracts implement transactions
    • Often using poorly designed and thus insecure languages, compilers and interpreters / VMs

K also implements transactions, directly!

    • Indeed, each K rule instance is a transaction
  • Each smart contract (Solidity, EVM, …) requires a formal specification in order to be verified

K formal specifications are already executable!

    • And indeed, they are validated by heavy testing

46

Hm, then why not write my smart contracts directly and only as K executable specifications?

47 of 52

Example: ERC20 Token in Solidity�- Snippet -

47

48 of 52

Example: ERC20 Compiled to EVM�- Snippet -

48

Bytecode:

Opcodes:

  • Unreadable
  • Slow: ~25ms to execute (ganache)
  • Untrusted compiler, so it needs to be formally verified to be trusted
    • We formally verify it using KEVM against the following K specification:

49 of 52

K Specification of ERC20�- Snippet, Sugared -

49

rule transfer(To, V) => true

caller: From

account: id: From balance: BalanceFrom => BalanceFrom - V

account: id: To balance: BalanceTo => BalanceTo + V

log: . => Transfer(From, To, V)

requires 0 <= V <= BalanceFrom /\ BalanceTo + V <= MAXVALUE

  • Formal, yet understandable by non-experts
  • Executable, thus testable (for increased confidence)
  • Fast: ~2ms to execute with LLVM backend of K
  • No compiler required, correct-by-construction
  • Use K as programming language for smart contracts!

(needed: gas model for K)

50 of 52

Blockchain of Truth

50

51 of 52

Blockchain of Truth (BoT)

  • There is only one known way that can be used to assert, with absolute certainty, correctness:

  • We envision a blockchain of truth, whose goal is to store and certify mathematical facts
    • Many correctness claims are mathematical facts
      • Formal verification, trusted computation, etc.
    • Trust reduces to a simple, <200 line program:

51

mathematical proof

proof checker

52 of 52

Correct-By-Construction and Fast�is Possible!

52

Deductive program verifier

Parser

Interpreter

Compiler

(semantic) Debugger

Symbolic execution

Model checker

Formal Language Definition

(Syntax and Semantics)