1 of 37

Virtual Machines

2 of 37

The MITScript Project

Parser (Formal Grammars and Parsing Theory, Phase 1)

Interpreter (Program Semantics, Phase 2)

Virtual Machine

    • Garbage Collector (Memory Management, Phase 3)
    • Execution Organization (Syntax-Directed Translation, Phase 4)

Code Generation and Optimization (Efficiency, Phase 5)

3 of 37

  • Language (“What”)?
    • Syntax
    • Semantics
  • Virtual Machine (“How”)
    • Instruction Set (single operations)
    • Execution Organization (code, stack, heap)
    • Managed Runtime (data type representation, and garbage collection)
  • Machine Code
    • Instruction Set
    • Execution Organization (Segments, Stack, Pages)

Virtual Machine

Language

High-Level

VM

Machine Code

Low-Level

VM

4 of 37

What?

f = fun(y) {

x = y + 2;

};

f(1);

function {

functions = [ ],

constants = [None, 0, 1],

names = [f],

instructions = [

load_const 1

load_func 0

alloc_closure

store_global 0

load_global 0

load_const 2

call 1

pop

load_const 0

return

]

}

function {

local_vars = [y, x],

constants = [None, 2],

instructions = [

load_local 0

load_const 1

add

store_local 1

load_const 0

return

]

}

5 of 37

Livecode – show pycache

6 of 37

Recursive Interpreter

 

 

 

 

class Binop : public Expr {

enum BINOP { PLUS, SUB, MUL, DIV};

BINOP op;

Expr *left, *right;

};

int eval_expr(Frame* f, Expr *e) { ... }

int eval_binop(Frame *f, Binop *e) {

switch(e->op) {

case PLUS: return eval_plus(e);

case SUB : return eval_sub(e);

case MUL : return eval_mul(e);

case DIV : return eval_div(e);

}

}

int eval_plus(Frame *f, Binop *e) {

int n_1 = eval_expr(f, e->left);

int n_2 = eval_expr(f, e->right);

return n_1 + n_2;

}

7 of 37

Why?

  • Semantics as recursive inference rules

  • “Easy” to map directly to a recursive interpreter
    • Unstructured control flow can be a little messy, but implementation directly matches semantics

  • Very inefficient!
    • Interpreter is continuously traversing a large datastructure
    • Deep recursive calls have high overheads
    • A simple statement like x = 2+2 may involve thousands of instructions

8 of 37

Why?

In C:

//assuming y is in register r1

//puts the result in x in register r2.

mov $r2, 2

add $r2 $r1 //y + 2

In a dynamic language:

Check what type of value y is.

Get the integer value from the value object of y

Allocate a value object for 2.

Check the type of the value object for 2

Get the integer value from the value object of 2

Allocate a new integer value to compute y + 2

Make x point to that integer value.

x = y + 2

9 of 37

Translation

fun(y) {

x = y + 2;

};

function {

local_vars = [y, x],

constants = [None, 2],

instructions = [

load_local 0

load_const 1

add

store_local 1

load_const 0

return

]

}

function __function_1 (y) : {

%x = alloca int64_t

%0 = %y

%1 = 2

%2 = add %0 %1

store %x %2

%3 = 0

return %3

}

__function_1 :

sub 8 %rsp

mov %rsp %rbp

mov (%rbp, 2, 8) %rax

mov 2 %rcx

add %rcx %rax

mov %rax (%rbp)

mov 0 %rax

ret

High-Level

VM

Low-Level

VM

Machine Code

Language

10 of 37

Virtual Machine: Why?

  • Faster to interpret than the original AST

  • Provide a layer of abstraction between the code and hardware: portable!
    • Opportunity to apply machine-independent analysis and optimizations once
    • E.g., type check elimination, dead code elimination, algebraic simplifications

  • Easier to compile (to machine code) relative to AST

11 of 37

MITScript Virtual Machine

function {

functions = [ ],

constants = [None, 0, 1],

names = [f],

instructions = [

load_const 1

load_func 0

alloc_closure

store_global 0

load_global 0

load_const 2

call 1

pop

load_const 0

return

]}

function {

local_vars = [y, x],

constants = [None, 2],

instructions = [

load_local 0

load_const 1

add

store_local 1

load_const 0

return

]

}

f = fun(y) {

x = y + 2;

};

f(1);

12 of 37

Organization (Code, Stack, and Heaps)

  • Code
    • Instructions
    • Functions
    • Metadata

function {

functions = [ ],

constants = [None, 0, 1],

names = [f],

instructions = [

load_const 1

load_func 0

alloc_closure

store_global 0

load_global 0

load_const 2

call 1

pop

load_const 0

return

]}

function {

local_vars = [y, x],

constants = [None, 2],

instructions = [

load_local 0

load_const 1

add

store_local 1

load_const 0

return

]

}

f = fun(y) {

x = y + 2;

};

f(1);

13 of 37

Organization (Code, Stack, and Heaps)

  • Stacks
    • Locals
    • Instruction pointer
    • Operand Stack
  • Heaps
    • Associative Array
  • Globals

function {

functions = [ ],

constants = [None, 0, 1],

names = [f],

instructions = [

load_const 1

load_func 0

alloc_closure

store_global 0

load_global 0

load_const 2

call 1

pop

load_const 0

return

]}

function {

local_vars = [y, x],

constants = [None, 2],

instructions = [

load_local 0

load_const 1

add

store_local 1

load_const 0

return

]

}

14 of 37

Python VM (Illustrative)

Op Stack

function {

functions = [ ],

constants = [None, 0, 1],

names = [f],

instructions = [

load_const 1

load_func 0

alloc_closure

store_global 0

load_global 0

load_const 2

call 1

pop

load_const 0

return

]}

function {

local_vars = [y, x],

constants = [None, 2],

instructions = [

load_local 0

load_const 1

add

store_local 1

load_const 0

return

]

}

IP

Heap

Globals

f

Locals

Stack

Frame

15 of 37

Python VM (Illustrative)

Op Stack

function {

functions = [ ],

constants = [None, 0, 1],

names = [f],

instructions = [

load_const 1

load_func 0

alloc_closure

store_global 0

load_global 0

load_const 2

call 1

pop

load_const 0

return

]}

function {

local_vars = [y, x],

constants = [None, 2],

instructions = [

load_local 0

load_const 1

add

store_local 1

load_const 0

return

]

}

IP

0

Globals

f

Locals

0

Stack

Frame

Heap

16 of 37

Python VM (Illustrative)

Op Stack

function {

functions = [ ],

constants = [None, 0, 1],

names = [f],

instructions = [

load_const 1

load_func 0

alloc_closure

store_global 0

load_global 0

load_const 2

call 1

pop

load_const 0

return

]}

function {

local_vars = [y, x],

constants = [None, 2],

instructions = [

load_local 0

load_const 1

add

store_local 1

load_const 0

return

]

}

IP

1

Globals

f

Locals

Stack

Frame

Heap

0

17 of 37

Python VM (Illustrative)

Op Stack

function {

functions = [ ],

constants = [None, 0, 1],

names = [f],

instructions = [

load_const 1

load_func 0

alloc_closure

store_global 0

load_global 0

load_const 2

call 1

pop

load_const 0

return

]}

function {

local_vars = [y, x],

constants = [None, 2],

instructions = [

load_local 0

load_const 1

add

store_local 1

load_const 0

return

]

}

IP

2

Globals

f

Locals

Stack

Frame

Heap

0

18 of 37

Python VM (Illustrative)

Op Stack

Locals

function {

functions = [ ],

constants = [None, 0, 1],

names = [f],

instructions = [

load_const 1

load_func 0

alloc_closure

store_global 0

load_global 0

load_const 2

call 1

pop

load_const 0

return

]}

function {

local_vars = [y, x],

constants = [None, 2],

instructions = [

load_local 0

load_const 1

add

store_local 1

load_const 0

return

]

}

IP

3

Closure { f : }

Globals

f

Stack

Frame

Heap

0

19 of 37

Python VM (Illustrative)

Globals

f

Op Stack

Locals

function {

functions = [ ],

constants = [None, 0, 1],

names = [f],

instructions = [

load_const 1

load_func 0

alloc_closure

store_global 0

load_global 0

load_const 2

call 1

pop

load_const 0

return

]}

function {

local_vars = [y, x],

constants = [None, 2],

instructions = [

load_local 0

load_const 1

add

store_local 1

load_const 0

return

]

}

IP

4

Stack

Frame

Heap

Closure { f : }

0

20 of 37

Python VM (Illustrative)

Globals

f

Op Stack

Locals

function {

functions = [ ],

constants = [None, 0, 1],

names = [f],

instructions = [

load_const 1

load_func 0

alloc_closure

store_global 0

load_global 0

load_const 2

call 1

pop

load_const 0

return

]}

function {

local_vars = [y, x],

constants = [None, 2],

instructions = [

load_local 0

load_const 1

add

store_local 1

load_const 0

return

]

}

IP

5

Stack

Frame

Heap

Closure { f : }

0

21 of 37

Python VM (Illustrative)

Globals

f

Op Stack

Locals

function {

functions = [ ],

constants = [None, 0, 1],

names = [f],

instructions = [

load_const 1

load_func 0

alloc_closure

store_global 0

load_global 0

load_const 2

call 1

pop

load_const 0

return

]}

function {

local_vars = [y, x],

constants = [None, 2],

instructions = [

load_local 0

load_const 1

add

store_local 1

load_const 0

return

]

}

IP

6

Stack

Frame

Heap

0

Closure { f : }

1

22 of 37

Python VM (Illustrative)

Globals

f

Locals

function {

functions = [ ],

constants = [None, 0, 1],

names = [f],

instructions = [

load_const 1

load_func 0

alloc_closure

store_global 0

load_global 0

load_const 2

call 1

pop

load_const 0

return

]}

function {

local_vars = [y, x],

constants = [None, 2],

instructions = [

load_local 0

load_const 1

add

store_local 1

load_const 0

return

]

}

IP

7

Op Stack

Locals

y :

x : Uninit

IP

0

Op Stack

Stack

Frame

Heap

0

Closure { f : }

Stack

Frame

1

23 of 37

Python VM (Illustrative)

Globals

f

Locals

function {

functions = [ ],

constants = [None, 0, 1],

names = [f],

instructions = [

load_const 1

load_func 0

alloc_closure

store_global 0

load_global 0

load_const 2

call 1

pop

load_const 0

return

]}

function {

local_vars = [y, x],

constants = [None, 2],

instructions = [

load_local 0

load_const 1

add

store_local 1

load_const 0

return

]

}

IP

7

Op Stack

Locals

y :

x : Uninit

IP

1

Op Stack

Stack

Frame

Heap

0

Closure { f : }

Stack

Frame

1

24 of 37

Python VM (Illustrative)

Globals

f

Locals

function {

functions = [ ],

constants = [None, 0, 1],

names = [f],

instructions = [

load_const 1

load_func 0

alloc_closure

store_global 0

load_global 0

load_const 2

call 1

pop

load_const 0

return

]}

function {

local_vars = [y, x],

constants = [None, 2],

instructions = [

load_local 0

load_const 1

add

store_local 1

load_const 0

return

]

}

IP

7

Op Stack

Locals

y :

x : Uninit

IP

2

Stack

Frame

Op Stack

Stack

Frame

Heap

0

Closure { f : }

1

2

25 of 37

Python VM (Illustrative)

Globals

f

Locals

function {

functions = [ ],

constants = [None, 0, 1],

names = [f],

instructions = [

load_const 1

load_func 0

alloc_closure

store_global 0

load_global 0

load_const 2

call 1

pop

load_const 0

return

]}

function {

local_vars = [y, x],

constants = [None, 2],

instructions = [

load_local 0

load_const 1

add

store_local 1

load_const 0

return

]

}

IP

7

Op Stack

Locals

y :

x : Uninit

IP

3

Op Stack

Stack

Frame

Stack

Frame

Heap

0

Closure { f : }

1

2

3

26 of 37

Python VM (Illustrative)

Globals

f

Locals

function {

functions = [ ],

constants = [None, 0, 1],

names = [f],

instructions = [

load_const 1

load_func 0

alloc_closure

store_global 0

load_global 0

load_const 2

call 1

pop

load_const 0

return

]}

function {

local_vars = [y, x],

constants = [None, 2],

instructions = [

load_local 0

load_const 1

add

store_local 1

load_const 0

return

]

}

IP

7

Op Stack

Locals

y :

x :

IP

4

Op Stack

1

2

3

Stack

Frame

Stack

Frame

Heap

0

Closure { f : }

27 of 37

Python VM (Illustrative)

Globals

f

Locals

function {

functions = [ ],

constants = [None, 0, 1],

names = [f],

instructions = [

load_const 1

load_func 0

alloc_closure

store_global 0

load_global 0

load_const 2

call 1

pop

load_const 0

return

]}

function {

local_vars = [y, x],

constants = [None, 2],

instructions = [

load_local 0

load_const 1

add

store_local 1

load_const 0

return

]

}

IP

7

Op Stack

Locals

y :

x :

IP

5

Op Stack

None

Stack

Frame

Stack

Frame

Heap

0

Closure { f : }

1

2

3

28 of 37

Python VM (Illustrative)

Globals

f

Op Stack

Locals

function {

functions = [ ],

constants = [None, 0, 1],

names = [f],

instructions = [

load_const 1

load_func 0

alloc_closure

store_global 0

load_global 0

load_const 2

call 1

pop

load_const 0

return

]}

function {

local_vars = [y, x],

constants = [None, 2],

instructions = [

load_local 0

load_const 1

add

store_local 1

load_const 0

return

]

}

IP

7

Stack

Frame

Heap

0

Closure { f : }

1

2

3

None

29 of 37

Python VM (Illustrative)

Globals

f

Op Stack

Locals

function {

functions = [ ],

constants = [None, 0, 1],

names = [f],

instructions = [

load_const 1

load_func 0

alloc_closure

store_global 0

load_global 0

load_const 2

call 1

pop

load_const 0

return

]}

function {

local_vars = [y, x],

constants = [None, 2],

instructions = [

load_local 0

load_const 1

add

store_local 1

load_const 0

return

]

}

IP

8

Stack

Frame

Heap

0

Closure { f : }

1

2

3

30 of 37

Python VM (Illustrative)

Globals

f

Op Stack

Locals

function {

functions = [ ],

constants = [None, 0, 1],

names = [f],

instructions = [

load_const 1

load_func 0

alloc_closure

store_global 0

load_global 0

load_const 2

call 1

pop

load_const 0

return

]}

function {

local_vars = [y, x],

constants = [None, 2],

instructions = [

load_local 0

load_const 1

add

store_local 1

load_const 0

return

]

}

IP

9

Stack

Frame

Heap

0

Closure { f : }

None

1

2

3

31 of 37

General Characteristics

  • More explicit addressing of storage
    • globals, locals, with corresponding operations to access each.

  • Simple instructions perform 1 operation at a time�
  • Explicit control flow through call/return and jumps (we’ll see next time)

32 of 37

Virtual Machine

// Description: push a constant onto the operand stack

// Operand 0: index of constant in enclosing function's list of constants

// Mnemonic: load_const i

// Stack: S => S :: f.constants()[i]

LoadConst,

// Description: push a function onto the operand stack

// Operand 0: index of function in enclosing function's list of functions

// Mnemonic: load_func i

// Stack: S => S :: f.functions()[i]

LoadFunc,

// Description: load value of local variable and push onto operand stack

// Operand 0: index of local variable to read

// Mnemonic: load_local i

// S => S :: value_of(f.local_vars[i])

LoadLocal,

33 of 37

Virtual Machine

// Description: store top of operand stack into local variable

// Operand 0: index of local variable to store into

// Operand 1: value to store

// Mnemonic: store_local i

// Stack: S :: operand 1 ==> S

StoreLocal,

// Description: implements binary operation (as given in the semantics of Assignment #2)

// Operand 0: N/A

// Operand 1: right value

// Operand 2: left value

// Result: value of the operation as specified by the semantics of Assignment #2 // Mnemonic: sub/mul/div

// Stack: S:: operand 2 :: operand 1 => S :: op(operand 2, operand 1)

Add,

34 of 37

Virtual Machine: Why?

  • Faster to interpret than the original AST

  • Provide a layer of abstraction between the code and hardware: portable!
    • Opportunity to apply machine-independent analysis and optimizations once
    • E.g., type check elimination, dead code elimination, algebraic simplifications

  • Easier to compile (to machine code) relative to AST

35 of 37

Language Implementations

Interpreted

Interpreted

Swift

LLVM

EVM

Java

Python VM

Ethereum

Java VM

x86

ARM

SPARC

PowerPC

Language

High-Level

VM

Machine Code

C/C++ (Clang)

Low-Level

VM

JavaScript

C/C++ (gcc)

GIMPLE (HL)

GIMPLE (LL, SSA)

MIPS

LLVM (MC)

Direct

Translation

Python

SIL

LLVM (MC)

LLVM

36 of 37

“Java is an attractive solution for set top boxes. Java byte code can run on any microprocessor. Intel x86 processors dominate desktop computers, but embedded systems like set top boxes may use any number of different processors. Choice of processor in embedded systems is dependent on factors such as cost, performance and footprint. For these reasons, platform neutrality is crucial to success.”

37 of 37

Language Implementations

Interpreted

Interpreted

Swift

LLVM

EVM

Java

Python VM

Ethereum

Java VM

x86

ARM

SPARC

PowerPC

Language

High-Level

VM

Machine Code

C/C++ (Clang)

Low-Level

VM

JavaScript

C/C++ (gcc)

GIMPLE (HL)

GIMPLE (LL, SSA)

MIPS

LLVM (MC)

Direct

Translation

Python

SIL

LLVM (MC)

LLVM