Fast arithmetic for dynamic languages
Jaroslav Sevcik, Benedikt Meurer, Georg Neis
jarin@google.com, bmeurer@google.com, neis@google.com
V8 compiler engineers
Confidential + Proprietary
Confidential + Proprietary
Plan
Confidential + Proprietary
Encoding of numbers in V8 (x64)
Small integers are encoded into “tagged pointers”.
Integers:
Reference:
E.g., integer 42 encoded as 0x0000002a00000000,� pointer 0x12345678 encoded as 0x12345679
Non-integer numbers live �on the heap.
0000:0000
int32
high 63-bits of pointer
1
Pointer to heap num
1
<number header>
1
Float64 payload
Heap
Confidential + Proprietary
NaN tagging (JavaScriptCore x64)
Integers, doubles and pointers encoded in 64 bits:
Integers:
� Double
� Reference
In addition, false, true, undefined and null have special encodings (6, 7, 10, 2).
int32
FFFF:0000
48 bit pointer
0000
Payload
0001-FFFE
Double bit-pattern encoded by incrementing by 2^48, so that integer and references overlap with float64 NaN aspace.
Confidential + Proprietary
Addition in spec (semantics)
Confidential + Proprietary
Addition in spec (semantics)
Confidential + Proprietary
Addition in spec (semantics)
Confidential + Proprietary
Addition in spec (semantics)
Confidential + Proprietary
Addition in spec (semantics)
Confidential + Proprietary
Arithmetic in V8’s interpreter
Call to a hand-written assembly code (aka stub) for the fast cases:
Other objects handled fall back to a generic C++ implementation.
Confidential + Proprietary
Type feedback
Binary operation stubs also collect type feedback and store it on functions.
function f(a, b){� var c = a + a;� return b + c;�}
Function info
Feedback vector
...
...
StackCheck
Ldar a0
Add a0, [2]
Star r0
Ldar r0
Add a1, [3]
Return
2: unitialized
3: unitialized
Confidential + Proprietary
Type feedback
Binary operation stubs also collect type feedback and store it on functions.
function f(a, b){� var c = a + a;� return b + c;�}
f(1, 2);
Function info
Feedback vector
...
...
2: int x int -> int
3: int x int -> int
StackCheck
Ldar a0
Add a0, [2]
Star r0
Ldar r0
Add a1, [3]
Return
Confidential + Proprietary
Type feedback
Binary operation stubs also collect type feedback and store it on functions.
function f(a, b){� var c = a + a;� return b + c;�}
f(1, 2)�f(1, 0.1)
Function info
Feedback vector
...
...
2: int x int -> int
3: num x int -> num
StackCheck
Ldar a0
Add a0, [2]
Star r0
Ldar r0
Add a1, [3]
Return
Confidential + Proprietary
Optimizing compiler
Uses type feedback to insert dynamic checks.
Generally, checks inserted early in the compiler pipeline.
After the check, compiler uses the type as the static type.
If the check fails, deoptimize.
Confidential + Proprietary
Compile with types
Check inputs, add them up, check for overflow, and return the result.
function f(a, b){� var c = a + a;� return b + c;�}
f(1, 2);�f(1, 2);
StackCheck
Ldar a0
Add a0, [2]
Star r0
Ldar r0
Add a1, [3]
Return
Optimized code
...
movq rax,[rbp+0x18]
test al,0x1
jnz 115
movq rbx,rax
addq rbx,rax
jo 120
movq rax,[rbp+0x10]
test al,0x1
jnz 125
addq rbx,rax
jo 130
movq rax,rbx
movq rsp,rbp
pop rbp
ret 0x18
...
Feedback vector
...
...
2: int x int -> int
3: int x int -> int
Confidential + Proprietary
Compile with types
Check inputs, add them up, check for overflow, and return the result.
function f(a, b){� var c = a + a;� return b + c;�}�// Optimize�f(1, 2);�f(1, 0.1); // Deopt!
StackCheck
Ldar a0
Add a0, [2]
Star r0
Ldar r0
Add a1, [3]
Return
Optimized code
...
movq rax,[rbp+0x18]
test al,0x1
jnz 115
movq rbx,rax
addq rbx,rax
jo 120
movq rax,[rbp+0x10]
test al,0x1
jnz 125
addq rbx,rax
jo 130
movq rax,rbx
movq rsp,rbp
pop rbp
ret 0x18
...
Deoptimizer
Feedback vector
...
...
2: int x int -> int
3: int x int -> int
Confidential + Proprietary
Deoptimization
Check inputs, add them up, check for overflow, and return the result.
function f(a, b){� var c = a + a;� return b + c;�}�// Optimize�f(1, 2);�f(1, 0.1); // Deopt!
Feedback vector
...
...
2: int x int -> int
3: int x int -> int
-> StackCheck
Ldar a0
Add a0, [2]
Star r0
Ldar r0
Add a1, [3]
Return
Deoptimizer
Confidential + Proprietary
Deoptimization
Check inputs, add them up, check for overflow, and return the result.
function f(a, b){� var c = a + a;� return b + c;�}�// Optimize�f(1, 2);�f(1, 0.1);
Feedback vector
...
...
2: int x int -> int
3: num x int -> num
StackCheck
Ldar a0
Add a0, [2]
Star r0
Ldar r0
Add a1, [3]
Return
Deoptimizer
Type feedback is updated when executing addition again.
Confidential + Proprietary
Re-compile with types
Check inputs, add them up, convert, check for overflow, and return the result.
function f(a, b){� var c = a + a;� return b + c;�}
f(1, 0.1)�f(1, 0.1)
Feedback vector
...
...
2: int x int -> int
3: num x int -> num
StackCheck
Ldar a0
Add a0, [2]
Star r0
Ldar r0
Add a1, [3]
Return
Optimized code
...
movq rax,[rbp+0x18]�test al,0x1�jnz 179 (0x120096c04373)�shrq rax, 32�movq rbx,rax�addl rbx,rax�jo 332 (0x120096c0440c)�movq rax,[rbp+0x10]�test al,0x1�jz 91 (0x120096c0431b)�movq r10,[r13+0x50]�cmpq [rax-0x1],r10�vmovsd xmm0,[rax+0x7]�jnz 337 (0x120096c04411)�jmp 107 (0x120096c0432b)�movq r10,rax�shrq r10, 32�vxorpd xmm0,xmm0,xmm0�vcvtlsi2sd xmm0,xmm0,r10�vxorpd xmm1,xmm1,xmm1�vcvtlsi2sd xmm1,xmm1,rbx�vaddsd xmm0,xmm1,xmm0�movq rbx,[r13+0x7ff3f0]�movq rax,rbx�addq rax,0x10�cmpq rax,[r13+0x7ff3f8]�ja 254 (0x120096c043be)�movq [r13+0x7ff3f0],rax�incq rbx�movq r10,[r13+0x50]�movq [rbx-0x1],r10�vmovsd [rbx+0x7],xmm0�movq rax,rbx�movq rsp,rbp�pop rbp�ret 0x18�...
Deoptimizer
Confidential + Proprietary
Deoptimization pitfalls
Deoptimization is expensive!
�We expect to reach steady state.
Not always guaranteed, compiler must be careful.
If no feedback gets updated, we can run into deoptimization loop:
Function keeps getting deoptimized and optimized again until we reach deoptimization limit and the function is never optimized again.
Confidential + Proprietary
Deoptimization loop example
function f(x, y) { // First run with x=1, y = undefined.� if (x) y = x + 1; // Integer feedback for +� // Type feedback suggests y is integer here and used� // as integer later.� // Crankshaft will try to store y in int32… � // … and deoptimize if y is not int32.� if (x) y = y + 2; // Integer feedback for +� return y;�}
Confidential + Proprietary
Deoptimization loop example
function f(x, y) { // First run with x=1, then x=0.� if (x) y = x + 1; // Integer feedback for +� // Deoptimizes here if y not int32!� if (x) y = y + 2; // Integer feedback for +� return y;�}
�f(1, undefined); // Prime the feedback, then optimize.�// Crankshaft keeps deoptimizing, eventually disables opt.�for (var i = 0; i < 1000000; i++) f(0, undefined);
Confidential + Proprietary
Can we do better?
Javascript code (Octane crypto.js)� var l = a[i]&0x3fff, h = a[i++]>>14, m = xh*l+h*xl;� l = ((m&0x3fff)<<14 …
Asm.js code (Octane zlib.js)
e=...|0;K=...|0; ...�S=e+K+c+b+f+E+D+m+o+g+a+C+B+A+z+y+R|0
No overflow checks needed, bitwise ops only observe low 32 bits!
Confidential + Proprietary
Truncations
What is a truncation?
In (x + y)|0, we only care about the lowest 32 bits of the result of the sum. �If x, y are int52, we only care about lowest 32 bits of x and y.
Expression +a[i] does not distinguish between a[i] = undefined and a[i] = NaN.�Reading holes can produce NaN! (Instead of undefined.)
For c ? x : y, we do not distinguish between c = 1 and c = true.
Confidential + Proprietary
Truncation propagation
Propagate truncations from uses to definitions.
At definition, take least upper bound of all uses.
Choose input truncations based on the use truncations.
Confidential + Proprietary
Example 1
var a = (x|0)+(y|0)*(z&0xff)|0
ToInt32
ToInt32
&
*
+
ToInt32
Confidential + Proprietary
Example 1 (type upper bounds)
var a = (x|0)+(y|0)*(z&0xff)|0
ToInt32
ToInt32
&
*
+
ToInt32
Type: ..-547608329985
Type: 0-255
Type: ..-549755813632
Type: I32
Type: I32
Confidential + Proprietary
Example 1 (truncation propagation)
var a = (x|0)+(y|0)*(z&0xff)|0
ToInt32
ToInt32
&
*
+
ToInt32
Type: ..-547608329985
Type: 0-255
Type: ..-549755813632
Word32
None
None
None
None
Type: I32
Type: I32
Confidential + Proprietary
Example 1 (truncation propagation)
var a = (x|0)+(y|0)*(z&0xff)|0
ToInt32
ToInt32
&
*
ToInt32
Type: ..-547608329985
Type: 0-255
Type: ..-549755813632
Word32
Word32
Word32
None
None
Type: I32
Type: I32
Confidential + Proprietary
Example 1 (truncation propagation)
var a = (x|0)+(y|0)*(z&0xff)|0
ToInt32
ToInt32
&
* I32
ToInt32
Type: ..-547608329985
Type: 0-255
Type: ..-549755813632
Word32
Word32
Word32
Word32
Word32
Type: I32
Type: I32
No overflow checks necessary, all operations in int32!
Confidential + Proprietary
Example 2
var a = (y|0)*(z&0xff); f(a); a = (x|0)+(a|0)|0
ToInt32
ToInt32
&
*
+
ToInt32
Call
Confidential + Proprietary
Example 2
var a = (y|0)*(z&0xff); f(a); a = (x|0)+(a|0)|0
ToInt32
ToInt32
&
* F64
ToInt32
Call
Word32
Word32
Word32
Any
Float64
Float64
Confidential + Proprietary
Example 2
var a = (y|0)*(z&0xff); f(a); a = (x|0)+(a|0)|0
ToInt32
ToInt32
&
* F64
ToInt32
Call
I32 -> F64
I32 -> F64
F64 -> W32
Confidential + Proprietary
Other uses of truncations
The truncation machinery can be used for other optimizations:
Propagation only in V8’s Turbofan compiler, mostly tuned for asm.js-like workloads.
Still some challenges to balance type feedback and truncations.
Confidential + Proprietary
Challenges
Currently, truncation analysis runs first, type feedback does not affect the truncation decision.
Smarter choice of “worse” representations if profitable.
Confidential + Proprietary
Future directions
It is possible that JavaScript could get arbitrary precision integers.
Is it preferable to have (U)Int64 or BigNum in the language?
Does SIMD make sense for JavaScript (or other dynamic languages)?
Confidential + Proprietary