JSON sucks, let’s talk binary!
David Bruant
bdxio - October 16th 2015 - ENSEIRB-MATMECA
Overview
4) Protocol Buffer (protobuf)
5) Other equivalent protocols
6) With infinite resources...
(too long) Introduction
Two devs walk into a bar...
In this very classroom (almost)
Riddle
… so? how many?
(this slide left blank intentionally)
(hat tip Janet Switcher for this joke)
Riddle - answer
Riddle 2 - biaised choice
Representing information efficiently
Motivation
PERFORMANCE
Motivation
Why does JSON suck
My position
(aside) The code we write and the wire
The code and the wire
01001001001101010101111010101
Machine 1
Machine 2
JSON.stringify(obj) => send
receive => JSON.parse(str)
encode(obj) => send
receive => decode(buf)
(30 bytes)
(8 bytes)
Protocol Buffer
Protocol buffer - Origins
Protocol buffer - Because binary protocols are hard
https://en.wikipedia.org/wiki/IPv4#Header
Protocol buffer - Introduction
// message description
message measure {
float lng = 1;
float lat = 2;
repeated int32 values = 4 [packed=true];
}
function measureEncode(obj){
var b = new Buffer(...)
// read obj, write in b
return b;
}
function measureDecode(b){
var obj = {};
// read b, write in obj
return obj;
}
(protobuf wire protocol)
Protocol Buffer - Introduction
Protocol buffer tricks - Binary 101
0 => 0 | ... | 9 => 9 |
a => 10 | b => 11 | c => 12 |
d => 13 | e => 14 | f => 15 |
Protocol buffer tricks - varint
00000001 0000001 | 1 |
10101100 00000010 0000010 0101100 | 300 |
10101100 10000000 00010001 0010001 0000000 0101100 | 278572 |
Protocol buffer tricks - encoding a list
Protocol buffer tricks - types
Type | Meaning |
0 | varint |
1 | 64-bit |
2 | Length-delimited |
... | (other types) |
https://developers.google.com/protocol-buffers/docs/encoding#structure
First byte of a field is
(field_number << 3) | wire_type
Protocol buffer tricks - example
message Test4 {� repeated int32 d = 4 [packed=true];�}
encodeTest4({d: [3, 270, 86942]}) :
22 // tag (field number 4, wire type 2)�06 // payload size (6 bytes)�03 // first element (varint 3)�8E 02 // second element (varint 270)�9E A7 05 // third element (varint 86942)
https://developers.google.com/protocol-buffers/docs/encoding#packed-repeated-fields
Protocol buffer - limitations
Protocol buffer - limitations - date
Protocol buffer - limitations - delta encoding
Protobuf on the web
Other equivalent ideas
With infinite resources...
Making protobuf easier/better
Can we do better than protobuf and the others?
// message description
message measure {
float lng = 1;
float lat = 2;
repeated int32 values = 4 [packed=true];
}
function measureEncode(obj){
var b = new Buffer(...)
// read obj, write in b
return b;
}
function measureDecode(b){
var obj = {};
// read b, write in obj
return obj;
}
(protobuf wire protocol)
Can we do better than protobuf and the others?
Constraints
- backward compat
- forward compat
- lazy random read
- ....
function encode(obj){
var b = new Buffer(...)
// read obj, write in b
return b;
}
function decode(b){
var obj = {};
// read b, write in obj
return obj;
}
create wire protocol
Message description
- language (string)
- Date
- some encoding (like delta-encoding, zip)
- ….
Nobody care about this!
Announcing Protocol Bruant
… no, just kidding :-)
Thanks!