WebRTC�
HTTP/2 is here, let’s optimize!
or, why (some) yesterday's best-practices are today's HTTP/2 anti-patterns.
+Ilya Grigorik
@igrigorik
RFC 7540
“9% of all Firefox (M36) HTTP transactions are happening over HTTP/2. There are actually more HTTP/2 connections made than SPDY ones. This is well exercised technology.”
Feb 18, 2015 - Patrick McManus, Mozilla
New TLS + NPN/ALPN connections in Chrome:
May 26, 2015 - Chrome telemetry
HTTP/2 is here.
HTTP/2 is well tested.
HTTP/2 is replacing SPDY.
See RFC7540 (HTTP/2) and RFC7541 (HPACK). HTTP/2 has already surpassed SPDY in adoption, and Chrome will deprecate SPDY (and NPN) in early 2016.
HTTP/2 in ~5 slides...
the what, why, and how behind the new protocol.
"HTTP/2 is a protocol designed for low-latency transport of content �over the World Wide Web"
Latency vs Bandwidth impact on Page Load Time
“To speed up the Internet at large, we should look for more ways to bring down RTT. What if we could reduce cross-atlantic RTTs from 150 ms to 100 ms? This would have a larger effect on the speed of the internet than increasing a user’s bandwidth from 3.9 Mbps to 10 Mbps or even 1 Gbps.” - Mike Belshe
Single digit % perf improvement after�5 Mbps
Linear improvement in page load time!
HTTP/2 in one slide…
HTTP/2 binary framing 101
Basic data flow in HTTP 2.0...
Streams are multiplexed because frames can be interleaved
HPACK header compression
For a deep(er) dive on HTTP/2 protocol, grab the free book at the O’Reilly booth, or…
Read it online (free):
hpbn.co/http2
Optimizing (web) application delivery
let’s take a quick tour of our delivery pipeline...
Application
HTTP
TCP
Link layer
(Ethernet, WiFi, LTE…)
UDP
RRC and radio delays, energy consumption, ...
All things DNS (and QUIC :))
Handshakes, goodput, packet loss, ...
Resource fetch, execution and processing, ….
Parallelism, prioritization, protocol overhead, ...
If/when lower layers fail, we’re forced to “optimize” at the application layer...
Application
HTTP
TCP
Link layer
UDP
Reduce DNS lookups
Unresolved names block requests
✓ HTTP/1.x
✓ HTTP/2
Application
HTTP
TCP
Link layer
UDP
Reuse TCP connections
Connection are expensive� - handshake latency, resource overhead, ...
✓ HTTP/1.x
✓ HTTP/2
Single connection!
Application
HTTP
TCP
Link layer
UDP
Use a Content Delivery Network
Page rendering is latency-bound (most of the time)
- lower roundtrip times are critical to optimize asset delivery
✓ HTTP/1.x
✓ HTTP/2
Application
HTTP
TCP
Link layer
UDP
Minimize number of HTTP redirects
Each redirect restarts the fetch process� - cross-origin redirects are worst case: DNS, TCP, new HTTP request
✓ HTTP/1.x
✓ HTTP/2
Application
HTTP
TCP
Link layer
UDP
Eliminate unnecessary request bytes
Unnecessary metadata (e.g. headers) add up quickly
- 100+ requests, with a few KB each of headers… hundreds of KB’s!
✓ HTTP/1.x
✓ HTTP/2
Application
HTTP
TCP
Link layer
UDP
Compress assets during transfer
Bytes are slow and expensive to transfer...� - GZIP offers 40-80% savings on most assets - easy win.
✓ HTTP/1.x
✓ HTTP/2
HPACK helps...
Application
HTTP
TCP
Link layer
UDP
Cache resources on the client
Redundant data transfers are… redundant!
- Cache-Control and ETag’s on each resource is a must.
✓ HTTP/1.x
✓ HTTP/2
Application
HTTP
TCP
Link layer
UDP
Eliminate unnecessary resources
Fetch what you need, bytes are expensive� - Aggressive prefetching is expensive both on client and server.
✓ HTTP/1.x
✓ HTTP/2
Evergreen performance best-practices
Application
HTTP/1.x
TCP
Link layer
UDP
High protocol overhead
Head-of-line blocking
Limited parallelism
Parallelism is limited by number of connections
~6 parallel downloads per origin
No problem… domain shard all the things!
Duplicate (spurious) data packets lead to poor connection “goodput”
What’s the optimal number of shards?
Trick question, the answer depends on device + network + network weather + page architecture.
Most sites abuse sharding, and hurt themselves… causing congestion, retransmissions, etc.
Congestion control is a bigger problem than you might think. Especially in emerging markets where many users are bandwidth+latency limited.
“One great metric around that which I enjoy is the fraction of connections created that carry just a single HTTP transaction (and thus make that transaction bear all the overhead). For HTTP/1 74% of our active connections carry just a single transaction - persistent connections just aren't as helpful as we all want. But in HTTP/2 that number plummets to 25%. That's a huge win for overhead reduction.”
Patrick McManus, Mozilla.
Report card: domain sharding
Introduced as a workaround for lack of multiplexing in HTTP/1. UA’s open ~6 connections per origin (6 parallel downloads), and sharding allows us to raise this number to… any number.
HTTP/1.x: domain sharding is abused, consider limiting use to two shards.
HTTP/2: unnecessary. Negative impact on performance. Eliminate.
Removing domain sharding for HTTP/2
HTTP/2 can coalesce connections on your behalf, if... �
Using this technique you can get the best of both worlds with minimal work.
“Reduce requests” → concatenate all the things!
Head of line blocking is expensive, instead of fetching N small assets, let’s fetch fewer but bigger assets.. with same content!
Yes, we’ve always had caching… But, we were never able to optimize for “churn” because small requests were too expensive. This is no longer the case.
* churn: ratio of bytes (in cache vs new) we have to fetch when pushing an update.
Report card: concatenated assets
Introduced as a workaround for head-of-line blocking in HTTP/1. Concatenation allows us to fetch “multiple files” within one request.
HTTP/2: avoid*. Ship small granular resources and optimize caching policies.
HTTP/1.x: use carefully and consider and optimize for invalidation costs (churn).
* Significant wins in compression are the only case where it might be useful.
“Reduce Requests” → inline all the things!
Head of line blocking is expensive, instead of fetching granular resources, just embed them inside others… to avoid requests.
Inlining: because you’ll also need…
Server: “You asked for /product/123, but you’ll need app.js, product-photo-1.jpg, as well… I promise to deliver these to you. That is, unless you decline or cancel.”
GET /product/123
/product/123
/app.js
/product-photo-1.jpg
Inlining is server push. Except, server push has nicer properties:�
Per-stream flow control + server push
GET /product-photo.jpg
/product-photo.jpg
/product-photo.jpg
WINDOW_UPDATE
I want image geometry and preview, and I’ll fetch the rest later...
Report card: resource inlining
Introduced as a workaround for head-of-line blocking in HTTP/1. Inlining is a latency optimization that eliminates full request roundtrip, and reduces “number of requests” in HTTP/1.
HTTP/2: replace with server push. The most naive strategy is still strictly better.
HTTP/1.x: use carefully and pay close attention to caching implications.
* Significant wins in compression are the only case where it might be useful.
HTTP/2 server push instead of inlining...
Implementation and status�
Jetty’s “smart push” is a great strategy...
Lots of room for experimentation + innovation!
You... need to think about prioritization
with HTTP/2 the browser is relying on the server to deliver data in an optimal way -- this is critical.
Prioritization is key to optimized rendering...
With HTTP/1.1 browsers “prioritizes” resources by holding a priority queue on the client and taking educated guesses for how to make the best use of available TCP connections… which delays request dispatch.
GET index.html
GET style.css
GET hero.jpg
GET other.jpg
High
Priority
Low
Priority
time
Delay dispatch, because we don’t want to “waste” bandwidth on images, we might have more critical resources, etc...
Prioritization is key to optimized rendering...
With HTTP/2 browsers prioritize requests based on type/context, and immediately dispatch the request as soon as the resource is discovered. The priority is communicated to the server as weights + dependencies.
GET index.html
GET style.css
GET hero.jpg
GET other.jpg
High
Priority
Low
Priority
time
Requests are initiated as soon as each resource is discovered
Responsibility is on the server to deliver the bytes in the right order!
Stream prioritization in HTTP/2...
* Prioritization is an advisory hint to the server, it does not provide strict delivery semantics.
HTTP/2 prioritization in Firefox...
With HTTP/2 the browser relies on the server to deliver the response data in an optimal way.
It’s not just the number of bytes, or requests per second, but the order in which bytes are delivered. Test your HTTP/2 server very carefully.
| | HTTP/1.x | HTTP/2 |
Reduce DNS lookups | | ✓ | ✓ |
Reuse TCP connections | | ✓ | ✓ |
Use a Content Delivery Network | | ✓ | ✓ |
Minimize number of HTTP redirects | | ✓ | ✓ |
Eliminate unnecessary request bytes | | ✓ | ✓ |
Compress assets during transfer | | ✓ | ✓ |
Cache resources on the client | | ✓ | ✓ |
Eliminate unnecessary resources | | ✓ | ✓ |
Apply domain sharding | | Revisit (max 2) | Remove |
Concatenate resources | | Carefully (caching) | Remove (compression) |
Inline resources | | Carefully (caching) | Remove, use server push |
Pick your HTTP/2 server carefully!
… we need better test suites and benchmarks.
+Ilya Grigorik
@igrigorik
Learn more...
hpbn.co/http2
Slides...
bit.ly/http2-opt