Caching Doesn’t Improve Mobile Web Performance*
Jamshed Vesuna† Colin Scott†
Michael Buettner∆ Michael Piatek∆
Arvind Krishnamurthy* Scott Shenker†‡
†UC Berkeley ∆Google *University of Washington ‡ICSI
Special thanks to our shepherd Dan Tsafrir
*Much
Flywheel NSDI’15 Results
2
Increasing the cache hit ratio of their proxy from 22% to 32% resulted in only
1-2% reduction in median mobile page load time
3
Goal:
Understand the effects of caching on mobile web performance
4
Outline
5
Background - Loading a Web Page
6
Background - Critical Path
7
Critical Path: the longest chain of dependent browser tasks
Fetch Delay = Network Delay
Render Delay = Computational Delay
Background - Page Load Time (PLT)
8
Outline
9
Performance Model - Estimating PLT
C - computational delays
N - network delays
K - fraction of objects on the critical path that are cacheable
X - cache hit ratio (out of all objects)
f() - overlap of C and N on the critical path
10
EPLT [X] = C+N·(1−K·X)− f(X)
Performance Model - Building an Intuition
11
EPLT [X] = C+N·(1−K·X)
Performance Model - Fitting K
In practice, K ~ 0.2 = ⅕*
EPLT [max] ≤ C + ⅘N
12
*Demystifying Page Load Performance with WProf. NSDI ’13
Prediction: Upper Bound on Caching Benefits
C:N ~ ⅔ for mobile devices
PLTo = EPLT [0] ≤ C+N = 5/2 C
EPLT [max] ≤ 11/5 C
Reduction in PLT: (EPLT [X] - PLTo) / PLTo
≤ 3/25 (12% with a perfect cache!)
13
Prediction: Desktop Benefits from Caching
C:N ~ ⅙ for fast desktop devices
PLTo = EPLT [0] ≤ C+N = 7 C
EPLT [max] ≤ 21/5 C
Reduction in PLT: (EPLT [X] - PLTo) / PLTo
≤ 2/5 (40% with a perfect cache!)
14
Explanation: C is Small for Desktop
C:N ~ ⅕ for 2GHz CPU*
*Demystifying Page Load Performance with WProf. NSDI ’13
Explanation: C is Small for Desktop
C:N ~ ⅕ for 2GHz CPU*
*Demystifying Page Load Performance with WProf. NSDI ’13
Explanation: C is Small for Desktop
C:N ~ ⅕ for 2GHz CPU*
*Demystifying Page Load Performance with WProf. NSDI ’13
Explanation: C is Small for Desktop
C:N ~ ⅕ for 2GHz CPU*
*Demystifying Page Load Performance with WProf. NSDI ’13
Explanation: C is Larger for Mobile
19
C:N ~ ⅔ for 1GHz CPU
Outline
20
Measurement Methodology
21
Measurement Methodology
22
Measurement Methodology
23
Measurement Methodology
24
Measurement Methodology
25
Measurement Methodology
26
Outline
27
Workload Characteristics
28
Workload Characteristics
29
Workload Characteristics
30
Increasing Cache Hits - Flywheel Result
31
Increased cache hit ratio from 20% to 30%
→ 1-2% reduction in page load time
Desktop vs Mobile, Perfect Cache
32
Reduction Defined As:
(Original PLT - PLT with a perfect cache) / (Original PLT)
Desktop vs Mobile, Perfect Cache
33
Median reduction in PLT for 3.2 GHz desktop is 34%
Desktop vs Mobile, Perfect Cache
34
Median reduction in PLT for mobile is 13%
Isolating the Bottleneck Resource
35
Constrained CPU similar to Mobile
Isolating the Bottleneck Resource
36
Constrained RAM similar to Desktop
Isolating the Bottleneck Resource
37
CPU is the key difference, not RAM
Slower CPUs Show Reduced Improvements
38
As CPU is throttled, caching has a reduced impact on PLT
Slower CPUs Show Reduced Improvements
39
As CPU is throttled, caching has a reduced impact on PLT
Caching Benefits are Limited by Slow CPUs
40
*Assumption: “All else being equal” (including b/w)
→ Mobile devices benefit less from web caching
Implications
41
*If you only care about end user latency
Conclusion
42
jamshed.vesuna@gmail.com cs@cs.berkeley.edu
This Presentation: https://goo.gl/plH4HE
PLT Analysis: https://github.com/colin-scott/page_load_time
Open Source Tools: https://github.com/JamshedVesuna/telemetry
*Demystifying Page Load Performance with WProf. NSDI ’13
Backup Slides
43
Data Validation
Sanity Checks: https://github.com/colin-scott/page_load_time/tree/master/telemetry/sanity_checks
44
Bandwidth vs Latency
45
* Flywheel data
Lots of Related Work
46
Known Limitations - PLT
47
Device Specs
48
Known Limitations - WPR
49