Eric Burnett, Oct 2019
One Minute Presubmits
London Build Meetup Talk
Proprietary + Confidential
The Space
Proprietary + Confidential
Proprietary + Confidential
Presubmits...
Proprietary + Confidential
Proprietary + Confidential
Goal:
60 seconds
Proprietary + Confidential
Proprietary + Confidential
Latency vs Happiness
Proprietary + Confidential
Proprietary + Confidential
Latency vs Happiness
Proprietary + Confidential
Proprietary + Confidential
The Problem
Proprietary + Confidential
Proprietary + Confidential
Builder-centric world
Proprietary + Confidential
Proprietary + Confidential
Storage-centric world
Proprietary + Confidential
Proprietary + Confidential
Digression: Directories
Logically, a nested mapping from
path -> [attributes, contents]
Does not have to be on a single disk anywhere.
Could be a flat ‘manifest’:
/src/lib/a/a.txt -> [775, “aabbccdd1122”]
/src/lib/a/b.txt -> [775, “12341234aabb”]
Or a merkle tree: ------------------------------------------------------->
Proprietary + Confidential
Proprietary + Confidential
Remote Execution principles
Move work off-host | Reuse prior results | Copy as little as possible |
Work moved off-host can be parallelized onto the right number of right-sized workers. | Small, well-defined units of work are cacheable, and cached results don’t need to be executed again. | Data is only needed in two places: where it’s produced, and where it’s used. Every intermediate copy is unnecessary overhead. And file data is cacheable too! |
Proprietary + Confidential
Proprietary + Confidential
Structure of a “Build”
Source Fetch |
| |
Dependency Fetch |
| |
Transforms |
| |
Build/Test |
| |
Results |
| |
Proprietary + Confidential
Proprietary + Confidential
Chart
89%
Build / Test
7%
Dependency Fetch
1%
Source Fetch
0.5%
Results
Presubmit time, before Remote Execution
3%
Transforms
Proprietary + Confidential
Proprietary + Confidential
Chart
Presubmit time, with Remote Execution
12%
Build / Test
54%
Dependency Fetch
6%
Source Fetch
4%
Results
24%
Transforms
Proprietary + Confidential
Proprietary + Confidential
Chart
Builder utilization
12%
Build / Test
54%
Dependency Fetch
6%
Source Fetch
4%
Results
24%
Transforms
CPU
Network
Disk
Memory
Proprietary + Confidential
Proprietary + Confidential
The Solution
Proprietary + Confidential
Proprietary + Confidential
Move everything off-host
Proprietary + Confidential
Proprietary + Confidential
Proprietary + Confidential
New world
Proprietary + Confidential
Proprietary + Confidential
Storage-centric principles
Operate on metadata | Cacheable operations | Minimal builder |
Where possible, remove the data and operate only on metadata. Data can be fetched from storage where/when it’s needed. | Avoid executing the same process on the same data twice. Carve up data operations into small enough chunks that cache hits are common. | Prefer to do heavy lifting in trusted services, and use builders to orchestrate - execute scripts, small tools and remote builds. Individual builders should not need significant resources, and lost builders should not be a significant setback. |
Proprietary + Confidential
Proprietary + Confidential
Build phases - goals
Source Fetch |
| |
Dependency Fetch |
| |
Transforms |
| |
Build/Test |
| |
Results |
| |
Proprietary + Confidential
Proprietary + Confidential
Source Fetch
Build a virtual directory - merkle tree or manifest - of the repository at the desired ref.
New data copied into storage (not shown)
Reference the virtual directory for downstream work.
Proprietary + Confidential
Proprietary + Confidential
Dependency Fetch
Build a virtual directory of the each dependency at the desired ref.
At build time, stitch these together with the top-level source into a single virtual directory containing all input files at the appropriate locations.
Proprietary + Confidential
Proprietary + Confidential
Transforms
Strategically rewrite the virtual directory as needed.
Applying a patch: point additions/removals/ replacements of a few specific files.
Applying a transform: shard the work and apply it recursively, caching based on
<tool, rule, blob, path?> -> blob
or subtrees of the same.
Proprietary + Confidential
Proprietary + Confidential
Build/Test
Execute the build tool on top of the virtual directory.
The build tool must be virtualization-aware: either take the virtual directory as input, or run on a FUSE filesystem with logic to get metadata (digests) from it and to insert remote outputs into it.
Proprietary + Confidential
Proprietary + Confidential
Results
Persist handles to remote files instead of the file contents themselves.
May also persist whole directories, if desired: e.g. state of the file tree after every step and at the end of the build.
Optionally, extend the lifetime/durability of these remote files for persistence.
Proprietary + Confidential
Proprietary + Confidential
New world
Proprietary + Confidential
Proprietary + Confidential
Chart
75%
Build / Test
6%
Dependency Fetch
6%
Source Fetch
1%
Results
Presubmit time, after remoting Everything
11%
Transforms
Proprietary + Confidential
Proprietary + Confidential
Additional benefits
Proprietary + Confidential
Proprietary + Confidential
Requirements
1 | Durable content-addressed and key-value storage options | |
2 | ‘Syncer’ to fetch repos, transform into virtual directories, and populate repo@ref -> root map | |
3 | Logic (service or builder-side) for resolving dependency tree and stitching the relevant virtual directories together for a build | |
4 | Logic (service or builder-side) for applying any necessary source->source transforms | |
5 | Virtual-directory-aware build tool | |
Proprietary + Confidential
Proprietary + Confidential
Deployment
1 | Start at the repo: remove successively more tasks from builder, but still copy all files down for legacy phases OR | |
2 | Start at the build: virtualize fetches, virtualize outputs, virtualize inputs. Expand to cover pre-build transforms, and push upwards. | |
Proprietary + Confidential
Proprietary + Confidential
Current Work
Proprietary + Confidential
Proprietary + Confidential
Thank You
Proprietary + Confidential
Additional Content
Proprietary + Confidential
Proprietary + Confidential