How Mozilla uses Python to Build and Ship Firefox

Hi!

I’m excited to be here today to let you know about how Mozilla uses Python to build and ship Firefox

How do we get from this….

To this

How do we turn source code into builds and updates for users devices?

Who am I?

Chris AtLee

catlee@mozilla.com

I’m the Manager of the Release Engineering team at Mozilla.

Our team is primarily responsible for building and shipping Firefox for Desktop and Android

We’re a team of 13 people, and we’re quite distributed around the world, but we have 3 people in the Toronto area, and 6 Canadians on our team!

some numbers

4 GB in version control (hg) for Firefox; 494k commits

Over 200 commits per day; around 7,000 per month

1,750 build & test tasks per commit

approximately 15 cpu days of automation per commit

20 releases per month. Each release can have over 7,000 individual tasks and generates 50-100GB of artifacts!

All powered by Python!

Building a modern web browser is hard. We have approximately 15 million lines of code in the product. We need to build and test desktop and mobile applications on a diverse set of platforms: Linux, macOS, Windows, Android.

Services like TravisCI, CircleCI, Jenkins don’t scale to that level of complexity.

Goal of this talk:

understanding pipelines & python

Understand how Mozilla uses Python to solve a very diverse set of problems, and at a large scale.

And also understand how our build & release pipeline work.

This is a very basic overview of the build/release pipeline.

We start with a developer committing code to our hg repositories.

From there we schedule builds and tests to run on Taskcluster.

For releases, we also need to generate updates for these builds, and then need to sign the builds and updates.

Finally, we publish the new release to the world!

Scratch the surface of any of these components, and you’ll find python is just below the surface.

treeherder

One of the first steps in the process is commit code into mercurial.

Treeherder is our tool to display the results of commits to our various repositories

This isn’t even the full view

Shows what changed on the left, and the various platform builds & tests on the right.

Green means success, orange means test failure, red generally means build failure.

Backend runs on Django!

https://treeherder.mozilla.org

https://github.com/mozilla/treeherder

taskcluster

mostly not python, sorry :(

https://docs.taskcluster.net/docs

Cluster of services that provides a CI and release platform to Mozilla.

The queue maintains a queue of tasks. Tasks are represented by json objects specifying which commands to run, what dependencies they have, etc.

Provisioners create new cloud resources to handle increases in load (e.g. in AWS)

Workers claim tasks from the queue, and run on the AWS instances, or on fixed hardware (e.g. mac minis, moonshots). They execute the tasks and then and upload resulting logs and artifacts, for example the builds, or screenshots from test failures or crash stacks.

For our hg repos, we have a service called mozilla-taskcluster which watches for new changes to the repos. It then triggers “decision tasks”, which run `mach taskgraph`

Photo: https://commons.wikimedia.org/wiki/File:Galileo_supercomputer_rack.jpg

mach taskgraph

it’s complicated

This is a graph (a DAG) of tasks for a recent Firefox release.

There are approximately 5k discrete tasks here. Graphviz….struggles a bit to make sense of this.

Tl;dr there are a lot of moving pieces here. For each platform we’re releasing, we have this pattern that we repeat of building Firefox itself; creating localized versions of it (we now have an en-CA locale!); creating complete and partial updates for each version; signing all the binaries; and finally publishing. For desktop, we have 5 platforms, 100 locales, and we create partial updates going back 3 or 4 versions. That means we create about 500 individual installers, and 2000 partial updates per release.

mach is a frontend CLI to the build system written in python.

Among other things, it wraps the build tool (mozbuild, which is written in python), as well as providing other common utilities, like taskgraph generation.

Taskgraph generation reads in yaml configuration files from the repository and process them with a series of python functions (that we call transforms), and finally generates the concrete set of tasks representing what to build/test/sign/upload. Like .travis.yml file, but hyped up on way too much halloween candy.

mozbuild: https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Build_Instructions/How_Mozilla_s_build_system_works

workers

Photo by Randy Fath on Unsplash

Interact with the Taskcluster Queue to claim work

Many different kinds of workers. Builds are done on docker or bare metal workers. Some workers are implemented in nodejs, some in Go, and others in Python.

Tasks definitions for these types of workers allow you to specify arbitrary commands and environment variables to run, as well as what docker containers to use. Great for developer ergonomics! Bad for security! We need to protect sensitive security tokens for publishing to S3 for example, and allowing people to run arbitrary code on the workers puts those tokens at risk.

We also have workers that operate in a restricted environment. They operate only on data passed in the task payload. We call them “scriptworkers” because the code or script that is run for the task is baked into the worker, and can’t be overridden.

We use use asyncio and aiohttp to do things in parallel when possible. So we can do things like download artifacts from our upstream tasks in parallel.

We use these types of workers for security sensitive work, like getting binaries signed, or pushing them to S3 or the Google Play Store for publishing Android apps.

The output from our release taskgraph is a set of signed binary artifacts ready to be published. The scriptworkers are also responsible for publishing the updates to our S3 buckets, and to our update server.

Chain of trust

https://github.com/mozilla-releng/scriptworker

Keeping users up to date

Update server (named balrog). Firefox regularly queries Balrog to check if there are any updates available.

Backend is 100% python.

A very important application. Handles approximately 5k requests/second

https://github.com/mozilla/balrog

Uses well known modules like Flask, SQLAlchemy. Uses uwsgi

It’s a rule engine that matches client parameters like which product, version, locale, etc. a user is on against a set of rules. The rules indicate which release of Firefox to serve next to the user.

updates

https://aus5.mozilla.org/update/6/Firefox/65.0a1/20181106100114/Linux_x86_64-gcc3/en-CA/nightly/Linux%204.18.0-2-amd64%20(GTK%203.24.1%2Clibpulse%2012.2.0)/ISET:SSE4_2,MEM:15790/default/default/update.xml

<updates>

<update type="minor" displayVersion="65.0a1" appVersion="65.0a1" platformVersion="65.0a1" buildID="20181109101751">

<patch type="complete" URL="https://archive.mozilla.org/pub/firefox/nightly/2018/11/2018-11-09-10-17-51-mozilla-central-l10n/firefox-65.0a1.en-CA.linux-x86_64.complete.mar" hashFunction="sha512" hashValue="a56f091a1a74a10917e6adbae6603593e89c8173d4d6a4689895ae31181e72c33e158988d12c19a8e4c52ec32a9df8df22ca891e42bf2cb7445476e45acb3d49" size="53647369"/>

</update></updates>

Python can work with binary file formats too!

Firefox uses a custom archive format called MAR (Mozilla ARchive: https://wiki.mozilla.org/Software_Update:MAR )

We also use Python to process these MAR files to generate partial updates (binary diff between versions), as well as to sign the updates. The package to handle MAR files is called mardor, and relies on Python modules like cryptography and Construct.

https://github.com/mozilla/build-mar

https://construct.readthedocs.io/en/latest/

https://cryptography.io/en/latest/

why python?

Why have we chosen to use Python at Mozilla for our build and release pipeline?

Speed of development

Amazing ecosystem of modules available

Great community

Mature tooling:

Performance generally not a concern. It’s good enough that we don’t spend much time optimizing. Bigger wins come from better architecture; e.g. stop processing so much data

why not python?

Packaging and distribution is still hard. Setuptools, eggs, wheels, pip, Pipenv

Pinning dependencies everywhere helps. Tools like pyup help, but aren’t perfect.

stuff I missed

Thanks!

slides are at https://mzl.la/2yUhiqH

Get involved!

https://codetribute.mozilla.org/

How Mozilla uses Python to Build and Ship Firefox - PyCon Canada 2018 - Google Slides