1 of 15

Firefox AI Runtime

1

2 of 15

Tarek Ziadé

Creator of the French Python User Group (Afpy)

Part of the Firefox AI/ML team at Mozilla

Author of some books about Python

Gawel et Tarek - Pycon FR 2009

2

3 of 15

Firefox AI Runtime Goal

Provide an inference API that runs offline, that we can use for our internal use cases, and surface it for web extension developers.

3

4 of 15

Firefox Translations

Offline translation in Firefox
started in 2019
based on Bergamot and Marian NMT - https://browser.mt
RNN models trained for language pairs

4

5 of 15

Firefox Translations / Architecture

Forks a dedicated inference process
Runs Bergamot as WASM
Stores the runtime and the models in Remote Settings (~10 to 20MB each)
Leverage Gemmology for fast inference

Web Page

Inference Process

bergamot.wasm

Remote Settings

Eng -> FR

Gemmology

avx-vnni

neon i8mm

5

6 of 15

Beyond translation

How can we support more inference tasks ?

Describing images → image-to-text
Recognizing words → named entities recognition
Classifying text → text-classification / sentiment-analysis
Semantic search → feature extraction
Summarizing → summarization
Text to speech → text-to-audio
Speech to text → automatic-speech-recognition
etc.

Can’t use Bergamot

6

7 of 15

🤗 Transformers.js

Javascript port of Hugging Face’s Transformers (Python)
Built on the top of Microsoft's ONNX runtime (WASM and WebGPU)
Enables using 1000+ models from Hugging Face
Provides high level API for pre- and post- processing of data

7

8 of 15

Example

const captioner = await pipeline('image-to-text',

'Xenova/vit-gpt2-image-captioning');

const url = 'https://example.com/cats.jpg';

const output = await captioner(url);

// [{ generated_text: 'a cat laying on a couch with another cat' }]

implements a set of classes per inference type
crawls the Hugging Face model hub
downloads and caches models on disk
runs an ONNX inference session using onnxruntime-web (wasm/webgpu)

8

9 of 15

Transformers.js @ Firefox 133+

Added onnxruntime-web as a backend like Bergamot
Vendored Transformers.js
Custom models cache in IndexedDB (cross-origin)
Can download from our Hub or Hugging Face’s
Use the inference process too

Web Page

Inference Process

onnx.wasm

Remote Settings

https://model-hub.mozilla.org

bergamot.wasm

9

10 of 15

PDF.js alt-text

10

11 of 15

Enabling the �Mozilla Community

to build AI/ML features in Firefox

Do not zoom on people faces, it’s scary

11

12 of 15

WebExtensions AI API

Available in Nightly
Preffed-off in Firefox 134
Wraps our Firefox AI Runtime
Offers high-level API to run inference in the browser with low friction
=> Enables the web devs community to experiment with inference easily
Gives us a way to iterate on an API design for the web

12

13 of 15

WebExtensions AI API

// SVE - Smallest Viable Example

// 1. Create the ML engine.

await browser.trial.ml.createEngine({taskName: "summarization"});

// 2. Call it.

const res = await browser.trial.ml.runEngine({

args: ["This is the text to summarize"],

});

// 3. Display the results.

console.log(res);

13

14 of 15

Demo. Let’s build a web extension.

Internet has always been about cats.

14

15 of 15

Thanks!

Discord : https://discord.gg/Jmmq9mGwy7

All the doc https://firefox-source-docs.mozilla.org/toolkit/components/ml/

Full example : https://searchfox.org/mozilla-central/source/toolkit/components/ml/docs/extensions-api-example

15