JavaScript isn't enabled in your browser, so this file can't be opened. Enable and reload.

1 of 102

JPEG XL

The next-generation “Alien Technology from the Future”

Last update: October 2024

The original version of this presentation was presented at ImageReady in October 2020

Slides by

Jon Sneyers

2 of 102

Prelude

Why image compression?

3 of 102

Digital images

Raster (grid, matrix, 2D array) of pixels (picture elements)
Every pixel has a color
For light-based images (shown on a screen), the RGB color model is used:

Every pixel consists of 3 subpixels: R, G, B
Many colors can be reproduced in this way

4 of 102

5 of 102

An image is worth a million words

Typical screen: 256 shades of Red, 256 shades of Green, 256 shades of Blue�(8-bit color depth) → 256 x 256 x 256 = about 16 million different colors

HDR screens: 10-bit color so 2^30 = about 1 billion different colors

So 1 pixel requires (at least) 3 bytes
Typical screen: Full HD, which is 1920 x 1080 pixels (about 2 megapixels)
Typical camera: 12 to 20 megapixels (though there are phones with > 100 MP)
1 pixel = 3 bytes, so 1 megapixel = 3 megabytes (uncompressed)
A million words is about 6 megabytes (uncompressed)
So a screen-filling image takes about the same amount of information as a million words (which is about 10 books), 1 camera photo ≈ 100 books

6 of 102

An image is worth a million words

7 of 102

Lossless and lossy compression

Lossless compression: can reconstruct original exactly

For text: 8:1 is possible, 3:1 is easy (e.g. 1 GB worth of wikipedia articles can be compressed to 117 MB with a lot of effort, or to 300 MB with little effort)
For (photographic) images: 2.5:1 is possible, 2:1 is easy

Lossy compression: information is lost, but (maybe) you don’t see the difference

Not done for text
For (photographic) images:�20:1 can still be “visually lossless”, 50:1 can often still look good

A 12 megapixel, 8-bit RGB image is

36 MB uncompressed
~15 MB losslessly compressed
~4 MB with high-fidelity lossy compression
~1 MB with low-fidelity lossy compression

Lossy compression makes images on the web possible

8 of 102

Web page weight

In 2010, median page weight was < 500 KB
In 2020, median page weight was 2 MB�(source: https://httparchive.org/reports/page-weight)

27 KB for HTML
73 KB for CSS (usually mostly cached)
132 KB for fonts (usually mostly cached)
448 KB for Javascript (usually mostly cached)
948 KB for images

9 of 102

It’s not just web pages

Photography

Google Photos alone stored�over 4 trillion photos in 2020�(> 500 per human on the planet)
~2 trillion new photos per year�= ~10 million TB (in JPEG)�= ~300,000 hard drives (32 TB each)

Game assets
Satellite images
Medical images
...

10 of 102

TL;DR: Compression (still) matters!

11 of 102

JPEG

Standard since 1992, developed between 1986 and 1992
Created by the Joint Photographic Experts Group
JPEG was there at the right time:

First graphical web browser: Mosaic in 1993
Digital cameras: in 1991 Kodak released a 1.3 megapixel camera that cost $13,000; since 2003, digital photography dominates

JPEG has lasted an exceptionally long time

For comparison: Google was founded in 1998, Facebook in 2004, Instagram in 2010

JPEG is an exceptionally well-designed image codec

12 of 102

“JPEG is

Alien Technology

from the Future”

�– Tim Terriberry

13 of 102

Why do we need something new?

JPEG is (finally) hitting its limits:

Unfortunately we are stuck with the de facto JPEG standard, which is more limited than the full original JPEG specification
JPEG can only do 8-bit color depth, which is not enough for HDR

A careful implementation of JPEG like jpegli can squeeze out ~10-bit precision from the de facto format, but that requires updating applications

JPEG cannot do lossless
JPEG is not great for non-photographic images
JPEG cannot do alpha transparency
JPEG cannot do animation or multiple layers
Better compression is possible and desirable

14 of 102

15 of 102

“JPEG XL is

Next-generation�Alien Technology

from the Future”

– Somebody from the future (I hope)

16 of 102

JPEG XL features

What can the JPEG XL codec do?

17 of 102

What we preserved from JPEG

18 of 102

What we preserved from JPEG

8x8 DCT�
Progressive decoding
4:4:4 with optional chroma subsampling
Existing JPEG data can be represented as-is in JPEG XL: lossless recompression!�(~20% smaller thanks to better entropy coding)

19 of 102

What we added

VarDCT (4x4, 32x32, 8x16, ...)
Better progressive
XYB color space
Adaptive quantization
Patches, splines, noise
Restoration filters
Better entropy coding
Lossless
High bit depth (up to 32-bit)
Alpha, depth, thermal, …
Multi-frame, layers
Parallel / cropped decode

20 of 102

JPEG

21 of 102

JPEG XL

22 of 102

More powerful, more complicated, but not slower

Decode can be faster than JPEG on multi-core
Encode can be faster than JPEG on multi-core
No special-purpose hardware needed
Can of course always do slower / even better encoding�(we left a lot of “alien artifacts” in the bitstream for future encoders to take advantage of – it’s a very expressive bitstream)

23 of 102

Key features

Royalty-free (like WebP and AVIF, unlike HEIC)

24 of 102

Key features

Legacy-friendly

(unique feature)��Lossless JPEG recompression

25 of 102

Key features

Progressive decode, responsive by design�(unlike any video-derived image format)

The usual progressive like in JPEG, but better
Saliency progression: e.g. send face detail first, background detail later
Reordering: e.g. send middle data first, rest later
Progressive DC, for auto-LQIP

26 of 102

27 of 102

Progressive decode

28 of 102

Key features

Optimized for high fidelity�(unlike any video-derived format)
For still images, we can (usually) afford visually lossless, no need to go down to the low bitrates needed for video

29 of 102

Generation loss resilience

30 of 102

Fully automatic encoding

JPEG encoders: you still need to decide�is quality 70 ok on this image?�is 4:2:0 ok?
Some images look fine at�q_50:420 while others look�artifacted at q_85:444
Encoder config are�technical parameters,�not a visual target!

human has to sit here to control the encoder

31 of 102

Fully automatic encoding

JPEG XL encoder: you specify a visual target

e.g. perceptual distance of 1 JND unit (just noticeable difference)

no human needed to control the encoder

32 of 102

Encoder consistency

Standard deviation of image quality (mean opinion scores across many images) for a given encoder setting is relatively low

so no surprises
as opposed to other modern codecs which are more inconsistent than JPEG

less consistent

more difference in quality for the same settings

more consistent

less difference in quality for the same settings

33 of 102

Subjective test results

Overall: ~10% better (smaller for same visual quality) than AVIF when comparing at same encode effort, ~20% if aligning by p5 quality rather than p50

34 of 102

Overall results hide some variation

Overall: ~10% better (smaller for same visual quality) than AVIF, ~20% better than WebP, ~30% better than JPEG

Diagrams (lossy non-photo):�~30% worse than AVIF�~10% worse than WebP�~40% better than JPEG

Landscape / nature:�~30% better than AVIF�~30% better than WebP�~30% better than JPEG

Portraits:�~20% better than AVIF�~25% better than WebP�~30% better than JPEG

35 of 102

← better compression

36 of 102

← better compression

37 of 102

JPEG XL (libjxl e6)

baseline: unoptimized JPEG (libjpeg-turbo, 4:4:4)

optimized JPEG (mozjpeg, 4:2:0)

AVIF (libaom s6, 4:4:4)

AVIF (libaom s7, 4:4:4)

AVIF (libaom s8, 4:4:4)

WebP (libwebp m4, 4:2:0)

AVIF (libaom s6, 4:4:4)

AVIF (libaom s7, 4:4:4)

AVIF (libaom s8, 4:4:4)

JPEG XL (libjxl e6)

WebP (libwebp m4, 4:2:0)

optimized JPEG (mozjpeg, 4:2:0)

very low quality

q20

visually lossless

q95

high quality

q75

low quality

q30

medium quality

q50

38 of 102

Trade-offs between speed and compression

Codec	Enc Speed�(Mpx/s, single core)	MT�(can use multi-threading effectively?)	Saving�compared to unoptimized JPEG�at medium-high quality	Saving�compared to unoptimized JPEG�at visually lossless quality
JPEG (libjpeg-turbo)	100	no	0	0
JPEG (mozjpeg)	5	no	15%	0%
WebP	6	no	20%	N/A
HEIC (x265)	2	yes, but	30%	25%
AVIF s9	5	yes, but	15%	10%
AVIF s8	2	yes, but	25%	20%
AVIF s6	1	yes, but	35%	25%
AVIF s0	0.01	yes, but	40%	35%
JPEG XL e3	10	yes	30%	40%
JPEG XL e6	5	yes	40%	45%

(ballpark numbers for a typical CPU and photographic image, actual numbers depend on the specific hardware and image)

39 of 102

JPEG XL features

Lossy and lossless

Lossless up to 32-bit per component�(float32, float16, bfloat16, uint24, …)

Colorspace signaling and handling�is part of the codestream/decoder�(designed for HDR, wide gamut)
Supports extra channels:�alpha, CMYK, spot colors, selection masks, depth maps, thermal, etc.
Layered images (alpha blending is specified and done by decoder)
Animation, image sequences, multi-page

Can be used across the workflow, from capture/authoring to web delivery

40 of 102

Objective results: HDR

41 of 102

Objective results: HDR

42 of 102

Lossless: best compression

Summary		Size difference vs. JPEG XL (-e 9 -E 3 -I 1)
Type	Image category	WebP	WebP v2	FLIF	AVIF	PNG	Effort 9	Effort 7	Effort 3	BMF (-s)	EMMA
Digital 2d Art	Art	16.8%	10.2%	5.8%	71.3%	47.7%	3.3%	8.3%	29.5%	-3.8%	-11.6%
Digital 2d Art	FanArt	19.2%	10.2%	8.7%	51.1%	52.9%	2.4%	9.7%	38.3%	-5.0%	-13.3%
Comics / B&W	Manga	21.1%	15.9%	6.0%	140.3%	53.9%	2.5%	10.5%	32.6%	-2.1%	-11.0%
Pixel Art	PixelArt	12.7%	14.2%	32.4%	106.4%	46.1%	5.0%	18.8%	274.9%	-10.0%	-0.5%
Emoji / Icons	Emoji	11.4%	6.0%	12.0%	56.0%	50.0%	3.2%	9.6%	25.6%	-7.1%	-4.7%
Digital 2d Art	Pixiv	15.4%	10.2%	9.7%	38.0%	50.7%	2.6%	6.6%	21.7%	-7.8%	-11.1%
Comics	ComixArt	20.4%	12.6%	6.0%	95.7%	58.3%	4.4%	12.2%	41.4%	-1.4%	-12.7%
Game Assets	GameSets	10.6%	6.8%	9.2%	79.2%	45.2%	6.6%	19.7%	70.2%	-10.9%	-15.8%
Game Screenshots	GameScr	13.1%	7.1%	5.8%	40.2%	41.3%	3.3%	7.0%	28.5%	-6.2%	-13.9%
Photo	ITAP	10.5%	5.2%	2.2%	47.6%	40.6%	2.8%	5.2%	19.6%	-6.6%	-10.9%
Photo	CLIC	11.9%	8.0%	4.0%	23.2%	41.1%	0.9%	2.6%	7.2%	-5.4%	-6.3%
Photo	LPCB	13.8%	11.0%	5.3%	33.8%	33.4%	1.2%	2.2%	12.9%	-5.0%	-8.1%
Digital 3d Art	LowPoly	16.1%	5.3%	3.9%	46.7%	43.8%	2.0%	10.1%	36.5%	-8.5%	-15.6%
Fractal Art	Fractals	10.9%	7.6%	3.7%	47.6%	38.8%	5.6%	10.9%	36.6%	-0.2%	-11.3%
Average % larger than jxl		14.6%	9.3%	8.2%	62.7%	46.0%	3.3%	9.5%	48.3%	-5.7%	-10.5%

43 of 102

Lossless image compression

smaller files

faster to encode

libjxl�max effort

libjxl�min effort

JPEG XL is

Pareto best

PNG

44 of 102

45 of 102

46 of 102

JPEG XL development

Who created JPEG XL and how?

47 of 102

48 of 102

A collaborative effort

Google

Jyrki Alakuijala, Sami Boukortt, Martin Bruse, Iulia-Maria Comsa, �Alex Deymo, Moritz Firsching, Thomas Fischbacher, Sebastian Gomez, Renata Khasanova, Evgenii Kliuchnikov, Jeffrey Lim, Robert Obryk, Krzysztof Potempa, Alexander Rhatushnyak, Zoltan Szabadka, Lode Vandevenne, Luca Versari, Jan Wassenberg

Cloudinary

Jon Sneyers�(Eric Portis, Tal Lev-Ami, Colin Bendell, ...)

Other JPEG experts

Touradj Ebrahimi, Walt Husak, Andy Kuzma, Fernando Pereira, Antonio Pinheiro, Thomas Richter, Peter Schelkens, Osamu Watanabe, …

49 of 102

A brief history of JPEG XL: my background story

Back in 2011, together with Pieter Wuille, I got interested in image compression and started a pet project called JiF or JPiF
In October 2015, I released FLIF, the Free Lossless Image Format
In February 2016, I started working for Cloudinary
In September 2016, I presented FLIF at the IEEE ICIP conference, at the occasion of the Image Compression Grand Challenge organized by JPEG (“best lossless codec”)
End of 2017, JPEG started drafting a�Call for Proposals for JPEG XL
In 2018, I made FUIF, the Free Universal Image Format,�based on FLIF, and submitted it to the JPEG XL CfP

50 of 102

A brief history of JPEG XL: the Google compression team

In 2010, a Paris-based Google team created lossy WebP�as single-frame WebM (VP8)
In 2012, the Zurich-based compression team added alpha and a lossless mode
They went on doing other things: Brotli (2013), WOFF 2.0, ZopfliPNG, ...
In 2016, they got interested in lossy image compression and created the Butteraugli perceptual metric and the Guetzli JPEG encoder
Brunsli (lossless JPEG recompression) was originally �designed to save storage for Google Photos
In 2017, they released the first version of Pik
In 2018 they submitted Pik to the JPEG XL CfP

51 of 102

It took almost two years to combine Pik and FUIF

Early 2018: JPEG call for proposals for next-gen image codec
Oct 2018: 7 proposals submitted, including Pik and FUIF
Jan 2019: Combination of Pik+FUIF selected
Two years of shared development, many improvements made along the way

52 of 102

History

Oct 2018: 7 submissions to JPEG XL CfP investigated
Jan 2019: Combination of Pik+FUIF selected
April 2019: New Project (NP) started
July 2019: Committee Draft (CD)

53 of 102

History (2)

Jan 2020: Draft International Standard (DIS)
July 2020: DIS approved, with 181 ballot comments (!)
October 2020: all DIS ballot comments addressed
November 2020: final validation experiments

54 of 102

History (3)

December 2020: bitstream frozen
January 2021: Part 1 (codestream) FDIS submitted
April 2021: Part 2 (file format) FDIS submitted;�Draft Amendment to Part 1 (Profile & Levels)

55 of 102

First editions published

18181-1 (JPEG XL core codestream): published in March 2022
18181-2 (JPEG XL file format): published in Oct 2021
18181-3 (JPEG XL conformance testing): published in Oct 2022
18181-4 (JPEG XL reference software): published in August 2022
Then started working on 2nd editions (corrections and clarifications)

56 of 102

Current status

18181-1 (JPEG XL core codestream):�2nd edition published July 2024
18181-2 (JPEG XL file format):�2nd edition published June 2024
18181-3 (JPEG XL conformance testing):�2nd edition FDIS submitted July 2024
18181-4 (JPEG XL reference software):�2nd edition not yet initiated, will be done when libjxl 1.0 is released

57 of 102

libjxl — reference implementation of JPEG XL

https://github.com/libjxl/libjxl

BSD-3 license (permissive FOSS)

Development opened to external contributions in May 2021

Production-ready reference implementation

Royalty-free, free & open source, efficient and robust

58 of 102

libjxl releases

☑ Version 0.1: Format release candidate (Nov 14th, 2020)

☑ Version 0.2: Format release (December 24th, 2020)

No more backwards-incompatible changes after this�(bitstreams created with encoder 0.2 remain decodable by future decoders)

☑ Version 0.3: Stable libjxl decode API (January 29th, 2021)

☐ Version 0.4: Full decoder implemented (“the missing release”)

No more forwards-incompatible changes after this�(decoder version 0.4 can decode bitstreams from future encoders)

☑ Version 0.5: Encoder improvements (August 2nd, 2021)

☑ Version 0.6: API improvements (October 4th, 2021)

☑ Version 0.7: Semi-final API (September 21st, 2022)

☑ Version 0.8: fast lossless, API tweaks (January 18th, 2023)

☑ Version 0.9: color management in decoder (December 22nd, 2023)

☑ Version 0.10: streaming encoding, significant speedups (February 21st, 2024)

☑ Version 0.11: no more aborts, gain map API (September 13th, 2024)

☐ Version 1.0: Fully finalized encode+decode API (Q3 2024?)

59 of 102

fjxl — fast lossless jxl encoder (now libjxl e1)

60 of 102

Independent implementations

J40 (C): https://github.com/lifthrasiir/j40
jxlatte (Java): https://github.com/Traneptora/jxlatte
jxl-oxide (Rust): https://github.com/tirr-c/jxl-oxide
hydrium (C): https://github.com/Traneptora/hydrium
Ongoing: a hardware implementation based on libjxl-tiny

Conformance testing: https://libjxl.github.io/bench/

61 of 102

Industry support is growing

62 of 102

Adoption

Browsers:

Chrome: was supported behind a flag since April 7th, 2021 (tracking bug)

Removal announced on Oct 31, 2022; removed at M110

Firefox: initial implementation started end of April 2021 (tracking bug)

Mozilla standards position: neutral

Safari: supported since version 17 (tracking bug )
Browsers with jxl enabled by default: Pale Moon, Basilisk, Waterfox, LibreWolf, Thorium

Image libraries: ImageMagick, libvips, Imlib2, ffmpeg

OS-level / toolkits: All Apple OSes (iOS, macOS, …), GDK-pixbuf, Qt, gThumb

Image authoring: GIMP, Krita, darktable, Adobe Camera Raw, Affinity Photo

Image viewers: XnView, ImageGlass, gwenview, digiKam, KolourPaint, KPhotoAlbum, LXImage-Qt, qimgv, qView, nomacs, VookiImageViewer, PhotoQt, ...

63 of 102

“There is not enough interest from the entire ecosystem to continue experimenting with JPEG XL.”

We want JPEG XL support in all browsers!

We already support JPEG XL!

We will ship JPEG XL support if there is a Rust implementation.

etc etc

64 of 102

Let’s dive deeper

Details for the image compression geeks

65 of 102

JPEG XL

coding tools

VarDCT
Better progressive
XYB color space
Adaptive quant
Patches, splines, noise
Restoration filters
Better entropy coding
Lossless
High bit depth
Alpha, depth, ...
Multi-frame, layers
Parallel/cropped decode

Variable block sizes:

Square:�DCT8x8, DCT16x16, DCT32x32, DCT64x64, DCT128x128, DCT256x256
Rectangular:�DCT8x16, DCT16x8, DCT16x32, DCT32x16, DCT8x32, DCT32x8,�DCT32x64, DCT64x32, DCT64x128, DCT128x64, DCT128x256, DCT256x128
Subdivided 8x8:�DCT4x4, DCT2x2, DCT4x8, DCT8x4
“Corner” 8x8 DCT: AFV1 - AFV4
“Hornuss” transform (8x8, not DCT)

66 of 102

JPEG XL

coding tools

VarDCT
Better progressive
XYB color space
Adaptive quant
Patches, splines, noise
Restoration filters
Better entropy coding
Lossless
High bit depth
Alpha, depth, ...
Multi-frame, layers
Parallel/cropped decode

+ the giant ones (up to 256x256), not currently used in the encoder

JPEG

WebP

67 of 102

JPEG XL

coding tools

VarDCT
Better progressive
XYB color space
Adaptive quant
Patches, splines, noise
Restoration filters
Better entropy coding
Lossless
High bit depth
Alpha, depth, ...
Multi-frame, layers
Parallel/cropped decode

68 of 102

JPEG XL

coding tools

VarDCT
Better progressive
XYB color space
Adaptive quant
Patches, splines, noise
Restoration filters
Better entropy coding
Lossless
High bit depth
Alpha, depth, ...
Multi-frame, layers
Parallel/cropped decode

69 of 102

JPEG XL

coding tools

VarDCT
Better progressive
XYB color space
Adaptive quant
Patches, splines, noise
Restoration filters
Better entropy coding
Lossless
High bit depth
Alpha, depth, ...
Multi-frame, layers
Parallel/cropped decode

DC always first
Progressive DC: send DC as (progressive) frame
Progressive alpha / extra channels
Progressive lossless
Group reordering, e.g. middle-out
Arbitrary AC coefficient reordering
Passes can update coefficients, allowing saliency passes

70 of 102

Progressive decode

71 of 102

JPEG XL

coding tools

VarDCT
Better progressive
XYB color space
Adaptive quant
Patches, splines, noise
Restoration filters
Better entropy coding
Lossless
High bit depth
Alpha, depth, ...
Multi-frame, layers
Parallel/cropped decode

Human visual system based color space

72 of 102

JPEG XL

coding tools

VarDCT
Better progressive
XYB color space
Adaptive quant
Patches, splines, noise
Restoration filters
Better entropy coding
Lossless
High bit depth
Alpha, depth, ...
Multi-frame, layers
Parallel/cropped decode

Human visual system based color space
XYB:

Y: luma (L+M; black-white)
X: difference L-M (red-green)
B: S (blue; typically stored as B-Y so blue-yellow)

73 of 102

JPEG XL

coding tools

VarDCT
Better progressive
XYB color space
Adaptive quant
Patches, splines, noise
Restoration filters
Better entropy coding
Lossless
High bit depth
Alpha, depth, ...
Multi-frame, layers
Parallel/cropped decode

Adaptive quantization: adjust quality locally as needed

74 of 102

JPEG XL

coding tools

VarDCT
Better progressive
XYB color space
Adaptive quant
Patches, splines, noise
Restoration filters
Better entropy coding
Lossless
High bit depth
Alpha, depth, ...
Multi-frame, layers
Parallel/cropped decode

A frame can be a “reference frame” and ‘Patches’ from it can be used in other frames
Allows to encode repeating elements only once and reuse them many times

E.g. text: store individual letters in a spritesheet reference frame
Can use (quantized) lossless compression for crisp letters

Very effective for mixed image contents

E.g. screenshots: use lossy for photo, lossless patches for text

75 of 102

JPEG XL

coding tools

VarDCT
Better progressive
XYB color space
Adaptive quant
Patches, splines, noise
Restoration filters
Better entropy coding
Lossless
High bit depth
Alpha, depth, ...
Multi-frame, layers
Parallel/cropped decode

Splines can be represented as centripetal Catmull-Rom curves with varying thickness and color, encoded compactly
E.g. to encode individual hairs (which doesn’t compress well with DCT)
Decoder is defined but (as you might expect) we don’t have an encoder yet that effectively uses this feature

76 of 102

JPEG XL

coding tools

VarDCT
Better progressive
XYB color space
Adaptive quant
Patches, splines, noise
Restoration filters
Better entropy coding
Lossless
High bit depth
Alpha, depth, ...
Multi-frame, layers
Parallel/cropped decode

Synthetic noise generation, intensity-modulated
Useful to replicate capture noise/grain or to make surfaces look more realistic
cjxl --photon_noise_iso=3200

77 of 102

JPEG XL

coding tools

VarDCT
Better progressive
XYB color space
Adaptive quant
Patches, splines, noise
Restoration filters
Better entropy coding
Lossless
High bit depth
Alpha, depth, ...
Multi-frame, layers
Parallel/cropped decode

Two kinds of smoothing filters to reduce blockiness and DCT artifacts
Gabor-like transform: encoder slightly sharpens before encoding, decoder slightly blurs after decoding
Adaptive edge-preserving filter with signalled strength

78 of 102

JPEG XL

coding tools

VarDCT
Better progressive
XYB color space
Adaptive quant
Patches, splines, noise
Restoration filters
Better entropy coding
Lossless
High bit depth
Alpha, depth, ...
Multi-frame, layers
Parallel/cropped decode

JPEG entropy coding: Huffman prefix coding
JPEG XL entropy coding:

prefix coding option for fast/hardware encode
ANS with optional LZ77, configurable hybrid symbol encoding

JPEG context model: none
JPEG XL context model:

Meta-Adaptive context modelling (from FLIF)
Context clustering (so MA tree can be a DAG)

79 of 102

JPEG XL

coding tools

VarDCT
Better progressive
XYB color space
Adaptive quant
Patches, splines, noise
Restoration filters
Better entropy coding
Lossless
High bit depth
Alpha, depth, ...
Multi-frame, layers
Parallel/cropped decode

State-of-the-art lossless

Combining the best ideas from FLIF, lossless WebP
Weighted self-correcting predictor (plus simpler predictors)
Predictor adjustable per context
Strong MA context modelling

Used for DC, alpha, signalling of block type selection, adaptive quantization field, edge-preserving filter strength, chroma from luma factors, etc.

80 of 102

JPEG XL: Modular mode

JPEG XL has two modes: VarDCT and Modular

Modular is used inside VarDCT mode to encode DC, quantweights, CfL weights, filter weights, patches, etc

Modular combines the best of

FLIF / FUIF
Lossless Pik
Lossless WebP
new stuff

81 of 102

JPEG XL

coding tools

VarDCT
Better progressive
XYB color space
Adaptive quant
Patches, splines, noise
Restoration filters
Better entropy coding
Lossless
High bit depth
Alpha, depth, ...
Multi-frame, layers
Parallel/cropped decode

Meta-adaptive context model; predictor per context

82 of 102

JPEG XL

coding tools

VarDCT
Better progressive
XYB color space
Adaptive quant
Patches, splines, noise
Restoration filters
Better entropy coding
Lossless
High bit depth
Alpha, depth, ...
Multi-frame, layers
Parallel/cropped decode

Meta-adaptive context model; predictor per context

Visualization of predictor choice

83 of 102

Prediction, Entropy coding and Context modeling

Prediction: don’t encode absolute pixel values, but subtract a predicted value so the numbers get closer to zero
Entropy coding: use shorter encodings for “more likely” numbers (typically small numbers)

Huffman prefix codes: e.g. use 0 to represent 0, 101 for -1, 100 for 1, etc
ANS: encode histogram first, then do something that can use less than 1 bit per symbol (better approximating the Shannon entropy)
lz77 distance codes: keep rolling window of past symbols, if something repeats, encode “take N symbols starting D symbols ago”

Context modeling: use different encodings in different contexts
Meta-adaptive context modeling: can signal the context model itself

84 of 102

Comparison

Codec	Prediction	Entropy coding	Context modeling
TIFF	None	RLE	/
JPEG	Left (DC) or none (AC)	Huffman	/
PNG	can choose per row	Huffman + lz77	/
WebP	can choose per block	Huffman + 2D-lz77	per block and per channel
FFV1	fixed (gradient)	AC	fixed (quantized local differences)
FLIF	can choose per channel	AC	MA trees
JPEG XL	can choose per context�(predictor + offset + multiplier)	ANS (or Huffman) + 2D-lz77	MA trees specify context and predictor; more MA properties

85 of 102

Predictors

Predictor	PNG	Lossless WebP	FFV1	FLIF	JPEG XL
Zero (none)
Left (W), Top (N)
LeftLeft (WW)
Avg(Left, Top)
Paeth
Select
Gradient				(non-interlaced)
TopLeft (NW), TopRight (NE)
Avg(Left, TopLeft)
Avg(Top, TopLeft)
Avg(Top, TopRight)
Avg(Left, TopLeft, Top, TopRight)
Avg(Top, Avg(Left, TopRight))
Weighted sum of W, WW, N, NN, NE, NEE
Predictors involving Bottom				(interlaced)
Self-correcting (Weighted) Predictor

86 of 102

AvgAll

West

North

NorthWest

NorthEast

Gradient

AvgN+NW

AvgW+NW

87 of 102

AvgAll

West

North

NorthWest

NorthEast

Gradient

AvgN+NW

AvgW+NW

88 of 102

Context properties

Property	FFV1	FLIF	JPEG XL
Previous channel(s) pixel value(s)
Previous channel(s) error
Absolute value of the above
Gradient
Left-TopLeft (W-NW)	(quantized)
TopLeft-Top (NW-N)	(quantized)
Top-TopRight (N-NE)	(quantized)
TopTop-Top (NN-N)	(quantized, ‘large’ model)
LeftLeft-Left (WW-W)	(quantized, ‘large’ model)
Properties involving Bottom		(interlaced mode)
Left Gradient error
Group index
x and y position
abs(N), abs(W)
Weighted Predictor max-amplitude-error

89 of 102

Minimum bits per symbol (best-case)

Huffman prefix codes: 1 bpp is the minimum�(assuming 1 symbol per pixel)
lz77: (cost of signalling max-length match) / (max-length)� ~0.001 bpp in PNG/zlib (for a fully black image)
FLIF’s AC: max chance is 4095/4096, which corresponds to ~0.0004 bpp
JPEG XL: singleton histogram is asymptotically 0 bpp

90 of 102

Header overhead: how many bytes is a 1-pixel image?

	white #FFFFFF	black�#000000	gray�#7F7F7F	yellow #FFFF00	transparent�#00000000	semitransparent #1337BABE	Smallest valid file (any size)
PNG	67	67	67	69	68	70	67
GIF	35	35	43	35	37	/	35
JPEG	160	160	159	288	/	/	107
WebP	38	34	38	36	34	38	26
FLIF	15	14	15	18	14	20	14
JPEG XL	19	13	20	20	17	25	12
AVIF	314	307	304	314	500	507	298
HEIC	540	540	575	582	871	950	386

91 of 102

Interlude: JPEG XL Art

What is it and how does it work?

92 of 102

JPEG XL

AVIF

117 byte jxl file:

117 byte avif file:��“Hi there! I am an AVIF image, with the brands avif, mif1, miaf, MA1A, encoded with libavif, and you should handle me as a picture, and my dimensions are —”

93 of 102

JPEG XL Art

Challenge:�make a tiny valid bitstream that produces an interesting image
Approach:

Hand-craft an MA tree using various predictors and offsets
Let all residuals be zero
Result: singleton histogram so image itself is asymptotically 0 bpp

Can add modular transforms to spice things up: reversible color transforms (RCT), Squeeze
Can use multiple layers or make animations too
Splines can be used
Other approaches could be used using other coding tools of jxl (patches, VarDCT mode, …) — hasn’t been done yet

94 of 102

JPEG XL

117 byte jxl file:

Width 1000

Height 1000

RCT 1

if y > 400

if x > 500

if WGH > 0

- AvgN+NW + 2

- AvgN+NW - 2

if WGH > 0

- AvgN+NE + 2

- AvgN+NE - 2

if c > 1

if y > 0

if N > -10

- N - 1

- N + 0

- Set 300

if c > 0

if y > 0

- N + 1

- Set -510

if y > 0

if y > 220

if x > 499

if x > 500

- NW + 0

if y > 280

if N > -100

- N - 10

- Set 0

if N > 300

- Set 200

- N + 10

- NE + 0

if x > 500

- NE + 0

- NW + 0

if W > -100

if W > 300

- Set 200

if x > 960

- W + 10

if x > 800

- W - 10

if x > 628

- W + 10

if x > 340

- W - 10

if x > 175

- W + 10

if x > 0

- W - 10

- Set 255

- Set 0

95 of 102

Online JXL Art tool

https://jxl-art.surma.technology/

JXL Art gallery

https://jpegxl.info/art

96 of 102

(end of interlude

on JPEG XL Art)

97 of 102

JPEG XL

coding tools

VarDCT
Better progressive
XYB color space
Adaptive quant
Patches, splines, noise
Restoration filters
Better entropy coding
Lossless
High bit depth
Alpha, depth, ...
Multi-frame, layers
Parallel/cropped decode

Up to 32-bit per channel lossless�(float or int)
Fully ready for HDR & wide gamut
Not just for end-user image delivery, also for authoring workflows

image by Masakazu Matsumoto

98 of 102

JPEG XL

coding tools

VarDCT
Better progressive
XYB color space
Adaptive quant
Patches, splines, noise
Restoration filters
Better entropy coding
Lossless
High bit depth
Alpha, depth, ...
Multi-frame, layers
Parallel/cropped decode

Extra channels supported natively

No need to do e.g. alpha at the container level
Can be subsampled (e.g. 1:8 for depth maps)
Have their own bit depth
Can do cropped or progressive decode with extra channels

Alpha (e.g. add alpha to existing JPEG image!)
Depth map
Thermal
Spot colors
The K of CMYK
Selection masks
Unknown but optional (can ignore)
Unknown and not optional (need to update decoder)

99 of 102

JPEG XL

coding tools

VarDCT
Better progressive
XYB color space
Adaptive quant
Patches, splines, noise
Restoration filters
Better entropy coding
Lossless
High bit depth
Alpha, depth, ...
Multi-frame, layers
Parallel/cropped decode

Multi-frame: for animation or photo bursts

Can subtract previous frame
Frame can be smaller than the canvas
Blend and dispose modes like in APNG/GIF/AWebP
Patches can be used for moving sprites
Zero-delay frames are allowed
A real video codec will work better though (no real inter prediction)

Composite still image: multiple layers, overlaid by the decoder

E.g. add lossless text or logo overlay over a lossy photo

100 of 102

JPEG XL

coding tools

VarDCT
Better progressive
XYB color space
Adaptive quant
Patches, splines, noise
Restoration filters
Better entropy coding
Lossless
High bit depth
Alpha, depth, ...
Multi-frame, layers
Parallel/cropped decode

Everything is stored in independent groups of 256x256 pixels

In modular mode, the group size can be adjusted to 128x128, 512x512 or 1024x1024
Group border artifacts are avoided: restoration filters (Gaborish, EPF) are still applied across group edges

Allows parallel decode for larger images
Also allows efficient cropped (ROI) decode

E.g. for zooming in on gigapixel images

101 of 102

More info: jpegxl.info�JPEG XL Discord

jon@cloudinary.com

Twitter/Bluesky: @jonsneyers

Thank You.