1 of 102

JPEG XL

The next-generation “Alien Technology from the Future”

Last update: October 2024

The original version of this presentation was presented at ImageReady in October 2020

Slides by

Jon Sneyers

&

2 of 102

Prelude

Why image compression?

3 of 102

Digital images

  • Raster (grid, matrix, 2D array) of pixels (picture elements)
  • Every pixel has a color
  • For light-based images (shown on a screen), the RGB color model is used:
    • Every pixel consists of 3 subpixels: R, G, B
    • Many colors can be reproduced in this way

3

4 of 102

4

5 of 102

An image is worth a million words

  • Typical screen: 256 shades of Red, 256 shades of Green, 256 shades of Blue�(8-bit color depth) → 256 x 256 x 256 = about 16 million different colors
    • HDR screens: 10-bit color so 2^30 = about 1 billion different colors
  • So 1 pixel requires (at least) 3 bytes
  • Typical screen: Full HD, which is 1920 x 1080 pixels (about 2 megapixels)
  • Typical camera: 12 to 20 megapixels (though there are phones with > 100 MP)
  • 1 pixel = 3 bytes, so 1 megapixel = 3 megabytes (uncompressed)
  • A million words is about 6 megabytes (uncompressed)
  • So a screen-filling image takes about the same amount of information as a million words (which is about 10 books), 1 camera photo ≈ 100 books

5

6 of 102

An image is worth a million words

6

=

7 of 102

Lossless and lossy compression

  • Lossless compression: can reconstruct original exactly
    • For text: 8:1 is possible, 3:1 is easy (e.g. 1 GB worth of wikipedia articles can be compressed to 117 MB with a lot of effort, or to 300 MB with little effort)
    • For (photographic) images: 2.5:1 is possible, 2:1 is easy
  • Lossy compression: information is lost, but (maybe) you don’t see the difference
    • Not done for text
    • For (photographic) images:�20:1 can still be “visually lossless”, 50:1 can often still look good
  • A 12 megapixel, 8-bit RGB image is
    • 36 MB uncompressed
    • ~15 MB losslessly compressed
    • ~4 MB with high-fidelity lossy compression
    • ~1 MB with low-fidelity lossy compression
  • Lossy compression makes images on the web possible

7

8 of 102

Web page weight

  • In 2010, median page weight was < 500 KB
  • In 2020, median page weight was 2 MB�(source: https://httparchive.org/reports/page-weight)
    • 27 KB for HTML
    • 73 KB for CSS (usually mostly cached)
    • 132 KB for fonts (usually mostly cached)
    • 448 KB for Javascript (usually mostly cached)
    • 948 KB for images

8

9 of 102

It’s not just web pages

  • Photography
    • Google Photos alone stored�over 4 trillion photos in 2020�(> 500 per human on the planet)
    • ~2 trillion new photos per year�= ~10 million TB (in JPEG)�= ~300,000 hard drives (32 TB each)
  • Game assets
  • Satellite images
  • Medical images
  • ...

9

10 of 102

TL;DR: Compression (still) matters!

10

11 of 102

JPEG

  • Standard since 1992, developed between 1986 and 1992
  • Created by the Joint Photographic Experts Group
  • JPEG was there at the right time:
    • First graphical web browser: Mosaic in 1993
    • Digital cameras: in 1991 Kodak released a 1.3 megapixel camera that cost $13,000; since 2003, digital photography dominates
  • JPEG has lasted an exceptionally long time
    • For comparison: Google was founded in 1998, Facebook in 2004, Instagram in 2010
  • JPEG is an exceptionally well-designed image codec

11

12 of 102

“JPEG is

Alien Technology

from the Future”

– Tim Terriberry

12

13 of 102

Why do we need something new?

  • JPEG is (finally) hitting its limits:
    • Unfortunately we are stuck with the de facto JPEG standard, which is more limited than the full original JPEG specification
    • JPEG can only do 8-bit color depth, which is not enough for HDR
      • A careful implementation of JPEG like jpegli can squeeze out ~10-bit precision from the de facto format, but that requires updating applications
    • JPEG cannot do lossless
    • JPEG is not great for non-photographic images
    • JPEG cannot do alpha transparency
    • JPEG cannot do animation or multiple layers
    • Better compression is possible and desirable

13

14 of 102

14

15 of 102

“JPEG XL is

Next-generation�Alien Technology

from the Future”

– Somebody from the future (I hope)

15

16 of 102

JPEG XL features

What can the JPEG XL codec do?

17 of 102

What we preserved from JPEG

17

18 of 102

What we preserved from JPEG

  • 8x8 DCT�
  • Progressive decoding
  • 4:4:4 with optional chroma subsampling
  • Existing JPEG data can be represented as-is in JPEG XL: lossless recompression!�(~20% smaller thanks to better entropy coding)

18

19 of 102

What we added

  • VarDCT (4x4, 32x32, 8x16, ...)
  • Better progressive
  • XYB color space
  • Adaptive quantization
  • Patches, splines, noise
  • Restoration filters
  • Better entropy coding
  • Lossless
  • High bit depth (up to 32-bit)
  • Alpha, depth, thermal, …
  • Multi-frame, layers
  • Parallel / cropped decode

19

20 of 102

JPEG

20

21 of 102

JPEG XL

21

22 of 102

More powerful, more complicated, but not slower

  • Decode can be faster than JPEG on multi-core
  • Encode can be faster than JPEG on multi-core
  • No special-purpose hardware needed
  • Can of course always do slower / even better encoding�(we left a lot of “alien artifacts” in the bitstream for future encoders to take advantage of – it’s a very expressive bitstream)

22

23 of 102

Key features

  • Royalty-free (like WebP and AVIF, unlike HEIC)

23

24 of 102

Key features

  • Legacy-friendly

(unique feature)��Lossless JPEG recompression

24

25 of 102

Key features

  • Progressive decode, responsive by design�(unlike any video-derived image format)
    • The usual progressive like in JPEG, but better
    • Saliency progression: e.g. send face detail first, background detail later
    • Reordering: e.g. send middle data first, rest later
    • Progressive DC, for auto-LQIP

25

26 of 102

26

27 of 102

Progressive decode

27

28 of 102

Key features

  • Optimized for high fidelity�(unlike any video-derived format)
  • For still images, we can (usually) afford visually lossless, no need to go down to the low bitrates needed for video

28

29 of 102

Generation loss resilience

29

30 of 102

Fully automatic encoding

  • JPEG encoders: you still need to decide�is quality 70 ok on this image?�is 4:2:0 ok?
  • Some images look fine at�q_50:420 while others look�artifacted at q_85:444
  • Encoder config are�technical parameters,�not a visual target!

30

human has to sit here to control the encoder

31 of 102

Fully automatic encoding

  • JPEG XL encoder: you specify a visual target
    • e.g. perceptual distance of 1 JND unit (just noticeable difference)

31

no human needed to control the encoder

32 of 102

Encoder consistency

  • Standard deviation of image quality (mean opinion scores across many images) for a given encoder setting is relatively low
    • so no surprises
    • as opposed to other modern codecs which are more inconsistent than JPEG

32

less consistent

more difference in quality for the same settings

more consistent

less difference in quality for the same settings

33 of 102

Subjective test results

Overall: ~10% better (smaller for same visual quality) than AVIF when comparing at same encode effort, ~20% if aligning by p5 quality rather than p50

33

34 of 102

Overall results hide some variation

Overall: ~10% better (smaller for same visual quality) than AVIF, ~20% better than WebP, ~30% better than JPEG

Diagrams (lossy non-photo):�~30% worse than AVIF�~10% worse than WebP�~40% better than JPEG

Landscape / nature:�~30% better than AVIF�~30% better than WebP�~30% better than JPEG

Portraits:�~20% better than AVIF�~25% better than WebP�~30% better than JPEG

34

35 of 102

35

← better compression

36 of 102

36

← better compression

37 of 102

37

JPEG XL (libjxl e6)

baseline: unoptimized JPEG (libjpeg-turbo, 4:4:4)

optimized JPEG (mozjpeg, 4:2:0)

AVIF (libaom s6, 4:4:4)

AVIF (libaom s7, 4:4:4)

AVIF (libaom s8, 4:4:4)

WebP (libwebp m4, 4:2:0)

AVIF (libaom s6, 4:4:4)

AVIF (libaom s7, 4:4:4)

AVIF (libaom s8, 4:4:4)

JPEG XL (libjxl e6)

WebP (libwebp m4, 4:2:0)

optimized JPEG (mozjpeg, 4:2:0)

very low quality

q20

visually lossless

q95

high quality

q75

low quality

q30

medium quality

q50

38 of 102

Trade-offs between speed and compression

38

Codec

Enc Speed�(Mpx/s, single core)

MT�(can use multi-threading effectively?)

Savingcompared to unoptimized JPEGat medium-high quality

Savingcompared to unoptimized JPEGat visually lossless quality

JPEG (libjpeg-turbo)

100

no

0

0

JPEG (mozjpeg)

5

no

15%

0%

WebP

6

no

20%

N/A

HEIC (x265)

2

yes, but

30%

25%

AVIF s9

5

yes, but

15%

10%

AVIF s8

2

yes, but

25%

20%

AVIF s6

1

yes, but

35%

25%

AVIF s0

0.01

yes, but

40%

35%

JPEG XL e3

10

yes

30%

40%

JPEG XL e6

5

yes

40%

45%

(ballpark numbers for a typical CPU and photographic image, actual numbers depend on the specific hardware and image)

39 of 102

JPEG XL features

  • Lossy and lossless
    • Lossless up to 32-bit per component�(float32, float16, bfloat16, uint24, …)
  • Colorspace signaling and handling�is part of the codestream/decoder�(designed for HDR, wide gamut)
  • Supports extra channels:�alpha, CMYK, spot colors, selection masks, depth maps, thermal, etc.
  • Layered images (alpha blending is specified and done by decoder)
  • Animation, image sequences, multi-page

  • Can be used across the workflow, from capture/authoring to web delivery

39

40 of 102

Objective results: HDR

40

41 of 102

Objective results: HDR

41

42 of 102

Lossless: best compression

42

Summary

Size difference vs. JPEG XL (-e 9 -E 3 -I 1)

Type

Image category

WebP

WebP v2

FLIF

AVIF

PNG

Effort 9

Effort 7

Effort 3

BMF (-s)

EMMA

Digital 2d Art

16.8%

10.2%

5.8%

71.3%

47.7%

3.3%

8.3%

29.5%

-3.8%

-11.6%

Digital 2d Art

19.2%

10.2%

8.7%

51.1%

52.9%

2.4%

9.7%

38.3%

-5.0%

-13.3%

Comics / B&W

21.1%

15.9%

6.0%

140.3%

53.9%

2.5%

10.5%

32.6%

-2.1%

-11.0%

Pixel Art

12.7%

14.2%

32.4%

106.4%

46.1%

5.0%

18.8%

274.9%

-10.0%

-0.5%

Emoji / Icons

11.4%

6.0%

12.0%

56.0%

50.0%

3.2%

9.6%

25.6%

-7.1%

-4.7%

Digital 2d Art

15.4%

10.2%

9.7%

38.0%

50.7%

2.6%

6.6%

21.7%

-7.8%

-11.1%

Comics

20.4%

12.6%

6.0%

95.7%

58.3%

4.4%

12.2%

41.4%

-1.4%

-12.7%

Game Assets

10.6%

6.8%

9.2%

79.2%

45.2%

6.6%

19.7%

70.2%

-10.9%

-15.8%

Game Screenshots

13.1%

7.1%

5.8%

40.2%

41.3%

3.3%

7.0%

28.5%

-6.2%

-13.9%

Photo

10.5%

5.2%

2.2%

47.6%

40.6%

2.8%

5.2%

19.6%

-6.6%

-10.9%

Photo

11.9%

8.0%

4.0%

23.2%

41.1%

0.9%

2.6%

7.2%

-5.4%

-6.3%

Photo

13.8%

11.0%

5.3%

33.8%

33.4%

1.2%

2.2%

12.9%

-5.0%

-8.1%

Digital 3d Art

16.1%

5.3%

3.9%

46.7%

43.8%

2.0%

10.1%

36.5%

-8.5%

-15.6%

Fractal Art

10.9%

7.6%

3.7%

47.6%

38.8%

5.6%

10.9%

36.6%

-0.2%

-11.3%

Average % larger than jxl

14.6%

9.3%

8.2%

62.7%

46.0%

3.3%

9.5%

48.3%

-5.7%

-10.5%

43 of 102

43

Lossless image compression

43

smaller files

faster to encode

libjxl�max effort

libjxl�min effort

JPEG XL is

Pareto best

PNG

44 of 102

44

45 of 102

45

46 of 102

JPEG XL development

Who created JPEG XL and how?

47 of 102

47

48 of 102

A collaborative effort

Google

Jyrki Alakuijala, Sami Boukortt, Martin Bruse, Iulia-Maria Comsa, �Alex Deymo, Moritz Firsching, Thomas Fischbacher, Sebastian Gomez, Renata Khasanova, Evgenii Kliuchnikov, Jeffrey Lim, Robert Obryk, Krzysztof Potempa, Alexander Rhatushnyak, Zoltan Szabadka, Lode Vandevenne, Luca Versari, Jan Wassenberg

Cloudinary

Jon Sneyers�(Eric Portis, Tal Lev-Ami, Colin Bendell, ...)

Other JPEG experts

Touradj Ebrahimi, Walt Husak, Andy Kuzma, Fernando Pereira, Antonio Pinheiro, Thomas Richter, Peter Schelkens, Osamu Watanabe, …

48

49 of 102

A brief history of JPEG XL: my background story

  • Back in 2011, together with Pieter Wuille, I got interested in image compression and started a pet project called JiF or JPiF
  • In October 2015, I released FLIF, the Free Lossless Image Format
  • In February 2016, I started working for Cloudinary
  • In September 2016, I presented FLIF at the IEEE ICIP conference, at the occasion of the Image Compression Grand Challenge organized by JPEG (“best lossless codec”)
  • End of 2017, JPEG started drafting a�Call for Proposals for JPEG XL
  • In 2018, I made FUIF, the Free Universal Image Format,�based on FLIF, and submitted it to the JPEG XL CfP

49

50 of 102

A brief history of JPEG XL: the Google compression team

  • In 2010, a Paris-based Google team created lossy WebP�as single-frame WebM (VP8)
  • In 2012, the Zurich-based compression team added alpha and a lossless mode
  • They went on doing other things: Brotli (2013), WOFF 2.0, ZopfliPNG, ...
  • In 2016, they got interested in lossy image compression and created the Butteraugli perceptual metric and the Guetzli JPEG encoder
  • Brunsli (lossless JPEG recompression) was originally �designed to save storage for Google Photos
  • In 2017, they released the first version of Pik
  • In 2018 they submitted Pik to the JPEG XL CfP

50

51 of 102

It took almost two years to combine Pik and FUIF

  • Early 2018: JPEG call for proposals for next-gen image codec
  • Oct 2018: 7 proposals submitted, including Pik and FUIF
  • Jan 2019: Combination of Pik+FUIF selected
  • Two years of shared development, many improvements made along the way

51

52 of 102

History

  • Oct 2018: 7 submissions to JPEG XL CfP investigated
  • Jan 2019: Combination of Pik+FUIF selected
  • April 2019: New Project (NP) started
  • July 2019: Committee Draft (CD)

52

53 of 102

History (2)

  • Jan 2020: Draft International Standard (DIS)
  • July 2020: DIS approved, with 181 ballot comments (!)
  • October 2020: all DIS ballot comments addressed
  • November 2020: final validation experiments

53

54 of 102

History (3)

  • December 2020: bitstream frozen
  • January 2021: Part 1 (codestream) FDIS submitted
  • April 2021: Part 2 (file format) FDIS submitted;�Draft Amendment to Part 1 (Profile & Levels)

54

55 of 102

First editions published

  • 18181-1 (JPEG XL core codestream): published in March 2022
  • 18181-2 (JPEG XL file format): published in Oct 2021
  • 18181-3 (JPEG XL conformance testing): published in Oct 2022
  • 18181-4 (JPEG XL reference software): published in August 2022
  • Then started working on 2nd editions (corrections and clarifications)

55

56 of 102

Current status

  • 18181-1 (JPEG XL core codestream):�2nd edition published July 2024
  • 18181-2 (JPEG XL file format):�2nd edition published June 2024
  • 18181-3 (JPEG XL conformance testing):�2nd edition FDIS submitted July 2024
  • 18181-4 (JPEG XL reference software):�2nd edition not yet initiated, will be done when libjxl 1.0 is released

56

57 of 102

libjxl — reference implementation of JPEG XL

https://github.com/libjxl/libjxl

BSD-3 license (permissive FOSS)

Development opened to external contributions in May 2021

Production-ready reference implementation

Royalty-free, free & open source, efficient and robust

57

58 of 102

libjxl releases

Version 0.1: Format release candidate (Nov 14th, 2020)

Version 0.2: Format release (December 24th, 2020)

No more backwards-incompatible changes after this�(bitstreams created with encoder 0.2 remain decodable by future decoders)

Version 0.3: Stable libjxl decode API (January 29th, 2021)

Version 0.4: Full decoder implemented (“the missing release”)

No more forwards-incompatible changes after this�(decoder version 0.4 can decode bitstreams from future encoders)

Version 0.5: Encoder improvements (August 2nd, 2021)

Version 0.6: API improvements (October 4th, 2021)

Version 0.7: Semi-final API (September 21st, 2022)

Version 0.8: fast lossless, API tweaks (January 18th, 2023)

Version 0.9: color management in decoder (December 22nd, 2023)

Version 0.10: streaming encoding, significant speedups (February 21st, 2024)

Version 0.11: no more aborts, gain map API (September 13th, 2024)

Version 1.0: Fully finalized encode+decode API (Q3 2024?)

58

59 of 102

fjxl — fast lossless jxl encoder (now libjxl e1)

59

60 of 102

Independent implementations

Conformance testing: https://libjxl.github.io/bench/

60

61 of 102

Industry support is growing

61

62 of 102

Adoption

Browsers:

  • Chrome: was supported behind a flag since April 7th, 2021 (tracking bug)
    • Removal announced on Oct 31, 2022; removed at M110
  • Firefox: initial implementation started end of April 2021 (tracking bug)
    • Mozilla standards position: neutral
  • Safari: supported since version 17 (tracking bug )
  • Browsers with jxl enabled by default: Pale Moon, Basilisk, Waterfox, LibreWolf, Thorium

Image libraries: ImageMagick, libvips, Imlib2, ffmpeg

OS-level / toolkits: All Apple OSes (iOS, macOS, …), GDK-pixbuf, Qt, gThumb

Image authoring: GIMP, Krita, darktable, Adobe Camera Raw, Affinity Photo

Image viewers: XnView, ImageGlass, gwenview, digiKam, KolourPaint, KPhotoAlbum, LXImage-Qt, qimgv, qView, nomacs, VookiImageViewer, PhotoQt, ...

62

63 of 102

“There is not enough interest from the entire ecosystem to continue experimenting with JPEG XL.”

We want JPEG XL support in all browsers!

We already support JPEG XL!

We will ship JPEG XL support if there is a Rust implementation.

etc etc

64 of 102

Let’s dive deeper

Details for the image compression geeks

65 of 102

JPEG XL

coding tools

  • VarDCT
  • Better progressive
  • XYB color space
  • Adaptive quant
  • Patches, splines, noise
  • Restoration filters
  • Better entropy coding
  • Lossless
  • High bit depth
  • Alpha, depth, ...
  • Multi-frame, layers
  • Parallel/cropped decode

  • Variable block sizes:
    • Square:�DCT8x8, DCT16x16, DCT32x32, DCT64x64, DCT128x128, DCT256x256
    • Rectangular:�DCT8x16, DCT16x8, DCT16x32, DCT32x16, DCT8x32, DCT32x8,�DCT32x64, DCT64x32, DCT64x128, DCT128x64, DCT128x256, DCT256x128
    • Subdivided 8x8:�DCT4x4, DCT2x2, DCT4x8, DCT8x4
    • “Corner” 8x8 DCT: AFV1 - AFV4
    • “Hornuss” transform (8x8, not DCT)

66 of 102

JPEG XL

coding tools

  • VarDCT
  • Better progressive
  • XYB color space
  • Adaptive quant
  • Patches, splines, noise
  • Restoration filters
  • Better entropy coding
  • Lossless
  • High bit depth
  • Alpha, depth, ...
  • Multi-frame, layers
  • Parallel/cropped decode

+ the giant ones (up to 256x256), not currently used in the encoder

JPEG

WebP

67 of 102

JPEG XL

coding tools

  • VarDCT
  • Better progressive
  • XYB color space
  • Adaptive quant
  • Patches, splines, noise
  • Restoration filters
  • Better entropy coding
  • Lossless
  • High bit depth
  • Alpha, depth, ...
  • Multi-frame, layers
  • Parallel/cropped decode

68 of 102

JPEG XL

coding tools

  • VarDCT
  • Better progressive
  • XYB color space
  • Adaptive quant
  • Patches, splines, noise
  • Restoration filters
  • Better entropy coding
  • Lossless
  • High bit depth
  • Alpha, depth, ...
  • Multi-frame, layers
  • Parallel/cropped decode

69 of 102

JPEG XL

coding tools

  • VarDCT
  • Better progressive
  • XYB color space
  • Adaptive quant
  • Patches, splines, noise
  • Restoration filters
  • Better entropy coding
  • Lossless
  • High bit depth
  • Alpha, depth, ...
  • Multi-frame, layers
  • Parallel/cropped decode

  • DC always first
  • Progressive DC: send DC as (progressive) frame
  • Progressive alpha / extra channels
  • Progressive lossless
  • Group reordering, e.g. middle-out
  • Arbitrary AC coefficient reordering
  • Passes can update coefficients, allowing saliency passes

70 of 102

Progressive decode

70

71 of 102

JPEG XL

coding tools

  • VarDCT
  • Better progressive
  • XYB color space
  • Adaptive quant
  • Patches, splines, noise
  • Restoration filters
  • Better entropy coding
  • Lossless
  • High bit depth
  • Alpha, depth, ...
  • Multi-frame, layers
  • Parallel/cropped decode

  • Human visual system based color space

72 of 102

JPEG XL

coding tools

  • VarDCT
  • Better progressive
  • XYB color space
  • Adaptive quant
  • Patches, splines, noise
  • Restoration filters
  • Better entropy coding
  • Lossless
  • High bit depth
  • Alpha, depth, ...
  • Multi-frame, layers
  • Parallel/cropped decode

  • Human visual system based color space
  • XYB:
    • Y: luma (L+M; black-white)
    • X: difference L-M (red-green)
    • B: S (blue; typically stored as B-Y so blue-yellow)

73 of 102

JPEG XL

coding tools

  • VarDCT
  • Better progressive
  • XYB color space
  • Adaptive quant
  • Patches, splines, noise
  • Restoration filters
  • Better entropy coding
  • Lossless
  • High bit depth
  • Alpha, depth, ...
  • Multi-frame, layers
  • Parallel/cropped decode

  • Adaptive quantization: adjust quality locally as needed

74 of 102

JPEG XL

coding tools

  • VarDCT
  • Better progressive
  • XYB color space
  • Adaptive quant
  • Patches, splines, noise
  • Restoration filters
  • Better entropy coding
  • Lossless
  • High bit depth
  • Alpha, depth, ...
  • Multi-frame, layers
  • Parallel/cropped decode

  • A frame can be a “reference frame” and ‘Patches’ from it can be used in other frames
  • Allows to encode repeating elements only once and reuse them many times
    • E.g. text: store individual letters in a spritesheet reference frame
    • Can use (quantized) lossless compression for crisp letters
  • Very effective for mixed image contents
    • E.g. screenshots: use lossy for photo, lossless patches for text

75 of 102

JPEG XL

coding tools

  • VarDCT
  • Better progressive
  • XYB color space
  • Adaptive quant
  • Patches, splines, noise
  • Restoration filters
  • Better entropy coding
  • Lossless
  • High bit depth
  • Alpha, depth, ...
  • Multi-frame, layers
  • Parallel/cropped decode

  • Splines can be represented as centripetal Catmull-Rom curves with varying thickness and color, encoded compactly
  • E.g. to encode individual hairs (which doesn’t compress well with DCT)
  • Decoder is defined but (as you might expect) we don’t have an encoder yet that effectively uses this feature

76 of 102

JPEG XL

coding tools

  • VarDCT
  • Better progressive
  • XYB color space
  • Adaptive quant
  • Patches, splines, noise
  • Restoration filters
  • Better entropy coding
  • Lossless
  • High bit depth
  • Alpha, depth, ...
  • Multi-frame, layers
  • Parallel/cropped decode

  • Synthetic noise generation, intensity-modulated
  • Useful to replicate capture noise/grain or to make surfaces look more realistic
  • cjxl --photon_noise_iso=3200

77 of 102

JPEG XL

coding tools

  • VarDCT
  • Better progressive
  • XYB color space
  • Adaptive quant
  • Patches, splines, noise
  • Restoration filters
  • Better entropy coding
  • Lossless
  • High bit depth
  • Alpha, depth, ...
  • Multi-frame, layers
  • Parallel/cropped decode

  • Two kinds of smoothing filters to reduce blockiness and DCT artifacts
  • Gabor-like transform: encoder slightly sharpens before encoding, decoder slightly blurs after decoding
  • Adaptive edge-preserving filter with signalled strength

78 of 102

JPEG XL

coding tools

  • VarDCT
  • Better progressive
  • XYB color space
  • Adaptive quant
  • Patches, splines, noise
  • Restoration filters
  • Better entropy coding
  • Lossless
  • High bit depth
  • Alpha, depth, ...
  • Multi-frame, layers
  • Parallel/cropped decode

  • JPEG entropy coding: Huffman prefix coding
  • JPEG XL entropy coding:
    • prefix coding option for fast/hardware encode
    • ANS with optional LZ77, configurable hybrid symbol encoding
  • JPEG context model: none
  • JPEG XL context model:
    • Meta-Adaptive context modelling (from FLIF)
    • Context clustering (so MA tree can be a DAG)

79 of 102

JPEG XL

coding tools

  • VarDCT
  • Better progressive
  • XYB color space
  • Adaptive quant
  • Patches, splines, noise
  • Restoration filters
  • Better entropy coding
  • Lossless
  • High bit depth
  • Alpha, depth, ...
  • Multi-frame, layers
  • Parallel/cropped decode

  • State-of-the-art lossless
    • Combining the best ideas from FLIF, lossless WebP
    • Weighted self-correcting predictor (plus simpler predictors)
    • Predictor adjustable per context
    • Strong MA context modelling
  • Used for DC, alpha, signalling of block type selection, adaptive quantization field, edge-preserving filter strength, chroma from luma factors, etc.

80 of 102

JPEG XL: Modular mode

  • JPEG XL has two modes: VarDCT and Modular
    • Modular is used inside VarDCT mode to encode DC, quantweights, CfL weights, filter weights, patches, etc
  • Modular combines the best of
    • FLIF / FUIF
    • Lossless Pik
    • Lossless WebP
    • new stuff

80

81 of 102

JPEG XL

coding tools

  • VarDCT
  • Better progressive
  • XYB color space
  • Adaptive quant
  • Patches, splines, noise
  • Restoration filters
  • Better entropy coding
  • Lossless
  • High bit depth
  • Alpha, depth, ...
  • Multi-frame, layers
  • Parallel/cropped decode

  • Meta-adaptive context model; predictor per context

82 of 102

JPEG XL

coding tools

  • VarDCT
  • Better progressive
  • XYB color space
  • Adaptive quant
  • Patches, splines, noise
  • Restoration filters
  • Better entropy coding
  • Lossless
  • High bit depth
  • Alpha, depth, ...
  • Multi-frame, layers
  • Parallel/cropped decode

  • Meta-adaptive context model; predictor per context

Visualization of predictor choice

83 of 102

Prediction, Entropy coding and Context modeling

  • Prediction: don’t encode absolute pixel values, but subtract a predicted value so the numbers get closer to zero
  • Entropy coding: use shorter encodings for “more likely” numbers (typically small numbers)
    • Huffman prefix codes: e.g. use 0 to represent 0, 101 for -1, 100 for 1, etc
    • ANS: encode histogram first, then do something that can use less than 1 bit per symbol (better approximating the Shannon entropy)
    • lz77 distance codes: keep rolling window of past symbols, if something repeats, encode “take N symbols starting D symbols ago”
  • Context modeling: use different encodings in different contexts
  • Meta-adaptive context modeling: can signal the context model itself

83

84 of 102

Comparison

84

Codec

Prediction

Entropy coding

Context modeling

TIFF

None

RLE

/

JPEG

Left (DC) or none (AC)

Huffman

/

PNG

can choose per row

Huffman + lz77

/

WebP

can choose per block

Huffman + 2D-lz77

per block and per channel

FFV1

fixed (gradient)

AC

fixed (quantized local differences)

FLIF

can choose per channel

AC

MA trees

JPEG XL

can choose per context�(predictor + offset + multiplier)

ANS (or Huffman) + 2D-lz77

MA trees specify context and predictor; more MA properties

85 of 102

Predictors

85

Predictor

PNG

Lossless WebP

FFV1

FLIF

JPEG XL

Zero (none)

Left (W), Top (N)

LeftLeft (WW)

Avg(Left, Top)

Paeth

Select

Gradient

(non-interlaced)

TopLeft (NW), TopRight (NE)

Avg(Left, TopLeft)

Avg(Top, TopLeft)

Avg(Top, TopRight)

Avg(Left, TopLeft, Top, TopRight)

Avg(Top, Avg(Left, TopRight))

Weighted sum of W, WW, N, NN, NE, NEE

Predictors involving Bottom

(interlaced)

Self-correcting (Weighted) Predictor

86 of 102

86

AvgAll

West

North

NorthWest

NorthEast

Gradient

AvgN+NW

AvgW+NW

?

?

?

?

?

?

?

?

87 of 102

87

AvgAll

West

North

NorthWest

NorthEast

Gradient

AvgN+NW

AvgW+NW

88 of 102

Context properties

88

Property

FFV1

FLIF

JPEG XL

Previous channel(s) pixel value(s)

Previous channel(s) error

Absolute value of the above

Gradient

Left-TopLeft (W-NW)

(quantized)

TopLeft-Top (NW-N)

(quantized)

Top-TopRight (N-NE)

(quantized)

TopTop-Top (NN-N)

(quantized, ‘large’ model)

LeftLeft-Left (WW-W)

(quantized, ‘large’ model)

Properties involving Bottom

(interlaced mode)

Left Gradient error

Group index

x and y position

abs(N), abs(W)

Weighted Predictor max-amplitude-error

89 of 102

Minimum bits per symbol (best-case)

  • Huffman prefix codes: 1 bpp is the minimum�(assuming 1 symbol per pixel)
  • lz77: (cost of signalling max-length match) / (max-length)� ~0.001 bpp in PNG/zlib (for a fully black image)
  • FLIF’s AC: max chance is 4095/4096, which corresponds to ~0.0004 bpp
  • JPEG XL: singleton histogram is asymptotically 0 bpp

89

90 of 102

Header overhead: how many bytes is a 1-pixel image?

90

white

#FFFFFF

black�#000000

gray�#7F7F7F

yellow

#FFFF00

transparent�#00000000

semitransparent

#1337BABE

Smallest valid file (any size)

PNG

67

67

67

69

68

70

67

GIF

35

35

43

35

37

/

35

JPEG

160

160

159

288

/

/

107

WebP

38

34

38

36

34

38

26

FLIF

15

14

15

18

14

20

14

JPEG XL

19

13

20

20

17

25

12

AVIF

314

307

304

314

500

507

298

HEIC

540

540

575

582

871

950

386

91 of 102

Interlude: JPEG XL Art

What is it and how does it work?

92 of 102

JPEG XL

AVIF

117 byte jxl file:

117 byte avif file:��“Hi there! I am an AVIF image, with the brands avif, mif1, miaf, MA1A, encoded with libavif, and you should handle me as a picture, and my dimensions are —”

92

93 of 102

JPEG XL Art

  • Challenge:�make a tiny valid bitstream that produces an interesting image
  • Approach:
    • Hand-craft an MA tree using various predictors and offsets
    • Let all residuals be zero
    • Result: singleton histogram so image itself is asymptotically 0 bpp
  • Can add modular transforms to spice things up: reversible color transforms (RCT), Squeeze
  • Can use multiple layers or make animations too
  • Splines can be used
  • Other approaches could be used using other coding tools of jxl (patches, VarDCT mode, …) — hasn’t been done yet

93

94 of 102

JPEG XL

117 byte jxl file:

94

Width 1000

Height 1000

RCT 1

if y > 400

if x > 500

if WGH > 0

- AvgN+NW + 2

- AvgN+NW - 2

if WGH > 0

- AvgN+NE + 2

- AvgN+NE - 2

if c > 1

if y > 0

if N > -10

- N - 1

- N + 0

- Set 300

if c > 0

if y > 0

- N + 1

- Set -510

if y > 0

if y > 220

if x > 499

if x > 500

- NW + 0

if y > 280

if N > -100

- N - 10

- Set 0

if N > 300

- Set 200

- N + 10

- NE + 0

if x > 500

- NE + 0

- NW + 0

if W > -100

if W > 300

- Set 200

if x > 960

- W + 10

if x > 800

- W - 10

if x > 628

- W + 10

if x > 340

- W - 10

if x > 175

- W + 10

if x > 0

- W - 10

- Set 255

- Set 0

95 of 102

Online JXL Art tool

95

96 of 102

(end of interlude

on JPEG XL Art)

97 of 102

JPEG XL

coding tools

  • VarDCT
  • Better progressive
  • XYB color space
  • Adaptive quant
  • Patches, splines, noise
  • Restoration filters
  • Better entropy coding
  • Lossless
  • High bit depth
  • Alpha, depth, ...
  • Multi-frame, layers
  • Parallel/cropped decode

  • Up to 32-bit per channel lossless�(float or int)
  • Fully ready for HDR & wide gamut
  • Not just for end-user image delivery, also for authoring workflows

98 of 102

JPEG XL

coding tools

  • VarDCT
  • Better progressive
  • XYB color space
  • Adaptive quant
  • Patches, splines, noise
  • Restoration filters
  • Better entropy coding
  • Lossless
  • High bit depth
  • Alpha, depth, ...
  • Multi-frame, layers
  • Parallel/cropped decode

  • Extra channels supported natively
    • No need to do e.g. alpha at the container level
    • Can be subsampled (e.g. 1:8 for depth maps)
    • Have their own bit depth
    • Can do cropped or progressive decode with extra channels
  • Alpha (e.g. add alpha to existing JPEG image!)
  • Depth map
  • Thermal
  • Spot colors
  • The K of CMYK
  • Selection masks
  • Unknown but optional (can ignore)
  • Unknown and not optional (need to update decoder)

99 of 102

JPEG XL

coding tools

  • VarDCT
  • Better progressive
  • XYB color space
  • Adaptive quant
  • Patches, splines, noise
  • Restoration filters
  • Better entropy coding
  • Lossless
  • High bit depth
  • Alpha, depth, ...
  • Multi-frame, layers
  • Parallel/cropped decode

  • Multi-frame: for animation or photo bursts
    • Can subtract previous frame
    • Frame can be smaller than the canvas
    • Blend and dispose modes like in APNG/GIF/AWebP
    • Patches can be used for moving sprites
    • Zero-delay frames are allowed
    • A real video codec will work better though (no real inter prediction)
  • Composite still image: multiple layers, overlaid by the decoder
    • E.g. add lossless text or logo overlay over a lossy photo

100 of 102

JPEG XL

coding tools

  • VarDCT
  • Better progressive
  • XYB color space
  • Adaptive quant
  • Patches, splines, noise
  • Restoration filters
  • Better entropy coding
  • Lossless
  • High bit depth
  • Alpha, depth, ...
  • Multi-frame, layers
  • Parallel/cropped decode

  • Everything is stored in independent groups of 256x256 pixels
    • In modular mode, the group size can be adjusted to 128x128, 512x512 or 1024x1024
    • Group border artifacts are avoided: restoration filters (Gaborish, EPF) are still applied across group edges
  • Allows parallel decode for larger images
  • Also allows efficient cropped (ROI) decode
    • E.g. for zooming in on gigapixel images

101 of 102

More info: jpegxl.infoJPEG XL Discord

jon@cloudinary.com

Twitter/Bluesky: @jonsneyers

Thank You.

102 of 102