1 of 15

Variational Bayesian Quantization

Yibo Yang*, Robert Bamler*, Stephan Mandt

(*equal contribution)

University of California, Irvine

International Conference on Machine Learning • June 13, 2020

Our paper was previously titled Variable-bitrate Neural Compression via Bayesian Arithmetic Coding. We changed the title to Variational Bayesian Quantization based on reviewer feedback.

2 of 15

Data is abundant

Latent variable models

g

z ~ p(z)

x ~ p(x|z)

0.123

-0.987

0.151

...

Yang, Bamler, Mandt • Variational Bayesian Quantization, ICML 2020

2

3 of 15

Latent variable models + compression

Data Compression

Model Compression

infer

x

z*

g

z

x

N

counts

queen

woman

girl

boy

man

...

queen

0

3

2

0

1

woman

0

0

7

2

5

...

Example: Bayesian word embedding model [Barkan 2017].

x’ = g(z*)

Yang, Bamler, Mandt • Variational Bayesian Quantization, ICML 2020

3

4 of 15

Quantizing continuous latent variables

0.13

-0.98

0.13

...

Quantize

0.123

-0.987

0.151

...

z* =

Current neural compression methods:

  • Optimize a rate-distortion objective end-to-end;
  • Embed quantization into training, approximately:
      • Straight-through estimator [Bengio et al., 2013]
      • Stochastic binarization [Toderici et al., 2016]
      • Soft-to-hard VQ [Agustsson et al., 2017]
      • Adding uniform noise [Ballé et al., 2017]

  • Require catering the training procedure, or even the generative model itself, to quantization;
  • Most require retraining a new model for a different rate-distortion trade-off.

Yang, Bamler, Mandt • Variational Bayesian Quantization, ICML 2020

4

5 of 15

Contributions

A new algorithm, Variational Bayesian Quantization (VBQ), for compressing latent representations in a large class of generative models, that:

  1. Operates completely after training:
    1. plug-and-play compression for pre-trained models;
    2. works for both data compression and model compression;
    3. separates the data modeling (and model training) task from quantization and compression;
  2. Performs variable-bitrate compression with a single model, outperforming JPEG with a single standard variational autoencoder (VAE);
  3. Exploits posterior uncertainty for compression -- only other known work is bits-back coding for lossless compression [Wallace, 1990, Hinton and Van Camp, 1993].

What’s the best way to quantize latent variables in a given generative model?

Yang, Bamler, Mandt • Variational Bayesian Quantization, ICML 2020

5

6 of 15

Data Compression: When Probabilities Matter

Don’t transmit

what you can predict.

Better generative probabilistic models �lead to better compression rate:

minimal bitrate = –log2 pmodel(message)

E[bitrate] ⩾ Crossentropy(data || model)

Classical Example: Arithmetic Coding

Yang, Bamler, Mandt • Variational Bayesian Quantization, ICML 2020

6

7 of 15

Lossy Data Compression: When Uncertainty Matters

Don’t transmit

what you can predict.

Better generative probabilistic models �lead to better compression rate:

Don’t transmit what you’re not sure about.

Better estimates of posterior uncertainties�allow for more efficient use of bandwidth.

minimal bitrate = -log2 pmodel(message)

E[bitrate] ⩾ Crossentropy(data || model)

[Yang, Bamler & Mandt, Variational Bayesian Quantization, ICML 2020]

Yang, Bamler, Mandt • Variational Bayesian Quantization, ICML 2020

7

8 of 15

What’s the Population of Rome?

© David Iliff, CC BY-SA 2.5

100,000

  • In the year 500 AD:

2,879,728

  • On April 30, 2018:

Yang, Bamler, Mandt • Variational Bayesian Quantization, ICML 2020

8

9 of 15

Variational Bayesian Quantization

Now: continuous observation

(Reminder: Classical Arithmetic Coding)

[Yang, Bamler & Mandt, Variational Bayesian Quantization, ICML 2020]

Yang, Bamler, Mandt • Variational Bayesian Quantization, ICML 2020

9

10 of 15

Real Example: Compressing Neural Word Embeddings

?

+

=

“king”

“man”

“woman”

“queen”

[Yang, Bamler & Mandt, Variational Bayesian Quantization, ICML 2020]

VBQ (proposed)

(model with 107 parameters)

Yang, Bamler, Mandt • Variational Bayesian Quantization, ICML 2020

10

11 of 15

Data Compression With Deep Probabilistic Models

Results for Image Compression:

[Yang, Bamler & Mandt, Variational Bayesian Quantization, ICML 2020]

Yang, Bamler, Mandt • Variational Bayesian Quantization, ICML 2020

11

12 of 15

original

Yang, Bamler, Mandt • Variational Bayesian Quantization, ICML 2020

12

13 of 15

JPEG @ 0.24 BPP

Yang, Bamler, Mandt • Variational Bayesian Quantization, ICML 2020

13

14 of 15

ours @ 0.24 BPP

Yang, Bamler, Mandt • Variational Bayesian Quantization, ICML 2020

14

15 of 15

Conclusion

  • Goal: quantizing latent variables in post-processing
  • Solution: new algorithm, Variational Bayesian Quantization (VBQ), for quantizing latent representations in a wide class of latent variable models:
    • takes posterior uncertainty into account;
    • modular, plug-and-play compression of data and model;
    • separates modeling from compression; variable-rate compression with a single model
  • Consequences
    • model compression: improved lossy compression of Bayesian word embeddings, outperforming all baselines that use uniform quantization;
    • data compression: lossy image compression by quantizing a single Gaussian VAE, outperforming all baselines including JPEG;
    • potentially offers a new way of evaluating/comparing latent variable models.

Our paper was previously titled Variable-bitrate Neural Compression via Bayesian Arithmetic Coding. We changed the title to Variational Bayesian Quantization based on reviewer feedback.

Yang, Bamler, Mandt • Variational Bayesian Quantization, ICML 2020

15