Variational Bayesian Quantization
Yibo Yang*, Robert Bamler*, Stephan Mandt
(*equal contribution)
University of California, Irvine
International Conference on Machine Learning • June 13, 2020
Our paper was previously titled Variable-bitrate Neural Compression via Bayesian Arithmetic Coding. We changed the title to Variational Bayesian Quantization based on reviewer feedback.
Data is abundant
Latent variable models
g
z ~ p(z)
x ~ p(x|z)
0.123 |
-0.987 |
0.151 ... |
Yang, Bamler, Mandt • Variational Bayesian Quantization, ICML 2020
2
Latent variable models + compression
Data Compression
Model Compression
infer
x
z*
g
z
x
N
counts | queen | woman | girl | boy | man | ... |
queen | 0 | 3 | 2 | 0 | 1 | |
woman | 0 | 0 | 7 | 2 | 5 | |
... | | | | | | |
Example: Bayesian word embedding model [Barkan 2017].
x’ = g(z*)
Yang, Bamler, Mandt • Variational Bayesian Quantization, ICML 2020
3
Quantizing continuous latent variables
0.13 |
-0.98 |
0.13 ... |
Quantize
0.123 |
-0.987 |
0.151 ... |
z* =
Current neural compression methods:
Yang, Bamler, Mandt • Variational Bayesian Quantization, ICML 2020
4
Contributions
A new algorithm, Variational Bayesian Quantization (VBQ), for compressing latent representations in a large class of generative models, that:
What’s the best way to quantize latent variables in a given generative model?
Yang, Bamler, Mandt • Variational Bayesian Quantization, ICML 2020
5
Data Compression: When Probabilities Matter
Don’t transmit
what you can predict.
Better generative probabilistic models �lead to better compression rate:
minimal bitrate = –log2 pmodel(message)
E[bitrate] ⩾ Crossentropy(data || model)
Classical Example: Arithmetic Coding
Yang, Bamler, Mandt • Variational Bayesian Quantization, ICML 2020
6
Lossy Data Compression: When Uncertainty Matters
Don’t transmit
what you can predict.
Better generative probabilistic models �lead to better compression rate:
Don’t transmit what you’re not sure about.
Better estimates of posterior uncertainties�allow for more efficient use of bandwidth.
minimal bitrate = -log2 pmodel(message)
E[bitrate] ⩾ Crossentropy(data || model)
[Yang, Bamler & Mandt, Variational Bayesian Quantization, ICML 2020]
Yang, Bamler, Mandt • Variational Bayesian Quantization, ICML 2020
7
What’s the Population of Rome?
© David Iliff, CC BY-SA 2.5
100,000
2,879,728
Yang, Bamler, Mandt • Variational Bayesian Quantization, ICML 2020
8
Variational Bayesian Quantization
Now: continuous observation
(Reminder: Classical Arithmetic Coding)
[Yang, Bamler & Mandt, Variational Bayesian Quantization, ICML 2020]
Yang, Bamler, Mandt • Variational Bayesian Quantization, ICML 2020
9
Real Example: Compressing Neural Word Embeddings
?
–
+
=
“king”
“man”
“woman”
“queen”
[Yang, Bamler & Mandt, Variational Bayesian Quantization, ICML 2020]
VBQ (proposed)
(model with 107 parameters)
Yang, Bamler, Mandt • Variational Bayesian Quantization, ICML 2020
10
Data Compression With Deep Probabilistic Models
Results for Image Compression:
[Yang, Bamler & Mandt, Variational Bayesian Quantization, ICML 2020]
Yang, Bamler, Mandt • Variational Bayesian Quantization, ICML 2020
11
original
Yang, Bamler, Mandt • Variational Bayesian Quantization, ICML 2020
12
JPEG @ 0.24 BPP
Yang, Bamler, Mandt • Variational Bayesian Quantization, ICML 2020
13
ours @ 0.24 BPP
Yang, Bamler, Mandt • Variational Bayesian Quantization, ICML 2020
14
Conclusion
Our paper was previously titled Variable-bitrate Neural Compression via Bayesian Arithmetic Coding. We changed the title to Variational Bayesian Quantization based on reviewer feedback.
Yang, Bamler, Mandt • Variational Bayesian Quantization, ICML 2020
15