Local Exchangeability
Trevor Campbell Saifuddin Syed Chiao-Yu Yang Michael Jordan Tamara Broderick� University of British Columbia University of California, Berkeley MIT
Exchangeability
The order in which data are observed is often unimportant
aether, black hole, quantum, dark matter, ...
quantum, aether, black hole, dark matter, ...
e.g. document topics
data sequence is exchangeable
iff
de Finetti: data are exchangeable iff they are conditionally i.i.d.
quantum
aether
black hole
dark matter
...
Exchangeability
Rainfall estimation� [Guestrin+ 05]
real data isn’t usually exchangeable due to the availability of covariates
Modelling text over time � [Blei+ 06]
quantum
black�hole
dark matter
luminiferous
aether
1700s
1900s
quantum
black
hole
dark matter
luminiferous �aether
1700s
1900s
Learning control system dynamics �[Deisenroth+ 15]
but intuitively, data with similar covariates are “nearly exchangeable”
This work
A new theory of local exchangeability for covariate-dependent data�
GP regression �[Rasmussen+ 06]
Kernel BP feature allocation � [Ren+ 11]
DDP temporal clustering �[MacEachern 99]
Dynamic topic modelling �[Blei+ 06]
History
“we must take up the case where we still encounter ‘analogies’ among the events under consideration, but without their attaining the limiting case of exchangeability.” [de Finetti 38]
Bruno de Finetti
this is an 80-year-old problem
Setup
stochastic process with index set
pseudometric: distance considers only spatial location�observations at the same location are exchangeable
observation index
observation spatial location covariate
heights of uniformly random people
sequence of data
pseudometric: indices aren’t informative
data are exchangeable
Gaussian process regression
pseudometric captures proximity of covariates
separable, no isolated points
Exchangeability [de Finetti]
can swap indices of all observations
Defn: for any finite set of covariates , permutation
the distribution is invariant to swapping any finite collection of observations
Theorem (sketch) [de Finetti]: There is a unique random distribution s.t.
Partial exchangeability [de Finetti 38, Lauritzen 74, Diaconis+ 78, etc]
can only swap within these groups
Defn: for any finite set of covariates , permutation
as long as
the distribution is invariant to swapping observations with identical covariates
Theorem (sketch) [de Finetti]: There is a unique stochastic process of distributions s.t.
Partial exchangeability [de Finetti 38, Lauritzen 74, Diaconis+ 78, etc]
problem 1: this doesn’t address our problem of “nearby” covariates
��
successfully captures related populations in BNP (e.g. HDP mixture), but:
problem 2: theory requires infinite data in each equivalence class of covariates
problem 3: can be arbitrarily complex, not so useful for modelling
Local exchangeability
Key Idea: add a bit of (total variation) wiggle room to exchangeability�
Defn: is -locally exchangeable if for any finite set of � covariates and permutation
Why total variation?
Others have drawbacks:
or are equivalent (Hellinger)
Is this a generalization?
Yes: generalizes (partial) exchangeability
Let be -locally exchangeable under pseudometric
so
Why:
Does it hold in nontrivial cases?
Yes: if is independently generated from latent measure process
is smooth enough on average
relativity
electron
luminiferous aether
black hole
i.i.d. papers
i.i.d. papers
i.i.d. papers
i.i.d. papers
1700
1800
2000
1900
time
topic�probability
1
e.g. Dynamic Topic Model
0
Proposition: is -locally exchangeable if and
Gaussian process example
e.g.: show stationary GP regression models are locally exchangeable
{
de Finetti representation
Main Theorem (de Finetti representation)� is an -locally exchangeable process iff
2) for any finite set of covariates , permutation
1) it is independently generated from a latent measure process
Regularity
Is the underlying measure for an f-locally exchangeable process always continuous?
No:
relativity
electron
luminiferous aether
black hole
1700
1800
2000
1900
time
topic�probability
1
e.g. Dynamic Topic Model
0
...but this is OK�many BNP processes are intuitively locally exchangeable�but aren’t sample-continuous
Regularity
Is the underlying measure for an f-locally exchangeable process always continuous?
This is f-locally exchangeable with
Euclidean distance and
0
1
No:
Are there cases with more regularity?
Yes: the faster decays as , the smoother the process
Theorem: Suppose as . Then:
is constant�is exchangeable
is weak-continuous & weak-sense stationary�is stationary
(can’t say anything)
Sufficiency
If is exchangeable, then the indices contain no information about
empirical measure
bounded function
is independent of given
(“ is sufficient”)
Approximate sufficiency & binned data
Theorem: Suppose we partition into bins and form the binned observations . Then
we often don’t observe exact indices, but rather “binned” / “coarse” versions
Local Exch: the binned empirical measures are “approximately sufficient” for
Insight: equivalent to forming empirical measures in each bin
see the paper for a more powerful sufficiency result
Conclusion
A new theory of local exchangeability for covariate-dependent data:�
http://arxiv.org/abs/1906.09507
Proof sketch
Classical de Finetti [see Aldous 85]
Local de Finetti
select a dense countable subset of
do essentially the same thing�
Conclusion
A new theory of local exchangeability for covariate-dependent data:�
http://arxiv.org/abs/1906.09507