Robust data encodings for quantum classifiers
QTML 2019
Ryan LaRose, Brian Coyle
arXiv:work-in-progress
Motivation: Data representation (encoding) is critical
“Science is representation learning by humans. Deep learning is representation learning by machines.”
-- Lex Fridman, Journal of Academic Twitter.
Motivation: Data representation (encoding) is critical
How classical data is encoded in a quantum state is crucial for learning.
There is a tradeoff between robustness and learnability with quantum encodings of classical data.
Outline
Outline
Classification problems in machine learning
Input: Feature vectors and labels
Output: “Intelligent” machine that correctly classifies all feature vectors
and can make new (correct) predictions on data it was not trained on.
Quantum classifiers
Input: Feature vectors and labels
Output: “Intelligent” quantum machine that correctly classifies all feature vectors
and can make new (correct) predictions on data it was not trained on.
Common models for quantum classification
Common models for quantum classification
Common models for quantum classification
Common models for quantum classification
Common models for quantum classification
Common models for quantum classification
Common models for quantum classification
Common models for quantum classification
Common models for quantum classification
Common models for quantum classification
Common models for quantum classification
Common models for quantum classification
Theme:
Common models for quantum classification
Theme:
Common models for quantum classification
Theme:
Common models for quantum classification
Theme:
Model for a (binary) quantum classifier
We consider the model commonly discussed in literature:
Model for a (binary) quantum classifier
We consider the model commonly discussed in literature:
Outline
Encoding data in a quantum state
(1) Complete wavefunction encoding
Takes exp(n) time
(2) QRAM
Takes infinite time (not possible)
(3) “Quantum data”
Denies that encoding problem exists
Problem: Given a feature vector encode it in a quantum state on n qubits
Data encodings
Basis encoding for binary data
where each
Data encodings
Amplitude (wavefunction) encoding for arbitrary data
where each
Data encodings
We can consider parameterizations of features.
Schuld and Killoran (Phys. Rev. Lett. 122, 040504) define the tensor product angle encoding
which encodes one feature per qubit.
Data encodings
We can also define a dense angle encoding
which encodes two features per qubit.
Data encodings
Such parameterizations can be generalized to any L2 functions:
Data encodings
Such parameterizations can be generalized to any L2 functions:
These functions directly determine the learnable decision boundaries of the model. In particular, the decision boundary can be found from
Data encodings
Such parameterizations can be generalized to any L2 functions:
These functions directly determine the learnable decision boundaries of the model. In particular, the decision boundary can be found from
For a single qubit classifier, this becomes
Data encodings
Such parameterizations can be generalized to any L2 functions:
These functions directly determine the learnable decision boundaries of the model. In particular, the decision boundary can be found from
For a single qubit classifier, this becomes
Learnability of encodings
Single qubit encoding of two features
Dense angle encoding
Wavefunction encoding
Learnability of encodings
Single qubit encoding of two features
Data encodings
From a hardness/advantage perspective, it's a good idea to encode data with circuits that are hard to simulate classically.
Supervised learning with quantum enhanced feature spaces, Nature. vol. 567, pp. 209-212 (2019)
Key properties of data encodings
Robustness definition
Let denote the predicted label by the quantum classifier for data point .
Let denote a noise channel.
We say that the classifier is robust to the noise channel if and only if
Outline
Noise in quantum systems
Noise occurs due to interactions between a principal quantum system and its environment.
Physically,
Noise in quantum systems
Noise occurs due to interactions between a principal quantum system and its environment.
Physically,
We often use the equivalent, more convenient operator-sum representation
where
Common models for noise
Depolarizing noise:
Dephasing noise:
Common models for noise
Pauli noise:
Common models for noise
Pauli noise:
Amplitude damping:
Common models for noise
Pauli noise:
Amplitude damping:
Measurement noise:
where is the probability of getting outcome k given input l.
Outline
Two regimes
This characterizes (mostly) properties of the model.
This characterizes (mostly) properties of the data encoding
Robustness to Pauli channels
Result 1: The classifier is robust to Pauli noise
if
Proof:
Robustness to Pauli channels
Result 1: The classifier is robust to Pauli noise
if
Proof:
Robustness to Pauli channels
Result 1: The classifier is robust to Pauli noise
if
Proof:
Robustness to Pauli channels
Result 1: The classifier is robust to Pauli noise
if
Proof:
If , then
Robustness to Pauli channels
Result 1: The classifier is robust to Pauli noise
if
Proof:
If , then
If , then
Robustness to Pauli channels
For the wavefunction encoding
Robustness to Pauli channels
For the dense angle encoding
Robustness to Pauli channels
Corollary 1: Suppose the classification scheme is modified to measure in the X basis, i.e.
Then, the classifier is robust if
Robustness to Pauli channels
Corollary 1: Suppose the classification scheme is modified to measure in the X basis, i.e.
Then, the classifier is robust if
Corollary 2: Suppose the classification scheme is modified to measure in the Y basis, i.e.
Then, the classifier is robust if
Unconditional robustness to dephasing
Result 2: The classifier is unconditionally robust to dephasing noise.
Proof:
Unconditional robustness to dephasing
Result 2: The classifier is unconditionally robust to dephasing noise.
Proof:
Unconditional robustness to depolarizing noise
Result 3: The classifier is unconditionally robust to global depolarizing noise (at any point in the circuit).
General statement:
Unconditional robustness to depolarizing noise
Result 3: The classifier is unconditionally robust to global depolarizing noise (at any point in the circuit).
Intuition for single qubit case:
If If
then then
Amplitude damping channel
Result 4: A data encoding
is robust to amplitude damping noise iff
for all feature vectors.
Amplitude damping channel
Result 4: A data encoding
is robust to amplitude damping noise iff
for all feature vectors.
Can this be achieved? i.e., do there exist such functions f and g?
Amplitude damping channel
Result 4: A data encoding
is robust to amplitude damping noise iff
for all feature vectors.
Can this be achieved? i.e., do there exist such functions f and g?
Yes.
Amplitude damping channel
Proof of Result 4: For the noisy classifier,
Amplitude damping channel
Proof of Result 4: For the noisy classifier,
Suppose
Amplitude damping channel
Proof of Result 4: For the noisy classifier,
Suppose
Because and , we certainly have
Amplitude damping channel
Proof of Result 4: For the noisy classifier,
Suppose
Because and , we certainly have
That is, noisy classification of features labelled 0 is robust.
Amplitude damping channel
Proof of Result 4: For the noisy classifier,
Amplitude damping channel
Proof of Result 4: For the noisy classifier,
Suppose
Amplitude damping channel
Proof of Result 4: For the noisy classifier,
Suppose
We require
Amplitude damping channel
Proof of Result 4: For the noisy classifier,
Suppose
We require
Using resolution of the identity we arrive at
Amplitude damping channel
Proof of Result 4: For the noisy classifier,
Suppose
We require
Using resolution of the identity we arrive at
Amplitude damping channel
Wavefunction encoding with amplitude damping noise
Amplitude damping channel
Wavefunction encoding with amplitude damping noise
Amplitude damping channel
Wavefunction encoding with amplitude damping noise
Amplitude damping channel
Wavefunction encoding with amplitude damping noise
Amplitude damping channel
Wavefunction encoding with amplitude damping noise
Amplitude damping channel
Wavefunction encoding with amplitude damping noise
Amplitude damping channel
Dense angle encoding with amplitude damping noise
Amplitude damping channel
Dense angle encoding with amplitude damping noise
Amplitude damping channel
Dense angle encoding with amplitude damping noise
Amplitude damping channel
Dense angle encoding with amplitude damping noise
Amplitude damping channel
Dense angle encoding with amplitude damping noise
Amplitude damping channel
Dense angle encoding with amplitude damping noise
Amplitude damping channel
What is a robust encoding?
Can we always achieve robustness?
Yes.
Theorem: For any noisy quantum classifier with trace-preserving quantum operation, there exists an encoding such that the noisy classifications are robust.
Can we always achieve robustness?
Yes.
Theorem: For any noisy quantum classifier with trace-preserving quantum operation, there exists an encoding such that the noisy classifications are robust.
Proof: Schauder’s theorem => trace-preserving quantum channels have at least one fixed point. Fixed points => existence of a robust encoding.
Schauder’s theorem (informal): Any continuous map on a convex, compact subspace of a Hilbert space has a fixed point.
Can we always achieve robustness?
Yes.
But at the cost of learnability.
Can we always achieve robustness?
Yes.
Examples:
Bit-flip channel has fixed points [insert fixed points]
Phase flip channel has fixed points [insert fixed points]
Outline
Conclusions
Conclusions
Conclusions
Conclusions
Conclusions
Conclusions
Conclusions
Continued work:
Future directions:
Thank you for your attention.
NOTES SLIDE