Learning with Errors and PIR
History
4s0 + 3s1 + 2s3 = 3
2s0 + 6s1 + 3s3 = 4
7s0 + 8s1 + 1s3 = 5
…
All are integers (ℤ)
How easy it is to find s0,s1,s3?
It’s easy! We can solve it using Gaussian elimination
Losing some information - Attempt 1
4s0 + 3s1 + 2s3 = 3 (mod 100)
2s0 + 6s1 + 3s3 = 4 (mod 100)
7s0 + 8s1 + 1s3 = 5 (mod 100)��Does this make solving hard ?
How about operating over ring `modulo q`. For ex, if q = 100 then representation of 120 will be 120 mod 100 = 20.
Think of this as chopping off the head to lose some information!
No, gaussian elimination works as well on rings
Losing some information - Attempt 2
4s0 + 3s1 + 2s3 + 1 = 3
2s0 + 6s1 + 3s3 - 1 = 4
7s0 + 8s1 + 1s3 + 1 = 5�
Does this make solving hard?
Let’s try losing information on least significant bits of the equations by adding some noise to them!
Still No. We can solve this using least square minimisation.
Losing some information - Attempt 3
Let’s try adding noise and doing operations over field modulus q.
4s0 + 3s1 + 2s3 + 1 = 3 (mod 100)
2s0 + 6s1 + 3s3 - 1 = 4 (mod 100)
7s0 + 8s1 + 1s3 + 1 = 5 (mod 100)
Is it hard to solve?
Yes! Turns out it’s really hard to solve.
(Search) LWE problem - Noisy modular equations are very hard to solve!
(Search) LWE
Given m samples (b0, a0) ∈ ℤqx ℤnq, (b1, a1), …, (b_m-1, a_m-1) such that
bi = <ai, s> + ei
Where s ∈ ℤnq and ei is sampled randomly from range [-B,..,+B] and B is small.
Finding s is hard!
Using matrices
However, be careful!
Be careful when selecting n, q, B, and m. Otherwise, there are algorithms that can solve for `s` easily.
Just to give a flavor of n,q,B, and m values
For ex, n = 2^10, m = 2^13, q = 2^32, σ = 6.4 is safe.
What’s Sigma?
Usually we replace sampling e from range [-B…+B] with sampling from gaussian distribution with mean (μ) = 0 and standard deviation (σ).
Lattice estimator
https://github.com/malb/lattice-estimator
Ok! Hard problem! So what?
Turns out we can build very useful crypto out of this!
But we need something more
(Decisional) LWE
You cannot distinguish which one is which. If you can then you can solve (Search) LWE, which is supposed for be hard in the first place!
Let’s build Secret key encryption!
Secret-key encryption - Attempt 1
Idea: Let’s represent message m0 as vector of size m where each element is modulus q. To encrypt m0 we set B = As + e + m0.
To encrypt m0: Set ciphertext c = (B = As + e + m0, A)
Would this work?
Secret-key encryption - Attempt 1
To decrypt c = (B, A), given s. We can try
Recall B = As + e + m0
m’ = B - As = m0 + e
We nearly succeeded! If there was a way to eliminate e
Secret-key encryption - Attempt 2
Idea: Error is in least significant bits. How about scaling m0 by some factor before adding it to encryption. And during decryption scale m’ by inverse of the factor?
Why scaling?
By scaling by some factor, we make sure that `m` is in higher bits and does not gets jumbled up with error bits.
Set message m0 as a vector of size m with each element modulo p (not q) where p < q.
To encrypt m0:
set scaling factor Δ as q/p and calculate m1 = Δ m0
Output c = (B = As + e + m1, A)
Why mod p and not q ?
Because you are scaling message by some factor and storing it in MSB
Credits:https://www.zama.ai/post/tfhe-deep-dive-part-1
To decrypt c = (B, A) given s:
Calculate m0 = (B - As) / Δ = (Δm0 + e) / Δ
And notice (m1 + e) / Δ = m0
Hurray! We succeeded. Now you know how Secret key regev encryption works!
Additive homomorphism
Our encryption scheme is additive homomorphic. This means
If c1 is encryption of m1 under secret key s
And
c2 is encryption of m2 under secret key s
Then
c1 + c2 is encryption of m1 + m2 under secret key s
Additive homomorphism
Why so?
Let c1 = (B1 = A1s + e1 + Δm1, A1) AND c2 = (B2 = A2s + e2 + Δm2, A2)
Then c1 + c2 = c3 = ((A1 + A2)s + (e1 + e2) + Δ(m1 + m2), (A1 + A2))
Decrypting c3 given s:
= ((A1 + A2)s + (e1 + e2) + Δ(m1 + m2) - (A1 + A2)s) / Δ
= ((e1 + e2) + Δ(m1 + m2)) / Δ
= m1 + m2
How about adding many ciphertexts?
Would decrypting c = c1 + c2 + … + c10000 will still give result into m = m1 + m2…?
It depends on values of n,q,p,m
For example,
((e1 + e2) + Δ(m1 + m2)) / Δ = m1 + m2 only if |e1 + e2| < ½ q/p.
Credits:https://www.zama.ai/post/tfhe-deep-dive-part-1
Plaintext Inner Product
Let k be a vector of size m with each element modulo p.
kT =
Given ciphertext c encrypting message m0, can I transform c to c’ such that c’ encrypts inner product <k, m1> ?
Plaintext Inner Product
c = (As + e + Δm0, A)
Decrypting c`, given s =
(kT (As) + kTe + kT Δm - kT (A) s) / Δ
= (kTe + kT Δm0) / Δ
= kTm0 (if |kTe| < ½ q/p)
kT c = c` = (kT (As) + kTe + kT Δm, kT (A))
Plaintext Inner Product
c’ =
,
We have all the tools to build Private information retrieval
So let’s build one!
PIR
Let’s define our database as a vector of N rows with each entry modulo p.
And let’s modify it to matrix of dimension √N by √N
dB =
[Fake] PIR
To query a value at specific row r and col c in plaintext
Create a vector hot vector qu of size √N that is 0 at all index, except at index = c.
qu =
[Fake] PIR
Compute: res = dB x qu
res consists of all rows in column c, from which you can retrieve row r
Hurray! FakePIR works. That’s it folks.
Not quite yet! Because we want
PIR
Idea: Build query qu the same way, but encrypt it using our secret key encryption scheme.
Let’s set m = √N and choose n, p, q, σ such that encryption of vector qu is valid
AND
Plaintext inner product with vector of size m results into correct decryption
PIR
Let c_qu = (As + e + Δ qu, A)
Compute: res = dB x c_qu
res = dB c_qu = (dB As + dB e + dB Δqu, dB A)
Decrypt res, given s :
= (db As + dB e + dB Δqu) - (dB A)s) / Δ
= (dB e + dB Δqu) / Δ
= dB qu
Hurray! We get the same result as when qu was in plaintext, but this time with didn’t reveal any information!
Found this cool?
Send me a dm!
Twitter: Janmajaya_mall
Telegram: @janmajayamall
Try out the python tutorial: https://github.com/Janmajayamall/pir-demo