1 of 74

Part 2: Differential Privacy in the Context of Information Sharing

Privacy-Aware Sequential Learning.

2 of 74

Motivation: Social Sequential Learning

People make decisions after observing others:

  • Vaccination & public health behavior
  • Consumers reading reviews before purchasing
  • Investors following market actions
  • Workers adopting AI tools based on peers

Public actions transmit information�→ This creates information cascades & herd behavior

3 of 74

Motivation: Privacy Leakage

The public report can reveal sensitive information:

  • A vaccination choice reveals health signals
  • An investment move reveals beliefs or private information
  • A yes/no adoption decision reveals valuation
  • Reporting to a registry reveals sensitive status

→ Public observability = privacy leakage

4 of 74

4

Key Questions and Insights

  • How do privacy concerns affect sequential learning?
  • Will learning still occur under privacy constraints?
  • What is the speed of learning with privacy?

Surprisingly! Privacy considerations can, in some cases, lead to a faster learning process.

5 of 74

5

Sequential learning model

 

 

 

 

 

 

 

 

 

 

6 of 74

6

Binary Signal and Information Cascade

Information Cascade: Individuals ignore their private information and imitate prior, public actions (Bikhchandani et al., 1992)

-1

+1

7 of 74

7

Gaussian Signal and Learning Efficiency

 

 

Asymptotic learning is inefficient for Gaussian private signals!

📢

8 of 74

  •  

Metric Differential Privacy

Continuous/Gaussian signals

No information flow

 

 

 

No privacy

9 of 74

  • Agent n selects a reporting strategy to maximize their expected utility, subject to a fixed privacy budget:

Best Response Under Privacy Constraints

10 of 74

10

Privacy Impact on Learning

  • Privacy leads to randomized behavior
  • Noisier actions may hurt learning
  • Reduced risk of information cascades

Information traps get less sticky when you introduce “noise”. Sequential learning under privacy can be faster and more efficient!

11 of 74

11

The Binary Model

12 of 74

12

The Binary Model

Before Cascade

After Cascade

+1

-1

-1

-1

-1

13 of 74

13

The Binary Model: Information Traps

 

14 of 74

14

The Binary Model:Nonmonotonic Pattern

15 of 74

15

The Binary Model

Information cascade Threshold

Signal Accuracy

16 of 74

16

The Gaussian Model

17 of 74

17

Randomized Response and Global Sensitivity

  • Randomized Response does not depend on the private signal!

It flips the action with a fixed probability, independent of what the signal is

18 of 74

18

Global Sensitivity

  • Warning: For differentially private sequential learning with binary states and Gaussian signals, asymptotic learning will not occur under a randomized response strategy with any positive privacy budget.

Randomized Response does not depend on the private signal!

+1

Signal Distribution

 

 

Decision

Threshold

 

+1

19 of 74

19

Smooth Randomized Response

 

 

 

Smooth Randomized Response

Decision

Threshold

Decision

Threshold

20 of 74

20

The Gaussian Model

21 of 74

21

Learning Rate

22 of 74

22

Time to First Correct Action

Finite!

23 of 74

23

Total Number of Incorrect Action

Finite!

24 of 74

24

The Gaussian Model

25 of 74

25

 

 

 

Mechanism Behind Acceleration

    • False state → more contradictory signals → more noise
    • True state → fewer contradictory signals → less noise
    • Reporting becomes state-asymmetric
    • Asymmetry improves state distinguishability
    • Actions become more informative → faster learning

26 of 74

26

Order-Optimal Asymptotic Learning with Heterogeneous Privacy Budgets

27 of 74

27

Order-Optimal Asymptotic Learning with Heterogeneous Privacy Budgets

28 of 74

28

Learning Rate Bound

29 of 74

29

Learning Rate Bound

30 of 74

30

Learning Efficiency

31 of 74

31

32 of 74

Part 2: Differential Privacy in the Context of Information Sharing

Differentially Private Distributed Estimation and Inference.

33 of 74

A recent challenge in power grids

  • Problems of exposing sensitive information about a households’ energy consumption
    • health problems
    • patterns of usage
    • security risks (e.g., theft)

Average consumption?

Can we efficiently learn the average consumption while preserving the privacy of the households?

34 of 74

Learning about the effectiveness of a treatment

Patient data

Patient data

Patient data

Patient data

 

 

Privacy regulations regarding patient data

  • HIPAA (USA),
  • GDPR (EU)

Can we efficiently learn if the new treatment is effective while preserving privacy of the patients?

35 of 74

Privacy is important in both contexts!

Average consumption?

1st part of the talk: Distributed estimation and learning of the sufficient statistics of exponential family variables

Papachristou, Marios, and M. Amin Rahimian. "Differentially Private Distributed Estimation and Learning." IISE Transactions, 2024

2nd part of the talk: Non-Bayesian social learning under privacy constraints

Papachristou, Marios, and M. Amin Rahimian. Differentially Private Distributed Inference” Under review, 2024

Patient data

Patient data

Patient data

Patient data

 

 

Distributed Estimation

(Non-Bayesian) Social Learning

36 of 74

Some related works in Distributed Estimation

37 of 74

Some related works in Social Learning

38 of 74

Social learning relies on information flow. It offers an interesting context to study privacy.

  • How can individuals exchange information to learn from each other despite their privacy needs and security concerns?
  • How should one limit the information requirement of a non-Bayesian iterative update rule to guarantee privacy and still ensure consensus and asymptotic learning for the agents?

39 of 74

Section 0: Preliminaries

Graph structure, learning task

40 of 74

Graph structure

 

 

 

 

 

 

41 of 74

42 of 74

DP Protections

 

 

 

 

 

 

 

 

 

 

 

Signal DP

Network DP

43 of 74

Section 1: Private Distributed Estimation and Learning

Learning the sufficient statistics of exponential family variables

44 of 74

Minimum Variance Unbiased Estimation

 

 

 

 

 

 

 

45 of 74

Online Learning

 

 

 

 

 

 

 

46 of 74

Minimum Variance Unbiased Estimation

Algorithm

 

47 of 74

Minimum Variance Unbiased Estimation

Algorithm

 

48 of 74

Online Learning

Algorithm (Signal DP)

 

49 of 74

Online Learning

Algorithm (Network DP)

 

50 of 74

51 of 74

Section 2: Private Non-Bayesian Distributed Social Learning

Distributed maximum likelihood estimation and online learning

52 of 74

Distributed Maximum Likelihood Estimation

 

 

 

 

 

 

 

53 of 74

Online Learning

 

 

 

 

 

 

 

 

 

54 of 74

Non-private Distributed MLE Benchmark

 

55 of 74

Non-private Online Learning Benchmark

 

56 of 74

DP Protections

Multiplicative Noise

Caveat

Algorithms become non-deterministic, i.e., they can return the wrong answer with non-zero probability!

 

57 of 74

Private Distributed MLE – Updates within rounds

58 of 74

Aggregations

AM/GM Aggregation

Double Thresholding

59 of 74

AM/GM Aggregation

Theorem (Informal) [Papachristou, R, 2024]

 

Global sensitivity of log-likelihood

Max log-likelihood value

Min abs value of identifiability condition

60 of 74

Double Threshold Aggregation

Theorem (Informal) [Papachristou, R, 2024]

 

Global sensitivity of log-likelihood

Max log-likelihood value

Min abs value of identifiability condition

61 of 74

Private Online Learning Algorithm

62 of 74

Online Learning

Theorem (Informal) [Papachristou, R, 2024]

Max sum of variances of KL divergence

Global sensitivity of log-likelihood

Sum of variances of the number of signals

Min abs value of identifiability condition

 

63 of 74

Differentially Private Distributed Inference

64 of 74

65 of 74

66 of 74

67 of 74

68 of 74

69 of 74

70 of 74

71 of 74

72 of 74

73 of 74

74 of 74