1 of 47

False-name Attacks under Uncertainty�In Combinatorial Auctions, Voting Games, and Collaborative ML

P.h.D Thesis Committee, Technion, September 2023

Yotam Gafni

Technion, Israel

Moshe Tennenholtz

Ron Lavi

Advised by:

2 of 47

What are False-name Attacks?

  •  

3 of 47

Too ominous to deal with?

Many negative results in different settings:

  • No combinatorial auction is Sybil-proof and Pareto-efficient [Yokoo, Saburai & Matsubara 2004]
  • A Sybil-proof combinatorial auction with m items is O(1/m) efficient [Iwasaki et. al 2010]
  • A Sybil-proof mechanism for Multi-level marketing has o(1) revenue [Emek et. al 2011]
  • Exponentially bad efficiency in resource-sharing games [Mazorra & Della Penna 2023]
  • In false-name proof voting with m alternatives, even if all voters vote for alternative a, it can win w.p. O(1/m) [Conitzer 2008]

4 of 47

Approaches to Mitigation in Combinatorial Auctions

  • Limited valuation classes
    • With Submodular valuations++ a Sybil-proof mechanism exists (=VCG) [Nisan & Ronen 2006]
    • Approximately submodular valuations give approximate efficiency guarantees in VCG [Alkalay-Houlihan & Vetta 2014]
    • [Christodoulou, Kovacs & Schapira 2016] For XOS valuations, every pure Nash equilibrium when selling each item in a 2nd price auction with no over-bidding guarantees at least half of the optimal welfare, and such equilibrium exists.

(Taken from Michal Feldman’s slides)

5 of 47

A New Approach: Uncertainty

  • All the negative results we saw were in the full information settings.
  • What happens if the agents face uncertainty?
  • We see an example where an attack only succeeds based on the circumstances.
  • But to understand the example, we first need to recall combinatorial auctions.

6 of 47

Reminder: VCG & Combinatorial Auctions

 

7 of 47

Example: Two Items and Two Bidders

v1({b}) = v1({a}) = 0, v1({a,b}) = 15

v2({b}) = 10, v2({a}) = 10, v2({a,b}) = 10

VCG allocates 𝒂𝟏 = {a,b}, 𝒂𝟐 ={} and agent 1 (the couple) pays 10.

15

10

10

8 of 47

A False-name Attack for the Example

  • Can agent 2 do better with a false-name attack? Bid:
  • b2({b}) = 0, b2({a}) = b2({a,b}) = 15
  • b3({a}) = 0, b3({b}) = b3({a,b}) = 15
  • VCG allocates 𝒂𝟏 = {}, 𝒂𝟐 ={𝒂} , 𝒂𝟑 = {𝒃} and agents 2 and 3 pay nothing

15

15

15

9 of 47

Is uncertainty the answer to the example?

  • What we showed is that with false-name attacks, VCG is no longer DSIC
  • But, false-name attacks’ success depends on other bidders’ valuations
  • What if the attacker is uncertain about the other bidders?

Left-hand, the attacker’s utility is -20. Right-hand, the attacker’s utility is 10

15

15

15

15

15

15

15

10 of 47

This Calls for a Bayesian Analysis… [Gafni, Lavi & Tennenholtz 2020]

 

11 of 47

Results…

 

12 of 47

  •  

Agent 1 / Agent 2

Action A

Action B

Action C

Action A

4, 8

3,9

6,17

Action B

2,6

10,5

9,5

Action C

5,10

3,7

5,9

What If There is No Bayesian Prior? [Gafni & Tennenholtz 2023]

13 of 47

Let’s apply ’Safety Level’ to the First-price Auction.

 

 

 

 

 

The worst-case performance of any under-bid (including exact-bid) is 0, when it does not win the item.

 

 

 

 

The worst-case performance of overbidding is negative, when it wins the item and pays more than its value.

14 of 47

A Refined Safety Notion

 

Agent 1 / Agent 2

Action A

Action B

Action C

Action A

4, 8

3,9

6,17

Action B

2,6

10,5

9,5

Action C

5,10

3,7

5,9

- How does it apply to the first-price auction?

15 of 47

Discrete First-price Auction with DSL

 

 

 

 

 

 

 

 

 

 

 

 

16 of 47

Discrete First-price Auction with DSL

 

 

 

 

 

 

 

 

 

 

 

 

17 of 47

Ok… and how about VCG under Sybil Attacks?

 

18 of 47

Exact-bidding Strategies can come in many forms

10

10

30

 

30

10

10

10

10

20

10

6

20

10

4

19 of 47

An Attacker Can Benefit From an Exact-Bidding Attack

10

10

15

 

15

10

10

5

5

20

5

5

20 of 47

21 of 47

Conclusion

  • We saw two models of VCG under uncertainty
  • If the attacker knows a distribution over the agents’ types, in many cases truthfulness is a BNE.
  • If the attacker doesn’t know a distribution, it is prudent to follow a DSL strategy.
  • But then, given DSL attackers, we are still guaranteed optimal welfare!

Future Work?

 

Bids:

3 rooms, 5000$ total

8 rooms, 10000$ total

1 room, 2000$ total

22 of 47

Collaborative ML: A Growing Interest…

23 of 47

Building shared models – a simple use case

  • Two real estate marketplaces want to predict house prices
  • Users randomly go to each, so the joint data should better predict
  • They want to run linear regression of the price vs. square ft
  • They want to keep updating the model as more data is added

24 of 47

Great… Where is the problem?

  • There are works that study strategic data sharing:
    • Disincentives to effort exertion in collecting data [Blum et. al 21’]
    • Privacy concerns [Dwork et. al 06’, …]
    • Mutually beneficial information design in competitive markets [Gradwohl & Tennenholtz 21’,22’]
    • Partitioning into coalitions of shared model building [Donahue & Kleinberg 21’]

Other works study how competition changes the nature of individual learning (without sharing):

    • Designing models to take over a submarket [Ben-Porat and Tennenholtz 19’]
    • Trading off bias and variance when learning under competition [Feng et. al 22’]

We talk about a different security risk we call exclusivity attacks

  • Exclusivity attack – by sending distorted data, a firm can learn the best model fit, but also mislead the other firm. (Though top priority is to learn the best model)
  • This is an understudied attack [Shoham & Tennenholtz 2005] [Kantarcioglu and Jiang 2013]

25 of 47

One-shot vs long-term

  • Data collaboration is usually a long-term project with ongoing updates
  • + Risk: An attacker has the option to submit multiple updates and learn more
    • Simple example: [Kantarcioglu and Jiang 2013] show that average can not be learnt in one-shot. But with two samples, it can be.
    • Sybil attacks make blocking this hard
  • + Benefit: An attacker must consider future consequences of data corruption

26 of 47

Main Question:

  • Given a communication protocol and learning algorithm,

where all agents report truthfully and accept the model estimation

can an attacker deviate successfully with an exclusivity attack?

Main Results Preview:

  • Continuous Protocol:

  • Periodic Protocol: d-LR, k-Center are not vulnerable.

27 of 47

A successful attack with one-shot data sharing

  •  

28 of 47

  • Max
    • Agent 1 has the number 10.
    • Agent 2 has the number 20.
    • If they share truthfully, they both know the max is 20.
    • If agent 1 sends 25 instead…
    • Agent 2 now thinks the true max is 25
    • Can agent 1 know the true max for sure?

A failed attack with one-shot data sharing

29 of 47

Formal model – Continuous Protocol

  •  

30 of 47

Running Example – Continuous Protocol

Revisiting max in the continuous protocol.

Here 90 < y < x.

Agent 1 is strategic. Agent 2 is truthful.

31 of 47

Observed history, Strategy and Vulnerability

  •  

 

32 of 47

Observed history, Strategy and Vulnerability (Cont.)

  • We consider deviations from truthfulness, i.e., all other agents act truthfully.

 

 

33 of 47

Linear regression

  •  

Challenge Intuition:

How to ‘reverse’ effects of fake points submitted?

34 of 47

A sneak attack

  • What if the line of the last output and the new update (by itself) are the same?

35 of 47

Triangulation attacks

  •  

36 of 47

High level construction of triangulation for LR

  •  

37 of 47

Example of a triangulation attack (1-LR)

Under truth

Under triangulation

38 of 47

Example of a triangulation attack (2-LR)

39 of 47

The periodic communication model

  •  

40 of 47

In periodic model, LR is not vulnerable

  •  

41 of 47

��Take away messages:�����

  • The choice of protocol matters for security against exclusivity attacks, and this might require sacrifice of high availability
  • The # Sybils an attacker controls matters for the security of the system, and we derive exact bounds
  • Implications for federated learning (obscuring learning parameters)

42 of 47

Bounds on Power Indices in Weighted Voting Games

“The number of delegates ought not to be in exact proportion to the number of inhabitants, because the influence and power of those states whose delegates are numerous, will be greater [even relative to their proportion] when compared to the influence and power of the other states...”

(The Anti-Federalist Papers)

43 of 47

A model of weighted voting games…

  •  

44 of 47

Power and proportion may not coincide…

  •  

45 of 47

Power and proportion may not coincide II…

  •  

46 of 47

Power and proportion may not coincide III…

  •  

47 of 47

Summary of the bounds we prove

We believe (and this is supported by experiments) that the Shapley-Shubik upper bound of 2 holds not only against proportion, but any type of false-name split.

The implication would be that splitting is not too bad for the big players (as a whole).