1 of 47

False-name Attacks under Uncertainty�In Combinatorial Auctions, Voting Games, and Collaborative ML

P.h.D Thesis Committee, Technion, September 2023

Yotam Gafni

Technion, Israel

Moshe Tennenholtz

Ron Lavi

Advised by:

2 of 47

What are False-name Attacks?

3 of 47

Too ominous to deal with?

Many negative results in different settings:

No combinatorial auction is Sybil-proof and Pareto-efficient [Yokoo, Saburai & Matsubara 2004]
A Sybil-proof combinatorial auction with m items is O(1/m) efficient [Iwasaki et. al 2010]
A Sybil-proof mechanism for Multi-level marketing has o(1) revenue [Emek et. al 2011]
Exponentially bad efficiency in resource-sharing games [Mazorra & Della Penna 2023]
In false-name proof voting with m alternatives, even if all voters vote for alternative a, it can win w.p. O(1/m) [Conitzer 2008]

4 of 47

Approaches to Mitigation in Combinatorial Auctions

Limited valuation classes

With Submodular valuations++ a Sybil-proof mechanism exists (=VCG) [Nisan & Ronen 2006]
Approximately submodular valuations give approximate efficiency guarantees in VCG [Alkalay-Houlihan & Vetta 2014]
[Christodoulou, Kovacs & Schapira 2016] For XOS valuations, every pure Nash equilibrium when selling each item in a 2^nd price auction with no over-bidding guarantees at least half of the optimal welfare, and such equilibrium exists.

(Taken from Michal Feldman’s slides)

5 of 47

A New Approach: Uncertainty

All the negative results we saw were in the full information settings.
What happens if the agents face uncertainty?
We see an example where an attack only succeeds based on the circumstances.
But to understand the example, we first need to recall combinatorial auctions.

6 of 47

Reminder: VCG & Combinatorial Auctions

7 of 47

Example: Two Items and Two Bidders

v1({b}) = v1({a}) = 0, v1({a,b}) = 15

v2({b}) = 10, v2({a}) = 10, v2({a,b}) = 10

– VCG allocates 𝒂_𝟏= {a,b}, 𝒂_𝟐={} and agent 1 (the couple) pays 10.

8 of 47

A False-name Attack for the Example

Can agent 2 do better with a false-name attack? Bid:
– b2({b}) = 0, b2({a}) = b2({a,b}) = 15
– b3({a}) = 0, b3({b}) = b3({a,b}) = 15
– VCG allocates 𝒂_𝟏= {}, 𝒂_𝟐={𝒂} , 𝒂_𝟑= {𝒃} and agents 2 and 3 pay nothing

9 of 47

Is uncertainty the answer to the example?

What we showed is that with false-name attacks, VCG is no longer DSIC
But, false-name attacks’ success depends on other bidders’ valuations
What if the attacker is uncertain about the other bidders?

Left-hand, the attacker’s utility is -20. Right-hand, the attacker’s utility is 10

10 of 47

This Calls for a Bayesian Analysis… [Gafni, Lavi & Tennenholtz 2020]

12 of 47

Agent 1 / Agent 2	Action A	Action B	Action C
Action A	4, 8	3,9	6,17
Action B	2,6	10,5	9,5
Action C	5,10	3,7	5,9

What If There is No Bayesian Prior? [Gafni & Tennenholtz 2023]

13 of 47

Let’s apply ’Safety Level’ to the First-price Auction.

The worst-case performance of any under-bid (including exact-bid) is 0, when it does not win the item.

The worst-case performance of overbidding is negative, when it wins the item and pays more than its value.

14 of 47

A Refined Safety Notion

Agent 1 / Agent 2	Action A	Action B	Action C
Action A	4, 8	3,9	6,17
Action B	2,6	10,5	9,5
Action C	5,10	3,7	5,9

- How does it apply to the first-price auction?

15 of 47

Discrete First-price Auction with DSL

16 of 47

Discrete First-price Auction with DSL

17 of 47

Ok… and how about VCG under Sybil Attacks?

18 of 47

Exact-bidding Strategies can come in many forms

19 of 47

An Attacker Can Benefit From an Exact-Bidding Attack

21 of 47

Conclusion

We saw two models of VCG under uncertainty
If the attacker knows a distribution over the agents’ types, in many cases truthfulness is a BNE.
If the attacker doesn’t know a distribution, it is prudent to follow a DSL strategy.
But then, given DSL attackers, we are still guaranteed optimal welfare!

Future Work?

Bids:

3 rooms, 5000$ total

8 rooms, 10000$ total

1 room, 2000$ total

…

22 of 47

Collaborative ML: A Growing Interest…

23 of 47

Building shared models – a simple use case

Two real estate marketplaces want to predict house prices
Users randomly go to each, so the joint data should better predict
They want to run linear regression of the price vs. square ft
They want to keep updating the model as more data is added

24 of 47

Great… Where is the problem?

There are works that study strategic data sharing:

Disincentives to effort exertion in collecting data [Blum et. al 21’]
Privacy concerns [Dwork et. al 06’, …]
Mutually beneficial information design in competitive markets [Gradwohl & Tennenholtz 21’,22’]
Partitioning into coalitions of shared model building [Donahue & Kleinberg 21’]

Other works study how competition changes the nature of individual learning (without sharing):

Designing models to take over a submarket [Ben-Porat and Tennenholtz 19’]
Trading off bias and variance when learning under competition [Feng et. al 22’]

We talk about a different security risk we call exclusivity attacks

Exclusivity attack – by sending distorted data, a firm can learn the best model fit, but also mislead the other firm. (Though top priority is to learn the best model)
This is an understudied attack [Shoham & Tennenholtz 2005] [Kantarcioglu and Jiang 2013]

25 of 47

One-shot vs long-term

Data collaboration is usually a long-term project with ongoing updates
+ Risk: An attacker has the option to submit multiple updates and learn more

Simple example: [Kantarcioglu and Jiang 2013] show that average can not be learnt in one-shot. But with two samples, it can be.
Sybil attacks make blocking this hard

+ Benefit: An attacker must consider future consequences of data corruption

26 of 47

Main Question:

Given a communication protocol and learning algorithm,

where all agents report truthfully and accept the model estimation

can an attacker deviate successfully with an exclusivity attack?

Main Results Preview:

Continuous Protocol:

Periodic Protocol: d-LR, k-Center are not vulnerable.

27 of 47

A successful attack with one-shot data sharing

28 of 47

Agent 1 has the number 10.
Agent 2 has the number 20.
If they share truthfully, they both know the max is 20.
If agent 1 sends 25 instead…
Agent 2 now thinks the true max is 25
Can agent 1 know the true max for sure?

A failed attack with one-shot data sharing

29 of 47

Formal model – Continuous Protocol

30 of 47

Running Example – Continuous Protocol

Revisiting max in the continuous protocol.

Here 90 < y < x.

Agent 1 is strategic. Agent 2 is truthful.

31 of 47

Observed history, Strategy and Vulnerability

32 of 47

Observed history, Strategy and Vulnerability (Cont.)

We consider deviations from truthfulness, i.e., all other agents act truthfully.

33 of 47

Linear regression

Challenge Intuition:

How to ‘reverse’ effects of fake points submitted?

34 of 47

A sneak attack

What if the line of the last output and the new update (by itself) are the same?

35 of 47

Triangulation attacks

36 of 47

High level construction of triangulation for LR

37 of 47

Example of a triangulation attack (1-LR)

Under truth

Under triangulation

38 of 47

Example of a triangulation attack (2-LR)

39 of 47

The periodic communication model

40 of 47

In periodic model, LR is not vulnerable

41 of 47

��Take away messages:��

The choice of protocol matters for security against exclusivity attacks, and this might require sacrifice of high availability
The # Sybils an attacker controls matters for the security of the system, and we derive exact bounds
Implications for federated learning (obscuring learning parameters)

42 of 47

Bounds on Power Indices in Weighted Voting Games

“The number of delegates ought not to be in exact proportion to the number of inhabitants, because the influence and power of those states whose delegates are numerous, will be greater [even relative to their proportion] when compared to the influence and power of the other states...”

(The Anti-Federalist Papers)

43 of 47

A model of weighted voting games…

44 of 47

Power and proportion may not coincide…

45 of 47

Power and proportion may not coincide II…

46 of 47

Power and proportion may not coincide III…

47 of 47

Summary of the bounds we prove

We believe (and this is supported by experiments) that the Shapley-Shubik upper bound of 2 holds not only against proportion, but any type of false-name split.

The implication would be that splitting is not too bad for the big players (as a whole).