1 of 54

Recommender Systems with Fairness Considerations and Strategic Agent Dynamics��Krishna Acharya�PhD Dissertation Defense �(March 03, 2026)��

Dr. Jacob�Abernethy

Dr. Vidya�Muthukumar

Dr. Aaron�Roth

Dr. Kai�Wang

Dr. Juba Ziani Advisor

Committee Members

Krishna Acharya

Speaker

2 of 54

Recommender systems are everywhere

3 of 54

Preliminaries: Users, Items, Recommendation Model

Model

4 of 54

Recommender models: A timeline

5 of 54

Thesis Statement

“This dissertation studies three challenges in recommender systems: ensuring fair performance across heterogeneous user populations, characterizing how strategic content producers shape the item catalog, and understanding how the shift to LLM-based semantic recommendation reopens these challenges in a fundamentally new setting”

6 of 54

Overview

P1) User Fairness

Oracle Efficient Algorithms for Groupwise Regret [ICLR 24]
Improving Minimax Group Fairness in Sequential Recommendation [ECIR 25]

P2) Strategic creators & item catalog evolution

Producers Equilibria and Dynamics in Engagement-Driven Recommender Systems [TMLR 25]

P3) LLM-based recommendation

GLoSS: Generative Language Models with Semantic Search for Sequential Recommendation [Oars@KDD25]

P4) Conclusion & Future directions

7 of 54

Oracle Efficient Algorithms for Groupwise Regret

Krishna Acharya, Eshwar Ram Arunachaleswaran,

Sampath Kannan, Aaron Roth, Juba Ziani, ICLR 2024

P1) User Fairness

8 of 54

Recap: Online learning

t=1

t=2

t=7

9 of 54

Online learning with groupwise regret

t=1

t=2

Age

Race

Old

Young

White

Black

t=7

⋮

10 of 54

�Prior work: Sublinear regret but computationally intractable�

11 of 54

Snapshot of our algorithm

old

young

…

white

always active

AdaNormalHedge

Final prediction

Pred-young

Pred-white

Pred-agnostic

Update internal states of the algorithms of only active groups’ experts

(here: young, white, always active)

12 of 54

Experiments

Regret comparison

Our algorithm: online ridge regression as group experts + AdaNormalHedge
Benchmark: online ridge regression [Azoury&Warmuth’01]

Datasets: Census-income, medical-costs

Our Algorithm

13 of 54

Improving Minimax Group Fairness in Sequential Recommendation

Krishna Acharya, David Wardrope, Timos Korres,

Aleksandr Petrov, Anders Uhrenholt, ECIR 2025

P1) User Fairness

14 of 54

Task: Sequential Recommendation

Given: Sequence of items a user has viewed

Predict: most likely next item.

15 of 54

Model: Self Attentive Sequential recommendation (SASRec)

Transformer model for sequential recommendation

SASRec: Self-Attentive Sequential Recommendation [Kang & McAuley’18]

16 of 54

User Fairness in Recommendation

Data & algorithms🡪 unfair user outcomes

Model highly accurate on aggregate ✅
But perform badly on user segments ❌

Popularity bias
Cold-start users

Head

Tail

17 of 54

Group fairness

Users segmented on

Demographics features: sex, race …
Functional characteristics: #views, #buys �

Equalize metrics across groups.

Ideal: reduce loss of disadvantaged group. ✅
Algorithms can inflate loss of the advantaged group. ❌

Problems:

Intersectionality: users in multiple groups
Legally prohibited fields

18 of 54

Minimax group fairness

19 of 54

Distributionally Robust Optimization (DRO)

20 of 54

Existing DRO approaches for recommendation

21 of 54

Limitations of group based DRO methods

GroupDRO & Streaming DRO have major limitations:

Need group membership during training

Users cannot lie in multiple groups, does not scale with intersecting groups

Performance drop

observed

22 of 54

Conditional Value at Risk (CVaR) DRO

23 of 54

Experiments

Normalized Discounted Cumulative Gain

Leave one out split

24 of 54

User groups

Ratio of popular items in user’s history

G_{pop =}{niche, diverse, popular}

User interaction length

G_seq= {short, medium, long}�

We experiment across thresholds and resulting group splits

25 of 54

Single-group setting: DRO is effective, CVaR DRO best

CVaR DRO obtains highest NDCG across all groups & on aggregate
Even for highly imbalanced groups splits (G_pop1080):

CVaR is best, Group, Streaming DRO ~ similar to standard training.

Standard training

26 of 54

Multi-group setting: CVaR DRO shines

Group-based (GDRO, SDRO)

Highly sensitive to choice of “atoms” for DRO loss, impossible to know before training.�

CVaR DRO

Is group-agnostic & outperforms on aggregate, 5/6 groups.

Popularity based groups

Sequence length groups

27 of 54

Takeaways

Standard training (ERM) ⟹ poor performance on user segments�
Group-based DRO methods:

Groups needed upfront, cannot scale to intersecting groups ❌
Performance degrades with imbalance ❌

Conditional Value at Risk-DRO doesn’t suffer from the above, outperforms groupwise and on aggregate.

28 of 54

Producer equilibria and dynamics in engagement driven recommender systems��

Krishna Acharya, Varun Vangala, Jingyan Wang, Juba Ziani, TMLR 2025

P2) Strategic creators

29 of 54

Content creation game amongst producers

Users

Embedding space

How to maximize user engagement?

Producers

Alice

Bob

30 of 54

Modelling producer competition

Probability of user k seeing producer i’s content

Relevance score

31 of 54

Content serving rules

Softmax

Top-K Softmax

Filter top-K scores & softmax

Greedy

Filter top score

Low temp T, low top K more greedy

Luce/Linear rule

Round-robin/random

Independent of producer content.

32 of 54

Result: Producer strategy at Nash eq supported on basis vectors

Nash eq.

33 of 54

Structure of Equilibria & Producer specialization

Producer specialization 🡪 catalog diversity ✅

Each producer concentrates on a distinct content niche
Yields heterogeneous catalog�

No specialization 🡪 catalog collapse ❌ :

Each producer concentrates on the same popular content.
Yields homogenous catalog�

Equilibria for serving rules

Linear (Luce) rule: specialization ✅
Round-robin: no specialization ❌
Softmax : depends on temperature, top-K (experiments)

34 of 54

Experiments

User embeddings:

ML100K, Amazon [Hou et al ’23]�

Producers: best response dynamics�
Equilibrium

Measure diversity of catalog and
Producer utility

35 of 54

Result: Greedier serving leads to catalog diversity

More greedy

36 of 54

Result: Producer utility increases with greedier serving

greedy

Linear

Round-robin

Top-10 softmax

Top-20

Full

softmax

37 of 54

GLoSS: Generative Language Models with Semantic Search for Sequential Recommendation��

Krishna Acharya, Aleksandr V. Petrov, Juba Ziani

Presented at OARS@KDD2025

P3) LLM-based recommendation

38 of 54

Task: Sequential Recommendation

Given: Sequence of items a user has viewed

Predict: most likely next item.

39 of 54

Identifier (ID) based sequential recommenders

✅ Pros

Fast Training:

Small model (5-10 M)

Fast Inference

Items stored in NN index 🡪 fast relevance scoring�

User

❌ Cons

Learn embeddings for each item

Embeddings do not generalize across surfaces

Cold-start

Retrain for new items
Lower performance for new users

40 of 54

LLM based recommendation

Incorporate catalog knowledge: IC, RAG, Finetune?�
What generation strategy to use? �

3. How many candidate texts to generate?

4. How do we ground back to the item catalog?

Generate next item title using an LLM

41 of 54

GLoSS Architecture

Verbalize the user’s item history into text using item metadata

Finetune LLaMA-3 with QLoRA

Generate candidate texts for the next likely item

Retrieve closely matching items from item catalog

42 of 54

Candidate text generation

Deterministic: �Beam search decoding ✅

Sampling: �Temperature, Top-K softmax ❌

43 of 54

Retrieving the closest matching items

Sparse: keyword overlap based

Dense/Semantic: Embedding based

E5-small : 33M params, d=384 �
E5-base: 110M params, d=768

TF-IDF, BM25

E5, Qwen-embedder

44 of 54

Experiments

Metrics:

Hit Rate/Recall@5�
Normalized Discounted Cumulative Gain@5

45 of 54

GLoSS vs ID-based models

Outperforms all ID-based baselines on

Recall (+52%) & NDCG (+42%)

46 of 54

GLoSS vs LLM-based benchmarks

Higher Recall than all LLM-based baselines.
Competitive NDCG metrics.

47 of 54

Dense retrieval greatly improves metrics

Dense retrieval outperforms sparse in catalog grounding.

+12% gains in NDCG
+3% gains in Recall

48 of 54

Strong metrics across user interaction lengths

Short user

sequence

Long

User sequence

Medium

sequence

49 of 54

Takeaways

High quality text generation:

4-bits quantized, LoRA tuned Llama models, paged attention

2. SOTA:

Beats all ID-based benchmarks on R@5, NDCG@5
Also outperforms LLM based models on R@5�

3. Grounding generated text:

Semantic search significantly improves ranking and retrieval metrics

4. Strong metrics across sequence lengths:

Short, medium, long sequence users all obtain high metrics.

50 of 54

Future directions

New risks from economically motivated producers

1. Shilling attack: Seller introduces fake users

to boost its item visibility

LLMs bias to the latest tokens, Recency based shilling

2. Semantic rewrite: Seller manipulates its item’s

metadata to increase visibility

Hard to detect, mimic natural descriptions

51 of 54

Publications

Part of this talk:�User Fairness

Oracle Efficient Algorithms for Groupwise Regret [ICLR 24]
Improving Minimax Group Fairness in Sequential Recommendation [ECIR 25]

Competition & Item Diversity

Producers Equilibria and Dynamics in Engagement-Driven Recommender Systems [TMLR 25]

LLM-based recommendation

GLoSS: Generative Language Models with Semantic Search for Sequential Recommendation [OARS workshop KDD25]

Not part of this talk:

Algorithmic Fairness

Wealth Dynamics Over Generations: Analysis and Interventions [SaTML 23]

�

�Game theory, online learning

Last-iterate Convergence for Symmetric, General-sum, 2×2 Games Under The Exponential Weights Dynamic [ALT 26]
One Shot Inverse Reinforcement Learning for Stochastic Linear Bandits [UAI 24]�

Differential Privacy

Personalized Differential Privacy for Ridge Regression [NRL 25]

52 of 54

Thanks to all my co-authors!

Dr. Ashwin Pananjady

Dr. Aaron Roth

Dr. Juba Ziani

Dr. Sampath Kannan

Dr. Eshar Ram

Arunachaleswaran

Dr. Aleksandr V. Petrov

Lokranjan Lakshmikanthan

Dr. Anders Kirk Uhrenholt

Dr. David Wardrope

Timos Korres

Varun Vangala

Dr. Jingyan Wang

Dr. Franziska Boenisch

Rakshit Naidu

Dr. Vidya

Muthukumar

Jim James

�

Etash Guha

Guanghui Wang

53 of 54

Thank you, committee!

Dr. Jacob�Abernethy

Dr. Vidya�Muthukumar

Dr. Aaron�Roth

Dr. Kai�Wang

Dr. Juba Ziani Advisor

Committee Members

54 of 54

Questions

P1) User Fairness

Oracle Efficient Algorithms for Groupwise Regret [ICLR 24]
Improving Minimax Group Fairness in Sequential Recommendation [ECIR 25]

P2) Strategic creators & item catalog evolution

Producers Equilibria and Dynamics in Engagement-Driven Recommender Systems [TMLR 25]

P3) LLM-based recommendation

GLoSS: Generative Language Models with Semantic Search for Sequential Recommendation [OARS workshop KDD25]�