2 of 19

Love

Christmas

Hype

Country

Artist/Genre/Style (25%)

Cunningham, Sally Jo; Bainbridge, David; Falconer, Annette�"More of an art than a science": Supporting the creation of playlists and mixes. ISMIR 2006

Event or Activity (25%)

Romance (19%)

Mood (16%)

Other (2.6%)

3 of 19

Playlist names are not necessarily bound to a category

Total freedom of expression for the user
How to automatically handle them?�(e.g. cold start recommendation?)

“Housewarming Party”

“Country summer”

“Spring awakening”

(real playlist names from the Million Playlist Dataset)

“Marina and the diamonds”

“Dancing in the kitchen”

4 of 19

What are the best tracks for this playlist title? 🍋

https://www.playlistnameai.com/ideas/happy-playlist-names

5 of 19

Problem statement

Goal:�Generate a playlist using only its title
Impact:

Cold-start recommendation
Automatic playlist generation

Challenges:

Decontextualized natural language
No dependency on a set lexicon
High variability in playlist titles

Million Playlist Dataset

Initially released by Spotify for �the 2018 RecSys Challenge
1 million playlists
2+ million tracks

Ching-Wei Chen, Paul Lamere, Markus Schedl, and Hamed Zamani. �Recsys Challenge 2018: Automatic Music Playlist Continuation. RecSys 2018

6 of 19

Related work

Text & NLP Approaches

Playlist creation & categories (Cunningham et al., 2006)
Classification with topic models (Fields et al., 2010)
Songs as token sequences (McFee & Lanckriet, 2011)
Emotion-based playlist recs (Nair et al., 2021)
Metadata & knowledge graphs (Gabbolini & Bridge, 2023)

Large Language Models in Music Recs

Playlist title generation (Doh et al., 2021, Kim et al., 2023)
Text2Playlist (Delcluze et al., 2025)
Text2Tracks (Palumbo, 2025) & TALKPLAY (Doh et al., 2025)

Gap: Limited focus on playlist titles as input

Playlist Titles in Recommender Systems

RecSys Challenge 2018: cold-start with titles

Monti et al., 2018 – RNN embeddings
Faggioli et al., 2018 – similarity matrices
Kim et al., 2018 – LSTM with n-grams

7 of 19

Approach

#clustering #fine-tuning #language-models #semantic-similarity

Phase 1

Fine-Tune a Language Model

Phase 2

Predict with Semantic Similarity

8 of 19

Main idea

words occurring often together in sentences

close in the�embedding space

titles occurring often together in playlists

close in the�embedding space

9 of 19

Phase 1: Fine-tune a Language Model

Sentence-BERT embeddings
Semantic clustering K-Means, k = 15
Remove miscellaneous clusters
Fine-tuning all-MiniLM-L6-v2

Cross-entropy loss (best), Triplet loss
15 epochs (early stopping), LR 2 e-5

MPD

Pretrained�SBERT

Playlist embeddings

Clustering

Fine-tuning

Fine-tuned language

model

10 of 19

Phase 2: Predict with Semantic Similarity

Relevant playlist retrieved using nearest-neighbor from Gensim
Voting system: Count track frequency in top-50 neighbors, return the most frequent tracks

Fine-tuned language

model

New�Playlist Title

MPD

Embedding representation

K most relevant playlists

Ranking based on cosine similarity

N recommended tracks

Voting mechanism

11 of 19

Evaluation

#quantitative #qualitative

12 of 19

Quantitative evaluation

Method	R-Precision	NDCG
Monti et al. (Only title)	0.0837	0.1260
Faggioli et al. (Only title)	0.1093	0.2451
Kim et al. (Only title)	0.0760	0.1866
Pre-trained	0.1570	0.2731
Fine-tuned (cross-entropy)	0.1556	0.2825
Fine-tuned (triplet loss)	0.1285	0.2297

from RecSys Challenge 2018

our method

13 of 19

Qualitative evaluation

Rock Classics

▶ My Heart Will Go On� Céline Dion

▶ Highway to Hell� AC/DC

▶ Smells Like Teen Spirit� Nirvana

▶ It’s My Life� Bon Jovi

14 of 19

Qualitative evaluation

On a subset of 22 selected playlists
Human Judgment of Playlist Relevance
Qualitative score = nb. valid tracks

total recommended tracks

Method	Quality@10	Quality@66
Pre-trained	0.7376	0.7231
Fine-tuned (cross-entropy)	0.7789	0.7719
Fine-tuned (triplet loss)	0.7533	0.7461

15 of 19

Predict with LLMs

#prompt #zero-shot #few-shot

16 of 19

LLM Generation

You are an expert in music playlist generation.

Your task is to generate the continuation of a playlist given only its title and five example songs with their artists.

Important:

• You have to select only songs released before October 2017.

• Propose a COMPLETE playlist consisting of exactly 10 songs.

…

• All recommended songs must be UNIQUE and must not repeat any of the five example songs provided.

Playlist Title: "{playlist_title}"

Examples:

(1) {"song": "{song1}", "artist": "{artist1}"}

(2) . . .

Output format (strict):

[

{"song": "<title>", "artist": "<artist>"},

...

]

Answer ONLY with the JSON list exactly as specified above. Do not output anything else.

Setting a persona

Give precise instructions

Include N samples

Ask a specific output

approximation to match MPD content

17 of 19

Evaluation of the LLM

Metric	GPT-4o�(0-shot)	GPT-4o�(5-shot)	Our Method (FT-C)
Precision@10	0.0636	0.1227	0.1793
Recall@10	0.0073	0.0197	0.0382
MRR@10	0.1636	0.2505	0.3254
R-Precision@10	0.0157	0.0338	0.0496
NDCG@10	0.1900	0.3249	0.3740
Qualitative Score	0.7175	0.7953	0.7789

Also tried Llama and Zephyr (results in the paper)
GPT-4o improves from 0-shot → 5-shot, but still falls short on retrieval metrics.

18 of 19

Conclusion

Contributions

Pipeline for playlist generation from titles only
Outperformed state-of-the-art title-based methods
First assessment of prompt-based LLM

Future Work

Mitigate popularity bias �→ integrate diversity & novelty metrics
Playlist continuation�(use first N tracks as seeds)
Hybrid systems�→ title embeddings + collaborative filtering

19 of 19

Thank you!

This presentation

bit.ly/lmrec-2025

WE HAVE A DEMO!

playlist-recommendation.tools.eurecom.fr

SPOT #21 - Thu Sept, 25 - Poster Session

Repo: bit.ly/lm-recommender-2025

Mail: pasquale.lisena@eurecom.fr