1 of 105

Knowledge-based

Music Recommendation

Models, Algorithm and Exploratory Search

Michel BUFFA Reviewer

Mounia LALMAS Reviewer

Gaël RICHARD Examiner

Tommaso DI NOIA Examiner

Pietro MICHIARDI Examiner

Benoit HUET Thesis Director

Raphäel TRONCY Thesis Co-Director

Thesis Committee

PhD Candidate

Pasquale Lisena

11 October 2019

2 of 105

1. Music

in particular Classical Music

2. Knowledge Graphs

as part of Semantic Web technologies

3. ML techniques

applied to Music KG

in particular for recommendation

What’s my thesis about

3 of 105

Why Classical Music?

4 of 105

4

CLASSICAL

POPULAR

VS

5 of 105

5

M. Lasar (2011). Digging into Pandora’s Music Genome with musicologist Nolan Gasser. https://arstechnica.com/tech-policy/2011/01/digging-into-pandoras-music-genome-with-musicologist-nolan-gasser/

When it comes to classical music, on the other hand, it's much more about the composition itself, because even though the interpretation can vary in various subtle ways.

CLASSICAL

POPULAR

VS

For pop music the experience of the music is really defined by the recording.

6 of 105

6

CLASSICAL

POPULAR

VS

Track-based

Work-based

70 years of history

Thousand years�from Gregorian chant to a work written last Tuesday

Songs

Multi-movement works

Major, minor

Polyphonic, homophonic, monophonic

7 of 105

7

M. Schedl (2015) Towards Personalizing Classical Music Recommendations. 2015 IEEE International Conference on Data Mining Workshop (ICDMW), Atlantic City, NJ, USA, pp. 1366-1367. https://doi.org/10.1109/ICDMW.2015.8

“Fans of classical music are underrepresented on social media and music streaming platforms.”

  • Less data
  • Less detailed metadata
  • Less involved in research

Music recommendation research

Classical music recommendation research

8 of 105

8

Data

Metadata

Data which describes other data

composer

composition date

genre

performer

key

derivation type

1801

9 of 105

9

Title, opus, movement

Who is the composer?

Who is the performer?

online music approach

Track as “atomic unit”

10 of 105

10

music archives approach

Work as “aggregation unit”

  • genre
  • date
  • author
  • title(s)

  • publication
  • performance
  • recordings
  • books
  • ...

11 of 105

Research Questions

11

Which model best represents these rich data for final users and music scholars?

What strategies to adopt for building a music Knowledge Graph?

How to make these data accessible to researchers and developers?

How can graph-based algorithms support music recommender systems?

What information can be extracted from editorial playlists?

Is Graph representation also suitable for music content?

RQ1

RQ2

RQ3

RQ4

RQ5

RQ6

12 of 105

Roadmap

  1. Music Model & Vocabularies
  2. Data Conversion
  3. Web APIs for KG
  • Embeddings and Similarity
  • Playlists and Weights
  • Learning MIDI Embeddings

12

Building a�Music graph

Exploiting the�Music knowledge

PART I

PART II

13 of 105

13

Improve music description to foster music exchange and reuse

Travel to the heart of the musical archives in France’s greatest institutions

Connect sources, multiply usage, enrich user experience

14 of 105

14

Building a�Music graph

PART I

15 of 105

15

What is a Knowledge Graph?

It is a specific kind of knowledge base which is:

  • a directed graph�connections between nodes are first-class citizens
  • semantic�the meaning of the connections are part of the data itself
  • smart�allows graph-computing techniques and algorithms
  • alive�easy to extend, access, reuse

Semantic Web technologies realize graphs in which nodes and properties linking them are identified by URIs.

16 of 105

16

17 of 105

17

I.A Music Model & Vocabularies

Building a� Music graph

Which model to represent this richness?

musicologists

libraries

musical museums

conservatories

radios

concert halls

RQ1

18 of 105

18

  • One of the first example of describing music using Semantic Web
  • Extend FRBR, Timeline Ontology, Event Ontology
  • Uses vocabularies for Keys, Musical Instrument (by MusicBrainz), Genres (DBpedia)

Y. Raimond, S. Abdallah, M. Sandler, and F. Giasson (2007). The Music Ontology.

In 15th International Conference on Music Information Retrieval (ISMIR). 417–422

State of the art

The Music Ontology

Building a Music graph Music Model & Vocabularies

I.A

19 of 105

19

The DOREMUS Model

  • Relies on Linked Data and Semantic Web principles
    • everything is a URI
    • RDF model
  • Music specific extension of FRBRoo
  • Event-based pattern: the knowledge is represented in modules (triangles) which describe events that give birth to work/expression

FRBR

museum information

bibliographic records

P. Choffé and F. Leresche (2016). DOREMUS: connecting sources, enriching catalogues and user experience. In 24th IFLA World Library and Information Congress.

Building a Music graph Music Model & Vocabularies

I.A

20 of 105

20

F14

Work

F22

Expression

M2

Opus Statement

F28

Expression

Creation

R3 is realized in

E7

Activity

5

1

“Sonate pour violoncelle et piano no 1”@fr

“Sonates" , "Sonata in F"

Ludwig van Beethoven

Ludwig von Beethoven

composer

compositeur@fr compositore@it

R17 created

R19 created a realization of

U17 has opus statement

U12 has genre

P102 has title

U31 had function of type

P14 carried out by

P9 consists of

P4 has time span

1796

Sonata

sonata@it , sonate@fr , klaviersonate@de

M42 Performed

Expression

Creation

M43�Performed

Expression

Berlin

P4 has time span

1796

P7 took

place at

F24 Publication Expression

F30 Publication Event

P4 has time span

1797

P7 took place at

Vienna

U4 had princeps publication

U54 is performed expression of

P165 incorporates

1770

1827

P98

born

P100

died

U11 has key

F Major

F Dur@de , Fa majeur@fr,

Fa maggiore@it , Fa mayor@es

M6

Casting

M23

Casting Detail

U13 has casting

1

U30

quantity

U2 foresees mop

Piano

Pianoforte@it Fortepian@pl

M23

Casting Detail

1

U30

quantity

U2 foresees mop

Cello

Violoncello@it Violoncelle@fr

F15

Complex

Work

F19 Publication Work

M44

Performed

Work

U5 had premiere

 U38 has descriptive expression

R10 has member

21 of 105

Controlled Vocabularies

21

“Sax”@en

“Saxophone”@en

“Saxofone”@pt

“Sassofono”@it

“Saxophone”@fr

Alternate labels

Alternate languages

“English term is preferred globally”

Notes

“Woodwinds”@en

“Legni”@it

Hierarchy

“Baritone Saxophone”@en

Building a Music graph Music Model & Vocabularies

I.A

22 of 105

Controlled Vocabularies

GENRES

Diabolo

IAML

Itema3

Redomi

RAMEAU

Medium of performance

MIMO

Itema3

IAML

Diabolo

RAMEAU

Redomi

Musical keys

Modes

Catalogues

Derivation types

Functions

more available at

http://data.doremus.org/vocabularies

23 families of vocabularies · 11,000+ concepts · 610 links between terms

INTERLINKED

INTERLINKED

P. Lisena et al. (2018). Controlled Vocabularies for Music Metadata.

In 19th International Conference on Music Information Retrieval (ISMIR). Paris, France.

Building a Music graph Music Model & Vocabularies

I.A

23 of 105

23

These and additional competency questions have been collected by experts from our partner institutions and used as requirements and validation for the model.

https://github.com/DOREMUS-ANR/knowledge-base/tree/master/query-examples

P. Lisena et al. (2017) Modeling the Complexity of Music Metadata in Semantic Graphs for Exploration and Discovery. In (ISMIR’17) 4th International Workshop on Digital Libraries for Musicology (DLfM’17), Shanghai, China.

Building a Music graph Music Model & Vocabularies

Which works have been composed

by Mozart when he was <10?

How many works have been composed and performed for the 1st time in the same city?

Which composers had the chance to direct their own work in a performance during the last decade?

I.A

24 of 105

24

Which chamber music works have been composed in the 19th century by Scandinavian composers?

Edvard Grieg

1843 - 1907

Work

Genre

>1800 AND <=1900

CHAMBER MUSIC

Composition date

?

composed by

nationality

part of

SCANDINAVIA

Building a Music graph Music Model & Vocabularies

I.A

25 of 105

I.B Data Conversion

Building a�Music graph

26 of 105

26

Music archives have

very detailed knowledge

PROBLEMS

  • Multiple formats
    • sometimes complex parsing is required
  • No possible interoperability
  • Need for discovering overlapping knowledge
  • Information codified as free text
    • different practices in codifying the same information (“Op. 27 n. 2” - “Op. 27 no 2”)
    • wrong fields, typos, wrong punctuation
  • Not always publicly accessible

Building a Music graph Data Conversion

I.B

Ryszard Kruk S. andl McDaniel B. (2009). Goals of Semantic Digital Libraries.�

27 of 105

Source datasets

27

Works

62 550 | XML

Scores

9 154 | XML

Concerts

340 609 | XML

Discs

9 500 | XML

Works

6 846 | UNIMARC

Scores

30 319 | UNIMARC

Concerts

5 164 | XML

Discs

8 602 | XML

Works

135 940 | INTERMARC

Scores

89 184 | INTERMARC

(3 different XML sources)

Building a Music graph Data Conversion

I.B

28 of 105

28

001 FRBNF139081882FR

100 $313891295$w.0..b.....$aBeethoven$mLudwig van$d1770-1827

144 $w....b.fre.$aSonates$bPiano$pOp. 27, no 2$tDo dièse mineur

001 FRBNF139081882FR

100 $313891295$w.0..b.....$aBeethoven$mLudwig van$d1770-1827

144 $w....b.fre.$aSonates$bPiano$pOp. 27, no 2$tDo dièse mineur

LANG TITLE MOP OPUS KEY

MARC FILE

Building a Music graph Data Conversion

I.B

29 of 105

29

001 FRBNF139081882FR

100 $313891295$w.0..b.....$aBeethoven$mLudwig van$d1770-1827

144 $w....b.fre.$aSonates$bPiano$pOp. 27, no 2$tDo dièse mineur

MARC FILE

NUM SUB

Building a Music graph Data Conversion

I.B

30 of 105

marc2rdf

MARC PARSER

FREE TEXT INTERPRETER

STRING 2 VOCABULARY

MARC

files

vocabularies

1st performance in Moscow, December 29, 1956,

by Mstislav Rostropovich on cello and A. Dedukhin on piano

mapping rules

Building a Music graph Data Conversion

I.B

RDF

graph

What strategies to adopt for building a music Knowledge Graph?

RQ2

31 of 105

31

INTERMARC

marc2rdf

UNIMARC

EUTERPE XML

ITEMA3 XML

euterpe

converter

itema3

converter

GRAPH BNF

GRAPH PHILHARMONIE

GRAPH EUTERPE

GRAPH ITEMA3

diabolo converter

DIABOLO XML

GRAPH DIABOLO

STRING 2 VOCABULARY

Building a Music graph Data Conversion

I.B

32 of 105

32

What is in the Knowledge Graph?

89.872

persons

(composers, performers, …)

18.075

corporate bodies�(orchestras, chorus, publishers, …)

357.451

musical works

16k components

4k derived works

193.412

concerts and studio recordings

469.131

performed work

3.833

foreseen concerts

31.296

publications

48.006

scores

Building a Music graph Data Conversion

I.B

33 of 105

33

I.C Web APIs for KG

Building a�Music graph

Pëtr Il'ič Čajkovskij

Pyotr Ilyich Tchaikovsky

Пётр Ильич Чайковский

GALLERY OF COMPOSERS

Antonio Vivaldi

Ludwig van Beethoven

Johann Sebastian Bach

Jean Sébastien Bach [FR]

34 of 105

34

34

SELECT * WHERE {

?composer a foaf:Person ;

foaf:name ?name ;

foaf:depiction ?img .

}

Pëtr Il'ič Čajkovskij

Pyotr Ilyich Tchaikovsky

Пётр Ильич Чайковский

GALLERY OF COMPOSERS

Antonio Vivaldi

Ludwig van Beethoven

Johann Sebastian Bach

Jean Sébastien Bach [FR]

Building a Music graph Web APIs for KG

I.C

35 of 105

35

SPARQL result

JSON format

"bindings": [{

"composer": { "type": "uri",

"value": "http://data.doremus.org/artist/0b9d963c-bfd7-337d-b6c3-c874f5e62125"

},

"name": { "type": "literal",

"value": "Petr Ilʹič Čajkovskij"

},

"img": { "type": "uri",

"value": "http://.../Pyotr_Ilyich_Tchaikovsky.jpg"

}

}, {

"composer": { "type": "uri",

"value": "http://data.doremus.org/artist/0b9d963c-bfd7-337d-b6c3-c874f5e62125"

},

"name": { "type": "literal",

"value": "Piotr Ilitch Tchaikovski"

},

"img": { "type": "uri",

"value": "http://.../Pyotr_Ilyich_Tchaikovsky.jpg"

}

}, {

"composer": { "type": "uri",

"value": "http://data.doremus.org/artist/b34f92ab-ad86-361b-a8b8-5c3a4db784d0"

},

"name": { "type": "literal",

"value": "Antonio Vivaldi"

},

"img": { "type": "uri",

"value": "http://.../Antonio_Vivaldi.jpg"

}

}, ...

SAME

DIFFERENT

SAME

DIFFERENT

How to make these data accessible to researchers and developers?

RQ3

36 of 105

36

[{"id": "http://data.doremus.org/artist/0b9d963c...","name": [

"Petr Ilʹič Čajkovskij""Piotr Ilitch Tchaikovski"],"image": "http://.../Pyotr_Ilyich_Tchaikovsky.jpg"},{"id": "http://data.doremus.org/artist/b34f92ab...","name": "Antonio Vivaldi","image": "http://.../Antonio_Vivaldi.jpg"}]

2 names

1 picture

Building a Music graph Web APIs for KG

I.C

37 of 105

37

skip irrelevant metadata

reducing and parsing

merging “rows”

mapping to different structures

Building a Music graph Web APIs for KG

I.C

Booth et al. (2019) Toward Easier RDF. In W3C Workshop on Web Standardization for Graph Data.

38 of 105

SPARQL Transformer

{"proto": {"id" : "?composer","name": "$foaf:name$required","image": "$foaf:depiction$required"},"$where": ["?composer a ecrm:E21_Person"],"$limit": 100}

38

  • JS and Python library
  • A JSON-based syntax
    • template + query
  • Integration in grlc.io for web api development

Building a Music graph Web APIs for KG

I.C

Lisena P. et al. (2019). Easy Web API Development with SPARQL Transformer. In ISWC’19.

39 of 105

SPARQL Transformer

39

Building a Music graph Web APIs for KG

I.C

QUERIES*

n. objects

(original)

n. objects

(transformed)

1.Born_in_Berlin

1132

573

2.German_musicians

290

257

3.Musicians_born_in_Berlin

172

109

4.Soccer_players

78

70

5.Games

1020

981

Evaluation #1: Queries’ results

Evaluation #2: User Survey

55 subjects

Used in

Overhead < 0.1 seconds

40 of 105

40

Exploiting the

Music knowledge

PART II

41 of 105

41

discover new music

improve their streaming music experience

background for their activities

FINAL USERS

MUSIC EXPERTS

playlist producing

help for concert programming

automatic radio broadcasting

How can graph-based algorithms support

music recommender systems?

RQ4

42 of 105

42

Antonio Vivaldi

Autumn. I Allegro

Tomaso Albinoni

Symphony n. 3

NEXT

SEED

TARGET

How to find it?

43 of 105

State of the art

Item neighborhood mapping

43

S. Oramas, V. C. Ostuni, T. Di Noia, X. Serra, and E. Di Sciascio. Sound and music recommendation with knowledge graphs. ACM Trans. Intell. Syst. Technol. 8, 2, Article 21 (October 2016), 21 pages.

http://dx.doi.org/10.1145/2926718

The vector representation of the item i is computed on his neighborhood of length l.

More two items share entities/property at a certain distance, more those items can be considered similar.

44 of 105

State of the art

MusicLynx

44

Alo Allik, Florian Thalmann, and Mark Sandler. 2018. MusicLynx: Exploring Music Through Artist Similarity Graphs.

In The Web Conference 2018. Demo Track, pp 167-170.

https://doi.org/10.1145/3184558.3186970

  • Access to different knowledge sources
  • Maximum Degree Weighted (MDW): links to very large categories (i.e. Living People) are discouraged with respect to more significant ones.

45 of 105

II.A Embeddings

and Similarity

Exploiting the

Music knowledge

46 of 105

46

46

Word Embeddings (e.g. word2vec)

corpus of document -> vectors that represent the semantic distribution of words in the text

Graph Embeddings (e.g. node2vec)

set of random walks -> vectors that represent the semantic distribution of entity in the graph

Exploiting the Music knowledge Embeddings and Similarity

II.A

Main idea: nodes that occurs in similar contexts (neighborhood of nodes in a graph) are more similar, and will be closer in the vector space.

Aditya Grover and Jure Leskovec. node2vec: Scalable Feature Learning for Networks.

In 22nd ACM SIGKDD , 2016.

47 of 105

Some problems:

  • Our dataset was constantly growing
  • The amount of nodes is huge
  • Different purposes for recommendation:
    • radio broadcasting
    • concert programming
    • final users

47

computational-wise and time-wise

expensive

(multiple run of node2vec�on huge amount of data)

Exploiting the Music knowledge Embeddings and Similarity

II.A

48 of 105

Solution

48

Compute embeddings at�simple features level

period of time

musical key

medium of performance

genre

...

Exploiting the Music knowledge Embeddings and Similarity

II.A

49 of 105

Example: MoP

49

vocabulary:iaml/mop/wob

vocabulary:iaml/mop/wcl

The clarinet is more similar to the oboe or to the cello?

vocabulary:iaml/mop/svc

Exploiting the Music knowledge Embeddings and Similarity

II.A

50 of 105

Example: MoP

50

vocabulary:iaml/mop/wob

vocabulary:iaml/mop/wcl

vocabulary:iaml/mop/w

vocabulary:iaml/mop/s

vocabulary:iaml/mop/svc

Exploiting the Music knowledge Embeddings and Similarity

II.A

Graph of vocabularies

51 of 105

Example: MoP

51

vocabulary:iaml/mop/wob

vocabulary:iaml/mop/wcl

vocabulary:iaml/mop/svc

Casting Detail

Casting Detail

Casting Detail

Casting

862 times

Casting

1213 times

Exploiting the Music knowledge Embeddings and Similarity

II.A

Graph of usage

52 of 105

52

SPARQL

endpoint

subgraph

(edgelist)

selection of

interesting properties

(i.e. skos:broader)

vectors

embedding

NODE2VEC

s 1.34 0.98 0.20

w 1.44 1.21 0.31�svc 0.14 1.31 1.48

wcl -1.2 1.90 0.85

wob -0.83 2.32 1.03

Pasquale Lisena et al. Controlled Vocabularies for Music Metadata.

19th International Society for Music Information Retrieval Conference (ISMIR), Paris, France, September 2018.

Exploiting the Music knowledge Embeddings and Similarity

II.A

53 of 105

Example: MoP

53

vocabulary:iaml/mop/wob

vocabulary:iaml/mop/wcl

The clarinet is more similar to the oboe or to the cello?

vocabulary:iaml/mop/svc

0.506

0.562

Exploiting the Music knowledge Embeddings and Similarity

II.A

54 of 105

54

VECTOR SPACE OF MoPs

ethnic chordophones

ethnic flutes

percussions

brass

orchestra

woodwinds

orchestra

strings

rare strings

Exploiting the Music knowledge Embeddings and Similarity

II.A

55 of 105

55

VECTOR SPACE OF GENRES

Exploiting the Music knowledge Embeddings and Similarity

II.A

56 of 105

56

Combine embeddings at�complex features level

artists

works

playlists

Exploiting the Music knowledge Embeddings and Similarity

II.A

57 of 105

Example: Artists

Exploiting the Music knowledge Embeddings and Similarity

II.A

58 of 105

58

MOP

embeddings:

MOP

GENRE

KEY

Artist’s features:

BIRTH DATE

DEATH DATE

CASTING

WORKS

GENRE WORKS

KEYS�WORKS

PLAYED

MOP

-0.02 0.01 0.01 0.00 -0.01 -0.02 0.01 0.00 -0.01 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 -0.00 0.00 0.00 0.07 -0.03 0.07 -0.02 -0.01 0.19 0.02 0.69 -0.19 -0.14 0.08 0.03 0.03 0.00 0.08 null null null null null -0.06 0.07 0.02 -0.03 0.00

Artist vector

BIRTH

PLACE

DEATH PLACE

FUNCTION

FUNCT

GeoNames

GeoNames

Time

DIMENSIONALITY REDUCTION (PCA)

Time

AVG

AVG

AVG

AVG

AVG

Exploiting the Music knowledge Embeddings and Similarity

II.A

Some data are unknown or not applicable

null null null null null

59 of 105

Example: Artists

59

P. Lisena, R. Troncy (2017). Combining music specific embeddings for computing artist similarity.

In 18th International Society for Music Information Retrieval Conference (ISMIR), Late-breaking & demo track.

percentage of missing

dimensions in artist 2

with respect to artist 1

Exploiting the Music knowledge Embeddings and Similarity

II.A

60 of 105

60

Do all the properties have the same importance?

Exploiting the Music knowledge Embeddings and Similarity

II.A

61 of 105

II.B Playlists and Weights

Exploiting the

Music knowledge

Which information is possible to extract from editorial playlists?

RQ5

Idea: there are “unaware” rules that experts apply when realising a playlist.

62 of 105

62

  • Use playlists to give a weight to the influence of each dimension.
  • No GOLD STANDARD available, creation of a “silver” one

Radio France Playlists

(50)

Spotify Playlists

(65)

ITEMA3 Concerts (624)

Philharmonie Concerts (186)

  • Radio France Web Radio (7 channels)
  • Realised by experts
  • Classical section of Spotify app
  • Realised by Spotify staff
  • Real concerts that took place in Paris (studio + concert hall)

Exploiting the Music knowledge Playlists and Weights

II.B

63 of 105

63

variance within < variance between

HOMOGENEOUS

good for recommendation

variance within > variance between

INHOMOGENEOUS

bad for recommendation

Exploiting the Music knowledge Playlists and Weights

II.B

F test statistic = variance between / variance within Weights

64 of 105

64

evaluation

Exploiting the Music knowledge Playlists and Weights

II.B

  • 7 music experts from partner institutions
  • given the seed:
    • put bad items in the trash
    • sort according to preference

65 of 105

65

evaluation

Exploiting the Music knowledge Playlists and Weights

II.B

The study of variance help us to identify which dimensions should be promoted for better recommendations.

66 of 105

66

Exploiting the Music knowledge Playlists and Weights

II.B

Under experimentation in

live.philharmoniedeparis.fr

67 of 105

The role of titles: Title2Rec

Exploiting the

Music knowledge

“Relax Driving”

Johannes Brahms

Symphony n.3

“Beach Party”

Luis Fonsi

Despacito

68 of 105

Title2Rec: training

68

Exploiting the Music knowledge Playlists and Weights

II.B

Content

(id of tracks)

Playlists

SEQUENTIAL EMBEDDINGS

(word2vec)

CLUSTERS

of playlists

Titles

DOCUMENTS

fastText

MODEL

69 of 105

Title2Rec: training

69

yy :) christmas litmas guardians christmas christmas holiday christmas christmas the good stuff. xmas himym christmas pop xmas country happy holidays holidays christmas christmas hits 25 just cause stay christmas tis the season 🎄 christmas 🎄 christmas oldbutgold christmas christmas vibes christmas strong christmas winter wonderland christmas time december 15 xmas christmas christmas pop flight christmas deep christmas vibes christmas oldies work in progress christmas christmas playlist christmas music josh 🎄 christmas blah christmas & chill depression secret christmas christmas & chill christmas love :) christmas elite :) christmas special songs christmas christmas christmas jams jessica its lit classy pump up graduation at the moment .... christmas christmas christmas music good old days christmas mix christmas music 80s rock christmas 2015 xmas christmas christmas christmas christmas vibes 2017 songs christmas vibes!! christmas music holidays christmas 2016! christmas christmas club music summer 2015 christmasssss christmassss christmas christmas christmas christmas!! christmas christmas feels christmas christmas(:: christmas playlist great christmas playlist christmas & chill christmas christmas trap blast from the past christmas 2016 classics grad christmas christmas christmas christmas yessss christmas christmas rihanna christmas christmas songs christmas 2016!!!!! good vibes christmas christmas songs christmas christmas christmas favorites christmas christmas 2016 🎄 christmas last christmas christmas all my friends christmas christmas !! chirstmas the weeknd christmas 2015 christmas christmas lyrical party music wake up happy vibes 🎄 christmas calm country winter christmas christmas christmas pop christmas af ❄ christmas️ feel good :)) christmas christmas af christmas jams moana christmas merry christmas! christmas playlist christmas christmas silly love songs christmas </3 school 🎄 christmas christmas music christmas christmas music 🎄 christmas x-mas christmas bops christmas beachin' dance jamz christmas new wave its christmas christmas 🎄 christmas indie 2 christmas 1980 christmas jams christmas 2015 sunrise christmas christmas playlist christmas jams christmas white ella chirstmas sleep :))))) christmas random christmas dance christmas christmas december; christmas christmas favs christmas old christmas songs ~holidaze~ christmas christmas music xmas christmas holidays december christmas christmas christmas baby wedding music tis the season christmas relax holidays!! 🎅 🏼 christmas christmas christmas december '15 christmas!! christmas new songs christmas christmas

Exploiting the Music knowledge Playlists and Weights

II.B

70 of 105

Title2Rec: predicting

70

Exploiting the Music knowledge Playlists and Weights

II.B

Given a new title:

  • found the most similar titles among the known ones
  • propose the most popular tracks among those titles

Evaluated on Spotify’s�Million Playlists Datasets

in the context of the�RecSys Challenge 2018

in the challenge:

#37 over 112

#13 over 31

71 of 105

II.C Learning

MIDI Embeddings

Exploiting the

Music knowledge

Is Graph representation also suitable for music content?

RQ6

72 of 105

MIDI2vec

Apply graph technologies to MIDI

72

  • Transform MIDI flow in a graph

  • Apply node2vec for learning graph embeddings

Exploiting the Music knowledge Learning MIDI Embeddings

II.C

MIDI

Group of Notes

Pitch

Duration

Program

Time Signature

Tempo

Velocity

+

+

+

+

+

73 of 105

Experiment: genre and metadata prediction

73

Dataset 1: SLAC

250 MIDI, balanced on 5/10 genres

Accuracies on cross-fold validation:

Dataset 2: MuseData

438 MIDI, unbalanced, linked to DOREMUS

Accuracies on cross-fold validation:

Exploiting the Music knowledge Learning MIDI Embeddings

II.C

Baseline: McKay et al (2010)

74 of 105

74

Exploratory Search Engine

overture.doremus.org

Emotion Detection

data.doremus.org/emotion

75 of 105

75

Which model best represents these rich data for final users and music scholars?

DOREMUS model and Vocabularies

What strategies to adopt for building a music Knowledge Graph?

marc2rdf and other converters

result: the DOREMUS Knowledge Graph

How to make these data accessible to researchers and developers?

SPARQL Transformer reshapes and merges the results for easy use

RQ1

RQ2

RQ3

Main contributions

76 of 105

76

How can graph-based algorithms support music recommender systems?

Embedding approach with generation and recombination of partial vectors

Which information is possible to extract from editorial playlists?

A study of editorial playlists, for weighting a recommender system� Title2Rec: recommend music by the title of the playlist

Graph representation is suitable also for music content?

MIDI2vec: learning MIDI graph embeddings

RQ4

RQ5

RQ6

Main contributions

77 of 105

77

Future Work (1/2)

Short Term

  • Studies on simplifications of the ontology (schema.org)
  • Domain-based NLP for text-field information extraction

Long Term

  • Strategies for modeling librarian information�representing meta-information on a 2nd level (RDF*)

Modeling and accessing a KG

78 of 105

78

Future Work (2/2)

Short term

  • Split the dataset in historical period�more precise training, faster performances
  • Title2Rec + similarity-based recommender system�application for editors
  • Experiment MIDI embeddings on larger dataset

Long term

  • Gold standard dataset of classical music playlists
  • Combining our strategy with more traditional ones (CF)
  • MIDI ontology: extend and use in MIDI2vec

  • Knowledge-aware Recommender system

Knowledge-aware Recommender system

79 of 105

Publications

Conference

Poster&Demo

Journal

Tutorial

Workshop

EKAW'16

ISWC'16

EKAW'16

2016

ISWC'17 X2

ISMIR'17

K-CAP'17

DLfM'17

2017

ISWC'18 X2

ISMIR'18

ISMIR'18

BIBLIOTHEK - Forschung und Praxis

ESWC'18

RecSys'18

TheWebConf'18

2018

ISWC'19

2019

PC Member

ISWC’18 P&D, SAAM’18, DLfM’18, ISWC’19 P&D, K-CAP’19

as sub-reviewer: KAARS’18, TheWebConf’19

Student

Supervision

2 Master Thesis supervisions

10 Semester Projects supervisions

Lecturer for WebInt and Aalto BootCamp

Talks

Des Catalogues au Web des Données

  • BnF, Paris

Classical Music and Knowledge Graphs

  • Semantic Web course, PoliTo
  • WAI meeting, VU Amsterdam
  • Research seminar, Deezer, Paris

80 of 105

80

References (1/2)

  • M. Schedl (2015) Towards Personalizing Classical Music Recommendations. 2015 IEEE International Conference on Data Mining Workshop (ICDMW), Atlantic City, NJ, USA, pp. 1366-1367.
  • Y. Raimond, S. Abdallah, M. Sandler, and F. Giasson (2007). The Music Ontology. In 15th International Conference on Music Information Retrieval (ISMIR). 417–422
  • P. Choffé and F. Leresche (2016). DOREMUS: connecting sources, enriching catalogues and user experience. In 24th IFLA World Library and Information Congress.
  • P. Lisena et al. (2018). Controlled Vocabularies for Music Metadata. In 19th International Conference on Music Information Retrieval (ISMIR). Paris, France.
  • P. Lisena et al. (2017) Modeling the Complexity of Music Metadata in Semantic Graphs for Exploration and Discovery. In 4th International Workshop on Digital Libraries for Musicology (DLfM’17), Shanghai, China.
  • Booth et al. (2019) Toward Easier RDF. In W3C Workshop on Web Standardization for Graph Data.
  • Lisena P. et al. (2019). Easy Web API Development with SPARQL Transformer. In ISWC’19.
  • S. Oramas, V. C. Ostuni, T. Di Noia, X. Serra, and E. Di Sciascio. Sound and music recommendation with knowledge graphs. ACM Trans. Intell. Syst. Technol. 8, 2, Article 21 (October 2016), 21 pages.

81 of 105

81

References (2/2)

  • Alo Allik, Florian Thalmann, and Mark Sandler (2018). MusicLynx: Exploring Music Through Artist Similarity Graphs. In The Web Conference 2018. Demo Track, pp 167-170.
  • Aditya Grover and Jure Leskovec. (2016) node2vec: Scalable Feature Learning for Networks. In 22nd ACM SIGKDD.
  • McKay, C., Burgoyne, J., Hockman, J., B. L. Smith, J.,Vigliensoni, G., and Fujinaga, I. (2010). Evaluating the Genre Classification Performance of Lyrical Features Relative to Audio, Symbolic and Cultural Features. In ISMIR 2011, Utrecht, The Netherlands
  • Meroño-Peñuela, A., Hoekstra, R., Gangemi, A., Bloem, P., de Valk, R., Stringer, B., Janssen, B., de Boer, V.,Allik, A., Schlobach, S., et al. (2017). The MIDI Linked Data Cloud. In ISWC 2017, Vienna, Austria.
  • Huang, A. and Wu, R. (2016). Deep Learning for Music. Computing Research Repository (CoRR), https://arxiv.org/abs/1606.04930 .
  • Peter Knees and Markus Schedl (2013). A survey of music similarity and recommendation from music context data. ACM Trans. Multimedia Comput. Commun. Appl. 10, 1, Article 2 (December 2013), 21 pages.
  • Palumbo, Rizzo, Troncy. (2017) entity2rec: Learning user-item relatedness from knowledge graphs for top-N item recommendation. In RECSYS 2017, Como, Italy.

82 of 105

82

???

Popularity bias

Listeners of Justin Bieber: ~19M

83 of 105

83

84 of 105

84

The building blocks

Work-Expression-Event

F14

Work

F22

Expression

F28

Expression

Creation

R3 is realized in

R17 created

R19 created a realization of

Building a Music graph Music Model & Vocabularies

I.A

85 of 105

85

The composition event

F14

Work

F22

Expression

F28

Expression

Creation

E7

Activity

“composer”@en

U31 had function of type

P14 carried

out by

P9 consists of

P4 has time span

1796

Building a Music graph Music Model & Vocabularies

I.A

86 of 105

86

M2

Opus Statement

5

1

“Sonate pour violoncelle et piano no 1”@fr

“Sonates" , "Sonata in F"

U17 has opus statement

U12 has genre

U70 has title

Sonata

sonata@it , sonate@fr , klaviersonate@de

U11 has key

F Major

F Dur@de , Fa majeur@fr,

Fa maggiore@it , Fa mayor@es

F14

Work

F22

Expression

F28

Expression

Creation

U42 has opus number

U43 has opus subnumber

Building a Music graph Music Model & Vocabularies

I.A

87 of 105

87

M42 Performed

Expression

Creation

M43�Performed

Expression

M44

Performed

Work

F14

Work

F22

Expression

F28

Express.

Creation

R3 is realized in

R17 created

R19 created a realization of

The performance

U54 is performed expression of�

Royal Albert Hall

P4 has time-span

2013-04-09

M28

Individual Performance

P14 carried out by

Ruby Huges

U1 used mop

Soprano

P9 consist of

M28

Individual Performance

BBC Symphony Orchestra

U1 used mop

Orchestra

Orchestra@it Orchestre@fr

Osmo Vanska

P7 took place at

M28

Individual Performance

P14 carried out by

U31 had function of type

Conductor

Direttore d’orchestra@it Chef d’orchestre@fr

U31 had function of type

Performer

Interprete@it Interprète@fr

Building a Music graph Music Model & Vocabularies

I.A

88 of 105

Controlled Vocabularies Why?

88

“Do dièse majeur”@fr

“Do diesis maggiore”@it

“C sharp major”@en

Different languages

Synonyms

“sonate”@fr

“sonatine”@fr

“sonate d'église”@fr

Disambiguation

Ludwig van

Beethoven

Johann van

Beethoven

Description

“composer”@en

“A composer is a person who creates or writes music (source: Wikipedia)”@en

89 of 105

89

marc2rdf

MARC PARSER

  • Parsing of the file
  • Interpretation of the fields
  • Graph generation

MARC

files

mapping rules

Building a Music graph Data Conversion

I.B

90 of 105

90

144 $w....b.fre.$aSonates$bPiano$pOp. 27, no 2$tDo dièse mineur

F22 Expression: Opus Number

F22 Self-Contained Expression

U17 has opus statement M2 Opus Statement

[U42 has opus number M12 Opus Number]

+ [U43 has opus subnumber M13 Opus Subnumber]

TUM : 144 $p, chain of digits

TUM : 144 $p, chain of digits before the comma

Remove the abbreviation “Op.” before the number

144 $pOp. 352 --> M12 = 352

144 $pOp. 27, no 2 --> M12 = 27, M13 =2

UNIT OF INFORMATION

PATH

INTERMARC BNF

TRANSFER RULE

EXAMPLE

MAPPING RULES

Building a Music graph Data Conversion

I.B

91 of 105

91

marc2rdf

MARC PARSER

FREE TEXT INTERPRETER

MARC

files

vocabularies

1st performance in Moscow, December 29, 1956,

by Mstislav Rostropovich on cello and A. Dedukhin on piano

  • Extracting info from the text through empirical rules
  • Disambiguation for vocabularies terms and artists

Building a Music graph Data Conversion

I.B

92 of 105

92

marc2rdf

MARC PARSER

FREE TEXT INTERPRETER

STRING 2 VOCABULARY

  • Replace labels with URIs from controlled vocabularies

MARC

files

vocabularies

“Violoncelle”@fr

Building a Music graph Data Conversion

I.B

93 of 105

93

STRING 2 VOCABULARY

  • Match against a family of vocabularies

“Soprano”@it

MIMO IAML DIABOLO ITEMA3 REDOMI RAMEAU

GENRE

“C Major”@en

GENRE

vocabulary:key/c

KEY

vocabulary:key/c

  • 2 passes
    • Exact label + language
    • Exact label, any language
  • Correction of editorial mistakes

Building a Music graph Data Conversion

I.B

94 of 105

Query Object

{"proto": {"id" : "?composer","name": "$foaf:name$required","image": "$foaf:depiction$required"},"$where": ["?composer a ecrm:E21_Person"],"$limit": 100}

94

Define at the same time:

  • the query
  • the requested structure

PROTOTYPE DEFINITION

  • write directly the resulting shape
  • define the replacements
  • options and filters

$-PROPERTIES

map SPARQL features

$where $values $limit $distinct $orderby $groupby $having $filter $prefixes

KEY CONCEPT

95 of 105

Query Object

"proto": {"id" : "?work","name": "$rdfs:label$required","image": "$foaf:depiction$required",

},

95

The “id” keyword declares the subject of the other statements

?work rdfs:label ?name .

?work foaf:depiction ?image .

"author": {"id": "?author","name": "$rdfs:label$required"

}

"$where": "?work dbo:author ?author"

?author rdfs:label ?name .

at different levels

96 of 105

Implementation

96

JSON query

PARSER

SELECT DISTINCT ?work ?v1 ?v2� WHERE {?work a dbo:Work.?work dbo:museum dbr:Louvre.?work rdfs:label ?v1.?work foaf:depiction ?v2.} LIMIT 100

{"id": "?work","name": "?v1","image": "?v2"}

SPARQL Query

PROTOTYPE

QUERY PERFORMER

SHAPER

SPARQL results

(JSON)

JSON output

SPARQL endpoint

97 of 105

JSON-LD VERSION

{"@context" : "http://schema.org/",

"@graph": {"@id" : "?work",

"@type" : "Painting","name": "$rdfs:label$required","image": "$foaf:depiction$required"},"$where": ["?work a dbo:Work","?work dbo:museum dbr:Louvre"],"$limit": 100}

97

{"@context": "http://schema.org/","@graph": ["@id": "http://.../St._John_the_Baptist_(Leonardo)","@type": "Painting","name": [{"@language": "en","@value": "St. John the Baptist (Leonardo)"}, {"@language": "it","@value": "San Giovanni Battista (Leonardo)"}],"image": "http://..._C2RMF_retouched.jpg"},...]}

Query Object

JSON Output

98 of 105

98

Exploiting the Music knowledge Playlists and Weights

II.B

99 of 105

99

F test statistic = variance between / variance within

Exploiting the Music knowledge Playlists and Weights

II.B

100 of 105

Title2Rec: training

100

word2vec

track vectors

pw2v

mean�of tracks�in each playlist

k-means

preprocessing

titles of playlist

concatenate

k documents

titles not involved

just titles involved

titles and embs involved

fastText

t2r model

k clusters

track sequences

MPD

Exploiting the Music knowledge Playlists and Weights

II.B

101 of 105

Title2Rec: predicting

101

fastText

t2r model

fastText

cosine similarity

known playlists

new playlist

title

title

title vector

for each playlist

title vector

of the new playlist

P

most similar playlists

recommendations

most popular�tracks in P

Exploiting the Music knowledge Playlists and Weights

II.B

102 of 105

Experiment: metadata prediction

102

Dataset 1: SLAC

250 MIDI, balanced on 5/10 classes

Accuracies on cross-fold validation:

Dataset 2: MuseData

438 MIDI, unbalanced, linked to DOREMUS

Accuracies on cross-fold validation:

Exploiting the Music knowledge Learning MIDI Embeddings

II.C

103 of 105

Experiment: metadata prediction

103

Dataset 1: SLAC

Exploiting the Music knowledge Learning MIDI Embeddings

II.C

Subclasses of the same class

Hardcore Rap and Pop Rap

104 of 105

Experiment: metadata prediction

104

Dataset 2: MuseData

Exploiting the Music knowledge Learning MIDI Embeddings

II.C

105 of 105

Experiment: metadata prediction

105

Dataset 2: MuseData

Exploiting the Music knowledge Learning MIDI Embeddings

II.C