DOREMUS
a Graph of Interlinked Musical Work
Pasquale Lisena
EURECOM, France
M. Achichi, P. Lisena, K. Todorov, R. Troncy, J. Delahousse
2
Which works have been composed
by Mozart when he was <10?
How many works have been composed and performed for the 1st time in the same city?
Which composers had the chance to direct their own work in a performance during the last decade?
3
metadata about
artists, works, performances, scores
Music
knowledge graph
used for building the knowledge graph�open-source, reusable
Tools for converting and interlinking
4
Music is complex
5
M. Lasar (2011). Digging into Pandora’s Music Genome with musicologist Nolan Gasser. https://arstechnica.com/tech-policy/2011/01/digging-into-pandoras-music-genome-with-musicologist-nolan-gasser/
When it comes to classical music, on the other hand, it's much more about the composition itself, because even though the interpretation can vary in various subtle ways.
CLASSICAL
POP
VS
For pop music the experience of the music is really defined by the recording.
6
CLASSICAL
POP
VS
Track-based | Work-based |
60 years of history | Thousand years�from Gregorian chant to a work written last Tuesday |
Songs | Multi-movement works |
Major, minor | Polyphonic, homophonic, monophonic |
7
8
Music archives have
very detailed knowledge
PROBLEMS
APPROACH
Semantic Web!
9
Improve music description to foster music exchange and reuse
Travel to the heart of the musical archives in France’s greatest institutions
Connect sources, multiply usage, enrich user experience
10
Building the
DOREMUS graph
DATA CONVERSION
DATA LINKING
LINK VALIDATION
DATA MODELING
marc2rdf
string2vocabulary
...custom converters
legato
11
DATA CONVERSION
DATA LINKING
LINK VALIDATION
The DOREMUS Model
FRBR
museum information
bibliographic records
DATA MODELING
Choffé, Pierre, and Françoise Leresche. DOREMUS: connecting sources, enriching catalogues and user experience. In 24th IFLA World Library and Information Congress. 2016.
12
The building blocks
Work-Expression-Event
F14
Work
F22
Expression
F28
Expression
Creation
R3 is realized in
R17 created
R19 created a realization of
DATA CONVERSION
DATA LINKING
LINK VALIDATION
DATA MODELING
13
F14
Work
F22
Expression
M2
Opus Statement
F28
Expression
Creation
R3 is realized in
E7
Activity
5
1
“Sonate pour violoncelle et piano no 1”@fr
“Sonates" , "Sonata in F"
Ludwig van Beethoven
Ludwig von Beethoven
composer
compositeur@fr compositore@it
R17 created
R19 created a realization of
U17 has opus statement
U12 has genre
P102 has title
U31 had function of type
P14 carried out by
P9 consists of
P4 has time span
1796
Sonata
sonata@it , sonate@fr , klaviersonate@de
M42 Performed
Expression
Creation
M43�Performed
Expression
Berlin
P4 has time span
1796
P7 took
place at
F24 Publication Expression
F30 Publication Event
P4 has time span
1797
P7 took place at
Vienna
U4 had princeps publication
U54 is performed expression of
P165 incorporates
1770
1827
P98
born
P100
died
U11 has key
F Major
F Dur@de , Fa majeur@fr,
Fa maggiore@it , Fa mayor@es
M6
Casting
M23
Casting Detail
U13 has casting
1
U30
quantity
U2 foresees mop
Piano
Pianoforte@it Fortepian@pl
M23
Casting Detail
1
U30
quantity
U2 foresees mop
Cello
Violoncello@it Violoncelle@fr
F15
Complex
Work
F19 Publication Work
M44
Performed
Work
U5 had premiere
U38 has descriptive expression
R10 has member
14
F22
Expression
M6
Casting
M23
Casting Detail
U13 has casting
1
U30
quantity
U2 foresees mop
Piano
Pianoforte@it Fortepian@pl
M23
Casting Detail
1
U30
quantity
U2 foresees mop
Cello
Violoncello@it Violoncelle@fr
Controlled Vocabularies for Music Metadata
GENRES
Diabolo
IAML
Itema3
Redomi
RAMEAU
Medium of performance
MIMO
Itema3
IAML
Diabolo
RAMEAU
Redomi
Musical keys
Modes
Catalogues
Derivation types
Functions
more available at
http://data.doremus.org/vocabularies
23 families of vocabularies · 11,000+ concepts · 610 links between terms
published at ISMIR 2018
INTERLINKED
INTERLINKED
16
Dealing with different formats
Works: INTERMARC
Scores: INTERMARC
Discs: INTERMARC
Works: UNIMARC
Scores: INTERMARC
Performances: XML
Works - Recordings - Scores
3 different XML sources
A pre-digital archive format in Radio France
DATA MODELING
DATA LINKING
LINK VALIDATION
DATA CONVERSION
Source datasets
17
Works
62 550 | XML
Scores
9 154 | XML
Concerts
340 609 | XML
Discs
9 500 | XML
Works
6 846 | UNIMARC
Scores
30 319 | UNIMARC
Concerts
5 164 | XML
Discs
8 602 | XML
Works
135 940 | INTERMARC
Scores
89 184 | INTERMARC
Source datasets
18
DATASET
Works�
Scores
Concerts
Discs
Classic work
Jazz improvisation
Ethnic/World/Traditional music
19
001 FRBNF139081882FR
100 $313891295$w.0..b.....$aBeethoven$mLudwig van$d1770-1827
144 $w....b.fre.$aSonates$bPiano$pOp. 27, no 2$tDo dièse mineur
001 FRBNF139081882FR
100 $313891295$w.0..b.....$aBeethoven$mLudwig van$d1770-1827
144 $w....b.fre.$aSonates$bPiano$pOp. 27, no 2$tDo dièse mineur
LANG TITLE MOP OPUS KEY
MARC FILE
MARC must die
“
Roy Tennant, 2002
”
DATA MODELING
DATA LINKING
LINK VALIDATION
DATA CONVERSION
20
marc2rdf
MARC PARSER
MARC
files
mapping rules
DATA MODELING
DATA LINKING
LINK VALIDATION
DATA CONVERSION
21
144 $w....b.fre.$aSonates$bPiano$pOp. 27, no 2$tDo dièse mineur
F22 Expression: Opus Number
F22 Self-Contained Expression
U17 has opus statement M2 Opus Statement
[U42 has opus number M12 Opus Number]
+ [U43 has opus subnumber M13 Opus Subnumber]
TUM : 144 $p, chain of digits
TUM : 144 $p, chain of digits before the comma
Remove the abbreviation “Op.” before the number
144 $pOp. 352 --> M12 = 352
144 $pOp. 27, no 2 --> M12 = 27, M13 =2
UNIT OF INFORMATION
PATH
INTERMARC BNF
TRANSFER RULE
EXAMPLE
MAPPING RULES
DATA MODELING
DATA LINKING
LINK VALIDATION
DATA CONVERSION
22
marc2rdf
MARC PARSER
FREE TEXT INTERPRETER
MARC
files
vocabularies
1st performance in Moscow, December 29, 1956,
by Mstislav Rostropovich on cello and A. Dedukhin on piano
“
”
DATA MODELING
DATA LINKING
LINK VALIDATION
DATA CONVERSION
23
marc2rdf
MARC PARSER
FREE TEXT INTERPRETER
STRING 2 VOCABULARY
MARC
files
vocabularies
“Violoncelle”@fr
DATA MODELING
DATA LINKING
LINK VALIDATION
DATA CONVERSION
24
STRING 2 VOCABULARY
“Soprano”@it
MIMO IAML DIABOLO ITEMA3 REDOMI RAMEAU
GENRE
“C Major”@en
GENRE
vocabulary:key/c
KEY
vocabulary:key/c
DATA MODELING
DATA LINKING
LINK VALIDATION
DATA CONVERSION
25
INTERMARC
marc2rdf
UNIMARC
EUTERPE XML
ITEMA3 XML
euterpe
converter
itema3
converter
GRAPH BNF
GRAPH PHILHARMONIE
GRAPH EUTERPE
GRAPH ITEMA3
diabolo converter
DIABOLO XML
GRAPH DIABOLO
DATA MODELING
DATA LINKING
LINK VALIDATION
DATA CONVERSION
STRING 2 VOCABULARY
26
GRAPH BNF
GRAPH PHILHARMONIE
http://data.doremus.org/expression/d72301f0-0aba-3ba6-93e5-c4efbee9c6ea
“Quasi una fantasia”
COMPOSER Beethoven
ORDER NUM 14
OPUS 27 n. 2
GENRE sonata
CASTING piano
KEY C sharp major
1st PUB ?
PREMIERE ?
http://data.doremus.org/expression/37932fbc-fef3-3edb-9fae-1eec9b4be01d �“Sonata quasi una fantasia”
COMPOSER Beethoven
ORDER NUM 14
OPUS 27, no 2
GENRE sonata, romantic music
CASTING piano (1)
KEY C sharp major
1st PUB 1802, Vienna
PREMIERE ?
sameAs
27
DATA MODELING
LINK VALIDATION
DATA CONVERSION
DATA LINKING
Challenges
On the left: Beethoven.
On the right: (the same) Beethoven.
28
DATA MODELING
LINK VALIDATION
DATA CONVERSION
DATA LINKING
First Linking
Composer + Catalogue
Wolfgang Amadeus Mozart
Eine kleine Nachtmusik K 525
Wolfgang Amadeus Mozart
Serenade No. 13 in G major KV 525
sameAs
29
DATA MODELING
LINK VALIDATION
DATA CONVERSION
DATA LINKING
Legato
New linking system
Existing data linking system were not satisfactory
30
DATA MODELING
LINK VALIDATION
DATA CONVERSION
DATA LINKING
* works to be compared are grouped by composer
*
31
DATA MODELING
LINK VALIDATION
DATA CONVERSION
DATA LINKING
32
DATA MODELING
LINK VALIDATION
DATA CONVERSION
DATA LINKING
Heterogeneities Task
False Positive Trap
Legato performances at the
OAEI campaign 2017
sandbox
mainbox
SPIMBENCH
DOREMUS
33
DATA LINKING
DATA MODELING
DATA CONVERSION
LINK VALIDATION
certain links
confidence score + experts’ validation
?
SINGLE LINK TRIANGLE MISSING LINK CONFLICT
inference if experts’ validation
remove with
experts’ check
34
What is in the Knowledge Graph?
89.872
persons
(composers, performers, …)
18.075
corporate bodies�(orchestras, chorus, publishers, …)
357.451
musical works
16k components
4k derived works
193.412
concerts and studio recordings
469.131
performed work
3.833
foreseen concerts
31.296
publications
48.006
scores
35
Future Work
Applications
DOREMUS CHATBOT
DOREMUS website
THIS PRESENTATION
pasquale.lisena@eurecom.fr
37
Persons
9.269 | euterpe |
1.503 | diabolo |
9.040 | itema3 |
8.419 | philharmonie |
19.881 | bnf |
54.675 | bnf bib |
291.421
in the whole graph
89.872
active*
* with 1 or more compositions, performances, dedications, ...
1.479 | dedicatees |
529 | subjects |
21.626 | composers |
7.830 | conductors |
3.583 | performers |
13.242 | text authors |
38
Corporate Bodies
45.743
in the whole graph
18.075
active*
* with 1 or more compositions, performances, dedications, ...
1001 | euterpe |
0 | diabolo |
39 | itema3 |
1.603 | philharmonie |
855 | bnf |
14.657 | bnf bib |
6 | dedicatees |
7 | subjects |
517 | orchestras + ensembles |
192 | choruses |
6.099 | publishers |
2.194 | producers |
39
Works
f15 | f14 | f22 | |
- | 10.587 | 10.587 | euterpe |
9.343 | 12.344 | 12.344 | diabolo |
-- | 15.016 | 15.016 | itema3 |
5.762 | 14.527 | 14.875 | philharmonie |
135.749 | 134.973 | 134.973 | bnf |
245.069 | 223.357 | 279.641 | bnf bib |
420.733
expressions �(include movements)
357.451
complex works
40
Works
16.132 | components* |
4.619 | arrangements |
293 | transcriptions |
43 | orchestration |
4.884 | total of derivations |
* movements, parts, acts, selections (extraits) ...
420.733
expressions �(include movements)
357.451
complex works
41
Performances
193.065
concerts (performances)
5.702
converted from specific records
469.131
interpretations of
288.298
distinct works
f31 | m43 | |
2.294 | 2.294 | diabolo |
2.296 | 12.602 | itema3 |
7107 | 47.119 | philharmonie |
14.115 | 15.221 | bnf |
165.225 | 387.519 | bnf bib |
42
Foreseen Concerts
3.833
concerts
13.520
interpretations of
10.759
distinct works
m26 | f25 > f22 | |
3.833 | 13.520 | euterpe |
17
artistic seasons
281
cycles
33
festivals
43
Recordings
397.597
recordings
15.267
supports
f26 | f4 | f3 | |
2.296 | 2.842 | - | itema3 |
3.406 | 11.681 | - | philharmonie |
392.020 | 744 | 199.339 | bnf bib |
198.693
publications
44
Scores
31.296
publications
48.006
scores
44.668
distinct works
f24 | f24 > f22 | |
31.296 | 48.006 | bnf bib |