1 of 7

We’ll never get a unified biographical data model

Eetu Mäkelä

Semantic Computing Research Group (SeCo)

Department of Computer Science

eetu.makela@aalto.fi - http://seco.cs.aalto.fi/u/jiemakel/

This presentation:

http://j.mp/bdm-dh

2 of 7

Differing Targets of Modeling

  • Text itself
    • Writer’s point of view, style, focus etc.
  • Life of the person
    • Life events, accomplishments, occupations, relations etc.
  • In between
    • Sentence level semantic annotation
      • “He sadly passed away in 1962 because of severe cerebrovascular disease” <- frame: Death, time: in 1962
      • “While in Paris, he frequented lots of cafes” <- visitor: he, place visited: cafes
      • “He was disliked by his peers” <- disliker: peers, object of disaffection: he

Department of

Computer Science

3 of 7

Differing Viewpoints

  • Attribute (birthdate on a person) vs event (birth event with person as participant)
  • For events, transitional (start of marriage, end of marriage) vs attributional (marriage as event)

Department of

Computer Science

4 of 7

Differing Information Depths

  • Simple attribute vs structured event
  • For attributes, qualified (occupation: doctor, 1910-1935, source: ODNB) vs unqualified (occupation: doctor)

Department of

Computer Science

5 of 7

Differing Choices

  • Target of modeling
  • Attribute vs event
  • For attributes, qualified vs unqualified
  • For attributes, distinct property (officiant: John) vs qualified general property (participant: John, role: officiant)
  • For events, transitional vs attributional

What to choose depends mainly on use case needs. None of the choices are intrisically superior

Department of

Computer Science

6 of 7

Lack of common models

Intuition: we currently don’t have common models because each project thus far has had a slightly different combination of answers to the previous questions

Solution: Enumerating the different modeling possibilities and organizing best practice patterns and vocabularies under them would allow people to mix and match a combination usable for them, but still maintain as much interoperability as possible

In addition, rules can be created to map between the models

Department of

Computer Science

7 of 7

Conclusion: We’ll never get a unified biographical data model - but maybe we can get three or four of them that are mapped to each other

Department of

Computer Science