Albert Meroño-Peñuela
Interfacing Human-Machine Intelligence with Cultural AI
Albert Meroño Peñuela, PhD
2
Machine intelligence is not enough
3
How can we combine Human and Machine Intelligence?
By “endowing machines with a deeper understanding of the world”
4
“Some significant fraction of the knowledge that a robust system is likely to draw on�is external, cultural knowledge that is symbolically represented” [Marcus arXiv 2020]
CULTURE
Cultural AI is crucial for Hybrid Intelligence (CARE)
Real-world domains: Culture embodies challenging human activities
AI as a co-author as paramount example
5
HYBRID INTELLIGENCE
My research on Interfacing Human-Machine Intelligence with Cultural AI
6
Research Highlights
7
Knowledge Graph�Construction
Text Generation
Interfaces
Social Querying
Interactive Learning�& Reasoning
Knowledge Graph�Completion
Research Highlights
8
Knowledge Graph�Construction
Text Generation
Interfaces
Social Querying
Interactive Learning�& Reasoning
Knowledge Graph�Completion
KGC from structured data
Scalable and intelligent KG construction algorithms for
9
CEDAR KG (660M triples)
[Meroño et al. SWJ 2015]
MIDI KG (10B triples)
[Meroño et al. ISWC 2017]
CLARIAH KG (865M triples)
[Hoekstra, Meroño et al. JWS 2018]
Ethics by design: privacy reasoning
10
[Casellas, Nieto, Meroño et al. AAAI 2010]
COLLABORATIVE HI
RESPONSIBLE HI
Efficient semantic list management
[Meroño et al. ISWC 2019] [Daga, Meroño et al. QuWeDa 2019]
What list models have socially emerged from the Web?
11
COLLABORATIVE HI
Efficient semantic list management
What is the impact of these models in query performance?
12
ADAPTIVE HI
Research Highlights
13
Knowledge Graph�Construction
Text Generation
Interfaces
Social Querying
Interactive Learning�& Reasoning
Knowledge Graph�Completion
Deterministic multimodal KG completion
[Meroño et al. SAAM 2018]
Leveraging melody and text for entity linking:
14
Graph embeddings link music to high-level features
[Lisena, Meroño et al. TISMIR 2020]
15
“adagio”
Symbolic music distributional hypothesis:
Can we relate groups of notes in similar contexts with similar meanings?
ADAPTIVE HI
MIDI2vec scales to Web size
[Lisena, Meroño et al. TISMIR 2020]
SLAC dataset (250 MIDI files, high level features)
MuseData (438 MIDI files)
EchoNest: scaling up to Web size
16
ADAPTIVE HI
Research Highlights
17
Knowledge Graph�Construction
Text Generation
Interfaces
Social Querying
Interactive Learning�& Reasoning
Knowledge Graph�Completion
Web-scale music mashups
[Meerwaldt, Meroño et al. WHiSe 2017]
18
COLLABORATIVE HI
Song 1
Song 2
Mashup
Music can be learned by example...
19
Music as a language model
[Wilschut, Wijtsma, Meroño et al. 2020] (in progress)
ADAPTIVE HI
… but it is best learned socially & interactively
20
Reasoning
Learning
Mutation
[Miras, Meroño et al. 2020] (in progress)
ADAPTIVE HI
Evolutionary algorithm
Social feedback
COLLABORATIVE HI
… but it is best learned socially & interactively
21
Mutation
Research Highlights
22
Knowledge Graph�Construction
Text Generation
Interfaces
Social Querying
Interactive Learning�& Reasoning
Knowledge Graph�Completion
Human-machine querying
[Meroño et al. ESWC 2016, ISWC 2017] [Lisena, Meroño et al. ISWC 2019]
23
body
query info
access
parameters
FAIRquery
COLLABORATIVE HI
Human-machine querying
24
Explanations from provenance graphs
Social, selective explanations
25
[Hoover et al. arXiv 2019]
EXPLAINABLE HI
COLLABORATIVE HI
[Groth & Moreau 2013]
[Hoekstra & Groth PAW 2014]
Research Highlights
26
Knowledge Graph�Construction
Text Generation
Interfaces
Social Querying
Interactive Learning�& Reasoning
Knowledge Graph�Completion
AI as a co-author
27
[Radford et al. OpenAI 2019]
Generating explanations
28
An explanation for the absence of effective and general methods for ontology matching is that we are unable to match domain ontologies on all of the instances, and however, we have already performed several tests on ontologies with several hundred classes and over a million instances and found that they do not match most features of the domain ontology.
A simple explanation for the difficulty of solving large scale ontology reasoning is that we tend to solve small problems by imposing very big and complex rules. We often end up with very large portions of ontologies that cannot be represented using standard reasoners.
[Meroño et al. ESWC 2020]
(under review)
EXPLAINABLE HI
Explanation
Explanation
Machine learning techniques are used for the task of entity linking because it is a challenging task for the user. Therefore, we propose a novel method that is scalable to large knowledge bases with a high number of facts and a high accuracy.
Explanation
Generating hypotheses and definitions
29
[Meroño et al. ESWC 2020]
(under review)
EXPLAINABLE HI
In this paper, our hypothesis is that a new ontology can be derived from an existing one. The key idea is to combine ontology-based and knowledge-based approaches.
Hypothesis
The Semantic Web is a rich and evolving web of interconnected resources with many different types of content, such as web pages, blogs, videos, music plays, and so on.
Definition
Hybrid papers: Human drives, machine writes
30
ADAPTIVE HI
[Meroño et al. ESWC 2020]
(under review)
Inspired by
[van Harmelen & ten Teije JWE 2019]
31
ADAPTIVE HI
RESPONSIBLE HI
COLLABORATIVE HI
EXPLAINABLE HI
Human? Crowds? Human in the loop?
Conclusions
32
COLLABORATIVE HI
ADAPTIVE HI
Thank you
Mentors
Students (esp. Rick Meerwaldt, Nina Wilschut, Stefan Wijtsma)
33
Cited Contributions (1/2)
[Casellas, Nieto, Meroño et al. AAAI 2010] Casellas, N., Nieto, J-E., Meroño, A., Roig, A., Torralba, S., Reyes de los Mozos, M., Casanovas, P. “Ontological Semantics for Data Privacy Compliance: the NEURONA Ontology”, AAAI Spring Symposium Series Technical Reports (Intelligent Information Privacy Management), Stanford 23rd-25th of March 2010.
[Daga, Meroño et al. QuWeDa 2019] Enrico Daga, Albert Meroño-Peñuela, Enrico Motta. “Modelling and Querying Lists in RDF. A Pragmatic Study”. In: 3rd Workshop on Querying and Benchmarking the Web of Data (QuWeDa 2019), ISWC 2019, 18th International Semantic Web Conference (2019).
[Hoekstra, Meroño et al. JWS 2018] Rinke Hoekstra, Albert Meroño-Peñuela, Auke Rijpma, Richard Zijdeman, Ashkan Ashkpour, Kathrin Dentler, Ivo Zandhuis, Laurens Rietveld. “The dataLegend Ecosystem for Historical Statistics”. Journal of Web Semantics: Science, Services and Agents on the World Wide Web, volume 50, pp. 49-61 (2018).
[Lisena, Meroño et al. ISWC 2019] Pasquale Lisena, Albert Meroño-Peñuela, Tobias Kuhn, Raphaël Troncy. “Easy Web API Development with SPARQL Transformer”. In: The Semantic Web – ISWC 2019, 18th International Semantic Web Conference. Lecture Notes in Computer Science, vol 11779, pp. 454-470 (2019).
[Lisena, Meroño et al. TISMIR 2020] Pasquale Lisena, Albert Meroño-Peñuela, Raphaël Troncy. MIDI2vec: Learning MIDI Embeddings for Reliable Prediction of Symbolic Music Metadata. Transactions of the International Society for Music Information Retrieval (TISMIR) (2020)
[Meerwaldt, Meroño et al. WHiSe 2017] Rick Meerwaldt, Albert Meroño-Peñuela, Stefan Schlobach. “Mixing Music as Linked Data: SPARQL-based MIDI Mashups”. In: Proceedings of the 2nd Workshop on Humanities in the SEmantic web (WHiSe 2017). ISWC 2017, October 22nd, Vienna, Austria (2017).
[Meroño et al. SWJ 2015] Albert Meroño-Peñuela, Ashkan Ashkpour, Christophe Guéret, Stefan Schlobach. “CEDAR: The Dutch Historical Censuses as Linked Open Data”. Semantic Web — Interoperability, Usability, Applicability, 8(2), pp. 297–310. IOS Press (2015).
[Meroño et al. ISWC 2017] Albert Meroño-Peñuela, Rinke Hoekstra, Aldo Gangemi, Peter Bloem, Reinier de Valk, Bas Stringer, Berit Janssen, Victor de Boer, Alo Allik, Stefan Schlobach, Kevin Page. “The MIDI Linked Data Cloud”. In: The Semantic Web – ISWC 2017, 16th International Semantic Web Conference. Lecture Notes in Computer Science, vol 10587, pp. 156-164 (2017).
34
Cited Contributions (2/2)
[Meroño et al. ISWC 2019] Albert Meroño-Peñuela, Enrico Daga. “List.MID: A MIDI-Based Benchmark for Evaluating RDF Lists”. In: The Semantic Web – ISWC 2019, 18th International Semantic Web Conference. Lecutre Notes in Computer Science, vol 11779, pp. 246-260 (2019).
[Meroño et al. SAAM 2018] Albert Meroño-Peñuela, Reinier de Valk, Enrico Daga, Marilena Daquino, Anna Kent-Muller. “The Semantic Web MIDI Tape: An Interface for Interlinking MIDI and Context Metadata”. In: Workshop on Semantic Applications for Audio and Music, ISWC 2018. 9th October 2018, Monterey, California, USA (2018).
[Meroño et al. ESWC 2016] Albert Meroño-Peñuela, Rinke Hoekstra. “grlc Makes GitHub Taste Like Linked Data APIs”. The Semantic Web – ESWC 2016 Satellite Events, Heraklion, Crete, Greece, May 29 – June 2, 2016, Revised Selected Papers. LNCS 9989, pp. 342-353 (2016)
[Meroño et al. ISWC 2017] Albert Meroño-Peñuela, Rinke Hoekstra. “Automatic Query-centric API for Routine Access to Linked Data”. In: The Semantic Web – ISWC 2017, 16th International Semantic Web Conference. Lecture Notes in Computer Science, vol 10587, pp. 334-339 (2017)
[Meroño et al. ESWC 2020] Albert Meroño-Peñuela, Dayana Spagnuelo, GPT-2. “Is a Transformer Your Next Semantic Co-Author? Generating Semantic Web Paper Snippets with GPT-2”. In: The Semantic Web – ESWC 2020 Satellite Events, posters & demos (2020) (under review)
35
References
[Gil DSJ 2017] Gil, Yolanda. "Thoughtful artificial intelligence: Forging a new partnership for data science and scientific discovery." Data Science 1, no. 1-2 (2017): 119-129.
[Groth & Moreau 2013] Paul Groth, Luc Moreau. “PROV-Overview: An Overview of the PROV Family of Documents”. W3C Working Group Note 30 April 2013 https://www.w3.org/TR/prov-overview/
[van Harmelen & ten Teije JWE 2019] A Boxology of Design Patterns for Hybrid Learning and Reasoning Systems. van Harmelen, F.; and ten Teije, A. Journal of Web Engineering, 18(1-3): 97-124. 2019.
[Hoekstra & Groth PAW 2014] Hoekstra, Rinke, and Paul Groth. "PROV-O-Viz-understanding the role of activities in provenance." In International Provenance and Annotation Workshop, pp. 215-220. Springer, Cham, 2014.
[Hoover et al. arXiv 2019] Benjamin Hoover, Hendrik Strobelt, Sebastian Gehrmann. “exBERT: A Visual Analysis Tool to Explore Learned Representations in Transformers Models”. Computation and Language (cs.CL); Machine Learning (cs.LG). arXiv:1910.05276 [cs.CL] (2019)
[Marcus arXiv 2020] Gary Marcus. “The Next Decade in AI: Four Steps Towards Robust Artificial Intelligence”. Artificial Intelligence (cs.AI); Machine Learning (cs.LG). arXiv:2002.06177 [cs.AI] (2020)
[Marcus & Davis 2019] Marcus, Gary, and Ernest Davis. Rebooting AI: building artificial intelligence we can trust. Pantheon, 2019.
[Radford et al. OpenAI 2019] Radford, Alec, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. "Language models are unsupervised multitask learners." OpenAI Blog 1, no. 8 (2019): 9.
36
Abstract. Many aspects of human intelligence, such as language, legal reasoning, and music understanding are informed by rich models of our surrounding cultural world. However, current AI systems are mainly concerned about achieving high performance in narrowly designed tasks, and typically ignore these cultural models of the world. Cultural AI is the understanding of human culture and ethical values by machines, but also the empowerment of humans to address, understand and advance culture by using AI. How can we make AI systems aware of human culture and values? How does cultural knowledge impact practices in knowledge engineering, reasoning, databases and the Web? Will we be content with robots that can prepare our meals, clean the streets, and take care of our elderly? Or, beyond these, will we rather seek inspiring conversations on the fairness of Galileo’s trial, the spectrum of musical emotions, or the authenticity of historical documents? In this talk, I will share the results of my research on Cultural AI through systems that combine knowledge graph construction, knowledge graph completion, and a number of diverse interfaces using interactive learning and reasoning, social querying, and text generation. I will discuss how these can contribute to the Hybrid Intelligence program as interfaces for human-machine intelligence, and as enablers of a more Collaborative, Adaptive, Responsible and Explainable AI.
37