PhD position available in Montpellier, France (2019-2022)
(semantic web, ontology alignment and property graphs)
PhD position in Semantic Web area: ontology alignment and property graphs
Ontology alignment in agronomy and biodiversity by building and leveraging ontology-based background resources based on property graphs.
Employer University of Montpellier
School Doctorate School I2S, PhD in Informatics
When Fall 2020
Duration 36 months
Where LIRMM, Montpellier, France
Collaboration Project ANR D2KAB (www.d2kab.org) , Project AgroPortal (http://agroportal.lirmm.fr)
Semantic web, AI, ontology alignment, background knowledge, property graphs, NoSQL.
Semantic web technologies (OWL, RDF, SPARQL, triplestore, Linked data), Property Graphs (neo4j), Rust.
This PhD project is part of an ANR project started in June 2019 called D2KAB (www.d2kab.org) which primary objective is to create a framework to turn agronomy and biodiversity data into –semantically described, interoperable, actionable, open– knowledge, along with investigating scientific methods and tools to exploit this knowledge for applications in science & agriculture. Agronomy/agriculture and biodiversity (ag & biodiv) face several major societal, economical, and environmental challenges, a semantic data science approach will help to address. We shall provide the means –ontologies and linked open data– for ag & biodiv to embrace the semantic Web to produce and exploit FAIR data. D2KAB project brings together a unique multidisciplinary consortium of 12 partners to achieve this objective. Each of the project driving scenarios (food packaging, agro-agri linked data, wheat phenotype, ecosystems & plant biogeography) will have a significant impact and produce concrete outcomes for ag & biodiv scientific communities and socio-economic actors in agriculture.
One challenge is the overlap between ontologies or vocabularies. Ontology alignment has been explicitly expressed as a critical issue by all the our partners developing or working with ontologies. Within D2KAB’s WP2, we will build an ontology alignment framework that covers the whole ontology alignment life cycle: extraction, generation, validation, community-based evaluation, storage and retrieval of the mappings.
During this PhD project, our goal is to push state-of-the-art in ontology alignment research [1] using background knowledge (BK) approaches and experimenting in agronomy and & biodiversity. We will use AgroPortal’s [2] mapping repository (produced during D2KAB) as a background knowledge resource to improve state-of-the-art ontology alignment algorithms.
In latest Ontology Alignment Evaluation Initiative campaigns (OAEI – http://oaei.ontologymatching.org), machine-learning based BK-based (or content-based [3]) approaches are the ones obtaining now the best results; but they are only applicable when relevant and clean knowledge sources are available. We have recently finished a PhD at LIRMM (A. Annane, supervised by Z. Bellahsène and C. Jonquet) in which we have investigated the use of biomedical ontology mappings to build efficient BK [4]. However, the theoretical results obtain during (OAEI) benchmarking are hardly transferable to the reality of heterogeneous ontologies and user needs. In addition, the complete absence of background knowledge resources in other domain than biomedicine (e.g., Anatomy and LargeBio tracks) prevent the reproducibility of the results in other domains.
Considering two ontologies to align, the background knowledge resources will be other related ontologies merged together in a single graph in which mapping paths will be identified.
Recently, there has been an important interest of the semantic Web and property graph communities to explore the convergence of mixing their methodologies and technologies in addressing some of the key semantic Web objectives (e.g., data and knowledge sharing) with help of some of the key property graphs features (e.g., scalability). This was the topic of a specific workshop held by W3C in March 2019 on Web Standardization for Graph Data (https://www.w3.org/Data/events/data-ws-2019/) and new languages prototypes are being proposed to address the need [7-8].
Within this PhD project, our approach is to adopt a graph-based mapping repository (using NoSQL property graphs) to facilitate the exploitation of concept-to-concept mapping paths to identify and select new ontology alignments. We make the hypothesis that graph databases being particularly relevant for paths related queries, will help us to push state-of-the-art performance and treat some issues such as scalability [4] and use of In Memory architecture [5]. Our preliminary work (on biomedical ontologies) [6] have shown interesting results that have obtained very good results at OAEI 2017. The scientific challenge is also now to extend this work and demonstrate its portability to the real world (outside of the OAEI benchmark) and new domains (ag & biodiv). Eventually, we will offer our community-curated ontology mapping repository (also co-developed within D2KAB) as a resource for future OAEI campaigns and evaluate our results within this context.
We are looking for a motivated young researcher with experience in semantic web technologies and graph databases. The candidate will demonstrate aptitudes or matches with most of the following aspects:
- High motivation for scientific research
- Knowledge with semantic web technologies, especially JSON/RDF/SPARQL
- Knowledge with graph databases (e.g., NoSQL, etc.)
- Excellent technical and development skills to conduct experiments with real-world and benchmark data
- Perfect English oral and writing skills
- Autonomy and initiative, take on technical decisions within the project and justify choices
- Excellent writing skills as reports, documentation, and technical notes will always be necessary
- Basic knowledge of French with the objective to learn the language during the contract
Application by email to Clement Jonquet (jonquet@lirmm.fr) and Arnaud Castelltort (castelltort@lirmm.fr)
Documents required are (include everything in one single PDF file):
- a curriculum vitae describing your education and experience;
- a motivation letter describing your interest in the position and the matches with the expected profile;
- link to your master thesis or relevant related publications;
- copies of your transcripts of records (master, bachelor);
- names and contact details of referees.
No application by email will be accepted, but for more information about this position, please contact Clement Jonquet (jonquet+d2kab-phd1@lirmm.fr) and Arnaud Castelltort (castelltort+d2kab-phd1@lirmm.fr). Please avoid attached documents and include links if you would like to send a document. Remote and face to face interviews will be organized.
The successful candidate will hold a scholarship from the French Ministry of Higher Education Research and Innovation (1600€ net per month) for a three years period of time. Social security and benefits are included. Possibility to complement with teaching activities.
[1] Jérôme Euzenat and Pavel Shvaiko. Ontology Matching (second edition). Springer, 2013.
[2] Jonquet, C., Toulet, A., Arnaud, E., Aubin, S., Yeumo, E. D., Emonet, V., ... & Larmande, P. (2018). AgroPortal: A vocabulary and ontology repository for agronomy. Computers and Electronics in Agriculture, 144, 126-143.
[3] Angela Locoro, Jérôme David, and Jérôome Euzenat. Context-based matching: design of a flexible framework and experiment. Journal on Data Semantics, 3(1):25–46, 2014
[4] A. Castelltort et T. Martin. Handling scalable approximate queries over NoSQL graph databases: Cypherf and the Fuzzy4S framework. Fuzzy Sets and Systems 348: 21-49 (2018)
[5] A. Castelltort et A. Laurent. Exploiting NoSQL Graph Databases and in Memory Architectures for Extracting Graph Structural Data Summaries. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 25(1): 81-110 (2017)
[6] Annane, A., Bellahsene, Z., Azouaou, F., & Jonquet, C. (2018). Building an effective and efficient background knowledge resource to enhance ontology matching. Journal of Web Semantics. In press, 2018.
[7] Angles Renzo, Harsh Thakkar, and Dominik Tomaszuk. RDF and Property Graphs Interoperability: Status and Issues, Alberto Mendelzon Workshop on Foundations of Data Management, Asunción, Paraguay, June 3–7, 2019.
[8] Hartig, Olaf. Foundations to Query Labeled Property Graphs using SPARQL⋆." 1st International Workshop on Approaches for Making Data Interoperable, co-located with 15th Semantics Conference (SEMANTiCS 2019), Karlsruhe, Germany, September 9, 2019