Moshe Koppel : Digital Tools for Historical Corpora: Automating Source Criticism, Ur-text Reconstruction and More

Inscription obligatoire à cette session du séminaire du programme e-philologie PSL (co-organisé par l'ENC, l'ENS, l'EHESS, l'EPHE)
Où : salle Delisle, ENC, rue Richelieu 65, métro Pyramides.
Quand : 5 décembre, 17h
Plus d'informations : https://ephilolog.hypotheses.org/594

Abstract:
The availability of large digitized historical corpora presents myriad challenges and opportunities. This talk will present an overview of some of the key problems presented by historical corpora, including the need for text preprocessing (correcting transcription errors, punctuating raw text, opening abbreviations, and morphological tagging), authorship analysis, identifying cross-references and parallel texts, and stemma and ur-text reconstruction. We will focus in depth on two especially interesting challenges: source analysis of multi-author documents (including Hebrew biblical books) and ur-text reconstruction from multiple noisy textual witnesses.

Bio:
Moshe Koppel is a professor of computer science at Bar-Ilan University in Israel and chief scientist of the DICTA, the aim of which is to apply methods of computational linguistics to a large historical corpus of Hebrew and Jewish literature. Much of his research has focused on text-related applications of machine learning, especially authorship attribution. He has published academic papers in leading journals in computer science, mathematics, linguistics, economics, law, political science and other disciplines.

    This is a required question
    This is a required question
    This is a required question
    This is a required question
    This is a required question