1 of 58

Computational Approaches to Textual Scholarship:

The ARTFL Project's French Digital Collections

Clovis Gladstone

The ARTFL Project

The Textual Optics Lab

2 of 58

The Origins of ARTFL

  • In compiling the Trésor de la Langue Française, the French government assembled a computer corpus of 150 million words beginning in 1957.
  • In 1981, the Centre National de la Recherche Scientifique partnered with the University of Chicago to form American and French Research on the Treasury of the French Language (ARTFL), making the corpus available to the North American scholarly community.
  • ARTFL - FRANTEXT, the main ARTFL database, has since grown from 1,600 to 3,500 documents and over 214 million words.

The ARTFL Project

The Textual Optics Lab

3 of 58

Access: Subscription/Consortium Model

  • Over 300 subscribing institutions across the world
  • Broad access at low cost:
    • $500 per annum for Ph.D. granting institutions, $250 for others
  • Unlimited access across entire IP range -- all students, faculty and staff
  • Fees have never been raised

The ARTFL Project

The Textual Optics Lab

4 of 58

Implications of the Subscription/Consortium Model

Our objectives fall into three basic categories that have defined themselves in the context of the consortium model. Providing a set of services to the humanistic scholarly community.

1. Collection development - quantity, quality.

2. Software development (PhiloLogic): improving the means of access to the collection. 

3. Inquiry: participating in digital humanities scholarship and research.

� 

The ARTFL Project

The Textual Optics Lab

5 of 58

Collection Development

We have integrated into our main ARTFL-FRANTEXT database over 1,000 new works, and, through a series of collaborations and in-house efforts, we have also created a number of stand-alone text databases, dictionaries, and other resources both for subscribers and public access.��ARTFL's collection grows in three ways:

  • In-house data entry
  • Collaborative projects
  • Freely available online editions of suitable quality

The ARTFL Project

The Textual Optics Lab

6 of 58

ARTFL's French digital collections

Public collections:

  • ARTFL-Encyclopédie
  • Newberry Library's French Revolutionary Collection
  • Très Grande Bibliothèque
  • The Montaigne Project
  • Tout Voltaire
  • Les Archives Parlementaires
  • Revolutionary Laws: 1788-1799
  • Oeuvres complètes de Robespierre
  • The DVLF
  • The Bibliothèque Bleue
  • and more…

Subscription collections:

  • ARTFL-Frantext
  • Dictionnaire d'Autrefois
  • Journal de Trévoux
  • Dictionnaire de Bayle
  • Études critiques du 19ème siècle
  • Corpus de philosophie en langue française
  • Novels of the Nineteenth Century
  • Textes de Français Ancien (TFA)
  • French Poetry
  • and more…

The ARTFL Project

The Textual Optics Lab

7 of 58

The ARTFL Project’s Databases

To support all the databases ARTFL maintains (over 200, 50 French), we have had to develop a range of digital tools for textual analysis:

  • Databases running under PhiloLogic: for sophisticated search and retrieval
  • Databases running under custom SQL-based applications: dictionary applications, sequence alignment, and other specialized applications.
  • DVLF (aggregation of French dictionaries) running under an SQL-based application
  • TextPAIR: automatic detection of text-reuses
  • TopoLogic: topic-modeling browser for discourse analysis
  • The Intertextual Hub: a platform combining most of our digital tools

The ARTFL Project

The Textual Optics Lab

8 of 58

Why so many different options?

  • Importance of choosing the right tool for the job
  • For simple applications: using a relational database engine (SQL engines) is an obvious choice
  • For search and retrieval: use an inverted index stored in a key/value store:
    • One key for each unique word
    • Corresponding value contains all occurrences of that word: document where it occurs, position of each occurrence in document
  • For data-mining application: develop dedicated tools to fit scholarly use-cases

The ARTFL Project

The Textual Optics Lab

9 of 58

Software Development

ARTFL has developed its own open source search and retrieval software PhiloLogic, which is a flexible, fast and reliable text analysis system. The development of PhiloLogic was guided by two principle ideas :

  • Support traditional text analysis (philology) at scale, in particular PhiloLogic excels at providing concordances over thousands of texts.�
  • Ease of use - more efficient to browse large samples of occurrences than to spend time formulating highly complex queries.

The ARTFL Project

The Textual Optics Lab

10 of 58

PhiloLogic: our corpus query engine

Open source full-text search and analysis system which has its roots in the tradition of philology.

  • PhiloLogic was developed in the early days of the internet: one of the first to deploy a search engine backed by a storage solution in the early 90s
  • Concordance retrieval is at the heart of PhiloLogic’s design:

=> retrieve all occurrences of a word

=> use of an inverted index for word retrieval

  • Metadata query support came later with the development of SQL technologies

=> metadata queries in PhiloLogic work as a filter:

search for word X in documents Y = search for word X, filter out all

results that don’t satisfy condition Y

The ARTFL Project

The Textual Optics Lab

11 of 58

PhiloLogic: scale and intuitiveness

PhiloLogic is meant to provide concordances over tens of thousands of texts.

Its design is guided by three principles :

  • Intuitive: easy to use
  • Interactive: navigate between close and distant reading perspectives
  • Iterative: ability to continually refine a set of queries and develop a line of thought

The ARTFL Project

The Textual Optics Lab

12 of 58

PhiloLogic as a work environment

The ARTFL Project

The Textual Optics Lab

13 of 58

PhiloLogic as a work environment

The ARTFL Project

The Textual Optics Lab

14 of 58

PhiloLogic as a work environment

The ARTFL Project

The Textual Optics Lab

15 of 58

PhiloLogic as a work environment

The ARTFL Project

The Textual Optics Lab

16 of 58

PhiloLogic as a work environment

The ARTFL Project

The Textual Optics Lab

17 of 58

PhiloLogic as a work environment

The ARTFL Project

The Textual Optics Lab

18 of 58

PhiloLogic as a work environment

The ARTFL Project

The Textual Optics Lab

19 of 58

PhiloLogic as a work environment

The ARTFL Project

The Textual Optics Lab

20 of 58

PhiloLogic as a work environment

The ARTFL Project

The Textual Optics Lab

21 of 58

PhiloLogic as a work environment

The ARTFL Project

The Textual Optics Lab

22 of 58

The limits of concordances in the Age of Big Data

Concordances are a powerful investigative tool but it has some clear limitations:

  • User initiated search
  • Limited order of generalization
  • Information overload problem: what to do with 150,000 hits?

"There are only about 30,000 days in a human life -- at a book a day, it would take 30 lifetimes to read a million books and our research libraries contain more than ten times that number. Only machines can read through the 400,000 books already publicly available for free download from the Open Content Alliance."

Gregory Crane

The ARTFL Project

The Textual Optics Lab

23 of 58

Navigating from Afar: Data-Mining

  • Data-Mining is about detecting patterns in large amounts of data
  • Our work has focused on leveraging recurring patterns in text to measure similarity between passages of varying length:
    • Thematic similarity (vector space): authors writing about the same topic
    • Text reuse (sequence alignment): find reuses of any given passage across large amounts of texts.
    • Allusion detection: find allusions of one author in another text.

The ARTFL Project

The Textual Optics Lab

24 of 58

Navigating from Afar: Vector Space Similarity

  • Widely used search model (Lucene, Solr): search term(s) as vectors compared to documents as vectors.
  • Can be used to measure the similarity between documents: creates links across potentially very different texts

The ARTFL Project

The Textual Optics Lab

25 of 58

Navigating from Afar: Vector Space Similarity

a

The ARTFL Project

The Textual Optics Lab

26 of 58

Navigating from Afar: Vector Space Similarity

The ARTFL Project

The Textual Optics Lab

27 of 58

Sequence Alignment

Generalized technique to identify regions of similarity shared by two strings or sequences.

AKA: longest common substring problem.

Applications in many domains, including:

  • bioinformatics: detection of similar DNA sequences;
  • plagiarism detection in texts and computer code;
  • collation of text or manuscript traditions.

The ARTFL Project

The Textual Optics Lab

28 of 58

Advantages of Sequence Alignment

  • Respects text order of documents (not bag of words).
  • Does not require pre-identified blocks for comparison.
  • Not confused by extraneous similarities (topic, subject, etc).
  • Core functions are language independent.
  • Spans variations in similar passages reflecting:
    • insertions and deletions
    • spelling and orthographic variations
    • OCR and other errors
    • other variations

The ARTFL Project

The Textual Optics Lab

29 of 58

Our sequence alignment software: TextPAIR

TextPAIR inherits its basic algorithm from PhiloLine developed at the ARTFL Project: https://code.google.com/archive/p/text-pair/

  • Identify regions of similarity shared by two strings or sequences
  • The algorithm for TextPAIR was chosen for its lack of computational complexity: it is designed to be fast
  • The model is based on a shingles of n-grams representation of documents
  • It is highly parallelized for scalability and speed.

https://github.com/ARTFL-Project/text-pair

The ARTFL Project

The Textual Optics Lab

30 of 58

How TextPAIR works

N-grams are blocks of N words usually constructed after various preprocessing steps. These ngrams are generated leaving some overlap from one to the next:

�The cloud-capped towers, the gorgeous palaces,�The solemn temples, the great globe itself—�Yea, all which it inherit—shall dissolve(Shakespeare, The Tempest, Act 4, Scene 1) ca. 1611��cloud_capped_towers, capped_towers_gorgeous, towers_gorgeous_palaces, gorgeous_palaces_solemn, palaces_solemn_temples, solemn_temples_globe, temples_globe_itself,globe_itself_yea, itself_yea_inherit, yea_inherit_shall, inherit_shall_dissolve

The ARTFL Project

The Textual Optics Lab

31 of 58

Ngram Creation Parameters

  • Size of Ngrams: smaller increase resolution as well as number of possible matches: trigrams (groups of 3 tokens) usually prefered and the default.
  • Function word and other filtering: to reduce number of ngrams.
  • Normalization: accent flattening, stemming, lemmatization, modernizing.
  • Tokenization rules (vary by language): how to split running text into lists of words.
  • Gap allowed within ngrams: allows for more flexible matching
  • Word order within ngrams: some language are not bound by a particular word order

The ARTFL Project

The Textual Optics Lab

32 of 58

Matching parameters

There are many parameters, but the main ones are:

  • Number of common ngrams between two passages for a match: default to 4
  • Size of initial matching window: when we have a matching ngram, how far ahead do we look?
  • Gap: allows for matching across variants, insertions, etc.
  • Flexible gap? increase size of gap as matching passage gets bigger �Increased flexibility may increase uninteresting matches.
  • Should passages in close vicinity in both documents be merged? How big can the gap be?

The ARTFL Project

The Textual Optics Lab

33 of 58

How the alignments are found...

Pairwise comparison between two documents:

  • Generate n-grams for each document
  • Identify common ngrams (relatively rare) between each document pair
  • Enough common ngrams to make it worthwhile checking?
  • Anchor match at common ngram in document order
  • Continue comparison while there are matching ngrams
  • End of match determined by lack of matching ngrams within a configurable window of ngrams (eg: no matching ngrams within a 15 ngram window)
  • Optionally merge contiguous passages if within a certain distance

The ARTFL Project

The Textual Optics Lab

34 of 58

Example of alignment with moderate additions

Rousseau, Discours sur l'inégalité (1755)

...désir général qui porte un séxe à s'unir à l'autre ; le moral est ce qui détermine ce désir et le fixe sur un seul objet exclusivement, ou qui du moins lui donne pour cet objet préferé un plus grand dégré d'énergie...

Analyse des observations des tribunaux d'appel… (1801)

...désir général qui porte un sexe vers l’autre., appartient uniquement à l’ordre physique de la nature : mais le choix, la préférence ; l’amour qui détermine ce désir, et le fixe sur un seul objet, ou qui, du moins, lui donne, sur l’objet préféré, un plus d XXVJ DISCOURS grand degré d’énergie...

The ARTFL Project

The Textual Optics Lab

35 of 58

Example of alignment with moderate additions

Rousseau, Discours sur l'inégalité (1755)

...désir général qui porte un séxe à s'unir à l'autre ; le moral est ce qui détermine ce désir et le fixe sur un seul objet exclusivement, ou qui du moins lui donne pour cet objet préferé un plus grand dégré d'énergie...

Analyse des observations des tribunaux d'appel… (1801)

...désir général qui porte un sexe vers l’autre., appartient uniquement à l’ordre physique de la nature : mais le choix, la préférence ; l’amour qui détermine ce désir, et le fixe sur un seul objet, ou qui, du moins, lui donne, sur l’objet préféré, un plus d XXVJ DISCOURS grand degré d’énergie...

The ARTFL Project

The Textual Optics Lab

36 of 58

Example of alignment with more significant additions

Rousseau, Emile (1762)

...je puis sentir ce que c'est qu'ordre, beauté, vertu, je puis contempler l'univers, m'élever à la main qui le gouverne, je puis aimer le bien, le faire, et je me comparerois aux bêtes? Âme abjecte, c'est ta triste philosophie qui te rend semblable à elles ; ou plustôt tu veux en vain t'avilir ; ton génie dépose contre tes principes, ton coeur bienfaisant dément ta doctrine, et l'abus même de tes facultés prouve leur excellence en depit de toi...

Bertrand, Jean-Baptiste-Auguste, Étude philosophique sur l'homme (1876)

...je puis sentir ce que c’est qu’ordre, beauté, » vertu ; je puis contempler l’univers, m’élever à la main » qui le gouverne ; je puis aimer le bien, le faire, et je me » comparerais aux bêtes ! » Content de la place où Diou l’a mis, il ne voit rien de meilleur que son espèce. Il avoue que cette réflexion l’enorgueillit moins qu’elle ne le touche, parce que cet état n’est point de son choix. Mais, avant cette réflexion, quelle apostrophe ne fait-il pas au philosophe qui se confond avec la brute ! « Ame abjecte, lui dit-il, » c’est la triste philosophie qui te rend semblable aux bêtes ; » ou plutôt tu veux en vain t’avilir: ton génie dépose contre » tes principes. Ton coeur bienfaisant dément ta doctrine, » et l’abus même de tes facultés prouve leur excellence en » dépit de toi...

The ARTFL Project

The Textual Optics Lab

37 of 58

Example of alignment with more significant additions

Rousseau, Emile (1762)

...je puis sentir ce que c'est qu'ordre, beauté, vertu, je puis contempler l'univers, m'élever à la main qui le gouverne, je puis aimer le bien, le faire, et je me comparerois aux bêtes? Âme abjecte, c'est ta triste philosophie qui te rend semblable à elles ; ou plustôt tu veux en vain t'avilir ; ton génie dépose contre tes principes, ton coeur bienfaisant dément ta doctrine, et l'abus même de tes facultés prouve leur excellence en depit de toi...

Bertrand, Jean-Baptiste-Auguste, Étude philosophique sur l'homme (1876)

...je puis sentir ce que c’est qu’ordre, beauté, » vertu ; je puis contempler l’univers, m’élever à la main » qui le gouverne ; je puis aimer le bien, le faire, et je me » comparerais aux bêtes ! » Content de la place où Diou l’a mis, il ne voit rien de meilleur que son espèce. Il avoue que cette réflexion l’enorgueillit moins qu’elle ne le touche, parce que cet état n’est point de son choix. Mais, avant cette réflexion, quelle apostrophe ne fait-il pas au philosophe qui se confond avec la brute ! « Ame abjecte, lui dit-il, » c’est la triste philosophie qui te rend semblable aux bêtes ; » ou plutôt tu veux en vain t’avilir: ton génie dépose contre » tes principes. Ton coeur bienfaisant dément ta doctrine, » et l’abus même de tes facultés prouve leur excellence en » dépit de toi...

The ARTFL Project

The Textual Optics Lab

38 of 58

Example of alignment with problematic OCR

Cantiques de St. Eustache, martyr, et du mauvais riche (1700)

...écoutes, Prends les routes, Qui conduisent à bon port: Ce glouton vient de t'apprendre, Qu'il faut rendre Un grand compte après la mort. Fuis de ce richard le vice D'avarice, Donne aux pauvres largement; Fuis les excès de la bouche, Et ne touche A tes mets que sobrement. Fais grand cas de tes misères Salutaires, Ainsi que Lazare a fait...

Durand, Laurent, Cantiques de l'âme dévote (1816)

...écoute s’, 'Prends les route »; ……. -"i^Qui^eondiiisent^à --bon portr • ■§• ■■■'■' Ce glouton vient déU%ip prendre Qtfïil faut rendre *" ■■■' -Un’'-grand to.mpte aprèéslâ-moïK*'.'■" gt;Fuis de ce Richard 1e^vice 1 '■■- ■ - - ^D’avarice ;] ; ■ ?'■'• … Donne aux p ouvres largeojieat-f-^- »- •"- ' &« E E'AME DêrorE, a ? 3' Fuis les excès delà- bouche , \ Et ne touche A-lés’mets ;que sobrement ; Fais grand cas’Se les-misères, Salutaires ,. Ainsi que Lazare a fait...

The ARTFL Project

The Textual Optics Lab

39 of 58

Example of alignment with problematic OCR

Cantiques de St. Eustache, martyr, et du mauvais riche (1700)

...écoutes, Prends les routes, Qui conduisent à bon port: Ce glouton vient de t'apprendre, Qu'il faut rendre Un grand compte après la mort. Fuis de ce richard le vice D'avarice, Donne aux pauvres largement; Fuis les excès de la bouche, Et ne touche A tes mets que sobrement. Fais grand cas de tes misères Salutaires, Ainsi que Lazare a fait...

Durand, Laurent, Cantiques de l'âme dévote (1816)

...écoute s’, 'Prends les route »; ……. -"i^Qui^eondiiisent^à --bon portr • ■§• ■■■'■' Ce glouton vient déU%ip prendre Qtfïil faut rendre *" ■■■' -Un’'-grand to.mpte aprèéslâ-moïK*'.'■" gt;Fuis de ce Richard 1e^vice 1 '■■- ■ - - ^D’avarice ;] ; ■ ?'■'• … Donne aux p ouvres largeojieat-f-^- »- •"- ' &« E E'AME DêrorE, a ? 3' Fuis les excès delà- bouche , \ Et ne touche A-lés’mets ;que sobrement ; Fais grand cas’Se les-misères, Salutaires ,. Ainsi que Lazare a fait...

The ARTFL Project

The Textual Optics Lab

40 of 58

Navigating the alignments

Detecting the alignment is just the first step in the design of an intertextual tool. We designed a web interface that can:

  • Search across all alignments using metadata fields, word searches within passages
  • Banality filtering: currently based on high frequency of ngrams (work in progress)
  • Highlight differences between source and reuse
  • Provide a timeline of reuses across the entire database, or just a subset using query tools.

The ARTFL Project

The Textual Optics Lab

41 of 58

Comparing Frantext to French Revolutionary Pamphlets

Reuses of Rousseau's Social Contract in Revolutionary Pamphlets….

The ARTFL Project

The Textual Optics Lab

42 of 58

Highlight additions and deletions in reuses

Reuses of Rousseau's Social Contract in Laurent François, Etudes sur l'histoire de l'humanité (1855)

The ARTFL Project

The Textual Optics Lab

43 of 58

Reuses of Rousseau in 19th cent. France

The ARTFL Project

The Textual Optics Lab

44 of 58

Limitations of the text-reuse approach: very Little Rousseau in Robespierre?

Expected significant borrowings from Rousseau by Robespierre.

Robespierre had a deep affinity for the ideas of Rousseau.

”The popular image persists of Robespierre quoting passages from the Social Contract while simultaneously ordering executions” -- David Williams, 2012

Among all texts we have by Robespierre, primarily speeches and reports… we have only 3 direct reuses of Rousseau

The ARTFL Project

The Textual Optics Lab

45 of 58

Rousseau, Jean-Jacques, Considérations sur le gouvernement de Pologne

Pologne une véritable milice exactement comme elle est établie en Suisse où tout habitant est soldat, mais seulement quand il faut l'être? La servitude établie en Pologne ne permet pas

Tant qu'ils ne sortent pas du lieu de leur demeure, peu ou point détournés de leurs travaux, ils n'ont aucune paye, mais sitot qu'ils marchent en campagne , ils ont le pain de munition et sont à la solde de l'état, et il n'est permis à personne d'envoyer un

Robespierre, Maximilien Discours sur l'organisation des Gardes nationales / ● 1790

et qu’ils n’aient point d’autre force pour combattre les ennemis du dehors, ce Là tout habitant est soldat , mais seulement quand il faut l’être y pour me servir de l’expression de J. J. Rousseau

Tant qu’ils ne sortent point de leurs demeures , -peu ou point détournés de leurs travaux , ils n’ont aucune paie ; mais sitôt qu’ils marchent en campagne ^ ils sont à la odde de l’état.

The ARTFL Project

The Textual Optics Lab

46 of 58

Rousseau, Jean-Jacques, 1712-1778. ● EMILE OU DE L'EDUCATION. TOME SECOND. ●

vengés de leurs tyrans après la mort, n'est-il pas clair que cela mettroit ceux-ci fort à leur aise, & les délivrerait du soin d'apaiser ces malheureux? Il est donc faux que cette doctrine ne fût pas nuisible; elle ne seroit donc pas la vérité.Philosophe, tes lois morales sont fort belles; mais montre-m'en, de grâce, la sanction. Cesse un moment de battre la campagne, & dis-moi nettement ce que tu mets à la place du Poul-Serrho.] Bon jeune homme, soyez sincère & vrai. sans orgueil; sachez être ignorant: vous ne tromperez ni vous ni les autres. Si jamais vos talents cultivés vous mettent en état de parler aux hommes, ne leur parlez jamais que selon

Robespierre, Maximilien, 1758-1794 ● Oeuvres complètes, Volume 08 ● 1910

En Morale... Notre divine Constitution a bien détruit tous les ancienis abus auxquels elle a pu atteindre : mais elle a élevé un monument à J.J. Roustseau ; or, ce Moraliste nous a dit en termes précis, lorsqu'il a parlé du fameux pont de Pontu Sérou des Indiens: « Philosophe, tes loix morales sont fort belles ; mais de g'race montre m'en la sanction, et dis-moi nettement ce que tu mets à la place de ce pont... •» Rousseau pensoit donc comme mon ami, qu'aucune institution humaine, qu'aucun moyen national, ni bayonnettes, ni canons, ni décrets, ni motions, ne sanctionnoient suMsamment la morale du peuple,

Points to limits of direct reuse approach...

The ARTFL Project

The Textual Optics Lab

47 of 58

Allusion detection… is it feasible?

Given the limitations of the text-reuse approach, would it be possible to devise a method that could answer the following question: where is Rousseau in Robespierre?

Much work has been done on discourse analysis:

  • Critical theory (Spitzer, Barthes, Foucault, Pêcheux...)
  • In traditional linguistics, psychology, neuroscience...
  • In Natural Language Processing (e.g. word vectors, sentiment analysis...) and Machine-Learning (e.g. topic modeling, supervised classification...)

Our approach is far more naïve, but also meant to be more intuitive:

  • Importance of certain categories of words, especially nouns in discourse
  • Concepts anchor text into specific intellectual tradition
  • A particular set of words used in a small context is NOT random
  • Similar groups of words used in two texts indicates similarity (NOT equivalence) of thought

The ARTFL Project

The Textual Optics Lab

48 of 58

Rousseau, Du contrat social Robespierre (June 16 1793)

The ARTFL Project

The Textual Optics Lab

49 of 58

Rousseau, Du contrat social Robespierre (June 16 1793)

The ARTFL Project

The Textual Optics Lab

50 of 58

The Intertextual Hub: text-reuse and discourse analysis

Goal: build a platform to understand how ideas circulate within a cultural system:

  • Prototype funded by the NEH focused on late 18th cent France
  • NSF proposal to build a multi-language version with researchers from the social sciences.

Hub is built around a series of text collections with their own internal organisation:

  • French Revolutionary Collection: over 25,000 pamphlets from the French Rev period
  • Archives Parlementaires: parliamentary debates during the French Rev
  • ARTFL-Frantext 18th collection
  • Over 5,000 texts of political economy
  • Journaux de Marat

All collections are connected to one another through:

  • Intertextual links (text-reuses)
  • Discourse analysis techniques which allow cross-collection exploration

The ARTFL Project

The Textual Optics Lab

51 of 58

Topic Modeling as a discourse analysis tool

Preliminary assumption : each text is a combination of several topics.

Built by analysing word co-occurrences across an entire corpus.

Results in:

  • A list of co-occurring words forming topics
  • Documents are represented as a mixture of these topics

Topic-modeling as an intertextual tool: explore shared topics across many thousands of texts

The ARTFL Project

The Textual Optics Lab

52 of 58

Topic Modeling as a discourse analysis tool

The ARTFL Project

The Textual Optics Lab

53 of 58

Topic Modeling as a discourse analysis tool

The ARTFL Project

The Textual Optics Lab

54 of 58

Topic Modeling as a discourse analysis tool

The ARTFL Project

The Textual Optics Lab

55 of 58

Topic Modeling as a discourse analysis tool

Searching for "religion, culte, prêtre, église, dieu, fanatisme, moral, autel, clergé, divinité" in 7,500 18th cent. texts

The ARTFL Project

The Textual Optics Lab

56 of 58

Directed reading in close-reading: highlight reuses in text

Better understand how a text came to be: example of the anonymous pamphlet Le Pot au noir from 1788.

The virulent anticlerical has led some scholars to attribute it to d'Holbach

Others have wondered about the Voltairian tone of the work.

The ARTFL Project

The Textual Optics Lab

57 of 58

Future directions

We are continuously working on providing access to new resources through the continued development of new digital collections.

We are exploring new ways of navigating the hundreds of thousands of texts that we give access to through the:

  • Continued work on PhiloLogic:
    • Provide more analytical tools based on concordances
    • Improve user-interface for easier discovery
  • Continued work on intertextuality:
    • Beyond reuses with allusion detection
    • Visualization of text-reuses for large scale corpora

The ARTFL Project

The Textual Optics Lab

58 of 58

Thank you!

The ARTFL Project

The Textual Optics Lab