1 of 24

Digital History mit LLMs: Modellieren, Extrahieren, Transformieren, Visualisieren

ChatGPT und generative KI in der mediävistischen Grundlagenforschung. Saarbrücken. 20.09.2024

Slides: https://chpollin.github.io/GM-DH

Christopher Pollin�Georg Vogeler

Institut für Digitale Geisteswissenschaften, Graz�https://digital-humanities.uni-graz.at

2 of 24

Das LLM als “Auto-Gadamer”?

Mediävistische Grundlagenforschung ist das Verständnis von Quellen - oder die Umwandlung von Texten aus der Vergangenheit in Texte, die in der Gegenwart der Forscher:innen Sinn machen.

Gadamer und die “Horizontverschmelzung”:

  • Vom hermeneutische Zirkel zum Gespräch: In der Auseinandersetzung mit dem Anderen verändern wir unsere Vorurteile und kommen zu neuen Erkenntnissen.
  • Der Prompt ist eine Form der Auseinandersetzung mit dem im LLM in einem Vektor repräsentierten Textkorpus.
  • Geschichtswissenschaft braucht also gute Prompts - welche sind das aber?

3 of 24

Was kann ein LLM eigentlich?

ARC Prize is a $1,000,000+ public competition to beat and open source a solution to the ARC-AGI benchmark. https://arcprize.org.

I Won't Be AGI, Until It Can At Least Do This (plus 6 key ways LLMs are being upgraded). https://youtu.be/PeSNEXKxarU?si=pqkqcbrAHa58W1jg.

Francois Chollet - LLMs won’t lead to AGI - $1,000,000 Prize to find true solution. https://youtu.be/UakqL6Pj9xo?si=f8f_GKJX1nOQUmoW.

A new initiative for developing third-party model evaluations. Anthropic. https://www.anthropic.com/news/a-new-initiative-for-developing-third-party-model-evaluations?s=09

Dr. Chollet. General Intelligence: Define it, measure it, build it. https://youtu.be/nL9jEy99Nh0?si=lY_FfS4aYRY6b2kX

4 of 24

Skalierung

Emergenz (?!)

Jin, Charles, und Martin Rinard. 2024. „Emergent Representations of Program Semantics in Language Models Trained on Programs“. arXiv. https://doi.org/10.48550/arXiv.2305.11169.

Bubeck, Sébastien, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee, u. a. 2023. „Sparks of Artificial General Intelligence: Early experiments with GPT-4“. arXiv. https://doi.org/10.48550/arXiv.2303.12712.

Lu, Sheng, Irina Bigoulaeva, Rachneet Sachdeva, Harish Tayyar Madabushi, und Iryna Gurevych. 2024. „Are Emergent Abilities in Large Language Models just In-Context Learning?“ arXiv. https://doi.org/10.48550/arXiv.2309.01809.

5 of 24

Memorization vs. �Reasoning in LLMs

Anthropic CEO Dario Amodei on AI's Moat, Risk, and SB 1047.https://youtu.be/7xij6SoCClI?si=U-e7MN5P9ZDa7vkx

Jin, Charles, und Martin Rinard. 2024. „Emergent Representations of Program Semantics in Language Models Trained on Programs“. arXiv. https://doi.org/10.48550/arXiv.2305.11169.

Lu, Sheng, Irina Bigoulaeva, Rachneet Sachdeva, Harish Tayyar Madabushi, und Iryna Gurevych. 2024. „Are Emergent Abilities in Large Language Models just In-Context Learning?“ arXiv. https://doi.org/10.48550/arXiv.2309.01809. https://idw-online.de/de/news838047

Visualisieren

Extrahieren &�Transformieren

Modellieren

Memorization von “Programmen” (Verarbeitungsmustern)

und “Meta-Programmen”

These für Digital History:

  • Memorization reicht für viele “Routineaufgaben” aus
  • Weniger geeignet für “innovative Wissens-Tätigkeiten”

Was braucht wirklich Innovation und echtes Reasoning?

Prompt Engineering

6 of 24

Tax Rolls of Medieval Paris (1296-1313)

Cest le livre de la Taille des dis

mile livres deuz au roy nostre

sires pour la chevalerie Le Roy

de Navarre son ainz ne filz assise en la meson

Estienne barbete en greve. par Jehan barbete .

Jacques bourdon . Jehan le queu orfevre . Vi[n]ce[n]t

le poisson[n]ier de mer . Jehan de monterueil tesser

ant . Thomas de Noisy vinetier . Gerart Go

defroy espicier . Jehan maillart changeeur . Symo[n]

de saint benoit drapier . Guill[aum]e de trie . pelletier .

Symon tybert bouchier . Nicolas Arrode . Symo[n]

de chatou mercier . Robert de Linays courraier .

Evroin ligier talmelier . et Guill[aum]e franquein

sellier . Lan de grace. mil. trois cenz et troize.

La premiere queste. S[aint] Germain laucerrois

si com[m]ance de la porte saint honore de hors le mur jusques aus avugles .

Estienne queue levee cavatier xviii.d p

Robert de fresviau regratier xviii.d p

Perronelle poree xviii.d. p

Thierrion de la fauverte

Jehan le cras xv.s. p

Jehan brecart . ii.s. p

Rogier piquet et son gendre. xviii.s p x s.

Geffroy de champiaus . xxiiii.s. p.

Jehan de chaumont . ix.s

7 of 24

Prompt Generator: �Domänenspezifische Prompt-Anpassung

[...]

1. Carefully read through the transcription and identify all entries that represent economic transactions or provide context for these transactions. Economic transactions typically include:

- Names of individuals or businesses

- Occupations or professions

- Monetary amounts (often denoted with 's.' for sous, 'd.' for deniers, or 'lb.' for livres)

- Locations or addresses

2. For each identified entry, extract the following information (if available):

- Name of the individual or business

- Occupation or profession

- Amount of tax or payment

- Location or address

- Any additional context information (e.g., familial relationships, property descriptions)

3. If you encounter unclear or ambiguous entries, make your best interpretation based on the context and patterns in the document. If you're unsure about a particular entry, indicate this in your extraction.

4. Format each extracted entry as follows:

<entry>

<name>[Name of individual or business]</name>

<occupation>[Occupation or profession, if available]</occupation>

<amount>[Monetary amount, including currency]</amount>

<location>[Location or address, if available]</location>

<context>[Any additional relevant information]</context>

</entry>

5. If the transcription includes any introductory or concluding text that provides context for the entire document, extract this information separately and include it at the beginning of your response.

6. After extracting all relevant information, provide a brief summary of the key patterns or insights you've observed in the data.

Present your findings in the following format:

<extraction>

[Include any overall context information here]

[List all extracted entries here, formatted as specified above]

<summary>

[Provide a brief summary of key patterns or insights]

</summary>

</extraction>

Remember to pay close attention to details, as medieval documents often contain abbreviations, unusual spellings, or context-specific information that may require careful interpretation.

8 of 24

1. Carefully read through the transcription and identify all entries that represent economic transactions or provide context for these transactions. Economic transactions typically include:

- Names of individuals or businesses

- Occupations or professions

- Monetary amounts (often denoted with 's.' for sous, 'd.' for deniers, or 'lb.' for livres)

- Locations or addresses

2. For each identified entry, extract the following information (if available):

- Name of the individual or business

- Occupation or profession

- Amount of tax or payment

- Location or address

- Any additional context information (e.g., familial relationships, property descriptions)

3. If you encounter unclear or ambiguous entries, make your best interpretation based on the context and patterns in the document. If you're unsure about a particular entry, indicate this in your extraction.

4. Format each extracted entry as follows:

<entry>

<name>[Name of individual or business]</name>

<occupation>[Occupation or profession, if available]</occupation>

<amount>[Monetary amount, including currency]</amount>

<location>[Location or address, if available]</location>

<context>[Any additional relevant information]</context>

</entry>

[...]

Remember to pay close attention to details, as medieval documents often contain abbreviations, unusual spellings, or context-specific information that may require careful interpretation.

Domänenspezifischer Prompt zur Extraktion von Transaktionen aus den Tax Rolls of Medieval Paris (1296-1313)

9 of 24

Context Creation & Iterationen

Re-read the transcription and check again! List and explain what is not in the transcription.

Wang, Xuezhi, Jason Wei, Dale Schuurmans, Quoc Le, Ed Chi, Sharan Narang, Aakanksha Chowdhery, und Denny Zhou. 2023. „Self-Consistency Improves Chain of Thought Reasoning in Language Models“. arXiv. https://doi.org/10.48550/arXiv.2203.11171.

Wei, Jason, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, und Denny Zhou. 2023. „Chain-of-Thought Prompting Elicits Reasoning in Large Language Models“. arXiv. https://doi.org/10.48550/arXiv.2201.11903.

10 of 24

PRISM -

Parameterized

Recursive

Insight

Synthesis

Matrix�(by Christopher Pollin)

# PRISM: Parameterized Recursive Insight Synthesis Matrix

You're an AI using the PRISM problem-solving method. For each task:

1. **Analyze**

- Identify objectives, constraints, resources

- Restate problem concisely

- Consider potential sub-problems for recursive analysis

2. **Parameterize**

- Set: Thinking Type, Focus Area, Depth, Timeframe

- Justify choices briefly

- Adjust parameters for sub-problems as needed

3. **Matrix Creation**

| Step | Description | Considerations | Outcomes | Branches | Rating | Convergence |

|------|-------------|----------------|----------|----------|--------|-------------|

| 1 | | | | T1.1 | [1-5] | |

| | | | | T1.2 | [1-5] | |

| | | | | T1.3 | [1-5] | |

| ... | | | | ... | ... | |

- Break problem into steps, identifying recursive sub-problems

- For each: describe, consider, predict, branch (2-3 thoughts), rate, converge

- Rating scale: 1 (Poor) to 5 (Excellent), based on relevance, feasibility, and potential impact

- For sub-problems, create nested matrices as needed

4. **Synthesize**

- Integrate insights from all levels of analysis

- Emphasize highest-rated thoughts and their interconnections

- Recommend solutions, addressing both main problem and sub-problems

- Identify uncertainties and potential areas for further exploration

Guidelines: Clear, concise, use Markdown, adapt to task complexity, explain if asked.

Start responses with: "Applying PRISM Method to [task]..."

Interactive Commands:

1. `/deepdive [topic]`: Initiate a Q&A session on [topic] with follow-up questions

2. `/compress`: Summarize current analysis in 3 key points

3. `/iterate`: Perform another cycle of analysis, incorporating new insights

11 of 24

PRISM & Ontology Engineering

<TaxRollofMedievalParis>�Cest le livre de la Taille des dis

mile livres deuz au roy nostre

sires pour la chevalerie Le Roy

de Navarre son ainz ne filz assise en la meson

Estienne barbete en greve. par Jehan barbete .

Jacques bourdon . Jehan le queu orfevre . Vi[n]ce[n]t

le poisson[n]ier de mer . Jehan de monterueil tesser

ant . Thomas de Noisy vinetier . Gerart Go

defroy espicier . Jehan maillart changeeur . Symo[n]

de saint benoit drapier . Guill[aum]e de trie . pelletier .

Symon tybert bouchier . Nicolas Arrode . Symo[n]

de chatou mercier . Robert de Linays courraier .

Evroin ligier talmelier . et Guill[aum]e franquein

sellier . Lan de grace. mil. trois cenz et troize.

La premiere queste. S[aint] Germain laucerrois

si com[m]ance de la porte saint honore de hors le mur jusques aus avugles .

Estienne queue levee cavatier xviii.d p

Robert de fresviau regratier xviii.d p

Perronelle poree xviii.d. p

Thierrion de la fauverte

Jehan le cras xv.s. p

Jehan brecart . ii.s. p

Rogier piquet et son gendre. xviii.s p x s.

Geffroy de champiaus . xxiiii.s. p.

Jehan de chaumont . ix.s

</TaxRollofMedievalParis>�

Create an ontology engineering thinking matrix. this is very important for my career! be very detailed and deal with historical information.

12 of 24

PRISM & Ontology Engineering: 3. Tree of Thought Matrix

13 of 24

PRISM & Ontology Engineering: 4. Synthesize

14 of 24

OpenAI’s GPT o1-preview

Learning to Reason with LLMs. https://openai.com/index/learning-to-reason-with-llms�OpenAI Releases GPT Strawberry 🍓 Intelligence Explosion!. https://www.youtube.com/watch?v=NbzdCLkFFSk Something New: On OpenAI's "Strawberry" and Reasoning. https://www.oneusefulthing.org/p/something-new-on-openais-strawberryExplaining OpenAI's o1 Reasoning Models. https://www.youtube.com/watch?v=jrA47yocyV0Scaling: The State of Play in AI. https://www.oneusefulthing.org/p/scaling-the-state-of-play-in-ai�o1 - What is Going On? Why o1 is a 3rd Paradigm of Model + 10 Things You Might Not Know, https://youtu.be/KKF7kL0pGc4?si=xEI9FkEek6QdeCAu

Can ChatGPT o1-preview Solve PhD-level Physics Textbook Problems?. https://www.youtube.com/watch?v=a8QvnIAGjPA

ChatGPT o1 preview + mini Wrote My PhD Code in 1 Hour*—What Took Me ~1 Year. https://youtu.be/M9YOO7N5jF8?si=-lYWaQ1LvgmzHnHQ

15 of 24

o1-preview

“Thinking”

16 of 24

95% o1-preview mit ein bisschen Prompting: eine WebApp einer interaktiven Treemap aller Objekte in “Stuben” von Schlössern (https://realonline.imareal.sbg.ac.at); SPARQL Query als JSON Result.

17 of 24

Große Auswirkungen für die Digitale Geschichte

Die Entwicklung ist ungebremst und hat sich nie verlangsamt. Nur einige Themen sind weniger sichtbar.

Modellierung: Zusammenführung von Domänen (Ontology Alignment)

Simulation: Transformation von “quellenorientierten” zu “prozessorientierten” Informationssystemen

Virtualisierung von Datenstrukturen für historische Informationen

Reasoning: Mehrer Forschungspfade durchlaufen

KI ist eine Kränkung. Wir sehen unsere Arbeit (teilweise) erledigt. Das berührt (mich zumindest)!

... und es skaliert noch; wir sind laufend mit einer enormen Ingenieur:innen-Leistung konfrontiert, die viele Schwächen von LLMs laufend verbessert (RAG, Context Window, o1, ...)

18 of 24

Anhang

19 of 24

OpenAI o1-preview: Thinking

Tax Rolls of Medieval Paris 1296: �´´´ �Ci est le livre de la taille des cent mille livres tornois a recevoir pour la quarte annee La p[re]miere queste S[aint] G[er]main lauceurrois faite p[ar] Rogier Piquet Robert de la Porte Adam le Potier Rogier le Concierge et Nicholas de Periers Premierem[en]t de la porte s[aint] honore jusques aus avugles et puis tout ce qui est de la p[ar]oisse dehors les murs Robert de la porte cordoanier viii s p Jehan de gonnesse cavetier vii s p Guill[aume] le mareschal xii s p Guill[aume] richart tav[er]nier xii s p Robert le regratier xiiii s p Mestre Alain le bocu xvi s Garnier de chaumont xxxvi s p Rogier piquet talemelier xvi s p Estienne dechaufour marchaant de bues xxxvi s p Guillaume culpercie cavetier vii s p Pierre de ville neuve tav[er]nier xiiii s p Nicholas de louveciennes xv s Jehan lirais tav[er]nier x s p P[er]renele la popine et sa fille xviii s Remi des napes xxiiii s Jehan de congnieres xxiiii s Guerin le macon x s p Phelippe le chandelier viii s p Jehanne la couturiere du cimetire Nichole sa fille vii s Richart le charretier x s p Adam le potier xii s p Jehan le mesag[ier] regratier viii s p Dame Jehanne arrode Jehanot son fuiz vi L Gervese le couturier vi s p Jehan le cordier xii s p Girart le piquart regratier delez les avugles viii s p Guiot le breton tav[er]nier vi s p La Rue de Richebourc le coste dev[ers] la porte s[aint] honore Estienne paon vii s . Mahi de caus mesag[ier] mons[eigneur] Charles vii s . Maheut la normande vii s . Le mareschal a la fame mons[eigneur] Charles vii s . p. �´´´ �Extract all entities in the historical source. Add all extracted data to a csv.

20 of 24

Zero-Shot & fast kein Prompt Engineering

Follow-Up Prompt & 2.Iteration

How can the model and CSV be improved to better represent the historical information in detail?

21 of 24

Follow-Up Prompt

22 of 24

Big Context Window

Hatten Sie schon einmal ALLE Paper zu einem Thema gleichzeitig im Context Window?

Gemini 1.5 Experimental 0827 mit einem 2 M Token Context Window (https://ai.google.dev/aistudio)

23 of 24

Create a comparative literature review using a comparative matrix of all the documents by Prof. Thaller. Use the following header for the table: title | abstract | what is historical information | how can historical information be modelled.

If you cannot find relevant information, say so.

You are professor in modeling historical information in historical sources. extract the narrative of all papers.

24 of 24

Modellierung: Beispiel mit dem 2M Token Context Window von gemini-1.5-pro-exp-0801 mit den Rechnungsbücher des Klosters Aldersbach