1 of 11

Ethnographic Models:

Parametrizing Simulations from Ethnographic Data Using Generative AI

Michael Fischer

Human Relations Area Files at Yale University

This material is based upon work supported by the National Science Foundation under Grant Number 2024286. HDNS-I: Infrastructure for Knowledge Linkages from Ethnography of World Societies. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Link to this presentation at:

https://ethnolab.hraf.net

2 of 11

Intro

Given the venue and the brief time allowed, rather than a report on research Dr. Sridhar Ravula and I are doing at HRAF on parameterizing simulations using Large Language Models (LLMs), I’ll present some first steps that researchers can apply to their own work, safely from prying eyes if need be.

I will briefly compare initial steps towards a simulation of arranging marriages by the Cayapa of Ecuador, with two LLMs, one large and public, and expensive and one small, private and free.

Link to this presentation at:

https://ethnolab.hraf.net

3 of 11

AI is not intelligent

  • Artificial Intelligence is not intelligence as people know it.
  • Large Language Models (LLMs) seem ‘intelligent’ by capacity to organize, transform and respond to human language.
  • LLMs are formidable models of language use as found in millions of books and documents over the past few hundred years.
  • LLMs have no access to original texts these were trained on, but immensely detailed models ranging from a few million to near a trillion parameters.
  • Interacting with an LLM is more or less interaction with millions of authors. If you’ve written something over 5 years ago, this likely includes you.

Link to this presentation at:

https://ethnolab.hraf.net

4 of 11

Not for quick or automatic solutions

  • Like most software, LLMs can quickly help reach about 90% of a solution
  • The other 10% takes months to years, if there is a solution.
  • Of course 90% of a solution is a great head start, and for those of us who do not have personal research assistants LLMs can help quite a lot.
  • Like most prior AI that has attempted to perform actual human tasks, AI depends on human intelligence to a great extent.
  • Working with an LLM is kind of like a conversation with yourself - all responses by the LLM are initiated by you. And you are likely an author.
  • The LLM largely works by predicting continuations of your language productions - one word at a time.

5 of 11

Example: Identifying elements for a model:

Cayapa Marriage

  • Gemini 2.5 (gemini.google.com – some free versions there as well)
  • Initially we’ll look at using Google’s Gemini 2.5 LLM. This is a paid service, and one has to assume that any data uploaded to Gemini is not secure.
  • Using HRAF I generated a copyright ‘fair use’ sample of text on marriage by searching for OCM identifiers relating to Marriage (800-809) for the Cayapa (OWC: SD06) and saving the results as a text file.
  • The resulting text file can be accessed at sd06_all_marriage, and you are free to use it for your own purposes.
  • The following slides will cover dialogs with LLMs based on this content.

6 of 11

Gemini2.5

First contact:

the following text is about marriage among the Cayapa people. List what we can infer from the text, including deontic transitions and deontic themes.: [text file submitted see sd06_all_marriage]

Transcript for Gemini

(63 pages)

Gemini:

Here's a breakdown of the inferences we can make from the provided text about Cayapa marriage, including deontic themes and transitions:

**I. General Social Context:**

* **Kinship & Descent:**

* Bilateral descent: Kinship is traced through both the mother's and father's lines. (sd06-000-000182, sd06-004-001289)

* Kindred-based: The main kin group is the kindred (relatives on both sides).

* Shifting Kinship Terminology: There's evidence of a shift in kinship terminology between 1909 and 1960, possibly related to tribal fission and changes in residence. (sd06-004-001195)

* Iroquois Cousin Terminology: Cross-cousins are equated and parallel cousins are distinguished. (sd06-004-001253)

* **Residence:**

* Primarily bilocal: Couples tend to live near either the husband's or the wife's family. (sd06-000-000182, sd06-004-001289)

* Some neolocality: Some couples establish a new, independent residence. (sd06-000-000182, sd06-004-001289)

* **Social Organization & Governance:**

* Self-governing: The Cayapa largely manage their own affairs. ………..

7 of 11

Gemini Example: Outcomes in a nutshell

Fischer: the following text is about marriage among the Cayapa people. List what we can infer from the text,

including deontic transitions and deontic themes.: [text file submitted see sd06_all_marriage]

Fischer: elaborate on Marriage disputes: Conflicts arise over marriage arrangements, choice of partners, adherence to rules.

Fischer: elaborate on Daughter's Age: The mother may feel her daughter is too young. (sd06-004-004412)

Fischer: elaborate on each point in the original analysis

Fischer: give some examples of Violation of norms to punishment:

Fischer: are there any candidate cultural models that have sufficient detail that would benefit from simulation?

Fischer: design a simulation using production rules, where each rule is of the form [Condition,Outcome]. Condition can be a specific event, a state, or during the duration of an action. Outcome is an modal operator, Obliged|Permitted or the negation of these, followed by designation of an action or state. Devise other forms if needed. Do this for the simulation of: Marriage Arrangement Process (with Reluctance & Negotiation)

Fischer: develop an interpreter written in prolog that can execute the production rules listed before.

Fischer: please finish the code Fischer: show the code for "find_eligible_partner/2"

Fischer: Now redo the simulation in Python instead of Prolog.

8 of 11

Small LLM: Gemma3 (4b)

the following text is about marriage among the Cayapa people. List important specific inferences from the text, especially deontic transitions and deontic themes.

[Text here … see sd06_all_marriage

Transcript for Gemma3

(19 pages)

Gemma3:

**Overall Themes & Context:**

* **Cayapa Tribe Culture:** The text describes the culture of the Cayapa

tribe, a group of indigenous people likely in the region of Ecuador

(referenced in some segments).

* **Marriage Customs:** The core of the text revolves around their

marriage customs, particularly the “boda” – a celebration involving feasts,

dancing, and often, rituals relating to the deceased.

* **Traditional Law & Social Control:** The Cayapa have strict rules

regarding consanguineal marriage (marriage between relatives) and divorce.

These laws are enforced by tribal officials.

* **Social Dynamics:** The text highlights the complexities of social

control, the role of women in the community, and the potential for

deviation from established norms.

**Key Observations & Summaries by Section:**

* **Sections 1-4 (Focus on General Culture & Marriage):** These sections

establish the basic cultural context. We learn that "boda" celebrations are

common, encompassing both feasts and rituals related to the dead. There's a

strong emphasis on family ties and the prohibition of marriage between

relatives. The existence of divorce is acknowledged, but highly regulated

and often leading to exile.

* **Sections 5-8 (The Woman’s Case – A Specific Event):** ……………

9 of 11

Gemma3 Example: Outcomes in a nutshell

Fischer: the following text is about marriage among the Cayapa people. List important specific inferences from the text, especially deontic transitions and deontic themes. [Text appears here … see sd06_all_marriage

Fischer: create a detailed list of deontic transitions in the text, and the agent roles relating to these transitions.

Fischer: analyze the reasoning behind the transitions.

Fischer: produce a prolog program that states the conclusions, conditions and rules above. The purpose will be to develop a simulation based on production rules.

Fischer: Expand the program with more rules and facts.

Fischer: add some facts and rules relating to arrangement of marriages

Fischer: Introduce variables to track the relationships between families and the state of the marriage proposals.

10 of 11

Discussion

  1. The large and small models are comparable. Both will help you with rapidly achieving around 90% of your task.
  2. On both you have to exercise critical judgement.
  3. The larger one is a bit more forthcoming, while the small model is more terse, but both will do the same job if you direct them to.
  4. The larger model may be slightly more accurate and write better simulations, though I don’t usually find the difference compelling.
  5. The small model will run on just about any recent laptop or desktop with 8 gig of ram at a speed adequate for assisting with scholarship.
  6. The small model is private – your data will go no further than your own computer.
  7. The smaller models will even run, if a bit slowly, on minimal hardware, such as a Raspberry PI …

11 of 11

Thanks!

Michael Fischer

Human Relations Area Files

Yale University

mike@hraf.net

https://ethnolab.hraf.net

Link to this presentation at:

https://ethnolab.hraf.net

This material is based upon work supported by the National Science Foundation under Grant Number 2024286. HDNS-I: Infrastructure for Knowledge Linkages from Ethnography of World Societies. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.