Language Modeling is fundamental to NLP
BERT
GPT-2
RoBERTa
T5
…
Models
Language Model
I love to go ___
hiking
LM Pretraining
…
me gustaría ir de excursión
Translation
Sentiment
Assistants
Target Tasks
Ecological Fallacy
Individual observations part
of a group treated
as independent.
Robinson, 1950 (American Sociological Association)
Motivation: Ecological Fallacy in Language Modeling
I spend my weekends hiking.
I love the serenity of the mountains.
I could watch anime all day…!!
Hiking is the best
Yeah, right -_-
Did you watch Haikyuu!!
Input Text Sequences.
Motivation: Ecological Fallacy in Language Modeling
I spend my weekends hiking.
I love the serenity of the mountains.
I could watch anime all day…!!
Hiking is the best
Yeah, right -_-
Did you watch Haikyuu!!
Computes loss on independent text sequences.
Motivation: Ecological Fallacy in Language Modeling
Text sequences written by the same author (part of a group).
I spend my weekends hiking.
I love the serenity of the mountains.
I could watch anime all day…!!
Hiking is the best
Yeah, right -_-
Did you watch Haikyuu!!
Computes loss on independent text sequences.
Motivation: Ecological Fallacy in Language Modeling
Large Human Language Models
Oral Session on June 18 9:00-10:30am (Queued in the end)
Human Language Modeling (HuLM): User State
(Washington Outsider, 2014)
Human states are somewhat stable but also change over time.
Human Language Modeling (HuLM): User State
Commitment. Maybe anxious about new beginnings.
Carefree. Living in the moment.
Human Language Modeling (HuLM): User State
Condition on a dynamic user state
(Washington Outsider, 2014)
Human states are somewhat stable but also change over time.
Latent variable capturing the distribution of human states over time through the user’s language
Soni et al., 2022
Human Language Modeling (HuLM): Problem Definition
HuLM Paper Code
Soni et al., 2022
HaRT: Human-aware Recurrent Transformers
Transformer
Layer 12
Layer 11
Layer 2
Layer 1
Layer 3
Insert
Layer
Ui-1
Previous User State
...
Input User Messages
Q = WTQU [H(1);Ui-1]
User-State Based Self-Attention
Ui = tanh(WU Ui-1 + WHH(11))
User State Recurrence
Extract
Layer
Temporally ordered
Transformer
Ui
Next User State
Soni et al., 2022
Soni et al., 2022
Soni et al., 2022
Soni et al., 2022
Personality
Stance Detection
Colab
Selected Further Reading
Personalized Language Models
Personalized Application-Focused, and Debiasing Models
Large Human Language Models: A Need and the Challenges
Human Context for/in Dialog Agents
What is Human Context for Dialog Systems?
Personality
I am going to ? .
Modes of communication
Occupation
Demographics
Large Human Language Models
Soni et al., 2024
Human-Level Agent Modeling
PARRY (Kenneth Colby 1972)
Human-Level Agent Modeling
PARRY (Kenneth Colby 1972)
Speaker - Adresse Model (Jiwei Li et al., 2017)
Human-Level Agent Modeling
PARRY (Kenneth Colby 1972)
Speaker - Adresse Model (Li et al., 2017)
PersonalityChat (Lotfi, et al. 2024)
Dialog Agents Understanding the Human
You Impress Me: Dialogue Generation via Mutual Persona Perception, Liu, et al, 2020, ACL
Psychological Metrics
Human-Centered Metrics for Dialog System Evaluation, Giorgi et al., arXiv 2023
Agents
Dialogues
Turns
Dialog System
Psychological Metrics
Psychological Metrics
Selected Further Readings
Colab
Ethical Considerations
Responsible release strategy.
Careful with profiling and stereotyping.
Unintended harms.
Malicious exploitations, and targeted content without consent of users.
Laws and policies for user privacy and data consent.
More representationally diverse, covering a wider world population.
Motivation: Ecological Fallacy in Language Modeling
Text sequences written by the many authors.
Large Human Language Models
Motivation: Ecological Fallacy in Language Modeling
Universal Author?
Large Human Language Models