1 of 56

From Deep Agents to Inverse Generative Social Science �On ways to leverage artificial intelligence and vast computational power to reformulate modeling of social phenomena

Ivan Garibay, Ph.D.

University of Central Florida

Talk Prepared for Alphabet Talk Series

Virtual – October 27th, 2020

2 of 56

Agenda

  • Deep Agent
    • Social Sim DARPA grant
    • Desiderata
    • Canvas/team science
  • Inverse Generative Social Science
    • What is it and why
    • Critique of Generative SS
    • Evolutionary Model Discovery
    • Anasazi (Plos One), Schelling segregation (in prep)
  • Social Media Modeling
    • Multi-Action Cascade Model of conversation (CMOT journal),
    • Response Prioritization under information overflow (Nature Sci. Rep.)
  • Conclusion
    • Future of Computational Social Science
    • Thanks, collaborators, supporters

Dr. Ivan Garibay, Complex Adaptive Systems Laboratory -https://www.cs.ucf.edu/~garibay/

2

3 of 56

Deep Agent

A Framework for Information Spread and Evolution in Social Networks

Dr. Ivan Garibay, Complex Adaptive Systems Laboratory

3

Computational Social Science 2019, Santa Fe.

4 of 56

  • 4 years, ~$65M (we just completed year 3)
  • Goal: Develop technologies for high-fidelity simulation of online social behavior (the spread and evolution of online information) while rigorously testing and measuring simulation accuracy
  • 7 teams compete in challenges: University of Central Florida (UCF), Urbana-Champlain, USC, Virginia Tech, Stanford, Carnegie Mellon, Duke, Notre Dame, Illinois Rutgers, USF, others
  • UCF got two grants $12.5M, Deep Agent (Garibay PI)

Dr. Ivan Garibay, Complex Adaptive Systems Laboratory -https://www.cs.ucf.edu/~garibay/

4

5 of 56

Deep Agent: A Framework for Information Spread and Evolution in Social Networks

Research Objectives

Objectives, Approach

Impact, Outcomes, Achievements

Representative Figure

PI: Ivan Garibay

$6.2M

  • The Deep Agent Framework unleashes the power of combining massively parallel computing, data analytics of large datasets and machine learning into assisting model designers to mix and match sub models in a semi-automated way, exploring, testing and validating not one but tens of thousands of models against not a single real world phenomenon but a large set of target behaviors
  • Agent Zero: Rational, Emotional, Social behaviors + AI/ML
  • Objectives: Develop technologies for high-fidelity simulation of online social behavior (the spread and evolution of online information) while rigorously testing and measuring simulation accuracy
  • Developed two complementary state of the art information diffusion models:
    • Multi Action Cascade Model
    • Multiplexity-Based Model
  • Successfully prove effectiveness
    • 2nd Place (ahead of USC, Stanford, Carnegie Mellon, UVA, USF, Rutgers, others)
    • Real world data validation: Twitter, YouTube, Reddit, GitHub, Telegram
    • Scenarios: Venezuela Maduro vs Guaido, white helmets of Syria, Cyber-crime, Cyber warfare, software collaboration
  • +50 publications, contributed ideas towards Inverse Generative Social Sciences

6 of 56

Deep Agent: Desiderata

  1. Human-interpretable results, preserve “glass-box” property of Agent-Based modeling, seek ontological correspondence
  2. Cognitive architecture for decision making: ortho-rational (emotion), social connectivity, and bounded rational subcomponents (Epstein’s Agent Zero)
  3. Make computational modeling human behavior more a science than an art via vast exploration of alternative rule systems (universes) on which the social phenomena under study emerges
  4. Uncover strong causal relationships so that the behavior of the “virtual human” can explain something about the behavior or “real humans”

Dr. Ivan Garibay, Complex Adaptive Systems Laboratory -https://www.cs.ucf.edu/~garibay/

6

7 of 56

Social Science is Hard

“Imagine how much harder physics would be if electrons could think.”

- Murray Gell-Mann

“Imagine how much harder physics would be if electrons had feelings.”

- Richard Feynman

“I can calculate the motion of heavenly bodies, but not the madness of people.”

- Isaac Newton

(c) Ivan Garibay - Complex Adaptive Systems Laboratory

7

8 of 56

Generative Social Sciences….�…and Two Critiques

Complex systems approach to modeling social phenomena

Dr. Ivan Garibay, Complex Adaptive Systems Laboratory -https://www.cs.ucf.edu/~garibay/

8

9 of 56

Generative Social Sciences (GSS)

(c) Ivan Garibay - Complex Adaptive Systems Laboratory

9

Micromotives

Causal Mechanisms

Generators, gen

(set of agent rules)

Macrobehavior

Observed Emergent Phenomena

Generated Social Dynamics, P

 

Non-linear, stochastic, dynamical process resulting from agent interactions

10 of 56

First Area for Improvement: Identify generators in a semi-automated, systemic way

  • Critique: Generator design is an ad-hoc process that required enormous expertise and insight into the problem (more art than science). Bias by modeler choices.
  • Improvement: Find generative rules in a semi-automated, robust manner by systematically testing a comprehensive set of plausible hypothesis from social science theory and from experimental data.
  • Technology: massive search space search

(c) Ivan Garibay - Complex Adaptive Systems Laboratory

10

11 of 56

Second Area for Improvement: Strengthen “weak” causal inference

  • Critique: Finding a generator for a social phenomena implies that we discover one way, of potential infinitely many ways, to generate that phenomena. Tells me very little about the “actual causal mechanisms” at play
  • Improvement: Finding not one, but a comprehensive set of plausible generators for a given social phenomena. Analyze this set of generators to postulate “stronger” causal relationships
  • Technology: massive search space search, tools for characterizing the generator set

(c) Ivan Garibay - Complex Adaptive Systems Laboratory

11

12 of 56

Inverse Generative Social Sciences

1st Inverse Generative Social Science Workshop, Washington DC, January 23-25th. 2020. Organizers: Josh Epstein, Ivan Garibay, William Rand, Rob Axtell.

Dr. Ivan Garibay, Complex Adaptive Systems Laboratory -https://www.cs.ucf.edu/~garibay/

12

13 of 56

Answer to critique 1: Find generative rules in a semi-automated, robust manner

  • What are plausible generators?
  • How are generators represented in the search space?
  • Universal micro-behavior representation language? (i.e. Agent Zero rules), or ad-hoc rules per each case?
  • What algorithm to use to perform the search
  • How to measure error?

(c) Ivan Garibay - Complex Adaptive Systems Laboratory

13

14 of 56

Weak Causal Inference

Dr. Ivan Garibay, Complex Adaptive Systems Laboratory -https://www.cs.ucf.edu/~garibay/

14

 

 

 

 

 

 

Ground Truth

Model

=

?

 

micro

Macro

growth

15 of 56

Answer to critique 2: Discover stronger causal relationships

Dr. Ivan Garibay, Complex Adaptive Systems Laboratory -https://www.cs.ucf.edu/~garibay/

15

 

 

 

 

 

Ground Truth

Model

ML Alg. 1: Minimize error

Search the space of all plausible generators, stop after all optima are found

 

 

 

….

 

ML Alg. 2: Characterize generator set: nearly homogeneous? heterogeneous?

factor salience? (100% of generators contain factor X?), factor Importance (contribution towards fitness)

16 of 56

Inverse Generative Social Sciences

  • Game Theory: Given players rules for decision making, calculate the outcome of the game

  • Mechanism Design Theory: Given the desired outcome of the game, what are the rules?

  • GSS: Micro behavioral rules generate observable macro phenomena - “If you didn’t grow it, you didn’t explain it” (Epstein, 1999)

  • iGSS: Given an observable macro phenomena, what are the micro-behavior(s) that could generate them?

Dr. Ivan Garibay, Complex Adaptive Systems Laboratory -https://www.cs.ucf.edu/~garibay/

16

17 of 56

Interest:��To develop computational methodology to correctly identify and characterize the causes of emergent behaviors in complex social systems in a fully human-interpretable way

Dr. Ivan Garibay, Complex Adaptive Systems Laboratory -https://www.cs.ucf.edu/~garibay/

17

18 of 56

A First Simple Implementation: �Factors

How to identify candidate factors methodically?

(c) Ivan Garibay - Complex Adaptive Systems Laboratory

18

19 of 56

(c) Ivan Garibay - Complex Adaptive Systems Laboratory

19

Human-interpretable causal factors are selected from social science: cognitive, social, behavioral, moral, economic, psychological candidate theories

20 of 56

No need to find a perfect model, just the right ingredients to bootstrap search

Dr. Ivan Garibay, Complex Adaptive Systems Laboratory

20

Agent-Based Modeling Canvas Output: Factors/Grammar

Complex Adaptive Systems Laboratory

21 of 56

Scaffolding for Interdisciplinary Modeling

Dr. Ivan Garibay, Complex Adaptive Systems Laboratory

21

Agent-Based Modeling Canvas

Complex Adaptive Systems Laboratory

22 of 56

A First Simple Implementation: Evolutionary Model Discovery

Searching the space of potential theories that explains a complex social phenomena and conducting factor importance analysis to identify stronger causal relationships

(c) Ivan Garibay - Complex Adaptive Systems Laboratory

22

23 of 56

(c) Ivan Garibay - Complex Adaptive Systems Laboratory

23

24 of 56

Evolutionary Model Discovery, part 1: �Genetic Programming

  • Select a single sub-model to explore
  • Search space representation consist of “factors” (from Canvas) on a factor tree (Paul Davis)
  • GP native tree representation
  • Subcomponents (canvas hypothesis) originate from
    • Social Theories
    • Empirical Data (analytics, characterization of data, clustering, etc.)

(c) Ivan Garibay - Complex Adaptive Systems Laboratory

24

Area 1: Find generative rules in a semi-automated, robust manner

25 of 56

Evolutionary Model Discovery, part 2: Random Forest Regressor

  • Set of all plausible generators: approximated by best individuals of n (parameter) Genetic Programming runs
  • Pick one causal feature: “factor importance”
  • Pick an algorithm: random forest regressor analysis
    • Trained on factor presence to predict GP fitness data
  • Use existing “feature” importance techniques: gini importance, permutation accuracy importance, joint contribution

(c) Ivan Garibay - Complex Adaptive Systems Laboratory

25

Area 2: discover stronger causal relationships

26 of 56

Case Study 1: �Socio-Agricultural Behavior of the ancestral Pueblo

What socio-agricultural factors might have led to the sudden demise of a flourishing ancient civilization?

(c) Ivan Garibay - Complex Adaptive Systems Laboratory

26

27 of 56

(c) Ivan Garibay - Complex Adaptive Systems Laboratory

27

28 of 56

Artificial Anasazi

  • Archeological Agent-based model of the ancestral Kayenta Pueblo occupying the Long House Valley, Arizona

  • One of the most studied agent-based models
    • Originally developed by (Dean et al, 2000) (Axtell et al, 2002)
    • Further calibration efforts by (Janssen, 2009) (Stonedahl, 2010)
    • Available on OpenABM: https://www.openabm.org/model/2222/version/2/view and in Netlogo model library (Stonedahl & Wilensky, 2010)

  • Occupied the Long House Valley, Arizona from around 1800 BC until an exodus around 1300 AD

  • Model simulates period between 800 AD and 1350 AD

29 of 56

Focus on sub-model for farm plot selection

30 of 56

Hypothesized factors for farm plot selection

Full information

Family inherited information

Nearest-neighbor information

Best performers information

Comparison of quality

Comparison of dryness

Comparison of yield

Water availability

Comparison of distance (orig.)

Homophily by age

Homophily by agricultural productivity

Social presence

Fleeing/migration

31 of 56

(c) Ivan Garibay - Complex Adaptive Systems Laboratory

31

32 of 56

(c) Ivan Garibay - Complex Adaptive Systems Laboratory

32

Comparison of quality

Social presence

Comparison of distance

Fleeing/migration

Comparison of yield

Water availability

Homophily by age

Homophily by agricultural productivity

Comparison of dryness

33 of 56

(c) Ivan Garibay - Complex Adaptive Systems Laboratory

33

34 of 56

Optimal presence scores for factors with highest importance. P-values of one-tailed Mann-Whitney U tests for alternate hypothesis: RMSE for presence A < RMSE for presence B (null hypothesis: RMSE for presence A = RMSE for presence B) for α = 0.05. Green cells indicate agreement of the alternate hypothesis.

35 of 56

100 Runs of Artificial Anasazi with farm plot selection rules inferred through Evolutionary Model Discovery with randomized parameter initialization

36 of 56

Human Interpretable Mechanistic Explanation

“Upon failure of a farm plot, the ancestral Pueblo households of the Long House valley, were likely to consider the whole valley in search of new land to farm on, preferring areas that indicated higher soil quality, higher social presence, and farming further away from areas where farm plots failed previously.”

(c) Ivan Garibay - Complex Adaptive Systems Laboratory

36

37 of 56

Socio-Agricultural Behavior of the ancestral Pueblo

  • What socio-agricultural factors might have led to the sudden demise of a flourishing ancient civilization?
  • Farm selection was more “intelligent” than originally modeled: “move to next closest location to failed farm plot and try again”
  • Ancestral Pueblo preferred farming plot locations with (actual causal factors):
    • Higher soil quality
    • Higher social presence
    • Further away from areas where farm plots failed previously

(c) Ivan Garibay - Complex Adaptive Systems Laboratory

37

Gunaratne, C. and Garibay, I. (2017a). Agent-based modeling for causal exploration of social systems. In Proceedings of the Computational Social Sciences Conference, Santa Fe, New Mexico, USA. ACM.

Gunaratne, C. and Garibay, I. (2017b). Alternate social theory discovery using genetic programming: towards better understanding the artificial anasazi. In Proceedings of the Genetic and Evolutionary Computation Conference, pages 115–122. ACM

Gunaratne, C. and Garibay, I. (2017c). Evolutionary model discovery of causal factors behind the socio-agricultural behavior of the ancestral Pueblo, submitted to PLOS One. arXiv preprint arXiv:1802.00435, https://arxiv.org/abs/1802.00435

38 of 56

Social Media Modeling

(dis)misinformation, polarization, radicalization, conspiracies, manipulation and other maladies that riddle social media platforms

Dr. Ivan Garibay, Complex Adaptive Systems Laboratory -https://www.cs.ucf.edu/~garibay/

38

39 of 56

Dr. Ivan Garibay, Complex Adaptive Systems Laboratory

39

2016 this needed explanation

40 of 56

40

From the TwitterVerse to Dialogue Assessment

t

Narrative

Counter Narrative

dismiss

distort

dismay

distract

build

blog

bridge

Bots/trolls are force multipliers that spread narrative and attack the counter narrative

Information Environment: types of information operation strategies

engage

excite

enhance

explain

From Netanomics briefing

41 of 56

Multi Action Cascade Model (MACM)

Diffusion of information agent-based model using information theoretical concepts

Dr. Ivan Garibay, Complex Adaptive Systems Laboratory

41

International Conference on Computational Social Science 2019, Amsterdam.

42 of 56

Multi Action Cascade Model (MACM)

  • Improve exiting state of the art of information diffusion: imitation and innovation dynamics
  • Social media action taxonomy: Create, Vote, Post, Follow
  • Links: Uses information theory (relative transfer entropy) to identify influencer network
  • Nodes: cognitive capacity limits information spread, cognitive overload algorithm

Dr. Ivan Garibay, Complex Adaptive Systems Laboratory

42

43 of 56

(c) Ivan Garibay - Complex Adaptive Systems Laboratory

43

44 of 56

Case Study 2: ��Prioritization of Responses Under Information Overload on Online Social Media

What drives compulsive information sharing by highly active social media users?

(c) Ivan Garibay - Complex Adaptive Systems Laboratory

44

45 of 56

Extended Working Memory

  • Users typically receive more messages than they can respond to
  • Stored in information scaffolds such as notification lists for recall

Theory of Extended Self

(Belk, 2013) (Clowes, 2017)

Working Memory

(Miller, 1956) (Baddeley, 2012) (Cowan, 2008)

  • Information Chunking
  • Capacity, Magical number 7+/-2, 4
  • Attention Span
  • Information Overload
  • Re-embodiment
  • Manifestation via objects
  • Reliance

Extended Working Memory

(Gunaratne & Garibay, 2019)

46 of 56

Message Overflow

 

 

 

 

Mt

 

Received More Messages due to increased neighbor activity

1

2

3

Information lost due to overload

User experiences overload due to message overflow of actionable information queue

 

 

 

Loss of Attention Under Information Overload

47 of 56

Hypothesized Factors: �Response Prioritization Under Information Overload

 

Recency (most recent messages are more likely to be process, e.g., re-twitted)

Conversation popularity (cascade users)

Conversation size (cascade responses)

Initiators popularity

User common interests (common conversations)

User reciprocity (individual has responded to an another

Individual)

URL domain popularity mentioned in post

URL Domain familiarity

Information expertise

48 of 56

(c) Ivan Garibay - Complex Adaptive Systems Laboratory

48

Recency (reciprocal of the amount of time that had

passed since the message was originally)

Conversation popularity (cascade users)

Conversation size (cascade responses)

Initiators popularity

User common interests (common conversations)

User reciprocity (individual has responded to an another

Individual)

URL domain popularity mentioned in post

URL Domain familiarity

Information expertise

49 of 56

(c) Ivan Garibay - Complex Adaptive Systems Laboratory

49

Recency (reciprocal of the amount of time that had

passed since the message was originally)

Conversation popularity (cascade users)

Conversation size (cascade responses)

Initiators popularity

User common interests (common conversations)

User reciprocity (individual has responded to an another

Individual)

URL domain popularity mentioned in post

URL Domain familiarity

Information expertise

50 of 56

Human Interpretable Mechanistic Explanation

“Users experiencing information overload on social media prioritize responses mainly by the recency with which they had been received, but also are more likely to respond to messages on conversations initiated by globally less popular users, and messages from individuals whom they have less in common with and yet have a history of responding to.”

(c) Ivan Garibay - Complex Adaptive Systems Laboratory

50

51 of 56

Prioritization of Responses Under Information Overload on Online Social Media

  • What drives compulsive information sharing by highly active social media users?
  • Social media users experiencing information overload prioritize responses by (actual causal factors):
    • recently received messages
    • conversations by globally less popular users, friendship paradox (Hodas et al, 2014)
    • unfamiliar/novel messages
    • messages from reciprocal relationships

Gunaratne, C., Baral, N., Rand, W., Garibay, I., Jayalath, C., & Senevirathna, C. (2019). A Theory of Extended Working Memory and its Role in Online Conversation Dynamics. arXiv preprint arXiv:1910.09686.

Gunaratne, C., Senevirathna, C., Jayalath, C., Baral, N., Rand, W., & Garibay, I. A Multi-Action Cascade Model of Conversation. 5th International Conference on Computational Social Science, Amsterdam, NL

(c) Ivan Garibay - Complex Adaptive Systems Laboratory

51

Algorithmic Curation

Multi-Action Cascade Model

Transfer Entropy is used to estimate probability of acting

52 of 56

Dr. Ivan Garibay, Complex Adaptive Systems Laboratory

52

UCF

UCF: 2nd place at DARPA Challenge

KEY: ours is fully human-explainable

Networks Science and Statistical Learning

Deep Agent: Inverse Generative

Agent-Based Model + ML

Deep Learning

53 of 56

Conclusions

  • Inverse generative social science is a new sub-discipline of computational social sciences, successful studies Anasazi, Schelling segregation, social media information (Garibay), alcoholism (Epstein)
  • First methods are emerging that improve agent-based modeling:
    • A) systematic way to build human interpretable models using data and theory
    • B) stronger causal inferences
  • Given enough data (macro social patterns) and better more general agent architectures, it is possible to discover what causal mechanisms actually drive human behavior
  • Ethical considerations manipulating social systems (similar to social media manipulation)

(c) Ivan Garibay - Complex Adaptive Systems Laboratory

53

54 of 56

Collaborators

Dr. Ivan Garibay, Complex Adaptive Systems Laboratory

54

55 of 56

Sponsors

Dr. Ivan Garibay, Complex Adaptive Systems Laboratory

55

D5AI

56 of 56

Graduate Research Assistant Position Open�Fall 2019 or Spring 2020 (Orlando, Florida)

Dr. Ivan Garibay, Complex Adaptive Systems Laboratory

56

Thanks