Evaluating Artificial Social Intelligence �in an �Urban Search and Rescue �Task Environment
AAAI Fall Symposium Series:�Theory of Mind for Teams
4-5 Nov 2021
�Jared Freeman1, Lixiao Huang2, Matt Wood1, Stephen J. Cauffman2
Aptima Inc.1, Arizona State University2
� freeman@aptima.com, lixiao.huang@asu.edu, mwood@aptima.com, scauffma@asu.edu
This material is based upon work supported by the Defense Advanced Research Projects Agency (DARPA) under Contract No. HR001119C0130. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Defense Advanced Research Projects Agency.
Overview
Training, Talk, Technology, and Theory of Mind
Training
Talk
Technology
ToM
Training, Talk, Technology, and Theory of Mind
DARPA ASIST
Artificial Social Intelligence �for Successful Teams
ASIST & �Team Tasks
The ASIST USAR �Team Task Environment
Bird’s-eye view of three participants
Zoom
Picture in Picture
1st person view
Building layout and player locations
The ASIST Task�from the Participant’s View
1
2
3
Info Map
Marker block legend
Minecraft world
The ASIST �Experimental Design
| Trial Maps (within-group): SaturnA and SaturnB | ||
Trial 1 | Trial 2 | ||
Shared mental model manipulation (between-group) | Condition 1: Team planning | 32 teams | |
Condition 2: No planning (math control task) | 32 teams | ||
Sample data for study 3 | *No planning + human advisor | 4 teams | |
Experimental Design
Artificial Social Intelligence
ASI Evaluation & Findings
Metric ID: Function | ASI Agent & Human Observer �infer / predict | Measure |
M1: Prediction of effects of future interventions | Team score (3x per trial at fixed times) | Normalized RMSE |
M3: Inference of member mental model / knowledge | Given map information (3x) | Mean accuracy |
M6: Inference of member mental model / knowledge (conflicting knowledge) | Given marker block meanings (3x) | Mean accuracy |
M7: Prediction of action given member beliefs (Sally-Anne) | Room entry in response to another participant’s marker block (many per trial) | Mean accuracy |
ASI Evaluation & Findings
Findings
Accuracy of ASI agents (yellow) & human observers (blue) on four tests of social intelligence.
ASI Evaluation & Findings
Findings
Accuracy of human observers (triangle) and artificial agents (circles) on four tests of social intelligence.
ASI Evaluation & Findings
Finding
Accuracy predicting final score (M1), 3x per trial
Percent accuracy for inferring marker block semantics (M6), 3x per trial
Future Research
Support the claim that | With quantitative measurements of |
Social science constructs drive | Analytic agent use, influence, effect |
Design of ASI MToM/T to enable | MToM/T Existence, Inference, Prediction |
ASI interventions on | Intervention (non)existence, Compliance, Explanations, Perceived Utility of ASI, Trust in ASI |
Team process that improve | Synchronization, Error Reduction, Resilience, Coordinative Comms |
Mission effects | Mission score (weighted to team tasks) |
Goal
Training
Talk
Technology
ToM
Acknowledgement
Contact:
Jared Freeman <freeman@aptima.com>