18: Human-AI Collaboration Part 2
Juho Kim & Jean Young Song
Human-AI Interaction KAIST Fall 2020 | kixlab.org/courses/human-ai
Administrative Notes
Previously on CS492F...
Today’s Learning Objectives
After today’s class, you should be able to...
Reflection on the �last in-class activity
Toward Human-AI Teamwork
Theories of Teamwork
Hoffman, Guy, and Cynthia Breazeal. "Collaboration in human-robot teams." AIAA 1st Intelligent Systems Technical Conference. 2004.
Conditions for Successful Teamwork (1/2)
Individual plan as part of a joint plan
“collaborative plans do not reduce to the sum of the individual plans, but consist of an interplay of actions that can only be understood as part of the joint activity”
Commitment to partner’s action
“Just as with committing to her own actions, x would—in that case—not adopt other goals inconsistent with y’s acting, would monitor y’s success, might request y to do it or help y if need be.”
Commitment to mutual belief
“If one participant reaches the conclusion that the common goal (or a subgoal) is achieved, unachievable or irrelevant, it becomes this participant’s goal to inform the other team members of this conclusion.”
Hoffman, Guy, and Cynthia Breazeal. "Collaboration in human-robot teams." AIAA 1st Intelligent Systems Technical Conference. 2004.
Conditions for Successful Teamwork (2/2)
Commitment to mutual support
“In [a Shared Cooperative Activity] each agent is committed to supporting the efforts of the other to play her role in the joint activity [. . . ] there must be at least some cooperatively relevant circumstances in which [the participant] would be prepared to provide the necessary help.”
Grounding
“the sum of [...] mutual, common, or joint knowledge, beliefs, or suppositions”
Dialog, gaze, demonstration
Joint closure
“participants in a joint action trying to establish the mutual belief that they have succeeded well enough for current purposes”
Hoffman, Guy, and Cynthia Breazeal. "Collaboration in human-robot teams." AIAA 1st Intelligent Systems Technical Conference. 2004.
Scenario: Moving a heavy sofa to the door
Imagine a human-robot team moving a heavy sofa together to the door. How should the following be supported �for successful teamwork?
Reflection: How is your human team doing?
AXIS: Generating Explanations at Scale with Human-AI Interaction
AXIS: Generating Explanations at Scale with Learnersourcing and Machine Learning.
Joseph Williams, Juho Kim, Anna Rafferty, Samuel Maldonado, Krzysztof Z. Gajos, Walter Lasecki, Neil Heffernan.
Best Paper Honorable Mention. Learning at Scale 2016.
AXIS: Generating explanations at scale with learnersourcing + ML
AXIS: Generating Explanations at Scale with Learnersourcing and Machine Learning.
Joseph Williams, Juho Kim, Anna Rafferty, Samuel Maldonado, Krzysztof Z. Gajos, Walter Lasecki, Neil Heffernan.
Best Paper Honorable Mention. Learning at Scale 2016.
Exp 1
Exp 2
Exp 3
Exp N
…...
AXIS
0.36
0.34
0.01
0.04
Online Education: �Lots of problems, not many quality explanations
Problem
Explanation
Frontend: �Learner interface
15/56
Solve the problem & �submit an answer.
Frontend: �Learner interface
15/56
See a system-�picked explanation.
Frontend: �Learner interface
15/56
V
Rate how helpful�the explanation was.
Frontend: �Learner interface
Write a self-explanation.
The total number of cookies in the jar is 8.�Since there are 5 chocolate cookies the probability that Chris gets an chocolate cookie is 5/8�Since Chris removed 1 cookie from the jar and did not replace it or put it back there are now 7 cookies in the jar.�So, the probability that Chris gets an oatmeal cookie from the jar is 3/7 5/8 x 3/7 = 15/56�So, the probability of Chris getting a chocolate cookie on the first draw, and an oatmeal cookie on the second draw is 15/56. Type in 15/56
Backend: �Explanation selection policy
Filter an explanation using simple heuristics.
The total number of cookies in the jar is 8.�Since there are 5 chocolate cookies the probability that Chris gets an chocolate cookie is 5/8�Since Chris removed 1 cookie from the jar and did not replace it or put it back there are now 7 cookies in the jar.�So, the probability that Chris gets an oatmeal cookie from the jar is 3/7 5/8 x 3/7 = 15/56�So, the probability of Chris getting a chocolate cookie on the first draw, and an oatmeal cookie on the second draw is 15/56. Type in 15/56
Length of a response
Learner’s history / knowledge level
…
Frontend: �Learnersourcing interface
Get a problem
Submit an answer
See a system-picked explanation
Rate the helpfulness of the explanation
Write a self-explanation
Rating learnersourced explanation
Backend: �Explanation selection policy
Exp 1
Exp 2
Exp 3
Exp N-1
…
Exp N
Ratings | 6.75 | 2.5 | 9.0 | | 4.5 | |
Policy for �next learner | 16% | 7% | 20% | | 10% | |
Multi-armed bandit formulation
Bandits enable dynamic experimentation
https://cxl.com/blog/ab-testing-guide/
Backend: �Explanation selection policy
Exp 1
Exp 2
Exp 3
Exp N-1
…
Exp N
Multi-armed bandit formulation
Backend: �Explanation selection policy
Exp 1
Exp 2
Exp 3
Exp N-1
…
Exp N
Multi-armed bandit formulation
Backend: �Explanation selection policy
Exp 1
Exp 2
Exp 3
Exp N-1
…
Exp N
Ratings | 6.75 | 2.5 | 9.0 | | 4.5 | |
Policy for �next learner | 16% | 7% | 20% | | 10% | |
Add the new explanation as a new arm.
Backend: �Explanation selection policy
Exp 1
Exp 2
Exp 3
Exp N-1
…
Exp N
Ratings | 6.75 | 2.5 | 9.0 | | 4.5 | 9.5 |
Policy for �next learner | 14% | 5% | 18% | | 8% | 23% |
Update the policy with new explanation
Backend: �Explanation selection policy
Exp 1
Exp 2
Exp 3
Exp N-1
…
Exp N
Ratings | 6.75 | 2.5 | 9.0 | | 3.9 | 9.5 |
Policy for �next learner | 14% | 5% | 19% | | 6% | 24% |
Update the policy with new rating
Backend: �Explanation selection policy
Exp 1
Exp 2
Exp 3
Exp N-1
…
Exp N
Ratings | 6.75 | 2.5 | 9.0 | | 4.5 | 9.5 |
Policy for �next learner | 14% | 5% | 18% | | 8% | 23% |
For a new learner, pick an explanation using the policy.
Backend: �Explanation selection policy
Filter the explanation
Add the explanation as a new arm
For each new rating, update posterior
For each new learner, pick an explanation using Thompson sampling
AXIS = Instructor > No exp.
Accuracy increase in solving new problems
Human-AI Interaction in �AXIS
Human
Better quality explanation
Better learning
AI
Solves problems
Provides explanations
Rates explanations
Updates policy using bandits + Thompson sampling
Interaction
DreamTeam
DreamTeam system
Relevance and Responses
Discussion
Human-AI Co-Creation
What is co-creation?
ACTIVITY: Analyze human-AI co-creation