1 of 39

18: Human-AI Collaboration Part 2

Juho Kim & Jean Young Song

Human-AI Interaction KAIST Fall 2020 | kixlab.org/courses/human-ai

2 of 39

Administrative Notes

  • Project feedback meetings on 11/24 (Tue).
    • Schedule is announced on Campuswire.
  • Milestone #2 for your project is due 11/24 (Tue).

3 of 39

Previously on CS492F...

4 of 39

Today’s Learning Objectives

After today’s class, you should be able to...

  • Understand how human-AI teamwork should be designed and supported.
  • Analyze a human-AI co-creation example and discuss their strengths, weaknesses, and implications.
  • Identify conditions for successful (or not successful) human-AI collaboration.

5 of 39

Reflection on the �last in-class activity

6 of 39

Toward Human-AI Teamwork

7 of 39

Theories of Teamwork

  • “...teamwork, implying a sense of partnership that occurs when agents work “jointly with” others rather than “acting upon” others”.
  • Joint Intention Theory, which shows that for joint action to emerge, teammates must communicate to maintain a set of shared beliefs and to coordinate their actions towards the shared plan. In addition, they must demonstrate commitment to doing their own part, to the others doing theirs, to providing mutual support, and finally—to a mutual belief as to the state of the task.”

Hoffman, Guy, and Cynthia Breazeal. "Collaboration in human-robot teams." AIAA 1st Intelligent Systems Technical Conference. 2004.

8 of 39

Conditions for Successful Teamwork (1/2)

Individual plan as part of a joint plan

“collaborative plans do not reduce to the sum of the individual plans, but consist of an interplay of actions that can only be understood as part of the joint activity”

Commitment to partner’s action

“Just as with committing to her own actions, x would—in that case—not adopt other goals inconsistent with y’s acting, would monitor y’s success, might request y to do it or help y if need be.”

Commitment to mutual belief

“If one participant reaches the conclusion that the common goal (or a subgoal) is achieved, unachievable or irrelevant, it becomes this participant’s goal to inform the other team members of this conclusion.”

Hoffman, Guy, and Cynthia Breazeal. "Collaboration in human-robot teams." AIAA 1st Intelligent Systems Technical Conference. 2004.

9 of 39

Conditions for Successful Teamwork (2/2)

Commitment to mutual support

“In [a Shared Cooperative Activity] each agent is committed to supporting the efforts of the other to play her role in the joint activity [. . . ] there must be at least some cooperatively relevant circumstances in which [the participant] would be prepared to provide the necessary help.”

Grounding

“the sum of [...] mutual, common, or joint knowledge, beliefs, or suppositions”

Dialog, gaze, demonstration

Joint closure

“participants in a joint action trying to establish the mutual belief that they have succeeded well enough for current purposes”

Hoffman, Guy, and Cynthia Breazeal. "Collaboration in human-robot teams." AIAA 1st Intelligent Systems Technical Conference. 2004.

10 of 39

Scenario: Moving a heavy sofa to the door

Imagine a human-robot team moving a heavy sofa together to the door. How should the following be supported �for successful teamwork?

  • Individual plan as part of a joint plan
  • Commitment to partner’s action
  • Commitment to mutual belief
  • Commitment to mutual support
  • Grounding
  • Joint closure

11 of 39

Reflection: How is your human team doing?

  • Individual plan as part of a joint plan
  • Commitment to partner’s action
  • Commitment to mutual belief
  • Commitment to mutual support
  • Grounding
  • Joint closure

12 of 39

AXIS: Generating Explanations at Scale with Human-AI Interaction

  • Human-side: learnersourcing
  • AI-side: multi-armed bandits

AXIS: Generating Explanations at Scale with Learnersourcing and Machine Learning.

Joseph Williams, Juho Kim, Anna Rafferty, Samuel Maldonado, Krzysztof Z. Gajos, Walter Lasecki, Neil Heffernan.

Best Paper Honorable Mention. Learning at Scale 2016.

13 of 39

AXIS: Generating explanations at scale with learnersourcing + ML

AXIS: Generating Explanations at Scale with Learnersourcing and Machine Learning.

Joseph Williams, Juho Kim, Anna Rafferty, Samuel Maldonado, Krzysztof Z. Gajos, Walter Lasecki, Neil Heffernan.

Best Paper Honorable Mention. Learning at Scale 2016.

Exp 1

Exp 2

Exp 3

Exp N

…...

AXIS

0.36

0.34

0.01

0.04

14 of 39

Online Education: �Lots of problems, not many quality explanations

  • Costly to generate quality explanations
  • Not a single best explanation for all
  • Instructor might not be best at explaining

Problem

Explanation

15 of 39

Frontend: �Learner interface

15/56

Solve the problem & �submit an answer.

16 of 39

Frontend: �Learner interface

15/56

See a system-�picked explanation.

17 of 39

Frontend: �Learner interface

15/56

V

Rate how helpful�the explanation was.

18 of 39

Frontend: �Learner interface

Write a self-explanation.

The total number of cookies in the jar is 8.�Since there are 5 chocolate cookies the probability that Chris gets an chocolate cookie is 5/8�Since Chris removed 1 cookie from the jar and did not replace it or put it back there are now 7 cookies in the jar.�So, the probability that Chris gets an oatmeal cookie from the jar is 3/7 5/8 x 3/7 = 15/56�So, the probability of Chris getting a chocolate cookie on the first draw, and an oatmeal cookie on the second draw is 15/56. Type in 15/56

19 of 39

Backend: �Explanation selection policy

Filter an explanation using simple heuristics.

The total number of cookies in the jar is 8.�Since there are 5 chocolate cookies the probability that Chris gets an chocolate cookie is 5/8�Since Chris removed 1 cookie from the jar and did not replace it or put it back there are now 7 cookies in the jar.�So, the probability that Chris gets an oatmeal cookie from the jar is 3/7 5/8 x 3/7 = 15/56�So, the probability of Chris getting a chocolate cookie on the first draw, and an oatmeal cookie on the second draw is 15/56. Type in 15/56

Length of a response

Learner’s history / knowledge level

20 of 39

Frontend: �Learnersourcing interface

Get a problem

Submit an answer

See a system-picked explanation

Rate the helpfulness of the explanation

Write a self-explanation

21 of 39

Rating learnersourced explanation

22 of 39

Backend: �Explanation selection policy

Exp 1

Exp 2

Exp 3

Exp N-1

Exp N

Ratings

6.75

2.5

9.0

4.5

Policy for �next learner

16%

7%

20%

10%

Multi-armed bandit formulation

23 of 39

Bandits enable dynamic experimentation

https://cxl.com/blog/ab-testing-guide/

24 of 39

Backend: �Explanation selection policy

  • Repeatedly select an action
    • Explanation to show to current user
  • Learn effectiveness of actions after observation (observed reward)
    • Optimize for learners’ ratings of helpfulness of an explanation

Exp 1

Exp 2

Exp 3

Exp N-1

Exp N

Multi-armed bandit formulation

25 of 39

Backend: �Explanation selection policy

  • Classic RL formulation
  • Exploitation: present effective explanations
  • Exploration: experiment with explanations to collect evidence
  • Thompson sampling
    • Provides dynamic policy for choosing explanations and updates the policy based on reward observation
    • Provides interpretable representations of system’s belief any time

Exp 1

Exp 2

Exp 3

Exp N-1

Exp N

Multi-armed bandit formulation

26 of 39

Backend: �Explanation selection policy

Exp 1

Exp 2

Exp 3

Exp N-1

Exp N

Ratings

6.75

2.5

9.0

4.5

Policy for �next learner

16%

7%

20%

10%

Add the new explanation as a new arm.

27 of 39

Backend: �Explanation selection policy

Exp 1

Exp 2

Exp 3

Exp N-1

Exp N

Ratings

6.75

2.5

9.0

4.5

9.5

Policy for �next learner

14%

5%

18%

8%

23%

Update the policy with new explanation

28 of 39

Backend: �Explanation selection policy

Exp 1

Exp 2

Exp 3

Exp N-1

Exp N

Ratings

6.75

2.5

9.0

3.9

9.5

Policy for �next learner

14%

5%

19%

6%

24%

Update the policy with new rating

29 of 39

Backend: �Explanation selection policy

Exp 1

Exp 2

Exp 3

Exp N-1

Exp N

Ratings

6.75

2.5

9.0

4.5

9.5

Policy for �next learner

14%

5%

18%

8%

23%

For a new learner, pick an explanation using the policy.

30 of 39

Backend: �Explanation selection policy

Filter the explanation

Add the explanation as a new arm

For each new rating, update posterior

For each new learner, pick an explanation using Thompson sampling

31 of 39

AXIS = Instructor > No exp.

Accuracy increase in solving new problems

32 of 39

Human-AI Interaction in �AXIS

Human

Better quality explanation

Better learning

AI

Solves problems

Provides explanations

Rates explanations

Updates policy using bandits + Thompson sampling

Interaction

33 of 39

DreamTeam

34 of 39

DreamTeam system

  • Identify optimal team structures for a team with observable feedback
  • Multi-armed bandits with temporal constraints

35 of 39

Relevance and Responses

  • Relevance
    • Human-AI collaboration (AI decides human performs): will the teams want control?
    • Many insights into teamwork
  • Student responses
    • Neat application of multi-armed bandits to a practical social problem
    • Time constraints made a big difference ⇒ good insight
    • Easy it practically easy to transition to new team structures?
    • Experiment at a more externally valid setting?

36 of 39

Discussion

  • What would be the future of (AI-mediated) teamwork? �What other things can AI do to support teamwork?
  • Other ideas for applying multi-armed bandits to social problems?
  • Would you be willing to use DreamTeam to your own teams? Why and why not?

37 of 39

Human-AI Co-Creation

38 of 39

What is co-creation?

  • Co-creation: Two (or more) agents work together in a creative task.
  • Creative tasks: writing, music, ideation, drawing, dance, fashion design, …
  • What are some common struggles humans have in creative tasks?
  • How might AI be able to help?

39 of 39

ACTIVITY: Analyze human-AI co-creation

  • Let’s analyze an existing human-AI co-creation application!
  • In a team of 4, for 25 minutes
  • 3-min presentation from each team after the activity

yellkey.com/push