1 of 34

1

Gamifying Facial Emotion Recognition for Both Human Training �and Machine Learning Data Collection

Yeonsun Yang1

Ahyeon Shin1

Nayoung Kim1

Huidam Woo1

John Joon Young Chung2

Jean Y. Song1

2

Midjourney

1

2 of 34

2

Facial Emotion Recognition In the Real-world

Motivation

Spontaneous facial expressions are diverse, subjective, and ambiguous.

😀

Happy

😭

Sad

An example of in-the-wild FER datasets

3 of 34

3

The Impact of FER on Interactions

FER is important for both human-human interactions and human-machine interactions.

Motivation

# Family

# Workplace

# Law Enforcement

4 of 34

4

Who Needs FER Training?

Motivation

Clinical Populations

Professionals

General Populations

5 of 34

5

Training Interfaces to Enhance Human FER

Related Work

Learner

Image 1/20

Emotion

Feedback

Happy

Sad

Disgust

Neutral

The inner corners of the eyebrows and angled downward

NEXT

Micro Expressions Training Tool (METT)

Ekman et al., 2003

6 of 34

6

Training Interfaces to Enhance Human FER

Related Work

Image 1/20

Emotion

Feedback

Happy

Sad

Disgust

Neutral

NEXT

Micro Expressions Training Tool (METT)

Ekman et al., 2003

The inner corners of the eyebrows and angled downward

  • Limited to sign-based training�
  • Limited to self-administered training�
  • Tedious and repetitive sessions

Limitations

Sign-based explanation of emotion with action units

Facial expression images taken in controlled environments with action units

7 of 34

7

Image 1/20

Emotion

Happy

Sad

Disgust

Neutral

NEXT

Labeling Interfaces to Enhance Machine FER

Related Work

AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild

Mollahosseini et al., IEEE Transactions on Affective Computing (2017)

“Judgement-based approach”

Interprets facial expression based on how it is universally and heuristically perceived by a large common population

8 of 34

8

  • Large labeling error and bias�
  • Limited to single-choice format�
  • Tedious and repetitive sessions

Image 1/20

Emotion

Happy

Sad

Disgust

Neutral

NEXT

Labeling Interfaces to Enhance Machine FER

Related Work

Limitations

AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild

Mollahosseini et al., IEEE Transactions on Affective Computing (2017)

9 of 34

9

Research Goal: Building an Integrated Interface

Approach

Simultaneously addressing limitations

FER Training

Data Collection

10 of 34

10

To effectively address the challenges identified in current interfaces within a single application,

Emotion Categories

# of Responses

(1) Engaging and motivating groups of general populations

(2) Group learning through interaction with each other

(3) Aggregating all socially-agreed emotional judgments collected from user groups

Research Goal: Building an Integrated Interface

Approach

11 of 34

11

Emotion Categories

# of Responses

”Gamification”

Research Goal: Building an Integrated Interface

Approach

12 of 34

12

Design Probing

Approach

Based on iterative design probes (N=9):

[DG1]

[DG2]

[DG3]

Enable diverse layers of interactions to support learning socially agreed-upon interpretations of emotions

Observational learning, real-time personalized feedback, and reflection

Minimize the difficulty and effort required for the labeling actions during the game

Breaking labeling actions into smaller unit of work (i.e., binary labeling)

Provide game rules and elements that are easy to learn

�Using mainstream game plot

13 of 34

13

Design Probing

Approach

“Mafia Game Plot”

  • Active interactions among users, such as observation, debating, and voting�
  • Useful for redesigning to incorporate suggestions and guidelines from literature�
  • Requiring little time to adapt to the system due to its familiar rules�

14 of 34

14

Find the Bot! : Strategies

Approach

Find the Bot! seamlessly incorporate a wide range of suggestions and guidelines from gamification, education, and crowdsourcing to motivate players using a combination of game elements.

15 of 34

15

Find the Bot! : Gameplay

Approach

Player1

Player5

Player2

Player3

Player4

Round 1

HAPPY

16 of 34

16

Find the Bot! : Gameplay

Approach

16

Player1

Player5

Player2

Player3

Player4

Round 1

HAPPY

17 of 34

17

Find the Bot! : Gameplay

Approach

17

Player1

Player5

Player2

Player3

Player4

Round 1

HAPPY

18 of 34

18

Find the Bot! : Gameplay

Approach

18

Player1

Player5

Player2

Player3

Player4

Round 1

HAPPY

19 of 34

19

Find the Bot! : Gameplay

Approach

Labeling

Game stages

Skimming

Pointing out

Voting

Last defense

Advice

20 of 34

20

Find the Bot! : Gameplay

Approach

Labeling

Skimming

Pointing out

Voting

Last defense

Advice

Game stages

21 of 34

21

Find the Bot! : Gameplay

Approach

Labeling

Skimming

Pointing out

Voting

Last defense

Advice

Game stages

22 of 34

22

Find the Bot! : Gameplay

Approach

Labeling

Skimming

Pointing out

Voting

Last defense

Advice

Game stages

23 of 34

23

Find the Bot! : Gameplay

Approach

Labeling

Skimming

Pointing out

Voting

Last defense

Advice

Game stages

24 of 34

24

Find the Bot! : Gameplay

Approach

Labeling

Skimming

Pointing out

Voting

Last defense

Advice

Game stages

25 of 34

25

Ground-truth measures* (N=275)

  • Assessing FER scores (0~64 points)
  • Criteria to divide participants into low FER group or not (40 points)
  • Providing a basis for judgment-based scoring

*Japanese and Caucasian Facial Expressions of Emotion (JACFEE) and Neutral Faces (JACNeuF)

Matsumoto et al., (1988)

User study (N=59)

  • Classifying 22 participants as low FER group and 37 as ordinary group based on their pre-survey FER scores
  • Randomly dividing the low FER group into learner group (n=11) and control group (n=11)

Experiment

Evaluation Setup

26 of 34

26

Ground-truth measures* (N=275)

  • Assessing FER scores (0~64 points)
  • Criteria to divide participants into low FER group or not (40 points)
  • Providing a basis for judgment-based scoring

*Japanese and Caucasian Facial Expressions of Emotion (JACFEE) and Neutral Faces (JACNeuF)

Matsumoto et al., (1988)

User study (N=59)

  • Classifying 22 participants as low FER group and 37 as ordinary group based on their pre-survey FER scores
  • Randomly dividing the low FER group into learner group (n=11) and control group (n=11)

Experiment

Evaluation Setup

27 of 34

27

Playing Find the Bot!

Plain labeling task

Learner group

(N=11)

Player group

(N=37)

Control group

(N=11)

Two 90-mins lab sessions

A 90-mins lab session

  1. pre-test to evaluate their FER scores�
  2. Playing Find the Bot! with pre-matched team consisting of one learner and three players �
  3. Post-test to assess improvements in their FER scores�
  4. Survey on user experience (GEQ, SUS, customized questions)
  1. pre-test to evaluate their FER scores�
  2. Labeling 200 in-the-wild facial expression images�
  3. Post-test to assess improvements in their FER scores�

Experiment

Evaluation Setup

28 of 34

28

Playing Find the Bot!

Plain labeling task

Learner group

(N=11)

Player group

(N=37)

Control group

(N=11)

Two 90-mins lab sessions

A 90-mins lab session

  1. pre-test to evaluate their FER scores�
  2. Playing Find the Bot! with pre-matched team consisting of one learner and three players �
  3. Post-test to assess improvements in their FER scores�
  4. Survey on user experience (GEQ, SUS, customized questions)
  1. pre-test to evaluate their FER scores�
  2. Labeling 200 in-the-wild facial expression images�
  3. Post-test to assess improvements in their FER scores�

Experiment

Evaluation Setup

29 of 34

29

  • Rich game interactions and a progression in emotional assumptions by analyzing game log data
  • Well-motivated and engaged game design by analyzing post survey (GEQ, SUS, customized questions)

Experiment

Evaluation of Game Design (for All Players)

30 of 34

30

Experiment

Improvement on Judgment-based FER (Learner vs. Control)

  • The higher increase in inter-rater reliability of pre-and post-test responses in learner group (k=0.078) rather than control group (k=0.02)
  • A trend where the only learner group’s responses shifted towards judgment-based answers

31 of 34

31

Experiment

Increase on Social Agreement of Collected Labels

  • The Gini coefficient of label distributions is highly skewed towards 1
  • A more skewed Gini coefficient of responses in the post-test compared to the pre-test

32 of 34

32

Generalizability

  • The task could be answered in binary (e.g., yes or no) and does not require open-ended responses.�
  • The task can be broken down into the smallest units of work. �
  • The task is simple enough to allow users to make judgments within a few seconds without a second thought. �
  • The task embraces subjective responses, but the expected responses should have high agreement rate.

Discussion

Findings from the Study

33 of 34

33

Guidelines for Game with a Purpose

Discussion

Findings from the Study

  • Identification two important considerations in designing game interfaces to successfully engage and motivate participants

  • Benefits from consistent and attractive UI design��
  • Usefulness of utilizing mainstream games

34 of 34

34

Yeonsun Yang1

Ahyeon Shin1

Nayoung Kim1

Huidam Woo1

John Joon Young Chung2

Jean Y. Song1

2

Midjourney

1

Find the Bot!

Project page: https://github.com/diag-dgist/FindtheBot

Gamifying Facial Emotion Recognition for Both Human Training �and Machine Learning Data Collection