1 of 42

Addressing and Visualizing Misalignments �in Human Task-Solving Trajectories

Sejin Kim GIST AI Graduate School

2024. 12. 17

2 of 42

Outline

- Motivation

- Previous Works

1. O2ARC

2. ARCLE

- Misalignments in O2ARC Trajectories

- Trajectory Visualization

- Future Plan

3 of 42

Motivation

4 of 42

ARC (Abstraction and Reasoning Corpus)

5 of 42

Abstraction & Reasoning in Solving ARC Tasks

Abstraction

Reasoning

O2ARC Task-342

6 of 42

Research Goal

Developing a Human-like RL Agent through Conceptual Abstraction and Reasoning

7 of 42

Research Goal

Developing a Human-like RL Agent through Conceptual Abstraction and Reasoning

Analysis of LLMs’ Capabilities on Abstraction and Reasoning ⇦ Presented by Woochang Sim on Nov. 7

LLMs abstraction and reasoning capability grounded in LoTH

8 of 42

Research Goal

Developing a Human-like RL Agent through Conceptual Abstraction and Reasoning

Analysis of LLMs’ Capabilities on Abstraction and Reasoning

LLMs abstraction and reasoning capability grounded in LoTH

Learning Conceptual Abstraction based on User Intention ⇦ Today’s topic

Analysis of the differences between O2ARC Trajectories and user intentions
Importance of intention-level learning of task-solving trajectories

9 of 42

Research Goal

Developing a Human-like RL Agent through Conceptual Abstraction and Reasoning

Analysis of LLMs’ Capabilities on Abstraction and Reasoning

LLMs abstraction and reasoning capability grounded in LoTH

Learning Conceptual Abstraction based on User Intention

Analysis of the differences between O2ARC Trajectories and user intentions
Importance of intention-level learning of task-solving trajectories

Developing an RL Agent Capable of Human-like System-2 Reasoning ⇦ Future plan

10 of 42

Previous Works

11 of 42

Object-Oriented ARC (O2ARC 3.0)

- Tool for solving ARC tasks for human

- Around 13k trajectories are collected

IJCAI 2024 Demo

12 of 42

Motivation

Lack of ARC Task-Solving Trajectory Data

ARC tasks consist of just a few number of input/output grid pairs
Data on the reasoning process to infer the output grid from the input grid is required
Collecting sufficient solution data from humans is too costly
Current AI models struggle to augment human-level solution data

13 of 42

Goal

Encouraging Voluntary User Participation

Gamified interface and competitive components (i.e., score, leaderboard)
13k trajectories are collected approximately

14 of 42

Trajectories Collected by O2ARC

Trajectories Containing Rich Information

Grid-related information (grid size, color of each pixel)
Object-related information (location, pixels of the object, pixels of the background)
Action-related information (type of action, target pixels of the action)

15 of 42

Trajectories Collected by O2ARC

Trajectories Containing Efficient Processes

Collects object-centric trajectories by providing object-based actions
Encourages competition to gather data consisting of a minimal number of actions

16 of 42

ARC Learning Environment (ARCLE)

- RL Env for AI to solve ARC tasks

- PPO handle single task

CoLLAs 2024

17 of 42

Motivation

Lack of an RL Environment for Training O2ARC Trajectories

RL agents learn through interaction with an RL environment
The ARC benchmark has not been explored using RL
An environment is needed to train O2ARC trajectories

18 of 42

Goal

Providing an RL Environment for Learning ARC Task-Solving Processes

The MDP formulation of O2ARC Trajectories
Normalized the object-based actions provided by O2ARC 3.0

19 of 42

Training a PPO Agent through ARCLE

Performance of Agents with Various Policies

The agent influenced by previous actions outperformed that not influenced
The agent is learning the process of solving the given ARC tasks
The agent that emphasized only the meaning of color changes showed a less improvement

20 of 42

Misalignments in O2ARC Trajectories

21 of 42

Motivation

The PPO Agent Trained through ARCLE is Not Learning Well for the Single ARC Task

We hypothesize this discrepancy stems from intention-trajectory misalignment
O2ARC actions consist of both pixel-level actions and object-level actions
Some user intentions are not correspond one-to-one with actions in the trajectories

22 of 42

Goal

Identify Misalignments with Activity Theory

Users

Tasks

Tools

23 of 42

Goal

Categorize Misalignments within 3 cases

24 of 42

Goal

Quantifying Misalignments in various ways

25 of 42

Activity Theory

Human Activities through Interactions between Users, Tasks, and Tools

User (Subject) refers to the individual or group performing the activity to achieve a goal
Task (Object) is the goal of the activity, or the overall activity itself
Tool (Instrument) is the tool or means used by the subject to effectively achieve the object

Users

Tasks

Tools

26 of 42

Misalignments via Activity Theory

Misalignments between Trajectories and Intentions Explained by Activity Theory

Misalignments between Tools and Tasks (Functional Inadequacies in Tools) occur when tools lack functionality to support the tasks

Users

Tasks

Tools

27 of 42

Misalignments via Activity Theory

Misalignments between Trajectories and Intentions Explained by Activity Theory

Misalignments between Users and Tools (User Unfamiliarity with Tools) result from limited tool proficiency

Users

Tasks

Tools

28 of 42

Misalignments via Activity Theory

Misalignments between Trajectories and Intentions Explained by Activity Theory

Misalignments between Users and Tasks (Cognitive Dissonance in Users) stem from misunderstandings of task

Users

Tasks

Tools

29 of 42

Example Case: O2ARC Task-49

30 of 42

Ideal Trajectories

O2ARC Task-49

31 of 42

Functional Inadequacies in Tools

O2ARC Task-49

32 of 42

User Unfamiliarity with Tools

O2ARC Task-49

33 of 42

Cognitive Dissonance in Users

O2ARC Task-49

34 of 42

Trajectories Visualization

A node represents a state

Radii: in-degree
Color: remark

An edge represents an action

Thickness: frequency
Color: type

O2ARC Task-49

Task ID: 23b5c85d

https://o2arc.com/task/23b5c85d

35 of 42

Misalignment Analysis

36 of 42

Action-Level Analysis

Tasks with high and low average in-degrees and exhibit different patterns

High: Users accordingly apply object-level actions, aligning with their intentions
Low: Users rely on irregular pixel-level actions, occurring misalignments commonly

Object-level Actions

Pixel-level Actions

37 of 42

Intention-Level Analysis

Some actions should be added to O2ARC reflecting users’ common intention

Aligned intentions (91.11%) show O2ARC trajectories as reliable for strategy analysis
Functional Inadequacy has the highest actions-to-intentions ratio (6.64)
It highlights the need to improve O2ARC’s action set

38 of 42

Trajectory-Level Analysis

Trajectory analysis highlights user strategies, tool limits, and misalignment causes

35% of trajectories show structured strategies, especially in simple tasks, aiding AI training
Over 30% of misalignments stem from tool limits in repetitive tasks like pixel edits
User Unfamiliarity and Cognitive Dissonance is not overlapped relatively (only 3.45%)

39 of 42

Task-Level Analysis

KDE plot represents the distribution of misalignments across tasks

User Unfamiliarity occurs frequently at the early stage with low intention proportions
Functional Inadequacy occurs consistently throughout
O2ARC requires improvements in its action set and better explanations for beginners

40 of 42

Future Plans

41 of 42

Future Plans

General human intention detection via O2ARC Tasks
Reward modeling based on O2ARC trajectories
Program synthesis with intention-aligned trajectories

42 of 42

Q&A session