Learning and Leveraging Conventions in the Design of an Adaptive Haptic Shared Control for Steering a Semi-Automated Vehicle
Vahid Izadi and Amirhossein Ghasemi
Email: ah.Ghasemi@uncc.edu
Website: https://coefs.uncc.edu/aghasem1
Spring 2022
Motivation
Autonomous Ground Vehicles hold great potential for both military and commercial applications
2
Faults Happen
3
When is Semi-Autonomy is Helpful?
4
| | |
| | |
| | |
Detect-Detect
Miss-Detect
Detect-Miss
Miss-Miss
5
Current Transfer Control Approaches
Input Mixing Shared Control:
Shortcomings: No Direct Connection between Human and Automation
Haptic Shared Control: referred to as physical human-robot interaction (pHRI), is a sharing scheme that involves physical coupling between a human partner and co-robot.
Input Switching Control:
Shortcomings:
Haptic Shared Control
6
Human Driver
Automation System
PD-control
Adding Automation Can Cause Conflicts
7
| | |
| | |
| | |
Detect-Detect
Miss-Detect
Detect-Miss
Miss-Miss
Research Goals
Intention (Right or Left) and Role Negotiations (leader/follower) can lead to successful collaboration.
Conventions can narrow this to a subset of these equilibria (or norms) to which the team might more naturally gravitate. Current control transfer mechanisms are built based on fixed conventions.
Goal: Develop the principles of convention formation in a haptic shared control framework to determine optimal handover strategies in uncertain circumstances.
Challenge: In a multi-agent repeated game context, there might be multiple strategies for solving a problem (several equilibria), with some preferable than others.
Goal 1: Create a modular structure that separates partner-specific conventions from task-dependent representations.
Goal 2: Characterize the map from the space of conventions to outcomes in driver-automation collaboration
Goal 3: Develop Techniques to influence and guide the agents (automation and driver) to reach a desirable shared convention.
Research Goals
Goal 1: Creating a modular structure that separates partner-specific conventions from task-dependent representations.
Let’s define a cost function for each agent
Goal 2: Creating a map from the space of conventions to outcomes in multiagent and driver-automation collaboration
Nash strategy
Zero-sum vs. Non-zero Sum Games
wins
loses
wins
loses
Zero-sum
wins
loses
wins
loses
Both Wins
Non zero-sum
Overall Goal: Minimize the combined cost
Challenge: favor one more than the other
Cooperative-Competitive Game
wins
loses
wins
loses
Both Wins
Zero-sum part:
Nonzero-sum part
This co-co concept models a situation where one agent pays/receives an incentive to implement a strategy that minimizes the combined cost function. The co-co game requires communication between the agents.
14
Examples of Convention Surfaces
Active Safety
Neutral
Assistive Behavior
Cooperative Cost Value
Competitive Cost Value
15
Three sample convention-based paradigms
Neutral Behavior
Selfless Behavior
Selfish Behavior
---- Automation’s preferred path
…..Human’s preferred path
____Vehicle path
16
Goal 3: Develop techniques to influence and guide the agents to reach shared conventions.
RL-MPC for Policy Search
17
High-Level Decision Variables
Reinforcement Learning for High-Level Policy
Haptic Shared Control
MPCs
Dynamics
Actions
References
States
Observations
Automation’s preferred path
Human’s preferred path
Vehicle path
Differential Torque (fight)
Human’s weight for being selfish
Automation’s weight for being selfish
Cost Values for the Cooperative Part
Cost Values for the Competitive Part
Future Works
19
Goal 1: Learning Conventions
Goal 2: Design an Adaptable and Personalized Automation
Goal 3: Transfer of Convention Knowledge to adapt to New Users and Coordinate on New Tasks
Goal 4: Test and Validation of Convention Formation
Tasks
Human
Users
Automation
System
Tasks
Human
Users
Adaptable Automation
New Tasks
New
Human
Adaptable Automation
New Tasks
New
Human
Automation
System
Thank You!