Towards Learning-based Approximately Optimal Control in (Constrained) Decentralized Dynamic Teams
Vijay Subramanian
ECE Division, EECS Department, University of Michigan (UM), Ann Arbor
Joint work with Nouman Khan (UM) (CDC 2023, INFOCOM 2024), Hsu Kao (JP Morgan) (AISTATS 2022),
Ujwal Dinesha, Subrahmanyam Arunachalam, Dheeraj Narasimha, and Srinivas Shakkottai (TAMU) (INFOCOM 2024)
April 8, 2024
SNAPP Seminar
Acknowledgments
Decision-Making in Multi-Agent Systems/Networks
UAV Swarms
Connected autonomous driving
Smart grids
Drone Delivery
Warehouse Robots
Background: Sequential Decision Making and Information States
Challenges for cooperative multi-agent problems
Solutions for cooperative multi-agent problems
Multi-agent Constrained Dec POMDP
Decision Process
Local
observation
Local
observation
Local action
Local action
Decision Process
Optimization Problem
Policy-Profile Space
Objective and Constraint Functions
Optimization Problem
Optimization Problem
Is the Lagrangian formulation useful?
Optimization Problem: Comments
Optimization Problem
Key Assumptions
Strong Duality Result
Result 1: Strong Duality
Result on Existence of Saddle-Point
Result 2: Existence of Saddle-Point
Main Idea 1 (A Suitable Minimax Theorem)
Proposition 1: A Suitable Minimax Theorem
W-S/Borkar Toplogy
Analysis Steps
Summary of General Results
Wireless Downlink Video-Streaming
Problem Setting
Problem Statement
Goal
System Model I
System Model II
Decision Problem
Policy-Profile Space
Objective and Constraint Functions
Decision Problem
Optimization Problem
Lagrangian formulation useful makes immediate costs separable!
Key Assumptions for Scalable Multi-Agent Control
Issue: No common information:
Assumption 1: Factorized Transition Law.
Assumption 2: Additively separable immediate costs.
Nested information for each ED:
Knowing BTS & ED action,
ED state evolves independently
Lagrangian relaxation
separates immediate costs
Result 3
Optimal Control for a Single Transmitter-Receiver Problem
Learning-based Solution
Numerical Results I - Parameters
Numerical Results II
Extensions to dynamic games?
Thank You
Background for Wireless Video Streaming
Summary of Wireless Video Streaming