Neural Pontryagin Optimal Controller for
Lossy Energy Storage with Nonlinear Efficiency
���
Outline
Background and Motivation
Introduction
Method
Simulation Results
Conclusion and Future Works
2
Motivation
3
Motivation-Battery Control Challenge
4
Motivation-Current Methods’ Challenge
Dynamic Programming / Model Predictive Control (MPC):
Model-Free RL:
Model-Based RL:
5
Introduction-Overall Framework
In this work, we propose a novel Framework: Neural-PMP integrates Pontryagin Maximum Principle with neural network–learned dynamics�
New Algorithm: Gradient-based method to solve PMP conditions efficiently�
Improved Performance:
6
Introduction
7
Problem Formulation
Battery arbitrage as an optimal control problem: Objectives
(1) Charging cost
(2) Penalty for excessive (dis)charging amount ut:
(3) Penalty term as soft constraint to prevent from exceeding battery limits
Battery arbitrage constraints on state and controls
8
Problem Formulation: Nonlinear Efficiency
Real batteries incur efficiency losses:�
Efficiency is nonlinear and battery-specific due to electrochemical properties:�
Examples: sigmoid, piecewise linear, quadratic forms
Such battery’s charging efficiency function is not known explicitly by the users or controller, justifying the adoption of using neural network to approximate such dynamics
9
PMP Conditions
In classical optimal control theory, PMP conditions provide the necessary conditions for finding the optimal control actions:
With H() as the Hamiltonian, λ as the costate. The optimality conditions are:
10
PMP with Neural Dynamics: Learning
Key step: Train NN-parameterized battery dynamics, and plug it into PMP conditions (Line 5 in Algorithm)
11
This is achievable because battery measurements are always available!
Battery Measurements
NN Surrogate Dynamics
PMP Conditions for Optimal Control
Optimal Control Sequences
PMP with Neural Dynamics: Control
Once dynamical model is learned, can do gradient step to iteratively optimize u
12
Neural PMP’s Properties
13
Simulation Setups
Benchmarked Algorithms:
i). a Linear-Convex solver, which firstly use linear regression to approximate a linearized system dynamics and then solve MPC problem using cvxpy solver;
ii). state-of-the-art model-free RL algorithm Proximal Policy Optimization (PPO);
iii). model-based random shooting MPC (RS-MPC) controller
We test on both one-battery and multiple-battery setting.
14
Simulation Results-Single Battery
Observations:
15
Simulation Results-Multiple Batteries
�
16
Conclusions and Future Works
Neural-PMP integrates PMP with NN dynamics for optimal battery control
Advantages:
Future work:
17
Thank you!
18