Safe control under input limits with neural control barrier functions
Simin Liu
Dolan Lab, Intelligent Control Lab
1
Outline
2
Why CBFs?
3
CBFs cover the expansive “set invariance”� class of safety problems
4
State space
Safe set
Set invariance: keep state inside safe set for all time
CBFs cover the expansive “set invariance”� class of safety problems
5
Safe cobot-human interaction
Bipedal locomotion
Safe trajectory planning for AVs
CBFs can offer provable, flexible safe control
Provable: has mathematical guarantees of safety
Flexible: acts as “safety layer” on top of any other policy
6
How do CBFs work?
7
CBFs are a “danger index”
8
=0: at safe set boundary
Safe set
System provably safe if CBF never becomes positive
9
State space
Safe set
System provably safe if CBF never becomes positive
10
State space
Safe set
CBF gives affine “input safety constraint”
11
Note: constraint also state-dependent
Example of constraint set in control space
CBF gives affine “input safety constraint”
12
𝜙
State axes
Safe set
Boundary (0 level set)
Control spaces at different states
Set of safe inputs
“Input safety constraint” can be implemented as top layer in hierarchical controller
13
Hierarchical controller
Safety layer: CBF quadratic program
Any nominal controller
x
CBFs sound great! So, what’s the catch?
14
If CBF constructed in limit-blind way, we’ll run into issues later
15
Example: limit-blind CBF for balancing cartpole
16
Implicitly defined by a function to keep negative
For limit-blind CBF, sometimes no feasible input satisfies safety constraint…
17
Control spaces at different states
Set of safe inputs
Input limit set
Sets don’t intersect!
𝜙
State axes
Safe set
Boundary (0 level set)
In practice, can only max out limits
The best we can do is max out the limits trying to minimally violate the safety law.
18
Set of safe inputs
Input limit set
“As safe as possible” input
Example: limit-blind CBF for balancing cartpole
19
Our aim: find limit-friendly CBF
20
State space
Safe set
Stricter safe set
Our aim: find limit-friendly CBF
21
State space
Safe set
Stricter safe set
Example: balance cartpole
22
Q: Where on the boundary are there irrecoverable states?
Safe set for limit-blind CBF
Example: balance cartpole
23
Q: Where on the boundary are there irrecoverable states?
Safe set for limit-blind CBF
Example: balance cartpole
24
But the exact shape of a non-sat safe set is still hard to guess….
?
?
Finding limit-friendly CBF = finding CBF that obeys complex design constraint
25
State space
Safe set
Our key ideas:
26
Learner
Critic
Examples of saturation
Updated CBF
Unlike previous synthesis methods, ours scales to nonlinear, high-dimensional systems
*See references 1-9.
27
Nonlinear systems
Polynomial systems
Euler-Lagrange systems
Simple nonlinear systems
Recap
28
Roadmap
29
Posing the min-max optimization
30
Loss function measures “how unsafe” at state x
31
Design constraint 🡪 loss function
Set of safe inputs
Input limit set
is the most safe input
Satisfying design constraint is equivalent to min-max over loss
32
Q: How to interpret this?
Parametrizing the CBF to optimize over
33
We choose a neural CBF, enabling generic, scalable synthesis
34
We design a neural CBF that is stricter than the limit-blind CBF
Q: how would we modify 𝜙 to shrink its safe set?
35
State space
Stricter safe set
We design a neural CBF that is stricter than the limit-blind CBF
Q: how would we modify 𝜙 to shrink its safe set?
Add a positive function to 𝜙.
36
State space
We design a neural CBF that is stricter than the limit-blind CBF
37
State space
Designing an optimization algorithm
38
Optimize min-max using learner-critic framework
39
Learner
Critic
Learner and critic both use gradient descent
40
Learner
Critic
Example of saturation
Updated CBF
with PGD
Techniques not covered:
But feel free to ask afterwards!
41
Improves efficiency
Let’s see some examples.
42
Learner-critic walkthrough for cartpole
43
Iteration 0
Learner
Critic
Learner-critic walkthrough for cartpole
44
Iteration 0, critic’s turn
Learner
Critic
Learner-critic walkthrough for cartpole
45
Iteration 0, learner’s turn
Learner
Critic
Learner-critic walkthrough for cartpole
46
Iteration 1, critic’s turn
Learner
Critic
Learner-critic walkthrough for cartpole
47
Iteration 1, learner’s turn
Learner
Critic
Learner-critic walkthrough for cartpole
48
Iteration 2, critic’s turn
Learner
Critic
Learner-critic walkthrough for cartpole
49
Iteration 2, learner’s turn
Learner
Critic
After a while…
50
Learner-critic walkthrough for cartpole
51
Final iteration
After:
Iteration 0
Observing our learned safe controller in action
52
A harder example now.
53
Balance pendulum on a quadcoptor
*See reference 11 for video credits.
54
Balance pendulum on a quadcoptor
Q: which states do you expect the worst saturation at?
*See reference 12 for figure credit.
55
Given: nonlinear, 10D state,
4D limited input system
It’s much harder to reason about how to shrink this safe set!��Good news – we don’t have to. Just learn it.
56
Our method far outperforms a non-neural baseline
*See references 6, 10.
57
| M1 | M2 |
Baseline (non-neural CBF)* | 78.7 | 49.5-79.5 |
Ours | 99.0 | 97.0-98.9 |
States with infeasible safe input
M1: % boundary states with feasible safe input
M2: % forward invariant rollouts
Why does our method outperform?
58
Pend pitch
Pend pitch vel
Safe set diagram
Our learned safe set
Baseline safe set
Limitations + future work
59
Final recap
60
Acknowledgments
I’m grateful to my advisors, Prof. Dolan and Prof. Liu, as well as the other members of my committee, Prof. Held and Jaskaran Grover.
Also thanks to the members of the Dolan Lab and ICL, especially Qin Lin, Tianhao Wei, Ravi Pandya, and also Prof. Andrea Bajcsy, Kate Shih, Ashwin Khadke, Arpit Agarwal.
61
Questions?
62
References
63