1 of 63

Safe control under input limits with neural control barrier functions

Simin Liu

Dolan Lab, Intelligent Control Lab

1

2 of 63

Outline

Why control barrier functions (CBFs)?

How do they work and why do input limits make them hard to construct?

How can we leverage tools from machine learning (neural networks, gradient-based training) to tackle this problem?

2

3 of 63

Why CBFs?

3

4 of 63

CBFs cover the expansive “set invariance”� class of safety problems

4

State space

Safe set

Set invariance: keep state inside safe set for all time

5 of 63

CBFs cover the expansive “set invariance”� class of safety problems

5

Safe cobot-human interaction

Bipedal locomotion

Safe trajectory planning for AVs

6 of 63

CBFs can offer provable, flexible safe control

Provable: has mathematical guarantees of safety

Flexible: acts as “safety layer” on top of any other policy

6

7 of 63

How do CBFs work?

7

8 of 63

CBFs are a “danger index”

CBFs map each state to a scalar measure of danger

8

=0: at safe set boundary

Safe set

9 of 63

System provably safe if CBF never becomes positive

A safe controller will decrease the CBF if it ever reaches 0

9

State space

Safe set

10 of 63

System provably safe if CBF never becomes positive

A safe controller will decrease the CBF if it ever reaches 0
🡪 Constraint on what inputs a safe controller can provide at boundary states

10

State space

Safe set

11 of 63

CBF gives affine “input safety constraint”

11

Note: constraint also state-dependent

Example of constraint set in control space

12 of 63

CBF gives affine “input safety constraint”

12

𝜙

State axes

Safe set

Boundary (0 level set)

Control spaces at different states

Set of safe inputs

13 of 63

“Input safety constraint” can be implemented as top layer in hierarchical controller

This safety layer can operate on top of any controller, modifying its inputs minimally to comply with the safety constraint

13

Hierarchical controller

Safety layer: CBF quadratic program

Any nominal controller

x

14 of 63

CBFs sound great! So, what’s the catch?

14

15 of 63

If CBF constructed in limit-blind way, we’ll run into issues later

Where did the CBF even come from?

In the absence of limits, CBF constructed from safety spec using known formula

15

16 of 63

Example: limit-blind CBF for balancing cartpole

16

Implicitly defined by a function to keep negative

17 of 63

For limit-blind CBF, sometimes no feasible input satisfies safety constraint…

17

Control spaces at different states

Set of safe inputs

Input limit set

Sets don’t intersect!

𝜙

State axes

Safe set

Boundary (0 level set)

18 of 63

In practice, can only max out limits

The best we can do is max out the limits trying to minimally violate the safety law.

18

Set of safe inputs

Input limit set

“As safe as possible” input

19 of 63

Example: limit-blind CBF for balancing cartpole

19

20 of 63

Our aim: find limit-friendly CBF

It is valid to employ any CBF stricter than the original limit-blind CBF

In most cases, there exists a limit-friendly CBF that is stricter
Q: Why?

20

State space

Safe set

Stricter safe set

21 of 63

Our aim: find limit-friendly CBF

It is valid to employ any CBF stricter than the original limit-blind CBF

In most cases, there exists a limit-friendly CBF that is stricter
Q: Why? Can exclude irrecoverable states

21

State space

Safe set

Stricter safe set

22 of 63

Example: balance cartpole

22

Q: Where on the boundary are there irrecoverable states?

Safe set for limit-blind CBF

23 of 63

Example: balance cartpole

23

Q: Where on the boundary are there irrecoverable states?

Safe set for limit-blind CBF

24 of 63

Example: balance cartpole

24

But the exact shape of a non-sat safe set is still hard to guess….

?

25 of 63

Finding limit-friendly CBF = finding CBF that obeys complex design constraint

25

State space

Safe set

26 of 63

Our key ideas:

Train generic neural CBF to satisfy design constraint!
Pose as min-max optimization
Optimize using efficient learner-critic algorithm

26

Learner

Critic

Examples of saturation

Updated CBF

27 of 63

Unlike previous synthesis methods, ours scales to nonlinear, high-dimensional systems

Synthesizing limit-friendly CBF is a hard problem, and the more general the system, the harder it is

Previous works consider subclasses of nonlinear systems*

*See references 1-9.

27

Nonlinear systems

Polynomial systems

Euler-Lagrange systems

Simple nonlinear systems

28 of 63

Recap

CBF’s promise provable safety, but they’re hard to construct given input limits
Input limits pose a tough constraint on CBF
Our idea: train neural CBF to satisfy constraint, using learner-critic algorithm
Our synthesis method is generic, scalable, automatic

28

29 of 63

Roadmap

Posing synthesis as min-max optimization

Our choice of loss function
Design of parametric (neural) CBF

Using learner-critic optimization algorithm

29

30 of 63

Posing the min-max optimization

30

31 of 63

Loss function measures “how unsafe” at state x

31

Design constraint 🡪 loss function

Set of safe inputs

Input limit set

is the most safe input

32 of 63

Satisfying design constraint is equivalent to min-max over loss

32

Q: How to interpret this?

33 of 63

Parametrizing the CBF to optimize over

33

34 of 63

We choose a neural CBF, enabling generic, scalable synthesis

Generic:

Can express wide range of nonlinear functions

Scalable:

Can be efficiently trained on large inputs (high-dimensional systems)

34

35 of 63

We design a neural CBF that is stricter than the limit-blind CBF

Q: how would we modify 𝜙 to shrink its safe set?

35

State space

Stricter safe set

36 of 63

We design a neural CBF that is stricter than the limit-blind CBF

Q: how would we modify 𝜙 to shrink its safe set?

Add a positive function to 𝜙.

36

State space

37 of 63

We design a neural CBF that is stricter than the limit-blind CBF

37

State space

38 of 63

Designing an optimization algorithm

38

39 of 63

Optimize min-max using learner-critic framework

39

Learner

Critic

40 of 63

Learner and critic both use gradient descent

40

Learner

Critic

Example of saturation

Updated CBF

with PGD

41 of 63

Techniques not covered:

Simple trick to get a differentiable objective
Details of critic’s batch optimization and learner’s batch update
“Warm-start” technique that boosts critic efficiency
Regularization term that encourages a larger safe set

But feel free to ask afterwards!

41

Improves efficiency

42 of 63

Let’s see some examples.

42

43 of 63

Learner-critic walkthrough for cartpole

43

Iteration 0

Learner

Critic

44 of 63

Learner-critic walkthrough for cartpole

44

Iteration 0, critic’s turn

Learner

Critic

45 of 63

Learner-critic walkthrough for cartpole

45

Iteration 0, learner’s turn

Learner

Critic

46 of 63

Learner-critic walkthrough for cartpole

46

Iteration 1, critic’s turn

Learner

Critic

47 of 63

Learner-critic walkthrough for cartpole

47

Iteration 1, learner’s turn

Learner

Critic

48 of 63

Learner-critic walkthrough for cartpole

48

Iteration 2, critic’s turn

Learner

Critic

49 of 63

Learner-critic walkthrough for cartpole

49

Iteration 2, learner’s turn

Learner

Critic

50 of 63

After a while…

50

51 of 63

Learner-critic walkthrough for cartpole

51

Final iteration

After:

~200 iterations
15 minutes of training
10k cumulative counterexamples

Iteration 0

52 of 63

Observing our learned safe controller in action

52

53 of 63

A harder example now.

53

54 of 63

Balance pendulum on a quadcoptor

*See reference 11 for video credits.

54

55 of 63

Balance pendulum on a quadcoptor

Q: which states do you expect the worst saturation at?

*See reference 12 for figure credit.

55

Given: nonlinear, 10D state,

4D limited input system

56 of 63

It’s much harder to reason about how to shrink this safe set!��Good news – we don’t have to. Just learn it.

56

57 of 63

Our method far outperforms a non-neural baseline

*See references 6, 10.

57

	M1	M2
Baseline (non-neural CBF)*	78.7	49.5-79.5
Ours	99.0	97.0-98.9

States with infeasible safe input

M1: % boundary states with feasible safe input

M2: % forward invariant rollouts

58 of 63

Why does our method outperform?

58

Pend pitch

Pend pitch vel

Safe set diagram

Our learned safe set

Baseline safe set

59 of 63

Limitations + future work

Learning required a state transformation first

Maybe unnecessary with sinusoidal NN?

Assumed known, deterministic dynamics

Extend to learning robust non-saturating CBF?

59

60 of 63

Final recap

CBF are hard to synthesize under input limits
Neural CBF representation + efficient training algorithm = generic, scalable, automatic synthesis
Addressing this problem makes CBFs more practically useful!

60

61 of 63

Acknowledgments

I’m grateful to my advisors, Prof. Dolan and Prof. Liu, as well as the other members of my committee, Prof. Held and Jaskaran Grover.

Also thanks to the members of the Dolan Lab and ICL, especially Qin Lin, Tianhao Wei, Ravi Pandya, and also Prof. Andrea Bajcsy, Kate Shih, Ashwin Khadke, Arpit Agarwal.

61

62 of 63

Questions?

62

63 of 63

References

C. Liu and M. Tomizuka. Control in a safe set: Addressing safety in human-robot interactions. In Dynamic Systems and Control Conference, volume 46209, page V003T42A003. American Society of Mechanical Engineers, 2014.
W. Zhao, T. He, and C. Liu. Model-free safe control for zero-violation reinforcement learning. In 5th Annual Conference on Robot Learning, 2021.
A. D. Ames, J. W. Grizzle, and P. Tabuada. Control barrier function based quadratic programs with application to adaptive cruise control. In 53rd IEEE Conference on Decision and Control, pages 6271–6278. IEEE, 2014.
Y. Lyu, W. Luo, and J. M. Dolan. Probabilistic safety-assured adaptive merging control for autonomous vehicles. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 10764–10770. IEEE, 2021.
W. S. Cortez and D. V. Dimarogonas. Safe-by-design control for euler-lagrange systems. arXiv preprint arXiv:2009.03767, 2020.
T. Wei and C. Liu. Safe control with neural network dynamic models. arXiv preprint arXiv:2110.01110, 2021.
A. Clark. Verification and synthesis of control barrier functions. arXiv preprint arXiv:2104.14001, 2021.
S. Bansal, M. Chen, S. Herbert, and C. J. Tomlin. Hamilton-jacobi reachability: A brief overview and recent advances. In 2017 IEEE 56th Annual Conference on Decision and Control (CDC), pages 2242–2253. IEEE, 2017.
M. Chen, S. Herbert, and C. J. Tomlin. Fast reachable set approximations via state decoupling disturbances. arXiv preprint arXiv:1603.05205, 2016.
W. Zhao, T. He, and C. Liu. Model-free safe control for zero-violation reinforcement learning.�In 5th Annual Conference on Robot Learning, 2021.
M. Hehn and R. D’Andrea. A flying inverted pendulum. In 2011 IEEE International Confer-�ence on Robotics and Automation, pages 763–770. IEEE, 2011
R. Figueroa, A. Faust, P. Cruz, L. Tapia, and R. Fierro. Reinforcement learning for balancing�a flying inverted pendulum. In Proceeding of the 11th World Congress on Intelligent Control�and Automation, pages 1787–1793. IEEE, 2014.

63