1 of 63

Safe control under input limits with neural control barrier functions

Simin Liu

Dolan Lab, Intelligent Control Lab

1

2 of 63

Outline

  1. Why control barrier functions (CBFs)?

  1. How do they work and why do input limits make them hard to construct?

  1. How can we leverage tools from machine learning (neural networks, gradient-based training) to tackle this problem?

2

3 of 63

Why CBFs?

3

4 of 63

CBFs cover the expansive “set invariance”� class of safety problems

4

State space

Safe set

Set invariance: keep state inside safe set for all time

5 of 63

CBFs cover the expansive “set invariance”� class of safety problems

5

Safe cobot-human interaction

Bipedal locomotion

Safe trajectory planning for AVs

6 of 63

CBFs can offer provable, flexible safe control

Provable: has mathematical guarantees of safety 

Flexible: acts as “safety layer” on top of any other policy

6

7 of 63

How do CBFs work?

7

8 of 63

CBFs are a “danger index”

  • CBFs map each state to a scalar measure of danger

8

=0: at safe set boundary

 

Safe set

9 of 63

System provably safe if CBF never becomes positive

  • A safe controller will decrease the CBF if it ever reaches 0

9

State space

Safe set

10 of 63

System provably safe if CBF never becomes positive

  • A safe controller will decrease the CBF if it ever reaches 0
  • 🡪 Constraint on what inputs a safe controller can provide at boundary states

10

State space

Safe set

11 of 63

CBF gives affine “input safety constraint”

11

Note: constraint also state-dependent

 

 

Example of constraint set in control space

12 of 63

CBF gives affine “input safety constraint”

12

𝜙

State axes

Safe set

Boundary (0 level set)

 

 

 

Control spaces at different states

 

 

 

Set of safe inputs

 

 

 

 

 

 

13 of 63

“Input safety constraint” can be implemented as top layer in hierarchical controller

  • This safety layer can operate on top of any controller, modifying its inputs minimally to comply with the safety constraint

13

Hierarchical controller

 

 

 

Safety layer: CBF quadratic program

Any nominal controller

x

 

 

14 of 63

CBFs sound great! So, what’s the catch?

14

15 of 63

If CBF constructed in limit-blind way, we’ll run into issues later

  • Where did the CBF even come from?

  • In the absence of limits, CBF constructed from safety spec using known formula

15

16 of 63

Example: limit-blind CBF for balancing cartpole

16

 

 

 

Implicitly defined by a function to keep negative

17 of 63

For limit-blind CBF, sometimes no feasible input satisfies safety constraint…

17

Control spaces at different states

 

 

 

Set of safe inputs

Input limit set

 

 

 

 

 

 

Sets don’t intersect!

𝜙

State axes

Safe set

Boundary (0 level set)

 

 

 

18 of 63

In practice, can only max out limits

The best we can do is max out the limits trying to minimally violate the safety law.

18

 

 

 

Set of safe inputs

Input limit set

“As safe as possible” input

19 of 63

Example: limit-blind CBF for balancing cartpole

19

20 of 63

Our aim: find limit-friendly CBF

  • It is valid to employ any CBF stricter than the original limit-blind CBF

  • In most cases, there exists a limit-friendly CBF that is stricter
  • Q: Why?

20

State space

Safe set

Stricter safe set

21 of 63

Our aim: find limit-friendly CBF

  • It is valid to employ any CBF stricter than the original limit-blind CBF

  • In most cases, there exists a limit-friendly CBF that is stricter
  • Q: Why? Can exclude irrecoverable states

21

State space

Safe set

Stricter safe set

22 of 63

Example: balance cartpole

22

Q: Where on the boundary are there irrecoverable states?

 

 

 

Safe set for limit-blind CBF

23 of 63

Example: balance cartpole

23

Q: Where on the boundary are there irrecoverable states?

 

 

Safe set for limit-blind CBF

 

24 of 63

Example: balance cartpole

24

But the exact shape of a non-sat safe set is still hard to guess….

 

 

?

?

 

25 of 63

Finding limit-friendly CBF = finding CBF that obeys complex design constraint

25

State space

Safe set

26 of 63

Our key ideas:

  • Train generic neural CBF to satisfy design constraint!
  • Pose as min-max optimization
  • Optimize using efficient learner-critic algorithm

26

Learner

Critic

Examples of saturation

Updated CBF

27 of 63

Unlike previous synthesis methods, ours scales to nonlinear, high-dimensional systems

  • Synthesizing limit-friendly CBF is a hard problem, and the more general the system, the harder it is

  • Previous works consider subclasses of nonlinear systems*

*See references 1-9.

27

Nonlinear systems

Polynomial systems

Euler-Lagrange systems

Simple nonlinear systems

28 of 63

Recap

  • CBF’s promise provable safety, but they’re hard to construct given input limits 
  • Input limits pose a tough constraint on CBF
  • Our idea: train neural CBF to satisfy constraint, using learner-critic algorithm  
  • Our synthesis method is generic, scalable, automatic

28

29 of 63

Roadmap

  • Posing synthesis as min-max optimization
    • Our choice of loss function
    • Design of parametric (neural) CBF

  • Using learner-critic optimization algorithm

29

30 of 63

Posing the min-max optimization

30

31 of 63

Loss function measures “how unsafe” at state x

31

Design constraint 🡪 loss function

Set of safe inputs

Input limit set

is the most safe input

32 of 63

Satisfying design constraint is equivalent to min-max over loss

32

Q: How to interpret this?

33 of 63

Parametrizing the CBF to optimize over

33

34 of 63

We choose a neural CBF, enabling generic, scalable synthesis

  • Generic:
    • Can express wide range of nonlinear functions
  • Scalable:
    • Can be efficiently trained on large inputs (high-dimensional systems)

34

35 of 63

We design a neural CBF that is stricter than the limit-blind CBF

Q: how would we modify 𝜙 to shrink its safe set?

35

State space

 

Stricter safe set

36 of 63

We design a neural CBF that is stricter than the limit-blind CBF

Q: how would we modify 𝜙 to shrink its safe set?

Add a positive function to 𝜙.

36

State space

 

 

37 of 63

We design a neural CBF that is stricter than the limit-blind CBF

37

State space

 

 

38 of 63

Designing an optimization algorithm

38

39 of 63

Optimize min-max using learner-critic framework

39

Learner

Critic

 

 

40 of 63

Learner and critic both use gradient descent

40

Learner

Critic

Example of saturation

Updated CBF

with PGD

41 of 63

Techniques not covered:

  • Simple trick to get a differentiable objective
  • Details of critic’s batch optimization and learner’s batch update
  • “Warm-start” technique that boosts critic efficiency
  • Regularization term that encourages a larger safe set

But feel free to ask afterwards!

41

Improves efficiency

42 of 63

Let’s see some examples.

42

43 of 63

Learner-critic walkthrough for cartpole

43

Iteration 0

Learner

Critic

 

 

44 of 63

Learner-critic walkthrough for cartpole

44

Iteration 0, critic’s turn

Learner

Critic

 

 

45 of 63

Learner-critic walkthrough for cartpole

45

Iteration 0, learner’s turn

Learner

Critic

 

 

46 of 63

Learner-critic walkthrough for cartpole

46

Iteration 1, critic’s turn

Learner

Critic

 

 

47 of 63

Learner-critic walkthrough for cartpole

47

Iteration 1, learner’s turn

Learner

Critic

 

 

48 of 63

Learner-critic walkthrough for cartpole

48

Iteration 2, critic’s turn

Learner

Critic

 

 

49 of 63

Learner-critic walkthrough for cartpole

49

Iteration 2, learner’s turn

Learner

Critic

 

 

50 of 63

After a while…

50

51 of 63

Learner-critic walkthrough for cartpole

51

Final iteration

After:

  • ~200 iterations
  • 15 minutes of training
  • 10k cumulative counterexamples

Iteration 0

52 of 63

Observing our learned safe controller in action

52

53 of 63

A harder example now.

53

54 of 63

Balance pendulum on a quadcoptor

*See reference 11 for video credits.

54

55 of 63

Balance pendulum on a quadcoptor

Q: which states do you expect the worst saturation at?

*See reference 12 for figure credit.

55

Given: nonlinear, 10D state,

4D limited input system

 

56 of 63

It’s much harder to reason about how to shrink this safe set!��Good news – we don’t have to. Just learn it.

56

57 of 63

Our method far outperforms a non-neural baseline

*See references 6, 10.

57

M1

M2

Baseline (non-neural CBF)*

78.7

49.5-79.5

Ours

99.0

97.0-98.9

States with infeasible safe input

M1: % boundary states with feasible safe input

M2: % forward invariant rollouts

58 of 63

Why does our method outperform?

58

Pend pitch

Pend pitch vel

Safe set diagram

Our learned safe set

Baseline safe set

 

 

 

59 of 63

Limitations + future work

  • Learning required a state transformation first
    • Maybe unnecessary with sinusoidal NN?
  • Assumed known, deterministic dynamics
    • Extend to learning robust non-saturating CBF?

59

60 of 63

Final recap

  • CBF are hard to synthesize under input limits
  • Neural CBF representation + efficient training algorithm = generic, scalable, automatic synthesis
  • Addressing this problem makes CBFs more practically useful!

60

61 of 63

Acknowledgments

I’m grateful to my advisors, Prof. Dolan and Prof. Liu, as well as the other members of my committee, Prof. Held and Jaskaran Grover.

Also thanks to the members of the Dolan Lab and ICL, especially Qin Lin, Tianhao Wei, Ravi Pandya, and also Prof. Andrea Bajcsy, Kate Shih, Ashwin Khadke, Arpit Agarwal.

61

62 of 63

Questions?

62

63 of 63

References

  1. C. Liu and M. Tomizuka. Control in a safe set: Addressing safety in human-robot interactions. In Dynamic Systems and Control Conference, volume 46209, page V003T42A003. American Society of Mechanical Engineers, 2014.
  2. W. Zhao, T. He, and C. Liu. Model-free safe control for zero-violation reinforcement learning. In 5th Annual Conference on Robot Learning, 2021.
  3. A. D. Ames, J. W. Grizzle, and P. Tabuada. Control barrier function based quadratic programs with application to adaptive cruise control. In 53rd IEEE Conference on Decision and Control, pages 6271–6278. IEEE, 2014.
  4. Y. Lyu, W. Luo, and J. M. Dolan. Probabilistic safety-assured adaptive merging control for autonomous vehicles. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 10764–10770. IEEE, 2021.
  5. W. S. Cortez and D. V. Dimarogonas. Safe-by-design control for euler-lagrange systems. arXiv preprint arXiv:2009.03767, 2020.
  6. T. Wei and C. Liu. Safe control with neural network dynamic models. arXiv preprint arXiv:2110.01110, 2021.
  7. A. Clark. Verification and synthesis of control barrier functions. arXiv preprint arXiv:2104.14001, 2021.
  8. S. Bansal, M. Chen, S. Herbert, and C. J. Tomlin. Hamilton-jacobi reachability: A brief overview and recent advances. In 2017 IEEE 56th Annual Conference on Decision and Control (CDC), pages 2242–2253. IEEE, 2017.
  9. M. Chen, S. Herbert, and C. J. Tomlin. Fast reachable set approximations via state decoupling disturbances. arXiv preprint arXiv:1603.05205, 2016.
  10. W. Zhao, T. He, and C. Liu. Model-free safe control for zero-violation reinforcement learning.�In 5th Annual Conference on Robot Learning, 2021.
  11. M. Hehn and R. D’Andrea. A flying inverted pendulum. In 2011 IEEE International Confer-�ence on Robotics and Automation, pages 763–770. IEEE, 2011
  12. R. Figueroa, A. Faust, P. Cruz, L. Tapia, and R. Fierro. Reinforcement learning for balancing�a flying inverted pendulum. In Proceeding of the 11th World Congress on Intelligent Control�and Automation, pages 1787–1793. IEEE, 2014.

63