JavaScript isn't enabled in your browser, so this file can't be opened. Enable and reload.

1 of 41

An Efficient Framework for

Modular Autonomous Vehicle Risk Assessment (MAVRA)

Robert Moss

Stanford University

(collaborating with Shubh Gupta, Marc R. Schlichting, Kyu-Young Kim, Anthony Corso, Grace X. Gao, and Mykel J. Kochenderfer)

ITSC Workshop on Safety Validation of Connected and Automated Vehicles

October 7, 2022

2 of 41

Motivation: Real-world setting

2

Image credit: https://www.governing.com/community/vehicles-still-firmly-in-control-of-city-streets

Problem: Estimate risk of autonomous vehicle policies in realistic environments.

3 of 41

Motivation: Realistic simulation

3

Image credit: https://carla.org/img/posts/2021-11-07/intersection.gif

Problem: Estimate risk of autonomous vehicle policies in high-fidelity simulators.

4 of 41

Motivation: Efficient assessment

4

Image credit: https://carla.org/img/posts/2021-11-07/intersection.gif

Problem: Efficiently estimate risk of autonomous vehicle policies in high-fidelity simulators.

(faster than real-time simulation)

5 of 41

Motivation

5

Image credits: https://carlachallenge.org/challenge/nhtsa/

“Too often we see solid tool chains but no tangible test strategies.” Christof Ebert and Michael Weyrich. "Validation of Autonomous Systems." IEEE Software, 2019.

6 of 41

Objective

Develop strategy for autonomous vehicle risk assessment as a modular framework:

Modular simulated components

Allow for different levels of model fidelity based on availability in simulation

Scenario-based validation

Condition on stressing cases

Efficient algorithms for scenario search and falsification

Plug in different search algorithms based on computational availability

Well-defined cost metric

Estimate the cost of a failure given limitations of the simulator

Unified risk metrics

Calculate risk to compare against other AV policies (agnostic of chosen cost metric)

6

Image credits: https://carlachallenge.org/challenge/nhtsa/

7 of 41

Modular framework: Risk assessment (MAVRA)

7

8 of 41

Modular framework: Risk assessment (MAVRA)

8

9 of 41

Modular framework: Risk assessment (MAVRA)

9

10 of 41

Modular framework: Risk assessment (MAVRA)

10

11 of 41

Modular framework: Risk assessment (MAVRA)

11

12 of 41

Modular framework: Risk assessment (MAVRA)

12

13 of 41

Modular framework: Risk assessment (MAVRA)

13

14 of 41

Modular framework: Risk assessment (MAVRA)

14

15 of 41

Modular framework: Risk assessment (MAVRA)

15

16 of 41

Modular framework: Risk assessment (MAVRA)

16

17 of 41

Modular framework: Risk assessment (MAVRA)

17

18 of 41

Modular framework: Risk assessment (MAVRA)

18

19 of 41

Scenario-based assessment

2D low-fidelity case

Perform scenario-based testing, or “use-case” testing.

Used in government agencies and industry^[1,2,3]

Initial work^[4] demonstrated the framework on a 2D driving simulator: AutomotiveSimulator.jl^[5]

We investigate stressing scenarios inspired by the U.S. NHTSA pre-crash typology^[6]
Ran across all scenarios

Due to inexpensive computation in 2D

19

[1] Z. Zhong, et al. “A Survey on Scenario-Based Testing for Automated Driving Systems in High-Fidelity Simulation.” arXiv:2112.00964, 2021.

[2] P. Junietz, et al. "Evaluation of Different Approaches to Address Safety Validation of Automated Driving." IEEE ITSC, 2018.

[3] C. Ebert and M. Weyrich. “Validation of Autonomous Systems”, IEEE Software, 2019.

[4] R. Moss, et al. “Autonomous Vehicle Risk Assessment.” Tech. Report, Stanford Center for AI Safety, 2021.

[5] https://github.com/sisl/AutomotiveSimulator.jl

[6] W. G. Najm, J. D. Smith, and M. Yanagisawa. "Pre-crash Scenario Typology for Crash Avoidance Research." No. DOT-VNTSC-NHTSA-06-02. United States National Highway Traffic Safety Administration, 2007.

20 of 41

Scenario-based assessment

3D high-fidelity case

Modularity in framework allows for different driving simulators (demonstrated on the open-source CARLA sim.^[1])

Using the scenario runner^[2] to sample NHTSA pre-crash typology^[3] inspired scenarios and weather parameters.
Scaling to high-fidelity 3D simulators, intelligent sampling is required (instead of exhaustive search)

20

Image credits: https://carlachallenge.org/challenge/nhtsa/

[1] https://carla.org/

[2] https://github.com/carla-simulator/scenario_runner

[3] W. G. Najm, J. D. Smith, and M. Yanagisawa. "Pre-crash Scenario Typology for Crash Avoidance Research." No. DOT-VNTSC-NHTSA-06-02. United States National Highway Traffic Safety Administration, 2007.

21 of 41

Scenario-based assessment

Weather models

Weather parameters can also sample as part of the “scenario”

Using models of weather characteristics from different geographical locations

21

22 of 41

Scenario-based assessment

AV-specific stressing scenarios (see [1])

22

[1] F. M. Favarò, et al. “Examining Accident Reports Involving Autonomous Vehicles in California.” PLOS One, 2017.

[1]

23 of 41

AV policies

Previous work^[1] used the intelligent driver model (IDM) in a 2D simulated environment
Current work focuses on image-based neural network policies in the high-fidelity simulator CARLA^[2]

We use open-source policies that competed in the CARLA Autonomous Driving Challenge
Namely, World-on-Rails^[3] and the NEAT^[4] policies.

23

Video credit: https://github.com/autonomousvision/neat

[1] R. Moss, et al. “Autonomous Vehicle Risk Assessment.” Tech. Report, Stanford Center for AI Safety, 2021.

[2] https://carla.org/

[2] D. Chen, V. Koltun, and P. Krähenbühl. "Learning to Drive from a World on Rails", International Conference on Computer Vision (ICCV), 2021.

[3] K. Chitta, A. Prakash, and A. Geiger. "NEAT: Neural Attention Fields for End-to-End Autonomous Driving", International Conference on Computer Vision (ICCV), 2021.

^[4]

24 of 41

Observation models

Sensors

The sensor models are dependent on choice of AV policies

World-on-Rails^[1] and the NEAT^[2] policies use multiple cameras and GNSS for localization

We apply adversarial noise disturbances to the sensor outputs (e.g., camera saturation and static noise)

24

[1] D. Chen, V. Koltun, and P. Krähenbühl. "Learning to Drive from a World on Rails", International Conference on Computer Vision (ICCV), 2021.

[2] K. Chitta, A. Prakash, and A. Geiger. "NEAT: Neural Attention Fields for End-to-End Autonomous Driving", International Conference on Computer Vision (ICCV), 2021.

25 of 41

Failure metric

We define failure as a collision with another agent or object.

Failure could be re-defined as lane violations, hard braking, swerving events, etc.

25

⚠

26 of 41

Cost metric

For collision failures

Impact speed: Useful proxy for collision severity^[1]

Strictly non-negative
Often easy to obtain in low- and high-fidelity simulators

delta-V (δv): Industry often uses delta-V

Defined as difference in speed between time-of-collision and the subsequent time step
Shown to be a good measure of crash severity^[1]used as a proxy for impact force (when recreating crashes)
Potential (unforeseen) issues with negative delta-V (e.g., vehicle speeds into pedestrian)

Collision force: Normal impulse (i.e., change in momentum) of the collision

The value we are trying to approximate using the metrics above
Recommended to use if available in simulation (often requires more sophisticated simulated physics engines)

Parameterized cost model: Combines multiple inputs

Could include severity of impact, involved participants (other vehicles or pedestrians), environment surroundings (urban vs. rural).
Hierarchical cost that outputs a combined measure based on fatality/fatality likelihood, bodily injury expenses, and property damage.

26

[1] D. C. Richards, "Relationship between Speed and Risk of Fatal Injury: Pedestrians and Car Occupants", Department for Transport: London, 2010.

Objective: cost of zero should equal “no failure” using what’s available in simulation

27 of 41

Risk metric

Adopting risk metrics used in the financial industry and robotics community^[1]
Using a distribution of costs Z (where z > 0 indicates “failure”)

Mean cost:

Value at risk (VaR):

Conditional value at risk (CVaR):

Worst case cost:

Using conditional value at risk (CVaR) provides a well-defined risk metric to compute relative risk b/w AV policies

27

[1] A. Majumdar and M. Pavone. "How should a robot assess risk? Towards an axiomatic theory of risk in robotics." Robotics Research, 2020.

28 of 41

Validation Search

Scenario-level search / sampling

With a fast simulator and/or enough parallel compute:

Perform exhaustive or Monte Carlo search over scenarios

With limited compute and expensive simulators:

A more intelligent scenario sampling scheme is required

We estimate risk using tree-based importance sampling

Using ongoing work from the Stanford NAV Lab^[1]

Idea: focus on important areas that impact the CVaR estimate

28

[1] S. Gupta, et al. "Tree-based importance sampling for risk estimation." Work-in-Progress, 2022.

29 of 41

Validation Search

Step-level search (episode)

Given a scenario, apply sensor noise disturbances at each time step when running the scenario.

29

t = 1

⚠

30 of 41

Validation Search

Step-level search (episode)

Given a scenario, apply sensor noise disturbances at each time step when running the scenario.

30

t = 1

t = 2

⚠

31 of 41

Validation Search

Step-level search (episode)

Given a scenario, apply sensor noise disturbances at each time step when running the scenario.

31

t = 1

t = 2

t = 3

⚠

32 of 41

Validation Search

Step-level search (episode)

Given a scenario, apply sensor noise disturbances at each time step when running the scenario.

32

t = 1

t = 2

t = 3

t = 4

⚠

33 of 41

Validation Search

Step-level search (episode)

Given a scenario, apply sensor noise disturbances at each time step when running the scenario.

33

t = 1

t = 2

t = 3

t = 4

t = 5

⚠

34 of 41

Validation Search

Step-level search (episode)

Given a scenario, apply sensor noise disturbances at each time step when running the scenario.

34

[1] S. Gupta, et al. "Tree-based importance sampling for risk estimation." Work-in-Progress, 2022.

bicyclist

35 of 41

Validation Search

Step-level search (episode)

Existing algorithms can be used for the step-level falsification (i.e., adversarial sensor noise disturbances):

Monte Carlo noise
Adaptive stress testing^[1]
Cross-entropy method^[2]
Rapidly-exploring random trees (RRTs)^[3]
(see other black-box falsification methods in the survey by A. Corso, et al.^[4])

35

[1] R. Lee, et al. "Adaptive Stress Testing: Finding Likely Failure Events with Reinforcement Learning," Journal of Artificial Intelligence Research, 2020.

[2] R. Y. Rubinstein and D. P. Kroese. "The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte Carlo Simulation and Machine Learning." Springer Science and Business Media, 2013.

[3] S. M. Lavalle. "Rapidly-Exploring Random Trees: A New Tool for Path Planning." Iowa State University, Tech. Report, 1998.

[4] A. Corso, et al. “A Survey of Algorithms for Black-Box Safety Validation of Cyber-Physical Systems”, Journal of Artificial Intelligence Research, 2021.

t = 1

t = 2

t = 3

t = 4

t = 5

⚠

36 of 41

Preliminary results

Tree-IS in 2D case

In the 2D driving simulator, S. Gupta et al.^[1] showed the tree importance sampling case converges (among other examples)

36

[1] S. Gupta, et al. "Tree-based importance sampling for risk estimation." Work-in-Progress, 2022.

37 of 41

Preliminary results

Tree-IS in 3D CARLA case

In the 3D driving simulator, tree-IS for scenario selection has small benefit in risk metric convergence

(Note: showing NEAT agent when using Monte Carlo scenario selection vs. tree importance sampling)

37

38 of 41

Preliminary results

World-on-Rails vs. NEAT (relative risk)

Now comparing risk given different AV policies (World-on-Rails and NEAT)

38

39 of 41

Example cases

CARLA

39

Non-failure

Failure

40 of 41

Recap and conclusions

Developed unified strategy for autonomous vehicle risk assessment and validation (MAVRA)

Modular by design to adapt to in-simulation restrictions

Proposed the use of risk metrics from the financial and robotics communities

Use of conditional value at risk (CVaR) to estimate relative tail-risk of AV policies

Plug-in-play different intelligent scenario search and step-level falsification methods

Can research and develop better scenario search and falsification methods on low-fidelity, then upscale to high-fidelity
Could be used as feedback for training AV policies (minimize risk)

Reminder that this approach is one aspect of autonomous vehicle safety assessment (see Transferring Aviation Safety Lessons to the Road^[1])
Framework is open-sourced at https://github.com/sisl/AutonomousRiskFramework.jl
Submitting journal article to IEEE Transactions on Intelligent Transportation Systems

40

[1] M. J. Kochenderfer and R. J. Moss, “Transferring Aviation Safety Lessons to the Road.” Automated Road Transportation Symposium, 2021. [http://web.stanford.edu/~mossr/pdf/kochenderfer-arts21.pdf]

41 of 41

Thank you!

Questions?

mossr@cs.stanford.edu

Robert Moss