Double Counterfactual Regret Minimization for Generating Safety-Critical Scenario of Autonomous Driving
Abstract
:1. Introduction
- A novel structure for scenario generation was designed. This structure depends solely on the size of the dataset, rather than its type, and is easily scalable. The more data that are available, the greater the number of valid scenarios that can be generated.
- Data analysis using real traffic trajectories. We extracted a series of real traffic trajectories from the highD dataset for use in the scenario architecture, including common trajectories in reality, such as long- and short-distance lane-changing trajectories, and straight-line-driving trajectories with different speeds.
- Inspired by Kuhn’s poker, we introduced game theory into the field of autonomous driving scenario generation by constructing a virtual agent to mimic the human-regret-matching process. This approach generates anthropomorphic strategies, which, in turn, create the desired scenarios.
2. Problem Formulation
2.1. Scenario Description
2.2. Model
- is the set of players;
- , where is the set of sequences of player i;
- , where : is the set of payoff functions for player i;
- , where is the set of constraints for player i to realize the action.
- is the set of players;
- , where is the set of sequences of player i;
- , where : is the set of payoff functions for player i;
- , where is the set of constraints for player i to realize the action.
3. Methodology
3.1. Nash Equilibrium
3.2. Double Counterfactual Regret Minimization
Algorithm 1 D-CFR Algorithm. |
1: Function WalkTree(,,,) 2: if then 3: return 4: end if 5: if then 6: Sample outcome with probability 7: return WalkTree(,,,) 8: end if 9: ← information set of 10: ← 0 11: ← match regret of I 12: for do 13: if then 14: ← 15: ← WalkTree 16: ← 17: u ← 18: else 19: ← 20: ← WalkTree 21: ← 22: u ← 23: end if 24: if then 25: for do 26: ← 27: ← 28: end for 29: end if 30: end for 31: return u 32: end Function 33: 34: Function D-CFR 35: for do 36: for do 37: WalkTree 38: end for 39: for do 40: for do 41: WalkTree 42: end for 43: end for 44: end for 45: end Function |
4. Testing and Case Study
4.1. Setup
4.2. Result Analysis
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
Appendix A.1. Risk Field Theory
Appendix A.2. Counterfactual Regret Minimization (CFR)
References
- Kalra, N.; Paddock, S.M. Driving to safety: How many miles of driving would it take to demonstrate autonomous vehicle reliability? Transp. Res. Part A Policy Pract. 2016, 94, 182–193. [Google Scholar] [CrossRef]
- Klischat, M.; Althoff, M. Generating Critical Test Scenarios for Automated Vehicles with Evolutionary Algorithms. In Proceedings of the 2019 IEEE Intelligent Vehicles Symposium (IV), Paris, France, 9–12 June 2019; pp. 2352–2358. [Google Scholar]
- Sinha, A.; Chand, S.; Vu, V.; Chen, H.; Dixit, V. Crash and disengagement data of autonomous vehicles on public roads in California. Sci. Data 2019, 8, 298. [Google Scholar] [CrossRef] [PubMed]
- Zhang, X.; Li, F.; Wu, X. CSG: Critical Scenario Generation from Real Traffic Accidents. In Proceedings of the 2020 IEEE Intelligent Vehicles Symposium (IV), Las Vegas, NV, USA, 19 October–13 November 2020; pp. 1330–1336. [Google Scholar]
- Ponn, T.; Gnandt, C.; Diermeyer, F. An optimization-based method to identify relevant scenarios for type approval of automated vehicles. In Proceedings of the ESV—International Technical Conference on the Enhanced Safety of Vehicles, Eindhoven, The Netherlands, 10–13 June 2020; pp. 10–13. [Google Scholar]
- Klikovits, S.; Riccio, V.; Castellano, E.; Cetinkaya, A.; Gambi, A.; Arcaini, P. Does Road Diversity Really Matter in Testing Automated Driving Systems?—A Registered Report. arXiv 2022, arXiv:2209.05947. [Google Scholar]
- Kruber, F.; Wurst, J.; Botsch, M. An unsupervised random forest clustering technique for automatic traffic scenario categorization. In Proceedings of the 2018 21st International Conference on Intelligent Transportation Systems (ITSC), Maui, HI, USA, 4–7 November 2018; pp. 2811–2818. [Google Scholar]
- Kruber, F.; Wurst, J.; Morales, E.S.; Chakraborty, S.; Botsch, M. Unsupervised and Supervised Learning with the Random Forest Algorithm for Traffic Scenario Clustering and Classification. In Proceedings of the 2019 IEEE Intelligent Vehicles Symposium (IV), Paris, France, 9–12 June 2019; pp. 2463–2470. [Google Scholar]
- Wang, W.; Zhao, D. Extracting Traffic Primitives Directly From Naturalistically Logged Data for Self-Driving Applications. IEEE Robot. Autom. Lett. 2018, 3, 1223–1229. [Google Scholar] [CrossRef]
- Fox, E.B.; Sudderth, E.B.; Jordan, M.I.; Willsky, A.S. An HDP-HMM for systems with state persistence. In Proceedings of the 25th International Conference on Machine Learning, Helsinki, Finland, 5–9 July 2008; pp. 312–319. [Google Scholar]
- Wheeler, T.A.; Kochenderfer, M.J. Critical Factor Graph Situation Clusters for Accelerated Automotive Safety Validation. In Proceedings of the 2019 IEEE Intelligent Vehicles Symposium (IV), Paris, France, 9–12 June 2019; pp. 2133–2139. [Google Scholar]
- Kiss, O.; Grossi, M.; Roggero, A. Importance sampling for stochastic quantum simulations. Quantum 2023, 7, 977. [Google Scholar] [CrossRef]
- Rempe, D.; Philion, J.; Guibas, L.J.; Fidler, S.; Litany, O.L. Generating Useful Accident-Prone Driving Scenarios via a Learned Traffic Prior. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 17305–17315. [Google Scholar]
- Priisalu, M.; Pirinen, A.; Paduraru, C.; Sminchisescu, C. Generating Scenarios with Diverse Pedestrian Behaviors for Autonomous Vehicle Testing. In Proceedings of the 5th Conference on Robot Learning, Auckland, New Zealand, 14–18 December 2022; Volume 164, pp. 1247–1258. [Google Scholar]
- Mziou Sallami, M.; Kerboua-Benlarbi, S.; Doufene, A. Autonomous driving scenario generation using Generative Adversarial Networks. Quantum 2020, 7, 20–56. [Google Scholar]
- Chen, B.; Chen, X.; Wu, Q.; Li, L. Adversarial Evaluation of Autonomous Vehicles in Lane-Change Scenarios. IEEE Trans. Intell. Transp. Syst. 2022, 23, 10333–10342. [Google Scholar] [CrossRef]
- Wachi, A. Failure-scenario maker for rule-based agent using multi-agent adversarial reinforcement learning and its application to autonomous driving. arXiv 2019, arXiv:1903.10654. [Google Scholar]
- Abeysirigoonawardena, Y.; Shkurti, F.; Dudek, G. Generating Adversarial Driving Scenarios in High-Fidelity Simulators. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; pp. 8271–8277. [Google Scholar]
- Xu, M.; Huang, P.; Li, F.; Zhu, J.; Qi, X.; Oguchi, K.; Huang, Z.; Lam, H.; Zhao, D. Accelerated policy evaluation: Learning adversarial environments with adaptive importance sampling. arXiv 2022, arXiv:2106.10566. [Google Scholar]
- Hsu, C.C.; Kang, L.W.; Chen, S.Y.; Wang, I.S.; Hong, C.H.; Chang, C.Y. Deep learning-based vehicle trajectory prediction based on generative adversarial network for autonomous driving applications. Multimed. Tools Appl. 2023, 82, 1573–7721. [Google Scholar] [CrossRef]
- Liu, H.; Zhang, L.; Hari, S.K.S.; Zhao, J. Safety-Critical Scenario Generation Via Reinforcement Learning Based Editing. In Proceedings of the 2024 IEEE International Conference on Robotics and Automation (ICRA), Yokohama, Japan, 13–17 May 2024; pp. 14405–14412. [Google Scholar]
- Zhao, C.; Li, L.; Pei, X.; Li, Z.; Wang, F.-Y.; Wu, X. A comparative study of state-of-the-art driving strategies for autonomous vehicles. Accid. Anal. Prev. 2021, 150, 105937. [Google Scholar] [CrossRef] [PubMed]
- Shawky, M. Factors affecting lane change crashes. IATSS Res. 2020, 44, 155–161. [Google Scholar] [CrossRef]
- Wang, J.; Wu, J.; Li, Y. The concept, principle, and modeling of driving risk in the context of human-vehicle-road collaboration. China Highw. J. 2016, 29, 105–114. [Google Scholar]
- Huang, H.; Wang, J.; Fei, C.; Zheng, X.; Yang, Y.; Liu, J.; Wu, X.; Xu, Q. A probabilistic risk assessment framework considering lane-changing behavior interaction. Sci. China Inf. Sci. 2020, 63, 1–15. [Google Scholar] [CrossRef]
- Moravčík, M.; Schmid, M.; Burch, N.; Lisý, V.; Morrill, D.; Bard, N.; Davis, T.; Waugh, K.; Johanson, M.; Bowling, M. DeepStack: Expert-level artificial intelligence in heads-up no-limit poker. Science 2017, 356, 508–513. [Google Scholar] [CrossRef] [PubMed]
- Brown, N.; Sandholm, T. Superhuman AI for heads-up no-limit poker: Libratus beats top professionals. Science 2018, 359, 418–424. [Google Scholar] [CrossRef] [PubMed]
- Blair, A.; Saffidine, A. AI surpasses humans at six-player poker. Science 2019, 356, 864–865. [Google Scholar] [CrossRef] [PubMed]
- Lisy, V.; Davis, T.; Bowling, M. Counterfactual Regret Minimization in Sequential Security Games. In Proceedings of the AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016; Volume 30. [Google Scholar]
- Davis, T.; Waugh, K.; Bowling, M. Solving large extensive-form games with strategy constraints. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 1861–1868. [Google Scholar]
- Neller, T.W.; Lanctot, M. An introduction to counterfactual regret minimization. In Proceedings of the Model AI Assignments, the Fourth Symposium on Educational Advances in Artificial Intelligence (EAAI-2013), Bellevue, WA, USA, 15–16 July 2013. [Google Scholar]
- Minderhoud, M.M.; Bovy, P.H.L. Extended time-to-collision measures for road traffic safety assessment. Accid. Anal. Prev. 2001, 33, 89–97. [Google Scholar] [CrossRef] [PubMed]
- Wishart, J.; Como, S.; Elli, M.; Russo, B.; Weast, J.; Altekar, N.; James, E.; Chen, Y. Driving safety performance assessment metrics for ads-equipped vehicles. SAE Int. J. Adv. Curr. Pract. Mobil. 2020, 2, 2881–2899. [Google Scholar] [CrossRef]
Parameter | Description |
---|---|
Set of attackers | |
Set of defenders | |
Set of natural trajectories | |
Set of time series | |
Number of time series | |
Number of attackers | |
Number of defenders | |
Number of natural trajectories | |
Types of road for testing |
Role | Size | Action |
---|---|---|
Attacker camp | 1 | The attacking player’s camp |
Defender camp | 1 | The defending player’s camp |
Attacker | Adopt an attack strategy to attack the defender | |
Defender | Keeping the defender safe | |
Action | Natural trajectories for deploying attacks |
Role | Size | Action |
---|---|---|
Attacker | Adopt an attack strategy to attack the defender | |
Defender | Keeping the defender safe | |
Action | Natural trajectories for the time series |
Factor | Value/Type | Description |
---|---|---|
30 | Frames in the scenario | |
Compact vehicle | Types of traffic participants | |
2 | Number of traffic participants | |
40 | Number of natural trajectories | |
Straight highway | Types of road for testing | |
3.94 | Ego vehicle length | |
1.92 | Ego vehicle width | |
3.94 | Traffic participant length | |
1.92 | Traffic participant width | |
[20, 30] | Number of iterations performed by agent A | |
[80, 120] | Number of iterations performed by agent B | |
[−50, 50] | The range of adjustable time series | |
2 | Lane where the ego vehicle is located | |
1, 3 | Lane where the traffic participants are located |
Iteration (A) | Iteration (B) | Running Time (s) | Best Fitness 2 |
---|---|---|---|
10 | 1 | 183.44 | 7025 |
50 | 1492.88 | 4,228,302 | |
100 | 3422.10 | 149,897,307 | |
150 | 4905.49 | 150,025,045 | |
200 | 5729.71 | 150,055,946 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, Y.; Sun, P.; Shuai, L.; Zhang, D. Double Counterfactual Regret Minimization for Generating Safety-Critical Scenario of Autonomous Driving. Electronics 2024, 13, 4303. https://doi.org/10.3390/electronics13214303
Wang Y, Sun P, Shuai L, Zhang D. Double Counterfactual Regret Minimization for Generating Safety-Critical Scenario of Autonomous Driving. Electronics. 2024; 13(21):4303. https://doi.org/10.3390/electronics13214303
Chicago/Turabian StyleWang, Yong, Pengchao Sun, Liguo Shuai, and Daifeng Zhang. 2024. "Double Counterfactual Regret Minimization for Generating Safety-Critical Scenario of Autonomous Driving" Electronics 13, no. 21: 4303. https://doi.org/10.3390/electronics13214303
APA StyleWang, Y., Sun, P., Shuai, L., & Zhang, D. (2024). Double Counterfactual Regret Minimization for Generating Safety-Critical Scenario of Autonomous Driving. Electronics, 13(21), 4303. https://doi.org/10.3390/electronics13214303