Deep Reinforcement Learning for Traffic Light Timing Optimization
Abstract
:1. Introduction
2. Related Work
2.1. Traditional Methods
2.2. RL-Based Methods
3. Method
3.1. Problem Statement
3.2. Agent Design
3.2.1. Intersection State
3.2.2. Agent Actions
3.2.3. Reward
3.3. Effective-Pressure with Double Dueing Deep Q-Network for Traffic Light Control
Algorithm 1 EP-D3QN for traffic light control. |
Input: Intersections’ state o Output: Action a Initialize: The parameters of main network and target network , discount factor ,target network update rate , replay buffer D, threshold and
|
4. Experiment and Analysis
4.1. Experimental Setup
4.2. Evaluation Metrics
4.3. Compared Methods
- Fixed-time Traffic Light Control (FT). FT [2] is the most widely used traffic light control method. Each intersection sets a fixed sequence of signal phases, and the duration of traffic lights is also fixed.
- Self-organizing traffic lights (SOTL). In the SOTL [5], whether the traffic light phases is switched depends on the observed traffic conditions and the rules defined in advance. Compared to FT, SOTL is very flexible.
- MaxPressure (MP). In the MP [6], the traffic light controller activates the traffic light phases with max pressure in a cycle. The MP introduces the concept of pressure, which is the difference between the number of vehicles on incoming lanes and the number of vehicles on outgoing lanes. At each time step, the pressure of each phase is calculated and compared. Finally, the phase with the max pressure is activated.
- Double dueling Deep Q-network (3DQN). The 3DQN [11], incorporates multiple optimization elements to improve the performance of traffic light control, such as dueling network, target network, double Q-learning network, and prioritized experience replay.
4.4. Result and Analysis
4.4.1. Light Traffic Flow Scenario
4.4.2. Heavy Traffic Flow Scenario
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Noaeen, M.; Naik, A.; Goodman, L.; Crebo, J.; Abrar, T.; Abad, Z.S.H.; Bazzan, A.L.; Far, B. Reinforcement learning in urban network traffic signal control: A systematic literature review. Expert Syst. Appl. 2022, 199, 116830. [Google Scholar] [CrossRef]
- Li, L.; Wen, D. Parallel systems for traffic control: A rethinking. IEEE Trans. Intell. Transp. Syst. 2016, 17, 1179–1182. [Google Scholar] [CrossRef]
- Robertson, D.I.; Bretherton, R.D. Optimizing networks of traffic signals in real-time SCOOT method. IEEE Trans. Veh. Technol. 1991, 40, 11–15. [Google Scholar] [CrossRef]
- Sims, A. The Sydney coordinated adaptive traffic (SCAT) system philosophy and benefits. IEEE Trans. Veh. Technol. 1980, 29, 130–137. [Google Scholar] [CrossRef]
- Cools, S.B.; Gershenson, C.; D’Hooghe, B. Self-organizing traffic lights: A realistic simulation. In Advances in Applied Self-Organizing Systems; Springer: London, UK, 2013; pp. 45–55. [Google Scholar]
- Varaiya, P. Max pressure control of a network of signalized intersections. Transp. Res. Part C Emerg. Technol. 2013, 36, 177–195. [Google Scholar] [CrossRef]
- Haydari, A.; Yilmaz, Y. Deep Reinforcement Learning for Intelligent Transportation Systems: A Survey. IEEE Trans. Intell. Transp. Syst. 2022, 23, 11–32. [Google Scholar] [CrossRef]
- Li, L.; Lv, Y.; Wang, F.Y. Traffic Signal Timing via Deep Reinforcement Learning. IEEE-CAA J. Autom. Sin. 2016, 3, 247–254. [Google Scholar]
- Mousavi, S.; Schukat, M.; Howley, E. Traffic Light Control Using Deep Policy-Gradient and Value-Function Based Reinforcement Learning. IET Intell. Transp. Syst. 2017, 11, 417–423. [Google Scholar] [CrossRef] [Green Version]
- Genders, W.; Razavi, S. Policy Analysis of Adaptive Traffic Signal Control Using Reinforcement Learning. J. Comput. Civ. Eng. 2020, 34, 19–46. [Google Scholar] [CrossRef]
- Liang, X.; Du, X.; Wang, G.; Han, Z. A deep reinforcement learning network for traffic light cycle control. IEEE Trans. Veh. Technol. 2019, 68, 1243–1253. [Google Scholar] [CrossRef] [Green Version]
- Wei, H.; Zheng, G.; Yao, H.; Li, Z. IntelliLight: A Reinforcement Learning Approach for Intelligent Traffic Light Control. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, London, UK, 19–23 August 2018; pp. 2496–2505. [Google Scholar]
- Wei, H.; Chen, C.; Zheng, G.; Wu, K.; Li, Z. PressLight: Learning Max Pressure Control to Coordinate Traffic Signals in Arterial Network. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 1290–1298. [Google Scholar]
- Chen, C.; Wei, H.; Xu, N.; Zheng, G.; Yang, M.; Xiong, Y.; Xu, K.; Li, Z. Toward a Thousand Lights: Decen-tralized Deep Reinforcement Learning for Large-Scale Traffic Signal Control. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence (AAAI’19), Honolulu, HI, USA, 27 January–1 February 2019; pp. 3414–3421. [Google Scholar]
- Krajzewicz, D.; Erdmann, J.; Behrisch, M.; Bieker, L. Recent development and applications of sumo simulation of urban mobility. Int. J. Adv. Syst. Meas. 2012, 5, 128–138. [Google Scholar]
- Wu, Q.; Zhang, L.; Shen, J.; Lu, L.; Du, B.; Wu, J. Efficient pressure: Improving efficiency for signalized intersections. arXiv 2021, arXiv:2112.02336. [Google Scholar]
- Zhang, L.; Wu, Q.; Jun, S.; Lu, L.; Du, B.; Wu, J. Expression might be enough: Representing pressure and demand for re-inforcement learning based traffic signal control. In Proceedings of the 39th International Conference on Machine Learning, Baltimore, MD, USA, 17–23 July 2022; pp. 26645–26654. [Google Scholar]
- Shabestary, S.M.A.; Abdulhai, B. Deep learning vs. discrete reinforcement learning for adaptive traffic signal control. In Proceedings of the 2018 21st International Conference on Intelligent Transportation Systems (ITSC), Maui, HI, USA, 4–7 November 2018; pp. 286–293. [Google Scholar]
- Zeng, J.; Hu, J.; Zhang, Y. Adaptive Traffic Signal Control with Deep Recurrent Q-learning. In Proceedings of the 2018 IEEE Intelligent Vehicles Symposium (IV), Changshu, China, 26–30 June 2018; pp. 1215–1220. [Google Scholar]
- Chen, P.; Zhu, Z.; Lu, G. An Adaptive Control Method for Arterial Signal Coordination Based on Deep Rein-forcement Learning. In Proceedings of the 2019 IEEE Intelligent Transportation Systems Conference (ITSC), Auckland, New Zealand, 27–30 October 2019; pp. 3553–3558. [Google Scholar]
- Van Hasselt, H.; Guez, A.; Silver, D. Deep reinforcement learning with double q-learning. In Proceedings of the 29th AAAI Conference on Artificial Intelligence (AAAI’15), Austin, TX, USA, 25–30 January 2015; pp. 2094–2100. [Google Scholar]
- Wang, Z.; Schaul, T.; Hessel, M.; van Hasselt, H.; Lanctot, M.; de Freitas, N. Dueling network architectures for deep reinforcement learning. In Proceedings of the 33rd International Conference on Machine Learning (ICML’16), New York, NY, USA, 19–24 June 2016; pp. 1995–2003. [Google Scholar]
- Schaul, T.; Quan, J.; Antonoglou, I.; Silver, D. Prioritized experience replay. In Proceedings of the 4th International Conference on Learning Representations (ICLR’16), San Juan, PR, USA, 2–4 May 2016. [Google Scholar]
- Xu, M.; Wu, J.; Huang, L.; Zhou, R.; Wang, T.; Hu, D. Network-wide traffic signal control based on the discovery of critical nodes and deep reinforcement learning. J. Intell. Transport. Syst. 2020, 24, 1–10. [Google Scholar] [CrossRef]
- Shashi, F.I.; Md Sultan, S.; Khatun, A.; Sultana, T.; Alam, T. A Study on Deep Reinforcement Learning Based Traffic Signal Control for Mitigating Traffic Congestion. In Proceedings of the 2021 IEEE 3rd Eurasia Conference on Biomedical Engi-neering, Healthcare and Sustainability (ECBIOS), Tainan, Taiwan, 28–30 May 2021; pp. 288–291. [Google Scholar]
- Wei, H.; Zheng, G.; Gayah, V.; Li, Z. Recent Advances in Reinforcement Learning for Traffic Signal Control: A Survey of Models and Evaluation. ACM SIGKDD Explor. Newsl. 2022, 22, 12–18. [Google Scholar] [CrossRef]
- Liu, J.; Qin, S.; Luo, Y.; Wang, Y.; Yang, S. Intelligent Traffic Light Control by Exploring Strategies in an Optimised Space of Deep Q-Learning. IEEE Trans. Veh. Technol. 2022, 71, 5960–5970. [Google Scholar] [CrossRef]
Algorithm | Reward | AQL | AWT | ATT |
---|---|---|---|---|
FT | 0.005 | 0.555 | 7.729 | 27.538 |
MaxPressure | 0.172 | 0.504 | 7.240 | 15.631 |
SOTL | 4.410 | 0.386 | 5.341 | 9.331 |
3DQN | 4.229 | 0.351 | 6.535 | 6.775 |
EP-D3QN | 6.332 | 0.322 | 3.911 | 4.982 |
Algorithm | Reward | AQL | AWT | ATT |
---|---|---|---|---|
FT | 0.009 | 3.130 | 6.648 | 16.030 |
MaxPressure | 0.007 | 2.740 | 9.626 | 6.377 |
SOTL | 0.058 | 2.243 | 6.452 | 10.420 |
3DQN | 9.588 | 2.703 | 6.232 | 7.532 |
EP-D3QN | 11.658 | 1.385 | 4.110 | 3.519 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, B.; He, Z.; Sheng, J.; Chen, Y. Deep Reinforcement Learning for Traffic Light Timing Optimization. Processes 2022, 10, 2458. https://doi.org/10.3390/pr10112458
Wang B, He Z, Sheng J, Chen Y. Deep Reinforcement Learning for Traffic Light Timing Optimization. Processes. 2022; 10(11):2458. https://doi.org/10.3390/pr10112458
Chicago/Turabian StyleWang, Bin, Zhengkun He, Jinfang Sheng, and Yu Chen. 2022. "Deep Reinforcement Learning for Traffic Light Timing Optimization" Processes 10, no. 11: 2458. https://doi.org/10.3390/pr10112458
APA StyleWang, B., He, Z., Sheng, J., & Chen, Y. (2022). Deep Reinforcement Learning for Traffic Light Timing Optimization. Processes, 10(11), 2458. https://doi.org/10.3390/pr10112458