A Traffic-Aware Federated Imitation Learning Framework for Motion Control at Unsignalized Intersections with Internet of Vehicles
Abstract
:1. Introduction
- (1)
- Isolation: To balance motion control performance and privacy preservation, setting a local center to assist the motion control optimization is essential, which means that some privacy-sensitive data are delivered to the local center. However, for privacy requirements, data exposure to cloud nodes or other peer nodes is prohibited. This constructs data isolation among intersections;
- (2)
- Heterogeneity: Due to vehicles’ non-uniform spatial distribution, intersections in different areas carry different traffic flows. One of the traffic flow characteristics is the flow rate difference. Due to different traffic flows at different intersections, generated experience data drives obtained the RL model to demonstrate different capabilities for motion control optimization. Therefore, conventional model parameters averaging cannot meet the performance requirements at different intersections;
- (3)
- Scalability: As the number of IoV-enabled unsignalized intersections grows, data generated by the vehicles increase. Because of the incurred high computation and communication budget, any learning-based algorithm with a centralized property may find it challenging to handle such data.
- TAFI-MC framework is proposed to optimize motion control across multiple isolated unsignalized intersections cooperatively. This framework contains three parts: vehicle interactors, edge trainers, and one cloud aggregator;
- TAFI-MC integrates an IL algorithm to obtain a safety-oriented motion control policy, which trains the model with the experience from a set of collision avoidance rules;
- A loss-aware experience selection strategy is designed, which can reduce the communication overhead by extra computation. Depending on the reference loss, each interactor generates new experiences and decides whether to upload them.
2. System Architecture
3. Federated Imitation Learning Framework for Motion Control
3.1. Traffic-Aware Federated Learning
- (1)
- Model Distribution: A set of ETs at intersections participate in FL training. The CA distributed the global model to ETs. The ET n trains the global model with local data for a new model . The index of communication rounds is represented by r.
- (2)
- Experience Upload: To improve motion control performance, each vehicle should consider other vehicles’ states for inference. However, the efficiency of vehicular distributed training is significantly low because of the insufficient number of collected samples and non-uniform distribution. Under the above setting, it is essential to use centralized training and distributed execution [26]. Then, each vehicle interacts with the environment, i.e., other vehicles, and generates enormous experience data to upload. The corresponding ET uses this data to train the local model.
- (3)
- FL Model Training: The proposed FL’s third step is to train the model by using local data uploaded by vehicles. Let represent the experience data stored in selected ETs. denotes a local experience of the ET with a length , . d is the size of the entire data among the selected ETs. The goal of the FL is to minimize the loss function :
- (4)
- Upload Updated Model: The fourth step is to upload the local model from ETs to the CA. The communication overhead exceeds the computing overhead [27]. The model can be compressed before being uploaded to the CA to reduce communication overhead.
- (5)
- Weighted Aggregation: After ETs upload their models, the fifth step is to produce a new global model by computing a weighted sum of all received models . For the next training iteration, the newly generated global model is used. Federated Averaging (FedAVG), which is commonly used in FL, increases the proportion of local computing and decreases mini-batch sizes. In FedAVG, each ET adds computation by iterating the local updates multiple times before the aggregation step in the CA. To aggregate the model, the weighted averaging algorithm is implemented. The weights for parameter aggregation are determined by the traffic flow of each intersection, which is . and F denote the traffic flow on intersections and the flow sum of all intersections, respectively. The aggregate method can then be re-written as
3.2. Imitation Learning for Motion Control
Algorithm 1: Imitation learning for motion control. |
|
3.3. Collision Avoidance Rules
4. Loss-Aware Experience Selection Strategy
5. Results and Discussion
5.1. Simulation Settings
5.2. Metric
5.3. Discussion
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
FL | Federated Learning |
IoV | Internet of Vehicles |
IL | Imitation Learning |
References
- Chen, Y.; Lu, C.; Chu, W. A Cooperative Driving Strategy Based on Velocity Prediction for Connected Vehicles With Robust Path-Following Control. IEEE Internet Things J. 2020, 7, 3822–3832. [Google Scholar] [CrossRef]
- Xu, Y.; Li, D.; Xi, Y. A Game-Based Adaptive Traffic Signal Control Policy Using the Vehicle to Infrastructure (V2I). IEEE Trans. Veh. Technol. 2019, 68, 9425–9437. [Google Scholar] [CrossRef]
- Khayatian, M.; Mehrabian, M.; Andert, E.; Dedinsky, R.; Choudhary, S.; Lou, Y.; Shirvastava, A. A Survey on Intersection Management of Connected Autonomous Vehicles. ACM Trans.-Cyber-Phys. Syst. 2020, 4, 48:1–48:27. [Google Scholar] [CrossRef]
- Stryszowski, M.; Longo, S.; Velenis, E.; Forostovsky, G. A framework for self-enforced interaction between connected vehicles: Intersection negotiation. IEEE Trans. Intell. Transp. Syst. 2020, 22, 6716–6725. [Google Scholar] [CrossRef]
- Perronnet, F.; Buisson, J.; Lombard, A.; Abbas-Turki, A.; Ahmane, M.; Moudni, A.E. Deadlock Prevention of Self-Driving Vehicles in a Network of Intersections. IEEE Trans. Intell. Transp. Syst. 2019, 20, 4219–4233. [Google Scholar] [CrossRef]
- Guan, Y.; Ren, Y.; Li, S.E.; Sun, Q.; Luo, L.; Li, K. Centralized cooperation for connected and automated vehicles at intersections by proximal policy optimization. IEEE Trans. Veh. Technol. 2020, 69, 12597–12608. [Google Scholar] [CrossRef]
- Wu, T.; Jiang, M.; Zhang, L. Cooperative Multiagent Deep Deterministic Policy Gradient (CoMADDPG) for Intelligent Connected Transportation with Unsignalized Intersection. Math. Probl. Eng. 2020, 2020, 1820527. [Google Scholar] [CrossRef]
- Jiang, M.; Wu, T.; Wang, Z.; Gong, Y.; Zhang, L.; Liu, R.P. A Multi-intersection Vehicular Cooperative Control based on End-Edge-Cloud Computing. arXiv 2020, arXiv:2012.00500. [Google Scholar]
- Guo, W.; Tian, W.; Ye, Y.; Xu, L.; Wu, K. Cloud Resource Scheduling with Deep Reinforcement Learning and Imitation Learning. IEEE Internet Things J. 2020, 8, 3576–3586. [Google Scholar] [CrossRef]
- Huo, Y.; Tao, Q.; Hu, J. Cooperative Control for Multi-Intersection Traffic Signal Based on Deep Reinforcement Learning and Imitation Learning. IEEE Access 2020, 8, 199573–199585. [Google Scholar] [CrossRef]
- Yang, B.; Zhang, J.; Shi, H. Interactive-Imitation-Based Distributed Coordination Scheme for Smart Manufacturing. IEEE Trans. Ind. Inform. 2020, 17, 3599–3608. [Google Scholar] [CrossRef]
- Riedmaier, S.; Ponn, T.; Ludwig, D.; Schick, B.; Diermeyer, F. Survey on scenario-based safety assessment of automated vehicles. IEEE Access 2020, 8, 87456–87477. [Google Scholar] [CrossRef]
- Tesla. Tesla AI Day. 2021. Available online: https://www.youtube.com/watch?v=j0z4FweCy4M&ab_channel=Tesla (accessed on 3 December 2021).
- Luo, P.; Yu, F.R.; Chen, J.; Li, J.; Leung, V.C. A Novel Adaptive Gradient Compression Scheme: Reducing the Communication Overhead for Distributed Deep Learning in the Internet of Things. IEEE Internet Things J. 2021, 8, 11476–11486. [Google Scholar] [CrossRef]
- Shi, S.; Wang, Q.; Chu, X.; Li, B.; Qin, Y.; Liu, R.; Zhao, X. Communication-efficient distributed deep learning with merged gradient sparsification on GPUs. In Proceedings of the IEEE INFOCOM 2020-IEEE Conference on Computer Communications, Toronto, ON, Canada, 6–9 July 2020; pp. 406–415. [Google Scholar]
- Sattler, F.; Wiedemann, S.; Müller, K.R.; Samek, W. Robust and communication-efficient federated learning from non-iid data. IEEE Trans. Neural Netw. Learn. Syst. 2019, 31, 3400–3413. [Google Scholar] [CrossRef] [Green Version]
- Schaul, T.; Quan, J.; Antonoglou, I.; Silver, D. Prioritized Experience Replay. In Proceedings of the 4th International Conference on Learning Representations, ICLR 2016, San Juan, PR, USA, 2–4 May 2016; pp. 1–21. [Google Scholar]
- Messaoud, S.; Bradai, A.; Ahmed, O.B.; Quang, P.T.A.; Atri, M.; Hossain, M.S. Deep federated q-learning-based network slicing for industrial iot. IEEE Trans. Ind. Inform. 2020, 17, 5572–5582. [Google Scholar] [CrossRef]
- Mun, H.; Lee, Y. Internet Traffic Classification with Federated Learning. Electronics 2021, 10, 27. [Google Scholar] [CrossRef]
- Li, Z.; Liu, J.; Hao, J.; Wang, H.; Xian, M. CrowdSFL: A Secure Crowd Computing Framework Based on Blockchain and Federated Learning. Electronics 2020, 9, 773. [Google Scholar] [CrossRef]
- Yu, Z.; Hu, J.; Min, G.; Wang, Z.; Miao, W.; Li, S. Privacy-Preserving Federated Deep Learning for Cooperative Hierarchical Caching in Fog Computing. IEEE Internet Things J. 2021, 1–10. [Google Scholar] [CrossRef]
- Chen, Z.; Liao, W.; Hua, K.; Lu, C.; Yu, W. Towards asynchronous federated learning for heterogeneous edge-powered internet of things. Digit. Commun. Netw. 2021, 7, 317–326. [Google Scholar] [CrossRef]
- Zhao, N.; Wu, H.; Yu, F.R.; Wang, L.; Zhang, W.; Leung, V.C. Deep Reinforcement Learning-Based Latency Minimization in Edge Intelligence over Vehicular Networks. IEEE Internet Things J. 2021. [Google Scholar] [CrossRef]
- Lim, W.Y.B.; Huang, J.; Xiong, Z.; Kang, J.; Niyato, D.; Hua, X.S.; Leung, C.; Miao, C. Towards federated learning in uav-enabled internet of vehicles: A multi-dimensional contract-matching approach. IEEE Trans. Intell. Transp. Syst. 2021, 22, 5140–5154. [Google Scholar] [CrossRef]
- Bian, Y.; Li, S.E.; Ren, W.; Wang, J.; Li, K.; Liu, H.X. Cooperation of multiple connected vehicles at unsignalized intersections: Distributed observation, optimization, and control. IEEE Trans. Ind. Electron. 2019, 67, 10744–10754. [Google Scholar] [CrossRef]
- Konecný, J.; McMahan, H.B.; Yu, F.X.; Richtárik, P.; Suresh, A.T.; Bacon, D. Federated Learning: Strategies for Improving Communication Efficiency. arXiv 2016, arXiv:1610.05492. [Google Scholar]
- McMahan, B.; Moore, E.; Ramage, D.; Hampson, S.; y Arcas, B.A. Communication-Efficient Learning of Deep Networks from Decentralized Data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, AISTATS 2017, Fort Lauderdale, FL, USA, 20–22 April 2017; Volume 54, pp. 1273–1282. [Google Scholar]
- Zhang, Y.; Malikopoulos, A.A.; Cassandras, C.G. Decentralized optimal control for connected automated vehicles at intersections including left and right turns. In Proceedings of the 56th IEEE Annual Conference on Decision and Control, CDC 2017, Melbourne, Australia, 12–15 December 2017; pp. 4428–4433. [Google Scholar]
- Katriniok, A.; Kojchev, S.; Lefeber, E.; Nijmeijer, H. Distributed scenario model predictive control for driver aided intersection crossing. In Proceedings of the 2018 European Control Conference (ECC), Melbourne, Australia, 12–15 December 2017; pp. 1746–1752. [Google Scholar]
Parameter | Value |
---|---|
Simulator | |
Lane length (m) | 150 |
Vehicle size (m) | 2 |
Velocity (m/s) | |
Initial velocity (m/s) | 10 |
Acceleration (m/s2) | |
Discrete-time step T (s) | |
Safety Value | |
Space normalization factor | 10 |
Space exponetial factor | 10 |
Time linear factor | |
Time exponetial factor | 2 |
Acceleration exponetial factor | |
Acceleration exponetial factor | 12 |
Acceleration linear factor | |
Safety value upper bound | 20 |
Safety value lower bound | |
Conversion factor | 3 |
Fusion factor | |
Weighting factor | |
Vehicle Selection | |
Number of the closest vehicle n | 5 |
Parameter | Value |
---|---|
Discounted factor | 0.8 |
Batch Size B | 48 |
Soft update factor | 0.99 |
Episode | 50 |
Learning rate | 0.001 → 0 |
Optimizer | Adam |
Network Architecture | |
Dense layer 1# | 64 |
Dense layer 2# | 64 |
Dense layer 3# | 1 |
Discard Factor | 1% | 2% | 5% | 10% | |
---|---|---|---|---|---|
Collision Rate | Model | 36% | 39% | 44% | 42% |
Model+Rule | 0% | 0% | 0% | 0% | |
Average Jerk | Model | 21.55 | 13.08 | 11.11 | 23.85 |
Model+Rule | 137.15 | 113.23 | 108.83 | 130.83 | |
Average Velocity | Model | 12.22 | 12.06 | 12.06 | 12.24 |
Model+Rule | 12.21 | 12.21 | 12.19 | 12.26 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wu, T.; Jiang, M.; Han, Y.; Yuan, Z.; Li, X.; Zhang, L. A Traffic-Aware Federated Imitation Learning Framework for Motion Control at Unsignalized Intersections with Internet of Vehicles. Electronics 2021, 10, 3050. https://doi.org/10.3390/electronics10243050
Wu T, Jiang M, Han Y, Yuan Z, Li X, Zhang L. A Traffic-Aware Federated Imitation Learning Framework for Motion Control at Unsignalized Intersections with Internet of Vehicles. Electronics. 2021; 10(24):3050. https://doi.org/10.3390/electronics10243050
Chicago/Turabian StyleWu, Tianhao, Mingzhi Jiang, Yinhui Han, Zheng Yuan, Xinhang Li, and Lin Zhang. 2021. "A Traffic-Aware Federated Imitation Learning Framework for Motion Control at Unsignalized Intersections with Internet of Vehicles" Electronics 10, no. 24: 3050. https://doi.org/10.3390/electronics10243050
APA StyleWu, T., Jiang, M., Han, Y., Yuan, Z., Li, X., & Zhang, L. (2021). A Traffic-Aware Federated Imitation Learning Framework for Motion Control at Unsignalized Intersections with Internet of Vehicles. Electronics, 10(24), 3050. https://doi.org/10.3390/electronics10243050