Robust Long-Term Vehicle Trajectory Prediction Using Link Projection and a Situation-Aware Transformer
Abstract
:1. Introduction
- We propose a long-term vehicle trajectory prediction method for complex urban roads. The proposed approach is robust to error accumulation and capable of capturing the driver’s driving pattern. To validate its effectiveness, we conduct comprehensive evaluations and performance comparison using the real-world vehicle trajectories.
- We propose an enhanced Transformer model to precisely forecast long-term trajectory of a vehicle. In particular, to capture the changes in the driver’s driving pattern in response to the external factors (e.g., traffic control devices), we propose to add an encoder network to the Transformer model which abstracts the situation nearby the driver.
- To assure the predicted trajectory lies on (or does not deviate from) the actual road, we propose a link projection scheme to project the prediction onto the link geometry.
- We propose a new performance evaluation metric tailored for vehicle trajectory prediction applications, called area-between-curves. It considers the similarity of the predicted trajectory to the ground-truth trajectory, gauging the agreement between the two patterns.
2. Related Work
3. Proposed Idea
3.1. Situation-Aware Transformer
- Encoder block reads the past vehicle trajectory of which length, called look-back window size, is . This block encodes the recent trajectory of the target vehicle to extract the driving pattern. The encoded pattern is passed to the following decoder block to generate the future trajectory.
- Situation encoder block is the newly attached block to the Transformer model, whose goal is to understand the situation on the road in the vicinity of the target driver. In this study, the existence of the intersections and traffic control devices are considered as the situation. In addition, other factors that can affect the driving pattern are also considered. For example, in this study, the dataset we collected is the bus trajectory, and thus the location of the bus stops is used as an input to the situation encoder block as well. This entails a preselection of elements that influence the driver’s driving patterns, and the information is conveyed to the situation encoding block in vector form. Each element is assigned to a specified position within the vector, reflecting the presence of elements. Considering the heading direction of the target vehicle, the existence of a traffic light, traffic enforcement camera, intersection, and bus stop within the range of R meter is binary encoded, and then passed to the situation encoder block. If there are multiple instances of the same element, the nearest one will be processed first. In this study, a multi-layer perceptron (MLP) is used to construct the situation encoder block.
- Decoder block receives the information regarding (i) the abstract representation of the situation from the situation encoder block, and (ii) the encoded trajectory of the vehicle from the encoder block, and then generates the future trajectory (i.e., location/position of the target vehicle). The attention layer at the bottom of the decoder block concatenates the encoded vehicle pattern and the recognized situation.In contrast to the RNN and its variants (e.g., LSTM and GRU), the Transformer model inherently lacks the concept of sequence or order, making it challenging to preserve the temporal relationships of the data when processing time-series data in parallel. Therefore, position encoding is used to preserve the spatiotemporal correlations among the input data in our study. Additionally, the combined information considers both the driver’s recent driving pattern and the surrounding road situation, and it is utilized to predict a high-precision future trajectory.
3.2. Link Projection
Algorithm 1: Link Projection-Based N-Step Trajectory Predictions |
3.3. Overall Procedure
4. Evaluation
4.1. Dataset
4.2. Evaluation Method
- RMSE is one of the representative standard statistical metrics indicating the difference between predicted values and actual values, defined as follows:
- MAE is also one of the statistical metrics for evaluating the difference between predicted values and actual values. MAE is considered a robust metric, particularly in the context of coordinate data, as it is sensitive to small decimal places. MAE is less affected by outliers, making it a robust evaluation metric.
- ABC (Area Between Curves) algorithm is a path difference measurement technique proposed in this paper. The path similarity cannot be measured by distance-based metrics. However, in the case of trajectory prediction problem, it is important to produce the trajectory that overlaps with the ground truth as much as possible to achieve the driving pattern agreement. The proposed ABC draws a curve by the ground truth and another by the predictions. Then, it measures the area of the closed region formed by the two curves, which can be done by counting the number of pixels belonging to the closed area.
Algorithm 2: Area Between Curves (ABC) |
4.3. Evaluation and Comparison Results
- Vanilla LSTM: It is well-known for effectively learning long-term patterns in sequential data, which can be done with a relatively simple architecture. The learned information is stored in the cell state, and the addition or deletion of information occurs through gates. This model has gained prominence for its ability to capture long-term dependencies in sequences and is widely utilized in various applications including natural language processing, speech recognition, and time series prediction. Each LSTM unit incorporates a cell state and three gates: input, forget, and output, effectively managing the flow of information while maintaining essential temporal relationships within the network.
- 1D-CNN: It can efficiently capture temporal patterns, offering superior performance in predicting the future and analyzing traffic behaviors. This approach significantly enhances the accuracy and efficiency of vehicle trajectory prediction by enabling in-depth time-series analysis without the need for complex feature extraction processes. In our experiment, the 1D-CNN model was constructed with 64 convolutional filters. The resulting values were further processed by flattening the output for dense layers, culminating in a final prediction layer with two outputs to forecast coordinates each.
- ConvLSTM: It integrates spatial and temporal features to learn vehicle trajectories, offering high accuracy in predicting future positions and recognizing behavior patterns. This architecture captures subtle spatio-temporal correlations even in complex road environments and traffic flows, significantly enhancing the reliability of vehicle trajectory analysis. In this study, the ConvLSTM model was structured with a ConvLSTM2D layer utilizing 64 filters, which is followed by a sequence of flattening and dense layers, culminating in the final layer that predicts two distinct features.
- Vanilla Transformer (Vanilla TF): With its unique attention mechanism, the Transformer model can effectively capture the temporal correlation among consecutive features in vehicle trajectory, excelling in predicting future vehicle locations and recognizing complex traffic patterns. This approach can outperform traditional sequence learning methods by capturing deeper temporal dependencies, offering significant advantages in the analysis of vehicular movements. According to recent studies, the Transformer model was constructed using a series of normalization and attention mechanisms, initiated by layer normalization and multi-head attention for processing inputs. This configuration is enhanced by feed-forward networks comprising Conv1D layers for further transformation, following the principle of self-attention across multiple heads to effectively capture dependencies without the constraints of sequence alignment.
- Situation-Aware Transformer (SAT): The Transformer-based model we propose in this study is further enhanced to effectively adapt to dynamic road situation to precisely predict vehicle trajectories considering both the learned driving patterns and the surrounding situation. This proposed model is based on the encoder-decoder model of the vanilla Transformer, and then enhanced by introducing an additional encoder for understanding surrounding road situations such as traffic control devices, intersections, and bus stops.
- Situation-Aware Transformer with Link Projection (SATLP): The proposed SAT model followed by the link projection operation to correct the prediction error.
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Dimitrakopoulos, G.; Demestichas, P. Intelligent transportation systems. IEEE Veh. Technol. Mag. 2010, 5, 77–84. [Google Scholar] [CrossRef]
- Guerrero-Ibáñez, J.; Zeadally, S.; Contreras-Castillo, J. Sensor technologies for intelligent transportation systems. Sensors 2018, 18, 1212. [Google Scholar] [CrossRef] [PubMed]
- Zhang, J.; Wang, F.Y.; Wang, K.; Lin, W.H.; Xu, X.; Chen, C. Data-driven intelligent transportation systems: A survey. IEEE Trans. Intell. Transp. Syst. 2011, 12, 1624–1639. [Google Scholar] [CrossRef]
- Veres, M.; Moussa, M. Deep learning for intelligent transportation systems: A survey of emerging trends. IEEE Trans. Intell. Transp. Syst. 2019, 21, 3152–3168. [Google Scholar] [CrossRef]
- Abou Elassad, Z.E.; Mousannif, H.; Al Moatassime, H.; Karkouch, A. The application of machine learning techniques for driving behavior analysis: A conceptual framework and a systematic literature review. Eng. Appl. Artif. Intell. 2020, 87, 103312. [Google Scholar] [CrossRef]
- Hamilton, J.D. Time Series Analysis; Princeton University Press: Princeton, NJ, USA, 2020. [Google Scholar]
- Brownlee, J. Deep Learning for Time Series Forecasting: Predict the Future with MLPs, CNNs and LSTMs in Python; Machine Learning Mastery: Vermont, VA, USA, 2018. [Google Scholar]
- Huang, Y.; Du, J.; Yang, Z.; Zhou, Z.; Zhang, L.; Chen, H. A survey on trajectory-prediction methods for autonomous driving. IEEE Trans. Intell. Veh. 2022, 7, 652–674. [Google Scholar] [CrossRef]
- Sousa, R.S.D.; Boukerche, A.; Loureiro, A.A. Vehicle trajectory similarity: Models, methods, and applications. ACM Comput. Surv. (CSUR) 2020, 53, 1–32. [Google Scholar] [CrossRef]
- Bahari, M.; Saadatnejad, S.; Rahimi, A.; Shaverdikondori, M.; Shahidzadeh, A.H.; Moosavi-Dezfooli, S.M.; Alahi, A. Vehicle trajectory prediction works, but not everywhere. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 17123–17133. [Google Scholar]
- Medsker, L.R.; Jain, L. Recurrent neural networks. Des. Appl. 2001, 5, 2. [Google Scholar]
- Sherstinsky, A. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Phys. D Nonlinear Phenom. 2020, 404, 132306. [Google Scholar] [CrossRef]
- Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv 2014, arXiv:1412.3555. [Google Scholar]
- Kim, N.; Balaraman, A.; Lee, K.; Kim, T. Multi-Step Peak Power Forecasting with Constrained Conditional Transformer for a Large-Scale Manufacturing Plant. IEEE Access 2023, 11, 136692–136705. [Google Scholar] [CrossRef]
- Lin, L.; Li, W.; Bi, H.; Qin, L. Vehicle trajectory prediction using LSTMs with spatial–temporal attention mechanisms. IEEE Intell. Transp. Syst. Mag. 2021, 14, 197–208. [Google Scholar] [CrossRef]
- Messaoud, K.; Yahiaoui, I.; Verroust-Blondet, A.; Nashashibi, F. Attention based vehicle trajectory prediction. IEEE Trans. Intell. Veh. 2020, 6, 175–185. [Google Scholar] [CrossRef]
- Xie, G.; Gao, H.; Qian, L.; Huang, B.; Li, K.; Wang, J. Vehicle trajectory prediction by integrating physics-and maneuver-based approaches using interactive multiple models. IEEE Trans. Ind. Electron. 2017, 65, 5999–6008. [Google Scholar] [CrossRef]
- Xing, Y.; Lv, C.; Cao, D. Personalized vehicle trajectory prediction based on joint time-series modeling for connected vehicles. IEEE Trans. Veh. Technol. 2019, 69, 1341–1352. [Google Scholar] [CrossRef]
- Zhao, L.; Liu, Y.; Al-Dubai, A.Y.; Zomaya, A.Y.; Min, G.; Hawbani, A. A novel generation-adversarial-network-based vehicle trajectory prediction method for intelligent vehicular networks. IEEE Internet Things J. 2020, 8, 2066–2077. [Google Scholar] [CrossRef]
- Altché, F.; de La Fortelle, A. An LSTM network for highway trajectory prediction. In Proceedings of the 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC), Yokohama, Japan, 16–19 October 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 353–359. [Google Scholar]
- Ip, A.; Irio, L.; Oliveira, R. Vehicle trajectory prediction based on LSTM recurrent neural networks. In Proceedings of the 2021 IEEE 93rd Vehicular Technology Conference (VTC2021-Spring), Helsinki, Finland, 25–28 April 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1–5. [Google Scholar]
- Gers, F.A.; Schmidhuber, J.; Cummins, F. Learning to forget: Continual prediction with LSTM. Neural Comput. 2000, 12, 2451–2471. [Google Scholar] [CrossRef] [PubMed]
- Zhou, H.; Zhang, S.; Peng, J.; Zhang, S.; Li, J.; Xiong, H.; Zhang, W. Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, ON, Canada, 20–27 February 2021; Volume 35, pp. 11106–11115. [Google Scholar]
- Quintanar, A.; Fernández-Llorca, D.; Parra, I.; Izquierdo, R.; Sotelo, M. Predicting vehicles trajectories in urban scenarios with transformer networks and augmented information. In Proceedings of the 2021 IEEE Intelligent Vehicles Symposium (IV), Nagoya, Japan, 11–17 June 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1051–1056. [Google Scholar]
- Zhang, K.; Feng, X.; Wu, L.; He, Z. Trajectory prediction for autonomous driving using spatial-temporal graph attention transformer. IEEE Trans. Intell. Transp. Syst. 2022, 23, 22343–22353. [Google Scholar] [CrossRef]
- Seoul Metropolitan Government. T Data: Seoul Traffic Big Data Platform. Available online: https://t-data.seoul.go.kr/ (accessed on 26 October 2023).
- Coifman, B.; Li, L. A critical evaluation of the Next Generation Simulation (NGSIM) vehicle trajectory dataset. Transp. Res. Part Methodol. 2017, 105, 362–377. [Google Scholar] [CrossRef]
- United States Department of Transportation, Federal Highway Administration. Next Generation Simulation (NGSIM) Vehicle Trajectories. Available online: https://ops.fhwa.dot.gov/trafficanalysistools/ngsim.htm (accessed on 30 November 2023).
- Seoul Metropolitan Government. Bus Station Location Information in Seoul. Available online: https://data.seoul.go.kr (accessed on 26 October 2023).
- Decker, B.L. World Geodetic System 1984; Defense Mapping Agency Aerospace Center St Louis Afs Mo: St. Louis, MI, USA, 1986. [Google Scholar]
- Ericsson, E. Independent driving pattern factors and their influence on fuel-use and exhaust emission factors. Transp. Res. Part D Transp. Environ. 2001, 6, 325–345. [Google Scholar] [CrossRef]
Scenario | LSTM | 1D-CNN | ConvLSTM | Vanilla TF | SAT | SATLP |
---|---|---|---|---|---|---|
Intersection | 0.0539 | 0.1060 | 0.0525 | 0.0344 | 0.0443 | 0.0315 |
Straight Lane | 0.1433 | 0.1216 | 0.0553 | 0.2367 | 0.0710 | 0.0791 |
Curve Lane | 0.0753 | 0.1773 | 0.2196 | 0.0606 | 0.0676 | 0.0588 |
Entire Trajectory | 0.1302 | 0.1010 | 0.0878 | 0.2046 | 0.0748 | 0.0701 |
Scenario | LSTM | 1D-CNN | ConvLSTM | Vanilla TF | SAT | SATLP |
---|---|---|---|---|---|---|
Intersection | 0.0368 | 0.0896 | 0.0389 | 0.0312 | 0.0374 | 0.0271 |
Straight Lane | 0.1063 | 0.0853 | 0.0401 | 0.1721 | 0.0524 | 0.0589 |
Curve Lane | 0.0646 | 0.1416 | 0.1954 | 0.0526 | 0.0542 | 0.045 |
Entire Trajectory | 0.0988 | 0.0708 | 0.0597 | 0.1427 | 0.0617 | 0.0569 |
Scenario | LSTM | 1D-CNN | ConvLSTM | Vanilla TF | SAT | SATLP |
---|---|---|---|---|---|---|
Intersection | 562,145 | 960,666 | 979,646 | 918,796 | 617,366 | 524,391 |
Straight Lane | 108,663 | 136,888 | 169,647 | 315,519 | 66,964 | 10,734 |
Curve Lane | 1,308,515 | 1,462,938 | 1,228,900 | 702,028 | 486,265 | 233,317 |
Entire Trajectory | 42,530 | 51,494 | 93,393 | 114,687 | 32,384 | 9804 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kim, M.; Kwak, B.I.; Hou, J.-U.; Kim, T. Robust Long-Term Vehicle Trajectory Prediction Using Link Projection and a Situation-Aware Transformer. Sensors 2024, 24, 2398. https://doi.org/10.3390/s24082398
Kim M, Kwak BI, Hou J-U, Kim T. Robust Long-Term Vehicle Trajectory Prediction Using Link Projection and a Situation-Aware Transformer. Sensors. 2024; 24(8):2398. https://doi.org/10.3390/s24082398
Chicago/Turabian StyleKim, Minsung, Byung Il Kwak, Jong-Uk Hou, and Taewoon Kim. 2024. "Robust Long-Term Vehicle Trajectory Prediction Using Link Projection and a Situation-Aware Transformer" Sensors 24, no. 8: 2398. https://doi.org/10.3390/s24082398
APA StyleKim, M., Kwak, B. I., Hou, J. -U., & Kim, T. (2024). Robust Long-Term Vehicle Trajectory Prediction Using Link Projection and a Situation-Aware Transformer. Sensors, 24(8), 2398. https://doi.org/10.3390/s24082398