1. Introduction
With continuous improvements in standard of living, the consumer desire for cars has been steadily increasing, leading to a rapid rise in the number of motor vehicles. Statistics show that, in 2023, the number of motor vehicles in China exceeded 400 million, an increase of nearly 25 million compared to the previous year [
1]. While automobiles have brought convenience to people’s travel, they have also caused problems such as traffic congestion and traffic safety, posing a significant threat to the safety of people’s lives and property. With the advent of intelligent, connected vehicles and big data, the development of autonomous driving and intelligent driver assistance technologies provides solutions to improve road traffic safety and alleviate congestion. Among these technologies, vehicle-trajectory prediction is a key component of autonomous driving and intelligent driver assistance. By predicting the future trajectory of a vehicle and assessing driving risks using traffic conflict or safety technologies, it is possible to identify potential dangers in real time. This enables corrective actions by the driver, helping to prevent traffic accidents. However, accurately predicting the future trajectory of a vehicle in a dynamic and complex traffic environment presents numerous challenges and difficulties [
2,
3].
Early methods for vehicle-trajectory prediction were mostly based on physical models, utilizing vehicle kinematic models to relate steering angles, acceleration, and other factors to the vehicle’s motion trends [
4,
5,
6]. For example, Zyner et al. [
7] predicted vehicle-driving behavior using parameters such as vehicle speed, position, and heading angle; Xin et al. [
8] addressed issues such as sensor limitations and obstacle occlusion by using a deep-learning model with single-vehicle motion state information as the input for the trajectory prediction. However, methods based on physical models have significant limitations in application, as they are only reliable for short-term trajectory prediction. Later, researchers applied Markov models [
9,
10] and Kalman filtering [
11,
12] to predict trajectories, achieving promising results. For instance, Nishiwaki et al. [
9] used a hidden Markov model trained with vehicle longitudinal speed, lateral position, and dynamic characteristics to generate vehicle trajectories. Xiao et al. [
13] proposed an interactive multi-model vehicle-trajectory prediction method based on the fusion of the motion model and the maneuver model, by combining the constant-turn-rate and acceleration (CTRA) motion model and unscented Kalman filter to predict uncertain vehicle trajectories in the future.
Nonetheless, these methods also suffer from decreased accuracy in long-term predictions. Consequently, scholars began exploring other methods for vehicle-trajectory prediction. Xu et al. [
14] proposed an extended dynamic model to represent evasive lane-changing behavior, which generated more accurate lane-change trajectories compared to the quintic polynomial lane-change model. Xie et al. [
15] predicted vehicle trajectories based on behavior recognition and curvature constraints, using lane curvature as a constraint to predict the optimal trajectory.
With the development of machine learning and deep learning, data-driven trajectory prediction has increasingly gained attention and importance [
16,
17,
18,
19]. Under conditions with large datasets, Dai et al. [
20] considered spatial interactions between different vehicles and temporal relationships in trajectory time series, proposing a spatiotemporal LSTM-based trajectory prediction model and verifying its effectiveness. Wen et al. [
21] proposed a generative-adversarial-network (GAN)-based vehicle lane-change trajectory prediction model, which significantly improved prediction accuracy compared to traditional models. Yang et al. [
22] proposed a vehicle lane-changing trajectory prediction model based on temporal convolutional networks with an attention mechanism to improve the accuracy of vehicle lane-changing trajectory prediction, which can predict the trajectory of the lane change with high accuracy. Xue et al. [
23] focused on the limited consideration of traffic environment influences on vehicle maneuvers in current models and developed a machine-learning-based integrated lane-change prediction model that incorporates the traffic context, making it applicable to various traffic environments. Guo et al. [
24] developed a transformer-based vehicle lane-change prediction model using large-scale real trajectory data collected from connected vehicles.
To solve more complex problems in prediction, the use of trajectory prediction methods based on hybrid models is gradually increasing. Deo et al. [
25] proposed a trajectory prediction method based on an LSTM model, which learns the dynamics of the predicted vehicle motion through an encoder, captures the interdependence of surrounding target movements in the scene through a convolutional social pooling layer, and combines the two to decode and output a multimodal prediction probability distribution of the predicted vehicle. To explain the influence of historical trajectories and adjacent vehicles on the target vehicle, Lin et al. [
26] proposed an LSTM model STA-LSTM with a spatiotemporal attention mechanism based on the literature [
25] for vehicle-trajectory prediction. The spatiotemporal attention weights provided by the model enhance its interpretability. To further obtain multidimensional data and extract effective information, research based on graph convolutional networks (GCN) and attention mechanisms is gradually increasing. Liang et al. [
27] used Lane GCN and Actor Net to extract map features and target motion features and then used a fusion network composed of four types of interactions to generate multimodal trajectory predictions. Messaoud et al. [
28] proposed a model that combines grid-based trajectory encoding with LSTM and multi-head self-attention.
Another data-driven approach is to learn the motion behavior of vehicles from a large amount of trajectory data, accurately identify the driving intention of the vehicle, and perform trajectory prediction to improve the accuracy of the trajectory prediction [
29]. Ji et al. [
30] designed a driving-intention recognition and vehicle-trajectory prediction model based on an LSTM network, introducing a mixed-density network to represent the vehicle’s future position. Dai et al. [
20] considered the spatiotemporal interaction relationship between vehicles and embedded spatial interaction into the model, proposing a trajectory prediction model based on spatiotemporal LSTM. Shi et al. [
31] proposed a trajectory prediction algorithm that combines car-following and lane-changing, including a lane-changing-intention prediction module, trajectory encoder, and trajectory decoder. Due to the influence of multiple modules on the prediction results of model, the overall trajectory prediction accuracy of the model is not significantly improved.
In summary, early research focused on using kinematic models for modeling and prediction and fitting trajectories with polynomials, which could not adapt to complex and changing scenarios or meet the demands of real-time prediction. With advancements in data collection and processing technologies, many researchers have used machine-learning methods to predict vehicle trajectories based on large-scale trajectory datasets. Numerous studies have shown that vehicle trajectory is influenced by different driving intentions and driving styles. That is, even in the same driving environment, there can be significant differences in the trajectories controlled by drivers with different styles. Therefore, considering the increasingly complex road traffic environment, this paper comprehensively explores the relationship between driving style, driving intention, and vehicle-trajectory prediction. The study aims to develop a precise, effective, and adaptive vehicle-trajectory prediction model, which holds significant implications for vehicle risk assessment, driving decision making, and autonomous driving.
The remainder of this study is structured as follows:
Section 2 introduces the concept of the trajectory prediction model and the research schematic overview of this study.
Section 3 presents the vehicle-trajectory datasets and extracted trajectory data.
Section 4 introduces the construction of the trajectory prediction model, which includes the model input data, model evaluation metrics, and model parameter settings.
Section 5 presents the predicting results of the model as well as the comparison to other models, and the experimental results of experiments are analyzed and discussed. Finally, the conclusions and future perspectives are presented in
Section 6.
2. Overview of Prediction Models
2.1. Recurrent Neural Network
Traditional neural networks do not have connections between neurons in different layers, meaning that the output of the current state does not influence subsequent states. This lack of continuity and learning accumulation renders traditional neural networks inadequate for modeling time-series data. The advent of the recurrent neural network (RNN) addressed this limitation. The RNN can accept variable-length sequences as input and possesses memory capabilities, allowing it to capture temporal dependencies within sequences. RNN have found wide applications in fields such as natural language processing and time-series analysis.
An RNN is a chain-like neural network consisting of three layers: an input layer, a hidden layer, and an output layer. Data enters through the input layer, undergoes multiple iterations within the hidden layer, and is, finally, output through the output layer. The basic structure of an RNN is illustrated in
Figure 1. The distinctive feature of RNN is their ability to carry information from one time step to the next, accumulating information over time through internal connections within the hidden layer. This chain-like structure enables the RNN to effectively handle sequential data, such as vehicle-trajectory data. However, as time accumulates, the RNN may face issues such as vanishing gradients or exploding gradients during backpropagation, making it difficult to effectively learn information from long historical sequences, ultimately impairing their learning capability [
32,
33].
2.2. Long Short-Term Memory Network
As an improved version of RNN, long short-term memory (LSTM) networks not only retain the ability of an RNN to handle time-series problems but also effectively address the issues of vanishing gradients and exploding gradients [
34,
35]. An LSTM network is composed of multiple cells, each responsible for maintaining and updating the cell state. The structure of a single LSTM unit, illustrated in
Figure 2, includes three critical gates: the forget gate, the input gate, and the output gate. These gates interact with the cell state and, by introducing nonlinear elements, enable the network to more efficiently extract and learn relevant information from long sequences of data.
The forget gate determines which information from the previous cell state should be discarded, the input gate decides what new information should be stored in the cell state, and the output gate controls the output of the current cell state. This gate mechanism allows LSTM networks to selectively retain or discard information over long periods, making them particularly effective for learning dependencies in long-term sequences.
The forget gate is used to control whether the information in the cell state should be forgotten, allowing the network to better handle long-term dependencies. By passing the current input
and the hidden state from the previous time step
through an activation function, the forget gate regulates the extent to which the current input and the previous hidden state are forgotten. This process helps to mitigate the influence of earlier information on subsequent information. The process of information transmission through the forget gate is as follows:
In this context, represents the sigmoid activation function, is the weight matrix for the forget gate, and is the bias term.
The input gate controls how the current input updates the cell state, thereby affecting the network’s memory and learning capabilities. It consists of two main components:
Sigmoid Layer: This layer controls the selection of values that should contribute to the state update by filtering the input. It applies the sigmoid activation function to determine which parts of the current input and previous hidden state will influence the new state.
Tanh Layer: This layer uses the tanh activation function to compute a new candidate value for the cell state. The tanh function scales the input to a range between −1 and 1, allowing the network to add or subtract information in a more controlled manner.
The results from the
sigmoid and
tanh layers are then multiplied to form the new state for the current time step. This process enables the input gate to flexibly filter out irrelevant information from the current input while incorporating relevant historical information. The process of updating the cell state through the input gate is as follows:
In this context, and are the weight matrices associated with the candidate state and the input gate, respectively. Similarly, and are the bias terms corresponding to the candidate state and the input gate.
The output gate determines the final output of the cell state and consists of both a sigmoid layer and a tanh layer. The process works as follows:
Sigmoid layer calculation: The current input and the previous hidden state are passed through a sigmoid activation function, which decides the extent to which the cell state should be expressed in the output. The sigmoid function generates an output gate value , which ranges between 0 and 1, indicating how much of the cell state will contribute to the final output.
Tanh layer calculation: Simultaneously, the tanh activation function is applied to the cell state to compute a new candidate hidden state . This candidate hidden state represents the filtered version of the cell state, constrained within the range of −1 to 1.
Final output calculation: The final hidden state of the LSTM is then obtained by multiplying the output of the sigmoid function by the candidate hidden state . This operation selectively filters the cell state to produce the final output, which balances the contribution of the cell’s memory and the current input.
The process of the output gate can be summarized as follows:
In this context, is the weight matrix associated with the output gate, and is the bias term for the output gate.
2.3. Research Schematic Overview
Through the above analysis, it can be seen that LSTM has strong information mining and deep representation capabilities in dealing with temporal problems and has significant advantages in vehicle-trajectory prediction. Therefore, this study is based on the LSTM network, comprehensively considering the influence of driving style and driving intention on vehicle-driving trajectory and jointly using both as inputs with historical trajectory data to construct a vehicle-trajectory prediction method based on the LSTM network. The research flowchart for the proposed framework is shown in
Figure 3.
4. Trajectory Prediction Model
4.1. Model Input Data
In this study, an LSTM network based on the Keras framework was constructed using the TensorFlow deep-learning platform. The input sequence to the model consisted of three components: the number of samples, the time steps, and the number of features. Key feature parameters that effectively describe changes in the vehicle’s trajectory, such as the vehicle’s lateral and longitudinal coordinates, speed, and acceleration, were selected as inputs for model training and prediction.
Since driving decisions and execution during the driving process depend on driving intentions, which in turn influence the vehicle’s future trajectory, driving intentions were also converted into the one-hot encoded format and included as feature parameters in the model. Specifically, the encoding is as follows: (1,0,0) represents a left-lane change, (0,1,0) represents car-following behavior, and (0,0,1) represents a right-lane change.
In this context, X represents the input to the model, and denotes the input at time step T. The parameter N is the length of the input historical trajectory. and represent the longitudinal and lateral positions at time T, and represent the longitudinal and lateral velocities at time T, and and represent the longitudinal and lateral accelerations at time T. The variable represents the driving intention at time T.
Additionally, to explore the impact of driving style on the prediction of future vehicle trajectories, driving style was included as a new feature parameter in the LSTM network, along with historical trajectory data and driving intentions, as part of the model’s input. For the clustering of driving styles, this study selected three feature parameters: the lateral velocity variation coefficient, the acceleration impact coefficient, and the collision time coefficient and used the k-means clustering algorithm to divide driving styles into three categories: conservative, normal, and aggressive driving style. Before being input into the model, the three types of driving styles were converted into a one-hot encoded format, where (1,0,0) represents a conservative driving style, (0,1,0) represents a normal driving style, and (0,0,1) represents an aggressive driving style.
In this context, represents the driving style at time T.
The trajectory sample data was divided into testing and training sets in a 1:4 ratio. The LSTM model was then used to train the constructed time-series data. Based on the research team’s previous work on the optimal recognition window length for driving intention [
37], historical trajectory data with a duration of 2 s was selected as the input to the model. The future trajectory was predicted using a sliding time window method, with each sliding step corresponding to 1 frame in the trajectory data and a step size of 1.
The model predicted the next frame of trajectory data based on 2 s of historical trajectory data. The predicted trajectory data from this step was then combined with the previous historical trajectory data, excluding the first frame, to form a new input. This iterative process continued to predict trajectory data for future time steps.
This method allows the model to continuously update its predictions as new data becomes available, making it possible to generate a more accurate trajectory prediction over time by leveraging recent historical information and driving style influences.
4.2. Model Evaluation Metrics
To accurately assess the performance of the LSTM model in the trajectory prediction, the mean absolute error (MAE) and root-mean-square error (RMSE) are used as evaluation metrics. The MAE measures the average magnitude of errors in a set of predictions, without considering their direction. It is a linear score, meaning that all individual differences are weighted equally. The RMSE measures the square root of the average of the squared differences between the predicted and actual values. Unlike MAE, it penalizes larger errors more heavily due to the squaring of differences, making it more sensitive to outliers. The mathematical expression for MAE and RMSE are as shown in Equations (13) and (14).
In these equations, n represents the number of samples, and are the actual values, and and are the predicted values. The smaller the MAE and RMSE values, the closer the model’s predictions are to the true values, indicating a better predictive performance of the model.
4.3. Model Parameter Settings
The Adam algorithm is a commonly used optimization algorithm that is widely applied in the field of deep learning. Compared to other traditional gradient descent methods, the Adam algorithm has the characteristics of adaptive learning rate and second-order moment estimation, which can adaptively adjust the learning rate according to the gradient characteristics of different parameters, thereby accelerating the convergence of the model and improving the training effect. Based on the characteristics of the vehicle’s driving trajectory, the Adam algorithm is used as the optimization algorithm, with root-mean-square error (RMSE) as the loss function for model training and optimization. The commonly used optimization steps of the Adam algorithm are as follows: calculation of the momentum estimate of the gradient (first moment estimate), calculation of the momentum estimate of the squared gradient (second moment estimate), bias correction, and updating model parameters. Through these steps, the Adam algorithm dynamically adjusts the learning rate for each parameter during training, effectively balancing the model’s convergence speed and stability, thereby improving training efficiency and performance. Additionally, the selection of hyperparameters significantly influences the network’s performance, convergence speed, and generalization ability. The key hyperparameters in the model are set as follows:
Learning rate: In an LSTM network, the learning rate is a critical parameter that controls the step size of parameter updates during training. To ensure a good learning performance, an adaptive-decay learning rate is introduced during training. The initial learning rate is set to 0.001. If there is no improvement in the training set over 10 epochs, the learning rate is reduced by a decay factor of 0.2. This strategy helps ensure the stable convergence of the model.
Training epochs: A training epoch refers to the process of passing all the training data through the network once and performing a single parameter update. The number of training epochs affects the model’s ability to learn data features. With too few epochs, the model may not fully learn the data’s features, leading to underfitting. Conversely, too many epochs can cause the model to overfit the training data. Considering the number of trajectory samples used in this study, the maximum number of iterations is set to 300 epochs.
The above parameter settings are for optimizing the model’s learning process, ensuring a balance between adequate training and the prevention of overfitting, thus leading to a robust and generalizable model.
6. Conclusions
The accurate prediction of vehicle-driving trajectories can support driving risk assessment and assistive driving systems. In this study, an LSTM network based on the Keras framework was constructed using the TensorFlow deep-learning platform. The impact of driving intention on trajectory prediction was considered by incorporating it along with historical trajectory data as inputs to the LSTM network. Additionally, to study the influence of driving style on trajectory prediction results, the lateral and longitudinal MAE and RMSE metrics were selected to evaluate the predictive performance of the model, a comparative study on the predictive performance of different models was conducted. The following conclusions were drawn:
(1) By comparing the values of the MAE and RMSE, it can be concluded that the vehicle-driving trajectory prediction model based on the LSTM in this study has a high prediction performance; the predicted trajectory is basically consistent with the actual trajectory.
(2) By adding the driving style as a new parameter in the LSTM model, it was found that trajectory prediction results made with the consideration of driving style exhibit better accuracy. It indicates that the driving style has a significant impact on the driving behavior and trajectory of vehicles. However, as the prediction time horizon increases, the accuracy of the predictions decreases, with noticeable errors emerging after 5 s.
(3) By comparing to the CTRA model based on kinematic models, it was found that the LSTM-based prediction model has a better predictive performance in a long prediction period. However, compared to the trajectory prediction model based on spatiotemporal LSTM, it was found that the model proposed in this study has significant advantages in a short prediction period (less than 3 s). However, as the prediction period increases, the prediction errors of both models rapidly increase.
The future predicted vehicle in this study can enhance the drivers’ ability to judge and make decisions about danger. By predicting driving trajectory and analyzing the motion status and position of the target vehicle and surrounding vehicles during the predicted period, the possible collision risks can be judged, which can help drivers to improve driving safety and comfort. At the same time, trajectory prediction is also beneficial for optimizing advanced driving assistance systems and promoting the development of higher-level autonomous driving technology. However, the model in this study is trained based on vehicle-driving trajectories from highways, and the performance of the long-term prediction still needs to be improved. In future research, the following aspects of research can be considered to improve the predictive performance of trajectory prediction models: further improving the accuracy of the original data and exploring more complex models, such as deep-learning models (GRU, etc.) or ensemble learning methods (gradient boosting trees, etc.). Additionally, further attempts can be made to expand the application scope of the model, such as urban roads, intersections, and ramps.