Next Article in Journal
A Formal Fuzzy Concept-Based Approach for Association Rule Discovery with Optimized Time and Storage
Previous Article in Journal
Industry 5.0 Drivers Analysis Using Grey-DEMATEL: A Logistics Case in Emerging Economies
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Bus Schedule Time Prediction Based on LSTM-SVR Model

1
School of Mathematics and Information Science, Nanjing Normal University of Special Education, Nanjing 210048, China
2
School of Information Science and Engineering, Southeast University, Nanjing 211189, China
*
Author to whom correspondence should be addressed.
Mathematics 2024, 12(22), 3589; https://doi.org/10.3390/math12223589
Submission received: 10 October 2024 / Revised: 10 November 2024 / Accepted: 14 November 2024 / Published: 16 November 2024

Abstract

:
With the acceleration of urbanization, urban bus scheduling systems are facing unprecedented challenges. Traditional bus scheduling provides the original schedule time and the planned time of arrival at the destination, where the schedule time is the departure time of the bus. However, various factors encountered during the drive result in significant differences in the driving time of the bus. To ensure timely arrivals, the bus scheduling system has to rely on manual adjustments to optimize the schedule time to determine the actual departure time. In order to reduce the scheduling cost and align the schedule time closer to the actual departure time, this paper proposes a dynamic scheduling model, LSTM-SVR, which leverages the advantages of LSTM in capturing the time series features and the ability of SVR in dealing with nonlinear problems, especially its generalization ability in small datasets. Firstly, LSTM is used to efficiently capture features of multidimensional time series data and convert them into one-dimensional effective feature outputs. Secondly, SVR is used to train the nonlinear relationship between these one-dimensional features and the target variables. Thirdly, the one-dimensional time series features extracted from the test set are put into the generated nonlinear model for prediction to obtain the predicted schedule time. Finally, we validate the model using real data from an urban bus scheduling system. The experimental results show that the proposed hybrid LSTM-SVR model outperforms LSTM-BOA, SVR-BOA, and BiLSTM-SOA models in the accuracy of predicting bus schedule time, thus confirming the effectiveness and superior prediction performance of the model.
MSC:
68T07; 90B20; 90C90

1. Introduction

The bus is one of the most significant modes of transportation in modern cities. The urbanization of recent decades has led to a rapid expansion of the urban population, which places great pressure on bus systems. Traditional bus operation modes have struggled to keep pace with the growing traffic demand, resulting in traffic congestion and long waiting times for passengers. Therefore, optimizing the operation mode of public transportation in order to improve operational efficiency and service quality has become an urgent problem that various public transportation companies need to solve. Currently, the mode of optimizing public transportation operation mainly focuses on predicting bus arrival time, bus driving time, and bus departure time, that is, bus schedule time.

1.1. Related Work

To address the problem, studies on related traditional methods have been presented in the literature [1,2,3]. Tirachini et al. proposed a multivariate regression model for bus arrival time prediction based on factors affecting bus travel in an urban network (e.g., delays caused by traffic signals). The time prediction of this method is more accurate but the establishment of the model is more complicated and sensitive to outliers [1]. Suwardo et al. proposed an Autoregressive Integrated Moving Average (ARIMA) time series model for predicting bus driving time. A suitable time series model for predicting bus driving time was selected by evaluating the minimum of the Mean Absolute Relative Error (MARE) and Mean Absolute Percentage Prediction Error (MAPPE). However, the ARIMA model is constrained in its ability to process complex nonlinear patterns and requires a stable dataset [2]. Shang et al. utilized the inverse variance function to study bus scheduling, but the method faces challenges such as incomplete consideration of the factors influencing bus operations, strong data dependence, and the scenario-specific limitation [3]. Many of these traditional methods have complicated calculations and a high sensitivity to data, which can impact the accuracy of prediction.
With the development of big data, intelligent algorithms based on Machine Learning (ML) are playing an increasingly important role in various fields [4,5], including innovations in bus systems. Kviesis et al. used Support Vector Regression (SVR) with other regression models for bus arrival time prediction comparison, showing that SVR outperformed other regression models [6]. Alam et al. predicted the occurrence of arrival time irregularities by mining GPS coordinates of transit buses provided by the Toronto Transit Commission (TTC) along with hourly weather data and using these data in ML models that we have developed. They found that their Long Short-Term Memory Recurrent Neural Network (LSTM) model demonstrates the best prediction accuracy [7]. Chu et al. proposed a new deep learning model in bus arrival time, called the Deep Encoder Cross Network (DECN), to improve estimated time of arrival (ETA) prediction based on multiple non-distance-based factors, such as weather, road speed and congestion, and traffic composition [8]. He et al. developed an LSTM model for accurately predicting driving time across different segments of a bus route. However, the model still lacks flexibility and maturity in prediction and many parameters in the LSTM prediction have not been adjusted [9]. Jargalsaikhan et al. investigated certain ML methods to predict bus driving time. Concretely, for bus travel data, they employed three regression methods: linear regression (LR), SVR, and an Artificial Neural Network (ANN) to predict driving time. The performances of these ML methods were estimated and compared using conventional measures such as mean absolute error and root mean squared error [10]. Dunne et al. aimed at validating ML methods recently shown to be effective in the literature on new bus datasets from Dublin and Genoa. The analysis of the results showed some interesting insights into bus networks, highlighting that the accuracy of the predictions is strongly related to the standard deviation of the whole driving time [11]. Wai et al. presented a novel ML modeling approach that combines model-generated dwell and travel time (or run time) predictions to produce predicted bus departure times [12].
Although single methods are favored for their simplicity, training efficiency, and real-time responsiveness, they have limitations in terms of generalization ability, resistance to noise interference, and dependence on feature engineering. In contrast, we can flexibly combine these multiple learning methods in practical applications to capitalize on their respective advantages according to specific needs and available resources.
In order to overcome the shortcomings of single methods, hybrid intelligence algorithms have emerged, which realize complementary and enhanced performance by integrating the advantages of multiple algorithms. Liu et al. proposed a novel hybrid neural network ANN-LSTM based on spatio-temporal feature vectors for integrated prediction, which predicts long-distance arrival from the temporal feature dimension and short-distance arrival from the spatial feature dimension. The results demonstrated that the algorithm achieves high accuracy in bus arrival prediction, and outperforms single neural network models in both accuracy and arrival time prediction [13]. Hashi et al. proposed a new robust model that could forecast the arrival time of buses by using SVM and the Kalman filtering algorithm [14]. Zhou et al. proposed a Variational Mode Decomposition (VMD)-based LSTM model for predicting future bus arrival times over multiple time scales. This method reduced computational complexity while better capturing similar bus link operating speed patterns compared with the LSTM model [15]. Similar related work was presented in [16,17,18,19]. Liu et al. proposed a Kalman filter–LSTM model to predict bus driving time and analyzed the characteristics of driving time using historic Automatic Vehicle Location (AVL) data. The experimental results proved that the model can better predict driving time and provided the operator with an effective and optimized scheduling plan based on the punctuality of the bus during the specific operating hours [20]. Jiang et al. proposed an LSTM model to predict bus driving time, considering incomplete data. To improve the model performance in terms of accuracy and efficiency, a Genetic Algorithm (GA) was developed and applied to optimize hyperparameters of the LSTM model [21]. Zhou et al. introduced the hybrid Improved Seagull Algorithm (ISOA)–LSTM model. The proposed model was applied to the prediction analysis of actual transit operation data by direction and time period. In addition, this study explored a Bidirectional LSTM (BiLSTM) model for predicting bus operation status using both historical and future data. The proposed model with an attention mechanism exhibited high prediction accuracy, broadening its practical applicability [22]. Xie et al. proposed a Deep Reinforcement Learning-based dynamic bus Timetable Scheduling method with Bidirectional Constraints (DRL-TSBC) model, and used a Deep Q-Network (DQN) to verify whether buses depart in both directions every minute. The experimental results demonstrated that the model has significant advantages in reducing passenger waiting times and adapting to changes in passenger flow [23].
These hybrid intelligent algorithms have achieved significant results in optimizing bus arrival times, driving times, and schedule times. However, few studies have integrated multiple models to investigate dynamic schedule times. Current bus scheduling systems operate with a predetermined departure time and an estimated arrival time to the destination, based on historical data. However, various factors can introduce significant variability in driving times along routes. To ensure on-time arrivals, scheduling centers often invest resources in manually adjusting schedule times to minimize deviations from departure times, which increases operational costs. Accurately predicting schedule times to ensure on-time arrivals is highly significant for reducing labor costs and improving operational efficiency.

1.2. Our Contribution

To solve the above problem, this paper proposes a novel hybrid intelligent algorithm, named LSTM-SVR. The LSTM algorithm, as an excellent representative of temporal recurrent neural networks, plays a central role in this paper. It accurately captures the time-dependent and sequential patterns in real-time bus schedule data, reveals the regularity and dynamic changes of bus operation, and provides a basis for subsequent schedule time prediction. In the face of the complex time series features of bus schedule data, LSTM can automatically extract the key features and optimize the feature extraction process by its unique gating mechanism. SVR, as a regression model based on SVM, effectively improves the prediction accuracy by mapping the nonlinear problem to a high-dimensional space for linear processing through kernel technique. The significant advantage of SVR lies in its powerful ability to deal with the nonlinear problem and its excellent generalization performance for small sample datasets. The hybrid model fully utilizes the advantages of LSTM in capturing the time-dependent and sequential patterns in bus scheduling data, and SVR in dealing with nonlinear regression problems. The main contributions of this paper are as follows:
  • The model proposed in this paper is an innovative two-stage hybrid LSTM-SVR prediction framework. In the first stage, we fully utilize LSTM’s expertise in time series feature extraction to extract one-dimensional key time series features. In the second stage, using SVR, we combine these features with actual driving time to build an accurate nonlinear prediction model. The main advantage of this approach is its ability to effectively avoid overfitting, which is a particularly common problem in LSTM models, especially when the training data are limited. SVR excels in dealing with nonlinear problems and demonstrates excellent generalization capabilities even on small datasets. By integrating the advantages of LSTM and SVR, the hybrid model not only improves the prediction accuracy, but also is more comprehensive and efficient, which is especially suitable for applications in public transportation scheduling.
  • A numerical experiment was conducted using real data from an urban bus scheduling system to validate the effectiveness and stability of the proposed model. Results show that this model surpasses LSTM, SVR, and BiLSTM-SOA models in prediction accuracy and stability, offering a scientific foundation for bus companies and managers to improve operational efficiency.
The rest of this paper is organized as follows. In Section 2, we provide a detailed problem description including the workflow of the bus scheduling system, the related dataset, and the process of the proposed hybrid model. In Section 3, we give some numerical experiments and verify the effectiveness of the proposed model. Finally, in Section 4, we highlight the key strengths of this research and its potential for generalization, along with a comprehensive summary of the study’s core contributions.

2. Materials and Models

In this section, we mainly introduce the problem description and a preparation of the dataset and models that are used in the subsequent numerical experiments.

2.1. Problem Description

In the daily operation of a bus scheduling system, the scheduled time for departure from an origination and the expected arrival time at the destination are preset. Ideally, if no unexpected events occur during the journey, the actual departure time of the bus will coincide with the scheduled time. However, in reality, various circumstances such as weather changes, traffic congestion, or traffic accidents can impact the bus trip. These disruptions may lead to variations in the actual departure time of the current shift, driving time of the previous shift, cycle times, the rest time between two consecutive shifts of the bus, and the interval time between this bus and the previous bus. Such factors can cause discrepancies between the planned and actual arrival times. To ensure timely arrivals, bus companies often rely on the scheduling system to adjust schedule times based on real-time conditions.
In order to reduce these adjustments and improve operational efficiency, this paper utilizes real data collected from bus systems and applies deep learning techniques to develop a dynamic scheduling approach. This dynamic scheduling method reduces manual scheduling, optimizes the scheduling process, and aims to align the schedule time as closely as possible with the actual departure time. The specific workflow is illustrated in Figure 1.

2.2. Dataset

In order to verify the reliability and validity of the hybrid LSTM-SVR model, the relevant data used in this paper were obtained from a representative (long distance, numerous stops, and high passenger flow) route in an urban bus scheduling system. The dataset includes the cycle times (times), driving time of the previous shift (minutes), rest time between two consecutive shifts of the bus (minutes), interval time between this bus and the previous bus (minutes), actual departure time of the current shift (hour: minute: second) and actual driving time of the current shift (minutes). Table 1 presents the variables utilized in the subsequent analysis, with input variables directly obtained from the dataset. Taking the data from the departure station in winter as an example, the selected dataset contains 6838 samples. To facilitate the training and validation of our LSTM-SVR model, the dataset is divided, with 80% allocated for model training and 20% for testing. The main purpose of splitting the dataset into a training set and test set for model validation is to evaluate the generalization ability of the model. The train set is used for the learning of the model so that it can extract features from the input data and optimize the parameters. The test set is used to evaluate the performance of the model on unseen data to determine whether the model has good generalization ability to avoid overfitting. Studies in [24,25] have shown that the 80%:20% training/test split generally yields optimal classification performance and balanced generalization. This ratio allows models to learn effectively with the support of sufficient training data, while maintaining reliable test data for performance evaluation, making it a widely recommended approach. As a result, the training set comprises 5470 samples, while the testing set contains 1368 samples. This ratio has been chosen to guarantee that the model can be trained on a sufficiently large dataset, while also allowing for robust performance assessment on unseen data.

2.3. Models

LSTM and SVR are two prevalent intelligent algorithms in deep learning. LSTM excels in extracting features from time series data, whereas SVR specializes in modeling deeper nonlinear relationships from these extracted features. It is noted that SVR can avoid overfitting while maintaining robust predictive performance on unseen data. Consequently, this paper primarily focuses on the combination of LSTM and SVR. In the following, we provide a concise introduction to these two models.

2.3.1. LSTM

LSTM is an improved RNN structure specifically designed to address the gradient vanishing and gradient explosion encountered by traditional RNNs when processing long sequence data. LSTM was proposed by Hochreiter and Schmidhuber in [26]. Its key innovation lies in solving the limitation of traditional RNNs which is that it is difficult to capture long-term dependencies. LSTM becomes a powerful tool for processing time series data. As illustrated in [4], the structure of LSTM is shown in Figure 2.
In Figure 2, f t denotes the forgetting gate. It is used to keep the cell state up-to-date, manage which existing information should be retained or forgotten, and ensure that only relevant information is stored. i t denotes the input gate, which updates the cell state and decides what information to include in the current cell. o t denotes the output gate, which decides what information will eventually be output to the next layer or as the output of the current LSTM cell. The network’s current input is represented by x t , while h t 1 signifies the output from LSTM’s hidden layer at the previous time step. Similarly, C t 1 refers to the cell state at the previous time step. LSTM’s output at time t is indicated by h t and C t .
In this paper, the actual driving time (minutes) of a bus is affected by several relevant time variables, including cycle times (times), driving time of the previous shift (minutes), rest time between two consecutive shifts of the bus (minutes), interval time between the current bus and the previous one (minutes), and actual departure time of the current shift (hour: minute: second). There is a significant temporal dependence among these variables. Through its intrinsic memory cells, the LSTM model is able to extract key time series features from these time series data and uncover potential relationships between the input variables and the target variables.

2.3.2. SVR

SVR is a regression model based on SVM. The goal of SVR is to construct a function that not only fits the given training data but also maintains the model’s simplicity. The model corresponding to SVR can be expressed as:
f ( x ) = w T x + b
where w , b are the model parameters and x is the input vector. This function defines a linear regression model, aiming to minimize the discrepancy between the predicted and actual data. When dealing with a nonlinear regression problem, SVR can achieve linear regression in high-dimensional space through introducing kernel functions (e.g., Gaussian radial basis kernel, polynomial kernel, etc.), which map the data from a low-dimensional space to a high-dimensional space [27,28]. The use of kernel functions significantly expands the applicability of SVR, enabling it to effectively handle complex nonlinear data relationships.
This paper demonstrates the effectiveness of SVR in accurately predicting target variables. SVR’s nonlinear modeling capability, excellent generalization performance, and adaptability to complex data structures provide a robust foundation for predictive modeling. The application of SVR ensures that the model is able to handle complex input features and achieve accurate prediction of the target variables.

2.3.3. LSTM-SVR

LSTM specializes in handling time series data, effectively capturing long-term dependencies and complex time series features in the data, while SVR excels in handling nonlinear regression problems. By combining LSTM and SVR, we can effectively solve the problems of long-term dependencies, nonlinear relationships, data noise, and the model’s generalization ability in time series data. The combination of the two models not only improves the accuracy of model prediction, but also enhances the adaptability to the complex data environment, which provides a strong support for better prediction of bus schedule times. The workflow of the LSTM-SVR model is divided into the following four steps, which are shown in Figure 3.
Step 1: Input dataset. The input dataset includes cycle times, driving time of the previous shifts, rest time between two consecutive shifts of the bus, interval time between the current bus and the previous one, actual departure time of different shifts, and actual driving time of current shift.
Step 2: Data preprocessing. We use the multiple interpolation strategy of the Random Forest algorithm and the method of forward padding to address missing data. For outlier data, the Inter-Quartile Range (IQR) method is utilized for calibration and correction. Finally, using standard score (Z-score) standardization to eliminate different scales of variables, we normalize the data for the subsequent analysis and prediction.
Step 3: LSTM-SVR hybrid model. In LSTM, we choose the Bayesian Optimization Algorithm (BOA), while the Grid Search algorithm (GS), BOA, Genetic Algorithm (GA), and Random Search Optimization (RSO) are used to optimize SVR. The model is constructed in two stages. In the first stage, the following variables from the dataset are selected as independent variables: cycle times, driving time of the previous shift, rest time between two consecutive shifts of the bus, interval time between the current bus and the previous bus, and actual departure time of the current shift. The actual driving time of the current shift is input as a dependent variable into LSTM for training. These variables are used to train the LSTM model, from which one-dimensional time series features are extracted. In the second stage, the extracted features are employed as the independent variable, while the actual driving time of different shifts serve as the dependent variable. Subsequently, those are used to train the SVR model, resulting in a nonlinear relationship between the extracted features and the driving time of different bus shifts.
Step 4: Output the predicted schedule time. The extracted time series features from the test set are put into the nonlinear model generated in Step 3 to make predictions; i.e., the predicted driving time is obtained. The predicted schedule time is obtained by subtracting the predicted driving time from the planned arrival time.

3. Experiments

This section introduces four aspects: experimental setup, evaluation metrics, experimental design, and result analysis. In Section 3.1, the data related to the models are preprocessed, including missing data processing, outlier data processing, and data normalization. In Section 3.2, the evaluation metrics are used to assess the validity of the proposed model. Section 3.3 presents the parameter selection required for the LSTM-SVR model. Finally, we provide the specific experimental results in Section 3.4.

3.1. Experimental Setup

In this subsection, the data associated in the model are preprocessed for further experimental analysis.
  • Missing data processing
Missing data are common in datasets. The issue often originates from technical faults or human errors during data collection, resulting in incomplete records. In addition, changes in the external environment or observation conditions may also cause missing data. In data analysis, it is crucial to choose appropriate imputation methods to ensure data quality and model reliability. Random Forest and K-Nearest Neighbor are popular approaches for imputation [29]. K-Nearest Neighbor relies on local data similarity, while Random Forest combines global and local patterns through decision trees, offering better adaptability and robustness, especially with complex, nonlinear data [30,31]. So, the multiple interpolation strategy based on the random forest algorithm is employed to handle the missing data. In this paper, the IterativeImputer function from the sklearn library is used to implement multiple imputation. For the actual departure time of the current shift with temporal order and similarity between adjacent time points, the forward filling method is applied, i.e., the valid data of the previous time point to fill the current missing data.
2.
Outlier data processing
The outlier data are processed to enhance the quality of the dataset while reducing the impact of anomalies [32]. Given that indicators such as actual driving time and actual departure time reflect the continuity and efficiency of vehicle operation, respectively, and these indicators exhibit skewed distributions, the IQR method is employed to quantitatively identify and correct anomalous data points.
3.
Data normalization
To ensure fair and accurate model assessment, the Z-score standardization method [33] is used to standardize data with different scales, eliminating the influence of varying magnitudes among variables.
All numerical experiments in this paper were conducted on an Intel Core i5-12450H CPU with 2.00 GHz RAM and 16 GB of physical RAM. The PC was running Python version 3.11 on a 64-bit Windows 11 Home Chinese Edition operating system. We mainly used the following Python libraries:
  • NumPy: A numerical computation library for efficient processing of multi-dimensional arrays and matrices, providing a rich set of mathematical functions suitable for large-scale data operations.
  • Pandas: A powerful data analysis and processing library that provides a flexible data structure (e.g., DataFrame) for easy data loading, cleaning, and transformation.
  • Matplotlib: A commonly used plotting library for creating various static, dynamic, and interactive charts and graphs, enabling data visualization.
  • Scikit-learn: A machine learning library that provides a wide range of algorithms and tools for data preprocessing, model training, evaluation, and parameter optimization.
  • TensorFlow and Keras: Frameworks used for building and training LSTM models to capture complex patterns in time series data.
  • DEAP: An evolutionary computation library mainly used to implement optimization algorithms such as the genetic algorithm, facilitating the solution of complex optimization problems.
  • Scikit-optimize (skopt): A hyperparameter tuning library based on Bayesian optimization, used to automatically find the optimal parameter combinations for models.

3.2. Evaluation Metrics

The predicted bus schedule time is obtained by subtracting the predicted driving time from the planned arrival time. To evaluate the deviation between the driving time predicted by the model and the actual driving time, four evaluation metrics are chosen: Mean Square Error (MSE), Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE) [34].
MSE reflects the difference between predicted and actual values as follows:
M S E = 1 n i = 1 n ( x i y i ) 2
where n is the number of samples, x i is the predicted value of sample i , and y i is the actual value of sample i .
RMSE is the square root of the MSE, commonly used to assess the overall prediction error of the model as follows:
R M S E = 1 n i = 1 n ( x i y i ) 2
MAE is the average of the absolute differences between the predicted and actual values:
M A E = 1 n i = 1 n | x i y i |
MAPE is the average percentage of the absolute error relative to the actual values, given as follows:
M A P E = 1 n i = 1 n x i y i x i × 100 %
The evaluation metrics work together to provide a comprehensive understanding of the model’s prediction performance. MSE and RMSE focus on the overall prediction accuracy of the model, MAE provides robustness analysis, and MAPE is used to assess the relative error of the model. By utilizing these evaluation metrics, the model’s effectiveness in predicting bus driving time is validated, ultimately confirming its reliability in predicting bus schedule times.

3.3. Experimental Design

To assess the performance of the LSTM-SVR hybrid model, this paper employs the LSTM baseline model for comparative analysis.
Parameter selection is a key factor affecting the predictive performance of the model. In this paper, the BOA is used to optimize the parameters of the LSTM model. Compared to traditional optimization algorithms in [35], the BOA is utilized as the hyperparameter optimization method for the LSTM model due to its significant advantages when dealing with this type of problem, including high efficiency, global search capability, adaptivity, and robustness.
For the SVR model, the Gaussian kernel function is used for parameter tuning to enhance the model’s predictive performance and ensure its generalization ability. Additionally, four optimization algorithms, GS, BOA, GA, and RSO, are employed to tune the regularization parameters of the SVR model. These four optimization algorithms are applied to the LSTM-SVR model, resulting in the designations LSTM-SVR-GS, LSTM-SVR-BOA, LSTM-SVR-GA, and LSTM-SVR-RSO, respectively.
The structure of the LSTM model comprises an input layer, two hidden layers, a Batch Normalization (BN) layer, a dropout layer, a dense layer, and an output layer. The input layer defines the dimensionality and size of the input data, and it receives the input sequences. The two hidden layers contain 108 and 56 LSTM neurons, respectively, which are capable of capturing the temporal dependencies inherent in the sequence. The BN layer is located between the two hidden layers and standardizes the output of the previous layer. This standardization can stabilize the input distribution, accelerate the training process, and improve the model’s stability and generalization ability. The dropout layer is located after the batch normalization layer with a dropout rate of 13%. The dense layer, placed after the second hidden layer, contains 32 neurons and employs the Rectified Linear Unit (ReLU) activation function for additional feature processing. Finally, predicted results or selected features are generated by the output layer. In order to enhance the effectiveness of the LSTM, the Adam optimizer is employed. Compared to traditional optimizers, the Adam optimizer stands out due to its advantages of adaptive learning rate adjustment, fast convergence, adaptability to sparse gradients, reduced risk of overfitting, and enhanced stability during the training process. The empirical results show that Adam works well in practice and is superior to Stochastic Gradient Descent and Root Mean Square Propagation optimization methods when it comes to prediction accuracy [36]. The model is trained for a maximum of 50 rounds, with a batch size of 36; i.e., 36 samples are used to compute the gradient during each round of weight update. Table 2 shows the detailed setting of the above parameters. To prevent overfitting during training, early stopping is used to monitor the training process. At the end of each training round, the model assesses its performance on the validation set and calculates the validation loss. If the validation loss does not improve for 10 consecutive rounds, training will automatically terminate.

3.4. Results Analysis

Figure 4 shows the results of the loss function values for all models during training and validation. The BOA is used to optimize the parameters of the LSTM model (LSTM-BOA). The proposed LSTM-SVR models exhibit a faster decrease in the loss function compared with the LSTM-BOA model, and achieve lower loss values within the same training time or training rounds. The proposed models can achieve more accurate prediction results.
To further demonstrate the effectiveness of the LSTM-SVR model, in addition to comparing it with LSTM-BOA, we also compare these two models with the SVR-BOA and BiLSTM-SOA in the literature [22], respectively. Four evaluation metrics are presented as MSE, RMSE, MAE, and MAPE. As shown in Table 3, the values of MSE, RMSE, MAE, and MAPE for the LSTM-SVR series of models are lower than the LSTM-BOA, BiLSTM-SOA, and SVR-BOA models. Notably, the MAPE decreased by nearly 12%, indicating that the combination of LSTM and SVR can effectively enhance the predictive ability of the model. Additionally, the results of four LSTM-SVR models in Table 3 are relatively stable, further showing the stability of the proposed model. Figure 5 visualizes the relevant results.
Based on the above results, the driving time for the summer weekday departure station and the holiday departure station can be similarly predicted. The driving time prediction results for the summer weekday departure station, winter weekday departure station, and holiday departure station using the LSTM-SVR-RSO model are shown in Table 4. Table 4 shows that the absolute error ϵ of the predicted driving time compared with the actual driving time are within a specific range, indicating that the proposed model demonstrates good predictive performance.
To clearly demonstrate the prediction performance of our algorithm, we compare the proposed LSTM-SVR-RSO with LSTM-BOA, BiLSTM-SOA, and SVR-BOA, respectively. Due to limited space, the results during winter workdays are shown as an example. In Figure 6, the vertical coordinate Error represents the absolute error between the predicted driving time and the actual driving time of each algorithm, while the horizontal coordinate Index indicates the set of each departure time indices. As shown in Figure 6, the proposed algorithm in this paper is superior to other algorithms and exhibits greater stability.
The predicted schedule time is obtained by subtracting the predicted driving time from the arrival time (the actual arrival time used in the experiment), as shown in Table 5. The numerical results in Table 5 show that in most cases (the bolded data), the predicted schedule time is closer to the actual departure time than the originally planned departure time. However, there are still some exceptions for specific time periods. It is verified that these time periods are related to the occurrence of serious abnormal conditions and unexpected situations that require human intervention. This shows that the proposed algorithm has not collected sufficient training data to address these unexpected situations, highlighting a potential area for further research and improvement in our algorithm.

4. Conclusions

This paper investigates a hybrid prediction model combining LSTM and SVR. The hybrid model was designed to overcome the main limitation of LSTM, specifically its tendency to overfit due to scarce training data, and to address the SVR’s difficulties in effectively capturing long-term dependencies in time series data. This makes the model a vital tool for public transportation companies and managers.
The hybrid model effectively leverages the capabilities of LSTM in capturing time series features, while also utilizing SVR’s strengths in addressing nonlinear problems, particularly its good generalization ability with small sample data. The integration of these two models maximizes their respective advantages, leading to improved prediction performance compared to the use of either method. Consequently, by comparing with the LSTM-BOA, SVR-BOA, and BiLSTM-SOA models, the LSTM-SVR model proposed in this paper can more accurately predict bus scheduling times, reduce scheduling costs, improve the operational efficiency of bus companies, and improve the reliability of bus scheduling systems.
The proposed LSTM-SVR hybrid model was trained based on data from a public bus scheduling system. However, the concepts and methodologies can be applied to other fields that require the analysis of time series data and predict future trends, such as weather forecasting, stock price prediction, and tourism demand prediction. The model’s ability in time series analysis and nonlinear modeling is critical to the accuracy of predicting market changes.
Despite its significant advantages, including a marked improvement in prediction accuracy, the model still has certain limitations. The algorithm proposed in this paper demonstrates insufficient predictive accuracy when handling unexpected situations. Therefore, enhancing the model’s ability to manage such anomalies would help improve its prediction accuracy. In addition, hybrid models may require extended training time and substantial computational resources, so migrating the model to a more powerful computing platform can improve its computational performance.

Author Contributions

Conceptualization, Y.X.; methodology, L.Y.; supervision, Z.G.; software and validation, L.Y.; writing—original draft preparation, Z.G., J.L. and Y.C.; writing—review and editing, Z.G., L.Y. and Y.X. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (Grants No. 120081) and Qing Lan Project.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Tirachini, A. Estimation of Travel Time and the Benefits of Upgrading the Fare Payment Technology in Urban Bus Services. Transp. Res. Part C Emerg. Technol. 2013, 30, 239–256. [Google Scholar] [CrossRef]
  2. Suwardo, W.; Napiah, M.; Kamaruddin, I. ARIMA Models for Bus Travel Time Prediction. J. Inst. Eng. Malays. 2010, 71, 49–58. Available online: http://scholars.utp.edu.my/id/eprint/5860 (accessed on 14 September 2024).
  3. Shang, H.; Liu, Y.; Huang, H.; Guo, R. Vehicle Scheduling Optimization Considering the Passenger Waiting Cost. J. Adv. Transp. 2019, 2019, 4212631. [Google Scholar] [CrossRef]
  4. Pan, H.; Tang, Y.; Wang, G. A Stock Index Futures Price Prediction Approach Based on the MULTI-GARCH-LSTM Mixed Model. Mathematics 2024, 12, 1677. [Google Scholar] [CrossRef]
  5. Ma, W.; Hong, Y.; Song, Y. On Stock Volatility Forecasting under Mixed-Frequency Data Based on Hybrid RR-MIDAS and CNN-LSTM Models. Mathematics 2024, 12, 1538. [Google Scholar] [CrossRef]
  6. Kviesis, A.; Zacepins, A.; Komasilovs, V.; Munizaga, M. Bus Arrival Time Prediction with Limited Data Set Using Regression Models. In Proceedings of the 4th International Conference on Vehicle Technology and Intelligent Transport Systems (VEHITS), Funchal, Madeira, Portugal, 16–18 March 2018. [Google Scholar] [CrossRef]
  7. Alam, O.; Kush, A.; Emami, A.; Pouladzadeh, P. Predicting Irregularities in Arrival Times for Transit Buses with Recurrent Neural Networks Using GPS Coordinates and Weather Data. J. Ambient. Intell. Humaniz. Comput. 2021, 12, 7813–7826. [Google Scholar] [CrossRef]
  8. Chu, K.F.; Lam, A.Y.; Tsoi, K.H.; Huang, Z.; Loo, B.P. Deep Encoder Cross Network for Estimated Time of Arrival. IEEE Access 2023, 11, 76095–76107. [Google Scholar] [CrossRef]
  9. He, P.; Jiang, G.; Lam, S.K.; Tang, D. Travel-Time Prediction of Bus Journey with Multiple Bus Trips. IEEE Trans. Intell. Transp. Syst. 2019, 20, 4192–4205. [Google Scholar] [CrossRef]
  10. Jargalsaikhan, N.; Matsuyama, K. An Investigation of Machine Learning Methods for Prediction Bus Travel Time of Mongolian Public Transportation. Int. Workshop Adv. Imaging Technol. (IWAIT) 2020, 11515, 325–329. [Google Scholar] [CrossRef]
  11. Dunne, L.; Rocco Di Torrepadula, F.; Di Martino, S.; McArdle, G.; Nardone, D. Bus Journey Time Prediction with Machine Learning: An Empirical Experience in Two Cities. In International Symposium on Web and Wireless Geographical Information Systems (W2GIS); Springer: Cham, Switzerland, 2023. [Google Scholar] [CrossRef]
  12. Wai, B.; Zhou, W. Designing and Implementing Real-Time Bus Time Predictions using Artificial Intelligence. Transp. Res. Rec. 2020, 2674, 636–648. [Google Scholar] [CrossRef]
  13. Liu, H.; Xu, H.; Yan, Y.; Cai, Z.; Sun, T.; Li, W. Bus Arrival Time Prediction Based on LSTM and Spatial-Temporal Feature Vector. IEEE Access 2020, 8, 11917–11929. [Google Scholar] [CrossRef]
  14. Hashi, A.O.; Hashim, S.Z.M.; Anwar, T.; Ahmed, A. A Robust Hybrid Model Based on Kalman-SVM for Bus Arrival Time Prediction. In Emerging Trends in Intelligent Computing and Informatics (IRICT), Advances in Intelligent Systems and Computing; Springer: Cham, Switzerland, 2019. [Google Scholar] [CrossRef]
  15. Zhou, T.; Wu, W.; Peng, L.; Zhang, M.; Li, Z.; Xiong, Y.; Bai, Y. Evaluation of Urban Bus Service Reliability on Variable Time Horizons Using a Hybrid Deep Learning Method. Reliab. Eng. Syst. Saf. 2022, 217, 108090. [Google Scholar] [CrossRef]
  16. Leong, S.H.; Lam, C.T.; Ng, B.K. Bus Arrival Time Prediction for Short-Distance Bus Stops with Real-Time Online Information. In Proceedings of the 2021 IEEE 21st International Conference on Communication Technology (ICCT), Tianjin, China, 13–16 October 2021. [Google Scholar] [CrossRef]
  17. Luo, X.; Li, D.; Yang, Y.; Zhang, S. Spatiotemporal Traffic Flow Prediction with KNN and LSTM. J. Adv. Transp. 2019, 2019, 4145353. [Google Scholar] [CrossRef]
  18. Zhang, Y.; Zhu, J.; Zhang, J. Short-Term Passenger Flow Forecasting Based on Phase Space Reconstruction and LSTM. J. Inst. Eng. Malays. 2017, 482, 679–688. [Google Scholar] [CrossRef]
  19. Petersen, N.C.; Rodrigues, F.; Pereira, F.C. Multi-Output Bus Travel Time Prediction with Convolutional LSTM Neural Network. Expert Syst. Appl. 2019, 120, 426–435. [Google Scholar] [CrossRef]
  20. Liu, Y.; Zhang, H.; Jia, J.; Shi, B.; Wang, W. Understanding Urban Bus Travel Time: Statistical Analysis and a Deep Learning Prediction. Int. J. Mod. Phys. B 2023, 37, 2350034. [Google Scholar] [CrossRef]
  21. Jiang, R.; Hu, D.; Sun, Q.; Wu, X. Predicting bus travel time with hybrid incomplete data—A deep learning approach. Promet-Traffic Transp. 2022, 34, 673–685. [Google Scholar] [CrossRef]
  22. Zhou, B.; Zhou, D.D.; Sun, J.; Ni, X.Y. Bus Arrival Time Prediction Model Based on Bidirectional Long Short-Term Memory Network. J. Transp. Syst. Eng. Inf. Technol. 2023, 23, 148–160. [Google Scholar] [CrossRef]
  23. Xie, J.; Lin, Z.; Yin, J.; Lai, Z.; Wang, X.; Chen, X. Deep Reinforcement Learning Based Dynamic Bus Timetable Scheduling with Bidirectional Constraints. In Proceedings of the ninth-First International Conference of Big Data and Social Computing (BDSC), Harbin, China, 8–10 August 2024. [Google Scholar] [CrossRef]
  24. Gholamy, A.; Kreinovich, V.; Kosheleva, O. Why 70/30 or 80/20 Relation between Training and Testing Sets: A Pedagogical Explanation. Int. J. Intell. Technol. Appl. Stat. 2018, 11, 105–111. [Google Scholar] [CrossRef]
  25. Bichri, H.; Chergui, A.; Hain, M. Investigating the Impact of Train/Test Split Ratio on the Performance of Pre-Trained Models with Custom Datasets. Int. J. Adv. Comput. Sci. Appl. 2024, 15, 527–530. [Google Scholar] [CrossRef]
  26. Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
  27. Bach, F.R.; Lanckriet, G.R.G.; Jordan, M.I. Multiple Kernel Learning, Conic Duality, and the SMO Algorithm. In Proceedings of the Twenty-First International Conference on Machine Learning (ICML), Banff, AB, Canada, 4–8 July 2004. [Google Scholar] [CrossRef]
  28. Lanckriet, G.R.G.; Cristianini, N.; Bartlett, P. Learning the Kernel Matrix with Semidefinite Programming. J. Mach. Learn. Res. 2004, 5, 27–72. [Google Scholar] [CrossRef]
  29. Adnan, F.A.; Jamaludin, K.R.; Wan Muhamad, W.Z.A.; Miskon, S. A Review of the Current Publication Trends on Missing Data Imputation over Three Decades: Direction and Future Research. Neural Comput. Appl. 2022, 34, 18325–18340. [Google Scholar] [CrossRef]
  30. Stekhoven, D.J.; Bühlmann, P. MissForest—Non-Parametric Missing Value Imputation for Mixed-Type Data. Bioinformatics 2012, 28, 112–118. [Google Scholar] [CrossRef] [PubMed]
  31. Shah, A.D.; Bartlett, J.W.; Carpenter, J.; Nicholas, O.; Hemingway, H. Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study. Am. J. Epidemiol. 2014, 179, 764–774. [Google Scholar] [CrossRef]
  32. Perez, H.; Tah, J.H.M. Improving the Accuracy of Convolutional Neural Networks by Identifying and Removing Outlier Images in Datasets Using T-SNE. Mathematics 2020, 8, 662. [Google Scholar] [CrossRef]
  33. Lane, D.M.; Scott, D.; Hebl, M.; Guerra, R.; Osherson, D.; Zimmer, H. Introduction to Statistics: An Interactive E-Book, 1st ed.; University of Houston: Houston, TX, USA, 2013; Available online: https://www.onlinestatbook.com (accessed on 26 September 2024).
  34. Chicco, D.; Warrens, M.J.; Jurman, G. The Coefficient of Determination R-Squared Is More Informative than SMAPE, MAE, MAPE, MSE and RMSE in Regression Analysis Evaluation. PeerJ Comput. Sci. 2021, 7, e623. [Google Scholar] [CrossRef]
  35. Victoria, A.H.; Maragatham, G. Automatic Tuning of Hyperparameters Using Bayesian Optimization. Evol. Syst. 2021, 12, 217–223. [Google Scholar] [CrossRef]
  36. Chang, Z.H.; Yang, Z.; Chen, W.B. Effective Adam-Optimized LSTM Neural Network for Electricity Price Forecasting. In Proceedings of the IEEE International Conference on Software Engineering and Service Sciences (ICSESS), Beijing, China, 23–25 November 2018. [Google Scholar] [CrossRef]
Figure 1. The workflow of a bus scheduling system.
Figure 1. The workflow of a bus scheduling system.
Mathematics 12 03589 g001
Figure 2. The structure of LSTM.
Figure 2. The structure of LSTM.
Mathematics 12 03589 g002
Figure 3. Flowchart of LSTM-SVR model.
Figure 3. Flowchart of LSTM-SVR model.
Mathematics 12 03589 g003
Figure 4. Results of loss function values for all models during training and validation.
Figure 4. Results of loss function values for all models during training and validation.
Mathematics 12 03589 g004
Figure 5. Results of MAPE of different models.
Figure 5. Results of MAPE of different models.
Mathematics 12 03589 g005
Figure 6. Results of prediction error of different models.
Figure 6. Results of prediction error of different models.
Mathematics 12 03589 g006
Table 1. Description of data variables.
Table 1. Description of data variables.
Variable TypeVariable NameExplanation of Variables (Units)
Input variableCycle timesNumber of bus cycles (times)
Driving time of previous shiftTravel time of previous bus trip (hour: minute: second)
Rest time between two consecutive shifts of busInterval between two adjacent departures of same bus (minutes)
Interval time between current bus and previous busInterval between departure time of current bus and previous bus (minutes)
Actual departure time of current shiftActual departure time of buses on current trip (hour: minute: second)
Actual driving timeActual driving time of buses (hour: minute: second)
Intermediate output variablesOne-dimensional time series featuresOne-dimensional time series features extracted by LSTM
Final output variablePredicted schedule timePredicted bus schedule time (hour: minute: second) output by SVR model
Table 2. Parameter list of LSTM structure.
Table 2. Parameter list of LSTM structure.
ParametersParameter’s ExplanationParameter Value
Network layerNumber of network layers in the model4
Number of neurons in the input layerNumber of input features5
Number of neurons in the output layerOutput value, i.e., the extracted features1
Number of neurons in the hidden layerRelated to the performance of the network108, 56
Activation functionActivation function of the output layerReLU
OptimizerModel optimizerAdam
RoundsNumber of training times50
Batch normalizationImproves model generalization capabilitiesTrue
Dropout ratioDropout stochastic inactivation rate13%
Table 3. Results of different models’ error values.
Table 3. Results of different models’ error values.
ModelMSERMSEMAEMAPE
LSTM-BOA0.37650.61360.431418.715
BiLSTM-SOA0.36040.60040.419318.4578
SVR-BOA0.38770.62270.430115.0201
LSTM-SVR-GS0.30880.55570.39676.8614
LSTM-SVR-BOA0.30590.55310.39136.7531
LSTM-SVR-GA0.30800.55500.39836.9460
LSTM-SVR-RSO0.30540.55260.39146.7926
Table 4. Predicted versus actual driving times for a given day in different time periods (partial).
Table 4. Predicted versus actual driving times for a given day in different time periods (partial).
Time PeriodSummer WeekdaysWinter WeekdaysHoliday
ActualPredicted ϵ ActualPredicted ϵ ActualPredicted ϵ
Time value (min)47.0044.302.7049.0054.425.4245.0046.491.49
49.0047.471.5355.0061.246.2444.0046.732.73
51.0051.310.3162.0059.592.4142.0049.857.85
56.0056.870.8761.0059.931.0746.0050.574.57
57.0059.452.4563.0058.784.2248.0050.632.63
61.0058.332.6754.0052.931.0749.0050.011.01
59.0055.803.2056.0059.193.1951.0050.620.38
56.0059.513.5150.0053.943.9455.0047.787.22
60.0056.723.2857.0055.371.6352.0055.383.38
55.0056.531.5349.0055.556.5555.0052.942.06
63.0057.435.5759.0058.710.2950.0052.442.44
67.0060.556.4562.0067.975.9751.0054.443.44
72.0064.847.1665.0074.729.7259.0053.435.57
70.0065.434.5771.0061.519.4967.0056.7810.22
58.0054.923.0876.0068.857.1575.0055.2019.80
64.0051.3112.6966.0067.351.3560.0062.642.64
58.0049.098.9165.0055.989.0247.0062.6615.66
45.0043.731.2762.0057.384.6265.0060.714.29
46.0040.795.2157.0053.303.7055.0048.286.72
39.0038.840.1651.0050.160.8441.0051.6810.68
Table 5. Results of schedule time and planned departure time (partial).
Table 5. Results of schedule time and planned departure time (partial).
Summer WeekdaysWinter WeekdaysHoliday
Time comparisonPlannedPredictedActualPlannedPredictedActualPlannedPredictedActual
6:41:006:40:4206:38:008:10:008:07:2508:05:006:48:006:54:3106:56:00
7:40:007:39:3207:38:008:16:008:23:0408:22:007:14:007:17:1607:20:00
7:46:007:47:4107:48:009:00:008:56:1308:52:008:03:008:06:0908:14:00
8:04:008:05:0308:08:009:09:009:10:0409:09:008:12:008:16:2608:21:00
8:25:008:26:0808:27:009:45:009:46:1609:45:008:30:008:32:2208:35:00
8:52:008:57:3309:00:0013:22:0013:21:3813:20:008:38:008:40:5908:42:00
10:12:0010:12:4010:10:0013:30:0013:33:2713:40:009:10:009:08:2309:08:00
10:21:0010:33:1210:30:0013:58:0013:50:1114:00:0010:14:0010:09:1310:02:00
11:15:0011:18:3011:22:0014:18:0014:20:1714:20:0011:46:0011:48:3711:52:00
11:24:0011:59:1711:56:0016:26:0016:26:0216:32:0012:26:0012:26:0412:24:00
12:20:0012:28:2812:30:0016:32:0016:29:2516:38:0012:36:0012:26:3312:29:00
12:40:0012:49:3412:44:0016:50:0016:50:1717:00:0012:48:0012:40:3412:44:00
16:16:0016:16:1316:05:0016:56:0017:31:2917:22:0013:14:0013:09:3413:04:00
16:32:0016:32:0916:25:0017:20:0017:35:0917:28:0014:04:0013:54:1313:44:00
16:50:0016:49:3416:45:0017:39:0017:44:3917:46:0014:29:0014:19:4814:00:00
16:56:0016:55:0516:52:0018:14:0018:09:0118:00:0017:46:0017:52:2217:55:00
21:12:0021:10:3320:50:0018:30:0018:22:3718:18:0018:48:0018:36:2018:52:00
21:40:0021:46:1621:45:0018:40:0018:31:4218:28:0019:00:0019:04:1719:00:00
22:44:0022:35:1322:30:0019:10:0019:10:5019:10:0020:30:0020:26:4320:20:00
23:00:0023:00:0023:00:0019:20:0019:25:0119:20:0022:12:0021:54:1922:05:00
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ge, Z.; Yang, L.; Li, J.; Chen, Y.; Xu, Y. Bus Schedule Time Prediction Based on LSTM-SVR Model. Mathematics 2024, 12, 3589. https://doi.org/10.3390/math12223589

AMA Style

Ge Z, Yang L, Li J, Chen Y, Xu Y. Bus Schedule Time Prediction Based on LSTM-SVR Model. Mathematics. 2024; 12(22):3589. https://doi.org/10.3390/math12223589

Chicago/Turabian Style

Ge, Zhili, Linbo Yang, Jiayao Li, Yuan Chen, and Yingying Xu. 2024. "Bus Schedule Time Prediction Based on LSTM-SVR Model" Mathematics 12, no. 22: 3589. https://doi.org/10.3390/math12223589

APA Style

Ge, Z., Yang, L., Li, J., Chen, Y., & Xu, Y. (2024). Bus Schedule Time Prediction Based on LSTM-SVR Model. Mathematics, 12(22), 3589. https://doi.org/10.3390/math12223589

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop