Next Article in Journal
Hybrid Sine Cosine Algorithm for Solving Engineering Optimization Problems
Next Article in Special Issue
A Framework to Analyze Function Domains of Autonomous Transportation Systems Based on Text Analysis
Previous Article in Journal
Unilateral Laplace Transforms on Time Scales
Previous Article in Special Issue
Analysis of the Accident Propensity of Chinese Bus Drivers: The Influence of Poor Driving Records and Demographic Factors
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Neural Network-Based Hybrid Forecasting Models for Time-Varying Passenger Flow of Intercity High-Speed Railways

School of Railway Tracks and Transportation, Wuyi University, Jiangmen 529020, China
*
Author to whom correspondence should be addressed.
Mathematics 2022, 10(23), 4554; https://doi.org/10.3390/math10234554
Submission received: 7 November 2022 / Revised: 24 November 2022 / Accepted: 26 November 2022 / Published: 1 December 2022
(This article belongs to the Special Issue Mathematical Optimization in Transportation Engineering)

Abstract

:
Time-varying passenger flow is the input data in the optimization design of intercity high-speed railway transportation products, and it plays an important role. Therefore, it is necessary to predict the origin-destination (O-D) passenger flow at different times of the day in combination with the stable time-varying characteristics. In this paper, three neural network-based hybrid forecasting models are designed and compared, named Variational Mode Decomposition-Multilayer Perceptron (VMD-MLP), Variational Mode Decomposition-Gated Recurrent Unit Neural Network (VMD-GRU), and Variational Mode Decomposition-Bidirectional Long Short-Term Memory Neural Network (VMD-Bi-LSTM). First, the time-varying characteristics of passenger travel demand under different time granularities are analyzed and extracted by the VMD method. Second, three neural network prediction models are constructed to predict the passenger flow sequence after VMD decomposition and reconstruction. Experimental analysis is performed on the Guangzhou Zhuhai intercity high-speed railway in China, and the passenger flow at different time periods of the day under different time granularities is predicted. The following results were found: (i) The number of hidden neurons and the number of iterations of the hybrid forecasting model have a great impact on the prediction accuracy. The error of the VMD-MLP model fluctuates less and it performs more smoothly than both the VMD-GRU model and the VMD-Bi-LSTM model. (ii) The VMD-MLP, VMD-GRU, and VMD-Bi-LSTM models can basically reduce the MAPE error to less than 10%. With the increase of time granularity, RMSE and MAE errors tend to gradually increase, while the MAPE error tends to gradually decrease. (iii) For passenger flow under a smaller time granularity, the prediction accuracy of the VMD-MLP model is higher, while for passenger flow under a larger time granularity, the prediction accuracy of the VMD-GRU and VMD-Bi-LSTM models is higher. (iv) The proposed neural network-based hybrid models outperform the existing models and the hybrid models perform better than the single models.

1. Introduction

The rapid development of China’s intercity high-speed railway not only alleviates the pressure of urban traffic, but also realizes the high-speed connection between adjacent cities. The high-speed and high-density transportation organization of the intercity high-speed railway gives passengers more choices of departure time. The passenger travel demand for the intercity high-speed railway at different departure periods in one day shows obvious fluctuation characteristics, which have a certain stability in a short period of time. Such a demand can be called the time-varying passenger flow. Therefore, this stable fluctuation law of passenger flow can be used to predict the time-varying passenger flow of the intercity high-speed railway in the short term and provide decision support for the optimization design of intercity high-speed railway transportation products.
In recent years, the time-varying passenger flow of the intercity high-speed railway has been used as input data in the optimization of timetables, line planning, and differential pricing by various researchers [1,2,3,4,5]. The prediction accuracy of passenger demand with different departure times in one day directly impacts the optimization solutions for high-speed railway operation management. Therefore, the prediction of the time-varying passenger flow of the intercity high-speed railway becomes an important role.
Traditional railway passenger flow forecasting is mainly aimed at the daily, monthly, and yearly passenger flow, which lacks research on the time-varying passenger flow of the intercity high-speed railway [6,7,8]. Although, some short-term methods were developed to predict passenger flow at the time granularities of 5 min, 15 min, and half an hour [9,10], but they were mainly focused on the prediction of subway passenger flow. There are quite a few differences between the subway and high-speed railways.
Considering the above research gaps, the objective of this paper is to develop methods for the intercity high-speed railway to predict passenger flow under different time granularities in one day. Prediction methods have been developed for decades and are usually divided into parametric and non-parametric approaches. Traditional parametric methods [11,12,13] are simple and fast but less accurate for nonlinear real-time data. Non-parametric methods have been widely researched in recent years, among which, neural networks are hot topics as they are more suitable for passenger flow prediction with non-linear and more complex data [14,15]. Considering that single prediction methods always have some defects, some hybrid methods are proposed to further improve the prediction accuracy, especially the neural network-based hybrid forecasting method [15,16].
Based on the above analysis, the research ideas of this paper are as follows: passenger travel demand is analyzed based on historical passenger flow, with an operating time of one day divided into time periods with different time granularities of 1 h, 3 h, 5 h, and 16 h, and the origin-destination (O-D) passenger flow time series under different time granularities are obtained. Then, the non-stationary passenger flow time series is decomposed into a stationary passenger flow time series by using the Variational Mode Decomposition (VMD), to extract the time fluctuation characteristics of passenger flow. Finally, three neural network-based hybrid prediction models are proposed, named VMD-MLP, VMD-GRU, and VMD-Bi-LSTM, to predict the reconstructed passenger flow time series and obtain the O-D passenger travel demand at different times of the day in the future.
The novelty of the research is the proposal of three neural network-based hybrid prediction models for the intercity high-speed railway. The designed methods not only use the stable fluctuation law of passenger flow but also predict the O-D passenger flow under different time granularities in one day, which have not been sufficiently studied for the intercity high-speed railway in the current literature.
The main contributions of this paper are as follows: (1) the fluctuation characteristics of the O-D passenger flow of the intercity high-speed railway under different time granularity in one day are analyzed, and (2) based on the time fluctuation characteristics of passenger flow, three neural network-based hybrid prediction models: VMD-MLP, VMD-GRU, and VMD-Bi-LSTM, are designed to predict the O-D passenger flow of the intercity high-speed railway under different time granularities and different periods in one day.
This paper is arranged as follows: Section 2 summarizes the literature, Section 3 analyzes the characteristics of passenger travel demand and proposes a method of passenger flow time series decomposition; in Section 4, three neural network-based hybrid prediction models are designed, Section 5 presents the experimental analysis, and Section 6 is the conclusion.

2. Related Literature Review

The research methods of railway passenger flow forecasting can be divided into three categories: parametric, non-parametric, and hybrid forecasting models. Parametric forecasting models mainly include the Time Series Model, Historical Average Model, and the Autoregressive Comprehensive Moving Average Model [13,17]. The parametric models have been applied in passenger flow forecasting before, but they are sensitive to the linear relationship between variables and cannot grasp the nonlinear relationship of time series, for which they have certain limitations in the application range of passenger flow forecasting. Milenkovi, Libor, and Melicha [18] used the Autoregressive Comprehensive Moving Average method to predict the passenger flow of railway stations in the short term at the granularity of a month.
Non-parametric forecasting models mainly include the Support Vector Machine, Neural Network, Decision Tree, etc. Non-parametric forecasting models are flexible, can fit a large number of different function forms, and have more advantages in terms of their training ability and performance than parametric prediction models. Zhang, Cheng, and Gao [19] proposed a prediction method based on multi-layer LSTM for short-term prediction of railway station passenger flow with a time granularity of 0.5 h. This method combined multiple traffic data and feature selection based on Spearman correlation and had a high prediction accuracy. Jing and Yin [20] used a BP neural network to predict the short-term passenger flow of intercity high-speed railway stations at the granularity of one hour. They also adopted an improved step-updating method to prevent the error function from falling into a local minimum and oscillation during the weight-updating process, which showed that this model had good performance. Peng, Bai, and Wu [21] predicted the short-term passenger flow of railway stations with the time granularity of one hour, and proposed an improved LSTM model, which had higher accuracy than the traditional BPNN module and the LSTM model without improvement.
Toque et al. [22] proposed the Gated Recurrent Unit Neural Network Model (GRU) to predict the station passenger flow of various rail transit modes in the Paris Business District in the short term, with a time granularity of 15 min. Compared with the Random Forest Model and LSTM Model, it had greater accuracy. Jérémy, Gérald, and Stéphane [23] proposed the Bayesian model to predict the passenger flow of subway stations in the short term with a time granularity of 10 min. Marc, Peter, and Dieter [24] proposed the DLR four-step model to predict the O-D passenger flow of an airport in the long term, with a time granularity of one year. Ulrich and Bozana [25] proposed the GVAR model to predict the passenger flow of an airport in the short term, with a time granularity of one month. Chan, Yash, and Sameer [26] proposed the Neural Granger Causality model to also predict the passenger flow of an airport in the short term, with a time granularity of one month.
At present, most of the passenger flow forecasting studies use hybrid models to predict passenger flow. There are different combinations of hybrid models. One approach is to combine two or more parametric or non-parametric models, and the other is to combine the passenger flow sequence decomposition model with parametric or non-parametric models to give full play to the advantages of each model and achieve a higher prediction accuracy.
(1) Passenger flow time series decomposition model mixed with parameter forecasting models or non-parameter forecasting models
Jiang, Zhang, and Chen [27] used a hybrid model, EEMD-GSVM, which decomposed the time series into several intrinsic mode functions using EEMD, and then used the particle swarm-calibrated GSVM to predict the daily O-D passenger flow of an intercity high-speed railway in the short term. Zhao and Mi [28] proposed a hybrid model, SSA-WPDCNN-SVR, to predict the daily O-D passenger flow of an intercity high-speed railway in the short term. First, the SSA method was used to decompose the passenger flow time series into a main series and several sub-series, then the Convolutional Neural Network Model was used to predict the main series, while the SVR model was used to predict the subseries.
Sun et al. [29] used the hybrid models VMD-SARIMA-MLP and VMD-SARIMA-LSTM. The SARIMA model was used to predict the periodic components, the LSTM was used to learn and predict the deterministic components, and the MLP network was used to predict the volatile components to predict the short-term daily passenger flow of the subway station. Xin et al. [30] proposed a wave-LSTM hybrid model to predict the short-term passenger flow of subway stations with 15 min as the time granularity. Wei and Chen [31] proposed a hybrid model, EMD-BPN, to predict the short-term passenger flow of subway stations, with 15 min as the time granularity. Jin et al. [32] used a hybrid model, VMD-ARMA/KELM-KELM, including Variational Mode Decomposition (VMD), Autoregressive Moving Average Model (ARMA), and Kernel Limit Learning Machine (KELM), which, respectively, predicted the short-term passenger flow of airlines with a monthly time granularity.
It is noted that Empirical Mode Decomposition (EMD) and Variational Mode Decomposition (VMD) are usually used as decomposition techniques to improve the prediction accuracy, which were proposed by Huang et al. in 1998 [33] and by Dragomiretskiy and Zosso in 2014 [34], respectively. Some researchers [35,36,37] also performed comparison studies and found that the performances of VMD were better than EMD, which was partly because EMD had some defects, such as mode mixing and an endpoint effect, while VMD could ensure decomposition optimality in the sense of minimum sum of modes’ bandwidths.
(2) Mixed parameter forecasting models or non-parameter forecasting models
Wen et al. [38] used a hybrid forecasting model, SARIMA-TrAdaboost, which decomposed the time series into linear and non-linear time series and predicted the short-term daily passenger flow of Beijing–Shanghai high-speed railway stations by embedding a random forest model. Glisovic, Milenkovi, and Bojovi [39] proposed two hybrid forecasting models, SARIMA-GA-ANN and SARIMA-ANN, to predict the short-term passenger flow of railway stations with a monthly time granularity. Zhu and Zhou [8] used the panel vector autoregression (PVAR) and neural network (NN) hybrid PVAR-NN prediction methods to predict yearly passenger flow in the railway system.
Jérémy, Stéphane, and Gérald [40] used a hybrid forecasting model, Bayesian and Gaussian methods, to predict the short-term passenger flow of subway stations, with 2 min as the time granularity. Ma, Guo, and Ma [41] proposed GCN-Bi-LSTM to predict the short-term passenger flow of subway stations with 5 min as the time granularity. Zhang, Chen, and Shen [42] proposed a hybrid forecasting model, CB-LSTM, to predict the short-term passenger flow of subway stations, with 1 to 30 min as the time granularity. Chen et al. [15] developed an EMD-LSTM hybrid forecasting model for predicting short-term metro-inbound passenger flow, with 15 min as the time granularity.
Robert, Yingqi, and Suzilah [43] used the hybrid forecasting model ADL-TVP-VAR to predict the passenger flow of an airport in the medium and long term with a time granularity of one year. Kim and Shin [44] used the hybrid forecasting model NAF-k-fold to predict the passenger flow of an airport in the short term with a time granularity of one month. Rodrigo [45] used the hybrid model MLEM to train multiple neural networks to predict the passenger flow of an airport in the long term with a time granularity of one month. Wai et al. [46] used the hybrid model ARIMA-SRIMA to predict the passenger flow of an airport in the short and long term with a time granularity of one month.
In conclusion, the prediction accuracy of the hybrid model is higher than that of the single model, and hence hybrid prediction methods are widely used in recent years. Railway passenger flow forecasting mainly forecasts short-term daily or monthly passenger flow, and only a few scholars have studied hourly passenger flow forecasting of railway stations. However, there is a lack of research on O-D passenger flow forecasting of different departure periods under different time granularities in one day. Therefore, this paper uses the time distribution characteristics of intercity high-speed railway passenger flow to design neural network-based hybrid prediction models, that is, the Variational Mode Decomposition-Multilayer Perceptron (VMD-MLP) model, the Variational Mode Decomposition-Gated Recurrent Unit Neural Network (VMD-GRU) model, and the Variational Mode Decomposition-Bidirectional Long Short-Term Memory Neural Network (VMD-Bi-LSTM) model, to predict the O-D passenger flow of an intercity high-speed railway at different times of one day under different time granularities.

3. Data Analysis and Decomposition

3.1. Time-Varying Passenger Flow

The passenger flow along the intercity high-speed railway is huge, the train departure frequency is high, and the passenger travel demand shows obvious time-varying characteristics, which means that the passenger flow at different departure periods in one day has obvious fluctuations and reflects certain periodic characteristics. According to the historical ticket sale data of the Guangzhou–Zhuhai intercity high-speed railway in China, the time-varying characteristics of passenger flow were analyzed. The passenger flow with time-varying characteristics can be called time-varying passenger flow.
Statistical analysis of the passenger flow in the dimensions of the departure period and departure week can obtain the distributions of O-D passenger flow under different time granularities every day in a week. The passenger demand before 7:00 and after 23:00 is small, so only the passenger flow between 7:00 and 23:00 was studied. For convenience, the time granularity is denoted as Δ t , and Δ t = 1 , 3 , 5 , 16 . This paper adopted time granularity Δ t , dividing the operation time [7:00, 23:00] into several time periods, [7, 7 + Δ t ] as in period 1, [7 + Δ t , 7 + 2 Δ t ] as in period 2, and so on. The passenger flow distributions from Guangzhou South station to Zhongshan North station under different time granularities are shown in Figure 1.
As can be seen from Figure 1:
  • The fluctuation characteristics of passenger flow every day in a week have a certain periodicity.
  • The fluctuation characteristics of the passenger flow at different departure times every day have a certain tendency.
  • The fluctuation characteristics of passenger flow under different time granularities are obviously different. The smaller the time granularity is, the more detailed and more complex the time-varying characteristics of the passenger flow will be.
The actual passenger flow in the m th period of day n is recorded as x n m , and the predicted passenger flow is recorded as x ^ n m ,   n = 1 , 2 N ,   m = 1 , 2 , 3 , . According to the time-varying characteristics of passenger flow in Figure 1, for an O-D pair, the passenger flows in the same period of the first 14 days are used to predict the passenger flow in the same period of the 15th day, which means that x n 14 m ,   x n 13 m ,   ,   x n 1 m are used to predict x ^ n m , as shown in Figure 2.

3.2. Variational Mode Decomposition Model

It can be seen from Figure 1 that the original passenger flow time series is non-stationary, which has an impact on the accuracy of the passenger flow prediction. To better predict the passenger flow and improve the prediction performance of the model, it is necessary to process the data first. In this paper, Variable Mode Decomposition (VMD) is used to process the original passenger flow time series, and x n m is decomposed into multiple relatively stable sub-sequences with different frequency scales,   x n , 1 m , x n , 2 m , , x n , k m , which is the Intrinsic Mode Function (IMF). The core idea of VMD is to construct and solve variational problems. The decomposition process is as follows [34].
  • Each sub-sequence, x n , k m , is processed by Hilbert transform:
    ( δ t + j π t ) x n , k m ( t )
    where δ t represents the Dirac distribution function, which is the pulse function, and represents the convolution operation.
  • The transformed x n , k m is multiplied by the exponential hybrid demodulation center frequency e j ω k t , and then the spectrum is transformed to a baseband:
    [ ( δ t + j π t ) x n , k m ( t ) ] e j ω k t
  • The bandwidth of each x n , k m can be obtained by Gaussian smooth estimation of the demodulated signal, which is the L 2 norm of the gradient, and the constrained variational mode is expressed as:
    m i n { x n , k m } , { ω k } k | | t [ ( δ t + j π t ) x n , k m ( t ) ] e j ω k t | | 2 2 s . t .         k x n , k m = x n m
    where ω k is the center frequency corresponding to each mode, k is the number of mode components, x n m is the original passenger flow time series, and t means finding the partial derivative.
  • To solve the optimal solution of the constrained variational problem in step 3, the problem can be converted into unconstrained variational mode problems. A quadratic penalty factor, α , and Lagrangian operator, λ ( t ) , are introduced, and an augmented Lagrangian expression in the following form is constructed:
    L ( { x n , k m } , { ω k } ,   λ ) = α k | | t [ ( δ t + j π t ) x n , k m ( t ) ] e j ω k t | | 2 2 + | | x n m ( t ) k x n , k m ( t ) | | 2 2 + λ ( t ) , x n m ( t ) k x n , k m ( t )
  • To solve Equation (4), the saddle point of the above formula can be obtained by the alternating direction multiplier method in VMD, and by updating x n , k m , ω k , and λ to find the optimal solution of the constrained variational mode, the updated method is as follows:
    x ^ n , k m c + 1 ( ω ) = x ^ n m ( ω ) i k x ^ n , k m c ( ω ) + λ c ( ω ) 2 1 + 2 α ( ω ω k c ) 2
    ω k c + 1 = 0 ω | x ^ n , k m c + 1 ( ω ) | 2 d ω 0 | x ^ n , k m c + 1 ( ω ) | 2 d ω
    λ ^ c + 1 ( ω ) = λ ^ c ( ω ) + τ [ x ^ n m ( ω ) k x ^ n , k m c + 1 ( ω ) ]
    where x ^ n m ( ω ) , x ^ n , k m ( ω ) , and   λ ^ ( ω ) are the Fourier transforms of x n m ( t ) , x n , k m ( t ) ,   and   λ ( t ) , respectively, ω is the random frequency, c is the number of iterations, and τ is the updated parameter of the Lagrange multiplier.
  • Repeating step 5, the judgment condition for stopping the loop iteration is:
    k | | x ^ n , k m c + 1 x ^ n , k m c | | 2 2 | | x ^ n , k m c | | 2 2 < ε
After VMD decomposition, the original passenger flow time series, x n m , is decomposed into k IMF components, which are   x n , 1 m , x n , 2 m , , x n , k m , and the k IMF components decomposed in each time period are reconstructed separately. There are many ways of reconstruction, such as component addition, component weighted average RE addition, or high-correlation and low-correlation components are added, respectively. This paper adopts the method of component addition, which is x n m = x n , 1 m + x n , 2 m + + x n , k m , the reconstructed new passenger flow time series is x n m , and then the sequence x n 14 m , x n 13 m , x n 1 m is used as the input of the forecasting model.

4. Neural Network-Based Hybrid Forecasting Models

The hybrid model can integrate the advantages of multiple models and improve the accuracy of passenger flow forecasting. Based on this, this paper combined the VMD method with a single passenger flow forecasting method, and designed hybrid models to predict the time-varying passenger flow of an intercity high-speed railway, which are the VMD-MLP model, the VMD-GRU model, and the VMD-Bi-LSTM model.

4.1. Hybrid Model Design

The hybrid forecasting models mainly include two steps, as follows.
Step 1: Data preprocessing. First, the data were cleaned and screened for repeated values, abnormal values, and missing values, and then the appropriate super-parameters were selected based on the decomposition principle of VMD. The original passenger flow time series x n m was decomposed into k IMF eigenmode functions to weaken the noise interference on the forecasting models and improve the stability of the input time series, and it was finally reconstructed into a new passenger flow time series x n m .
Step 2: Passenger flow forecasting. The reconstructed time series x n 14 m , x n 13 m , x n 1 m were substituted into the neural network forecasting model for prediction. First, the training time step, prediction time step, number of training samples, and division into a training set and test set of the forecasting models were determined. Second, the neural network forecasting model was constructed based on the deep learning framework Keras, and appropriate network hyperparameters were selected, such as hidden neurons and iteration times. The output value is x ^ n m . Finally, the predicted value was compared with the real value to calculate the prediction error. There are three commonly used prediction errors, namely Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE), as shown in Formulas (9)–(11), where s is the number of samples in the dataset and M is the number of time periods.
R M S E = 1 M m = 1 M n = 1 s | x n m x ^ n m | 2 s
M A E = 1 M m = 1 M 1 s n = 1 s | x n m x ^ n m |
M A P E = 1 M m = 1 M 1 s n = 1 s | ( x n m x ^ n m x n m ) |

4.2. Introduction of Neural Network Forecasting Model

4.2.1. Multi-Layer Perceptron Model

Multi-Layer Perceptron (MLP) is a specific artificial neural network. To overcome the limitations of the single-layer perceptron, it was proposed to add a hidden layer between the input layer and the output layer, so as to turn a single-layer perceptron into a multi-layer perceptron, and the input signal propagates forward through the network layer-by-layer [47]. The MLP model structure in this paper is shown in Figure 3.
The input vector x n 14 m , x n 13 m , x n 1 m propagates through the neurons in the input layer of the MLP to all neurons in the hidden layer, and the input and output of the h -th hidden neuron are y h and Y h , respectively. The output of the output layer is the predicted value x ^ n m .
y h = i = 1 14 W i h x n i m
Y h = f ( y h )
x ^ n m = f ( h = 1 H Y h W h )
where i is the i -th neuron of the input layer, h is the h -th neuron of the hidden layer, W i h represents the weight between the i -th neuron of the input layer and the h -th neuron of the hidden layer, and W h represents the weight between the h -th neuron of the hidden layer and the output layer.

4.2.2. Gated Recurrent Unit Neural Network Model

The Gated Recurrent Unit Neural Network (GRU) is an improved version of the Recurrent Neural Network. The GRU model has two gates, that is, a reset gate and an update gate. The reset gate controls the length of forgotten information, and the update gate can forget as well as select memory [47], as shown in Figure 4. The GRU model is composed of a simpler gate mechanism, which makes the calculation speed of learning and prediction faster.
In Figure 4, t is the iteration time, H t 1 is the hidden state at the previous time, H t is the current time output, σ and t a n h are activation functions, r t is the reset gate, z t is the update gate, and H ˜ t is the hidden layer information, as follows:
σ ( t ) = 1 1 + e t
t a n h ( t ) = e t e t e t + e t
The output of the function σ ( t ) does not consider the information learned at the previous time. The t a n h ( t ) function compresses the previously learned information to stabilize the value. The calculation process of the GRU model includes the following key steps:
  • Calculation of reset gate, r t . x n 14 m , x n 13 m , x n 1 m represent the inputs of the current cell. The reset gate is used to control the amount of information that needs to be forgotten about the previous cell. It will read H t 1 and x n 14 m , x n 13 m , x n 1 m , and the process can be expressed as:
    r t = σ ( W r [ H t 1 , x n 14 m , x n 13 m , x n 1 m ] + b r )
    where W r is the weight matrix of the reset gate and b r is the deviation.
  • Calculation of update gate, H ˜ t . The update gate determines which information is to be discarded from the previous cell and which new information is to be added to the current cell in the GRU model, to reduce the risk of gradient disappearance. This process can be expressed as:
    z t = σ ( W z [ H t 1 , x n 14 m , x n 13 m , x n 1 m ] + b z )
    where W z is the weight matrix of the reset gate and b z is the deviation.
  • Calculation of hidden layer information, H ˜ t . The information of the cell passing through the reset gate, r t , and inputs x n 14 m , x n 13 m , x n 1 m together are processed by the tanh function, and the output is the hidden layer information. The process can be expressed as:
    H ˜ t = t a n h ( W H ˜ [ r t H t 1 , x n 14 m , x n 13 m , x n 1 m ] + b H )
    where W H ˜ is the weight matrix and b H is the deviation.
  • Calculation of output, H t : First, the hidden layer state at the previous time H t 1 multiplies the treated z t , and then z t multiplies H ˜ t ; finally, the two are added to obtain H t . Specifically, this can be expressed as:
    H t = ( 1 z t ) H t 1 + z t H ˜ t
The above four steps are the loop body of the GRU model. The iteration times are usually determined in experiments. The output H t obtained after completing the given iteration times is the last predicted value, x ^ n m .

4.2.3. Bi-Directional Long Short-Term Memory Model

The Bi-Directional Long Short-Term Memory Model (Bi-LSTM) is a model designed based on the LSTM model. This model can obtain the data feature information in both directions of the hidden layer in the calculation process, which helps to improve the prediction accuracy. The structure of the Bi-LSTM model is shown in Figure 5, which contains six weight matrices, named W 1 W 6 . The forward layer performs forward calculation from time 1 to time t to obtain and save data at each time, while the backward layer reverses the calculation from time t to time 1 to obtain and save the data at each time. h t is the final output value.
The operation formulas of the Bi-LSTM model are as follows:
s t = f ( W 1 [ x n 14 m , x n 13 m , x n 1 m ] + W 2 s t 1 )
s t = f ( W 3 [ x n 14 m , x n 13 m , x n 1 m ] + W 5 s t 1 )  
h t = g ( W 4 s t + W 6 s t )

5. Experimental Analysis

Based on the historical ticket sale data of the Guangzhou–Zhuhai intercity high-speed railway in China from June 2013 to June 2016, five typical O-Ds were selected for experimental analysis: Guangzhou South railway station to Zhongshan North railway station (D1), Guangzhou South railway station to Zhuhai railway station (D2), Guangzhou South railway station to Xiaolan railway station (D3), Guangzhou South railway station to Zhuhai North railway station (D4), and Zhongshan North railway station to Zhuhai North railway station (D5). Data during national legal holidays were deleted and a total of 1047 days of historical ticket sale data remained. The first 80% (838 days) of the dataset was selected as the training set, and the last 20% (209 days) was selected as the test set.

5.1. Decomposition Results of VMD Model

The VMD method was used to decompose the non-stationary original passenger flow time series { x n m } into stable sub-sequences { x n , 1 m } , { x n , 2 m } , { x n , k m } for the five selected O-Ds. When the number of modes, k , is small, some important information in the original passenger flow time series { x n m } will be filtered, affecting the accuracy of the subsequent prediction. However, when the number of modes, k , is large, it will lead to repeated modes or extra noise, and excessive decomposition will lead to overlapping signal components. After many experiments, it was determined that the number of modes, k , of the VMD model was 10, and the output sub-sequences were { x n , 1 m } , { x n , 2 m } , { x n , 10 m } ; that is, I M F 1 , I M F 2 , I M F 10 . The bandwidth, a , was 7000, the noise tolerance, t a u , was 0, the non-constant value of the synthetic signal was 0, and the control error constant, t o l , was e 7 . The VMD model decomposed the four passenger flow time series under different time granularities for each O-D. The frequency of IMF components is listed from low to high. The decomposition results of the passenger flow time series { x 1 1 , x 2 1 , x 1047 1 } from Guangzhou South to Zhuhai in time period 1 are shown in Figure 6.
The low-frequency component IMF1 mainly reflects the overall change trend of the original passenger flow sequence. Several peaks in IMF2 were caused by fluctuations around holidays (excluded). The middle-frequency components from IMF3 to IMF6 exhibited certain periodic characteristics. The high-frequency components from IMF7 to IMF10 mainly reflect other fluctuation characteristics of the original passenger flow sequence, without obvious periodicity.

5.2. Prediction Results of Hybrid Model

To accurately predict the passenger flow time series, for datasets of different O-Ds, the ranges of iteration times, optimizer, hidden neurons, and hidden layers were adjusted appropriately during training, to obtain better prediction results. For the GRU and Bi-LSTM models, the parameters were set as follows: the time step was 14, the prediction step was 1, the number of iterations was in the interval [10, 100], the learning rate was 0.01, the loss function was RMSE, the optimizer was the Adam optimizer, and the number of hidden layers was 1. For the MLP model, the numbers of neurons in the input layer and the output layer were 14 and 1, respectively. The other super-parameter settings were the same as for GRU and Bi-LSTM. There are many methods to determine the number of hidden neurons in neural networks, such as the empirical method, repeated experiment method, growth method, and genetic algorithm [47]. In this paper, the empirical method was used, the number of neurons in the hidden layer was 2 i , and i was a positive integer.
Based on the above parameter settings, the hybrid models VMD-MLP, VMD-GRU, and VMD-Bi-LSTM were used to predict the time-varying passenger flow of the Guangzhou–Zhuhai intercity high-speed railway. The comparison between the prediction value (the prediction value corresponding to the hybrid model with the smallest MAPE error was selected) and the actual value from Guangzhou South station to Zhongshan North station under different time granularities every day in a week is shown in Figure 7.
In Figure 7, the distribution characteristics of the passenger flow prediction values in each period are consistent with the actual values overall, which shows that the hybrid models can better fit the time-varying characteristics of passenger travel demand and are suitable for predicting the time-varying passenger flow of the intercity high-speed railway.
Table 1, Table 2, Table 3 and Table 4 show the prediction errors of the three hybrid models corresponding to different parameters at four time granularities for each O-D pair, where the minimum prediction error of each model for each O-D pair is underlined, NHN represents the number of hidden neurons, and NI represents the number of iterations. For convenience, when MAPE reached the minimum, the final tuning of hyperparameters of the three forecasting models under different time granularities were as shown in Table 5. Figure 8 shows a comparison of the minimum values of the RMSE, MAE, and MAPE prediction errors of the three hybrid models for each O-D pair under different time granularities.
The following findings were obtained based on the above comparison results from Table 1, Table 2, Table 3, Table 4 and Table 5 and Figure 8.
  • When the number of hidden neurons was constant, the error tended to decrease with the increase of the number of iterations, and when the number of iterations was constant, the error also tended to decrease with the increase of the number of hidden neurons. However, when the two increased to a certain amount, the error increased, which indicates that it is not the case that a greater number of hidden neurons and number of iterations is better. Using too few neurons or too few iterations in the hidden layer led to underfitting. However, using too many hidden neurons or too many iterations led to overfitting and increased the training time.
  • The minimum MAPE error of the three hybrid models can be controlled within 10% generally, which indicates that the prediction accuracy of the three hybrid models is high. The MAPE prediction error of O-D D2 under time granularity Δ t = 1 was slightly higher. This can be attributed to the complexity of the time-varying characteristics of the passenger flow of O-D D2 under this time granularity, which led to the reduced applicability of the model.
  • With the increase of time granularity, RMSE and MAE errors tended to gradually increase, while MAPE error tended to gradually decrease. The reasons are as follows: (i) The larger the time granularity was, the larger the passenger flow in each time period was, according to Equations (9) and (10), and then the larger the corresponding RMSE and MAE errors were. (ii) The smaller the time granularity was, the more fully the time-varying characteristics of passenger flow were reflected, and some irregular fluctuation characteristics easily occurred, which was not convenient for the extraction of hybrid prediction models, resulting in the decrease of the prediction accuracy, that is, the increase of the MAPE prediction error.
  • Under different parameters, the errors of the VMD-GRU and VMD-Bi-LSTM models fluctuated greatly, and the errors of the VMD-MLP model changed relatively smoothly. For passenger flow under a smaller time granularity, the prediction error of the VMD-MLP model was smaller, while for the passenger flow under a larger time granularity, the prediction errors of the VMD-GRU and VMD-Bi-LSTM models were smaller. This shows that the applicability of the models was different under different time granularities.
To measure the prediction performance, seven baselines (VMD-LSTM, VMD-ARIMA, MLP, GRU, Bi-LSTM, LSTM, and ARIMA) were adopted to compare with the proposed models, and the results for O-D D1 and Δ t = 1 are shown in Table 6. From Table 6, the prediction accuracy of the VMD-MLP, VMD-GRU, and VMD-Bi-LSTM models was obviously higher than the baselines and the hybrid models performed better than the single models. Hence, the proposed neural network-based hybrid models outperformed the existing models for predicting the time-varying passenger flow of an intercity high-speed railway.

6. Conclusions

In this paper, three neural network-based hybrid models: VMD-MLP, VMD-GRU, and VMD-Bi-LSTM, were designed to predict the time-varying passenger flow of an intercity high-speed railway. According to the historical ticket sale data of the Guangzhou–Zhuhai intercity high-speed railway in China, the time-varying characteristics of passenger travel demand were analyzed. The non-stationary passenger flow time series was decomposed into several stationary passenger flow time series by the VMD method, and three neural network forecasting models, MLP, GRU, and Bi-LSTM, were then used to predict the decomposed time series to obtain the passenger travel demand of O-D pairs at different time periods of the day under different time granularities. After experimental analysis, the following conclusions were drawn.
  • The number of hidden neurons and the number of iterations of the neural network had a great impact on the prediction error of the hybrid models. Within a certain value range, with the increase of the number of hidden neurons and the number of iterations, the error tended to decrease, but when it increased to a certain extent, the error tended to increase. It is necessary to calibrate the parameters in combination with the change of prediction errors.
  • The prediction accuracies of the three hybrid models were high, and the MAPE error could generally be controlled within 10%. With the increase of time granularity, RMSE and MAE errors tended to gradually increase, while the MAPE error tended to gradually decrease.
  • In the optimization design of the model parameters, the errors of the VMD-MLP model fluctuated less, and it performed more smoothly than the VMD-GRU and VMD-Bi-LSTM models. The VMD-MLP model had a higher accuracy in passenger flow forecasting under a smaller time granularity, and the VMD-GRU and VMD-Bi-LSTM models had a higher accuracy in passenger flow forecasting under a larger time granularity.
  • The proposed neural network-based hybrid models outperformed the existing models for predicting the time-varying passenger flow of an intercity high-speed railway. The hybrid models performed better than the single models.
In this study, only historical time series data were used to predict passenger flow, and in a real-world setting, multi-source data can be found to improve the prediction accuracy by fusing spatial and temporal features of passenger flow. Besides, only neural network-based hybrid models were proposed. There are also some other prediction models, such as the deep learning approach, which are more flexible in architecture. In the future, we plan to design a deep learning approach by considering multi-source data, such as weather, spatial structure, and temporal features, to predict the time-varying passenger flow of the high-speed railway or a large-scale metro system.

Author Contributions

Conceptualization, H.S.; data curation, H.S. and S.P.; formal analysis, H.S. and S.P.; funding acquisition, H.S.; investigation, H.S. and S.P.; methodology, H.S. and S.P.; resources, H.S.; validation, H.S. and S.P.; writing—original draft, H.S.; writing—review and editing, H.S., S.P., S.M. and K.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the High-Level Personnel Research Start Foundation of Wuyi University (2017RC51) and the Science and Technology Planning Project of Jiangmen (2022030100030002348).

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Kaspi, M.; Raviv, T. Service-oriented line planning and timetabling for passenger trains. Oper. Res. Manag. Sci. 2013, 47, 295–311. [Google Scholar] [CrossRef]
  2. Niu, H.; Zhou, X.; Gao, R. Train scheduling for minimizing passenger waiting time with time-dependent demand and skip-stop patterns: Nonlinear integer programming models with linear constraints. Transp. Res. Part B 2015, 76, 117–135. [Google Scholar] [CrossRef]
  3. Su, H.Y.; Tao, W.C.; Hu, X.L. A line planning approach for high-speed rail networks with time-dependent demand and capacity constraints. Math. Probl. Eng. 2019, 2019, 7509586. [Google Scholar] [CrossRef]
  4. Xu, G.M.; Yang, H.; Liu, W.; Shi, F. Itinerary choice and advance ticket booking for high-speed-railway network services. Transp. Res. Part C 2018, 95, 82–104. [Google Scholar] [CrossRef]
  5. Su, H.Y.; Peng, S.T.; Deng, L.B.; Xu, W.X.; Zeng, Q.F. Optimal differential pricing for intercity high-speed Railway services with time-dependent demand and passenger choice behaviors under capacity constraints. Math. Probl. Eng. 2021, 2021, 8420206. [Google Scholar] [CrossRef]
  6. Tsai, T.H.; Lee, C.K.; Wei, C.H. Neural network based temporal feature models for short-term railway passenger demand forecasting. Expert Syst. Appl. Int. J. 2009, 36, 3728–3736. [Google Scholar] [CrossRef]
  7. Li, H.J.; Zhang, Y.Z.; Zhu, C.F. Forecasting of railway passenger flow based on Grey model and monthly proportional coefficient. In Proceedings of the 2012 IEEE Symposium on Robotics and Applications (ISRA), Kuala Lumpur, Malaysia, 3–5 June 2012; Volume 18, pp. 23–26. [Google Scholar]
  8. Zhu, R.Q.; Zhou, H.Y. Railway passenger flow forecast based on hybrid PVAR-NN Model. In Proceedings of the 2020 IEEE 5th International Conference on Intelligent Transportation Engineering (ICITE), Beijing, China, 11–13 September 2020; pp. 190–194. [Google Scholar]
  9. Tang, L.; Yang, Z.; Cabrera, J.; Jian, M.; Kwok, L.T. Forecasting short-term passenger flow: An empirical study on Shenzhen metro. IEEE Trans. Intell. Transp. Syst. 2018, 99, 3613–3622. [Google Scholar] [CrossRef]
  10. Li, L.; Wang, Y.; Zhong, G.; Zhang, J.; Ran, B. Short-to-medium term passenger flow forecasting for metro stations using a hybrid model. Transp. Eng. 2017, 22, 1937–1945. [Google Scholar] [CrossRef]
  11. Smith, B.L.; Demetsky, M.J. Traffic flow forecasting: Comparison of modeling approaches. J. Transp. Eng. 1997, 123, 261–266. [Google Scholar] [CrossRef]
  12. Williams, B.M.; Durvasula, P.K.; Brown, D.E. Urban freeway traffic flow prediction: Application of seasonal autoregressive integrated moving average and exponential smoothing models. Transp. Res. Record. J. Transp. Res. Board 1998, 1644, 132–141. [Google Scholar] [CrossRef]
  13. Williams, B.M.; Hoel, L.A. Modeling and forecasting vehicular traffic flow as a seasonal ARIMA process: Theoretical basis and empirical results. J. Transp. Eng.-ASCE (Am. Soc. Civ. Eng.) 2003, 129, 664–672. [Google Scholar] [CrossRef] [Green Version]
  14. Jia, R.; Li, Z.; Xia, Y.; Zhu, J.; Ma, N.; Chai, H.; Liu, Z. Urban road traffic condition forecasting based on sparse ride-hailing service data. IET Intell. Transp. Syst. 2020, 14, 668–674. [Google Scholar] [CrossRef]
  15. Chen, Q.C.; Wen, D.; Li, X.Q.; Chen, D.J.; Lv, H.X.; Zhang, J.; Gao, P. Empirical mode decomposition based long short-term memory neural network forecasting model for the short-term metro passenger flow. PLoS ONE 2019, 14, e0222365. [Google Scholar] [CrossRef] [PubMed]
  16. Sun, Y.K.; Cao, Y.; Zhou, M.J.; Wen, T.; Li, P.; Roberts, C. A hybrid method for life prediction of railway relays based on Multi-Layer Decomposition and RBFNN. IEEE Access 2019, 7, 44761–44770. [Google Scholar] [CrossRef]
  17. Smith, B.L.; Williams, B.M.; Oswald, K.R. Comparison of parametric and nonparametric models for traffic flow forecasting. Transp. Res. Part C Emerg. Technol. 2002, 10, 303–321. [Google Scholar] [CrossRef]
  18. Milenkovi, M.; Libor, S.; Melichar, V.; Nebojša, B.; Zoran, A.V. SARIMA modeling approach for railway passenger flow forecasting. Transport 2016, 7, 1–8. [Google Scholar] [CrossRef] [Green Version]
  19. Zhang, Z.; Cheng, W.; Gao, Y. Passenger flow forecast of rail station based on multi-source data and long short term memory network. IEEE Access 2020, 8, 28475–28483. [Google Scholar] [CrossRef]
  20. Jing, Z.C.; Yin, X.L. Neural network-based prediction model for passenger flow in a large passenger station: An exploratory study. IEEE Access 2020, 8, 36876–36884. [Google Scholar] [CrossRef]
  21. Peng, K.B.; Bai, W.; Wu, L.Y. Passenger flow forecast of railway station based on improved LSTM. In Proceedings of the 2020 2nd International Conference on Advances in Computer Technology, Information Science and Communications (CTISC), Suzhou, China, 20–22 March 2020; pp. 166–170. [Google Scholar]
  22. Toque, F.; Come, E.; Oukhellou, L.; Trepanier, M. Short-term multi-step ahead forecasting of railway passenger flows during special events with machine learning methods. Open Sci. 2018, 9, 1–16. [Google Scholar]
  23. Jérémy, R.; Gérald, G.; Stéphane, B. A dynamic Bayesian network approach to forecast short-term urban rail passenger flows with incomplete data. Transp. Res. Procedia 2017, 26, 53–61. [Google Scholar]
  24. Marc, C.G.; Peter, B.; Dieter, W. A new direct demand model of long-term forecasting air passengers and air transport movements at German airports. J. Air Transp. Manag. 2018, 71, 140–152. [Google Scholar]
  25. Ulrich, G.; Bozana, Z. Forecasting air passenger numbers with a GVAR model. Ann. Tour. Res. 2021, 89, 103252. [Google Scholar]
  26. Chan, L.L.; Yash, G.; Sameer, A. Air passenger forecasting using Neural Granger causal google trend queries. J. Air Transp. Manag. 2021, 95, 102083. [Google Scholar]
  27. Jiang, X.S.; Zhang, L.; Chen, X.Q. Short-term forecasting of high-speed rail demand: A hybrid approach combining ensemble empirical mode decomposition and gray support vector machine with real-world applications in China. Transp. Res. Part C 2014, 44, 110–127. [Google Scholar] [CrossRef]
  28. Zhao, S.; Mi, X.W. A novel hybrid model for short-term high-speed railway passenger demand forecasting. IEEE Access 2019, 7, 175681–175692. [Google Scholar] [CrossRef]
  29. Sun, S.L.; Yang, D.C.; Guo, J.E.; Wang, S.Y. AdaEnsemble learning approach for metro passenger flow forecasting. Comput. Sci. 2020, 07575, 1–21. [Google Scholar]
  30. Xin, Y.; Xue, Q.C.; Yang, X.X.; Yin, H.D.; Qu, Y.C.; Li, X.; Wu, J.J. A novel prediction model for the inbound passenger flow of urban rail transit. Inf. Sci. 2021, 566, 347–363. [Google Scholar]
  31. Wei, Y.; Chen, M.C. Forecasting the short-term metro passenger flow with empirical mode decomposition and neural networks. Transp. Res. Part C 2012, 21, 148–162. [Google Scholar] [CrossRef]
  32. Jin, F.; Li, Y.W.; Sun, S.L.; Li, H.T. Forecasting air passenger demand with a new hybrid ensemble approach. J. Air Transp. Manag. 2020, 83, 1–18. [Google Scholar] [CrossRef]
  33. Huang, N.; Shen, Z.; Long, S.; Wu, M.L.C. The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc. Math. Phys. Eng. Sci. 1998, 454, 903–995. [Google Scholar] [CrossRef]
  34. Dragomiretskiy, K.; Zosso, D. Variational Mode Decomposition. IEEE Trans. Signal Process. 2014, 62, 531–544. [Google Scholar] [CrossRef]
  35. Mohanty, S.; Gupta, K.; Raju, K. Comparative study between VMD and EMD in bearing fault diagnosis. In Proceedings of the 9th IEEE International Conference on Industrial and Information Systems (ICIIS2014), Gwalior, India, 15–17 December 2014; pp. 1–6. [Google Scholar]
  36. Yue, Y.; Sun, G.; Cai, Y.; Chen, R.; Wang, X.; Zhang, S. Comparison of performances of variational mode decomposition and empirical mode decomposition. In Proceedings of the 2016 3nd International Conference on Energy Science and Applied Technology, Jaipur, India, 21–24 September 2016; pp. 469–476. [Google Scholar]
  37. Wardana, A. A comparative study of EMD, EWT and VMD for detecting the oscillation in control loop. In Proceedings of the 2016 International Seminar on Application for Technology of Information and Communication (ISemantic), Semarang, Indonesia, 5–6 August 2016; pp. 1907–1910. [Google Scholar]
  38. Wen, K.Y.; Zhao, G.T.; He, B.S.; He, B.S.; Zhang, H.S. A decomposition-based forecasting method with transfer learning for railway short-term passenger flow in holidays. Expert Syst. Appl. 2022, 189, 116102. [Google Scholar] [CrossRef]
  39. Glisovic, N.; Milenkovi, M.; Bojovi, N. Comparison of SARIMA-GA-ANN and SARIMA-ANN for prediction of the railway passenger flows. In Proceedings of the 4th International Symposium and 26th National Conference on Operational Research, Chania, Greece, 4 June 2015; Volume 6, pp. 1–5. [Google Scholar]
  40. Jérémy, R.; Stéphane, B.; Gérald, G. Dynamic bayesian networks with gaussian mixture models for short-term passenger flow forecasting. In Proceedings of the 2017 12th International Conference on Intelligent Systems and Knowledge Engineering (ISKE), Nanjing, China, 24–26 November 2017; pp. 1–8. [Google Scholar]
  41. Ma, D.L.; Guo, Y.T.; Ma, S.Z. Short-term subway passenger flow prediction based on GCN-Bi-LSTM. IOP Conf. Ser. Earth Environ. Sci. 2021, 693, 012005. [Google Scholar] [CrossRef]
  42. Zhang, J.L.; Chen, F.; Shen, Q. Cluster-based LSTM Network for short-term passenger flow forecasting in urban rail transit. IEEE Access 2019, 7, 147653–147671. [Google Scholar] [CrossRef]
  43. Robert, F.; Yingqi, W.; Suzilah, I. Evaluating the forecasting performance of econometric models of air passenger traffic flows using multiple error measures. Int. J. Forecast. 2011, 27, 902–922. [Google Scholar]
  44. Kim, S.; Shin, D.H. Forecasting short-term air passenger demand using big data from search engine queries. Autom. Constr. 2016, 70, 98–108. [Google Scholar] [CrossRef]
  45. Rodrigo, A.S. Forecasting air passengers at São Paulo International Airport using a mixture of local experts model. J. Air Transp. Manag. 2013, 26, 35–39. [Google Scholar]
  46. Wai, H.K.T.; Hatice, O.B.; Andrew, G.; Hamish, G. Forecasting of Hong Kong airport’s passenger throughput. Tour. Manag. 2014, 42, 62–76. [Google Scholar]
  47. Simon, O.H. Neural Networks and Learning Machines; Pearson Press: London, UK, 2008. [Google Scholar]
Figure 1. Passenger flow distribution. (a) Passenger flow distribution under time granularity Δ t = 1 , (b) Passenger flow distribution under time granularity Δ t = 3 , (c) Passenger flow distribution under time granularity Δ t = 5 , (d) Passenger flow distribution under time granularity Δ t = 16 .
Figure 1. Passenger flow distribution. (a) Passenger flow distribution under time granularity Δ t = 1 , (b) Passenger flow distribution under time granularity Δ t = 3 , (c) Passenger flow distribution under time granularity Δ t = 5 , (d) Passenger flow distribution under time granularity Δ t = 16 .
Mathematics 10 04554 g001
Figure 2. Passenger flow time series prediction process.
Figure 2. Passenger flow time series prediction process.
Mathematics 10 04554 g002
Figure 3. MLP structure.
Figure 3. MLP structure.
Mathematics 10 04554 g003
Figure 4. GRU structure.
Figure 4. GRU structure.
Mathematics 10 04554 g004
Figure 5. Bi-LSTM structure.
Figure 5. Bi-LSTM structure.
Mathematics 10 04554 g005
Figure 6. VMD decomposition results of Guangzhou South railway station–Zhuhai railway station (period 1). (a) VMD decomposition results under time granularity Δ t = 1 , (b) VMD decomposition results under time granularity Δ t = 3 , (c) VMD decomposition results under time granularity Δ t = 5 , (d) VMD decomposition results under time granularity Δ t = 16 .
Figure 6. VMD decomposition results of Guangzhou South railway station–Zhuhai railway station (period 1). (a) VMD decomposition results under time granularity Δ t = 1 , (b) VMD decomposition results under time granularity Δ t = 3 , (c) VMD decomposition results under time granularity Δ t = 5 , (d) VMD decomposition results under time granularity Δ t = 16 .
Mathematics 10 04554 g006
Figure 7. Comparison between the prediction value and the actual value under different time granularities. (a) Prediction results of model VMD-MLP under time granularity Δ t = 1 , (b) Prediction results of model VMD-MLP under time granularity Δ t = 3 , (c) Prediction results of model VMD-MLP under time granularity Δ t = 5 , (d) Prediction results of model VMD-Bi-LSTM under time granularity Δ t = 16 .
Figure 7. Comparison between the prediction value and the actual value under different time granularities. (a) Prediction results of model VMD-MLP under time granularity Δ t = 1 , (b) Prediction results of model VMD-MLP under time granularity Δ t = 3 , (c) Prediction results of model VMD-MLP under time granularity Δ t = 5 , (d) Prediction results of model VMD-Bi-LSTM under time granularity Δ t = 16 .
Mathematics 10 04554 g007
Figure 8. Comparison of the minimum values of the RMSE, MAE, and MAPE errors. (a) The minimum RMSE prediction error of each model for each O-D pair under different time granularities, (b) The minimum MAE prediction error of each model for each O-D pair under different time granularities, (c) The minimum MAPE prediction error of each model for each O-D pair under different time granularities.
Figure 8. Comparison of the minimum values of the RMSE, MAE, and MAPE errors. (a) The minimum RMSE prediction error of each model for each O-D pair under different time granularities, (b) The minimum MAE prediction error of each model for each O-D pair under different time granularities, (c) The minimum MAPE prediction error of each model for each O-D pair under different time granularities.
Mathematics 10 04554 g008
Table 1. Prediction error of 1 h (RMSE/MAE/MAPE).
Table 1. Prediction error of 1 h (RMSE/MAE/MAPE).
ModelNHNNID1D2D3D4D5
1018.7/14.6/0.1067.1/53.7/0.2122.3/17.6/0.135.5/4.3/0.052.8/2.2/0.02
645021.3/16.5/0.0960.3/48.2/0.2519.5/15.7/0.125.6/4.4/0.051.9/1.6/0.01
10017.2/13.6/0.1147.0/35.8/0.1622.3/17.8/0.145.0/3.9/0.043.6/2.9/0.04
1020.2/16.0/0.1152.1/41.0/0.1914.4/11.2/0.085.3/4.2/0.058.5/8.2/0.08
VMD1285022.6/17.8/0.0954.7/43.2/0.2214.8/11.7/0.095.0/4.0/0.051.9/1.5/0.01
-MLP 10019.4/15.6/0.1345.4/34.6/0.1620.0/16.0/0.125.0/4.0/0.053.6/3.2/0.04
1019.4/15.5/0.1254.6/43.0/0.2216.0/12.8/0.105.6/4.3/0.054.8/3.9/0.05
2565017.8/13.6/0.0948.0/37.4/0.1815.5/12.1/0.095.5/4.4/0.062.5/2.1/0.02
10017.5/13.7/0.1155.8/43.9/0.2132.6/28.2/0.225.3/4.3/0.054.9/4.0/0.06
1075.7/72.6/0.64141.3/132.6/0.7452.0/49.6/0.396.3/4.9/0.076.0/6.0/0.08
645032.5/26.3/0.1766.4/53.3/0.28104.3/102.6/0.8113.1/12.0/0.153.9/3.9/0.05
10021.5/17.3/0.1373.0/59.0/0.3139.3/36.1/0.2812.3/10.7/0.132.3/2.0/0.02
1025.6/20.4/0.14146.2/131.3/0.6828.3/24.1/0.1926.3/25.6/0.322.4/2.1/0.03
VMD1285031.4/27.4/0.2769.6/55.0/0.2749.0/40.3/0.325.9/4.7/0.066.3/6.2/0.08
-GRU 10025.1/19.9/0.1577.6/59.7/0.3136.9/27.3/0.2115.0/12.9/0.151.6/1.3/0.01
1019.2/14.8/0.10113.2/97.4/0.5322.6/19.6/0.1520.1/19.4/0.251.5/1.3/0.01
2565024.2/19.3/0.13140.2/130.7/0.6222.1/18.0/0.149.7/8.4/0.111.7/1.4/0.02
10024.1/18.6/0.1073.8/59.0/0.3029.8/25.3/0.198.9/7.2/0.096.0/5.4/0.07
1019.1/14.7/0.1151.1/39.4/0.2013.4/10.7/0.086.2/4.8/0.061.9/1.5/0.02
645022.0/16.3/0.1257.2/44.3/0.227.4/3.9/0.036.1/4.7/0.061.1/0.9/0.01
VMD 10024.4/16.3/0.1369.5/53.6/0.3113.3/4.2/0.036.1/4.7/0.064.8/3.0/0.04
-Bi-L 1019.7/14.9/0.1159.2/44.9/0.2911.9/9.0/0.075.9/4.6/0.067.4/6.6/0.07
STM1285022.3/17.1/0.1464.9/49.6/0.2210.2/7.3/0.056.3/5.0/0.062.2/1.8/0.02
10024.0/18.8/0.1577.6/53.9/0.234.6/3.0/0.026.6/5.1/0.065.8/3.1/0.04
1019.8/15.1/0.1156.1/43.2/0.2114.7/10.7/0.086.8/5.4/0.075.5/2.8/0.04
2565022.4/17.3/0.1567.1/52.5/0.249.5/5.8/0.045.9/4.6/0.063.9/2.9/0.04
10024.9/19.7/0.1862.1/48.4/0.2313.5/8.8/0.066.2/4.8/0.069.9/5.8/0.08
Note: The minimum prediction error of each model for each O-D pair is underlined.
Table 2. Prediction error of 3 h (RMSE/MAE/MAPE).
Table 2. Prediction error of 3 h (RMSE/MAE/MAPE).
ModelNHNNID1D2D3D4D5
1021.1/15.8/0.02170.6/130.3/0.1131.5/24.0/0.0314.8/4.6/0.092.0/1.5/0.03
645020.7/15.4/0.02151.1/113.2/0.1136.8/29.3/0.0414.7/4.2/0.081.5/1.1/0.01
10021.3/16.9/0.02153.5/115.8/0.1134.0/26.6/0.0314.7/4.2/0.081.8/1.5/0.02
1020.6/16.2/0.02183.3/144.0/0.1231.2/23.6/0.0314.9/4.5/0.092.0/1.6/0.05
VMD1285020.5/15.7/0.02148.2/111.4/0.1137.1/29.7/0.0414.7/4.2/0.081.5/1.1/0.01
-MLP 10020.9/16.5/0.02152.6/113.9/0.1033.5/26.2/0.0314.7/4.2/0.081.8/1.4/0.02
1022.7/18.5/0.02170.8/130.7/0.1134.8/27.4/0.0314.8/4.8/0.092.2/1.8/0.10
2565020.9/16.4/0.02149.8/113.0/0.1137.1/29.7/0.0414.7/4.2/0.091.5/1.1/0.01
10022.8/16.9/0.02153.5/114.8/0.1134.1/26.8/0.0314.7/4.2/0.081.5/1.1/0.01
1030.8/25.9/0.03507.0/476.3/0.4250.5/40.1/0.0515.1/6.8/0.135.0/4.7/0.04
645026.2/21.0/0.03216.0/174.9/0.1536.4/28.1/0.0315.2/7.5/0.104.2/3.9/0.03
10037.5/29.1/0.04321.8/260.8/0.2335.8/28.2/0.0414.8/5.0/0.102.2/1.8/0.02
1027.4/24.6/0.03162.0/122.8/0.1455.4/47.9/0.0615.1/6.5/0.136.3/5.9/0.06
VMD1285030.0/15.7/0.02270.6/215.6/0.1892.7/86.2/0.1215.7/8.6/0.176.9/6.6/0.07
-GRU 10066.2/53.4/0.08516.3/442.7/0.4940.6/32.2/0.047.1/4.5/0.404.9/4.5/0.03
1023.5/18.2/0.02175.4/134.8/0.1451.3/41.4/0.0516.4/9.4/0.192.2/1.8/0.01
2565030.6/25.1/0.03235.7/179.8/0.1547.4/37.8/0.0515.3/7.8/0.152.6/2.1/0.01
10039.6/33.1/0.04237.3/189.6/0.2043.7/34.6/0.0414.8/5.0/0.103.5/3.5/0.02
1024.5/18.8/0.02165.7/126.8/0.1130.9/23.8/0.0314.7/4.2/0.082.1/1.7/0.01
645023.3/18.4/0.02216.9/156.9/0.1637.9/30.0/0.0414.7/4.2/0.081.4/1.0/0.01
VMD 10023.8/18.7/0.02212.6/155.2/0.1635.3/28.3/0.0414.7/4.2/0.081.4/1.0/0.01
-Bi-L 1025.3/20.1/0.02176.8/137.6/0.1230.3/23.5/0.0314.7/4.2/0.081.7/1.3/0.01
STM1285023.4/18.2/0.02195.0/146.5/0.1536.0/27.6/0.0314.7/4.2/0.081.4/1.1/0.01
10023.9/18.5/0.02205.8/150.0/0.1535.9/28.0/0.0414.7/4.2/0.081.3/1.0/0.01
1026.8/21.0/0.03176.5/134.9/0.1230.5/23.5/0.0314.7/4.2/0.082.0/1.6/0.01
2565022.4/17.5/0.02222.0/162.4/0.1536.9/28.2/0.0414.7/4.2/0.081.3/1.0/0.01
10022.6/17.2/0.02218.3/154.4/0.1437.8/28.1/0.0414.7/4.2/0.081.3/0.9/0.01
Note: The minimum prediction error of each model for each O-D pair is underlined.
Table 3. Prediction error of 5 h (RMSE/MAE/MAPE).
Table 3. Prediction error of 5 h (RMSE/MAE/MAPE).
ModelNHNNID1D2D3D4D5
1038.0/30.9/0.02242.1/187.6/0.0935.2/29.2/0.0220.2/14.0/0.041.4/1.1/0.04
645026.8/20.6/0.01205.7/150.2/0.0726.0/20.1/0.0121.2/14.0/0.041.2/0.8/0.04
10029.3/22.6/0.01255.3/200.7/0.0927.0/21.2/0.0121.6/14.0/0.041.0/0.8/0.03
1042.3/35.0/0.02251.8/197.6/0.0935.8/29.8/0.0220.2/13.7/0.041.3/1.0/0.04
VMD1285026.8/20.6/0.01204.7/148.8/0.0727.1/21.2/0.0121.2/13.6/0.041.0/0.8/0.04
-MLP 10029.4/22.7/0.01264.3/209.8/0.1027.2/21.4/0.0121.6/14.2/0.041.1/0.8/0.03
1040.5/33.4/0.02262.7/209.5/0.1034.1/28.0/0.0220.6/14.1/0.041.5/1.2/0.06
2565027.0/20.8/0.01205.5/151.2/0.0728.4/22.5/0.0121.3/21.3/0.041.3/0.7/0.04
10034.1/27.2/0.02260.2/205.1/0.1031.9/25.2/0.0121.4/13.8/0.041.1/0.8/0.03
1091.5/83.3/0.06527.6/462.7/0.2199.6/86.3/0.0723.4/17.6/0.075.4/5.1/0.12
6450100.0/93.5/0.07240.9/184.2/0.0837.5/31.5/0.0222.1/17.5/0.073.5/2.1/0.11
10048.5/39.0/0.02246.0/184.8/0.0883.6/71.8/0.0524.9/19.8/0.094.8/4.5/0.14
1084.9/77.9/0.05817.4/778.7/0.0336.5/30.1/0.0234.0/28.5/0.1210.0/9.8/0.17
VMD1285054.1/44.2/0.03271.8/204.0/0.0938.0/28.4/0.0244.5/41.0/0.202.0/1.3/0.06
-GRU 100128/118./0.09503.7/423.2/0.2177.3/67.7/0.0520.9/14.2/0.054.7/4.3/0.11
1069.4/56.1/0.04654.9/612.2/0.28125.6/121.6/0.0922.7/15.8/0.0510.0/ 9.0/0.15
25650101.0/88.8/0.06282.1/209.9/0.1044.4/37.3/0.0337.2/32.7/0.147.1/6.6/0.13
10048.1/39.0/0.02452.2/364.0/0.1777.5/69.6/0.0536.8/32.6/0.152.4/2.0/0.10
1031.8/24.5/0.01226.6/166.5/0.0831.9/26.0/0.0218.6/12.5/0.041.3/1.0/0.05
645040.9/30.7/0.02330.8/225.3/0.1028.6/22.4/0.0118.6/12.9/0.041.2/0.9/0.06
10040.6/30.3/0.02337.1/210.0/0.0929.4/22.9/0.0120.6/13.4/0.041.3/1.0/0.05
1032.0/24.0/0.01231.0/169.2/0.1031.9/26.0/0.0218.7/12.8/0.041.2/1.0/0.05
VMD1285037.4/28.2/0.02348.7/227.7/0.1031.6/25.4/0.0218.5/12.2/0.041.0/0.8/0.05
-BiLST 10039.7/29.1/0.02322.4/222.1/0.1032.3/24.8/0.0123.2/13.6/0.041.1/0.8/0.05
M 1031.7/24.3/0.01238.1/179.0/0.0832.3/26.2/0.0218.8/12.7/0.041.2/0.9/0.05
2565036.7/27.3/0.02304.8/217.5/0.1029.6/23.3/0.0118.2/11.8/0.041.0/0.8/0.05
10036.7/27.8/0.02307.5/215.0/0.1028.7/22.6/0.0119.4/12.4/0.041.1/0.9/0.05
Note: The minimum prediction error of each model for each O-D pair is underlined.
Table 4. Prediction error of 16 h (RMSE/MAE/MAPE).
Table 4. Prediction error of 16 h (RMSE/MAE/MAPE).
NHNNID1D2D3D4D5
10310.0/242.0/0.06711.4/532.4/0.07295.6/217.4/0.0145.6/31.9/0.0113.0/9.7/0.10
6450310.0/242.0/0.06713.1/534.9/0.07298.0/219.7/0.0145.6/31.8/0.0113.0/9.7/0.10
100310.0/242.0/0.06713.1/534.8/0.07298.0/219.7/0.0145.6/31.8/0.0113.0/9.7/0.10
10310.0/242.0/0.06710.4/531.1/0.07295.9/217.7/0.0145.6/31.9/0.0113.0/9.6/0.10
VMD12850310.0/242.0/0.06712.1/534.5/0.07297.2/219.0/0.0145.6/31.9/0.0113.0/9.7/0.10
-MLP 100310.0/242.0/0.06712.1/534.4/0.07297.2/219.0/0.0145.6/31.9/0.0113.0/9.7/0.10
10310.0/242.0/0.06709.1/530.0/0.07295.4/217.2/0.0145.6/31.9/0.0113.0/9.7/0.10
25650310.0/242.0/0.06710.5/534.0/0.07296.2/218.1/0.0145.6/31.9/0.0113.0/9.7/0.09
100310.0/242.0/0.06710.5/534.0/0.07296.2/218.1/0.0145.6/31.9/0.0113.0/9.7/0.09
10608.0/388.0/0.10885.4/717.9/0.091005.4/898.8/0.0262.5/38.7/0.0214.5/11.1/0.15
64502405.9/2308.0/0.583327.0/3015.7/0.41875.0/738.8/0.01433.3/361.1/0.15147.5/122.1/0.34
10031131.9/31056.3/7.8911591.1/10296.1/1.402827.3/2474.7/0.061072.3/1041.3/0.48244.0/238.9/0.44
10866.5/770.3/0.193555.2/3478.1/0.461354.5/1289.5/0.15749.7/743.0/0.3573.9/71.3/0.21
VMD1285013535.1/13396.5/3.444694.1/4311.4/0.575953.5/5933.7/0.15404.7/384.3/0.18281.7/274.6/0.68
-GRU 1005907.2/5718.6/1.4315964.9/15562.6/2.1010365.2/10166.7/0.26287.1/203.6/0.10115.8/107.5/0.31
101199.1/825.0/0.214870.4/4813.4/0.633032.8/2032.1/0.0873.6/53.3/0.2318.4/14.6/0.17
25650991.0/790.4./ 0.2118295.5/18103.8/2.411747.7/1431.1/0.03601.8/503.4/0.2574.4/69.2/0.23
10082387.7/82249.7/21.049438.6/49291.2/6.588795.4/8444.0/0.222914.0/2871.3/0.13988.2/987.1/10.35
10246.8/182.0/0.04605.5/437.3/0.05278.9/193.3/0.0444.1/30.5/0.0111.8/8.5/0.11
6450970.9/513.9/0.1310400.8/4135.7/0.542638.5/703.7/0.1767.9/43.9/0.0216.0/12.2/0.09
100335.5/193.0/0.047510.5/6636.9/0.88270.6/183.3/0.04259.1/174.8/0.1411.5/8.4/0.07
10245.2/183.8/0.04756.8/603.2/0.07292.2/196.8/0.0443.1/29.3/0.0112.8/9.6/0.13
VMD12850246.3/186.6/0.04712.7/531.1/0.0711781.0/8718.8/2.25215.6/179.8/0.088.9/5.8/0.06
-Bi-L 1002337.8/1346.6./ 0.336339.7/5467.7/0.72310.8/209.2/0.05340.6/290.4/0.14193.2/179.2/0.17
STM 10246.0/180.5/0.04620.0/447.7/0.05300.1/197.1/0.0544.8/29.3/0.0110.8/8.7/0.11
25650287.6/211.6/0.058333.9/8180.7/1.101923.3/1650.0/0.40210.3/176.9/0.0831.5/22.9/0.14
1001399.3/1290.9/0.328333.9/8180.7/1.10302.1/197.1/0.051421.4/1349.0/0.65473.5/467.7/0.50
Note: The minimum prediction error of each model for each O-D pair is underlined.
Table 5. The final tuning of hyperparameters (NHN/NI) of the three forecasting models under different time granularities.
Table 5. The final tuning of hyperparameters (NHN/NI) of the three forecasting models under different time granularities.
Model Δ t D1D2D3D4D5
164/5064/100128/1064/10064/50
VMD364/10128/10064/1064/10064/50
-MLP564/5064/50256/5064/10128/100
1664/1064/1064/1064/1064/10
1256/10128/50256/50128/50128/100
VMD3128/50128/1064/5064/50256/10
-GRU5256/100128/1064/50128/100128/50
1664/1064/1064/5064/1064/10
164/1064/10128/10064/5064/50
VMD364/1064/1064/1064/1064/10
-Bi-564/1064/1064/5064/1064/10
LSTM1664/1064/1064/1064/10128/50
Table 6. Comparison of the proposed models and baselines.
Table 6. Comparison of the proposed models and baselines.
ModelRMSEMAEMAPE
VMD-MLP17.213.60.09
VMD-GRU19.214.80.10
VMD-Bi-LSTM19.114.70.11
VMD-LSTM20.115.60.12
VMD-ARIMA21.215.40.15
MLP25.519.10.17
GRU26.920.20.18
Bi-LSTM3421.10.18
LSTM31.024.20.22
ARIMA33.222.90.20
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Su, H.; Peng, S.; Mo, S.; Wu, K. Neural Network-Based Hybrid Forecasting Models for Time-Varying Passenger Flow of Intercity High-Speed Railways. Mathematics 2022, 10, 4554. https://doi.org/10.3390/math10234554

AMA Style

Su H, Peng S, Mo S, Wu K. Neural Network-Based Hybrid Forecasting Models for Time-Varying Passenger Flow of Intercity High-Speed Railways. Mathematics. 2022; 10(23):4554. https://doi.org/10.3390/math10234554

Chicago/Turabian Style

Su, Huanyin, Shuting Peng, Shanglin Mo, and Kaixin Wu. 2022. "Neural Network-Based Hybrid Forecasting Models for Time-Varying Passenger Flow of Intercity High-Speed Railways" Mathematics 10, no. 23: 4554. https://doi.org/10.3390/math10234554

APA Style

Su, H., Peng, S., Mo, S., & Wu, K. (2022). Neural Network-Based Hybrid Forecasting Models for Time-Varying Passenger Flow of Intercity High-Speed Railways. Mathematics, 10(23), 4554. https://doi.org/10.3390/math10234554

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop