1. Introduction
Forecasting traffic flows of vessels has been recognized as a challenging task in the maritime intelligent transportation system, since it could be affected by various complex factors [
1,
2]. Accurate and timely traffic information is of significant importance to both maritime managers and individual vessels, as it not only helps the former to better conduct port planning [
3,
4], alleviate congestion [
5,
6], mitigate emission of GHG (greenhouse gases) and improve public security [
7,
8], but also enables the latter to better operate the ship navigation system and plan a route [
9,
10], so as to avoid collisions and reduce the potential cost due to late arrival [
11].
Various solutions have been studied for forecasting maritime traffic flow. Wang et al. [
12] developed a vessel flow prediction model using a back propagation (BP) neural network, in which training and data sampling were conducted based on a residual analysis. Haiyan and Youzhen [
13] came up with a hybrid scheme for vessel flow forecasting, which integrated an RBF (radial basis function) neural network, grey forecasting and auto-regression into a framework of support vector regression (SVR). The experimental results based on a real dataset demonstrated that the combined method significantly outperformed each individual. Li et al. [
14] conceived a hybrid method centering around a robust support vector regression (RSVR) model. In that method, a chaotic cloud simulated annealing genetic algorithm was first developed to optimize the parameters of RSVR, then the kernel principal component analysis (KPCA) algorithm was utilized to finalize the input vectors. Afterwards, the hybrid framework was established, and the test based on a real dataset justified its advantages over others. Yang et al. [
15] developed an enhanced hybrid RBF neural network (EHRBF-NN) to forecast the traffic flow of vessels. The method was robust in that it incorporated a regression tree and a particle swarm optimization (PSO) algorithm into the RBF neural network. More specifically, the regression tree was exploited to calculate the parameters (i.e., centers and radius) of RBF, and the PSO was designated to address the issue of potential over-fitting and update the weights of the neural network. Liu et al. [
16] proposed a two-step method to predict vessel flows. Particularly, in the first step, they separated vessel traffic flow into low-rank and sparse components by adopting a non-convex model. Then in the second step, the ARIMA (auto-regressive integrated moving average) and WNN (wavelet neural network) were explored to predict the flows based on the low-rank and sparse components from the fist step. He et al. [
17] proposed an improved Kalman model for short-term vessel traffic flow prediction. In that model, a preliminary prediction was first done using a regression analysis, which was then adopted to replace the transfer equation of the Kalman filter. Xiao et al. [
11] came up with a probabilistic method for maritime traffic forecasting. The method was novel in that it took both waterway pattern knowledge and vessel motion stability into account. In particular, the waterway pattern was learnt via a lattice-based DBSCAN algorithm which also helped narrow down the problem scale, and the motion behavior was modeled through a kernel density estimation, thus, the combination of them was supposed to successfully predict the traffic flow.
Although some successes have been achieved in maritime traffic flow forecasting, most of the methods above are shallow and outdated to some extent, given the fact that the deep learning approach has been unarguably deemed as the state-of-the-art in many areas, such as computer vision, natural language processing, and transportation. The promising advantages brought by deep learning can be summarized as two aspects: (1) deep neural networks can be designed to solve end-to-end tasks, which directly takes the raw data as input without elaborating the intermediate feature extraction; (2) it usually achieves significantly higher accuracy in comparison with conventional methods. Despite the fact that numerous deep learning approaches have been applied in traffic flow or traffic condition forecasting, most of them focus on the land transportation instead of maritime. Lv et al. [
18] proposed a deep learning framework to predict the freeway traffic flow, by exploring the big data from road infrastructure sensors. At the heart of that approach was a stacked auto-encoder (SAE) model, which was adopted to learn the latent traffic flow features by inherently taking spatial and temporal correlations into account. Chen et al. [
19] implemented a typical stacked LSTM (long short-term memory) network to forecast the traffic congestion, which was applied to a realistic navigation map. The results demonstrated that the proposed approach was able to effectively learn the hidden patterns of traffic flows. Cui et al. [
20] proposed a deep stacked bidirectional and unidirectional LSTM (SBU-LSTM) network to predict the traffic speed, which considered both forward and backward dependencies of the traffic data. More specifically, in that approach, a bidirectional LSTM (BDLSM) layer was developed to learn the spatial features and bidirectional temporal relationships by exploiting the historical data, as such it was able to grasp the dynamic nature of the traffic. Polson and Sokolov [
21] came up with a deep learning architecture to predict the short-term traffic flow. Particularly, that architecture integrated a linear model with
regularization and a branch of tanh layers, the first layer of which was designed to identify the spatio-temporal relations among predictors and other layers. As a consequence, it was able to capture the sharp nonlinear patterns of the traffic flow. Yu et al. [
22] designed a spatio-temporal deep neural network to predict the short-term traffic speeds. Enlightened by the research outcomes of motion detection in the field of computer vision, they first segmented the road network into grids, then the traffic information inside each grid was input to a convolutional neural network (CNN) to learn the spatial relationship. After that, the output of which was further taken as the input of a LSTM network, to capture the temporal relationship. Zhang et al. [
23] proposed a deep spatio-temporal residual network (ST-ResNet) to forecast the citywide crowd inflow and outflow. In that approach, a residual neural network framework was first adopted to model the temporal closeness, period, and trend properties of the crowd flows, then a sequence of residual convolutional units were designed to model the spatial relationships of crowd flows, and lastly, the output of those residual neural networks were aggregated as the preceding layer to the final output layer of real crowd flows. Guo et al. [
24] presented a framework for predictor fusion to forecast the short-term traffic condition. In the framework, three strategies were employed to evaluate the fusion performance, i.e., average fusion, weighted fusion and kNN fusion. Experimental results based on real dataset verified the significant advantages of the fusion method over the stand-alone ones. Liu and Chen [
25] developed a novel prediction model for passenger flow using deep learning approach. The core of the model was the combination of a pre-trained unsupervised SAE with a supervised deep neural network, which was supposed to well extract the hierarchical features, so that the passenger flow for any periods from Monday to Sunday could be successfully forecast. Ke et al. [
26] devised a deep learning approach to predict the road traffic congestion. The approach was distinguished in that it utilized visual signals to learn the traffic flow, and moving object detection and moving speed calculation were done by integrating a CNN and a Gaussian mixture model (GMM).
Obviously, the applications of deep learning in land transportation have been investigated far more extensively than that of maritime. On one hand, large-scale data for the land transportation is not that hard to obtain compared with maritime transportation, which is the foundation of applying the deep learning techniques. On the other hand, unlike the vessels in the sea, the mobilities of cars or people on land are mostly restricted to streets or rails, therefore, the traffic pattern for which is comparatively easy to be revealed. Moreover, we would like to note that, a variety of research with respects to the deep learning in maritime have been conducted, however most of them focus on vessel type identification [
27], trajectory prediction or reconstruction [
28], anomaly detection [
29,
30] and collision avoidance [
31,
32,
33]. The problem of how to adopt deep learning to forecast the traffic flows for maritime has not yet been studied.
Therefore, in this paper, centering around the deep learning approach, we propose three different solutions to forecast the inflow and outflow of vessels within a given marine region. More specifically, the three solutions are featured by a CNN, a LSTM network, and the integration of a bidirectional LSTM network with a CNN (BDLSTM-CNN), respectively. In particular, with respects to the BDLSTM-CNN based solution, each input will be first fed into the convolutional layers, then they will go through the forward layer and backward layer of a bidirectional LSTM network. As such, this hybrid solution is supposed to coherently learn the spatial and temporal dependencies pertaining to the vessel flows. Moreover, to apply the three deep learning based solutions into practice, we divide the given marine areas into M×N grids. Then all the solutions are implemented to forecast the inflow and outflow for each grid. Experimental results based on the real AIS data for a given area in Singapore show that the three deep learning based solutions significantly outperform the conventional method, in which the hybrid solution BDLSTM-CNN achieves the best performance.
The remainder of the paper is organized as follows.
Section 2 introduces the preliminaries first and then elaborates the structures and logic of the three deep learning based solutions.
Section 3 presents the comprehensive experimental results and analysis.
Section 4 concludes the paper and states the future works.
3. Experimentation and Evaluation
In this section, we conduct experimentation in different settings to test the proposed solutions, and demonstrate their advantages over the baseline. Particularly, we first introduce the data processing and experimental settings, then we compare and evaluate the three deep learning based solutions. Finally, we compare the three solutions with a conventional method, i.e., support vector regression (SVR).
3.1. Data Processing and Experimentation Setup
We use the AIS (Automatic Identification System) data of maritime vessels to perform the forecasting task, which is an automatic tracking system that uses transponders on ships to generate trajectory-related information [
30]. The AIS data for a vessel contains many useful attributes regarding its movement and mobility, which can be obtained from
https://www.marinetraffic.com. Here we mainly exploit the information of vessel ID, coordinates (i.e., longitude and latitude) and time stamp. The testing filed we selected is an rectangle marine area southeast to Singapore, which is shown as the red rectangle in
Figure 5. The location of the left upper point is (
,
), and that of the right lower point is (
,
). In all the experimentation we conducted, we uniformly divided this area into
grids (i.e.,
), which means that both inflow matrix
and outflow matrix
have a size of
.
Then we process and analyze the AIS data, and identify all the vessels entering and leaving those grids accordingly. Particularly, the AIS data we collected for the given area lasts about 31 days, from 01-10-2013 to 31-10-2013. We set the duration for each time step as 5 min. Then we have 8525 samples for each grid (with some data missed), which include the amount of vessels entering and leaving the given grid. We divide them into training dataset and testing dataset according to the time order, i.e., from 01-10-2013 to 25-10-2013, and from 26-10-2013 to 31-10-2013, which include 6305 and 2220 samples, respectively. In addition, we implement all the experimentation using pytorch, in a laptop with Intel i7 CPU, 8G RAM, and Nvidia GTX 1060.
3.2. Error Performance for the Deep Learning Based Solutions
We set
and
, respectively, and conduct experimentation according to the above configurations. We use two types of errors to measure the performance of the proposed solutions, i.e., mean absolute error (MAE) and root mean square error (RMSE), since they are the two most important metrics for regression problems. We also use their normalized forms to further evaluate the performance, i.e., NMAE and NRMSE. All the results for the three deep learning based solutions are shown in
Figure 6 and
Figure 7. Before looking into them, we would like to note that the errors in those figures are calculated by considering inflow and outflow together. Additionally, all the results below are obtained based on the testing data.
From
Figure 6a,b we can see that, as the training iteration increases, the MAE for testing data of the three deep learning based solutions drop quickly. With respect to both
and
, all the solutions seem to converge after about 85 iterations. Pertaining to the two different
settings, the BDLSTM-CNN based solution always achieves the lowest MAE, which are around 1.11 and 1.14, respectively. In contrast, the LSTM based solution achieves the second lowest MAE of 1.15 and 1.25, and the CNN based solution achieves the third lowest MAE of 1.29 and 1.35. It makes sense that the BDLSTM-CNN based hybrid solution engenders the best results as it coherently explores the spatial and temporal dependencies regarding the vessel flows, by integrating both CNN and LSTM. Moreover, with respect to the temporal expression, it exploits the bidirectional LSTM to take advantage of both past and future information, which is supposed to get better results than the unidirectional LSTM. Comparing the CNN based solution and the LSTM based solution, we can observe that, the latter presents a better performance, which is probably due to that the temporal relationship is much relatively important in comparison with the spatial relationship in this case. Although that the former solution takes every five preceding flow matrix to generate the succeeding one, and the convolution layer also has the potential to predict the movement of vessels from the spatial perspective, the structure of CNN itself does not have a scheme to capture the temporal correlation. This also might be able to explain the oscillation in the curve of CNN based solution, because it dose not have any mechanism to capture the seasonality pattern in the vessel flows. On the other hand, there are only slight oscillations for LSTM based solution and BDLSTM-CNN based solution, which justifies their favorable capability of handling the seasonality in sequence. Comparing
Figure 6a,b, we can observe that, the performance of the three solutions slightly deteriorates for
. It is reasonable as one can always make a more accurate forecast for a near future than for a far future. However, the hybrid solution still achieves the lowest error. Additionally, we would like to highlight that, the difference between the LSTM based solution and the hybrid solution is larger for
than that of
. It happened because in the case of
, the bidirectional LSTM may bring more useful information from the backward layer, in comparison with the solo forward layer in the LSTM based solution.
From
Figure 6c,d, we can see that they share similar pattern with
Figure 6a,b. This is normal because they are simply the normalized version of the latter, which can be regarded as a kind of error rate. The difference in the curve shape comes from the fact that we only show part of errors in
Figure 6a,b, in order to highlight the region of interests. Looking into
Figure 6c,d, we can observe that, the three solutions achieve error rates (in terms of MAE) of around 24.5%, 22.5% and 22.0%, respectively, for
, and around 26%, 24% and 22.5%, respectively, for
, with the hybrid solution being the lowest. The error rates in terms of MAE would be decreased about 1.5%, 1.5% and 0.5%. So it can be concluded that the results of three deep learning based solutions considerably are better than the traditional approach expressed in MAE, with the performance of the BDLSTM-CNN based hybrid solution being the smallest.
Likewise, we also use the RMSE (root mean square error) to further evaluate the proposed solutions, and the results are depicted in
Figure 7, which share similar pattern with that of
Figure 6. From
Figure 7a,b, we can observe that, the three solutions achieve a RMSE of around 2.01, 1.75 and 1.68 for
, respectively, and around 2.18, 1.88 and 1.75 for
, respectively, with the hybrid solution being the lowest for both cases. From
Figure 7c,d, we can see that, the three solutions achieve error rates (in terms of RMSE) of around 24%, 21% and 20%, respectively, for
, and around 26%, 23.5% and 21%, respectively, for
, with the hybrid solution being the lowest for both cases. Similarly, the error rates in terms of RMSE will be lowered by 2%, 2.5% and 1%. Consequently, it can be found that the outcomes of three deep learning based solutions noticeably outperformed than the traditional approach in terms of RMSE, with the performance of the BDLSTM-CNN based hybrid solution being the lowest.
Combining the performance of both MAE and RMSE in
Figure 6 and
Figure 7, it can be possibly proving that the BDLSTM-CNN based hybrid solution outperforms the CNN based solution and the LSTM based solution. However, considering that, (1) all the solutions only adopt five historical data to predict a new one, (2) all the errors are derived based on integrating inflow and outflow together, we believe that the performance of all the solutions are sufficiently satisfactory, although superiority and inferiority exist among them.
3.3. Breakdown Performance for the Hybrid Solution
Since the BDLSTM-CNN based solution has the best overall performance, we look into this hybrid solution and analyze its prediction capability in a breakdown perspective. To this end, we plot the curves of the average forecasting value of the inflow and outflow for the whole given region, and the forecasting value of inflow and outflow for a given grid, respectively. We would like to note that, all the results below are obtained based on the testing data.
We first plot the above forecasting values against the ground truth for
in
Figure 8. From
Figure 8a we can see that, the ground truth value of the average by considering inflow and outflow together, changes dramatically as time goes on. However, the BDLSTM-CNN based solution can well capture those trends, such as the sharp changes at time step 100, 1300, 1700 and 2200, respectively, because this hybrid solution combines the advantages of the capability of CNN and BDLSTM, to learn the spatial and temporal features in a unified way. Nevertheless, we observe some imperfect tracking, such as time step 250, which does not catch the crest. However, considering that the ground truth is the average of both inflow and outflow for the whole given area, some minor errors are tolerable. We also plot the forecasting curves of a single grid for inflow and outflow in
Figure 8b,c, respectively, by taking grid (6,6) as an example. From
Figure 8b, we observe that, the hybrid solution can catch most of the crests and troughs for the inflow curve, no matter how sharp or smooth they are, such as the crest at time step 400 and the trough at time step 1250. An unsatisfactory forecasting is found as well at time step 1800, where a sharp crest is missed. From
Figure 8c, we observe that the hybrid solution can also successfully follow both sharp and smooth crests and troughs for the outflow curve, such as the crest at time step 1050, and trough at time step 1450. Although the hybrid solution mismatched the ground truth at some points, such as time step 1800, most of the failures are tolerable.
We then continue to plot the forecasting values against the ground truth for
in
Figure 9. From
Figure 9a we can see that, the hybrid solution is still able to basically track the ground truth value, although slight deterioration is observed in comparison with the performance in
Figure 8a. This can be explained by the fact that, a near future is comparatively easier to be forecast than a distant future. Despite the slight deterioration in the average forecasting, the hybrid solution still shows competitive performance for inflow and outflow forecasting pertaining to grid (6,6), which mostly well captures all the sharp or smooth crests and troughs.
Combining the results in both
Figure 8 and
Figure 9, we can conclude that the hybrid solution has strong capability of forecasting the inflow and the outflow of vessels. Though the inflow and outflow changed dramatically, the trend can be well captured by the hybrid solution.
3.4. Comparison with the Conventional Method
In this subsection, we compare the performance of the three deep learning based solutions with a conventional method, namely the support vector regression (SVR) based approach. In particular, we use the python sklearn package to implement this function, and utilize the same training dataset to optimize its parameters, and then do the evaluation using the same testing dataset, the results of which are recorded in
Table 1. Since the normalized errors reflect a kind of error rate in forecasting, we mainly concentrate on the normalized forms of the two errors, i.e., NMAE and NRMSE.
From
Table 1 we can see that, the error rates of the SVR method are around 51% for both measurements of MAE and RMSE, almost twice as much as the three deep learning based solutions. The inferiority comes from the fact that, the SVR method does not have a sophisticated scheme to learn the underlying spatial dependency, or the long-term temporal dependency. In contrast, CNN has a convolution layer, and LSTM or BDLSTM has a memory and gate to handle the two situations [
42,
43], respectively. On the other hand, the BDLSTM-CNN based hybrid solution always achieves the lowest error rates of 20% to 22.5%, which implies a forecasting accuracy of 77.5% to 80%. Considering that we only use five historical data to predict a new one, and both inflow and outflow of vessels change dramatically, the performance achieved by the hybrid solution is sufficiently satisfactory.
4. Conclusions and Future Work
In this paper, we propose three deep learning based solutions to forecast the flow of maritime vessels. To apply the deep learning approach, we first divide the given marine area into grids, then we predict both the inflow and outflow of vessels for each grid. In particular, the three solutions are characterized by a CNN, a LSTM, and the integration of BDLSTM and CNN, respectively. When testing them with the real AIS data of vessels, the hybrid solution based on BDLSTM-CNN achieves the best performance in terms of mean absolute error (MAE) and root mean square error (RMSE), both error rates will be decreased by 1–2% compared with other methods, as it is able to coherently learn the spatial and temporal representations for both the inflow and outflow. On the other hand, when further comparing them with a conventional approach, the three deep learning based solutions significantly outperform the SVR method. We would like to note that, is only an empirical value for the given area in this paper, and it can vary with different scenarios. Moreover, the number of columns does not need to be the same with the rows.
However, the proposed approach still needs to be improved further and tested more extensively. In future, we will work on the following directions: (1) we will consider an attention model in the BDLSTM-CNN based solution to further improve the performance; (2) we will explore more relevant features, such as weather, date, kinematics and kinetics of ship, variable speeds of ship movement, collision avoidance maneuvers; (3) we will use more than five data points to predict a new one, and also forecast the flow longer-time away, such as = ; (4) we will investigate multi-agent based methods and reinforcement learning based methods to solve the route planning for maritime vessel flows; (5) we will try to apply integral of multiplied by absolute error (ITAE) and integral of time multiplied by square error (ITSE) performance criteria to assess the quality of vessel traffic forecasting.