Next Article in Journal
The Environmental Niche of the Light Purse Seine Fleet in the Northwest Pacific Ocean Based on Automatic Identification System Data
Next Article in Special Issue
Environmental and Cost Assessments of Marine Alternative Fuels for Fully Autonomous Short-Sea Shipping Vessels Based on the Global Warming Potential Approach
Previous Article in Journal
A Multi-Spatial Scale Ocean Sound Speed Prediction Method Based on Deep Learning
Previous Article in Special Issue
Parametric Investigation on the Influence of Turbocharger Performance Decay on the Performance and Emission Characteristics of a Marine Large Two-Stroke Dual Fuel Engine
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

DAPNet: A Dual-Attention Parallel Network for the Prediction of Ship Fuel Consumption Based on Multi-Source Data

1
Navigation College, Dalian Maritime University, Dalian 116026, China
2
Collaborative Innovation Center of Maritime Big Data and Shipping Artificial General Intelligence, Dalian 116026, China
3
The Research Institute for Socionetwork Strategies, Kansai University, Suita 5648680, Japan
*
Author to whom correspondence should be addressed.
J. Mar. Sci. Eng. 2024, 12(11), 1945; https://doi.org/10.3390/jmse12111945
Submission received: 28 September 2024 / Revised: 23 October 2024 / Accepted: 26 October 2024 / Published: 31 October 2024
(This article belongs to the Special Issue Performance and Emission Characteristics of Marine Engines)

Abstract

:
The precise prediction of ship fuel consumption (SFC) not only serves to enhance energy efficiency to benefit shipping enterprises but also to provide quantitative foundations to aid in carbon emission reduction and ecological environment protection. On the other hand, SFC-related data represent typical multi-source characteristics and heterogeneous features, which lead to several methodological issues (e.g., feature alignment and feature fusion) in SFC prediction. Therefore, this paper proposes a dual-attention parallel network named DAPNet to solve the above issues. Firstly, we design a parallel network structure containing two kinds of long short-term memory (LSTM) and improved temporal convolutional networks (TCNs) for time-series analysis tasks so that different source data can be applied to suitable networks. Secondly, a local attention mechanism is included in each single parallel network so as to improve the ability of feature alignment from different-scale training data. Finally, global attention is employed for the fusion of all parallel networks, which can enrich representation features and simultaneously enhance the performance of SFC prediction. In experiments, DAPNet is compared with 10 methods, including baseline and attention models. The comparison results show that DAPNet and several of its variants obtain the highest accuracy in SFC prediction.

1. Introduction

The shipping industry plays an indispensable role in global trade logistics, with over 80% of international trade reliant on maritime transport [1]. With the continuous increase in shipping trade, the fuel consumption and carbon emissions of ships have attracted strong attention in recent years [2]. From the perspective of environmental protection, reducing ship fuel consumption (SFC) leads to energy savings and emission reductions. From the perspective of shipping enterprise operations, the cost associated with SFC constitutes a substantial proportion of the total operational expenses [3]. Consequently, the energy reduction of SFC plays a pivotal role in many fields, such as shipping management, route optimization, and path planning [4]. There is also an urgent need for accurate and efficient SFC prediction, which can provide discovery and decision support for researchers and managers [5].
SFC is predicted by using classic time-series analysis and multivariate regression, which usually demonstrate the effects of explanatory factors on fuel consumption [6]. Fan et al. used theoretical formulas and computational fluid dynamics analysis methods to construct a ship energy efficiency model under the influence of calm water resistance [7]. Yang et al. considered the effect of ocean currents on ship speed and reduced SFC via its optimization [8]. However, there still exists gap between prediction precision and actual requirement, especially during dynamic and diverse environments [9]. As a time-series analysis problem, Zhu et al. employed classic methods such as support vector regression (SVR) and artificial neural networks (ANNs) to train models for SFC prediction. A comparative analysis revealed that effective feature fusion with ANNs could improve the precision of SFC prediction [10]. The analysis suggested that autoregressive models have limitations in terms of identifying model parameters and capturing the nonlinear characteristics of sequences [11]. The multiple linear regression method was also employed to establish an SFC prediction model, and its effectiveness was validated using real-world data from bulk carriers [12].
In recent years, the feature fusion of multi-source data has emerged as both a challenging and essential process, particularly when applying traditional machine learning methods to SFC prediction. Han et al. proposed several approaches for the fusion of multi-source data, and they employed the long short-term memory (LSTM) method for SFC prediction [13]. A comparative analysis revealed that effective feature fusion could improve the precision of SFC prediction. Furthermore, several advanced models, e.g., bi-directional LSTM (BiLSTM) and LSTM with a self-attention mechanism (LA), are begun to be used for SFC prediction and obtain better prediction results [14,15]. However, limitations remain, as well-processed data, e.g., data alignment, association, and fusion, are still required by the methods in existing studies [16].
To address the above issues, this paper proposes a dual-attention parallel network named DAPNet. Firstly, DAPNet introduces a parallel network structure to pursue multi-source data, and transforms the multivariate regression into parallel process based on each independent network the multivariate regression task as a parallel process of an independent network. Secondly, multi-source data also represent heterogeneous features, which require an appropriate network. In DAPNet, two kinds of LSTM and an improved temporal convolutional network (TCN) are applied to feature extraction, where LSTM focuses on capturing temporal correlations and the improved TCN focuses on the dependent relationships among different variables [17,18]. Finally, DAPNet includes dual attention for feature alignment and feature fusion. Local attention is included in each single parallel network to enhance temporal feature extraction from individual factors in the same dimension. Global attention is included for the feature fusion of all parallel networks so as to enrich the dynamic diversity of representation features. In experiments, a case study of a ferry ship used as a reference ship is conducted, and it is equipped with many data sensors for the collection of multi-source data, including both internal and environmental factors. DAPNet is compared with 10 methods, including a traditional and advanced RNN and LSTM and state-of-the-art attention models. The experimental results reveal that DAPNet and several of its variants obtained a lower training loss and a higher prediction accuracy than other comparison models, with DAPNet achieving minimum values of 0.0065 and 0.0108 for the mean absolute error (MAE) and root mean square error (RMSE) in SFC prediction. The proposed DAPNet presents three main contributions accounting for an end-to-end framework with a parallel network, local attention, and global attention, which solve multivariable regression, multi-source feature alignment, and heterogeneous feature fusion.
The structure of this paper is organized as follows. In Section 2, we provide a detailed introduction to the proposed DAPNet. A data description of the case ship and comparative analyses between baseline models and DAPNet are presented in Section 3. Finally, in Section 4, the conclusions are summarized and future prospects are outlined.

2. Methodology Design of DAPNet

2.1. Overview of DAPNet for SFC Prediction

DAPNet, depicted in Figure 1, is based on an end-to-end framework that encompasses variable selection, data alignment, feature extraction, feature alignment, feature fusion, and final prediction. DAPNet features a parallel structure comprising LSTM and an improved TCN, each tailored for specific variable inputs and data alignment. Variable specialty decides the candidate network to use LSTM or the improved TCN. In each parallel network, the local attention is included for feature extraction and alignment. Global attention is used for the feature fusion of all parallel networks. The representation features are then used for final prediction through a fully connected neural network (FCNN). Detailed descriptions of the parallel network and the dual-attention mechanism are provided in the following sections.

2.2. Design of Parallel Network

2.2.1. Variable Selection and Separation

Numerous sensors are initially equipped on the ship to measure the ship navigation status, such as speed and pitch. As shown in Figure 2, these types of data represent continuous, contextual, and sequential characteristics. However, variance and volatility are still contained in the distribution of these variables, with most exhibiting a time-series trend. Therefore, this paper designs a parallel network based on LSTM to accept such variables.
Several sensors are additionally equipped on the ship to measure the external status and environmental factors, such as trim, draft, head wind, and cross wind. However, these data also follow time-series patterns and exhibit high variance and significant volatility, as shown in Figure 3. Notably, environmental factors simultaneously reveal spatio-temporal features. Consequently, this paper designs a parallel network based on a TCN to accept such variables.
In addition to the aforementioned measured data used as the input of model, the distribution of the actual SFC data is presented in Figure 4.

2.2.2. Data Alignment Based on Framing

The differing sampling frequencies of multi-source sensors result in the presence of heterogeneous data that necessitate synchronization within the temporal domain. According to an existing study [13], framing is still one of the most efficient approaches for data alignment from different-scaled data. In this paper, we also incorporate a mobile overlapping frame approach to address the challenges posed by heterogeneous SFC data. The specific process of the mobile overlapping frame can be represented as follows:
m = t e k t s k l 1 r × l
D i k = d t k t t i , s k   ,   t i , e k
t i , s k = t s k + i 1 × 1 r × l
t i , e k = t s k + i i × r + r × l
where m represents the number of frames; l and r denote the frame length and overlap ratio, respectively; t s k and t e k represent the start and end points in time for the k th sensor; and D i k signifies the collection of data d t k for the i th frame within the k th sensor, where t i , s k and t i , e k correspondingly indicate the temporal start and end points within the i th frame of the k th sensor. Data from each sensor with varying sampling frequencies can be aligned after framing. For instance, Figure 5 illustrates the data acquisition frequencies of head wind and port draft over the same time duration, along with their respective framing procedures.
Figure 5 additionally illustrates both the frame length and overlap ratio. Fine-tuning the parameters of frame size and overlap size can significantly benefit the performance of the prediction model. Following multiple experiments, the selected frame length and overlap ratio were set to 30 s and 60%, respectively. Ultimately, the input matrix is constructed by extracting the statistical feature of the mean value of the data within each frame, as shown by the green dot in Figure 5.
a v g D i k = t t i , s k , t i , e k d t k m i k
where a v g D i k denotes the average value of the data within D i k , and m i k represents the number of data points in D i k . After the framing and extraction of statistical features, the data structure of the SFC variable X a v g is as described in Equation (6), where q represents the total number of SFC features contained in X a v g .
X a v g = a v g D 1 1 a v g D 1 k a v g D 1 q a v g D i 1 a v g D i k a v g D i q a v g D m 1 a v g D m k a v g D m q

2.2.3. Design of Parallel Network Based on LSTM

Basic LSTM is proposed in [19], which has been widely adopted for time-series regression and contextual prediction tasks. In most situations, variables with the same dimension can be directly inputted into LSTM. In this paper, the selected variables are sampled by different frequencies. Accordingly, DAPNet provides a parallel network of LSTM for training each variable separately, as shown in Figure 6. Furthermore, the detailed internal structure of the LSTM is shown in Figure 7.
Notation L i denotes the size of cells in the LSTM network. For different variables, each individual LSTM network can use different sizes of cells. The final hidden state h i t in a signal network can be calculated as follows:
h i t = L S T M i h i t 1 , c i t
where h i t 1 and c i t denote the previous hidden state and current cell state, h i t is the output of the LSTM part in DAPNet, and i denotes the i th LSTM in the parallel network structure. In DAPNet, other variants such as gated recurrent unit (GRU) [20], BiLSTM [21], and LSTM embedded with the quaternion ship domain (QSD-LSTM) [22] can replace LSTM and can also be mixed in a parallel network.

2.2.4. Design of Parallel Network Based on TCN

Convolutional neural networks (CNNs) exhibit inherent attributes conducive to enhancing nonlinear fitting capabilities, whereas SFC data are fundamentally characterized by temporal features. As an advanced variant of CNN tailored to address spatio-temporal modeling challenges, a TCN holds considerable potential to further enhance model performance [23]. In this context, the TCN is incorporated into the proposed DAPNet as another type of parallel network. To be different from LSTM, the TCN can accept multivariate inputs provided they share the same dimension. Consequently, both the classic TCN and an improved TCN are included, as shown in Figure 8. The structure of Figure 8 represents the encoder structure of the improved TCN, which used an m × k convolutional kernel.
For multivariate inputs, hidden states ( h 11 t , h 21 t ) T of the first layer can be calculated as follows:
h 11 t , h 21 t T = w 1 x 1 t 2 x 1 t 1 x 1 t x 2 t 2 x 2 t 1 x 2 t , x 2 t 2 x 2 t 1 x 2 t x 3 t 2 x 3 t 1 x 3 t   T
where w 1 denotes the convolutional kernels in the first hidden layer. The improved TCN incorporates a temporal pyramid structure in the initial convolutional processing and represents the same dilated convolutional processing as a traditional TCN. The parallel network architecture of the TCN is illustrated in Figure 9.

2.3. Design of Dual Attention

Since the attention mechanism was introduced in [24], it has been widely applied across various networks (e.g., CNN and LSTM) and research fields (e.g., trajectory prediction and action recognition) [25,26,27,28]. In most attention models, a set of queries, keys, and value vectors is obtained from the inputs, with a key–value pair used to reassign weights of each value based on a compatibility function. In current studies, the attention mechanism has become one of the indispensable components, and can also be improved for several new functions, such as channel and spatial attentions and local and global attentions [29]. In DAPNet, local attention is incorporated within in each parallel network to enhance feature extraction, while global attention is applied across all parallel networks to enrich feature fusion.

2.3.1. Local Attention for Feature Extraction and Alignment

According to Figure 6 and Figure 8, both the LSTM and TCN output several final hidden states from the input layer through multiple hidden layers. The LSTM has the ability to extract features from both long-term and short-term memories, while the TCN can also capture temporal–spatial features based on dilated convolutional layers. Gradient vanishing is still a specific issue; however, gradient vanishing is different in the LSTM than TCN. In DAPNet, an attention mechanism is designed in each single parallel network so that the final hidden states depending on the independent LSTM or TCN can be recalculated and reallocated to obtain a new score based on local attention (see Figure 10).
Figure 10a shows the local attention in the LSTM. For each LSTM cell, the hidden state h i is transformed into ( q i , k i , v i ) , as follows:
q i = w q L h i
k i = w k L h i
v i = w v L h i
where w q L , w k L , and w v L denote the transformation weights of the query, key, and, value in local attention. Then, relevant relationships among new weights a i can be calculated by
a i = s o f t m a x s q i , k i = e x p s q i , k i j = 1 t e x p s q i , k j
s ( q i , k i ) denotes the dot-product of the query and key as
s q i , k i = q i · k i D k
where D k denotes the key dimension. After calculating the relevance among all inputs via softmax, new hidden state h i can be obtained by summation of attention scores as follows:
h i = i = 1 t a i v i
Figure 10b shows the local attention in the TCN, and the new hidden state h i can similarly be obtained based on the calculation methods corresponding to Equations (9)–(14).

2.3.2. Global Attention for Feature Fusion

Following feature extraction, DAPNet designs a global attention for the fusion of all outputs from parallel networks so that the diversity of local features can be enriched to generate representative features. Figure 11 shows the global attention in DAPNet. Global attention would recalculate and reallocate of all the representation hidden states ( h 1 t , , h n t ) obtained by local attention from x t .
h i t = L o c a l A t t i x t
where L o c a l A t t i ( · ) denotes the local attention score of the i-th parallel network. For each h i t , ( q t i , k t i , v t i ) are calculated as follows.
q t i = w q G h i t
k t i = w k G h i t
v t i = w v G h i t
where w q G , w k G , and w v G denote the transformation weights of the query, key, and value in global attention. Then, relevant relationships among new weights a i can be calculated by
a t i = s o f t m a x s q t i , k t i = e x p s q t i , k t i j = 1 n e x p s q t j , k t j
where n denotes the number of parallel networks. The global attention score h ^ i t can be obtained from the summation of all combinations of a t i v t i as follows:
h ^ t = G l o b a l A t t h 1 t , , h n t = i = 1 n a t i v t i
For the final output of DAPNet, the FCNN is used to calculate global attention score as follows:
y t = F C N N h ^ t

3. Experiments and Comparisons

3.1. Variable Definition and Data Explanation

The case ship in this study is a ferry that operates on the route from Tórshavn to Suduroy Island, as shown in Figure 12 [10]. As depicted in Figure 8, the solid line and the dotted line represent the route under normal sailing conditions ( R N ) and the route under adverse sailing conditions ( R A ), respectively. The datasets corresponding to the multi-source variables described in Figure 2, Figure 3 and Figure 4 were collected by the multi-source sensors installed on the ship, as shown in Table 1.
To ensure high-quality data for more effective model predictions, additional preparation is required following the processing of multi-source heterogeneous SFC data and prior to feeding it into the model. Firstly, the Min-Max normalization method is employed to bring all indicators to the same scale. Secondly, to further enhance the quality and quantity of the SFC input data, a sliding time window is employed in this paper for the subsequent framing of data. The final step involves data separation. The SFC dataset after framing can be divided into 52 complete voyages, totaling approximately 20,116 data points, and categorized into two datasets. One of the datasets includes 15,022 data points of the routes under normal sailing conditions, which is further divided into training, validation, and test sets with proportions of 70%, 10%, and 20%, respectively. The other includes 5094 data points of the routes under adverse sailing conditions, designated for validation experiments aimed at assessing the generality and effectiveness of the proposed model.

3.2. Parameters and Criteria

3.2.1. Parameter Settings

In this experiment, the training processes for all SFC prediction models are consistent. Additionally, we provide some key training parameters as follows: the initial learning rate is set to 0.01, the loss function used is mean squared error (MSE), the initial number of epochs is 10,000, and training is halted when the loss no longer decreases after 50 epochs. The batch size is 256 and the optimizer used is Adam. In addition, the time step is set to 16 for all models, the number of hidden layers is set to 1 for models with an RNN structure, and the filter is set to 3 for models with a TCN structure. Other parameters not explicitly mentioned here are set to default values. It is worth noting that the above parameters were determined based on extensive experimentation, and all models were trained for 20 cycles with the best performing models retained.
This paper compares the performance of DAPNet with ten distinct prediction models. These ten models encompass a traditional and advanced RNN and LSTM and state-of-the-art attention models. Due to the numerous parameters involved, which can form multiple combinations, it is challenging to list all of these combinations and their impact on the model’s predictive results. Therefore, this paper lists the parameter settings and the range of adjustments with representative features for each model, as shown in Table 2.
The specific methods for adjusting the parameters of each model are described as follows. For the models RNN, LSTM, BiGRU, BiLSTM, RA, BA, and SA, the parameter adjustment range {128, 64, 32, 16, 8} signifies the corresponding number of neurons in the hidden layer. For the parameter adjustment range of the TCN and TA, the corresponding parameters of kernel size and d take on the values 2 and 1 in the first Res Block, 4 and 2 in the second Res Block, and 8 and 4 in the third Res Block. The parameter adjustment range of TLA and DAPNet can be summarized as including the range of the related parameters of the LSTM and TCN above. The complete training process of DAPNet is detailed in Algorithm 1.
Algorithm 1 Training Process of DAPNet in SFC Prediction Tasks
Input:  X = [ S P E ,   P P ,   S P ,   T R ,   P D ,   S D ,   H W ,   C W ] ;//Fuel consumption related variables
Output:   y = S F C ;//The predicted fuel consumption
Initialization: All the parameters U.
for each epoch do
h k ( i ) = f k ( X k ) ;//Process Fuel consumption feature X k through networks ( k = LSTM, TCN, and improved TCN) to obtain feature representations h i
h i t = L o c a l A t t i x t ;//Apply local attention mechanism on h i to enhance the feature representations and obtain h i
h ^ t = G l o b a l A t t h 1 t , , h n t ;//Fuse h i using global attention to get global features h ^ t
y t = F C N N h ^ t ;//The predicted fuel consumption y t through fully connected layer
Update (U);//Update and optimize the parameters U of DAPNet by minimization loss
end for
return The trained DAPNet and the predicted fuel consumption y t

3.2.2. Evaluation Criteria

To validate the effectiveness of DAPNet, we employ MAE and RMSE to measure the disparity between actual and predicted values. MAE and RMSE are commonly used standards in the literature for assessing the performance of predictive models and are defined as follows:
M A E = 1 n i = 1 n y i y i
R M S E = 1 n i = 1 n y i y i 2
where n represents the amount of test data and y i and y i denote actual and predicted value, respectively.

3.3. Results and Analyses

During the experimental stage, parameter tuning was initially conducted, and the optimal parameters for each model were selected based on the MAE and RMSE results. It is worth emphasizing that the TLA and DAPNet models employed a controlled variable approach during the parameter tuning process. Both models first optimized parameters related to the LSTM component. Subsequently, the optimal parameters for the TCN component were adjusted based on the optimal parameters obtained for the LSTM component of each model, resulting in distinct optimal parameter combinations for the two models. The outcomes are illustrated in Figure 13 and Figure 14.
In Figure 13, the parameter HLD refers to the Hidden Layer Dimension, whereas in Figure 14, the parameter Blocks represents the Number of Residual Blocks. Figure 13 illustrates the parameter tuning process for models such as RNN, LSTM, BiGRU, BiLSTM, RA, BA, SA, TLA, and DAPNet, where the evaluation criteria are specified as MAE and RMSE, respectively. It is noteworthy that TLA and DAPNet adjusted only the parameters in the LSTM component, leaving other parameters as default. Additionally, the optimal parameters for the aforementioned models are provided in Table 3. From Figure 13, it is evident that DAPNet exhibits the best predictive performance. Figure 14 showcases the parameter tuning process for the models, including TCN, TA, TLA, and DAPNet. TLA and DAPNet performed TCN parameter tuning based on the optimal parameters obtained for the LSTM component in Figure 13. Similarly, the optimal parameters for these models are presented in Table 3. Notably, according to Figure 14, DAPNet continues to demonstrate optimal performance. To obtain the optimal model in terms of both accuracy and stability in predicting SFC, Table 3 also presents the MAE and RMSE values of each model under the aforementioned optimal parameters. The results of all comparative models presented in Table 3 are the means and standard deviations (mean ± standard deviation) obtained from 20 independent runs, ensuring a reduction in randomness. Furthermore, the comparisons of different SFC prediction models are displayed in Figure 15 to compare the predictive performance of SFC more clearly. The error bars in Figure 15 represent the standard deviation across 20 experimental runs.
As evident from Figure 15 and Table 3, DAPNet emerges as the model with the optimal SFC predictive performance, striking a balanced trade-off between accuracy and stability. The average values of MAE and RMSE for 20 experimental runs of DAPNet are 0.0073 and 0.0121, respectively, with minimum values reaching 0.0065 and 0.0108. The results indicate that DAPNet improves the MAE and RMSE values for SFC prediction by 23.96% and 17.50%, respectively, compared to classic and state-of-the-art models. Conversely, the LSTM and BiLSTM demonstrate commendable stability. However, their predictive performance is subpar, attributed to the inability of such simple networks to effectively extract relevant features associated with SFC. Further analysis reveals that the predictive performance of the RNN, LSTM, and BiLSTM, incorporating attention mechanisms, surpasses that of the single RNN, LSTM, BiGRU, and BiLSTM structures. Similarly, the predictive performance of the attention-based TCN surpasses that of the single TCN. Therefore, attention mechanisms contribute significantly to the enhancement of SFC predictive accuracy. Moreover, combining the LSTM and TCN models based on attention further improves SFC predictive accuracy. Specifically, TLA and DAPNet exhibit lower MAE and RMSE values compared to the other models in Table 3. The predictive performance of DAPNet is further enhanced compared to TLA, showing varying degrees of improvement in terms of MAE, RMSE, and their corresponding stabilities. Thus, the proposed model structure in this study demonstrates superior suitability for SFC predictive research compared to traditional hybrid model structures.
To validate the generalizability of DAPNet in the domain of SFC prediction, this paper conducts predictions on the dataset from an alternative route of the case ship outlined in Section 3.1. Similarly, the prediction effect of each model is compared under the best parameters, as presented in Table 4 and Figure 16.
It is obvious from Table 4 and Figure 16 that DAPNet consistently outperforms other models for SFC prediction. The average values of MAE and RMSE across 20 experimental runs of DAPNet can reach 0.0157 and 0.0198, respectively. These results indicate that the DAPNet proposed in this paper demonstrates strong generalizability.

3.4. Comparisons and Discussions

The ablation study conducted in this study aims to validate the functionality of each structural component of the DAPNet. The ablation study separately discusses the impact of the framing, improved TCN, LSTM, local attention, and global attention structures utilized in this paper on DAPNet. The results presented in Table 5 represent the average values derived from 20 experimental runs. Furthermore, Table 5 also presents training time and test time.
As shown in Table 5, the proposed DAPNet represents the SFC prediction approach introduced in this paper. It is observed that the SFC predictive performance of the proposed DAPNet remains the best, with the average values of MAE and RMSE across 20 experiments reaching 0.0073 and 0.0121, respectively. Generally, although more complex network structures result in longer prediction times, the difference in computational time among the various models remains relatively minor. Upon further observation, DAPNet without data alignment implies the removal of the framing structure from the proposed DAPNet. An analysis of the results reveals that the proposed DAPNet outperforms DAPNet without data alignment due to the framing structure’s ability to accomplish data alignment. DAPNet without local attention implies that the proposed DAPNet removes the local attention structure. The superior predictive performance of the proposed DAPNet is attributed to the feature alignment capability provided by local attention. Similarly, the proposed DAPNet performs better than DAPNet without global attention due to the assistance of global attention in feature fusion. An analysis of the results of DAPNet without local attention and global attention shows that it is evident that removing both local attention and global attention has the most significant negative impact on the predictive performance of the proposed DAPNet. Conversely, DAPNet with the traditional TCN indicates the replacement of the improved TCN in the proposed DAPNet with a traditional TCN. DAPNet with BiLSTM signifies the replacement of LSTM in the proposed DAPNet with BiLSTM. Both operations lead to a certain degree of reduction in the predictive accuracy for the proposed DAPNet. This can be attributed to that the improved TCN and LSTM are better suited for extracting spatio-temporal features from SFC data, ensuring the model’s generalization capability. Overall, the above comparative experiments of SFC prediction models indicate that each structural component adopted in the proposed DAPNet has a positive impact on the overall predictive performance of the model. Furthermore, the effectiveness and accuracy of DAPNet are reaffirmed validated.

4. Conclusions and Future Work

This paper constructs a dual-attention parallel network, namely DAPNet. The proposed method effectively tackles the challenges posed by the multi-source characteristics and heterogeneous features of SFC-related data, ultimately enhancing the overall predictive performance. At the core of DAPNet, a parallel network structure composed of LSTM and an improved TCN is designed to further tackle the challenge of feature extraction. This parallel network structure inherits the advantages of both LSTM and the TCN and improves nonlinear fitting and the ability to extract temporal features from the data adequately. In each parallel network, local attention is incorporated to achieve feature alignment. The features are subsequently aggregated into global attention for feature fusion, thereby enhancing the predictive performance. To evaluate the effectiveness of the proposed DAPNet, a ferry is chosen as the case ship. The results demonstrate that DAPNet outperforms classic neural networks and new hybrid neural networks in terms of predictive performance. According to ablation investigations, both the local and global attentions play a positive role in influencing the prediction outcomes. DAPNet achieves minimum MAE and RMSE values of 0.0065 and 0.0108, respectively. The results show that DAPNet improved the MAE and RMSE values for SFC prediction by 23.96% and 17.50%, respectively, compared to classic and state-of-the-art models. Furthermore, the generalization of DAPNet is also validated through comparisons across different routes of the same case ship.
However, there is considerable room for scalability in terms of data sources and model architecture, and several directions for further enhancement can be considered in the future. The SFC data currently used in this study are relatively limited in scope. The key data that significantly affect the prediction of SFC, such as rudder, are not included in the data sources of this study. Specifically, since the weather conditions in this study from the noon report were manually recorded, they may not accurately reflect the actual weather encountered during navigation. In the next phase of this study, we plan to incorporate additional SFC-related feature data, such as rudder, as well as reliable meteorological data corresponding to the navigation routes. This will address the issue of data quality and ensure that the conclusions drawn from the research are more in line with established knowledge in the maritime field. The application of the proposed model to various types of ships can be investigated to further explore the SFC predicting applicability. Additionally, future work could extend to research areas such as the optimization of ship energy and speed based on predicted SFC.

Author Contributions

Writing—review and editing, methodology, and formal analysis, Y.Z.; data curation and writing—original draft preparation, X.L. and Y.Z.; validation, X.L. and J.J. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Natural Science Foundation of China (grant nos. 52131101, 51939001) and the Science and Technology Fund for Distinguished Young Scholars of Dalian (grant no. 2021RJ08).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The fuel-consumption-related data of the passenger ro-ro ship can be found at http://cogsys.imm.dtu.dk/propulsionmodelling/data.html (accessed on 14 February 2024). The code of DAPNet can be found at https://github.com/1101Floor/DAPNet (accessed on 14 February 2024).

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

References

  1. Xu, L.; Yang, Z.; Chen, J.; Zou, Z.; Wang, Y. Spatial-temporal evolution characteristics and spillover effects of carbon emissions from shipping trade in EU coastal countries. Ocean Coast. Manag. 2024, 250, 107029. [Google Scholar] [CrossRef]
  2. Chen, X.; Zhou, J.; Wang, Y. Marine fuel restrictions and air pollution: A study on Chinese ports considering transboundary spillovers. Mar. Policy 2024, 163, 106136. [Google Scholar] [CrossRef]
  3. Luo, X.; Yan, R.; Xu, L.; Wang, S. Accuracy and applicability of ship’s fuel consumption prediction models: A comprehensive comparative analysis. Energy 2024, 310, 133187. [Google Scholar] [CrossRef]
  4. Zhi, L.; Zuo, Y. Collaborative Path Planning of Multiple AUVs Based on Adaptive Multi-Population PSO. J. Mar. Sci. Eng. 2024, 12, 223. [Google Scholar] [CrossRef]
  5. Ferlita, A.; Qi, Y.; Nardo, E.; El Moctar, O.; Schellin, T.E.; Ciaramella, A. A framework of a data-driven model for ship performance. Ocean Eng. 2024, 309, 118486. [Google Scholar] [CrossRef]
  6. Fan, A.; Yang, J.; Yang, L.; Wu, D.; Vladimir, N. A review of ship fuel consumption models. Ocean Eng. 2022, 264, 112405. [Google Scholar] [CrossRef]
  7. Fan, A.; Yan, X.; Bucknall, R.; Yin, Q.; Ji, S.; Liu, Y.; Song, R.; Chen, X. A novel ship energy efficiency model considering random environmental parameters. J. Mar. Eng. Technol. 2020, 19, 215–228. [Google Scholar] [CrossRef]
  8. Yang, L.; Chen, G.; Zhao, J.; Rytter, N.G.M. Ship speed optimization considering ocean currents to enhance environmental sustainability in maritime shipping. Sustainability 2020, 12, 3649. [Google Scholar] [CrossRef]
  9. Wei, N.; Yin, L.; Li, C.; Li, C.; Chan, C.; Zeng, F. Forecasting the daily natural gas consumption with an accurate white-box model. Energy 2021, 232, 121036. [Google Scholar] [CrossRef]
  10. Zhu, Y.; Zuo, Y.; Li, T. Modeling of Ship Fuel Consumption Based on Multisource and Heterogeneous Data: Case Study of Passenger Ship. J. Mar. Sci. Eng. 2021, 9, 273. [Google Scholar] [CrossRef]
  11. Yaseen, Z.M.; El-Shafie, A.; Jaafar, O.; Afan, H.A.; Sayl, K.N. Artificial intelligence based models for stream-flow forecasting 2000–2015. J. Hydrol. 2015, 530, 829–844. [Google Scholar] [CrossRef]
  12. Hajli, K.; Rönnqvist, M.; Dadouchi, C.; Audy, J.-F.; Cordeau, J.-F.; Warya, G.; Ngo, T. A fuel consumption prediction model for ships based on historical voyages and meteorological data. J. Mar. Eng. Technol. 2024, 23, 439–450. [Google Scholar] [CrossRef]
  13. Han, P.; Liu, Z.; Sun, Z.; Yan, C. A novel prediction model for ship fuel consumption considering shipping data privacy: An XGBoost-IGWO-LSTM-based personalized federated learning approach. Ocean Eng. 2024, 302, 117668. [Google Scholar] [CrossRef]
  14. Chen, Y.; Sun, B.; Xie, X.; Li, X.; Li, Y.; Zhao, Y. Short-term forecasting for ship fuel consumption based on deep learning. Ocean Eng. 2024, 301, 117398. [Google Scholar] [CrossRef]
  15. Wang, Z.; Lu, T.; Han, Y.; Zhang, C.; Zeng, X.; Li, W. Improving ship fuel consumption and carbon intensity prediction accuracy based on a long short-term memory model with self-attention mechanism. Appl. Sci. 2024, 14, 8526. [Google Scholar] [CrossRef]
  16. Ganaie, M.A.; Hu, M.; Malik, A.K.; Tanveer, M.; Suganthan, P. Ensemble deep learning: A review. Eng. Appl. Artif. Intell. 2022, 115, 105151. [Google Scholar] [CrossRef]
  17. Zhang, Y.; Wen, Q.; Wang, X. OneNet: Enhancing time series forecasting models under concept drift by online ensembling. In Proceedings of the 37th Conference on Neural Information Processing Systems, New Orleans, LA, USA, 22 September 2023. [Google Scholar]
  18. Xie, Y.; Sun, W.; Ren, M.; Chen, S.; Huang, Z.; Pan, X. Stacking ensemble learning models for daily runoff prediction using 1D and 2D CNNs. Expert Syst. Appl. 2023, 217, 119469. [Google Scholar] [CrossRef]
  19. Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
  20. Bao, K.; Bi, J.; Gao, M.; Sun, Y.; Zhang, X.; Zhang, W. An improved ship trajectory prediction based on AIS data using MHA-BiGRU. J. Mar. Sci. Eng. 2022, 10, 804. [Google Scholar] [CrossRef]
  21. Xiao, Y.; Li, X.; Yao, W.; Chen, J.; Hu, Y. Bidirectional data-driven trajectory prediction for intelligent maritime traffic. IEEE Trans. Intell. Transp. Syst. 2022, 24, 1773–1785. [Google Scholar] [CrossRef]
  22. Liu, R.W.; Hu, K.; Liang, M.; Li, Y.; Liu, X.; Yang, D. QSD-LSTM: Vessel trajectory prediction using long short-term memory with quaternion ship domain. Appl. Ocean Res. 2023, 136, 103592. [Google Scholar] [CrossRef]
  23. Jiang, J.; Zuo, Y.; Xiao, Y.; Zhang, W.; Li, T. STMGF-Net: A Spatiotemporal Multi-Graph Fusion Network for Vessel Trajectory Forecasting in Intelligent Maritime Navigation. IEEE Trans. Intell. Transp. Syst. 2024, 1–13. [Google Scholar] [CrossRef]
  24. Ashish, V.; Noam, S.; Niki, P.; Jakob, U.; Llion, J.; Aidan, G.; Lukasz, K.; Illia, P. Attention Is All You Need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Auckland, New Zealand, 4 December 2017. [Google Scholar]
  25. Bin, Y.; Yang, Y.; Shen, F.; Xie, N.; Shen, H.T.; Li, X. Describing video with attention-based bidirectional LSTM. IEEE Trans. Cybern. 2018, 49, 2631–2641. [Google Scholar] [CrossRef] [PubMed]
  26. Yang, H.; Yuan, C.; Zhang, L.; Sun, Y.; Hu, W.; Maybank, S.J. STA-CNN: Convolutional spatial-temporal attention learning for action recognition. IEEE Trans. Image Process. 2020, 29, 5783–5793. [Google Scholar] [CrossRef]
  27. Capobianco, S.; Forti, N.; Millefiori, L.; Braca, P.; Willett, P. Recurrent encoder–decoder networks for vessel trajectory prediction with uncertainty estimation. IEEE Trans. Aerosp. Electron. Syst. 2023, 59, 2554–2565. [Google Scholar] [CrossRef]
  28. Hao, H.; Wang, Y.; Xue, S.; Xia, Y.; Zhao, J.; Shen, F. Temporal convolutional attention-based network for sequence modeling. arXiv 2002, arXiv:2002.12530. [Google Scholar] [CrossRef]
  29. Song, C.; Han, H.; Avrithis, Y. All the attention you need: Global-local, spatial-channel attention for image retrieval. In Proceedings of the 2022 IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 16 July 2022. [Google Scholar]
  30. Connor, J.T.; Martin, R.D.; Atlas, L.E. Recurrent neural networks and robust time series prediction. IEEE Trans. Neural Netw. Learn. Syst. 1994, 5, 240–254. [Google Scholar] [CrossRef]
  31. Liu, Y.; Zhang, Q.; Song, L.; Chen, Y. Attention-based recurrent neural networks for accurate short-term and long-term dissolved oxygen prediction. Comput. Electron. Agric. 2019, 165, 104964. [Google Scholar] [CrossRef]
  32. Hao, X.; Liu, Y.; Pei, L.; Li, W.; Du, Y. Atmospheric temperature prediction based on a BiLSTM-Attention model. Symmetry 2022, 14, 2470. [Google Scholar] [CrossRef]
Figure 1. Overview of DAPNet for SFC prediction. The output of DAPNet provides the predicted value of SFC. The input variables, described sequentially from left to right, are as follows: the first two inputs to the LSTM are speed and pitch, where pitch can be further divided into starboard pitch and port pitch. The subsequent inputs to the TCN are trim, draft, and wind, where draft and wind are further subdivided into the starboard draft port draft, head wind, and cross wind, respectively. The final group of inputs to the improved TCN consists of the combined set of trim, draft, and wind.
Figure 1. Overview of DAPNet for SFC prediction. The output of DAPNet provides the predicted value of SFC. The input variables, described sequentially from left to right, are as follows: the first two inputs to the LSTM are speed and pitch, where pitch can be further divided into starboard pitch and port pitch. The subsequent inputs to the TCN are trim, draft, and wind, where draft and wind are further subdivided into the starboard draft port draft, head wind, and cross wind, respectively. The final group of inputs to the improved TCN consists of the combined set of trim, draft, and wind.
Jmse 12 01945 g001
Figure 2. Data sampling of navigation speed and ship pitch. (a) Speed; (b) Port pitch; (c) Starboard pitch.
Figure 2. Data sampling of navigation speed and ship pitch. (a) Speed; (b) Port pitch; (c) Starboard pitch.
Jmse 12 01945 g002
Figure 3. Data sampling of ship trim, draft, and environmental wind. (a) Trim; (b) Port draft; (c) Starboard draft; (d) Head wind; (e) Cross wind.
Figure 3. Data sampling of ship trim, draft, and environmental wind. (a) Trim; (b) Port draft; (c) Starboard draft; (d) Head wind; (e) Cross wind.
Jmse 12 01945 g003
Figure 4. Data sampling of ship fuel consumption.
Figure 4. Data sampling of ship fuel consumption.
Jmse 12 01945 g004
Figure 5. The process of data framing for asynchronous frequency.
Figure 5. The process of data framing for asynchronous frequency.
Jmse 12 01945 g005
Figure 6. Parallel network of LSTM cells. The parallel network of LSTM cells consists of multiple LSTM models operating in parallel, with each LSTM network designed to capture distinct temporal characteristics of its respective input elements.
Figure 6. Parallel network of LSTM cells. The parallel network of LSTM cells consists of multiple LSTM models operating in parallel, with each LSTM network designed to capture distinct temporal characteristics of its respective input elements.
Jmse 12 01945 g006
Figure 7. Internal architectural of LSTM details.
Figure 7. Internal architectural of LSTM details.
Jmse 12 01945 g007
Figure 8. Structure of improved TCN.
Figure 8. Structure of improved TCN.
Jmse 12 01945 g008
Figure 9. Parallel network of TCN blocks. The parallel network of the TCN blocks consists of multiple individual TCNs, each tasked with extracting distinct features. Each TCN contains multiple residual modules, incorporating dilated convolution, layer normalization, and ReLU activation. This design enables the parallel TCN to efficiently capture a diverse range of spatial features with enhanced richness and depth.
Figure 9. Parallel network of TCN blocks. The parallel network of the TCN blocks consists of multiple individual TCNs, each tasked with extracting distinct features. Each TCN contains multiple residual modules, incorporating dilated convolution, layer normalization, and ReLU activation. This design enables the parallel TCN to efficiently capture a diverse range of spatial features with enhanced richness and depth.
Jmse 12 01945 g009
Figure 10. Design of local attention in parallel network of LSTM and TCN. (a) Local attention of LSTM; (b) Local attention of TCN.
Figure 10. Design of local attention in parallel network of LSTM and TCN. (a) Local attention of LSTM; (b) Local attention of TCN.
Jmse 12 01945 g010
Figure 11. Design of global attention in DAPNet. The global attention mechanism is employed to fuse multiple extracted features by assigning varying importance weights to each feature.
Figure 11. Design of global attention in DAPNet. The global attention mechanism is employed to fuse multiple extracted features by assigning varying importance weights to each feature.
Jmse 12 01945 g011
Figure 12. Sailing routes of the case ship.
Figure 12. Sailing routes of the case ship.
Jmse 12 01945 g012
Figure 13. Parameter adjustments and error comparisons of hidden layer dimension. (a) MAE comparisons of different models with HLD; (b) RMSE comparisons of different models with HLD.
Figure 13. Parameter adjustments and error comparisons of hidden layer dimension. (a) MAE comparisons of different models with HLD; (b) RMSE comparisons of different models with HLD.
Jmse 12 01945 g013
Figure 14. Parameter adjustments and error comparisons of residual blocks. (a) MAE comparisons of different models with blocks; (b) RMSE comparisons of different models with blocks.
Figure 14. Parameter adjustments and error comparisons of residual blocks. (a) MAE comparisons of different models with blocks; (b) RMSE comparisons of different models with blocks.
Jmse 12 01945 g014
Figure 15. Comparisons of different SFC prediction methods. (a) Comparisons of MAE; (b) Comparisons of RMSE.
Figure 15. Comparisons of different SFC prediction methods. (a) Comparisons of MAE; (b) Comparisons of RMSE.
Jmse 12 01945 g015
Figure 16. Comparisons of different SFC prediction methods in validation dataset. (a) Comparisons of MAE; (b) Comparisons of RMSE.
Figure 16. Comparisons of different SFC prediction methods in validation dataset. (a) Comparisons of MAE; (b) Comparisons of RMSE.
Jmse 12 01945 g016
Table 1. Variables explanation of data collection for ferry ship.
Table 1. Variables explanation of data collection for ferry ship.
No.VariablesAbbreviation
1Ship Fuel ConsumptionSFC
2SpeedSPE
3Port PitchPP
4Starboard PitchSP
5TrimTR
6Port DraftPD
7Starboard DraftSD
8Head WindHW
9Cross WindCW
Table 2. Parameter setting of different models.
Table 2. Parameter setting of different models.
No.ModelParameter Setting and The Range of Adjustments
1Standard RNN (RNN) [30]Hidden Layer Dimension (HLD)
HLD = {128, 64, 32, 16, 8}
2Standard LSTM (LSTM) [19]
3BiGRU [20]
4BiLSTM [21]
5RNN-Attention (RA) [31]
6BiLSTM-Attention (BA) [32]
7Seq2Seq-Attention (SA) [27]
8TCN [26]Number of Res Block = {1, 2, 3}
Kernel size = {2, 4, 8}
d = {1, 2, 4}
9TCN-Attention (TA) [28]
10TCN-LSTM-Attention (TLA)Number of Res Block = {1, 2, 3}
Kernel size = {2, 4, 8}
d = {1, 2, 4}
HLD = {128, 64, 32, 16, 8}
11DAPNet
Table 3. Comparisons of prediction performance for different methods.
Table 3. Comparisons of prediction performance for different methods.
MethodsOptimal Parameter ValuesMAERMSE
RNN [30]HLD = 320.0105 ± 0.00080.0153 ± 0.0009
LSTM [19]HLD = 160.0097 ± 0.00040.0147 ± 0.0004
BiGRU [20]HLD = 160.0096 ± 0.00040.0158 ± 0.0009
BiLSTM [21]HLD = 640.0094 ± 0.00030.0141 ± 0.0005
RA [31]HLD = 160.0099 ± 0.00060.0152 ± 0.0014
BA [32]HLD = 160.0092 ± 0.00080.0137 ± 0.0013
SA [27]HLD = 640.0091 ± 0.00080.0138 ± 0.0013
TCN [26]Number of Res Block = 2;
(Kernel size, d) in each block = {(2, 1), (4, 2)}
0.0102 ± 0.00120.0155 ± 0.0021
TA [28]Number of Res Block = 3;
(Kernel size, d) in each block = {(2, 1), (4, 2), (8, 4)}
0.0094 ± 0.00080.0146 ± 0.0013
TLAHLD = 64;
Number of Res Block = 3;
(Kernel size, d) in each block = {(4, 16), (8, 8), (16, 4)}
0.0092 ± 0.00070.0143 ± 0.0011
DAPNetHLD = 16;
Number of Res Block = 3;
(Kernel size, d) in each block = {(4, 16), (8, 8), (16, 4)}
0.0073 ± 0.00030.0121 ± 0.0007
Table 4. Investigations of generalization for different methods in validation dataset.
Table 4. Investigations of generalization for different methods in validation dataset.
MethodsMAERMSE
RNN [30]0.0228 ± 0.00120.0270 ± 0.0018
LSTM [19]0.0221 ± 0.00180.0259 ± 0.0021
BiGRU [20]0.0201 ± 0.00130.0229 ± 0.0016
BiLSTM [21]0.0196 ± 0.00140.0234 ± 0.0012
RA [31]0.0207 ± 0.00210.0247 ± 0.0020
BA [32]0.0192 ± 0.00160.0223 ± 0.0014
SA [27]0.0196 ± 0.00160.0224 ± 0.0015
TCN [26]0.0216 ± 0.00210.0255 ± 0.0028
TA [28]0.0195 ± 0.00130.0237 ± 0.0013
TLA0.0189 ± 0.00240.0227 ± 0.0024
DAPNet0.0157 ± 0.00120.0198 ± 0.0018
Table 5. Ablation investigations of proposed method.
Table 5. Ablation investigations of proposed method.
MethodsMAERMSETrain
Time
Test
Time
DAPNet without data alignment0.01900.0241474.34 s2.66 s
DAPNet without local attention0.00940.0143408.19 s2.05 s
DAPNet without global attention0.00950.0146433.98 s3.27 s
DAPNet without local/global attention0.00980.0148335.68 s2.20 s
DAPNet using traditional TCN0.00960.0149431.80 s3.19 s
DAPNet using BiLSTM0.00970.0155456.73 s3.84 s
Proposed DAPNet0.00730.0121444.26 s3.24 s
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, X.; Zuo, Y.; Jiang, J. DAPNet: A Dual-Attention Parallel Network for the Prediction of Ship Fuel Consumption Based on Multi-Source Data. J. Mar. Sci. Eng. 2024, 12, 1945. https://doi.org/10.3390/jmse12111945

AMA Style

Li X, Zuo Y, Jiang J. DAPNet: A Dual-Attention Parallel Network for the Prediction of Ship Fuel Consumption Based on Multi-Source Data. Journal of Marine Science and Engineering. 2024; 12(11):1945. https://doi.org/10.3390/jmse12111945

Chicago/Turabian Style

Li, Xinyu, Yi Zuo, and Junhao Jiang. 2024. "DAPNet: A Dual-Attention Parallel Network for the Prediction of Ship Fuel Consumption Based on Multi-Source Data" Journal of Marine Science and Engineering 12, no. 11: 1945. https://doi.org/10.3390/jmse12111945

APA Style

Li, X., Zuo, Y., & Jiang, J. (2024). DAPNet: A Dual-Attention Parallel Network for the Prediction of Ship Fuel Consumption Based on Multi-Source Data. Journal of Marine Science and Engineering, 12(11), 1945. https://doi.org/10.3390/jmse12111945

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop