1. Introduction
In the past few years, great attention has been devoted to the efficiency of ship operation in different ocean-going conditions [
1,
2]. To decrease the greenhouse gas effect, enhancement of the hull shape design [
3,
4] and the power plant arrangement [
5,
6] has been conducted in order to reduce ship emissions and fuel consumption. This has been complemented with weather ship routing systems in order to avoid sea states in which ships are forced to increase emissions and fuel consumption [
7,
8].
There are numerous research approaches on the prediction of ship speed, some addressing the physical mechanisms [
9,
10], while others deal with statistically based algorithms [
11,
12,
13,
14]. Normally, ship performance is studied through the speed–power curve of the specified vessel obtained from a series of sea trials. Unfortunately, until ship efficiency is examined, the hub is on fuel consumption, in place of the engine power originated from the speed–power curve. The single curve obtained from sea trials is not enough to study the fuel consumption of the whole range of operating conditions, during the always changing vessel operational schedule. In [
15], the authors investigated the problem of fuel consumption prediction and providing the best value for the trim of a vessel in real operation based on data measured onboard. For the fuel consumption prediction, the authors used three different approaches for comparison: white, black, and gray box models. In fact, the influence or trim can also be determined from numerical models to be used at the design stage [
16], including the influence of biofouling on ship resistance [
17], as well as the uncertainty in predictions [
18].
All this generates unreliable estimation conditions [
19]. The speed prediction plays a fundamental role in cruise optimization, which is an operation to choose the ideal journey for the vessel crew with regard to augmenting power efficiency and bringing down greenhouse gas emissions in the maritime sector. A correct forecast of ship performance in-operation is a requirement to fulfill these goals [
20]. In [
21], an innovative platform was introduced to harmonize, through big data technologies, data collected from various sensors onboard. Extreme scale processing techniques were also implemented in order to achieve operational efficiency and performance optimization. This platform was further benchmarked on a series of pilot demonstrations regarding fuel consumption prediction. In [
22] an accurate regression model for the fuel consumption of the main engine, using an artificial neural network (ANN), was proposed by big data analysis including data collection, clustering, compression, and expansion. Various numbers of hidden layers and neurons and different types of activation functions were tested in the ANN in order to obtain an accurate regression model, studying their effects on the accuracy and efficiency of the regression analysis. The authors have shown that the proposed regression model using ANN is a more accurate and efficient model to predict the fuel consumption of the main engine than polynomial regression and support vector machines.
The ship speed prediction problem in real sea conditions may be essential and challenging as well [
9,
13]. Concerning the evaluation of the ship speed in the ocean, it is appropriate to compute added resistance in waves with precision. The added resistance is originated by the radiating waves together with the reflection or diffraction of incident waves. The former mostly arises out of vessel motions and is expertly foreseen with potential theory (strip theory for bow waves), which involves either a far-field technique subjected to the momentum conservation principle, or a near-field technique incorporating the hydrodynamic pressure on the hull surface [
23]. However, the forecast of the added resistance generated mostly from a vessel forepart reveals great dispersion of outcomes depending on the adopted approach. There are various techniques available to predict the added resistance in waves, such as potential-based methods [
24,
25,
26,
27], experimental methods [
28,
29], and computational fluid dynamics (CFD) approaches [
30,
31,
32,
33,
34].
The methodology developed and presented here is based on a neural network trained with suitable simulated data acquired through routing software that provides realistic time histories of ship performance at sea [
7,
8]. These navigation data simulate the data set that would be collected on board [
35], and will be adopted to predict the speed of the vessel in different operational conditions. Neural networks have been applied to a range of problems related to ship performance, such as weather forecast [
36], wind loading [
37], ship maneuvering [
38,
39], and ship motions and loads [
40].
This learning method is very convenient, since it does not need to state the theoretical form of the function, making the calculation procedure expeditious and adequate for being carried out in problem-solving time. Here, the multilayer perceptron (MLP) is adopted, which is among the frequently applied neural net configurations. Considering the system training, the Levenberg–Marquardt backpropagation algorithm is used and finally compared with those obtained with the resilient backpropagation method. The inputs in use to train the ANN concerning the prediction of the ship speed are the output torque of the main engine, the revolutions per minute (RPM) of the propulsion shaft, the significant wave height, and the peak period of the waves, along with the relative angle of wave encounter. The ultimate aim is to apply the approach on board vessels.
The structure of the paper is as follows:
Section 2 explains the source of data associated with the ship operation with regard to weather conditions;
Section 3 presents the ANN model adopted in this study;
Section 4 reports the training procedure;
Section 5 considers the results obtained, while in
Section 6 the conclusions are established.
2. Ship Operational Data
Ship performance at sea can vary appreciably from still water to steering in waves up to the extreme design environments as shown in [
7,
8]. The principal prerequisites for the evolution of a well-planned weather routing system are accurate atmospheric condition predictions up to the planned arrival time, as well as the ability to reproduce the vessel’s way of behaving whatever conditions it meets in the ocean, determining the motions, achievable speed, combustible consumption, and discharges.
At the design stage, the assessment of ship performance relies on models. These models may be empirical or semi-empirical, numerical, or physical in the case of tests in towing tanks [
13]. Concerning the vessel’s behavior during ocean operation, model scale ship behavior assessments based on model experiments and theoretical computations may be compared with log book data. However, enough studies have still not been executed through examination of the similarities with on-board monitoring information [
1]. Due to economic and time constraints, any of those models can only be verified for a relatively limited number of cases, which are in general different from the large variety of conditions in which a ship can be required to operate in her lifetime.
In order to make up for this limitation, and to provide proper data to validate the proposed ANN model, realistic time histories of ship performance at sea have been simulated by route planning software. The software models the ship behavior at sea, accounting for involuntary speed reductions due to added resistance and engine overloading, and voluntary speed reduction due to undesired ship motion and the exceedance of pre-defined sea-keeping limits [
41,
42]; it performs a multi-objective optimization through a robust evolutionary algorithm aimed at identifying a set of Pareto-optimal routes balancing voyage duration, fuel consumption, and navigational safety. The first version of the software so far implemented [
7,
8] takes into consideration the variability of sea-state conditions (significant wave height, peak period, and wave direction) and sailing conditions (ship speed and course) along the route, while assuming constant loading conditions; consequently the ANN developed in this work does not consider displacement and trim as input variables. Thus, the preliminary analysis and testing of the method partially maintains the limitation approach.
The transatlantic route between the English Channel and Miami [
43] area of a 24,600 t container-ship has been considered in this paper. Aiming at providing a larger range of variability of the operational conditions, the simulated data include summer and spring voyages, in both directions, westwards and eastwards, and with different mission requirements.
3. Artificial Neural Network Model
A neural network is a computer or mathematical model that attempts to imitate either the form or functioning features attributed to living neural systems. ANNs are mostly divided connections of adjustable arbitrary processing elements (PEs); in other words, they are made up of a linked set of artificial neurons and transform the data making use of a connectionist perspective for calculation. However, applied in computer equipment, a PE is an elementary addition of multiplications succeeded by a discontinuance (McCulloch–Pitts model). The links’ strengths, named weights, may be modified in an attempt to make the end product of the system correspond to a desired output.
Usually, a neural network is a flexible structure that replicates the intrinsic form constructed upon either outer or inner data, which moves throughout the system during the training stage. Adaptation is the capability to replace the network variables conforming to an order (usually, minimizing an error function). Adaptation allows the network to search for the best performance. A neural network is an arbitrary mathematical statistical modelling implementation. It may be utilized for reproducing complicated interconnections linking fed in and sent out information or to detect patterns in information. Allocated data processing possesses the benefits of accuracy, backup, big distribution of calculations, and collaborative computing.
The MLP is the preponderant applied neural network topology. Lippmann [
44] is classified among the most dominant recommended bibliography concerning the numerical abilities of MLPs. Generally, for steady pattern categorization, the MLP with a pair of hidden layers is a general pattern classifier. That is, the discriminant analysis might take whatever form, as needed by the fed-in information collection. After all, while the weights and the output groups are correctly normalized, the MLP reaches the performance of the highest analytical recipient, which is excellent from a classifying viewpoint [
45]. Regarding mapping capabilities, the MLP is thought to be capable of estimating random processes.
MLPs are usually trained using the backpropagation method and here the Levenberg–Marquardt optimization is applied. The weights and bias are modified conforming the optimization procedure, and the backpropagation technique is utilized for computation of the Jacobian matrix of the performance function with respect to the weights and bias.
Actually, the renovated attentiveness regarding the neural nets has been in part activated due to the existence of backpropagation. The least mean squares (LMS) algorithm presented by Bernard Widrow (1960) may not be applied to hidden PEs, because the wanted signal is unknown in that learning method. The backpropagation algorithm distributes the errors along the system and permits adjustment of the hidden PEs.
Two of the main features of an MLP are its arbitrary Pes, which possess a discontinuance that should be smooth (the logistic function and the hyperbolic tangent are preponderant and universally adopted); and its great connectivity (in other words, a component that belongs to a specified layer supplies every component of the following layer). The MLP is trained using error-correction learning, which signifies that the network required response needs to be found. In pattern identification that is usually the event, because the fed-in information is specified (in other words, which information is part of which test is discovered).
Here a unique-hidden-layer MLP structure is adopted for the time-domain relation connecting outputs and inputs. A common arrangement illustration is shown in
Figure 1. A unique bias neuron
b will be summed to every one of the input
pj and hidden layers that will be added to the weighted inputs
wj to constitute the net input
n, which can be expressed by
The quantity of the hidden neurons regulates the robustness of the network design. This quantity was attentively selected in the subsequence of several training runs performed for confirmation through a sensitivity analysis of the results.
Here two different structures of MLP systems (feedforward ANNs) were developed in MatLab and trained employing the Levenberg–Marquardt method, and then compared with the resilient backpropagation optimization algorithm. The first MLP consists of three layers of nodes: an input layer, a hidden layer, and an output layer; the second one has an input layer, three hidden layers, and an output layer. Except for the input nodes, each node is a neuron that uses a nonlinear activation function, in this case a log sigmoid function. In the first part of the study, the training procedure used five input nodes and one output node as indicated before. For this first set of results it used only the first MLP structure (one hidden layer) and a formulation of two hidden neurons was chosen in the subsequent study of the system convergence and generalization ability. The training was performed only using the Levenberg–Marquardt method and the adopted input variables were:
HS—significant wave height;
TP—peak period of wave;
Torque—main engine output;
RPM—revolutions of the shaft;
Alpha—relative angle of wave encounter.
The relative angle of wave encounter was reduced to the range <0,180> degrees due to the assumed symmetry of the wave impact on the resultant ship speed regardless of the wave approaching from port side or starboard side.
4. Neural Network Training
A log-sigmoid activation function is employed to all neurons belonging to the hidden layer, producing a system with the capability to produce smooth choices. The function is described as follows
The error correction learning performs as follows: deriving out of the network response at PE
i at iteration
n,
yi(
n), and the wanted response
di(
n) for a stated fed-in sample an instantaneous error
ei(
n) can be established through
Backpropagation calculates the sensitivity of a cost function with reference to every weight in the system, and refreshes all weights in proportion to the sensitivity [
46]. The benefit of the method is that it may be applied with local data and needs only small products per weight, which is extremely effective. The disadvantage is that since this is a gradient descent method, it exclusively makes use of the local data so it may lay in relative minima. Furthermore, the method is naturally noisy because a deficient approximation of the gradient is adopted, producing slow convergence.
At the same time as backpropagation, a steepest-descent method, the Levenberg–Marquardt algorithm, is carried out, which originated from the Newton–Raphson method that was outlined for minimizing functions that are sums of squares of nonlinear functions [
47], of the form
where
ek is the error in the
kth pattern and
e is the array of the components
ek. As long as the divergence between the prior weight array and the actual one is small, the error array may be approached to 1st-order through a Taylor series:
Consequently, the error function might be given by
Through minimization of the error function with reference to the actual weight array:
where
is the Jacobian matrix.
The Hessian for the sum-of-square error function is stated by
Ignoring the 2nd term in (8), the Hessian may be rewritten in the following way:
The refreshing of the weights requires the inverse Hessian. The Hessian is moderately simple to calculate, because it is formulated on 1st-order derivatives with respect to the network weights, which are effortlessly adjusted through backpropagation. Despite the fact that the refreshing procedure may be employed iteratively for the minimization of the error function, this can result in a big step size that can cancel the linear estimation where the method is established. In the Levenberg–Marquardt technique, the error function is minimized, while the step size is preserved as small in spite of making sure of the linear estimation viability. This is achieved through adoption of a reformulated error function as follows:
where
λ is a variable controlling the step size. Minimizing the modified error with respect to
w(
j + 1):
When the scalar λ is zero, Equation (11) expresses precisely Newton’s method, employing the approximate Hessian matrix. When λ is large, the formulation changes into gradient descent with a small step size. Newton’s method is expeditious and more precise near an error minimum, so the objective is to move in the direction of Newton’s method expeditiously. Consequently, λ is reduced after all favorable steps (performance function reduction) and is augmented only when a preliminary step will increment the performance function. Thus, the performance function is decreased every time at every algorithm iteration.
The sigmoid transfer functions typically used by the multilayer networks in the hidden layers are often called “squashing” functions, because they compress an infinite input range into a finite output range. Sigmoid functions are characterized by the fact that their slopes must approach zero as the input gets large. This causes a problem when the steepest descent is used to train a multilayer network with sigmoid functions, because the gradient can have a very small magnitude and, therefore, cause small changes in the weights and biases, even though the weights and biases are far from their optimal values.
The purpose of the resilient backpropagation training algorithm is to eliminate these harmful effects of the magnitudes of the partial derivatives. Only the sign of the derivative can determine the direction of the weight update; the magnitude of the derivative has no effect on the weight update. The size of the weight change is determined by a separate update value. The update value for each weight and bias is increased by a factor whenever the derivative of the performance function with respect to that weight has the same sign for two successive iterations. The update value is decreased by a factor whenever the derivative with respect to that weight changes sign from the previous iteration. If the derivative is zero, the update value remains the same. Whenever the weights are oscillating, the weight change is reduced. If the weight continues to change in the same direction for several iterations, the magnitude of the weight change increases. A complete description of the resilient backpropagation algorithm is given in [
48]. This learning algorithm is used for comparison of the results obtained with the Levenberg–Marquardt method.
5. Results of Speed Prediction
The simulated data obtained with the weather routing code utilized to verify the neural net approach were divided into four sets, representing four voyages in the North Atlantic: westward and eastward, in April and July 2001, respectively.
In each simulation run the values of the ship speed, the output torque of the main engine, the RPM of the propulsion shaft, the significant wave height, the peak period of the waves, and the relative angle of wave encounter were estimated at consecutive control-points along the route located at a distance of about 30 nautical miles, roughly corresponding to one hour and a half at the considered ship speeds; however, the sampling rate was not kept constant. To better understand the characteristics of each voyage the respective maximum, minimum, mean, and standard deviation values of the significant wave height, peak period, and ship’s speed are listed in
Table 1.
When training multiple-layer networks, the common procedure is to initially split the information into three subsets:
The training set, which is utilized both to calculate the gradient and to refresh the network weights and biases;
The error on the validation set is observed throughout the training procedure. The validation error usually drops in the course of the beginning of the procedure, as well as the training set error. Nevertheless, when the model starts to over-fit the information, the error on this set generally increases. The system weights and biases are saved when the validation set error is at a minimum;
The last set is the test set. This set error is not employed throughout the training procedure, but it is utilized to measure the differences between distinct models. It is also convenient to outline the test-set error throughout the procedure. If the error on this set reaches a minimum at a substantially dissimilar iteration number than the validation set error, this may denote a deficient split of the information set.
Here, the function employed to execute the division of the data is automatically obtained when the model is trained, and is utilized to split the information into training, validation, and testing subsets. Afterwards, the data is randomly divided into the three subsets. The fractions of information that are placed in the training, testing, and validation sets are 0.7, 0.15, and 0.15, respectively. The sets have different numbers of training points varying between 12,754 and 15,618 in total for each set.
The correlation analysis is performed using the Pearson correlation coefficient
r, which is calculated to measure the linear dependence between the model and desired outputs. According to the definition, the Pearson correlation coefficient between an output
x and a desired output
d is stated by
The
r values acquired for the evaluation of the vessel speed and fuel oil consumption (FOC) over the training, test, and validation data sets are recorded in
Table 2 and
Table 3, respectively.
It can be seen that very good agreement is achieved for each one of the tested sets utilizing only two hidden neurons.
Figure 2 and
Figure 3 show the respective regression plots, illustrating the fitness between the outputs and targets for the ship speed estimation.
The labels of the y-axes of
Figure 2 and
Figure 3 indicate equations between the predicted value and the target value, with output as the dependent variable and target as the independent variable. These equations were employed to express how well the ANN performed. The coefficient of the target reveals the correlation between the output and the targets, consequently for a satisfactory performance of the MLP it should be as close to unity as possible. The second term, which is a constant, is the error or the residue that should be summed to the scaled target to force it as close as possible to the predicted output; preferably it would be zero or as small as possible.
The results obtained in the regression plots for 01072001_east and 01072001_west voyages for the ship speed estimation are similar to the ones presented in
Figure 2 and
Figure 3, as well as the regression plots for the FOC estimation, and for this reason these figures are omitted here.
Figure 4 and
Figure 5 show comparisons between the neural net prediction and the results derived from the simulations. A total of 1000 data points are exhibited for each data set. The plots show the very good correspondence between the neural net approximations and the results achieved throughout simulations.
Table 4 and
Table 5 present the results of the sensitivity analysis performed to analyze the ANN prediction capability to predict the ship speed and FOC, respectively, without information of the engine state (RPM and torque), using as inputs only the sea conditions. This analysis was made for two different neural network structures: using just one hidden layer vs. three hidden layers.
From the obtained results presented in
Table 4 and
Table 5 it can be seen that it is possible to predict the ship speed and the FOC with satisfactory accuracy using as input data only the information about the sea conditions and disregarding the input of the RPM and torque from the engine. A loss of accuracy can be noted in the voyages where the bandwidths of H
s and T
p are smaller (lower standard deviation). It is also noted that the model with just one hidden layer is less efficient for more neurons in the test and validation datasets, while the opposite is reported for the training dataset. With the neural network structure with three hidden layers, this problem is solved and the efficiency for the test and validation datasets are almost the same as the one obtained for the training set.
Table 6 presents the results of the sensitivity analysis performed to analyze the ANN prediction capability to predict the ship speed using the resilient backpropagation learning algorithm instead of the Levenberg–Marquardt algorithm, again without information of the engine state (RPM and torque), using as inputs just the sea conditions. This analysis was made for two different neural network structures: using just one hidden layer vs. three hidden layers, as presented in
Table 4 and
Table 5.
The results obtained with the Levenberg method were compared with another training method, in this case the resilient backpropagation learning algorithm, in order to analyze its significance. From the obtained results presented in
Table 6 and comparing with the ones presented in
Table 4, it can be seen that the magnitudes of the results are similar as well as the behavior of the models when the inner structure is changed.
6. Conclusions
A procedure based on artificial neural networks has been implemented to provide estimations of ship speed and FOC from the output torque of the main engine, the RPM of the propulsion shaft, the significant wave height, the peak period of the waves, and the relative angle of wave encounter measurements. The obtained neural network system is suitable for producing precise approximations of both variables, indicating that it is feasible to achieve acceptable results with an ANN trained using only two hidden neurons. Additionally, it was demonstrated that it is possible to predict the ship speed and the FOC using as input data only information about the sea conditions. The main purpose of the presented approach is for it to be integrated into a hull monitoring system. The ANN can been used as a redundant system in case of failure of a sensor providing the foreseen information for the controllers, and can help in anticipating some ship speed deviation from predefined target conditions, reducing action delays optimizing the ship operation.
The information utilized to adjust and confirm the validity of the neural net system was acquired through simulations of ship operations executed with a routing program providing a large set of practical navigation data to copy the data set collected on board. Thus the neural network can be used as an alternative to the more complex system used in [
7,
8], or any other weather routing system [
49,
50]. It is also a good option to be used in decision support systems onboard, like the ones described in [
35,
51].