1. Introduction
Ships have been the dominant means of transporting goods for many years, and more than 80% of the world trade is transported by sea [
1]. The increased interest of the global community in reducing environmental pollution has led to the introduction of new regulations by the authorities to improve vessels’ energy and operational efficiency. However, the total greenhouse gas (GHG) emissions produced by the shipping industry have increased from 2012 to 2018 by 9.6% [
2]; by 2050, shipping missions are projected to be increased by 90–130% from 2008 levels [
3].
The International Maritime Organization (IMO), as the main regulatory body for international shipping, with the adoption of the OILPOL Convention in 1954, introduced environmental regulations in the shipping sector. More recently, the IMO has adopted mandatory operational and technical measures, and committed to controlling GHG emissions via technological improvements, operational performance indicators, and the use of alternative fuels [
3,
4].
Therefore, the IMO introduced the Energy Efficiency Design Index (EEDI) for new ship design, which sets a minimum
emission per cargo carried, the Energy Efficiency Operational Indicator (EEOI), and the Ship Energy Efficiency Management Plan (SEEMP) for all ships, aiming to improve the operational energy efficiency of ships by using operational strategies and practices [
3,
5,
6]. Most of the measures available are speed based, due to ships’ energy efficiency, high sensitivity, and significant impact in the reduction of greenhouse gas (GHG) emissions [
7,
8]. The monitoring of such measures does not involve the investment of new funds or incur significant costs.
In addition, the European Commission (EC) presented several initiatives to limit GHG emissions [
9], speed up decarbonization by setting the target of a climate-neutral Europe by 2050, and incorporate maritime transportation in emissions trading [
10]. Moreover, a wide range of research groups, bodies, and authorities promote new energy indicators, such as the Clean Shipping Index (CSI), the Environmental Ship Index (ESI), and Rightship’s Existing Vessel Design Index (EVDI).
The shipping industry can improve its environmental performance and meet the targets either through ship design or operation-related measures. Although, as [
11] argue, devices for data monitoring have a relatively low cost but the data processing method is quite complex, particularly when the activities of a ship vary. Therefore, energy inefficiencies can occur due to the limited information about energy efficiency and the lack of time that this information is produced and provided, as [
12] concluded. The performance modeling of a ship can be achieved with multiple levels of sophistication [
13], such as theory-based models or data-driven models. The former was mainly developed for ship design purposes and has significant uncertainties about a ship’s operation measures [
14].
The shift of the shipping industry towards a more digital era led to large amounts of data related to energy consumption being collected (e.g., Kyma and Lean Marine systems). As a result, the shipping ecosystem aims to use the collected data and improve the operational efficiency of the ships, whether it concerns their design or their maintenance plan. So, all stakeholders are eager to exploit deeper the usage of complex machine learning methods and develop data-driven performance models with prediction accuracy.
This eagerness does not characterize only the shipping industry but every scientific and industrial field. A key tool to fulfill this eagerness and to develop and apply advanced machine learning techniques is the integration of computer science and statistics, as well as the theoretical foundation of artificial intelligence and data science. Thus, it is not surprising that machine learning is one of the technical domains with the fastest growth rates today resulting in the creation of new learning theories and algorithms, their application in new cases and fields, as well as the continual explosion in the accessibility of online data and low-cost processing [
15].
Such an example is the artificial neural networks (ANN), which estimate the shaft power of large merchant ships via data-driven performance models [
16,
17,
18,
19]. Moreover, a Bayesian belief network (BBN) was applied to a dry-bulk ship interface with the port to quantify energy performance [
20]. The capability of ANN and multiple linear regression (MLR) was compared in [
21] to establish the relationship between fuel consumption and main engine RPM, ship speed, etc. Both ANNs and Gaussian Process Regression (GPR) were applied by [
22] to predict the fuel consumption in relation to shaft power and ship speed.
Furthermore, many products that are using machine learning algorithms and utilizing the available ship performance data have already been developed and launched to the market (e.g., BMT, GreenSteam, HITACHI, and NAPA). However, most of these models are not easily understood and their sensitivity and accuracy are not well defined or explained [
23].
The current slow pace of change has increased the pressure on regulatory bodies to intensify their effort and improve their effectiveness, making it difficult to predict future shipping industry trends. In addition, the absence of standardized measurement of environmental performance, due to the complexity of calculations and the dependency on the quantity and quality of data input, makes it a challenging and time-consuming task for humans to assess and implement a holistic approach. Hence, it is of paramount importance for the proper data analysis and the industry experience to be combined.
In this context, the present work intends to provide useful objective indices to aid the assessment of commercial ships’ environmental performance based on machine learning. Thus, this paper is organized as follows: the two next sections are related to the theoretical background and the proposed methodological framework, respectively, while
Section 4 presents the application of the proposed methodology as well as the results and the proposed indexes incorporated in a graphical tag. Finally, in
Section 5, concluding remarks are given as well as some directions for further research.
2. Theoretical Background
The concern for sustainable transformation in maritime has been at the top of the agenda for many years now. However, it involves complex decisions and multiple factors that must be considered [
24]. Hence, most of the decisions that need to be made to improve the environmental performance of vessels and the general shipping industry have conflicting results. As a result, it is difficult to minimize emissions and at the same time maximize service levels [
25]. For this reason, most of the existing management decision systems focus on cost or operational performance indexes [
26].
The keen interest in environmental sustainability has led to extensive research; however, many of the recommended solutions are theoretical and impracticable. In addition, the multiple and controversial environmental initiatives available to the shipping industry do not offer clarity in making decisions and create additional administrative burden [
12]. Further, many of the current studies propose solutions that focus only on the technical side, such as the use of alternative fuels [
27], fuel life cycle calculations [
28], hull cleaning [
29], and vessel design [
30,
31].
In the existing literature, some initiatives provide indications about vessels’ performance based on environmental factors that are considered to be performance-related and others are developed as incentive schemes where environmental improvements to vessels or practices are rewarded with certifications or class notations, and consequently provide a market advantage [
9]. Some other initiatives deal with a single environmental issue or have been developed for a specific use, location, or vessel type, while others assess a broader range of environmental issues and provide an overview of vessels’ environmental performance. However, the effectiveness of these initiatives in improving environmental performance has been questioned. A comparative analysis of the CSI and the ESI suggested that there are several drawbacks in assessing environmental performance [
32]. In their study, ref. [
33] was cautious about the contribution of “private standards” in mitigating GHG in shipping due to the lack of transparency and the ambition of several schemes analyzed.
In the literature, several studies exist regarding the modeling of vessel fuel consumption and emissions. The traditional “resistance modeling”, with the objective to estimate the vessel’s total resistance in relation to speed and external factors (e.g., wind and waves), is the theoretical foundation of ship fuel consumption [
34,
35]. However, it cannot handle complex issues, which is why alternative methods have been developed [
36,
37,
38,
39,
40]. In general, these studies confirm that the speed of a vessel is the principal factor of fuel consumption, although resistance, due to weather, also has a significant influence [
29,
40].
The approach, the complexity, and the use of raw data are critical to achieving accuracy and well-understanding results related to the ship’s environmental performance. Applying ANN models [
41] achieved prediction of propulsive power from the indicators, which mainly affect vessel resistance (speed, wind speed, direction, temperature, etc.). Other empirical studies have applied ship data from noon reports [
29,
40,
42] or vessel positions from the Automatic Identification System (AIS) [
43]. Moreover, ref. [
44] confirmed that the use of ANN-based fuel prediction is appropriate to analyze the bunker fuel efficiency of a single oil tanker when noon reports are the primary source of information. Furthermore, the application of ANN models transcends traditional models, such as polynomial regression and support vector machine (SVM) learning, in accuracy and efficiency [
45].
This paper proposes an alternative method for assessing ship environmental performance based on machine learning by using an objective and quantified approach.
3. Materials and Methods
The framework used in this paper makes extensive use of machine learning techniques to create a new composite energy efficiency index based on real ship operational data (see
Figure 1 for the simplified framework process). The actual framework combines Principal Component Analysis (PCA) and clustering techniques to acquire from real data a new combined efficiency index and aims to minimize the number of parameters characterizing the environmental performance of a certain ship.
The best scenario is to conclude with one representative artificial environmental performance index containing the total information (or as much as possible) from the data. Nevertheless, even if only one environmental performance index could summarize the information contained in the data while mixing the various meanings of information, it would still be difficult to draw useful conclusions from it. An alternative and possibly more informative scenario would be the extraction of more than one index incorporating different information (e.g., pollution level and/or pollution reason) from the data providing practical interpretations.
For acquiring appropriate indices from the data, the PCA will be used and then Cluster Analysis (CA) will be applied to create groups of ships with similar environmental performance. PCA is a renowned method that has been applied in a wide range of scientific problems, especially in industry (see, for example, [
46]), to reduce the dimensionality of the data at hand, taking into consideration the relations among variables. Moreover, PCA has been used historically to produce environmental performance-related indices in various production fields [
47,
48].
3.1. Principal Component Analysis
PCA is a mathematical technique [
49] that does not make any assumptions about the nature of the data (e.g., the distribution of the available variables). PCA uses an orthogonal transformation to convert several dependent variables into a reduced number of linearly uncorrelated variables called principal components (PCs). PCA is used for revealing the internal structure of the data in a way that best explains the variance in the data [
49]. An interesting feature of this method is that the extracted PCs may be appropriately interpreted or labeled by identifying which of the original variables contribute to each of the PCs.
Assuming that there are
p original variables (say
), each of the
p PCs (say
) may be written as a linear combination of the original variables. Specifically, the
jth PC can be written in the following form:
where
(
) are appropriate weights that quantify the contribution of the
uth original variable to the
jth PC. The PCA model is extracted by appropriately decomposing the
covariance matrix
S of
that contains the variances and the covariances of the original variables (
denotes the covariance of the
ith and the
jth variable). In the case of standardized variables, the PCA model is extracted by appropriately decomposing the
correlation matrix
P of
that contains the correlation coefficients among the variables (or in other words, the variances and covariances of the standardized variables). In case the population parameters (matrices) are unknown, appropriate estimators are used.
Among PCs, the first PC accounts for as much of the information present in the data as possible, and each succeeding PC in turn has the highest variance possible under the constraint that it is orthogonal to (i.e., uncorrelated with) the preceding PCs. Usually, the first two (or three) uncorrelated PCs explain the majority of the information contained in a data set. In cases, such as the case examined here, where just a small number of original variables is available, usually, the two first PCs explain most of the information in a data set. Thus, it is evident that by using PCA the problem under study may be simplified. In
Figure 2, the application of PCA in the space defined by two original variables is presented.
Moreover, since each PC is a weighted sum of the random variables
where each
represents one of the original parameters, then by using the Central Limit Theorem (see, for example, [
50]), it is assumed that at least the approximate distribution of each PC is Normal or, equivalently, that the distribution of the standardized PCs is approximately Standard Normal. Moreover, since the two PCs are uncorrelated and are assumed to be normally distributed, the lack of correlation is equivalent to the independence of the PCs.
As a consequence, it is evident that the PCA method offers an opportunity to reform the multivariate problem to many univariate problems in the sense that the PCs are independent. This can simplify the procedure of evaluating ships’ environmental performance. Additionally, since PCA permits the interpretation of each PC, it offers an additional tool to assign qualitative meaning to quantitative data.
3.2. Cluster Analysis
CA or clustering refers to algorithms that aim to organize a set of items/observations into groups or clusters so that they share similar (in some manner) characteristics and differ from other observations that belong to other groups. Clustering is a key function of exploratory data analysis and a widely used method for statistical data analysis in various fields.
CA may be accomplished by several algorithms, which vary greatly in their comprehension of what defines a cluster and how to effectively discover them. Some common definitions of clusters include groupings with close spacing between cluster members, crowded regions of the data space, intervals, or certain statistical distributions. Therefore, clustering may be described as a multi-objective optimization problem. The proper clustering technique and parameter settings (including factors such as the distance function to employ, a density threshold, or the number of predicted clusters) rely on the specific data set and the intended application of the findings. The task of such analysis can be viewed as a challenge of categorizing items based on how similar they are to one another. To group objects into clusters, this similarity measure is typically—and in most applications—based on distance functions such as Euclidean distance, Manhattan distance, Minkowski distance, Cosine similarity, etc. A homogeneous group is made up of objects that are sufficiently similar to one another (a cluster). CA as such is not an automatic task but an iterative process of knowledge discovery or interactive multi-objective optimization that involves trial and failure. It is often necessary to modify data preprocessing and model parameters until the result achieves the desired properties.
A graphical representation of an application of CA is presented in
Figure 3.
3.3. Available Real Data
To develop new ship environmental performance indices, data were combined from two different sources. The first data source was from the EU Monitoring, Reporting, Verification (MRV) mechanism that collects the
emissions reports for ships above 5000 gross tonnages (regardless of the ship’s flag), operating in ports under the jurisdiction of any EU Member State. The second data source was that of the startup “27 Research” in Greece and was used to extract information regarding the physical characteristics of all the ships with at least one voyage in the EU zone during 2018–2021. The final merged data set consisted of two major groups, namely, General Cargo ships with 2650 records and Container ships with 62 records. The variables recorded for each ship are described in
Table 1.
4. Analysis and Results
In this section, the data described above are analyzed using mainly the PCA and CA methods described in
Section 3 after a necessary preprocessing procedure and an initial exploratory analysis of the original data to identify possible outliers.
4.1. Data Preprocessing
Before applying the PCA method to the data, the observations in each variable
are standardized using the following formula:
where
is the mean of the
ith variable and
is the standard deviation of the
ith variable. Since, in most cases, the true means and variances are unknown, their unbiased sample estimates can be used.
4.2. Exploratory Analysis of the Original Data
After preprocessing the data, 381 missing values were removed and 2209 remained, and a basic exploratory analysis was carried out to determine initially the correlation between the variables and to detect only possible outliers. Both these analyses are crucial for the PCA since they not only justify its PCA due to the presence of highly correlated variables but also result in more concrete PCs by removing any outlier that incorporates noise in the multivariate data set.
One of the most noticeable points from correlation analysis was the positive correlation between fuel consumption (
) with the variables related to environmentally harmful emissions (
). There was also a stronger positive correlation between variables related to the construction characteristics of the vessels (
). To highlight the significant correlations, a correlation heatmap matrix reporting only the significant Pearson’s correlation coefficients among the variables is depicted in
Figure 4.
Moreover, as we observe, the annual total fuel consumption for voyages () is correlated with the total distance traveled (), which is related to the annual total time spent at sea on a voyage (). On the other hand, there are many correlations in the variables corresponding to the technical specifications of a ship, and some of them are strongly positively correlated (>=0.9), such as the weight that a ship can safely carry () with (a) the width of the ship () and (b) the maximum space available for cargo (). The maximum space available for cargo () is also strongly positively correlated with (a) the maximum length of a vessel () and (b) the width of the ship ().
To examine if there are any outliers in the data set, the variables
–
(ship characteristics) were examined following two different approaches, namely, a univariate graphical examination with the aid of a multivariate approach based on the Mahalanobis Distances (MDs) of the observations (see
Figure 5) from the center (mean) of their joint distribution (
, where
u is a vector of the observed values,
is the vector with the sampling mean of the variables, and
V is the variance–covariance matrix).
As we may observe in
Figure 5, most of our data give an MD value smaller than 10, while a small number of ships give MD values larger than 10; moreover, a smaller number of ships give an MD value larger than 20. From the graphical representation, an intuitive threshold of in-bound values was set equal to 10 and all observations with MD values larger than 10 were removed from further analysis, due to probably false reported values. For example, the distance between the waterline and the keel (
) of the ship that corresponds to the upper right dot in
Figure 5 is reported to be equal to
m when the average value of this characteristic is only
m (
m), which obviously indicates a false reported value. Another observation that was removed corresponds to a ship with a maximum available space for cargo (
) of 62,400 m
3, which is 3.24 standard deviations larger than the corresponding mean value of this variable (
= 19,452 m
3,
= 13,265.28 m
3).
4.3. Application of PCA
The PCA was applied sequentially to (a) all the available data and (b) to data after removing observations (ships) with an MD value larger than 10 (labeled as reduced data set). The number of principal components that need to be kept in a PCA is usually determined with the help of the so-called scree plot. In the left plot of
Figure 6, the scree plot, created based on all data, is presented, which helps determine the number of PCs to keep. Based on the plot, one can conclude that three PCs should be preserved since the curve declines steeply and then bends when the number of PCs equals 3, which serves as an indicator of a cut-off point. It is worth mentioning that the same cut-off point is indicated in both cases.
In
Table 2, the PCA loadings for the first three PCs using the initial standardized data set and the reduced data set are reported. It is worth mentioning that the loadings of the first 3 PCs in both cases are similar (observe the small values in the columns labeled as “Pairwise Differences among loadings”), indicating the difference between the loadings of initial data and those of the reduced data set. The first three PCs explain more than 80% of the observed variance in the original variables (80.2% and 85.1% for the two cases, respectively). Moreover, from the loadings, it is clear that almost all variables, with the exception of
,
, and
, have a positive correlation with the first PC. Variables
,
,
, and
have a negative correlation with the second PCs. A negative correlation with the third PC is also found for the variables
,
,
,
, and
. These relationships can be easily depicted, for example, for the first two PCs, with the help of the biplot, as presented in the right plot of
Figure 6. The biplot presented in the right plot of
Figure 6 is created based on the reduced data set.
Interpretation of the First PCs
Large (absolute) values of PC loadings indicate that the variables have a strong effect on that principal component. From the values reported in
Table 2, it is clear that the first PC is more related to the variables
–
, which represent the ship’s physical dimensions, while the second PC is related more strongly to the variables
–
, which represent the ship’s consumption,
emissions, and operational data. The third PC seems to be strongly related to variables
–
, which represent the ship’s
in different geographical regions.
For the first PC, all the large loadings are positive, meaning that ships with large physical dimensions will have large positive values in the first PC while smaller ships will have smaller values. Regarding the second PC, it is worth noticing that all the large loadings (related to variables –) are negative, meaning that ships with pure environmental performance (large consumption, large emissions, etc.) will have small values in the second PC and ships with good environmental performance will have large positive values. Finally, in the case of the third PC, a contrast between the loadings representing the emissions during operation under different conditions is observed. Based on the sign of the loadings, it is clear that the third PC gives large positive scores to ships operating mostly in a Member State’s jurisdiction while it gives large negative scores to ships operating mostly outside a Member State’s jurisdiction.
The above indicates that the first three PCs can be clearly interpreted and labeled as follows:
Ship’s “Physical Dimensions” (first PC);
Ship’s “Operational Env. Efficiency” (second PC);
Ship’s “Operating Region” (third PC).
These three PCs can be considered as three independent indices that characterize ships’ size, operational environmental performance, and operating region (in terms of operating mostly inside or outside a Member State’s jurisdiction).
4.4. Definition of a Graphical Tag Based on the Three PCs (Three Indices)
Based on the three aforementioned PCs (independent indices), an appropriate graphical index (tag) for describing the environmental performance of a ship, conditional to its size and its working region, can be defined. The template icon for the graphical index is given in
Figure 7.
The icon of the ship in
Figure 7 has two distinct zones at the main body of the ship. The lower zone can be used to depict the score of a ship in the first PC, while the upper zone can be used to represent the score of a ship for the second PC. These scores can be depicted with the aid of colors. Specifically, since the first PC takes positive values (recall that all the large loadings were positive), the lower zone can be filled (for example, with blue color) proportionally to its value. Large ships will be indicated with an almost full lower zone while the small ships will be indicated with an almost empty lower zone. The upper zone that represents the second PC can be filled gradually with green, orange, and red colors, representing the environmental efficiency of a ship. Large values of the second PC indicate ships that can be considered environmentally “friendly” and can be depicted with only green color in the upper zone. On the other hand, the upper zone can be filled with red for ships that contribute significantly to pollution, i.e., ships with small values for the second PC. In order to be able to depict also all the intermediate values/cases, the upper zone can be filled gradually with green, orange, and red to indicate the environmental efficiency of the ship. Finally, the third PC is assigned to the stern of the ship and indicates the operating region index of each ship.
The aforementioned procedure can be summarized in the following algorithm, which also clarifies the procedure for determining the gradual fill of the two zones and the color assigned to the stern of the ship in the image given in
Figure 7.
- Step 1.
The two zones in the tag are standardized so that their length is equal to 1.
- Step 2.
Under the assumption that the first PC is, at least approximately, normally distributed, the lower zone of a ship with values equal to is filled with the color blue up to the value , where denotes the empirical cumulative distribution function of the first PC.
- Step 3.
Following a similar reasoning, the upper zone is filled gradually with green, orange, and red with each color assigned to the interval [0, 1/3), [1/3, 2/3), and [2/3, 1], respectively.
- Step 4.
The third PC, depicted as the stern of the ship, is colored according to the following rule: If the score of the third PC (denoted as PC3 score) is smaller than the first quartile of its values in the available data, then it is filled in red. If the PC3 score is larger than the third quartile, then the stern of the ship is filled with the color green. In all other cases, the stern is filled with the color orange.
Some characteristic examples of the proposed tag are given in
Figure 8. For example, in
Figure 8a, a ship with good environmental performance (upper green bar), average physical dimensions (blue bottom bar), and operating mostly inside a Member State’s jurisdiction (green rectangle) is depicted. The second tag of
Figure 8b presents a ship with good environmental performance and large physical dimensions, which also operates mostly inside a Member State’s jurisdiction. The third tag (
Figure 8c) represents a ship that differs from the first two only in the dimensions, having a size somewhere in the middle of the two previous ships. The fourth example (
Figure 8d) demonstrates a graphical index of a ship with average to pure environmental performance and small physical dimensions that operate mostly outside a Member State’s jurisdiction. The fifth tag (
Figure 8e) represents a ship with extremely pure environmental performance and large physical dimensions that operate inside a Member State’s jurisdiction. The last tag (
Figure 8e) represents a ship similar to the previous ship with the following differences: (1) it has a pure, but not as extreme as the previous ship, environmental performance and (2) operates both inside and outside a Member State’s jurisdiction.
From the above examples, it is clear the proposed tag can serve as a unified index that represents the environmental impact based on carbon dioxide () emissions adjusted to the cargo capacity, which is directly related to the physical dimensions of a ship. Thus, this graphical tag is able to distinguish the large vessels with high environmental impact from the vessels with similar dimensions with low emissions. The same applies to smaller vessels as well.
4.5. Cluster Analysis Based on the Three Indices
Following the production of the three indices related to ships’ environmental impact derived by the operation time and emissions in combination with technical characteristics, such as physical dimensions, resulting in the aforementioned graphical tag, the K-Means algorithm was used to further explore the data. More specifically, the K-Means algorithm was implemented using the three indices produced by the PCA to trace and group vessels in clusters with similar characteristics in terms of size, emissions, time of operation, energy consumption, etc.
4.5.1. Choosing Optimal Number of Clusters
Determining the number, k, of clusters in a data set is one of the most crucial tasks in CA. There are several methods to achieve this, each one exploring different characteristics of the data and the clusters, which do not always conclude with the same number of clusters. In such cases, an analysis with all the possible scenarios should be carried out. The information gained by this procedure should then be combined with the knowledge of a domain expert to determine not only a statistically realistic option but also a pragmatic choice from the expert’s point of view.
In the present study, two of the most frequently used methods, namely, the silhouette score and the Gap statistic, will be used to determine the optimal number of clusters k in the available data set (PCs for the General Cargo ships). Each of these statistics is calculated for a range of values for the number k of clusters. Large values or, in general, any peaks to the plots of these statistics versus k indicate that the observations in the clusters defined are well-matched with each other and well-separated from neighboring clusters.
The silhouette score [
51] for a given separation, i.e., by fixing the number of clusters in the data, is defined as
where
a denotes the mean intra-cluster distance and
b denotes the mean nearest-cluster distance (
b).
The Gap statistic, on the other hand, for each number
k of clusters compares the total within intra-cluster variation
(in the log scale) with its expected value determined by generating a large amount of reference data from a uniform distribution on the hypercube defined by the range of the available variables—in this case, the three PCs). For more details, the reader is referred to [
52,
53].
The silhouette score and the Gap Statistic for the data set are depicted in
Figure 9 using the k-means algorithm. Both procedures present a peak at
, indicating that there are four clusters in the data. Additionally, the Silhouette score indicates that analysis with two clusters could also be a reasonably good option. As a result, both analyses were carried out and presented briefly next.
4.5.2. K-Means Algorithm
The k-means algorithm was applied in the data set by setting the number of clusters equal to 2 and 4. In
Figure 10, the two clusters defined by the k-means algorithm are plotted with respect to the available variables, i.e., with respect to the PCs. More specifically, in the upper left plot, the two clusters (colored with red and green) are plotted against the first and the second PCs. In the upper right plot, the same clusters are plotted against the first and the third PCs, while in the lower plot, the clusters are plotted against the second and the third PCs.
From the plots, the clustering algorithm segments the data set into two segments, mainly with respect to the size of the ships. For example, clusters on the plane defined by the first and the second PC or on the plane defined by the first and the second PC (plots in the upper row of
Figure 10) seem to be separated well in terms of physical dimensions, i.e., with respect to the horizontal axis. However, there seems to be no significant separation regarding the other two PCs (see, for example, the lower plot in
Figure 10 or the plots in the upper row with respect to the vertical axis). Therefore, it seems that the k-means algorithm with
manages to separate the ships with respect to their physical dimensions and fails to capture any other difference regarding the other two indices.
The plots in
Figure 11 represent the clusters, colored in four different colors, identified by the k-means algorithm in the case of four clusters with respect—as in
Figure 10—to the PCs. From the plots, it is clear that the four clusters are well-separated with respect to the first two PCs (see upper left plot), namely, the “Physical Dimensions” and “Operational Env. Efficiency”. The third PC, namely, the “Operating Region”, seems to play a relatively smaller role in the separation of the clusters (see the upper right and the lower plots).
From the above analysis, it is clear that the four-cluster is more informative than the two-cluster analysis. The four clusters approach manages not only to separate the ships according to their “Physical Dimensions”, as the two clusters analysis did, but also to advance the information hidden in the second PC (Operational Env. Efficiency).
4.5.3. Interpretation of Clusters
The differences in the four clusters are also highlighted in
Figure 12, in which the values of the three indices (PCs) at the centroid of the four clusters identified by the k-means algorithm are presented.
The first cluster consists of 416 ships while the corresponding numbers for clusters 2, 3, and 4 are 625, 309, and 848, respectively. The differences between the clusters were also tested using an ANOVA (assuming that PCs are independent and approximately normally distributed). The results of ANOVA confirmed, indeed, that there is a statistically significant difference between the means of the three PCs in the four clusters (p-value < 0.0001).The four identified clusters can be briefly labeled as follows:
Cluster 1: “large, environmentally friendly ships”;
Cluster 2: “small, environmentally friendly ships”;
Cluster 3: “large, non-environmentally friendly ships”;
Cluster 4: “small, non-environmentally friendly ships”.
These are in accordance with the existing literature on the environmental sustainability in maritime shipping (see, for example, [
24,
26]).
As a final remark, one can notice that while the third index (Operating Region) seems to play, as already mentioned, a relatively smaller role in the separation of the clusters, there is still some information that can be extracted with respect to this index. More specifically, it seems that the small, environmentally friendly ships (Cluster 2) tend to operate exclusively inside a Member State’s jurisdiction. In addition, it is interesting to mention that while a group of large ships with poor environmental performance due to their size and design is indeed expected to be observed [
30,
31], there is also a large number of small ships with poor environmental performance (Cluster 4), which operates almost exclusively outside a Member State’s jurisdiction.
4.5.4. Further Investigation of the Characteristics of Clusters
To delve deeper into the nature and the characteristics of the identified clusters, the correlations PCA at each cluster were also calculated. In
Table 3, the Pearson correlation coefficients and their corresponding p-values (in parentheses) are presented for all the possible pairs of PCs in each cluster. All the correlation coefficients demonstrate a weak but significant—at a significance level of 0.05—correlation between all the PCs.
More specifically, Physical Dimensions (PC1) and Ship’s Operational Env. Efficiency (PC2) present a weak positive correlation in all clusters, meaning that the ship’s size influences positively its environmental impact. This positive correlation seems to be larger in Clusters 1 and 3—i.e., among large ships—and smaller among small ships (Clusters 2 and 4).
Ship’s Operational Env. Efficiency (PC2) and Operating Region (PC3) seem to have a weak negative correlation in all the Clusters except Cluster 2, i.e., the cluster defined by the small, environmentally friendly ships. One possible explanation for this is that in the group of small, environmentally friendly ships (Cluster 2), the better the environmental performance, the more likely it is to use eco-friendly fuel and operate mostly inside a Member State’s jurisdiction. On the other hand, for the other three clusters, the observed negative correlation is quite surprising since this implies that ships operate mostly in areas where low-quality oil is used (i.e., outside a Member State’s jurisdiction) tend to have a better environmental performance, i.e., large values of the second index (PC2). This may be explained by the better engine specifications usually adopted by ships that make large international voyages to reduce travel costs.
Regarding the correlation between the Physical Dimensions (PC1) and the Operating Region (PC3) in each cluster, it seems that there is a weak but statistically significant negative correlation among the environmentally friendly ships (Clusters 1 and 2) and a weak but statistically significant positive correlation among the non-environmentally friendly ships (Clusters 3 and 4). This may again be explained by the better engines that ships that make large international voyages use to reduce travel costs.
4.6. PCA Validation of the Proposed Indices
To validate the PCs produced by the PCA and used afterward in the CA, the 62 Container ships in the merged data set (see
Section 3.3) were used. According to Regulation (EC) No 1367/2006 of the European Parliament and of the Council of 6 September 2006, the main difference between the two categories is the weight and volume of cargo carried. As a result, the 62 Container ships can serve as a validation set to assess the quality, reliability, and consistency of the analytical findings of the PCA and the creation of the three indices (PCs).
In
Table 4, the PCs produced by this data set are presented along with the PCs from the reduced data from the General Cargo ships. Additionally, pairwise differences among the loadings are also given for comparison purposes. From the results, it is obvious that the PCs for the Container ships present similar values to those derived by the General Cargo ships and can produce three similar, in nature and behavior, indices. Therefore, it seems that the three proposed indices provide a concrete description of the environmental performance and can be used in other categories of ships. It is of note that no outlier was detected among the Container ships observations.
5. Discussion and Conclusions
The increasing focus on environmental sustainability has spurred considerable research and the production of numerous theoretical solutions. The traditional “resistance modeling” has been widely accepted as the theoretical foundation of ship fuel consumption and emissions, as it serves to estimate the vessel’s total resistance in relation to speed and external influences [
34,
35]. Although this method is widely used, it is limited in its ability to address complex issues. Consequently, alternative methods have been developed to further improve the accuracy of fuel consumption predictions [
36,
37,
38,
39,
40]. These studies have generally concluded that speed is the primary factor of fuel consumption, with external conditions such as weather playing a significant secondary role [
29,
40]. Nonetheless, many current studies prioritize the technical aspects of sustainability and the multitude of environmental initiatives available to the shipping industry can be confusing, as well as add to administrative burdens.
This research presents a large number of environmental initiatives or indices that are currently available in the shipping industry, including instruments developed by the IMO. The framework used makes extensive use of machine learning techniques to create new composite energy efficiency indices that are based on real ship operational data. The actual framework combines PCA and clustering techniques to acquire from real data new combined efficiency indices with an easy interpretation. These indices are combined in a graphical tag to depict the environmental impact of a ship. Considering that there is a plethora of clustering and dimensionality reduction techniques that could be applied in future studies, it seems that PCA fully meets the process of a linear transformation of variables and reducing them as composite variables.
Moreover, based on the three proposed indices, the ships are categorized into four clusters that incorporate the information of 14 operational and design variables. These clusters distinguish the vessels based on their environmental impact, physical dimensions, and operation region, thus shedding light on the specific characteristics of each cluster. For example, it was shown that small, environmentally friendly ships usually operate exclusively inside a Member State’s jurisdiction, which is a characteristic that is not met in any other group of ships. Moreover, a significant number of small ships with poor environmental performance were identified, which operate exclusively outside a Member State’s jurisdiction.
The proposed indices and the corresponding graphical tag manage, indeed, to represent the environmental footprint of a ship. These indices are incorporated in an innovation graphical tag that can serve as an environmental impact label for the ships. Using aggregating data such as those in the present data distribution can only serve as a snapshot of the ship’s performance. It is true that more frequently recorded observations would provide more detailed information, ensure the robustness, and secure the quality of the data. Using statistical process monitoring (SPM) systems to continuously monitor carbon emissions from businesses can have several advantages. At the industrial level, it can assist in detecting excessive emissions at an early stage and ensuring that the necessary measures can be implemented in advance to limit them. This can minimize the estimated overall cost, which includes emission-related and operational costs of the SPM program. In addition, it can help with determining whether the emissions are within the regulatory limit or at a high risk of non-compliance. Monitoring and measuring the impact and associated costs of emissions on the environment in order to establish guidelines for comparing the actual with the targeted emissions. Most importantly, SPM programs can assist decision-makers in determining an acceptable emission charge [
54]. A more dynamic tool will require collecting data on a more regular basis—for example, monthly—which will allow not only to incorporate changes in the second (Operational Env. Efficiency) and the third indexes (Operating Region), and monitor the environmental footprint of a specific ship, but also (a) to identify seasonal patterns and/or (b) early detect trends and changes in the shipping market that affect the performance of the ships in general. In more frequently collected data—for example, every hour—this could also allow the real-time monitoring of the second index (Operational Env. Efficiency), which could result in a real-time decision-making tool by updating the permissible limit and alerting all the ships in a region.