1. Introduction
Seagrasses are the only submerged marine plants with an underground root and rhizome system forming beds and meadows. They play a key role in the ecosystem services of the global coastal zone in relation to nutrient biogeochemical cycling, carbon sequestration, sediment stabilization, fish refugia, and food-web structure [
1]. Although the global species diversity of seagrasses is relatively low (<60 species), understanding their distribution is particularly important for ecologists developing bioregional models, covering all oceans and climatic zones together with the respective species assemblages [
2]. Most aquatic ecosystem health assessment studies rely on seagrass species richness and distribution, since these serve as valuable bio-indicators reflecting recent environmental changes, especially the release of pollutants and eutrophication events.
Halophila minor and
Halophila ovalis act as bio-indicators for trace metal pollution and sediment accumulation [
3], and
Zostera marina is an eutrophication indicator [
4], while the genus
Cymodocea acts as a heavy-metal bioaccumulator and tolerant bioindicator of pollution [
5], rapid coastal development, and human intervention [
6]. Finally,
Posidonia oceanica meadows are directly linked to the degree of human impact, like effluents, many nutrients, organic matter, trace metals, coastal settlements, fish farms, and trawling [
7,
8,
9].
It is evident that seagrass species face significant challenges due to their high vulnerability related to a variety of anthropogenic disturbances concentrated along the coastal zone [
2], and their sensitivity to environmental changes driven by climate change [
10], ultimately leading to their global decline [
11]. In the Mediterranean bioregion, there are eight seagrass species, namely
Cymodocea nodosa,
Halophila decipiens,
Halophila stipulacea,
Posidonia oceanica,
Ruppia cirrhosa,
Ruppia maritima,
Zostera noltii, and
Zostera marina, with four of six families declining in the western Mediterranean region, implying that priority conservation policies are needed [
11,
12]. To understand the complex interrelation between oceanographic, environmental, morphodynamic, and human impact conditions and the distribution of seagrass species, modern data-driven models have been developed and implemented, following Machine Learning and Artificial Neural Networks techniques [
13,
14]. Such models are capable of exploring the seagrass presence/absence dynamics, and link seagrass families and species to the main environmental drivers determining their distribution, and ultimately, their abundance.
The exploitation of patterns between seabed habitat types and environmental parameters is particularly important for ecologists, since the abundance and distribution of seabed biological communities are directly linked to water column dynamics and quality [
15]. In parallel, seagrass presence/absence and internal body growth asymmetries provide valuable metrics for the evaluation of marine ecosystems’ ecological status [
16]. Although the interrelation between physical, chemical, biological, seabed- and human-related drivers to seagrass species distribution is strongly non-linear, and therefore difficult to identify, modern data-driven models and techniques explore these hidden patterns. In addition, the development of systematic, freely available, diverse databases over the latest decades, and the development of tools for Big Data fusion and aggregation, lead to the better understanding of coastal benthic processes and human impacts.
In the present study, fuzzy logic modeling explored the nonlinear dynamics between the environment-ecosystem-human gradient and its impact on seagrass distribution in the Mediterranean Sea. A Fuzzy Inference System (FIS) is the process of formulating the mapping between a given set of inputs to an output using fuzzy logic. Such systems are particularly suited to model the relationship between variables in environments that are either ill-defined or very complex, as the case of seagrass distribution and its relation to environmental and human drivers. The Fuzzy Inference System (FIS) is an engine that applies reasoning to compute fuzzy outputs, and involves a knowledge base which defines rules and membership functions (MFs). The system is built using a set of “if–then” rules, which, like any conventional rule in artificial intelligence, has the general form, “If x is A, then z is C”, where A and C are linguistic values defined by fuzzy sets in the universes of discourse X and Z, respectively. The “if” and “then” parts are called the antecedent and the consequent of a rule, respectively.
The optimal model generated was further used to assess the response of the main Mediterranean seagrass species distribution patterns to changes in climate (gradual water temperature increase, sea level rise) and eutrophication. This FIS model could be used by policy-makers and public authorities responsible for the Marine Strategy Framework Directive (MSFD) implementation to understand and assess pressures and impacts on seagrass ecosystems, link abiotic factors on determining the appropriate Good Environmental Status (GES) and establishing baselines and targets for biodiversity assessments in the frame of MSFD (Descriptor 1) and UN Sustainability Development Goals (Goal 14.2), in line with the prevailing hydrologic, oceanographic, biogeographic, and climatic conditions.
The work was conducted in the framework of the Horizon 2020 ODYSSEA Project. Some of the aims of this project include developing appropriate algorithms, aggregating data from diverse databases, revealing hidden relationships among abiotic and biological parameters, and identifying species distribution patterns and species richness trends.
2. Materials and Methods
2.1. The Seagrass Dataset
The initial dataset was the Global Distribution of Seagrasses dataset, produced by the United Nations Environment Programme—World Conservation Monitoring Centre (UNEP-WCMC) [
17,
18]. The dataset illustrates the global distribution of 73 seagrass species and has the form of a geo-referenced shapefile composed of two subsets of points and polygons, indicating the occurrences measured from 1934 to 2015 on seagrass family, genus, and species. A total of 1771 locations were found for the major Mediterranean seagrass families (Cymodoceaceae, Zosteraceae, Posidoniaceae, Hydrocharitaceae, Ruppiaceae) (
Table 1).
A limited number of problematic points were identified, based mostly on the depth zone distribution and the distance to the coast. Data were filtered on Mediterranean Sea seagrass families, and ultimately, the species. Ruppiaceae data were excluded from analysis due to their limited occurrence (<2.0%).
2.2. The Environmental Drivers Dataset
The distribution of seagrasses in the Mediterranean Sea is assumed to vary in relation to a series of physical, chemical, biological, and seabed- and human-related parameters. For each geolocation in which seagrass species are reported, hydrographic data (water temperature, salinity, currents, waves) and water quality data (nutrients, dissolved oxygen, chlorophyll-a, net primary production rates) at the surface and bottom of the water column were retrieved from the Copernicus Marine Environmental Service (CMEMS,
https://marine.copernicus.eu/) data products. These are gridded, mean-monthly oceanographic data, covering the whole Mediterranean Sea during variable periods from 1987 to 2015 (
Table 2). The only exception was the wave dataset that was reported on an hourly basis and subsequently converted into mean-monthly values.
Other factors potentially affecting the distribution of seagrass species are the water depth and the substrate conditions (varying from mud to rock) which were provided by the European Marine Observation and Data Network (EMODnet,
https://www.emodnet.eu/en).
Finally, the distance of each seagrass observation point to the closest point of human influence (port, coastal city, river mouth, distance to coast) was computed, using the haversine distance method.
The final dataset consisted of 573 data points, representing each location (geographical coordinates) with the presence of the four main Mediterranean seagrass families, namely Zosteraceae, Hydrocharitacheae, Cymodoceaceae, and Posidoniaceae, together with the climatological (multi-year) means produced by averaging the mean-monthly values of physical, water quality, and biological parameters [
19]. Bathymetry, substrate properties, and distances of seagrass locations from areas of potential human impact (coast, cities with a population of over 50,000 inhabitants, ports, and river mouths) were also included in the dataset. These parameters are considered as potential drivers to explain marine flora distribution and the response to environmental changes. The dataset was randomly divided into two parts: 90% was defined as a training dataset, while the remaining 10% acted as a validation dataset.
2.3. The Fuzzy Inference System
A Mamdani-type fuzzy inference system was developed for this study, which comprised four steps (
Figure 1): fuzzification of input variables, rules construction and evaluation, aggregation of rules output, and defuzzification [
20,
21].
The FIS consisted of four physical, chemical, biological, seabed- and human-related input parameters, a series of well-designed fuzzy rules, based on the most frequent parameters interrelated in the training set, and receiving one aggregate value representing the seagrass family abundance. The FIS should be able to respond to the following question: “Given a series of environmental driving conditions prevailing in an area, what is the seagrass family favored by these conditions?”
The first step involves taking the crisp inputs, i.e., each numerical value, and assigning the appropriate fuzzy sets (defined here as Very Low, Low, Medium, High, and Very High) through the selected membership function (in this case, a trapezoidal membership function). The trapezoidal membership function used to transform crisp inputs into membership values belonging to the various fuzzy sets of the FIS has the form:
where
a,
b,
c, and
d are the membership function parameters (
Table 3). Examples for bottom chlorophyll-a and surface temperature are shown in
Figure 2 and
Figure 3.
After fuzzification, the second step involves taking the fuzzified inputs and applying them to the antecedents in a series of constructed fuzzy rules. If a given fuzzy rule has multiple antecedents, then the fuzzy operator (in this case, AND) is used to obtain a single number that represents the result of the antecedent evaluation. This number (the true value) is then applied to the consequent membership function. The IF-THEN fuzzy rules constructed according to the most frequent interrelation of input parameters observed in the training set have the following form. Rule 1: “If the water depth is Low, AND the water temperature at the sea bottom is Low, AND the chlorophyll-a concentration at the sea bottom is Low, AND the nitrates at the sea surface are Very Low, THEN the Seagrass Family favored in these conditions is Posidoniaceae’’.
A set of similar fuzzy rules may be developed and inserted in the FIS. When feeding the FIS with more data, that is, developing and adding more rules, the model provides more accurate estimates of the favored Seagrass Family presence. In our case, 45 total rules were used to associate the most frequently appearing independent antecedents in the training sets with a rule.
The third step involves the aggregation, a process that produces an overall output by considering the membership functions of all rule consequents and combining them into a single fuzzy set.
Finally, defuzzification is the process which leads to a final output as a crisp number. The input in the defuzzification process is the aggregate fuzzy set, and the output is a single number. There are several defuzzification methods, but the most popular one is the centroid technique [
20,
21], which is followed here. The seagrass family output is produced through a triangular membership function, of the form:
The defuzzification membership functions of the FIS are discrete, resulting in a definite seagrass species. If the output falls in the interval [0, 0.25], then the seagrass family is Zosteraceae; if [0.25, 0.50], then the seagrass family is Hydrocharitaceae; if [0.50, 0.75], then it is Cymodoceaceae; and finally, in the range [0.75, 1.00], it is Posidoniaceae (
Figure 4).
Several experimental Fuzzy Inference Systems were developed in this study, comprising different combinations of the four input parameters and keeping the one output parameter, that is, the seagrass family, constant. The FIS was developed using the Fuzzy Logic Toolbox operating under MATLAB 7.0 (The Mathworks, Inc., Natick, MA, USA).
2.4. Evaluation Metrics
The most commonly used measure to evaluate this type of FIS model is the Classification Precision of the system, being the fraction of relevant/correctly classified instances, in terms of seagrass family presence, over the validation dataset (10% or 58 records from the dataset).
Precision is a good metric for balanced datasets, like the case studied here for seagrass family presence [
18], and is defined as:
where
TP is known as the number of true (correct) assessments, and
FP indicates the number of incorrect assessments produced by the examined FIS.
Therefore, the model returns the Mediterranean seagrass family that best suits the input environmental parameters. When the forecasted family coincides with values in the UNEP dataset, this instance is considered a true-positive. In case of an incorrect assessment, a false-positive instance is considered. For example, in Rovinj, Croatia, the FIS correctly assessed that the environmental drivers favor Zosteraceae, but in a location at the Gulf of Gabes, Tunisia, the assessment was incorrect (Zosteraceae instead of Cymodoceaceae).
2.5. Statistics
Aiming to detect possible statistical differences between the environmental conditions prevailing in sites with different seagrass families, a two-way ANOVA was performed. In the case of a significant difference between levels (p < 0.05), a Tukey’s post-hoc test was used to show which families differed.
2.6. FIS Sensitivity Analysis and Climate Change Scenarios
Sensitivity tests using the optimum FIS were conducted to simulate the impact of climate change and increased human pressure on seagrass communities under different scenarios, thus examining the level of seagrass tolerance and their progressive species replacement. Model runs incremented each parameter by −15%, −10%, −5%, +5%, +10%, and +15% of its initial value. Then, the optimum FIS was used to assess and predict the most appropriate seagrass family favoring these environmental conditions. The developed tests were “closed”, that is, seagrass families may change among them, but their disappearance is not allowed. This is a deficiency of the presently developed model that will be resolved in a future study.
4. Discussion
In this work, a fuzzy logic model was developed using existing datasets for the Mediterranean basin to examine the main environmental parameters affecting the distribution of seagrass families. Fuzzy logic models of the Mamdani type have never been implemented for such a complex task. In a similar study, [
13] considered a series of morphodynamic, environmental, and human impact variables while employing Machine Learning algorithms to predict the presence–absence of
P. oceanica seagrass species. Their dataset, however, was limited and rather unbalanced, biasing absence records and affecting the model’s reliability. Results showed that
P. oceanica presence/absence patterns depended on nitrates and silicates, water depth, mean sea surface temperature, salinity, and distance to river mouths. Although chlorophyll-a concentrations were not included in the analysis, the correctly classified instances reached 78.3% using the random forest classifier.
A more comprehensive study was performed by [
14,
19] detecting seagrass presence/absence and distinguishing seagrass families in the Mediterranean through supervised learning methods. In these papers, chlorophyll-a and distance to the coast appeared more relevant in explaining seagrass presence/absence, while chlorophyll-a, salinity, distance to major cities, and nutrients were found as the main drivers for detecting the presence of different seagrass families.
The improvement of the present model can be explained by (a) the simplicity of using only four imported environmental predictors compared to 217 in [
14], thus achieving higher precision when minimizing the impact of irrelevant variables; (b) predictors were aggregated to their long-term means, compared to the use of mean-monthly variables representing limited time-intervals (e.g., in [
14] Chl-a for just December 2015 was used); and (c) the FIS does not operate like a “Black-box” ML model in which the lack of transparency and accountability may lead to poor interpretability. In fuzzy systems, rules are user-constructed according to data reasoning, they are flexible, and can be modified or receive different weights, making the model easier to interpret. The [
14,
19] model tested several machine-learning classifiers and achieved the highest precision of 44.4% on seagrass family classification with the random forest algorithm. The present FIS, although simpler, achieved a precision level of 76%.
This model consisted of four main drivers: water depth, bottom chlorophyll-a concentration, surface water temperature, and nitrates. Although [
14,
19] used a different environmental driver dataset, quite similar predictors were found (bathymetry and chlorophyll-a). Moreover, in this work, water temperature was found to play a major role in determining seagrass family presence, instead of salinity in [
14]. The fact that long-term climatological means were used as predictors in this work, compared to mean-monthly values imported in the machine learning model of [
14,
19] seems responsible for this relative weight change among variables. In the Mediterranean, the variability in water temperature is more important than salinity variability among the various seagrass locations, and this was captured by the FIS. The influence of water temperature on
P. oceanica growth was also highlighted in [
22], with limited presence of this species in the warmer Levant Sea. The impact of nutrients was found to be important in all similar works [
13,
14,
19], since food availability determines seagrass growth, distribution, and metabolism [
23].
Water depth is an important parameter to predict the location of seagrass families, since it indirectly expresses changes in seabed temperature, pressure, light availability, and wave disturbance [
24]. Of the above-listed factors, seabed light availability appears to be the most influential driver in determining the colonization depth of various seagrass species [
25], since it affects the photosynthetic activity. In the Mediterranean, Cymodoceaceae, mostly
C. nodosa, was found over a broad range of depths, from shallow waters to depths of 60 m in sheltered to semi-exposed coasts, while
P. oceanica was also present down to 50 m [
26]. In the present dataset, Cymodoceaceae and Zosteraceae were found to be a shallow water species (mean depth 17 m), while Posidoniaceae was found in moderate depths (mean ± sd: 32 ± 16 m), and Hydrocharitaceae in moderate to deep water (66 ± 28 m). Tukey’s post hoc test (
p < 0.05) showed that Posidoniaceae sites were found to be significantly different in terms of bathymetry to Zosteraceae and Cymodoceaceae habitats. Above-substrate biomass, leaf biomass, and shoot density of both species was found to decline from shallow to higher depths in the Spanish Mediterranean Sea [
27], probably due to changes in light intensity and seabed temperature, explaining that bathymetry is an important factor capable of distinguishing seagrass families, as shown by the present FIS.
Water temperature is also a significant driver for seagrass distribution, influencing growth rates, reproductive patterns, and enzymic and metabolic functions when found within a physiological optimum. Extreme heat may enhance mortality for a number of seagrass families [
28]. The optimum FIS clarified sea water temperature preferences by family in the Mediterranean: Posidoniaceae and
Zosteraceae favor cooler temperatures (16–18 °C), Cymodoceaceae are found in moderate (19.5–21.0 °C) to warm waters (18–21.3 °C), while Hydrocharitaceae grow in warmer areas (21–23 °C). Statistical tests (Tukey’s post hoc test,
p < 0.05) illustrated that all seagrass habitats were characterized as significantly different in terms of water temperature, except for Zosteraceae and Posidoniaceae. These findings are in agreement with the species distribution patterns over the Mediterranean Sea, with warm-water species mostly being along the North African coastline and cold-water species being along the northern shores of the Aegean, Ionian, Adriatic, and Tyrrhenian Seas [
13]. Temperature preferences may determine which species will be affected under rising temperature climate scenarios. Under such conditions, the decline of one species may lead to the replacement by another [
29], following the patterns defined by the FIS. In some examples of disturbance, no recovery was observed, although this case was not examined by the FIS.
Chlorophyll-a levels were important for seagrass family distribution. Each family prefers specific and different values of chlorophyll-a (
Figure 8). Posidoniaceae and Hydrocharitaceae favor oligotrophic systems; Zosteraceae is abundant in mesotrophic environments; while Cymodoceaceae is tolerant to a wide range of chlorophyll-a concentrations. According to [
14], chlorophyll-a levels in winter months (mostly in December) is the key parameter determining seagrass presence/absence and family identification. According to [
22],
P. oceanica meadows are absent in the vicinity of river mouths due to increased turbidity and chlorophyll-a levels. During eutrophication events, seagrass suffers from reduced light conditions, increased epiphyte growth, decline in dissolved oxygen, and anaerobic organic matter decomposition, leading ultimately to sulfide stress that few species can tolerate [
30].
P. oceanica is very sensitive to sulfide stress [
31], whereas temperate species growing on terrigenic sediments, such as
C. nodosa and
Z. marina, showed only a minor response to this stress [
32].
Nutrients, mostly in the form of nitrates, also represent a key environmental driver for seagrass species abundance, since food availability controls seagrass growth, distribution, and metabolism [
33]. Several nutrient sources, mostly rivers, outflow along the Mediterranean coastline, providing the appropriate nutrient levels to seagrass sustainability. In eutrophic waters, phytoplankton is favored at the expense of seagrass production, resulting in the loss of seagrass cover at high nutrient concentrations [
32]. The effect of seagrasses on nitrogen cycling and uptake kinetics is species-dependent, where some species exhibits increased, and others reduced uptake of affinities and rates [
32]. This supports the finding that nitrogen concentrations may act as an environmental driver for seagrass species differentiation, as produced by the optimum FIS. Nitrogen assimilation by seagrasses requires the nitrate reduction to ammonium, a process taking place with higher rates in leaves than roots. This allows the easier utilization of dissolved inorganic ammonium as the “preferred” pool of nutrients, compared to the ammonium pools in sediments and pore water [
32]. Morphological leaf variations among seagrass families, like leaf width, length, number of leaves per shoot, and so forth support the variable dissolved nitrate and ammonium uptake rates from the benthic waters, in agreement with the selection of the FIS. The main difference from study [
14] is that the present FIS identified nitrates as an important factor compared to the machine-learning model that emphasized phosphates. Seagrass beds are considered significant phosphorus sinks, with variable, species-dependent rates of uptake and assimilation.
In terms of seagrass resilience to climate change impacts, the optimum FIS indicated that: (a) Zosteraceae is highly tolerant to changes in water depth and nutrient levels, as well as to small changes in chlorophyll-a levels, but extremely sensitive to temperature changes; (b) Hydrocharitaceae is resistant to increases in water temperature, chlorophyll-a, and nutrient levels, but sensitive to bathymetric changes; (c) Cymodoceaceae meadows remain relatively stable under mild sea temperature increases, replacing Posidoniaceae under strong sea water temperature rises; and (d) Posidoniaceae appears threatened by an increase in water temperature, favors oligotrophic waters, and is being found in moderate water depths.
Since the proposed model (in terms of input variables) follows the “trial and error” approach, it could be further improved through extended testing. The inclusion of taxa disappearance, as an additional output state, is also another option, advancing the FIS and making the model more realistic. However, expert knowledge is needed for such advancement. Other tests could include finding the most accurate shape and boundaries of the membership functions and examining the most appropriate defuzzification method.
5. Conclusions
This work has developed a simple but novel self-learning expert-system application, based on Mamdani fuzzy logic to predict the occurrence of Mediterranean seagrass habitats at the family level, according to the environmental conditions prevailing in an area. The system was developed utilizing diverse databases, from UNEP-WCMC for seagrass distribution and CMEMS and EMODnet for environmental conditions at each seagrass site. The optimum model receives input values from four parameters, namely water depth, sea surface temperature, surface nitrates, and bottom chlorophyll-a, and identifies seagrass family distribution patterns with fair classification precision (~76%).
The present FIS is capable of describing favorable living conditions for seagrass families across the Mediterranean Sea. Posidoniaceae, and mostly P. oceanica prefers cool, oligotrophic waters; Zosteraceae seabeds favor a wider temperature range (16–19.5 °C); and mesotrophic waters, Cymodoceaceae, and mostly C. nodosa appear tolerant to a broad range of living conditions, from warm, oligotrophic, to moderately warm mesotrophic areas; while Hydrocharitaceae prefer the warmer, oligotrophic parts of the Mediterranaean.
The FIS also has the capability to identify the impact of environmental change on seagrass habitats as those induced by climate change, illustrating that (a) Hydrocharitaceae and Cymodoceaceae are the most sensitive families to increases in water depth with sea level rise, while Posidoniaceae and Zosteraceae are tolerant to the climate change drivers examined; (b) Cymodoceaceae is the family with higher tolerance to mild (+5%) increases in sea temperature; (c) Hydrocharitaceae exhibits tolerance to higher (+10–15%) increases in sea temperature; (d) Posidoniaceae and Zosteraceae are mostly affected by temperature rise at any level, and (e) Posidoniaceae exhibits higher tolerance to a decrease in mean water temperature.
Based on these model results, it is evident that tolerance toward disturbances, as well as growth and recolonization potentials differ among various seagrass species, and distribution patterns can now be predicted based on these taxonomic tolerances. Seagrass presence requires several environmental conditions to be satisfied, and quantifying the relative importance of these conditions is the primary challenge for predicting seagrass habitat suitability. Thus, the present FIS model could be potentially used by policy-makers in order to (a) aid the design of a conservation action plan for areas that seagrass species are present; (b) explore the environmental factors and improve those needed in areas where seagrasses are absent, thus supporting ecosystem restoration; and (c) establish baselines and targets for biodiversity assessments required for MSFD implementation.