1. Introduction
Shallow, coastal benthic habitats represent one of the most productive and valuable ecosystems on Earth [
1]. The particular hydrodynamic conditions of these environments are responsible for the highly active exchange of nutrients, sediments, and biota. Their locations within euphotic zones make them an ideal place for the growth of macroalgae, which provide good settlements for benthic communities. Nearshore benthic habitats usually form complicated patterns, in which conducting spatial determination analysis is very important for ecosystem management and protection. Finally, precise mapping of the seafloor substratum and geomorphology is a fundamental task for marine spatial planning, especially with respect to marine protected areas (MPAs) or European Union (EU) legislative frameworks (e.g., Water Framework Directive 2000/60/EC, Habitats Directive 92/43/EEC, and Marine Strategy Framework Directive 2008/56/EC). In this study, we recognized and determined spatial areas occupied by valuable habitats that occur in the southern Baltic Sea. We evaluated our methods to obtain the most reliable maps of the studied area, which is one of the goals of the ECOMAP EU BONUS project, promoting Baltic Sea environmental assessments by opto-acoustic remote sensing, mapping, and monitoring.
The remote sensing methods used for seafloor mapping take advantage of sound propagation in marine environments. Over the last few decades, the rapid development of hydroacoustic methods utilizing single-beam echosounders, side-scan sonars, and multibeam echosounders (MBES) has occurred [
2]. The quick growth of statistical techniques, which has taken place in recent years, has created great potential for precise mapping. Although global maps of the world’s oceans’ bathymetry based on gravity measurements are currently available, their low resolution makes them unusable for detailed analysis in such fields including benthic habitat mapping, mapping of sediments, underwater archeology, etc. [
3].
Recently developed multibeam echosounders have allowed researchers to acquire three types of information: bathymetric data, which after processing is equivalent to interpolated digital elevation model (DEM) data, the angular dependency of the backscatter intensity of the acoustic signals from the seafloor, and the volume backscatter intensity of the water column. The spatial resolution of multibeam echosounder data, especially as applied in shallow water environments, can be compared to high-resolution LiDAR remote sensing data [
4]. Up to now, only small areas of the world’s oceans—much less than 15%—have been mapped with high-resolution bathymetry [
5]. The availability of MBES backscatter data is even more limited. Seafloor acoustic reflectivity is a phenomenon that can be characterized as a measure of the acoustic energy coming back from the seafloor, reflecting the properties of the seafloor [
6]. The determination of backscatter is therefore the most useful technique for creating categorical maps of the seabed. The backscatter of the water column is beyond the scope of this study.
Backscatter measurements from multibeam echosounders are not yet fully supervised and standardized [
6]. For a better understanding of these phenomena, it is necessary to define the characteristic properties of backscatter intensity for particular benthic habitats in different areas. Seafloor substrata can be determined based on certain acquisition, processing, and interpretation techniques, which should be specified [
7,
8]. Considering the abovementioned objectives, we defined following research hypotheses: (1) different properties of backscatter intensity will allow us to distinguish habitat types in the southern Baltic Sea (the Rowy area); (2) the use of two frequencies significantly increases the amount of information gathered that will be useful for the correct classification of seafloor habitats; and (3) image processing methods, together with the application of statistical and textural analysis, will allow us to develop semi-automatic workflows to recognize and determine benthic habitats in the southern Baltic Sea.
Despite the fact that they were designed to gather deep water measurements, recent models of multibeam echosounders are capable of performing hydroacoustic surveys in shallow areas. Consequently, an increasing amount of research is being conducted in coastal areas (e.g., [
9,
10,
11,
12]). Nevertheless, hydroacoustic measurements in shallow water require especially careful sensor calibration, proper survey design, and experience to obtain accurate geospatial data.
Maps of benthic habitats can be created from hydroacoustic measurements using three types of analyses: manual expert interpretation of bathymetry and backscatter maps, acoustic signal parametrization, and image processing [
13]. Knowledge-based expert interpretation has many disadvantages, such as lack of objectivity, high time consumption, and lack of repeatability; therefore, it is less frequently used in modern applications. Signal processing methods are usually related to unsupervised methods of classification and often work on one type of data (bathymetry or backscatter) [
14]. They include, for example, angular range analysis (ARA, e.g., [
15]), texture analysis [
16,
17], spectral analysis [
18], and neural network analysis (e.g., [
19,
20]). The image processing approach benefits from different kinds of classification (often supervised) and allows researchers to apply many geomorphometric attributes (e.g., [
21,
22]). The approach presented here is based on object image analysis related to different acoustic products: backscatter and bathymetry combined in a relational database.
4. Discussion
Multi-frequency, multibeam echosounder data is a promising new approach in the characterization of seabed habitats. Recent research confirms that the simultaneous analysis of many frequencies leads to a better understanding of seafloor properties [
62]. Although we did not use a multibeam echosounder with multispectral mode (such as the R2Sonic 2026) for our measurements, we repeated the hydroacoustic surveys with different frequencies. Similar research has been presented in [
62], where the surveys were repeated with three different frequencies: 200, 400, and 600 kHz. This approach helped us to make a detailed acoustic characterization of the seabed sediments. In our case, we performed hydroacoustic research at two frequencies: 150 kHz and 400 kHz. The additional information from the ground-truth data allowed us to define the distributions of the acoustic backscatter for all the classes of habitats, which differed depending on the frequency used. All the feature selection results confirmed that attributes of both frequencies were useful to explain the variability of the analyzed data.
The Boruta feature selection algorithm has been tested in the benthic habitat mapping literature a few times, giving promising results [
22,
63]. Our results confirmed the usefulness of the application of this feature selection method in habitat mapping. We recommend that if the algorithm would work on statistics gathered from object-based image analysis, then the classification should be performed on the same segmentation setting.
Our study confirmed that beyond the primary features, such as backscatter and bathymetry, some other secondary features were useful, such as slope, GLCM entropy, GLCM homogeneity, and the standard deviation of bathymetry. We suggest remembering such attributes for further feature selection actions. The list of suggested secondary features is not yet finished and may include, for example, spatial autocorrelation [
63]; hue, saturation, and intensity [
64]; angular range analysis [
65], Q-values [
66]; and maximum orbital velocity [
64].
The scale of multiresolution segmentation is a very important setting of OBIA, which has an impact on further analysis, including the results of the classification [
45]. Up to now, at least a few benthic habitat mapping studies have included the application of different scales of multiresolution segmentation [
46,
49]. To estimate the parameter in a proper way, we tested many scales from 1 to 20, with a step of 1—similar to the approach in [
49]—for a wider range of the parameters. The best scale was chosen for the best accuracy assessment of the evaluated classification methods. Although the investigation of the dependency between the accuracy and the multiresolution segmentation scale used was not the aim of this study, we tested 80 sets of the OB segmentation–classification results (20 scales × 4 classifiers). Our attempts confirmed that the scale of the multiresolution segmentation was imperfect, and its incorrect determination may have led to poor results in the object-based classification. Future research should take a closer look at this phenomenon and investigate the changes in accuracy depending on the scale of the multiresolution segmentation parameter.
In this study, we performed a robust object-based methodology on a relatively small test area, characterized by diverse habitat conditions with the occurrence of unique red algae. Considering the regional conditions, there are no areas with similar characteristics within the Polish coast of the southern Baltic Sea. It should be noted that in the marine habitat mapping literature, there have been studies based on similar or smaller spatial extents, such as 0.056 km
2 [
48] or 0.39 km
2 [
12]. Other methods of benthic habitat mapping based on object-based image analysis were previously applied in various environments and areas, from smaller areas [
48] to slightly less diverse areas within the Polish coast of the southern Baltic Sea [
49] to larger areas [
67]. Therefore, we can state that our methodology would be scalable.
In this study, we designed a ground-truth survey to encompass the representativeness of all kinds of habitats. It is necessary to keep in mind that a set of samples that is too small can lead to a falsified accuracy result [
68]. Some studies have presented results of seabed mapping after analysis of similarly small but representative numbers of ground-truth samples [
10,
17,
42,
49]. In any such case, there is a possibility of errors, for which the sources have been described in detail (e.g., [
69]). Despite the relatively small number of samples, we used varied methods of sampling, including Van Veen grabs and ROV video inspections within all the sites. Thus, our ground-truth survey was designed to obtain strict and diverse knowledge of the analyzed area.
Considering the unit of analysis, the methods of classification could be separated between pixel-based (PB) and object-based (OB) methods. The utilization of ground-truth samples allowed for further division between unsupervised and supervised techniques. The Jenks natural breaks method has been applied in habitat mapping studies several times [
48,
70]. In comparison with similar research, we obtained poor accuracy using this classification in this study. In our pixel-based classifications, there were visible ‘salt and pepper’ effects caused by the noise of the input data, which was obvious in comparison with the OB approaches [
71]. The reason for the poor accuracy may be related to the overlapping distribution of the backscatter intensity for the habitat classes described in
Section 3.1.
Different approaches of machine learning or decision trees have been widely used in recent predictive habitat mapping (e.g., [
12,
22,
46,
48,
67,
72,
73]). Such approaches belong to both PB and OB techniques and supervised classification methods, executing top-down strategies: “assemble first, predict later” [
13]. The OB approach of supervised classifiers has been developed over the last few years in marine habitat mapping (e.g., [
12,
46,
48,
65,
67]). Many of the aforementioned studies concerned evaluations of classification methods. In particular, the random forest method seems to be a promising method for the automatic classification of benthic habitats. For example, in [
65], the RF method achieved an excellent result of 94% overall accuracy and a KIA of 90%. Results with 80% overall accuracy are common in marine habitat mapping when using the random forest classifier [
12,
22,
67].
The KNN classifier has been applied much less often in marine habitat mapping studies with other well-known examples [
46,
48]. In these studies, the KNN classifier separated classes with an overall accuracy from 52% to 66%. Considering the KIA value (from 0.38 to 0.43), the performance of the KNN classifier in these studies can be described as fair to moderate [
46]. In our study, we obtained better accuracy using this method, but possible sources of errors should be kept in mind (see
Section 3.5). We recommend continuing to evaluate this method of classification in further habitat mapping studies.
The application of two frequencies of MBES measurements is very interesting from the viewpoint of marine habitat mapping. The acoustic responses of the habitats are dependent on the frequency; therefore, distinct frequencies may reveal different attributes. With two frequencies, we have a better possibility of achieving habitat discrimination. One recent study has suggested that the combination of PB and OB methods can lead to a better separation of classes, resulting in better accuracy [
12]. In the aforementioned study, the application of such an approach increased the overall accuracy by 5.1% and the kappa value by 0.06 (overall accuracy—83.6%, KIA—0.78). In comparison, the combination of two OB classifiers in our study allowed us to increase the overall accuracy by 7.1% and the KIA by 0.10. Both results suggest that the combination of the best classification outcomes might be useful and promising in future marine habitat mapping studies.
5. Conclusions
In this study, we developed a robust workflow for predictive habitat mapping based on multi-frequency, multibeam echosounder data. For the first time, we recognized and distinguished six nearshore habitats of the Rowy area in the southern Baltic Sea. The identified habitats included very rare seascapes for the Polish coast of the Baltic Sea, encompassing species of red algae and boulder sites colonized by
Mytilus Trossulus bivalves. Future research will be conducted using the same model of multibeam echosounder device but with an acoustically calibrated option regarding the backscatter strength. Therefore, the composition of the seafloor will be represented from a physical point of view, which would create new perspectives in benthic habitat mapping, such as the ability to track spatial changes of habitats over time [
42].
An important part of our workflow was the feature extraction and selection. We extracted 70 secondary features of the bathymetric and backscatter data. They included either pixel-based statistics or object-based GLCM textures. Some features were calculated based on multiscale or object-based approaches. The Boruta feature selection algorithm allowed us to choose relevant attributes, which included the following (beyond bathymetry and backscatter): slope, GLCM entropy, GLCM homogeneity, and the standard deviation of bathymetry. Our results confirmed the usefulness of the application of the Boruta feature selection method in habitat mapping. The proper feature selection helped us to discriminate habitat classes with similar distributions of backscatter intensity. However, the list of secondary features is not yet complete. We suggest expanding it for other attributes and a multiscale approach.
We tested different aspects of image processing, such as pixel-based and object-based image analysis, unsupervised and supervised methods of classification, and habitat mapping based on single-frequency and multi-frequency multibeam echosounder (MBES) datasets. Our results demonstrated the great usefulness of object-based image analysis and supervised classifiers, such as the random forest and k-nearest neighbors algorithms. Because, in our case, each classifier performed better with respect to specific classes of habitats, we took advantage of the best results and combined them, obtaining very good agreement—93% overall accuracy and a 0.90 Kappa coefficient. We applied such a combination based on two object-based results. In our study, the application of the multi-frequency, MBES dataset with the proper selection of secondary features significantly increased the accuracy of the habitat maps with respect to the single-frequency results.
Our workflow encouraged us to offer some additional suggestions. We recommend taking a closer look at the scale of multiresolution segmentation in object-based marine habitat mapping studies. A particularly interesting topic is the changes in accuracy depending on the scale of multiresolution segmentation parameter. We also recommend evaluating the k-nearest neighbors method of classification in future habitat mapping studies.
The rapid development of the hydroacoustic industry will bring about the greater availability of multi-frequency, multibeam echosounder data. Our predictive habitat mapping of shallow euphotic zones creates a new scientific perspective and provides relevant data for the management of natural resources.