1. Introduction
Coastal blue carbon ecosystems, such as mangroves, salt marshes, and seagrass meadows, play a crucial role in supplying environmental functions such as providing shelter for fisheries, performing water purification, enhancing soil stability, decreasing coastal erosion, supplementing nutrients, protecting coastlines [
1], and mitigating climate change by sequestering carbon from the atmosphere [
2]. They are regarded as the most efficient ecosystem for carbon storage but are also the most rapidly disappearing ecosystem worldwide. Among them, seagrass meadows serve as habitats in subtidal and intertidal zones and continue to decline due to the fact of environmental [
3] and climate change [
4]; episodic events, including heatwaves [
5], storms [
6], and tsunamis [
7]; anthropogenic activities [
8]. Monitoring of seagrass meadows is essential for understanding the mechanism of their changes and promoting the social implementation of its blue carbon policy [
9].
Extensive efforts to monitor seagrass meadows have been engaged through in situ surveys, including diving surveys [
10] and sampling-based surveys [
11], which are accurate but laborious, costly, and time-consuming. Passive remote sensing (Quickbird-2 [
12], IKONOS [
13], Landsat 5, and Landsat 7 [
14]) and active remote sensing (side-scan sonar [
15] and airborne Lidar [
16]) have also been widely used to monitor seagrass meadows; however, obtainment of frequent and high-resolution images that will contribute toward understanding the processes of changes in seagrass meadows and will aid in the promotion of blue carbon management may be costly; moreover, the thick cloud affects the timely acquisition of satellite image resources. Recently, unmanned aerial vehicles (UAVs), also known as drones, have attracted attention for application in coastal ecosystem mapping [
17]. Although unmanned aerial vehicles are restricted by the maximum distance and flight time as well as policy restrictions in different regions, they can be used to acquire high-quality orthophoto images in small areas with high resolution and high frequency owing to their flexibility including avoiding the influence of cloud cover [
18].
Collection and classification of seagrass images obtained using drones remain challenging. Duffy et al. [
19] compared unsupervised and object-based image analysis (OBIA) classification methods adopted for acquiring intertidal seagrass orthophotos. Although these traditional machine learning methods yield high-accuracy, generalization ability limitations lead to the achievement of unsatisfactory results when a trained model is applied to a new data set [
20]. Deep learning methods have gained popularity for coping with the generalization problem owing to their advantages in digging semantic values after convolution. For example, Yamakita et al. [
21] confirmed that a deep convolutional generative adversarial network showed better results than the fully convolutional network for classifying different objects, including seagrass, by using aerial and satellite images (QuickBird). However, the deep convolutional generative adversarial network may exhibit limitations, such as collapsing generators, and is highly sensitive to hyper-parameter selections [
22]. Moniruzzaman et al. [
23] demonstrated that a faster region-based convolutional neural network yielded good results for classifying underwater video images of seagrass. However, the approach of the faster region-based convolutional neural network depends on a very time-consuming algorithm for object location assumption [
24]. Recently, U-Net has been applied for drone image classification because it requires fewer images and has a simple stable structure. Hobley et al. [
25] and Jeon et al. [
26] applied U-Net to classify intertidal seagrass orthophotos. Although their results were accurate for intertidal waters, their applicability to subtidal water seagrass meadows posed challenges. U-Net is also unsuitable for classifying small seagrass objects in the image [
27]. The feature pyramid network (FPN) was first introduced for object detection. It consists of a top-down pathway that can be used to generate different resolution layers, and classification in these layers helps classify small objects in the image. [
28]. As studies have not been reported on the application of FPN to classify seagrass images, this study aimed to evaluate its performance compared with that of U-Net.
Preprocessing images, such as resolution adjustment, the normalization process, and a combination of input data sets, is essential for increasing classification accuracy [
29,
30], especially for submerged seagrass images with sun glint and scattering due to the presence of waves. The noise induced by sun glint and scattering due to the fact of waves in the original drone photos might lead to the information loss of submerged target objects. Noises in images (i.e., glint, contrast, and compression) negatively affect the model learning of the features of images [
31]. Image normalization can help alleviate the brightness difference attributable to changes in lighting conditions during a long-time mapping mission [
32]. Color calibration aids the unification of the features of images before the mosaic of the images [
33]; however, it may not be adequate to perform corrections for brightness difference in subtidal and intertidal seagrass images. Hence, further image normalization after mosaic is necessary, such as by using Gaussian blur [
34], which is a widely applied smoothing technology based on the Gaussian function. Depending on the number of classification classes and the amount of data in each class of input training data, the combination of different input data sets yields different classification accuracies [
35]. To our knowledge, existing studies have not reported the assessment of the effect of such aspects on the classification of subtidal and intertidal seagrass images, which we aimed to discuss in the present study.
The Futtsu tidal flat in Tokyo Bay, Japan, has been regarded as a study site for monitoring seagrass meadows [
36]. However, few studies have reported the identification of seasonal variations and the influence of meteorological and oceanic events, including typhoon impacts, using drone images. Typhoon Hagibis, one of the biggest typhoons documented since 1977 in Japan [
37], hit Tokyo Bay in October 2019; however, there was no occurrence of a typhoon in Tokyo Bay in 2020. Therefore, during 2020, the Futtsu tidal flat was a suitable study site for examining seagrass recovery after the occurrence of typhoons.
The objective of the present study was to develop a novel automated method for mapping subtidal and intertidal seagrass meadows with high accuracy by applying FPN to drone orthophotos. First, we generated ground truth polygons of subtidal and intertidal seagrass meadows. Thereafter, we formulated an FPN application protocol for mapping and evaluated its performance compared to that of U-Net-based classification. An investigation on the influence of resolution, normalization, and a combination of data sets was also performed to enhance classification accuracy. Finally, seasonal variations in seagrass meadows were identified, and the influence of typhoons was assessed.
4. Discussion
This study was conducted to assess FPN performance for classifying subtidal and intertidal seagrass orthophotos by conducting various input data set preprocessing experiments and by setting hyper-parameters; U-Net processed with the same procedures was used for comparison. Seagrass is usually submerged in subtidal and intertidal areas; thus, sun glint and scattering occurring due to the presence of waves serve as challenges. To overcome such problems, preprocessing of input data sets, including adjustment of spatial resolution, the normalization of parameters, a combination of input data sets, and the selection of a suitable classification model, is essential. In our study, the optimal 2 m resolution, Gaussian blur radius of 901 (701 for U-Net), and data for the four seasons combination were determined, and FPN was found to outperform U-Net in the classification of subtidal and intertidal seagrass meadows.
The adjusted resolution images helped achieve better classification results than the original resolution image, which might be attributed to sun glint and scattering noise. Moreover, the computing time significantly increased with increasing image resolution. For optimal resolution images, the noise in images is averaged and merged with the surrounding pixels, reducing the learning time and increasing classification efficiency. This finding is consistent with that reported previously on image classification at different resolutions [
57]. Under this condition, higher altitude drone missions and resampling of images into lower resolution could increase the efficiency. The trained model was designed to classify seagrass objects of different sizes in the resampled image. Although high-resolution drone photos were intended to obtain high-resolution seagrass mapping results automatically, we sacrificed the resolution for automatic high-accuracy classification, which may have led to information loss. For example, patches smaller than 2 m on the ground were averaged by the surrounding objects and not identified when images were resampled to 2 m for higher classification accuracy. Although they may not be essential in typical seagrass meadow for large area estimation, if small patch seagrass classification is essential in some case studies, we recommend compensating the FPN classification results by conducting additional unsupervised classification for small patches. Application of an appropriate Gaussian blur enables the achievement of higher classification accuracy. Gaussian blur may help further average out the surrounding pixels with noise (noise-induced wrong classification was reduced) and alleviate the brightness difference, increasing the texture homogeneity of seagrass and sediments and enabling learning through model application (wrong classification in sediments decreased). Application of Gaussian blur with an inappropriate radius may lead to the obtainment of different objects showing the same texture information or result in unsolved noises in orthophotos (see
Table 3).
The classification accuracy was related to the input data set combinations. The input data set of the four seasons exhibited generalization ability and presented the most accurate classification results for the new data set. Although the three seasons data set achieved the same level of accuracy when applied to the new winter data set, because of the diversity of features, the four seasons data set should be selected, as seagrass shows different texture features in different seasons (
Figure 12). Image texture could be identified and interpreted by humans but not by traditional classification methods [
19]. However, the convolutions of the FPN extract the texture information automatically, and the training and classification of the classifier rely more on texture information than on other automatically extracted information [
58,
59]. Additionally, it is recommended that photos should be acquired under the conditions of different water levels, as it may also increase the contrast of texture feature difference.
FPN was found to demonstrate better classification accuracy for the new data set than U-Net when optimal preprocessing was conducted. The lower feature map of FPN was designed for small object detection in the image; its multi-layer structure helped efficiently classify objects of different sizes in different layers generated from the resampled image [
28]. Additionally, the ResNet used in the FPN structure circumvents the gradient explosion, causing FPN to effectively learn the seagrass features even in deeper networks and to yield better classification results [
60]. The worse results of the U-Net might be attributable to its learning for redundant features and ignoring of small objects [
27,
61]. Nevertheless, unsuccessful classification pixels existed in areas where a large brightness gradient appeared (e.g., sun glint, see
Figure 13C) or the texture information in non-seagrass areas was similar to that in seagrass areas (
Figure 13D). Successful classification results are often obtained in areas with relatively homogeneous texture or those surrounded by marked boundaries (
Figure 13A,B). In addition, we found no significant difference in the classification results of seagrasses in the deep and shallow subtidal zones (
Figure 14A,C), as the trained model might have learned adequate features to achieve highly accurate seagrass classification in different water depths. Moreover, the classification results showed lower accuracy for
Z. japonica (
Figure 14B) than for
Z. marina (
Figure 14A), possibly because of the insufficient training data set for
Z. japonica. Apart from that, the bed with a homogenous texture yielded less wrong seagrass classification results than the bed with a heterogeneous texture (
Figure 14C–E). More data augmentation and preprocessing procedures in low-accuracy areas may help improve classification accuracy.
The seagrass area decreased significantly from summer to autumn, and it significantly increased from winter to summer. The information gap of the seagrass recovery after the typhoon was also filled, and 8% of the seagrass area recovered. The decrease in seagrass meadows may be related to the high sea surface temperature [
62] and rapid temperature fluctuation [
63]. The light intensity causes a seagrass net photosynthesis peak in spring, which promotes seagrass growth [
64]. In addition, water depth [
65], species [
66], and meadow size [
67] affect the variation, and the recovery process may be related to the stability of seagrass. To observe the differences in spatiotemporal variation and recovery among seagrass meadows, we selected three typical meadow size areas (areas 1, 2, and 3) according to the seagrass species, water depth, and meadow size (see
Figure 15). Area 1 was occupied by large meadows in the deep subtidal zone, where the seagrass distribution was stable; area 2 was occupied by small
Z. marina meadows in shallow subtidal zone; area 3 contained small
Z. japonica meadows in the shallow subtidal zone, where significant variations were identified. For the three selected sites, the seagrass cover increased by 1% in area 1, 58% in area 2, and 30% in area 3 from winter 2020 to winter 2021. This result is consistent with that of a previous study showing that the asynchronous local dynamics of seagrass contribute to its stability [
38].
To apply the model for classification in a new environment, such as the blue carbon assessment, annual monitoring in the most prosperous season (early summer at the study site) for seagrass is necessary, and our FPN model with the four seasons data set may be applicable for each new site without further tuning [
68]. However, if the classification accuracy in a new environment is lower than expected, we recommend implementing transfer learning based on our tuned FPN model or collecting a new data set during an appropriate season. Establishment of a new FPN model using only the newly collected one-season data set with the same preprocessing procedures as our model may also be applicable for annual monitoring purposes. The results in
Table 8 indicate that the trained FPN model based on a sufficient one season data set can be applied to the same season data set for annual monitoring with high classification accuracy.
Our trained model and proposed framework will facilitate classification of areas dominated by seagrass to obtain high-accuracy seagrass mapping results, which will help the observation of seasonal and unexpected event (typhoons or tsunamis)-induced variations in seagrass. Except for seagrass vegetation classification, this framework may also be applicable for classifying algae, rocks, and sediments. The training data set can be collected in cloudy or sunny weather with oblique photogrammetry and then processed using the same preprocessing to obtain the trained model for different targets. The collection frequency and time depend on whether the target varies in different seasons. In addition, the easily applied framework, including photo collection and classification procedures using FPN, can be helpful for the local coastal management office or NPO, who have limited access to the hyper-spectral or multi-spectral cameras for monitoring variations in benthic targets. The low cost of the equipment and lower computational resource consumption are other benefits for applying this framework in similar places.
5. Conclusions
We established an FPN-based classification method for drone photos of subtidal and intertidal seagrass meadows in the Futtsu tidal flat of Tokyo Bay, which demonstrated the first application of FPN for submerged seagrass classification. During model development, we considered the spatial resolution, normalization preprocessing, and suitable combination of seasonal input data sets. Using the four seasons data set with a 2 m resolution and processing with a Gaussian blur radius of 901, the FPN model achieved the highest accuracy with an OA of 0.957, precision of 0.895, recall of 0.942, F1-score of 0.918, and IoU of 0.848, ultimately outperforming the accuracy of the conventional U-Net-based model. Our method also overcame the difficulty of classifying submerged seagrass meadows under the influence of scattering due to the fact of waves and sun glint. As the model demonstrates a high generalization ability, it may be applicable to a new site without further tuning. The implementation of transfer learning or training of a new FPN model using an appropriate seasonal data set in a new site may be considered an option when the accuracy of the direct application of our FPN is insufficient. Thus, our model will contribute toward blue carbon assessment of local seagrass meadows. Our model and framework may facilitate seagrass classification in new areas. Applying the model to other submerged targets (e.g., algae, rocks, and sediments) may also be feasible.
The classification results of seagrass meadows in the Futtsu tidal flat of Tokyo Bay revealed seasonal changes in the detailed spatial distribution of the meadows. The seagrass area recovered by 8% after the occurrence of Typhoon No. 19 in 2019. This finding indicates that the proposed model is useful for understanding detailed spatiotemporal variations in seagrass meadows, which will help local management associations in assessing the blue carbon and devising effective management strategies, particularly for those associations that have limited access to hyper-spectral and multi-spectral equipment.