Next Article in Journal
Identification of Global Extended Pseudo Invariant Calibration Sites (EPICS) and Their Validation Using Radiometric Calibration Network (RadCalNet)
Next Article in Special Issue
Quantifying the Effect of Land Use and Land Cover Changes on Spatial-Temporal Dynamics of Water in Hanjiang River Basin
Previous Article in Journal
Hierarchical Spectral–Spatial Transformer for Hyperspectral and Multispectral Image Fusion
Previous Article in Special Issue
Seismic Random Noise Attenuation Using DARE U-Net
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Autonomous Extraction Technology for Aquaculture Ponds in Complex Geological Environments Based on Multispectral Feature Fusion of Medium-Resolution Remote Sensing Imagery

1
School of Geography, Liaoning Normal University, Dalian 116029, China
2
Liaoning Provincial Key Laboratory of Physical Geography and Geomatics, Liaoning Normal University, Dalian 116029, China
3
Key Research Base of Humanities and Social Sciences of Ministry of Education, Institute of Marine Sustainable Development, Liaoning Normal University, Dalian 116029, China
4
State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China
5
College of Resources and Environment, University of Chinese Academy of Sciences, Beijing 100049, China
6
Collaborative Innovation Center of South China Sea Studies, Nanjing University, Nanjing 210093, China
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Remote Sens. 2024, 16(22), 4130; https://doi.org/10.3390/rs16224130
Submission received: 7 August 2024 / Revised: 31 October 2024 / Accepted: 1 November 2024 / Published: 5 November 2024

Abstract

:
Coastal aquaculture plays a crucial role in global food security and the economic development of coastal regions, but it also causes environmental degradation in coastal ecosystems. Therefore, the automation, accurate extraction, and monitoring of coastal aquaculture areas are crucial for the scientific management of coastal ecological zones. This study proposes a novel deep learning- and attention-based median adaptive fusion U-Net (MAFU-Net) procedure aimed at precisely extracting individually separable aquaculture ponds (ISAPs) from medium-resolution remote sensing imagery. Initially, this study analyzes the spectral differences between aquaculture ponds and interfering objects such as saltwater fields in four typical aquaculture areas along the coast of Liaoning Province, China. It innovatively introduces a difference index for saltwater field aquaculture zones (DIAS) and integrates this index as a new band into remote sensing imagery to increase the expressiveness of features. A median augmented adaptive fusion module (MEA-FM), which adaptively selects channel receptive fields at various scales, integrates the information between channels, and captures multiscale spatial information to achieve improved extraction accuracy, is subsequently designed. Experimental and comparative results reveal that the proposed MAFU-Net method achieves an F1 score of 90.67% and an intersection over union (IoU) of 83.93% on the CHN-LN4-ISAPS-9 dataset, outperforming advanced methods such as U-Net, DeepLabV3+, SegNet, PSPNet, SKNet, UPS-Net, and SegFormer. This study’s results provide accurate data support for the scientific management of aquaculture areas, and the proposed MAFU-Net method provides an effective method for semantic segmentation tasks based on medium-resolution remote sensing images.

1. Introduction

In recent years, the rapid development of aquaculture has attracted global attention, with the Food and Agriculture Organization of the United Nations emphasizing its contribution to the global food supply, particularly in developing countries [1,2,3,4]. This growth has created job opportunities and economic benefits; however, the swift expansion of aquaculture may also pose environmental risks, such as marine pollution, ecological damage, and eutrophication, disrupting the ecological balance in affected areas [5,6,7,8].
Moreover, the expansion of coastal aquaculture ponds, including shrimp farms, fish-farming areas, and shellfish cultivation zones, has intensified the competition for land resources. This expansion often results in the destruction of natural habitats, loss of biodiversity, and degradation of coastal ecological functions [9,10]. Different types of aquaculture have varying environmental impacts; for example, compared with traditional fish farming, shrimp farms may cause more significant habitat changes [11]. Moreover, the impact of the aquaculture pond scale on the climate cannot be disregarded. A large amount of aquaculture effluent may release greenhouse gases, thereby exacerbating global warming. Furthermore, large-scale farming could alter the surface water flow patterns, affect soil moisture and microclimates, and further influence local climates [12,13,14]. Therefore, to achieve ecological protection and sustainable resource utilization, the timely and accurate understanding of the distribution and scale of coastal aquaculture is essential for the harmonious development of the environment and economy [15].
Remote sensing technology has become an effective tool for the investigation of coastal ecosystems, largely because of its wide observation range [16,17] and rich imaging spectral information [18]. It provides detailed information across a broad wavelength range and possesses high imaging efficiency and the ability for long-term data collection and continuous monitoring [19]. Currently, methods for the extraction of coastal aquaculture ponds using remote sensing technology are generally divided into visual interpretation methods, methods that combine image features with traditional machine learning, and deep learning approaches. Visual interpretation is time-consuming and relies on expert knowledge, which hinders its application on a large scale [20]. Some researchers have obtained features from remote sensing images using the Google Earth Engine (GEE) platform [21] and combined these image features with traditional machine learning methods to extract aquaculture areas. For example, Hou et al. successfully extracted aquaculture ponds by combining shape and water quality parameters via a random forest approach, effectively mitigating the interference of features such as saltwater pans [22]. Xia et al. employed a multithreshold segmentation algorithm coupled with random forest to successfully map aquaculture ponds in Shanghai [23]. Hou et al. proposed a novel raft aquaculture index combined with a decision tree to extract coastal raft aquaculture areas in their study area [24]. Leveraging the GEE platform, Duan et al. developed an aquaculture pond extraction method by integrating spectral and morphological features acquired from Landsat-8 imagery [25]. Xu et al. combined spectral indices (the NDVI, NDWI, and NDBI) derived from multispectral imagery with textural features and a random forest classifier to map various aquaculture areas in the Pearl River Delta of Guangdong Province [26]. Liu et al. utilized Landsat-8 satellite imagery, integrating the fusion object-oriented NDWI, edge feature extraction methods, and edge detection algorithms. By combining visual interpretation with edge overlap, the accurate extraction of aquaculture areas was subsequently achieved [27]. Ottinger et al. proposed a large-scale coastal aquaculture area detection method based on time series derivation, water body threshold segmentation, edge enhancement, and topographic masking [28]. Although these methods have enabled progress, the results still depend on manual feature selection and expert knowledge, which limits their generalizability, and they do not utilize spatial information fully, potentially affecting their performance in complex environments. Therefore, the need for advanced techniques to increase the efficiency and accuracy of extraction is highlighted.
In recent years, the rapid development of deep learning technology has revolutionized feature extraction techniques, particularly in the field of semantic segmentation, where it has demonstrated immense potential in various domains, such as autonomous driving, medical image analysis, and remote sensing image analysis [29,30,31]. Researchers worldwide are actively exploring its application in the extraction of coastal aquaculture areas. For example, Cui et al. developed an automated aquaculture area extraction network based on a fully convolutional network (FCN), which could effectively identify aquaculture areas [32]. Lu et al. successfully extracted aquaculture areas from the northeastern coastal waters of Fujian Province by improving the traditional spatial pooling and upsampling structure of atrous spatial pyramid pooling (ASPP), effectively solving the “adhesion” problem [33]. Zeng et al. proposed an FCN that integrated the RCSA self-attention mechanism, enabling more accurate aquaculture pond edge detail identification by incorporating the NDWI [34]. Su et al. designed the Raft-Net model structure, which is capable of accurately extracting raft aquaculture details at various scales, even in turbid water environments [35]. Dang et al. developed the ResU-Net model structure, which used Sentinel-2, ALOS-DEM, and NOAA-DEM satellite images as inputs to successfully predict and map diverse wetland types in the northeastern coastal zone of Vietnam [36]. Gao et al. proposed a semantic segmentation model named D-ResUnet, which is based on Sentinel-1 images and is capable of the precise extraction of aquaculture areas [37]. Wang et al. proposed a coastal aquaculture pond extraction method that combines spectral and spatial morphological features, integrating water indices and spatial convolution to achieve large-scale aquaculture area extraction [38]. Moreover, numerous researchers have achieved significant accomplishments in accurately extracting aquaculture ponds on the basis of high-resolution remote sensing imagery. Cheng et al. proposed a novel semantic segmentation network named hybrid dilated convolution U-Net (HDCUNet) to address the limitations of traditional pixel-level classification methods, such as the misidentification of sediments and floating debris as aquaculture areas. This approach integrates U-Net with hybrid dilated convolution (HDC), effectively expanding the receptive field and optimizing the network’s global information acquisition process. By resolving the common “gridding” artefact problem, HDCUNet significantly reduces the misclassification of bottom sediments, enabling the accurate extraction of aquaculture ponds at various scales [39]. Fu et al. designed an innovative end-to-end hierarchical cascaded network (HCNet) specifically for the identification and delineation of mariculture areas in high-spatial-resolution (HSR) satellite imagery. This network effectively captures multiscale information and progressively refines the details of the target, yielding promising extraction results [40]. Liu et al. introduced and improved the deep learning-based richer convolutional feature (RCF) network and proposed the UPS-Net structure. UPS-Net effectively reduces the degree of merging among adjacent ponds when high-resolution imagery is processed [41]. Zhang et al. proposed the NSCT method, which combines a segmentation network with a nonsubsampled contourlet transform (NSCT) to leverage scattering and structural characteristics for the extraction of raft aquaculture areas [42]. Ai et al. incorporated a self-attention module into the DeepLabV3+ network to construct SAMAL-Net. This network demonstrated superior performance in aquaculture pond extraction tasks involving GF-2 imagery from the Jiaozhou Bay area of Qingdao, surpassing previous techniques [43]. Fu et al. designed a TCNet network structure based on high-spatial-resolution optical images by combining Transformers and CNNs, achieving the precise extraction of aquaculture areas within the marine ranching zone of Ningde City [44]. Deng et al. designed the CANet structure, which incorporated a multiscale superpixel segmentation optimization module to emphasize edge features, effectively improving the extraction accuracy achieved for aquaculture areas in GF-2 imagery [45].
Although the abovementioned methods for the extraction of aquaculture areas from high-resolution remote sensing imagery have been successful, their high costs limit their large-scale application. On the other hand, medium-resolution imagery is particularly valuable for applications requiring large-scale monitoring, as it provides sufficient detail to identify aquaculture ponds while covering extensive areas, thus effectively reducing the data costs. Considering the balance between cost-effectiveness and the scope of application, we consider that medium-resolution remote sensing imagery is the optimal data source for the identification of coastal aquaculture ponds [46,47,48]. However, methods based on medium-resolution imagery often result in the extraction of contiguous aquaculture areas rather than multiple individually separable aquaculture ponds (ISAPs) [49] because of the influence of the image resolution and interference with ground features. This challenge is particularly prominent in the coastal aquaculture areas around the Bohai Sea in China, where the spatial similarity between saltwater fields and ISAPs makes distinguishing them in medium-resolution images very difficult [50].
To address this issue, we propose the difference index for saltwater field aquaculture areas (DIAS), which is based on the spectral information of different ground features in the study area, using medium-resolution Sentinel-2 remote sensing imagery. We also design a median augmented adaptive fusion module (MEA-FM) and an adaptive attention U-shaped network (MAFU-Net). The combination of the DIAS index and the MAFU-Net enables the more effective feature extraction of ISAPs from medium-resolution imagery and significantly reduces the impact of interfering ground features, thus achieving the accurate extraction of ISAPs in complex environments. Our main contributions are as follows.
  • To address the interference of extraneous features such as saltwater fields within some aquaculture pond areas, we analyzed the spectral differences between saltwater pans and aquaculture ponds across different seasons. This analysis led to the development of the DIAS. By incorporating the DIAS as a new band in remote sensing imagery, we constructed the CHN-LN4-ISAPs-9 dataset, which encompasses ISAPs across four coastal aquaculture areas in Liaoning Province, China.
  • A MEA-FM was designed, which is capable of adaptively selecting channel receptive fields of various scales, thoroughly mixing information between channels, and capturing multiscale spatial information.
  • A novel adaptive attention U-Net (MAFU-Net) was proposed for the segmentation of independent aquaculture ponds from medium-resolution remote sensing imagery, achieving satisfactory results compared with traditional classic segmentation networks.

2. Materials

2.1. Study Area

The coastal region of Liaoning Province, China, which has geographical advantages and a wealth of marine resources, has become a prime location for aquaculture research. Owing to its diverse marine life, including fish, shellfish, and crustaceans, Liaoning offers exceptional conditions for this industry to thrive. Its strategic proximity to the Northeast Asian economic circle further amplifies its importance, granting it significant advantages in seafood exports and international trade. Research on the aquaculture practices implemented in Liaoning provides invaluable insights into the dynamics of global commerce.
This study selected four typical aquaculture pond areas along the coast of Liaoning Province as the research focus (as shown in Figure 1), including Qingduizi Bay in Zhuanghe City ( 123 16 123 61 E 39 41 39 68 N); the area north of Mayado Island in Pulandian District, Dalian City ( 122 11 122 63 E, 39 15 39 46 N); the coastal area of Yingkou City in the eastern part of Liaodong Bay ( 122 02 122 27 E, 40 33 40 68 N); and Changshansi Bay in Huludao City ( 120 30 120 52 E, 40 22 40 42 N). These areas have complex terrains and cover various geographical features, such as low-lying pits, rivers, embankments, saltwater fields, and aquaculture ponds [46,51,52]. In particular, aquaculture ponds are present in large contiguous areas of various shapes and sizes and are separated by embankments. Some aquaculture ponds are mixed with similar geographical features, such as saltwater fields, and the close arrangement of the ponds makes the identification of individual aquaculture ponds quite challenging. Given the important position of the aquaculture industry in the economy of Liaoning Province and the significant contribution of the aquaculture zones along the coast of Liaoning Province to the international trade of China through the export of aquatic products, the accurate extraction of ISAPs is crucial for the scientific management of the aquatic resources contained in this province. The areas selected in this study covered all types of aquaculture ponds in Liaoning Province to verify the effectiveness of the proposed method and to provide solid data support for the scientific management of the aquaculture industry in Liaoning Province.

2.2. Data Used

This study utilizes Sentinel-2 satellite remote sensing imagery from 2022. This satellite system is part of the European Space Agency’s (ESA) Earth Observation program and operates with two satellites, Sentinel-2A and Sentinel-2B. The revisiting period for a single satellite is 10 days, whereas the alternating operation of the two satellites enables the complete coverage of the equatorial region every 5 days. The Sentinel-2 satellite system is renowned for its high spatial resolution of up to 10 m and its rich spectral band information, making it highly applicable, especially for the identification of aquaculture areas. Its high spatial resolution allows the imagery to capture more detailed surface features, providing abundant information on land cover structures. Additionally, the provided spectral bands cover the range from visible light to near-infrared light, enabling the effective differentiation of various land cover types. This offers reliable data support for the extraction of ISAPs.
The multispectral imaging instrument (MSI) onboard the Sentinel-2 satellite is capable of capturing image data across 13 spectral bands, covering the spectrum from visible light to near-infrared and shortwave infrared light. These bands and their central wavelengths are shown below.
  • Visible light bands: blue (B2)—490 nm, green (B3)—560 nm, and red (B4)—665 nm;
  • Near-infrared bands: NIR (B8)—842 nm;
  • Shortwave infrared bands: SWIR1 (B11)—1610 nm and SWIR2 (B12)—2190 nm;
  • Other bands: B1—443 nm, B5—705 nm, B6—740 nm, B7—783 nm, B8A—865 nm, B9—935 nm, and B10—1375 nm.
The combination of these bands provides rich spectral information for the identification and distinguishing of different types of land features. In this study, we pay particular attention to the blue (B2), green (B3), red (B4), near-infrared (NIR, B8), shortwave infrared 1 (SWIR1, B11), and shortwave infrared 2 (SWIR2, B12) bands, as they exhibit significant spectral feature differences that are crucial in differentiating key land features such as aquaculture ponds, saltwater fields, and embankments.
This study utilized the GEE cloud platform as a tool for the downloading and preprocessing of data images to obtain Sentinel-2 imagery for the study area. On the GEE platform, preprocessing steps for the Sentinel-2 imagery were carried out, including atmospheric correction, radiometric calibration, geographic alignment, and pixel value normalization. Since the subsequent spectral analysis required distinguishing between different features in the imagery, the downloaded images comprised all 13 bands of Sentinel-2. Additionally, considering that the textural characteristics of target features may vary across different seasons, we chose to generate images from the average values for the entire year from January to December, effectively mitigating the instability associated with single-season images. The image information for each area is presented in Table 1. The acquired imagery provides reliable data support for subsequent spectral analysis and dataset creation.

3. Methodology

3.1. Difference Index for Saltwater Field Aquaculture Areas (DIAS)

Because aquaculture ponds, saltwater fields, and embankments have similar shapes and textural characteristics, excluding the interference of saltwater fields and embankments to accurately identify aquaculture ponds is a challenging task. To address this issue, Wang et al. [53] used a maximum likelihood classifier for supervised classification purposes, followed by visual interpretation to classify eighteen types of coastal land features, including saltwater fields. Sridhar et al. [50] utilized a normalized saltwater field index to differentiate saltwater fields from aquaculture ponds. Rajitha et al. [54] employed a hybrid land cover classification method to address the inability of the NDVI method to distinguish between saltwater fields and aquaculture ponds. These methods consider the impact of saltwater fields on aquaculture ponds and extract aquaculture areas at a large scale, but they generally cannot identify individually separable aquaculture ponds (ISAPs). To maximize the differentiation between aquaculture ponds and interfering land features, on the one hand, it is necessary to utilize spectral information to increase the expressive power of the feature area. Compared with ISAPs, saltwater fields have greater salinity levels in their water bodies, which results in a significant difference in their spectral characteristics, thus facilitating their differentiation. However, the spectral reflectance levels of turbid water bodies in ISAPs are somewhat similar to those of saltwater fields, which may lead to incorrect extraction results. On the other hand, distinguishing between the embankments of independent aquaculture ponds is necessary in addressing the issue of the “agglomeration” between aquaculture ponds in the extraction results. To solve the above problems, we analyzed the spectral reflectance of ISAPs, saltwater fields, and embankments at different bands and developed the DIAS.
To conduct an in-depth analysis of the spectral differences among aquaculture ponds, saltwater fields, and embankments, spectral characteristic analysis must be performed on imagery from different seasons. This is because the varying seasonal climatic conditions, vegetation growth states, and water characteristics significantly influence the spectral reflectance. By analyzing different seasons, we can gain a more comprehensive understanding of the spectral variation patterns of these three land cover types, which will aid in the development of more refined differential indices. The study area is located in the coastal aquaculture zone of Yingkou in eastern Liaodong Bay, China. This region experiences significant interference from saltwater fields, complicating the extraction of ISAPs. Figure 2 presents maps of the study area in Yingkou for January, April, July, and October. In each image, the red areas represent spectral testing samples from the saltwater fields (55,919 pixels), the green areas denote spectral testing samples from the ISAPs (74,346 pixels), and the blue areas indicate spectral testing samples from the embankments (28,339 pixels). Figure 3 displays the spectral analysis of the study area in Yingkou for January, April, July, and October. The spectral analysis of the test samples in this area was conducted over four different months, with the results revealing significant differences in multiple bands among the ISAPs, saltwater fields, and embankments. Noticeable spectral reflectance differences were present between the saltwater fields and the ISAPs in the blue band (B2 band), green band (B3 band), red band (B4 band), and NIR band (B8 band) across different seasons. First, the spectral analysis revealed that the spectral reflectance of the ISAPs in the green band was significantly greater than that in the red band, whereas the spectral reflectance of the saltwater fields in the green band was significantly smaller than that in the red band. Second, the spectral reflectance of the ISAPs in the blue band was greater than that in the NIR band, whereas the spectral reflectance of the saltwater fields in the blue band was smaller than that in the NIR band. Therefore, we could distinguish these two types of land features on the basis of the differences among the spectral reflectance values in these four bands. Regarding differentiation between embankments and ISAPs, the spectral analysis revealed significant differences between the spectral reflectance values of the ISAPs and embankments in the red band (B4 band), NIR band (B8 band), SWIR1 band (B11 band), and SWIR2 band (B12 band). In the same spectral analysis, the spectral reflectance of the embankments in the NIR band was greater than that in the red band, and that in the SWIR1 band was greater than that in the SWIR2 band, whereas the spectral reflectance values of the ISAPs in these four bands exhibited the opposite trend.
On this basis, we propose the difference index for saltwater field aquaculture areas (DIAS):
D I A S = ρ Re d + ρ G r e e n ρ B l u e ρ N I R ρ Re d + ρ G r e e n + ρ B l u e + ρ N I R + ρ N I R ρ S W I R 1 ρ S W I R 2 ρ N I R + ρ S W I R 1 + ρ S W I R 2
where ρ R e d , ρ G r e e n , ρ B l u e , ρ N I R , ρ S W I R 1 , and ρ S W I R 2 represent the reflectance values of the respective bands. Through extensive experiments, we set the threshold at 0.05. The experimental results are shown in Figure 4, which demonstrates that the saltwater field areas are effectively distinguished.

3.2. Dataset Construction

On the basis of the above analysis, the Sentinel-2 image data were filtered such that only bands B2, B3, B4, B8, SWIR1, and SWIR2 were retained. The bands with a spatial resolution of 20 m were then resampled to 10 m using bilinear interpolation.
To express medium-resolution remote sensing imagery better in terms of spectral and textural features, and to utilize the spectral information of each band fully to enhance the distinguishability of independent aquaculture ponds from the background and other interfering objects, this study employed an index calculation method. Research has shown that images fused with specific indices can achieve greater accuracy in target object extraction than original images [55]. This study applied binarization to the normalized difference water index (NDWI) [56], the enhanced water index (EWI) [57], and the DIAS proposed in this study. These three binary indices were added as new bands and combined with the processed Sentinel-2 satellite imagery bands (B2, B3, B4, B8, B11, and B12) to form a multiband image with nine channels. This image served as the subsequent network input to assist in the extraction of ISAPs.
We utilized high-resolution imagery acquired from Google Earth for visual interpretation purposes, dividing each image area into ISAP coverage areas and background areas on the basis of its ISAP distribution. Utilizing the ArcGIS10.8 software, we segmented the ISAPs contained in each satellite image into single-channel image files and used them as semantic segmentation labels, with the background area marked 0 and the aquaculture area marked 1. We subsequently performed sliding cropping on the sample areas and labeled images, setting the step size to 128 pixels and generating 2398 pairs of 256—×256 pixel images and labels. By implementing data augmentation techniques such as rotation, horizontal flipping, vertical flipping, and diagonal mirroring, the sample size was expanded to 9592 pairs of image blocks and label blocks. The samples were randomly divided into a training set and a validation set at an 8:2 ratio, with 7673 pairs of images used for training and 1919 pairs used as the validation set, thereby optimizing the performance and robustness of the constructed model.

3.3. Adaptive Attention U-Shaped Network (MAFU-Net)

The designed adaptive attention U-Net (MAFU-Net) structure is shown in Figure 5. This architecture includes four downsampling layers, four skip connections, and four upsampling layers. To enhance the model’s performance in extracting independent single aquaculture ponds (ISAPs), we design a median enhancement adaptive fusion module (MEA-FM), as shown in Figure 6, and replace the convolutional layers in the traditional U-Net architecture with the MEA-FM. This module can adaptively select channel receptive fields of different scales, combine the information between channels, and obtain spatial information at multiple scales.

3.4. Median-Enhanced Adaptive Fusion Module (MEA-FM)

Our designed MEA-FM consists of four parts: an input obtained through convolutional layers with different kernel sizes, a channel self-attention mechanism, a channel shuffling mechanism, and a multiscale spatial attention mechanism. Once an image is input into the module, it is processed by three convolutional layers with different receptive fields to obtain three feature maps with different perception scopes. We selected 3—×3 convolutions, 5—×5 convolutions, and dilated 3—×3 convolutions with a dilation rate of 3. The combination of these three convolutional layers typically implies that the associated model can capture local details and extensive contextual information and achieve an understanding of the features at different scales. This approach helps the model to become sensitive to details while also providing a global and structural understanding. The feature maps obtained from the three convolutional layers are represented as
P 3 = C o n v 3 × 3 ( P i n p u t )
P 5 = C o n v 5 × 5 ( P i n p u t )
P d 3 = C o n v 3 × 3 , d = 3 ( P i n p u t )
In the above formulas, P i n p u t represents the feature map input into the network, and C o n v 3 × 3 and C o n v 5 × 5 denote convolution matrices with kernel sizes of 3—×3 and 5—×5, respectively. C o n v 3 × 3 , d = 3 indicates a convolution matrix with a kernel size of 3 and a dilation rate of 3. P 3 , P 5 , and P d 3 represent the feature maps obtained from the convolutions implemented at the three different scales. The three feature maps are subsequently integrated and input into the channel attention module in the next stage.

3.4.1. Channel Self-Attention Module

In the previous stage, we obtained three feature maps at different scales. To enable the network to adaptively learn the target feature information, we developed a median-enhanced channel-adaptive module. Our research revealed that existing channel attention mechanisms typically use global average pooling (GAP) and global maximum pooling (GMP) to extract global statistical information from feature maps. However, these methods are insufficient in handling noise, especially when significant noise is contained in the input feature maps, which can affect the quality of the feature extraction process. Since we use medium-resolution remote sensing imagery, which is more susceptible to noise interference than high-resolution imagery, special attention needs to be given to appropriately treating and eliminating the noise contained in the imagery.
On the basis of the existing problems, to solve the noise issue while retaining the important feature information and increasing the robustness of the channel attention module, we incorporate a median pooling operation into this module. Combined with GAP and GMP, this strategy minimizes the interference caused by noise and allows for the acquisition of sufficient feature information. As shown in Figure 6, the combinations of the three feature maps were passed through GMP, GAP, and GMedP to compress the H and W dimensions of the feature maps into three new feature maps with sizes of 1—×1. P G A P G , P G M e d P G , and P G M P G can be expressed as follows:
P G A P G = G A P ( P 3 P 5 P d 3 )
P G M e d P G = G M e d P ( P 3 P 5 P d 3 )
P G M P G = G M P ( P 3 P 5 P d 3 )
where P 3 , P 5 , and P d 3 represent the feature maps obtained via the convolutions at three scales, and ⊕ represents the elementwise addition operation. The three new feature maps P G A P G , P G M e d P G , and P G M P G are subsequently input into each fully connected layer, followed by batch normalization and rectified linear unit (ReLU) activation to produce new feature maps, P G A P G , P G M e d P G , and P G M P G , which are represented as follows:
P G A P G = R e L U ( B N ( W F C · P G A P G ) )
P G M e d P G = R e L U ( B N ( W F C · P G M e d P G ) )
P G M P G = R e L U ( B N ( W F C · P G M P G ) )
where W F C indicates a fully connected operation, BN represents batch normalization, and ReLU represents the activation function. After the three feature maps P G A P G , P G M e d P G , and P G M P G are added in an elementwise manner and activated with the Softmax function, three-channel attention maps are obtained. To adaptively select more standardized features, the three-channel attention maps are used to modulate the feature maps P 3 , P 5 , P d 3 obtained from the convolutions implemented at the three scales. The modulated feature maps are represented as P C 3 , P C 5 , and P C d 3 as follows:
P C 3 = S o f t m a x ( P G A P G P G M e d P G P G M P G ) P 3
P C 5 = S o f t m a x ( P G A P G P G M e d P G P G M P G ) P 5
P C d 3 = S o f t m a x ( P G A P G P G M e d P G P G M P G ) P d 3
Afterwards, the three feature maps P C 3 , P C 5 , and P C d 3 are merged to serve as the input for the next stage.
Through the aforementioned channel self-attention module, it is possible to adaptively select more representative feature maps while reducing noise interference. However, a problem may be encountered: the information between channels might not be integrated fully. If the degree of information integration between channels is insufficient, it will limit the ability of the model to express the features and thus cause it to fail to fully leverage the effect of channel attention. Since our images have additional new bands composed of three indices, they are more sensitive to the information between channels. Therefore, whether the information between channels is sufficiently integrated is directly related to the quality of the experimental results. To address this issue, channel shuffling operations were introduced. The enhanced feature maps were divided into several groups (for example, three groups), each containing 1/3 of the total number of channels. Transposition operations were performed on these grouped feature maps to shuffle the order of the channels within each group. The shuffled feature maps were subsequently restored to their original shapes. This method can combine the information between different channels better, enhancing the ability to express features and thus improving the model performance. It can be mathematically expressed as follows:
P C = W C h a n n e l S h u f f l e ( P C 3 P C 5 P C d 3 )
where P C represents the input for the next stage and W C h a n n e l S h u f f l e ( ) denotes the channel shuffling operation. Assuming that the original input image size is F = R C × H × W , the size of each channel portion within the channel shuffling operation is F = R ( C / G ) × H × W , where G is the number of groups. The three parts were subsequently combined to restore the size to that of the input data. Through the above operations, it is possible to focus better on the information between channels, thereby improving the model’s accuracy.

3.4.2. Multiscale Spatial Attention Module

Channel attention mechanisms focus on the categories of features, whereas spatial attention mechanisms focus on the locations of features. Since a single-scale convolutional kernel may not fully capture the multiscale information contained in a feature map, this limits the model’s performance when features possessing different scales and complexities are addressed. Multiscale feature extraction methods can better capture information at different scales and in different directions in a feature map, thereby enhancing the model’s performance in complex scenes. Multiscale contextual information helps to distinguish objects with different sizes and backgrounds. To further improve the extraction accuracy achieved for ISAPs, we use a multiscale spatial attention module to obtain sufficient spatial information. In the spatial attention submodule, we design a multiscale deep convolution approach. First, the input feature map is processed through a 5—×5 convolutional layer to extract basic features. Second, these basic feature maps are passed through multiple branches of deep strip convolutions to further extract multiscale features. Last, these multiscale features are added in an elementwise manner, and a spatial attention map is generated through a 1—×1 convolutional layer. The weighted feature map is elementwise multiplied with the spatial attention map to obtain the final output feature map.
This map can be expressed as
P C = C o n v 5 × 5 ( P C ) i = 1 n D W i ( C o n v 5 × 5 ( P C ) )
where P C represents the feature map input into the spatial attention module, D W i denotes the number of deep convolution operations implemented at different scales, n represents the number of deep convolutions, ⊕ represents the elementwise addition operation, and P C represents the results obtained after the elementwise addition of feature maps with multiple scales. In each deep convolution branch, we use two deep directional strip-like convolutions instead of standard convolutional kernels because strip-like convolutions are more lightweight. Moreover, since ISAPs are strip-like rectangles, strip-like convolutions can better extract strip-like features, which could improve the extraction accuracy of the network to some extent. The integrated feature map was subsequently passed through a 1—×1 convolutional layer to generate the final spatial attention map. This attention map was elementwise multiplied with the feature map produced after channel mixing to obtain the final output feature map P o u t p u t . The mathematical formula is as follows:
P o u t p u t = C o n v 1 × 1 ( P C ) P C
where P o u t p u t represents the output of the entire MEA-FM.

3.5. Loss Function

Since ISAP extraction is essentially a binary classification problem, we choose the binary cross-entropy (BCE) loss function [58], which measures the difference between the outputs of a model and the true labels, helping the model to learn the correct classification results. For each sample, if the true label is y i (which can be 0 or 1) and the model output is y i (predicted probability), the BCE loss is expressed as
B C E = y i log ( y i ) ( 1 y i ) log ( 1 y i )
If the true label is ( y i = 1), the loss term is ( l o g ( y i ) ), which encourages the model to bring the probability value closer to 1; if the true label is ( y i = 0 ), the loss term is ( log ( 1 y i ) ), which encourages the model to bring the probability value closer to 0. During the training process, by minimizing the BCE loss, the model parameters were optimized so that the predicted results of the model were closer to the true labels. This helps the model to perform learning and prediction better in binary classification tasks.

4. Experiments

4.1. Experimental Setup

To verify the effectiveness of the proposed ISAP extraction method, we conducted extensive experiments on the CHN-LN4-ISAPs-9 dataset. During the training phase, to increase the generalizability of the network, random data augmentation methods, including random scaling, random horizontal flipping, and colour space transformation, were used [59]. After several experiments, the initial learning rate was 0.0001, and adaptive moment estimation (Adam) was used as the optimizer, with a batch size of 8. The operating system was Windows 11, the CPU was a 12th Gen Intel(R) Core(TM) i7-12700K, the GPU used was an NVIDIA GeForce RTX 3080, which is manufactured by NVIDIA and sourced from Santa Clara, CA, USA. the deep learning framework was PyTorch version 1.7.1, and the coding and running environment was Python 3.7.16.

4.2. Evaluation Metrics

In semantic segmentation tasks, deep learning networks utilize various evaluation metrics calculated from a confusion matrix. To assess the effectiveness of the model developed in this study, we employed four accuracy assessment metrics: precision, recall, the intersection over union (IoU), and the F1 score [60]. The formulas for these four assessment metrics are as follows:
P r e c i s i o n = T P T P + F P
R e c a l l = T P T P + F N
I o u = T P T P + F N + F P
F 1 = 2 × ( P r e c i s i o n × R e c a l l P r e c i s i o n + R e c a l l )
where true positives (TPs) refer to the number of positive instances correctly predicted as positive by the classification model; false positives (FPs) refer to the number of negative instances incorrectly predicted as positive by the classification model; false negatives (FNs) refer to the number of positive instances incorrectly predicted as negative by the classification model; and true negatives (TNs) refer to the number of negative instances correctly predicted as negative by the classification model.

4.3. Ablation Experiments

To demonstrate the effectiveness of different network components and the proposed DIAS in ISAP extraction tasks, we conducted ablation experiments. First, using the same cropping method and sample labels, we created datasets with different numbers of bands from Sentinel-2 remote sensing images containing various bands. We produced three aquaculture pond datasets with different numbers of bands: the CHN-LN4-ISAPs-6 dataset (each image contained bands B2, B3, B4, B8, B11, and B12), the CHN-LN4-ISAPs-8 dataset (each image contained bands B2, B3, B4, B8, B11, and B12; the NDWI; and the EWI), and the CHN-LN4-ISAPs-9 dataset (each image contained bands B2, B3, B4, B8, B11, and B12; the NDWI; the EWI; and the DIAS). Using U-Net as the baseline network, we subsequently performed cross-validation on the three datasets (where U-Net1 represents the baseline U-Net; U-Net2 represents U-Net with the channel self-attention module, U-Net3 represents U-Net with the multiscale spatial attention module, and U-Net4 represents U-Net with MEA-FM, which is our designed network, MAFU-Net). To objectively demonstrate the effectiveness of the proposed method, each experiment was independently repeated 10 times, and the final results are reported as the average values of the 10 experiments. The experimental results are shown in Table 2.
Table 2 shows that when the channel attention module was added to the multiscale spatial attention module, slight improvements were achieved in terms of the four evaluation metrics on the three datasets. By incorporating our designed MEA-FM into the baseline U-Net, all four evaluation metrics were significantly enhanced, revealing the effectiveness of our designed module. This finding also confirms that adding new index bands to an image improves the resulting extraction accuracy. Images with the NDWI, EWI, and DIAS as new bands yielded the highest extraction accuracy under the same method, showing that the DIAS and the proposed method are effective.

4.4. Comparative Experiments

To evaluate the effectiveness of our proposed method, we selected seven relatively advanced deep learning methods for training and validation on the CHN-LN4-ISAPs-9 dataset. The methods used for comparison included U-Net [61], DeepLabV3+ [62], SegNet [63], PSPNet [64], SKNet [65], UPS-Net [66], and SegFormer [67]. To objectively reflect the effectiveness of the proposed method, each experiment was independently repeated 10 times, and the final results are reported as average values obtained across the 10 experiments.
Table 3 shows that when U-Net is reasonably adjusted in terms of its parameters, it can achieve good segmentation effects, confirming the superior performance of U-type networks in segmentation tasks. Although DeepLabV3+ achieved a recall rate (Recall) of 88.67%, its other evaluation results were not ideal, indicating that the DeepLabV3+ network is substantially affected by the background and interfering objects in complex environments, making the extraction of ISAPs difficult. The SegFormer network had lower results with respect to all four metrics, which indicated that the network could not effectively complete the extraction task on a small dataset. Although SegNet is relatively lightweight and can retain boundary information better, its model generalizability is weak, and it has difficulties in addressing complex scenes. Owing to its selective kernel convolution, the SKNet model has improved recognition abilities for targets of different scales, with the IoU and F1 reaching 80.67 and 87.97, respectively; however, these values are lower than those of our MAFU-Net model by 3.26% and 2.7%, respectively. PSPNet has an excellent pyramid merging module, and, in terms of the evaluation metrics, it performs better than the other CNN networks; however, in terms of the IoU and F1 score, it performs 2.38% and 2.13% worse than the MAFU-Net model.
The MAFU-Net method proposed in this study can obtain multiscale contextual information and fully utilize the information between channels. Compared with the other methods, this method achieves the best performance in terms of the four evaluation metrics and adapts well to complex environments to complete the ISAP extraction task. Figure 7 shows the segmentation results produced by various neural networks on the CHN-LN4-ISAPs-9 dataset.
Figure 7a–h depict images of different areas randomly selected from the CHN-LN4-ISAPs-9 dataset, each with a size of 256—×256 pixels and a resolution of ten meters. In study area a, U-Net, DeepLabV3+, and SegFormer exhibited more severe adhesion phenomena, whereas the other networks could identify the embankments between the ISAPs better, reducing the frequency of adhesion and clarifying the boundaries. In study areas b, d, and e, due to the interference caused by water bodies and sediment, U-Net, DeepLabV3+, SegNet, SKNet, UPS-Net, and SegFormer all presented varying false detection rates. In comparison, PSPNet and our proposed MAFU-Net were less affected, indicating the better robustness of these models. In study areas a, c, g, and h, which included land features such as saltwater fields and fallow aquaculture ponds with features similar to those of the target ISAPs, under the complex background information, U-Net, DeepLabV3+, SegFormer, SegNet, and UPS-Net were greatly affected and were easily influenced by interfering objects, making the accurate extraction of the target objects difficult. In contrast, PSPNet, SKNet, and MAFU-Net were less affected, with MAFU-Net showing the best extraction effect, distinguishing interfering objects from the target objects better and handling the ISAP extraction task under complex backgrounds effectively. In study area f, which included a range of interfering objects with individual aquaculture ponds, several networks had varying FN rates, and their boundaries were not sufficiently distinct. In comparison, MAFU-Net could pay more attention to the information between channels and focus more on spectral features, resulting in a lower FN rate than the other networks.
Through the analysis of the network model evaluation indicators and experimental results, MAFU-Net is demonstrated to be superior to the other networks in terms of extracting independent aquaculture ponds from medium-resolution remote sensing imagery. By obtaining multiscale contextual information and fully utilizing the information between channels, it greatly reduces the frequencies of misidentification and FNs in complex backgrounds and focuses on distinct boundaries to distinguish independent aquaculture ponds and embankments between ponds. Therefore, MAFU-Net can complete the ISAP extraction task in medium-resolution remote sensing imagery effectively, aiding in the effective extraction of ISAPs and more accurate monitoring of aquaculture areas.

5. Discussion

5.1. Different Band Combinations and Their Impacts on the Experimental Results

To construct the dataset, we selected only the B2, B3, B4, B8, B11, and B12 bands, with the NDWI, EWI, and DIAS index bands from Sentinel-2 imagery, as inputs for the MAFU-Net model. Previous experiments have demonstrated that the integration of specific index bands can increase the accuracy of object extraction from target features and have validated the effectiveness of the DIAS. In this section, we explore various band combinations based on Sentinel-2 imagery as inputs for the MAF-UNet model to demonstrate the rationale behind our band selection for the dataset. We established six experimental groups.
  • Group 1 includes the B2, B3, and B4 bands (visible light) from Sentinel-2 imagery.
  • Group 2 includes the B1, B2, B3, B4, B5, B6, B7, B8, B8a, B9, B10, B11, and B12 bands from Sentinel-2 imagery.
  • Group 3 includes the B2, B3, B4, B8, B11, and B12 bands from Sentinel-2 imagery.
  • Group 4 includes the B2, B3, and B4 bands from Sentinel-2 imagery, as well as the NDWI, EWI, and DIAS index bands.
  • Group 5 includes the B1, B2, B3, B4, B5, B6, B7, B8, B8A, B9, B10, B11, and B12 bands from Sentinel-2 imagery, as well as the NDWI, EWI, and DIAS index bands.
  • Group 6 includes the B2, B3, B4, B8, B11, and B12 bands from Sentinel-2 imagery, with the NDWI, EWI, and DIAS index bands.
The six sets of data were used as inputs for the MAFU-Net model, and the experimental results are shown in Table 4.
The six groups of experimental results indicate that, compared with the first three groups of experiments, the last three groups of data show improved performance across all metrics after three index bands are incorporated. This finding demonstrates that adding the NDWI, EWI, and DIAS index bands can effectively increase the accuracy of the model. Group 6 achieved the best performance across all metrics. Although Group 5 included more bands, its performance did not surpass that of Group 4 and Group 6. This finding suggests that simply adding bands does not necessarily improve the model’s performance; notably, some lower-resolution bands may introduce data redundancy and negatively affect the extraction accuracy. Group 6, which selected key bands (B2, B3, B4, B8, B11, and B12) with the three index bands, provided sufficient feature information and improved the performance of the MAFU-Net model. Therefore, the experimental results indicate that the rational combination of band selection and index bands can improve the ability of the model to extract ISAP feature information. In the future, we will continue to explore different combinations of bands for various types of remote sensing images to further enhance the overall model performance.

5.2. Application of the MAFU-Net Model

To further validate the universality and generalization ability of the MAFU-Net model proposed in this study, we selected four representative areas as validation zones, as shown in Figure 8: Zhangxia Bay, Pulandian Bay, Taiping Bay in Dalian, Liaoning Province, and the coastal aquaculture area in Southern Jinzhou. These areas feature complex backgrounds, significant interference from various objects, and a wide distribution of aquaculture ponds, posing substantial challenges for the accurate extraction of ISAPs. These ISAPs not only provide rich aquatic products for Liaoning Province but also promote the sustainable development of the regional economy through advanced aquaculture technologies and management practices, effectively meeting the growing demand for aquatic products.
To accurately extract ISAPs within the test area, we preprocessed the Sentinel-2 imagery for the validation zones using the GEE cloud platform via the preprocessing methods mentioned in Section 2. We selected bands B2, B3, B4, B8, B11, and B12 and added three additional index bands: the NDWI, EWI, and DIAS. The processed images were subsequently input into the MAFU-Net model, and the extraction results are shown in Figure 8. The results indicate that the MAFU-Net model significantly reduces common errors such as “boundary adhesion” and “misidentification” and that it exhibits lower sensitivity to interference object features in complex environments, effectively extracting ISAPs.
The MAFU-Net model proposed in this study is not only suitable for the extraction of ISAPs; the MEA-FM module also possesses multiscale feature acquisition capabilities and median information enhancement functions. This reduces noise interference and effectively merges the information between channels. As a result, this module can serve as a plug-and-play attention module in the field of deep learning neural networks, demonstrating excellent performance when handling semantic segmentation tasks with medium-resolution imagery.

5.3. Limitations of the Method and Future Directions for Improvement

Although this study has achieved positive results, some challenges remain. On the one hand, due to the influence of the shooting angles and seasonal variations, certain ISAP areas in remote sensing images may exhibit brightness anomalies. Additionally, we need to consider the differences in the characteristics of ISAPs during fallow and non-fallow periods [68], as these differences are related to the local climate, aquaculture species, and management practices. In general, most ISAPs choose to fallow from November to March of the following year to allow the pond water environment to self-recover, reduce the incidence of diseases, and improve the yield and quality of the next cultivation cycle. During this period, the temperatures are low, leading to the slow growth of cultured organisms, and it represents a critical stage for the ecological restoration of water bodies. The non-fallow period typically occurs from April to October and is characterized by suitable temperatures and vigorous biological growth. To manage aquaculture areas more accurately, we need to identify and extract ISAPs during the non-fallow period. To address these issues, we plan to use remote sensing images from different periods in the same region as inputs for the MAFU-Net model in the future. Data from various time points could provide rich background and feature information, helping the model to better identify and extract target features. Furthermore, we will explore the use of multitemporal data to further enhance the learning capabilities of the model, enabling it to generalize better when facing new data.
On the other hand, since this study was focused on the Bohai Rim region of China, the proposed method may not be entirely applicable to coastal aquaculture ponds in Southern China. The research indicates that the background of ISAP areas in Southern China is more complex. Unlike the saltwater fields and other interfering features found in the Bohai Rim region, Southern China also has interfering features with spectral characteristics similar to ISAPs, as well as rectangular grasslands that have similar shapes but different properties. These interfering features limit the applicability of the proposed DIAS index in Southern China. To address this shortcoming, we are considering introducing new index bands (such as the NDVI [69] and EVI [70]) in the original images on the basis of different interfering features to achieve better extraction results.
In future research, we will further improve the model and attempt to incorporate multisource remote sensing data as input to comprehensively capture information about land features. Moreover, we will explore the proposal of new differential indices to address the challenges of ISAP extraction in southern coastal regions. Additionally, we will conduct the dynamic monitoring of large-scale ISAP areas via multitemporal remote sensing imagery, which will allow us to promptly capture changes in water bodies and ISAP areas, providing accurate dynamic information to help to assess ongoing environmental changes. Moreover, through the analysis of the extraction results from remote sensing data at different times, we will identify long-term change trends in ISAP areas and their impacts on the surrounding ecological environments. This research will provide effective and accurate data support for fishery regulatory agencies, assisting relevant organizations in more scientifically managing coastal ISAP regions.

6. Conclusions

In this study, we address the challenge of extracting ISAPs from medium-resolution remote sensing images. On the basis of the different spectral characteristics of land cover within the study area, we propose a DIAS index to differentiate ISAPs from interfering objects. Additionally, we introduce a new deep learning neural network model, MAFU-Net. Compared with existing methods, we combine bands B2, B3, B4, B8, B11, and B12 from Sentinel-2 remote sensing images with the NDWI, EWI, and DIAS spectral indices to construct the CHN-LN4-ISAPs-9 dataset. The dataset created using this method enhances the land features of ISAPs and provides more comprehensive and rich feature information. The MAFU-Net model improves the traditional U-Net by replacing its convolutional layers with the MEA-FM module. This module captures multiscale information and interchannel information effectively through a median-enhanced self-attention mechanism and a channel mixing mechanism, thereby strengthening the ability of the model to extract ISAP features. When the model was applied to ISAP identification in four typical areas of the Bohai Sea region, it achieved excellent results. In comparative experiments with classic deep learning semantic segmentation models, U-Net, DeepLabV3+, SegNet, PSPNet, SKNet, UPS-Net, and SegFormer were used. MAFU-Net achieved superior performance, with precision of 89.29%, recall of 91.03%, an IOU of 83.93%, and an F1 score of 90.67%, outperforming the other models. Furthermore, we explored combinations of original image bands and demonstrated the rationality of the band selection of the dataset. Through ablation experiments, we showed that incorporating DIAS index bands into the original image can effectively improve the extraction accuracy of the model. This finding confirms that the method proposed in this paper considers both land feature information and spectral information, allowing for the rapid and precise extraction of ISAPs in complex environments. This provides a foundation for the future exploration of issues related to the automatic extraction of remote sensing images.
The MAFU-Net model introduced in this paper successfully achieved the precise extraction of ISAPs in the Bohai Sea study area, but further research is needed. Future studies will explore different data combination methods to obtain higher-quality data samples and further enhance the model performance. Our goal is to extend the MAFU-Net model to larger-scale ISAP areas while integrating multisource remote sensing images and large-scale time series satellite remote sensing images to analyze the spatiotemporal evolutionary patterns within the region. This study provides technical support for the scientific management of coastal aquaculture areas.

Author Contributions

Conceptualization, Z.L., F.W. and Y.Z.; methodology, Z.L. and Y.Z.; validation, Z.L., F.W. and Y.Z.; software, Z.L., F.W. and Y.Z.; investigation, Z.L., F.X. and Y.Z.; resources, Z.L., J.Z. and Y.Z.; writing—original draft preparation, Z.L.; writing—review and editing, Z.L., F.W., F.X., J.Z., P.L. and Y.Z. All authors have read and agreed to the published version of the manuscript.

Funding

The National Natural Science Foundation of China (grant number 42101257).

Data Availability Statement

The original contributions presented in the study are included in the article, and further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ISAPsIndividually separable aquaculture ponds
MEA-FMMedian augmented adaptive fusion module
MAFU-NetAdaptive attention U-Net
DIASDifference index for saltwater field aquaculture areas

References

  1. Bank, M.S.; Metian, M.; Swarzenski, P.W. Defining seafood safety in the Anthropocene. Environ. Sci. Technol. 2020, 54, 8506–8508. [Google Scholar] [CrossRef]
  2. Béné, C.; Barange, M.; Subasinghe, R.; Pinstrup-Andersen, P.; Merino, G.; Hemre, G.I.; Williams, M. Feeding 9 billion by 2050–Putting fish back on the menu. Food Secur. 2015, 7, 261–274. [Google Scholar] [CrossRef]
  3. Jiang, Q.; Bhattarai, N.; Pahlow, M.; Xu, Z. Environmental sustainability and footprints of global aquaculture. Resour. Conserv. Recycl. 2022, 180, 106183. [Google Scholar] [CrossRef]
  4. Action, S. World fisheries and aquaculture. Food Agric. Organ. 2020, 2020, 1–244. [Google Scholar]
  5. Xu, Z.; Wu, S.; Christie, P.; Gao, X.; Xu, J.; Xu, S.; Liang, P. Impacts of estuarine dissolved organic matter and suspended particles from fish farming on the biogeochemical cycling of mercury in Zhoushan island, eastern China Sea. Sci. Total Environ. 2020, 705, 135921. [Google Scholar] [CrossRef]
  6. Zhang, H.; Xiao, Y.; Deng, Y. Island ecosystem evaluation and sustainable development strategies: A case study of the Zhoushan Archipelago. Glob. Ecol. Conserv. 2021, 28, e01603. [Google Scholar] [CrossRef]
  7. Prasad, K.A.; Ottinger, M.; Wei, C.; Leinenkugel, P. Assessment of coastal aquaculture for India from Sentinel-1 SAR time series. Remote Sens. 2019, 11, 357. [Google Scholar] [CrossRef]
  8. Wang, D.; Wan, B.; Liu, J.; Su, Y.; Guo, Q.; Qiu, P.; Wu, X. Estimating aboveground biomass of the mangrove forests on northeast Hainan Island in China using an upscaling method from field plots, UAV-LiDAR data and Sentinel-2 imagery. Int. J. Appl. Earth Obs. Geoinf. 2020, 85, 101986. [Google Scholar] [CrossRef]
  9. Higgins, S.; Overeem, I.; Tanaka, A.; Syvitski, J.P. Land subsidence at aquaculture facilities in the Yellow River delta, China. Geophys. Res. Lett. 2013, 40, 3898–3902. [Google Scholar] [CrossRef]
  10. Zhou, S.; Zhu, H.; Huang, S.; Zhou, J.; Zhang, S.; Wang, C. Biomagnification and risk assessment of polychlorinated biphenyls in food web components from Zhoushan fishing ground, China. Mar. Pollut. Bull. 2019, 142, 613–619. [Google Scholar] [CrossRef]
  11. Macusi, E.D.; Estor, D.E.P.; Borazon, E.Q.; Clapano, M.B.; Santos, M.D. Environmental and socioeconomic impacts of shrimp farming in the Philippines: A critical analysis using PRISMA. Sustainability 2022, 14, 2977. [Google Scholar] [CrossRef]
  12. Hall, G.M. Impact of climate change on aquaculture: The need for alternative feed components. Turk. J. Fish. Aquat. Sci. 2015, 15, 569–574. [Google Scholar] [CrossRef] [PubMed]
  13. Barange, M.; Bahri, T.; Beveridge, M.; Cochrane, K.L.; Funge-Smith, S.; Poulain, F. Impacts of climate change on fisheries and aquaculture. United Nations’ Food Agric. Organ. 2018, 12, 628–635. [Google Scholar]
  14. Ahmed, N.; Thompson, S.; Glaser, M. Global aquaculture productivity, environmental sustainability, and climate change adaptability. Environ. Manag. 2019, 63, 159–172. [Google Scholar] [CrossRef]
  15. Gentry, R.R.; Froehlich, H.E.; Grimm, D.; Kareiva, P.; Parke, M.; Rust, M.; Gaines, S.D.; Halpern, B.S. Mapping the global potential for marine aquaculture. Nat. Ecol. Evol. 2017, 1, 1317–1324. [Google Scholar] [CrossRef] [PubMed]
  16. Duan, Y.; Li, X.; Zhang, L.; Liu, W.; Chen, D.; Ji, H. Detecting spatiotemporal changes of large-scale aquaculture ponds regions over 1988–2018 in Jiangsu Province, China using Google Earth Engine. Ocean Coast. Manag. 2020, 188, 105144. [Google Scholar] [CrossRef]
  17. Chen, C.; Liang, J.; Xie, F.; Hu, Z.; Sun, W.; Yang, G.; Yu, J.; Chen, L.; Wang, L.; Wang, L.; et al. Temporal and spatial variation of coastline using remote sensing images for Zhoushan archipelago, China. Int. J. Appl. Earth Obs. Geoinf. 2022, 107, 102711. [Google Scholar] [CrossRef]
  18. Peterson, K.T.; Sagan, V.; Sloan, J.J. Deep learning-based water quality estimation and anomaly detection using Landsat-8/Sentinel-2 virtual constellation and cloud computing. GISci. Remote Sens. 2020, 57, 510–525. [Google Scholar] [CrossRef]
  19. Stiller, D.; Ottinger, M.; Leinenkugel, P. Spatio-temporal patterns of coastal aquaculture derived from Sentinel-1 time series data and the full Landsat archive. Remote Sens. 2019, 11, 1707. [Google Scholar] [CrossRef]
  20. Zhang, X.; Ma, S.; Su, C.; Shang, Y.; Wang, T.; Yin, J. Coastal oyster aquaculture area extraction and nutrient loading estimation using a GF-2 satellite image. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 4934–4946. [Google Scholar] [CrossRef]
  21. Rajandran, A.; Tan, M.L.; Samat, N.; Chan, N.W. A review of Google Earth Engine application in mapping aquaculture ponds. In IOP Conference Series: Earth and Environmental Science; IOP Publishing: Bristol, UK, 2022; Volume 1064, p. 012011. [Google Scholar]
  22. Hou, Y.; Zhao, G.; Chen, X.; Yu, X. Improving satellite retrieval of coastal aquaculture pond by adding water quality parameters. Remote Sens. 2022, 14, 3306. [Google Scholar] [CrossRef]
  23. Xia, Z.; Guo, X.; Chen, R. Automatic extraction of aquaculture ponds based on Google Earth Engine. Ocean Coast. Manag. 2020, 198, 105348. [Google Scholar] [CrossRef]
  24. Hou, T.; Sun, W.; Chen, C.; Yang, G.; Meng, X.; Peng, J. Marine floating raft aquaculture extraction of hyperspectral remote sensing images based decision tree algorithm. Int. J. Appl. Earth Obs. Geoinf. 2022, 111, 102846. [Google Scholar] [CrossRef]
  25. Duan, Y.; Li, X.; Zhang, L.; Chen, D.; Ji, H. Mapping national-scale aquaculture ponds based on the Google Earth Engine in the Chinese coastal zone. Aquaculture 2020, 520, 734666. [Google Scholar] [CrossRef]
  26. Xu, Y.; Hu, Z.; Zhang, Y.; Wang, J.; Yin, Y.; Wu, G. Mapping aquaculture areas with Multi-Source spectral and texture features: A case study in the pearl river basin (Guangdong), China. Remote Sens. 2021, 13, 4320. [Google Scholar] [CrossRef]
  27. Liu, Y.; Wang, Z.; Yang, X.; Zhang, Y.; Yang, F.; Liu, B.; Cai, P. Satellite-based monitoring and statistics for raft and cage aquaculture in China’s offshore waters. Int. J. Appl. Earth Obs. Geoinf. 2020, 91, 102118. [Google Scholar] [CrossRef]
  28. Ottinger, M.; Clauss, K.; Kuenzer, C. Large-scale assessment of coastal aquaculture ponds with Sentinel-1 time series data. Remote Sens. 2017, 9, 440. [Google Scholar] [CrossRef]
  29. Minaee, S.; Boykov, Y.; Porikli, F.; Plaza, A.; Kehtarnavaz, N.; Terzopoulos, D. Image segmentation using deep learning: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 3523–3542. [Google Scholar] [CrossRef]
  30. Chen, H.; He, Y.; Zhang, L.; Yao, S.; Yang, W.; Fang, Y.; Liu, Y.; Gao, B. A landslide extraction method of channel attention mechanism U-Net network based on Sentinel-2A remote sensing images. Int. J. Digit. Earth 2023, 16, 552–577. [Google Scholar] [CrossRef]
  31. Zhou, H.; Luo, F.; Zhuang, H.; Weng, Z.; Gong, X.; Lin, Z. Attention multihop graph and multiscale convolutional fusion network for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–14. [Google Scholar] [CrossRef]
  32. Cui, B.G.; Zhong, Y.; Fei, D.; Zhang, Y.H.; Liu, R.J.; Chu, J.L.; Zhao, J.H. Floating raft aquaculture area automatic extraction based on fully convolutional network. J. Coast. Res. 2019, 90, 86–94. [Google Scholar] [CrossRef]
  33. Lu, Y.; Shao, W.; Sun, J. Extraction of offshore aquaculture areas from medium-resolution remote sensing images based on deep learning. Remote Sens. 2021, 13, 3854. [Google Scholar] [CrossRef]
  34. Zeng, Z.; Wang, D.; Tan, W.; Yu, G.; You, J.; Lv, B.; Wu, Z. RCSANet: A full convolutional network for extracting inland aquaculture ponds from high-spatial-resolution images. Remote Sens. 2020, 13, 92. [Google Scholar] [CrossRef]
  35. Su, H.; Wei, S.; Qiu, J.; Wu, W. RaftNet: A new deep neural network for coastal raft aquaculture extraction from Landsat 8 OLI data. Remote Sens. 2022, 14, 4587. [Google Scholar] [CrossRef]
  36. Dang, K.B.; Nguyen, M.H.; Nguyen, D.A.; Phan, T.T.H.; Giang, T.L.; Pham, H.H.; Nguyen, T.N.; Tran, T.T.V.; Bui, D.T. Coastal wetland classification with deep u-net convolutional networks and sentinel-2 imagery: A case study at the tien yen estuary of vietnam. Remote Sens. 2020, 12, 3270. [Google Scholar] [CrossRef]
  37. Gao, L.; Wang, C.; Liu, K.; Chen, S.; Dong, G.; Su, H. Extraction of floating raft aquaculture areas from Sentinel-1 SAR images by a dense residual U-Net model with pre-trained ResNet34 as the encoder. Remote Sens. 2022, 14, 3003. [Google Scholar] [CrossRef]
  38. Wang, J.; Sui, L.; Yang, X.; Wang, Z.; Liu, Y.; Kang, J.; Lu, C.; Yang, F.; Liu, B. Extracting coastal raft aquaculture data from landsat 8 OLI imagery. Sensors 2019, 19, 1221. [Google Scholar] [CrossRef]
  39. Cheng, B.; Liang, C.; Liu, X.; Liu, Y.; Ma, X.; Wang, G. Research on a novel extraction method using Deep Learning based on GF-2 images for aquaculture areas. Int. J. Remote Sens. 2020, 41, 3575–3591. [Google Scholar] [CrossRef]
  40. Fu, Y.; Ye, Z.; Deng, J.; Zheng, X.; Huang, Y.; Yang, W.; Wang, Y.; Wang, K. Finer resolution mapping of marine aquaculture areas using worldView-2 imagery and a hierarchical cascade convolutional neural network. Remote Sens. 2019, 11, 1678. [Google Scholar] [CrossRef]
  41. Liu, Y.; Yang, X.; Wang, Z.; Lu, C.; Li, Z.; Yang, F. Aquaculture area extraction and vulnerability assessment in Sanduao based on richer convolutional features network model. J. Oceanol. Limnol. 2019, 37, 1941–1954. [Google Scholar] [CrossRef]
  42. Zhang, Y.; Wang, C.; Ji, Y.; Chen, J.; Deng, Y.; Chen, J.; Jie, Y. Combining segmentation network and nonsubsampled contourlet transform for automatic marine raft aquaculture area extraction from sentinel-1 images. Remote Sens. 2020, 12, 4182. [Google Scholar] [CrossRef]
  43. Ai, B.; Xiao, H.; Xu, H.; Yuan, F.; Ling, M. Coastal aquaculture area extraction based on self-attention mechanism and auxiliary loss. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 16, 2250–2261. [Google Scholar] [CrossRef]
  44. Fu, Y.; Zhang, W.; Bi, X.; Wang, P.; Gao, F. TCNet: A Transformer–CNN Hybrid Network for Marine Aquaculture Mapping from VHSR Images. Remote Sens. 2023, 15, 4406. [Google Scholar] [CrossRef]
  45. Deng, J.; Bai, Y.; Chen, Z.; Shen, T.; Li, C.; Yang, X. A convolutional neural network for coastal aquaculture extraction from high-resolution remote sensing imagery. Sustainability 2023, 15, 5332. [Google Scholar] [CrossRef]
  46. Zhao, Q.; Yu, L.; Du, Z.; Peng, D.; Hao, P.; Zhang, Y.; Gong, P. An overview of the applications of earth observation satellite data: Impacts and future trends. Remote Sens. 2022, 14, 1863. [Google Scholar] [CrossRef]
  47. Ren, C.; Wang, Z.; Zhang, Y.; Zhang, B.; Chen, L.; Xi, Y.; Xiao, X.; Doughty, R.B.; Liu, M.; Jia, M.; et al. Rapid expansion of coastal aquaculture ponds in China from Landsat observations during 1984–2016. Int. J. Appl. Earth Obs. Geoinf. 2019, 82, 101902. [Google Scholar] [CrossRef]
  48. Duan, Y.; Tian, B.; Li, X.; Liu, D.; Sengupta, D.; Wang, Y.; Peng, Y. Tracking changes in aquaculture ponds on the China coast using 30 years of Landsat images. Int. J. Appl. Earth Obs. Geoinf. 2021, 102, 102383. [Google Scholar] [CrossRef]
  49. Hu, Y.; Zhang, L.; Chen, B.; Zuo, J. An Object-Based Approach to Extract Aquaculture Ponds with 10-Meter Resolution Sentinel-2 Images: A Case Study of Wenchang City in Hainan Province. Remote Sens. 2024, 16, 1217. [Google Scholar] [CrossRef]
  50. Sridhar, P.; Surendran, A.; Ramana, I. Auto-extraction technique-based digital classification of saltpans and aquaculture plots using satellite data. Int. J. Remote Sens. 2008, 29, 313–323. [Google Scholar] [CrossRef]
  51. Ma, Z.; Li, H.; Ye, Z.; Wen, J.; Hu, Y.; Liu, Y. Application of modified water quality index (WQI) in the assessment of coastal water quality in main aquaculture areas of Dalian, China. Mar. Pollut. Bull. 2020, 157, 111285. [Google Scholar] [CrossRef] [PubMed]
  52. Wang, M.; Mao, D.; Xiao, X.; Song, K.; Jia, M.; Ren, C.; Wang, Z. Interannual changes of coastal aquaculture ponds in China at 10-m spatial resolution during 2016–2021. Remote Sens. Environ. 2023, 284, 113347. [Google Scholar] [CrossRef]
  53. Wang, H.; Xu, X.; Zhu, G. Landscape changes and a salt production sustainable approach in the state of salt pan area decreasing on the Coast of Tianjin, China. Sustainability 2015, 7, 10078–10097. [Google Scholar] [CrossRef]
  54. Rajitha, K.; Mukherjee, C.; Vinu Chandran, R.; Prakash Mohan, M. Land-cover change dynamics and coastal aquaculture development: A case study in the East Godavari delta, Andhra Pradesh, India using multi-temporal satellite data. Int. J. Remote Sens. 2010, 31, 4423–4442. [Google Scholar] [CrossRef]
  55. Liu, C.; Jiang, T.; Zhang, Z.; Sui, B.; Pan, X.; Zhang, L.; Zhang, J. Extraction method of offshore mariculture area under weak signal based on multisource feature fusion. J. Mar. Sci. Eng. 2020, 8, 99. [Google Scholar] [CrossRef]
  56. Gao, B.C. NDWI—A normalized difference water index for remote sensing of vegetation liquid water from space. Remote Sens. Environ. 1996, 58, 257–266. [Google Scholar] [CrossRef]
  57. Wang, S.; Baig, M.H.A.; Zhang, L.; Jiang, H.; Ji, Y.; Zhao, H.; Tian, J. A simple enhanced water index (EWI) for percent surface water estimation using Landsat data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2015, 8, 90–97. [Google Scholar] [CrossRef]
  58. Ruby, U.; Yendapalli, V. Binary cross entropy with deep learning technique for image classification. Int. J. Adv. Trends Comput. Sci. Eng. 2020, 9, 5393–5397. [Google Scholar]
  59. Shorten, C.; Khoshgoftaar, T.M. A survey on image data augmentation for deep learning. J. Big Data 2019, 6, 1–48. [Google Scholar] [CrossRef]
  60. Goutte, C.; Gaussier, E. A probabilistic interpretation of precision, recall and F-score, with implication for evaluation. In European Conference on Information Retrieval; Springer: Berlin/Heidelberg, Germany, 2005; pp. 345–359. [Google Scholar]
  61. Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
  62. Chen, L.C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 801–818. [Google Scholar]
  63. Badrinarayanan, V.; Kendall, A.; Cipolla, R. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef] [PubMed]
  64. Zhao, H.; Shi, J.; Qi, X.; Wang, X.; Jia, J. Pyramid scene parsing network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2881–2890. [Google Scholar]
  65. Li, X.; Wang, W.; Hu, X.; Yang, J. Selective kernel networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 510–519. [Google Scholar]
  66. Cui, B.; Fei, D.; Shao, G.; Lu, Y.; Chu, J. Extracting raft aquaculture areas from remote sensing images via an improved U-net with a PSE structure. Remote Sens. 2019, 11, 2053. [Google Scholar] [CrossRef]
  67. Xie, E.; Wang, W.; Yu, Z.; Anandkumar, A.; Alvarez, J.M.; Luo, P. SegFormer: Simple and efficient design for semantic segmentation with transformers. Adv. Neural Inf. Process. Syst. 2021, 34, 12077–12090. [Google Scholar]
  68. Boyd, C.E. Bottom Soils, Sediment, and Pond Aquaculture; Springer Science & Business Media: Philadelphia, PA, USA, 2012. [Google Scholar]
  69. Rouse, J.W.; Haas, R.H.; Schell, J.A.; Deering, D.W. Monitoring vegetation systems in the Great Plains with ERTS. NASA Spec. Publ. 1974, 351, 309. [Google Scholar]
  70. Liu, H.Q.; Huete, A. A feedback based modification of the NDVI to minimize canopy background and atmospheric noise. IEEE Trans. Geosci. Remote Sens. 1995, 33, 457–465. [Google Scholar] [CrossRef]
Figure 1. Study area (A represents the coastal aquaculture area of Qingduizi Bay, Zhuanghe City, Liaoning Province; B represents the coastal aquaculture area north of Maya Island, Pulandian District, Dalian City; C represents the coastal aquaculture area of Yingkou City, east of Liaodong Bay; and D represents the Calabash Island Changshan Temple Bay coastal aquaculture area).
Figure 1. Study area (A represents the coastal aquaculture area of Qingduizi Bay, Zhuanghe City, Liaoning Province; B represents the coastal aquaculture area north of Maya Island, Pulandian District, Dalian City; C represents the coastal aquaculture area of Yingkou City, east of Liaodong Bay; and D represents the Calabash Island Changshan Temple Bay coastal aquaculture area).
Remotesensing 16 04130 g001
Figure 2. True colour images of the Yingkou area in January, April, July, and October.
Figure 2. True colour images of the Yingkou area in January, April, July, and October.
Remotesensing 16 04130 g002
Figure 3. Spectral analysis charts of the Yingkou area in January, April, July, and October (the green line in Figure 3 represents the spectral values of the aquaculture ponds across the 13 bands, the red line represents the spectral values of the saltwater fields across the 13 bands, and the orange line represents the spectral values of the embankments across the 13 bands).
Figure 3. Spectral analysis charts of the Yingkou area in January, April, July, and October (the green line in Figure 3 represents the spectral values of the aquaculture ponds across the 13 bands, the red line represents the spectral values of the saltwater fields across the 13 bands, and the orange line represents the spectral values of the embankments across the 13 bands).
Remotesensing 16 04130 g003
Figure 4. Extraction results for saltwater fields using DIAS with different thresholds.
Figure 4. Extraction results for saltwater fields using DIAS with different thresholds.
Remotesensing 16 04130 g004
Figure 5. Adaptive attention U-Net (MAFU-Net).
Figure 5. Adaptive attention U-Net (MAFU-Net).
Remotesensing 16 04130 g005
Figure 6. Median-enhanced adaptive fusion module (MEA-FM).
Figure 6. Median-enhanced adaptive fusion module (MEA-FM).
Remotesensing 16 04130 g006
Figure 7. Visual comparison results of different network models on the CHN-LN4-ISAPs-9 dataset are presented. Among them, (ah) depict various areas, with the first two columns showing the dataset images and the labeled images, while the subsequent eight columns display the extraction results from different models for each scene. (The black areas represent backgrounds, the white areas represent aquaculture areas, the red elliptical areas indicate misidentified water body regions, the yellow rectangular areas indicate misidentified saltwater fields and other land feature regions, the blue rectangular areas indicate misidentified fallow aquaculture ponds, the orange areas indicate omissions, and the green areas indicate edge adhesion).
Figure 7. Visual comparison results of different network models on the CHN-LN4-ISAPs-9 dataset are presented. Among them, (ah) depict various areas, with the first two columns showing the dataset images and the labeled images, while the subsequent eight columns display the extraction results from different models for each scene. (The black areas represent backgrounds, the white areas represent aquaculture areas, the red elliptical areas indicate misidentified water body regions, the yellow rectangular areas indicate misidentified saltwater fields and other land feature regions, the blue rectangular areas indicate misidentified fallow aquaculture ponds, the orange areas indicate omissions, and the green areas indicate edge adhesion).
Remotesensing 16 04130 g007
Figure 8. Locations of the verification areas (a represents the Zhangxia Bay coastal aquaculture area in Dalian, Liaoning Province; b represents the Pulandian Bay coastal aquaculture area in Dalian, Liaoning Province; c represents the Taiping Bay coastal aquaculture area in Dalian, Liaoning Province; d represents the southern coastal aquaculture area in Jinzhou, Liaoning Province).
Figure 8. Locations of the verification areas (a represents the Zhangxia Bay coastal aquaculture area in Dalian, Liaoning Province; b represents the Pulandian Bay coastal aquaculture area in Dalian, Liaoning Province; c represents the Taiping Bay coastal aquaculture area in Dalian, Liaoning Province; d represents the southern coastal aquaculture area in Jinzhou, Liaoning Province).
Remotesensing 16 04130 g008
Table 1. Image information for each study area.
Table 1. Image information for each study area.
AreaImage SizeImage ExtentImage Data
Study area A4710—×1935 123 16 123 61 E
39 41 39 68 N
January and December 2022
Study area B4938—×3002 122 11 122 63 E
39 15 39 46 N
January and December 2022
Study area C4221—×3838 122 02 122 27 E
40 33 40 68 N
January and December 2022
Study area D4633—×2244 120 30 120 52 E
40 22 40 42 N
January and December 2022
Table 2. Segmentation results obtained with different network components on three datasets (average values), where the best results are shown in bold.
Table 2. Segmentation results obtained with different network components on three datasets (average values), where the best results are shown in bold.
DatasetModelPrecision (%)Recall (%)IoU (%)F1 Score (%)
CHN-LN4-ISAPs-6U-Net182.7485.8772.6883.84
U-Net285.4887.3676.2986.94
U-Net383.5186.5373.9184.97
U-Net4 (ours)88.0589.7881.1389.12
CHN-LN4-ISAPs-8U-Net183.2286.2173.5584.35
U-Net286.1387.6977.8687.33
U-Net384.0187.3174.4685.77
U-Net4 (ours)88.6990.2682.1189.75
CHN-LN4-ISAPs-9U-Net183.6986.5773.9484.99
U-Net287.6288.1478.2987.68
U-Net384.3787.4374.7585.84
U-Net4 (ours)89.2991.0383.9390.67
Table 3. Segmentation results produced by different methods on the CHN-LN4-ISAPs-9 dataset (average values). The best results are shown in bold.
Table 3. Segmentation results produced by different methods on the CHN-LN4-ISAPs-9 dataset (average values). The best results are shown in bold.
MethodPrecision (%)Recall (%)IoU (%)F1 Score (%)
U-Net83.6986.5773.9484.99
DeepLabV3+79.0188.6771.6682.58
SegNet81.0684.0571.0281.81
PSPNet87.9589.8481.5588.54
SKNet87.0188.1380.6787.97
UPS-Net82.9686.1174.0184.52
SegFormer78.6982.1670.1279.31
MAFU-Net89.2991.0383.9390.67
Table 4. Comparison results of different data combination methods.
Table 4. Comparison results of different data combination methods.
GroupBandsPrecision (%)Recall (%)IoU (%)F1 Score (%)
Group 1386.9288.0480.8187.01
Group 21386.5787.3879.5186.49
Group 3688.0589.7881.1389.12
Group 4688.1290.0682.7689.27
Group 51687.1588.7581.2287.89
Group 6989.2991.0383.9390.67
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liang, Z.; Wang, F.; Zhu, J.; Li, P.; Xie, F.; Zhao, Y. Autonomous Extraction Technology for Aquaculture Ponds in Complex Geological Environments Based on Multispectral Feature Fusion of Medium-Resolution Remote Sensing Imagery. Remote Sens. 2024, 16, 4130. https://doi.org/10.3390/rs16224130

AMA Style

Liang Z, Wang F, Zhu J, Li P, Xie F, Zhao Y. Autonomous Extraction Technology for Aquaculture Ponds in Complex Geological Environments Based on Multispectral Feature Fusion of Medium-Resolution Remote Sensing Imagery. Remote Sensing. 2024; 16(22):4130. https://doi.org/10.3390/rs16224130

Chicago/Turabian Style

Liang, Zunxun, Fangxiong Wang, Jianfeng Zhu, Peng Li, Fuding Xie, and Yifei Zhao. 2024. "Autonomous Extraction Technology for Aquaculture Ponds in Complex Geological Environments Based on Multispectral Feature Fusion of Medium-Resolution Remote Sensing Imagery" Remote Sensing 16, no. 22: 4130. https://doi.org/10.3390/rs16224130

APA Style

Liang, Z., Wang, F., Zhu, J., Li, P., Xie, F., & Zhao, Y. (2024). Autonomous Extraction Technology for Aquaculture Ponds in Complex Geological Environments Based on Multispectral Feature Fusion of Medium-Resolution Remote Sensing Imagery. Remote Sensing, 16(22), 4130. https://doi.org/10.3390/rs16224130

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop