Next Article in Journal
Impact of Arable Land Abandonment on Crop Production Losses in Ukraine During the Armed Conflict
Next Article in Special Issue
Classification of Ship Type from Combination of HMM–DNN–CNN Models Based on Ship Trajectory Features
Previous Article in Journal
FE-SKViT: A Feature-Enhanced ViT Model with Skip Attention for Automatic Modulation Recognition
Previous Article in Special Issue
A Non-Uniform Grid Graph Convolutional Network for Sea Surface Temperature Prediction
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Submarine Landslide Identification Based on Improved DeepLabv3 with Spatial and Channel Attention

by
Jingwen Huang
1,
Weijing Song
1,*,
Tao Liu
1,
Xiaoyu Cui
1,
Jining Yan
1 and
Xiaoyu Wang
2
1
School of Computer Science, China University of Geosciences, Wuhan 430078, China
2
Monitoring and Planning Institute of Inner Mongolia Forestry and Grassland Administration, Hohhot 024050, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2024, 16(22), 4205; https://doi.org/10.3390/rs16224205
Submission received: 9 September 2024 / Revised: 6 November 2024 / Accepted: 6 November 2024 / Published: 12 November 2024
(This article belongs to the Special Issue Artificial Intelligence and Big Data for Oceanography)

Abstract

:
As one of the most destructive, hazardous, and frequent marine geohazards, correctly recognizing submarine landslides holds substantial importance for regional risk assessment, disaster prevention, and marine resource development. Many conventional approaches to prediction and mapping necessitate the involvement of expert insights, oversight, and extensive field investigations, which can result in significant time and effort invested in the prediction process. This paper focuses on employing a deep neural network semantic segmentation technique to detect submarine landslides to replace previous methods, such as numerical analysis and physical modeling, to predict and identify the landslide areas quickly. The peripheral zone of the western Iberian Sea is selected as the study area. Since the neural network image recognition task usually requires RGB images as input data, factors such as slope, hillshade, and elevation extracted from digital elevation model (DEM) data are used to synthesize RGB images through band synthesis methods, and the number and diversity of data are increased utilizing data enhancement. Based on the classical semantic segmentation model DeepLabV3, this paper proposes an improved deep learning method, which strengthens the ability of model feature extraction for complex situations by adding an attention mechanism module, improving the spatial pyramid pooling module, and improving the landslide intersection over union metric from 0.4257 to 0.5219 and the F1-score metric from 0.609 to 0.6631 to achieve effective identification of submarine landslides.

1. Introduction

As a highly destructive marine geohazard, submarine landslides are widely found in offshore continental shelf areas. Their occurrence is usually caused by gravity and external triggers such as earthquakes, which are important for seafloor geomorphologic shaping, seafloor sediment transport, and depositional change [1,2,3]. Such landslides not only migrate large amounts of sediment over long distances, but also often damage marine engineering facilities along the route, such as submarine fiber-optic cables, oil rigs [4], and submarine pipelines, and may even trigger tsunamis [5], presenting a substantial risk to the safety of life and property of coastal residents. Historically, in 1969, Hurricane Camille struck the Mississippi River Delta, triggering submarine landslides leading to platform damage and causing up to 100 million dollars in economic losses [6]. The huge tsunami triggered by submarine landslides from the 2004 earthquake in Sumatra, Indonesia, which resulted in hundreds of thousands of deaths and huge economic losses [7], and the communication disruption caused by submarine landslides in the Taiwan Straits in 2006, have all highlighted the wide influence and serious consequences of submarine landslide disasters.
The risk of submarine landslides is increasing with the expansion of global marine engineering construction and the frequent occurrence of extreme weather events. This has become a hotspot and a challenge in the assessment and prevention of marine engineering disasters. Compared with terrestrial landslides, submarine landslides are characterized by larger scale, longer sliding distance, and greater assessment difficulty. Additionally, the mobility of the soil body is significantly enhanced after its occurrence, with a wide range of damage, causing far-reaching impacts on the structure of marine engineering and the submarine environment. Therefore, accurate identification of the morphology, location, and scale of submarine landslides is crucial for regional risk assessment and disaster prevention [8].
Currently, the study of submarine landslides mainly relies on high-precision geophysical exploration techniques [9,10], numerical analysis methods [11,12,13], and physical modeling experiments [14,15]. Despite the progress made using these traditional methods, risk assessment and classification studies are still insufficient in the face of the complex submarine environment and diverse triggering factors. The inherent heterogeneity of natural environments and subsurface conditions, coupled with the reliance on indirect measurements, diverse data types, and multiple formats, poses formidable challenges in underground analysis, potentially leading to errors. The process of organizing, mining, and processing data for advanced analytics is not only time-consuming but also computationally intensive. To tackle these complexities, the adoption of diversified and modal Artificial Intelligence/Machine Learning (AI/ML) model methodologies offers a promising pathway to accelerate the data collection-to-analysis pipeline while unraveling deeper insights. Moreover, the projected 50–70% increase in offshore infrastructure by 2028 underscores the dire need for tools capable of rapid analysis to support comprehensive risk assessments [16]. Consequently, there is an urgent call for innovative approaches to swiftly analyze these intricate datasets, enabling timely decision-making and the formulation of effective risk mitigation strategies.
Section 1 and Section 2 of this paper provide a pertinent literature review, outlining the cutting-edge research. Section 3 introduces the study area and data sources, providing a comprehensive background for the subsequent analysis. Section 4 explicates the methodological framework employed in this research, elucidating the models and techniques adopted. Section 5 presents the experimental section, encompassing experimental configurations, results, and their interpretations, constituting the empirical validation of our hypotheses. Lastly, the paper concludes with a summary of key findings and a visionary outlook on potential future directions and avenues for further research.

2. Related Work

2.1. Factors Affecting Submarine Landslides

Submarine landslides are an important geologic hazard, the occurrence and development of which are influenced by a variety of factors. These factors can be generally divided into two primary groups: natural factors and anthropogenic factors.
The natural formation of submarine landslides is influenced by a combination of complex factors, which are deeply rooted in the physical and mechanical characteristics of sediments, the seabed topography, and the seabed geological structure. Specifically, the physical properties of sediments, such as particle size, density, and high water content, significantly reduce their shear strength, enhance the sensitivity, and expand the range of the liquid limit and plastic limit [17], thus promoting the breeding of landslides. The conditions of the slope, valley bottom morphology, and water depth of the seafloor topography can be a breeding ground for landslides even at minor slopes, along which landslides can continue to slide for long distances. In addition, widely distributed weak layers on the seafloor, such as interfaces and fissures formed by biogenic, sedimentary, or tectonic movements, further weaken the stability of slopes. From a more macroscopic perspective, seafloor landslide-prone areas, such as fjords, active estuarine deltas, submarine land-canyon sedimentary fans, open-type large slopes, and oceanic volcanic islands and ridges, are prone to triggering landslides due to their unique geologic and hydrodynamic environments. As for the external triggering conditions, tectonic movement affects the stability by changing the stratigraphic structure [18], hydrodynamic conditions such as water scouring and storm wave action directly induce landslide activities, and the sea level rise and fall of global climate change and changes in rainfall patterns [19] also indirectly contribute to the stability of sediments, which together constitute a complex and varied system of natural causes of submarine landslides.
Although human activities are not the main culprits of seafloor landslides, some human behaviors undoubtedly exacerbate the consequences of seafloor landslides. The profound intervention of human activities in the seabed environment, such as the over-exploitation of seabed resources, large-scale land reclamation projects, and the man-made adjustment of river flows, has altered the natural distribution of seabed sediments and the topography of the seabed. These activities not only accelerate the process of sediment erosion and redeposition but also may lead to significant changes in seafloor topography, such as an increase in slope, the deepening of seafloor valleys, and the destruction of existing geological structures. Human activities change the seabed environment in direct or indirect ways, and these changes often contribute to the occurrence of seabed landslides, making the already complex and variable seabed geological processes even more difficult to predict and control.

2.2. Current Status of Research on Submarine Landslides

Submarine landslides, as a common and highly destructive marine geological hazard, have undergone significant research advances in recent years, with profound implications for the assessment of seabed stability and the safety of offshore infrastructure. Research in this field is essentially an extension and deepening of terrestrial landslide studies into underwater and deep-sea environments. The first human understanding of submarine landslides dates back to the damage to submarine cables following the 1929 Grand Banks earthquake, which directly demonstrated the potential threat of submarine geological activity to underwater infrastructure [2]. Subsequently, pioneers such as Terzaghi initiated studies on terrestrial landslide mechanisms [20] and gradually broadened their focus to include nearshore submarine landslides [21], marking the official beginning of scientific research on submarine landslides.
Given the diversity of sediment instability types, Brunsden and Prior [22] refined the classification into several major forms, such as rockfalls, slides/collapses, and turbidity currents, and laid the foundation for subsequent classification research. Mulder and Cochonat [23] further elaborated on the evolutionary processes of submarine landslides, pointing out that cohesive sediments can progressively develop into mudflows or even turbidity currents through water degradation and entrainment, revealing the complexity of landslide dynamics. Using the examples of Saguenay Fjord, Quebec, Canada (depth ranges from 0 to 225 m and a total volume of more than 200 million cubic meters); Palos Verdes Slide, CA, USA; and the Canary Islands rock avalanches, Spain (covering an area of 2600 km2 for a volume of about 150 km3), Locat and Lee [24] meticulously outlined the reasons behind submarine landslides, the various classification systems, key characteristics, geotechnical research methods, and the underlying mechanical processes. Their work offers a solid theoretical foundation for advancing the field. Harbitz et al. [25] focused on the effects of landslide magnitude, speed, acceleration, and backward movement on tsunami generation, deepening the understanding of the disaster chain associated with submarine landslides.
In terms of triggering mechanisms, the “driving force-resistance” model proposed by Anderson and Anderson [26] suggested that a landslide is triggered when the driving force acting on it exceeds the shear resistance of its base material, providing a critical perspective for understanding the triggering conditions of submarine landslides. The papers by Zhu et al. [27] and Jia et al. [28] focused on the detailed classification of landslide types, characterization, discussion of triggering mechanisms, and evaluation of field investigation methods, demonstrating the breadth and depth of submarine landslide research.
In terms of research methods, Lu et al. [29] used seismic geomorphology interpretation and seismic inversion techniques to effectively predict and identify shallow water flow and natural gas hydrate-related hazards, demonstrating the potential of high-precision geophysical methods in disaster warning. Hamilton et al. [30] and Daniel Orange et al. [31] used integrated multi-source geophysical data (multibeam data consisting of 30 kHz Simrad EM-300 data) combined with high-resolution seafloor imagery to achieve detailed characterization of complex submarine geological structures. Imran et al. [32] and Blasio et al. [33] improved rheological models to simulate the sliding behavior of submarine sediments and the interaction between turbidity currents and water bodies, improving the accuracy and applicability of numerical simulations. The lubrication theory model proposed by Harbitz et al. [34] and the Depth Averaged Material Point Method (DAMPM) model based on multiphase flow theory by Zakeri et al. [35] reveal the dynamic characteristics of submarine landslides from different perspectives. Capone et al. [36] innovatively applied the rheological Smoothed Particle Hydrodynamics (SPH) model to accurately reproduce the complex deformation of submarine landslides and their dynamic interactions with the water body. Building on this, Wang et al. [37] extended the application by using the depth-integrated SPH method to systematically investigate the combined effects of environmental parameters, such as water depth, slope angle, contact friction coefficient, and erosion rate, on the characteristics of submarine landslides.
In particular, the introduction of advanced technologies, such as three-dimensional seismic surveys and multibeam bathymetry, has given new impetus to submarine landslide research. 3D seismic surveys [38,39,40] and multibeam bathymetry [41] have demonstrated unique advantages in macrostructure analysis and geometric morphology imaging, respectively, providing abundant data to support a comprehensive understanding of submarine landslides.

2.3. Machine Learning/Deep Learning and Submarine Landslides

In recent years, due to the swift advancement of artificial intelligence technology, machine learning and deep learning technologies have gradually penetrated the field of submarine landslide disaster research [42], showing great potential in the analysis of complex geological environments. Tse et al. [43] conducted an in-depth study of synchronization patterns utilizing an unsupervised learning framework of historical landslide events in the South China Sea region, which provided a new perspective for understanding the geological dynamics of the seafloor. Qi and Tang [44] further integrated meta-heuristic algorithms and machine learning techniques to effectively improve the precision or correctness of forecasting slope stability and expand the boundaries of technology applications (the whole dataset consists of 168 slope cases collected from five published research works). On the other hand, Dyer et al. [45] innovatively applied the gradient-based decision tree (GBDT) model to submarine landslide susceptibility mapping (LSM) in the northern Gulf of Mexico (total area of 386,753 km2), which, combined with the powerful capabilities of the geographic information system (GIS), significantly improved the accuracy and efficiency of landslide risk assessment. Although this series of explorations marks the first success of the application of machine learning and deep learning in the field of submarine landslides, in general, research in this field is still in its infancy and is in dire need of deeper exploration and extensive practice.
In view of this, this paper aims to fill this research gap by proposing an innovative application of the deep neural network semantic segmentation method in deep learning to the automatic identification of submarine landslides to explore its possibilities.
Several classic deep-learning network models for semantic segmentation tasks are presented next. FCN [46] is the pioneering model that introduced convolutional neural networks to the performance of semantic segmentation tasks. Its core idea is to replace the traditional fully connected layer with a convolutional layer, which preserves the geometric and positional details of the input image. Features are extracted through a series of convolutional and pooling operations and upsampled using an inverse convolutional (transposed convolutional) layer to scale the low-resolution feature maps back to the dimensions of the original image. UNet [47] was originally designed for biomedical image segmentation tasks, but its architecture applies to a wide range of semantic segmentation tasks. U-Net achieves efficient feature extraction and restoration through a symmetric encoder-decoder structure. PSPNet [48] is a deep learning model for semantic segmentation proposed by Huawei Noah’s Ark Lab and the University of Science and Technology of China in 2017. By combining the deep features extracted by the backbone network and the features extracted across multiple scales by utilizing a pyramid architecture, pooling module, PSPNet achieves an effective fusion of global and local information and can efficiently capture contextual information at different scales, thus improving segmentation performance. GCN [49] is a neural network model specialized in processing graph-structured data. By applying convolutional operations to graph nodes, it can effectively capture local and global information in graph structures. Similar to UNet, GCN uses an encoder-decoder architecture along with a large convolutional kernel to capture more global features and help the model learn the full view of the image. DeepLabV3 [50] revisits the use of dilated convolution for semantic segmentation. By introducing various techniques such as null convolution and spatial pyramid pooling, a larger sensory field is obtained while maintaining computational efficiency, mitigating the problem of image size reduction and enhancing the ability to capture multi-scale contextual information. DeepLabV3+ [51] is a further improvement on DeepLabV3 by introducing an encoder-decoder architecture, where the encoder extracts features while the decoder restores spatial resolution of the features for finer segmentation of the target boundaries. Its ASPP module contains multiple parallel null convolutional layers, each with a different null rate, to produce multi-scale feature maps. It can capture better and reconstruct detail and improve the segmentation accuracy of object boundaries and small targets.

3. Study Area and Data Sources

The peripheral zone of the western Iberian Sea, especially its southwestern and northeastern Atlantic margins, which covers the geographical area from 33 ° 45 N to 43 ° N and from 6 ° 22 W to 16 ° 15 W, constitutes a geologically significant risk area. The exact location is shown in Figure 1. This region is characterized by the continuous northwesterly convergence of the African and European tectonic plates. This process exacerbates the accumulation of crustal stresses in the region and induces frequent seismicity [52,53], including strong earthquakes with magnitudes greater than 7 on the Richter scale (Mw > 7). In the historical record, the Lisbon earthquake of 1755 and the resulting tsunami are an example of a highly destructive natural disaster [54]. At the same time, due to the mixing wedge created by subduction that started west of the Gulf of Cadiz, submarine landslides are common in the area [55], highlighting the seriousness of the seafloor risks there.
In addition, the large chain of seamounts widely distributed in the region, with their topography rising dramatically from the abyssal plains up to several kilometers in height, not only constitutes a distinctive marine topographic feature but is also closely associated with moderate- to high-intensity seismicity. Such geotectonic conditions are widely recognized as one of the key factors triggering submarine landslides. Recent stability assessments of the study area have revealed a wide range of potential instabilities on continental slopes and seamounts, predicting the vulnerability of these areas to destructive processes. Despite the significant geologic risks in the region, in-depth studies on the phenomenon of submarine landslides are relatively scarce and focus on individual case studies [56,57,58].
Therefore, this work develops a more in-depth study of submarine landslides in the region to better assess and mitigate the geohazard risks. The study utilizes the open-source MAGICLAND database (Marine Geological Hazards Induced by Submarine Landslides along the Western Iberian Margin) provided by Davide Gamboa, which focuses on marine geological hazards caused by underwater landslides in the western Iberian margin. This database comprises 41 items of data, including landslide characteristics, confidence levels of landslide mapping quality, and seafloor depths of the landslides. For this study, data from the digital elevation model (DEM) bathymetric grids and the geomorphological data of 1552 underwater landslides have been employed [59], as shown in Figure 2, Figure 3 and Figure 4.

4. Methodology

This section presents the data preprocessing method for submarine landslide elevation data images and the deep convolutional neural network architecture for DeepLabV3-based semantic segmentation with improvements developed in this paper.

4.1. Data Preparation

The DEM data used in this study come from the MAGICLAND database. These data were acquired using high-precision surveying techniques with a spatial resolution of approximately 115.6 m × 115.6 m and 9773 × 11,380 pixels. These data not only reflect the fine variations in the seafloor topography but also contain rich precursor information on geological hazards. To accurately identify and analyze landslide areas, a polygon annotation method was adopted to meticulously label the landslide phenomena within the DEM data, clearly delineating the shape, boundaries, and coverage area of the landslides, thereby laying a solid foundation for subsequent data processing and analysis.
To fully exploit the topographic feature information contained in the DEM data [60], the study utilized advanced GIS tools. GIS not only possesses powerful spatial data processing capabilities but also effectively integrates multi-source data, enabling precise extraction of topographic features [61]. In this study, GIS was used to derive key topographic feature parameters from the DEM data, including slope, elevation, and hillshade. These parameters respectively represent the inclination of the seabed terrain, the changes in height, and the three-dimensional morphology under lighting conditions. These elements hold significant importance in comprehending the mechanisms that lead to the formation of landslides and their spatial distribution characteristics [62].
The three extracted topographic parameters, slope, elevation, and hillshade, were treated as different “color” channels and combined through a band synthesis operation to generate images similar to RGB (red, green, blue) images. This process not only achieved an intuitive visualization of the topographic features but also preserved the rich information of the original data. On the basis of the synthesized RGB images, the large-sized original images were cropped into square patches. During the cropping process, the principle that each cropped patch should either fully contain a submarine landslide area or not contain any landslide area at all was strictly adhered to to ensure the accuracy and representativeness of the training samples. The cropped images were all of the same size: 1 m × 1 m spatial resolution and 128 × 128 pixels. The specific workflow is illustrated in Figure 5. At the same time, four final obtained data were selected for comparative representation with the original image, as shown in Figure 6.
Finally, we obtain landslide samples of 128 × 128 pixels, each with 3 channels denoted R (red), G (green), and B (blue), i.e., x ∈{0, 1, …, 255}, with an annotated mask y ∈{0, 1}, where 0 and 1 denote non-landslides and landslides, respectively.

4.2. Data Augmentation

In the field of deep learning, especially for semantic segmentation tasks involving complex natural phenomena, such as the automatic identification of submarine landslide areas in this study, the scale and diversity of the dataset play a crucial role in training high-performance deep neural network models. However, a significant challenge often faced in real-world situations is the extremely limited number of reliably labeled samples for specific disaster events (such as landslides) [63]. This data scarcity directly constrains the learning ability and generalization performance of deep neural network models, as deep learning is inherently a data-driven technique whose performance is highly dependent on the richness and representativeness of the training data.
Specifically, when the number of landslide area samples is insufficient, deep neural network models may not be able to fully capture the complex feature differences and boundary variations between landslide and non-landslide areas during the training process. In such cases, the model is prone to overfitting [64], meaning that while it performs excellently on the training set and can accurately identify landslide areas in each training sample, its recognition ability drastically decreases when faced with new, unseen test samples, failing to generalize accurately to broader data distribution. The occurrence of overfitting is essential because the model overfits the noise and details in the training data during the training process rather than genuinely learning the general and essential features of landslide areas. This not only wastes computational resources but also severely limits the model’s reliability and effectiveness in practical applications [65].
Therefore, this paper employs data augmentation techniques to increase the quantity and diversity of the existing data. Based on the data obtained through the aforementioned methods, data augmentation techniques such as rotation, flipping, transposition, and blurring were applied to transform and perturb the data in terms of spatial and visual appearance, thereby increasing the quantity and diversity of the data. The specific data augmentation operations are shown in Table 1. These methods can alleviate, to some extent, the adverse effects of data scarcity on the training of deep neural network models, thereby enhancing the model’s performance in the semantic segmentation task of landslide areas.

4.3. Improved DeepLabV3 with Spatial and Channel Attention

We present a novel model architecture, which is built upon the principles of deep optimization of the ResNet50 backbone, inspired by DeepLabV3 and specifically improved to enhance the accuracy of landslide area recognition in complex geological environments. This model integrates a multi-level feature processing mechanism, including efficient feature extraction, refined attention mechanisms, spatial pyramid pooling, multi-scale feature aggregation strategies, advanced channel attention modules, and efficient decoder designs. Together, these components form a powerful and compact deep learning framework. As illustrated in Figure 7, the core innovation of this model lies in its carefully designed spatial and channel attention modules. The spatial attention module enhances the model’s sensitivity to spatial layout and structural information in images. By capturing the differences in features across different spatial locations in the image, this module significantly improves the model’s ability to distinguish key spatial areas (such as landslide boundaries and internal structures of landslides) from background areas. This ensures that during the training process, the model focuses more on the spatial details crucial for landslide recognition, thereby enhancing its spatial resolution and localization accuracy. The channel attention module, on the other hand, aims to optimize feature representation by dynamically adjusting the importance weights of feature channels. It automatically identifies and amplifies the feature channels that contribute most to landslide recognition while suppressing or diminishing the influence of non-critical channels. This adaptive feature recalibration strategy not only enhances the model’s feature representation capability in the channel dimension but also promotes the effective extraction and utilization of critical information in complex scenarios, further improving the accuracy and robustness of landslide recognition. Together, these modules complement each other, significantly enhancing the model’s ability to understand and represent deep-level features in images [66].
At the workflow level, we adhered to a strict data partitioning and training evaluation process. To ensure the independence and non-interference of the model’s training, evaluation, and testing phases, the dataset was initially partitioned into distinct sets for training, validation, and testing purposes. During the training phase, iterative optimization algorithms were used to enable the model to gradually learn and internalize the feature representation of landslide areas, achieving precise recognition of these areas. Subsequently, the model’s performance was comprehensively evaluated using the validation set, with quantitative metrics such as accuracy, recall, and F1-score objectively reflecting the model’s learning effectiveness and generalization ability. Finally, the trained model was applied to the test set to verify its predictive performance on unseen data, from which the final landslide recognition predictions were derived (Figure 8).
Next, we will give a full introduction to each module of the model.

4.3.1. AttentionModule

Global average pooling is first applied to average the input feature map, which averages the values of all spatial locations of each channel of the input feature map to generate a global feature vector. This global feature vector captures the global spatial information of each channel and is used to represent the importance of that channel. The global feature vector is then dimensionalized by a 1 × 1 convolutional layer to reduce computational complexity and suppress some of the unimportant information. This process can be seen as a further feature extraction of the global features to obtain a more compact representation of each channel. Applying the ReLU activation function to the downscaled feature vector allows the network to learn nonlinear features. The downscaled features are again upscaled by a 1 × 1 convolutional layer so that the quantity of output channels matches the quantity of input channels. The goal is to rescale the compact representation after dimensionality reduction back to the original number of channels so that it matches the channel dimension when multiplied with the input feature map. The upscaled features are passed through a Sigmoid activation function to generate a set of channel weight coefficients between [0, 1], which are used to adjust the feature strength for each channel. Ultimately, an element-wise multiplication is performed between the original input feature map x and the attention coefficient avg-out, which has been computed. In this way, the feature map of each channel is weighted and adjusted according to its attention coefficients. Figure 9 shows the module’s network structure diagram.

4.3.2. ASPP Module

Based on the ASPP [67] module in the original DeepLabv3, the model is extended and optimized so that the ASPP module can better capture multi-scale information and improve the accuracy of feature fusion, resulting in a boost to the model’s feature extraction prowess, especially for complex scenes.
Firstly, the input feature map is globally average pooled, and the spatial dimension is compressed into a 1 × 1 feature map to capture the global information. Then, dimensionality reduction is performed by 1 × 1 convolution to lower the computational cost and adjust the number of channels of the features. Compared to the original DeepLabV3, here, not only is global pooling performed, but the features are further processed by additional convolution, batch normalization (BatchNorm), and activation function (ReLU), which makes the extraction and representation of global information more refined. The input feature maps are then processed directly using 1 × 1 convolution to extract local features, which are normalized and activated by BatchNorm and ReLU. The output features are made more stable and have the ability of nonlinear expression. Next, three 3 × 3 convolutional layers with different expansion rates are used to capture different scales of contextual information, and BatchNorm and ReLU activation functions are added to each expansion convolutional layer to maintain the clarity and expressive ability of the features when dealing with multi-scale information. Finally, after the multi-scale features are extracted by different convolutional layers, these feature maps are spliced in the channel dimension to obtain a feature map containing rich multi-scale information. This multi-scale information is then further fused by adjusting the channel count to match the number of output channels out-features through 1 × 1 convolutional layers (bottleneck layers). Figure 9 shows the module’s network structure diagram.

4.3.3. SEBlock Attention Module

Compared with AttentionModule, SEBlock focuses on the weighting of the channel dimension, emphasizes the relationship between different channels, and is a channel-based attention mechanism, the main purpose of which is to enhance the feature expression by adjusting the weights between channels, and improve the model’s capability to represent features specifically along the channel dimension. The main purpose of AttentionModule is to improve the model’s grasp of the spatial structure information of the image, so as to strengthen the model’s competence to discriminate the features of different spatial locations, which emphasizes the holistic picture of spatial locations. The specific workflow of the SEBlock module is shown in Figure 10.
SEBlock [68] consists of three main parts: Squeeze, Excitation, and Reweighting. In the Squeeze phase, the spatial extent (height, width) within the input feature maps are compressed to 1 × 1 by a global average pooling operation to obtain the global sensory fields for each channel. This process serves to aggregate the spatial information and condense it into a single scalar for each channel, allowing the model to focus on the importance of each channel in the entire feature map rather than on localized regions, thus providing global information to support the weighting between channels; in the Excitation phase, SEBlock learns anew the channel relationships through two fully connected layers (implemented in the code via 1 × 1 convolution) relationships. First, the number of channels is reduced from in-channels to in-channels/reduction by fc1 (the first 1 × 1 convolution), where reduction is a hyperparameter, usually set to 16, to reduce the computational effort and introduce nonlinear relationships. Next, after the ReLU activation function, which introduces the nonlinear variation, the number of channels is reduced to the original in-channels by fc2 (the second 1 × 1 convolution). Finally, the weights for each channel are obtained by restricting the output to (0, 1) with the Sigmoid activation function. This stage of processing introduces nonlinearities and inter-channel dependencies, allowing the network to better understand and represent the correlations between different channels, thus providing a basis for subsequent feature weighting. In the Reweighting phase, SEBlock reapplies the previously computed channel weights y (after Sigmoid activation) to each channel of the original input feature map x utilizing element-by-element multiplication. Through this weighting operation, the model is able to adaptively adjust the importance of each channel, emphasize more relevant features, and suppress irrelevant features, which makes the model have better expressive ability at the channel level, and is able to dynamically adjust the importance of different features throughout the training stage, so as to improve the recognition and classification ability of the model.

4.3.4. Decoder Module

The core of the decoding module is a convolutional block containing several convolutional layers and activation functions designed to gradually recover the spatial resolution of the feature map and generate the final semantic segmentation result. The input channel count in-channels is converted into 256 channels by first passing through the first convolutional layer, which receives the feature maps processed by the encoder and other modules, such as the ASPP and the attention module. The dimensions of the convolution kernel are 3 × 3, keeping the spatial dimensions of the feature map constant (achieved by padding = 1). The convolution output’s activation values are then normalized using a batch normalization layer, ensuring a more stable model during training, accelerating convergence, and reducing sensitivity to parameter initialization. The ReLU (Rectified Linear Unit) activation function is applied to the normalized feature map, introducing nonlinear properties that strengthen the model’s capability to learn more intricate features. The second layer of convolution further processes the feature map, maintaining 256 channels and continuing to enhance the feature representation. The third layer of convolution converts the feature map into several segmentation class numbers. The convolution kernel is 1 × 1 in size and is mainly used to compress the 256 channels into the final number of class channels, so that the output feature map is of the same resolution as the input, and contains the probability values or classification results that each pixel point belongs to a different class.

5. Experiments

5.1. Experimental Setup

5.1.1. Data Preprocessing

The raw remotely sensed images were acquired from DEM data from the MAGICLAND (Marine Geohazards due to Underwater Landslides on the Western Iberian Margin) database. After the data preparation process elaborated in Section 4.1, each image was processed to contain one or more landslides with 128 × 128 pixels. The whole dataset was partitioned as a 70% training set, a 15% validation set, and a 15% test set. The training set contains 128 images, the test set contains 28 images, and the validator contains 28 images.
Due to the limited landslide data provided in the database, the amount of data we obtained was very small and could not support the training of the deep neural network, so we used data enhancement to increase the amount and diversity of the data in terms of space and appearance, so as to avoid the occurrence of problems such as overfitting during the training of the model, which would affect the final experimental results. For the training and test set data, we used various data enhancement means such as rotation, inversion, transposition, and blurring, which were operated with different probabilities, and the process is shown in Figure 11 and Figure 12. For the test set, different cropping strategies were used to generate five sub-images from one original image.

5.1.2. Evaluation Indicators

In this paper, five performance metrics are used, including pixel accuracy, precision, recall, F1-score, and mean intersection over union (mIoU), which are defined as follows:
Pixel accuracy measures the fraction of pixels that are accurately predicted by the model to the total pixels:
PA = TP + TN TP + TN + FP + FN
Precision measures how many of the samples predicted by the model as positive classes are positive classes. It is concerned with how many of the predictions are correct:
Precision = TP TP + FP
Recall measures how many of all samples that are positive classes are correctly identified as such. It is concerned with how many of the actual results were correctly predicted:
Recall = TP TP + FN
The F1-score combines precision and recall into a single, balanced evaluation metric:
F 1 = 2 × Precision × Recall Precision + Recall
The mIoU is a commonly used metric in image segmentation, which calculates the ratio of intersection and concatenation between predicted and true values, and then averages the overall categories:
mIoU = 1 2 × TP TP + FP + FN + TN TN + FN + FP
where TP (true positive), TN (true negative), FP (false positive), and FN (false negative) are accurately predicted landslide pixels, accurately predicted non-landslide pixels, landslide pixels misperceived as non-landslide, and non-landslide pixels misperceived as landslide, respectively.

5.1.3. Experimental Configurations

In this paper, six classical image semantic segmentation methods: FCN, UNet, GCN, PSPNet, DeepLab V3, and DeepLabV3+ are used to compare with the improved DeepLabV3 model. The experiments were conducted using the PyTorch framework, with the learning rate at the beginning of the model training value set to 1.0 × 10−4 and automatically reduced during training using the ReduceLROnPlateau module. The batch size of the model is set to 2, the epoch of training is 150, and the optimizer is Adam. Since the model is mainly used to identify landslide areas, but the landslide areas and non-landslide areas are not balanced in the sample images, the weights of the landslide areas are set to 2.0, and the weights of the non-landslide areas are set to 1.0.

5.2. Results

The above experimental results are shown in Table 2, and Figure 13 depicts the imaging results. The table and pictures allow us to derive the following conclusions:
  • Through experiments, the performance of seven semantic segmentation models was evaluated. The results for the landslide detection task are as follows: UNet: landslide IoU of 0.27, pixel accuracy of 0.7561, precision of 0.3994, recall of 0.68, and F1-score of 0.4636. FCN: landslide IoU of 0.1961, pixel accuracy of 0.8201, precision of 0.3516, recall of 0.3473, and F1-score of 0.3716. PSPNet: landslide IoU of 0.1013, pixel accuracy of 0.7579, precision of 0.2425, recall of 0.2305, and F1-score of 0.2363. GCN: landslide IoU of 0.1691, pixel accuracy of 0.8358, precision of 0.3608, recall of 0.2449, and F1-score of 0.2918. DeepLabV3: landslide IoU of 0.4257, pixel accuracy of 0.8911, precision of 0.6569, recall of 0.563, and F1-score of 0.6093. DeepLabV3+: landslide IoU of 0.1574, pixel accuracy of 0.8095, precision of 0.3623, recall of 0.2684, and F1-score of 0.3084. Improved DeepLabV3: landslide IoU of 0.1574, pixel accuracy of 0.8095, precision of 0.3623, recall of 0.2684, and F1-score of 0.3084. Among classic semantic segmentation models, the DeepLabV3 model has demonstrated particularly outstanding performance in the semantic segmentation task for submarine landslide scenarios. This conclusion is based on a detailed analysis of experimental results. Specifically, the model achieved remarkable results in key evaluation metrics, including a landslide IoU of 0.4257, reflecting the overlap between the predicted and actual regions—a key indicator of segmentation accuracy. Additionally, the pixel accuracy reached 0.8911, highlighting the model’s high accuracy in pixel-level classification. Further analysis of the precision (0.6569) and recall (0.563) data shows the model’s precision and recall capabilities in identifying landslide areas. The combined result of these two metrics—the F1-score—reached 0.6093, further validating the DeepLabV3 model’s advantage in balancing precision and recall. Meanwhile, according to the images, it can be found that the Unet, FCN, PSPNet, GCN, and DeepLabV3plus generated images have roughly the same area of landslides as the labeled images, but the specific shapes are more different, and the areas and shapes of landslides are roughly the same as the labeled images in the images generated by DeepLabV3. In conclusion, DeepLabV3 emerges as the best-performing classic model for this task, while the other classic models, including FCN, PSPNet, GCN, and DeepLabV3+, exhibited less satisfactory results.
  • A comparative analysis reveals that although DeepLabV3+, as an advanced version of DeepLabV3, theoretically has a more complex network structure and potentially stronger learning capabilities, its performance did not surpass DeepLabv3 in this experiment. This phenomenon can be reasonably explained from a data-driven perspective; there is a balance between model complexity and the amount of training data required when the training data sample size is limited. More complex models, such as DeepLabv3+, often require more training samples to adequately learn and optimize their internal parameters to achieve the desired generalization ability. Therefore, in environments where sample resources are limited, the relatively simpler DeepLabv3 model can more effectively utilize the limited data resources, avoid overfitting, and thus exhibit better segmentation performance.
    Meanwhile, in the in-depth exploration and enhancement of applying semantic segmentation techniques to submarine landslide detection, this study found that the DeepLabV3 model, through a series of targeted improvements, showed significant performance enhancements compared to its original version. Specifically, the improved model achieved a substantial leap in the accuracy of landslide area identification, with a landslide IoU value reaching 0.5219, a 22.6% increase compared to DeepLabV3’s 0.4257. From the image results, the landslide areas generated by the Improved DeepLabV3 are more similar to the labeled landslide areas. This significant improvement not only highlights the effectiveness of the model optimization strategies but also underscores the importance of customizing the model according to data characteristics in specific application scenarios.
    A key innovation in this study is that the input images are not traditional RGB three-channel color images but rather single-band images extracted and synthesized based on a DEM, which are then transformed into simulated three-band images through specific algorithms. Although such images have significant importance in geographic information science, their unique data distribution and representation are more complex than natural images. Particularly, the subtle variations in terrain features are difficult to intuitively reflect in the synthesized images, which undoubtedly increases the difficulty of automatically identifying landslide areas.
    To overcome this challenge, the improved model integrates advanced modules such as attention mechanisms based on the DeepLabv3 framework. These modules enhance and suppress key feature information through the dynamic weighting of feature maps, effectively addressing the problem of landslide features being easily overlooked or misidentified in complex backgrounds. Specifically, the attention mechanism allows the model to focus on the most discriminative parts of the image, which are crucial features for distinguishing between landslide and non-landslide areas, thereby significantly enhancing the model’s feature representation capability and the accuracy of landslide area identification.
    Moreover, the improved model also achieved encouraging progress in several evaluation dimensions, including pixel accuracy, precision, recall, and F1-score. The pixel accuracy increased to 0.9284, indicating the model’s robustness in pixel-level classification. The simultaneous improvement in precision and recall (0.664 and 0.6695, respectively) reflects the model’s ability to identify landslide areas while maintaining good recall capability accurately. The growth in the F1-score (to 0.6631) directly reflects the balanced optimization of precision and recall, further demonstrating the comprehensive performance optimization of the model.
  • Data augmentation techniques were used in this experiment to expand the dataset’s size and diversity. Critical geological characteristics were not distorted or disrupted by the spatial transformations and appearance disturbances utilized in data augmentation, according to realism. Following augmentation, important geological features, like the distribution and shape of faults and sedimentary strata, maintained their original patterns. To further validate the effectiveness of data augmentation, the augmentation operations were removed in a subsequent experiment, with the results presented in Table 3. The results show that the IoU and F1-score for the landslide regions dropped when data augmentation was eliminated. For example, the Improved DeepLabV3’s F1-score declined from 0.6631 to 0.4747, while its landslide IoU dropped from 0.5219 to 0.3021. This illustrates the need for and efficacy of the data augmentation procedure and further validates the representativeness of the supplemented samples. These findings suggest that the supplemented samples retain a certain level of representativeness and help to improve the model’s capacity for generalization. By simulating the diversity and complexity of the data, data augmentation can effectively increase the dataset’s size in cases where there is a lack of data. This allows the model to observe more types of changes and perturbations, which enhances the model’s capacity to generalize on previously unseen data and perform better in real-world applications, enhancing the model’s training impact.

6. Conclusions and Future Work

This paper implements a deep neural network to identify submarine landslides using DEM data images and landslide areas provided by the MAGICLAND database. Based on the DeepLabV3 model, spatial and channel attention mechanism modules are added. The model is compared with six classic semantic segmentation models (FCN, UNet, PSPNet, GCN, DeepLabV3, and DeepLabV3+) and validated on the submarine landslide prediction problem using commonly used experimental metrics for semantic segmentation (pixel accuracy, precision, recall, F1-score, and mean intersection over union). The feasibility of the model in predicting submarine landslides is demonstrated, and the improved model’s accuracy in experimental results is verified. Compared to traditional methods, the use of deep learning neural networks accelerates submarine landslide prediction, reduces workload, and enhances the ability to process complex data.
The innovations of this paper include the following aspects: Since submarine images are not as clear and easy to obtain as surface images, research on the susceptibility of submarine landslides is very challenging. This paper formalizes landslide susceptibility as a semantic segmentation problem on optical remote sensing images and applies deep learning neural networks to submarine landslide susceptibility detection; Deep learning neural networks typically require input data in the form of three-channel RGB images. Since submarine landslide images can only be presented through elevation DEM data, three geological factors were extracted from the DEM data to serve as the RGB channels, resulting in synthesized input image data. Regarding the model, spatial attention mechanisms and channel attention mechanisms were added on top of DeepLabv3 to improve the model’s ability to distinguish features at different spatial locations, while focusing more on important channel information, thereby increasing the accuracy of the prediction results.
Given that submarine landslide images are directly extracted from DEM, their intrinsic attributes are significantly different from the color and texture features of traditional RGB images, a characteristic that poses a significant challenge to the learning effectiveness of the deep learning model in the image recognition task [69]. Specifically, DEM data focus on the three-dimensional geometric description of the terrain and lack the rich visual information and intuitive color hierarchy of RGB images, which exacerbates the difficulty of the model in capturing the features of the landslide area, and consequently affects the performance on the final test set, which fails to achieve the desired recognition accuracy [70].
To address this challenge, future research and experiments should focus on deep preprocessing and enhancement strategies for seabed image data, which can explore the implementation of terrain feature enhancement techniques to extract and highlight the specific terrain markers of the landslide area; secondly, image segmentation and feature enhancement algorithms, such as edge detection and texture analysis, can be used to enhance the contrast between the landslide area and the surrounding environment, so as to make it easier for the model to capture key information. Also, we can consider introducing multi-source data fusion strategies, such as combining marine geophysical data, such as acoustic sounding and side-scan sonar, in order to construct a more comprehensive and multi-dimensional description of the seabed environment, and to provide a richer source of information for model learning.
Finally, in this work, we only validate the effectiveness of the semantic segmentation model for submarine landslides. As more and more seafloor hazard datasets become available, future research could expand its application to include the identification of other seafloor features that contribute to geohazards, such as faulting and gas seepage.

Author Contributions

Conceptualization, W.S.; methodology, J.H.; validation, T.L.; formal analysis, T.L. and X.C.; data curation, X.C.; writing—original draft preparation and editing, J.H.; writing—review and editing, W.S. and J.Y.; visualization, J.H.; supervision, W.S.; funding acquisition, X.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Open Research Program of the International Research Center of Big Data for Sustainable Development Goals, Grant No. CBAS2023ORP03, the technology project of Hubei Provincial Department of Natural Resources under Grant ZRZY2024KJ22 and the International Research Center of Big Data for Sustainable Development Goals, Grant No. CBAS2022GSP05.

Data Availability Statement

The original contributions presented in the study are included in the Ref. [59], further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. McAdoo, B.G.; Praston, L.F.; Orange, D.L. Submarine landslides geomorphology, US continental slopeAuthor. Mar. Geol. 2000, 69, 103–136. [Google Scholar] [CrossRef]
  2. Heezen, B.C. Turbidity currents and submarine slumps, and the 1929 Grand Banks earthquake. Am. J. Sci. 1952, 250, 849–873. [Google Scholar] [CrossRef]
  3. Piper, D.J.W.; Cochonat, P.; Morrison, M.L. The sequence of events around the epicenter of the 1929 Grand Banks earthquake: Initiation of debris flows and turbidity current inferred from sidescan sonar. Sedimentology 1999, 46, 79–97. [Google Scholar] [CrossRef]
  4. Wang, Y.W.; Wang, L.Z.; Chen, X.D. Offshore petroleum leaking source detection method from remote sensing data via deep reinforcement learning with knowledge transfer. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 5826–5840. [Google Scholar] [CrossRef]
  5. Bea, R.G. How sea floor slides affect offshore structures. Oil Gas J. 1971, 69, 88–92. [Google Scholar]
  6. Liu, F. Submarine Landslides Induced by Gas Hydrate Decomposition and Environmental Risk Assessment in the Northern Slope of the South China Sea. Ph.D. Thesis, Graduate University of the Chinese Academy of Sciences (Institute of Oceanology), Beijing, China, 2010. [Google Scholar]
  7. Michael, A.F.; William, R.N.; Greene, H.G.; Homa, J.L.; Ray, W.S. Geology and tsunamigenic potential of submarine landslides in Santva Barbara Channel, Southern California. Mar. Geol. 2005, 224, 1–22. [Google Scholar]
  8. Zhang, X.H.; Chen, Y.F.; Le, Y. Nearshore Bathymetry Based on ICESat-2 and Multispectral Images: Comparison Between Sentinel-2, Landsat-8, and Testing Gaofen-2. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 2449–2462. [Google Scholar] [CrossRef]
  9. Wang, W.; Wang, D.; Wu, S.; Völker, D.; Zeng, H.; Cai, G.; Li, Q. Submarine landslides on the north continental slope of the South China Sea. J. Ocean Univ. China 2018, 17, 83–100. [Google Scholar] [CrossRef]
  10. lstad, T.; De Blasio, F.V.; Elverhøi, A.; Harbitz, C.B.; Engvik, L.; Longva, O.; Marr, J.G. On the frontal dynamics and morphology of submarine debris flows. Mar. Geol. 2004, 213, 481–497. [Google Scholar] [CrossRef]
  11. El-Ramly, H.; Morgenstern, N.R.; Cruden, D.M. Probabilistic slope stability analysis for practice. Can. Geotech. J. 2002, 39, 665–683. [Google Scholar] [CrossRef]
  12. Griffiths, D.V.; Lane, P.A. Slope stability analysis by finite elements. Géotechnique 1999, 49, 387–403. [Google Scholar] [CrossRef]
  13. Ijaz, N.; Ye, W.; Rehman, Z.; Dai, F.; Ijaz, Z. Numerical Study on Stability of Lignosulphonate-Based Stabilized Surficial Layer of Unsaturated Expansive Soil Slope Considering Hydro-Mechanical Effect. Transp. Geotech. 2002, 32, 100697. [Google Scholar] [CrossRef]
  14. Bradshaw, A.S.; Tappin, D.R.; Rugg, D.A. The Kinematics of a Debris Avalanche on the Sumatra Margin. Int. Symp. Submar. Mass Mov. Conseq. 2010, 28, 117–125. [Google Scholar]
  15. Schofield, A.N. Use of centrifugal model testing to assess slope stability. Rev. Can. Géotech. 2011, 15, 14–31. [Google Scholar] [CrossRef]
  16. Mark-Moser, M.K.; Dyer, A.S.; Zaengle, D. AI/ML Techniques for Submarine Landslide Detection and Landslide Susceptibility Mapping; National Energy Technology Laboratory (NETL): Pittsburgh, PA, USA; Morgantown, WV, USA; Albany, OR, USA, 2023. [Google Scholar]
  17. Zhenhong, L.; Zuosheng, Y.; Bornhold, B.D. Instability of the subaqueous delta slope of the modern Yellow River. Mar. Geol. Quat. Geol. 1995, 15, 11–22. [Google Scholar]
  18. Herzer, R.H. Uneven submarine topography south of Mernoo Gap—The result of volcanism and submarine sliding. N. Z. J. Geol. Geophys. 1975, 18, 183–188. [Google Scholar] [CrossRef]
  19. Weaver, P.P.E.; Kuijpers, A. Climatic control of turbidite deposition on the Madeira Abyssal Plain. Natures 1983, 306, 360–363. [Google Scholar] [CrossRef]
  20. Terzaghi, K. Mechanism of Landslides; Geotechnical Society of America: Berkeley, CA, USA, 1951. [Google Scholar]
  21. Terzaghi, K.; Bjerrum, L.; Rosenqvist, I.T. Varieties of Submarine Slope Failures; Harvard University: Cambridge, MA, USA, 1957. [Google Scholar]
  22. Prior, D.B.; Coleman, J.M. Submarine Slope Instability; Louisiana State University Coastal Studies Institute: Baton Rouge, LA, USA, 1984. [Google Scholar]
  23. Mulder, T.; Cochonat, P. Classification of offshore mass movements. J. Sediment. Res. 1996, 66, 43–57. [Google Scholar]
  24. Locat, J.; Lee, H.J. Submarine landslides: Advances and challenges. Can. Geotech. J. 2002, 39, 193–212. [Google Scholar] [CrossRef]
  25. Harbitz, C.B.; Løvholt, F.; Pedersen, G. Mechanisms of tsunami generation by submarine landslides: A short review. Nor. J. Geol. Geol. Foren. 2006, 86, 255–264. [Google Scholar]
  26. Anderson, R.S.; Anderson, S.P. Geomorphology: The Mechanics and Chemistry of Landscapes; Cambridge University Press: Cambridge, UK, 2010. [Google Scholar]
  27. Zhu, C.Q.; Jia, Y.G.; Liu, X.L.; Zhang, H. Classification and genetic mechanism of submarine landslide: A review. Mar. Geol. Quat. Geol. 2015, 35, 153–163. [Google Scholar]
  28. Jia, Y.G.; Wang, Z.H.; Liu, X.L. The research progress of field investigation and in-situ observation methods for submarine landslide. Period. Ocean Univ. China 2017, 47, 61–72. [Google Scholar]
  29. Lu, S.; McMechan, G.A. Estimation of gas hydrate and free gas saturation, concentration, and distribution from seismic data. Geophysics 2002, 67, 582–593. [Google Scholar] [CrossRef]
  30. Hamilton, I.W.; Hartley, B.; Angheluta, C. Turning high-resolution geophysics upsidedown: Application of seismic inversion to site investigation and geohazard problems. In Proceedings of the 36th Offshore Technology Conference, Houston, TX, USA, 3–6 May 2004; Society of Petroleum Engineers: Houston, TX, USA, 2004. [Google Scholar]
  31. Daniel, O.; Angell, M.; Pawlowski, B. Visualizing seafloor, seismic, gravity and magnetic data in the deepwater Gulf of Mexico improves understanding of geohazards, salt, and seeps. In Proceedings of the 33rd Offshore Technology Conference, Houston, TX, USA, 30 April–3 May 2001; Society of Petroleum Engineers: Houston, TX, USA, 2001. [Google Scholar]
  32. Imran, J.; Harff, P.; Parker, G. A numerical model of submarine debris flow with graphical user interface. Comput. Geosci. 2001, 27, 717–729. [Google Scholar] [CrossRef]
  33. Blasio, F.V.D.; Engvik, L.; Harbitz, C.B. Hydroplaning and submarine debris flows. J. Geophys. Res. 2004, 109, 1. [Google Scholar] [CrossRef]
  34. Harbitz, C.B.; Parker, G.; Elverhøi, A. Hydroplaning of subaqueous debris flows and glide blocks: Analytical solutions and discussion. J. Geophys. Res. 2003, 108, 23492366. [Google Scholar] [CrossRef]
  35. Zakeri, A.; Høeg, K.; Nadim, F. Submarinedebrisflow impact on pipelines Part II: Numerical analysis. Coast. Eng. 2009, 56, 1–10. [Google Scholar] [CrossRef]
  36. Capone, T.; Panizzo, A.; Monaghan, J.J. SPHmodellingofwaterwaves generated by submarine landslides. J. Hydraul. Res. 2010, 48, 80–84. [Google Scholar] [CrossRef]
  37. Wang, Z.T.; Li, X.Z.; Liu, P. Numerical analysis of submarine landslides using a smoothed particle hydrodynamics depth integral model. Acta Oceanol. Sin. 2016, 35, 134–140. [Google Scholar] [CrossRef]
  38. Bull, S.; Cartwright, J.; Huuse, M. A review of kinematic indicators from mass-transport complexes using 3D seismic data. Mar. Pet. Geol. 2009, 26, 1132–1151. [Google Scholar] [CrossRef]
  39. Frey-Martínez, J. 3D Seismic Interpretation of Mass Transport Deposits: Implications for Basin Analysis and Geohazard Evaluation. In Submarine Mass Movements and Their Consequences; Springer: Dordrecht, The Netherlands, 2010; pp. 553–568. [Google Scholar]
  40. O’Brien, P.E.; Mitchell, C.H.; Nguyen, D.; Langford, R.P. Mass Transport Complexes on a Cenozoic paleo-shelf edge, Gippsland basin, southeastern Australia. Mar. Pet. Geol. 2018, 98, 783–801. [Google Scholar] [CrossRef]
  41. Twichell, D.C.; Chaytor, J.D.; Ten Brink, U.S.; Buczkowski, B. Morphology of late Quaternary submarine landslides along the US Atlantic continental margin. Mar. Geol. 2009, 26, 4–15. [Google Scholar] [CrossRef]
  42. Han, W.; Li, J.; Wang, S. Geological Remote Sensing Interpretation Using Deep Learning Feature and an Adaptive Multisource Data Fusion Network. IEEE Trans. Geosci. Remote Sens. 2022, 60, 4510314. [Google Scholar] [CrossRef]
  43. Tse, K.C.; Chiu, H.; Tsang, M.; Li, Y.; Lam, E.Y. An unsupervised learning approach to study synchroneity of past events in the South China Sea. Front. Earth Sci. 2019, 13, 628–640. [Google Scholar] [CrossRef]
  44. Qi, C.; Tang, X. Slope stability prediction using integrated metaheuristic and machine learning approaches: A comparative study. Comput. Ind. Eng. 2018, 118, 112–122. [Google Scholar] [CrossRef]
  45. Dyer, A.S.; Mark-Moser, M.K.; Duran, R. Offshore application of landslide susceptibility mapping using gradient-boosted decision trees: A Gulf of Mexico case study. Nat. Hazards 2024, 120, 6223–6244. [Google Scholar] [CrossRef]
  46. Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
  47. Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015; Proceedings, Part III 18. Springer International Publishing: Cham, Switzerland, 2015; pp. 234–241. [Google Scholar]
  48. Zhao, H.; Shi, J.; Qi, X. Pyramid scene parsing network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2881–2890. [Google Scholar]
  49. Peng, C.; Zhang, X.; Yu, G. Large kernel matters–improve semantic segmentation by global convolutional network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4353–4361. [Google Scholar]
  50. Chen, L.C.; Papandreou, G.; Schroff, F. Rethinking atrous convolution for semantic image segmentation. arXiv 2017, arXiv:1706.05587. [Google Scholar]
  51. Chen, L.C.; Zhu, Y.; Papandreou, G. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 801–818. [Google Scholar]
  52. Terrinha, P. Tsunamigenic-seismogenic structures, neotectonics, sedimentary processes and slope instability on the southwest Portuguese Margin. Mar. Geol. 2018, 195, 801–818. [Google Scholar] [CrossRef]
  53. Zitellini, N. The quest for the Africa–Eurasia plate boundary west of the Strait of Gibraltar. Earth Planet. Sci. Lett. 2009, 280, 13–50. [Google Scholar] [CrossRef]
  54. Zitellini, N. Source of 1755 Lisbon earthquake and tsunami investigated. Eos Trans. Am. Geophys. Union 2001, 82, 285–291. [Google Scholar] [CrossRef]
  55. Urgeles, R.; Camerlenghi, A. Submarine landslides of the Mediterranean Sea: Trigger mechanisms, dynamics, and frequency-magnitude distribution. J. Geophys. Res. Earth Surf. 2013, 118, 2600–2618. [Google Scholar] [CrossRef]
  56. Terrinha, P. The Tagus River delta landslide, off Lisbon, Portugal. Implications for Marine geo-hazards. Mar. Geol. 2019, 416, 105983. [Google Scholar] [CrossRef]
  57. Gamboa, D. Destructive episodes and morphological rejuvenation during the lifecycles of tectonically active seamounts: Insights from the Gorringe Bank in the NE Atlantic. Earth Planet. Sci. Lett. 2021, 559, 116772. [Google Scholar] [CrossRef]
  58. Teixeira, M. Interaction of alongslope and downslope processes in the Alentejo Margin (SW Iberia)–Implications on slope stability. Mar. Geol. 2019, 401, 88–108. [Google Scholar] [CrossRef]
  59. Gamboa, D.; Omira, R.; Terrinha, P. A database of submarine landslides offshore West and Southwest Iberia. Sci. Data 2021, 8, 185. [Google Scholar] [CrossRef]
  60. Wang, L.; Zuo, B.; Le, Y. Penetrating remote sensing: Next-generation remote sensing for transparent earth. Innovation 2023, 4, 100519. [Google Scholar] [CrossRef]
  61. Wang, S.; Han, W.; Zhang, X. Geospatial remote sensing interpretation: From perception to cognition. Innov. Geosci. 2024, 2, 100056-1–100056-2. [Google Scholar] [CrossRef]
  62. Cheng, L.; Wang, L.; Feng, R. Remote sensing and social sensing data fusion for fine-resolution population mapping with a multimodel neural network. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 5973–5987. [Google Scholar] [CrossRef]
  63. Chen, W.; Zhou, G.; Liu, Z. NIGAN: A framework for mountain road extraction integrating remote sensing road-scene neighborhood probability enhancements and improved conditional generative adversarial network. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5626115. [Google Scholar] [CrossRef]
  64. Shorten, C.; Khoshgoftaar, T.M. A survey on image data augmentation for deep learning. J. Big Data 2019, 6, 60. [Google Scholar] [CrossRef]
  65. Wang, L.; Ma, Y.; Yan, J. pipsCloud: High performance cloud computing for remote sensing big data management and processing. Future Gener. Comput. Syst. 2018, 78, 353–368. [Google Scholar] [CrossRef]
  66. Fan, R.; Li, J.; Song, W. Urban informal settlements classification via a transformer-based spatial-temporal fusion network using multimodal remote sensing and time-series human activity data. Int. J. Appl. Earth Obs. Geoinf. 2022, 111, 102831. [Google Scholar] [CrossRef]
  67. Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell 2017, 40, 834–848. [Google Scholar] [CrossRef] [PubMed]
  68. Hu, J.; Shen, L.; Albanie, S.; Sun, G.; Wu, E. Squeeze-and-excitation networks. IEEE Trans. Pattern Anal. Mach. Intell 2020, 42, 2011–2023. [Google Scholar] [CrossRef]
  69. Li, L.; Liu, P.; Wu, J. Spatiotemporal remote-sensing image fusion with patch-group compressed sensing. IEEE Access 2020, 8, 209199–209211. [Google Scholar] [CrossRef]
  70. Du, B.; Zhao, Z.; Hu, X. Landslide susceptibility prediction based on image semantic segmentation. Comput. Geosci. 2021, 155, 104860. [Google Scholar] [CrossRef]
Figure 1. The black area represents a geographical map of the study area.
Figure 1. The black area represents a geographical map of the study area.
Remotesensing 16 04205 g001
Figure 2. Elevation images and bathymetry data in the western Iberian Sea area.
Figure 2. Elevation images and bathymetry data in the western Iberian Sea area.
Remotesensing 16 04205 g002
Figure 3. Areas of submarine landslides in the western Iberian Sea area. Three areas have been selected to zoom in and show.
Figure 3. Areas of submarine landslides in the western Iberian Sea area. Three areas have been selected to zoom in and show.
Remotesensing 16 04205 g003
Figure 4. An area was selected to show the slope schematic of the landslide area containing the evacuation length, deposit length, and deposit area.
Figure 4. An area was selected to show the slope schematic of the landslide area containing the evacuation length, deposit length, and deposit area.
Remotesensing 16 04205 g004
Figure 5. Schematic diagram of data process. Firstly, the three topographic features of features, slope, elevation, and hillshade, are extracted from the DEM image, then band synthesis is carried out to a three-channel image, and finally, clipping is performed.
Figure 5. Schematic diagram of data process. Firstly, the three topographic features of features, slope, elevation, and hillshade, are extracted from the DEM image, then band synthesis is carried out to a three-channel image, and finally, clipping is performed.
Remotesensing 16 04205 g005
Figure 6. Comparison of the final data obtained after data processing with the source data. Comparison of images on the left, masks on the right.
Figure 6. Comparison of the final data obtained after data processing with the source data. Comparison of images on the left, masks on the right.
Remotesensing 16 04205 g006
Figure 7. Framework structure for the improved DeepLabV3 models.
Figure 7. Framework structure for the improved DeepLabV3 models.
Remotesensing 16 04205 g007
Figure 8. The overall workflow of the entire experimental procedure.
Figure 8. The overall workflow of the entire experimental procedure.
Remotesensing 16 04205 g008
Figure 9. Framework for AttentionModule and ASPP module.
Figure 9. Framework for AttentionModule and ASPP module.
Remotesensing 16 04205 g009
Figure 10. Framework for SEBlock attention module.
Figure 10. Framework for SEBlock attention module.
Remotesensing 16 04205 g010
Figure 11. Image spatial transformation.
Figure 11. Image spatial transformation.
Remotesensing 16 04205 g011
Figure 12. Image appearance disturbance.
Figure 12. Image appearance disturbance.
Remotesensing 16 04205 g012
Figure 13. Experimental results. (1) RGB image, (2) label, (3) UNet, (4) PSPNet, (5) GCN, (6) FCN, (7) DeepLabV3plus, (8) DeepLabV3, (9) Improved DeepLabv3.
Figure 13. Experimental results. (1) RGB image, (2) label, (3) UNet, (4) PSPNet, (5) GCN, (6) FCN, (7) DeepLabV3plus, (8) DeepLabV3, (9) Improved DeepLabv3.
Remotesensing 16 04205 g013
Table 1. Data augmentation.
Table 1. Data augmentation.
MethodRemarks
RandomRotate90Randomly rotate the image by multiples of 90 degrees
(0 degrees, 90 degrees, 180 degrees, 270 degrees)
FlipRandom horizontal image flip
TransposeRandomly transpose the image, i.e., swap the width and height
of the image
GaussianBlurRandomly selected to apply Gaussian fuzzy (fuzzy between 3 and 7)
MotionBlurApply motion blur to make the image look like it was taken while
the camera was moving
MedianBlurApply median blurring to reduce noise by replacing each pixel of
an image with the median of its neighboring pixels
BlurApply mean blur to reduce image detail
OpticalDistortionApply optical aberrations to simulate lens distortions
CLAHEApply adaptive histogram equalization to enhance the contrast
of images
SharpenApply sharpening to enhance the details of the image
EmbossApply an embossing effect to make the image look engraved
RandomBrightnessContrastRandomize the brightness and contrast of the image
HueSaturationValueRandomize the hue, saturation, and brightness of images
Table 2. Experimental results.
Table 2. Experimental results.
ModelMiou (Mean/Background/Landslide)Pixel AccPrecisionRecallF1-Score
UNet0.49810.72610.270.75610.35160.680.4636
FCN0.50390.81170.19610.820.39940.34730.3716
PSPNet0.42330.74540.10130.7580.24250.23050.2363
GCN0.49440.81960.16910.8360.3610.24490.2918
DeepLabV30.64680.86790.42570.8910.6640.5630.6093
DeepLabV3plus0.47670.7960.15740.810.3620.26840.3084
Improved DeepLabV30.71650.91120.52190.92840.6640.66950.6631
Table 3. Experimental results without augmentation.
Table 3. Experimental results without augmentation.
ModelMiou (Mean/Background/Landslide)Pixel AccPrecisionRecallF1-Score
UNet0.55640.88090.2320.88720.34130.56020.4242
FCN0.52230.85550.18910.86180.27390.54310.3641
PSPNet0.47520.86240.08810.87130.19770.23490.2147
GCN0.53360.9050.15230.90790.30670.31920.2831
DeepLabV30.60820.95180.28740.95280.53210.3540.4251
DeepLabV3plus0.52240.89730.13760.80820.27780.33430.3015
Improved DeepLabV30.62360.94520.30210.94660.54090.4230.4747
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Huang, J.; Song, W.; Liu, T.; Cui, X.; Yan, J.; Wang, X. Submarine Landslide Identification Based on Improved DeepLabv3 with Spatial and Channel Attention. Remote Sens. 2024, 16, 4205. https://doi.org/10.3390/rs16224205

AMA Style

Huang J, Song W, Liu T, Cui X, Yan J, Wang X. Submarine Landslide Identification Based on Improved DeepLabv3 with Spatial and Channel Attention. Remote Sensing. 2024; 16(22):4205. https://doi.org/10.3390/rs16224205

Chicago/Turabian Style

Huang, Jingwen, Weijing Song, Tao Liu, Xiaoyu Cui, Jining Yan, and Xiaoyu Wang. 2024. "Submarine Landslide Identification Based on Improved DeepLabv3 with Spatial and Channel Attention" Remote Sensing 16, no. 22: 4205. https://doi.org/10.3390/rs16224205

APA Style

Huang, J., Song, W., Liu, T., Cui, X., Yan, J., & Wang, X. (2024). Submarine Landslide Identification Based on Improved DeepLabv3 with Spatial and Channel Attention. Remote Sensing, 16(22), 4205. https://doi.org/10.3390/rs16224205

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop