Automatic Detection Method for Loess Landslides Based on GEE and an Improved YOLOX Algorithm

Yu, Zhengbo; Chang, Ruichun; Chen, Zhe

doi:10.3390/rs14184599

Open AccessArticle

Automatic Detection Method for Loess Landslides Based on GEE and an Improved YOLOX Algorithm

by

Zhengbo Yu

^1,2,3

,

Ruichun Chang

^1,2,3,* and

Zhe Chen

^1,2,3

¹

College of Mathematics and Physics, Chengdu University of Technology, Chengdu 610059, China

²

Digital Hu Line Research Institute, Chengdu University of Technology, Chengdu 610059, China

³

Geomathematics Key Laboratory of Sichuan Province, Chengdu University of Technology, Chengdu 610059, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2022, 14(18), 4599; https://doi.org/10.3390/rs14184599

Submission received: 13 July 2022 / Revised: 30 August 2022 / Accepted: 10 September 2022 / Published: 14 September 2022

(This article belongs to the Special Issue Remote Sensing for Engineering and Sustainable Development Goals)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

The Loess Plateau is an ecologically fragile area in China; furthermore, loess landslides are typical forms of geological disasters, which severely limit the sustainable development of the local societies and the economy. Studying the automatic detection of landslides can facilitate disaster prevention and mitigation in the Loess Plateau, and help realize the climate action goal (SDG 13) of the United Nations Sustainable Development Goals (SDGs). This paper takes typical loess areas in China as the research object, and establishes a historical loess landslide sample database based on Google Earth (GEE) image data, with a total of 1451 loess landslides. The automatic detection of loess landslides is implemented by improving the You Only Look Once X (YOLOX) algorithm. The results show that the average accuracy of landslide detection in this method is 95.43%, and the accuracy rate is 96.32%, which effectively combines the earth’s big data to realize the automatic detection of loess landslides. The research results provide technical support for the promotion of disaster prevention and mitigation in China’s loess regions, the realization of sustainable development goals, and the improvement of natural disaster prevention–resistance–reduction systems.

Keywords:

loess landslide; automatic detection; SDG 13; deep learning; improved YOLOX algorithm; Google Earth images

Graphical Abstract

1. Introduction

Loess is widely distributed worldwide, accounting for about one-tenth of all global land. China has the widest distribution and deepest thickness of loess in the world. Loess areas have the characteristics of an unstable stratum structure, loose soil, and low vegetation coverage. When encountering harsh weather conditions—such as strong winds and heavy rains—there will be a significant loss of soil and water, fragmenting the Loess Plateau ravines and terrain [1,2]. Coupled with the population growth in the Loess Plateau, a large number of forests have been cut down for use as arable land, further desertifying the land. Under the combined action of nature and human activity, the ecological environment of the Loess Plateau has deteriorated significantly, resulting in the frequent occurrence of natural disasters, such as landslides, mudslides, and collapses [3]. Among them, loess landslides are the most typical [4]. They seriously threaten the safety of the population and property, in addition to the sustainable development of society and the economy. Therefore, it is imperative to research methods to reduce disasters and avoid risks [5]. Achieving the United Nations Sustainable Development Goals (SDGs) is a common goal for all governments. The Industry, Innovation and Infrastructure Goal (SDG 9) emphasizes building resilient infrastructure. The Sustainable Cities and Communities Goal (SDG 11) calls for cities and human settlements that are inclusive, safe, resilient, and sustainable. The Climate Action Goal (SDG 13) proposes strengthening countries’ resilience against climate-related disasters and natural disasters [6]. Therefore, research on the automatic detection of loess landslides is of significant importance.

There are two common landslide detection methods. The first type is the field survey method, which is inefficient, labour-intensive, and costly. The second primarily involves using remote-sensing images for landslide identification, which can be subdivided into three techniques: visual interpretation, object-oriented approaches, and pixel-based methods. The visual interpretation method is a relatively new method used in remote-sensing interpretation, which relies significantly on expert experience and requires a lot of time and effort to implement [7]. The object-oriented approach takes the image elements in the neighbourhood as the smallest processing unit, which can use spatial features such as the shape and texture of the target object on the image but its threshold setting is often only applicable to a specific study area, and its migration and portability need to be further improved [8,9]. The rapid development of emerging technologies such as big data and artificial intelligence has made big earth data the scientific research paradigm for analysing geoscience laws from massive earth system observation data; therefore, the pixel-based technique has also been widely used. In particular, based on the mechanism of spatial information acquisition and analysis, earth observation data, which is an important component of big earth data, can quickly, accurately, and macroscopically reflect key disasterreduction information, such as the spatial location of disasters and the hazard status of disaster-affected bodies. It plays an indispensable role in natural disaster prevention–resistance–reduction systems [10]. In remote sensing images, loess landslides mostly occur in loess areas, and the landslide area is usually homogeneous with its surrounding environment, having similar colour, brightness level, and texture characteristics. Even through human visual recognition, it is difficult to accurately detect and segment loess landslide areas; therefore, the automatic detection of loess landslides is very difficult. At present, researchers have primarily used mainstream target detection algorithms to identify landslides. Cheng used the YOLOv4 algorithm as a benchmark model and proposed related attention improvements in 1818 remote sensing images, verifying the effectiveness of the YOLOv4 model for potential landslide detection [11]. Ji used a convolutional neural network approach to train a semantic segmentation model on landslide satellite images, and efficiently predicted new potential landslides [12]. Xu proposed the use of MFFENet and ADANet models to extract the geological features of remote-sensing images and landslides [13]; their method had a stable and reliable identification performance in terms of landslide identification tasks.

At present, Earth big data technology is developing rapidly. For example, Google has developed Google Earth Engine (GEE), a cloud computing platform that specializes in processing massive amounts of satellite imagery and other Earth observation data. The platform stores nearly 40 years of publicly available global-scale remote-sensing imagery (Landsat, Modis, Sentinel, etc.) and petabyte-scale archives of other data, optimizing the computing infrastructure for the parallel processing of geospatial data. In 2019, the Chinese Academy of Sciences released the ‘Earth Big Data Sharing Service Platform’, which provides users with systematic, diverse, dynamic, continuous, and standardized global big earth data. This promotes the formation of a new model for earth science data sharing by establishing a data-sharing system that integrates data, computing, and services. Based on the big Earth data platform, a series of remote sensing analyses have been carried out worldwide. For landslide information analysis, using the GEE platform, Yu Bo et al. [14] constructed 30 m resolution long-series Landsat annual synthetic data through exponential sorting and constructed a random forest model using texture features. The large-scale extraction of landslide events in central Nepal was achieved by detecting changes in adjacent time-series images [15,16].

With the expansion of the sample size and scene of loess landslide disasters, in addition to the explosive growth of geological disaster-observation data, deep learning has gradually become a hotspot in research on landslide detection methods. Making full use of the information extraction ability of the convolutional neural network to realize the detection and segmentation of landslide images, in comparison to traditional landslide detection, it also has the advantages of a high detection speed and good detection results [17]. As one of the most cutting-edge technologies in machine learning today [18], deep learning is widely implemented in the field of computer vision. The RCNN algorithm pioneered the two-stage algorithm, which mainly proposes candidate regions through the RPN network to increase the accuracy of model recognition, and has played a significant role in promoting the development of subsequent target detection technologies, such as Faster R-CNN [19], Mask R-CNN [20], etc. SSD [21,22,23] and YOLO [24,25,26,27,28,29] series algorithms, as representatives of one-stage algorithms, have better running speeds than two-stage algorithms and can achieve real-time detection. Although there are some limitations in terms of their accuracy, an increasing number of researchers are paying attention to one-stage algorithms. Recently, one-stage algorithms have also begun to achieve the accuracy of two-stage algorithms, with obvious advantages in terms of speed. Lin proposed that one-stage algorithms are not as accurate as two-stage algorithms because of the imbalance of positive and negative samples, and proposed the parameter of focal loss [30]. Further, the Anchor Free technology [31,32] has once again begun to garner research attention. In comparison to the Anchor Based model, it is more suitable for the task of small-target detection. Its representative models include FCOS [33], YOLOX [28], etc. This article proposes corresponding improvements based on the YOLOX algorithm. In comparison to the YOLOv4 and YOLOv5 algorithms, YOLOX has a more efficient inference speed and better detection performance. Landslides often cause huge losses to industrial and agricultural production, as well as endangering human life and safety. Therefore, landslide detection can provide a powerful tool for landslide forecasting and disaster prevention. Usually, when the amount of landslide dataset is insufficient, the algorithm cannot be appropriately trained to identify landslides accurately. Since landslide detection does not usually require mainstream data, it may be inappropriate to directly use transfer learning [34,35,36] to load the weights obtained from datasets such as COCO [37] or PASCAL VOC [38]. This study proposes the training of a classification model on landslide images to provide transfer learning weights for the object detection model.

Based on the aforementioned discussion, to further improve the detection capability of landslides, this study combines advanced deep learning techniques and proposes the following innovations based on the YOLOX model. Similarly, some improved strategies are still suitable for mainstream target detection.

(1): Propose a method to automatically adjust the loss ratio of the model to improve its detection capability.
(2): Modify the backbone network structure of the model, train and transfer the parameters of the landslide classification model, and solve the issue associated with directly transferring the parameters of the model obtained by pre-training the mainstream dataset.
(3): Propose an activation function that can automatically adjust the gradient, along with a new EMA to improve the robustness of the detection model.
(4): Introduce a new feature fusion structure to improve the detection performance of multi-scale landslides.

2. Data and Methods

2.1. Data Collection

The study area is located in the border area between southern Gansu Province and eastern Qinghai Province, between 102°30′–104°30′E and 35°00′–36°30′N in China, as shown in Figure 1, and is located on the Loess Plateau. This region experiences low average annual rainfall, high evaporation, a dry climate, and sparse vegetation, which can be confirmed through optical images of landslides. The landslide types in the study area are mainly loess landslides.

The remote sensing images used in this study are from the GEE platform. Since Google Earth stitches together images from various satellite sensors, the spectral characteristics, spatial resolution, and dates of the images used in this work are inevitably inconsistent. The spatial resolution of most of the images in this study is 1 × 1 m, and the date range of the images is 2015–2018. A visual interpretation of landslide identification and mapping was carried out in this study using ArcMap. To ensure the quality of landslide annotation, two geologists conducted the cross-validation of the landslide interpretation results. A total of 13,112 landslides were marked in this study.

In this study, loess landslide detection is formatted as an object-detection task. The original polygons of the landslide annotations are converted into the PASCAL VOC annotation format (i.e., the outer rectangle of the landslide boundary) required by the object-detection algorithm. Then, the landslide dataset is randomly divided into training, validation, and test datasets. Among them, there are 1305 training samples, 146 validation samples, and 464 test samples. The training dataset is used to train deep-learning models. The validation dataset is then used to select the best model. The test dataset is used to evaluate the performance of the optimal model. Considering the diversity of the spectrum, texture, landform, topography, landslide type, and scale, along with the tonal differences of the input image data in the landslide area, multiple validation and test areas were selected at the study site to evaluate the method more comprehensively and objectively. To detect landslide and non-landslide areas at the study site, Google Earth images of the validation and test areas were cropped into 2000 × 2000 pixel patches from the upper left corner. To prevent the landslide samples in the verification area and the test area from being cut into two, the overlap size of the cutting direction was set to 500 pixels, according to the average length of the landslide.

2.2. YOLOX Model

The YOLOX model is an improved version of the YOLOv3 algorithm, and it has the best trade-off between speed and accuracy in the current YOLO series of algorithms. In comparison to the YOLOv4 and YOLOv5 algorithms, its speed and accuracy are improved to a certain extent. Unlike YOLOv4 and YOLOv5, which are based on a priori frame prediction, YOLOX directly predicts the size of the target. Not only does this reduce the complexity of the model, but it is more conducive to solving the problem that the different scales of the real target (ground truth) are large, which leads to a poor regression effect on the detection frame. Like usual target detection, in this study, remote-sensing images are used as the input to YOLOX and the landslide annotations of remote-sensing images are transformed for the target of YOLOX prediction. Along with the training of the YOLOX algorithm, the model parameters are optimized such that the YOLOX has the ability to automatically perform feature extraction on remote-sensing images and the automatic identification of landslides.

2.2.1. YOLOX Data Enhancement

YOLOX adopts two data augmentation strategies, Mosaic [25] and Mixup [39], in the input process. Mosaic randomly scales, cuts, and splices four images for model training inputs. The original image is a 2000 × 2000 pixel image, as shown in Figure 2a, and the result after mosaic enhancement is shown in Figure 2b. Mixup randomly superimposes two different images in a certain proportion (Beta distribution), which is usually used to increase the number of samples in the dataset and improve the generalization ability of the model. Due to the particularity of remote-sensing images, it is verified through experiments that the use of Mixup data enhancement is cancelled when building the model, and only Mosaic data enhancement is used.

In the training process of the model, Mosaic data enhancement first traverses and selects an image, then randomly selects three images in the training set, and randomly deflates and stitches these to obtain a composite image; the image size after Mosaic data enhancement is transformed into the size of the model input. Therefore, there is a high level of randomness in the selection and combination of images, and there will be several different combinations of images, which enrich the training set to a great extent. Since Mosaic data enhancement significantly increases the number of training samples in the model, the robustness of the model is significantly improved. This can reduce the scale of the detection target, improving its detection effect on small targets. Since the final visual sensory effect of remote-sensing images is easily affected by the atmosphere and easily mixed with noise information, as shown in some images in the Appendix A, the Mosaic data enhancement randomly combines four images, which helps improve the model’s anti-interference ability.

2.2.2. YOLOX Model Structure

The data-enhanced YOLOX model uses Focus-CSPDarknet53 as the backbone network to extract features, and—at the same time—introduces the path aggregation feature pyramid network (PAFPN) [40] to fuse semantic and geometric features for outputs of different scales, effectively improving the accuracy of target detection. In comparison to YOLOv4 and YOLOv5, the introduction of three Decoupled HeadS at the output end leads to fewer parameters and a faster computation speed. Compared with YOLOv5, YOLOv4, etc., YOLOX uses the SimOTA [28] strategy for positive and negative sample allocation. YOLOX’s Anchor Free strategy also allows it to perform well in small target detection. The YOLOX‘s parameters in several versions of the model provided by YOLOX are few, and the detection effect and speed are better, which is more convenient to deploy. Therefore, this study chooses to improve YOLOX to achieve the goal of better landslide detection. Figure 3 shows the overall structure of YOLOX, where the Decoupled Head is responsible for the final landslide detection.

2.2.3. Model Evaluation Indicators

To evaluate the effectiveness of landslide detection from a comprehensive perspective, this study selects the average precision (AP) as the main evaluation index. AP is a comprehensive consideration of ‘Precision and Recall’, used to evaluate the effectiveness of the model.

The precision rate refers to the proportion of landslides that are correctly predicted as landslides in the predicted results, as shown in Formula (1); the recall rate refers to the proportion of landslides that are correctly predicted as landslides to actual landslides, as shown in Formula (2). Among them, TP refers to the correct prediction as a positive example; FP refers to an incorrect prediction as a positive example; and FN refers to an incorrect prediction as a negative example.

P r e c i s i o n = \frac{T P}{T P + F P}

(1)

R e c a l l = \frac{T P}{T P + F N}

(2)

The average precision rate can be used to evaluate the prediction effect of the landslide. It is the integral of the precision rate to the recall rate of the precision rate–recall rate curve (P–R) curve in the interval [0, 1]. This is reflected in Formula (3), where p is Precision and r is Recall.

A P = \int_{0}^{1} p (r) d r

(3)

2.3. YOLOX Model Improvement

2.3.1. Automatic Selection of the Loss Function Scale

Usually, the task of the target detection algorithm includes two types of loss: classification and regression. In the YOLOX model, the loss is mainly composed of classification loss (detecting the target category and confidence) and regression (detecting target position) loss. In the training process, the ratio is 1:1, by default. The weight of the loss function has a significant effect on model convergence. This study provides a method for selecting the ratio of the loss function.

When determining the coefficient ratio of the loss function, supervised machine learning is used to find more groups of loss function ratios and their corresponding AP indicators, which is a time-consuming process. Therefore, a method faster than supervised machine learning is adopted. First, the ratios of classification and regression losses are adjusted to 1:2 and 2:1 for training 100 times, respectively. The evaluation indicators are compared for each result, and the loss ratio of the final training setting is determined using the obtained AP value. As shown in Figure 4, the ratio of the loss function obtained is (5.47:1) through a comparison of the APs trained according to the two classification and regression ratios.

Using this method of automatically selecting the ratio of regression and classification, in addition to applying the YOLOX model to landslide detection, saves a lot of training time, and the detection effects are improved (Table 1).

2.3.2. Backbone Network

The backbone network of the YOLOX model uses the Focus structure and CSPDarknet. The Focus structure reduces the number of parameters and calculations of the model, with a negligible loss of information. The CSPDarknet structure has also achieved good results when implemented in algorithms such as YOLOv4. Due to the specific features of landslide detection and the complex nature of landslide images, the residual module in the front end of the backbone network is modified on the basis of the Focus-CSPDarknet structure to reduce the loss of detailed features [41]. Figure 5a is a structure based on residuals before modification [42], and Figure 5b is the modified structure. A new residual structure is added to the original network to reduce the loss of details and features of remote-sensing images.

To ensure the validity of the model evaluation metrics, in the training dataset of landslide detection used in this study, the Python programming language is used to locate and save the image features of the landslide according to the landslide annotations, and an equal number of images of non-landslide areas are randomly selected and saved to construct a binary classification dataset of landslides. The convolutional model of binary classification is obtained by linking the improved backbone network to the classification convolutional structure, and the classification model is trained using the binary classification landslide dataset. Since each original image may contain more than one landslide region, the number of images in the constructed binary classification dataset will be larger, and the backbone network of the model can also learn the extraction capability of landslide features in advance. Furthermore, the parameters of the backbone network of the trained classification model can be transferred to the backbone network of the target detection model to reduce the impact of directly transferring the parameters of the backbone network from the mainstream dataset training process.

2.3.3. Activation Function

The YOLOX model uses SiLU (Formula (4)) as the activation function for the backbone network, with a shape similar to the Mish activation function. In the process of model training, the activation function affects the gradient required for backpropagation and, thus, affects the overall effect of the model. However, in practical tasks, trimming the activation function parameters to a certain extent may have better effects on the model results. This is reflected in Figure 6 and Formulas (5) and (6). Among them, the AP of the activation function of Formula (5) is improved in the final experimental results, while the AP index of Formula (6) is slightly reduced. However, manually adjusting the parameters of the activation function to achieve better results is time-intensive in terms of training the YOLOX detection model, and the overall detection model uses multiple SiLU activation functions; therefore, it is difficult to select appropriate parameters. Hence, this study proposes the AutoSiLU activation function, shown in Formula (7), adding variable coefficients in SiLU, and using backpropagation to automatically adjust the coefficients of each activation function. In this way, the issues associated with the complex manual adjustment of coefficients, including its time-intensive nature and difficulty in obtaining suitable parameters, are solved.

A c t i v a t i o n = X \times s i g m o i d (X)

(4)

A c t i v a t i o n = X \times s i g m o i d (X - 1)

(5)

A c t i v a t i o n = X \times s i g m o i d (X + 1)

(6)

A c t i v a t i o n = X \times s i g m o i d (X + i)

(7)

2.3.4. Improved EMA

The YOLOX algorithm adopts the EMA [28,43] algorithm when adjusting the model parameters, that is, the weights obtained by training are updated using Formula (8).

W e

represents the weights used each time (after the weighted average),

λ

is the coefficient of the weights obtained in each training iteration, W represents the weights obtained in each training iteration, and N is the number of training iterations.

\begin{matrix} W e [i + 1] = (1 - λ) W e [i] + λ W [i + 1]; \\ i from 1 to N - 1 \end{matrix}

(8)

In comparison to the original EMA, which uses a constant coefficient to update the weight of the network, the improved EMA can choose a relatively small

λ

parameter in the early stages of training. A relatively large

λ

parameter is used in the later stages of training, and a nonlinear function is used to gradually increase the value of

λ

with each training iteration. Compared with the original EMA, the improved EMA quickly converges in the early stages of training, and the convergence scenario will be more stable in the later stages of training. This study mainly uses Arctanx (Figure 7a) and Sigmoid (Figure 7b)—two nonlinear functions—and updates the parameters obtained by each training iteration through Formula (9).

\begin{matrix} W e [i + 1] = F u n c (λ) * W e [i] + (- F u n c (λ)) * W [i + 1] \\ i from 1 to N - 1 \end{matrix}

(9)

2.3.5. Feature Fusion

The Neck structure of the YOLOX model is the structure of the PAFPN [40] (Figure 8a). The final detection head consists of three scales, 20 × 20, 40 × 40, and 80 × 80, which are mainly responsible for detecting large-area landslides, medium-area landslides, and small-area landslides, respectively. From the perspective of the complex features of remote-sensing image data, in order to better promote the fusion of shallow detail features and deep semantic features, this study proposes an improved PAFPN structure based on the YOLOX model.

In one technique, three new convolution modules are added at the back end of the PAFPN, and the up-sampling structure is used to fuse the features (Figure 8b). In comparison to the original YOLOX model, the AP value obtained from the experimental results is improved, and the overall inference speed of the model remains relatively unchanged. However, in the other technique, upsampling is introduced at the front end of the PAFPN structure (Figure 8c), and the AP value is slightly reduced in comparison to the experimental results.

3. Result

The AP value of the improved model is 95.43%, and the recall rate and precision rate are similarly good. In comparison to YOLOX, the AP value is improved by 6.98%. A certain area in the test set is randomly selected for testing. Figure 9a shows the image of the real landslide area, and Figure 9b is the detection result obtained by the YOLOX algorithm. It can be seen that there are a large number of landslide areas that have not been detected. Figure 9c is the result obtained by the improved YOLOX algorithm, which can better detect the locations of landslides than the algorithm before the improvement (Figure 9b).

The related improvement results are shown in Table 2. Among them, YOLOX_A adopts the method of automatic loss adjustment; YOLOX_B adopts automatic loss adjustment + a new backbone network; YOLOX_C adopts automatic loss adjustment + a new backbone network + AutoSiLU; YOLOX_D adopts automatic loss adjustment + a new backbone network + AutoSiLU + EMA_Arctan; YOLOX_E adopts automatic loss adjustment + a new backbone network + AutoSiLU + EMA_Sigmoid; and YOLOX_F adopts automatic loss adjustment + a new backbone network + AutoSiLU + EMA_Sigmoid + feature fusion (Figure 8b). The improved YOLOX algorithm has achieved a relatively good improvement in landslide detection. The proposed automatic loss adjustment, AutoSiLU, EMA_Arctan, and EMA_Sigmoid do not lead to a reduction in the detection speed while improving the detection capability of the YOLOX model, new backbone network, and new feature fusion improve the detection capability of the model with a small loss in detection speed. However, they are still fast in comparison to Faster R-CNN and YOLOv4.

4. Discussion

By improving the YOLOX model, the AP on the test set is finally improved by 6.98%, of which the number of real landslides is 2748, the number of model predictions is 2853, and the number of correct predictions is 2453. The accuracy rate reaches 96.32%, the recall rate reaches 89.23%, and the F1 value is 0.93.

The improved YOLOX can more accurately predict the location of regional landslides and reduce false detections. The prediction results of the YOLOX model are wrongly detected in the upper right corner of Figure 10b, while the improved YOLOX model can accurately predict the location of the landslide.

The improved YOLOX model has a higher recall rate. The landslide in the upper left corner of Figure 11a was missed by the YOLOX model, while the improved YOLOX model successfully predicted the landslide in the upper left corner.

Due to the complexity of the texture features of remote-sensing image data, the shadow area formed by sunlight on the ridge or the top of the slope in the image can easily lead to incorrect model predictions. In addition, the detection performance for small area landslides still needs to be improved. Usually, there is less information available for small targets, and with the deepening of the convolutional neural network, the corresponding pixel values of the small target samples in the detection feature map are compressed, which makes the detection of small targets difficult; this is a common issue in many models today.

Although the issue of missed detection in the improved YOLOX model has been improved to a certain extent, Figure 12 shows that the improved YOLOX model did not detect small-area landslides.

As shown in Figure 13, the improved YOLOX model is prone to false detections at the edges of remote-sensing images, which affects the accuracy of the final model, to a certain extent.

In practical applications, high-resolution remote sensing images are cut to obtain images with the same resolution as the training images; this cut image is used for landslide detection separately, and the detection results are integrated. Thus, the performance of the model over a large study area is in line with the trend of the evaluation metrics in the test set. In view of the abovementioned issues, in future work, the following two aspects will be focused on: (1) improving the detection ability of the model for small-area landslides and (2) enriching the dataset of landslides and reducing the probability of false detection and missed detection of shadow areas and the edges of remote-sensing images.

5. Conclusions

This article is implemented using the Python language and the PyTorch1.71 (Facebook AI Research, New York, NY, USA) deep-learning framework. Using the YOLOX algorithm as a basis, an improved YOLOX algorithm was proposed to identify regional landslides in 1451 images containing landslides in the loess area; the following are the contributions of this study:

(a): Introduce a method to automatically select the ratio of regression and classification loss. To a certain extent, this can reduce the time spent in artificially selecting regression and loss ratios and training multiple times, improving the model detection effect.
(b): A two-layer residual network structure is used as the backbone network front-end of the YOLOX algorithm to improve its ability to extract complex information. Then, the landslide image features in the target detection training set are extracted to construct the landslide classification dataset, and the parameters of the target detection model are initialized by the trained classification model using transfer learning.
(c): Propose the AutoSiLU activation function based on the SiLU activation function of YOLOX. In comparison to SiLU, it can automatically optimize the functional structure of SiLU through gradient backpropagation, which leads to the higher flexibility of the model and an improved generalization effect.
(d): An improved EMA is introduced, and two parameter-variable methods are proposed to update the overall parameters of the model, which improves the generalization effect of the model, to a certain extent.
(e): A new fusion method is proposed for the PAFPN structure of the Neck network, which promotes the fusion of shallow shape features and deep semantic features, improving the overall network generalization effect.

The final improved YOLOX model achieves an AP of 95.43%, precision of 96.32%, recall of 89.23%, F1 of 0.93, and FPS of 106 on the test set. In comparison to the original YOLOX algorithm, the improved YOLOX algorithm has higher accuracy. The improved YOLOX model still has the ability for rapid detection, which makes it have better practical application value when detecting landslides. Even though the current improved YOLOX method achieves better results, there are still misidentifications in existing research due to the variability of the landslide morphology and the complexity of natural environmental textures. However, there is still significant room for improvement in terms of using deep learning for more refined classification and identification of landslides. It is believed that, with the increasing maturity of target detection technology and improvements to remote-sensing satellite technology, there will be great progress in landslide detection in remote-sensing images through deep learning.

Author Contributions

Z.Y.: methodology, validation, conceptualization, data curation, writing-original draft, and software. R.C.: writing-review and editing, funding acquisition, supervision, and project administration. Z.C.: investigation, formal analysis, resources, and visualization. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Key Research and Development Program of the Sichuan Provincial Science and Technology Department (grant number 2022YFS0486), Remote Sensing Identification and Monitoring Project of Geological Hazards in Sichuan Province (grant number 510201202076888); National Geological Disaster Identification Project of Ministry of Natural Resources (grant number 073320180876/2).

Data Availability Statement

Not applicable.

Acknowledgments

The authors are grateful for helpful comments from many researchers and colleagues.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Appendix A

References

Wu, W.J.; Wang, N.Q. Basic types and active features of loess landslide. Chin. J. Geol. Hazard Control. 2002, 13, 38–42. [Google Scholar]
Xu, Z.J.; Lin, Z.G.; Zhang, M.S. Chinese Loess and Loess Landslides. Chin. J. Rock Mech. Eng. 2007, 26, 1297–1312. [Google Scholar]
Liu, H.M.; Shi, Y.C. Characteristics and Influencing Factors of Different Types of Landslides in Loess Area. Northwest. Seismol. J. 2006, 29, 360–363. [Google Scholar]
Peng, J.; Wang, S.; Wang, Q.; Zhuang, J.; Huang, W.; Zhu, X.; Leng, Y.; Ma, P. Distribution and genetic types of loess landslides in China. J. Asian Earth Sci. 2019, 170, 329–350. [Google Scholar] [CrossRef]
Peng, J.B.; Lin, H.Z.; Wang, Q.Y.; Zhuang, J.Q.; Cheng, Y.X.; Zhu, X.H. The critical issues and creative concepts in mitigation research of loess geological hazards. J. Eng. Geol. 2014, 22, 684–691. [Google Scholar]
Guo, H.D.; Chen, F.; Sun, Z.C.; Liu, J.; Liang, D. Big Earth Data: A practice of sustainability science to achieve the Sustainable Development Goals. Sci. Bull. 2021, 66, 1050–1053. [Google Scholar] [CrossRef]
Petschko, H.; Bell, R.; Glade, T. Effectiveness of visually analyzing LiDAR DTM derivatives for earth and debris slide inventory mapping for statistical susceptibility modeling. Landslides 2016, 13, 857–872. [Google Scholar] [CrossRef]
Pawłuszek, K.; Marczak, S.; Borkowski, A.; Tarolli, P. Multi-aspect analysis of object-oriented landslide detection based on an extended set of LiDAR-derived terrain features. ISPRS Int. J. Geo-Inf. 2019, 8, 321. [Google Scholar] [CrossRef]
Bacha, A.S.; Van Der Werff, H.; Shafique, M.; Khan, H. Transferability of object-based image analysis approaches for landslide detection in the Himalaya Mountains of northern Pakistan. Int. J. Remote Sens. 2020, 41, 3390–3410. [Google Scholar] [CrossRef]
Guo, H.D.; Liang, D.; Chen, F.; Sun, Z.C.; Liu, J. Big Earth Data Facilitates Sustainable Development Goals. Bull. Chin. Acad. Sci. 2021, 36, 874–884. [Google Scholar] [CrossRef]
Cheng, L.; Li, J.; Duan, P.; Wang, M. A small attentional YOLO model for landslide detection from satellite remote sensing images. Landslides 2021, 18, 2751–2765. [Google Scholar] [CrossRef]
Ji, S.; Yu, D.; Shen, C.; Li, W.; Xu, Q. Landslide detection from an open satellite imagery and digital elevation model dataset using attention boosted convolutional neural networks. Landslides 2020, 17, 1337–1352. [Google Scholar] [CrossRef]
Xu, Q.; Ouyang, C.; Jiang, T.; Yuan, X.; Fan, X.; Cheng, D. MFFENet and ADANet: A robust deep transfer learning method and its application in high precision and fast cross-scene recognition of earthquake-induced landslides. Landslides 2022, 19, 1617–1647. [Google Scholar] [CrossRef]
Yu, B.; Chen, F.; Yang, A. The research and development of spatial hazard reduction in the Belt and Road initiative. Sci. Technol. Rev. 2020, 38, 53–57. [Google Scholar]
Chen, Z.; Chang, R.; Guo, H.; Pei, X.; Zhao, W.; Yu, Z.; Zou, L. Prediction of Potential Geothermal Disaster Areas along the Yunnan–Tibet Railway Project. Remote Sens. 2022, 14, 3036. [Google Scholar] [CrossRef]
Chen, Z.; Chang, R.; Zhao, W.; Li, S.; Guo, H.; Xiao, K.; Wu, L.; Hou, D.; Zou, L. Quantitative Prediction and Evaluation of Geothermal Resource Areas in the Southwest Section of the Mid-Spine Belt of Beautiful China. Int. J. Digit. Earth 2022, 15, 748–769. [Google Scholar] [CrossRef]
Meena, S.R.; Soares, L.P.; Grohmann, C.H.; van Westen, C.; Bhuyan, K.; Singh, R.P.; Floris, M.; Catani, F. Landslide detection in the Himalayas using machine learning algorithms and U-Net. Landslides 2022, 19, 1209–1229. [Google Scholar] [CrossRef]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. arXiv 2015, arXiv:1506.01497. [Google Scholar] [CrossRef]
He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. arXiv 2017, arXiv:1703.06870. [Google Scholar]
Fu, C.; Liu, W.; Ranga, A.; Tyagi, A.; Berg, A.C. DSSD: Deconvolutional Single Shot Detector. arXiv 2017, arXiv:1701.06659. [Google Scholar]
Li, Z.; Zhou, F. FSSD: Feature Fusion Single Shot Multibox Detector. arXiv 2017, arXiv:1712.01960. [Google Scholar]
Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.; Berg, A.C. SSD: Single Shot MultiBox Detector; Leibe, B., Matas, J., Sebe, N., Welling, M., Eds.; Springer International Publishing: Cham, Switzerland, 2016; pp. 21–37. [Google Scholar]
Redmon, J.; Farhadi, A. {YOLOv3: An Incremental Improvement}. arXiv 2018, arXiv:1804.02767. [Google Scholar]
Bochkovskiy, A.; Wang, C.; Liao, H.M. {YOLOv4: Optimal Speed and Accuracy of Object Detection}. arXiv 2020, arXiv:2004.010934. [Google Scholar]
Redmon, J.; Farhadi, A. YOLO9000: Better, Faster, Stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
Glenn, J. yolov5. 2021. Available online: https://github.com/ultralytics/yolov5 (accessed on 13 July 2022).
Ge, Z.; Liu, S.; Wang, F.; Li, Z.; Sun, J. YOLOX: Exceeding YOLO Series in 2021. arXiv 2021, arXiv:2107.08430. [Google Scholar]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
Lin, T.; Goyal, P.; Girshick, R.; He, K.; Dollar, P. Focal Loss for Dense Object Detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [Google Scholar]
Zhang, S.; Chi, C.; Yao, Y.; Lei, Z.; Li, S.Z. Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seoul, Korea, 16–18 June 2020; pp. 9759–9768. [Google Scholar]
Zhu, C.; He, Y.; Savvides, M. Feature Selective Anchor-Free Module for Single-Shot Object Detection. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 840–849. [Google Scholar]
Tian, Z.; Shen, C.; Chen, H.; He, T. FCOS: Fully Convolutional One-Stage Object Detection. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27–28 October 2019; pp. 9626–9635. [Google Scholar]
Weiss, K.; Khoshgoftaar, T.M.; Wang, D. A survey of transfer learning. J. Big Data 2016, 3, 9. [Google Scholar] [CrossRef]
Dai, W.; Jin, O.; Xue, G.; Yang, Q.; Yu, Y. EigenTransfer: A Unified Framework for Transfer Learning. In Proceedings of the 26th Annual International Conference on Machine Learning held in Conjunction with the 2007 International Conference on Inductive Logic Programming, Montreal, QC, Canada, 14–18 June 2009; pp. 193–200. [Google Scholar]
Torrey, L.; Shavlik, J. Transfer learning. In Handbook of Research on Machine Learning Applications and Trends: Algorithms, Methods, and Techniques; IGI Global: Pennsylvania, PA, USA, 2010; pp. 242–264. [Google Scholar]
Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft coco: Common objects in context. In Proceedings of the Computer Vision—ECCV 2014, Zurich, Switzerland, 6–12 September 2014; Springer: Cham, Switzerland, 2014; pp. 740–755. [Google Scholar]
Everingham, M.; Eslami, S.M.; Van Gool, L.; Williams, C.K.; Winn, J.; Zisserman, A. The pascal visual object classes challenge: A retrospective. Int. J. Comput. Vision 2015, 111, 98–136. [Google Scholar] [CrossRef]
Zhang, H.; Cisse, M.; Dauphin, Y.N.; Lopez-Paz, D. mixup: Beyond empirical risk minimization. arXiv 2017, arXiv:1710.09412. [Google Scholar]
Wang, W.; Xie, E.; Song, X.; Zang, Y.; Wang, W.; Lu, T.; Yu, G.; Shen, C. Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 8440–8449. [Google Scholar]
Wang, C.; Liao, H.M.; Wu, Y.; Chen, P.; Hsieh, J.; Yeh, I. CSPNet: A new backbone that can enhance learning capability of CNN. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA, 14–19 June 2020; pp. 390–391. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Long, X.; Deng, K.; Wang, G.; Zhang, Y.; Dang, Q.; Gao, Y.; Shen, H.; Ren, J.; Han, S.; Ding, E. PP-YOLO: An effective and efficient implementation of object detector. arXiv 2020, arXiv:2007.12099. [Google Scholar]

Figure 1. The distribution of landslides at study site located in southern Gansu Province and eastern Qinghai Province.

Figure 2. Mosaic data enhancement. The red solid line boxes mark the real landslide area, and the dotted line boxes mainly assist people to observe the transformation of the landslide area after data enhancement. (a) Four randomly selected images with landslide labels; (b) A Mosaic-enhanced image.

Figure 3. Flowchart of YOLOX model used for landslide detection.

Figure 4. Flowchart for automatic selection of the loss function scale.

Figure 5. Residual module of backbone network. (a) Residual module in YOLOX; (b) New residual module in improved YOLOX.

Figure 6. Activation function comparison chart.

Figure 7. Parameters used by the improved EMA algorithm, the vertical axis values represent the weights of the model update parameters. (a) Sigmoid activation function; (b) Arctanx activation function.

Figure 8. YOLOX’s feature fusion and improved feature fusion structures. (a) PAFPN structure in YOLOX; (b) The back-end of PAFPN adds an upsampling feature fusion structure; (c) The front-end of PAFPN adds an upsampling feature fusion structure.

Figure 9. Real landslide and identification results of YOLOX model and improved YOLOX model. (a) Real landslide; (b) YOLOX model prediction results; (c) Improve the prediction results of the YOLOX model.

Figure 10. Real landslide and identification results of YOLOX model and improved YOLOX model. (a) Real landslide; (b) YOLOX model prediction results; (c) Improve the prediction results of the YOLOX model.

Figure 11. Real landslide and identification results of YOLOX model and improved YOLOX model. (a) Real landslide; (b) YOLOX model prediction results; (c) Improve the prediction results of the YOLOX model.

Figure 12. Real landslide and identification results of improved YOLOX model. (a) Real landslide; (b) Improve the prediction results of the YOLOX model.

Figure 13. Real landslide and identification results of improved YOLOX model. (a) Real landslide; (b) Improve the prediction results of the YOLOX model.

Table 1. AP, Precision, and Recall indicators under different loss scales of YOLOX.

Regression: Classification	AP	Precision	Recall
Loss 1:1	88.45	90.48	81.31
Loss 1:2	88.26 (−0.19)	88.02	82.17
Loss 2:1	89.49 (+1.04)	86.76	83.58
Loss 4:1	89.56 (+1.11)	88.26	82.82
Loss 5:1	89.88 (+1.43)	88.21	83.41
Loss 6:1	89.63 (+1.18)	88.02	82.93
Plan (Loss 5.47:1)	90.37 (+1.92)	87.41	84.89

Table 2. AP, Precision, Recall, F1 and FPS (Frames Per Second) indicators of different object detection models.

	AP	Precision	Recall	F1	FPS
Faster R-CNN	72.46	65.82	82.77	0.73	27
YOLOv4	76.32	83.21	72.56	0.78	86
YOLOv5	83.92	86.59	78.35	0.82	112
YOLOX	88.45	90.48	81.31	0.86	116
YOLOX_A	90.37 (+1.92)	87.41	84.89	0.86	116
YOLOX_B	91.87 (+3.42)	89.58	86.38	0.88	114
YOLOX_C	92.78 (+4.33)	90.10	89.23	0.90	114
YOLOX_D	94.21 (+5.76)	94.23	89.11	0.92	114
YOLOX_E	94.23 (+5.78)	93.66	90.21	0.92	114
YOLOX_F	95.43 (+6.98)	96.32	89.23	0.93	106

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yu, Z.; Chang, R.; Chen, Z. Automatic Detection Method for Loess Landslides Based on GEE and an Improved YOLOX Algorithm. Remote Sens. 2022, 14, 4599. https://doi.org/10.3390/rs14184599

AMA Style

Yu Z, Chang R, Chen Z. Automatic Detection Method for Loess Landslides Based on GEE and an Improved YOLOX Algorithm. Remote Sensing. 2022; 14(18):4599. https://doi.org/10.3390/rs14184599

Chicago/Turabian Style

Yu, Zhengbo, Ruichun Chang, and Zhe Chen. 2022. "Automatic Detection Method for Loess Landslides Based on GEE and an Improved YOLOX Algorithm" Remote Sensing 14, no. 18: 4599. https://doi.org/10.3390/rs14184599

APA Style

Yu, Z., Chang, R., & Chen, Z. (2022). Automatic Detection Method for Loess Landslides Based on GEE and an Improved YOLOX Algorithm. Remote Sensing, 14(18), 4599. https://doi.org/10.3390/rs14184599

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Automatic Detection Method for Loess Landslides Based on GEE and an Improved YOLOX Algorithm

Abstract

1. Introduction

2. Data and Methods

2.1. Data Collection

2.2. YOLOX Model

2.2.1. YOLOX Data Enhancement

2.2.2. YOLOX Model Structure

2.2.3. Model Evaluation Indicators

2.3. YOLOX Model Improvement

2.3.1. Automatic Selection of the Loss Function Scale

2.3.2. Backbone Network

2.3.3. Activation Function

2.3.4. Improved EMA

2.3.5. Feature Fusion

3. Result

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI