Remote Sensing

Research

Jump to: Other

23 pages, 7788 KiB

Open AccessArticle

A Novel Mamba Architecture with a Semantic Transformer for Efficient Real-Time Remote Sensing Semantic Segmentation

by Hao Ding, Bo Xia, Weilin Liu, Zekai Zhang, Jinglin Zhang, Xing Wang and Sen Xu

Remote Sens. 2024, 16(14), 2620; https://doi.org/10.3390/rs16142620 - 17 Jul 2024

Cited by 8 | Viewed by 2689

Abstract

Real-time remote sensing segmentation technology is crucial for unmanned aerial vehicles (UAVs) in battlefield surveillance, land characterization observation, earthquake disaster assessment, etc., and can significantly enhance the application value of UAVs in military and civilian fields. To realize this potential, it is essential [...] Read more.

Real-time remote sensing segmentation technology is crucial for unmanned aerial vehicles (UAVs) in battlefield surveillance, land characterization observation, earthquake disaster assessment, etc., and can significantly enhance the application value of UAVs in military and civilian fields. To realize this potential, it is essential to develop real-time semantic segmentation methods that can be applied to resource-limited platforms, such as edge devices. The majority of mainstream real-time semantic segmentation methods rely on convolutional neural networks (CNNs) and transformers. However, CNNs cannot effectively capture long-range dependencies, while transformers have high computational complexity. This paper proposes a novel remote sensing Mamba architecture for real-time segmentation tasks in remote sensing, named RTMamba. Specifically, the backbone utilizes a Visual State-Space (VSS) block to extract deep features and maintains linear computational complexity, thereby capturing long-range contextual information. Additionally, a novel Inverted Triangle Pyramid Pooling (ITP) module is incorporated into the decoder. The ITP module can effectively filter redundant feature information and enhance the perception of objects and their boundaries in remote sensing images. Extensive experiments were conducted on three challenging aerial remote sensing segmentation benchmarks, including Vaihingen, Potsdam, and LoveDA. The results show that RTMamba achieves competitive performance advantages in terms of segmentation accuracy and inference speed compared to state-of-the-art CNN and transformer methods. To further validate the deployment potential of the model on embedded devices with limited resources, such as UAVs, we conducted tests on the Jetson AGX Orin edge device. The experimental results demonstrate that RTMamba achieves impressive real-time segmentation performance. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence for Remote Sensing: Methodology and Applications)

► Show Figures

Figure 1

26 pages, 17177 KiB

Open AccessArticle

Direction of Arrival Joint Prediction of Underwater Acoustic Communication Signals Using Faster R-CNN and Frequency–Azimuth Spectrum

by Le Cheng, Yue Liu, Bingbing Zhang, Zhengliang Hu, Hongna Zhu and Bin Luo

Remote Sens. 2024, 16(14), 2563; https://doi.org/10.3390/rs16142563 - 12 Jul 2024

Cited by 2 | Viewed by 817

Abstract

Utilizing hydrophone arrays for detecting underwater acoustic communication (UWAC) signals leverages spatial information to enhance detection efficiency and expand the perceptual range. This study redefines the task of UWAC signal detection as an object detection problem within the frequency–azimuth (FRAZ) spectrum. Employing Faster [...] Read more.

Utilizing hydrophone arrays for detecting underwater acoustic communication (UWAC) signals leverages spatial information to enhance detection efficiency and expand the perceptual range. This study redefines the task of UWAC signal detection as an object detection problem within the frequency–azimuth (FRAZ) spectrum. Employing Faster R-CNN as a signal detector, the proposed method facilitates the joint prediction of UWAC signals, including estimates of the number of sources, modulation type, frequency band, and direction of arrival (DOA). The proposed method extracts precise frequency and DOA features of the signals without requiring prior knowledge of the number of signals or frequency bands. Instead, it extracts these features jointly during training and applies them to perform joint predictions during testing. Numerical studies demonstrate that the proposed method consistently outperforms existing techniques across all signal-to-noise ratios (SNRs), particularly excelling in low SNRs. It achieves a detection F1 score of 0.96 at an SNR of −15 dB. We further verified its performance under varying modulation types, numbers of sources, grating lobe interference, strong signal interference, and array structure parameters. Furthermore, the practicality and robustness of our approach were evaluated in lake-based UWAC experiments, and the model trained solely on simulated signals performed competitively in the trials. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence for Remote Sensing: Methodology and Applications)

► Show Figures

Figure 1

36 pages, 2623 KiB

Open AccessArticle

Change Detection Methods for Remote Sensing in the Last Decade: A Comprehensive Review

by Guangliang Cheng, Yunmeng Huang, Xiangtai Li, Shuchang Lyu, Zhaoyang Xu, Hongbo Zhao, Qi Zhao and Shiming Xiang

Remote Sens. 2024, 16(13), 2355; https://doi.org/10.3390/rs16132355 - 27 Jun 2024

Cited by 28 | Viewed by 8987

Abstract

Change detection is an essential and widely utilized task in remote sensing that aims to detect and analyze changes occurring in the same geographical area over time, which has broad applications in urban development, agricultural surveys, and land cover monitoring. Detecting changes in [...] Read more.

Change detection is an essential and widely utilized task in remote sensing that aims to detect and analyze changes occurring in the same geographical area over time, which has broad applications in urban development, agricultural surveys, and land cover monitoring. Detecting changes in remote sensing images is a complex challenge due to various factors, including variations in image quality, noise, registration errors, illumination changes, complex landscapes, and spatial heterogeneity. In recent years, deep learning has emerged as a powerful tool for feature extraction and addressing these challenges. Its versatility has resulted in its widespread adoption for numerous image-processing tasks. This paper presents a comprehensive survey of significant advancements in change detection for remote sensing images over the past decade. We first introduce some preliminary knowledge for the change detection task, such as problem definition, datasets, evaluation metrics, and transformer basics, as well as provide a detailed taxonomy of existing algorithms from three different perspectives: algorithm granularity, supervision modes, and frameworks in the Methodology section. This survey enables readers to gain systematic knowledge of change detection tasks from various angles. We then summarize the state-of-the-art performance on several dominant change detection datasets, providing insights into the strengths and limitations of existing algorithms. Based on our survey, some future research directions for change detection in remote sensing are well identified. This survey paper sheds some light the topic for the community and will inspire further research efforts in the change detection task. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence for Remote Sensing: Methodology and Applications)

► Show Figures

Figure 1

23 pages, 3678 KiB

Open AccessArticle

Enhanced Wind Field Spatial Downscaling Method Using UNET Architecture and Dual Cross-Attention Mechanism

by Jieli Liu, Chunxiang Shi, Lingling Ge, Ruian Tie, Xiaojian Chen, Tao Zhou, Xiang Gu and Zhanfei Shen

Remote Sens. 2024, 16(11), 1867; https://doi.org/10.3390/rs16111867 - 23 May 2024

Cited by 1 | Viewed by 1212

Abstract

Before 2008, China lacked high-coverage regional surface observation data, making it difficult for the China Meteorological Administration Land Data Assimilation System (CLDAS) to directly backtrack high-resolution, high-quality land assimilation products. To address this issue, this paper proposes a deep learning model named UNET_DCA, [...] Read more.

Before 2008, China lacked high-coverage regional surface observation data, making it difficult for the China Meteorological Administration Land Data Assimilation System (CLDAS) to directly backtrack high-resolution, high-quality land assimilation products. To address this issue, this paper proposes a deep learning model named UNET_DCA, based on the UNET architecture, which incorporates a Dual Cross-Attention module (DCA) for multiscale feature fusion by introducing Channel Cross-Attention (CCA) and Spatial Cross-Attention (SCA) mechanisms. This model focuses on the near-surface 10-m wind field and achieves spatial downscaling from 6.25 km to 1 km. We conducted training and validation using data from 2020–2021, tested with data from 2019, and performed ablation experiments to validate the effectiveness of each module. We compared the results with traditional bilinear interpolation methods and the SNCA-CLDASSD model. The experimental results show that the UNET-based model outperforms SNCA-CLDASSD, indicating that the UNET-based model captures richer information in wind field downscaling compared to SNCA-CLDASSD, which relies on sequentially stacked CNN convolution modules. UNET_CCA and UNET_SCA, incorporating cross-attention mechanisms, outperform UNET without attention mechanisms. Furthermore, UNET_DCA, incorporating both Channel Cross-Attention and Spatial Cross-Attention mechanisms, outperforms UNET_CCA and UNET_SCA, which only incorporate one attention mechanism. UNET_DCA performs best on the RMSE, MAE, and COR metrics (0.40 m/s, 0.28 m/s, 0.93), while UNET_DCA_ars, incorporating more auxiliary information, performs best on the PSNR and SSIM metrics (29.006, 0.880). Evaluation across different methods indicates that the optimal model performs best in valleys, followed by mountains, and worst in plains; it performs worse during the day and better at night; and as wind speed levels increase, accuracy decreases. Overall, among various downscaling methods, UNET_DCA and UNET_DCA_ars effectively reconstruct the spatial details of wind fields, providing a deeper exploration for the inversion of high-resolution historical meteorological grid data. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence for Remote Sensing: Methodology and Applications)

► Show Figures

Figure 1

22 pages, 7704 KiB

Open AccessArticle

Zero-Shot Sketch-Based Remote-Sensing Image Retrieval Based on Multi-Level and Attention-Guided Tokenization

by Bo Yang, Chen Wang, Xiaoshuang Ma, Beiping Song, Zhuang Liu and Fangde Sun

Remote Sens. 2024, 16(10), 1653; https://doi.org/10.3390/rs16101653 - 7 May 2024

Cited by 1 | Viewed by 1376

Abstract

Effectively and efficiently retrieving images from remote-sensing databases is a critical challenge in the realm of remote-sensing big data. Utilizing hand-drawn sketches as retrieval inputs offers intuitive and user-friendly advantages, yet the potential of multi-level feature integration from sketches remains underexplored, leading to [...] Read more.

Effectively and efficiently retrieving images from remote-sensing databases is a critical challenge in the realm of remote-sensing big data. Utilizing hand-drawn sketches as retrieval inputs offers intuitive and user-friendly advantages, yet the potential of multi-level feature integration from sketches remains underexplored, leading to suboptimal retrieval performance. To address this gap, our study introduces a novel zero-shot, sketch-based retrieval method for remote-sensing images, leveraging multi-level feature extraction, self-attention-guided tokenization and filtering, and cross-modality attention update. This approach employs only vision information and does not require semantic knowledge concerning the sketch and image. It starts by employing multi-level self-attention guided feature extraction to tokenize the query sketches, as well as self-attention feature extraction to tokenize the candidate images. It then employs cross-attention mechanisms to establish token correspondence between these two modalities, facilitating the computation of sketch-to-image similarity. Our method significantly outperforms existing sketch-based remote-sensing image retrieval techniques, as evidenced by tests on multiple datasets. Notably, it also exhibits robust zero-shot learning capabilities in handling unseen categories and strong domain adaptation capabilities in handling unseen novel remote-sensing data. The method’s scalability can be further enhanced by the pre-calculation of retrieval tokens for all candidate images in a database. This research underscores the significant potential of multi-level, attention-guided tokenization in cross-modal remote-sensing image retrieval. For broader accessibility and research facilitation, we have made the code and dataset used in this study publicly available online. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence for Remote Sensing: Methodology and Applications)

► Show Figures

Figure 1

24 pages, 39892 KiB

Open AccessArticle

Evaluation of Ten Deep-Learning-Based Out-of-Distribution Detection Methods for Remote Sensing Image Scene Classification

by Sicong Li, Ning Li, Min Jing, Chen Ji and Liang Cheng

Remote Sens. 2024, 16(9), 1501; https://doi.org/10.3390/rs16091501 - 24 Apr 2024

Viewed by 1618

Abstract

Although deep neural networks have made significant progress in tasks related to remote sensing image scene classification, most of these tasks assume that the training and test data are independently and identically distributed. However, when remote sensing scene classification models are deployed in [...] Read more.

Although deep neural networks have made significant progress in tasks related to remote sensing image scene classification, most of these tasks assume that the training and test data are independently and identically distributed. However, when remote sensing scene classification models are deployed in the real world, the model will inevitably encounter situations where the distribution of the test set differs from that of the training set, leading to unpredictable errors during the inference and testing phase. For instance, in the context of large-scale remote sensing scene classification applications, it is difficult to obtain all the feature classes in the training phase. Consequently, during the inference and testing phases, the model will categorize images of unidentified unknown classes into known classes. Therefore, the deployment of out-of-distribution (OOD) detection within the realm of remote sensing scene classification is crucial for ensuring the reliability and safety of model application in real-world scenarios. Despite significant advancements in OOD detection methods in recent years, there remains a lack of a unified benchmark for evaluating various OOD methods specifically in remote sensing scene classification tasks. We designed different benchmarks on three classical remote sensing datasets to simulate scenes with different distributional shift. Ten different types of OOD detection methods were employed, and their performance was evaluated and compared using quantitative metrics. Numerous experiments were conducted to evaluate the overall performance of these state-of-the-art OOD detection methods under different test benchmarks. The comparative results show that the virtual-logit matching methods without additional training outperform the other types of methods on our benchmarks, suggesting that additional training methods are unnecessary for remote sensing image scene classification applications. Furthermore, we provide insights into OOD detection models and performance enhancement in real world. To the best of our knowledge, this study is the first evaluation and analysis of methods for detecting out-of-distribution data in remote sensing. We hope that this research will serve as a fundamental resource for future studies on out-of-distribution detection in remote sensing. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence for Remote Sensing: Methodology and Applications)

► Show Figures

Figure 1

17 pages, 5075 KiB

Open AccessArticle

CNN-BiLSTM: A Novel Deep Learning Model for Near-Real-Time Daily Wildfire Spread Prediction

by Mohammad Marjani, Masoud Mahdianpari and Fariba Mohammadimanesh

Remote Sens. 2024, 16(8), 1467; https://doi.org/10.3390/rs16081467 - 20 Apr 2024

Cited by 17 | Viewed by 4186

Abstract

Wildfires significantly threaten ecosystems and human lives, necessitating effective prediction models for the management of this destructive phenomenon. This study integrates Convolutional Neural Network (CNN) and Bidirectional Long Short-Term Memory (BiLSTM) modules to develop a novel deep learning model called CNN-BiLSTM for near-real-time [...] Read more.

Wildfires significantly threaten ecosystems and human lives, necessitating effective prediction models for the management of this destructive phenomenon. This study integrates Convolutional Neural Network (CNN) and Bidirectional Long Short-Term Memory (BiLSTM) modules to develop a novel deep learning model called CNN-BiLSTM for near-real-time wildfire spread prediction to capture spatial and temporal patterns. This study uses the Visible Infrared Imaging Radiometer Suite (VIIRS) active fire product and a wide range of environmental variables, including topography, land cover, temperature, NDVI, wind informaiton, precipitation, soil moisture, and runoff to train the CNN-BiLSTM model. A comprehensive exploration of parameter configurations and settings was conducted to optimize the model’s performance. The evaluation results and their comparison with benchmark models, such as a Long Short-Term Memory (LSTM) and CNN-LSTM models, demonstrate the effectiveness of the CNN-BiLSTM model with IoU of F1 Score of 0.58 and 0.73 for validation and training sets, respectively. This innovative approach offers a promising avenue for enhancing wildfire management efforts through its capacity for near-real-time prediction, marking a significant step forward in mitigating the impact of wildfires. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence for Remote Sensing: Methodology and Applications)

► Show Figures

Figure 1

19 pages, 8487 KiB

Open AccessArticle

MRFA-Net: Multi-Scale Receptive Feature Aggregation Network for Cloud and Shadow Detection

by Jianxiang Wang, Yuanlu Li, Xiaoting Fan, Xin Zhou and Mingxuan Wu

Remote Sens. 2024, 16(8), 1456; https://doi.org/10.3390/rs16081456 - 20 Apr 2024

Viewed by 1127

Abstract

The effective segmentation of clouds and cloud shadows is crucial for surface feature extraction, climate monitoring, and atmospheric correction, but it remains a critical challenge in remote sensing image processing. Cloud features are intricate, with varied distributions and unclear boundaries, making accurate extraction [...] Read more.

The effective segmentation of clouds and cloud shadows is crucial for surface feature extraction, climate monitoring, and atmospheric correction, but it remains a critical challenge in remote sensing image processing. Cloud features are intricate, with varied distributions and unclear boundaries, making accurate extraction difficult, with only a few networks addressing this challenge. To tackle these issues, we introduce a multi-scale receptive field aggregation network (MRFA-Net). The MRFA-Net comprises an MRFA-Encoder and MRFA-Decoder. Within the encoder, the net includes the asymmetric feature extractor module (AFEM) and multi-scale attention, which capture diverse local features and enhance contextual semantic understanding, respectively. The MRFA-Decoder includes the multi-path decoder module (MDM) for blending features and the global feature refinement module (GFRM) for optimizing information via learnable matrix decomposition. Experimental results demonstrate that our model excelled in generalization and segmentation performance when addressing various complex backgrounds and different category detections, exhibiting advantages in terms of parameter efficiency and computational complexity, with the MRFA-Net achieving a mean intersection over union (MIoU) of 94.12% on our custom Cloud and Shadow dataset, and 87.54% on the open-source HRC_WHU dataset, outperforming other models by at least 0.53% and 0.62%. The proposed model demonstrates applicability in practical scenarios where features are difficult to distinguish. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence for Remote Sensing: Methodology and Applications)

► Show Figures

Figure 1

18 pages, 9421 KiB

Open AccessArticle

Remote Sensing Images Secure Distribution Scheme Based on Deep Information Hiding

by Peng Luo, Jia Liu, Jingting Xu, Qian Dang and Dejun Mu

Remote Sens. 2024, 16(8), 1331; https://doi.org/10.3390/rs16081331 - 10 Apr 2024

Viewed by 1063

Abstract

To ensure the security of highly sensitive remote sensing images (RSIs) during their distribution, it is essential to implement effective content security protection methods. Generally, secure distribution schemes for remote sensing images often employ cryptographic techniques. However, sending encrypted data exposes communication behavior, [...] Read more.

To ensure the security of highly sensitive remote sensing images (RSIs) during their distribution, it is essential to implement effective content security protection methods. Generally, secure distribution schemes for remote sensing images often employ cryptographic techniques. However, sending encrypted data exposes communication behavior, which poses significant security risks to the distribution of remote sensing images. Therefore, this paper introduces deep information hiding to achieve the secure distribution of remote sensing images, which can serve as an effective alternative in certain specific scenarios. Specifically, the Deep Information Hiding for RSI Distribution (hereinafter referred to as DIH4RSID) based on an encoder–decoder network architecture with Parallel Attention Mechanism (PAM) by adversarial training is proposed. Our model is constructed with four main components: a preprocessing network (PN), an embedding network (EN), a revealing network (RN), and a discriminating network (DN). The PN module is primarily based on Inception to capture more details of RSIs and targets of different scales. The PAM module obtains features in two spatial directions to realize feature enhancement and context information integration. The experimental results indicate that our proposed algorithm achieves relatively higher visual quality and secure level compared to related methods. Additionally, after extracting the concealed content from hidden images, the average classification accuracy is unaffected. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence for Remote Sensing: Methodology and Applications)

► Show Figures

Figure 1

17 pages, 4146 KiB

Open AccessArticle

CDEST: Class Distinguishability-Enhanced Self-Training Method for Adopting Pre-Trained Models to Downstream Remote Sensing Image Semantic Segmentation

by Ming Zhang, Xin Gu, Ji Qi, Zhenshi Zhang, Hemeng Yang, Jun Xu, Chengli Peng and Haifeng Li

Remote Sens. 2024, 16(7), 1293; https://doi.org/10.3390/rs16071293 - 6 Apr 2024

Viewed by 1486

Abstract

The self-supervised learning (SSL) technique, driven by massive unlabeled data, is expected to be a promising solution for semantic segmentation of remote sensing images (RSIs) with limited labeled data, revolutionizing transfer learning. Traditional ‘local-to-local’ transfer from small, local datasets to another target dataset [...] Read more.

The self-supervised learning (SSL) technique, driven by massive unlabeled data, is expected to be a promising solution for semantic segmentation of remote sensing images (RSIs) with limited labeled data, revolutionizing transfer learning. Traditional ‘local-to-local’ transfer from small, local datasets to another target dataset plays an ever-shrinking role due to RSIs’ diverse distribution shifts. Instead, SSL promotes a ‘global-to-local’ transfer paradigm, in which generalized models pre-trained on arbitrarily large unlabeled datasets are fine-tuned to the target dataset to overcome data distribution shifts. However, the SSL pre-trained models may contain both useful and useless features for the downstream semantic segmentation task, due to the gap between the SSL tasks and the downstream task. To adapt such pre-trained models to semantic segmentation tasks, traditional supervised fine-tuning methods that use only a small number of labeled samples may drop out useful features due to overfitting. The main reason behind this is that supervised fine-tuning aims to map a few training samples from the high-dimensional, sparse image space to the low-dimensional, compact semantic space defined by the downstream labels, resulting in a degradation of the distinguishability. To address the above issues, we propose a class distinguishability-enhanced self-training (CDEST) method to support global-to-local transfer. First, the self-training module in CDEST introduces a semi-supervised learning mechanism to fully utilize the large amount of unlabeled data in the downstream task to increase the size and diversity of the training data, thus alleviating the problem of biased overfitting of the model. Second, the supervised and semi-supervised contrastive learning modules of CDEST can explicitly enhance the class distinguishability of features, helping to preserve the useful features learned from pre-training while adapting to downstream tasks. We evaluate the proposed CDEST method on four RSI semantic segmentation datasets, and our method achieves optimal experimental results on all four datasets compared to supervised fine-tuning as well as three semi-supervised fine-tuning methods. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence for Remote Sensing: Methodology and Applications)

► Show Figures

Figure 1

23 pages, 4205 KiB

Open AccessArticle

Exploring Uncertainty-Based Self-Prompt for Test-Time Adaptation Semantic Segmentation in Remote Sensing Images

by Ziquan Wang, Yongsheng Zhang, Zhenchao Zhang, Zhipeng Jiang, Ying Yu, Lei Li and Lei Zhang

Remote Sens. 2024, 16(7), 1239; https://doi.org/10.3390/rs16071239 - 31 Mar 2024

Viewed by 1323

Abstract

Test-time adaptation (TTA) has been proven to effectively improve the adaptability of deep learning semantic segmentation models facing continuous changeable scenes. However, most of the existing TTA algorithms lack an explicit exploration of domain gaps, especially those based on visual domain prompts. To [...] Read more.

Test-time adaptation (TTA) has been proven to effectively improve the adaptability of deep learning semantic segmentation models facing continuous changeable scenes. However, most of the existing TTA algorithms lack an explicit exploration of domain gaps, especially those based on visual domain prompts. To address these issues, this paper proposes a self-prompt strategy based on uncertainty, guiding the model to continuously focus on regions with high uncertainty (i.e., regions with a larger domain gap). Specifically, we still use the Mean-Teacher architecture with the predicted entropy from the teacher network serving as the input to the prompt module. The prompt module processes uncertain maps and guides the student network to focus on regions with higher entropy, enabling continuous adaptation to new scenes. This is a self-prompting strategy that requires no prior knowledge and is tested on widely used benchmarks. In terms of the average performance, our method outperformed the baseline algorithm in TTA and continual TTA settings of Cityscapes-to-ACDC by 3.3% and 3.9%, respectively. Our method also outperformed the baseline algorithm by 4.1% and 3.1% on the more difficult Cityscapes-to-(Foggy and Rainy) Cityscapes setting, which also surpasses six other current TTA methods. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence for Remote Sensing: Methodology and Applications)

► Show Figures

Figure 1

16 pages, 11666 KiB

Open AccessArticle

A Lightweight Building Extraction Approach for Contour Recovery in Complex Urban Environments

by Jiaxin He, Yong Cheng, Wei Wang, Zhoupeng Ren, Ce Zhang and Wenjie Zhang

Remote Sens. 2024, 16(5), 740; https://doi.org/10.3390/rs16050740 - 20 Feb 2024

Cited by 3 | Viewed by 1512

Abstract

High-spatial-resolution urban buildings play a crucial role in urban planning, emergency response, and disaster management. However, challenges such as missing building contours due to occlusion problems (occlusion between buildings of different heights and buildings obscured by trees), uneven contour extraction due to mixing [...] Read more.

High-spatial-resolution urban buildings play a crucial role in urban planning, emergency response, and disaster management. However, challenges such as missing building contours due to occlusion problems (occlusion between buildings of different heights and buildings obscured by trees), uneven contour extraction due to mixing of building edges with other feature elements (roads, vehicles, and trees), and slow training speed in high-resolution image data hinder efficient and accurate building extraction. To address these issues, we propose a semantic segmentation model composed of a lightweight backbone, coordinate attention module, and pooling fusion module, which achieves lightweight building extraction and adaptive recovery of spatial contours. Comparative experiments were conducted on datasets featuring typical urban building instances in China and the Mapchallenge dataset, comparing our method with several classical and mainstream semantic segmentation algorithms. The results demonstrate the effectiveness of our approach, achieving excellent mean intersection over union (mIoU) and frames per second (FPS) scores on both datasets (China dataset: 85.11% and 110.67 FPS; Mapchallenge dataset: 90.27% and 117.68 FPS). Quantitative evaluations indicate that our model not only significantly improves computational speed but also ensures high accuracy in the extraction of urban buildings from high-resolution imagery. Specifically, on a typical urban building dataset from China, our model shows an accuracy improvement of 0.64% and a speed increase of 70.03 FPS compared to the baseline model. On the Mapchallenge dataset, our model achieves an accuracy improvement of 0.54% and a speed increase of 42.39 FPS compared to the baseline model. Our research indicates that lightweight networks show significant potential in urban building extraction tasks. In the future, the segmentation accuracy and prediction speed can be further balanced on the basis of adjusting the deep learning model or introducing remote sensing indices, which can be applied to research scenarios such as greenfield extraction or multi-class target extraction. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence for Remote Sensing: Methodology and Applications)

► Show Figures

Figure 1

23 pages, 18943 KiB

Open AccessArticle

An Approach to Large-Scale Cement Plant Detection Using Multisource Remote Sensing Imagery

by Tianzhu Li, Caihong Ma, Yongze Lv, Ruilin Liao, Jin Yang and Jianbo Liu

Remote Sens. 2024, 16(4), 729; https://doi.org/10.3390/rs16040729 - 19 Feb 2024

Viewed by 1999

Abstract

The cement industry, as one of the primary contributors to global greenhouse gas emissions, accounts for 7% of the world’s carbon dioxide emissions. There is an urgent need to establish a rapid method for detecting cement plants to facilitate effective monitoring. In this [...] Read more.

The cement industry, as one of the primary contributors to global greenhouse gas emissions, accounts for 7% of the world’s carbon dioxide emissions. There is an urgent need to establish a rapid method for detecting cement plants to facilitate effective monitoring. In this study, a comprehensive method based on YOLOv5-IEG and the Thermal Signature Detection module using Google Earth optical imagery and SDGSAT-1 thermal infrared imagery was proposed to detect large-scale cement plant information, including geographic location and operational status. The improved algorithm demonstrated an increase of 4.8% in accuracy and a 7.7% improvement in [email protected]:95. In a specific empirical investigation in China, we successfully detected 781 large-scale cement plants with an accuracy of 90.8%. Specifically, of the 55 cement plants in Shandong Province, we identified 46 as operational and nine as non-operational. The successful application of advanced models and remote sensing technology in efficiently and accurately tracking the operational status of cement plants provides crucial support for environmental protection and sustainable development. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence for Remote Sensing: Methodology and Applications)

► Show Figures

Figure 1

17 pages, 18157 KiB

Open AccessArticle

Deep Learning-Based Real-Time Detection of Surface Landmines Using Optical Imaging

by Emanuele Vivoli, Marco Bertini and Lorenzo Capineri

Remote Sens. 2024, 16(4), 677; https://doi.org/10.3390/rs16040677 - 14 Feb 2024

Cited by 2 | Viewed by 6341

Abstract

This paper presents a pioneering study in the application of real-time surface landmine detection using a combination of robotics and deep learning. We introduce a novel system integrated within a demining robot, capable of detecting landmines in real time with high recall. Utilizing [...] Read more.

This paper presents a pioneering study in the application of real-time surface landmine detection using a combination of robotics and deep learning. We introduce a novel system integrated within a demining robot, capable of detecting landmines in real time with high recall. Utilizing YOLOv8 models, we leverage both optical imaging and artificial intelligence to identify two common types of surface landmines: PFM-1 (butterfly) and PMA-2 (starfish with tripwire). Our system runs at 2 FPS on a mobile device missing at most 1.6% of targets. It demonstrates significant advancements in operational speed and autonomy, surpassing conventional methods while being compatible with other approaches like UAV. In addition to the proposed system, we release two datasets with remarkable differences in landmine and background colors, built to train and test the model performances. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence for Remote Sensing: Methodology and Applications)

► Show Figures

Figure 1

23 pages, 5562 KiB

Open AccessArticle

SMFF-YOLO: A Scale-Adaptive YOLO Algorithm with Multi-Level Feature Fusion for Object Detection in UAV Scenes

by Yuming Wang, Hua Zou, Ming Yin and Xining Zhang

Remote Sens. 2023, 15(18), 4580; https://doi.org/10.3390/rs15184580 - 18 Sep 2023

Cited by 18 | Viewed by 4028

Abstract

Object detection in images captured by unmanned aerial vehicles (UAVs) holds great potential in various domains, including civilian applications, urban planning, and disaster response. However, it faces several challenges, such as multi-scale variations, dense scenes, complex backgrounds, and tiny-sized objects. In this paper, [...] Read more.

Object detection in images captured by unmanned aerial vehicles (UAVs) holds great potential in various domains, including civilian applications, urban planning, and disaster response. However, it faces several challenges, such as multi-scale variations, dense scenes, complex backgrounds, and tiny-sized objects. In this paper, we present a novel scale-adaptive YOLO framework called SMFF-YOLO, which addresses these challenges through a multi-level feature fusion approach. To improve the detection accuracy of small objects, our framework incorporates the ELAN-SW object detection prediction head. This newly designed head effectively utilizes both global contextual information and local features, enhancing the detection accuracy of tiny objects. Additionally, the proposed bidirectional feature fusion pyramid (BFFP) module tackles the issue of scale variations in object sizes by aggregating multi-scale features. To handle complex backgrounds, we introduce the adaptive atrous spatial pyramid pooling (AASPP) module, which enables adaptive feature fusion and alleviates the negative impact of cluttered scenes. Moreover, we adopt the Wise-IoU(WIoU) bounding box regression loss to enhance the competitiveness of different quality anchor boxes, which offers the framework a more informed gradient allocation strategy. We validate the effectiveness of SMFF-YOLO using the VisDrone and UAVDT datasets. Experimental results demonstrate that our model achieves higher detection accuracy, with AP50 reaching 54.3% for VisDrone and 42.4% for UAVDT datasets. Visual comparative experiments with other YOLO-based methods further illustrate the robustness and adaptability of our approach. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence for Remote Sensing: Methodology and Applications)

► Show Figures

Graphical abstract

20 pages, 3522 KiB

Open AccessArticle

Detection Method of Infected Wood on Digital Orthophoto Map–Digital Surface Model Fusion Network

by Guangbiao Wang, Hongbo Zhao, Qing Chang, Shuchang Lyu, Binghao Liu, Chunlei Wang and Wenquan Feng

Remote Sens. 2023, 15(17), 4295; https://doi.org/10.3390/rs15174295 - 31 Aug 2023

Cited by 4 | Viewed by 1528

Abstract

Pine wilt disease (PWD) is a worldwide affliction that poses a significant menace to forest ecosystems. The swift and precise identification of pine trees under infection holds paramount significance in the proficient administration of this ailment. The progression of remote sensing and deep [...] Read more.

Pine wilt disease (PWD) is a worldwide affliction that poses a significant menace to forest ecosystems. The swift and precise identification of pine trees under infection holds paramount significance in the proficient administration of this ailment. The progression of remote sensing and deep learning methodologies has propelled the utilization of target detection and recognition techniques reliant on remote sensing imagery, emerging as the prevailing strategy for pinpointing affected trees. Although the existing object detection algorithms have achieved remarkable success, virtually all methods solely rely on a Digital Orthophoto Map (DOM), which is not suitable for diseased trees detection, leading to a large false detection rate in the detection of easily confused targets, such as bare land, houses, brown herbs and so on. In order to improve the ability of detecting diseased trees and preventing the spread of the epidemic, we construct a large-scale PWD detection dataset with both DOM and Digital Surface Model (DSM) images and propose a novel detection framework, DDNet, which makes full use of the spectral features and geomorphological spatial features of remote sensing targets. The experimental results show that the proposed joint network achieves an AP50 2.4% higher than the traditional deep learning network. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence for Remote Sensing: Methodology and Applications)

► Show Figures

Graphical abstract

20 pages, 7667 KiB

Open AccessArticle

TPH-YOLOv5-Air: Airport Confusing Object Detection via Adaptively Spatial Feature Fusion

by Qiang Wang, Wenquan Feng, Lifan Yao, Chen Zhuang, Binghao Liu and Lijiang Chen

Remote Sens. 2023, 15(15), 3883; https://doi.org/10.3390/rs15153883 - 5 Aug 2023

Cited by 12 | Viewed by 2547

Abstract

Airport detection in remote sensing scenes is a crucial area of research, playing a key role in aircraft blind landing procedures. However, airport detection in remote sensing scenes still faces challenges such as class confusion, poor detection performance on multi-scale objects, and limited [...] Read more.

Airport detection in remote sensing scenes is a crucial area of research, playing a key role in aircraft blind landing procedures. However, airport detection in remote sensing scenes still faces challenges such as class confusion, poor detection performance on multi-scale objects, and limited dataset availability. To address these issues, this paper proposes a novel airport detection network (TPH-YOLOv5-Air) based on adaptive spatial feature fusion (ASFF). Firstly, we construct an Airport Confusing Object Dataset (ACD) specifically tailored for remote sensing scenarios containing 9501 instances of airport confusion objects. Secondly, building upon the foundation of TPH-YOLOv5++, we adopt the ASFF structure, which not only enhances the feature extraction efficiency but also enriches feature representation. Moreover, an adaptive spatial feature fusion (ASFF) strategy based on adaptive parameter adjustment module (APAM) is proposed, which improves the feature scale invariance and enhances the detection of airports. Finally, experimental results based on the ACD dataset demonstrate that TPH-YOLOv5-Air achieves a mean average precision (mAP) of 49.4%, outperforming TPH-YOLOv5++ by 2% and the original YOLOv5 network by 3.6%. This study contributes to the advancement of airport detection in remote sensing scenes and demonstrates the practical application potential of TPH-YOLOv5-Air in this domain. Visualization and analysis further validate the effectiveness and interpretability of TPH-YOLOv5-Air. The ACD dataset is publicly available. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence for Remote Sensing: Methodology and Applications)

► Show Figures

Figure 1

Other

Jump to: Research

27 pages, 5461 KiB

Open AccessEssay

BAFormer: A Novel Boundary-Aware Compensation UNet-like Transformer for High-Resolution Cropland Extraction

by Zhiyong Li, Youming Wang, Fa Tian, Junbo Zhang, Yijie Chen and Kunhong Li

Remote Sens. 2024, 16(14), 2526; https://doi.org/10.3390/rs16142526 - 10 Jul 2024

Cited by 2 | Viewed by 1573

Abstract

Utilizing deep learning for semantic segmentation of cropland from remote sensing imagery has become a crucial technique in land surveys. Cropland is highly heterogeneous and fragmented, and existing methods often suffer from inaccurate boundary segmentation. This paper introduces a UNet-like boundary-aware compensation model [...] Read more.

Utilizing deep learning for semantic segmentation of cropland from remote sensing imagery has become a crucial technique in land surveys. Cropland is highly heterogeneous and fragmented, and existing methods often suffer from inaccurate boundary segmentation. This paper introduces a UNet-like boundary-aware compensation model (BAFormer). Cropland boundaries typically exhibit rapid transformations in pixel values and texture features, often appearing as high-frequency features in remote sensing images. To enhance the recognition of these high-frequency features as represented by cropland boundaries, the proposed BAFormer integrates a Feature Adaptive Mixer (FAM) and develops a Depthwise Large Kernel Multi-Layer Perceptron model (DWLK-MLP) to enrich the global and local cropland boundaries features separately. Specifically, FAM enhances the boundary-aware method by adaptively acquiring high-frequency features through convolution and self-attention advantages, while DWLK-MLP further supplements boundary position information using a large receptive field. The efficacy of BAFormer has been evaluated on datasets including Vaihingen, Potsdam, LoveDA, and Mapcup. It demonstrates high performance, achieving mIoU scores of 84.5%, 87.3%, 53.5%, and 83.1% on these datasets, respectively. Notably, BAFormer-T (lightweight model) surpasses other lightweight models on the Vaihingen dataset with scores of 91.3% F1 and 84.1% mIoU. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence for Remote Sensing: Methodology and Applications)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Advanced Artificial Intelligence for Remote Sensing: Methodology and Applications

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Related Special Issue

Published Papers (18 papers)

Research

Other

Further Information

Guidelines

MDPI Initiatives

Follow MDPI