remotesensing-logo

Journal Browser

Journal Browser

Advanced Artificial Intelligence for Remote Sensing: Methodology and Applications

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "AI Remote Sensing".

Deadline for manuscript submissions: closed (30 June 2024) | Viewed by 36555

Special Issue Editors


E-Mail Website
Guest Editor
Department of Computer Science, University of Liverpool, Liverpool, UK
Interests: computer vision; remote sensing; change detection; hyperspectral image classification; road extraction
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
School of Electronic and Information Engineering, Beihang University, No. 37 Xueyuan Road, Haidian District, Beijing 100191, China
Interests: remote sensing; image recognition; domain adaptation; few-shot learning; light-weight neural network
Special Issues, Collections and Topics in MDPI journals

E-Mail
Guest Editor
Scuola Superiore Sant’Anna di Studi Universitari e di Perfezionamento, Pisa, Italy
Interests: artificial intelligence; computer vision; 3D reconstruction; image processing; localization methods; mapping; inspection robotics; deep Learning; industrial monitoring; smart sensors; photogrammetry; LiDAR; SAR; farming applications
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
1. Faculty of Engineering and IT, University of Technology Sydney, Ultimo, NSW, Australia
2. McGregor Coxall Australia Pty Ltd., Sydney, NSW, Australia
Interests: machine learning; geospatial 3D analysis; geospatial database querying; web GIS; airborne/spaceborne image processing; feature extraction; time-series analysis in forecasting modelling and domain adaptation in various environmental applications
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

The Special Issue on “Advanced Artificial Intelligence for Remote Sensing: Methodology and Applications” aims to explore the intersection of artificial intelligence (AI) and remote sensing, showcasing cutting-edge methodologies and their diverse applications in this field. Remote sensing, the science of collecting information about the earth's surface from a distance, has become an invaluable tool for monitoring and understanding our planet’s dynamics and changes. Concurrently, artificial intelligence techniques have advanced significantly, offering powerful tools for data analysis and decision making. The integration of AI techniques with remote sensing data has revolutionized the way we analyze, interpret, and extract meaningful information from large-scale Earth observation datasets, enabling us to address critical environmental, social, and economic challenges.

This Special Issue provides a platform for researchers and practitioners to present their latest research findings, methodologies, and applications in the realm of AI for remote sensing. It serves as a forum to exchange knowledge and foster collaborations among experts from diverse disciplines, including computer science, remote sensing, geoscience, and environmental studies. The issue aims to advance the understanding and utilization of AI techniques for remote sensing applications, pushing the boundaries of what can be achieved in terms of data analysis, information extraction, and decision support.

This Special Issue emphasizes the latest advancements in AI algorithms, models, and techniques that have been specifically developed or adapted for remote sensing applications. It aims to showcase novel methodologies, innovative applications, and case studies that demonstrate the potential of AI in addressing real-world challenges in agriculture, urban planning, forestry, climate change, and other domains.

We invite submissions that address various aspects of AI methodologies and their applications in remote sensing. Potential topics of interest include, but are not limited to:

  • AI-driven image classification and recognition in remote sensing.
  • Deep learning techniques for feature extraction and representation learning from remote sensing data.
  • The fusion of multi-source remote sensing data using AI-based approaches.
  • Semantic segmentation and object detection in remote sensing images
  • AI-based approaches for change detection and monitoring using remote sensing data.
  • AI-enabled hyperspectral and LiDAR data analysis.
  • Transfer learning and domain adaptation for remote sensing applications.
  • Few-shot image recognition/semantic segmentation/object detection, etc.
  • Domain adaptation problems in the remote sensing area.
  • Case studies and applications of AI in remote sensing for agriculture, urban planning, forestry, climate change, etc.

Authors are invited to submit original research articles, reviews, or survey papers that contribute to the field of advanced AI for remote sensing. All submissions will undergo a rigorous peer review process to ensure the quality and relevance of accepted papers. Manuscripts should follow the guidelines provided by the journal and should clearly address the Special Issue theme

Dr. Guangliang Cheng
Prof. Dr. Qi Zhao
Dr. Paolo Tripicchio
Dr. Hossein M. Rizeei
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • artificial intelligence
  • remote sensing
  • image recognition
  • semantic segmentation
  • object detection
  • change detection
  • environmental monitoring
  • deep learning
  • few-shot learning
  • domain adaptation

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Related Special Issue

Published Papers (18 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Other

23 pages, 7788 KiB  
Article
A Novel Mamba Architecture with a Semantic Transformer for Efficient Real-Time Remote Sensing Semantic Segmentation
by Hao Ding, Bo Xia, Weilin Liu, Zekai Zhang, Jinglin Zhang, Xing Wang and Sen Xu
Remote Sens. 2024, 16(14), 2620; https://doi.org/10.3390/rs16142620 - 17 Jul 2024
Cited by 3 | Viewed by 1794
Abstract
Real-time remote sensing segmentation technology is crucial for unmanned aerial vehicles (UAVs) in battlefield surveillance, land characterization observation, earthquake disaster assessment, etc., and can significantly enhance the application value of UAVs in military and civilian fields. To realize this potential, it is essential [...] Read more.
Real-time remote sensing segmentation technology is crucial for unmanned aerial vehicles (UAVs) in battlefield surveillance, land characterization observation, earthquake disaster assessment, etc., and can significantly enhance the application value of UAVs in military and civilian fields. To realize this potential, it is essential to develop real-time semantic segmentation methods that can be applied to resource-limited platforms, such as edge devices. The majority of mainstream real-time semantic segmentation methods rely on convolutional neural networks (CNNs) and transformers. However, CNNs cannot effectively capture long-range dependencies, while transformers have high computational complexity. This paper proposes a novel remote sensing Mamba architecture for real-time segmentation tasks in remote sensing, named RTMamba. Specifically, the backbone utilizes a Visual State-Space (VSS) block to extract deep features and maintains linear computational complexity, thereby capturing long-range contextual information. Additionally, a novel Inverted Triangle Pyramid Pooling (ITP) module is incorporated into the decoder. The ITP module can effectively filter redundant feature information and enhance the perception of objects and their boundaries in remote sensing images. Extensive experiments were conducted on three challenging aerial remote sensing segmentation benchmarks, including Vaihingen, Potsdam, and LoveDA. The results show that RTMamba achieves competitive performance advantages in terms of segmentation accuracy and inference speed compared to state-of-the-art CNN and transformer methods. To further validate the deployment potential of the model on embedded devices with limited resources, such as UAVs, we conducted tests on the Jetson AGX Orin edge device. The experimental results demonstrate that RTMamba achieves impressive real-time segmentation performance. Full article
Show Figures

Figure 1

26 pages, 17177 KiB  
Article
Direction of Arrival Joint Prediction of Underwater Acoustic Communication Signals Using Faster R-CNN and Frequency–Azimuth Spectrum
by Le Cheng, Yue Liu, Bingbing Zhang, Zhengliang Hu, Hongna Zhu and Bin Luo
Remote Sens. 2024, 16(14), 2563; https://doi.org/10.3390/rs16142563 - 12 Jul 2024
Cited by 1 | Viewed by 624
Abstract
Utilizing hydrophone arrays for detecting underwater acoustic communication (UWAC) signals leverages spatial information to enhance detection efficiency and expand the perceptual range. This study redefines the task of UWAC signal detection as an object detection problem within the frequency–azimuth (FRAZ) spectrum. Employing Faster [...] Read more.
Utilizing hydrophone arrays for detecting underwater acoustic communication (UWAC) signals leverages spatial information to enhance detection efficiency and expand the perceptual range. This study redefines the task of UWAC signal detection as an object detection problem within the frequency–azimuth (FRAZ) spectrum. Employing Faster R-CNN as a signal detector, the proposed method facilitates the joint prediction of UWAC signals, including estimates of the number of sources, modulation type, frequency band, and direction of arrival (DOA). The proposed method extracts precise frequency and DOA features of the signals without requiring prior knowledge of the number of signals or frequency bands. Instead, it extracts these features jointly during training and applies them to perform joint predictions during testing. Numerical studies demonstrate that the proposed method consistently outperforms existing techniques across all signal-to-noise ratios (SNRs), particularly excelling in low SNRs. It achieves a detection F1 score of 0.96 at an SNR of −15 dB. We further verified its performance under varying modulation types, numbers of sources, grating lobe interference, strong signal interference, and array structure parameters. Furthermore, the practicality and robustness of our approach were evaluated in lake-based UWAC experiments, and the model trained solely on simulated signals performed competitively in the trials. Full article
Show Figures

Figure 1

36 pages, 2623 KiB  
Article
Change Detection Methods for Remote Sensing in the Last Decade: A Comprehensive Review
by Guangliang Cheng, Yunmeng Huang, Xiangtai Li, Shuchang Lyu, Zhaoyang Xu, Hongbo Zhao, Qi Zhao and Shiming Xiang
Remote Sens. 2024, 16(13), 2355; https://doi.org/10.3390/rs16132355 - 27 Jun 2024
Cited by 9 | Viewed by 5617
Abstract
Change detection is an essential and widely utilized task in remote sensing that aims to detect and analyze changes occurring in the same geographical area over time, which has broad applications in urban development, agricultural surveys, and land cover monitoring. Detecting changes in [...] Read more.
Change detection is an essential and widely utilized task in remote sensing that aims to detect and analyze changes occurring in the same geographical area over time, which has broad applications in urban development, agricultural surveys, and land cover monitoring. Detecting changes in remote sensing images is a complex challenge due to various factors, including variations in image quality, noise, registration errors, illumination changes, complex landscapes, and spatial heterogeneity. In recent years, deep learning has emerged as a powerful tool for feature extraction and addressing these challenges. Its versatility has resulted in its widespread adoption for numerous image-processing tasks. This paper presents a comprehensive survey of significant advancements in change detection for remote sensing images over the past decade. We first introduce some preliminary knowledge for the change detection task, such as problem definition, datasets, evaluation metrics, and transformer basics, as well as provide a detailed taxonomy of existing algorithms from three different perspectives: algorithm granularity, supervision modes, and frameworks in the Methodology section. This survey enables readers to gain systematic knowledge of change detection tasks from various angles. We then summarize the state-of-the-art performance on several dominant change detection datasets, providing insights into the strengths and limitations of existing algorithms. Based on our survey, some future research directions for change detection in remote sensing are well identified. This survey paper sheds some light the topic for the community and will inspire further research efforts in the change detection task. Full article
Show Figures

Figure 1

23 pages, 3678 KiB  
Article
Enhanced Wind Field Spatial Downscaling Method Using UNET Architecture and Dual Cross-Attention Mechanism
by Jieli Liu, Chunxiang Shi, Lingling Ge, Ruian Tie, Xiaojian Chen, Tao Zhou, Xiang Gu and Zhanfei Shen
Remote Sens. 2024, 16(11), 1867; https://doi.org/10.3390/rs16111867 - 23 May 2024
Cited by 1 | Viewed by 865
Abstract
Before 2008, China lacked high-coverage regional surface observation data, making it difficult for the China Meteorological Administration Land Data Assimilation System (CLDAS) to directly backtrack high-resolution, high-quality land assimilation products. To address this issue, this paper proposes a deep learning model named UNET_DCA, [...] Read more.
Before 2008, China lacked high-coverage regional surface observation data, making it difficult for the China Meteorological Administration Land Data Assimilation System (CLDAS) to directly backtrack high-resolution, high-quality land assimilation products. To address this issue, this paper proposes a deep learning model named UNET_DCA, based on the UNET architecture, which incorporates a Dual Cross-Attention module (DCA) for multiscale feature fusion by introducing Channel Cross-Attention (CCA) and Spatial Cross-Attention (SCA) mechanisms. This model focuses on the near-surface 10-m wind field and achieves spatial downscaling from 6.25 km to 1 km. We conducted training and validation using data from 2020–2021, tested with data from 2019, and performed ablation experiments to validate the effectiveness of each module. We compared the results with traditional bilinear interpolation methods and the SNCA-CLDASSD model. The experimental results show that the UNET-based model outperforms SNCA-CLDASSD, indicating that the UNET-based model captures richer information in wind field downscaling compared to SNCA-CLDASSD, which relies on sequentially stacked CNN convolution modules. UNET_CCA and UNET_SCA, incorporating cross-attention mechanisms, outperform UNET without attention mechanisms. Furthermore, UNET_DCA, incorporating both Channel Cross-Attention and Spatial Cross-Attention mechanisms, outperforms UNET_CCA and UNET_SCA, which only incorporate one attention mechanism. UNET_DCA performs best on the RMSE, MAE, and COR metrics (0.40 m/s, 0.28 m/s, 0.93), while UNET_DCA_ars, incorporating more auxiliary information, performs best on the PSNR and SSIM metrics (29.006, 0.880). Evaluation across different methods indicates that the optimal model performs best in valleys, followed by mountains, and worst in plains; it performs worse during the day and better at night; and as wind speed levels increase, accuracy decreases. Overall, among various downscaling methods, UNET_DCA and UNET_DCA_ars effectively reconstruct the spatial details of wind fields, providing a deeper exploration for the inversion of high-resolution historical meteorological grid data. Full article
Show Figures

Figure 1

22 pages, 7704 KiB  
Article
Zero-Shot Sketch-Based Remote-Sensing Image Retrieval Based on Multi-Level and Attention-Guided Tokenization
by Bo Yang, Chen Wang, Xiaoshuang Ma, Beiping Song, Zhuang Liu and Fangde Sun
Remote Sens. 2024, 16(10), 1653; https://doi.org/10.3390/rs16101653 - 7 May 2024
Cited by 1 | Viewed by 1059
Abstract
Effectively and efficiently retrieving images from remote-sensing databases is a critical challenge in the realm of remote-sensing big data. Utilizing hand-drawn sketches as retrieval inputs offers intuitive and user-friendly advantages, yet the potential of multi-level feature integration from sketches remains underexplored, leading to [...] Read more.
Effectively and efficiently retrieving images from remote-sensing databases is a critical challenge in the realm of remote-sensing big data. Utilizing hand-drawn sketches as retrieval inputs offers intuitive and user-friendly advantages, yet the potential of multi-level feature integration from sketches remains underexplored, leading to suboptimal retrieval performance. To address this gap, our study introduces a novel zero-shot, sketch-based retrieval method for remote-sensing images, leveraging multi-level feature extraction, self-attention-guided tokenization and filtering, and cross-modality attention update. This approach employs only vision information and does not require semantic knowledge concerning the sketch and image. It starts by employing multi-level self-attention guided feature extraction to tokenize the query sketches, as well as self-attention feature extraction to tokenize the candidate images. It then employs cross-attention mechanisms to establish token correspondence between these two modalities, facilitating the computation of sketch-to-image similarity. Our method significantly outperforms existing sketch-based remote-sensing image retrieval techniques, as evidenced by tests on multiple datasets. Notably, it also exhibits robust zero-shot learning capabilities in handling unseen categories and strong domain adaptation capabilities in handling unseen novel remote-sensing data. The method’s scalability can be further enhanced by the pre-calculation of retrieval tokens for all candidate images in a database. This research underscores the significant potential of multi-level, attention-guided tokenization in cross-modal remote-sensing image retrieval. For broader accessibility and research facilitation, we have made the code and dataset used in this study publicly available online. Full article
Show Figures

Figure 1

24 pages, 39892 KiB  
Article
Evaluation of Ten Deep-Learning-Based Out-of-Distribution Detection Methods for Remote Sensing Image Scene Classification
by Sicong Li, Ning Li, Min Jing, Chen Ji and Liang Cheng
Remote Sens. 2024, 16(9), 1501; https://doi.org/10.3390/rs16091501 - 24 Apr 2024
Viewed by 1273
Abstract
Although deep neural networks have made significant progress in tasks related to remote sensing image scene classification, most of these tasks assume that the training and test data are independently and identically distributed. However, when remote sensing scene classification models are deployed in [...] Read more.
Although deep neural networks have made significant progress in tasks related to remote sensing image scene classification, most of these tasks assume that the training and test data are independently and identically distributed. However, when remote sensing scene classification models are deployed in the real world, the model will inevitably encounter situations where the distribution of the test set differs from that of the training set, leading to unpredictable errors during the inference and testing phase. For instance, in the context of large-scale remote sensing scene classification applications, it is difficult to obtain all the feature classes in the training phase. Consequently, during the inference and testing phases, the model will categorize images of unidentified unknown classes into known classes. Therefore, the deployment of out-of-distribution (OOD) detection within the realm of remote sensing scene classification is crucial for ensuring the reliability and safety of model application in real-world scenarios. Despite significant advancements in OOD detection methods in recent years, there remains a lack of a unified benchmark for evaluating various OOD methods specifically in remote sensing scene classification tasks. We designed different benchmarks on three classical remote sensing datasets to simulate scenes with different distributional shift. Ten different types of OOD detection methods were employed, and their performance was evaluated and compared using quantitative metrics. Numerous experiments were conducted to evaluate the overall performance of these state-of-the-art OOD detection methods under different test benchmarks. The comparative results show that the virtual-logit matching methods without additional training outperform the other types of methods on our benchmarks, suggesting that additional training methods are unnecessary for remote sensing image scene classification applications. Furthermore, we provide insights into OOD detection models and performance enhancement in real world. To the best of our knowledge, this study is the first evaluation and analysis of methods for detecting out-of-distribution data in remote sensing. We hope that this research will serve as a fundamental resource for future studies on out-of-distribution detection in remote sensing. Full article
Show Figures

Figure 1

17 pages, 5075 KiB  
Article
CNN-BiLSTM: A Novel Deep Learning Model for Near-Real-Time Daily Wildfire Spread Prediction
by Mohammad Marjani, Masoud Mahdianpari and Fariba Mohammadimanesh
Remote Sens. 2024, 16(8), 1467; https://doi.org/10.3390/rs16081467 - 20 Apr 2024
Cited by 9 | Viewed by 3092
Abstract
Wildfires significantly threaten ecosystems and human lives, necessitating effective prediction models for the management of this destructive phenomenon. This study integrates Convolutional Neural Network (CNN) and Bidirectional Long Short-Term Memory (BiLSTM) modules to develop a novel deep learning model called CNN-BiLSTM for near-real-time [...] Read more.
Wildfires significantly threaten ecosystems and human lives, necessitating effective prediction models for the management of this destructive phenomenon. This study integrates Convolutional Neural Network (CNN) and Bidirectional Long Short-Term Memory (BiLSTM) modules to develop a novel deep learning model called CNN-BiLSTM for near-real-time wildfire spread prediction to capture spatial and temporal patterns. This study uses the Visible Infrared Imaging Radiometer Suite (VIIRS) active fire product and a wide range of environmental variables, including topography, land cover, temperature, NDVI, wind informaiton, precipitation, soil moisture, and runoff to train the CNN-BiLSTM model. A comprehensive exploration of parameter configurations and settings was conducted to optimize the model’s performance. The evaluation results and their comparison with benchmark models, such as a Long Short-Term Memory (LSTM) and CNN-LSTM models, demonstrate the effectiveness of the CNN-BiLSTM model with IoU of F1 Score of 0.58 and 0.73 for validation and training sets, respectively. This innovative approach offers a promising avenue for enhancing wildfire management efforts through its capacity for near-real-time prediction, marking a significant step forward in mitigating the impact of wildfires. Full article
Show Figures

Figure 1

19 pages, 8487 KiB  
Article
MRFA-Net: Multi-Scale Receptive Feature Aggregation Network for Cloud and Shadow Detection
by Jianxiang Wang, Yuanlu Li, Xiaoting Fan, Xin Zhou and Mingxuan Wu
Remote Sens. 2024, 16(8), 1456; https://doi.org/10.3390/rs16081456 - 20 Apr 2024
Viewed by 915
Abstract
The effective segmentation of clouds and cloud shadows is crucial for surface feature extraction, climate monitoring, and atmospheric correction, but it remains a critical challenge in remote sensing image processing. Cloud features are intricate, with varied distributions and unclear boundaries, making accurate extraction [...] Read more.
The effective segmentation of clouds and cloud shadows is crucial for surface feature extraction, climate monitoring, and atmospheric correction, but it remains a critical challenge in remote sensing image processing. Cloud features are intricate, with varied distributions and unclear boundaries, making accurate extraction difficult, with only a few networks addressing this challenge. To tackle these issues, we introduce a multi-scale receptive field aggregation network (MRFA-Net). The MRFA-Net comprises an MRFA-Encoder and MRFA-Decoder. Within the encoder, the net includes the asymmetric feature extractor module (AFEM) and multi-scale attention, which capture diverse local features and enhance contextual semantic understanding, respectively. The MRFA-Decoder includes the multi-path decoder module (MDM) for blending features and the global feature refinement module (GFRM) for optimizing information via learnable matrix decomposition. Experimental results demonstrate that our model excelled in generalization and segmentation performance when addressing various complex backgrounds and different category detections, exhibiting advantages in terms of parameter efficiency and computational complexity, with the MRFA-Net achieving a mean intersection over union (MIoU) of 94.12% on our custom Cloud and Shadow dataset, and 87.54% on the open-source HRC_WHU dataset, outperforming other models by at least 0.53% and 0.62%. The proposed model demonstrates applicability in practical scenarios where features are difficult to distinguish. Full article
Show Figures

Figure 1

18 pages, 9421 KiB  
Article
Remote Sensing Images Secure Distribution Scheme Based on Deep Information Hiding
by Peng Luo, Jia Liu, Jingting Xu, Qian Dang and Dejun Mu
Remote Sens. 2024, 16(8), 1331; https://doi.org/10.3390/rs16081331 - 10 Apr 2024
Viewed by 883
Abstract
To ensure the security of highly sensitive remote sensing images (RSIs) during their distribution, it is essential to implement effective content security protection methods. Generally, secure distribution schemes for remote sensing images often employ cryptographic techniques. However, sending encrypted data exposes communication behavior, [...] Read more.
To ensure the security of highly sensitive remote sensing images (RSIs) during their distribution, it is essential to implement effective content security protection methods. Generally, secure distribution schemes for remote sensing images often employ cryptographic techniques. However, sending encrypted data exposes communication behavior, which poses significant security risks to the distribution of remote sensing images. Therefore, this paper introduces deep information hiding to achieve the secure distribution of remote sensing images, which can serve as an effective alternative in certain specific scenarios. Specifically, the Deep Information Hiding for RSI Distribution (hereinafter referred to as DIH4RSID) based on an encoder–decoder network architecture with Parallel Attention Mechanism (PAM) by adversarial training is proposed. Our model is constructed with four main components: a preprocessing network (PN), an embedding network (EN), a revealing network (RN), and a discriminating network (DN). The PN module is primarily based on Inception to capture more details of RSIs and targets of different scales. The PAM module obtains features in two spatial directions to realize feature enhancement and context information integration. The experimental results indicate that our proposed algorithm achieves relatively higher visual quality and secure level compared to related methods. Additionally, after extracting the concealed content from hidden images, the average classification accuracy is unaffected. Full article
Show Figures

Figure 1

17 pages, 4146 KiB  
Article
CDEST: Class Distinguishability-Enhanced Self-Training Method for Adopting Pre-Trained Models to Downstream Remote Sensing Image Semantic Segmentation
by Ming Zhang, Xin Gu, Ji Qi, Zhenshi Zhang, Hemeng Yang, Jun Xu, Chengli Peng and Haifeng Li
Remote Sens. 2024, 16(7), 1293; https://doi.org/10.3390/rs16071293 - 6 Apr 2024
Viewed by 1216
Abstract
The self-supervised learning (SSL) technique, driven by massive unlabeled data, is expected to be a promising solution for semantic segmentation of remote sensing images (RSIs) with limited labeled data, revolutionizing transfer learning. Traditional ‘local-to-local’ transfer from small, local datasets to another target dataset [...] Read more.
The self-supervised learning (SSL) technique, driven by massive unlabeled data, is expected to be a promising solution for semantic segmentation of remote sensing images (RSIs) with limited labeled data, revolutionizing transfer learning. Traditional ‘local-to-local’ transfer from small, local datasets to another target dataset plays an ever-shrinking role due to RSIs’ diverse distribution shifts. Instead, SSL promotes a ‘global-to-local’ transfer paradigm, in which generalized models pre-trained on arbitrarily large unlabeled datasets are fine-tuned to the target dataset to overcome data distribution shifts. However, the SSL pre-trained models may contain both useful and useless features for the downstream semantic segmentation task, due to the gap between the SSL tasks and the downstream task. To adapt such pre-trained models to semantic segmentation tasks, traditional supervised fine-tuning methods that use only a small number of labeled samples may drop out useful features due to overfitting. The main reason behind this is that supervised fine-tuning aims to map a few training samples from the high-dimensional, sparse image space to the low-dimensional, compact semantic space defined by the downstream labels, resulting in a degradation of the distinguishability. To address the above issues, we propose a class distinguishability-enhanced self-training (CDEST) method to support global-to-local transfer. First, the self-training module in CDEST introduces a semi-supervised learning mechanism to fully utilize the large amount of unlabeled data in the downstream task to increase the size and diversity of the training data, thus alleviating the problem of biased overfitting of the model. Second, the supervised and semi-supervised contrastive learning modules of CDEST can explicitly enhance the class distinguishability of features, helping to preserve the useful features learned from pre-training while adapting to downstream tasks. We evaluate the proposed CDEST method on four RSI semantic segmentation datasets, and our method achieves optimal experimental results on all four datasets compared to supervised fine-tuning as well as three semi-supervised fine-tuning methods. Full article
Show Figures

Figure 1

23 pages, 4205 KiB  
Article
Exploring Uncertainty-Based Self-Prompt for Test-Time Adaptation Semantic Segmentation in Remote Sensing Images
by Ziquan Wang, Yongsheng Zhang, Zhenchao Zhang, Zhipeng Jiang, Ying Yu, Lei Li and Lei Zhang
Remote Sens. 2024, 16(7), 1239; https://doi.org/10.3390/rs16071239 - 31 Mar 2024
Viewed by 1114
Abstract
Test-time adaptation (TTA) has been proven to effectively improve the adaptability of deep learning semantic segmentation models facing continuous changeable scenes. However, most of the existing TTA algorithms lack an explicit exploration of domain gaps, especially those based on visual domain prompts. To [...] Read more.
Test-time adaptation (TTA) has been proven to effectively improve the adaptability of deep learning semantic segmentation models facing continuous changeable scenes. However, most of the existing TTA algorithms lack an explicit exploration of domain gaps, especially those based on visual domain prompts. To address these issues, this paper proposes a self-prompt strategy based on uncertainty, guiding the model to continuously focus on regions with high uncertainty (i.e., regions with a larger domain gap). Specifically, we still use the Mean-Teacher architecture with the predicted entropy from the teacher network serving as the input to the prompt module. The prompt module processes uncertain maps and guides the student network to focus on regions with higher entropy, enabling continuous adaptation to new scenes. This is a self-prompting strategy that requires no prior knowledge and is tested on widely used benchmarks. In terms of the average performance, our method outperformed the baseline algorithm in TTA and continual TTA settings of Cityscapes-to-ACDC by 3.3% and 3.9%, respectively. Our method also outperformed the baseline algorithm by 4.1% and 3.1% on the more difficult Cityscapes-to-(Foggy and Rainy) Cityscapes setting, which also surpasses six other current TTA methods. Full article
Show Figures

Figure 1

16 pages, 11666 KiB  
Article
A Lightweight Building Extraction Approach for Contour Recovery in Complex Urban Environments
by Jiaxin He, Yong Cheng, Wei Wang, Zhoupeng Ren, Ce Zhang and Wenjie Zhang
Remote Sens. 2024, 16(5), 740; https://doi.org/10.3390/rs16050740 - 20 Feb 2024
Cited by 2 | Viewed by 1286
Abstract
High-spatial-resolution urban buildings play a crucial role in urban planning, emergency response, and disaster management. However, challenges such as missing building contours due to occlusion problems (occlusion between buildings of different heights and buildings obscured by trees), uneven contour extraction due to mixing [...] Read more.
High-spatial-resolution urban buildings play a crucial role in urban planning, emergency response, and disaster management. However, challenges such as missing building contours due to occlusion problems (occlusion between buildings of different heights and buildings obscured by trees), uneven contour extraction due to mixing of building edges with other feature elements (roads, vehicles, and trees), and slow training speed in high-resolution image data hinder efficient and accurate building extraction. To address these issues, we propose a semantic segmentation model composed of a lightweight backbone, coordinate attention module, and pooling fusion module, which achieves lightweight building extraction and adaptive recovery of spatial contours. Comparative experiments were conducted on datasets featuring typical urban building instances in China and the Mapchallenge dataset, comparing our method with several classical and mainstream semantic segmentation algorithms. The results demonstrate the effectiveness of our approach, achieving excellent mean intersection over union (mIoU) and frames per second (FPS) scores on both datasets (China dataset: 85.11% and 110.67 FPS; Mapchallenge dataset: 90.27% and 117.68 FPS). Quantitative evaluations indicate that our model not only significantly improves computational speed but also ensures high accuracy in the extraction of urban buildings from high-resolution imagery. Specifically, on a typical urban building dataset from China, our model shows an accuracy improvement of 0.64% and a speed increase of 70.03 FPS compared to the baseline model. On the Mapchallenge dataset, our model achieves an accuracy improvement of 0.54% and a speed increase of 42.39 FPS compared to the baseline model. Our research indicates that lightweight networks show significant potential in urban building extraction tasks. In the future, the segmentation accuracy and prediction speed can be further balanced on the basis of adjusting the deep learning model or introducing remote sensing indices, which can be applied to research scenarios such as greenfield extraction or multi-class target extraction. Full article
Show Figures

Figure 1

23 pages, 18943 KiB  
Article
An Approach to Large-Scale Cement Plant Detection Using Multisource Remote Sensing Imagery
by Tianzhu Li, Caihong Ma, Yongze Lv, Ruilin Liao, Jin Yang and Jianbo Liu
Remote Sens. 2024, 16(4), 729; https://doi.org/10.3390/rs16040729 - 19 Feb 2024
Viewed by 1730
Abstract
The cement industry, as one of the primary contributors to global greenhouse gas emissions, accounts for 7% of the world’s carbon dioxide emissions. There is an urgent need to establish a rapid method for detecting cement plants to facilitate effective monitoring. In this [...] Read more.
The cement industry, as one of the primary contributors to global greenhouse gas emissions, accounts for 7% of the world’s carbon dioxide emissions. There is an urgent need to establish a rapid method for detecting cement plants to facilitate effective monitoring. In this study, a comprehensive method based on YOLOv5-IEG and the Thermal Signature Detection module using Google Earth optical imagery and SDGSAT-1 thermal infrared imagery was proposed to detect large-scale cement plant information, including geographic location and operational status. The improved algorithm demonstrated an increase of 4.8% in accuracy and a 7.7% improvement in [email protected]:95. In a specific empirical investigation in China, we successfully detected 781 large-scale cement plants with an accuracy of 90.8%. Specifically, of the 55 cement plants in Shandong Province, we identified 46 as operational and nine as non-operational. The successful application of advanced models and remote sensing technology in efficiently and accurately tracking the operational status of cement plants provides crucial support for environmental protection and sustainable development. Full article
Show Figures

Figure 1

17 pages, 18157 KiB  
Article
Deep Learning-Based Real-Time Detection of Surface Landmines Using Optical Imaging
by Emanuele Vivoli, Marco Bertini and Lorenzo Capineri
Remote Sens. 2024, 16(4), 677; https://doi.org/10.3390/rs16040677 - 14 Feb 2024
Cited by 1 | Viewed by 4668
Abstract
This paper presents a pioneering study in the application of real-time surface landmine detection using a combination of robotics and deep learning. We introduce a novel system integrated within a demining robot, capable of detecting landmines in real time with high recall. Utilizing [...] Read more.
This paper presents a pioneering study in the application of real-time surface landmine detection using a combination of robotics and deep learning. We introduce a novel system integrated within a demining robot, capable of detecting landmines in real time with high recall. Utilizing YOLOv8 models, we leverage both optical imaging and artificial intelligence to identify two common types of surface landmines: PFM-1 (butterfly) and PMA-2 (starfish with tripwire). Our system runs at 2 FPS on a mobile device missing at most 1.6% of targets. It demonstrates significant advancements in operational speed and autonomy, surpassing conventional methods while being compatible with other approaches like UAV. In addition to the proposed system, we release two datasets with remarkable differences in landmine and background colors, built to train and test the model performances. Full article
Show Figures

Figure 1

23 pages, 5562 KiB  
Article
SMFF-YOLO: A Scale-Adaptive YOLO Algorithm with Multi-Level Feature Fusion for Object Detection in UAV Scenes
by Yuming Wang, Hua Zou, Ming Yin and Xining Zhang
Remote Sens. 2023, 15(18), 4580; https://doi.org/10.3390/rs15184580 - 18 Sep 2023
Cited by 13 | Viewed by 3547
Abstract
Object detection in images captured by unmanned aerial vehicles (UAVs) holds great potential in various domains, including civilian applications, urban planning, and disaster response. However, it faces several challenges, such as multi-scale variations, dense scenes, complex backgrounds, and tiny-sized objects. In this paper, [...] Read more.
Object detection in images captured by unmanned aerial vehicles (UAVs) holds great potential in various domains, including civilian applications, urban planning, and disaster response. However, it faces several challenges, such as multi-scale variations, dense scenes, complex backgrounds, and tiny-sized objects. In this paper, we present a novel scale-adaptive YOLO framework called SMFF-YOLO, which addresses these challenges through a multi-level feature fusion approach. To improve the detection accuracy of small objects, our framework incorporates the ELAN-SW object detection prediction head. This newly designed head effectively utilizes both global contextual information and local features, enhancing the detection accuracy of tiny objects. Additionally, the proposed bidirectional feature fusion pyramid (BFFP) module tackles the issue of scale variations in object sizes by aggregating multi-scale features. To handle complex backgrounds, we introduce the adaptive atrous spatial pyramid pooling (AASPP) module, which enables adaptive feature fusion and alleviates the negative impact of cluttered scenes. Moreover, we adopt the Wise-IoU(WIoU) bounding box regression loss to enhance the competitiveness of different quality anchor boxes, which offers the framework a more informed gradient allocation strategy. We validate the effectiveness of SMFF-YOLO using the VisDrone and UAVDT datasets. Experimental results demonstrate that our model achieves higher detection accuracy, with AP50 reaching 54.3% for VisDrone and 42.4% for UAVDT datasets. Visual comparative experiments with other YOLO-based methods further illustrate the robustness and adaptability of our approach. Full article
Show Figures

Graphical abstract

20 pages, 3522 KiB  
Article
Detection Method of Infected Wood on Digital Orthophoto Map–Digital Surface Model Fusion Network
by Guangbiao Wang, Hongbo Zhao, Qing Chang, Shuchang Lyu, Binghao Liu, Chunlei Wang and Wenquan Feng
Remote Sens. 2023, 15(17), 4295; https://doi.org/10.3390/rs15174295 - 31 Aug 2023
Cited by 3 | Viewed by 1382
Abstract
Pine wilt disease (PWD) is a worldwide affliction that poses a significant menace to forest ecosystems. The swift and precise identification of pine trees under infection holds paramount significance in the proficient administration of this ailment. The progression of remote sensing and deep [...] Read more.
Pine wilt disease (PWD) is a worldwide affliction that poses a significant menace to forest ecosystems. The swift and precise identification of pine trees under infection holds paramount significance in the proficient administration of this ailment. The progression of remote sensing and deep learning methodologies has propelled the utilization of target detection and recognition techniques reliant on remote sensing imagery, emerging as the prevailing strategy for pinpointing affected trees. Although the existing object detection algorithms have achieved remarkable success, virtually all methods solely rely on a Digital Orthophoto Map (DOM), which is not suitable for diseased trees detection, leading to a large false detection rate in the detection of easily confused targets, such as bare land, houses, brown herbs and so on. In order to improve the ability of detecting diseased trees and preventing the spread of the epidemic, we construct a large-scale PWD detection dataset with both DOM and Digital Surface Model (DSM) images and propose a novel detection framework, DDNet, which makes full use of the spectral features and geomorphological spatial features of remote sensing targets. The experimental results show that the proposed joint network achieves an AP50 2.4% higher than the traditional deep learning network. Full article
Show Figures

Graphical abstract

20 pages, 7667 KiB  
Article
TPH-YOLOv5-Air: Airport Confusing Object Detection via Adaptively Spatial Feature Fusion
by Qiang Wang, Wenquan Feng, Lifan Yao, Chen Zhuang, Binghao Liu and Lijiang Chen
Remote Sens. 2023, 15(15), 3883; https://doi.org/10.3390/rs15153883 - 5 Aug 2023
Cited by 11 | Viewed by 2250
Abstract
Airport detection in remote sensing scenes is a crucial area of research, playing a key role in aircraft blind landing procedures. However, airport detection in remote sensing scenes still faces challenges such as class confusion, poor detection performance on multi-scale objects, and limited [...] Read more.
Airport detection in remote sensing scenes is a crucial area of research, playing a key role in aircraft blind landing procedures. However, airport detection in remote sensing scenes still faces challenges such as class confusion, poor detection performance on multi-scale objects, and limited dataset availability. To address these issues, this paper proposes a novel airport detection network (TPH-YOLOv5-Air) based on adaptive spatial feature fusion (ASFF). Firstly, we construct an Airport Confusing Object Dataset (ACD) specifically tailored for remote sensing scenarios containing 9501 instances of airport confusion objects. Secondly, building upon the foundation of TPH-YOLOv5++, we adopt the ASFF structure, which not only enhances the feature extraction efficiency but also enriches feature representation. Moreover, an adaptive spatial feature fusion (ASFF) strategy based on adaptive parameter adjustment module (APAM) is proposed, which improves the feature scale invariance and enhances the detection of airports. Finally, experimental results based on the ACD dataset demonstrate that TPH-YOLOv5-Air achieves a mean average precision (mAP) of 49.4%, outperforming TPH-YOLOv5++ by 2% and the original YOLOv5 network by 3.6%. This study contributes to the advancement of airport detection in remote sensing scenes and demonstrates the practical application potential of TPH-YOLOv5-Air in this domain. Visualization and analysis further validate the effectiveness and interpretability of TPH-YOLOv5-Air. The ACD dataset is publicly available. Full article
Show Figures

Figure 1

Other

Jump to: Research

27 pages, 5461 KiB  
Essay
BAFormer: A Novel Boundary-Aware Compensation UNet-like Transformer for High-Resolution Cropland Extraction
by Zhiyong Li, Youming Wang, Fa Tian, Junbo Zhang, Yijie Chen and Kunhong Li
Remote Sens. 2024, 16(14), 2526; https://doi.org/10.3390/rs16142526 - 10 Jul 2024
Viewed by 1182
Abstract
Utilizing deep learning for semantic segmentation of cropland from remote sensing imagery has become a crucial technique in land surveys. Cropland is highly heterogeneous and fragmented, and existing methods often suffer from inaccurate boundary segmentation. This paper introduces a UNet-like boundary-aware compensation model [...] Read more.
Utilizing deep learning for semantic segmentation of cropland from remote sensing imagery has become a crucial technique in land surveys. Cropland is highly heterogeneous and fragmented, and existing methods often suffer from inaccurate boundary segmentation. This paper introduces a UNet-like boundary-aware compensation model (BAFormer). Cropland boundaries typically exhibit rapid transformations in pixel values and texture features, often appearing as high-frequency features in remote sensing images. To enhance the recognition of these high-frequency features as represented by cropland boundaries, the proposed BAFormer integrates a Feature Adaptive Mixer (FAM) and develops a Depthwise Large Kernel Multi-Layer Perceptron model (DWLK-MLP) to enrich the global and local cropland boundaries features separately. Specifically, FAM enhances the boundary-aware method by adaptively acquiring high-frequency features through convolution and self-attention advantages, while DWLK-MLP further supplements boundary position information using a large receptive field. The efficacy of BAFormer has been evaluated on datasets including Vaihingen, Potsdam, LoveDA, and Mapcup. It demonstrates high performance, achieving mIoU scores of 84.5%, 87.3%, 53.5%, and 83.1% on these datasets, respectively. Notably, BAFormer-T (lightweight model) surpasses other lightweight models on the Vaihingen dataset with scores of 91.3% F1 and 84.1% mIoU. Full article
Show Figures

Figure 1

Back to TopTop