remotesensing-logo

Journal Browser

Journal Browser

Deep Learning in Optical Satellite Images

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "Remote Sensing Image Processing".

Deadline for manuscript submissions: closed (15 September 2023) | Viewed by 26756

Special Issue Editors

College of Surveying and Geoinformatics, Tongji University, Shanghai 200000, China
Interests: multi-source data fusion; optical image processing; target detection; 3D reconstruction; deep learning
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing, Wuhan University, Wuhan 430079, China
Interests: image reconstruction; hyperspectral image processing; sparse representation; low rank representation; remote sensing; machine learning; deep learning
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Deep learning techniques have been introduced to achieve the best performance in numerous fields of optical satellite image processing, such as land cover classification, object recognition, change detection, and 3D reconstruction. Today, multi-source optical satellite images with higher spatial resolution can be publicly accessed, and the accumulated historical datasets can therefore be better utilized. The significance of deep learning in optical satellite image processing is expected to grow continuously and further facilitate more remote sensing research fields. However, there are still many challenges related to robust and intelligent processing of optical satellite images with deep learning techniques. These challenges not only cause difficulties for image understanding but also require more advanced computational methods and innovative applications. The objective of the present Special Issue is to cover the relevant topics, trends, and best practices regarding algorithms, models, analysis, and applications in the field. We welcome topics that include but are not limited to the following:

  • Deep learning based image processing
  • Optical satellite image processing
  • Object recognition
  • Landcover classification
  • Change detection
  • Image matching
  • Multi-source image fusion
  • 3D terrain reconstruction

Dr. Xiong Xu
Prof. Dr. Hongyan Zhang
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • deep learning based image processing
  • optical satellite image processing
  • object recognition
  • landcover classification
  • change detection
  • image matching
  • multi-source image fusion
  • 3D terrain reconstruction

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (11 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Other

20 pages, 4338 KiB  
Article
Cross-Modal Retrieval and Semantic Refinement for Remote Sensing Image Captioning
by Zhengxin Li, Wenzhe Zhao, Xuanyi Du, Guangyao Zhou and Songlin Zhang
Remote Sens. 2024, 16(1), 196; https://doi.org/10.3390/rs16010196 - 3 Jan 2024
Cited by 4 | Viewed by 2452
Abstract
Two-stage remote sensing image captioning (RSIC) methods have achieved promising results by incorporating additional pre-trained remote sensing tasks to extract supplementary information and improve caption quality. However, these methods face limitations in semantic comprehension, as pre-trained detectors/classifiers are constrained by predefined labels, leading [...] Read more.
Two-stage remote sensing image captioning (RSIC) methods have achieved promising results by incorporating additional pre-trained remote sensing tasks to extract supplementary information and improve caption quality. However, these methods face limitations in semantic comprehension, as pre-trained detectors/classifiers are constrained by predefined labels, leading to an oversight of the intricate and diverse details present in remote sensing images (RSIs). Additionally, the handling of auxiliary remote sensing tasks separately can introduce challenges in ensuring seamless integration and alignment with the captioning process. To address these problems, we propose a novel cross-modal retrieval and semantic refinement (CRSR) RSIC method. Specifically, we employ a cross-modal retrieval model to retrieve relevant sentences of each image. The words in these retrieved sentences are then considered as primary semantic information, providing valuable supplementary information for the captioning process. To further enhance the quality of the captions, we introduce a semantic refinement module that refines the primary semantic information, which helps to filter out misleading information and emphasize visually salient semantic information. A Transformer Mapper network is introduced to expand the representation of image features beyond the retrieved supplementary information with learnable queries. Both the refined semantic tokens and visual features are integrated and fed into a cross-modal decoder for caption generation. Through extensive experiments, we demonstrate the superiority of our CRSR method over existing state-of-the-art approaches on the RSICD, the UCM-Captions, and the Sydney-Captions datasets Full article
(This article belongs to the Special Issue Deep Learning in Optical Satellite Images)
Show Figures

Figure 1

18 pages, 4178 KiB  
Article
Hierarchical Feature Association and Global Correction Network for Change Detection
by Jinquan Lu, Xiangchao Meng, Qiang Liu, Zhiyong Lv, Gang Yang, Weiwei Sun and Wei Jin
Remote Sens. 2023, 15(17), 4141; https://doi.org/10.3390/rs15174141 - 24 Aug 2023
Viewed by 1307
Abstract
Optical satellite image change detection has attracted extensive research due to its comprehensive application in earth observation. Recently, deep learning (DL)-based methods have become dominant in change detection due to their outstanding performance. Remote sensing (RS) images contain different sizes of ground objects, [...] Read more.
Optical satellite image change detection has attracted extensive research due to its comprehensive application in earth observation. Recently, deep learning (DL)-based methods have become dominant in change detection due to their outstanding performance. Remote sensing (RS) images contain different sizes of ground objects, so the information at different scales is crucial for change detection. However, the existing DL-based methods only employ summation or concatenation to aggregate several layers of features, lacking the semantic association of different layers. On the other hand, the UNet-like backbone is favored by deep learning algorithms, but the gradual downscaling and upscaling operation in the mainstream UNet-like backbone has the problem of misalignment, which further affects the accuracy of change detection. In this paper, we innovatively propose a hierarchical feature association and global correction network (HFA-GCN) for change detection. Specifically, a hierarchical feature association module is meticulously designed to model the correlation relationship among different scale features due to the redundant but complementary information among them. Moreover, a global correction module on Transformer is proposed to alleviate the feature misalignment in the UNet-like backbone, which, through feature reuse, extracts global information to reduce false alarms and missed alarms. Experiments were conducted on several publicly available databases, and the experimental results show the proposed method is superior to the existing state-of-the-art change detection models. Full article
(This article belongs to the Special Issue Deep Learning in Optical Satellite Images)
Show Figures

Figure 1

22 pages, 44590 KiB  
Article
A Coarse-to-Fine Feature Match Network Using Transformers for Remote Sensing Image Registration
by Chenbin Liang, Yunyun Dong, Changjun Zhao and Zengguo Sun
Remote Sens. 2023, 15(13), 3243; https://doi.org/10.3390/rs15133243 - 23 Jun 2023
Cited by 3 | Viewed by 2264
Abstract
Feature matching is a core step in multi-source remote sensing image registration approaches based on feature. However, for existing methods, whether traditional classical SIFT algorithm or deep learning-based methods, they essentially rely on generating descriptors from local regions of feature points, which can [...] Read more.
Feature matching is a core step in multi-source remote sensing image registration approaches based on feature. However, for existing methods, whether traditional classical SIFT algorithm or deep learning-based methods, they essentially rely on generating descriptors from local regions of feature points, which can lead to low matching success rates due to various challenges, including gray-scale changes, content changes, local similarity, and occlusions between images. Inspired by the human approach of finding rough corresponding regions globally and then carefully comparing local regions, and the excellent global attention property of transformers, the proposed feature matching network adopts a coarse-to-fine matching strategy that utilizes both global and local information between images to predict corresponding feature points. Importantly, the network has great flexibility of matching corresponding points for any feature points and can be effectively trained without strong supervised signals of corresponding feature points and only require the true geometric transformation between images. The qualitative experiment illustrate the effectiveness of the proposed network by matching feature points extracted by SIFT or sampled uniformly. In the quantitative experiments, we used feature points extracted by SIFT, SuperPoint, and LoFTR as the keypoints to be matched. We then calculated the mean match success ratio (MSR) and mean reprojection error (MRE) of each method at different thresholds in the test dataset. Additionally, boxplot graphs were plotted to visualize the distributions. By comparing the MSR and MRE values as well as their distributions with other methods, we can conclude that the proposed method consistently outperforms the comparison methods in terms of MSR at different thresholds. Moreover, the MSR of the proposed method remains within a reasonable range compared to the MRE of other methods. Full article
(This article belongs to the Special Issue Deep Learning in Optical Satellite Images)
Show Figures

Figure 1

20 pages, 27522 KiB  
Article
A Multi-Scale Edge Constraint Network for the Fine Extraction of Buildings from Remote Sensing Images
by Zhenqing Wang, Yi Zhou, Futao Wang, Shixin Wang, Gang Qin, Weijie Zou and Jinfeng Zhu
Remote Sens. 2023, 15(4), 927; https://doi.org/10.3390/rs15040927 - 8 Feb 2023
Cited by 9 | Viewed by 2026
Abstract
Building extraction based on remote sensing images has been widely used in many industries. However, state-of-the-art methods produce an incomplete segmentation of buildings owing to unstable multi-scale context aggregation and a lack of consideration of semantic boundaries, ultimately resulting in large uncertainties in [...] Read more.
Building extraction based on remote sensing images has been widely used in many industries. However, state-of-the-art methods produce an incomplete segmentation of buildings owing to unstable multi-scale context aggregation and a lack of consideration of semantic boundaries, ultimately resulting in large uncertainties in predictions at building boundaries. In this study, efficient fine building extraction methods were explored, which demonstrated that the rational use of edge features can significantly improve building recognition performance. Herein, a fine building extraction network based on a multi-scale edge constraint (MEC-Net) was proposed, which integrates the multi-scale feature fusion advantages of UNet++ and fuses edge features with other learnable multi-scale features to achieve the effect of prior constraints. Attention was paid to the alleviation of noise interference in the edge features. At the data level, according to the improvement of copy-paste according to the characteristics of remote sensing imaging, a data augmentation method for buildings (build-building) was proposed, which increased the number and diversity of positive samples by simulating the construction of buildings to increase the generalization of MEC-Net. MEC-Net achieved 91.13%, 81.05% and 74.13% IoU on the WHU, Massachusetts and Inria datasets, and it has a good inference efficiency. The experimental results show that MEC-Net outperforms the state-of-the-art methods, demonstrating its superiority. MEC-Net improves the accuracy of building boundaries by rationally using previous edge features. Full article
(This article belongs to the Special Issue Deep Learning in Optical Satellite Images)
Show Figures

Figure 1

19 pages, 3969 KiB  
Article
Dictionary Learning for Few-Shot Remote Sensing Scene Classification
by Yuteng Ma, Junmin Meng, Baodi Liu, Lina Sun, Hao Zhang and Peng Ren
Remote Sens. 2023, 15(3), 773; https://doi.org/10.3390/rs15030773 - 29 Jan 2023
Cited by 4 | Viewed by 1973
Abstract
With deep learning-based methods growing (even with scarce data in some fields), few-shot remote sensing scene classification (FSRSSC) has received a lot of attention. One mainstream approach uses base data to train a feature extractor (FE) in the pre-training phase and employs novel [...] Read more.
With deep learning-based methods growing (even with scarce data in some fields), few-shot remote sensing scene classification (FSRSSC) has received a lot of attention. One mainstream approach uses base data to train a feature extractor (FE) in the pre-training phase and employs novel data to design the classifier and complete the classification task in the meta-test phase. Due to the scarcity of remote sensing data, obtaining a suitable feature extractor for remote sensing data and designing a robust classifier have become two major challenges. In this paper, we propose a novel dictionary learning (DL) algorithm for few-shot remote sensing scene classification to address these two difficulties. First, we use natural image datasets with sufficient data to obtain a pre-trained feature extractor. We fine-tune the parameters with the remote sensing dataset to make the feature extractor suitable for remote sensing data. Second, we design the kernel space classifier to map the features to a high-dimensional space and embed the label information into the dictionary learning to improve the discrimination of features for classification. Extensive experiments on four popular remote sensing scene classification datasets demonstrate the effectiveness of our proposed dictionary learning method. Full article
(This article belongs to the Special Issue Deep Learning in Optical Satellite Images)
Show Figures

Figure 1

22 pages, 85587 KiB  
Article
Remote Sensing Crop Recognition by Coupling Phenological Features and Off-Center Bayesian Deep Learning
by Yongchuang Wu, Penghai Wu, Yanlan Wu, Hui Yang and Biao Wang
Remote Sens. 2023, 15(3), 674; https://doi.org/10.3390/rs15030674 - 23 Jan 2023
Cited by 7 | Viewed by 2799
Abstract
Obtaining accurate and timely crop area information is crucial for crop yield estimates and food security. Because most existing crop mapping models based on remote sensing data have poor generalizability, they cannot be rapidly deployed for crop identification tasks in different regions. Based [...] Read more.
Obtaining accurate and timely crop area information is crucial for crop yield estimates and food security. Because most existing crop mapping models based on remote sensing data have poor generalizability, they cannot be rapidly deployed for crop identification tasks in different regions. Based on a priori knowledge of phenology, we designed an off-center Bayesian deep learning remote sensing crop classification method that can highlight phenological features, combined with an attention mechanism and residual connectivity. In this paper, we first optimize the input image and input features based on a phenology analysis. Then, a convolutional neural network (CNN), recurrent neural network (RNN), and random forest classifier (RFC) were built based on farm data in northeastern Inner Mongolia and applied to perform comparisons with the method proposed here. Then, classification tests were performed on soybean, maize, and rice from four measurement areas in northeastern China to verify the accuracy of the above methods. To further explore the reliability of the method proposed in this paper, an uncertainty analysis was conducted by Bayesian deep learning to analyze the model’s learning process and model structure for interpretability. Finally, statistical data collected in Suibin County, Heilongjiang Province, over many years, and Shandong Province in 2020 were used as reference data to verify the applicability of the methods. The experimental results show that the classification accuracy of the three crops reached 90.73% overall and the average F1 and IOU were 89.57% and 81.48%, respectively. Furthermore, the proposed method can be directly applied to crop area estimations in different years in other regions based on its good correlation with official statistics. Full article
(This article belongs to the Special Issue Deep Learning in Optical Satellite Images)
Show Figures

Figure 1

22 pages, 8592 KiB  
Article
Improving Spatial Resolution of Satellite Imagery Using Generative Adversarial Networks and Window Functions
by Kinga Karwowska and Damian Wierzbicki
Remote Sens. 2022, 14(24), 6285; https://doi.org/10.3390/rs14246285 - 12 Dec 2022
Cited by 10 | Viewed by 4567
Abstract
Dynamic technological progress has contributed to the development of systems imaging of the Earth’s surface as well as data mining methods. One such example is super-resolution (SR) techniques that allow for the improvement of the spatial resolution of satellite imagery on the basis [...] Read more.
Dynamic technological progress has contributed to the development of systems imaging of the Earth’s surface as well as data mining methods. One such example is super-resolution (SR) techniques that allow for the improvement of the spatial resolution of satellite imagery on the basis of a low-resolution image (LR) and an algorithm using deep neural networks. The limitation of these solutions is the input size parameter, which defines the image size that is adopted by a given neural network. Unfortunately, the value of this parameter is often much smaller than the size of the images obtained by Earth Observation satellites. In this article, we presented a new methodology for improving the resolution of an entire satellite image, using a window function. In addition, we conducted research to improve the resolution of satellite images acquired with the World View 2 satellite using the ESRGAN network, we determined the number of buffer pixels that will make it possible to obtain the best image quality. The best reconstruction of the entire satellite imagery using generative neural networks was obtained using a Triangular window (for 10% coverage). The Hann-Poisson window worked best when more overlap between images was used. Full article
(This article belongs to the Special Issue Deep Learning in Optical Satellite Images)
Show Figures

Figure 1

19 pages, 23376 KiB  
Article
EMO-MVS: Error-Aware Multi-Scale Iterative Variable Optimizer for Efficient Multi-View Stereo
by Huizhou Zhou, Haoliang Zhao, Qi Wang, Liang Lei, Gefei Hao, Yusheng Xu and Zhen Ye
Remote Sens. 2022, 14(23), 6085; https://doi.org/10.3390/rs14236085 - 30 Nov 2022
Cited by 12 | Viewed by 2193
Abstract
Efficient dense reconstruction of objects or scenes has substantial practical implications, which can be applied to different 3D tasks (for example, robotics and autonomous driving). However, because of the expensive hardware required and the overall complexity of the all-around scenarios, efficient dense reconstruction [...] Read more.
Efficient dense reconstruction of objects or scenes has substantial practical implications, which can be applied to different 3D tasks (for example, robotics and autonomous driving). However, because of the expensive hardware required and the overall complexity of the all-around scenarios, efficient dense reconstruction using lightweight multi-view stereo methods has received much attention from researchers. The technological challenge of efficient dense reconstruction is maintaining low memory usage while rapidly and reliably acquiring depth maps. Most of the current efficient multi-view stereo (MVS) methods perform poorly in efficient dense reconstruction, this poor performance is mainly due to weak generalization performance and unrefined object edges in the depth maps. To this end, we propose EMO-MVS, which aims to accomplish multi-view stereo tasks with high efficiency, which means low-memory consumption, high accuracy, and excellent generalization performance. In detail, we first propose an iterative variable optimizer to accurately estimate depth changes. Then, we design a multi-level absorption unit that expands the receptive field, which efficiently generates an initial depth map. In addition, we propose an error-aware enhancement module, enhancing the initial depth map by optimizing the projection error between multiple views. We have conducted extensive experiments on challenging datasets Tanks and Temples and DTU, and also performed a complete visualization comparison on the BlenedMVS validation set (which contains many aerial scene images), achieving promising performance on all datasets. Among the lightweight MVS methods with low-memory consumption and fast inference speed, our F-score on the online Tanks and Temples intermediate benchmark is the highest, which shows that we have the best competitiveness in terms of balancing the performance and computational cost. Full article
(This article belongs to the Special Issue Deep Learning in Optical Satellite Images)
Show Figures

Figure 1

19 pages, 15476 KiB  
Article
Physical-Based Spatial-Spectral Deep Fusion Network for Chlorophyll-a Estimation Using MODIS and Sentinel-2 MSI Data
by Yuting He, Penghai Wu, Xiaoshuang Ma, Jie Wang and Yanlan Wu
Remote Sens. 2022, 14(22), 5828; https://doi.org/10.3390/rs14225828 - 17 Nov 2022
Cited by 1 | Viewed by 1844
Abstract
Satellite-derived Chlorophyll-a (Chl-a) is an important environmental evaluation indicator for monitoring water environments. However, the available satellite images either have a coarse spatial or low spectral resolution, which restricts the applicability of Chl-a retrieval in coastal water (e.g., less than 1 km from [...] Read more.
Satellite-derived Chlorophyll-a (Chl-a) is an important environmental evaluation indicator for monitoring water environments. However, the available satellite images either have a coarse spatial or low spectral resolution, which restricts the applicability of Chl-a retrieval in coastal water (e.g., less than 1 km from the shoreline) for large- and medium-sized lakes/oceans. Considering Lake Chaohu as the study area, this paper proposes a physical-based spatial-spectral deep fusion network (PSSDFN) for Chl-a retrieval using Moderate Resolution Imaging Spectroradiometer (MODIS) and Sentinel-2 Multispectral Instrument (MSI) reflectance data. The PSSDFN combines residual connectivity and attention mechanisms to extract effective features, and introduces physical constraints, including spectral response functions and the physical degradation model, to reconcile spatial and spectral information. The fused and MSI data were used as input variables for collaborative retrieval, while only the MSI data were used as input variables for MSI retrieval. Combined with the Chl-a field data, a comparison between MSI and collaborative retrieval was conducted using four machine learning models. The results showed that collaborative retrieval can greatly improve the accuracy compared with MSI retrieval. This research illustrates that the PSSDFN can improve the estimated accuracy of Chl-a for coastal water (less than 1 km from the shoreline) in large- and medium-sized lakes/oceans. Full article
(This article belongs to the Special Issue Deep Learning in Optical Satellite Images)
Show Figures

Graphical abstract

22 pages, 6720 KiB  
Article
Bias Analysis and Correction for Ill-Posed Inversion Problem with Sparsity Regularization Based on L1 Norm for Azimuth Super-Resolution of Radar Forward-Looking Imaging
by Jie Han, Songlin Zhang, Shouzhu Zheng, Minghua Wang, Haiyong Ding and Qingyun Yan
Remote Sens. 2022, 14(22), 5792; https://doi.org/10.3390/rs14225792 - 16 Nov 2022
Cited by 3 | Viewed by 1714
Abstract
The sparsity regularization based on the L1 norm can significantly stabilize the solution of the ill-posed sparsity inversion problem, e.g., azimuth super-resolution of radar forward-looking imaging, which can effectively suppress the noise and reduce the blurry effect of the convolution kernel. In [...] Read more.
The sparsity regularization based on the L1 norm can significantly stabilize the solution of the ill-posed sparsity inversion problem, e.g., azimuth super-resolution of radar forward-looking imaging, which can effectively suppress the noise and reduce the blurry effect of the convolution kernel. In practice, the total variation (TV) and TV-sparsity (TVS) regularizations based on the L1 norm are widely adopted in solving the ill-posed problem. Generally, however, the existence of bias is ignored, which is incomplete in theory. This paper places emphasis on analyzing the partially biased property of the L1 norm. On this basis, we derive the partially bias-corrected solution of TVS and TV, which improves the rigor of the theory. Lastly, two groups of experimental results reflect that the proposed methods with partial bias correction can preserve higher quality than those without bias correction. The proposed methods not only distinguish the adjacent targets, suppress the noise, and preserve the shape and size of targets in visual terms. Its improvement of Peak Signal-to-Noise Ratio, Structure-Similarity, and Sum-Squared-Errors assessment indexes are overall 2.15%, 1.88%, and 4.14%, respectively. As such, we confirm the theoretical rigor and practical feasibility of the partially bias-corrected solution with sparsity regularization based on the L1 norm. Full article
(This article belongs to the Special Issue Deep Learning in Optical Satellite Images)
Show Figures

Figure 1

Other

Jump to: Research

17 pages, 19458 KiB  
Technical Note
CamoNet: A Target Camouflage Network for Remote Sensing Images Based on Adversarial Attack
by Yue Zhou, Wanghan Jiang, Xue Jiang, Lin Chen and Xingzhao Liu
Remote Sens. 2023, 15(21), 5131; https://doi.org/10.3390/rs15215131 - 27 Oct 2023
Cited by 2 | Viewed by 1884
Abstract
Object detection algorithms based on convolutional neural networks (CNNs) have achieved remarkable success in remote sensing images (RSIs), such as aircraft and ship detection, which play a vital role in military and civilian fields. However, CNNs are fragile and can be easily fooled. [...] Read more.
Object detection algorithms based on convolutional neural networks (CNNs) have achieved remarkable success in remote sensing images (RSIs), such as aircraft and ship detection, which play a vital role in military and civilian fields. However, CNNs are fragile and can be easily fooled. There have been a series of studies on adversarial attacks for image classification in RSIs. However, the existing gradient attack algorithms designed for classification cannot achieve excellent performance when directly applied to object detection, which is an essential task in RSI understanding. Although we can find some works on adversarial attacks for object detection, they are weak in concealment and easily detected by the naked eye. To handle these problems, we propose a target camouflage network for object detection in RSIs, called CamoNet, to deceive CNN-based detectors by adding imperceptible perturbation to the image. In addition, we propose a detection space initialization strategy to maximize the diversity in the detector’s outputs among the generated samples. It can enhance the performance of the gradient attack algorithms in the object detection task. Moreover, a key pixel distillation module is employed, which can further reduce the modified pixels without weakening the concealment effect. Compared with several of the most advanced adversarial attacks, the proposed attack has advantages in terms of both peak signal-to-noise ratio (PSNR) and attack success rate. The transferability of the proposed target camouflage network is evaluated on three dominant detection algorithms (RetinaNet, Faster R-CNN, and RTMDet) with two commonly used remote sensing datasets (i.e., DOTA and DIOR). Full article
(This article belongs to the Special Issue Deep Learning in Optical Satellite Images)
Show Figures

Figure 1

Back to TopTop