remotesensing-logo

Journal Browser

Journal Browser

State of the Art in Object Detection Based on Computer Vision and Image Processing

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "Remote Sensing Image Processing".

Deadline for manuscript submissions: closed (20 June 2024) | Viewed by 11639

Special Issue Editors


E-Mail Website
Guest Editor
Department of Mechanical and Automotive Engineering, Seoul National University of Science and Technology, Seoul 01811, Republic of Korea
Interests: object detection; semantic segmentation; image classification; scene understanding
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Software Convergence, Kyung Hee University, Yongin-si, Gyeonggi-do, Republic of Korea
Interests: object detection; 3D perception; robot vision
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
College of Electronic Science and Engineering, National University of Defense Technology, Changsha 410073, China
Interests: SAR target detection; SAR target classification; SAR image processing with deep learning

Special Issue Information

Dear Colleagues,

Object detection is one of the most fundamental and challenging topics in the field of remote sensing image analysis, where satellite, aerial images and UAVs can all be used for surveillance, tracking and positioning services. At the same time, many new artificial intelligence-based methods of detection are also being rapidly developed. Artificial intelligence, especially the fields of computer vision and deep neural networks, can be an extremely effective tool for automatic object detection, able to efficiently analyze large amounts of data. The purpose of object detection is to accurately locate and classify objects. However, unlike the excellent performance shown by deep learning in image classification, significant performance improvements are still required in object detection. Due to the complex attributes and variation of objects, most existing methods still have certain drawbacks, such as easily losing or mislocating objects, issues of how to use image processing to detect unknown classes, how to accurately detect 3D objects, etc.

It is our pleasure to announce the launch of a new Special Issue of Remote Sensing, the goal of which is to gather the latest research on the applications of remote sensing techniques including any or all aspects of image processing techniques for image enhancement, object detection and anomaly detection. At the same time, we also welcome papers in which artificial intelligence methods are not directly used in image processing but comprehensively used in multi-target detection. Articles may address, but are not limited to, the following topics.

  • Object detection with sensor fusion;
  • Small object detection;
  • Object detection with multimodal information fusion;
  • Occluded object detection;
  • Weakly supervised/unsupervised object detection;
  • Review papers and dataset for object detection.

Prof. Dr. Jong-Eun Ha
Prof. Dr. Hyoseok Hwang
Dr. Ronghui Zhan
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • object detection
  • salient object detection
  • weakly supervised object detection
  • sensor fusion
  • deep learning
  • computer vision

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (7 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Other

20 pages, 63466 KiB  
Article
A Step-Wise Domain Adaptation Detection Transformer for Object Detection under Poor Visibility Conditions
by Gege Zhang, Luping Wang and Zengping Chen
Remote Sens. 2024, 16(15), 2722; https://doi.org/10.3390/rs16152722 - 25 Jul 2024
Viewed by 882
Abstract
To address the performance degradation of cross-domain object detection under various illumination conditions and adverse weather scenarios, this paper introduces a novel method a called Step-wise Domain Adaptation DEtection TRansformer (SDA-DETR). Our approach decomposes the adaptation process into three sequential steps, progressively transferring [...] Read more.
To address the performance degradation of cross-domain object detection under various illumination conditions and adverse weather scenarios, this paper introduces a novel method a called Step-wise Domain Adaptation DEtection TRansformer (SDA-DETR). Our approach decomposes the adaptation process into three sequential steps, progressively transferring knowledge from a labeled dataset to an unlabeled one using the DETR (DEtection TRansformer) architecture. Each step precisely reduces domain discrepancy, thereby facilitating effective transfer learning. In the initial step, a target-like domain is constructed as an auxiliary to the source domain to reduce the domain gap at the image level. Then, we adaptively align the source domain and target domain features at both global and local levels. To further mitigate model bias towards the source domain, we develop a token-masked autoencoder (t-MAE) to enhance target domain features at the semantic level. Comprehensive experiments demonstrate that the SDA-DETR outperforms several popular cross-domain object detection methods on three challenging public driving datasets. Full article
Show Figures

Figure 1

18 pages, 6007 KiB  
Article
Instantaneous Extraction of Indoor Environment from Radar Sensor-Based Mapping
by Seonmin Cho, Seungheon Kwak and Seongwook Lee
Remote Sens. 2024, 16(3), 574; https://doi.org/10.3390/rs16030574 - 2 Feb 2024
Viewed by 1459
Abstract
In this paper, we propose a method for extracting the structure of an indoor environment using radar. When using the radar in an indoor environment, ghost targets are observed through the multipath propagation of radio waves. The presence of these ghost targets obstructs [...] Read more.
In this paper, we propose a method for extracting the structure of an indoor environment using radar. When using the radar in an indoor environment, ghost targets are observed through the multipath propagation of radio waves. The presence of these ghost targets obstructs accurate mapping in the indoor environment, consequently hindering the extraction of the indoor environment. Therefore, we propose a deep learning-based method that uses image-to-image translation to extract the structure of the indoor environment by removing ghost targets from the indoor environment map. In this paper, the proposed method employs a conditional generative adversarial network (CGAN), which includes a U-Net-based generator and a patch-generative adversarial network-based discriminator. By repeating the process of determining whether the structure of the generated indoor environment is real or fake, CGAN ultimately returns a structure similar to the real environment. First, we generate a map of the indoor environment using radar, which includes ghost targets. Next, the structure of the indoor environment is extracted from the map using the proposed method. Then, we compare the proposed method, which is based on the structural similarity index and structural content, with the k-nearest neighbors algorithm, Hough transform, and density-based spatial clustering of applications with noise-based environment extraction method. When comparing the methods, our proposed method offers the advantage of extracting a more accurate environment without requiring parameter adjustments, even when the environment is changed. Full article
Show Figures

Graphical abstract

17 pages, 13421 KiB  
Article
Inter-Domain Invariant Cross-Domain Object Detection Using Style and Content Disentanglement for In-Vehicle Images
by Zhipeng Jiang, Yongsheng Zhang, Ziquan Wang, Ying Yu, Zhenchao Zhang, Mengwei Zhang, Lei Zhang and Binbin Cheng
Remote Sens. 2024, 16(2), 304; https://doi.org/10.3390/rs16020304 - 11 Jan 2024
Cited by 2 | Viewed by 1402
Abstract
The accurate detection of relevant vehicles, pedestrians, and other targets on the road plays a crucial role in ensuring the safety of autonomous driving. In recent years, object detectors based on Transformers or CNNs have achieved excellent performance in the fully supervised paradigm. [...] Read more.
The accurate detection of relevant vehicles, pedestrians, and other targets on the road plays a crucial role in ensuring the safety of autonomous driving. In recent years, object detectors based on Transformers or CNNs have achieved excellent performance in the fully supervised paradigm. However, when the trained model is directly applied to unfamiliar scenes where the training data and testing data have different distributions statistically, the model’s performance may decrease dramatically. To address this issue, unsupervised domain adaptive object detection methods have been proposed. However, these methods often exhibit decreasing performance when the gap between the source and target domains increases. Previous works mainly focused on utilizing the style gap to reduce the domain gap while ignoring the content gap. To tackle this challenge, we introduce a novel method called IDI-SCD that effectively addresses both the style and content gaps simultaneously. Firstly, the domain gap is reduced by disentangling it into the style gap and content gap, generating corresponding intermediate domains in the meanwhile. Secondly, during training, we focus on one single domain gap at a time to achieve inter-domain invariance. That is, the content gap is tackled while maintaining the style gap, and vice versa. In addition, the style-invariant loss is used to narrow down the style gap, and the mean teacher self-training framework is used to narrow down the content gap. Finally, we introduce a multiscale fusion strategy to enhance the quality of pseudo-labels, which mainly focus on enhancing the detection performance for extreme-scale objects (very large or very small objects). We conduct extensive experiments on four mainstream datasets of in-vehicle images. The experimental results demonstrate the effectiveness of our method and its superiority over most of the existing methods. Full article
Show Figures

Graphical abstract

22 pages, 6000 KiB  
Article
SACuP: Sonar Image Augmentation with Cut and Paste Based DataBank for Semantic Segmentation
by Sundong Park, Yoonyoung Choi and Hyoseok Hwang
Remote Sens. 2023, 15(21), 5185; https://doi.org/10.3390/rs15215185 - 31 Oct 2023
Cited by 1 | Viewed by 1796
Abstract
In this paper, we introduce Sonar image Augmentation with Cut and Paste based DataBank for semantic segmentation (SACuP), a novel data augmentation framework specifically designed for sonar imagery. Unlike traditional methods that often overlook the distinctive traits of sonar images, SACuP effectively harnesses [...] Read more.
In this paper, we introduce Sonar image Augmentation with Cut and Paste based DataBank for semantic segmentation (SACuP), a novel data augmentation framework specifically designed for sonar imagery. Unlike traditional methods that often overlook the distinctive traits of sonar images, SACuP effectively harnesses these unique characteristics, including shadows and noise. SACuP operates on an object-unit level, differentiating it from conventional augmentation methods applied to entire images or object groups. Improving semantic segmentation performance while carefully preserving the unique properties of acoustic images is differentiated from others. Importantly, this augmentation process requires no additional manual work, as it leverages existing images and masks seamlessly. Our extensive evaluations contrasting SACuP against established augmentation methods unveil its superior performance, registering an impressive 1.10% gain in mean intersection over union (mIoU) over the baseline. Furthermore, our ablation study elucidates the nuanced contributions of individual and combined augmentation methods, such as cut and paste, brightness adjustment, and shadow generation, to model enhancement. We anticipate SACuP’s versatility in augmenting scarce sonar data across a spectrum of tasks, particularly within the domain of semantic segmentation. Its potential extends to bolstering the effectiveness of underwater exploration by providing high-quality sonar data for training machine learning models. Full article
Show Figures

Figure 1

22 pages, 7710 KiB  
Article
Hybrid Cross-Feature Interaction Attention Module for Object Detection in Intelligent Mobile Scenes
by Di Tian, Yi Han, Yongtao Liu, Jiabo Li, Ping Zhang and Ming Liu
Remote Sens. 2023, 15(20), 4991; https://doi.org/10.3390/rs15204991 - 17 Oct 2023
Cited by 2 | Viewed by 1461
Abstract
Object detection is one of the fundamental tasks in computer vision, holding immense significance in the realm of intelligent mobile scenes. This paper proposes a hybrid cross-feature interaction (HCFI) attention module for object detection in intelligent mobile scenes. Firstly, the paper introduces multiple [...] Read more.
Object detection is one of the fundamental tasks in computer vision, holding immense significance in the realm of intelligent mobile scenes. This paper proposes a hybrid cross-feature interaction (HCFI) attention module for object detection in intelligent mobile scenes. Firstly, the paper introduces multiple kernel (MK) spatial pyramid pooling (SPP) based on SPP and improves the channel attention using its structure. This results in a hybrid cross-channel interaction (HCCI) attention module with better cross-channel interaction performance. Additionally, we bolster spatial attention by incorporating dilated convolutions, leading to the creation of the cross-spatial interaction (CSI) attention module with superior cross-spatial interaction performance. By seamlessly combining the above two modules, we achieve an improved HCFI attention module without resorting to computationally expensive operations. Through a series of experiments involving various detectors and datasets, our proposed method consistently demonstrates superior performance. This results in a performance improvement of 1.53% for YOLOX on COCO and a performance boost of 2.05% for YOLOv5 on BDD100K. Furthermore, we propose a solution that combines HCCI and HCFI to address the challenge of extremely small output feature layers in detectors, such as SSD. The experimental results indicate that the proposed method significantly improves the attention capability of object detection in intelligent mobile scenes. Full article
Show Figures

Figure 1

24 pages, 10282 KiB  
Article
Research on Identification and Detection of Transmission Line Insulator Defects Based on a Lightweight YOLOv5 Network
by Zhilong Yu, Yanqiao Lei, Feng Shen, Shuai Zhou and Yue Yuan
Remote Sens. 2023, 15(18), 4552; https://doi.org/10.3390/rs15184552 - 15 Sep 2023
Cited by 9 | Viewed by 1775
Abstract
Transmission line fault detection using drones provides real-time assessment of the operational status of transmission equipment, and therefore it has immense importance in ensuring stable functioning of the transmission lines. Currently, identification of transmission line equipment relies predominantly on manual inspections that are [...] Read more.
Transmission line fault detection using drones provides real-time assessment of the operational status of transmission equipment, and therefore it has immense importance in ensuring stable functioning of the transmission lines. Currently, identification of transmission line equipment relies predominantly on manual inspections that are susceptible to the influence of natural surroundings, resulting in sluggishness and a high rate of false detections. In view of this, in this study, we propose an insulator defect recognition algorithm based on a YOLOv5 model with a new lightweight network as the backbone network, combining noise reduction and target detection. First, we propose a new noise reduction algorithm, i.e., the adaptive neighborhood-weighted median filtering (NW-AMF) algorithm. This algorithm employs a weighted summation technique to determine the median value of the pixel point’s neighborhood, effectively filtering out noise from the captured aerial images. Consequently, this approach significantly mitigates the adverse effects of varying noise levels on target detection. Subsequently, the RepVGG lightweight network structure is improved to the newly proposed lightweight structure called RcpVGG-YOLOv5. This structure facilitates single-branch inference, multi-branch training, and branch normalization, thereby improving the quantization performance while simultaneously striking a balance between target detection accuracy and speed. Furthermore, we propose a new loss function, i.e., Focal EIOU, to replace the original CIOU loss function. This optimization incorporates a penalty on the edge length of the target frame, which improves the contribution of the high-quality target gradient. This modification effectively addresses the issue of imbalanced positive and negative samples for small targets, suppresses background positive samples, and ultimately enhances the accuracy of detection. Finally, to align more closely with real-world engineering applications, the dataset utilized in this study consists of machine patrol images captured by the Unmanned Aerial Systems (UAS) of the Yunnan Power Supply Bureau Company. The experimental findings demonstrate that the proposed algorithm yields notable improvements in accuracy and inference speed compared to YOLOv5s, YOLOv7, and YOLOv8. Specifically, the improved algorithm achieves a 3.7% increase in accuracy and a 48.2% enhancement in inference speed compared to those of YOLOv5s. Similarly, it achieves a 2.7% accuracy improvement and a 33.5% increase in inference speed compared to those of YOLOv7, as well as a 1.5% accuracy enhancement and a 13.1% improvement in inference speed compared to those of YOLOv8. These results validate the effectiveness of the proposed algorithm through ablation experiments. Consequently, the method presented in this paper exhibits practical applicability in the detection of aerial images of transmission lines within complex environments. In future research endeavors, it is recommended to continue collecting aerial images for continuous iterative training, to optimize the model further, and to conduct in-depth investigations into the challenges associated with detecting small targets. Such endeavors hold significant importance for the advancement of transmission line detection. Full article
Show Figures

Figure 1

Other

Jump to: Research

15 pages, 2378 KiB  
Technical Note
Improved Object Detection with Content and Position Separation in Transformer
by Yao Wang and Jong-Eun Ha
Remote Sens. 2024, 16(2), 353; https://doi.org/10.3390/rs16020353 - 16 Jan 2024
Cited by 2 | Viewed by 1428
Abstract
In object detection, Transformer-based models such as DETR have exhibited state-of-the-art performance, capitalizing on the attention mechanism to handle spatial relations and feature dependencies. One inherent challenge these models face is the intertwined handling of content and positional data within their attention spans, [...] Read more.
In object detection, Transformer-based models such as DETR have exhibited state-of-the-art performance, capitalizing on the attention mechanism to handle spatial relations and feature dependencies. One inherent challenge these models face is the intertwined handling of content and positional data within their attention spans, potentially blurring the specificity of the information retrieval process. We consider object detection as a comprehensive task, and simultaneously merging content and positional information like before can exacerbate task complexity. This paper presents the Multi-Task Fusion Detector (MTFD), a novel architecture that innovatively dissects the detection process into distinct tasks, addressing content and position through separate decoders. By utilizing assumed fake queries, the MTFD framework enables each decoder to operate under a presumption of known ancillary information, ensuring more specific and enriched interactions with the feature map. Experimental results affirm that this methodical separation followed by a deliberate fusion not only simplifies the task difficulty of the detection process but also augments accuracy and clarifies the details of each component, providing a fresh perspective on object detection in Transformer-based architectures. Full article
Show Figures

Figure 1

Back to TopTop