sensors-logo

Journal Browser

Journal Browser

Single Sensor and Multi-Sensor Object Identification and Detection with Deep Learning

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Intelligent Sensors".

Deadline for manuscript submissions: closed (31 October 2023) | Viewed by 29385

Special Issue Editors


E-Mail Website
Guest Editor

E-Mail Website
Guest Editor
Department of Electrical, Computer and Software Engineering, The University of Auckland, Auckland 1010, New Zealand
Interests: IoT-based ambient intelligence; pervasive healthcare systems; human activity recognition; predictive data analytics and bio-cybernetic systems
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
James Watt School of Engineering, The University of Glasgow, Glasgow, G12 8QQ, UK
Interests: artificial intelligence; machine learning; prediction models; reinforcement learning; big data and data analytics; multimodal information processing; activity recognition; gesture recognition; emotion and sentiment analysis; digital signal processing; signal, speech and image processing; computer vision; human computer interaction (HCI); robot and robotic systems; intelligent system; Internet of Things (IoT); real-time and embedded systems; reconfigurable embedded systems; healthcare and monitoring
Department of Automation, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
Interests: information fusion; statistical signal processing; intelligent vehicle research

Special Issue Information

Dear Colleagues,

Deep learning has become popular in object detection and recognition. While many works have been dedicated to computer vision based on camera, there have been improvements in developing deep learning-based detection and identification for other sensors such as radar, infrared (IR), lidar, hyperspectral and RGB-D sensing. Besides single sensor or multi-sensor object detection, multi-sensor identification and multi-modal deep learning are of great interest to enhance the detection and recognition performance. This Special Issue calls for original and novel object detection and recognition methods based on deep learning for single sensor and multiple sensors. The topics of interest include, but are not limited to:

  • Image object detection and identification;
  • Object detection and classification in camera networks;
  • 3D computer vision;
  • RGB-D object detection and recognition;
  • Radar target detection and identification;
  • Lidar object detection and identification;
  • IR object detection and recognition;
  • Object detection and classification in hyperspectral data;
  • Multispectral object detection and recognition;
  • Multi-sensor object detection and classification;
  • Multi-modal deep learning;
  • Object detection and identification in IoT;
  • Object detection and classification in unmanned aerial vehicle(UAV) imagery;
  • Object detection and classification for autonomous driving. 

Prof. Dr. Henry Leung
Dr. Kevin I-Kai Wang
Dr. Wasim Ahmad
Dr. Peng Wang
Dr. Hao Zhu
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (10 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

12 pages, 6235 KiB  
Article
3D Object Detection Using Multiple-Frame Proposal Features Fusion
by Minyuan Huang, Henry Leung and Ming Hou
Sensors 2023, 23(22), 9162; https://doi.org/10.3390/s23229162 - 14 Nov 2023
Viewed by 1755
Abstract
Object detection is important in many applications, such as autonomous driving. While 2D images lack depth information and are sensitive to environmental conditions, 3D point clouds can provide accurate depth information and a more descriptive environment. However, sparsity is always a challenge in [...] Read more.
Object detection is important in many applications, such as autonomous driving. While 2D images lack depth information and are sensitive to environmental conditions, 3D point clouds can provide accurate depth information and a more descriptive environment. However, sparsity is always a challenge in single-frame point cloud object detection. This paper introduces a two-stage proposal-based feature fusion method for object detection using multiple frames. The proposed method, called proposal features fusion (PFF), utilizes a cosine-similarity approach to associate proposals from multiple frames and employs an attention weighted fusion (AWF) module to merge features from these proposals. It allows for feature fusion specific to individual objects and offers lower computational complexity while achieving higher precision. The experimental results on the nuScenes dataset demonstrate the effectiveness of our approach, achieving an mAP of 46.7%, which is 1.3% higher than the state-of-the-art 3D object detection method. Full article
Show Figures

Figure 1

18 pages, 836 KiB  
Article
Person Re-Identification Using Local Relation-Aware Graph Convolutional Network
by Yu Lian, Wenmin Huang, Shuang Liu, Peng Guo, Zhong Zhang and Tariq S. Durrani
Sensors 2023, 23(19), 8138; https://doi.org/10.3390/s23198138 - 28 Sep 2023
Cited by 1 | Viewed by 1048
Abstract
Local feature extractions have been verified to be effective for person re-identification (re-ID) in recent literature. However, existing methods usually rely on extracting local features from single part of a pedestrian while neglecting the relationship of local features among different pedestrian images. As [...] Read more.
Local feature extractions have been verified to be effective for person re-identification (re-ID) in recent literature. However, existing methods usually rely on extracting local features from single part of a pedestrian while neglecting the relationship of local features among different pedestrian images. As a result, local features contain limited information from one pedestrian image, and cannot benefit from other pedestrian images. In this paper, we propose a novel approach named Local Relation-Aware Graph Convolutional Network (LRGCN) to learn the relationship of local features among different pedestrian images. In order to completely describe the relationship of local features among different pedestrian images, we propose overlap graph and similarity graph. The overlap graph formulates the edge weight as the overlap node number in the node’s neighborhoods so as to learn robust local features, and the similarity graph defines the edge weight as the similarity between the nodes to learn discriminative local features. To propagate the information for different kinds of nodes effectively, we propose the Structural Graph Convolution (SGConv) operation. Different from traditional graph convolution operations where all nodes share the same parameter matrix, SGConv learns different parameter matrices for the node itself and its neighbor nodes to improve the expressive power. We conduct comprehensive experiments to verify our method on four large-scale person re-ID databases, and the overall results show LRGCN exceeds the state-of-the-art methods. Full article
Show Figures

Figure 1

20 pages, 4754 KiB  
Article
A Small Object Detection Method for Oil Leakage Defects in Substations Based on Improved Faster-RCNN
by Qiang Yang, Song Ma, Dequan Guo, Ping Wang, Meichen Lin and Yangheng Hu
Sensors 2023, 23(17), 7390; https://doi.org/10.3390/s23177390 - 24 Aug 2023
Cited by 7 | Viewed by 1694
Abstract
Since substations are key parts of power transmission, ensuring the safety of substations involves monitoring whether the substation equipment is in a normal state. Oil leakage detection is one of the necessary daily tasks of substation inspection robots, which can immediately find out [...] Read more.
Since substations are key parts of power transmission, ensuring the safety of substations involves monitoring whether the substation equipment is in a normal state. Oil leakage detection is one of the necessary daily tasks of substation inspection robots, which can immediately find out whether there is oil leakage in the equipment in operation so as to ensure the service life of the equipment and maintain the safe and stable operation of the system. At present, there are still some challenges in oil leakage detection in substation equipment: there is a lack of a more accurate method of detecting oil leakage in small objects, and there is no combination of intelligent inspection robots to assist substation inspection workers in judging oil leakage accidents. To address these issues, this paper proposes a small object detection method for oil leakage defects in substations. This paper proposes a small object detection method for oil leakage defects in substations, which is based on the feature extraction network Resnet-101 of the Faster-RCNN model for improvement. In order to decrease the loss of information in the original image, especially for small objects, this method is developed by canceling the downsampling operation and replacing the large convolutional kernel with a small convolutional kernel. In addition, the method proposed in this paper is combined with an intelligent inspection robot, and an oil leakage decision-making scheme is designed, which can provide substation equipment oil leakage maintenance recommendations for substation workers to deal with oil leakage accidents. Finally, the experimental validation of real substation oil leakage image collection is carried out by the intelligent inspection robot equipped with a camera. The experimental results show that the proposed FRRNet101-c model in this paper has the best performance for oil leakage detection in substation equipment compared with several baseline models, improving the Mean Average Precision (mAP) by 6.3%, especially in detecting small objects, which has improved by 12%. Full article
Show Figures

Figure 1

15 pages, 1171 KiB  
Article
Cross-Modality Person Re-Identification via Local Paired Graph Attention Network
by Jianglin Zhou, Qing Dong, Zhong Zhang, Shuang Liu and Tariq S. Durrani
Sensors 2023, 23(8), 4011; https://doi.org/10.3390/s23084011 - 15 Apr 2023
Cited by 9 | Viewed by 1934
Abstract
Cross-modality person re-identification (ReID) aims at searching a pedestrian image of RGB modality from infrared (IR) pedestrian images and vice versa. Recently, some approaches have constructed a graph to learn the relevance of pedestrian images of distinct modalities to narrow the gap between [...] Read more.
Cross-modality person re-identification (ReID) aims at searching a pedestrian image of RGB modality from infrared (IR) pedestrian images and vice versa. Recently, some approaches have constructed a graph to learn the relevance of pedestrian images of distinct modalities to narrow the gap between IR modality and RGB modality, but they omit the correlation between IR image and RGB image pairs. In this paper, we propose a novel graph model called Local Paired Graph Attention Network (LPGAT). It uses the paired local features of pedestrian images from different modalities to build the nodes of the graph. For accurate propagation of information among the nodes of the graph, we propose a contextual attention coefficient that leverages distance information to regulate the process of updating the nodes of the graph. Furthermore, we put forward Cross-Center Contrastive Learning (C3L) to constrain how far local features are from their heterogeneous centers, which is beneficial for learning the completed distance metric. We conduct experiments on the RegDB and SYSU-MM01 datasets to validate the feasibility of the proposed approach. Full article
Show Figures

Figure 1

20 pages, 5271 KiB  
Article
3D Object Detection for Self-Driving Cars Using Video and LiDAR: An Ablation Study
by Pascal Housam Salmane, Josué Manuel Rivera Velázquez, Louahdi Khoudour, Nguyen Anh Minh Mai, Pierre Duthon, Alain Crouzil, Guillaume Saint Pierre and Sergio A. Velastin
Sensors 2023, 23(6), 3223; https://doi.org/10.3390/s23063223 - 17 Mar 2023
Cited by 6 | Viewed by 4575
Abstract
Methods based on 64-beam LiDAR can provide very precise 3D object detection. However, highly accurate LiDAR sensors are extremely costly: a 64-beam model can cost approximately USD 75,000. We previously proposed SLS–Fusion (sparse LiDAR and stereo fusion) to fuse low-cost four-beam LiDAR with [...] Read more.
Methods based on 64-beam LiDAR can provide very precise 3D object detection. However, highly accurate LiDAR sensors are extremely costly: a 64-beam model can cost approximately USD 75,000. We previously proposed SLS–Fusion (sparse LiDAR and stereo fusion) to fuse low-cost four-beam LiDAR with stereo cameras that outperform most advanced stereo–LiDAR fusion methods. In this paper, and according to the number of LiDAR beams used, we analyzed how the stereo and LiDAR sensors contributed to the performance of the SLS–Fusion model for 3D object detection. Data coming from the stereo camera play a significant role in the fusion model. However, it is necessary to quantify this contribution and identify the variations in such a contribution with respect to the number of LiDAR beams used inside the model. Thus, to evaluate the roles of the parts of the SLS–Fusion network that represent LiDAR and stereo camera architectures, we propose dividing the model into two independent decoder networks. The results of this study show that—starting from four beams—increasing the number of LiDAR beams has no significant impact on the SLS–Fusion performance. The presented results can guide the design decisions by practitioners. Full article
Show Figures

Figure 1

22 pages, 8935 KiB  
Article
Deep Learning Derived Object Detection and Tracking Technology Based on Sensor Fusion of Millimeter-Wave Radar/Video and Its Application on Embedded Systems
by Jia-Jheng Lin, Jiun-In Guo, Vinay Malligere Shivanna and Ssu-Yuan Chang
Sensors 2023, 23(5), 2746; https://doi.org/10.3390/s23052746 - 2 Mar 2023
Cited by 8 | Viewed by 3778
Abstract
This paper proposes a deep learning-based mmWave radar and RGB camera sensor early fusion method for object detection and tracking and its embedded system realization for ADAS applications. The proposed system can be used not only in ADAS systems but also to be [...] Read more.
This paper proposes a deep learning-based mmWave radar and RGB camera sensor early fusion method for object detection and tracking and its embedded system realization for ADAS applications. The proposed system can be used not only in ADAS systems but also to be applied to smart Road Side Units (RSU) in transportation systems to monitor real-time traffic flow and warn road users of probable dangerous situations. As the signals of mmWave radar are less affected by bad weather and lighting such as cloudy, sunny, snowy, night-light, and rainy days, it can work efficiently in both normal and adverse conditions. Compared to using an RGB camera alone for object detection and tracking, the early fusion of the mmWave radar and RGB camera technology can make up for the poor performance of the RGB camera when it fails due to bad weather and/or lighting conditions. The proposed method combines the features of radar and RGB cameras and directly outputs the results from an end-to-end trained deep neural network. Additionally, the complexity of the overall system is also reduced such that the proposed method can be implemented on PCs as well as on embedded systems like NVIDIA Jetson Xavier at 17.39 fps. Full article
Show Figures

Figure 1

16 pages, 531 KiB  
Article
Resource Optimization for Multi-Unmanned Aerial Vehicle Formation Communication Based on an Improved Deep Q-Network
by Jie Li, Sai Li and Chenyan Xue
Sensors 2023, 23(5), 2667; https://doi.org/10.3390/s23052667 - 28 Feb 2023
Cited by 4 | Viewed by 1670
Abstract
With the widespread application of unmanned aerial vehicle (UAV) formation technology, it is very important to maintain good communication quality with the limited power and spectrum resources that are available. To maximize the transmission rate and increase the successful data transfer probability simultaneously, [...] Read more.
With the widespread application of unmanned aerial vehicle (UAV) formation technology, it is very important to maintain good communication quality with the limited power and spectrum resources that are available. To maximize the transmission rate and increase the successful data transfer probability simultaneously, the convolutional block attention module (CBAM) and value decomposition network (VDN) algorithm were introduced on the basis of a deep Q-network (DQN) for a UAV formation communication system. To make full use of the frequency, this manuscript considers both the UAV-to-base station (U2B) and the UAV-to-UAV (U2U) links, and the U2B links can be reused by the U2U communication links. In the DQN, the U2U links, which are treated as agents, can interact with the system and they intelligently learn how to choose the best power and spectrum. The CBAM affects the training results along both the channel and spatial aspects. Moreover, the VDN algorithm was introduced to solve the problem of partial observation in one UAV using distributed execution by decomposing the team q-function into agent-wise q-functions through the VDN. The experimental results showed that the improvement in data transfer rate and the successful data transfer probability was obvious. Full article
Show Figures

Figure 1

24 pages, 3338 KiB  
Article
Joint Aperture and Power Allocation Strategy for a Radar Network Localization System Based on Low Probability of Interception Optimization
by Chenyan Xue, Ling Wang and Daiyin Zhu
Sensors 2023, 23(5), 2613; https://doi.org/10.3390/s23052613 - 27 Feb 2023
Cited by 1 | Viewed by 1544
Abstract
In the process of using the Distributed Radar Network Localization System (DRNLS) further to improve the survivability of a carrier platform, the random characteristics of the system’s Aperture Resource Allocation (ARA) and Radar Cross Section (RCS) are often not fully considered. However, the [...] Read more.
In the process of using the Distributed Radar Network Localization System (DRNLS) further to improve the survivability of a carrier platform, the random characteristics of the system’s Aperture Resource Allocation (ARA) and Radar Cross Section (RCS) are often not fully considered. However, the random characteristics of the system’s ARA and RCS will affect the power resource allocation of the DRNLS to a certain extent, and the allocation result is an essential factor determining the performance of the DRNLS’s Low Probability of Intercept (LPI). Therefore, a DRNLS still has some limitations in practical application. In order to solve this problem, a joint allocation scheme of aperture and power for the DRNLS based on LPI optimization (JA scheme) is proposed. In the JA scheme, the fuzzy random Chance Constrained Programmin model for radar antenna aperture resource management (RAARM-FRCCP model) can minimize the number of elements under the given pattern parameters. The random Chance Constrained Programmin model for minimizing Schleher Intercept Factor (MSIF-RCCP model) built on this basis can be used to achieve DRNLS optimal control of LPI performance on the premise of ensuring system tracking performance requirements. The results show that when RCS has some randomness, its corresponding uniform power distribution result is not necessarily the optimal scheme. Under the condition of meeting the same tracking performance, the required number of elements and power will be reduced to a certain extent compared with the number of elements in the whole array and the power corresponding to the uniform distribution. The lower the confidence level is, the more times the threshold is allowed to pass, and the lower the power is, so that the DRNLS can have better LPI performance. Full article
Show Figures

Figure 1

15 pages, 3761 KiB  
Article
An Adaptive Kernels Layer for Deep Neural Networks Based on Spectral Analysis for Image Applications
by Tariq Al Shoura, Henry Leung and Bhashyam Balaji
Sensors 2023, 23(3), 1527; https://doi.org/10.3390/s23031527 - 30 Jan 2023
Cited by 1 | Viewed by 2063
Abstract
As the pixel resolution of imaging equipment has grown larger, the images’ sizes and the number of pixels used to represent objects in images have increased accordingly, exposing an issue when dealing with larger images using the traditional deep learning models and methods, [...] Read more.
As the pixel resolution of imaging equipment has grown larger, the images’ sizes and the number of pixels used to represent objects in images have increased accordingly, exposing an issue when dealing with larger images using the traditional deep learning models and methods, as they typically employ mechanisms such as increasing the models’ depth, which, while suitable for applications that have to be spatially invariant, such as image classification, causes issues for applications that relies on the location of the different features within the images such as object localization and change detection. This paper proposes an adaptive convolutional kernels layer (AKL) as an architecture that adjusts dynamically to images’ sizes in order to extract comparable spectral information from images of different sizes, improving the features’ spatial resolution without sacrificing the local receptive field (LRF) for various image applications, specifically those that are sensitive to objects and features locations, using the definition of Fourier transform and the relation between spectral analysis and convolution kernels. The proposed method is then tested using a Monte Carlo simulation to evaluate its performance in spectral information coverage across images of various sizes, validating its ability to maintain coverage of a ratio of the spectral domain with a variation of around 20% of the desired coverage ratio. Finally, the AKL is validated for various image applications compared to other architectures such as Inception and VGG, demonstrating its capability to match Inception v4 in image classification applications, and outperforms it as images grow larger, up to a 30% increase in accuracy in object localization for the same number of parameters. Full article
Show Figures

Figure 1

22 pages, 15721 KiB  
Article
Target Recognition in SAR Images by Deep Learning with Training Data Augmentation
by Zhe Geng, Ying Xu, Bei-Ning Wang, Xiang Yu, Dai-Yin Zhu and Gong Zhang
Sensors 2023, 23(2), 941; https://doi.org/10.3390/s23020941 - 13 Jan 2023
Cited by 26 | Viewed by 7836
Abstract
Mass production of high-quality synthetic SAR training imagery is essential for boosting the performance of deep-learning (DL)-based SAR automatic target recognition (ATR) algorithms in an open-world environment. To address this problem, we exploit both the widely used Moving and Stationary Target Acquisition and [...] Read more.
Mass production of high-quality synthetic SAR training imagery is essential for boosting the performance of deep-learning (DL)-based SAR automatic target recognition (ATR) algorithms in an open-world environment. To address this problem, we exploit both the widely used Moving and Stationary Target Acquisition and Recognition (MSTAR) SAR dataset and the Synthetic and Measured Paired Labeled Experiment (SAMPLE) dataset, which consists of selected samples from the MSTAR dataset and their computer-generated synthetic counterparts. A series of data augmentation experiments are carried out. First, the sparsity of the scattering centers of the targets is exploited for new target pose synthesis. Additionally, training data with various clutter backgrounds are synthesized via clutter transfer, so that the neural networks are better prepared to cope with background changes in the test samples. To effectively augment the synthetic SAR imagery in the SAMPLE dataset, a novel contrast-based data augmentation technique is proposed. To improve the robustness of neural networks against out-of-distribution (OOD) samples, the SAR images of ground military vehicles collected by the self-developed MiniSAR system are used as the training data for the adversarial outlier exposure procedure. Simulation results show that the proposed data augmentation methods are effective in improving both the target classification accuracy and the OOD detection performance. The purpose of this work is to establish the foundation for large-scale, open-field implementation of DL-based SAR-ATR systems, which is not only of great value in the sense of theoretical research, but is also potentially meaningful in the aspect of military application. Full article
Show Figures

Figure 1

Back to TopTop