sensors-logo

Journal Browser

Journal Browser

Advanced Visual Quality Enhancement and Computational Technology for Robotics

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Sensing and Imaging".

Deadline for manuscript submissions: closed (31 December 2022) | Viewed by 21554

Special Issue Editors


E-Mail Website
Guest Editor
School of Computer Science, Wuhan University, Wuhan 430072, China
Interests: video compression; image processing; multimedia communications

E-Mail Website
Guest Editor
School of Computer Science and Technology, Wuhan Institute of Technology, Wuhan 430205, China
Interests: computer vision;image processing; multimedia communications

E-Mail Website
Guest Editor
School of Computer Science and Technology, Wuhan Institute of Technology, Wuhan 430205, China
Interests: Intelligent information processing; intelligent computing; wireless sensor network technology

E-Mail Website
Guest Editor
School of Computer Science and Technology, Wuhan Institute of Technology, Wuhan 430205, China
Interests: computer vision; machine learning

Special Issue Information

Dear Colleagues,

Computer vision provides the key perception and computational approach of unmanned systems such as robots, intelligent video surveillance, and so on. Although advanced sensors can accquiire clearer images and videos, suffering from the limitations of hardware equipments and environmental factors, in many cases, only low-quality images and videos are available, which poses a huge obstacle to computer vision based applications. Video image quality enhancement is an old task, but with the blessing of current deep learning, it also has more promising prospects. For instance, significant improvements in image/video restoration and enhancement tasks such as super-resolution, denoising, deblurring, and deraining can recover information lost due to sensor hardware limitations. In the meantime, substantial progress has also been made in high-level vision fields such as recognition, idenification, authentication, and abnormality diagnosis, etc. Especially, this Special Issue encourages robotics related innovative studeis, not only in low-level vision, but also in modeling, analysis and understanding driven by visual quality. In addition, AI-synthesized digital content also brings confusion and challenges to real-world sensory information. Therefore, valuable research refers not only to recovering low-quality videos/images, but also to credibly distinguishing real-world contents from AI-generated contents.

For the better utility of images and videos acquired from sensors, this Special Issue aims at exploring novel image/video restoration and understanding methods using deep learning and other approaches.

Prof. Dr. Zhongyuan Wang
Prof. Dr. Tao Lu
Prof. Dr. Yuntao Wu
Prof. Dr. Huabing Zhou
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • image/video inpainting
  • image/video deblurring
  • image/video dehazing and deraining
  • image/video denoising
  • image/video super-resolution
  • low-light image enhancement
  • image/video quality assessment
  • image/video abnormity diagnosis
  • image/video authenticity authentication
  • image-based modeling
  • scene analysis
  • stereo vision for robotics
  • object detection/recognition/tracking
  • behavior/activity recognition
  • localization, navigation, and mapping
  • other vision-computing-related robotics

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (7 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

16 pages, 502 KiB  
Article
Improving Speech Recognition Performance in Noisy Environments by Enhancing Lip Reading Accuracy
by Dengshi Li, Yu Gao, Chenyi Zhu, Qianrui Wang and Ruoxi Wang
Sensors 2023, 23(4), 2053; https://doi.org/10.3390/s23042053 - 11 Feb 2023
Cited by 9 | Viewed by 6318
Abstract
The current accuracy of speech recognition can reach over 97% on different datasets, but in noisy environments, it is greatly reduced. Improving speech recognition performance in noisy environments is a challenging task. Due to the fact that visual information is not affected by [...] Read more.
The current accuracy of speech recognition can reach over 97% on different datasets, but in noisy environments, it is greatly reduced. Improving speech recognition performance in noisy environments is a challenging task. Due to the fact that visual information is not affected by noise, researchers often use lip information to help to improve speech recognition performance. This is where the performance of lip recognition and the effect of cross-modal fusion are particularly important. In this paper, we try to improve the accuracy of speech recognition in noisy environments by improving the lip reading performance and the cross-modal fusion effect. First, due to the same lip possibly containing multiple meanings, we constructed a one-to-many mapping relationship model between lips and speech allowing for the lip reading model to consider which articulations are represented from the input lip movements. Audio representations are also preserved by modeling the inter-relationships between paired audiovisual representations. At the inference stage, the preserved audio representations could be extracted from memory by the learned inter-relationships using only video input. Second, a joint cross-fusion model using the attention mechanism could effectively exploit complementary intermodal relationships, and the model calculates cross-attention weights on the basis of the correlations between joint feature representations and individual modalities. Lastly, our proposed model achieved a 4.0% reduction in WER in a −15 dB SNR environment compared to the baseline method, and a 10.1% reduction in WER compared to speech recognition. The experimental results show that our method could achieve a significant improvement over speech recognition models in different noise environments. Full article
Show Figures

Figure 1

11 pages, 736 KiB  
Communication
Constant Tension Control System of High-Voltage Coil Winding Machine Based on Smith Predictor-Optimized Active Disturbance Rejection Control Algorithm
by Yuming Ai, Baocheng Yu, Yanduo Zhang, Wenxia Xu and Tao Lu
Sensors 2023, 23(1), 140; https://doi.org/10.3390/s23010140 - 23 Dec 2022
Cited by 2 | Viewed by 3608
Abstract
In the production process of high-voltage coils, a constant tension control system is designed to improve the quality of the transformer. The system is composed of a controller, execution structure, detection structure, etc. The active disturbance rejection control (ADRC), optimized by the Smith [...] Read more.
In the production process of high-voltage coils, a constant tension control system is designed to improve the quality of the transformer. The system is composed of a controller, execution structure, detection structure, etc. The active disturbance rejection control (ADRC), optimized by the Smith predictor (SP), is adopted to achieve constant tension control. The experiment results show that the tension control system based on the SP-ADRC has higher control accuracy, shorter stabilization time and stronger anti-interference ability compared with the traditional PID algorithm. The actual experiment shows that the constant tension control system of the high-voltage coil winding machine based on SP-ADRC has a superior control effect and high practical value. Full article
Show Figures

Figure 1

17 pages, 3439 KiB  
Article
Application of a Bayesian Network Based on Multi-Source Information Fusion in the Fault Diagnosis of a Radar Receiver
by Boya Liu, Xiaowen Bi, Lijuan Gu, Jie Wei and Baozhong Liu
Sensors 2022, 22(17), 6396; https://doi.org/10.3390/s22176396 - 25 Aug 2022
Cited by 7 | Viewed by 1807
Abstract
A radar is an important part of an air defense and combat system. It is of great significance to military defense to improve the effectiveness of radar state monitoring and the accuracy of fault diagnosis during operation. However, the complexity of radar equipment’s [...] Read more.
A radar is an important part of an air defense and combat system. It is of great significance to military defense to improve the effectiveness of radar state monitoring and the accuracy of fault diagnosis during operation. However, the complexity of radar equipment’s structure and the uncertainty of the operating environment greatly increase the difficulty of fault diagnosis in real life situations. Therefore, a Bayesian network diagnosis method based on multi-source information fusion technology is proposed to solve the fault diagnosis problems caused by uncertain factors such as the high integration and complexity of the system during the process of fault diagnosis. Taking a fault of a radar receiver as an example, we study 2 typical fault phenomena and 21 fault points. After acquiring and processing multi-source information, establishing a Bayesian network model, determining conditional probability tables (CPTs), and finally outputting the diagnosis results. The results are convincing and consistent with reality, which verifies the effectiveness of this method for fault diagnosis in radar receivers. It realizes device-level fault diagnosis, which shortens the maintenance time for radars and improves the reliability and maintainability of radars. Our results have significance as a guide for judging the fault location of radars and predicting the vulnerable components of radars. Full article
Show Figures

Figure 1

14 pages, 2637 KiB  
Article
Methodology for Large-Scale Camera Positioning to Enable Intelligent Self-Configuration
by Yingfeng Wu, Weiwei Zhao and Jifa Zhang
Sensors 2022, 22(15), 5806; https://doi.org/10.3390/s22155806 - 3 Aug 2022
Cited by 1 | Viewed by 1652
Abstract
The development of a self-configuring method for efficiently locating moving targets indoors could enable extraordinary advances in the control of industrial automatic production equipment. Being interactively connected, cameras that constitute a network represent a promising visual system for wireless positioning, with the ultimate [...] Read more.
The development of a self-configuring method for efficiently locating moving targets indoors could enable extraordinary advances in the control of industrial automatic production equipment. Being interactively connected, cameras that constitute a network represent a promising visual system for wireless positioning, with the ultimate goal of replacing or enhancing conventional sensors. Developing a highly efficient algorithm for collaborating cameras in the network is of particular interest. This paper presents an intelligent positioning system, which is capable of integrating visual information, obtained by large quantities of cameras, through self-configuration. The use of the extended Kalman filter predicts the position, velocity, acceleration and jerk (the third derivative of position) in the moving target. As a result, the camera-network-based visual positioning system is capable of locating a moving target with high precision: relative errors for positional parameters are all smaller than 10%; relative errors for linear velocities (vx, vy) are also kept to an acceptable level, i.e., lower than 20%. This presents the outstanding potential of this visual positioning system to assist in the industry of automation, including wireless intelligent control, high-precision indoor positioning, and navigation. Full article
Show Figures

Figure 1

13 pages, 3945 KiB  
Article
PLI-VINS: Visual-Inertial SLAM Based on Point-Line Feature Fusion in Indoor Environment
by Zhangzhen Zhao, Tao Song, Bin Xing, Yu Lei and Ziqin Wang
Sensors 2022, 22(14), 5457; https://doi.org/10.3390/s22145457 - 21 Jul 2022
Cited by 11 | Viewed by 2758
Abstract
In indoor low-texture environments, the point feature-based visual SLAM system has poor robustness and low trajectory accuracy. Therefore, we propose a visual inertial SLAM algorithm based on point-line feature fusion. Firstly, in order to improve the quality of the extracted line segment, a [...] Read more.
In indoor low-texture environments, the point feature-based visual SLAM system has poor robustness and low trajectory accuracy. Therefore, we propose a visual inertial SLAM algorithm based on point-line feature fusion. Firstly, in order to improve the quality of the extracted line segment, a line segment extraction algorithm with adaptive threshold value is proposed. By constructing the adjacent matrix of the line segment and judging the direction of the line segment, it can decide whether to merge or eliminate other line segments. At the same time, geometric constraint line feature matching is considered to improve the efficiency of processing line features. Compared with the traditional algorithm, the processing efficiency of our proposed method is greatly improved. Then, point, line, and inertial data are effectively fused in a sliding window to achieve high-accuracy pose estimation. Finally, experiments on the EuRoC dataset show that the proposed PLI-VINS performs better than the traditional visual inertial SLAM system using point features and point line features. Full article
Show Figures

Figure 1

18 pages, 6673 KiB  
Article
A Novel Hybrid Algorithm for the Forward Kinematics Problem of 6 DOF Based on Neural Networks
by Huizhi Zhu, Wenxia Xu, Baocheng Yu, Feng Ding, Lei Cheng and Jian Huang
Sensors 2022, 22(14), 5318; https://doi.org/10.3390/s22145318 - 16 Jul 2022
Cited by 4 | Viewed by 1970
Abstract
The closed kinematic structure of Gough–Stewart platforms causes the kinematic control problem, particularly forward kinematics. In the traditional hybrid algorithm (backpropagation neural network and Newton–Raphson), it is difficult for the neural network part to train different datasets, causing training errors. Moreover, the Newton–Raphson [...] Read more.
The closed kinematic structure of Gough–Stewart platforms causes the kinematic control problem, particularly forward kinematics. In the traditional hybrid algorithm (backpropagation neural network and Newton–Raphson), it is difficult for the neural network part to train different datasets, causing training errors. Moreover, the Newton–Raphson method is unable to operate on a singular Jacobian matrix. In this study, in order to solve the forward kinematics problem of Gough–Stewart platforms, a new hybrid algorithm is proposed based on the combination of an artificial bee colony (ABC)–optimized BP neural network (ABC–BPNN) and a numerical algorithm. ABC greatly improves the prediction ability of neural networks and can provide a superb initial value to numerical algorithms. In the design of numerical algorithms, a modification of Newton’s method (QMn-M) is introduced to solve the problem that the traditional algorithm model cannot be solved when it is trapped in singular matrix. Results show that the maximal improvement in ABC–BPNN error optimization was 46.3%, while the RMSE index decreased by 42.1%. Experiments showed the feasibility of QMn-M in solving singular matrix data, while the percentage improvement in performance for the average number of iterations and required time was 14.4% and 13.9%, respectively. Full article
Show Figures

Figure 1

13 pages, 4348 KiB  
Article
An Improved Mixture Density Network for 3D Human Pose Estimation with Ordinal Ranking
by Yiqi Wu, Shichao Ma, Dejun Zhang, Weilun Huang and Yilin Chen
Sensors 2022, 22(13), 4987; https://doi.org/10.3390/s22134987 - 1 Jul 2022
Cited by 5 | Viewed by 2151
Abstract
Estimating accurate 3D human poses from 2D images remains a challenge due to the lack of explicit depth information in 2D data. This paper proposes an improved mixture density network for 3D human pose estimation called the Locally Connected Mixture Density Network (LCMDN). [...] Read more.
Estimating accurate 3D human poses from 2D images remains a challenge due to the lack of explicit depth information in 2D data. This paper proposes an improved mixture density network for 3D human pose estimation called the Locally Connected Mixture Density Network (LCMDN). Instead of conducting direct coordinate regression or providing unimodal estimates per joint, our approach predicts multiple possible hypotheses by the Mixture Density Network (MDN). Our network can be divided into two steps: the 2D joint points are estimated from the input images first; then, the information of human joints correlation is extracted by a feature extractor. After the human pose feature is extracted, multiple pose hypotheses are generated via the hypotheses generator. In addition, to make better use of the relationship between human joints, we introduce the Locally Connected Network (LCN) as a generic formulation to replace the traditional Fully Connected Network (FCN), which is applied to a feature extraction module. Finally, to select the most appropriate 3D pose result, a 3D pose selector based on the ordinal ranking of joints is adopted to score the predicted pose. The LCMDN improves the representation capability and robustness of the original MDN method notably. Experiments are conducted on the Human3.6M and MPII dataset. The average Mean Per Joint Position Error (MPJPE) of our proposed LCMDN reaches 50 mm on the Human3.6M dataset, which is on par or better than the state-of-the-art works. The qualitative results on the MPII dataset show that our network has a strong generalization ability. Full article
Show Figures

Figure 1

Back to TopTop