sensors-logo

Journal Browser

Journal Browser

Camera as a Smart-Sensor (CaaSS)

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Intelligent Sensors".

Deadline for manuscript submissions: 27 November 2024 | Viewed by 67967

Special Issue Editors


E-Mail Website
Collection Editor
College of Engineering & Informatics, National University of Ireland Galway, University Road, Galway H91 TK33, Ireland
Interests: consumer electronics; computational imaging; biometrics; cyber-security; cloud computing

E-Mail Website
Collection Editor
Computer Science and Engineering, University of North Texas
Interests: smart electronic systems; security and energy aware cyber-physical systems (CPS); IoMT based approaches for smart healthcare; IoT-enabled consumer electronics for smart cities

Special Issue Information

Dear Colleagues,

Digital camera technology has evolved over the past 3 decades to provide image and video quality of incredible quality, while, thanks to the mass market adoption of digital imaging technologies, the costs of image sensors and the associated optical systems and image signal processors continue to decrease. In parallel, advances in computational imaging and deep learning technologies have led to a new generation of advanced computer vision algorithms. New edge-AI technologies will enable these sophisticated algorithms to be integrated directly with the sensing and optical systems to enable a new generation of smart-vision sensors. An important aspect of such new sensors is that they can meet emerging needs to manage and curate data-privacy and can also help to reduce the energy requirements by eliminating the need to send large amounts of raw image and video data to data-centers for postprocessing and cloud-based storage.

This Topical Collection welcomes new research contributions and applied research on new synergies across these fields that enable a new generation of ‘Camera as a Smart Sensor’ technologies. Review articles that are well-aligned with this Topical Collection theme will also be considered.

Suitable topics can include:   

  • Novel combinations of commodity camera or image sensor technologies with edge-AI or embedded computational imaging algorithms;
  • Novel uses of camera or image sensors for new sensing applications;
  • Advanced deep learning techniques to enable new sensing with commodity camera or sensors;
  • New nonvisible sensing technologies that leverage advanced computational or edge-AI algorithms;
  • Optical design aspects of CaaSS;
  • Electronic design aspects of CaaSS, including new ISP hardware architectures;
  • CaaSS research targeted at privacy management or optimization of energy’
  • Large-scale deployments or novel products or commercial systems that leverage CaaSS in new use-cases or industry applications (e.g., smart city deployments, checkout-free shops, quality assurance and inspection lines, domestic service robotics).

Prof. Dr. Peter Corcoran
Prof. Dr. Saraju P. Mohanty
Collection Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

 
 

Keywords

  • Digital imaging
  • Digital camera
  • CMOS sensor
  • Embedded computer vision
  • Edge-AI
  • Camera as a Smart Sensor (CaaSS)
 
 

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (12 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review, Other

19 pages, 2941 KiB  
Article
A Manycore Vision Processor for Real-Time Smart Cameras
by Bruno A. da Silva, Arthur M. Lima, Janier Arias-Garcia, Michael Huebner and Jones Yudi
Sensors 2021, 21(21), 7137; https://doi.org/10.3390/s21217137 - 27 Oct 2021
Cited by 1 | Viewed by 3280
Abstract
Real-time image processing and computer vision systems are now in the mainstream of technologies enabling applications for cyber-physical systems, Internet of Things, augmented reality, and Industry 4.0. These applications bring the need for Smart Cameras for local real-time processing of images and videos. [...] Read more.
Real-time image processing and computer vision systems are now in the mainstream of technologies enabling applications for cyber-physical systems, Internet of Things, augmented reality, and Industry 4.0. These applications bring the need for Smart Cameras for local real-time processing of images and videos. However, the massive amount of data to be processed within short deadlines cannot be handled by most commercial cameras. In this work, we show the design and implementation of a manycore vision processor architecture to be used in Smart Cameras. With massive parallelism exploration and application-specific characteristics, our architecture is composed of distributed processing elements and memories connected through a Network-on-Chip. The architecture was implemented as an FPGA overlay, focusing on optimized hardware utilization. The parameterized architecture was characterized by its hardware occupation, maximum operating frequency, and processing frame rate. Different configurations ranging from one to eighty-one processing elements were implemented and compared to several works from the literature. Using a System-on-Chip composed of an FPGA integrated into a general-purpose processor, we showcase the flexibility and efficiency of the hardware/software architecture. The results show that the proposed architecture successfully allies programmability and performance, being a suitable alternative for future Smart Cameras. Full article
(This article belongs to the Special Issue Camera as a Smart-Sensor (CaaSS))
Show Figures

Figure 1

16 pages, 3489 KiB  
Article
DisCaaS: Micro Behavior Analysis on Discussion by Camera as a Sensor
by Ko Watanabe, Yusuke Soneda, Yuki Matsuda, Yugo Nakamura, Yutaka Arakawa, Andreas Dengel and Shoya Ishimaru
Sensors 2021, 21(17), 5719; https://doi.org/10.3390/s21175719 - 25 Aug 2021
Cited by 14 | Viewed by 3914
Abstract
The emergence of various types of commercial cameras (compact, high resolution, high angle of view, high speed, and high dynamic range, etc.) has contributed significantly to the understanding of human activities. By taking advantage of the characteristic of a high angle of view, [...] Read more.
The emergence of various types of commercial cameras (compact, high resolution, high angle of view, high speed, and high dynamic range, etc.) has contributed significantly to the understanding of human activities. By taking advantage of the characteristic of a high angle of view, this paper demonstrates a system that recognizes micro-behaviors and a small group discussion with a single 360 degree camera towards quantified meeting analysis. We propose a method that recognizes speaking and nodding, which have often been overlooked in existing research, from a video stream of face images and a random forest classifier. The proposed approach was evaluated on our three datasets. In order to create the first and the second datasets, we asked participants to meet physically: 16 sets of five minutes data from 21 unique participants and seven sets of 10 min meeting data from 12 unique participants. The experimental results showed that our approach could detect speaking and nodding with a macro average f1-score of 67.9% in a 10-fold random split cross-validation and a macro average f1-score of 62.5% in a leave-one-participant-out cross-validation. By considering the increased demand for an online meeting due to the COVID-19 pandemic, we also record faces on a screen that are captured by web cameras as the third dataset and discussed the potential and challenges of applying our ideas to virtual video conferences. Full article
(This article belongs to the Special Issue Camera as a Smart-Sensor (CaaSS))
Show Figures

Figure 1

21 pages, 7745 KiB  
Article
Transformations in the Photogrammetric Co-Processing of Thermal Infrared Images and RGB Images
by Adam Dlesk, Karel Vach and Karel Pavelka
Sensors 2021, 21(15), 5061; https://doi.org/10.3390/s21155061 - 26 Jul 2021
Cited by 5 | Viewed by 2731
Abstract
The photogrammetric processing of thermal infrared (TIR) images deals with several difficulties. TIR images ordinarily have low-resolution and the contrast of the images is very low. These factors strongly complicate the photogrammetric processing, especially when a modern structure from motion method is used. [...] Read more.
The photogrammetric processing of thermal infrared (TIR) images deals with several difficulties. TIR images ordinarily have low-resolution and the contrast of the images is very low. These factors strongly complicate the photogrammetric processing, especially when a modern structure from motion method is used. These factors can be avoided by a certain co-processing method of TIR and RGB images. Two of the solutions of co-processing were suggested by the authors and are presented in this article. Each solution requires a different type of transformation–plane transformation and spatial transformation. Both types of transformations are discussed in this paper. On the experiments that were performed, there are presented requirements, advantages, disadvantages, and results of the transformations. Both methods are evaluated mainly in terms of accuracy. The transformations are presented on suggested methods, but they can be easily applied to different kinds of methods of co-processing of TIR and RGB images. Full article
(This article belongs to the Special Issue Camera as a Smart-Sensor (CaaSS))
Show Figures

Figure 1

16 pages, 4551 KiB  
Article
Infrared Small Target Detection Method with Trajectory Correction Fuze Based on Infrared Image Sensor
by Cong Zhang, Dongguang Li, Jiashuo Qi, Jingtao Liu and Yu Wang
Sensors 2021, 21(13), 4522; https://doi.org/10.3390/s21134522 - 1 Jul 2021
Cited by 10 | Viewed by 3874
Abstract
Due to the complexity of background and diversity of small targets, robust detection of infrared small targets for the trajectory correction fuze has become a challenge. To solve this problem, different from the traditional method, a state-of-the-art detection method based on density-distance space [...] Read more.
Due to the complexity of background and diversity of small targets, robust detection of infrared small targets for the trajectory correction fuze has become a challenge. To solve this problem, different from the traditional method, a state-of-the-art detection method based on density-distance space is proposed to apply to the trajectory correction fuze. First, parameters of the infrared image sensor on the fuze are calculated to set the boundary limitations for the target detection method. Second, the density-distance space method is proposed to detect the candidate targets. Finally, the adaptive pixel growth (APG) algorithm is used to suppress the clutter so as to detect the real targets. Three experiments, including equivalent detection, simulation and hardware-in-loop, were implemented to verify the effectiveness of this method. Results illustrated that the infrared image sensor on the fuze has a stable field of view under rotation of the projectile, and could clearly observe the infrared small target. The proposed method has superior anti-noise, different size target detection, multi-target detection and various clutter suppression capability. Compared with six novel algorithms, our algorithm shows a perfect detection performance and acceptable time consumption. Full article
(This article belongs to the Special Issue Camera as a Smart-Sensor (CaaSS))
Show Figures

Figure 1

16 pages, 4129 KiB  
Article
Recognition of Cosmic Ray Images Obtained from CMOS Sensors Used in Mobile Phones by Approximation of Uncertain Class Assignment with Deep Convolutional Neural Network
by Tomasz Hachaj, Łukasz Bibrzycki and Marcin Piekarczyk
Sensors 2021, 21(6), 1963; https://doi.org/10.3390/s21061963 - 11 Mar 2021
Cited by 11 | Viewed by 3818
Abstract
In this paper, we describe the convolutional neural network (CNN)-based approach to the problems of categorization and artefact reduction of cosmic ray images obtained from CMOS sensors used in mobile phones. As artefacts, we understand all images that cannot be attributed to particles’ [...] Read more.
In this paper, we describe the convolutional neural network (CNN)-based approach to the problems of categorization and artefact reduction of cosmic ray images obtained from CMOS sensors used in mobile phones. As artefacts, we understand all images that cannot be attributed to particles’ passage through sensor but rather result from the deficiencies of the registration procedure. The proposed deep neural network is composed of a pretrained CNN and neural-network-based approximator, which models the uncertainty of image class assignment. The network was trained using a transfer learning approach with a mean squared error loss function. We evaluated our approach on a data set containing 2350 images labelled by five judges. The most accurate results were obtained using the VGG16 CNN architecture; the recognition rate (RR) was 85.79% ± 2.24% with a mean squared error (MSE) of 0.03 ± 0.00. After applying the proposed threshold scheme to eliminate less probable class assignments, we obtained a RR of 96.95% ± 1.38% for a threshold of 0.9, which left about 62.60% ± 2.88% of the overall data. Importantly, the research and results presented in this paper are part of the pioneering field of the application of citizen science in the recognition of cosmic rays and, to the best of our knowledge, this analysis is performed on the largest freely available cosmic ray hit dataset. Full article
(This article belongs to the Special Issue Camera as a Smart-Sensor (CaaSS))
Show Figures

Figure 1

13 pages, 6632 KiB  
Article
Feature Point Registration Model of Farmland Surface and Its Application Based on a Monocular Camera
by Yang Li, Dongyan Huang, Jiangtao Qi, Sikai Chen, Huibin Sun, Huili Liu and Honglei Jia
Sensors 2020, 20(13), 3799; https://doi.org/10.3390/s20133799 - 7 Jul 2020
Cited by 7 | Viewed by 2811
Abstract
In this study, an image registration algorithm was applied to calculate the rotation angle of objects when matching images. Some commonly used image feature detection algorithms such as features from accelerated segment test (FAST), speeded up robust features (SURF) and maximally stable extremal [...] Read more.
In this study, an image registration algorithm was applied to calculate the rotation angle of objects when matching images. Some commonly used image feature detection algorithms such as features from accelerated segment test (FAST), speeded up robust features (SURF) and maximally stable extremal regions (MSER) algorithms were chosen as feature extraction components. Comparing the running time and accuracy, the image registration algorithm based on SURF has better performance than the other algorithms. Accurately obtaining the roll angle is one of the key technologies to improve the positioning accuracy and operation quality of agricultural equipment. To acquire the roll angle of agriculture machinery, a roll angle acquisition model based on the image registration algorithm was built. Then, the performance of the model with a monocular camera was tested in the field. The field test showed that the average error of the rolling angle was 0.61°, while the minimum error was 0.08°. The field test indicated that the model could accurately obtain the attitude change trend of agricultural machinery when it was working in irregular farmlands. The model described in this paper could provide a foundation for agricultural equipment navigation and autonomous driving. Full article
(This article belongs to the Special Issue Camera as a Smart-Sensor (CaaSS))
Show Figures

Figure 1

23 pages, 4246 KiB  
Article
Joint Unsupervised Learning of Depth, Pose, Ground Normal Vector and Ground Segmentation by a Monocular Camera Sensor
by Lu Xiong, Yongkun Wen, Yuyao Huang, Junqiao Zhao and Wei Tian
Sensors 2020, 20(13), 3737; https://doi.org/10.3390/s20133737 - 3 Jul 2020
Cited by 4 | Viewed by 3795
Abstract
We propose a completely unsupervised approach to simultaneously estimate scene depth, ego-pose, ground segmentation and ground normal vector from only monocular RGB video sequences. In our approach, estimation for different scene structures can mutually benefit each other by the joint optimization. Specifically, we [...] Read more.
We propose a completely unsupervised approach to simultaneously estimate scene depth, ego-pose, ground segmentation and ground normal vector from only monocular RGB video sequences. In our approach, estimation for different scene structures can mutually benefit each other by the joint optimization. Specifically, we use the mutual information loss to pre-train the ground segmentation network and before adding the corresponding self-learning label obtained by a geometric method. By using the static nature of the ground and its normal vector, the scene depth and ego-motion can be efficiently learned by the self-supervised learning procedure. Extensive experimental results on both Cityscapes and KITTI benchmark demonstrate the significant improvement on the estimation accuracy for both scene depth and ego-pose by our approach. We also achieve an average error of about 3 for estimated ground normal vectors. By deploying our proposed geometric constraints, the IOU accuracy of unsupervised ground segmentation is increased by 35% on the Cityscapes dataset. Full article
(This article belongs to the Special Issue Camera as a Smart-Sensor (CaaSS))
Show Figures

Figure 1

11 pages, 5441 KiB  
Article
Measurement for the Thickness of Water Droplets/Film on a Curved Surface with Digital Image Projection (DIP) Technique
by Lingwei Zeng, Hanfeng Wang, Ying Li and Xuhui He
Sensors 2020, 20(8), 2409; https://doi.org/10.3390/s20082409 - 23 Apr 2020
Viewed by 3740
Abstract
Digital image projection (DIP) with traditional vertical calibration cannot be used for measuring the water droplets/film on a curved surface, because significant systematic error will be introduced. An improved DIP technique with normal calibration is proposed in the present paper, including the principles, [...] Read more.
Digital image projection (DIP) with traditional vertical calibration cannot be used for measuring the water droplets/film on a curved surface, because significant systematic error will be introduced. An improved DIP technique with normal calibration is proposed in the present paper, including the principles, operation procedures and analysis of systematic errors, which was successfully applied to measuring the water droplets/film on a curved surface. By comparing the results of laser profiler, traditional DIP, improved DIP and theoretical analysis, advantages of the present improved DIP technique are highlighted. Full article
(This article belongs to the Special Issue Camera as a Smart-Sensor (CaaSS))
Show Figures

Figure 1

15 pages, 1970 KiB  
Article
A Unified Deep Framework for Joint 3D Pose Estimation and Action Recognition from a Single RGB Camera
by Huy Hieu Pham, Houssam Salmane, Louahdi Khoudour, Alain Crouzil, Sergio A. Velastin and Pablo Zegers
Sensors 2020, 20(7), 1825; https://doi.org/10.3390/s20071825 - 25 Mar 2020
Cited by 39 | Viewed by 6735
Abstract
We present a deep learning-based multitask framework for joint 3D human pose estimation and action recognition from RGB sensors using simple cameras. The approach proceeds along two stages. In the first, a real-time 2D pose detector is run to determine the precise pixel [...] Read more.
We present a deep learning-based multitask framework for joint 3D human pose estimation and action recognition from RGB sensors using simple cameras. The approach proceeds along two stages. In the first, a real-time 2D pose detector is run to determine the precise pixel location of important keypoints of the human body. A two-stream deep neural network is then designed and trained to map detected 2D keypoints into 3D poses. In the second stage, the Efficient Neural Architecture Search (ENAS) algorithm is deployed to find an optimal network architecture that is used for modeling the spatio-temporal evolution of the estimated 3D poses via an image-based intermediate representation and performing action recognition. Experiments on Human3.6M, MSR Action3D and SBU Kinect Interaction datasets verify the effectiveness of the proposed method on the targeted tasks. Moreover, we show that the method requires a low computational budget for training and inference. In particular, the experimental results show that by using a monocular RGB sensor, we can develop a 3D pose estimation and human action recognition approach that reaches the performance of RGB-depth sensors. This opens up many opportunities for leveraging RGB cameras (which are much cheaper than depth cameras and extensively deployed in private and public places) to build intelligent recognition systems. Full article
(This article belongs to the Special Issue Camera as a Smart-Sensor (CaaSS))
Show Figures

Figure 1

21 pages, 7594 KiB  
Article
SLAM-Based Self-Calibration of a Binocular Stereo Vision Rig in Real-Time
by Hesheng Yin, Zhe Ma, Ming Zhong, Kuan Wu, Yuteng Wei, Junlong Guo and Bo Huang
Sensors 2020, 20(3), 621; https://doi.org/10.3390/s20030621 - 22 Jan 2020
Cited by 17 | Viewed by 6872
Abstract
The calibration problem of binocular stereo vision rig is critical for its practical application. However, most existing calibration methods are based on manual off-line algorithms for specific reference targets or patterns. In this paper, we propose a novel simultaneous localization and mapping (SLAM)-based [...] Read more.
The calibration problem of binocular stereo vision rig is critical for its practical application. However, most existing calibration methods are based on manual off-line algorithms for specific reference targets or patterns. In this paper, we propose a novel simultaneous localization and mapping (SLAM)-based self-calibration method designed to achieve real-time, automatic and accurate calibration of the binocular stereo vision (BSV) rig’s extrinsic parameters in a short period without auxiliary equipment and special calibration markers, assuming the intrinsic parameters of the left and right cameras are known in advance. The main contribution of this paper is to use the SLAM algorithm as our main tool for the calibration method. The method mainly consists of two parts: SLAM-based construction of 3D scene point map and extrinsic parameter calibration. In the first part, the SLAM mainly constructs a 3D feature point map of the natural environment, which is used as a calibration area map. To improve the efficiency of calibration, a lightweight, real-time visual SLAM is built. In the second part, extrinsic parameters are calibrated through the 3D scene point map created by the SLAM. Ultimately, field experiments are performed to evaluate the feasibility, repeatability, and efficiency of our self-calibration method. The experimental data shows that the average absolute error of the Euler angles and translation vectors obtained by our method relative to the reference values obtained by Zhang’s calibration method does not exceed 0.5˚ and 2 mm, respectively. The distribution range of the most widely spread parameter in Euler angles is less than 0.2˚ while that in translation vectors does not exceed 2.15 mm. Under the general texture scene and the normal driving speed of the mobile robot, the calibration time can be generally maintained within 10 s. The above results prove that our proposed method is reliable and has practical value. Full article
(This article belongs to the Special Issue Camera as a Smart-Sensor (CaaSS))
Show Figures

Figure 1

Review

Jump to: Research, Other

16 pages, 797 KiB  
Review
Deep Learning-Based Monocular Depth Estimation Methods—A State-of-the-Art Review
by Faisal Khan, Saqib Salahuddin and Hossein Javidnia
Sensors 2020, 20(8), 2272; https://doi.org/10.3390/s20082272 - 16 Apr 2020
Cited by 91 | Viewed by 18361
Abstract
Monocular depth estimation from Red-Green-Blue (RGB) images is a well-studied ill-posed problem in computer vision which has been investigated intensively over the past decade using Deep Learning (DL) approaches. The recent approaches for monocular depth estimation mostly rely on Convolutional Neural Networks (CNN). [...] Read more.
Monocular depth estimation from Red-Green-Blue (RGB) images is a well-studied ill-posed problem in computer vision which has been investigated intensively over the past decade using Deep Learning (DL) approaches. The recent approaches for monocular depth estimation mostly rely on Convolutional Neural Networks (CNN). Estimating depth from two-dimensional images plays an important role in various applications including scene reconstruction, 3D object-detection, robotics and autonomous driving. This survey provides a comprehensive overview of this research topic including the problem representation and a short description of traditional methods for depth estimation. Relevant datasets and 13 state-of-the-art deep learning-based approaches for monocular depth estimation are reviewed, evaluated and discussed. We conclude this paper with a perspective towards future research work requiring further investigation in monocular depth estimation challenges. Full article
(This article belongs to the Special Issue Camera as a Smart-Sensor (CaaSS))
Show Figures

Figure 1

Other

Jump to: Research, Review

12 pages, 695 KiB  
Letter
Efficient Star Identification Using a Neural Network
by David Rijlaarsdam, Hamza Yous, Jonathan Byrne, Davide Oddenino, Gianluca Furano and David Moloney
Sensors 2020, 20(13), 3684; https://doi.org/10.3390/s20133684 - 30 Jun 2020
Cited by 26 | Viewed by 5022
Abstract
The required precision for attitude determination in spacecraft is increasing, providing a need for more accurate attitude determination sensors. The star sensor or star tracker provides unmatched arc-second precision and with the rise of micro satellites these sensors are becoming smaller, faster and [...] Read more.
The required precision for attitude determination in spacecraft is increasing, providing a need for more accurate attitude determination sensors. The star sensor or star tracker provides unmatched arc-second precision and with the rise of micro satellites these sensors are becoming smaller, faster and more efficient. The most critical component in the star sensor system is the lost-in-space star identification algorithm which identifies stars in a scene without a priori attitude information. In this paper, we present an efficient lost-in-space star identification algorithm using a neural network and a robust and novel feature extraction method. Since a neural network implicitly stores the patterns associated with a guide star, a database lookup is eliminated from the matching process. The search time is therefore not influenced by the number of patterns stored in the network, making it constant (O(1)). This search time is unrivalled by other star identification algorithms. The presented algorithm provides excellent performance in a simple and lightweight design, making neural networks the preferred choice for star identification algorithms. Full article
(This article belongs to the Special Issue Camera as a Smart-Sensor (CaaSS))
Show Figures

Figure 1

Back to TopTop