sensors-logo

Journal Browser

Journal Browser

Intelligent Sensors and Computer Vision

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Intelligent Sensors".

Deadline for manuscript submissions: closed (20 October 2020) | Viewed by 55910

Special Issue Editors


E-Mail Website
Guest Editor
Department of Informatics, Systems and Communication, University of Milano - Bicocca, Milan, Italy
Interests: computer vision; machine learning; optimization
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Informatics, Systems and Communication, University of Milan-Bicocca, 20126 Milano, Italy
Interests: signal/image/video processing and understanding; color imaging; machine learning
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Informatics, Systems and Communication, University of Milano-Bicocca, viale Sarca, 336, 20126 Milan, Italy
Interests: color imaging; image and video processing; analysis and classification; visual information systems; image quality
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Learning and Machine Perception (LAMP) group. Computer Vision Center, Barcelona, Spain
Interests: computer vision; machine learning; continual learning; active learning

Special Issue Information

Dear Colleagues,

The purpose of this Special Issue is to introduce the current developments in Intelligent Sensors and Computer Vision applications exploiting artificial intelligence (AI) techniques.

Intelligent Sensors are devices that may now incorporate different imaging technologies to both detect events in their environment, and to perform logical functions over their own sensings. Visual data can be combined with a wide variety of information ranging from movements (body, head, gait, posture, etc.), physiological data (heart, brain, skin, etc.), and environmental data (audio, location, noise, etc.). Moreover, Intelligent Sensors can be now equipped with more powerful processing resources, thus enabling higher-complexity reasoning based on advanced multimodal, Computer Vision, and AI techniques.

The joint exploitation of vision and other signals within Intelligent Sensors opens the doors to multiple application fields. Based on the domain of action, possible contributions to this Special Issue include, but are not limited to the following:

  • Video and multimodal human action recognition
  • Video and multimodal anomaly detection for assisted living and surveillance
  • Video and multimodal human-computer interaction
  • Video and multimodal industrial quality control
  • Video and multimodal advanced driver assistant systems (ADAS)
  • Deep network architectures for computer vision and multimodal understanding
  • Supervised, semi-supervised, and unsupervised learning from video and multimodal signals
  • Vision and multimodal understanding applications such as sports, entertainment, healthcare, etc.
  • Time series multimodal data analysis
  • Deep learning models for multimodal fusion and analysis
  • Security and privacy issues for multimodal sensor data
  • Modeling, understanding, and leveraging of multimodal signals that can be acquired by intelligent sensors
  • Video and multimodal data representation, summarization, and visualization
  • Video and multimodal emotion/affect recognition and modeling
Prof. Simone Bianco
Dr. Marco Buzzelli
Prof. Raimondo Schettini
Dr. Joost van de Weijer
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Computer vision
  • Intelligent sensors
  • Video processing and understanding
  • Image processing and understanding
  • Multimodal processing and understanding
  • Deep learning and machine learning Embedded computer vision
  • Human action recognition
  • Anomaly detection for assisted living Anomaly detection for video surveillance
  • Human–computer interaction
  • Industrial quality control Advanced driver assistance systems (ADAS)
  • Emotion/affect recognition and modeling
  • Supervised, semi-supervised, and unsupervised learning
  • Data representation, summarization, and visualization
  • Security and privacy issues for multimodal sensor data

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (14 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

17 pages, 5663 KiB  
Article
A Genetic Algorithm to Combine Deep Features for the Aesthetic Assessment of Images Containing Faces
by Luigi Celona and Raimondo Schettini
Sensors 2021, 21(4), 1307; https://doi.org/10.3390/s21041307 - 12 Feb 2021
Cited by 6 | Viewed by 3203
Abstract
The automatic assessment of the aesthetic quality of a photo is a challenging and extensively studied problem. Most of the existing works focus on the aesthetic quality assessment of photos regardless of the depicted subject and mainly use features extracted from the entire [...] Read more.
The automatic assessment of the aesthetic quality of a photo is a challenging and extensively studied problem. Most of the existing works focus on the aesthetic quality assessment of photos regardless of the depicted subject and mainly use features extracted from the entire image. It has been observed that the performance of generic content aesthetic assessment methods significantly decreases when it comes to images depicting faces. This paper introduces a method for evaluating the aesthetic quality of images with faces by encoding both the properties of the entire image and specific aspects of the face. Three different convolutional neural networks are exploited to encode information regarding perceptual quality, global image aesthetics, and facial attributes; then, a model is trained to combine these features to explicitly predict the aesthetics of images containing faces. Experimental results show that our approach outperforms existing methods for both binary, i.e., low/high, and continuous aesthetic score prediction on four different image databases in the state-of-the-art. Full article
(This article belongs to the Special Issue Intelligent Sensors and Computer Vision)
Show Figures

Figure 1

15 pages, 6517 KiB  
Article
T1K+: A Database for Benchmarking Color Texture Classification and Retrieval Methods
by Claudio Cusano, Paolo Napoletano and Raimondo Schettini
Sensors 2021, 21(3), 1010; https://doi.org/10.3390/s21031010 - 2 Feb 2021
Cited by 11 | Viewed by 3479
Abstract
In this paper we present T1K+, a very large, heterogeneous database of high-quality texture images acquired under variable conditions. T1K+ contains 1129 classes of textures ranging from natural subjects to food, textile samples, construction materials, etc. T1K+ allows the design of experiments especially [...] Read more.
In this paper we present T1K+, a very large, heterogeneous database of high-quality texture images acquired under variable conditions. T1K+ contains 1129 classes of textures ranging from natural subjects to food, textile samples, construction materials, etc. T1K+ allows the design of experiments especially aimed at understanding the specific issues related to texture classification and retrieval. To help the exploration of the database, all the 1129 classes are hierarchically organized in 5 thematic categories and 266 sub-categories. To complete our study, we present an evaluation of hand-crafted and learned visual descriptors in supervised texture classification tasks. Full article
(This article belongs to the Special Issue Intelligent Sensors and Computer Vision)
Show Figures

Figure 1

18 pages, 1988 KiB  
Article
Revisiting the CompCars Dataset for Hierarchical Car Classification: New Annotations, Experiments, and Results
by Marco Buzzelli and Luca Segantin
Sensors 2021, 21(2), 596; https://doi.org/10.3390/s21020596 - 15 Jan 2021
Cited by 12 | Viewed by 3952
Abstract
We address the task of classifying car images at multiple levels of detail, ranging from the top-level car type, down to the specific car make, model, and year. We analyze existing datasets for car classification, and identify the CompCars as an excellent starting [...] Read more.
We address the task of classifying car images at multiple levels of detail, ranging from the top-level car type, down to the specific car make, model, and year. We analyze existing datasets for car classification, and identify the CompCars as an excellent starting point for our task. We show that convolutional neural networks achieve an accuracy above 90% on the finest-level classification task. This high performance, however, is scarcely representative of real-world situations, as it is evaluated on a biased training/test split. In this work, we revisit the CompCars dataset by first defining a new training/test split, which better represents real-world scenarios by setting a more realistic baseline at 61% accuracy on the new test set. We also propagate the existing (but limited) type-level annotation to the entire dataset, and we finally provide a car-tight bounding box for each image, automatically defined through an ad hoc car detector. To evaluate this revisited dataset, we design and implement three different approaches to car classification, two of which exploit the hierarchical nature of car annotations. Our experiments show that higher-level classification in terms of car type positively impacts classification at a finer grain, now reaching 70% accuracy. The achieved performance constitutes a baseline benchmark for future research, and our enriched set of annotations is made available for public download. Full article
(This article belongs to the Special Issue Intelligent Sensors and Computer Vision)
Show Figures

Figure 1

14 pages, 299 KiB  
Article
Classification Algorithm for Person Identification and Gesture Recognition Based on Hand Gestures with Small Training Sets
by Krzysztof Rzecki
Sensors 2020, 20(24), 7279; https://doi.org/10.3390/s20247279 - 18 Dec 2020
Cited by 9 | Viewed by 2567
Abstract
Classification algorithms require training data initially labelled by classes to build a model and then to be able to classify the new data. The amount and diversity of training data affect the classification quality and usually the larger the training set, the better [...] Read more.
Classification algorithms require training data initially labelled by classes to build a model and then to be able to classify the new data. The amount and diversity of training data affect the classification quality and usually the larger the training set, the better the accuracy of classification. In many applications only small amounts of training data are available. This article presents a new time series classification algorithm for problems with small training sets. The algorithm was tested on hand gesture recordings in tasks of person identification and gesture recognition. The algorithm provides significantly better classification accuracy than other machine learning algorithms. For 22 different hand gestures performed by 10 people and the training set size equal to 5 gesture execution records per class, the error rate for the newly proposed algorithm is from 37% to 75% lower than for the other compared algorithms. When the training set consists of only one sample per class the new algorithm reaches from 45% to 95% lower error rate. Conducted experiments indicate that the algorithm outperforms state-of-the-art methods in terms of classification accuracy in the problem of person identification and gesture recognition. Full article
(This article belongs to the Special Issue Intelligent Sensors and Computer Vision)
Show Figures

Figure 1

23 pages, 14758 KiB  
Article
Towards Automated 3D Inspection of Water Leakages in Shield Tunnel Linings Using Mobile Laser Scanning Data
by Hongwei Huang, Wen Cheng, Mingliang Zhou, Jiayao Chen and Shuai Zhao
Sensors 2020, 20(22), 6669; https://doi.org/10.3390/s20226669 - 21 Nov 2020
Cited by 53 | Viewed by 5435
Abstract
On-site manual inspection of metro tunnel leakages has been faced with the problems of low efficiency and poor accuracy. An automated, high-precision, and robust water leakage inspection method is vital to improve the manual approach. Existing approaches cannot provide the leakage location due [...] Read more.
On-site manual inspection of metro tunnel leakages has been faced with the problems of low efficiency and poor accuracy. An automated, high-precision, and robust water leakage inspection method is vital to improve the manual approach. Existing approaches cannot provide the leakage location due to the lack of spatial information. Therefore, an integrated deep learning method of water leakage inspection using tunnel lining point cloud data from mobile laser scanning is presented in this paper. It is composed of three parts as follows: (1) establishment of the water leakage dataset using the acquired point clouds of tunnel linings; (2) automated leakage detection via a mask-region-based convolutional neural network; and (3) visualization and quantitative evaluation of the water leakage in 3D space via a novel triangle mesh method. The testing result reveals that the proposed method achieves automated detection and evaluation of tunnel lining water leakages in 3D space, which provides the inspectors with an intuitive overall 3D view of the detected water leakages and the leakage information (area, location, lining segments, etc.). Full article
(This article belongs to the Special Issue Intelligent Sensors and Computer Vision)
Show Figures

Figure 1

22 pages, 3173 KiB  
Article
Visual Leakage Inspection in Chemical Process Plants Using Thermographic Videos and Motion Pattern Detection
by Mina Fahimipirehgalin, Birgit Vogel-Heuser, Emanuel Trunzer and Matthias Odenweller
Sensors 2020, 20(22), 6659; https://doi.org/10.3390/s20226659 - 20 Nov 2020
Cited by 2 | Viewed by 2738
Abstract
Liquid leakage from pipelines is a critical issue in large-scale chemical process plants since it can affect the normal operation of the plant and pose unsafe and hazardous situations. Therefore, leakage detection in the early stages can prevent serious damage. Developing a vision-based [...] Read more.
Liquid leakage from pipelines is a critical issue in large-scale chemical process plants since it can affect the normal operation of the plant and pose unsafe and hazardous situations. Therefore, leakage detection in the early stages can prevent serious damage. Developing a vision-based inspection system by means of IR imaging can be a promising approach for accurate leakage detection. IR cameras can capture the effect of leaking drops if they have higher (or lower) temperature than their surroundings. Since the leaking drops can be observed in an IR video as a repetitive phenomenon with specific patterns, motion pattern detection methods can be utilized for leakage detection. In this paper, an approach based on the Kalman filter is proposed to track the motion of leaking drops and differentiate them from noise. The motion patterns are learned from the training data and applied to the test data to evaluate the accuracy of the method. For this purpose, a laboratory demonstrator plant is assembled to simulate the leakages from pipelines, and to generate training and test videos. The results show that the proposed method can detect the leaking drops by tracking them based on obtained motion patterns. Furthermore, the possibilities and conditions for applying the proposed method in a real industrial chemical plant are discussed at the end. Full article
(This article belongs to the Special Issue Intelligent Sensors and Computer Vision)
Show Figures

Figure 1

22 pages, 22132 KiB  
Article
Robust 3D Hand Detection from a Single RGB-D Image in Unconstrained Environments
by Chi Xu, Jun Zhou, Wendi Cai, Yunkai Jiang, Yongbo Li and Yi Liu
Sensors 2020, 20(21), 6360; https://doi.org/10.3390/s20216360 - 7 Nov 2020
Cited by 8 | Viewed by 3857
Abstract
Three-dimensional hand detection from a single RGB-D image is an important technology which supports many useful applications. Practically, it is challenging to robustly detect human hands in unconstrained environments because the RGB-D channels can be affected by many uncontrollable factors, such as light [...] Read more.
Three-dimensional hand detection from a single RGB-D image is an important technology which supports many useful applications. Practically, it is challenging to robustly detect human hands in unconstrained environments because the RGB-D channels can be affected by many uncontrollable factors, such as light changes. To tackle this problem, we propose a 3D hand detection approach which improves the robustness and accuracy by adaptively fusing the complementary features extracted from the RGB-D channels. Using the fused RGB-D feature, the 2D bounding boxes of hands are detected first, and then the 3D locations along the z-axis are estimated through a cascaded network. Furthermore, we represent a challenging RGB-D hand detection dataset collected in unconstrained environments. Different from previous works which primarily rely on either the RGB or D channel, we adaptively fuse the RGB-D channels for hand detection. Specifically, evaluation results show that the D-channel is crucial for hand detection in unconstrained environments. Our RGB-D fusion-based approach significantly improves the hand detection accuracy from 69.1 to 74.1 comparing to one of the most state-of-the-art RGB-based hand detectors. The existing RGB- or D-based methods are unstable in unseen lighting conditions: in dark conditions, the accuracy of the RGB-based method significantly drops to 48.9, and in back-light conditions, the accuracy of the D-based method dramatically drops to 28.3. Compared with these methods, our RGB-D fusion based approach is much more robust without accuracy degrading, and our detection results are 62.5 and 65.9, respectively, in these two extreme lighting conditions for accuracy. Full article
(This article belongs to the Special Issue Intelligent Sensors and Computer Vision)
Show Figures

Figure 1

26 pages, 3014 KiB  
Article
An API for Wearable Environments Development and Its Application to mHealth Field
by Fabio Sartori
Sensors 2020, 20(21), 5970; https://doi.org/10.3390/s20215970 - 22 Oct 2020
Cited by 4 | Viewed by 3725
Abstract
Wearable technologies are transforming research in traditional paradigms of software and knowledge engineering. Among them, expert systems have the opportunity to deal with knowledge bases dynamically varying according to real-time data collected by position sensors, movement sensors, etc. However, it is necessary to [...] Read more.
Wearable technologies are transforming research in traditional paradigms of software and knowledge engineering. Among them, expert systems have the opportunity to deal with knowledge bases dynamically varying according to real-time data collected by position sensors, movement sensors, etc. However, it is necessary to design and implement opportune architectural solutions to avoid expert systems are responsible for data acquisition and representation. These solutions should be able to collect and store data according to expert systems desiderata, building a homogeneous framework where data reliability and interoperability among data acquisition, data representation and data use levels are guaranteed. To this aim, the wearable environment notion has been introduced to treat all those information sources as components of a larger platform; a middleware has been designed and implemented, namely WEAR-IT, which allows considering each sensor as a source of information that can be dynamically tied to an expert system application running on a smartphone. As an application example, the mHealth domain is considered. Full article
(This article belongs to the Special Issue Intelligent Sensors and Computer Vision)
Show Figures

Figure 1

17 pages, 2315 KiB  
Article
Energy Efficient Pupil Tracking Based on Rule Distillation of Cascade Regression Forest
by Sangwon Kim, Mira Jeong and Byoung Chul Ko
Sensors 2020, 20(18), 5141; https://doi.org/10.3390/s20185141 - 9 Sep 2020
Cited by 10 | Viewed by 2429
Abstract
As the demand for human-friendly computing increases, research on pupil tracking to facilitate human–machine interactions (HCIs) is being actively conducted. Several successful pupil tracking approaches have been developed using images and a deep neural network (DNN). However, common DNN-based methods not only require [...] Read more.
As the demand for human-friendly computing increases, research on pupil tracking to facilitate human–machine interactions (HCIs) is being actively conducted. Several successful pupil tracking approaches have been developed using images and a deep neural network (DNN). However, common DNN-based methods not only require tremendous computing power and energy consumption for learning and prediction; they also have a demerit in that an interpretation is impossible because a black-box model with an unknown prediction process is applied. In this study, we propose a lightweight pupil tracking algorithm for on-device machine learning (ML) using a fast and accurate cascade deep regression forest (RF) instead of a DNN. Pupil estimation is applied in a coarse-to-fine manner in a layer-by-layer RF structure, and each RF is simplified using the proposed rule distillation algorithm for removing unimportant rules constituting the RF. The goal of the proposed algorithm is to produce a more transparent and adoptable model for application to on-device ML systems, while maintaining a precise pupil tracking performance. Our proposed method experimentally achieves an outstanding speed, a reduction in the number of parameters, and a better pupil tracking performance compared to several other state-of-the-art methods using only a CPU. Full article
(This article belongs to the Special Issue Intelligent Sensors and Computer Vision)
Show Figures

Figure 1

22 pages, 6449 KiB  
Article
Feature Channel Expansion and Background Suppression as the Enhancement for Infrared Pedestrian Detection
by Shengzhe Wang, Bo Wang, Shifeng Wang and Yifeng Tang
Sensors 2020, 20(18), 5128; https://doi.org/10.3390/s20185128 - 9 Sep 2020
Cited by 4 | Viewed by 2435
Abstract
Pedestrian detection is an important task in many intelligent systems, particularly driver assistance systems. Recent studies on pedestrian detection in infrared (IR) imagery have employed data-driven approaches. However, two problems in deep learning-based detection are the implicit performance and time-consuming training. In this [...] Read more.
Pedestrian detection is an important task in many intelligent systems, particularly driver assistance systems. Recent studies on pedestrian detection in infrared (IR) imagery have employed data-driven approaches. However, two problems in deep learning-based detection are the implicit performance and time-consuming training. In this paper, a novel channel expansion technique based on feature fusion is proposed to enhance the IR imagery and accelerate the training process. Besides, a novel background suppression method is proposed to stimulate the attention principle of human vision and shrink the region of detection. A precise fusion algorithm is designed to combine the information from different visual saliency maps in order to reduce the effect of truncation and miss detection. Four different experiments are performed from various perspectives in order to gauge the efficiency of our approach. The experimental results show that the Mean Average Precisions (mAPs) of four different datasets have been increased by 5.22% on average. The results prove that background suppression and suitable feature expansion will accelerate the training process and enhance the performance of IR image-based deep learning models. Full article
(This article belongs to the Special Issue Intelligent Sensors and Computer Vision)
Show Figures

Figure 1

20 pages, 25616 KiB  
Article
Blind First-Order Perspective Distortion Correction Using Parallel Convolutional Neural Networks
by Neil Patrick Del Gallego, Joel Ilao and Macario Cordel II
Sensors 2020, 20(17), 4898; https://doi.org/10.3390/s20174898 - 30 Aug 2020
Cited by 5 | Viewed by 5454
Abstract
In this work, we present a network architecture with parallel convolutional neural networks (CNN) for removing perspective distortion in images. While other works generate corrected images through the use of generative adversarial networks or encoder-decoder networks, we propose a method wherein three CNNs [...] Read more.
In this work, we present a network architecture with parallel convolutional neural networks (CNN) for removing perspective distortion in images. While other works generate corrected images through the use of generative adversarial networks or encoder-decoder networks, we propose a method wherein three CNNs are trained in parallel, to predict a certain element pair in the 3×3 transformation matrix, M^. The corrected image is produced by transforming the distorted input image using M^1. The networks are trained from our generated distorted image dataset using KITTI images. Experimental results show promise in this approach, as our method is capable of correcting perspective distortions on images and outperforms other state-of-the-art methods. Our method also recovers the intended scale and proportion of the image, which is not observed in other works. Full article
(This article belongs to the Special Issue Intelligent Sensors and Computer Vision)
Show Figures

Figure 1

23 pages, 13788 KiB  
Article
A New Volumetric Fusion Strategy with Adaptive Weight Field for RGB-D Reconstruction
by Xinqi Liu, Jituo Li and Guodong Lu
Sensors 2020, 20(15), 4330; https://doi.org/10.3390/s20154330 - 3 Aug 2020
Cited by 4 | Viewed by 3190
Abstract
High-quality 3D reconstruction results are very important in many application fields. However, current texture generation methods based on point sampling and fusion often produce blur. To solve this problem, we propose a new volumetric fusion strategy which can be embedded in the current [...] Read more.
High-quality 3D reconstruction results are very important in many application fields. However, current texture generation methods based on point sampling and fusion often produce blur. To solve this problem, we propose a new volumetric fusion strategy which can be embedded in the current online and offline reconstruction framework as a basic module to achieve excellent geometry and texture effects. The improvement comes from two aspects. Firstly, we establish an adaptive weight field to evaluate and adjust the reliability of data from RGB-D images by using a probabilistic and heuristic method. By using this adaptive weight field to guide the voxel fusion process, we can effectively preserve the local texture structure of the mesh, avoid wrong texture problems and suppress the influence of outlier noise on the geometric surface. Secondly, we use a new texture fusion strategy that combines replacement, integration, and fixedness operations to fuse and update voxel texture to reduce blur. Experimental results demonstrate that compared with the classical KinectFusion, our approach can significantly improve the accuracy in geometry and texture clarity, and can achieve equivalent texture reconstruction effects in real-time as the offline reconstruction methods such as intrinsic3d, even better in relief scenes. Full article
(This article belongs to the Special Issue Intelligent Sensors and Computer Vision)
Show Figures

Figure 1

22 pages, 18485 KiB  
Article
A Multi-Level Approach to Waste Object Segmentation
by Tao Wang, Yuanzheng Cai, Lingyu Liang and Dongyi Ye
Sensors 2020, 20(14), 3816; https://doi.org/10.3390/s20143816 - 8 Jul 2020
Cited by 46 | Viewed by 7568
Abstract
We address the problem of localizing waste objects from a color image and an optional depth image, which is a key perception component for robotic interaction with such objects. Specifically, our method integrates the intensity and depth information at multiple levels of spatial [...] Read more.
We address the problem of localizing waste objects from a color image and an optional depth image, which is a key perception component for robotic interaction with such objects. Specifically, our method integrates the intensity and depth information at multiple levels of spatial granularity. Firstly, a scene-level deep network produces an initial coarse segmentation, based on which we select a few potential object regions to zoom in and perform fine segmentation. The results of the above steps are further integrated into a densely connected conditional random field that learns to respect the appearance, depth, and spatial affinities with pixel-level accuracy. In addition, we create a new RGBD waste object segmentation dataset, MJU-Waste, that is made public to facilitate future research in this area. The efficacy of our method is validated on both MJU-Waste and the Trash Annotation in Context (TACO) dataset. Full article
(This article belongs to the Special Issue Intelligent Sensors and Computer Vision)
Show Figures

Figure 1

17 pages, 9750 KiB  
Article
3D Contact Position Estimation of Image-Based Areal Soft Tactile Sensor with Printed Array Markers and Image Sensors
by Jong-il Lee, Suwoong Lee, Hyun-Min Oh, Bo Ram Cho, Kap-Ho Seo and Min Young Kim
Sensors 2020, 20(13), 3796; https://doi.org/10.3390/s20133796 - 7 Jul 2020
Cited by 5 | Viewed by 3761
Abstract
Tactile sensors have been widely used and researched in various fields of medical and industrial applications. Gradually, they will be used as new input devices and contact sensors for interactive robots. If a tactile sensor is to be applied to various forms of [...] Read more.
Tactile sensors have been widely used and researched in various fields of medical and industrial applications. Gradually, they will be used as new input devices and contact sensors for interactive robots. If a tactile sensor is to be applied to various forms of human–machine interactions, it needs to be soft to ensure comfort and safety, and it should be easily customizable and inexpensive. The purpose of this study is to estimate 3D contact position of a novel image-based areal soft tactile sensor (IASTS) using printed array markers and multiple cameras. First, we introduce the hardware structure of the prototype IASTS, which consists of a soft material with printed array markers and multiple cameras with LEDs. Second, an estimation algorithm for the contact position is proposed based on the image processing of the array markers and their Gaussian fittings. A series of basic experiments was conducted and their results were analyzed to verify the effectiveness of the proposed IASTS hardware and its estimation software. To ensure the stability of the estimated contact positions a Kalman filter was developed. Finally, it was shown that the contact positions on the IASTS were estimated with a reasonable error value for soft haptic applications. Full article
(This article belongs to the Special Issue Intelligent Sensors and Computer Vision)
Show Figures

Figure 1

Back to TopTop