Convolutional Neural Networks and Vision Applications

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Artificial Intelligence Circuits and Systems (AICAS)".

Deadline for manuscript submissions: closed (15 July 2021) | Viewed by 55971

Special Issue Editor

Special Issue Information

Dear Colleagues,

Processing speed is critical for visual inspection automation and mobile visual computing applications. Many powerful and sophisticated computer vision algorithms generate accurate results but require high computational power or resources and are not entirely suitable for real-time vision applications. On the other hand, there are vision algorithms and convolutional neural networks that perform at camera frame rates but with moderately reduced accuracy, which is arguably more applicable for real-time vision applications. This special issue is for research related to the design, optimization, and implementation of machine learning-based vision algorithms or convolutional neural networks that are suitable for real-time vision applications.

General topics covered in this special issue include, but are not limited to:

  • Optimization of software-based vision algorithms
  • CNN architecture optimizations for real-time performance
  • CNN acceleration through approximate computing
  • CNN applications that require real-time performance
  • Tradeoff analysis between speed and accuracy in CNN
  • GPU-based implementations for real-time CNN performance
  • FPGA-based implementations for real-time CNN performance
  • Embedded vision systems for applications that require real-time performance
  • Machine vision applications that require real-time performance

Prof. Dr. D. J. Lee
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (13 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

14 pages, 34742 KiB  
Article
Implementation of an Award-Winning Invasive Fish Recognition and Separation System
by Jin Chai, Dah-Jye Lee, Beau Tippetts and Kirt Lillywhite
Electronics 2021, 10(17), 2182; https://doi.org/10.3390/electronics10172182 - 6 Sep 2021
Viewed by 2218
Abstract
The state of Michigan, U.S.A., was awarded USD 1 million in March 2018 for the Great Lakes Invasive Carp Challenge. The challenge sought new and novel technologies to function independently of or in conjunction with those fish deterrents already in place to prevent [...] Read more.
The state of Michigan, U.S.A., was awarded USD 1 million in March 2018 for the Great Lakes Invasive Carp Challenge. The challenge sought new and novel technologies to function independently of or in conjunction with those fish deterrents already in place to prevent the movement of invasive carp species into the Great Lakes from the Illinois River through the Chicago Area Waterway System (CAWS). Our team proposed an environmentally friendly, low-cost, vision-based fish recognition and separation system. The proposed solution won fourth place in the challenge out of 353 participants from 27 countries. The proposed solution includes an underwater imaging system that captures the fish images for processing, fish species recognition algorithm that identify invasive carp species, and a mechanical system that guides the fish movement and restrains invasive fish species for removal. We used our evolutionary learning-based algorithm to recognize fish species, which is considered the most challenging task of this solution. The algorithm was tested with a fish dataset consisted of four invasive and four non-invasive fish species. It achieved a remarkable 1.58% error rate, which is more than adequate for the proposed system, and required only a small number of images for training. This paper details the design of this unique solution and the implementation and testing that were accomplished since the challenge. Full article
(This article belongs to the Special Issue Convolutional Neural Networks and Vision Applications)
Show Figures

Figure 1

14 pages, 33093 KiB  
Article
Underwater Target Recognition Based on Improved YOLOv4 Neural Network
by Lingyu Chen, Meicheng Zheng, Shunqiang Duan, Weilin Luo and Ligang Yao
Electronics 2021, 10(14), 1634; https://doi.org/10.3390/electronics10141634 - 9 Jul 2021
Cited by 49 | Viewed by 4657
Abstract
The YOLOv4 neural network is employed for underwater target recognition. To improve the accuracy and speed of recognition, the structure of YOLOv4 is modified by replacing the upsampling module with a deconvolution module and by incorporating depthwise separable convolution into the network. Moreover, [...] Read more.
The YOLOv4 neural network is employed for underwater target recognition. To improve the accuracy and speed of recognition, the structure of YOLOv4 is modified by replacing the upsampling module with a deconvolution module and by incorporating depthwise separable convolution into the network. Moreover, the training set used in the YOLO network is preprocessed by using a modified mosaic augmentation, in which the gray world algorithm is used to derive two images when performing mosaic augmentation. The recognition results and the comparison with the other target detectors demonstrate the effectiveness of the proposed YOLOv4 structure and the method of data preprocessing. According to both subjective and objective evaluation, the proposed target recognition strategy can effectively improve the accuracy and speed of underwater target recognition and reduce the requirement of hardware performance as well. Full article
(This article belongs to the Special Issue Convolutional Neural Networks and Vision Applications)
Show Figures

Figure 1

17 pages, 98514 KiB  
Article
CNN Algorithm for Roof Detection and Material Classification in Satellite Images
by Jonguk Kim, Hyansu Bae, Hyunwoo Kang and Suk Gyu Lee
Electronics 2021, 10(13), 1592; https://doi.org/10.3390/electronics10131592 - 1 Jul 2021
Cited by 12 | Viewed by 7459
Abstract
This paper suggests an algorithm for extracting the location of a building from satellite imagery and using that information to modify the roof content. The materials are determined by measuring the conditions where the building is located and detecting the position of a [...] Read more.
This paper suggests an algorithm for extracting the location of a building from satellite imagery and using that information to modify the roof content. The materials are determined by measuring the conditions where the building is located and detecting the position of a building in broad satellite images. Depending on the incomplete roof or material, there is a greater possibility of great damage caused by disaster situations or external shocks. To address these problems, we propose an algorithm to detect roofs and classify materials in satellite images. Satellite imaging locates areas where buildings are likely to exist based on roads. Using images of the detected buildings, we classify the material of the roof using a proposed convolutional neural network (CNN) model algorithm consisting of 43 layers. In this paper, we propose a CNN structure to detect areas with buildings in large images and classify roof materials in the detected areas. Full article
(This article belongs to the Special Issue Convolutional Neural Networks and Vision Applications)
Show Figures

Figure 1

16 pages, 12723 KiB  
Article
Pointless Pose: Part Affinity Field-Based 3D Pose Estimation without Detecting Keypoints
by Jue Wang and Zhigang Luo
Electronics 2021, 10(8), 929; https://doi.org/10.3390/electronics10080929 - 13 Apr 2021
Cited by 3 | Viewed by 2435
Abstract
Human pose estimation finds its application in an extremely wide domain and is therefore never pointless. We propose in this paper a new approach that, unlike any prior one that we are aware of, bypasses the 2D keypoint detection step based on which [...] Read more.
Human pose estimation finds its application in an extremely wide domain and is therefore never pointless. We propose in this paper a new approach that, unlike any prior one that we are aware of, bypasses the 2D keypoint detection step based on which the 3D pose is estimated, and is thus pointless. Our motivation is rather straightforward: 2D keypoint detection is vulnerable to occlusions and out-of-image absences, in which case the 2D errors propagate to 3D recovery and deteriorate the results. To this end, we resort to explicitly estimating the human body regions of interest (ROI) and their 3D orientations. Even if a portion of the human body, like the lower arm, is partially absent, the predicted orientation vector pointing from the upper arm will take advantage of the local image evidence and recover the 3D pose. This is achieved, specifically, by deforming a skeleton-shaped puppet template to fit the estimated orientation vectors. Despite its simple nature, the proposed approach yields truly robust and state-of-the-art results on several benchmarks and in-the-wild data. Full article
(This article belongs to the Special Issue Convolutional Neural Networks and Vision Applications)
Show Figures

Figure 1

10 pages, 2491 KiB  
Article
Forest Fire Smoke Recognition Based on Anchor Box Adaptive Generation Method
by Enting Zhao, Yang Liu, Junguo Zhang and Ye Tian
Electronics 2021, 10(5), 566; https://doi.org/10.3390/electronics10050566 - 27 Feb 2021
Cited by 19 | Viewed by 2714
Abstract
There are major problems in the field of image-based forest fire smoke detection, including the low recognition rate caused by the changeable and complex state of smoke in the forest environment and the high false alarm rate caused by various interferential objects in [...] Read more.
There are major problems in the field of image-based forest fire smoke detection, including the low recognition rate caused by the changeable and complex state of smoke in the forest environment and the high false alarm rate caused by various interferential objects in the recognition process. Here, a forest fire smoke identification method based on the integration of environmental information is proposed. The model uses (1) the Faster R-CNN as the basic framework, (2) a component perception module to generate a receptive field of integrated environmental information through separable convolution to improve recognition accuracy, and (3) a multi-level Region of Interest (ROI)pooling structure to reduce the deviation caused by rounding in the ROI pooling process. The results showed that the model achieved a recognition accuracy rate of 96.72%, an Intersection Over Union (IOU) of 78.96%, and an average recognition speed for each picture of 1.5 ms; the false alarm rate was 2.35% and the false-negative rate was 3.28%. Compared with other models, the proposed model can effectively enhance the recognition accuracy and recognition speed of forest fire smoke, which provides a technical basis for the real-time and accurate detection of forest fires. Full article
(This article belongs to the Special Issue Convolutional Neural Networks and Vision Applications)
Show Figures

Figure 1

13 pages, 4580 KiB  
Article
SACN: A Novel Rotating Face Detector Based on Architecture Search
by Anping Song, Xiaokang Xu and Xinyi Zhai
Electronics 2021, 10(5), 558; https://doi.org/10.3390/electronics10050558 - 27 Feb 2021
Cited by 4 | Viewed by 2347
Abstract
Rotation-Invariant Face Detection (RIPD) has been widely used in practical applications; however, the problem of the adjusting of the rotation-in-plane (RIP) angle of the human face still remains. Recently, several methods based on neural networks have been proposed to solve the RIP angle [...] Read more.
Rotation-Invariant Face Detection (RIPD) has been widely used in practical applications; however, the problem of the adjusting of the rotation-in-plane (RIP) angle of the human face still remains. Recently, several methods based on neural networks have been proposed to solve the RIP angle problem. However, these methods have various limitations, including low detecting speed, model size, and detecting accuracy. To solve the aforementioned problems, we propose a new network, called the Searching Architecture Calibration Network (SACN), which utilizes architecture search, fully convolutional network (FCN) and bounding box center cluster (CC). SACN was tested on the challenging Multi-Oriented Face Detection Data Set and Benchmark (MOFDDB) and achieved a higher detecting accuracy and almost the same speed as existing detectors. Moreover, the average angle error is optimized from the current 12.6° to 10.5°. Full article
(This article belongs to the Special Issue Convolutional Neural Networks and Vision Applications)
Show Figures

Figure 1

21 pages, 7498 KiB  
Article
An Anti-Counterfeiting Architecture for Traceability System Based on Modified Two-Level Quick Response Codes
by Shundao Xie and Hong-Zhou Tan
Electronics 2021, 10(3), 320; https://doi.org/10.3390/electronics10030320 - 29 Jan 2021
Cited by 14 | Viewed by 3565
Abstract
Traceability is considered a promising solution for product safety. However, the data in the traceability system is only a claim rather than a fact. Therefore, the quality and safety of the product cannot be guaranteed since we cannot ensure the authenticity of products [...] Read more.
Traceability is considered a promising solution for product safety. However, the data in the traceability system is only a claim rather than a fact. Therefore, the quality and safety of the product cannot be guaranteed since we cannot ensure the authenticity of products (aka counterfeit detection) in the real world. In this paper, we focus on counterfeit detection for the traceability system. The risk of counterfeiting throughout a typical product life cycle in the supply chain is analyzed, and the corresponding requirements for the tags, packages, and traceability system are given to eliminate these risks. Based on the analysis, an anti-counterfeiting architecture for traceability system based on two-level quick response codes (2LQR codes) is proposed, where the problem of counterfeit detection for a product is transformed into the problem of copy detection for the 2LQR code tag. According to the characteristics of the traceability system, the generation progress of the 2LQR code is modified, and there is a corresponding improved algorithm to estimate the actual location of patterns in the scanned image of the modified 2LQR code tag to improve the performance of copy detection. A prototype system based on the proposed architecture is implemented, where the consumers can perform traceability information queries by scanning the 2LQR code on the product package with any QR code reader. They can also scan the 2LQR code with a home-scanner or office-scanner, and send the scanned image to the system to perform counterfeit detection. Compared with other anti-counterfeiting solutions, the proposed architecture has advantages of low cost, generality, and good performance. Therefore, it is a promising solution to replace the existing anti-counterfeiting system. Full article
(This article belongs to the Special Issue Convolutional Neural Networks and Vision Applications)
Show Figures

Figure 1

20 pages, 10184 KiB  
Article
Fine-Grained Recognition of Surface Targets with Limited Data
by Runze Guo, Bei Sun, Xiaotian Qiu, Shaojing Su, Zhen Zuo and Peng Wu
Electronics 2020, 9(12), 2044; https://doi.org/10.3390/electronics9122044 - 2 Dec 2020
Viewed by 2002
Abstract
Recognition of surface targets has a vital influence on the development of military and civilian applications such as maritime rescue patrols, illegal-vessel screening, and maritime operation monitoring. However, owing to the interference of visual similarity and environmental variations and the lack of high-quality [...] Read more.
Recognition of surface targets has a vital influence on the development of military and civilian applications such as maritime rescue patrols, illegal-vessel screening, and maritime operation monitoring. However, owing to the interference of visual similarity and environmental variations and the lack of high-quality datasets, accurate recognition of surface targets has always been a challenging task. In this paper, we introduce a multi-attention residual model based on deep learning methods, in which channel and spatial attention modules are applied for feature fusion. In addition, we use transfer learning to improve the feature expression capabilities of the model under conditions of limited data. A function based on metric learning is adopted to increase the distance between different classes. Finally, a dataset with eight types of surface targets is established. Comparative experiments on our self-built dataset show that the proposed method focuses more on discriminative regions, avoiding problems like gradient disappearance, and achieves better classification results than B-CNN, RA-CNN, MAMC, and MA-CNN, DFL-CNN. Full article
(This article belongs to the Special Issue Convolutional Neural Networks and Vision Applications)
Show Figures

Figure 1

16 pages, 3578 KiB  
Article
Defect Detection in Printed Circuit Boards Using You-Only-Look-Once Convolutional Neural Networks
by Venkat Anil Adibhatla, Huan-Chuang Chih, Chi-Chang Hsu, Joseph Cheng, Maysam F. Abbod and Jiann-Shing Shieh
Electronics 2020, 9(9), 1547; https://doi.org/10.3390/electronics9091547 - 22 Sep 2020
Cited by 116 | Viewed by 12974
Abstract
In this study, a deep learning algorithm based on the you-only-look-once (YOLO) approach is proposed for the quality inspection of printed circuit boards (PCBs). The high accuracy and efficiency of deep learning algorithms has resulted in their increased adoption in every field. Similarly, [...] Read more.
In this study, a deep learning algorithm based on the you-only-look-once (YOLO) approach is proposed for the quality inspection of printed circuit boards (PCBs). The high accuracy and efficiency of deep learning algorithms has resulted in their increased adoption in every field. Similarly, accurate detection of defects in PCBs by using deep learning algorithms, such as convolutional neural networks (CNNs), has garnered considerable attention. In the proposed method, highly skilled quality inspection engineers first use an interface to record and label defective PCBs. The data are then used to train a YOLO/CNN model to detect defects in PCBs. In this study, 11,000 images and a network of 24 convolutional layers and 2 fully connected layers were used. The proposed model achieved a defect detection accuracy of 98.79% in PCBs with a batch size of 32. Full article
(This article belongs to the Special Issue Convolutional Neural Networks and Vision Applications)
Show Figures

Figure 1

14 pages, 3279 KiB  
Article
Detection and Localization of Overlapped Fruits Application in an Apple Harvesting Robot
by Yuhua Jiao, Rong Luo, Qianwen Li, Xiaobo Deng, Xiang Yin, Chengzhi Ruan and Weikuan Jia
Electronics 2020, 9(6), 1023; https://doi.org/10.3390/electronics9061023 - 21 Jun 2020
Cited by 33 | Viewed by 4217
Abstract
For yield measurement of an apple orchard or the mechanical harvesting of apples, there needs to be accurate identification of the target apple fruit. However, in a natural scene, affected by the apple’s growth posture and camera position, there are many kinds of [...] Read more.
For yield measurement of an apple orchard or the mechanical harvesting of apples, there needs to be accurate identification of the target apple fruit. However, in a natural scene, affected by the apple’s growth posture and camera position, there are many kinds of apple images, such as overlapped apples; mutual shadows or leaves; stems; etc. It is a challenge to accurately locate overlapped apples. They will influence the positioning time and recognition efficiency and then affect the harvesting efficiency of apple-harvesting robots or the accuracy of orchard yield measurement. In response to this problem, an overlapped circle positioning method based on local maxima is proposed. First, the apple image is transformed into the Lab color space and segmented by the K-means algorithm. Second, some morphological processes, like erosion and dilation, are implemented to abstract the outline of the apples. Then image points are divided into central points; edge points; or outer points. Third, a fast algorithm is used to calculate every internal point’s minimum distance from the edge. Then, the centers of the apples are obtained by finding the maxima among these distances. Last, the radii are acquired by figuring out the minimum distance between the center and the edge. Thus, positioning is achieved. Experimental results showed that this method can locate overlapped apples accurately and quickly when the apple contour was complete; and this has certain practicability. Full article
(This article belongs to the Special Issue Convolutional Neural Networks and Vision Applications)
Show Figures

Figure 1

21 pages, 8137 KiB  
Article
SR-SYBA: A Scale and Rotation Invariant Synthetic Basis Feature Descriptor with Low Memory Usage
by Meng Yu, Dong Zhang, Dah-Jye Lee and Alok Desai
Electronics 2020, 9(5), 810; https://doi.org/10.3390/electronics9050810 - 15 May 2020
Viewed by 2708
Abstract
Feature description has an important role in image matching and is widely used for a variety of computer vision applications. As an efficient synthetic basis feature descriptor, SYnthetic BAsis (SYBA) requires low computational complexity and provides accurate matching results. However, the number of [...] Read more.
Feature description has an important role in image matching and is widely used for a variety of computer vision applications. As an efficient synthetic basis feature descriptor, SYnthetic BAsis (SYBA) requires low computational complexity and provides accurate matching results. However, the number of matched feature points generated by SYBA suffers from large image scaling and rotation variations. In this paper, we improve SYBA’s scale and rotation invariance by adding an efficient pre-processing operation. The proposed algorithm, SR-SYBA, represents the scale of the feature region with the location of maximum gradient response along the radial direction in Log-polar coordinate system. Based on this scale representation, it normalizes all feature regions to the same reference scale to provide scale invariance. The orientation of the feature region is represented as the orientation of the vector from the center of the feature region to its intensity centroid. Based on this orientation representation, all feature regions are rotated to the same reference orientation to provide rotation invariance. The original SYBA descriptor is then applied to the scale and orientation normalized feature regions for description and matching. Experiment results show that SR-SYBA greatly improves SYBA for image matching applications with scaling and rotation variations. SR-SYBA obtains comparable or better performance in terms of matching rate compared to the mainstream algorithms while still maintains its advantages of using much less storage and simpler computations. SR-SYBA is applied to a vision-based measurement application to demonstrate its performance for image matching. Full article
(This article belongs to the Special Issue Convolutional Neural Networks and Vision Applications)
Show Figures

Figure 1

18 pages, 6596 KiB  
Article
Smart Camera for Quality Inspection and Grading of Food Products
by Zhonghua Guo, Meng Zhang, Dah-Jye Lee and Taylor Simons
Electronics 2020, 9(3), 505; https://doi.org/10.3390/electronics9030505 - 19 Mar 2020
Cited by 14 | Viewed by 4153
Abstract
Due to the increasing consumption of food products and demand for food quality and safety, most food processing facilities in the United States utilize machines to automate their processes, such as cleaning, inspection and grading, packing, storing, and shipping. Machine vision technology has [...] Read more.
Due to the increasing consumption of food products and demand for food quality and safety, most food processing facilities in the United States utilize machines to automate their processes, such as cleaning, inspection and grading, packing, storing, and shipping. Machine vision technology has been a proven solution for inspection and grading of food products since the late 1980s. The remaining challenges, especially for small to midsize facilities, include the system and operating costs, demand for high-skilled workers for complicated configuration and operation and, in some cases, unsatisfactory results. This paper focuses on the development of an embedded solution with learning capability to alleviate these challenges. Three simple application cases are included to demonstrate the operation of this unique solution. Two datasets of more challenging cases were created to analyze and demonstrate the performance of our visual inspection algorithm. One dataset includes infrared images of Medjool dates of four levels of skin delamination for surface quality grading. The other one consists of grayscale images of oysters with varying shape for shape quality evaluation. Our algorithm achieved a grading accuracy of 95.0% on the date dataset and 98.6% on the oyster dataset, both easily surpassed manual grading, which constantly faces the challenges of human fatigue or other distractions. Details of the design and functions of our smart camera and our simple visual inspection algorithm are discussed in this paper. Full article
(This article belongs to the Special Issue Convolutional Neural Networks and Vision Applications)
Show Figures

Figure 1

21 pages, 3573 KiB  
Article
Optimization and Implementation of Synthetic Basis Feature Descriptor on FPGA
by Dah-Jye Lee, Samuel G. Fuller and Alexander S. McCown
Electronics 2020, 9(3), 391; https://doi.org/10.3390/electronics9030391 - 27 Feb 2020
Cited by 1 | Viewed by 2765
Abstract
Feature detection, description, and matching are crucial steps for many computer vision algorithms. These steps rely on feature descriptors to match image features across sets of images. Previous work has shown that our SYnthetic BAsis (SYBA) feature descriptor can offer superior performance to [...] Read more.
Feature detection, description, and matching are crucial steps for many computer vision algorithms. These steps rely on feature descriptors to match image features across sets of images. Previous work has shown that our SYnthetic BAsis (SYBA) feature descriptor can offer superior performance to other binary descriptors. This paper focused on various optimizations and hardware implementation of the newer and optimized version. The hardware implementation on a field-programmable gate array (FPGA) is a high-throughput low-latency solution which is critical for applications such as high-speed object detection and tracking, stereo vision, visual odometry, structure from motion, and optical flow. We compared our solution to other hardware designs of binary descriptors. We demonstrated that our implementation of SYBA as a feature descriptor in hardware offered superior image feature matching performance and used fewer resources than most binary feature descriptor implementations. Full article
(This article belongs to the Special Issue Convolutional Neural Networks and Vision Applications)
Show Figures

Figure 1

Back to TopTop