1. Introduction
Based on the environment of “Made in China 2025”, intelligent manufacturing became one of the key development fields [
1,
2]. Machine vision is an important development field of intelligent manufacturing, because image information can be obtained, which accurately judges the state information of industrial target products, so as to prepare for subsequent automatic operation. At present, some important links in the assembly line of integrated circuit board are still completed by skilled workers, such as inserting pins of electronic components (as shown in
Figure 1) into corresponding holes (as shown in
Figure 2), and quality control of finished products. Not only does manual labor consume time, but the results of installing and testing are affected by the dedication level and work experience of the installers. With the development of computer technology and information processing technology, object recognition based on deep learning is one of the most popular directions in machine vision field. Due to the complex background of industrial target products, problems such as aliasing, occlusion, and shadow often occur, and there is inter-class similarity, which leads to certain difficulties in object recognition.
Many researchers completed several studies in the field of object recognition. For example, Radeva et al. [
3] introduced probability modeling using the Bayesian classification method in high-dimensional space to realize cork appearance detection and classification. Akhloufi et al. [
4] proposed an effective color texture classification framework for the classification of complex industrial products, which was realized by combining the statistical features calculated by the generalized isotropic symbiosis matrix extracted from the ribbon with the image entropy. However, the limitation of these method is that it is difficult to identify products with similar color and same texture. Hao et al. [
5] proposed a color threshold determination (CTD) method to identify color markers, aiming at the problem where traditional identification methods have low recognition accuracy or cannot be recognized in complex scenes and multi-objects. The Adaboost cascade classifier based on the Histograms of Oriented Gradients (HOG) [
6] feature was used to determine the color of each pixel in the candidate region of interest. Then, the color feature was matched according to the preset threshold, and the matching region was reserved to obtain the final recognition result. This method has good performance in the task of color mark recognition in complex scene, but it is not suitable for the situation of dense and overlapping target objects. Due to the complex background and multi-target aliasing, there is a great similarity between the objects; thus, these detection algorithms cannot accurately segment each object area in the heavily overlapping industrial product objects, which makes it challenging to use traditional detection algorithms to recognize the objects, as shown in
Figure 1.
In recent years, deep learning technology achieved great success in object recognition [
7,
8,
9,
10]. Apart from the artificial features of traditional algorithms, deep learning algorithms conduct representational learning on a large amount of data; thus, they are more generalized. At the same time, because the model is scalable, it is more flexible in practical application. At present, deep learning technology is widely used in industrial fields [
10,
11], such as industrial object classification [
7,
12], industrial product defect detection [
9,
13], and fault diagnosis [
14,
15]. For example, the R-CNN [
16] method proposed by Girshick et al. was a successful case of applying deep learning to object recognition. This method combines a classical regional recommendation network (RPN) and convolutional neural network (CNN) for object detection and classification. It was further improved in Fast RCNN [
17] and Faster RCNN [
18]. In the literature [
18], the region recommendation network (RPN) was firstly used to obtain the region of interest (ROI). The bounding boxes were then classified using a classifier. These algorithms provide guidance for industrial product detection. Although the accuracy of the R-CNN method is satisfactory, high computing force is needed, which leads to a low detection speed when using normal computers.
To overcome this problem, Redmon et al. [
19] proposed a new neural network, YOLO, which can directly predict the target boundary box. The network is simpler and faster than R-CNN under the premise of high accuracy. The RPN network is not needed in the YOLO network, as it directly performs regression to detect the object in the image; thus, the detection speed is faster. Although the latest version of the YOLO network (YOLO-V3 [
20]) improved the accuracy and speed of detection, and rendered it more suitable for small object detection, real-time detection in industrial applications requires too much hardware; thus, the network structure needs to be lightweight. Therefore, Google proposed the lightweight model Mobilenet [
21] to improve the detection speed of the neural network algorithm. Mobilenet, based on streamlined architecture, uses depth-separable convolution to build lightweight deep neural networks. Mobilenet is an efficient network architecture; it can be used to build small, low-latency, and low-performance models by setting parameters.
Inspired by the above studies, this paper plans to use the improved YOLO-V3 algorithm for real-time detection of electronic components, though combining the Mobilenet network to improve the YOLO-V3 network.
The rest of this article is organized as follows:
Section 2 introduces the construction of image dataset including image acquisition, image data enhancement, and label making. The improved YOLO-V3 algorithm is introduced in
Section 3.
Section 4 introduces the relevant contents of the experiment, and the proposed method is compared with the latest detection methods; then, the experimental result is discussed.
Section 5 presents the conclusion and future prospects of this paper.
5. Conclusions
In this paper, the improved YOLO V3 (YOLOV3–Mobilenet) model for detection of electronic components in complex backgrounds was proposed. In order to balance the accuracy of detection and speed, we incorporated the Mobilenet network framework to lighten the YOLO V3 network.
We collected 200 images containing four electronic components (capacitor of 470 μF, capacitor of 220 μF, capacitor of 22 μF, and inductor), using four data augmentation technologies (contrast enhancement processing, add noise processing, brightness conversion, and blurring) to build a dataset, and manually labeled the dataset.
To prove the validity of our proposed method, it was compared with some of the latest detection methods. The experimental results showed that, compared with the YOLO V3 model, the YOLOV3–Mobilenet model had a significant improvement in detection speed with similar accuracy. Furthermore, it had significant advantages compared with SSD and Faster R-CNN with Resnet101 network.
YOLOV3–Mobilenet can now be used for the detection of electronic components, but there is still a certain gap between its performance real-time detection. Future work will focus on optimizing existing models to enable the detection of electronic components in video to meet real-time requirements. We also plan to deploy it in embedded devices, so that it can achieve better portability in use. In addition, we will also optimize the data augmentation technologies to further improve the detection accuracy.