1. Introduction
Bearings are commonly used mechanical components. A bearing’s main function is to support mechanical rotation and reduce the friction coefficient during its movement. Since the roller is the most important part of the bearing, its surface quality has a significant impact on the performance and even the life of the bearing, thus, the surface quality of the roller must be extremely high. Bearing rollers are shown in
Figure 1 below.
Rollers are the main pressure-bearing part in rolling bearings and are easily damaged due to defects and other factors. If there are defects on the roller surface, the stability of the bearing will be heavily reduced during use. Therefore, in mechanical design, the geometric accuracy and surface roughness of the roller are typically one level higher than that of the ferrules and raceways. Among the rolling bearings, deep groove ball bearings are mainly used in small- and medium-sized equipment, while roller bearings are widely used in medium- and large-sized machines. They are widely used in passenger transportation, aerospace, and other transportation fields, as well as agricultural machinery, industrial machinery, medical equipment, and other related machinery industries. The bearing roller is the main research object of this paper.
Defects inevitably occur on the surface of bearing rollers in the production process. The defects are mainly distributed on the cylindrical surface, chamfers, and end surfaces. Common defect categories include: Damage and scratches caused by mechanical collision; corrosion caused by mechanical aging; and material lacking, at the chamfer, and grind lacking caused in the production process. As can be seen from
Figure 2 below.
The main defect categories are as follows:
Scratch, as shown in
Figure 2a,b. A defect caused by a roller being scratched by other hard objects.
Damage, as shown in
Figure 2c–f. We describe defects with large areas and irregular shapes as damage.
Corrosion, as shown in
Figure 2g,h. The defects caused by corrosion.
Material lacking at the chamfer, as shown in
Figure 2i,j. The roller is sunken at the chamfer, making the contour not a circle.
Grind lacking, as shown in
Figure 2k. The defects caused by insufficient grinding.
Stamp lacking, as shown in
Figure 2l. The defects caused by insufficient stamping.
Figure 2a,c,d,g,i,k,l are images of the end surfaces of the roller, and the rest are images of the cylindrical surface of the roller. These defects have a great influence on the performance and stability of the bearing and must be detected. Visual inspection is a good solution because it can reduce a lot of manual detection. At present, visual inspection technology has been used in many scenarios, such as chip pin and circuit solder inspection [
1], workpiece vision measurement [
2], plastic bottle defect detection [
3], metal product surface defect detection [
4,
5,
6], equipment parts identification and classification [
7], gear and bearing surface inspection and measurement [
8], bearing defect inspection [
9], optical character recognition [
10], and agricultural product identification [
11]. Despite being used in large numbers, there are still many problems with visual inspection in the application of roller surface inspection. In the actual production process, it still relies more on manual inspection, and the inspection efficiency and level are relatively low.
Traditional methods used in manufacturing, such as edge detection [
12,
13], segmentation [
14], and line detection [
15,
16], can hardly extract the internal structures and accurately classify each defect category. Generally, a defect is regarded as a target without distinction, and the difference in reflection between the target and the background is used to separate the two, and then judge whether the bearing roller is qualified according to the position and area of the target. Internal features of defects are not utilized at all. For this reason, it is easy to treat some textures, marks, oil stains, etc. as defects, resulting in a low accuracy and low recall rate of the detection process. Sometimes we need to know exactly how many defect categories exist and calculate the frequency of each defect category in order to properly adjust the production process. And this is not possible for the traditional surface inspection method that is used in manufacturing.
The appearance of deep learning makes up for the disadvantages of traditional algorithms. Since deep learning algorithms have shown state-of-the-art performance in classification and object detection tasks [
17], deep neural networks can be utilized to learn the difference between different categories of defects, and to learn the commonality between the same category of defect, from a large amount of data, so that accurate classification can be achieved.
For example, Daniel Weimer et al. explored how convolutional neural network architecture and different hyper-parameter settings affect the feature extraction in industrial inspection [
18]. Yiting Li et al. conducted research on the surface defect detection algorithm based on MobileNet-SSD, which proved that defect detection can be achieved using lightweight networks [
19]. Xian Tao et al. designed a cascaded autoencoder architecture for segmenting and localizing defects [
20], and showed that their method meets the robustness and accuracy requirements for metallic defect detection. Jinhua Lin et al. used a deep convolution neural network to detect defects on castings. They established a convolutional neural network to extract defect features from a suspicious area and, finally, the accuracy of detection was more than 96% [
21]. S. Nahavand et al. used intelligent algorithms to detect defects on a metal surface [
22]; Xian Tao et al. developed a machine vision device to detect defects on an electrical connector using convolutional neural networks, and they discussed the effects of data augmentation on defect recognition [
23]; Yuan et al. used a modified segmentation method and deep neural networks to detect defects on the cover glass of mobile phones, and used GAN to generate new data in order to overcome the drawbacks presented when a huge amount of data is unavailable [
24].
This paper introduces a real-time machine vision system for bearing roller surface inspection, which can classify and locate the major categories of defects occurring on the surface of a bearing roller, and determine whether each bearing roller is qualified based on the position, category, and area of the defect. In order to meet industrial requirements, we propose a multi-task convolutional neural network framework for classifying and locating defects simultaneously. The simplified pipeline lays the foundation for future industrial applications. The system can replace the manual inspection, and its performance is better than the traditional algorithms.
Compared with the existing surface inspection research that is based on deep learning, our method can achieve real-time performance because we use a multi-task learning strategy. The classification task is performed simultaneously with the localization task, making the process of the entire model simpler and more efficient. Our system is an entire surface inspection system for bearing roller defect detection and quality evaluation, which has industrial application value.
The rest of the paper is organized as follows:
Section 2 introduces the design of the visual inspection system, including the hardware system and software system;
Section 3 elaborates on the defect detection method based on the convolutional neural network;
Section 4 gives the implementation and results of the experiment; and the
Section 5 summarizes the whole paper.
2. System Overview
The visual inspection system mainly consisted of two parts: a hardware system and a software system.
The electrical part of the system was mainly composed of the PLC (Programmable Logic Controller) and the industrial computer. The PLC implements motion control and digital I/O control. The industrial computer mainly implements image acquisition, image processing, image analysis, and output. The hardware of the industrial computer was Intel Core i7-6700k CPU, NVIDIA GTX-1080 GPU, 128GB RAM, and the operating system was Windows 10. The mechanical structure is shown in
Figure 3 below. It mainly consisted of a feeding device, a feeding conveyor, a pushing mechanism, four cameras, four ring light sources, a strip light source, a cam, a receiving device, etc.
The bearing roller has two end surfaces and a cylindrical surface, so three workplaces were required for image acquisition. The conveyor conveyed the rollers to workplace 1, workplace 2, and workplace 3 in sequence, and triggered the corresponding image acquisition function. At these three workplaces, we used a total of four industrial cameras. At workplace 1 and workplace 2, the roller was stationary. We use two plane-array cameras, with a resolution of 2448 × 2050, to capture the two end surfaces of the roller. At workplace 3, the rollers began to roll under the action of the mechanism. We used two line-array cameras, with a resolution of 4K, to capture the cylindrical surface. As the cylindrical surface is the working surface of the bearing roller, we used two line-array cameras to prevent defects from being missed due to the rolling of the roller. The selection of the cameras was determined by the working distance and image definition requirements.
Visual inspection has strict requirements on illumination, and stable illumination can ensure the stability of the image quality. For defect features, it is important to choose a targeted light source. We set up two ring light sources, a high-angle light source, and a low-angle light source at workplace 1. The two ring light sources were arranged in front and rear. Since the end surface of the bearing roller contains planes and chamfers, it is not possible to illuminate both parts with only one light source, so we used two light sources to simultaneously illuminate the chamfer and the plane of the roller. The low-angle light source in front was responsible for illuminating the chamfer, and the high-angle light source behind was responsible for illuminating the plane. The light source setting at workplace 2 was the same as at workplace 1. At workplace 3, we used a strip light source to illuminate the cylindrical surface.
The software system was programmed in C# and C++. C# writes the user interface, and C++ implements the underlying algorithm. The defect detection algorithm was developed using the PyTorch deep learning computing platform. Commonly used image processing algorithms, such as threshold segmentation and morphological processing, were implemented using OpenCV.
3. Surface Inspection Process
Bearing rollers have two end surfaces and a cylindrical surface. Since the cylindrical surface is the working surface of the bearing roller, a roller must be judged as unqualified if there are defects on it. If the defects occur on the outer circumference of the end surfaces, such as material lacking at the chamfer and stamp lacking, it will also affect the working surface, and the roller must also be judged as unqualified. For the defects inside the end surfaces, we can calculate the defect area to determine whether the roller is qualified.
Because the material lacking at the chamfer, represented by
Figure 2i,j above, and the stamp lacking, represented by
Figure 2l, can be first detected and excluded in the inspection process described below, our detection algorithm primarily detected and analyzed four categories of defects, which were damage, scratch, corrosion and grind lacking. Details of these defect categories are shown in
Figure 4 below. Defects other than those mentioned above are not discussed because of their low frequency of occurrence.
Image acquisition was performed at a suitable working distance. For each bearing roller, a total of two images were captured on both end surfaces, and the image was cropped to a resolution of 416 × 416. For the cylindrical surface, of each bearing roller, two images were captured and the resolution was also 416 × 416 after cropping.
We note that, although the shapes of the same defect category are different, there are similarities in features that can be extracted and classified by convolutional neural networks. In this section, we will describe in detail the method for identifying various defects on bearing rollers. The completed process pipeline is shown in
Figure 5 below.
The process consisted of the following three stages: First, contour detection. It is used to determine if the outer contour of the end surface is a standard circle and exclude the roller with a non-circular contour. Second, defect detection. It uses a multi-task learning convolutional neural network to classify and locate defects. Third, roller quality evaluation. It is used to determine whether the bearing roller is qualified according to the position, category, and area of the defect.
3.1. Contour Detection
In this part, we fitted the outer contour of the end surfaces of the roller by using the Hough transform [
25]. The pipeline can be seen from
Figure 6 below.
We performed the Hough circle detection 10 times for each end surface, and then took the average of the radius and the average of the center coordinates as the actual radius and center coordinates of the outer contour of the end surface. Then we used the Canny algorithm to extract the outer contour and calculated the standard deviation of the distance between the actual center coordinates and all points on the contour. The formulas were as follows:
where (
xsi,ysi) and
rsi are the center coordinates and the radius of the
i-th circle detected by Hough circle detection. (
xs,ys) and
rs are the actual center coordinates and the actual radius of the outer contour of the end surface,
dj is the distance between the
j-th point on the contour and the coordinate (
xs,ys), and
std is the standard deviation of
d.
If the std was less than the set threshold (set to 0.4 by experiment), it meant that the outer contour of the current end surface was a circle, and the sample would be sent into the shared convolutional neural network to extract a feature map for defect classification and localization. On the contrary, if there was a defect at the contour of the end surface, and the outer contour was not a circle, then the bearing roller would be judged as unqualified.
3.2. Features Extraction Using CNN
In this part, we designed a 26-layer convolutional neural network for feature extraction. The design reference for this network comes from the VGG [
26] and the Resnet [
27]. Firstly, we used small convolution kernels, instead of large convolution kernels, in order to reduce the computation and increase the network depth as well as the nonlinear mapping, so that the model’s data-fitting ability would be stronger. Secondly, we also used the 1 × 1 convolution kernel to compress parameters that were output from the 3 × 3 convolution kernel to reduce the computation of the network. Finally, we referred to Resnet to add shortcuts to the network in order to alleviate the gradient disappearance during training. The structure is shown in
Table 1 below. We pre-trained the network on the ImageNet dataset [
28] to improve the generalization capabilities.
3.3. Defect Classification and Localization
We classified the defects and calculated the position of the defects based on the feature map extracted by the CNN. We used a multi-task CNN architecture to unify classification and localization in order to simplify the entire inspection process. The loss function of the entire CNN was linearly weighted by the loss function of the classification task and the loss function of the localization task, as shown below:
where
is the loss function of the classification task,
is the loss function of the localization task, and
α is the weight of
.
3.3.1. Classification
The feature map was extracted by the convolutional neural network, and the dimension of the feature map was 13 × 13 × 1024. Each position of 13 × 13 represented a specific area in the original image. We followed the Single Shot MultiBox Detector (SSD) [
29] and Faster R-CNN [
30] to associate 6 anchor boxes at each location of the feature map. Each anchor box was responsible for predicting whether there was a defect at the position or not. If there was a defect, it would then predict the defect category and calculate the probability of the defect belonging to a certain defect category. In this paper, there were four categories of defects. The loss function of the classification task was defined as follows:
where
N is the total number of anchor boxes,
i refers to the anchor box index,
j refers to the ground-truth box index,
p refers to the category index, and 0 represents the background.
when category
p of
i-th anchor box and category
p of
j-th ground-truth box match, otherwise
.
indicates the predicted probability of the category
p corresponding to the
i-th anchor box.
3.3.2. Localization
If there was a defect in the current position, we calculated the
IoU of each anchor box with the ground-truth box, and removed the anchor boxes whose
IoU was smaller than the set threshold by non-maximum suppression, leaving the anchor box whose
IoU was larger than the set threshold. The boxes left were our predicted boxes.
IoU was defined as:
where
GT is the ground-truth box and
PB is the predicted box.
Each predicted box contained four predicted values, which were the center coordinates (
x,
y) of the box, and the length and width of the box. Through continuous iteration, the loss was gradually reduced, and the position of the predicted box was constantly approaching the ground-truth box. The loss function was as follows:
where
is Smooth L1 loss,
N is the total number of anchor boxes, and
when category
p of
i-th anchor box and category
p of
j-th ground-truth box match, otherwise
.
is a four-dimensional vector that represents the position of the predicted box.
is a four-dimensional vector that represents the position of the ground-truth box.
where
x,
y, denote the box’s center coordinates and
w,
h, denote its width and height, respectively. Variables
x,
xa, and
x* are for the predicted box, anchor box, and ground-truth box, respectively (likewise for
y,
w, and
h).
3.4. Roller Quality Evaluation
For defects that occured on the cylindrical surface, no matter which kind of defect it was and what the defect area was, the bearing roller was judged as unqualified. For defects that occurred on the end surfaces, step 3.1, described above, had already excluded defects, such as material lacking at the chamfer and stamp lacking, that caused the outer contour to not be circular in shape. For corrosion, scratch, damage, and grind lacking defects, the bearing roller was judged based on the defect area. The defects with bounding boxes were equivalent to the ROIs (Region of Interest), and the ROIs were analyzed separately using the image processing method. Accordingly, we calculated the defect area on each end surface separately. The process is shown in
Figure 7 below.
Different defects have different impacts on the performance of the roller. Damage has the greatest impact on the performance, followed by scratch, corrosion, and grind lacking. Our surface inspection system had different tolerances for different defects and; therefore, we defined four coefficients for the four defects. When calculating the total defect area, it was necessary to multiply the area of the different defects by the corresponding coefficient. For damage, scratch, corrosion, and grind lacking, the coefficients were defined as 3, 1.5, 1, and 0.8, respectively. The coefficients were defined by multiple experiments based on the inspection effect, and different coefficients could be defined according to different situations.
After performing median filtering, Otsu thresholding [
31], and morphological processing on the ROIs, defects were segmented from the background, and then we calculated the total area of all the defects. If the total defect area was greater than the set threshold, which was about 5% of the end surface area, the bearing roller would be judged as unqualified. The roller would be judged as qualified only when the defect area of each end surface was smaller than the threshold.
3.5. Data Augmentation
Both classification and localization depend on the CNN model, and the deep CNN model is easily over-fitting due to its powerful fitting ability, especially when the amount of data is not large. For bearing rollers, the probability of occurrence of defects is relatively low, and the amount of data that can be collected is relatively small, so it is necessary to appropriately augment the original data. We adopted several commonly used augmentation methods, including image rotation, image flipping, image cropping, adding blur, and adding noise. The augmentation results are shown in
Figure 8 below.