3.3. Image Preprocessing Module Design
The method of image preprocessing is divided into histogram equalization and Gaussian filtering.
Histogram equalization can be divided into two parts: statistics and mapping [
18]. Since statistics work must wait until the image has been processed, this means that the processing method will lose the advantage of pipeline processing, which would vastly reduce the processing speed. Because the inspection robots do not have sudden changes in the lighting environment during its operation, the method of counting the grayscale histogram of this frame can be used to map the next frame. Using the histogram equalization IP core and the software algorithm designed based on this method, the processing result of the working image of the underground belt conveyor is shown in
Figure 3. Compared with the previous pure-software algorithm, the method proposed in this paper only needs two clock cycles to get a result and greatly saves logic resources, but it will cause the mapped image to lose 2 gray levels, that is, the pixel gray value mapping at level 254. From the histograms, the loss of 2 gray levels can be ignored.
The discrete mathematical expression of Gaussian filtering is:
In Equation (1): —Smoothing coefficient; i—the abscissa of the Gaussian template; j—the ordinate of the Gaussian template; r—the radius of the Gaussian template.
When using FPGA to implement Gaussian filtering, in order to save logic resources and improve processing speed, fixed-point calculations are used instead of floating-point calculations and shifts are used instead of divisions. In this paper,
= 1, the template size is 5 × 5, and the coefficient in the template is the theoretical value multiplied by 256. After the calculation is completed, the effect of the coefficient is eliminated by shifting to the right by 8 bits. Due to the central symmetry of the Gaussian template, in order to improve the utilization of logic resources, the convolution process is divided into row convolution and column convolution. The constructed logic circuit is shown in
Figure 4.
The processing results of the Gaussian filtering hardware module constructed in this paper and the Gaussian filtering implemented by the software on the working image of the underground belt conveyor are shown in
Figure 5. As the algorithm IP core does not copy the image frame pixels during the processing, the results will appear as black edges (image reduction), and the convolution result in Gaussian filtering will be truncated to an integer; the error can be ignored.
3.4. Edge Detection Algorithm Module Design
The edge detection IP core consists of four parts, namely, the gradient information calculation module, the non-maximum value judgment module, the threshold segmentation, and the edge link module [
19].
Since the goal of the algorithm is to detect line segments, which are included in the edge part of the image, where the gray level changes sharply, enhancing the edge of the image is an indispensable step for subsequent line segment detection. In order to detect the correct single-pixel edge, in addition to obtaining the accurate gradient value, the gradient information calculation module also needs to solve the maximum value of the gradient, the gradient quadrant, and the edge direction angle (level line angle (LLA)). The calculation architecture of the overall algorithm module is shown in
Figure 6. Additionally, the calculation method is shown in
Figure 7.
Since the offset of the conveyor belt will not be very large, we only pay attention to π/4~3π/4 when grouping the edge direction angles. As shown in
Figure 8, a pyramid configuration is used to group LLAs. First, 3-bit data are used to discretely divide the interval (π/4~3π/4) into 8 groups according to the unit of π/8. Secondly, use the 2-bit grouping value to merge the 8 groups of 3-bit LLA. This grouping method sets the tolerance angle of line segment growth to 22.5 degrees. In addition, a 1 bit is used to mark the angle of the effective interval (π/4~3π/4). When the flag bit is 0, it means a valid grouping. When the flag bit is 1, it means an invalid group that exceeds the valid interval. The flag bit participates in the calculation of subsequent edge suppression. At the same time, in order to avoid the problem of missed detection of the line segment at the edge of the group, the original grouping method is offset by π/16 for regrouping to ensure that the angle on the frame of one grouping method is at the center of the other grouping method. The output of this grouping method is a 3-bit grouping value, and each potential edge point has two grouping results, which will be calculated at the same time in the subsequent area growth module.
Although this method limits the possibility of growth to a certain extent, in the implementation process, multiplication and comparison are used to replace the arc-tangent function solving process, which saves logic resources. Because the two-digit grouping result is conducive to the judgment operation of subsequent modules, the method reduces the difficulty of algorithm placement and routing.
A larger gradient value in a gradient image does not mean that the pixel is an edge point but only shows that the pixel has a higher potential. The non-maximum value judgment determines the single edge property of edge detection, which is particularly important for further line segment detection. This module performs 3 × 3 windowing on the result of gradient calculation, uses interpolation to calculate the potential maximum value, and then determines whether the center pixel is the local gradient maximum value and realizes the removal of false edges.
In addition to eliminating false edges, the edge detection module must also ensure the continuity of the edges. In addition to the result of judging the non-maximum value, whether a pixel can be defined as an edge is still affected by the gradient in the field. In this module, dual thresholds are used to link potential edges and suppress false edges for the results of non-maximum suppression. When the threshold is manually specified, there will always be edge loss caused by too-large thresholds or false edges caused by too-small thresholds. Then, use the maximum between-class variance method, i.e., Otsu’s method, to adaptively determine the high and low thresholds and use this to perform threshold segmentation and edge linking. This method ensures that the divided two classes have the largest inter-class variance, which means that they have the smallest probability of misclassification. The calculation circuit diagram of this algorithm is shown in
Figure 9.
The clock cost of the inter-class variance calculation is 6, and the threshold calculation needs to traverse all gradient levels. Therefore, if the OTSU threshold of this frame of image is used as the basis for gradient image segmentation, an additional 1082 beats of delay will occur. In the inspection process, the edges and textures of the working image of the belt conveyor will not have a sudden short-term change, so the threshold value obtained by the gradient histogram of the previous frame is used as the basis for the segmentation of this frame.
In summary, two-level first-in-first-out (FIFO) buffers, combined with registers, are used in FPGA to construct a 3 × 3 convolution window for the result of non-maximum suppression, and then the threshold determined by the OTSU algorithm is used as the high threshold, and one-half of it is used as the low threshold. Threshold segmentation and edge linking are based on three judgments:
When the central element is greater than the high threshold, it is judged as an edge;
When the central element is less than the low threshold, it is judged as non-edge;
When the central element is between the two thresholds, if there are elements higher than the high threshold in the neighborhood, it is judged as an edge.
Moreover, the pixels judged to be edges also need to satisfy the LLA in the interval [π/4, 3π/4].
3.5. Region Growth Algorithm Module Design
The regional growth module is the difficult key point of the IP core construction of the line segment detection algorithm. The edge information of the running image of the belt conveyor and the corresponding LLA are obtained through the preprocessing and edge detection of the previous stage [
20]. The line segments that meet the requirements are detected from the edge image. In the area growth module, the scanning direction of the convolution window must be consistent with the direction of line segment growth. As shown in
Figure 10, the convolution window constructed in FPGA always scans the image from left to right and top to bottom. When the direction of the line segment is consistent with the scanning direction, it can be detected, but the line segment on the right in
Figure 10 is from right to left in the top-down direction. In other words, the line segment cannot be detected if it is opposite the scanning direction. Therefore, a mirroring module is arranged before the edge detection of the region-growing module, and the region-growing module will obtain the line segment detection result of the whole image by processing the edge image and the mirrored edge image.
The implementation logic of the mirroring module is first-in-last-out. While the top-down scanning direction does not change, the order of elements in a row is reversed. The mirror module, constructed by simple dual-port RAM, will inevitably introduce a clock delay of one row of elements.
Figure 11 shows the processing path from the edge to the line segment information. After the processing of the previous module, the edge image (Gra) and the grouping value of edge direction angle (GAP) are obtained. On the one hand, Gra and GAP directly enter the area growth module to calculate the line segment information. On the other hand, Gra and GAP firstly pass through the mirror module and then enter the regional growth module. Since there are two LLA grouping methods, the area growth operation must be performed at the same time for different grouping values; that is, the area growth module will be called four times in the line segment detection IP core.
The region growth algorithm has three basic elements: a seed selection method, a similarity criterion, and a termination growth criterion. In this paper, combining the characteristics of FPGA processing image data, a region-growing method using linearly increasing column numbers as virtual seeds is proposed [
21]. This method does not use pixels with a certain characteristic in the image as seeds, but when the upstream module sequentially outputs row pixels, the increasing number of columns is used as the virtual seed. Based on the line segment area information stored in the virtual seed, the GAP in the heterogeneous convolution window is combined to realize the growth of the area and the update of the line segment information. In the gradient information calculation module, two-digit values are used to group LLAs. In the region growth module, the equality of GAP values is considered the similarity rule. The tolerance angle of line segment growth is π/8. Hence, when the difference in LLA values is within π/8, it is considered that the two points are aligned. In a digital image, a line segment is a collection of aligned pixels (close to LLA). The principle of stopping the growth will be described later in conjunction with the heterogeneous convolution window.
First, establish a heterogeneous convolution window. Since the growth of the region requires the combination of the seed point and the edge direction angle, this window is used to traverse the gradient image. This is shown in
Figure 12.
After completing the establishment of a convolution heterogeneous window, the method of updating regional information is studied. The update method of the area information depends on three growth methods: initialization, horizontal growth, and vertical growth. The three growth modes are determined by the alignment of the central element and the seed. As shown in
Figure 13, when the gradient of the central element point Gra = 1, growth is triggered. However, if Gra = 0, the growth is waiting to be triggered at Gra = 1.
The judgment of termination of growth is very important to the accurate detection of line segments. In the image, due to the intersection of line segments or noise, the termination of growth cannot be judged only by the alignment of D[I,J], D[I+1,J], and D[I+1,J+1]. In this design, the growth possibility of the three directions is analyzed to judge continued growth, and then NOR operations on the three judgments are performed. Finally, the judgment condition for the termination of the growth is constructed. The right side of
Figure 13 shows the method for judging the termination of growth in the area update. There are three possible directions for the continued growth of the central element D[I,J]:
When the gradient value of D[I+1,J] is 1, the judgment of continuing lateral growth is performed;
When the gradient value of D[I,J+1] is 1, the judgment of continuing vertical growth is performed;
When the gradient value of D[I+1,J+1] is 1, the judgment of continuing diagonal growth is performed.
At the same time, for the correctness of the region growth algorithm, the design of RAM control logic is also important. As shown in
Figure 14, port A is the write port, which is responsible for the initialization of RAM and the update of line segment information. The B port is the read port. Since the updated region information of the seed point will participate in the next growth during the calculation process, the read data will participate in the construction of the heterogeneous convolution window.
In order to ensure the correctness of subsequent calculations, the RAM is initialized before the area growth; that is, the initialization data is written through Port A. The timing of the key signals is shown in
Figure 15.
The construction of the heterogeneous convolution window requires that the data read from RAM and the data read from FIFO are at the same time. The timing logic of key signals is shown in
Figure 16. Both RAM and FIFO have a one-beat read delay. The read-enabled signal of FIFO drives the count of RAM read addresses.
The data update of area growth is determined by the growth mode and the judgment of the termination of growth. The timing diagram of key signals is shown in
Figure 17. When the growth mode is vertical growth or initialization, the write enable of the A port is pulled high by one beat and the write data is determined by the growth termination signal. As the growth mode is horizontal, when the termination of growth is true, initialize J−1. When the termination of growth is false, it is necessary to initialize the address J−1 while updating the address J.