A Novel Pressure Relief Hole Recognition Method of Drilling Robot Based on SinGAN and Improved Faster R-CNN

Liang, Bin; Wang, Zhongbin; Si, Lei; Wei, Dong; Gu, Jinheng; Dai, Jianbo

doi:10.3390/app13010513

Open AccessArticle

A Novel Pressure Relief Hole Recognition Method of Drilling Robot Based on SinGAN and Improved Faster R-CNN

by

Bin Liang

^1,2,

Zhongbin Wang

^1,*,

Lei Si

¹

,

Dong Wei

¹,

Jinheng Gu

¹ and

Jianbo Dai

¹

School of Mechatronic Engineering, China University of Mining and Technology, Xuzhou 221116, China

²

Xuhai College, China University of Mining and Technology, Xuzhou 221008, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(1), 513; https://doi.org/10.3390/app13010513

Submission received: 28 October 2022 / Revised: 23 November 2022 / Accepted: 27 December 2022 / Published: 30 December 2022

(This article belongs to the Special Issue Recent Advances in Autonomous Systems and Robotics)

Download

Browse Figures

Versions Notes

Abstract

:

The drilling robot is the key equipment for pressure relief in rockburst mines, and the accurate recognition of a pressure relief hole is the premise for optimizing the layout of pressure relief holes and intelligent drilling. In view of this, a pressure relief hole recognition method for a drilling robot, based on single-image generative adversarial network (SinGAN) and improved faster region convolution neural network (Faster R-CNN), is proposed. Aiming at the problem of insufficient sample generation diversity and poor performance of the traditional SinGAN model, some improvement measures including image size adjustment, multi-stage training, and dynamically changing iteration times are designed as an improved SinGAN for the generation of pressure relief hole images. In addition, to solve the problem that the traditional depth neural network is not ideal for small-size target recognition, an improved Faster R-CNN based on multi-scale image input and multi-layer feature fusion is designed with the improved SqueezeNet as the framework, and the sample data collected from ground experiments are used for comparative analysis. The results indicate that the improved SinGAN model can improve the diversity of generated images on the premise of ensuring the quality of image samples, and can greatly improve the training speed of the model. The accuracy and recall rate of the improved Faster R-CNN model were able to reach 90.09% and 98.32%, respectively, and the average detection time was 0.19 s, which verifies the superiority of the improved Faster R-CNN model. To further verify the practicability of the proposed method, some field images were collected from the underground rockburst relief area in the coal mine, and a corresponding test analysis was carried out. Compared with three YOLO models, the accuracy and recall rate of improved Faster R-CNN model improved significantly, although the training time and recognition time increased to a certain extent, which proves the feasibility and effectiveness of the proposed method.

Keywords:

drilling robot; pressure relief holes recognition; generative adversarial network; lightweight convolutional neural network; faster R-CNN

1. Introduction

With the increasing mining depth and intensity of coal mines in China, rockburst has become one of the most serious dynamic disasters in deep coal mining. Its occurrence frequency and destruction intensity are increasing, which has a great impact on the safe mining of deep coal seams [1,2,3]. According to incomplete statistics, the number of rockburst coal mines in China has increased from 32 in 1985 to more than 200 in 2021. As the most direct and effective means, the borehole pressure relief method plays a key role in mine disaster prevention and control. At present, a drilling pressure relief operation requires personnel to enter dangerous areas, which poses a serious threat to the lives and safety of operators. Therefore, it is urgent to realize unmanned operation in pressure relief areas. The accurate identification of pressure relief holes directly affects the optimized layout of pressure relief holes and the pressure relief effect of the drilling robot in high ground stress areas, which is the precondition for achieving accurate and intelligent drilling [4,5].

Many scholars have conducted in-depth research on hole recognition in different fields. Song et al. [6] proposed a target plate double hole detection method based on the improved maximum between class variance method, which not only shortens the calculation time, but also improves the accuracy of bullet hole center positioning. Yao et al. [7] developed an electric vehicle charging hole recognition and positioning system, based on the principle of binocular vision imaging, using Halcon image processing software. Zhuang et al. [8] proposed a hole location and normal vector detection algorithm based on template matching, which converts a three-dimensional space point cloud into a two-dimensional plane point cloud through binarization, uses template matching algorithm and region of interest to identify and locate holes in two-dimensional plane, and indirectly calculates the hole location according to the point cloud in the hole neighborhood. Lei et al. [9] presented an industrial robot alignment technology based on monocular vision. The pose of the workpiece was is obtained through the monocular vision system, which realized the accurate alignment of the alignment axis at the end of the robot and the workpiece target. Wei et al. [10] proposed a method for machining feature recognition of thin-walled multi-cavity structural parts based on the combination of rules, layers, and feature suppression. Mi et al. [11] proposed an automatic positioning and recognition method for a container keyhole based on machine vision, which meets the real-time requirements of port automation. It can be seen from the above reports that traditional target recognition methods often use artificial features such as HOG features, SIFT features, and HAAR features to extract target features [12,13,14]. However, due to the harsh environment of an underground coal mine, the image has the characteristics of mixed noise, non-uniform illuminance, low contrast, inconspicuous gray transformation, etc. The traditional image processing method makes it difficult to accurately extract the target features.

In recent years, image processing technology based on deep learning has been widely used, including recurrent neural network, convolutional neural network (CNN), and GAN [15,16,17,18], among which the CNN is the most widely used in the field of image-based target recognition. Representative network models include R-CNN, Fast R-CNN, Fast R-CNN, SSD (Single Shot MultiBox Detector), and YOLO (You Only Look Once) [19,20]. Reference [21] presented a threaded hole target detection method based on Faster R-CNN by combining the dual camera vision system and Hough transform circle detection algorithm. The experimental results show that this method can effectively avoid the detection of small targets in threaded holes, and has high recognition and positioning accuracy. Zhang [22] compared and analyzed the network structures of R-CNN and Faster R-CNN, and proved the superiority of Faster R-CNN algorithm in the hole identification of an explosive filling robot from the aspects of accuracy and recognition speed. Liu [23] proposed an extended deconvolution SSD target detection model based on multi-scale feature fusion to solve the problem that the depth learning target detection algorithm, SSD, has a poor recognition effect on small-scale targets. The test results from the public dataset show that the proposed method improves the accuracy of small target object detection.

Although many scholars have conducted a significant amount of research on hole recognition in different fields and made certain achievements, there are still the following problems: (1) Deep neural networks require a large number of samples to support the practical application process, but it is difficult to obtain images of pressure relief holes due to the complexity of the underground working environment. The quality and quantity of the images obtained are not enough to support the operation and optimization of existing deep learning algorithms, and the existing images need to be expanded. (2) The research on hole identification technology in various fields mainly focuses on hole identification with continuous smooth edges. However, there are problems, such as the collapse and deformation of coal mine pressure relief holes. In addition, hole edges appear as three-dimensional complex intersecting curves, which leads to an unsatisfactory application effect of traditional hole identification methods.

In order to improve the recognition effect of the pressure relief hole target of the drilling robot, this paper designs an improved SinGAN model to solve the problems of insufficient diversity of sample generation and poor performance of the model. Then, an accurate target recognition model of the relief hole based on improved Faster R-CNN is constructed, and the feasibility and effectiveness of the proposed method are verified through experimental tests.

2. Relief Hole Image Generation Based on SinGAN

In the field of deep learning, the richness of datasets is an important indicator of the application effect of network models. If the number of datasets is small, the learning ability of the network model will be greatly compromised during training, and the network model will be unable to learn all the characteristics of the data. This results in the network model’s insufficient generalization ability, which leads to the performance of the network model being unable meet the actual needs. It is difficult to obtain a large number of relief hole images due to the harsh underground environment and complex working conditions. Therefore, in this paper, based on the GAN model, the samples of relief hole images are generated to enrich the sample dataset.

2.1. Basic Principle of GAN

GAN adopts an unsupervised learning mode, the core aim of which is to make two neural networks confront each other to achieve the purpose of learning. This is mainly composed of generator G and discriminator D. The purpose of the generator is to generate an image that is as similar to the original image as possible from zero after the input noise z is continuously trained by the model, and to try to confuse the discriminator. Through mutual improvement in the process of the continuous game, the discriminator has strong discrimination ability, but there is no way to distinguish whether the input data are true or false. In this case, it can be determined that the generator has mastered the feature distribution of the original image through training.

In 2019, Tamar et al. proposed a single-image generative adversarial network model. The basic structure is shown in Figure 1 [24], where z_N represents random noise; G_N represents the generator;

{\tilde{x}}_{N}

represents the image generated by generator;

x_{N}

represents the real image; and D_N represents discriminator. In general, by training SinGAN to generate images, at least thousands of training samples are required. Its input is a single natural image, which, like the GAN model, is unsupervised. SinGAN can generate a new image similar to the original image, and the general features of the image will essentially be the same, but some details and structures will have changed.

Although SinGAN can generate any number of images with high authenticity and diversity, the premise is that the input image is a natural scene image with an obvious structure. Due to the insufficient layout of the relief hole image structure, the diversity of the generated images will be poor, and the performance of the SinGAN model will have a large gap with the actual demand.

2.2. Improved SinGAN

Aiming at the above problems, the SinGAN model is improved by combining the image features of the pressure relief hole. The specific measures are as follows: a new image size transformation function is designed to enrich the diversity of the pressure relief hole image, and the number of iterations in each stage is set as dynamic changes to shorten the training time of the model and improve the performance.

(1): Image size adjustment

At the beginning of the training, the main task of the network model is to study the general features of the images. As the training progresses, the detailed features of the images are gradually learned. Considering the uncertainty of the overall feature distribution of the images in the coal mine, different relief hole images are mainly divided by certain details, so when the relief hole images are created, they should focus on learning the details to ensure the variety of the images. During the process of model training, images of different sizes should be input at different levels, and several images are needed for several stages of model training. In order to achieve rich details, it is necessary to carry out a certain number of iterations of high-order resolution learning, generally at least three, but there is no need to carry out more training in low-resolution stages. Based on this, the authors design a new curve adjustment function, as shown in Equation (1), to ensure that the number of training stages will be inclined to high resolution.

x_{n} = 1 - x_{N} r^{[(N - 1) / \lg N] \lg (1 + n) + 1} n \in (0, N - 1),

(1)

where r is a constant less than 1, N is the maximum stage number, and n is the current stage number.

For example, if the resolution of the output image by the traditional method is 25 × 38, 40 × 60, 89 × 133, and 167 × 250, then the resolution of the output image by the basic size adjustment curve function is 56 × 83, 100 × 150, 117 × 176, and 167 × 250. It can be seen that the resolution of each stage after improvement is greater than the traditional method, except for the same maximum resolution.

(2): Multi-stage training

The setting of the learning rate at different stages will seriously affect the training speed of the model. The higher the learning rate, the faster the learning speed. However, in order to prevent an over-fitting phenomenon, the learning rate is generally set lower. During the training process, not only are the generators of a single stage trained, but some generators nearby this stage are also trained at the same time. In addition, the learning rate at the lower stages of the training process is generally set relatively low.

Figure 2 shows the change of the generator and its learning rate during model training. Firstly, a maximum of three generators are set to be trained at the same time. If there are more than three generators, only the generators of the first three scales are trained, and the parameters of other generators remain unchanged. Secondly, the learning rate of the generator at the lower two stages is adjusted to 1/10 and 1/100. Figure 2 illustrates the four top-down stages of the training process. From the beginning of line 1 to the end of line 3, the number of generators is positively correlated with the number of training stages. In the fourth stage, the parameters of G₀ are fixed, and the learning rates of G₁, G₂, and G₃ are, respectively, set as 1%, 10%, and 100% of the original values.

(3): Dynamic change of the number of iterations

As the pyramid structure of SinGAN has multi-scale and multi-stage characteristics, it takes a long time to train the model. In addition, the detail feature is more important than the structure feature in the process of image generation. Therefore, in order to further improve the diversity of image generation results and the efficiency of model training, this paper only trains the specific generators in the SinGAN model, instead of training all generators like the original SinGAN, and optimizes the number of iterations in each training round, as shown in Equation (2).

i t e r = \{\begin{cases} β \times n i t e r \times \lg (\frac{\max + d e p t h + 1}{\max - d e p t h - 1}) + 1000 d e p t h = 0 \\ β \times n i t e r \times \lg (\frac{\max + d e p t h + 1}{\max - d e p t h - 1}) 0 < d e p t h < \max - 1 \\ β \times n i t e r \times \lg (\frac{2 \max - 1}{\max - d e p t h - 1}) \max - 1 \leq d e p t h \end{cases},

(2)

where iter refers to the current number of iterations; β represents the rate at which training reaches the maximum value; niter represents the set number of iterations; max represents the maximum number of training stages; and depth represents the stage being trained, and is constantly changing.

With the above improvement measures, the iteration number of a certain stage has increased for the generator, but the iteration number of most stages has decreased. The total number of iterations of the model will be significantly reduced, and the training time will be greatly shortened, thus improving the training performance of the SinGAN model.

2.3. Image Generation Model for Pressure Relief Hole

The image generation model of pressure relief hole based on improved SinGAN is shown in Figure 3, where L_adv is the objective function of GAN. It mainly consists of two parts, including the generation process of a single pressure relief hole image and the training process of the model. The input of G₀ at stage 0 only has random noise z_n. In addition to z_n, the inputs at any other stages also include the image

\tilde{x_{n}}

output at the previous stage, and

\tilde{x_{n}}

also requires an upsampling operation. At each scale n, the noise z_n is superimposed with the upsampled image

\tilde{x_{n - 1}}

output at the previous scale, and is then transmitted to the generator. The image

\tilde{x_{n}}

output by the generator is also the generated image at the n-th stage, which is the result after mixing the residual image.

3. Pressure Relief Hole Identification Model Based on Improved Faster R-CNN

3.1. Faster R-CNN

Faster R-CNN is an end-to-end model structure. The main difference between this model and R-CNN is that a region generating network (RPN) is added to replace the selective search algorithm in R-CNN, which directly reduces the multiple calculations of theoretical features, improves the speed of model recognition, and makes it possible for the convolution operation to meet real-time requirements. The features of the input image are shared by the RPN network, and then a series of candidate regions are generated, which reduces the computational cost. Finally, the ROI Pooling layer extracts the fixed-length feature vector from the output feature map, and then sends the extracted parts to the next full connection layer. The last two output layers are used for classification and regression.

3.2. Design of Lightweight CNN

In 2016, the SqueezeNet model was proposed by Iandola F.N. and Han S. It is a lightweight CNN which can reach the accuracy of the AlexNet level, while the parameters are only 1/50 of AlexNet, reducing the complexity of the network and ensuring the accuracy. The structure of the SqueezeNet model is shown in Figure 4. SqueezeNet starts from the convolution layer 1 (conv1) with a convolution kernel of 7 × 7 and a step size of 2. Then, eight fire modules are set between convolution layer 1 (conv1) and convolution layer 10 (conv10). Each fire module contains two convolution layers, namely squeezer and expand. The convolution kernel of conv10 is 1 × 1 and the step size is 1. After the convolution layer 10, there is a global average pooling layer, which is sent to the softmax layer after global average pooling. The softmax layer is used to classify the images and output the results. Among them, the max-pooling layer is inserted between conv1 and fire2, fire4, fire5, fire8, and fire9. The step size of this layer is uniformly set to 2, and the convolution kernel is set to 3 × 3. Before being sent to fire9 for processing, the size of the feature map is 1/16 of that before being sent to conv1.

In order to improve the applicability of the SqueezeNet model to the pressure relief holes images in coal mines, this paper optimizes and adjusts its structure, and also designs an improved SqueezeNet network, as shown in Figure 5.

The detailed optimization processes of SqueezeNet are as follows. The image input size is resized to a fixed scale. The side length of the convolution kernel of the first convolution layer is changed to 3, the sliding interval of the filter is 1, and the number of final output channels is modified to 64. conv2 is added after conv1, the convolution kernel size is also 3 × 3, and the step size is 1. After the second convolution layer, the second fire module, the fourth fire module, and the eighth fire module, a maximum pooling layer is added, and the side length and step size of the convolution kernel are both 2. A ROI pooling layer is added after fire10, and then the fire11 and fire12 modules are added before the pooling layer. The original conv10 is replaced by fc13 (fc means full connection layer), fc14 is added after fc13, and then the full connection layers are added on the left and right, respectively. The left full connection layer is used for hole target classification, and the right full connection layer is used for location regression. In addition, a layer of Batch Normalization (BN) is added after each convolution layer (including the convolution layer in the fire module).

The purpose of adding a BN layer is to normalize the data, as shown in Equation (5). After normalization, the data need to be translated and zoomed, as shown in Equation (6). When training the model, each batch of training should be based on the parameters of mini_batch, that is, each mini_batch parameter should be processed according to Equations (5) and (6).

E [x] = \frac{1}{n} \sum_{i = 1}^{n} x_{i},

(3)

V a r [x] = \frac{1}{n} \sum_{i = 1}^{n} {(x_{i} - E [x])}^{2}

(4)

{\hat{x}}_{i} = \frac{x_{i} - E [x]}{\sqrt{V a r [x] + ε}}

(5)

y_{i} = B N_{γ β} (x_{i}) = γ {\hat{x}}_{i} + β

(6)

where n is the number of parameters of mini_batch, i is the index number of data, E[x] represents the average value, Var[x] represents the variance,

\hat{x_{i}}

is the data after BN operation, and y_i is the result of translation and reduction after BN operation. β and γ represent the learning parameters and ε represents the subtle invariant. When ε is ignored, the normalized data will be transformed into the standard normal distribution with a mean of 0 and a variance of 1.

3.3. Pressure Relief Hole Identification Model Based on Improved Faster R-CNN

At present, Faster R-CNN is usually combined with other networks, such as ZF-Net, VGG-Net, etc., and has achieved good results in some commonly used target detection datasets. However, Faster R-CNN still has some problems with the recognition of small-sized objects. Through the analysis of relief hole images, it can be seen that such images have the problems of complex structure, uneven lighting, and inconsistent resolution, and the relief holes in the images have different target sizes, different shapes, and more interferences. Therefore, directly applying the traditional Faster R-CNN model to the identification of relief holes will inevitably lead to an unsatisfactory effect. Based on the above problems, this paper designs an improved Faster R-CNN model for pressure relief hole recognition. With the improved SqueezeNet as the framework, combined with the measures of multi-scale input of images and top-down multi-layer feature fusion, the target recognition accuracy of the pressure relief hole image can be significantly improved.

3.3.1. Multiscale Image Input

In order to realize the recognition of any relief hole, this paper designs an improved method of multi-scale image input, and then extracts the characteristics of relief holes at different image scales, as shown in Figure 6. First of all, two kinds of pressure relief hole images with sizes of 390 and 300 are set as the input. After iterative training of large-scale images, the acquired network parameters are used as training initialization parameters for training small-scale images. The whole process is trained in this way until the end. It is worth noting that the images with sizes of 390 more easily make the model extract smaller objects than the images with sizes of 300. However, as the size enlarges, the calculation amount and running time also increase.

3.3.2. Top-Down Multi-Layer Feature Fusion

According to image analysis of the pressure relief holes, most of the pressure relief holes are 20 × 20~40 × 40 pixels. The SqueezeNet model will conduct the downsampling of the image with an amplitude value of 16 and repeat it 4 times. Then, the size of the feature region mapped by the relief hole is 1 × 1~2 × 2 on the last convolution layer. Finally, after the pooling layer, the region size is magnified several times, resulting in blurred image pixels, which may eventually lead to missed recognition of the pressure relief hole object in the image.

The shallow feature map in CNN can extract the shape, color, and texture of the object well, while the more general semantic information requires the deep feature map to for extraction. In CNN, with the increase in the number of network layers, the resolution of the feature map will also decrease, which is caused by the downsampling and pooling operations. In view of this, many scholars are studying how to achieve the detection of small-sized objects, and the current method is mainly to fuse the shallow and deep features. In addition, in the convolutional neural network, the feature maps of adjacent layers are highly correlated, and the performance improvement of the model is not obvious after their fusion. Using the above research ideas for reference, this paper designs a multi-layer feature fusion method which fuses the shallow and deep information from top to bottom. After upsampling, the feature map after the last downsampling is connected with the feature maps after the previous several iterations of downsampling. The final output results are used for related operations in the candidate area of the relief hole target.

The specific processes are as follows:

(1): Firstly, the feature map passing through fire12 is enlarged to be equal to the feature map of fire10. Then, the enlarged feature map is connected in series with the feature map of fire10, and performs the BN normalization. Finally, the output channel number is reduced to 512.
(2): The obtained feature map is upsampled again, and the enlarged size is consistent with the size of the feature map of fire6. Then, the upsampled feature map is connected in series with the feature map of fire6, and performs the BN normalization. Finally, the output channel number is also changed to 512.
(3): The output feature map is performed by the maximum pooling operation, and then sent to the RPN candidate region selection network and the Faster R-CNN relief hole identification network, respectively, to achieve accurate identification of the relief holes.

The main reason why the maximum pooling operation is performed in step (3) is that if the downsampling is performed twice, the size of the obtained feature map is 75 × 75, which will take a lot of time to identify the relief hole on this image size. If the number of downsampling iterations is 3, the size of the feature map after downsampling is only 37 × 37. The feature map size is greatly reduced, which is more suitable for hole target extraction and will consume less time for calculation.

Based on the improved SqueezeNet lightweight convolution neural network, through integrating the strategies of multi-scale image input and top-down multi-layer feature fusion, this paper designs an improved Faster R-CNN model for pressure relief hole target recognition, as shown in Figure 7.

4. Experimental Verification

4.1. Construction of the Experimental Platform for Image Acquisition

The purpose of identifying the pressure relief hole of drilling robot in coal mine is to optimize the hole layout, so as to improve the pressure relief effect. In this process, the formation time of the pressure relief hole is relatively short, and the pressure relief hole does not show significant deformation. In the underground pressure relief area of the coal mine, it is difficult to collect images in time after the formation of the pressure relief hole in the previous stage. Most collected pressure relief holes have a relatively long formation time, and will inevitably show deformation under the high ground stress conditions. The images of pressure relief holes with abnormal structural features will inevitably have an impact on the training model. Therefore, in order to obtain the original relief hole images, a drilling experiment platform was built, as shown in Figure 8. The coal and rock specimens used in this experiment were customized by referring to the actual composition of coal and rock. The drilling robot was operated to drill the specimens many times, and then the images containing the characteristics of the pressure relief hole were collected through an industrial camera.

In order to simulate the actual working environment of the drilling robot as much as possible, the industrial camera considered different conditions of brightness, shooting angles, blurring degree, and shooting distance when collecting pictures. Partial images of pressure relief holes acquired under the above conditions are shown in Figure 9.

4.2. Pressure Relief Hole Image Generation Test

4.2.1. Training Environment and Process

The hardware environment was as follows: Windows Professional Edition, Inter (R) Core (TM) i7-7500 CPU @ 3.5 GHz, NVIDIA GeForce GTX 1080, Xuzhou, China. The software environment was as follows: Python 3.6, Python 1.1.0, CUDA 10.0, Xuzhou, China, etc. The structures of the generator and the discriminator in the SinGAN model were basically the same; both contained convolution layers and filters. The number of convolution layers was 3, the total number of filters was 192, and the size was 3 × 3. The number of iterations in each stage of the training process was calculated by Equation (2). The initial learning rate was set to 5/10,000, and the number of iterations was set to 1600 as the boundary. After 1600 times, the learning rate was reduced to 1/10 of the original. If less than 1600 times, the learning rate remained unchanged. This model used the ADAM optimizer to optimize the network. The ReLU activation function parameter was set to 5%. In order to improve the training efficiency of the SinGAN model and to capture the detail feature of the pressure relief hole, we consulted the relevant literature and carried out some comparative analyses. Then, the number of training stages was set to 4 and the learning rate adjustment scale was set to 0.1. In addition, 3, 5, 7, … can be generally set for the number of convolution layer layers, and 32, 64, 128, … can be set for the number of filters in each convolution layer. The number of filters and the number of convolution layers were larger, so the image effect generated was generally better. However, the training efficiency and the memory of the computer’s graphics card need to be considered to prevent training failure of SinGAN. Thus, the number of filters in each convolution layer was set to 64, the number of convolution layers was set to 3, and the filter side length was set to 3. Finally, the maximum side length of the generated image was able to reach 533.

4.2.2. Results and Analysis

In order to explore the performance of improved SinGAN in generating image diversity, 221 image samples of pressure relief holes were collected for testing and analysis. All relief hole images were trained in the traditional SinGAN model, our improved SinGAN model, and SinGAN in [25], respectively. At the same time, to show the comparison effect, three real images were randomly selected from the relief hole data set. After three iterations of network model training, four images were selected from the generated images for display, as shown in Figure 10. Subjectively, compared with the traditional SinGAN, the image quality generated by the improved SinGAN model had more detail and more diversity.

In order to objectively evaluate the effect of image generation, two indicators, structural similarity (SSIM) and Single Image Frechette starting distance (SIFID), were selected for evaluation. SSIM was mainly used to measure the similarity between two images, and evaluate the distortion between images from the perspectives of brightness, contrast and structure, so as to obtain a more objective image quality assessment. The value range of SSIM was 0~1, and the index value from small to large represented the fact that the similarity between the reconstructed image and the real image was growing higher and higher. SIFID was specially used to evaluate the quality of the generated image, and the value was generally 0~1. The closer in quality the generated image and the original image are, the lower the index score is. Due to limited space, 20 groups of comparative data are randomly listed, as shown in Table 1.

Generally, the underground images of coal mines are divided into two categories: structure and texture. If there are prominent blocky features in the image, it means that its structure is strong. If the surface of the image is relatively flat, it indicates that it has strong texture. However, most relief hole images have obvious structure. Therefore, when the SSIM index is lower, it indicates that the data structure of the generated image has changed significantly. When the SSIM index is higher, it indicates that the features of the generated image are closer to the original image, and that the diversity of the generated image is rather poor. As can be seen from Table 1, the SIFID values of most images generated by improved SinGAN are lower than traditional SinGAN and improved SinGAN in [25], which proves that the improved model can obtain better image quality. Moreover, the SSIM values of the images generated by improved SinGAN are also lower than those of traditional SinGAN and SinGAN in [25], indicating that the improved model’s generated images have better diversity. In summary, the improved SinGAN model can not only enhance the diversity of generated images, but also ensure the quality.

In order to further explore the performance of the improved SinGAN model, the training duration of the generated image was analyzed. Three SinGAN models, including traditional SinGAN, our improved SinGAN, and SinGAN in [25], were used to generate image samples separately 50 times, and the required training time is shown in Figure 11. It can be seen that the average training time of the improved SinGAN model is 1.15 h, which is 77.93% lower and 86.91% lower than SinGAN in [25] (5.21 h) and the traditional model (8.79 h). The results indicate that improved SinGAN can generate higher-quality images in a shorter time.

4.3. Pressure Relief Hole Identification Test

4.3.1. Related Parameter Setting

First of all, the improved SinGAN model was used to generate 50 images for each relief hole image, and a total of 11,271 image samples were obtained. Subsequently, the sizes of all images were adjusted to 300 by scaling or clipping, so as to improve the training speed of the model, emerge the relief hole contour and shape features, and then optimize the network structure and model parameters of the relief hole target detection model. Then, labelImg software was used to add pressure relief hole labels. After labeling, each image was converted into one XML file, and the file save name corresponded to the image name. Finally, 11,271 image samples in the dataset were divided into ten parts, and the 10-fold cross validation was also used to test the performance of the network.

The relevant parameters were as follows: the initial learning rate was set to 0.00001, the momentum factor was set to 0.9, and the weight attenuation factor was 5 × 10⁻⁴. The newly added convolution layers were all initialized by Gaussian distribution, with a mean value of 0 and a standard deviation of 0.01. The hyperparameters during training were set as Table 2.

4.3.2. Results and Analysis

(1): Test I: Explore the influence of SqueezeNet improvement measures and BN normalization on the recognition model.

The original SqueezeNet, the improved SqueezeNet, and the improved SqueezeNet plus BN normalization layer were used as the basic networks to train the recognition model. The change curves of the loss function are shown in Figure 12. As can be seen from the figure, with the increase in the number of training iterations, the loss function showed a downward trend as a whole. The loss decreased significantly in the early stage of training, but decreased slightly and tended to be stable in the later stage of training. By comparing the three convergence curves, it can be seen that the BN normalization layer was able to make the curve converge faster, restrain the oscillation amplitude, and reduce the convergence value.

Figure 13 shows the accuracy rate changes of different SqueezeNet models during 10-fold cross-validation. As can be seen, the average accuracy rate of the improved SqueezeNet + BN model during 10-fold cross-validation was the highest, and the volatility was the lowest, which can verify its stability. The specific target recognition results of relief holes based on the above three basic network models are shown in Table 3. It can be seen that the target recognition effect of the relief hole could be significantly improved by increasing the number of convolution layers and placing the maximum pooling layer backward, with the accuracy rate and recall rate increased by 8.8% and 3.64%, respectively. Secondly, adding the BN layer could also enhance the target recognition effect of the relief hole, with its accuracy increased by 3.41% and recall increased by 4.52%. By comparing the average training time under different basic networks, it can be found that the training time of the improved SqueezeNet + BN model increased slightly, which indicates that the number of convolution layers affects the convergence speed of model training.

(2): Test II: Explore the influence of multi-scale image input and multi-layer feature fusion on the recognition model.

Taking the improved SqueezeNet + BN as the basic model, the multi-layer feature fusion strategy was, firstly, added for training. On this basis, the multi-scale image input strategy was then added to train the recognition model. The change curves of loss function are shown in Figure 14. The convergence curves of the two training methods eventually converged around 0.2, and the convergence curves basically coincided with each other. The difference was relatively small.

The target recognition results of the pressure relief hole based on the above two training methods are shown in Table 4. By comparing Table 3 and Table 4, it is not difficult to find that the accuracy and recall rate have been greatly increased by 2.03% and 16.05% after adding multi-layer feature fusion. However, the average training time increases from 160 s to 238 s, reducing the calculation efficiency of the recognition model. In addition, it can be seen from Table 4 that the recall rate of the recognition model has been increased by 2.04%, although the index values of accuracy and training time have slightly deteriorated by simultaneously using the multi-level feature fusion and image multi-scale input strategy. Considering that the target recognition of the pressure relief hole has a high requirement for the recall rate, the recognition model with the above two improved strategies shows a better comprehensive performance.

(3): Test III: Explore the performance of basic network models based on improved SqueezeNet, VGG-16, and ZF-Net in pressure relief hole identification.

The parameter settings of the three models are the same, and the comparison results are shown in Table 5. It can be seen that the designed Faster R-CNN model with improved SqueezeNet is superior to the other two models in terms of accuracy rate, training time, and recognition time. Although the recall rate (98.32%) is slightly lower than the VGG-16 model (98.67%), the comprehensive performance of the proposed recognition model is optimal, and is more suitable for rapid and accurate recognition of pressure relief holes in coal mine images.

4.4. Field Test in Underground Coal Mine

In order to further verify the practicability of the proposed identification model of a pressure relief hole, the industrial camera installed on the drilling robot was used to capture scene images in the pressure relief area of the underground roadway in a coal mine. The scene is shown in Figure 15.

4.4.1. Sample Processing and Transfer Learning

An industrial camera was used to collect a total of 244 images of on-site pressure relief holes, and some samples are shown in Figure 16. The improved SinGAN model was used to expand the image samples. Each real image generated 49 images, and finally, 12,200 images were obtained. Then, the sample data was divided into ten parts. and ten-fold cross validation was also used to test the performance of the network. The average value of ten results shall be taken as the final testing result.

In the CNN model, the front and latter layers of the network model usually have different functions. The front layer is responsible for extracting image features, such as shape, color, etc. The latter layer is responsible for allocating corresponding weight factors according to different characteristics, and has decision-making power. The objects extracted from the front layer feature are allowed to be different data. If the data of the latter layer are different, the configured weight factors are also different. Therefore, for similar sample data, we can refer to the idea of transfer learning, use the front layer with a fixed network model, and only train the latter layer to achieve the target recognition task.

In this experiment, the sample data of the pressure relief hole in the coal mine were very similar to the sample data collected on the ground, so transfer learning was used to improve the performance of the Faster R-CNN model. The specific operation is as follows: only train the parameters of the Softmax classifier in RPN and ROI Pooling, and fix the parameters of other modules unchanged. The specific training parameters are the same as those shown in Table 2.

4.4.2. Results and Analysis

After the training, 1220 images in the testing set were processed to adjust the size of the images to 300 × 300 pixel values. They were then input into the traditional Faster R-CNN and the improved Faster R-CNN model for testing. The results are shown in Table 6. It can be seen that the improved Faster R-CNN model increased the identification accuracy and recall rate of pressure relief holes by 2.2% and 6.15%, respectively, and the average training time has been shortened by 35.73% compared with the traditional Faster R-CNN. The identification effect of the relief holes based on improved Faster R-CNN model is shown in Figure 17. In general, the above results verify the superiority and practicability of the proposed relief hole identification model.

In order to objectively evaluate the model performance, the identification results of three network models, YOLO v3 [26], YOLO v4 [27], and YOLO v5 [28], are compared and analyzed, as shown in Table 7. It can be seen that the accuracy rate and recall rate of the three YOLO network models are lower than those of the improved Faster R-CNN. Although the training time and recognition time of YOLO model are better than those of the proposed model, the recognition performance of relief holes based on improved Faster R-CNN can meet the practical needs of the field.

5. Conclusions

(1): Aiming at the problem of insufficient diversity and poor performance of the traditional SinGAN model in sample generation, image size adjustment, multi-stage training, and dynamic changes in the number of iterations that are integrated, a sample generation model for pressure relief hole images based on improved SinGAN was designed. The test results indicate that the designed network model can improve the diversity of generated images on the premise of ensuring the quality of image samples, and can also greatly improve the training speed.
(2): Based on the SqueezeNet structure, multi-scale image input and multi-layer feature fusion were introduced, and a target recognition method of relief holes based on improved Faster R-CNN was designed. The ground experiment results show that the accuracy and recall rate of the improved Faster R-CNN model can reach to 90.09% and 98.32%, respectively, and the average detection time is only 0.19s, which verifies the feasibility and superiority of the designed network model in relief hole recognition.
(3): The field images were collected from the pressure relief area in an underground coal mine, and the target recognition experiment of the pressure relief hole was carried out. The practical application effect of the improved Faster R-CNN recognition model was tested through transfer learning, and the accuracy and recall rate were able to reach to 90.08% and 98.28%. These values are higher than those of the YOLO v3, YOLO v4, and YOLO v5 models, thus verifying the practicability and feasibility of the proposed method.
(4): Due to the complex underground environment and bad working conditions of the coal mine, the number of obtained on-site relief hole images was too small, and a more comprehensive and numerous relief hole image dataset will be built later to improve the accuracy of relief hole recognition. In addition, the relative position relationship between the pressure relief hole and the drilling robot will be further studied in order to improve the positioning accuracy of the drilling system.

Author Contributions

Investigation, software, formal analysis, and writing—original draft, B.L.; supervision, writing—review and editing, and funding acquisition, Z.W., L.S. and D.W.; conceptualization and project administration, J.D.; resources, J.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Key Research and Development Program of China, grant number 2020YFB1314200, the Excellent Teaching Team Program of “Blue Project” in Jiangsu Universities, and the Priority Academic Program Development (PAPD) of Jiangsu Higher Education Institutions.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Qi, Q.X.; Li, Y.Z.; Zhao, S.K.; Zhang, N.; Zheng, W.; Li, H.; Li, H.Y. Seventy years development of coal mine rockburst in China: Establishment and consideration of theory and technology system. Coal Sci. Technol. 2019, 47, 1–40. [Google Scholar]
Chen, Y.F.; Li, X.J.; Wu, H.B. Control status and development trend of rockburst mechanism and prevention in China. Coal Sci. Technol. Mag. 2021, 42, 70–75. [Google Scholar]
Wang, G.F.; Pang, Y.H.; Ren, H.W. Intelligent coal mining pattern and technological path. J. Min. Strat. Control. Eng. 2020, 2, 013501. [Google Scholar]
Si, L.; Wang, Z.B.; Wang, H.; Wei, D.; Tan, C. Drilling tool attitude calculation of anti-impact drilling robot based on inertial sensing units and BP neural network. Chin. J. Sci. Instrum. 2022, 43, 213–223. [Google Scholar]
Wang, Z.B.; Si, L.; Wang, H.; Zhang, X.F.; Zhao, S.H.; Wei, D.; Tan, C.; Yan, H.F. Position and attitude calculation method of anti-impact drilling robot based on spatial array inertial units. J. China Coal Soc. 2022, 47, 598–610. [Google Scholar]
Song, Y.; Wang, Y.L.; Du, B.J.; Wang, J.; Dong, X.F. Detection of overlapped bullet holes based on improved Otsu’s thresholding method. Acta Armamentarii 2022, 43, 924–930. [Google Scholar]
Yao, A.Q.; Xu, J.M. Electric car charging hole identification and positioning system based on binocular vision. Transducer Microsyst. Technol. 2021, 40, 81–84. [Google Scholar]
Zhuang, Z.W.; Tian, W.; Li, B.; Shi, H.B.; Du, X.H. Detection algorithm of hole position and normal based on template matching. Comput. Integr. Manuf. Syst. 2021, 27, 3484–3493. [Google Scholar]
Lei, J.Z.; Zeng, L.B.; Ye, N. Research on industrial robot alignment technique with monocular vision. Opt. Precis Eng. 2018, 26, 733–741. [Google Scholar]
Wei, T.; Zhang, D.; Zuo, D.W.; Xu, F.; Xia, S.X. Machining feature recognition method for thin-walled and multi-cavity structural parts. Comput. Integr. Manuf. Syst. 2017, 23, 2683–2691. [Google Scholar]
Mi, W.J.; Zhang, Z.W.; Mi, C. Study on container hanging holes recognition algorithm based on machine vision. Chin. J. Constr. Mach. 2016, 14, 399–402. [Google Scholar]
Singh, P.; Chen, Y.-C. Sensing coverage hole identification and coverage hole healing methods for wireless sensor networks. Wirel. Netw. 2019, 26, 2223–2239. [Google Scholar] [CrossRef]
Ozaslan, E.; Yetgin, A.; Acar, B.; Güler, M. Damage mode identification of open hole composite laminates based on acoustic emission and digital image correlation methods. Compos. Struct. 2021, 274, 114299. [Google Scholar] [CrossRef]
Liang, Y.C.; Sun, Y.P. Hardware-in-the-loop simulations of hole/crack identification in a composite plate. Materials 2020, 13, 424. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Calvin, R.; Suresh, S. Image captioning using convolutional neural networks and recurrent neural network. In Proceedings of the 6th International Conference for Convergence in Technology (I2CT), Maharashtra, India, 2–4 April 2021. [Google Scholar]
Zhou, Y.; Chen, S.C.; Wang, Y.M.; Huan, W. Review of research on lightweight convolutional neural networks. In Proceedings of the 2020 IEEE 5th Information Technology and Mechatronics Engineering Conference (ITOEC 2020), Chongqing, China, 12–14 June 2020. [Google Scholar]
Reinel, T.S.; Brayan, A.A.H.; Alejandro, B.-O.M.; Alejandro, M.-R.; Daniel, A.-G.; Alejandro, A.-G.J.; Alejandro Buenaventura, B.-J.; Simon, O.-A.; Gustavo, I.; Raúl, R.-P.; et al. GBRAS-Net: A convolutional neural network architecture for spatial image stage analysis. IEEE Access 2021, 9, 14340–14350. [Google Scholar] [CrossRef]
Lv, X.W.; Shao, Z.F.; Huang, X.; Zhou, W.; Ming, D.P.; Wang, J.M.; Tong, C.Z. BTS: A binary tree sampling strategy for object identification based on deep learning. In. J. Geogr. Inf. Sci. 2022, 36, 822–848. [Google Scholar] [CrossRef]
Xu, X.; Zhao, M.; Shi, P.; Ren, R.; He, X.; Wei, X.; Yang, H. Crack detection and comparison study based on Faster R-CNN and Mask R-CNN. Sensors 2022, 22, 1215. [Google Scholar] [CrossRef]
Kaushal, M. Rapid-YOLO: A novel YOLO based architecture for shadow detection. Optik 2022, 260, 169084. [Google Scholar] [CrossRef]
Zhang, Z.K.; Qi, L. Object detection of artifact threaded hole based on Faster R-CNN. J. Meas. Sci. Instrum. 2021, 12, 107–114. [Google Scholar]
Zhang, Y. Research on Related Technologies of Blast Hole Recognition and Feasible Area Planning for Intelligent Explosive Filling Robot; University of Science and Technology: Anshan, China, 2020. [Google Scholar]
Liu, H.H. Research on Operating Target Recognition and Location Method of Cooperative Robot Based on Depth Learning; Beijing University of Posts and Telecommunications: Beijing, China, 2020. [Google Scholar]
Yang, Q.; Li, F.; Tian, H.; Li, H.; Xu, S.; Fei, J.; Wu, Z.; Feng, Q.; Lu, C. A New Knowledge-Distillation-Based Method for Detecting Conveyor Belt Defects. Appl. Sci. 2022, 12, 10051. [Google Scholar] [CrossRef]
Gu, S.; Zhang, R.; Luo, H.; Li, M.; Feng, H.; Tang, X. Improved SinGAN Integrated with an Attentional Mechanism for Remote Sensing Image Classification. Remote Sens. 2021, 13, 1713. [Google Scholar] [CrossRef]
Redmon, J.; Farhadi, A. YOLO v3: An incremental improvement. arXiv 2015, arXiv:1504.08083. [Google Scholar]
Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. YOLO v4: Optimal speed and accuracy of object detection. arXiv 2015, arXiv:1504.08083. [Google Scholar]
Zhang, Q.L.; Li, Q.S.; Hu, J.Y.; Xie, X.H. Real-time detection of navel orange fruit based on improved PP-YOLO algorithm. J. Beijing Union Univ. 2022, 36, 58–66. [Google Scholar]

Figure 1. Basic structure of the SinGAN model.

Figure 2. Changes of generator and its learning rate during model training.

Figure 3. Generation model of a pressure relief hole image based on improved SinGAN.

Figure 4. Structure of the SqueezeNet model.

Figure 5. Optimization of SqueezeNet network structure.

Figure 6. Multi-scale image input of Faster R-CNN.

Figure 7. Improved Faster R-CNN model for pressure relief hole target recognition.

Figure 8. Drilling experiment platform.

Figure 9. Pressure relief hole images under different conditions: (a) different brightnesses; (b) different shooting angles; (c) different blurring degrees; (d) different shooting distances.

Figure 10. Images generation results based on traditional SinGAN, improved SinGAN in [25], and our improved SinGAN.

Figure 11. Training duration of different SinGAN models [25].

Figure 12. The loss function curves of different SqueezeNet models: (a) classification loss of fully connected layer; (b) regression loss of the fully connected layer.

Figure 13. The accuracy rate changes of different SqueezeNet models during 10-fold cross-validation.

Figure 14. Change curves of the loss function of the improved SqueezeNet + BN-based network with different measures: (a) classification loss of the fully connected layer; (b) regression loss of the fully connected layer.

Figure 15. Pressure relief hole image acquisition site in an underground coal mine.

Figure 16. Partial pressure relief hole images in coal mine.

Figure 17. Recognition result graph of underground pressure relief holes based on improved Faster R-CNN model.

Table 1. Comparison of different SinGAN models in terms of SSIM and SIFID.

No.	Traditional SinGAN		Improved SinGAN in [25]		Our improved SinGAN
No.	SSIM	SIFID	SSIM	SIFID	SSIM	SIFID
1	0.782	87.645	0.761	76.457	0.752	74.549
2	0.707	109.087	0.712	90.112	0.716	81.421
3	0.754	148.721	0.744	132.427	0.731	122.564
4	0.666	124.758	0.659	112.657	0.653	101.109
5	0.697	135.342	0.687	118.428	0.667	108.795
6	0.753	78.574	0.692	62.487	0.648	54.962
7	0.681	145.854	0.681	129.14	0.682	119.373
8	0.680	145.877	0.654	132.447	0.648	120.911
9	0.788	81.250	0.765	67.142	0.737	60.400
10	0.733	103.442	0.701	92.135	0.659	86.746
11	0.781	73.150	0.687	59.472	0.667	51.353
12	0.772	111.535	0.694	92.167	0.659	85.417
13	0.733	77.547	0.647	53.672	0.626	48.308
14	0.683	94.074	0.681	90.127	0.733	81.619
15	0.762	114.776	0.712	106.257	0.690	96.611
16	0.755	127.478	0.693	112.367	0.677	101.642
17	0.733	124.744	0.711	108.274	0.709	99.678
18	0.742	79.536	0.741	60.137	0.743	50.679
19	0.768	148.764	0.723	132.027	0.715	125.613
20	0.719	108.479	0.674	100.374	0.665	94.477

Table 2. The setting values of hyperparameters.

Hyperparameters	Setting Values
iterations	150,000
anchors	5
max_boxes	150
base_lr	1 × 10⁻⁵
cmot	0.15

Table 3. Relief hole target recognition results based on different SqueezeNet models.

Index	SqueezeNet	Improved SqueezeNet	Improved SqueezeNet + BN
Number of identified samples	813	854	905
Number of missed samples	315	274	223
Number of false samples	252	149	117
Accuracy rate/%	76.34	85.14	88.55
Recall rate/%	72.07	75.71	80.23
Average training time/s	153	155	160
Average recognition time/s	0.11	0.11	0.11

Table 4. Relief hole target recognition results based on improved SqueezeNet + BN with different measures.

Index	Multi-Level Feature Fusion	Multi-Level Feature Fusion + Image Multi-Scale Input
Number of identified samples	1086	1109
Number of missed samples	42	19
Number of false samples	113	122
Accuracy rate/%	90.58	90.09
Recall rate/%	96.28	98.32
Average training time/s	238	251
Average recognition time/s	0.19	0.19

Table 5. Relief hole target recognition results based on different basic network models.

Index	Improved SqueezeNet	VGG-16	ZF-Net
Number of identified samples	1109	1113	1101
Number of missed samples	19	15	39
Number of false samples	122	137	122
Accuracy rate/%	90.09	89.04	90.02
Recall rate/%	98.32	98.67	96.58
Average training time/s	251	310	332
Average recognition time/s	0.19	0.24	0.26

Table 6. Relief hole target recognition results based on the improved SqueezeNet + BN model with different measures.

Index	Traditional Faster R-CNN	Improved Faster R-CNN
Number of identified samples	1124	1199
Number of missed samples	96	21
Number of false samples	155	132
Accuracy rate/%	87.88	90.08
Recall rate/%	92.13	98.28
Average training time/s	389	250
Average recognition time/s	0.25	0.19

Table 7. Relief hole target recognition results based on YOLO v3, YOLO v4, and YOLO v5.

Index	YOLO v3	YOLO v4	YOLO v5
Number of identified samples	1129	1138	1161
Number of missed samples	91	82	59
Number of false samples	195	182	166
Accuracy rate/%	85.27	86.21	87.49
Recall rate/%	92.54	93.28	95.16
Average training time/s	174	184	164
Average recognition time/s	0.021	0.027	0.019

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liang, B.; Wang, Z.; Si, L.; Wei, D.; Gu, J.; Dai, J. A Novel Pressure Relief Hole Recognition Method of Drilling Robot Based on SinGAN and Improved Faster R-CNN. Appl. Sci. 2023, 13, 513. https://doi.org/10.3390/app13010513

AMA Style

Liang B, Wang Z, Si L, Wei D, Gu J, Dai J. A Novel Pressure Relief Hole Recognition Method of Drilling Robot Based on SinGAN and Improved Faster R-CNN. Applied Sciences. 2023; 13(1):513. https://doi.org/10.3390/app13010513

Chicago/Turabian Style

Liang, Bin, Zhongbin Wang, Lei Si, Dong Wei, Jinheng Gu, and Jianbo Dai. 2023. "A Novel Pressure Relief Hole Recognition Method of Drilling Robot Based on SinGAN and Improved Faster R-CNN" Applied Sciences 13, no. 1: 513. https://doi.org/10.3390/app13010513

APA Style

Liang, B., Wang, Z., Si, L., Wei, D., Gu, J., & Dai, J. (2023). A Novel Pressure Relief Hole Recognition Method of Drilling Robot Based on SinGAN and Improved Faster R-CNN. Applied Sciences, 13(1), 513. https://doi.org/10.3390/app13010513

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Novel Pressure Relief Hole Recognition Method of Drilling Robot Based on SinGAN and Improved Faster R-CNN

Abstract

1. Introduction

2. Relief Hole Image Generation Based on SinGAN

2.1. Basic Principle of GAN

2.2. Improved SinGAN

2.3. Image Generation Model for Pressure Relief Hole

3. Pressure Relief Hole Identification Model Based on Improved Faster R-CNN

3.1. Faster R-CNN

3.2. Design of Lightweight CNN

3.3. Pressure Relief Hole Identification Model Based on Improved Faster R-CNN

3.3.1. Multiscale Image Input

3.3.2. Top-Down Multi-Layer Feature Fusion

4. Experimental Verification

4.1. Construction of the Experimental Platform for Image Acquisition

4.2. Pressure Relief Hole Image Generation Test

4.2.1. Training Environment and Process

4.2.2. Results and Analysis

4.3. Pressure Relief Hole Identification Test

4.3.1. Related Parameter Setting

4.3.2. Results and Analysis

4.4. Field Test in Underground Coal Mine

4.4.1. Sample Processing and Transfer Learning

4.4.2. Results and Analysis

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI