In the field of deep learning, the richness of datasets is an important indicator of the application effect of network models. If the number of datasets is small, the learning ability of the network model will be greatly compromised during training, and the network model will be unable to learn all the characteristics of the data. This results in the network model’s insufficient generalization ability, which leads to the performance of the network model being unable meet the actual needs. It is difficult to obtain a large number of relief hole images due to the harsh underground environment and complex working conditions. Therefore, in this paper, based on the GAN model, the samples of relief hole images are generated to enrich the sample dataset.
2.1. Basic Principle of GAN
GAN adopts an unsupervised learning mode, the core aim of which is to make two neural networks confront each other to achieve the purpose of learning. This is mainly composed of generator G and discriminator D. The purpose of the generator is to generate an image that is as similar to the original image as possible from zero after the input noise z is continuously trained by the model, and to try to confuse the discriminator. Through mutual improvement in the process of the continuous game, the discriminator has strong discrimination ability, but there is no way to distinguish whether the input data are true or false. In this case, it can be determined that the generator has mastered the feature distribution of the original image through training.
In 2019, Tamar et al. proposed a single-image generative adversarial network model. The basic structure is shown in
Figure 1 [
24], where
zN represents random noise;
GN represents the generator;
represents the image generated by generator;
represents the real image; and
DN represents discriminator. In general, by training SinGAN to generate images, at least thousands of training samples are required. Its input is a single natural image, which, like the GAN model, is unsupervised. SinGAN can generate a new image similar to the original image, and the general features of the image will essentially be the same, but some details and structures will have changed.
Although SinGAN can generate any number of images with high authenticity and diversity, the premise is that the input image is a natural scene image with an obvious structure. Due to the insufficient layout of the relief hole image structure, the diversity of the generated images will be poor, and the performance of the SinGAN model will have a large gap with the actual demand.
2.2. Improved SinGAN
Aiming at the above problems, the SinGAN model is improved by combining the image features of the pressure relief hole. The specific measures are as follows: a new image size transformation function is designed to enrich the diversity of the pressure relief hole image, and the number of iterations in each stage is set as dynamic changes to shorten the training time of the model and improve the performance.
- (1)
Image size adjustment
At the beginning of the training, the main task of the network model is to study the general features of the images. As the training progresses, the detailed features of the images are gradually learned. Considering the uncertainty of the overall feature distribution of the images in the coal mine, different relief hole images are mainly divided by certain details, so when the relief hole images are created, they should focus on learning the details to ensure the variety of the images. During the process of model training, images of different sizes should be input at different levels, and several images are needed for several stages of model training. In order to achieve rich details, it is necessary to carry out a certain number of iterations of high-order resolution learning, generally at least three, but there is no need to carry out more training in low-resolution stages. Based on this, the authors design a new curve adjustment function, as shown in Equation (1), to ensure that the number of training stages will be inclined to high resolution.
where
r is a constant less than 1,
N is the maximum stage number, and
n is the current stage number.
For example, if the resolution of the output image by the traditional method is 25 × 38, 40 × 60, 89 × 133, and 167 × 250, then the resolution of the output image by the basic size adjustment curve function is 56 × 83, 100 × 150, 117 × 176, and 167 × 250. It can be seen that the resolution of each stage after improvement is greater than the traditional method, except for the same maximum resolution.
- (2)
Multi-stage training
The setting of the learning rate at different stages will seriously affect the training speed of the model. The higher the learning rate, the faster the learning speed. However, in order to prevent an over-fitting phenomenon, the learning rate is generally set lower. During the training process, not only are the generators of a single stage trained, but some generators nearby this stage are also trained at the same time. In addition, the learning rate at the lower stages of the training process is generally set relatively low.
Figure 2 shows the change of the generator and its learning rate during model training. Firstly, a maximum of three generators are set to be trained at the same time. If there are more than three generators, only the generators of the first three scales are trained, and the parameters of other generators remain unchanged. Secondly, the learning rate of the generator at the lower two stages is adjusted to 1/10 and 1/100.
Figure 2 illustrates the four top-down stages of the training process. From the beginning of line 1 to the end of line 3, the number of generators is positively correlated with the number of training stages. In the fourth stage, the parameters of
G0 are fixed, and the learning rates of
G1,
G2, and
G3 are, respectively, set as 1%, 10%, and 100% of the original values.
- (3)
Dynamic change of the number of iterations
As the pyramid structure of SinGAN has multi-scale and multi-stage characteristics, it takes a long time to train the model. In addition, the detail feature is more important than the structure feature in the process of image generation. Therefore, in order to further improve the diversity of image generation results and the efficiency of model training, this paper only trains the specific generators in the SinGAN model, instead of training all generators like the original SinGAN, and optimizes the number of iterations in each training round, as shown in Equation (2).
where
iter refers to the current number of iterations;
β represents the rate at which training reaches the maximum value;
niter represents the set number of iterations; max represents the maximum number of training stages; and
depth represents the stage being trained, and is constantly changing.
With the above improvement measures, the iteration number of a certain stage has increased for the generator, but the iteration number of most stages has decreased. The total number of iterations of the model will be significantly reduced, and the training time will be greatly shortened, thus improving the training performance of the SinGAN model.
2.3. Image Generation Model for Pressure Relief Hole
The image generation model of pressure relief hole based on improved SinGAN is shown in
Figure 3, where
Ladv is the objective function of GAN. It mainly consists of two parts, including the generation process of a single pressure relief hole image and the training process of the model. The input of
G0 at stage 0 only has random noise
zn. In addition to
zn, the inputs at any other stages also include the image
output at the previous stage, and
also requires an upsampling operation. At each scale
n, the noise
zn is superimposed with the upsampled image
output at the previous scale, and is then transmitted to the generator. The image
output by the generator is also the generated image at the
n-th stage, which is the result after mixing the residual image.