Optimal Control Method of Oil Well Production Based on Cropped Well Group Samples and Machine Learning

Wang, Xiang; Ding, Yangyang; Li, Ding; He, Yanfeng

doi:10.3390/en16124735

Open AccessArticle

Optimal Control Method of Oil Well Production Based on Cropped Well Group Samples and Machine Learning

School of Petroleum and Natural Gas Engineering, Changzhou University, No. 21 Middle Gehu Road, Wujin District, Changzhou 213164, China

^*

Author to whom correspondence should be addressed.

Energies 2023, 16(12), 4735; https://doi.org/10.3390/en16124735

Submission received: 14 April 2023 / Revised: 29 May 2023 / Accepted: 13 June 2023 / Published: 15 June 2023

(This article belongs to the Section H1: Petroleum Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

Most traditional injection-production optimization methods that treat the entire oil reservoir as a whole require re-optimization when facing new reservoirs, which is not only time-consuming but also does not make full use of historical experience information. This study decomposes the reservoir into independent basic production units to increase sample size and diversity and utilizes image enhancement techniques to augment the number of samples. Two frameworks based on convolutional neural networks (CNNs) are employed to recommend optimal control strategies for inputted well groups. Framework 1 uses bottom hole pressure (BHP) as a control variable and trains a CNN with optimal BHP obtained by reinforcement learning algorithms as labels. Framework 2 saves BHP and corresponding oil well revenue (NPV) during reinforcement learning optimization and trains a CNN with well groups and BHP as features and NPV as labels. The CNN in this framework is capable of directly outputting the NPV according to control strategies. The particle swarm algorithm (PSO) is used to generate control strategies and call CNN to predict development effects until PSO converges to the optimal production strategy. The experimental results demonstrate that the CNN-based frameworks outperform the traditional PSO-based methods in terms of accuracy and computational efficiency. Framework 1 achieves an output accuracy of 87% for predicting the optimal BHP for new well groups, while Framework 2 achieves an accuracy of 78%. Both frameworks exhibit fast running times, with each iteration taking less than 1 s. This study provides a more effective and accurate method for optimizing oil well production in oil reservoirs by decomposing oil reservoirs into independent units and using CNN to construct an algorithm framework, which is of great significance for the real-time optimization and control of oil wells in oil fields.

Keywords:

production optimization; cropped well group; CNN; reinforcement learning; PSO

1. Introduction

Well production is aimed at finding the optimal production solution for each well to maximize the net present value (NPV) or production of hydrocarbons from a reservoir, which falls under the category of production optimization. The optimization process involves forecasting future production, and numerical simulators are often used for this purpose. However, individual simulation runs can be time-consuming, and complete optimization may demand numerous simulation iterations [1]. Consequently, it is critical to devise efficient methodologies to address these challenges.

Currently, there are two main types of methods commonly used for well production: optimization algorithm-based methods and reinforcement learning (RL) methods. Among them, the optimization algorithm-based methods mainly include gradient-based methods and derivative-free methods. Gradient-based algorithms use gradient data to determine the search direction [2,3,4,5]. The mainstream gradient-based methods that have been used for production optimization problems are the adjoint gradient-based method [6], the stochastic gradient method [3], the synchronous perturbation stochastic approximation [7], etc. These methods have been proven to be able to provide fast and accurate solutions for production optimization problems. However, these methods can only ensure finding locally optimal solutions. Therefore, there is a need for more efficient methods that can find globally optimal solutions. Alternatively, derivative-free algorithms do not require the explicit computation of derivatives, and therefore, offer better flexibility [8,9]. Representative algorithms are differential evolution (DE) [10,11], agent-assisted evolution algorithm (SAEA) [12,13,14,15,16,17,18,19], particle swarm optimization (PSO) [20], etc. These methods have been widely used in various optimization tasks and have shown excellent global search capability. However, this method requires a large number of simulations, has low computational efficiency, and is difficult to solve for high-dimensional problems, making it difficult to apply in this field. A drawback of optimization algorithm methods is that they are task-specific, lack memory, and need to restart for new tasks.

Recent studies have attempted to use RL algorithms to solve specific problems in production optimization, such as De Paola et al. using the DQN algorithm (De Paola et al., 2020), or Zhang et al. using the SAC algorithm for full life-cycle water drive production optimization [21]. Although these studies have the capacity to markedly improve the final recovery in dynamic production optimization using RL, most of the RL models are learned and trained for specific reservoirs, and thus can only be used for the current reservoir; when applied to other reservoirs, they generally perform poorly. To address this limitation, recent work has begun to focus increasingly on using RL to solve generalized problems for different reservoir optimization models. For example, Miftakhov et al. proposed an end-to-end strategy optimization combined with pixel data to maximize the NPV of production processes [22]. Additionally, Nasir et al. developed a standard reservoir template on which to train an RL model for field development plan (FDP) optimization; when applied to a real reservoir, the real reservoir is rescaled to the reservoir template, thereby solving scalable field development optimization problems [23,24]. Furthermore, a general control strategy framework based on deep reinforcement learning (DRL) was developed by Nasir and Durlofsky for closed-loop decision-making in subsurface flow environments [25]. Here, the closed-loop reservoir management problem is expressed using a Markov decision process with partial observability and a proximal policy optimization algorithm is employed to solve the optimization problem. However, training the models requires significant computational effort, and the resulting generalized RL optimization models are not yet achieved.

In this paper, we propose a new approach that utilizes convolutional neural networks (CNNs) and RL algorithms to optimize well production in reservoir engineering. Our approach involves dividing the reservoir into fundamental production units. Each unit comprises a group of wells with distinct geological and developmental characteristics. In the field of reservoir engineering, this type of production unit, which consists of a central well and its associated neighboring wells, is commonly referred to as a “well group”. By doing so, the sample size increases significantly, covering a broader range of characteristics and simplifying the model training process. Moreover, image enhancement techniques can be employed to further improve the coverage of the samples.

We present two frameworks for optimal control of the well workover regime. The first framework employs a CNN in deep learning, taking the well group as input and outputting the optimal working regime of the well group in terms of bottomhole pressure (BHP). The optimal production strategy is obtained through RL algorithms, which are then used to label the samples before training the algorithm. The second framework also uses CNN networks in deep learning but incorporates labels obtained from RL algorithms using well revenue (NPV) under a certain BHP as its sample label. We demonstrate the effectiveness of our approach through extensive experiments and analysis.

The following sections outline the structure of this paper. Section 2 introduces the mathematical model for production optimization and describes the main processes of the two frameworks. Section 3 outlines the main algorithms employed in this study. Section 4 discusses the dataset used for the study. In Section 5, we present a test case to illustrate the application of the proposed methods. Finally, we summarize the main conclusions of this study in Section 6.

2. Production Optimization Problems and Solutions

2.1. Mathematical Model for Production Optimization

Production optimization seeks to achieve maximum financial gain or hydrocarbon production by adjusting the control strategy for each well [26,27].

Mathematically, the optimization problem for oil field development can be expressed as follows:

\underset{x \in Χ}{m a x} J (x), subject to c (x) \leq 0

(1)

The objective function, denoted by

J (x)

, is to be optimized, and the decision vector

x \in Χ

defines the specific production strategy of the well. The space

X

defines the range of values of the decision variables, while the vector

c (x)

defines the optimization constraints that must be satisfied.

In this study, the objective function to be optimized is the NPV of the production process. The formula for calculating NPV is as follows:

N P V = \sum_{n = 1}^{N_{t}} {[\sum_{j = 1}^{N_{p r d}} (C_{o} \cdot q_{o, j}^{n} - C_{w} \cdot q_{w, j}^{n}) - \sum_{k = 1}^{N_{i n j}} (C_{i} \cdot q_{w i n j, k}^{n})] \frac{Δ t^{n}}{{(1 + b)}^{t^{n} / 365}}}

(2)

In Equation (2),

N_{i n j}

and

N_{p r d}

represent the total number of injection and production wells, respectively.

N_{t}

used to indicate the total number of reservoir simulation steps, with

Δ t^{n}

in units of days being the length of the

n

th time step, and

t^{n}

in units of days being the cumulative time up to the

n

th time step.

C_{o}

denotes the oil revenue in USD/STB, which is set as 70 USD/STB in this paper;

C_{w}

represents the cost of disposing of the produced water in USD/STB, which is 5 USD/STB in this paper;

C_{i}

denotes the cost of injecting water in USD/STB, which is 5 USD/STB in this paper.

b

denotes the annual discount rate, which is 0 in this paper.

q_{o, j}^{n}

and

q_{w, j}^{n}

represent the oil production rate and water production rate in STB/D of the

j

th production well during the

n

th time step, respectively;

q_{w w i n j,}^{n}

denotes the water-injection rate of the

k

th injection well at the

n

th time step (in STB/D).

In this study, the decision vector

x

is the BHP of the well, and the range of values X of the decision variables and the optimization constraints

c (x)

are specified in Section 3.1.

2.2. Algorithmic Framework for Oil Well Production Optimization

In this study, we propose two novel frameworks for developing a general model for regulating well production strategy. These frameworks differ from previous approaches that rely on iterative optimization-based algorithms to explore numerical simulators or use RL to train a proxy model for well Production. The specifics of these two frameworks are elaborated below.

2.2.1. Framework 1

Figure 1 illustrates Framework 1, which consists of three main steps that correspond to ①, ② and ③ on the right side of the figure. The first step involves the preparation of the sample set. We develop a personalized deep Q-network (DQN) algorithm and use it to conduct RL on the cropped well group sample. When a well group sample is inputted into the algorithm, the algorithm aims to maximize the NPV of that well group. It achieves this by evaluating the NPV of the recommended BHP through numerical simulation until the optimal BHP for the well group is obtained. Each inputted well group sample, along with the corresponding optimal BHP recommended by the personalized DQN algorithm, will be stored in the sample repository. By optimizing a large number of cropped well group samples using RL, we can generate a diverse sample set containing cropped well group samples with their corresponding optimal BHPs for numerous cases. The second step involves the training of the model. We build a CNN and train it using the aforementioned sample set. Finally, we obtain a CNN model. The third step involves the application of the model. The details of the personalized DQN algorithm are explained in Section 3.1, the construction of the CNN network structure in Section 3.2.1, and the preparation of well groups in Section 4.

2.2.2. Framework 2

As shown in Figure 2, Framework 2 is comprised of a three-step process, i.e., ①, ② and ③ in the figure. The first step is to construct a sample set that contains more development information: unlike Framework 1, which only records the well group and the optimal BHP recommended by the personalized DQN algorithm for each input, Framework 2 also saves the NPV corresponding to each BHP obtained by the personalized DQN algorithm. The second step is to train a CNN model that can predict development effects under different production strategies. First, we construct a CNN network that differs from the one in Framework 1. Then, we use the sample set to train the CNN network. The last step is to apply the CNN model to generate optimal production strategies. First, particle swarm optimization, an intelligent optimization algorithm, is utilized in this study to automatically generate a batch of production strategies and invoke the CNN model to quickly predict their development effects. Then, we pass these development effects to PSO, which generates a new batch of production strategies based on them. By repeating this process, PSO eventually converges to the optimal production strategy. We will introduce PSO in Section 3.3 and describe the structure of the CNN network in Section 3.2.2.

3. Algorithms

3.1. Personalized DQN Algorithm

RL is a process of trying different actions and interactions through the environment in order to find the optimal strategy based on feedback from the environment. This approach can also be used to preserve experience. In the context of well workover optimization, RL can be applied by using the well group as the input, the BHP as the action, numerical simulation as the environment, and the NPV as the feedback. By doing so, optimal BHP or NPV labels can be added to subsequent well group samples.

As RL is not a specific algorithm, but rather a generic term for a class of algorithms, the choice of a suitable algorithm is necessary. The popular RL algorithms include DQN, Soft Actor-Critic, and Proximal Policy Optimization. In this study, the DQN algorithm was chosen to minimize computational stress on the computer.

In this paper, we investigate a single-step optimization problem for oil wells. However, the current DQN algorithm [28] was originally designed for a multi-step time series problem. Therefore, we have made modifications to the algorithm to make it suitable for our research needs, resulting in the personalized DQN algorithm. Specifically, we have made two main modifications to the DQN algorithm: (1) we have modified the network structure, and (2) we have adjusted the mathematical expression used for calculating the value of the action.

Modification of network structure: In the original DQN algorithm process, a two-layer neural network is initialized, comprising the original neural network with parameters $θ$ and the target neural network with parameters $\bar{θ}$ . However, the target neural network is designed to stabilize the training process of network optimization for multi-step time series problems, which is not necessary for our research. Therefore, we removed the target neural network from the personalized DQN algorithm to meet the requirements of our single-step optimization problem.
Modification of the mathematical expression for calculating the value size of an action: The reward of an action to be taken in RL is a measure of its value size and can be estimated using the expression given in Equation (3).

y_{j} = \{\begin{matrix} r_{j} & if episode terminates at step j + 1 \\ r_{j} + r {m a x}_{q^{'}} \hat{Q} (ϕ_{j + 1}, a^{'}; \bar{θ}) & o t h e r w i s e \end{matrix}

(3)

In Equation (3),

r_{j}

is the reward value of the current action.

r \max_{q^{'}} \hat{Q} (ϕ_{j + 1}, a^{'}; \bar{θ})

is an estimate of the value of performing a future action

a^{'}

in the state.

r

is the discount rate, which discounts the value of the future action to the current node and takes a value between 0 and 1.

Since this paper studies a single-step optimization problem for oil wells, only one action is generated in each training, and there are no future actions. Thus, there is no need to estimate the value of future actions, so

\max_{q^{'}} \hat{Q} (ϕ_{j + 1}, a^{'}; \bar{θ})

in Equation (3) should be removed. Consequently, Equation (3) becomes

y_{j} = r_{j}

.

In conclusion, Algorithm 1 presents the pseudo-code of the personalized DQN algorithm that we designed for the single-step optimization problem of well parameters.

Algorithm 1: Personalized DQN algorithm

Initialize replay memory

D

with capacity size

N

;
Initialize the action-value function

Q

with random parameters

θ

;
For episode = 1,

M

do:
Initialize sequence

s_{1} = \{x_{1}\}

(

x_{1}

is the image) and preprocessed
sequence

ϕ_{1} = ϕ (s_{1})

;
With probability

ε

select a random action

a_{t}

,
Otherwise, select the largest value of action

a_{t}

according to the
formula

a_{t} = a r g m a x_{a} Q (ϕ (s_{1}), a^{.}, θ)

;
Execute action

a_{t}

in environment and observe reward

r_{t}

and image

x_{t + 1}

;
Let

s_{t + 1} = s_{t}

and obtain

ϕ_{t + 1} = ϕ (s_{t + 1})

;
Store an experience data

(ϕ_{t}, a_{t}, r_{t}, ϕ_{t + 1})

in the replay memory

D

;
Randomly sample a batch of data

(ϕ_{j}, a_{j}, r_{j}, ϕ_{j + 1})

from

D

;
Using

y_{j} = r_{j}

to obtain

{(y_{j} - Q (ϕ_{j}, a_{j}; θ))}^{2}

;
Updating the network parameters

θ

according to the gradient
descent algorithm;
End for
Output: model with parameter

θ

.

3.2. Construction of CNN Networks

Since this study adopts two frameworks to optimize the oil well production strategies, and the CNN network structures required by these two frameworks differ significantly, two different CNN network structures need to be built. The detailed descriptions of these two CNN network structures are as follows.

3.2.1. Network Structure 1

The first network structure, referred to as Network Structure 1, is designed specifically for Framework 1, as shown in Figure 3. Designing the architecture of neural networks does not have specific guidelines to follow [29], as it requires customization based on the specific problem and empirical knowledge. In our study, we drew inspiration from a related research study [21] that employed network structures and made adjustments to the framework based on the requirements of our research problem. As a result, we arrived at the architecture depicted in Figure 3. This neural network comprises seven layers, including two convolutional layers, four hidden layers, and one output layer. The input to the network is the well group, and the output is the BHP recommended by the algorithm. Notably, the BHP values in the output layer are restricted to a range of −1 to 1 and are discretized into 21 discrete values, corresponding to the 21 neurons in the output layer. In this discretization scheme, 0 represents the default BHP value of the well group, while 1 and −1 represent the upper and lower BHP values of the well group, respectively. This approach aims to enhance the universality of the trained CNN network and to reduce the difficulty of training the algorithm.

3.2.2. Network Structure 2

Network Structure 2 is specifically designed for Framework 2, as illustrated in Figure 4. The neural network consists of seven layers in total, with two inputs: the well group, which is connected to the convolutional layer, and the BHP of the well group, which is merged with the well samples after two convolutions to be inputted into the first fully connected layer. The output of the network is the NPV obtained by producing the well group at a given BHP. Based on the foundation of network structure 1, we have designed this network architecture by incorporating the unique features of Framework 2. Within a given well group, by incorporating BHP as an input, the CNN model can understand how different BHP values impact the NPV. The model learns to capture patterns and correlations between various BHP settings and their corresponding NPV outcomes. This enables the CNN model to predict the NPV based on the input well group and BHP values. It is important to note that we normalize the input BHP and restrict the NPV of the output layer to a range of 0 to 1. The closer the output value is to 1, the better the BHP. This is conducted to make the trained CNN network more generalizable.

3.3. PSO Algorithm

PSO is a method of evolutionary computation that was introduced by Dr. Eberhart and Dr. Kennedy in 1995 [30] and originated from the study of bird flock predation behavior. Each individual in a flock can be treated as a particle, and the flock can be considered a particle swarm. The specific algorithmic procedure of the PSO can be referred to as presented in Marini F and Walczak B (2015) [31].

In this subsection, the main parameters of the PSO algorithm are presented. In this study, the independent variable is the standardized BHP, the objective function is the CNN network model in Framework 2, and the fitness function is the NPV of the model’s output. The termination condition is that the algorithm iterates 50 times. The number of particles is set to 50. The maximum velocity of the particles is set to 0.5, and the inertia factor is set to 1.0. The search space for particles corresponds to the range of standardized BHP values, i.e., −1 to 1. The individual learning factor and social learning factor are both set to 2.

These settings aim to strike a balance between exploration and exploitation during the search and reduce the risk of converging to a suboptimal solution. By utilizing 50 particles, we can explore the solution space more comprehensively. Limiting the maximum velocity helps prevent particles from making sudden large jumps, thereby enhancing the convergence toward the optimal solution. The inertia factor of 1.0 ensures a balanced contribution from the previous velocity and acceleration, facilitating progressive search. Additionally, the individual and social learning factors of 2 promote information sharing and cooperation among particles, facilitating the exploration of promising regions within the search space. This helps prevent the algorithm from getting trapped in local optima by combining insights from personal and team experiences.

4. Dataset Preparation

4.1. Data Collection

In this study, the S1 model, a real reservoir located in China, is used as the experimental case. The numerical model of the S1 reservoir is presented in Figure 5. The grid block size of the S1 model is 238 × 161 × 1, with a Δx of 30.10 m, Δy of 29.44 m, and Δz of 20.60 m. The model consists of 77 water wells and 93 oil wells.

According to the well productivity equation, in reservoir engineering theory, the development effectiveness of an oil well is primarily influenced by factors such as reservoir saturation, permeability, pore volume, pressure within the reservoir, and the spatial relationship between oil and water wells. Taking into consideration these significant factors, we have selected the following parameters as inputs for our analysis based on reservoir engineering theory and in conjunction with expert experience. The details of each input parameter are described below:

Effective grid field (I1): This parameter characterizes the geological structure of the reservoir and determines the validity of each grid block in the S1 model.
Pore volume field (I2): This parameter characterizes the pore volume distribution of the reservoir, indicating the volume of fluid (e.g., oil, water) a specific grid block can hold.
Permeability field (I3): This parameter characterizes the permeability distribution of the reservoir.
Pressure field (I4): This parameter characterizes the pressure distribution in the reservoir.
Water saturation field (I5): This parameter characterizes the distribution of water saturation throughout the reservoir.
Injection-production well pattern field (I6): This parameter characterizes the correlation between the location of oil and water wells in the reservoir and their respective control strategies.

In summary, the S1 model provides six matrix datasets that can be represented as six images, as shown in Figure 6. The white dots in Figure 6 are non-petroleum reservoir areas, i.e., areas corresponding to the 0 in I1.

By analyzing the data in Figure 6, it can be seen that in the S1 model, I1 only takes two values of 0 and 1. The range of values for I2 is from 51.1 m³ to 14,713.2 m³, while for I3, it ranges from 100.0 millidarcy to 7581.48 millidarcy. The variation range of I4 is from 130.0 bar to 210.0 bar. The value of I5 is 0.24, and during the production process, the water saturation range in the S1 model varies from 0.24 to 0.90. As for I6, if a grid does not have perforations, the corresponding value in I6 is 0. Otherwise, the value in I6 represents the BHP of that well. The default pressure for oil wells in the S1 model is set to 155 bar, while for water wells, it is 210 bar. The BHP variation range for each oil well is set from 130 bar to 180 bar.

4.2. Dataset Preparation

4.2.1. Cropping Image Samples

In this section, we will crop the previously obtained reservoir images with each oil well as the center. The size of the cropped well group is determined by the effective control range of each well according to reservoir engineering theory. For the S1 model, we set the size of the cropping frame to 11 × 11, resulting in 93 images of size 11 × 11, centered on each well. Figure 7 illustrates the results of cropping the permeability domain of the S1 model. Each row of the figure contains six images, including the effective grid, pore volume, permeability, pressure, water content saturation, and injection-production well pattern field (with green squares representing production wells and yellow squares representing injection wells).

4.2.2. Image Enhancement

The applicability or generalization ability of the resulting model depends on how well the samples of the training algorithm capture the geological development conditions of the reservoir. Image enhancement, on the other hand, by making a sequence of random adjustments to the images, the dataset can be enlarged. Additionally, randomly altered samples can decrease the model’s dependence on specific attributes, thus improving the generalization capability of the model [32,33]. Therefore, for the aforementioned reasons, we used image enhancement techniques to enhance the cropped well group samples. Common image enhancement transformations include rotation, flip transformation, scaling transformation, scale transformation, noise perturbation, color transformation, etc. These changes can generate similar yet different samples. The image enhancement techniques employed in this paper mainly consist of image rotation and image mirroring.

First, we rotate the images by 90°, 180° and 360° respectively, which results in three new images for each original image, as shown in Figure 8. Thus, by rotating the images, the number of samples in the well group can be quadrupled.

Then, we perform a vertical flip on all the images. This yields a new image for each original image, as shown in Figure 9. Therefore, when we perform another vertical flip on the well group samples obtained by the previous rotation, the number of samples can be doubled again.

4.2.3. Sample Labeling

The sample labels are obtained using the personalized DQN algorithm. First, the cropped samples after image enhancement are fed to the personalized DQN algorithm one by one. After multiple runs, the NPV and optimal BHP of each well group under different BHPs are obtained. Then, the BHPs are standardized and the NPVs are normalized. Finally, based on the sample label requirements of the two frameworks in Section 3.2, two sample sets are prepared: Framework 1 and Framework 2. In Framework 1, each sample consists of a well group and its optimal BHP; in Framework 2, each sample consists of a well group, a BHP and the NPV achieved by producing with that BHP for that well group. Since each well group has 21 candidate BHPs and only one optimal BHP among them, Framework 2 has 21 times more samples than Framework 1.

Since only based on the above process of making samples from the S1 model, the final number of samples obtained is relatively small. Therefore, we modified the distribution of the initial water saturation of S1 and changed the initial water saturation to four cases: 0.1, 0.15, 0.3 and 0.35. This is equivalent to obtaining four new reservoirs. Then samples are obtained from these four reservoirs through the above process of making samples, and the number of samples will be further expanded.

In summary, we use the samples obtained by image rotation for algorithm training, which are called Training Set 1 for Framework 1 and Training Set 2 for Framework 2. The samples obtained by mirroring are used as validation sets, which are called Validation Set 1 for Framework 1 and Validation Set 2 for Framework 2. To facilitate readers’ understanding, we have shown the details of the training and validation sets in Table 1.

5. Case Study

5.1. Settings for Model Training and Validation

In the training phase, Training Set 1 and Training Set 2 were used to train the CNN networks in Framework 1 and Framework 2, respectively. The setting of hyperparameters, similar to the configuration of network structures, lacks a definitive guideline and requires consideration of specific research questions and experiential knowledge. Therefore, based on the particular problems within these two frameworks and common initial hyperparameter settings in CNNs, we have initially established the primary training parameters for the CNN networks under each framework, as presented in Table 2. However, it is worth noting that the selection of hyperparameters is a dynamic process that may vary depending on the specific dataset, task, and experimental conditions. These initial settings serve as a starting point, and we will further refine and optimize them during subsequent hyperparameter tuning processes.

Two points need to be clarified before proceeding.

The CNN network structure in Framework 1 is designed for multi-classification problems, while the CNN network structure in Framework 2 is intended for regression problems. Therefore, the evaluation metrics for the CNN model in Framework 1 are accuracy and loss value, while the evaluation metric for the CNN model in Framework 2 is the loss value during training.
To prevent the trained model from underfitting or overfitting, we split both training set 1 and training set 2 into a training set and a test set with a ratio of 8:2. The model training result will show two curves in the image: one for the training set and one for the test set.

In the model validation phase, validation set 1 and validation set 2 are used to validate the effectiveness of the CNN model in Framework 1 and the PSO and CNN models in Framework 2, respectively. The model effectiveness is evaluated using the ratio of the number of optimal BHPs accurately recommended by the model to the total number of validation samples. It should be noted that the model validation is the overall effect of the two frameworks.

5.2. Model Training Results and Analysis

The results of training CNN networks in Framework 1 and Framework 2 are presented in Figure 10 and Figure 11, respectively.

Figure 10a depicts the training accuracy of the CNN model in Framework 1, which is consistently above 98%, and the validation accuracy of the CNN model, which remains around 95%. Figure 10b shows that both the training loss and validation loss of the CNN model decrease as the training progresses. The CNN model in Framework 1 was trained for 29 s. After the completion of training, the training set loss of the CNN model is approximately 0.15, and the validation set loss of the CNN model is around 0.05.

As shown in Figure 11, the training and validation losses of the CNN model in Framework 2 decrease sharply at the beginning of training and then stabilize gradually. The model training time for this framework is 35 s, which is slightly longer than that of Framework 1. The training and validation losses of the CNN model are both below 0.01 at the end of training.

5.3. Model Validation Results and Analysis

This section mainly introduces the comparison results between the recommended optimal BHP values by Framework 1 and Framework 2 models and the actual optimal BHP values. The comparison results are shown in Figure 12 and Figure 13, each of which has two subplots. In Figure 12a and Figure 13a, the x-axis corresponds to the actual optimal BHP value, and the y-axis corresponds to the model-predicted optimal BHP value. The closer the data points align with the diagonal line, the better the consistency between the predicted and actual values. Both Figure 12b and Figure 13b are residual plots, where the x-axis is the data point and the y-axis is the residual. There is a black line at the zero residual point, and the closer the residual points are to this line, the better the consistency between the predicted and actual values. Overall, both subplots convey the same information but in different forms. For clarity, only 100 comparison results are presented in the figure.

From Figure 12, it can be seen that only a few data points deviate from the best line in Figure 12a, and only a few residual points are not on the black line in Figure 12b. This indicates that the recommended optimal BHP values by the Framework 1 model are highly consistent with the actual optimal BHP values. The optimal BHP output accuracy of this model can reach 82% of the entire validation set 1. On the other hand, Figure 13 shows that the matching degree between the recommended optimal BHP values by the Framework 2 model and the actual optimal BHP values is lower than that of the former. However, most data points are still close to the best line. The optimal BHP output accuracy of this model is calculated to be 76% of the entire validation set 2.

After comparing the results of the two frameworks, it is evident that the model in Framework 1 outperforms the model in Framework 2 in terms of predicting the optimal BHP values. Additionally, it is worth emphasizing the superior computational efficiency exhibited by the model in Framework 1, as it completes the optimization task within 1 s, whereas the model in Framework 2 requires approximately 8 s. These findings underscore the advantages of the model in Framework 1, which achieves higher accuracy and faster execution, making it a more favorable choice for practical implementation.

To evaluate the performance of these two frameworks in the whole reservoir. Using both frameworks, we recommended BHP values for all well groups in the S1 model and had them produce according to these recommended BHP values across the entire reservoir.

The results revealed that Framework 1, with its recommended BHP values, generated a total profit of 11,522,300 USD for the entire reservoir production. On the other hand, Framework 2, with its recommended BHP values, resulted in a profit of 10,160,900 USD. In comparison, the baseline BHP value of 155 bar led to a loss of 701,533,900 USD. These findings clearly demonstrate the effectiveness of both Framework 1 and Framework 2 in optimizing the production of well groups, thereby significantly enhancing the economic performance of the entire reservoir. In terms of NPV, Framework 1 outperformed Framework 2, indicating its superior ability to maximize economic returns.

5.4. Comparison of Production Optimization Framework

This paper proposes two optimization frameworks for optimizing oil reservoir production strategies based on Convolutional Neural Networks (CNNs). In order to demonstrate the superiority of the proposed methods, they are compared with the PSO algorithm mentioned in the literature. Both the proposed frameworks and the PSO algorithm are applied to optimize a randomly selected set of 100 validation samples. The comparison is conducted based on the accuracy of the algorithms and the runtime per iteration. The accuracy is defined as the ratio of the number of wells for which the algorithms find the optimal BHP to the total number of wells requiring BHP optimization.

Regarding the parameter settings of the PSO algorithm in the context of this study, the main settings are as follows: the independent variable is BHP, and the fitness function is NPV obtained by the numerical simulation method. The termination condition is that the algorithm iterates 100 times. The number of particles is set to 100, the maximum particle velocity is set to 0.5, and the inertia factor is set to 1.0. The search space for particles corresponds to the range of BHP values, ranging from −1 to 1. Both the individual learning factor and the social learning factor are set to 2. The comparative results are summarized in Table 3.

While PSO has proven to be effective in finding optimal solutions, the generated production strategies require a numerical reservoir simulator to evaluate their effectiveness, which is time-consuming and computationally intensive. In contrast, the proposed CNN-based frameworks have several advantages. Firstly, they decompose the reservoir into independent production units, reducing computational complexity and improving optimization efficiency. Secondly, image augmentation techniques are used to increase sample size and diversity, improving the accuracy and robustness of the proposed methods. Thirdly, using CNNs can efficiently and accurately predict optimal control strategies.

Experimental results show that the proposed frameworks outperform PSO-based methods in terms of accuracy and computational efficiency. Specifically, Framework 1 achieves 87% output accuracy of the optimal BHP output for new well groups, while Framework 2 achieves 78% output accuracy of the optimal BHP implementation. The runtime for each iteration in both frameworks is less than 1 s.

In conclusion, the proposed CNN-based frameworks provide an effective and accurate method for optimizing oil reservoir production. They have several advantages over PSO-based methods, including reduced computational complexity, improved accuracy and robustness, and faster running time.

6. Conclusions

In this study, we have developed two frameworks based on Convolutional Neural Networks (CNNs) for optimizing production strategies in oil reservoirs. These frameworks utilize a personalized DQN algorithm, embedded CNN network architecture, and PSO algorithm. Our approach has achieved significant success in improving the accuracy of predicting optimal BHP and optimizing oil reservoir production.

To enhance the performance of the models, we conducted hyperparameter optimization by modifying the network structure and adjusting the hyperparameters. The selected hyperparameter combinations were determined through an exhaustive search, resulting in improved accuracy and stability compared to the default parameters. We have also presented training and validation results, demonstrating the superiority of the models trained with the optimal hyperparameters over those without hyperparameter tuning.

Compared to traditional methods such as PSO, CNN-based frameworks offer several advantages. By decomposing the reservoir into independent production units and employing image augmentation techniques, we have reduced the computational complexity and improved the accuracy and robustness of the models. The CNNs have effectively predicted optimal control strategies for production optimization.

Our experimental results have demonstrated that the proposed frameworks outperform the PSO-based methods in terms of accuracy and computational efficiency. Framework 1 achieves an output accuracy of 87% for the optimal BHP output of new well groups, while Framework 2 achieves 78% output accuracy for the implementation of optimal BHP. Notably, both frameworks exhibit fast running times of less than 1 s per iteration.

In future research, two main areas deserve attention and further investigation. The first area involves conducting extensive experiments to explore a wider range of hyperparameter tuning. This exploration is valuable for determining the optimal combination that maximizes the accuracy of the model. The second area of focus is expanding the dataset to include more data from three-dimensional reservoirs. Currently, the research primarily concentrates on optimizing production in two-dimensional reservoirs. By incorporating data from three-dimensional reservoirs into the dataset, the trained model may exhibit improved applicability to such reservoirs.

Author Contributions

X.W.: Visualization, Investigation, Data curation, Methodology, Software, Writing—Reviewing and Editing. Y.D.: Conceptualization, Methodology, Software, Writing—Original draft preparation. D.L.: Writing—Reviewing and Editing. Y.H.: Supervision, Validation. All authors have read and agreed to the published version of the manuscript.

Funding

This work has been supported by the National Natural Science Foundation of China (No. 52204027) and Sinopec Scientific and Technological Research Project “Research on the Application of Big Data Technology in Oilfield Development” (No. P20071).

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

The authors acknowledge fundings from the National Natural Science Foundation of China (No. 52204027), the Sinopec Scientific and Technological Research Project “Research on the Application of Big Data Technology in Oilfield Development” (No. P20071), and the Postgraduate Research & Practice Innovation Program of Jiangsu Province (No. SJCX23_1572).

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Hyperparameter Optimization for the Proposed Frameworks

To further improve the performance of the two models, this study conducted hyperparameter optimization, which mainly includes modifying the network structure and adjusting the hyperparameters. The specific details are presented in Table A1.

Table A1. Summary of hyperparameter adjustments.

Parameter	Adjustment Range
Learning rate	[0.01, 0.001, 0.0001, 0.00001]
Batch size	[16, 32, 64, 128, 256]
Epoch	[25, 50, 75, 100, 125, 150, 175, 200]
Optimizer	[SGD, Adam, AdaGrad, RMSProp, AMSGrad]
Dropout	[0, 0.1, 0.2, 0.3, 0.4, 0.5]
Regularization (L2)	[0.01, 0.001, 0.0001]

The following explanations should be further provided for the parameters in the above table: The ranges of hyperparameters in the table were set based on commonly used values in CNN. For Dropout, we added a Dropout layer after each hidden layer with the same parameter setting. For regularization, we added regularization to each hidden layer with the same parameter setting, using only L2 regularization in this study. Since the CNN in Framework 1 is for classification problems, and the CNN in Framework 2 is for regression problems, Cross entropy loss and Mean-squared error are commonly used choices for the loss functions; therefore, no hyperparameter tuning was conducted for the loss functions.

Using the hyperparameter settings in Table A1, the parameters were combined for the hyperparameter optimization of the two frameworks. An exhaustive search method was used to select the hyperparameter combinations that resulted in the highest accuracy for the models in each framework. The selected hyperparameter combinations are shown in Table A2.

Table A2. Optimal combination of hyperparameters for both frameworks.

Parameter	Value
Parameter	Framework 1	Framework 2
Learning rate	0.0001	0.001
Batch size	64	128
Epoch	100	100
Optimizer	Adam	Adam
Dropout	0.5	0.5
Regularization (L2)	0.01	0.001

Under the optimal combination of hyperparameters above, the training results of the two frameworks are shown in Figure A1 and Figure A2, and the verification results are shown in Figure A3 and Figure A4.

Based on the training results shown in Figure A1 and Figure A2, it can be concluded that the model trained with the optimal hyperparameters exhibits greater stability than the model trained with the default parameters. Additionally, the verification results shown in Figure A3 and Figure A4 indicate that the model trained with the optimal hyperparameters has higher accuracy than the model trained with the default parameters. After computation, the accuracy of the models in the two frameworks is 87% and 78% respectively, which is higher than the original 82% and 76% achieved without hyperparameter tuning.

Figure A1. Training results of CNN model in Framework 1 (Under the optimal hyperparameter combination).

Figure A2. Training results of CNN model in Framework 2 (Under the optimal hyperparameter combination).

Figure A3. Model validation results in Framework 1 (Under the optimal hyperparameter combination).

Figure A4. Model validation results in Framework 2 (Under the optimal hyperparameter combination).

References

Zhao, H.; Kang, Z.; Zhang, X.; Sun, H.; Cao, L.; Reynolds, A.C. A Physics-Based Data-Driven Numerical Model for Reservoir History Matching and Prediction with a Field Application. SPE J. 2016, 21, 2175–2194. [Google Scholar] [CrossRef]
Fonseca, R.R.-M.; Chen, B.; Jansen, J.D.; Reynolds, A. A Stochastic Simplex Approximate Gradient (StoSAG) for optimization under uncertainty. Int. J. Numer. Methods Eng. 2017, 109, 1756–1776. [Google Scholar] [CrossRef] [Green Version]
Liu, Z.; Reynolds, A.C. A Sequential-Quadratic-Programming-Filter Algorithm with a Modified Stochastic Gradient for Robust Life-Cycle Optimization Problems with Nonlinear State Constraints. SPE J. 2020, 25, 1938–1963. [Google Scholar] [CrossRef]
Sarma, P.; Chen, W.H.; Durlofsky, L.J.; Aziz, K. Production Optimization with Adjoint Models under Nonlinear Control-State Path Inequality Constraints. SPE Reserv. Eval. Eng. 2008, 11, 326–339. [Google Scholar] [CrossRef]
Wang, P.; Litvak, M.; Aziz, K. Optimization of Production Operations in Petroleum Fields. In Proceedings of the SPE Annual Technical Conference and Exhibition, Antonio, TX, USA, 29 September–2 October 2002; p. SPE-77658-MS. [Google Scholar]
Chen, Y.; Oliver, D.S.; Zhang, D. Efficient Ensemble-Based Closed-Loop Production Optimization. SPE J. 2009, 14, 634–645. [Google Scholar] [CrossRef]
Li, L.; Jafarpour, B.; Mohammad-Khaninezhad, M.R. A simultaneous perturbation stochastic approximation algorithm for coupled well placement and control optimization under geologic uncertainty. Comput. Geosci. 2013, 17, 167–188. [Google Scholar] [CrossRef]
Ebrahimi, A.; Khamehchi, E. Sperm whale algorithm: An effective metaheuristic algorithm for production optimization problems. J. Nat. Gas Sci. Eng. 2016, 29, 211–222. [Google Scholar] [CrossRef]
Hajizadeh, Y.; Christie, M.; Demyanov, V. Comparative Study of Novel Population-Based Optimization Algorithms for History Matching and Uncertainty Quantification: PUNQ-S3 Revisited. In Proceedings of the Abu Dhabi International Petroleum Exhibition and Conference, Abu Dhabi, United Arab Emirates, 1–4 November 2010; p. SPE-136861-MS. [Google Scholar]
Das, S.; Suganthan, P.N. Differential Evolution: A Survey of the State-of-the-Art. IEEE Trans. Evol. Comput. 2011, 15, 4–31. [Google Scholar] [CrossRef]
Hajizadeh, Y.; Christie, M.; Demyanov, V. History Matching with Differential Evolution Approach; a Look at New Search Strategies. In Proceedings of the SPE EUROPEC/EAGE Annual Conference and Exhibition, Barcelona, Spain, 14–17 June 2010; p. SPE-130253-MS. [Google Scholar]
Alpak, F.O.; Vink, J.C.; Gao, G.; Mo, W. Techniques for effective simulation, optimization, and uncertainty quantification of the in-situ upgrading process. J. Unconv. Oil Gas Resour. 2013, 3–4, 1–14. [Google Scholar] [CrossRef]
Babaei, M.; Pan, I. Performance comparison of several response surface surrogate models and ensemble methods for water injection optimization under uncertainty. Comput. Geosci. 2016, 91, 19–32. [Google Scholar] [CrossRef]
Baris, G.; Horne, R.; Leah, R.; Rosenzweig, J. Optimization of Well Placement in a Gulf of Mexico Waterflooding Project. SPE Reserv. Eval. Eng. 2002, 5, 229–236. [Google Scholar] [CrossRef] [Green Version]
Chen, G.; Li, Y.; Zhang, K.; Xue, X.; Wang, J.; Luo, Q.; Yao, C.; Yao, J. Efficient hierarchical surrogate-assisted differential evolution for high-dimensional expensive optimization. Inf. Sci. 2021, 542, 228–246. [Google Scholar] [CrossRef]
Chen, G.; Zhang, K.; Zhang, L.; Xue, X.; Ji, D.; Yao, C.; Yao, J.; Yang, Y. Global and Local Surrogate-Model-Assisted Differential Evolution for Waterflooding Production Optimization. SPE J. 2020, 25, 105–118. [Google Scholar] [CrossRef]
Golzari, A.; Haghighat Sefat, M.; Jamshidi, S. Development of an adaptive surrogate model for production optimization. J. Pet. Sci. Eng. 2015, 133, 677–688. [Google Scholar] [CrossRef]
Luo, C.; Zhang, S.-L.; Wang, C.; Jiang, Z. A metamodel-assisted evolutionary algorithm for expensive optimization. J. Comput. Appl. Math. 2011, 236, 759–764. [Google Scholar] [CrossRef] [Green Version]
Wang, H.; Jin, Y.; Doherty, J. Committee-Based Active Learning for Surrogate-Assisted Particle Swarm Optimization of Expensive Problems. IEEE Trans. Cybern. 2017, 47, 2664–2677. [Google Scholar] [CrossRef] [Green Version]
An, Z.; Zhou, K.; Hou, J.; Wu, D.; Pan, Y. Accelerating reservoir production optimization by combining reservoir engineering method with particle swarm optimization algorithm. J. Pet. Sci. Eng. 2022, 208, 109692. [Google Scholar] [CrossRef]
Zhang, K.; Wang, Z.; Chen, G.; Zhang, L.; Yang, Y.; Yao, C.; Wang, J.; Yao, J. Training effective deep reinforcement learning agents for real-time life-cycle production optimization. J. Pet. Sci. Eng. 2022, 208, 109766. [Google Scholar] [CrossRef]
Miftakhov, R.; Al-Qasim, A.; Efremov, I. Deep Reinforcement Learning: Reservoir Optimization from Pixels. In Proceedings of the International Petroleum Technology Conference, Dhahran, Saudi Arabia, 13–15 January 2020; p. D021S052R002. [Google Scholar]
He, J.; Tang, M.; Hu, C.; Tanaka, S.; Wang, K.; Wen, X.-H.; Nasir, Y. Deep Reinforcement Learning for Generalizable Field Development Optimization. SPE J. 2022, 27, 226–245. [Google Scholar] [CrossRef]
Nasir, Y.; He, J.; Hu, C.; Tanaka, S.; Wang, K.; Wen, X. Deep Reinforcement Learning for Constrained Field Development Optimization in Subsurface Two-phase Flow. Front. Appl. Math. Stat. 2021, 7, 689934. [Google Scholar] [CrossRef]
Nasir, Y.; Durlofsky, L.J. Deep reinforcement learning for optimal well control in subsurface systems with uncertain geology. J. Comput. Phys. 2023, 477, 111945. [Google Scholar] [CrossRef]
Oliveira, D.F.; Reynolds, A.C. Hierarchical Multiscale Methods for Life-Cycle-Production Optimization: A Field Case Study. SPE J. 2015, 20, 896–907. [Google Scholar] [CrossRef]
Wang, X.; Haynes, R.D.; He, Y.; Feng, Q. Well control optimization using derivative-free algorithms and a multiscale approach. Comput. Chem. Eng. 2019, 123, 12–33. [Google Scholar] [CrossRef] [Green Version]
Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef] [PubMed]
Zoph, B.; Le, Q.V. Neural architecture search with reinforcement learning. arXiv 2016, arXiv:1611.01578. [Google Scholar]
Kennedy, J.; Eberhart, R. Particle swarm optimization. In Proceedings of the ICNN’95—International Conference on Neural Networks, Perth, Australia, 27 November–1 December 1995; Volume 4, pp. 1942–1948. [Google Scholar]
Marini, F.; Walczak, B. Particle swarm optimization (PSO). A tutorial. Chemom. Intell. Lab. Syst. 2015, 149, 153–165. [Google Scholar] [CrossRef]
Shorten, C.; Khoshgoftaar, T.M. A survey on Image Data Augmentation for Deep Learning. J. Big Data 2019, 6, 60. [Google Scholar] [CrossRef] [Green Version]
Yang, S.; Xiao, W.; Zhang, M.; Guo, S.; Zhao, J.; Shen, F. Image Data Augmentation for Deep Learning: A Survey. arXiv 2022, arXiv:2204.08610. [Google Scholar]

Figure 1. Flowchart of Framework 1.

Figure 2. Flowchart of Framework 2.

Figure 3. Network Structure 1.

Figure 4. Network Structure 2.

Figure 5. The numerical model of S1.

Figure 6. Two-dimensional matrix derived from three-dimensional properties.

Figure 7. Cropped well group samples.

Figure 8. Rotation of permeability images.

Figure 9. Vertical flip of permeability images.

Figure 10. Training results of CNN model in Framework 1.

Figure 11. Training results of CNN model in Framework 2.

Figure 12. Model validation results in Framework 1 (CNN model).

Figure 13. Model validation results in Framework 2 (CNN model + PSO).

Table 1. A detailed description of the sample.

Sample Name	Features	Label	Number
Training set 1	well group	optimal BHP	1860
Training set 2	Well group and BHP	NPV	39,060
Validation set 1	well group	optimal BHP	1860
Validation set 2	Well group and BHP	NPV	39,060

Table 2. Description of training parameters.

Parameter	Value
Parameter	Framework 1	Framework 2
Learning rate	0.001	0.001
Batch size	128	128
Epoch	50	50
Loss function	Cross entropy loss	Mean-squared error
Optimizer	Adam	Adam

Table 3. Comparison of Optimization Methods for Oil Reservoir Production.

Method	Accuracy	Running Time per Iteration (s)
Framework 1	87%	<1
Framework 2	78%	<1
PSO	75%	>15

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, X.; Ding, Y.; Li, D.; He, Y. Optimal Control Method of Oil Well Production Based on Cropped Well Group Samples and Machine Learning. Energies 2023, 16, 4735. https://doi.org/10.3390/en16124735

AMA Style

Wang X, Ding Y, Li D, He Y. Optimal Control Method of Oil Well Production Based on Cropped Well Group Samples and Machine Learning. Energies. 2023; 16(12):4735. https://doi.org/10.3390/en16124735

Chicago/Turabian Style

Wang, Xiang, Yangyang Ding, Ding Li, and Yanfeng He. 2023. "Optimal Control Method of Oil Well Production Based on Cropped Well Group Samples and Machine Learning" Energies 16, no. 12: 4735. https://doi.org/10.3390/en16124735

APA Style

Wang, X., Ding, Y., Li, D., & He, Y. (2023). Optimal Control Method of Oil Well Production Based on Cropped Well Group Samples and Machine Learning. Energies, 16(12), 4735. https://doi.org/10.3390/en16124735

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Optimal Control Method of Oil Well Production Based on Cropped Well Group Samples and Machine Learning

Abstract

1. Introduction

2. Production Optimization Problems and Solutions

2.1. Mathematical Model for Production Optimization

2.2. Algorithmic Framework for Oil Well Production Optimization

2.2.1. Framework 1

2.2.2. Framework 2

3. Algorithms

3.1. Personalized DQN Algorithm

3.2. Construction of CNN Networks

3.2.1. Network Structure 1

3.2.2. Network Structure 2

3.3. PSO Algorithm

4. Dataset Preparation

4.1. Data Collection

4.2. Dataset Preparation

4.2.1. Cropping Image Samples

4.2.2. Image Enhancement

4.2.3. Sample Labeling

5. Case Study

5.1. Settings for Model Training and Validation

5.2. Model Training Results and Analysis

5.3. Model Validation Results and Analysis

5.4. Comparison of Production Optimization Framework

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Hyperparameter Optimization for the Proposed Frameworks

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI