Data-Driven Geothermal Reservoir Modeling: Estimating Permeability Distributions by Machine Learning
Round 1
Reviewer 1 Report
Thanks for your interesting work. In the following, you will find my suggestions to possible improvements:
I strongly recommend to rewrite the abstract. Line 2: geothermal modelling is not a crucial to develop strategies for the future development of society. Most countries do not have geothermal projects, but they have their own strategies for the development of society.
Line 3: It is not clear for me that we need tens of thousands parameters for the modelling. Do we have such high number of parameters?
Line 7: What is the relationship between the required input parameters estimation and permeability field? You are working just on the permeability field not all parameters! Please see these papers for more insights regarding the required parameters (Thermo-hydro-mechanical modeling of an Enhanced geothermal system in a fractured reservoir using CO2 as heat transmission fluid- A sensitivity investigation, Key parameters affecting the performance of fractured geothermal reservoirs: a sensitivity analysis by thermo-hydraulic-mechanical simulation). However, permeability is an important parameter, but there are other parameters which controls the process. Please mention it.
Line 11: It is better to mention the algorithms here.
Line 15: It is better to discuss with numbers and statistical measures when you are talking about the accuracy of the model.
Line 24: In the literature, COMSOL is getting interests for the geothermal reservoir simulation. You can mention it here besides the other packages. See this paper from this journal as an example (Hydro-Thermal Modeling for Geothermal Energy Extraction from Soultz-sous-Forêts, France).
Line 48: You need a space between "It" and "is".
Line 73 and 74: It is not clear.
Line 75 and 76: These sentences should be connected to each other.
Line 77: You should start the sentence with a capital character "Estimated".
Line 87: "permeability field" instead of "permeability".
Line 89: If CNN is an abbreviation, please define it.
Line 108: In reality, the observed pressure and temperature are functions of the permeability field.
Figure 1 should be revised. Please use boundary conditions instead of "B.C.". The text is outside the defined shapes. Subplots should be defined in the caption.
Line 111: "the numerical simulator" not "a numerical simulator".
Please consider that you are neglecting the mechanical and chemical aspects of the geothermal systems. In other words, you are using TH in 2D instead of the real field THMC in 3D. For the first attempt, it is fine and it could be enhanced in the future. But, it should be mentioned somewhere in the text.
Equation 1: All symbols and constants should be defined.
You need define your input parameters during the numerical simulations. For example the values of porosity, thermal conductivity, specific heat capacity...
I was confused. How we want to use this study in the field? How many pressure and temperature points we have in the field?
Please mention all subplots in the text.
Line 386: you did not check the different boundary conditions. You examined the different positions of the source and sink! Boundary condition can be constant flux, variable flux....
You mentioned the main shortcoming of your study in the discussion section. For the geothermal application, in spite of oil and gas reservoirs, we can not drill more wells (due to the cost). What is your suggestion to handle this issue?
Line 452: "we" instead of "We".
Author Response
We are very grateful for your very kind and useful comments. We revised our manuscript and respond in red to your comments in black as follows:
----------------------
Thanks for your interesting work. In the following, you will find my suggestions to possible improvements:
I strongly recommend to rewrite the abstract. Line 2: geothermal modelling is not a crucial to develop strategies for the future development of society. Most countries do not have geothermal projects, but they have their own strategies for the development of society.
Thank you for the comment. “Development of society” was out of line. We deleted it.
Line 3: It is not clear for me that we need tens of thousands parameters for the modelling. Do we have such high number of parameters?
I wanted to say that the governing equations in the simulation are consistent with several parameters, and the parameters are given on each simulation grid. The input parameters we need to estimate are the combination of the several physical parameters in the governing equations and the number of grids. Sometime, the number of parameters we need to estimate can be a large number. However, there is no need to dare to provide a specific number here, so I modified the text as follows:
Line 2 “The governing equations in the geothermal reservoir models are consistent of several constitutive parameters and each parameter is given on a large number of grids. The combinations of the parameters we need to estimate are almost limitless.”
Line 7: What is the relationship between the required input parameters estimation and permeability field? You are working just on the permeability field not all parameters! Please see these papers for more insights regarding the required parameters (Thermo-hydro-mechanical modeling of an Enhanced geothermal system in a fractured reservoir using CO2 as heat transmission fluid- A sensitivity investigation, Key parameters affecting the performance of fractured geothermal reservoirs: a sensitivity analysis by thermo-hydraulic-mechanical simulation). However, permeability is an important parameter, but there are other parameters which controls the process. Please mention it.
As you pointed out, permeability is just one of the important parameters. By reflecting your comments, we revised the sentence as
Line 5 “There are several parameters which controls the hydrothermal processes in the geothermal reservoir modeling. In this study, as an initial challenge, we focus on permeability, which is one of the most important parameters for the modeling. “
Line 11: It is better to mention the algorithms here.
We added the algorithms as:
Line 13 “Several machine learning algorithms (i.e, linear regression, Ridge regression, Lasso regression, Support Vector Regression (SVR), Multi-Layer Perceptron (MLP), random forest, gradient boosting, and the k-nearest neighbor algorithm)”
Line 15: It is better to discuss with numbers and statistical measures when you are talking about the accuracy of the model.
We added the numbers as:
Line 16 “By comparing the feature importance and the scores of estimations, random forest using pressure differences as feature variables provided the best estimation (the training score of 0.979 and the test score of 0.789).”
Line 24: In the literature, COMSOL is getting interests for the geothermal reservoir simulation. You can mention it here besides the other packages. See this paper from this journal as an example (Hydro-Thermal Modeling for Geothermal Energy Extraction from Soultz-sous-Forêts, France).
We added COMSOL and the reference in Line 29.
Line 48: You need a space between "It" and "is".
Thank you for finding this typo. Amended.
Line 73 and 74: It is not clear.
The sentence is not important. Thus, we deleted the sentence.
Line 75 and 76: These sentences should be connected to each other.
As you suggested, we connected these sentences (Line 78)
Line 77: You should start the sentence with a capital character "Estimated".
I’m sorry. we wanted to use “Assouline et al.” as the subject, but it missed due to LaTex function. In addition to this, others where the subject is missing due to LaTex functions have been corrected. (From Line 76)
Line 87: "permeability field" instead of "permeability".
Amended. (Line 91)
Line 89: If CNN is an abbreviation, please define it.
Used “convolutional neural network” (Line 93)
Line 108: In reality, the observed pressure and temperature are functions of the permeability field.
As you suggested, we deleted “described as” and wrote the sentence as:
Line 111 “Since the permeability K is a function of the observed data Pobs, Tobs, it may be possible to derive the permeability in a single simulation, without the requirement for many iterations.”
Figure 1 should be revised. Please use boundary conditions instead of "B.C.". The text is outside the defined shapes. Subplots should be defined in the caption.
As pointed out by the reviewer 2, this figure was vague and it could be explained without this figure, we deleted the figure.
Line 111: "the numerical simulator" not "a numerical simulator".
Revised. (Line 114)
Please consider that you are neglecting the mechanical and chemical aspects of the geothermal systems. In other words, you are using TH in 2D instead of the real field THMC in 3D. For the first attempt, it is fine and it could be enhanced in the future. But, it should be mentioned somewhere in the text.
We wrote this point in Introduction and Discussion part.
Line 119 “It should be noted that our analysis is limited to 2D thermo-hydraulic simulation as a first step to develop the machine learning approach and that 3D thermo-hydraulic-mechanical-chemical simulations is needed to make it available for real field development in the future research.”
Line 437 “In addition, our simulation was limited to thermo-hydraulic simulation as a first step to develop the machine learning approach. To make our analysis available for real field development, 3D thermo-hydraulic-mechanical-chemical simulations is needed in the next research.”
Equation 1: All symbols and constants should be defined.
We explained all symbols and constants in Eq. 1.
You need define your input parameters during the numerical simulations. For example the values of porosity, thermal conductivity, specific heat capacity...
We added Table 1 to list the parameters.
I was confused. How we want to use this study in the field? How many pressure and temperature points we have in the field?
We assume that multiple data can be taken and interpolated, and in this 2D case, this method can be applied by creating a surface with at least three data points. This point was added to the discussion.
Line 423 “Geothermal developments can not drill a large number of wells due to the drilling cost. Thus, the pressure data in the natural state is only available for the discrete data from the limited wells. The learning model proposed in this study, however, requires two-dimensional pressure distributions. If we apply our method to field data, it requires to measure the pressure from at least three points surrounding the target surface for the 2D estimation and to interpolate the discrete data by interpolation techniques, such as kriging (e.g., \cite{teng2007}). Next research will examine the estimation errors when interpolations are performed.”
Please mention all subplots in the text.
Amended.
Line 386: you did not check the different boundary conditions. You examined the different positions of the source and sink! Boundary condition can be constant flux, variable flux....
We changed to “different positions of source and sink”. (L319, L400)
You mentioned the main shortcoming of your study in the discussion section. For the geothermal application, in spite of oil and gas reservoirs, we can not drill more wells (due to the cost). What is your suggestion to handle this issue?
I think this is a quite important comment for our paper. We consider uncertainty quantification of the estimation with limited measurement data is important for the geothermal application and added the following sentence.
Line 431 “In addition, when not many data points are available, as in the case of geothermal development, it is important to evaluate carefully how uncertain the measurements based on the data are. To evaluate the impacts of the uncertainties on the estimation, uncertainty quantification methods, such as Bayesian approximation and ensemble learning techniques (e.g., \cite{JIANG2021102262}), play some pivotal roles. Combining uncertainty quantification with our approach is also desirable in next study.”
Line 452: "we" instead of "We".
Amended.
Reviewer 2 Report
The paper undertakes a problem of importance, namely the matching of a numerical model to the natural state of the geothermal reservoir, which is a task that often takes months of work. The study investigated the use of machine learning approaches to achieve this, and examines the different types of machine learning algorithm that can be used.
I do have some observations:
- The paper does not make a direct comparison of performance with automated matching approaches, such as iTOUGH2.
- There is no quantitative estimation of the uncertainty of the estimated parameters.
- Figure 1 seems superfluous and I found hard to understand its meaning or intent. It may be useful in a PPT file where the speaker can describe its meaning, but it doesn't seem to add to the paper.
- There a few typos, such as Rasso instead of Lasso in Figure 5.
Author Response
We are very grateful for your very kind and useful comments. We revised our manuscript and respond in red to your comments in black as follows:
----------------------
The paper undertakes a problem of importance, namely the matching of a numerical model to the natural state of the geothermal reservoir, which is a task that often takes months of work. The study investigated the use of machine learning approaches to achieve this, and examines the different types of machine learning algorithm that can be used.
I do have some observations:
- The paper does not make a direct comparison of performance with automated matching approaches, such as iTOUGH2.
In response to your comment, we added the following sentence.
Line 409 “Of course, inverse analysis methods developed in the past (e.g., iTOUGH2 \cite{Finsterle}) could also provide good estimates. Although we have not compared the accuracy of the estimation between inverse analysis methods and our approach, the estimation accuracy can be better in both cases depending on the optimization. On the other hand, the good point of our approach is that once the learning model is created, it does not need to be computed over and over again, and it can be used even on computers with low specifications. This would help to expand the spread of small fields, which cannot be developed with large amounts of equipment.”
- There is no quantitative estimation of the uncertainty of the estimated parameters.
In response to your comment, we added the following sentence.
Line 431 “In addition, when not many data points are available, as in the case of geothermal development, it is important to evaluate carefully how uncertain the measurements based on the data are. To evaluate the impacts of the uncertainties on the estimation, uncertainty quantification methods, such as Bayesian approximation and ensemble learning techniques (e.g., \cite{JIANG2021102262}), play some pivotal roles. Combining uncertainty quantification with our approach is also desirable in next study.”
- Figure 1 seems superfluous and I found hard to understand its meaning or intent. It may be useful in a PPT file where the speaker can describe its meaning, but it doesn't seem to add to the paper.
Thank you for the comment. We eliminated the figure.
- There a few typos, such as Rasso instead of Lasso in Figure 5.
Thank you for finding the typo.
Round 2
Reviewer 1 Report
Thanks for considering my suggestions.
Author Response
Thank you very much for your kind review. We revised some English pointed out from the editor.