Next Article in Journal
Doppler Factor in the Omega-k Algorithm for Pulsed and Continuous Wave Synthetic Aperture Radar Raw Data Processing
Previous Article in Journal
Preparation of a Low-Cement-Content Silty Soil Stabilizer Using Industrial Solid Wastes
 
 
Article
Peer-Review Record

Machine Learning Approach to Predict the Illite Weight Percent of Unconventional Reservoirs from Well-Log Data: An Example from Montney Formation, NE British Columbia, Canada

Appl. Sci. 2024, 14(1), 318; https://doi.org/10.3390/app14010318
by Azzam Barham and Nor Syazwani Zainal Abidin *
Reviewer 1: Anonymous
Reviewer 2:
Reviewer 3:
Appl. Sci. 2024, 14(1), 318; https://doi.org/10.3390/app14010318
Submission received: 15 August 2023 / Revised: 10 December 2023 / Accepted: 11 December 2023 / Published: 29 December 2023

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

General comment:

The manuscript presents a case study on employing a machine learning approach to predict the weight percentage of illite using conventional logs from the Montney formation. The authors use feed-forward neural network for building the predictive model and principal component analysis for exploratory data analysis. The authors are motivated to develop this approach due to the challenges associated with core-based analyses and the limitation of logging instruments that cannot directly measure illite.

The quality of writing can be greatly improved, and some sentences can be made clearer. The background information on geology, mineralogy and domain aspects are good but the descriptions of the machine learning approach and terminology used (especially in section 4) highly deviate from the standard and are very confusing to read. I suggest the authors to reference a standard machine learning textbook (e.g., Pattern Recognition and Machine Learning by Christopher M. Bishop). A major revision is needed before the manuscript can be further considered. Please see my comments below.

 

Comments:

Page 3, Line 116-119: These sentences are not clear. “The primary cause is…” What primary cause are the authors referring to?

 

Page 3, Line 120-122: These sentences can be improved or expanded for clarity. I suggest rephrasing “computer-assisted history matching” as history matching can refer to geologic model calibration. Please expand on “use of inconsistent elements” and explain what these elements are (i.e., the neural network element? Or elements in the data?”

 

Page 3, Line 124-127: Please expand on “network capacity allocation known as associate adjustive training” so the general readers of the journal can benefit from this manuscript. Are the authors trying to motivate the use of ANN by including these discussions between parametric and non-parametric technique? If so, please mention that in the manuscript.

 

Page 3, Figure 1: I suggest using original artworks for best figure quality (i.e., not distorted).

 

Page 4, Figure 2: Please correct typos (e.g., fine tunning) in the figure and upload figures of at least 300 dpi.

 

Page 4, Line 143: “whole quantitative rock…” perhaps the authors meant “whole quantitative rock analysis”?

 

Page 4, Line 146: unnecessary parentheses

 

Page 4, Line 151: “falls” -> “fall”

 

Page 5, Line 156-164: This whole paragraph is hard to understand. What are the authors trying to imply here: “Due to the fact that prediction errors tend to be considered if the model is executed and the prediction error is substantial, it is crucial to guarantee that all of the utilized data fall within the same range”? Are the authors trying to say that the testing data should fall within the same range as the training data?

 

Page 5, Line 166-177: Again, this whole paragraph is also hard to understand. For example, in “the strength of each neuron in an input form is represented by a continuous variable in a neural network” – this is the first time the authors mentioned the term “neuron” without first explaining what a neuron is. Does “continuous variable” refer to the output of a neuron?

 

Page 6, Line 185-186: This sentence is inaccurate “The ability of a neural network to generalize to novel patterns is irrelevant to its overall performance evaluation”. It should read something like “A neural network is evaluated on its ability to generalize to novel patterns, achieved using a validation dataset”?

 

Page 6, Line 189: What “group” are the authors referring to? Please keep the terms consistent.

 

Page 6, Line 189-190: What is “manufacturing setting”?

 

Page 6, Line 190-191: This line “The test set can be efficiently constructed by removing 20-30% of the patterns from the training set” is confusing. We typically have a dataset that is split 70%:30% (or some other ratios) into training and testing sets. Saying that you remove 20-30% from the training set is inaccurate.

 

Page 6, Line 202: What is “classification mining”? Does not need to mention classification here, since the authors are not performing classification in this work.

 

Page 6: The manuscript will benefit from a Problem Definition section where the variables used are introduced (e.g., m is the size of the training dataset? Therefore x_1, x_2, x_3, …, x_m?). In this section, the authors should explain what exactly the input and output are.

 

Page 7, Table 1: I suggest that the authors plot a histogram for each of the input features in the data as such visual aid is very useful to describe a dataset.

 

More explanations behind the dataset are needed. What constitute a row of training data? Is it sample points from the petrophysical logs that are at the same depth as the obtained core (that was used to obtain the true illite wt% through core analysis)?

 

Page 9, Figure 5: I suggest using an original artwork. A quick Google search shows that this is a screenshot from a pane in the Matlab Neural Network Toolbox.

 

Page 9, Line 268-274: There seems to be a confusion on what is defined by training, validation, and testing datasets. The authors mentioned monitoring the errors on the training and testing datasets during the training process. In practice, we monitor the training and validation datasets. Here is a quick reference that follows the standard: https://en.wikipedia.org/wiki/Training,_validation,_and_test_data_sets

 

Page 10, Line 283-285: This line is confusing, please rewrite to follow the standard convention and update the rest of the manuscript accordingly. E.g., “In order to determine the illite wt.%, the network was trained and then subjected to test data from three wells that were not used during training or validation”.

 

Page 11, Line 303: What exactly is the input into the neural network? Is it the well log and XRD data, or is it the PCA representation of them?

 

Page 11, Line 307: “a significant advantage over other methods” This conclusion cannot be drawn since no comparison to other methods were done in this work.

 

Page 11, Line 310: “These findings show that the model that was designed was great.” Please rephrase.

 

Page 11, Line 311: The manuscript can benefit from a more balanced discussion on the pros and cons of the method.

 

 

 

 

 

 

 

Comments on the Quality of English Language

There were minor grammatical errors but sentence structures can be greatly improved.

Author Response

Please see the attachment

Author Response File: Author Response.docx

Reviewer 2 Report

Comments and Suggestions for Authors

The authors of the manuscript entitled:

 Machine Learning Approach to Predict the Illite Weight Percent of Unconventional Reservoirs: An Example from Montney Formation, NE British Columbia, Canada, have covered an important topic about unconventional reservoir characterization to predict a weight percent of a type of clay mineral using machine learning (ML). Currently, ML is a hot topic to predict the mineralogy in an unconventional reservoir

However, through reading this manuscript we have extracted some minor remarks:

1-The title should be modified adding the type of data “ from well-logs data”

2 - The reference list is significant and it should be updated by recent references (2022 and 2023) in ML an AI (for example see publications of Sid-Ali  OUADFEUL and Leila ALIOUANE)

 In Conclusion, the paper is well written and the results are significant. Good job!

 

Minor revision

Author Response

Please see the attachment

Author Response File: Author Response.docx

Reviewer 3 Report

Comments and Suggestions for Authors

Dear respected authors,

Please find attached my comments on the manuscript (applsci-2585758).

Comments for author File: Comments.pdf

Comments on the Quality of English Language

The paper has several typos and language issues and warrants a thorough language audit.

Author Response

Please see the attachment

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

I would like to thank the authors for seriously considering my suggestions. The paper is easier to follow now. I do have some minor remaining comments below. Thank you. 

 

Page 6, Line 209: “into three categories: training, production, and test…” What is the production data, since it is not mentioned in the paper? Is that supposed to be validation data?

 

Page 7, Line 243: “...and nonlinear classification techniques.” Please explain why there is a mention of nonlinear classification when the loss function used in this work is clearly one of regression. 

 

Page 8, Line 277-278: “Dimensional reduction evaluation PCA is considered one of the oldest algorithms and most utilized” Statements like this are not scientific. Please instead include citations or examples of when it is utilized, especially in petroleum engineering, to show great utilization. For example: 

https://doi.org/10.1007/s10596-013-9343-5 used PCA to determine with prior geologic scenarios are feasible for building static models

https://doi.org/10.1007/s10596-020-09971-4 used PCA to reduce the dimension of static models and then perform history matching with convolutional neural network by mapping the dynamic data to the PCA coefficients of the static models. 

https://doi.org/10.30632/PJV64N2-2023a2 used PCA to generate a new data set of independent features that are then used to generate class-based permeability-porosity models.

 

Page 9, Line 286-289: Please further specify that the input is actually the petrophysical properties where each input row is per unit depth. 

 

I think the authors should highlight that the PCA representations were not used as input into the neural network. I noticed that the other technical reviewer (and myself) got the impression that the PCA representations were used as input. In this paper, it was just used as a means to explore the data.

Comments on the Quality of English Language

Please re-check for grammar and consistency. 

Author Response

Dear esteemed reviewer, We express our sincere gratitude for your valuable insights and feedback. We have carefully considered your comments and have made every effort to address them adequately in our attached reply.

Page 6, Line 209: “into three categories: training, production, and test…” What is the production data, since it is not mentioned in the paper? Is that supposed to be validation data?

 -Done. Please follow the green highlighted line 207.

Page 7, Line 243: “...and nonlinear classification techniques.” Please explain why there is a mention of nonlinear classification when the loss function used in this work is clearly one of regression. 

Yes, we depend on the advantage of the ANN model in capturing the nonlinearity in the data while testing and validating the model. However, when comparing the predicted results to the original, we depend on the linear regression to check.

 

Page 8, Line 277-278: “Dimensional reduction evaluation PCA is considered one of the oldest algorithms and most utilized” Statements like this are not scientific. Please instead include citations or examples of when it is utilized, especially in petroleum engineering, to show great utilization. For example: 

https://doi.org/10.1007/s10596-013-9343-5 used PCA to determine with prior geologic scenarios are feasible for building static models

https://doi.org/10.1007/s10596-020-09971-4 used PCA to reduce the dimension of static models and then perform history matching with convolutional neural network by mapping the dynamic data to the PCA coefficients of the static models. 

https://doi.org/10.30632/PJV64N2-2023a2 used PCA to generate a new data set of independent features that are then used to generate class-based permeability-porosity models.

-Please follow the green-highlighted lines 273-279

Page 9, Line 286-289: Please specify that the input is the petrophysical properties where each input row is per unit depth. 

-Please follow the green-highlighted lines 283-288

I think the authors should highlight that the PCA representations were not used as input into the neural network. I noticed that the other technical reviewer (and myself) got the impression that the PCA representations were used as input. In this paper, it was just used as a means to explore the data.

 

Resolved; please follow the green highlighted lines 278–280.

Comments on the Quality of English Language

Please re-check for grammar and consistency.

Proofreading for the whole manuscript has been done.

Reviewer 3 Report

Comments and Suggestions for Authors

Unfortunately, the authors did not address my concerns in their revised version. One particular concern I mentioned was the lack of provided weights and bias data for their developed ANN model, which raises questions about the reproducibility of their work. However, the authors' response suggested that these details are considered sensitive information that can only be shared after publication. I find this statement unrealistic as it is important for readers to have access to such information in order to evaluate the reproducibility of the model accurately. Hence, I have doubts regarding the reliability of the presented model as it lacks important information like "the weights and bias Table”, which is typically provided by most neural network platforms such as MatLab or other software.

Accordingly, I believe the study don’t deserve publication.

Comments on the Quality of English Language

Extensive editing of English language required.

Author Response

Dear esteemed reviewer, We express our sincere gratitude for your valuable insights and feedback. We have carefully considered your comments and have made every effort to address them adequately in our attached reply.

 

According to your previous concerns,

The work's specific contributions and novelty points should have been highlighted more explicitly. It is unclear how the proposed model can help better reservoir management, as claimed by the authors in Lines [66–67]. The paper also lacks a critical review of existing studies on the topic.

 

The model will indirectly help in reservoir management by obtaining the illite wt.% in a quick and nearly costless manner. Please follow the blue-highlighted lines (47–54).

The authors should also emphasize how the proposed model is better than others.

As for our information, no model has been suggested to predict the illite wt.%.

The novelty of this paper should be further justified by highlighting the main contributions to the existing introduction and literature review.

Please follow the blue-highlighted lines (84–93).

Moreover, subsections (4.4, 4.5, and 4.6) are redundant and repetitive; therefore, they must be merged under one subsection.

Sections 4.4 and 4.5 were merged into one section blue line (195).

One particular concern I mentioned was the lack of provided weights and bias data for their developed ANN model, which raises questions about the reproducibility of their work.

Please find attached the weights and biases for our optimized model, and we will send it to the journal as supplementary material.

comments on the quality of the English language; extensive editing of the English language is required.

A full proofreading has been done for the entire manuscript with a certificate, which has been sent to the editorial board.

Author Response File: Author Response.pdf

Round 3

Reviewer 1 Report

Comments and Suggestions for Authors

Thank you for your responses. The manuscript should now be useful to the audience of Applied Sciences. 

Author Response

Dear reviewer, your insightful feedback greatly influenced the enhancement of this work. Your precise suggestions demonstrate a profound comprehension of the manuscript's subject matter. We express our gratitude for acknowledging the potential scientific value of our work in its completed state, which will benefit readers of the Journal of Applied Sciences and other scholars in the field of artificial intelligence applications in petroleum geology. My co-authors and I really appreciate your consideration.

The authors

Back to TopTop