Observational Cosmology with Artificial Neural Networks
Round 1
Reviewer 1 Report
Please see the attachment.
Comments for author File: Comments.pdf
Author Response
Dear referee, first of all we want to thank you for the useful comments and corrections that have helped to improve the paper.
The authors have presented in this manuscript a rather basic introduction to the Artificial Neural Network (ANN) and its application to cosmology. The manuscript covers the fundamental ideas, concepts, as well as the mathematical construction and the heuristic implementation of the ANN. It also provides several examples of the ANN in the cosmological studies. Since, as the authors depict, the era of big data for cosmology has come in the sense that we have now access to not only more and more observational data, but also rapidly developing tools to process such amounts of data, it is reasonable to make a phasic summary of ANN in cosmology and present it to more researchers of both data science and cosmology.
While I find this work indeed make a useful contribution to the related introductory literature, I would like to make several comments to the manuscript.
- Sections 2 and 3 cover the basic formalism of ANN, which can be found in most text books of ANN. Although the building blocks of ANN now have different terminologies, the technical implementation is in fact something familiar to most researchers of the observational cosmology and data science. These parts should be either more tailored to cosmology or be more tilted to introduce the conceptual ideas of why these basic and familiar tools may be able to form the new tool of ANN.
We have restructured the paper, the content that can be consulted in a textbook have been moved into the appendix and we synthesized the more relevant neural networks concepts in the Section 2.
- In comparison to the detailed introduction to the basis of ANN, the authors only provide general information of the several ANNs they build for the examples in this work. If the authors want to review the very basis of ANN and help the readers build an ANN step by step from scratch, more details of the procedures of the implementation or at least some pseudo-codes should be provided. The current manuscript is unbalance between the basic theory and exemplifying realizations.
In the new version of the paper, we try to be more descriptive in some concepts related to the neural networks. We also added some algorithms used on the construction of the ANNs.
- The manuscript presents several examples of cosmological applications of ANN. The modeling of the Hubble law with a perceptron in fact is just a linear fit. The aim of presenting this example is not so clear. The previous subsections show quite explicitly that the perceptron, if an identity activation is chosen, works like linear regression. This example just shows again that the perceptron is not something new. Moreover, the linear fit of the Hubble law does not seem to require ANN and it does not represent any basis of further application of ANN in cosmology.
We removed this example, in agreement with the referee's remark that linear regression can dispense with the concept of perceptron.
- In subsection 4.1, the authors build an ANN with a single hidden layer to reconstruct the background evolution history of the Universe, i.e., H(z). In this example, the authors generate simulation dataset from Eq.(32) and train an ANN so that it can reproduce Eq.(32). This, at least as the manuscript describes at its current state, seems trivial. Is it the validity and reliability of the ANN that is emphasized here, i.e., the reconstruction will not runaway and go out of control? Or, is the generated dataset just a simulation of the observational data in practice that does not come from an analytic equation? The authors should state more clearly of the significance of this example.
We decided to focus only on one example, as the second one caused some trouble, and we are more descriptive about its motivation and significance.
- In subsection 4.2, the authors build an ANN to reconstruct the evolution of the Universe from the differential equations of the density parameters. Besides the same ambiguity of the significance of this example as the previous examples, the generation of the simulation dataset is not stated clearly enough here. Generating training dataset from the differential equations (30) needs to solve it first. It then again seems trivial to reproduce the evolution with an ANN when you have already solved the evolution. Solving differential equations with an ANN is one of the important areas. The text at present does not seem covering enough information of this subject if that is what the authors want to discussed here.
The goal of this example is to show that training an ANN can reduce the computational time when such equations must be evaluated several times, as is the case in Bayesian cosmological inference and simulations. In addition to having a functional model, which can deliver the solutions of the system of equations by simply evaluating the initial conditions in it. This is not offered by conventional solution methods, since they perform the numerical process each time the solutions are requested.
- The application of ANN to theoretical researches has its long-known controversies. No matter how delicate the model is built or how precise the result and prediction can be, it cannot help the researchers to understand the fundamental theory. That is, roughly speaking, the ANN may understand what is read and predicted, but the human who builds the ANN still don’t. In fact, this issue has been considered in some development of ANN. For example, the Attention Model may provide some information about the correlation between the results and the intermedia parameters. The review should cover some of the attempts in this direction.
In the introduction we wrote about it, we mentioned that despite the controversies ANNs are being used in current cosmological research.
- The ANN architecture introduced in the manuscript is the fully connected neural network. It is indeed the basis of ANN. Various architectures of ANN, such as convolutional neural network (CNN), recurrent neural network (RNN) and Bayesian neural network (BNN), have been developed to address different kinds of problems. They are only mentioned in the last paragraph of the manuscript. There are also some applications of these architectures in cosmology. A simple introduction of these variant architectures and their corresponding kinds of problems in cosmology should be included.
In section 2 we explain more generalities of neural networks and mention some of the different types that exist. However, as a first approximation, we focus on the multi layer perceptron.
There are some minor problems in the manuscript.
- In most of the manuscript, the character x represents the input data, while the character w indicates the weight of the ANN that should be adjust during the training. In subsection 2.2, when introducing the gradient descent, the authors use x to denote the parameters to be optimized. This may raise confusion.
-The notation of ‘x’ was updated to ‘w’, so that it relates to the parameters to be updated in the networks.
- In section 3, above equation (12), the elements a_j^L of the vector a^L are referred to as the inputs. In the context of ANN, the terms input and output have their specific meaning. The usage of them in the wording should be careful to avoid confusion.
-We have corrected this point.
Reviewer 2 Report
This paper describes the basic properties and groundwork for the construction of deep neural networks, in a pedagogical way. After that, the authors discuss three examples of neural networks that they constructed to approximate data in cosmology. The first model predicts the value of the Hubble parameter, as well as its uncertainty, as a function of the redshift, which is taken as an input parameter, while a slightly modified model also takes into account as inputs the current values of the Hubble parameter and of the matter density parameter. The second model takes as input the scale factor, the current matter density and the current Hubble parameter, in order to predict the Hubble parameter at the given scale factor, as well as matter, radiation and cosmological constant densities at that scale, effectively solving the Friedmann equations for input initial conditions. Finally, the third model takes as input 17 different parameters from the Sloan Digital Sky Survey dataset, and predicts whether an object with those parameters is a star, a galaxy or a quasar.
The authors have presented the material in a well-organized and pedagogical way, albeit with dominant emphasis on the construction and properties of neural networks in general, rather than on the specifics of the networks used in the examples in cosmology. This is understandable, given that the authors are advocating for the application of neural networks in physics and cosmology, and the fact that the majority of the target audience is more familiar with cosmology itself and less familiar with neural networks. However, I find the examples of section 4 lacking in one aspect that I believe would be quite illuminating. Namely, given that neural networks are universal approximators, it would be interesting to estimate one possible efficiency ratio, as follows. For each example, the authors could clearly state the total number of real-valued parameters (weights and biases) present in the neural network, as well as the total (real-valued) numbers of training inputs, and then take the ratio of these two numbers. This would give the readers a crude estimate of the efficiency of the neural network, i.e., its overall performance in data compression ability. In other words, it would be nice to know how much a given neural network optimizes the data, in the case of each cosmological application.
I believe the authors could easily add the above analysis, and it would certainly be an excellent performance estimator. Namely, physicists are usually not too familiar with the inner workings of neural networks, but they have excellent intuition about fitting data to a set of parameters, and the ratio of the number of parameters to the number of data points would provide great insight into the level of usefulness of neural networks as model-building tools in cosmology.
Overall, the paper is good, and I recommend it for publication in Universe, provided that the authors include the analysis described above.
Author Response
Dear referee, first of all we want to thank you for the useful comments and corrections that have helped to improve the paper.
This paper describes the basic properties and groundwork for the construction of deep neural networks, in a pedagogical way. After that, the authors discuss three examples of neural networks that they constructed to approximate data in cosmology. The first model predicts the value of the Hubble parameter, as well as its uncertainty, as a function of the redshift, which is taken as an input parameter, while a slightly modified model also takes into account as inputs the current values of the Hubble parameter and of the matter density parameter. The second model takes as input the scale factor, the current matter density and the current Hubble parameter, in order to predict the Hubble parameter at the given scale factor, as well as matter, radiation and cosmological constant densities at that scale, effectively solving the Friedmann equations for input initial conditions. Finally, the third model takes as input 17 different parameters from the Sloan Digital Sky Survey dataset, and predicts whether an object with those parameters is a star, a galaxy or a quasar.
The authors have presented the material in a well-organized and pedagogical way, albeit with dominant emphasis on the construction and properties of neural networks in general, rather than on the specifics of the networks used in the examples in cosmology. This is understandable, given that the authors are advocating for the application of neural networks in physics and cosmology, and the fact that the majority of the target audience is more familiar with cosmology itself and less familiar with neural networks. However, I find the examples of section 4 lacking in one aspect that I believe would be quite illuminating. Namely, given that neural networks are universal approximators, it would be interesting to estimate one possible efficiency ratio, as follows. For each example, the authors could clearly state the total number of real-valued parameters (weights and biases) present in the neural network, as well as the total (real-valued) numbers of training inputs, and then take the ratio of these two numbers. This would give the readers a crude estimate of the efficiency of the neural network, i.e., its overall performance in data compression ability. In other words, it would be nice to know how much a given neural network optimizes the data, in the case of each cosmological application.
– Response: We have restructured the paper to better describe the three examples of applications in cosmology. Regarding the efficiency ratio, neural networks have their own metrics, such as loss function or accuracy, which are collected in the article.
I believe the authors could easily add the above analysis, and it would certainly be an excellent performance estimator. Namely, physicists are usually not too familiar with the inner workings of neural networks, but they have excellent intuition about fitting data to a set of parameters, and the ratio of the number of parameters to the number of data points would provide great insight into the level of usefulness of neural networks as model-building tools in cosmology.
– Response: The relationship between the number of parameters and the number of data points is not useful in machine learning techniques or deep learning; we believe that the introduction of this relationship may give rise to some misconceptions about how these methods are used in practice. Data analysis with neural networks generates computational models for the data sets and the number of parameters is very large, however thanks to the use of the computer, through many iterations, neural networks generate very good models for the data in question, the interpretation of these models are very difficult to explain but they can have a very good performance and a high capacity to make predictions of data not contained in the training sets. In summary, data analysis with neural networks, or other machine learning methods, is a different paradigm from traditional statistical modeling. However, to give an idea of the large number of trainable parameters in neural networks, we have added this information in each of our examples; in addition, we have also indicated the size of the data sets used.
Overall, the paper is good, and I recommend it for publication in Universe, provided that the authors include the analysis described above.
Reviewer 3 Report
Review report for the paper “Observational cosmology with Artificial Neural Networks”
The Abstract and Introduction sections have poorly written. The introduction section should be updated with the motivation behind this study, research gap, novelty, and contributions of the work.
The applicability of the method. Why do we need application ANN in this study? I did not see the author discussing the reason. Therefore, it is impossible to prove the superiority of this model combination in this article. Need detailed further explanation.
The motivation towards the proposed work is not clear.
Insufficient expression on innovative explanations. Does the practical significance of this innovation exist? There is a lack of comparison with previous studies of the same kind. For this point, the innovativeness of the author's statement needs further explanation.
Why someone should use your proposed work in practice, and what are the advantages of your work in comparison with others.
The conclusion writing is very poor. Please rewrite the Conclusion section by adding more advantages, limitations, and the future research direction of the proposed work.
Author Response
Dear referee, first of all we want to thank you for the useful comments and corrections that have helped to improve the paper.
The Abstract and Introduction sections have poorly written. The introduction section should be updated with the motivation behind this study, research gap, novelty, and contributions of the work.
–Response: We have restructured the document. We have rewritten the introduction to make the motivation of the paper clearer. With the new order of the sections of the paper, we believe that the summary, the introduction and the rest of the text are more harmonized.
The applicability of the method. Why do we need application ANN in this study? I did not see the author discussing the reason. Therefore, it is impossible to prove the superiority of this model combination in this article. Need detailed further explanation.
–Response: The aim of the article is to show a first approach to ANNs. We wrote better the introduction and section 2 to explain the motivations for focusing on ANNs.
The motivation towards the proposed work is not clear.
The same answers from the first two observations apply to it.
Insufficient expression on innovative explanations. Does the practical significance of this innovation exist? There is a lack of comparison with previous studies of the same kind. For this point, the innovativeness of the author's statement needs further explanation.
–Response: This paper is aimed to present an introduction and some application of the methods to the cosmological fied. However, in the examples of applications we add several references to better justify their possibilities in real cosmological research.
Why someone should use your proposed work in practice, and what are the advantages of your work in comparison with others.
–Response: In the Introduction we tried to motivate why ANNs are a valid tool in observational cosmology. However, in some types of cosmological data analysis traditional statistical or other machine learning methods may be a better choice than ANNs.
The conclusion writing is very poor. Please rewrite the Conclusion section by adding more advantages, limitations, and the future research direction of the proposed work.
–Response: We rewrote the Conclusions section to better explain the advantages, limitations and future work.
Reviewer 4 Report
The plots shown in Figures 10, 12, and 14 - 16 are not clear. It is good to enlarge their size further.
All terms and notations shown in the equations should be explained to clearly show their meaning.
Please re-check Equations 17 and 18.
Author Response
Dear referee, first of all we want to thank you for the useful comments and corrections that have helped to improve the paper.
The plots shown in Figures 10, 12, and 14 - 16 are not clear. It is good to enlarge their size further.
-Response: The image quality has been improved.
All terms and notations shown in the equations should be explained to clearly show their meaning.
-
Please re-check Equations 17 and 18.
-Response: These equations have been corrected, because we have a small typo error.
Round 2
Reviewer 1 Report
The authors have addressed the points of my concern. The manuscript can be published now.
Reviewer 3 Report
The authors have addressed the point of my concern. I am happy with their corrections. Hence, I would like to recommend this manuscript to be published.
Reviewer 4 Report
I have checked the revised manuscript and found all the issues I had suggested have been carefully addressed.