Observational Cosmology with Artificial Neural Networks
Abstract
:1. Introduction
2. A Deep Learning Overview
2.1. The Perceptron
Algorithm 1 The perceptron rule. |
2.2. Deep Neural Networks
2.3. Learning Process
Algorithm 2 Learning process. |
2.4. Overfitting and Underfitting
2.5. Coding Tips
3. Cosmological Framework
4. MLP Applied to the Hubble Parameter
5. Cosmological Differential Equations
- Obtain the solutions of the system by evaluating the initial conditions at missing points in the training set.
- Reduce the computational time to obtain multiple solutions.
6. Classification of Astronomical Objects
7. Conclusions
- Neural networks have the ability to emulate any function or pattern that pervades a given dataset. This can be very useful in various scientific areas, because such networks can generate a computational model for the data and are a good alternative when a satisfactory analytical model is not available.
- A properly trained neural network can be used to replace traditional computational calculations in a wide variety of problems and thus decrease the computational time necessary. In addition, in Section 5 we showed that the neural network also provides a model that can be evaluated and mathematically manipulated, something that traditional numerical methods may not always offer.
- With the last example, we showed the great efficiency of neural networks in tasks that can be complicated to perform—for example, object classification. In our case, we performed a classification with numerical features; however, the literature indicates that neural networks are also a great tool in image and video classification.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Acknowledgments
Conflicts of Interest
Appendix A. Activation Functions
- ReLU—Rectified Linear Unit (Figure A1a) is the most popular and simple activation function. Given a number x,This provides a very simple non-linear transformation over . The ReLU function retains only positive elements and discards all negative by setting to 0. Although its derivative is undefined when (Figure A1b), it is not necessary to worry about it because the input values may never actually be all zero at the same time.Derivatives of the activation functions are a relevant part of the learning process. Therefore, it is important to consider their properties.
- Sigmoid—The sigmoid function (Figure A1c) maps the real line to the interval . The behaviour of this function was defined by keeping in mind the behaviour of real neurons, which receive stimuli and communicate each other through pulses. The sigmoid function squashes the very negative x to zero, and if x tends to infinity, its image will be mapped to 1; that is, it is a good way to emulate a smoothed step function with 0 or 1. It is defined as:The second definition is better in computational terms.
- Hyperbolic tangent—This function has a similar behaviour to the sigmoid function, but it provides negative values. In the same way, it maps the set of real numbers to the interval . For the points close to 0, the hyperbolic tangent function (tanh) has almost linear behaviour and it is symmetric with respect to the y axis (Figure A1e)Regarding its derivative (Figure A1f), the behaviour is similar to the derivative of the sigmoid
Appendix B. Gradient Descent
- is orthogonal to the level curve at the point .
- points to the direction of the maximum increase of f.
- Conversely, points to the direction of the maximum decrease.
- Too small an : If is very small, the step size will be too short, then it might increase the total computation time to a very large extent or could even fail to converge on the desired point (see Figure A2a).
- Too large an : A very large learning rate can cause an exploding gradient and diverge from the minimum, or the algorithm may bypass the local minimum and overshoot (see Figure A2b).
- Optimal : A proper learning rate ensures that the algorithm can converge on the minimum of the function on a reasonable number of attempts, which also reduces the computational time taken for the process (Figure A2c).
- Compute partial derivatives of each variable and evaluate them at a random starting point in the domain of the function: .
- The gradient of the function must be constructed. It is recommended to define a maximum value for the norm of the gradient, but while preserving its direction, to avoid exploding when doing the iterations.
- Apply Equation (A2) to the initial point , and the process is repeated for the new point until it is close enough to the minimum.
- It is convex.
- Its gradient vector is Lipschitz continuous, that is, for a real positive number L:.
Appendix C. Backpropagation Equations
- Generate random values for and . Then compute the corresponding output by forward propagation.
- Compute the imputed error for the previous layers using Equation (A7).
- Once the new parameters are in place, iterate this procedure until the cost function reaches a very small value.
References
- Arjona, R.; Nesseris, S. What can machine learning tell us about the background expansion of the universe? Phys. Rev. D 2020, 101, 123525. [Google Scholar] [CrossRef]
- Wang, G.-J.; Ma, X.-J.; Xia, J.-Q. Machine learning the cosmic curvature in a model-independent way. Mon. Not. R. Astron. Soc. 2021, 501, 5714. [Google Scholar] [CrossRef]
- Chacón, J.; Vázquez, J.A.; Almaraz, E. Classification algorithms applied to structure formation simulations. arXiv 2021, arXiv:2106.06587. [Google Scholar] [CrossRef]
- Lin, H.W.; Tegmark, M.; Rolnick, D. Why does deep and cheap learning work so well? J. Stat. Phys. 2017, 168, 1223. [Google Scholar] [CrossRef] [Green Version]
- Peel, A.; Lalande, F.; Starck, J.; Pettorino, V.; Merten, J.; Giocoli, C.; Meneghetti, M.; Baldi, M. Distinguishing standard and modified gravity cosmologies with machine learning. Phys. Rev. D 2019, 100, 023508. [Google Scholar] [CrossRef] [Green Version]
- Rodríguez, A.C.; Kacprzak, T.; Lucchi, A.; Amara, A.; Sgier, R.; Fluri, J.; Hofmann, T.; Réfrégier, A. Fast cosmic web simulations with generative adversarial networks. Comp. Astrophys. Cosmol. 2018, 5, 4. [Google Scholar] [CrossRef] [Green Version]
- He, S.; Li, Y.; Feng, Y.; Ho, S.; Ravanbakhsh, S.; Chen, W.; Póczos, B. Learning to predict the cosmological structure formation. Proc. Natl. Acad. Sci. USA 2019, 116, 13825. [Google Scholar] [CrossRef] [Green Version]
- Dieleman, S.; Willett, K.W.; Dambre, J. Rotation-invariant convolutional neural networks for galaxy morphology prediction. Mon. Not. R. Astron. Soc. 2015, 450, 1441. [Google Scholar] [CrossRef]
- Ntampaka, M.; ZuHone, J.; Eisenstein, D.; Nagai, D.; Vikhlinin, A.; Hernquist, L.; Marinacci, F.; Nelson, D.; Pakmor, R.; Pillepich, A.; et al. A deep learning approach to galaxy cluster X-ray masses. Astrophys. J. 2019, 876, 82. [Google Scholar] [CrossRef] [Green Version]
- Auld, T.; Bridges, M.; Hobson, M.; Gull, S. Fast cosmological parameter estimation using neural networks. Mon. R. Astron. Soc. Lett. 2007, 376, L11. [Google Scholar] [CrossRef]
- Alsing, J.; Charnock, T.; Feeney, S.; Wandelt, B. Fast likelihood-free cosmology with neural density estimators and active learning. Mon. R. Astron. Soc. 2019, 488, 4440. [Google Scholar] [CrossRef] [Green Version]
- Li, S.-Y.; Li, Y.-L.; Zhang, T.-J. Model comparison of dark energy models using deep network. Res. Astron. Astrophys. 2019, 19, 137. [Google Scholar] [CrossRef] [Green Version]
- Dialektopoulos, K.; Said, J.L.; Mifsud, J.; Sultana, J.; Adami, K.Z. Neural network reconstruction of late-time cosmology and null tests. arXiv 2021, arXiv:2111.11462. [Google Scholar]
- Gómez-Vargas, I.; Vázquez, J.A.; Esquivel, R.M.; García-Salcedo, R. Cosmological Reconstructions with Artificial Neural Networks. arXiv 2021, arXiv:2104.00595. [Google Scholar]
- Wang, G.-J.; Ma, X.-J.; Li, S.-Y.; Xia, J.-Q. Reconstructing functions and estimating parameters with artificial neural networks: A test with a hubble parameter and sne ia. Astrophys. Suppl. Ser. 2020, 46, 13. [Google Scholar] [CrossRef] [Green Version]
- Escamilla-Rivera, C.; Quintero, M.A.C.; Capozziello, S. A deep learning approach to cosmological dark energy models. J. Cosmol. Astropart. Phys. 2020, 2020, 008. [Google Scholar] [CrossRef] [Green Version]
- Graff, P.; Feroz, F.; Hobson, M.P.; Lasenby, A. Bambi: Blind accelerated multimodal bayesian inference. Mon. Not. R. Soc. 2012, 421, 169. [Google Scholar] [CrossRef] [Green Version]
- Moss, A. Accelerated bayesian inference using deep learning. Mon. Not. R. Astron. Soc. 2020, 496, 328. [Google Scholar] [CrossRef]
- Hortua, H.J.; Volpi, R.; Marinelli, D.; Malago, L. Accelerating mcmc algorithms through bayesian deep networks. arXiv 2020, arXiv:2011.14276. [Google Scholar]
- Gómez-Vargas, I.; Esquivel, R.M.; García-Salcedo, R.; Vázquez, J.A. Neural network within a bayesian inference framework. J.Phys. Conf. Ser. 2021, 1723, 012022. [Google Scholar] [CrossRef]
- Mancini, A.S.; Piras, D.; Alsing, J.; Joachimi, B.; Hobson, M.P. CosmoPower: Emulating Cosmological Power Spectra for Accelerated Bayesian Inference from Next-Generation Surveys. Mon. Not. R. Astron. Soc. 2021, 511, 1771. [Google Scholar] [CrossRef]
- Baccigalupi, C.; Bedini, L.; Burigana, C.; De Zotti, G.; Farusi, A.; Maino, D.; Maris, M.; Perrotta, F.; Salerno, E.; Toffolatti, L.; et al. Neural networks and the separation of cosmic microwave background and astrophysical signals in sky maps. Mon. Not. R. Astron. Soc. 2000, 318, 769. [Google Scholar] [CrossRef]
- Petroff, M.A.; Addison, G.E.; Bennett, C.L.; Weiland, J.L. Full-sky cosmic microwave background foreground cleaning using machine learning. Astrophys. J. 2020, 903, 104. [Google Scholar] [CrossRef]
- Pasquet-Itam, J.; Pasquet, J. Deep learning approach for classifying, detecting and predicting photometric redshifts of quasars in the sloan digital sky survey stripe 82. Astron. Astrophys. 2018, 611, A97. [Google Scholar] [CrossRef]
- Ribli, D.; Pataki, B.; Csabai, I. An improved cosmological parameter inference scheme motivated by deep learning. Nat. Astron. 2019, 3, 93–98. [Google Scholar] [CrossRef] [Green Version]
- Ishida, E.E. Machine learning and the future of supernova cosmology. Nat. Astron. 2019, 3, 680. [Google Scholar] [CrossRef]
- List, F.; Rodd, N.L.; Lewis, G.F.; Bhat, I. Galactic center excess in a new light: Disentangling the γ-ray sky with bayesian graph convolutional neural networks. Phys. Rev. Lett. 2020, 125, 241102. [Google Scholar] [CrossRef]
- Dax, M.; Green, S.R.; Gair, J.; Macke, J.H.; Buonanno, A.; Schölkopf, B. Real-time gravitational wave science with neural posterior estimation. Phys. Rev. Lett. 2021, 127, 241103. [Google Scholar] [CrossRef]
- McCulloch, W.S.; Pitts, W. A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 1943, 5, 115. [Google Scholar] [CrossRef]
- Rosenblatt, F.; Papert, S. The Perceptron: A Perceiving and Recognizing Automaton; Cornell Aeronautical Laboratory Report; Cornell Aeronautical Laboratory: Buffalo, NY, USA, 1957. [Google Scholar]
- Minsky, M.; Papert, S. Perceptron: An Introduction to Computational Geometry; The MIT Press: Cambridge, UK, 1969; Volume 19, p. 2. [Google Scholar]
- Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning representations by back-propagating errors. Nature 1986, 323, 533. [Google Scholar] [CrossRef]
- Ying, X. An overview of overfitting and its solutions. J. Phys. Conf. Ser. 2019, 1168, 022022. [Google Scholar] [CrossRef]
- Allamy, H. Methods Avoid Over-Fitting Under-Fitting SupervisedMachine Learn. (Comparative Study). In Computer Science, Communication & Instrumentation Devices; Academia.edu: San Francisco, CA, USA, 2015; Volume 70. [Google Scholar]
- Zhang, A.; Lipton, Z.C.; Li, M.; Smola, A.J. Dive into Deep Learning. arXiv 2021, arXiv:2106.11342. [Google Scholar]
- Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 2014, 15, 1929. [Google Scholar]
- Louizos, C.; Welling, M.; Kingma, D.P. Learning sparse neural networks through l_0 regularization. arXiv 2017, arXiv:1712.01312. [Google Scholar]
- Phaisangittisagul, E. An analysis of the regularization between l2 and dropout in single hidden layer neural network. In Proceedings of the 2016 7th International Conference on Intelligent Systems, Modelling and Simulation (ISMS), Bangkok, Thailand, 25–27 January 2016; pp. 174–179. [Google Scholar]
- Full Code Repository. Available online: https://github.com/JuanDDiosRojas/Arts/tree/main/Deep%20Learning%20and%20its%20applications%20to%20cosmology (accessed on 22 December 2021).
- Escamilla, L.A.; Vazquez, J.A. Model selection applied to non-parametric reconstructions of the Dark Energy. arXiv 2021, arXiv:2111.10457. [Google Scholar]
- Keeley, R.E.; Shafieloo, A.; Zhao, G.-B.; Vazquez, J.A.; Koo, H. Reconstructing the Universe: Testing the Mutual Consistency of the Pantheon and SDSS/eBOSS BAO Data Sets with Gaussian Processes. Astron. J. 2021, 161, 151. [Google Scholar] [CrossRef]
- Lagaris, I.; Likas, A.; Fotiadis, D. Artificial neural networks for solving ordinary and partial differential equations. IEEE Trans. Neural Netw. 1998, 9, 987–1000. [Google Scholar] [CrossRef] [Green Version]
- Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics Informed Deep Learning (Part I): Data-Driven Solutions of Nonlinear Partial Differential Equations. arXiv 2017, arXiv:1711.10561. [Google Scholar]
- Dufera, T.T. Deep neural network for system of ordinary differential equations: Vectorized algorithm and simulation. Mach. Learn. Appl. 2021, 5, 100058. [Google Scholar] [CrossRef]
- Padilla, L.E.; Tellez, L.O.; Escamilla, L.A.; Vazquez, J.A. Cosmological Parameter Inference with Bayesian Statistics. Universe 2021, 7, 213. [Google Scholar] [CrossRef]
- Vázquez, J.A.; Tamayo, D.; Sen, A.A.; Quiros, I. Bayesian model selection on scalar ϵ-field dark energy. Phys. Rev. D 2021, 103, 043506. [Google Scholar] [CrossRef]
- Gonzalez, T.; Matos, T.; Quiros, I.; Vazquez-Gonzalez, A. Self-interacting Scalar Field Trapped in a Randall-Sundrum Braneworld: The Dynamical Systems Perspective. Phys. Lett. B 2009, 676, 161. [Google Scholar] [CrossRef] [Green Version]
- Hornik, K.; Stinchcombe, M.; White, H. Multilayer feedforward networks are universal approximators. Neural Netw. 1989, 2, 359. [Google Scholar] [CrossRef]
- Gower, R.M. Convergence Theorems for Gradient Descent; Lecture notes for Statistical Optimization. 2018. Available online: https://moodle.polytechnique.fr/pluginfile.php/246753/mod_resource/content/1/lectures%20notes%20on%20gradient%20descent%20.pdf (accessed on 22 December 2021).
- Nielsen, M.A. Neural Networks and Deep Learning; Determination Press: San Francisco, CA, USA, 2015; Volume 25. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
de Dios Rojas Olvera, J.; Gómez-Vargas, I.; Vázquez, J.A. Observational Cosmology with Artificial Neural Networks. Universe 2022, 8, 120. https://doi.org/10.3390/universe8020120
de Dios Rojas Olvera J, Gómez-Vargas I, Vázquez JA. Observational Cosmology with Artificial Neural Networks. Universe. 2022; 8(2):120. https://doi.org/10.3390/universe8020120
Chicago/Turabian Stylede Dios Rojas Olvera, Juan, Isidro Gómez-Vargas, and Jose Alberto Vázquez. 2022. "Observational Cosmology with Artificial Neural Networks" Universe 8, no. 2: 120. https://doi.org/10.3390/universe8020120
APA Stylede Dios Rojas Olvera, J., Gómez-Vargas, I., & Vázquez, J. A. (2022). Observational Cosmology with Artificial Neural Networks. Universe, 8(2), 120. https://doi.org/10.3390/universe8020120