Enhancing Deep Learning and Computer Image Analysis in Petrography through Artificial Self-Awareness Mechanisms
Abstract
:1. Introduction
2. SAL Methodology, Adaptive Learning, and Self-Monitoring
- represents the cross-entropy loss;
- ∑ denotes the summation over all N classes;
- is the true label of class i (0 or 1 depending on whether the sample belongs to that class or not); and
- is the predicted probability of class i.
3. Architecture, Workflow and Functionalities
3.1. Architecture and Workflow
- Maximum number of iterations: we define in advance a maximum number of iterations to prevent the algorithm from running indefinitely. Once this limit is reached (fixed empirically), the training process can be stopped.
- Convergence of hyper-parameters: the network self-monitors the convergence of the hyper-parameters by tracking their changes between iterations. If the changes fall below a predefined threshold, it can be an indication that the network has reached a stable configuration (for instance, we can set a threshold based on the relative change in hyper-parameter values between iterations).
- Performance improvement: the network self-monitors the performance metric of interest (e.g., accuracy or loss) on a validation set or during cross-validation. If the performance metric does not show significant improvement over a certain number of iterations, it may indicate that further updates to the hyper-parameters are unlikely to yield significant benefits.
- Resource constraints: if the training process exceeds these pre-set timing constraints without substantial improvements in performance, it automatically stops the loop.
- Early stopping: an early stopping mechanism based on a predefined criterion is set in advance, such as the performance on a validation set. If the performance does not improve or starts to deteriorate after a certain number of iterations, the training can be stopped early to avoid overfitting or wasting computational resources.
3.2. SAL Functionalities
- Internal State Monitoring: Mechanisms within the neural network architecture that monitor its internal state, including neuron activations and flow of information through the neural layers that define the network architecture. This will provide the network with a basic control of its own activity, through continuous/autonomous analysis of the fundamental hyper-parameters.
- Performance Self-Evaluation: Mechanisms within the neural network architecture that enable the neural network itself to evaluate its own performance and recognize errors. These are techniques such as loss prediction and confidence estimation that assess the network’s uncertainties in its predictions and that identify possible convergence problems (for additional details, see Appendix B).
- Metacognition: Integrated metacognitive mechanisms, allowing the network to assess its own knowledge and monitor the learning process. This will help the network identify gaps in its knowledge and make more informed decisions. In other words, one of the key functions of metacognition in SAL is to identify gaps or deficiencies in the network’s knowledge. This involves recognizing situations where the network is uncertain or where its predictions are unreliable. By identifying knowledge gaps, the network can prioritize learning in those areas or seek additional data or training to improve its understanding. An important aspect of metacognition is self-reflection. We are going to discuss this in detail in a dedicated section below.
- Continual Adaptation and Learning: The neural network is designed to support continual adaptation to new situations and changes in the environment. This may involve architecture updates (here named “artificial plasticity mechanisms”) to facilitate ongoing learning over time.
- Advanced Neural Functions: These include, for instance, “attention mechanisms”. Dedicated modules enable the network to focus attention on specific aspects of its internal state or the surrounding environment. Other functions are pre-processing self-optimization, automatic features extraction, automatic features ranking, and optimal data normalization. Pre-processing involves preparing the input data before feeding them into the neural network. This can include tasks such as resizing images to a standard size, normalizing pixel values, or applying data augmentation techniques to increase the diversity of training examples. For example, in a computer vision task where the neural network is trained to classify images of handwritten digits, pre-processing might involve resizing all input images to a fixed size (e.g., 28 × 28 pixels) and normalizing pixel values to lie within a certain range (e.g., zero to one).
3.3. Self-Reflection
4. A Synthetic Test
5. Test on Real Petrological Data
6. Image Analysis and Classification Using SAL
7. Conclusions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A. A Simplified Example of Code for Self-Reflection Model Update
- The code starts with a “for loop” iterating over the number of epochs specified by the “epochs” variable.
- Within each epoch, a tf.GradientTape() context is created. This context is used to compute the gradients of the trainable variables with respect to a given loss function. It enables automatic differentiation in TensorFlow.
- The model “self_reflection_model” (this can be any initial deep learning model properly set by the user) is called with the input data X_train to obtain the model’s predictions (outputs).
- The “sparse_categorical_crossentropy” loss function from Keras is applied to calculate the loss value between the predicted outputs and the true labels y_train.
- The gradients of the loss with respect to the trainable variables of the model are computed using tape.gradient().
- The computed gradients are then used to update the model’s trainable variables using the optimizer’s apply_gradients() method.
- The model is evaluated on the training set (X_train and y_train) to compute the training loss and accuracy.
- The model is also evaluated on the validation set (X_test and y_test) to compute validation loss and accuracy.
- The computed losses and accuracies are appended to their respective lists in the self_reflection_history dictionary.
- If the current validation accuracy is higher than the previous best accuracy, the best accuracy and the configuration of the current model are updated.
- The self-reflection hyper-parameters (in this simplified example, just learning rate, momentum, and dropout rate) are appended to their respective lists in the self_reflection_hyperparameters dictionary.
- Self-reflection mechanisms are applied to adjust the hyper-parameters based on the performance of the validation set. If the current validation accuracy is lower or equal to the previous epoch’s validation accuracy, the hyper-parameters are updated.
- # First, import all the necessary libraries
- import numpy as np
- import matplotlib.pyplot as plt
- import tensorflow as tf
- from tensorflow import keras
- # Assuming that all the previous building blocks have been implemented (data loading,
- # initial network model properly set, initial training loop, result plotting block and so
- # forth), the following is the core block for the self-reflection model update).
Appendix B. An Overview of Artificial Neural Networks, Other Machine Learning Methods and Key Terminology
References
- Damasio, A. Self Comes to Mind: Constructing the Conscious Brain; Pantheon: New York, NY, USA, 2010. [Google Scholar]
- Edelman, G.M. Neural Darwinism: The Theory of Neuronal Group Selection; Basic Books: New York, NY, USA, 1987; ISBN 0-19-286089-5. [Google Scholar]
- Edelman, G.M. Bright Air, Brilliant Fire: On the Matter of the Mind; Reprint Edition 1993; Basic Books: New York, NY, USA, 1992; ISBN 0-465-00764-3. [Google Scholar]
- Tononi, G.; Boly, M.; Massimini, M.; Koch, C. Integrated information theory: From consciousness to its physical substrate. Nat. Rev. Neurosci. 2016, 17, 450–461. [Google Scholar] [CrossRef] [PubMed]
- Tononi, G.; Edelman, G.M. Consciousness and complexity. Science 1998, 282, 1846–1851. [Google Scholar] [CrossRef] [PubMed]
- Chella, A.; Frixione, M.; Gaglio, S. A cognitive architecture for robot self-consciousness. Artif. Intell. Med. 2008, 44, 147–154. [Google Scholar] [CrossRef] [PubMed]
- Chella, A.; Lanza, F.; Pipitone, A.; Seidita, V. Knowledge acquisition through introspection in human-robot cooperation. Biol. Inspir. Cogn. Arc. 2018, 25, 1–7. [Google Scholar] [CrossRef]
- Dehaene, S.; Lau, H.; Kouider, S. What is consciousness, and could machines have it? Science 2017, 358, 486–492. [Google Scholar] [CrossRef] [PubMed]
- Gorbenko, A.; Popov, V.; Sheka, A. Robot self-awareness: Exploration of internal states. Appl. Math. Sci. 2012, 6, 675–688. [Google Scholar]
- Graziano, M.S. The attention schema theory: A foundation for engineering artificial consciousness. Front. Robot. AI 2017, 4, 60. [Google Scholar] [CrossRef]
- Holland, O. (Ed.) Machine Consciousness; Imprint Academic: New York, NY, USA, 2003. [Google Scholar]
- Kinouchi, Y.; Mackin, K.J. A basic architecture of an autonomous adaptive system with conscious-like function for a humanoid robot. Front. Robot. AI 2018, 5, 30. [Google Scholar] [CrossRef]
- Lewis, P.; Platzner, M.; Yao, X. An Outlook for Self-Awareness in Computing Systems. Self Awareness in Autonomic Systems Magazine. 2012. Available online: https://www.researchgate.net/publication/263473254_An_Outlook_for_Self-awareness_in_Computing_Systems (accessed on 15 December 2023).
- Novianto, R. Flexible Attention-Based Cognitive Architecture for Robots. Ph.D. Thesis, Open Publications of UTS Scholars, University of Technology, Sydney, NSW, Australia, 2014. [Google Scholar]
- Reggia, J.A. The rise of machine consciousness: Studying consciousness with computational models. Neural Netw. 2013, 44, 112–131. [Google Scholar] [CrossRef] [PubMed]
- Scheutz, M. Artificial emotions and machine consciousness. In The Cambridge Handbook of Artificial Intelligence; Frankish, K., Ramsey, W., Eds.; Cambridge University Press: Cambridge, UK, 2014; pp. 247–266. [Google Scholar] [CrossRef]
- Winfield, A.F.T. Experiments in artificial theory of mind: From safety to story-telling. Front. Robot. AI 2018, 5, 75. [Google Scholar] [CrossRef] [PubMed]
- Dell’Aversana, P. An Integrated Deep Learning Framework for Classification of Mineral Thin Sections and Other Geo-Data, a Tutorial. Minerals 2023, 13, 584. [Google Scholar] [CrossRef]
- Dell’Aversana, P. Artificial Neural Networks and Deep Learning: A Simple Overview; Research Gate: Berlin, Germany, 2019. [Google Scholar] [CrossRef]
- Mamani, M.; Wörner, G.; Sempere, T. Geochemical variations in igneous rocks of the Central Andean orocline (13° S to 18° S): Tracing crustal thickening and magma generation through time and space. Bull. Geol. Soc. Am. 2010, 122, 162–182. [Google Scholar] [CrossRef]
Model | Classification Accuracy | Precision |
---|---|---|
SAL Neural Network | 0.705 | 0.735 |
Decision Tree | 0.579 | 0.529 |
Random Forest | 0.684 | 0.691 |
Naive Bayes | 0.56 | 0.583 |
Logistic Regression | 0.613 | 0.546 |
CN2 Rule Inducer | 0.421 | 0.28 |
Adaptive Boosting | 0.664 | 0.672 |
Model | Classification Accuracy | Precision |
---|---|---|
SAL Neural Network | 0.781 | 0.817 |
Decision Tree | 0.625 | 0.681 |
Random Forest | 0.719 | 0.71 |
Logistic Regression | 0.618 | 0.551 |
CN2 Rule Inducer | 0.428 | 0.311 |
Adaptive Boosting | 0.625 | 0.655 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Dell’Aversana, P. Enhancing Deep Learning and Computer Image Analysis in Petrography through Artificial Self-Awareness Mechanisms. Minerals 2024, 14, 247. https://doi.org/10.3390/min14030247
Dell’Aversana P. Enhancing Deep Learning and Computer Image Analysis in Petrography through Artificial Self-Awareness Mechanisms. Minerals. 2024; 14(3):247. https://doi.org/10.3390/min14030247
Chicago/Turabian StyleDell’Aversana, Paolo. 2024. "Enhancing Deep Learning and Computer Image Analysis in Petrography through Artificial Self-Awareness Mechanisms" Minerals 14, no. 3: 247. https://doi.org/10.3390/min14030247
APA StyleDell’Aversana, P. (2024). Enhancing Deep Learning and Computer Image Analysis in Petrography through Artificial Self-Awareness Mechanisms. Minerals, 14(3), 247. https://doi.org/10.3390/min14030247