Incoherence: A Generalized Measure of Complexity to Quantify Ensemble Divergence in Multi-Trial Experiments and Simulations
Abstract
:1. Introduction
1.1. Motivation
1.2. Aims
- offer an intuitive explanation for non-statisticians to understand this specific perspective of complexity (Section 3)
- determine how the baseline compares to other existing and commonly used metrics to quantify similarities and demonstrate its uniqueness (Section 5.1)
- explore how this measure behaves with the identified “features of complexity” to see if it can help to formally quantify and identify those features (Section 5.2)
- demonstrate whether the metric is actually valuable in real-world decision making (Section 6)
2. Information Theoretical-Based Measures of Complexity
2.1. Shannon Entropy
2.2. Continuous Entropy
2.3. Information Entropy as Complexity
2.4. Gershenson–Fernandez Complexity
3. Defining Ensemble Complexity
3.1. Intuitive Description
3.2. Types of Ensemble Uncertainty
- Epistemic (sometimes called ontological), which arises due to a lack of knowledge about a system. This mean that the uncertainty can be reduced by collecting more data; for instance, by tossing a coin thousands of times to determine if it is biased. Within this category, though there is:
- (a)
- Structural (or model misspecification), which describes how well the equations and model actually reflect the real-world dynamics.
- (b)
- Parametric, which reflects how well you are able to estimate your models parameters.
- Aleatoric (called internal variability in climate science or endogenous in economics), on the other hand, cannot be reduced by collecting additional data. It arises due to the very nature of the system being non-deterministic and stochastic. For instance, even if you know a coin is biased toward heads by 80% in the aggregate, each time you actually flip the coin, the result will always have some level of uncertainty.
3.3. Existing Commonly Used Ensemble Measures
- The root mean square error (RMSE). Here, we compare the predicted value against a real-world historic truth. This measure is extremely useful when trying to reduce structural uncertainty through model improvement. However, it cannot be used when analyzing projections into the future where we have no truth to compare [32].
- The mean of the pooled distribution (or mean of means). This a very pragmatic measure when we need to be able to provide a single number to act upon, particularly for systems that have a high parametric but low aleatoric uncertainty. However, as we will see later in Figures 17 and 19, these measures can be misleading when the system is non-ergodic. For instance, a coin cannot be in both a state of heads and tails.
- Proportion or percentage is often used when dealing with these scenario-based questions. For instance, in weather forecasts, the public is used to an X% rain prediction, which is simply the proportion of trials where rain was recorded in the ensemble. This is an extremely effective and highly intuitive; however, it does require we first know what scenarios we are looking for and then be able to articulate them as a binary outcome. In this paper, we are actively aiming to create a measure to be used upstream of this, which can detect variations in output so that we can efficiently invest our efforts.
- Brier scores are then used to calculate the accuracy of the percentage predictions against reality. Like RMSE, we do not compare these measures in this paper, as we are focused on the aleatoric uncertainty of projections.
3.4. The Use of Statistical Tests
3.5. Conditions of a New Measure
- The measure should be continuous, consistent, and bounded by . There are many measures (such as standard deviation ) that, although are useful, are also incredibly sensitive to scale. For instance, a normal distribution of is very different from another with . This means that in order to be able to fully understand measures such as , we need the full context. This means it is extremely hard to compare across different systems; therefore, it cannot be used to detect complex dynamics at a large scale. Therefore, it is vitally important that our measure be bounded by .
- The measure is minimal at 0 when each of the distributions in the ensemble is identical. This is to ensure that we can be confident that the results are consistent with each other and there are no complex or chaotic dynamics that are affecting the output of the model.
- The measure is maximal at 1 when given an identity matrix where the identity matrix is defined as , so that . This is because the identity matrix represents the highest level of dissonance, as each individual distribution of the identity matrix represents the maximum possible certainty (and therefore the lowest entropy). Meanwhile, the ensemble as a whole represents the maximum uncertainty, with the pooled distribution being in its highest-entropy state.
- The measure should be compatible with both discrete and continuous data. For the same reason as (i), we want incoherence to be used to compare across systems. In general, when analyzing systems, we tend to have to pick the measure to fit the data. However, this makes it extremely hard to then have meaningful conversations with stakeholders who are likely unfamiliar with these many unique measures.
- The measure is invariant to the total number of distributions, and in the discrete case, it is invariant to the total number of states. We want this measure to be of the aleatoric uncertainty of the system, not the epistemic uncertainty of the data.
- The measure is consistent across systems and is not subject to the overall average entropy of the system. We want incoherence to be as interpretable and consistent as possible, meaning we want to ensure that systems with a higher baseline entropy do not have higher incoherence values. If anything, it makes more sense that any inconsistency in the results of more ordered systems (with low entropy) should be weighted more highly, since we would expect more ordered systems to stay ordered across trials.
- The measure is pessimistic and errs on the side of assuming the presence of complex dynamics, with an emphasis on detecting outliers. The downside of treating a coherent system as complex is much smaller than the downside of treating an incoherent system as statistical. An example of this is a black swan, which is often a consequence of complexity and long-tail distributions. In these situations, a single, extremely rare (and unpredictable) event can be catastrophic for the system as a whole. Having an emphasis on these outlier distributions is vital for a practical complexity measure.
3.6. Definition of the Pooled Distribution
3.7. Choosing a Divergence Metric
3.7.1. f-Divergences
3.7.2. Jensen–Shannon Divergence
3.7.3. Earths Mover’s Distance
3.8. Definition of Incoherence
Distributions | ||||
0.6 | 0.014 | 0.024 | 0.107 | |
2.2 | 0.125 | 0.055 | 0.103 |
4. Methods
4.1. Baseline with Existing Metrics
4.1.1. Discrete Data
4.1.2. Continuous Data
4.2. Features of Complexity
4.2.1. Erdos–Renyi Graphs
4.2.2. Cellular Automata
4.2.3. Daisyworld
5. Results
5.1. Baseline with Existing Metrics
5.1.1. Discrete Data
5.1.2. Continuous Data
5.2. Features of Complexity
Disorder and Order
5.3. Criticality
Sensitivity to Initial Conditions
5.4. Perturbation
5.5. Diversity of Diversity
6. Motivated Use Case—Real-World Decision Making
7. Discussion
7.1. Ergodicity
7.2. Existing Non-Parametric Statistical Tests
7.3. Requirement of Multiple Samples per Ensemble
8. Conclusions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Mitchell, M. Complexity: A Guided Tour; Oxford University Press: Oxford, UK, 2009. [Google Scholar]
- Prokopenko, M.; Boschetti, F.; Ryan, A.J. An information-theoretic primer on complexity, self-organization, and emergence. Complexity 2009, 15, 11–28. [Google Scholar] [CrossRef]
- Lloyd, S. Measures of complexity: A nonexhaustive list. IEEE Control Syst. Mag. 2001, 21, 7–8. [Google Scholar]
- Gershenson, C.; Fernández, N. Complexity and information: Measuring emergence, self-organization, and homeostasis at multiple scales. Complexity 2012, 18, 29–44. [Google Scholar] [CrossRef]
- Palutikof, J.P.; Leitch, A.M.; Rissik, D.; Boulter, S.L.; Campbell, M.J.; Perez Vidaurre, A.C.; Webb, S.; Tonmoy, F.N. Overcoming knowledge barriers to adaptation using a decision support framework. Wiley Interdiscip. Rev. Clim. Chang. 2019, 153, 607–624. [Google Scholar] [CrossRef]
- Weaver, C.P.; Lempert, R.J.; Brown, C.; Hall, J.A.; Revell, D.; Sarewitz, D. Improving the contribution of climate model information to decision making: The value and demands of robust decision frameworks. Wiley Interdiscip. Rev. Clim. Chang. 2013, 4, 39–60. [Google Scholar] [CrossRef]
- Mankin, J.S.; Lehner, F.; Coats, S.; McKinnon, K.A. The Value of Initial Condition Large Ensembles to Robust Adaptation Decision-Making. Earth Space Sci. 2022, 9, e2012EF001610. [Google Scholar] [CrossRef]
- Wiesner, K.; Ladyman, J. Measuring complexity. arXiv 2019, arXiv:1909.13243. [Google Scholar]
- Ladyman, J.; Wiesner, K. What Is a Complex System; Yale University Press: New Haven, CT, USA, 2020. [Google Scholar]
- Palmer, T. The primacy of doubt: Evolution of numerical weather prediction from determinism to probability. J. Adv. Model. Earth Syst. 2017, 9, 730–734. [Google Scholar] [CrossRef]
- Peters, O. The ergodicity problem in economics. Nat. Phys. 2019, 15, 1216–1221. [Google Scholar] [CrossRef]
- Farmer, J.D. Making Sense of Chaos; PENGUIN BOOKS Limited: London, UK, 2024. [Google Scholar]
- Poledna, S.; Miess, M.G.; Hommes, C.; Rabitsch, K. Economic forecasting with an agent-based model. Eur. Econ. Rev. 2023, 151, 104306. [Google Scholar] [CrossRef]
- Shannon, C. A Mathematical Theory of Communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef]
- Madukaife, M.S.; Phuc, H.D. Estimation of Shannon differential entropy: An extensive comparative review. arXiv 2024, arXiv:2406.19432. [Google Scholar]
- Giraldo, L.G.S.; Rao, M.; Principe, J.C. Measures of entropy from data using infinitely divisible kernels. IEEE Trans. Inf. Theory 2014, 61, 535–548. [Google Scholar] [CrossRef]
- Ozertem, U.; Uysal, I.; Erdogmus, D. Continuously differentiable sample-spacing entropy estimation. IEEE Trans. Neural Netw. 2008, 19, 1978–1984. [Google Scholar] [CrossRef] [PubMed]
- Davey, T. Cohesion: A Measure of Organisation and Epistemic Uncertainty of Incoherent Ensembles. Entropy 2023, 25, 1605. [Google Scholar] [CrossRef] [PubMed]
- Bar-Yam, Y. Multiscale variety in complex systems. Complexity 2004, 9, 37–45. [Google Scholar] [CrossRef]
- Weaver, W. Science and complexity. Am. Sci. 1948, 36, 536–544. [Google Scholar] [PubMed]
- Gell-Mann, M.; Lloyd, S. Information measures, effective complexity, and total information. Complexity 1996, 2, 44–52. [Google Scholar] [CrossRef]
- Langton, C.G. Computation at the edge of chaos: Phase transitions and emergent computation. Phys. D Nonlinear Phenom. 1990, 42, 12–37. [Google Scholar] [CrossRef]
- Wuensche, A. Discrete dynamical networks and their attractor basins. Complex Syst. 1998, 98, 3–21. [Google Scholar]
- Lopez-Ruiz, R.; Mancini, H.L.; Calbet, X. A statistical measure of complexity. Phys. Lett. A 1995, 209, 321–326. [Google Scholar] [CrossRef]
- Hüllermeier, E.; Waegeman, W. Aleatoric and epistemic uncertainty in machine learning: An introduction to concepts and methods. Mach. Learn. 2021, 110, 457–506. [Google Scholar] [CrossRef]
- Liu, J.Z.; Paisley, J.; Kioumourtzoglou, M.A.; Coull, B.A. Accurate uncertainty estimation and decomposition in ensemble learning. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; Curran Associates Inc.: Red Hook, NY, USA, 2019. [Google Scholar]
- Palmer, T.; Shutts, G.; Hagedorn, R.; Doblas-Reyes, F.; Jung, T.; Leutbecher, M. Representing model uncertainty in weather and climate prediction. Annu. Rev. Earth Planet. Sci. 2005, 33, 163–193. [Google Scholar] [CrossRef]
- Zhu, Y. Ensemble forecast: A new approach to uncertainty and predictability. Adv. Atmos. Sci. 2005, 22, 781–788. [Google Scholar] [CrossRef]
- Berner, J.; Ha, S.Y.; Hacker, J.; Fournier, A.; Snyder, C. Model uncertainty in a mesoscale ensemble prediction system: Stochastic versus multiphysics representations. Mon. Weather. Rev. 2011, 139, 1972–1995. [Google Scholar] [CrossRef]
- Demeritt, D.; Cloke, H.; Pappenberger, F.; Thielen, J.; Bartholmes, J.; Ramos, M.H. Ensemble predictions and perceptions of risk, uncertainty, and error in flood forecasting. Environ. Hazards 2007, 7, 115–127. [Google Scholar] [CrossRef]
- Surowiecki, J. The Wisdom of Crowds; Abacus: London, UK, 2012. [Google Scholar]
- Parker, W.S. Ensemble modeling, uncertainty and robust predictions. Wiley Interdiscip. Rev. Clim. Chang. 2013, 4, 213–223. [Google Scholar] [CrossRef]
- Ranganathan, P. An Introduction to Statistics: Choosing the Correct Statistical Test. Indian J. Crit. Care Med. 2021, 25, S184. [Google Scholar] [CrossRef] [PubMed]
- Cai, Y.; Lim, L.H. Distances between probability distributions of different dimensions. IEEE Trans. Inf. Theory 2022, 68, 4020–4031. [Google Scholar] [CrossRef]
- Lin, J. Divergence measures based on the Shannon entropy. IEEE Trans. Inf. Theory 1991, 37, 145–151. [Google Scholar] [CrossRef]
- Rubner, Y.; Tomasi, C.; Guibas, L. A metric for distributions with applications to image databases. In Proceedings of the Sixth International Conference on Computer Vision, Bombay, India, 7 January 1998. [Google Scholar] [CrossRef]
- Thomas, M.; Joy, A.T. Elements of Information Theory; Wiley-Interscience: Hoboken, NJ, USA, 2006. [Google Scholar]
- Erdos, P.; Renyi, A. On Random Graphs. Publ. Math. 1959, 18, 290–297. [Google Scholar] [CrossRef]
- Wolfram, S. A New Kind of Science; Wolfram Media: Champaign, IL, USA, 2002. [Google Scholar]
- Watson, A.J.; Lovelock, J.E. Biological homeostasis of the global environment: The parable of Daisyworld. Tellus B 1983, 35, 284–289. [Google Scholar] [CrossRef]
- Khumpuang, S.; Maekawa, H.; Hara, S. Photolithography for minimal fab system. IEEJ Trans. Sens. Micromachines 2013, 133, 272–277. [Google Scholar] [CrossRef]
- Mazzocchi, F. Complexity, network theory, and the epistemological issue. Kybernetes 2016, 45, 1158–1170. [Google Scholar] [CrossRef]
- Sorenson, O.; Rivkin, J.W.; Fleming, L. Complexity, Networks and Knowledge Flows; Edward Elgar Publishing: Northampton, MA, USA, 2010. [Google Scholar]
- Rybko, A.N.; Stolyar, A.L. Ergodicity of stochastic processes describing the operation of open queueing networks. Probl. Peredachi Informatsii 1992, 28, 3–26. [Google Scholar]
- Freund, Y.; Schapire, R.E. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 1997, 55, 119–139. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Davey, T. Incoherence: A Generalized Measure of Complexity to Quantify Ensemble Divergence in Multi-Trial Experiments and Simulations. Entropy 2024, 26, 683. https://doi.org/10.3390/e26080683
Davey T. Incoherence: A Generalized Measure of Complexity to Quantify Ensemble Divergence in Multi-Trial Experiments and Simulations. Entropy. 2024; 26(8):683. https://doi.org/10.3390/e26080683
Chicago/Turabian StyleDavey, Timothy. 2024. "Incoherence: A Generalized Measure of Complexity to Quantify Ensemble Divergence in Multi-Trial Experiments and Simulations" Entropy 26, no. 8: 683. https://doi.org/10.3390/e26080683
APA StyleDavey, T. (2024). Incoherence: A Generalized Measure of Complexity to Quantify Ensemble Divergence in Multi-Trial Experiments and Simulations. Entropy, 26(8), 683. https://doi.org/10.3390/e26080683