Decoding the Atmosphere: Optimising Probabilistic Forecasts with Information Gain
Abstract
:1. Introduction
2. Background: Brier Score and Information
- The BS gives a disproportionate weight to common events.
- The units of the BS are dimensionless, rendering interpretation straightforward but making the comparison between predictions of different variables difficult and subjective.
- The application of MSE to measure the distance between two pdfs is constrained by the range of probabilities , imposing a geometric behaviour on the BS function different from the typical interpretation of MSE as between unbounded scalars.
3. Evaluating Forecast of Rare Events
3.1. Desired Scoring Rule
- What do we wish to quantify regarding the forecast’s benefit?
- To convert a complex set of forecast and observed values into an insightful score, we must shed information (in both the colloquial sense and that of information theory) to reduce the uncertainty distributions to a single or small set of values that convey characteristics of the forecasts. How should we perform this reduction while preserving as much salient information as possible?
3.2. Measuring Skill with Information Gain
3.3. Implications of Using Brier Score to Evaluate Predictions of Rare Events
4. Decomposition into Reliability and Discrimination
5. Visualising Surprise
5.1. Worked Example
5.2. Evaluating Our Test Case
- The event occurs but was not forecast, as happens very early in the event in our example; and
- The event does not occur but was forecast, as happens after the event.
6. Synthesis
- Information gain accounts for higher moments of the probability distribution, as the Brier Score is equal to a second-order approximation of Ignorance. The Brier Score captures a mean error, but not more complex aspects of the probability distributions such as kurtosis.
- Differences between the two scores deviate sharply at extreme probabilities, such as those that characterise a dataset of rare events, since the BS truncates the score as probabilities tend to zero (rather than tending to infinity as in a logarithmic score);
- Information as measured in units of bits allows comparison between variables and joins a more universal framework of information transfer or surprise removal; the idea of optimising through information gain is shared with machine learning, for instance, and allows cross-comparison of system-component performance;
- We can decompose the gain or loss of information into reliability–discrimination components as with the Brier family of scores;
- While we laud objectively beneficial mathematical properties, information-based scores (unlike the Brier family) require the subjective clipping of certain forecast probabilities —binary events that do not verify yield an infinite error that reflects the implication of believing a prediction without question.
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
IG | Information Gain |
IGN(SS) | Ignorance (Skill Score) |
B(S)S | Brier (Skill) Score |
f | Forecast probability scalar |
o | Observation (0 or 1) scalar |
K | Probability bins |
REL | Reliability |
DSC | Discrimination |
UNC | Uncertainty |
Kullback–Liebler Divergence |
Appendix A. Brier Score as an Approximation of Ignorance
Appendix B. Histogram Binning for Forecast Evaluation
Appendix C. Decomposition into Reliability, Discrimination, and Uncertainty
Appendix C.1. Kullback–Leibler Divergence (DKL)
Appendix C.2. Uncertainty (UNC)
Appendix C.3. Reliability (REL)
Appendix C.4. Discrimination (DSC)
Appendix C.5. Full Expansion and Link with Brier Score and Cross-Entropy
Appendix D. Generating a Synthetic Time Series
References
- Brier, G.W. Verification Of Forecasts Expressed In Terms Of Probability. Mon. Weather Rev. 1950, 78, 1–3. [Google Scholar] [CrossRef]
- Roulston, M.S.; Smith, L.A. Evaluating Probabilistic Forecasts Using Information Theory. Mon. Weather Rev. 2002, 130, 1653–1660. [Google Scholar] [CrossRef]
- Hagedorn, R.; Smith, L.A. Communicating the value of probabilistic forecasts with weather roulette. Meteorol. Appl. 2009, 16, 143–155. [Google Scholar] [CrossRef]
- Benedetti, R. Scoring Rules for Forecast Verification. Mon. Weather Rev. 2010, 138, 203–211. [Google Scholar] [CrossRef]
- Tödter, J.; Ahrens, B. Generalization of the Ignorance Score: Continuous Ranked Version and Its Decomposition. Mon. Weather Rev. 2012, 140, 2005–2017. [Google Scholar] [CrossRef]
- Weijs, S.V.; van de Giesen, N. An information-theoretical perspective on weighted ensemble forecasts. J. Hydrol. 2013, 498, 177–190. [Google Scholar] [CrossRef]
- Nelson, K.P. Assessing Probabilistic Inference by Comparing the Generalized Mean of the Model and Source Probabilities. Entropy 2017, 19, 286. [Google Scholar] [CrossRef]
- McCutcheon, R.G. In Favor of Logarithmic Scoring. Philos. Sci. 2019, 86, 286–303. [Google Scholar] [CrossRef]
- Neapolitan, R.E.; Jiang, X. Artificial Intelligence: With an Introduction to Machine Learning, 2nd ed.; CRC Press: Boca Raton, FL, USA, 2018. [Google Scholar]
- Peirolo, R. Information gain as a score for probabilistic forecasts. Met. Apps 2011, 18, 9–17. [Google Scholar] [CrossRef]
- Weijs, S.V.; van de Giesen, N. Accounting for Observational Uncertainty in Forecast Verification: An Information-Theoretical View on Forecasts, Observations, and Truth. Mon. Weather Rev. 2011, 139, 2156–2162. [Google Scholar] [CrossRef]
- Jolliffe, I.T. Probability forecasts with observation error: What should be forecast? Meteorol. Appl. 2017, 24, 276–278. [Google Scholar] [CrossRef]
- Cover, T.M.; Thomas, J.A. Elements of Information Theory; John Wiley & Sons: Hoboken, NJ, USA, 2012. [Google Scholar]
- Robberechts, P.; Van Haaren, J.; Davis, J. A Bayesian Approach to In-Game Win Probability in Soccer. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, New York, NY, USA, 14–18 August 2021; pp. 3512–3521. [Google Scholar] [CrossRef]
- Green, D.M.; Swets, J.A. Signal Detection Theory and Psychophysics; Wiley: New York, NY, USA, 1966; Volume 1. [Google Scholar]
- Jolliffe, I.T.; Stephenson, D.B. Forecast Verification: A Practitioner’s Guide in Atmospheric Science; John Wiley & Sons: Hoboken, NJ, USA, 2003. [Google Scholar]
- Gilleland, E.; Ahijevych, D.; Brown, B.G.; Casati, B.; Ebert, E.E. Intercomparison of Spatial Forecast Verification Methods. Weather Forecast. 2009, 24, 1416–1430. [Google Scholar] [CrossRef]
- Wilks, D.S. Statistical Methods in the Atmospheric Sciences; Academic Press: Cambridge, MA, USA, 2011. [Google Scholar]
- Roberts, N.M.; Lean, H.W. Scale-Selective Verification of Rainfall Accumulations from high-resolution Forecasts of Convective Events. Mon. Weather Rev. 2008, 136, 78–97. [Google Scholar] [CrossRef]
- Duc, L.; Saito, K.; Seko, H. Spatial-temporal fractions verification for high-resolution ensemble forecasts. Tellus Ser. A Dyn. Meteorol. Oceanogr. 2013, 65, 18171. [Google Scholar] [CrossRef]
- Schwartz, C.S.; Sobash, R.A. Generating Probabilistic Forecasts from convection-allowing Ensembles Using Neighborhood Approaches: A Review and Recommendations. Mon. Weather Rev. 2017, 145, 3397–3418. [Google Scholar] [CrossRef]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- Ho, Y.; Wookey, S. The Real-World-Weight Cross-Entropy Loss Function: Modeling the Costs of Mislabeling. IEEE Access 2020, 8, 4806–4813. [Google Scholar] [CrossRef]
- Ruby, U.; Yendapalli, V. Binary cross entropy with deep learning technique for image classification. Int. J. Adv. Trends Comput. Sci. Eng. 2020, 9, 5393–5397. [Google Scholar]
- Zhang, Z.; Sabuncu, M. Generalized cross entropy loss for training deep neural networks with noisy labels. Adv. Neural Inf. Process. Syst. 2018, 31, 8792–8802. [Google Scholar]
- Hui, L.; Belkin, M. Evaluation of Neural Architectures Trained with Square Loss vs Cross-Entropy in Classification Tasks. arXiv 2020, arXiv:2006.07322. [Google Scholar]
- Weijs, S.V.; van Nooijen, R.; van de Giesen, N. Kullback–Leibler Divergence as a Forecast Skill Score with Classic Reliability–Resolution–Uncertainty Decomposition. Mon. Weather Rev. 2010, 138, 3387–3399. [Google Scholar] [CrossRef]
- Good, I.J. Rational Decisions. J. R. Stat. Soc. Ser. B Methodol. 1952, 14, 107–114. [Google Scholar] [CrossRef]
- Williams, G.P. Chaos Theory Tamed; Joseph Henry Press: Washington, DC, USA, 1997. [Google Scholar]
- Pierce, J.R. An Introduction to Information Theory: Symbols, Signals and Noise; Dover Publications: Garden City, NY, USA, 1980. [Google Scholar]
- Aurell, E.; Boffetta, G.; Crisanti, A.; Paladin, G.; Vulpiani, A. Predictability in the large: An extension of the concept of Lyapunov exponent. J. Phys. A Math. Gen. 1999, 30, 1. [Google Scholar] [CrossRef]
- Lawson, J.R. Predictability of Idealized Thunderstorms in Buoyancy–Shear Space. J. Atmos. Sci. 2019, 76, 2653–2672. [Google Scholar] [CrossRef]
- Palmer, T.N.; Döring, A.; Seregin, G. The real butterfly effect. Nonlinearity 2014, 27, R123. [Google Scholar] [CrossRef]
- Bernardo, J.M. Expected Information as Expected Utility. Ann. Stat. 1979, 7, 686–690. [Google Scholar] [CrossRef]
- Winkler, R.L.; Murphy, A.H. “Good” Probability Assessors. J. Appl. Meteorol. Climatol. 1968, 7, 751–758. [Google Scholar] [CrossRef]
- Murphy, A.H. On the Misinterpretation of Precipitation Probability Forecasts. Bull. Am. Meteorol. Soc. 1977, 58, 1297–1299. [Google Scholar] [CrossRef]
- Buizza, R. Chapter 2—Ensemble Forecasting and the Need for Calibration. In Statistical Postprocessing of Ensemble Forecasts; Vannitsem, S., Wilks, D.S., Messner, J.W., Eds.; Elsevier: Amsterdam, The Netherlands, 2018; pp. 15–48. [Google Scholar] [CrossRef]
- Gneiting, T.; Raftery, A.E.; Westveld, A.H.; Goldman, T. Calibrated Probabilistic Forecasting Using Ensemble Model Output Statistics and Minimum CRPS Estimation. Mon. Weather Rev. 2005, 133, 1098–1118. [Google Scholar] [CrossRef]
- Sanchez, C.; Williams, K.D.; Collins, M. Improved stochastic physics schemes for global weather and climate models. Q. J. R. Meteorol. Soc. 2016, 142, 147–159. [Google Scholar] [CrossRef]
- Le Carrer, N.; Ferson, S. Beyond probabilities: A possibilistic framework to interpret ensemble predictions and fuse imperfect sources of information. Q. J. R. Meteorol. Soc. 2021, 147, 3410–3433. [Google Scholar] [CrossRef]
- Lorenz, E.N. Deterministic Nonperiodic Flow. J. Atmos. Sci. 1963, 20, 130–141. [Google Scholar] [CrossRef]
- Von Holstein, C.A.S.S. A Family of Strictly Proper Scoring Rules Which Are Sensitive to Distance. J. Appl. Meteorol. Climatol. 1970, 9, 360–364. [Google Scholar] [CrossRef]
- Hendrickson, A.D.; Buehler, R.J. Proper Scores for Probability Forecasters. Ann. Math. Statist. 1971, 42, 1916–1921. [Google Scholar] [CrossRef]
- Gneiting, T.; Raftery, A.E. Strictly Proper Scoring Rules, Prediction, and Estimation. J. Am. Stat. Assoc. 2007, 102, 359–378. [Google Scholar] [CrossRef]
- Bröcker, J.; Smith, L.A. From ensemble forecasts to predictive distribution functions. Tellus Ser. A Dyn. Meteorol. Oceanogr. 2008, 60, 663–678. [Google Scholar] [CrossRef]
- Hartley, R.V.L. Transmission of information. Bell Syst. Tech. J. 1928, 7, 535–563. [Google Scholar] [CrossRef]
- Shannon, C.E. A Mathematical Theory of Communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef]
- Rioul, O.; Magossi, J.C. On Shannon’s Formula and Hartley’s Rule: Beyond the Mathematical Coincidence. Entropy 2014, 16, 4892–4910. [Google Scholar] [CrossRef]
- Doviak, R.J.; Zrnic, D.S.; Schotland, R.M. Doppler radar and weather observations. Appl. Opt. 1994, 33, 4531. [Google Scholar]
- Roebber, P.J. Using Evolutionary Programming to Generate Skillful Extreme Value Probabilistic Forecasts. Mon. Weather Rev. 2013, 141, 3170–3185. [Google Scholar] [CrossRef]
- Williams, R.M.; Ferro, C.A.T.; Kwasniok, F. A comparison of ensemble post-processing methods for extreme events. Q. J. R. Meteorol. Soc. 2014, 140, 1112–1120. [Google Scholar] [CrossRef]
- Hughes, G.; Topp, C.F.E. Probabilistic Forecasts: Scoring Rules and Their Decomposition and Diagrammatic Representation via Bregman Divergences. Entropy 2015, 17, 5450–5471. [Google Scholar] [CrossRef]
- Skinner, P.S.; Wicker, L.J.; Wheatley, D.M.; Knopfmeier, K.H. Application of Two Spatial Verification Methods to Ensemble Forecasts of Low-Level Rotation. Weather Forecast. 2016, 31, 713–735. [Google Scholar] [CrossRef]
- Lawson, J.R.; Potvin, C.K.; Skinner, P.S.; Reinhart, A.E. The vice and virtue of increased horizontal resolution in ensemble forecasts of tornadic thunderstorms in low-CAPE, high-shear environments. Mon. Weather Rev. 2021, 149, 921–944. [Google Scholar] [CrossRef]
- Gilleland, E.; Ahijevych, D.A.; Brown, B.G.; Ebert, E.E. Verifying Forecasts Spatially. Bull. Am. Meteor. Soc. 2010, 91, 1365–1373. [Google Scholar] [CrossRef]
- Weyn, J.A.; Durran, D.R.; Caruana, R. Can machines learn to predict weather? Using deep learning to predict gridded 500-hPa geopotential height from historical weather data. J. Adv. Model. Earth Syst. 2019, 11, 2680–2693. [Google Scholar] [CrossRef]
- Massoud, E.C.; Hoffman, F.; Shi, Z.; Tang, J.; Alhajjar, E.; Barnes, M.; Braghiere, R.K.; Cardon, Z.; Collier, N.; Crompton, O.; et al. Perspectives on Artificial Intelligence for Predictions in Ecohydrology. Artif. Intell. Earth Syst. 2023, 2, e230005. [Google Scholar] [CrossRef]
- Chase, R.J.; Harrison, D.R.; Lackmann, G.M.; McGovern, A. A Machine Learning Tutorial for Operational Meteorology. Part II: Neural Networks and Deep Learning. Weather Forecast. 2023, 38, 1271–1293. [Google Scholar] [CrossRef]
- Flora, M.L.; Potvin, C.K.; McGovern, A.; Handler, S. A Machine Learning Explainability Tutorial for Atmospheric Sciences. Artif. Intell. Earth Syst. 2024, 3, e230018. [Google Scholar] [CrossRef]
- Jeon, H.J.; Kang, J.H.; Kwon, I.H.; Lee, O.J. CloudNine: Analyzing Meteorological Observation Impact on Weather Prediction Using Explainable Graph Neural Networks. arXiv 2024, arXiv:2402.14861. [Google Scholar]
- Epstein, E.S. A Scoring System for Probability Forecasts of Ranked Categories. J. Appl. Meteorol. 1969, 8, 985–987. [Google Scholar] [CrossRef]
- Hersbach, H. Decomposition of the Continuous Ranked Probability Score for Ensemble Prediction Systems. Weather Forecast. 2000, 15, 559–570. [Google Scholar] [CrossRef]
Term | Synonyms |
---|---|
Ignorance | Entropy, Uncertainty, Surprise |
Discrimination | Resolution, Sharpness, Goodness |
Reliability | Calibration, Spread, Fit, Confidence |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lawson, J.R.; Potvin, C.K.; Nelson, K. Decoding the Atmosphere: Optimising Probabilistic Forecasts with Information Gain. Meteorology 2024, 3, 212-231. https://doi.org/10.3390/meteorology3020010
Lawson JR, Potvin CK, Nelson K. Decoding the Atmosphere: Optimising Probabilistic Forecasts with Information Gain. Meteorology. 2024; 3(2):212-231. https://doi.org/10.3390/meteorology3020010
Chicago/Turabian StyleLawson, John R., Corey K. Potvin, and Kenric Nelson. 2024. "Decoding the Atmosphere: Optimising Probabilistic Forecasts with Information Gain" Meteorology 3, no. 2: 212-231. https://doi.org/10.3390/meteorology3020010
APA StyleLawson, J. R., Potvin, C. K., & Nelson, K. (2024). Decoding the Atmosphere: Optimising Probabilistic Forecasts with Information Gain. Meteorology, 3(2), 212-231. https://doi.org/10.3390/meteorology3020010