Knowable Moments in Stochastics: Knowing Their Advantages
Abstract
:1. Introduction
- We redefine K-moments (Section 2.1) as expectations of maxima or minima of a number of stochastic variables that are independent, identically distributed copies of the stochastic variable of interest. The new definition is clearer, more rigorous and more intuitive. In most cases of interest (but not in all), the new definition does not imply changes in the computational framework (Section 2.3).
- We provide a more intuitive explanation of the statistical estimators of K-moments, which are adapted to the new definition, again without implying computational differences in most of the cases. Furthermore, we provide techniques to accelerate the calculations when the datasets are large, e.g., with million or billion values (Section 2.2).
- We extend the scope of K-moments from continuous stochastic variables to also include discrete variables (Section 2.4).
- We discuss cases of generalization where the new definition implies some differences, notational and computational, to the existing one (Section 2.5).
- We discuss the advantages of the K-moment framework, starting from the information it provides about what a classical moment estimator determines, which actually is not the true value of that classical moment (Section 3.1); this is the reason why classical moments are unknowable from samples, for orders beyond 3–4.
- We show that K-moments for moment orders up to 4 replace the information contained in classical moments and L-moments, such as summary statistics (Section 3.2).
- We show that the real power of the K-moment framework is its ability to estimate moments reliably and unbiasedly for very high orders up to the sample size, even if this is several million or more. We also show the ability of the framework to readily assign, in a simple manner, a value of the distribution function to each K-moment—a property not shared by any other type of moments (Section 3.3).
- Exploiting the above features, we show how K-moments provide a sound and flexible framework for model fitting, making optimal use of the entire dataset, rather than relying on a few moments (as in classical moments and L-moments) or assigning probabilities to single data values (order statistics). This enables utilization of the highest possible moment orders, which are particularly useful in modelling extreme highs or lows that are closely associated with high-order moments. The model fitting concept with K-moments is a new strategy not shared with any other types of methods of moments and includes visualization of the goodness of fit (Section 3.4).
- We illustrate that K-moment estimators offer the ability to estimate the probability density function from a sample—a unique feature of the K-moment methodology.
- We show that K-moments offer the unique advantage of taking into account the estimation bias when the data are not an independent sample but a time series from a process with dependence, even of long range (Section 3.6).
- We provide algorithmic details of the computational framework of K-moments (particularly in Section 2.2, Section 3.4 and Section 3.6)
2. Definitions and Main Derivations
2.1. Definition and Meaning
2.2. Estimation
- We sort the sample into ascending order, i.e., we designate .
- For a specific moment order , we employ Equation (18) to find for .
- From Equation (15), we calculate .
- From Equation (16) we calculate .
- We repeat steps 2–4 for all required orders p.
2.3. Theoretical Calculations for Continuous Stochastic Variables, q = 1
2.4. Theoretical Calculations for Discrete Stochastic Variables, q = 1
2.5. Theoretical Calculations for Continuous Stochastic Variables, q > 1
3. Applications and Advantages
3.1. Evaluation of Classical Moment Estimates
3.2. Summary Statistics
3.3. Estimation of Distribution Function
3.4. Fitting a Distribution Function
- Why use the first and second moments and not, say, the second and third? One may easily justify the standard choice of using the lowest possible order of moments by the fact that higher moments are less accurately estimated. On the other hand, one may counter that, when we are interested in extremes, these are better reflected in higher-order moments. It is well known that a model can hardly be a perfect representation of reality. Thus, we cannot expect that a good model fitting on the first and second moment would be equally suitable for the distribution tail, i.e., the behaviour on extremes.
- Why use two moments and not more? The standard answer, that two equations suffice to find two unknowns, may be adequate from a theoretical mathematical point of view but it is not from an empirical and engineering one. (As the saying goes, to draw a straight line, a mathematician needs two points but an engineer needs three). Certainly, an optimization framework (as in maximizing likelihood or in minimizing fitting error) is much preferable and superior to an equation solving method.
This criticism would be valid if the true distribution function was known to be the one chosen as a model for the real-world process studied. But this is hardly the case. Let us assume that in the time series of flow observations we have three very high values and that we have chosen a certain model, e.g., a Lognormal distribution. How can we be sure that the model is correct? If we are not sure (which actually is always the case), and if we are to design a certain engineering construction, would we prefer a fitting of the chosen model that is consistent with theoretical considerations, e.g., based on the maximum likelihood method, even if this yields a departure for the three high values? Or would we feel safer if our fitting represents well the three high values?
- We choose a number of moment orders, i.e., , with . The rationale for this is that when dealing with samples of size of the order of several thousands, the number m could be chosen much smaller, e.g., of the order of 100, to speed up calculations without compromising accuracy. The orders need not be natural numbers.
- We estimate the upper K-moments of orders using Equations (15) and (18).
- Assuming default values of the distribution parameters, represented as a vector , we determine from Equation (65), the theoretical relationships between parameters of the specified distribution and the mean (such as those in Table 1 or Table 4) and the expression of the distribution function . Alternatively, but not preferably, we can estimate from the sample mean, along with Equation (65).
- As the vector contains the tail index ξ, we determine from the relationships of Table 5.
- Given and , for each we estimate the empirical distribution function value from Equation (71).
- Given the parameter values in vector , for each K-moment , we estimate the theoretical distribution function from the expression of the distribution function setting .
- We form an expression for the total fitting error as the sum:
- We repeat the calculations of steps 3–7 for different sets of parameter vectors until the fitting error becomes minimal. The repetitions are executed by a solver (available in every software environment) using as objective function (to be minimized) the error .
3.5. Estimation of Density Function
3.6. Accounting for Estimation Bias in Time Series with Time Dependence
- We construct the climacogram and estimate the Hurst parameter .
- From Equation (84), we estimate the relative bias .
- We choose a moment order p and estimate by the standard estimators (15)–(18) the K-moment (or ) as if we had an independent sample.
- We adapt by Equation (85) and infer that the quantity estimated in step 3 is an estimate of (or ).
4. Discussion
5. Summary and Conclusions
Supplementary Materials
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A. Proof of Equations (46) and (49)
References
- Koutsoyiannis, D. Knowable moments for high-order stochastic characterization and modelling of hydrological processes. Hydrol. Sci. J. 2019, 64, 19–33. [Google Scholar] [CrossRef] [Green Version]
- Koutsoyiannis, D. Stochastics of Hydroclimatic Extremes—A Cool Look at Risk, 2nd ed.; Kallipos Open Academic Editions: Athens, Greece, 2022; 346p, ISBN 978-618-85370-0-2. [Google Scholar] [CrossRef]
- Koutsoyiannis, D. Replacing histogram with smooth empirical probability density function estimated by K-moments. Sci 2022, 4, 50. [Google Scholar] [CrossRef]
- Koutsoyiannis, D. Advances in stochastics of hydroclimatic extremes. L’Acqua 2021, 23–32. [Google Scholar]
- Koutsoyiannis, D.; Montanari, A. Bluecat: A local uncertainty estimator for deterministic simulations and predictions. Water Resour. Res. 2022, 58, e2021WR031215. [Google Scholar] [CrossRef]
- Koutsoyiannis, D.; Iliopoulou, T. Ombrian curves advanced to stochastic modeling of rainfall intensity. In Rainfall Modeling, Measurement and Applications; Chapter 9; Elsevier: Amsterdam, The Netherlands, 2022; pp. 261–283. [Google Scholar]
- Iliopoulou, T.; Malamos, N.; Koutsoyiannis, D. Regional ombrian curves: Design rainfall estimation for a spatially diverse rainfall regime. Hydrology 2022, 9, 67. [Google Scholar] [CrossRef]
- Iliopoulou, T.; Koutsoyiannis, D. A parsimonious approach for regional design rainfall estimation: The case study of Athens. In “Innovative Water Management in a Changing Climate”, Proceedings of the 7th IAHR Europe Congress, Athens, Greece, 7–9 September 2022; International Association for Hydro-Environment Engineering and Research (IAHR): Beijing, China, 2022. [Google Scholar]
- Koutsoyiannis, D.; Iliopoulou, T.; Koukouvinos, A.; Malamos, N.; Mamassis, N.; Dimitriadis, P.; Tepetidis, N.; Markantonis, D. Τεχνική Έκθεση [Technical Report; in Greek]. In Production of Maps with Updated Parameters of the Ombrian Curves at Country Level (Implementation of the EU Directive 2007/60/EC in Greece); Department of Water Resources and Environmental Engineer-ing—National Technical University of Athens: Athens, Greece, 2023. [Google Scholar]
- Regional Ombrian Curves: Design Rainfall Estimation for a Spatially Diverse Rainfall Regime—Prehistory, Rejection by Frontiers. Available online: https://www.itia.ntua.gr/en/getfile/2188/2/documents/PrehistoryRejectionFromFrontiers.pdf (accessed on 8 May 2023).
- Koutsoyiannis, D. An open letter to the Editor of Frontiers, ResearchGate. 2021. Available online: https://doi.org/10.13140/RG.2.2.34248.39689 (accessed on 8 May 2023). [CrossRef]
- Hemelrijk, J. Underlining random variables. Stat. Neerl. 1966, 20, 1–7. [Google Scholar] [CrossRef]
- Oberhettinger, F. Tables of Mellin Transforms; Springer: Berlin/Heidelberg, Germany, 1974; 275p. [Google Scholar]
- Koutsoyiannis, D.; Dimitriadis, P. Towards generic simulation for demanding stochastic processes. Sci 2021, 3, 34. [Google Scholar] [CrossRef]
- Fornberg, B. Generation of finite difference formulas on arbitrarily spaced grids. Math. Comput. 1988, 51, 699–706. [Google Scholar] [CrossRef]
- Hosking, J.R. L-moments: Analysis and estimation of distributions using linear combinations of order statistics. J. R. Stat. Soc. B 1990, 52, 105–124. [Google Scholar] [CrossRef]
- Weibull, W. A statistical theory of strength of materials. Ing. Vetensk. Akad. Handl. 1939, 151, 1–45. [Google Scholar]
- Hazen, A. The storage to be provided in impounding reservoirs for municipal water supply. Trans. Am. Soc. Civ. Eng. 1914, 77, 1539–1669. [Google Scholar] [CrossRef]
- Blom, G. Statistical Estimates and Transformed Beta Variables; Wiley: New York, NY, USA, 1958. [Google Scholar]
- Royston, J.P. Expected normal order statistics (exact and approximate). J. R. Stat. Soc. Ser. C 1982, 31, 161–165. [Google Scholar]
- Anderson, C.W. Extreme value theory for a class of discrete distributions with applications to some stochastic processes. J. Appl. Probab. 1970, 7, 99–113. [Google Scholar] [CrossRef]
- Hitz, A.; Davis, R.; Samorodnitsky, G. Discrete extremes. arXiv 2017, arXiv:1707.05033. [Google Scholar]
- Dimitriadis, P.; Koutsoyiannis, D. Stochastic synthesis approximating any process dependence and distribution. Stoch. Environ. Res. Risk Assess. 2018, 32, 1493–1515. [Google Scholar] [CrossRef]
- Koutsoyiannis, D. A random walk on water. Hydrol. Earth Syst. Sci. 2010, 14, 585–601. [Google Scholar] [CrossRef] [Green Version]
- Hurst, H.E. Long term storage capacities of reservoirs. Trans. Am. Soc. Civil Eng. 1951, 116, 776–808. [Google Scholar] [CrossRef]
- Koutsoyiannis, D. The Hurst phenomenon and fractional Gaussian noise made easy. Hydrol. Sci. J. 2002, 47, 573–595. [Google Scholar] [CrossRef]
- Greenwood, J.A.; Landwehr, J.M.; Matalas, N.C.; Wallis, J.R. Probability weighted moments: Definition and relation to pa-rameters of several distributions expressable in inverse form. Water Resour. Res. 1979, 15, 1049–1054. [Google Scholar] [CrossRef] [Green Version]
Characteristic | Equation | Equation No. |
---|---|---|
Distribution function | (20) | |
Classical noncentral moment | (21) | |
Upper K-moment | (22) | |
Lower K-moment | (23) |
Derivative | Derivative Expression * | Result * | |||||||
---|---|---|---|---|---|---|---|---|---|
−3 | −2 | −1 | 1 | 2 | 3 | ||||
0 | 1 | ||||||||
1 | −1/2 | 0 | ½ | ||||||
2 | 1 | −2 | 1 | ||||||
3 | −1/2 | 1 | 0 | −1 | ½ | ||||
4 | 1 | −4 | 6 | −4 | 1 | ||||
5 | −1/2 | 2 | −5/2 | 0 | 5/2 | −2 | ½ |
Characteristic | ||
---|---|---|
Location | ||
Dispersion | ||
Skewness | ||
Kurtosis |
Case | Symbol | |||
---|---|---|---|---|
0 | ||||
0 | 2 | |||
3 | 9 |
Characteristic | Definition of Tail Index 1 | Asymptotic Λ | Equation No. | ||||
---|---|---|---|---|---|---|---|
0 | 1/2 | 1 | |||||
Upper bounded by , tail index | 0 | 1 | (66) | ||||
Upper unbounded, tail index | – | (67) | |||||
Lower bounded by , tail index | 0 | 1 | (68) | ||||
Lower unbounded, tail index | – | (69) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Koutsoyiannis, D. Knowable Moments in Stochastics: Knowing Their Advantages. Axioms 2023, 12, 590. https://doi.org/10.3390/axioms12060590
Koutsoyiannis D. Knowable Moments in Stochastics: Knowing Their Advantages. Axioms. 2023; 12(6):590. https://doi.org/10.3390/axioms12060590
Chicago/Turabian StyleKoutsoyiannis, Demetris. 2023. "Knowable Moments in Stochastics: Knowing Their Advantages" Axioms 12, no. 6: 590. https://doi.org/10.3390/axioms12060590
APA StyleKoutsoyiannis, D. (2023). Knowable Moments in Stochastics: Knowing Their Advantages. Axioms, 12(6), 590. https://doi.org/10.3390/axioms12060590