Complexity Reduction in Analyzing Independence between Statistical Randomness Tests Using Mutual Information
Abstract
:1. Introduction
2. Preliminaries
2.1. Mutual Information
2.2. Estimating Mutual Information
2.3. Distribution of Estimators
2.4. MIRT-1 Method
- Let and be continuous random variables.
- Construct the permuted samples in such a way that the possible association between X and Y disappears, being the permutation i of elements of Y, i.e.,
- ;
- , for ;
- is the identity of .
- Estimate the MI of the allowed samples to obtain , where .
- The p-value associated with the test is calculated by
- If , then the null hypothesis is not-rejected.
- Select PRNGs.
- –
- The selected generators must generate outputs that satisfy the randomness conditions.
- Build the data samples using the selected generators.
- –
- Generate n sequences of random numbers of length L to be evaluated using the selected statistical randomness tests.
- Evaluate each of the n sequences using the k statistical randomness tests to obtain the corresponding p-values for each test (with ).
- Compute the MI between sequences of p-values to detect the possible presence of correlations.
- –
- Estimate the MI between pairs of sequences of p-values to detect the presence of correlation.
- –
- In the case of the MIRT-1 method, the estimator used is the shrinkage estimator.
- –
- For a better interpretation in [17], the MI values were normalized, i.e., , where represents the entropy of the variable .
- Determine the significant correlations to conclude the correlation between the tests using the permutation test. The MI values are grouped in a triangular matrix
3. Reducing the Complexity of the Method Based on Mutual Information to Analyze the Independence between the Statistical Tests of Randomness
3.1. Solution Idea
3.2. Selection of the Critical Value to Determine the Significant Correlations
3.3. Method Modification Proposal
- Select PRNGs.
- –
- The selected generators must generate outputs that satisfy the randomness conditions.
- Build the data samples using the selected generators.
- –
- Generate n sequences of random numbers of length L to be evaluated using the selected statistical tests of randomness.
- Evaluate each of the n sequences using the k statistical tests of randomness to obtain the corresponding p-values for each test (with ).
- Compute the MI between sequences of p-values to detect the presence of correlations.
- –
- Calculate the MI between pairs of sequences of p-values to detect the presence of correlation.
- –
- The MI estimator used is the ML
- Calculate and compare it with the critical value associated with the default value. If , the null hypothesis of independence between the tests of randomness is rejected.
4. Experimental Validation
4.1. Experimental Check of Normality of
4.1.1. Design of Experiment 1
4.1.2. Results
4.2. Analysis of the Effectiveness of the Proposed Variant
4.2.1. Design of Experiment 2
4.2.2. Results
4.2.3. Design of Experiment 3
4.3. Analysis of the Consistency and Stability of the MIRT-2 Method
4.3.1. Consistency
- Generate 10,000 samples of independent and identically distributed random variables .
- Calculate the mutual information using both and estimators for each sample.
- Calculate the mean and difference between the two methods for each sample. To check the assumptions of normality of the differences, a test for normal distribution, such as the Shapiro–Wilk or Kolmogorov–Smirnov test, can be conducted for the hypothesis that the distribution of the observations in the sample is normal (Figure 9)) (if , then reject normality).
- Calculate the mean difference and the LOAs (limits of agreement) (mean difference times the standard deviation of the differences.)
- Interpret the results. If the mean difference is close to zero and the limits of agreement are narrow, then the two methods are considered to be in good agreement.
4.3.2. Stability
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Skliar, O.; Monge, R.E.; Medina, V.; Gapper, S.; Oviedo, G. A Hybrid Random Number Generator(HRNG). Rev. De Matemática Teoría Apl. 2011, 18, 265. [Google Scholar] [CrossRef]
- Turan, M.S.; Doğanaksoy, A.; Boztaş, S. On Independence and Sensitivity of Statistical Randomness Tests. In Proceedings of the Sequences and Their Applications-SETA 2008, Lexington, KY, USA, 14–18 September 2008; Springer: Berlin/Heidelberg, Germany, 2008; Volume 5203, pp. 18–29. [Google Scholar]
- Koçak, O. A Unified Evaluation of Statistical Randomness Tests and Experimental Analysis of Their Relations. Ph.D. Thesis, Middle East Technical University, Üniversiteler Mahallesi, Dumlupınar Bulvarı No:1 06800, Ankara, Turkey, 2016. [Google Scholar]
- Garcia, F.D.; de Koning Gans, G.; Muijrers, R.; van Rossum, P.; Verdult, R.; Schreur, R.W.; Jacobs, B. Dismantling MIFARE Classic. In Proceedings of the Computer Security—ESORICS 2008, Torremolinos, Spain, 6–8 October 2008; Lecture Notes in Computer Science. Jajodia, S., Lopez, J., Eds.; Springer: Berlin/Heidelberg, Germany, 2008; pp. 97–114. [Google Scholar] [CrossRef]
- Kasper, T.; Silbermann, M.; Paar, C. All You Can Eat or Breaking a Real-World Contactless Payment System. In Proceedings of the International Conference on Financial Cryptography and Data Security, Tenerife, Spain, 25–28 January 2010; Sion, R., Ed.; Springer: Berlin/Heidelberg, Germany, 2010; Volume 6052, pp. 343–350. [Google Scholar] [CrossRef]
- Bernstein, D.J.; Chang, Y.A.; Cheng, C.M.; Chou, L.P.; Heninger, N.; Lange, T.; van Someren, N. Factoring RSA Keys from Certified Smart Cards: Coppersmith in the Wild. In Proceedings of the Advances in Cryptology—ASIACRYPT, Bengaluru, India, 1–5 December 2013; Lecture Notes in Computer Science. Sako, K., Sarkar, P., Eds.; Springer: Berlin/Heidelberg, Germany, 2013; pp. 341–360. [Google Scholar] [CrossRef]
- Bundesamt für Sicherheit in der Informationstechnik. Certification Report BSI-DSZ-CC-0212-2004 for Renesas AE45C1 (HD65145C1) Smartcard Integrated Circuit Version 01; Technical Report; Federal Office for Information Security: Bonn, Germany, 2004. [Google Scholar]
- Dunn, W.L.; Shultis, J.K. Pseudorandom Number Generators. In Exploring Monte Carlo Methods; Elsevier: Amsterdam, The Netherlands, 2012; pp. 47–68. [Google Scholar]
- Rukhin, A. Statistical Testing of Randomness: New and Old Procedures. In Randomness through Computation; World Scientific: Singapore, 2011. [Google Scholar]
- Hernandez-Castro, J.; Barrero, D.F. Evolutionary Generation and Degeneration of Randomness to Assess the Indepedence of the Ent Test Battery. In Proceedings of the 2017 IEEE Congress on Evolutionary Computation (CEC), Donostia, Spain, 5–8 June 2017; pp. 1420–1427. [Google Scholar] [CrossRef]
- Doğanaksoy, A.; Ege, B.; Muş, K. Extended Results for Independence and Sensitivity of NIST Randomness Tests. In Proceedings of the Information Security and Cryptography Conference, Istanbul, Turkey, 25–27 December 2008. [Google Scholar]
- Doğanaksoy, A.; Sulak, F.; Uğuz, M.; Şeker, O.; Akcengiz, Z. Mutual Correlation of NIST Statistical Randomness Tests and Comparison of Their Sensitivities on Transformed Sequences. Turk. J. Elec. Eng. Comp. Sci. 2017, 25, 655–665. [Google Scholar] [CrossRef]
- Hurley-Smith, D.; Hernandez-Castro, J. Great Expectations: A Critique of Current Approaches to Random Number Generation Testing & Certification. In Proceedings of the Security Standardisation Research, Darmstadt, Germany, 26–27 November 2018; Lecture Notes in Computer Science. Cremers, C., Lehmann, A., Eds.; Springer: Cham, Switzerland, 2018; Volume 11322, pp. 143–163. [Google Scholar] [CrossRef]
- Burciu, P.; Simion, E. A systematic approach of NIST statistical tests dependencies. J. Electr. Eng. Electron. Control. Comput. Sci. 2019, 5, 1–6. [Google Scholar]
- Sulak, F.; Uğuz, M.; Koçak, O.; Doğanaksoy, A. On the Independence of Statistical Randomness Tests Included in the NIST Test Suite. Turk. J. Elec. Eng. Comp. Sci. 2017, 25, 3673–3683. [Google Scholar] [CrossRef]
- Fan, L.; Chen, H.; Gao, S. A General Method to Evaluate the Correlation of Randomness Tests. In Proceedings of the International Workshop on Information Security Applications, Jeju Island, Republic of Korea, 19–21 August 2013; pp. 52–62. [Google Scholar]
- Karell-Albo, J.A.; Legón-Pérez, C.M.; Madarro-Capó, E.J.; Rojas, O.; Sosa-Gómez, G. Measuring independence between statistical randomness tests by mutual information. Entropy 2020, 22, 741. [Google Scholar] [CrossRef] [PubMed]
- Luengo, E.A.; Cerna, M.B.L.; Villalba, L.J.G.; Hernandez-Castro, J. A New Approach to Analyze the Independence of Statistical Tests of Randomness. Appl. Math. Comput. 2022, 426, 127116. [Google Scholar] [CrossRef]
- Luengo, E.A.; Cerna, M.B.L.; Villalba, L.J.G.; Hernandez-Castro, J.; Hurley-Smith, D. Critical Analysis of Hypothesis Tests in Federal Information Processing Standard (140-2). Entropy Int. Interdiscip. J. Entropy Inf. Stud. 2022, 24, 613. [Google Scholar]
- Cerna, M.B.L. Nuevas Técnicas Computacionales Para La Estimación de La Independencia de Los Tests de Aleatoriedad. 2021. Available online: https://docta.ucm.es/entities/publication/9c972153-b581-456f-b2a3-3bb7c3c256c9 (accessed on 1 October 2023).
- Thomas, J.A.; Cover, T. Elements of Information Theory; John Wiley & Sons, Inc.: New York, NY, USA, 1991; Volume 6, pp. 187–202. [Google Scholar]
- Darbellay, G.A. An Estimator of the Mutual Information Based on a Criterion for Conditional Independence. Comput. Stat. Data Anal. 1999, 32, 1–17. [Google Scholar] [CrossRef]
- Joe, H. Relative Entropy Measures of Multivariate Dependence. J. Am. Stat. Assoc. 1989, 84, 157–164. [Google Scholar] [CrossRef]
- Renyi, A. On the Foundations of Information Theory. Rev. Int. Stat. Inst. 1965, 33, 1–14. [Google Scholar] [CrossRef]
- Michalowicz, J.V.; Nichols, J.M.; Bucholtz, F. Handbook of Differential Entropy; CRC: Boca Raton, FL, USA, 2013. [Google Scholar]
- Fraser, A.M.; Swinney, H.L. Independent Coordinates for Strange Attractors from Mutual Information. Phys. Rev. A 1986, 33, 1134–1140. [Google Scholar] [CrossRef] [PubMed]
- Darbellay, G.; Vajda, I. Estimation of the Information by an Adaptive Partitioning of the Observation Space. IEEE Trans. Inf. Theory 1999, 45, 1315–1321. [Google Scholar] [CrossRef]
- Silverman, B.W. Density Estimation for Statistics and Data Analysis; CRC Press: Boca Raton, FL, USA, 1986; Volume 26. [Google Scholar]
- Moon, Y.I.; Rajagopalan, B.; Lall, U. Estimation of Mutual Information Using Kernel Density Estimators. Phys. Rev. E 1995, 52, 2318–2321. [Google Scholar] [CrossRef] [PubMed]
- Diks, C.; Manzan, S. Tests for Serial Independence and Linearity Based on Correlation Integrals. Stud. Nonlinear Dyn. Econom. 2002, 6. [Google Scholar] [CrossRef]
- Paninski, L. Estimation of Entropy and Mutual Information. Neural Comput. 2003, 15, 1191–1253. [Google Scholar] [CrossRef]
- Kraskov, A.; Stögbauer, H.; Grassberger, P. Estimating Mutual Information. Phys. Rev. E Stat. Physics Plasmas Fluids Relat. Interdiscip. Top. 2004, 69, 066138. [Google Scholar] [CrossRef]
- Daub, C.O.; Steuer, R.; Selbig, J.; Kloska, S. Estimating Mutual Information Using B-spline Functions—An Improved Similarity Measure for Analysing Gene Expression Data. BMC Bioinform. 2004, 5, 118. [Google Scholar] [CrossRef]
- Blinnikov, S.; Moessner, R. Expansions for Nearly Gaussian Distributions. Astron. Astrophys. Suppl. Ser. 1998, 130, 193–205. [Google Scholar] [CrossRef]
- Yavuz, Z.K.; Aydin, N.; Altay, G. Comprehensive review of association estimators for the inference of gene networks. Turk. J. Electr. Eng. Comput. Sci. 2016, 24, 695–718. [Google Scholar]
- Contreras Rodríguez, L.; Madarro-Capó, E.J.; Legón-Pérez, C.M.; Rojas, O.; Sosa-Gómez, G.S.G. Selecting an Effective Entropy Estimator for Short Sequences of Bits and Bytes with Maximum Entropy. Entropy 2021, 23, 561. [Google Scholar] [CrossRef]
- Papana, A.; Kugiumtzis, D. Evaluation of Mutual Information Estimators for Time Series. Int. J. Bifurc. Chaos 2009, 19, 4197–4215. [Google Scholar] [CrossRef]
- Sulewski, P. Equal-Bin-Width Histogram versus Equal-Bin-Count Histogram. J. Appl. Stat. 2021, 48, 2092–2111. [Google Scholar] [CrossRef] [PubMed]
- Sturges, H.A. The Choice of a Class Interval. J. Am. Stat. Assoc. 1926, 21, 65–66. [Google Scholar] [CrossRef]
- Cochran, W.G. Some Methods for Strengthening the Common χ2 Tests. Biom. J. Int. Biom. Soc. 1954, 10, 417–451. [Google Scholar]
- Lane, D. Online Statistics Education: A Multimedia Course of Study. Available online: https://onlinestatbook.com/ (accessed on 13 May 2023).
- Cencov, N.N. Estimation of an Unknown Distribution Density from Observations. Soviet Math. 1962, 3, 1559–1566. [Google Scholar]
- Bendat, J.S.; Piersol, A.G. Measurement and Analysis of Random Data, 2nd ed.; John Wiley & Sons: Hoboken, NJ, USA, 1966. [Google Scholar]
- Larson, H.J. Statistics: An Introduction; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 1975. [Google Scholar]
- Velleman, P. Interactive Computing for Exploratory Data Analysis i: Display Algorithms. Proc. Stat. Comput. Sect. 1976, 142–147. [Google Scholar]
- Doane, D.P. Aesthetic Frequency Classifications. Am. Stat. 1976, 30, 181–183. [Google Scholar]
- Mosteller, F.; Tukey, J.W. Addison-Wesley Series in Behavioral Science: Quantitative Methods, Reading, Mass; Addison-Wesley: Boston, MA, USA, 1977. [Google Scholar]
- Terrell, G.R.; Scott, D.W. Oversmoothed Nonparametric Density Estimates. J. Am. Stat. Assoc. 1985, 80, 209–214. [Google Scholar] [CrossRef]
- Ishikawa, K. Guide to Quality Control, UNIPUB/Kraus International; White Plains: New York, NY, USA, 1986. [Google Scholar]
- Boulle, M. Optimal Bin Number for Equal Frequency Discretizations in Supervized Learning. Intell. Data Anal. 2005, 9, 175–188. [Google Scholar] [CrossRef]
- Zhang, Z. Statistical Implications of Turing’s Formula; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2016. [Google Scholar] [CrossRef]
- Wilks, S.S. The Large-Sample Distribution of the Likelihood Ratio for Testing Composite Hypotheses. Ann. Math. Stat. 1938, 9, 60–62. [Google Scholar] [CrossRef]
- Menezes, A.J.; van Oorschot, P.C.; Vanstone, S.A. Handbook of Applied Cryptography; CRC Press: Boca Raton, FL, USA, 1997. [Google Scholar]
- Madarro-Capó, E.J.; Legón-Pérez, C.M.; Rojas, O.; Sosa-Gómez, G. Information theory based evaluation of the RC4 stream cipher outputs. Entropy 2021, 23, 896. [Google Scholar] [CrossRef] [PubMed]
Rules | k |
---|---|
Sturges [39] | |
Cochran [40] | |
Rice [41] | |
Cencov [42] | |
Bendat–Piersol [43] | |
Larson [44] | |
Velleman [45] | if |
if | |
Doane [46] | |
Mosteller–Tukey [47] | |
Terrell–Scott [48] | |
Ishikawa [49] |
Method | MIRT-1 | MIRT-2 |
---|---|---|
Distribution of the p-values | U(0,1) | U(0,1) |
MI estimator | shrinkage (SH) | maximum-likelihood (ML) |
Discretization | k = 10 | k = 10 |
Selection of significant values | Comparison with the critical value obtained by the permutation test. | Comparison with the critical value of the normal distribution for the prefix. |
Normality Test | p-Value | |
---|---|---|
Anderson–Darling | ||
Kolmogorov–Smirnov |
MIRT-1 | MIRT-2 | |
---|---|---|
App. Ent. | ≈Serial 1 | ≈Serial 1 |
CUSUM (f) | ≈CUSUM (b) ≈Frequency | ≈CUSUM (b) ≈Frequency |
CUSUM (b) | ≈Frequency ≈Non-Overlapping | ≈Frequency |
Long. Run | ≈Overlapping | ≈Overlapping |
Random Ex. | ≈Random Ex. Variant | ≈Random Ex. Variant |
Serial 1 | ≈Serial 2 | ≈Serial 2 |
Execution | MIRT-1 | MIRT-2 | Quotient |
---|---|---|---|
1 | 6249.36 | 0.33 | 18,937.45 |
2 | 6230.96 | 0.29 | 21,486.07 |
3 | 6250.03 | 0.3 | 20,833.43 |
4 | 6238.48 | 0.31 | 20,124.13 |
5 | 6240.17 | 0.41 | 15,219.93 |
6 | 6234.23 | 0.31 | 20,110.42 |
7 | 6236.86 | 0.29 | 21,506.41 |
8 | 6244.91 | 0.30 | 20,816.37 |
9 | 6241.02 | 0.30 | 20,803.40 |
10 | 6246.43 | 0.42 | 14,872.45 |
6241.245 | 0.326 | 19,144.92 | |
40.713 | 0.002 | 17,415.26 |
Bias | 0.001768216 |
The standard deviation of bias | 0.0002812867 |
Upper LOA | 0.002319538 |
Lower LOA | 0.001216894 |
Mean of differences/means | 202.4483 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Karell-Albo, J.A.; Legón-Pérez, C.M.; Socorro-Llanes, R.; Rojas, O.; Sosa-Gómez, G. Complexity Reduction in Analyzing Independence between Statistical Randomness Tests Using Mutual Information. Entropy 2023, 25, 1545. https://doi.org/10.3390/e25111545
Karell-Albo JA, Legón-Pérez CM, Socorro-Llanes R, Rojas O, Sosa-Gómez G. Complexity Reduction in Analyzing Independence between Statistical Randomness Tests Using Mutual Information. Entropy. 2023; 25(11):1545. https://doi.org/10.3390/e25111545
Chicago/Turabian StyleKarell-Albo, Jorge Augusto, Carlos Miguel Legón-Pérez, Raisa Socorro-Llanes, Omar Rojas, and Guillermo Sosa-Gómez. 2023. "Complexity Reduction in Analyzing Independence between Statistical Randomness Tests Using Mutual Information" Entropy 25, no. 11: 1545. https://doi.org/10.3390/e25111545
APA StyleKarell-Albo, J. A., Legón-Pérez, C. M., Socorro-Llanes, R., Rojas, O., & Sosa-Gómez, G. (2023). Complexity Reduction in Analyzing Independence between Statistical Randomness Tests Using Mutual Information. Entropy, 25(11), 1545. https://doi.org/10.3390/e25111545