From Whence Commeth Data Misreporting? A Survey of Benford’s Law and Digit Analysis in the Time of the COVID-19 Pandemic
Abstract
:1. Introduction
2. Materials and Methods
2.1. Data
2.2. Method
3. Results of Bibliometric Analysis
3.1. Performance Analysis
3.1.1. Publications Related Metrics
3.1.2. Citation Analysis
3.1.3. Collaboration Analysis
3.2. Science Mapping
3.2.1. Social Structure—Co-Authorship Analysis
3.2.2. References Co-Citation Analysis
3.2.3. Conceptual Structure—Co-Word Analysis
4. Results of Content Survey and Analysis
5. Discussion of Results
6. Implications and Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Nigrini, M.J. Benford’s Law: Applications for Forensic Accounting, Auditing, and Fraud Detection; Wiley Corporate F&A Series; Wiley: Hoboken, NJ, USA, 2012; ISBN 978-1-118-15285-0. [Google Scholar]
- Azevedo, C.D.S.; Gonçalves, R.F.; Gava, V.L.; Spinola, M.D.M. A Benford’s Law Based Methodology for Fraud Detection in Social Welfare Programs: Bolsa Familia Analysis. Phys. A Stat. Mech. Its Appl. 2021, 567, 125626. [Google Scholar] [CrossRef]
- Noorullah, A.S.; Jari, A.S.; Hasan, A.M.; Flayyih, H.H. Benford Law: A Fraud Detection Tool Under Financial Numbers Game: A Literature Review. Soc. Sci. Humanit. J. 2020, 4, 1909–1914. [Google Scholar]
- Durtschi, C.; Hillison, W.; Pacini, C. The Effective Use of Benford’s Law to Assist in Detecting Fraud in Accounting Data. J. Forensic Account. 2004, 5, 17–34. [Google Scholar]
- Idrovo, A.J.; Fernández-Niño, J.A.; Bojórquez-Chapela, I.; Moreno-Montoya, J. Performance of Public Health Surveillance Systems during the Influenza A(H1N1) Pandemic in the Americas: Testing a New Method Based on Benford’s Law. Epidemiol. Infect. 2011, 139, 1827–1834. [Google Scholar] [CrossRef] [PubMed]
- Lu, F.; Boritz, J.E. Detecting Fraud in Health Insurance Data: Learning to Model Incomplete Benford’s Law Distributions. In Machine Learning: ECML 2005; Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L., Eds.; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2005; Volume 3720, pp. 633–640. ISBN 978-3-540-29243-2. [Google Scholar]
- Crocetti, E.; Randi, G. Using the Benford’s Law as a First Step to Assess the Quality of the Cancer Registry Data. Front. Public Health 2016, 4, 225. [Google Scholar] [CrossRef]
- Daniels, J.; Caetano, S.-J.; Huyer, D.; Stephen, A.; Fernandes, J.; Lytwyn, A.; Hoppe, F.M. Benford’s Law for Quality Assurance of Manner of Death Counts in Small and Large Databases. J. Forensic Sci. 2017, 62, 1326–1331. [Google Scholar] [CrossRef]
- Morillas-Jurado, F.G.; Caballer-Tarazona, M.; Caballer-Tarazona, V. Applying Benford’s Law to Monitor Death Registration Data: A Management Tool for the Covid-19 Pandemic. Mathematics 2022, 10, 46. [Google Scholar] [CrossRef]
- Natashekara, K. COVID-19 Cases in India and Kerala: A Benford’s Law Analysis. J. Public Health 2022, 44, E287–E288. [Google Scholar] [CrossRef]
- Wong, W.K.; Juwono, F.H.; Loh, W.N.; Ngu, I.Y. Newcomb-Benford Law Analysis on COVID-19 Daily Infection Cases and Deaths in Indonesia and Malaysia. Herit. Sustain. Dev. 2021, 3, 102–110. [Google Scholar] [CrossRef]
- Kilani, A.; Georgiou, G.P. Countries with Potential Data Misreport Based on Benford’s Law. J. Public Health 2021, 43, E295–E296. [Google Scholar] [CrossRef]
- Campolieti, M. COVID-19 Deaths in the USA: Benford’s Law and under-Reporting. J. Public Health 2022, 44, E268–E271. [Google Scholar] [CrossRef]
- Balashov, V.S.; Yan, Y.; Zhu, X. Using the Newcomb–Benford Law to Study the Association between a Country’s COVID-19 Reporting Accuracy and Its Development. Sci. Rep. 2021, 11, 22914. [Google Scholar] [CrossRef]
- Donthu, N.; Kumar, S.; Mukherjee, D.; Pandey, N.; Lim, W.M. How to Conduct a Bibliometric Analysis: An Overview and Guidelines. J. Bus. Res. 2021, 133, 285–296. [Google Scholar] [CrossRef]
- Zhong, M.; Lin, M. Bibliometric Analysis for Economy in COVID-19 Pandemic. Heliyon 2022, 8, e10757. [Google Scholar] [CrossRef] [PubMed]
- Chen, Y.; Zhang, X.; Chen, S.; Zhang, Y.; Wang, Y.; Lu, Q.; Zhao, Y. Bibliometric Analysis of Mental Health during the COVID-19 Pandemic. Asian J. Psychiatry 2021, 65, 102846. [Google Scholar] [CrossRef]
- Farooq, R.; Rehman, S.; Ashiq, M.; Siddique, N.; Ahmad, S. Bibliometric Analysis of Coronavirus Disease (COVID-19) Literature Published in Web of Science 2019–2020. J. Fam. Community Med. 2021, 28, 1. [Google Scholar] [CrossRef]
- Mahi, M.; Mobin, M.A.; Habib, M.; Akter, S. A Bibliometric Analysis of Pandemic and Epidemic Studies in Economics: Future Agenda for COVID-19 Research. Soc. Sci. Humanit. Open 2021, 4, 100165. [Google Scholar] [CrossRef] [PubMed]
- Viana-Lora, A.; Nel-lo-Andreu, M.G. Bibliometric Analysis of Trends in COVID-19 and Tourism. Humanit. Soc. Sci. Commun. 2022, 9, 173. [Google Scholar] [CrossRef]
- Heradio, R.; Perez-Morago, H.; Fernandez-Amoros, D.; Javier Cabrerizo, F.; Herrera-Viedma, E. A Bibliometric Analysis of 20 Years of Research on Software Product Lines. Inf. Softw. Technol. 2016, 72, 1–15. [Google Scholar] [CrossRef]
- Aria, M.; Cuccurullo, C. Bibliometrix: An R-Tool for Comprehensive Science Mapping Analysis. J. Informetr. 2017, 11, 959–975. [Google Scholar] [CrossRef]
- Kaiser, M. Benford’s law as an indicator of survey reliability—Can we trust our data? J. Econ. Surv. 2019, 33, 1602–1618. [Google Scholar] [CrossRef]
- Druică, E.; Oancea, B.; Vâlsan, C. Benford’s Law and the Limits of Digit Analysis. Int. J. Account. Inf. Syst. 2018, 31, 75–82. [Google Scholar] [CrossRef]
- Nigrini, M.J. Audit Sampling Using Benford’s Law: A Review of the Literature with Some New Perspectives. J. Emerg. Technol. Account. 2017, 14, 29–46. [Google Scholar] [CrossRef]
- Barney, B.J.; Schulzke, K.S. Moderating “Cry Wolf” Events with Excess MAD in Benford’s Law Research and Practice. J. Forensic Account. Res. 2016, 1, A66–A90. [Google Scholar] [CrossRef]
- Dai, X.; Gil, G.F.; Reitsma, M.B.; Ahmad, N.S.; Anderson, J.A.; Bisignano, C.; Carr, S.; Feldman, R.; Hay, S.I.; He, J.; et al. Health Effects Associated with Smoking: A Burden of Proof Study. Nat. Med. 2022, 28, 2045–2055. [Google Scholar] [CrossRef]
- Wang, B.; López-Corredoira, M.; Wei, J.-J. The Hubble Tension Survey: A Statistical Analysis of the 2012–2022 Measurements. Mon. Not. R. Astron. Soc. 2023, 527, 7692–7700. [Google Scholar] [CrossRef]
- Cucari, N.; Tutore, I.; Montera, R.; Profita, S. A Bibliometric Performance Analysis of Publication Productivity in the Corporate Social Responsibility Field: Outcomes of SciVal Analytics. Corp. Soc. Responsib. Environ. Manag. 2023, 30, 1–16. [Google Scholar] [CrossRef]
- Andrikopoulos, A.; Economou, L. Coauthorship and Subauthorship Patterns in Financial Economics. Int. Rev. Financ. Anal. 2016, 46, 12–19. [Google Scholar] [CrossRef]
- Benford, F. The Law of Anomalous Numbers. Proc. Am. Philos. Soc. 1938, 78, 551–572. [Google Scholar]
- Diekmann, A. Not the First Digit! Using Benford’s Law to Detect Fraudulent Scientific Data. J. Appl. Stat. 2007, 34, 321–329. [Google Scholar] [CrossRef]
- Koch, P. Economic Complexity and Growth: Can Value-Added Exports Better Explain the Link? Econ. Lett. 2021, 198, 109682. [Google Scholar] [CrossRef]
- Fewster, R.M. A Simple Explanation of Benford’s Law. Am. Stat. 2009, 63, 26–32. [Google Scholar] [CrossRef]
- Yan, E.; Ding, Y. Scholarly Network Similarities: How Bibliographic Coupling Networks, Citation Networks, Cocitation Networks, Topical Networks, Coauthorship Networks, and Coword Networks Relate to Each Other. J. Am. Soc. Inf. Sci. Technol. 2012, 63, 1313–1326. [Google Scholar] [CrossRef]
- K-Synth Team. Frequently Asked Questions. 2023. Available online: https://www.bibliometrix.org/home/index.php/about-us-2/k-synth-team (accessed on 14 May 2024).
- Our World in Data. Brazil: Coronavirus Pandemic Country Profile. 2024. Available online: https://ourworldindata.org/coronavirus/country/brazil (accessed on 31 July 2024).
- Chen, C.; Dubin, R.; Schultz, T. Science Mapping. In Advances in Information Quality and Management; Mehdi Khosrow-Pour, D.B.A., Ed.; IGI Global: Hershey, PA, USA, 2014; pp. 4171–4184. ISBN 978-1-4666-5888-2. [Google Scholar]
- Durieux, V.; Gevenois, P.A. Bibliometric Indicators: Quality Measurements of Scientific Publication. Radiology 2010, 255, 342–351. [Google Scholar] [CrossRef] [PubMed]
- Osareh, F. Bibliometrics, Citation Analysis and Co-Citation Analysis: A Review of Literature I. Libri 1996, 46, 149–158. [Google Scholar] [CrossRef]
- Small, H. Co-Citation in the Scientific Literature: A New Measure of the Relationship between Two Documents. J. Am. Soc. Inf. Sci. 1973, 24, 265–269. [Google Scholar] [CrossRef]
- Navarro-Ballester, A.; Merino-Bonilla, J.A.; Ros-Mendoza, L.H.; Marco-Doménech, S.F. Publications on COVID-19 in Radiology Journals in 2020 and 2021: Bibliometric Citation and Co-Citation Network Analysis. Eur. Radiol. 2022, 33, 3103–3114. [Google Scholar] [CrossRef]
- Mas-Tur, A.; Roig-Tierno, N.; Sarin, S.; Haon, C.; Sego, T.; Belkhouja, M.; Porter, A.; Merigó, J.M. Co-Citation, Bibliographic Coupling and Leading Authors, Institutions and Countries in the 50 Years of Technological Forecasting and Social Change. Technol. Forecast. Soc. Change 2021, 165, 120487. [Google Scholar] [CrossRef]
- Fusco, F.; Marsilio, M.; Guglielmetti, C. Co-Production in Health Policy and Management: A Comprehensive Bibliometric Review. BMC Health Serv. Res. 2020, 20, 504. [Google Scholar] [CrossRef]
- Trujillo, C.M.; Long, T.M. Document Co-Citation Analysis to Enhance Transdisciplinary Research. Sci. Adv. 2018, 4, e1701130. [Google Scholar] [CrossRef]
- Nigrini, M.J. Taxpayers Compliance Application of Benford’s Law. J. Am. Tax. Assoc. 1996, 18, 72–92. [Google Scholar]
- Cobo, M.J.; López-Herrera, A.G.; Herrera-Viedma, E.; Herrera, F. An Approach for Detecting, Quantifying, and Visualizing the Evolution of a Research Field: A Practical Application to the Fuzzy Sets Theory Field. J. Informetr. 2011, 5, 146–166. [Google Scholar] [CrossRef]
- Cobo, M.J.; Martínez, M.A.; Gutiérrez-Salcedo, M.; Fujita, H.; Herrera-Viedma, E. 25years at Knowledge-Based Systems: A Bibliometric Analysis. Knowl.-Based Syst. 2015, 80, 3–13. [Google Scholar] [CrossRef]
- Kim, H.-Y. Statistical Notes for Clinical Researchers: Chi-Squared Test and Fisher’s Exact Test. Restor. Dent. Endod. 2017, 42, 152. [Google Scholar] [CrossRef] [PubMed]
Database | No. of Articles | No of Publishing Sources | Average Citation per Article | Highest Number of Citations per Article |
---|---|---|---|---|
WoS | 23 | 20 | 5.783 | 24 |
Scopus | 9 | 9 | 1.33 | 7 |
Cluster | Country/Countries | Authors’ Research Interest |
---|---|---|
Cluster 1 | Colombia | Public Health |
Cluster 2 | Brazil | Political Science and Quantitative Methods |
Cluster 3 | Mexico | Statistics |
Cluster 4 | Italy | Data Analytics and Engineering |
Cluster 5 | Brazil | Business Admin and Pharmaceutical Sciences |
Cluster 6 | Spain | Applied Economics and Corporate Finance |
Cluster 7 | US | Audit, Accounting, Corporate Governance, and Population Genetics |
Cluster 8 | US | Causal Inference and Business Analytics |
Cluster 9 | China and Norway | Data mining, Machine Learning, and Big Data Analysis |
Cluster 10 | US and UK | Monetary Policy, Macroeconomics, and Financial Stability |
Cluster 11 | Singapore and Sweden | Public Health and Medical Sciences |
Cluster 12 | Brazil and Portugal | Complex Systems and Statistical Methods |
Cluster | Country/Countries | Authors’ Research Interest |
Cluster 1 | Brazil | Data Analysis and Artificial Intelligence |
Cluster 2 | US and UK | Healthcare |
Cluster 3 | Malaysia | No particular pattern emerges |
Cluster 4 | Brazil | No particular pattern emerges |
Cluster 5 | India | No particular pattern emerges |
Type of Statistical Test | Frequency |
---|---|
Z-statistic | 24 |
CHI-SQUARE | 18 |
MAD | 10 |
Goodness of Fit | 3 |
KUIPER | 5 |
Log likelihood ratio test | 3 |
Distortion factor (DF) | 3 |
Euclidean distance | 3 |
KSD statistic | 5 |
Moreno-Montoya test | 1 |
Chebyshev distance M statistic | 1 |
Leemis M statistic | 1 |
Cho and Gaines D-Statistic | 1 |
SSD | 1 |
RMSD/RMSE | 2 |
Number of Different Tests Used | Frequency |
---|---|
1 | 2 |
2 | 6 |
3 | 4 |
4 | 10 |
5 | 2 |
6 | 1 |
Source of the Data | Number of Papers |
---|---|
North America | 5 |
Latin-America | 5 |
Western Europe | 3 |
Eastern Europe | 0 |
China | 2 |
India | 0 |
Other Asia | 2 |
World (multiple countries and continents) | 10 |
Summary of Findings | Frequency |
---|---|
Some deviation from Benford’s Law | 13 |
Persistent deviation from Benford’s Law | 7 |
Plausible misreporting | 16 |
Tampering with data | 6 |
Methodological rationalization | 9 |
Pairs of Categorical Variables (H0: Independence) | Fisher’s Exact Test Two-Tailed p-Value | ODDS RATIO |
---|---|---|
Period vs. severity of deviation from Benford | 0.645 | 2.153 |
Period vs. rationalization | 0.344 | 0.374 |
No. of different tests vs. severity of deviation from Benford | 0.076 * | n.a. |
No. of different tests vs. rationalization | 0.095 | n.a. |
Severity of deviation from Benford vs. rationalization | 0.032 ** | 9.347 |
No. of countries vs. severity of deviation from Benford | 0.073 * | 8.63 |
No. of countries vs. rationalization | 0.644 | 2.153 |
Deviation from Benford vs. methodological caveat | 0.205 | 3.281 |
No. of different tests vs. methodological caveat | 0.616 | n.a. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Vâlsan, C.; Puiu, A.-I.; Druică, E. From Whence Commeth Data Misreporting? A Survey of Benford’s Law and Digit Analysis in the Time of the COVID-19 Pandemic. Mathematics 2024, 12, 2579. https://doi.org/10.3390/math12162579
Vâlsan C, Puiu A-I, Druică E. From Whence Commeth Data Misreporting? A Survey of Benford’s Law and Digit Analysis in the Time of the COVID-19 Pandemic. Mathematics. 2024; 12(16):2579. https://doi.org/10.3390/math12162579
Chicago/Turabian StyleVâlsan, Călin, Andreea-Ionela Puiu, and Elena Druică. 2024. "From Whence Commeth Data Misreporting? A Survey of Benford’s Law and Digit Analysis in the Time of the COVID-19 Pandemic" Mathematics 12, no. 16: 2579. https://doi.org/10.3390/math12162579
APA StyleVâlsan, C., Puiu, A. -I., & Druică, E. (2024). From Whence Commeth Data Misreporting? A Survey of Benford’s Law and Digit Analysis in the Time of the COVID-19 Pandemic. Mathematics, 12(16), 2579. https://doi.org/10.3390/math12162579