Written Documents Analyzed as Nature-Inspired Processes: Persistence, Anti-Persistence, and Random Walks—We Remember, as Along Came Writing—T. Holopainen
Abstract
:Featured Application
Abstract
1. Introduction
Scientists for the first time have detected gravitational waves, ripples in space and time hypothesized by Albert Einstein a century ago, in a landmark discovery announced on Thursday that opens a new window for studying the cosmos. The waves were unleashed by the collision of the black holes, one of them 29 times the mass of the sun and the other 36 times the solar mass, located 1.3 billion light years from Earth, the researchers said. The scientific milestone was achieved using a pair of giant laser detectors in the United States, located in Louisiana and Washington state, capping a decades-long quest to find these waves. They detected remarkably small vibrations from the gravitational waves as they passed through the Earth. The scientists converted the wave signal into audio waves and listened to the sounds of the black holes merging.
On 14 September 2015 at 9.50:45 UTC the two detectors of the Laser Interferometer Gravitational-Wave Observatory simultaneously observed a transient gravitational-wave signal. The signal weeps upwards in frequency from 35 to 250 Hz with a peak gravitational-wave strain of . It matches the waveform predicted by the general relativity for the inspiral and merger of a pair of black holes and the ringdown of the resulting single black hole. The signal was observed with a marched-filter signal-to-noise ratio of 24 and false alarm rate estimated to be less than 1 event per 203,300 years, equivalent to a significance greater that 5.1... These observations demonstrate the existence of binary stellar-mass black hole systems. This is the first direct detection of gravitational waves and the first observation of a binary hole merger.
1.1. Automated Text Classification
1.2. e-Research
1.3. Analyzing Written Texts as Natural Phenomena
- If , then the time series represents a persistent process where the trend of previous steps are likely to be kept i.e., the time series is self-similar. A persistent time series possesses long-term memory [23].
- If , then the times series represents an anti-persistent process with oscillations. An anti-persistent time series will exhibit higher noise and more volatility.
- If , then the times series represent a process with no dependencies, or a random walk. It corresponds to a stochastic process defined by white noise.
1.4. Article Organization
2. Materials and Methods
- The corpus was gathered and prepared by the authors.
- The corpus contains:
- -
- 350 scientific texts.
- -
- 350 texts in the category of diffusion of science.
- -
- 350 randomly generated texts.
- -
- 350 newspaper articles about business.
- -
- 350 entertainment texts.
- Title, abstract, keywords, introduction and conclusions were the sections analyzed in scientific articles. We left out tables, equations and discipline-specific symbols.
- Scientific, diffusion, business and entertainment articles are all published texts, gathered randomly from the period between 2014 and 2018.
- The scientific articles were chosen according to a list given by InCites Journal Citation Report under the next parameters: select categories: (1) computer science and its different branches; (2) electronics along its different branches.
- The sources of newspaper science articles are the New York Times, BBC news, CNN-tech, ScienceDaily, under the tags technology, science.
- Business and entertainment articles were taken from MSN entertainment, Mirror, US weekly, BBC news.
- All texts are written in English.
- The mathematical routines to read and process the texts are coded in Python.
- To quantify the Hurst parameter, texts are mapped into time series by assigning the corresponding ASCII code to each character appearing in the text. Figure 1 illustrates this mapping.
- The husrtExp function hurst_re authored by Christopher Scholzel was chosen to compute the Hurst parameter.
3. Experimental Results
- Observation of gravitational waves from a binary black hole merger
- Gravitational-wave observatory
- observed
- gravitational-wave
- gravitational-wave
- merger
- black-holes
- black hole
- black hole
- black hole
- gravitational waves
- observations
- binary
- black hole
- gravitational waves
- binary black hole merger
- first quartile (Q1), corresponding to the 25th percentile.
- third quartile (Q3), corresponding to the 75th percentile.
- inter-quartile range (IQR): 25th to 75th percentile. That is to say, 50% of the data is contained in this range.
- median.
- minimum: Q1 − 1.5IQR.
- maximum: Q3 + 1.5IQR.
- Values below minimum and above maximum are outliers.
- 0.54 for business news.
- 0.49 for diffusion of science.
- 0.53 for entertainment mews.
- 0.48 for random texts.
- 0.57 for scientific texts.
- for business news.
- for diffusion of science.
- for entertainment news.
- for random texts.
- for scientific texts.
4. Discussion
- Scientific texts represent persistent processes, where the trend of previous steps are likely to be kept i.e., the text is self-similar, and it possesses long-term memory.
- Business articles are self-similar. However, the values of the Hurst parameter are lower than those of scientific texts.
- Diffusion of science reports and entertainment news represent both, anti-persistent and persistence processes.
- Random texts represent processes with no dependencies, or a random walk, corresponding to stochastic processes defined by white noise.
4.1. What about This Article?
- The Hurst parameter is 0.5825. This value lies within the upper 75% of the calculated values. This present paper thus reflects a persistent process where linguistic patterns are formed along the body of the text, making it self-similar.
4.2. Comparison with Related Work
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
Abbreviations
H | Hurst Parameter |
D | Fractal Dimension |
R/S | Rescaled-range analysis |
References
- Lewis, M.L.; Frank, M.C. Linguistic structure emerges through the interaction of memory constratints and communicative pressures. Behav. Brain Sci. 2019, 39, 38–39. [Google Scholar]
- Zipf, G.K. Human Behaviour and the Principle of Least Effort; Addison-Wesley: Reading, MA, USA, 1949. [Google Scholar]
- Dunham, W.; Malone, S. Einstein’s Gravitational Waves Detected in Landmark Discovery. Available online: Reuters.com (accessed on 17 May 2017).
- Abbott, B.P. Observation of gravitational waves from a binary black hole merger. Phys. Rev. Lett. 2016, 116, 1–18. [Google Scholar] [CrossRef] [PubMed]
- Yager, R.E. The importance of terminology in teaching K-12 science. J. Res. Sci. Teach. 1983, 20, 577–588. [Google Scholar] [CrossRef]
- Gimenez, J.; Baldwin, M.; Breen, P.; Guitierrez, J.; Roque, E. Reproduced, reinterpreted, lost: Trajectories of scientific knowledge across contexts. Text Talk 2020, 40, 293–324. [Google Scholar] [CrossRef] [Green Version]
- Alfonso, A.R.; Duque, C.G. Automated text clustering of newspaper and scientific texts in Brazilian Portuguese: Analysis and comparison of methods. J. Inf. Syst. Technol. Manag. 2014, 11, 415–435. [Google Scholar]
- Grabska-Gradzinska, I.; Klig, A.; Kwapien, J.; Drozdz, S. Complex network analysis of literary and scientific texts. Int. J. Mod. Phys. Comput. Phys. 2012, 23, 1250051–1250060. [Google Scholar] [CrossRef] [Green Version]
- Osipov, G.S.; Devyatkin, D.A.; Kusnetzova, Y.M.; Shvets, A.V. The Possibilities for Intelligent Analysis of Scientific Texts by Construction of their Cognitive Models. Sci. Tech. Inf. Process. 2019, 46, 337–344. [Google Scholar] [CrossRef]
- Klein, M.; Broadwell, P.; Farb, S.E.; Grappone, T. Comparing published scientific journal articles to their pre-print versions. Int. J. Digit. Libr. 2018, 4, 335–350. [Google Scholar]
- Balas, E.A. International Collaboration and Competition. In Innovative Research in Life Sciences: Pathways to Scientific Impact, Public Health Improvement, and Economic Progress; John Wiley & Sons, Inc.: London, UK, 2019; Chapter 22; pp. 365–380. [Google Scholar]
- Sanchez, A.; Carro, B. Internet Services: From Broadband to Ultrabroadband. In Digital Services in the 21st Century: A Strategic and Business Perspective; John Wiley & Sons, Inc.: London, UK, 2017; Chapter 2; pp. 9–30. [Google Scholar]
- Rees, D.; Laramee, R. A Survey of Information Visualization Books. Comput. Graph. 2019, 38, 610–646. [Google Scholar] [CrossRef] [Green Version]
- Großer, B.; Baumol, U. Virtual teamwork in the context of technological and cultural transformation. Int. J. Inf. Syst. Proy. Manag. 2017, 5, 21–35. [Google Scholar] [CrossRef]
- Sanog, P.; Zhang, C.; Xu, Y.; Xue, L.; Wang, K.; Zhang, C. Asymetrical Interaction in competitive Internet Techonology Diffusion: Implications for the Competition Between Local and Multinacional Online Vendors. In Global Diffusion and Adoption of Technologies for Knowledge and Information Sharing; Information Science Reference: Hershey, PA, USA, 2013; Chapter 10; pp. 221–240. [Google Scholar]
- PubPeer. About Pubpeer. Available online: https://pubpeer.com/static/about (accessed on 3 August 2020).
- Ward, P.; Graber, K.C.; van der Mars, H. Writing Quality Peer Reviews of Research Manuscripts. J. Teach. Phys. Educ. 2015, 34, 700–715. [Google Scholar] [CrossRef]
- Kulczycki, E.; Rozkosz, E.A. Does an expert-based evaluation allow us to go beyond the Impact Factor? Experiences from building a ranking of national journals in Poland. Scientometrics 2017, 1, 417–442. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Serenko, A.; Dohan, M. Comparing the expert survey and citation impact journal ranking methods: Example from the field of Artificial Intelligence. J. Inf. 2011, 5, 629–648. [Google Scholar] [CrossRef]
- Atanossova, I.; Bertin, M.; Lariviere, V. On the composition of scientific abstracts. J. Doc. 2016, 72, 636–647. [Google Scholar] [CrossRef]
- Hurst, H.E. Methods of using long-term storage in reservoirs. ICE Proc. 1956, 15, 519–543. [Google Scholar] [CrossRef]
- Molino-Minero, E.; Garcia-Nocetti, F.; Benitez-Perez, H. Application of time-scale local Hurst exponent to time series. Digit. Signal Process. 2015, 37, 92–99. [Google Scholar] [CrossRef]
- Kale, M.; Butar Butar, F. Fractal analysis of time series and distribution properties of Hurst exponent. J. Math. Sci. Math. Educ. 2011, 5, 8–19. [Google Scholar]
- Moreno, C.J.G. Using the Hurst exponent as a monitor and predictor of BWR reactor inestabilities. Ann. Nucl. Energy 2010, 37, 432–442. [Google Scholar]
- Hurst, H.E. A suggested statistical model of some time series which occur in nature. Nature 1957, 180, 494. [Google Scholar] [CrossRef]
- Jiang, C.; Lu, Z.; Zhou, J.; Memon, M.S. Evaluation of fractal dimension of soft terrain surface. J. Terramechanics 2017, 70, 27–34. [Google Scholar] [CrossRef]
- Abboushi, B.; Elzeyadi, I.; Taylor, R.; Sereno, M. Fractals in architecture: The visual interest, preference, and mood response to projected fractal light patterns in interior spaces. J. Environ. Psychol. 2019, 61, 57–70. [Google Scholar] [CrossRef]
- Popovic, N.; Radunovic, M.; Badnjar, J.; Popovic, T. Fractal dimension and lacunarity analysis of retinal microvascular morphology in hypertension and diabetes. Microvasc. Res. 2018, 118, 36–43. [Google Scholar] [CrossRef] [PubMed]
- Ashkenazy, Y. The use of generalized information dimension in measuring fractal dimension of time series. Phys. Stat. Mech. Its Appl. 1999, 271, 427–447. [Google Scholar] [CrossRef] [Green Version]
- Zhokh, A.; Trypolskyi, A.; Strizhak, P. Relationship between the anomalous diffusion and the fractal dimension of the environment. Chem. Phys. 2018, 503, 71–76. [Google Scholar] [CrossRef]
- Batht, S.J.; Dedania, H.V.; Shah, V.R. Fractal dimensional analysis in financial time series. Int. J. Financ. Manag. 2015, 5, 46–52. [Google Scholar]
- Hollingsworth, A. Weather forecasting: Storm hunting with fractals. Nature 1986, 319, 11–12. [Google Scholar] [CrossRef]
- Cajueiro, D.O.; Tabak, B.M. The rescaled variance statistic and the determination of the Hurst exponent. Math. Comput. Simul. 2005, 70, 172–179. [Google Scholar] [CrossRef]
- Galarnyk, M. Understanding Box-Plots. Available online: https://towardsdatascience.com/understanding-boxplots-5e2df7bcbd51 (accessed on 3 August 2020).
- Lu, C.; Bu, Y.; Wnag, J.; Torvik, V.; Schanaars, M.; Zhang, C. Examining scientific writing styles from the perspective of linguistic complexity. J. Assoc. Inf. Sci. Technol. 2018, 70, 462–475. [Google Scholar] [CrossRef]
- Liakata, M.; Saha, S.; Dobnik, S.; Batchelor, C.; Rebholz-Schuhmann, D. Automatic recognition of conceptualization zones in scientific articles and two life sciences applications. Bioinformatics 2012, 28, 991–1000. [Google Scholar] [CrossRef]
- Leong, A.P.; Toh, A.L.L.; Chin, S.F. Examining Structure in Scientific Research Articles: A Study of Thematic Progression and Thematic Density. Writ. Commun. 2018, 35, 286–614. [Google Scholar] [CrossRef]
- Ngai, S.B.C.; Singh, R.G.; Koon, A.C. A discourse analysis of the macro-structure, metadiscoursal and microdiscoursal features in the abstracts of research articles across multiple science disciplines. PLoS ONE 2018, 13, e0205417. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Cao, M.; Sun, X.; Zhuge, H. The contribution of cause-effect link to representing the core of scientific paper-The role of Semantic Link Network. PLoS ONE 2018, 13, e0199303. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Plaven-Sigray, P.; Matheson, G.J.; Schiffler, B.C.; Thompson, W.H. The readability of scientific texts is decreasing over time. e-Life 2017, 6, 1–14. [Google Scholar] [CrossRef]
- Gómez-Adorno, H.M.; Ríos, G.; Posadas-Durán, J.P.; Sidorov, G.; Sierra, G. Stylometry-based Approach for Detecting Writing Style Changes in Literary Texts. Comput. Sist. 2018, 22, 1–18. [Google Scholar] [CrossRef]
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
López-Ortega, O.; Pérez-Cortés, O.; Castillejos-Fernández, H.; Castro-Espinoza, F.-A.; González-Mendoza, M. Written Documents Analyzed as Nature-Inspired Processes: Persistence, Anti-Persistence, and Random Walks—We Remember, as Along Came Writing—T. Holopainen. Appl. Sci. 2020, 10, 6354. https://doi.org/10.3390/app10186354
López-Ortega O, Pérez-Cortés O, Castillejos-Fernández H, Castro-Espinoza F-A, González-Mendoza M. Written Documents Analyzed as Nature-Inspired Processes: Persistence, Anti-Persistence, and Random Walks—We Remember, as Along Came Writing—T. Holopainen. Applied Sciences. 2020; 10(18):6354. https://doi.org/10.3390/app10186354
Chicago/Turabian StyleLópez-Ortega, Omar, Obed Pérez-Cortés, Heydy Castillejos-Fernández, Félix-Agustín Castro-Espinoza, and Miguel González-Mendoza. 2020. "Written Documents Analyzed as Nature-Inspired Processes: Persistence, Anti-Persistence, and Random Walks—We Remember, as Along Came Writing—T. Holopainen" Applied Sciences 10, no. 18: 6354. https://doi.org/10.3390/app10186354
APA StyleLópez-Ortega, O., Pérez-Cortés, O., Castillejos-Fernández, H., Castro-Espinoza, F. -A., & González-Mendoza, M. (2020). Written Documents Analyzed as Nature-Inspired Processes: Persistence, Anti-Persistence, and Random Walks—We Remember, as Along Came Writing—T. Holopainen. Applied Sciences, 10(18), 6354. https://doi.org/10.3390/app10186354