Stochastic Diffusion Model for Analysis of Dynamics and Forecasting Events in News Feeds
Abstract
:1. Introduction
2. Review of Research on Forecasting Events Based on Text Analysis
3. Materials and Methods
4. Results
4.1. Deriving the Distribution Function for the Time Series Parameters, Which Describe Dynamics of the News Feed Content
4.1.1. Plotting of Difference Schemes of Probabilities of State Transitions in Information Space. Deriving the Main Equation of the Model
- is the probability that the system is in state (x − ε);
- is the probability that it is in state x;
- is the probability that it is in state (x + ξ).
4.1.2. Formulating and Solving a Boundary Value Problem When Predicting News Events in the Information Space for Systems with Memory Implementation and Self-Organization
4.2. Experimental Testing of the Suggested Model for Forecasting News Feed Events
4.2.1. Definition of the Parameters of the Event Forecasting Model Based on Changes in the Cluster Structure in the Information Space of News Feeds
4.2.2. Evaluation of the Value of Cosine Measure of the Event Occurrence Threshold in the Information Space of News Feeds
4.2.3. Modelling of the Predicted Event Occurrence Probability Dependence on Time. Analysis of Modelling Results
4.2.4. Assessment of the Accuracy and Reliability of Forecasts of the Implementation of Events in the News Feed, Obtained on the Basis of the Developed Model of the Dynamics of the News Feeds Content
5. Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Ao, X.; Luo, P.; Li, C.R.; Zhuang, F.; He, Q. Discovering and learning sensational episodes of news events. Inf. Syst. 2018, 78, 68–80. [Google Scholar] [CrossRef]
- Huminski, A.; Bin, N.Y. Automatic extraction of causal chains from text. Libres 2020, 29, 99–108. [Google Scholar]
- Preethi, P.G.; Uma, V.; Kumar, A. Temporal Sentiment Analysis and Causal Rules Extraction from Tweets for Event Prediction. In Procedia Computer Science; Elsevier: Amsterdam, The Netherlands, 2015; Volume 48, pp. 84–89. [Google Scholar]
- Gerber, M.S. Predicting crime using Twitter and kernel density estimation. In Decision Support Systems; Elsevier: Amsterdam, The Netherlands, 2014; Volume 61, pp. 115–125. [Google Scholar]
- Huang, C.-J.; Liao, J.-J.; Yang, D.-X.; Chang, T.-Y.; Luo, Y.-C. Realization of a news dissemination agent based on weighted association rules and text mining techniques. Expert Syst. Appl. 2010, 37, 6409–6413. [Google Scholar] [CrossRef]
- Bollen, J.; Huina, M.; Zeng, X.-J. Twitter mood predicts the stock market. J. Comput. Sci. 2010, 2. [Google Scholar] [CrossRef] [Green Version]
- Novikova, O.A.; Andrianova, E.G. Rol metodov intellektual’nogo analiza teksta v avtomatizacii prognozirovaniya rynka cennyh bumag. (Role of the methods of intellectual analysis of text in automation of security market forecast). Cloud Sci. 2018, 5, 196–211. [Google Scholar]
- Gruhl, D.; Guha, R.; Kumar, R.; Novak, J.; Tomkins, A. The predictive power of online chatter. In KDD ‘05: Proceeding of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining; ACM Press: New York, NY, USA, 2005; pp. 78–87. [Google Scholar]
- Mishne, G.; Rijke, M.D. Capturing global mood levels using blog posts. In AAAI 2006 Spring Symposium on Computational Approaches to Analysing Weblogs; Nicolov, N., Salvetti, F., Liberman, M., Martin, J.H., Eds.; The AAAI Press: Menlo Park, CA, USA; Stanford, CA, USA, 2006; pp. 145–152. [Google Scholar]
- Liu, Y.; Huang, X.; An, A.; Yu, X. ARSA: A sentiment-aware model for predicting sales performance using blogs. In SIGIR ‘07: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; ACM: New York, NY, USA, 2007; pp. 607–614. [Google Scholar]
- Choi, H.; Varian, H. Predicting the Present with Google Trends. Tech. Rep. 2009. [Google Scholar] [CrossRef]
- Zhao, L.; Sun, Q.; Ye, J.; Chen, F. Multi-task learning for spatio-temporal event forecasting. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, Australia, 8–9 August 2015; pp. 1503–1512. [Google Scholar]
- Achrekar, H.; Gandhe, A.; Lazarus, R.; Yu, S.-H.; Liu, B. Predicting flu trends using twitter data. In Proceedings of the IEEE Conference on Computer Communications Workshops, Shanghai, China, 10–15 April 2011; pp. 702–707. [Google Scholar]
- O’Connor, B.; Balasubramanyan, R.; Routledge, B.R.; Smith, N.A. From tweets to polls: Linking text sentiment to public opinion time series. In Proceedings of the Fourth International Conference on Weblogs and Social Media, Washington, DC, USA, 23–26 May 2010; pp. 122–129. [Google Scholar]
- Tumasjan, A.; Sprenger, T.; Sandner, P.; Welpe, I. Predicting elections with twitter: What 140 characters’ reveal about political sentiment. In Proceedings of the Fourth International AAAI Conference on Weblogs and Social Media, Washington, DC, USA, 23–26 May 2010; pp. 178–185. [Google Scholar]
- Ramakrishnan, N.; Butler, P.; Muthiah, S. “Beating the News” with EMBERS: Forecasting Civil Unrest Using Open Source Indicators. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; KDD ’14, New York, NY, USA, 24–27 August 2014; pp. 1799–1808. [Google Scholar] [CrossRef]
- Ning, Y.; Muthiah, S.; Rangwala, H.; Ramakrishnan, N. Modeling Precursors for Event Forecasting via Nested Multi-Instance Learning. Soc. Inf. Netw. 2016, 1095–1104. [Google Scholar] [CrossRef] [Green Version]
- Chouhan, S.S.; Khatri, R. Data Mining based Technique for Natural Event Prediction and Disaster Management. Int. J. Comput. Appl. Found. Comput. Sci. 2016, 139, 34–39. [Google Scholar]
- Orlov, Y.N.; Shagov, D.O. Indicative statistics for non-stationary time series. Keldysh Inst. Prepr. 2011, 53, 1–20. (In Russian) [Google Scholar]
- Kryzhanovsky, A.D.; Pastushkov, A.A. Nonparametric method of reconstructing probability density according to the observations of a random variable. Russ. Technol. J. 2018, 6, 31–38. (In Russian) [Google Scholar]
- Gnedenko, B.V. Probability Theory Course; Fizmatlit: Moscow, Russia, 1961; 406p. [Google Scholar]
- Fuentes, M. Non-Linear Diffusion and Power Law Properties of Heterogeneous Systems: Application to Financial Time Series. Entropy 2018, 20, 649. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Orlov, Y.N.; Fyodorov, S.L. Generation of unsteady trajectories of time series based on Fokker—Plank equation. Pap. MFTI 2016, 8, 126–133. [Google Scholar]
- Radinsky, K.; Horvitz, E. Mining the Web to Predict Future Events. In Proceedings of the 6th ACM International Conference on Web Search and Data Mining, Rome, Italy, 4–8 February 2013; ACM: New York, NY, USA, 2013; pp. 255–264. [Google Scholar] [CrossRef] [Green Version]
- Gunawardana, A.; Meek, C.; Xu, P. A Model for Temporal Dependencies in Event Streams. In Proceedings of the Advances in neural information processing systems, Granada, Spain, 12–15 December 2011; Volume 4, pp. 1962–1970. [Google Scholar]
- Christopher, D. Manning, Prabhakar Raghavan, Hinrich Schütze. In Introduction to Information Retrieval; Cambridge University Press: Cambridge, UK, 2008; 544p. [Google Scholar]
- Tan, P.-N.; Steinbach, M.; Vipin, K. Introduction to Data Mining; Pearson Addison-Wesley: Boston, MA, USA, 2006; 169p. [Google Scholar]
- Andrews, N.O.; Fox, E.A. Recent Developments in Document Clustering; Department of Computer Science, Virginia Tech: Blacksburg, VA, USA, 2007; 25p. [Google Scholar]
- Feldman, R.; Sanger, J. The Text Mining Handbook; Cambridge University Press: Cambridge, UK, 2009; 410p. [Google Scholar]
- Lesko, S.A.; Zhukov, D.O. Trends, self-similarity, and forecasting of news events in the information domain, its structure and director. In Proceedings of the 2015 International Conference on Big Data Intelligence and Computing, Chengdu, China, 19–21 December 2015; pp. 870–873. [Google Scholar]
- Sigov, A.S.; Zhukov, D.O.; Khvatova, T.Y.; Andrianova, E.G. A Model of Forecasting of Information Events on the Basis of the Solution of a Boundary Value Problem for Systems with Memory and Self-Organization. J. Commun. Technol. Electron. 2018, 63, 1478–1485. [Google Scholar] [CrossRef]
- Lesko, S.A.; Zhukov, D.O. Stochastic self-organisation of poorly structured data and memory realisation in an information domain when designing news events forecasting models. In Proceedings of the 2016 IEEE 2nd International Conference on Big Data Intelligence and Computing, DataCom 2016, Auckland, New Zealand, 8–12 August 2016; pp. 890–893. [Google Scholar]
- Zhukov, D.; Khvatova, T.; Otradnov, K. Forecasting News Events Using the Theory of Self-similarity by Analysing the Spectra of Information Processes Derived from the Vector Representation of Text Documents. Commun. Comput. Inf. Sci. 2020, 1140, 54–69. [Google Scholar]
- Sigov, A.; Zhukov, D.; Novikova, O. Modelling of memory realization processes and the implementation of information self-organization in forecasting the new’ s events using arrays of natural language texts. In Proceedings of the 1st International Scientific Conference Convergent Cognitive Information Technologies, Convergent 2016, Moscow, Russia, 25 November 2016; CEUR Workshop Proceedings. Volume 1763, pp. 42–55. [Google Scholar]
- Zhukov, D.O.; Zamyshlyaev, A.M.; Novikova, O.A. Model of Forecasting the Social News Events on the Basis of Stochastic Dynamics Methods. In Proceedings of the ITM Web of Conferences, Moscow, Russia, 14–15 February 2017. [Google Scholar]
Cheap | Buy | Book | Case | Free | Delivery | Discount | |
---|---|---|---|---|---|---|---|
0 | 1 | 1 | 1 | 0 | 0 | 1 | |
1 | 1 | 1 | 1 | 1 | 1 | 0 |
No. | Normalized News Text | Date of Event | Value of Parameter ε | Value of Parameter ξ | Initial State of System x0 31 December 2016 |
---|---|---|---|---|---|
1. | {“id”:”9dc7c737-0359-418f-a809-28a4aa23b3bb”,”date”:1490774096000,”title”:”The head of the Ministry of Internal Affairs was killed after he identified theft for 10 billion”,”content”:”couple a week attempt on the life of Nikolai Volk write a statement dismissal own desire to refuse to sign inventory internal financial report information life killed the day before head of the Ministry of Internal Affairs of the Ministry of Internal Affairs Nikolay Volkov complained native department to steal an asset billion ruble force to sign blank document testimony native witness to check investigator IC Russia direct killer to look for authorised operative central directorate Criminal Investigation Department Ministry of Internal Affairs of Russia source editorial office report wolf to identify multi-billion dollar embezzlement assign a lot of internal check number of inventory suspicion to be confirmed establish an investigation person demand a high-ranking police officer signing an act verification of the Ministry of Internal Affairs RID allegedly no financial hole theft of the Ministry of Internal Affairs RID to be able to pay in time contractor owes many organizations previously the Ministry of Internal Affairs initiate a case the fact of fraud against the organization mariotrek responsible construction sanatorium ministry Olympics Sochi FSUE RID Ministry of Internal Affairs speak Customer service we are talking about fraud million rubles identification of the fact of involvement of the employee of the Ministry of Internal Affairs RID fraud the case is transferred to the Investigative Committee is known at the moment the Ministry of Internal Affairs should remain a Sochi builder at least one million rubles of the Ministry of Internal Affairs RID to be the defendant arbitration case lawsuit lawsuit Stroy Universal LLC debt million rubles Organization LLC Enterprise RTSPP RID owes a million to the Ministry of Internal Affairs Russia comment this situation refuse to remind the killer to pursue the goal of robbing the wolves take the portfolio money leave the place expensive phone cash money the killer is hiding car VAZ forget the place medical mask IC consider contract murder priority version death head of the Ministry of Internal Affairs RID ““,”url”:” https://life.ru/991216 “,”siteType”:”LIFE”} | 29 March 2017 (implementation term is 88 days) | 0.016 | 0.016 | 0.046 |
2. | {“id”:”3845f74e-c144-4ec3-9b8f-333e8e08b8ad”,”date”:1490776169000,”title”:”Tajikistan becomes the main foreign supplier of suicide bombers for ISIL “,”content”:”conclusion come author study war by suicide statistical analysis industry martyrdom Islamic state yoke publish international center fight terrorism Hague Netherlands period December year November year only suicide bomber yoke to control to load explosives Inghimashi machine fighter belt suicide bomber fight conventional weapons need to be blown up nearby enemy prima life live bomb house indicate foreign fighter mark author research general difficulty foreigner die quality suicide bomber to consider fifteen year mention Kuni accept Islamic tradition nickname associated place of origin prima life Al Muhajir similarly Al Ansari indicate foreigner indicate country of origin stay die quality drive car explosives originate country Tajikistan then go native Saudi Arabia Morocco Tunisia Russia further give the table indicate the exact figure suicide bomber yoke Tajikistan Saudi Arabia Morocco Tunisia Russia strange year numerous to immigrate the Salafi Tunisia to be a large foreign legion yoke to number about a thousand fighter go close thousand a native of the Wahhabi Kingdom of As Saud native to found follow the immigrant Jordan to rule the royal dynasty belong to the Hashemite clan to originate great-grandfather Prophet Muhammad it is possible therefore the list of the suicide bomber indicate the period only and Jordan Moroccan twelve month to go talk significantly Tajik perish Syria Iraq stroke attack to load explosives Ingimasi machine native foreign country celebrate representative International Center fighting terrorism number amazing consider soul population quantity of natives various country number of yoke prima life assume Tajik frequently direct to suicidal explosion minimum partly nationality Organization to prohibit Russia Supreme Court of the Russian Federation”,”url”:” https://life.ru/991022 “,”siteType”:”LIFE”} | 29 March 2017 (implementation term is 88 days) | 0.021 | 0.021 | 0.083 |
3. | {“id”:”5fbf3918-22cc-4ef3-8ad0-20ae2654286c”,”date”:1491441192000,”title”:”In the area of the attack on the employees of the Russian Guard in Astrakhan a firefight is going on “,”content”:”inform life source law enforcement agency Leninsky district Astrakhan to start a firefight crime figure presumably a few hours earlier to attack a Rosguard officer preliminary data special operation pass the area railway station Astrakhan specify the source remind today night three Rosguards get a gunshot wound attack several criminal declare the regional directorate of the ID of RF attack fighter Rosguard involved crime figure April kill police officer Astrakhan”,”url”:” https://life.ru/994664 “,”siteType”:”LIFE”} | 6 April 2017 (implementation term is 96 days) | 0.016 | 0.016 | 0.047 |
4. | {“id”:”c7584973-348d-417a-90c3-2199a4040558”,”date”:1491047117000,”title”:”NATO Does Not Intend to Fight with Russia for Abkhazia and South Ossetia”,”content”: “representative NATO South Caucasus William Lahue declare treaty organization fight Russia Abkhazia South Ossetia case joining Georgia North Atlantic Alliance Georgia must decide status territory clearly understand so far stay Russian army the fifth article Georgia use nobody want war Lahue report member alliance agree Georgia member NATO none term possible joining Georgia alliance call report Interfax slowly matter go forward future Georgia receive invitation know Lahue speech joining Georgia NATO depend parallel factor politics various country willingness Georgia”,”url”:” http://www.vesti.ru/doc.html?id=2872818 “,”siteType”:”VESTI”} | 1 April 2017 (implementation term is 91 days) | 0.011 | 0.011 | 0.036 |
5. | {“id”:”dacb1299-f6fa-4b25-a4cd-95795657cf4c”,”date”:1490474466000,”title”:”Syrian military liberated 195 settlements from IS * since January “,”content”:”number of settlement liberate January Syrian government army terrorist organization Islamic State yoke January reach report Saturday Russian center reconciliation feuding party Syria number of settlement liberate January year Syrian government troops armed formation international terrorist organization Islamic State increase be said bulletin publish web-site Ministry of Defense of the Russian Federation 24 h control government troops cross a square kilometer territory total difficulty liberate a square kilometer number of settlement join reconciliation process 24 h change message center reconciliation continue negotiations accession regime cessation of hostilities detachment armed opposition Aleppo province Damascus Ham Homs El Quneitr number of armed groups declare a cessation of hostilities compliance agreement armistice change terrorist organization forbid Russia”,”url”:” https://ria.ru/syria/20170325/1490808936.html “,”siteType”:”RIA”} | 25 March 2017 (implementation term is 84 days) | 0.016 | 0.016 | 0.060 |
Normalized Text of News | Date of Event | Value of Parameter ε | Value of Parameter ξ | Initial State of System x0 31 December 2016 |
---|---|---|---|---|
{“id”:”85e74845-70da-434c-a602-497efa002de6”,”date”:1514753700000,”title”:” Roly-Poly Bun”,”content”:”grandmother of the gate speak a handful of two door grandfather the road to live to knead fry roll a winglet swept kneaded towards the window song all the more go to roll the floor half put a chimney sweeper sweep a threshold jump yarned scraped on to concoct sing chill chilled eat through take a distant yard porch bench butter scrape window lie scrape mudroom sour cream old man take a distant yard porch bench butter scrape up a window lie scrape mudroom sour cream old man hare box flour cornbin leave hare box flour cornbin leave old woman old woman old woman old woman Bun Bun Bun Bun Bun Bun bun”,”url”:”http://null.ru/null”,”siteType”:”Fictitious”} | implementation time is not known | 0.0022 | 0.0022 | 0.0076 |
News Number | Accuracy ϒ % | |
---|---|---|
1. | 79.5 | 16.0 |
2. | 74.0 | 2.3 |
3. | 79.2 | 13.7 |
4. | 73.0 | 6.3 |
5. | 72.0 | 12.3 |
Average value | = 75.5 | = ±3.2 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhukov, D.; Andrianova, E.; Trifonova, O. Stochastic Diffusion Model for Analysis of Dynamics and Forecasting Events in News Feeds. Symmetry 2021, 13, 257. https://doi.org/10.3390/sym13020257
Zhukov D, Andrianova E, Trifonova O. Stochastic Diffusion Model for Analysis of Dynamics and Forecasting Events in News Feeds. Symmetry. 2021; 13(2):257. https://doi.org/10.3390/sym13020257
Chicago/Turabian StyleZhukov, Dmitry, Elena Andrianova, and Olga Trifonova. 2021. "Stochastic Diffusion Model for Analysis of Dynamics and Forecasting Events in News Feeds" Symmetry 13, no. 2: 257. https://doi.org/10.3390/sym13020257
APA StyleZhukov, D., Andrianova, E., & Trifonova, O. (2021). Stochastic Diffusion Model for Analysis of Dynamics and Forecasting Events in News Feeds. Symmetry, 13(2), 257. https://doi.org/10.3390/sym13020257