Next Article in Journal
On Statistical and Semi-Weyl Manifolds Admitting Torsion
Next Article in Special Issue
Complex Color Space Segmentation to Classify Objects in Urban Environments
Previous Article in Journal
T-Spherical Fuzzy Bonferroni Mean Operators and Their Application in Multiple Attribute Decision Making
Previous Article in Special Issue
Knowledge Dynamics and Behavioural Equivalences in Multi-Agent Systems
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Description of the Distribution Law and Non-Linear Dynamics of Growth of Comments Number in News and Blogs Based on the Fokker-Planck Equation

1
Institute of Cybersecurity and Digital Technologies, MIREA-Russian Technological University, 78 Vernadsky Avenue, 119454 Moscow, Russia
2
Institute of Radio Electronics and Computer Science, MIREA-Russian Technological University, 78 Vernadsky Avenue, 119454 Moscow, Russia
*
Author to whom correspondence should be addressed.
Mathematics 2022, 10(6), 989; https://doi.org/10.3390/math10060989
Submission received: 12 January 2022 / Revised: 14 March 2022 / Accepted: 16 March 2022 / Published: 19 March 2022
(This article belongs to the Special Issue Applied and Computational Mathematics for Digital Environments)

Abstract

:
The article considers stationary and dynamic distributions of news by the number of comments. The processing of the observed data showed that static distribution of news by the number of comments relating to that news obeys a power law, and the dynamic distribution (the change in number of comments over time) in some cases has an S-shaped character, and in some cases a more complex two-stage character. This depends on the time interval between the appearance of a comment at the first level and a comment attached to that comment. The power law for the stationary probability density of news distribution by the number of comments can be obtained from the solution of the stationary Fokker-Planck equation, if a number of assumptions are made in its derivation. In particular, we assume that the drift coefficient μ ( x ) responsible in the Fokker-Planck equation for a purposeful change in the state of system x ( x is the current number of comments on that piece of news) linearly depends on the state x , and the diffusion coefficient D ( x ) responsible for a random change depends quadratically on x . The solution of the unsteady Fokker-Planck differential equation with these assumptions made it possible to obtain an analytical equation for the probability density of transitions between the states of the system per unit of time, which is in good agreement with the observed data, considering the effect of the delay time between the appearance of the first-level comment and the comment on that comment.

1. Introduction

The description of social network behavior and information resources is one of the most important areas of mathematical sociology. From a practical point of view, the development of models describing user opinion dynamics and preferences contributes to the development of systems for automated monitoring of the public mood and its changes. Compared to traditional methods of studying public opinion, the advantage of such systems is that of automated information processing. Social surveys require the development of questionnaires and sampling, which is complicated by the necessity to cover all strata of society. In addition, respondents tend to provide socially desirable responses.
Another advantage of automated information processing for social networks and comments to newsfeed is that it identifies straightforward comments related to a socially significant topic and to highly-publicized news. Therefore, the development of automated information processing tools provides feedback between society and government bodies, starting from the municipal level and ending at the level of state authorities.
The development of automated tools assumes that their work should be based on algorithms based on approved mathematical models. In addition, it is of the utmost importance not only to monitor and analyze the processes involved in research but also to predict their evolution, which is necessary to ensure sustainable social development.
The dynamics of the changes in opinions and moods of Internet users can be largely attributed to stochastic processes, but with the possibility of targeted impact. On the one hand, the human factor (many people with different opinions, preferences, and behavior patterns) creates random changes (due to the wide variety of behavioral models of users). On the other hand, elements of opinion consistency are introduced into the dynamics of changes. A detailed description of the use of stochastic methods for modeling the dynamics of social processes can be found in [1].
In this regard, we consider models based on the Fokker-Planck equation to be the most promising to develop models of the changes in public mood dynamics, which takes into account both ordered and random changes.
The Fokker-Planck equation is widely used for analyzing and modeling the behavior of time series when describing processes in complex systems [2,3,4,5], for example, when analyzing the dynamics of the non-stationary time series of stock and commodity indices. To predict changes, based on the Fokker-Planck equation and sample data, the distribution functions of the series levels are constructed in the form of a sum of polynomials in which the coefficients of drift and diffusion may depend on a specific parameter, which is the level of the series according to various laws and is empirical.
It should be noted that, apart from the Fokker-Planck equation, other approaches are used for modeling based on differential equations, for example, the Liouville equations [5,6], the diffusion equations [4,7] and many others. A detailed review of modeling social processes is presented in [8].
The Fokker-Planck equation is a second-order partial differential equation that not only contains a term responsible for stochastic changes (“diffusion”), but also an element responsible for opinion consistency (“draft”). From the Fokker-Planck equation, it is possible to obtain a probability density function of transitions per unit of time between states of a system. A system can be defined as a blog or newsfeed that users comment on, and its state will be the number of comments that are observed at a given time.
In addition to describing dynamic processes, stationary solutions can be obtained from the Fokker-Planck equation, which can describe the state of a system in a stationary state, when, for example, its evolution has already ended, and changes do not occur. One example of such stationarity may be the final static distribution of newsfeed or blogs by the number of comments on them.
The study of processes occurring in complex systems with the participation of the human factor shows that very often a power law of distribution is performed for the observed characteristics of the parameters of these processes. If we imagine the interconnection of the elements forming a complex system as a diagram, it turns out that the networks that arise in this case—social, communication, Internet link networks, citations, and others—are well described by scale–free models (scale-invariant), in which the degrees of vertices (nodes) are distributed according to the power law p ( x ) ~ x γ (where γ is the characteristic degree) [9,10,11,12,13,14]. Scale-free networks are self-similar, i.e., in any part of the network, the distribution of degrees will be the same.
The power model is widely used in the analysis of processes in complex social systems, but at the same time the issue of the theoretical justification of the possibility for its application requires further study. In our opinion, this justification is very crucial. The identification of the nature of the processes from which the power law arises is necessary for a deeper study of behavior and analysis of complex social systems.
In addition, we are not aware of attempts to apply a theoretical description of processes in social networks and network mass media based on the Fokker-Planck equation from the standpoint of formulating and solving boundary value problems based on it.
The purpose of our work is to investigate the possibility of obtaining from the Fokker-Planck equation, often observed in practice in complex social systems, the power law of the distribution of parameters of the processes occurring, and to show that under certain assumptions this equation can be used to describe both static and dynamic characteristics.

2. Research Methods

As described in the lead section, our article is devoted to solving the following issues. First, we collected statistics on the dynamics of changes for the number of comments on the news on the feed portal of the Russian radio station «Echo of Moscow» https://echo.msk.ru/ (accessed on 13 September 2021) (one of the leading Russian commercial radio stations and newsfeeds). Then, we describe the processing of the collected data and the results obtained (in particular, in a stationary state, a power law of the news distribution by the number of observed comments). The observed dynamics of change in the number pf comments (to news feeds and blogs) is described by either two-stage or S-shaped curves. Further, using the stationary Fokker-Planck equation and a number of assumptions about the dependence of coefficients describing a random and purposeful change in the state of the system x ( x is the current number of comments on the news) on the magnitude of this current state x , we derive from the Fokker-Planck equation the power law of the distribution of news by the number of comments. Then, based on the Fokker-Planck equation, we construct a dynamic model of changes in the state of the systems under consideration over time. The analysis of the models showed good agreement with the observed characteristics of the processes. This suggests that the models we have developed can be used not only to analyze social processes, but also to predict their evolution, which is very important for managing the stable development of social relations. In conclusion, we discuss the possible application of our models in practice and the creation of algorithms for automated systems for the monitoring of public opinion.

3. A Brief Overview of Existing Studies of the Structure of Complex Social Systems and the Processes Observed

One of the directions in the study of complex networks is the study of their structure, based on the possibility of representing processes at the graph level, using a set of attachments at the level of individual nodes for data aggregation (of the properties of the whole from the properties of the quotient). Aggregation is crucial, since it should, in principle, provide an isomorphism-invariant representation of the graph, i.e., the representation of the graph should be a function of the nodes of the graph, considered as some set.
In [15], the DeepSets aggregation operator based on self-organizing maps (SOM) is considered. Using SOM allows calculation of representations of nodes that include information about their resemblance. Experimental results on real data sets show that the proposed approach provides improved predictive performance compared to the generally accepted summing aggregation and many modern graph neural network architectures in the literature.
Since, with the growth of the network, the search for similarities between nodes in the network is a time-consuming process, to optimize the process of solving problems of predicting connections and detecting communities’ researchers in [16] use swarm algorithms. Swarm-based optimization methods used in social network analysis are compared in this article with community analysis and link analysis based on traditionally used approaches.
In [17], the authors consider the mathematical model of mixed membership in user groups, which are formed stochastically. This preliminary solution the authors base on the method of detecting pairwise measurements, which subsequently show the presence or absence of connections between a pair of nodes. When analyzing the approach for probabilistic changes between pairs of objects, it is usually necessary to introduce assumptions, for example, independence, or assumptions of the inconsistency of this connection (mixed membership in stochastically forming groups). The proposed model allows, under certain assumptions, the tracking of dynamic changes in the number of nodes in the forming of groups and their clustering by groups.
In the presented model, from the development of choice and influence on social networks [18], the authors consider a model for which the number of nodes and the network topology (structure of connections) are dynamic. A significant disadvantage of this model is that it explicitly considers the connections between all pairs of nodes. This action leads to quadratic difficulty in calculating the change in the number of participants in various social groups and a significant increase in the calculation time. It is worth noting that real social networks and systems are sparse. This means that most participants do not have paired connections, and the number of their connections is itself random. Introducing the concept of sparsity into the model [18], as well as taking into account the random nature of the number of connections for each node (user) of the network, can significantly increase the speed and efficiency of using this model.
The authors in [19] use a structure analysis technique that dynamically develops, and therefore has a multimodality of, the graph of the social network. Using this approach to real graph structures in practice shows that there is temporary online regularity in people’s social interactions. Moreover, correlations are found between the occurrence of friendship between participants and the settings of the interactive social network. Separately, it is worth noting that physical contacts between people can be considered as an interactive dynamically changing network.
In article [20], the authors described methods of structuring and influencing the dissemination of information on mobile social networks. In these networks, a group of users is typically treated as some kind of entity in which individuals can exchange messages. The authors also note that there is a variety of models for analyzing the dissemination of information on mobile social networks, but none of the existing methods considers the concept of information dissemination in the group. Therefore, the authors of the paper used the SIR model, which is used to spread viruses in computer networks, and applied this to the dynamics of the information dissemination process in groups. Simulations using the Monte Carlo method showed that group propagation increases the overall speed of information propagation on the network. In addition, the authors note that the presence of groups with a significant number of participants is most effective in disseminating information than the presence of a huge number of groups but with a small number of participants. This analysis of the impact of the structure on the dissemination of information within it proves that their distribution in the networks of Erdesh-Rennie and Barabashi-Albert does not show any differences. Ref.[20] analyses the stochastic model of opinion dynamics in social networks. This model is based on a multi-agent approach, for which the opinion of each network member is randomly influenced by the actions of others (its neighboring nodes). Examples were given that, since the number of users (nodes) in the network is not infinite, the model as a result asymptotically creates consensus. The consensus value usually corresponds to one of the absorbing states of the Markov system. However, when the number of nodes is large, some metastable transition states are observed in places. The duration of these transient states may be as long as desired in time, and the state data may be characterized using the mean field approximation for the Markov system. Ultimately, the authors propose a model by which opinion control in the social network is possible.
We can consider several statistical studies [21,22,23] that have widely used the method of studying profiles in social networks. The purpose of these studies is to identify the social mobility of people based on their publications accompanied by geodata. The authors found a large number of such publications, and based on these an approximate map of the user’s movements was compiled, the main centers of activity were identified, and the person’s place of residence was established. According to the data on the place of residence, the people’s names were found. Further, using a database of names distributed by gender, it was possible to determine the gender of more than half of all the accounts studied; according to the surname data, the researchers tried to establish information about the race and age of users, successfully in 38% and 14% of cases, respectively. These studies have shown that it is possible to establish some demographic characteristics, knowing only about the movements of a person or knowing his first and last name.
Using the comparison of time slices, it is possible to determine dynamically changing temporary communities of users of social network structures. The study of these dynamic communities makes it possible to significantly simplify the analysis of the dynamics of a complex system of social interactions as it evolves over time.
Consider [24], which presents the fundamental structures of dynamic social networks based on a high-resolution dataset describing a tightly connected population of 1000 first-year students at a large European university. The authors of this article consider the physically short interactions that they measured using Bluetooth, supplemented with information received from telecommunications networks (information about calls and messages), social networks and the demographic and geolocation data of users.
Human social communities by their nature overlap due to individuals participating in several different groups (in the theory of complex networks, such nodes are called jumpers). During the week, meetings of the subjects of the created compact structure take place, either a meeting of friends outside the university, or of all students (such structures are called cores). In a network of short physical interactions, all participants are present at the same time and are in physical contact.
The location of the core members can also be forecast. The objects that helps to do this are the kernels themselves. By observing the usual routes of the people who make up the core and their behavioral habits, it is possible to predict the geographical location of a person in the next time interval with high accuracy (on average in 93% of cases), such high accuracy proving that human mobility patterns are regular. It is also worth noting that the members of the core have fewer location states than individuals, which leads to lower values of information entropy on average.
The condition that geospatial studies are conducted for a part of the social group, yet the study is limited to certain time frames, shows any complex interaction between time, place and social context. It also supports the hypothesis that often. when people are most unpredictable in the geospatial domain, they exhibit some predictable social behavior. Linking the results of this article with the literature on dynamic community detection, it can be noted that there are many methods in the literature that would allow the detection gatherings in everyday life, but here the authors used a simple comparison of graph components to emphasize the fact that emerging social structures are natural, and these complex methods are not needed to determine their occurrence.
In fact, Ref.[24] provides a quantitative assessment of long-term patterns encoded in the micro dynamics for a huge system of interacting nodes, which are characterized by predictability and a high degree of order.
Let us consider another paper on dynamic models [25]. Recent developments in the field of social networks have shifted the focus from static representations to dynamic ones, requiring new methods of analysis and modeling. Observations in real social systems have revealed two main facts that play a very important role in the evolution of networks and affect the current processes of distribution: the strategies that individuals adopt when choosing between new or old social systems, connections, and the turbulent nature of social activity that sets the pace of these choices. The results are verified using numerical simulation and compared with two observable data sets.
In [26], methods of assessing public opinion and highlighting the mood of users are carried out using a method based on the use of vocabulary and semantics and inherited from the classical approach to the analysis of public sentiment. Neural networks are used for this method. The task of the neural network is to determine important keywords, which are then checked by experts in this subject area. Formally, the program first analyzes articles and determines how often different words are found in them. Next, the program identifies the most commonly used words and expressions, and makes them significant. Then, on their basis, the program builds a lexicon that characterizes the public mood based on the transmitted news articles.
In [27], the authors described the workings of the algorithm for analyzing certain topics from the social network. In addition to collecting information, there are methods for processing and sorting information. In addition, the time elapsed between publications is measured so that it is subsequently possible to restore the order of publications and obtain a time scale based on these data. Following from the above, the result is a graph that can be used to track the growth and decline in popularity of certain topics discussed on social networks. You can also trace what moods are accompanied by what events in society. In addition, it is possible to determine the period of active discussion for certain topics.
Article [28] describes the method of studying political sentiments in society, based on the analysis of the social network. This method is carried out by searching for special words in the text that are previously entered in the program database. The main task of this system is to track by how much different political parties are preferable to citizens, and which are less significant. In addition, which topics are most resonant and most discussed in society are monitored. Additionally, with the help of the program, it is possible to find out how many people in percentage terms support a certain political party.
The subject of [29] is that of microblogs. The authors of this study used the method of keyword analysis. With the help of such analysis and machine learning, they managed to divide the initial sample into six age groups and identify the topics that participants in each age group most often discuss and on which they most often express their thoughts. Teenagers under 18 most often discuss sports; young people aged 18–25 most often talk about entertainment; people aged 25 to 30 mainly discuss family and business, older people (31–36 years old) are most interested in technology, users aged 26–40 begin to worry about their health and speak about this more often, and those over 40 like to discuss politics. Thus, the most frequent topic for discussion was determined for each age group; this does not mean that each member of this group necessarily discusses this topic, but it is more likely that the person discussing this topic belongs to this age group.
The authors of [30,31,32,33,34] proposed a method that evaluates the mass media according to several criteria (topic, evaluation criteria/properties, classes), which combine thematic modeling of context and multi-criteria decision-making. This evaluation system is based on corporate analysis as follows: the conditional distribution of media probabilities by topic, detail and class is calculated after the formation of the thematic model of corporations. Several approaches, including manual labeling, a multi-corporate approach and an automatic approach, are used to obtain coefficients that show the interaction regarding how each topic relates to each evaluation criterion and to each class described in the document. The multi-corporate approach proposed in the study involves assessing the thematic asymmetry of text enclosures to obtain coefficients describing the relationship of each topic to a certain criterion. These factors, in combination with the thematic model, can be used to evaluate each document in the enclosures according to each of the criteria and classes considered. This method was applied to a body of texts consisting of 804,829 news publications from 40 Kazakh sources, published from 1 January 2018 to 31 December 2019 (over a period of 2 years) to classify negative information on socially significant topics. The study produced a BigARTM model (200 topics) and applied this model, including completion of the analytical hierarchical process table (AHP) and all necessary high-level labeling procedures. The experiments carried out confirm the general possibility of evaluating media using the thematic model of text enclosures, since the classification problem achieved an area estimate under the receiver performance curve (ROC AUC) of 0.81, which is comparable to the results obtained for the same task using the BERT model.
The developed system, in which the proposed model was integrated, allows the solution of classic problems, such as simple reports or sentiment analysis. Moreover, it has a number of unique possibilities for use. It provides options such as automatically analyzing a specific topic, event, or object without having to create a keyword-based query. The analysis is based on an arbitrary list of criteria and not limited to sentiment alone. This list includes social significance, popularity, manipulation, propaganda content, attitude to a certain country, attitude to a certain area, analysis of the dynamic behavior of topics, predictive analysis at the thematic level, etc.
In [35,36,37], the KroMFac technique is proposed, which performs community detection using regularized non-negative matrix factorization (NMF) based on the Kronecker graph model. KroMFac combines network analysis and community discovery methods in a single unified structure. This technique connects four areas of research, namely the detection of communities on graphs, of overlapping communities, of communities in incomplete networks with missing edges, and of complete networks.
It is possible to consider several works, close to the subject of our research, on the description of processes in complex social network structures.
Article [38] considers a model describing the spatial and temporal distribution of information in social networks based on a partial differential equation. In this paper, a non-autonomous diffusion logistic model with Dirichlet boundary conditions was created and investigated, which showed that the diffusion of data is strongly influenced by the diffusion coefficient and internal growth rate (the spread of information or rumors can be considered as a kind of virus that does not have a physical form).
Article [39] proposes a mathematical model of information dissemination and a mechanism of evolution of the state of the information node using the theory of thermodynamic molecular thermo-diffusion motion in combination with the model of epidemic infection. Four different network topologies are used for the time-varying online social network (OSN) information dissemination process (regular network, small worlds network, random network, and non-scale network).
When distributing OSN information, the concept of information entropy is used. The process of information dissemination determines the transition of the system from one stable state to another. The transfer function is set by such information parameters as information energy, information temperature and energy entropy. The considered model is based on the relationship between the state of microscopic network nodes and the rules of macroevolutionary evolution. The authors of the article conduct simulation experiments and empirical comparative experiments in networks with different topological structures. The proposed model is trained and evaluated using experimental data collected from the Chinese network Baidu.
The authors of article [40] propose a model for describing the distribution of messages in social networks. This proposal is based on systems described by means of differential equations that show the propagation of various information in a network graph chain. The authors are convinced that this model allows the taking into account of specific mechanisms for transmitting messages. In this model, the vertices of the graph are people who, when a message is received, form their attitude to it. After this, people decide on further transmission of this message over the network, provided that the corresponding interaction potential of the two persons exceeds a certain threshold level.
The authors developed a mathematical method for calculating the timing of the distribution of messages in the corresponding graph chain, which is reduced to solving a number of Cauchy problems for systems of ordinary nonlinear differential equations. Formally, these systems can be simplified, and some equations can be replaced by the Boussinesque or Corteweg de Frieze equations. The presence of soliton solutions for these equations gives us reason to consider social and communicative solitons as an effective tool for modeling the processes of disseminating messages on social networks and studying various influences on their distribution. If certain assumptions are allowed, this model, considered in [33] has some analogies with the spread of viral epidemics.
In conclusion, it should be noted that almost no one has studied models based on the Fokker-Planck equation to describe processes in complex network social systems.

4. The Analysis of the Observed Statistics of Comments from Users of Newsfeed Resources and Blogs—Statement of the Research Problem

4.1. Data Source Selection and Presentation

Newsfeed and blogs, on which Internet users leave their comments, are one of the most important among network objects, since they can indicate public opinion in real time. A socially significant topic usually attracts both supporters and opponents, who enter into discussions and leave comments. The more highly-publicized the news or blog, the higher the user activity and the greater the number of commentators (a multi-level structure of comments on comments appears). The analysis of the structure of comments of users of news posts and blogs is one of the most practically significant and relevant scientific tasks, the solution to which ensures sustainable social development.
To study the nature of the observed processes and collect data, we have selected the commercial radio station and newsfeed «Echo of Moscow» https://echo.msk.ru/ (accessed on 13 October 2021). The choice is determined by the following reasons:
  • News portal (The commercial radio station) is among the top 10 news sites in Russia and in July of 2021 took ninth place for attendance and seventh for user activity, also at the end of July 2021 ranking in the top eight of the cited radio stations and occupying first place through hyperlinks in social media at the end of August 2021 and fourth place according to the citation index in the media.
  • The portal has various themes (presents news from the political, sporting, economic and scientific arenas, cultural orientation, etc.).
  • The news portal has been in existence since 1990 and has established itself as a reliable, truthful and publicly available news source, and also publishes blogs of well-known media personalities.
  • There is practically no pre-moderation of comments (pre-moderation applies only to new users or users who have previously violated the rules of the news portal), but there is post-moderation of discussions (the requirements for comments and prohibitions on their placement can be found at the link: https://echo.msk.ru/moderate.html (accessed on 13 October 2021)). Users can express different opinions (which do not have to coincide with the official position) and their comments are deleted only for violating the rules.
At first, we downloaded the news range we were interested in, using a special software application (parser). The portal distributes news by day, and each individual day can be found at the link (https://echo.msk.ru/news, (accessed on 13 October 2021), where day is the day, month is the month, year is the year). Each news item has a number of parameters such as: news text, unique identification number on the portal, title, web page address (URL), metadata (date and time of publication, number of views and comments), texts of user comments (as well as unique identification number of the comment, unique identification number of the user, date and time of comment, comment hierarchy level, relationship by level of commenting to the parent comment) and available information about authors (unique identification number of the user on the webpage, city, occupation, place of work, name or nickname, registration date, the number of recommendations and user profile views, the total number of comments for the observed period, etc.). On average, the number of news items varied from 160 to 190 per day. While collecting the data, we downloaded information about which of the users commented on other users’ reaction to news. Based on the data obtained, a database of the newsfeed archive was created.
Figure 1 shows the correspondence of the share of commentators to the number of comments they wrote (the observed density of the distribution of commentators by their number of comments) for the period from 1 January to 31 December 2020. Similar dependencies can be built for any period (day, week, month, quarter, year). The total number of news items published in 2020 was 65,560, of views 196,609,650, and of comments 564,764.
Note that Figure 1 shows only part of the data. Some users managed to write several hundred comments during the year (the maximum number of comments on one news item was 239), but their share is rather small. So, for clarity of presentation, the right part of the chart has been reduced, because it is uninformative.

4.2. Processing of Observed Data

When analyzing the observed data (see Figure 1), it is crucial to establish the distribution law that the observed distribution density is subject to. Otherwise, se of the data obtained is difficult in terms of predicting the behavior of the process and making recommendations for decision-making.
Considering the process of creating comments by users to be largely random (due to the different probability of occurrence of various news events and the degree of interest in them, etc.), let us consider the three most frequently observed distribution laws:
  • Gaussian distribution: ρ ( x ) = e x 2 2 · σ 2 / σ 2 π
  • Exponential distribution: ρ ( x ) = a · e α x
  • Power distribution: ρ ( x ) = β · x γ
If any of these distributions is fulfilled, then the observed data should be linearized in the appropriate coordinates with an acceptable value of the correlation coefficient (0.95–0.98):
  • For the Gaussian distribution: l n { ρ ( x ) } = l n { σ 2 π } 1 2 · σ 2 · x 2
  • For exponential distribution: l n { ρ ( x ) } = l n { a } α x
  • For the power distribution: l n { ρ ( x ) } = l n { β } γ l n { x }
The linearization of the observed data for various types of distribution is shown in Figure 2, Figure 3 and Figure 4 (“1”—The area in which the “fluffy tail” is observed.).
As can be seen from Figure 2, Figure 3 and Figure 4, the best linearization is observed for approximating the observed data by the power law of distribution (see Figure 4). However, the areas shown in Figure 2, Figure 3 and Figure 4 by the oval figure, which we called the “fluffy tail”, deserve special discussion. Their appearance is due to the fact that, in addition to the so-called conscientious users, there are chatbots and users who write comments on a professional basis among the commentators. A rule can be introduced according to which unscrupulous users and chatbots can include those commentators who make more than 6–10 comments per day, as well as those who create several comments in a very short time interval (high-frequency commenting).
After appropriate purification, data can be obtained, the linearization of which, for the power law, is shown in Figure 5. There is no acceptable linearization for the exponential distribution and the Gaussian distribution. The straight line in Figure 5 shows that the trend line is well described by the linear approximation y = 1.49 1.23 z , where y = l n { ρ ( x ) } ,   z = l n { x } ,   l n { β } = 1.47 , and the correlation coefficient is 0.98.
In addition, to confirm the conclusion regarding linear approximation, it is possible to investigate the behavior of the residuals, and test the hypothesis that they are normally distributed with an average value equal to zero and have a homogeneous variance. The calculation of the residuals can be carried out on the basis of the actually observed values of the natural logarithm of the proportion of commentators who gave a given number of comments and the equation we obtained, for a given logarithm of the number of comments. The calculated value of the mathematical expectation for the distribution of residues is 0.25 and the variance is 0.13. The asymmetry is 0.64; the kurtosis is 0.14. Testing the slope hypothesis (two-sample F-test for variances) shows that the variance of the residuals (calculated relative to the trend line) is significantly less than the variance of the deviation of linear regression points from the average value of the observed data (Σyi/n = Σln{ρ(xi)}/n). This is equal to 2.11 (0.13 << 2.11). Thus, the resulting regression is significant. The asymmetry characterizes the “skewness” of the distribution function, and for symmetric functions (for example, the normal distribution) it is zero (in our case, it is small and close to zero). The kurtosis characterizes the “tail” of the distribution. With large positive values for the kurtosis, the distribution function decreases more slowly with further distance from the average value than with small ones. If the excess value is greater than zero, the distribution density graph will lie above the normal distribution graph and, for less than zero, below the graph (in our case, this is small and very close to zero). Thus, from the data obtained, it can be concluded that the distribution of residuals is very close to normal, which confirms the conclusion that the natural logarithm of the proportion of commentators who wrote these comments linearly depends on the natural logarithm of the number of comments, which confirms the fulfillment of the power law.
Thus, it can be assumed with great certainty that the density of the distribution of commentators by their number of comments obeys a power law.
It seems interesting to consider the dynamics of the changes in the number of comments on news of great public interest (during viewing, such types of newsfeed or blogs gain hundreds of comments) over time.
As an illustrative example, the news that appeared on the Echo of Moscow portal (https://echo.msk.ru/news/2626290-echo.html, accessed on 16 April 2020) can be chosen. On 21 November 2021: “The Public Council under the Ministry of Defense made a proposal to rename the Prague metro station in honor of Marshal Konev.” The total number of comments was 221. Figure 6 shows the dynamics of changes in the number of comments on this news item over time. The number of comments at the first level (comments one news itself) was 107, at the second level (comments of the comments at the first level) 26, at the third 24, and at the fourth and more, the average time for second-level comments to appear is about 130 min.
As another illustrative example, the news that appeared on the Echo of Moscow portal (https://echo.msk.ru/news/2740844-echo.html) accessed on 12 November 2020 can be chosen: “Lavrov said that the Russian Federation has reason to believe that Navalny was poisoned on a plane or in Germany.” The total number of comments was 220. The dynamics of the change in the number of comments on this news item over time is shown in Figure 7.
After removing the time gaps, the dynamics take the form shown in Figure 8.
The number of comments at the first level (comments on the news itself) was 93, at the second (comments on comments at the first level) 32, at the third 22, and at the fourth and more 73. The average time for second-level comments to appear is about 100 min.
It should be noted that in addition to the two-stage curves (see Figure 6 and Figure 8), in some cases there is an S dynamic for changes in the number of comments (see Figure 9 (without removing the gaps) and Figure 10 (after removing the time gaps) for the news item “Putin has nominated Mishustin for the post of prime Minister” published on the portal https://echo.msk.ru/news/2571431-echo.html (accessed on 21 November 2021) on 15 January 2020, which received 208 comments).
The number of comments at the first level (on the news itself) was 90, at the second (comments on comments at the first level) 38, at the third 14, and at the fourth and more 66. The average time for second-level comments to appear is about 56 min.
Note that the length of the sections of curves 1, 2 and 3 in Figure 6, Figure 8 and Figure 10 may be different, as well as the growth areas of the S-shaped curves.
The dynamics of the appearance of comments for news items: (1) On 16 April 2020: “The Public Council under the Ministry of Defense made a proposal to rename the Prazhskaya metro station in honor of Marshal Konev”; and (2) On 12 November 2020: “Lavrov stated that the Russian Federation has reason to believe that Navalny was poisoned on an airplane or in Germany”, have a two-stage character (see Figure 6 and Figure 8). For the news: (3) 15 January 2020: “Putin nominated Mishustin for the post of prime Minister”, this is S-shaped (see Figure 10). In our opinion, this may be due to a significant difference in the average time of appearance of second-level comments (the time interval between the appearance of the first-level comment and the comment om this comment). If for the first news and for the second this is about 130 and 100 min, respectively, then for the third it is about 56 min. It should also be noted that the two-stage nature of the dynamics of commenting on the first news item is more evident than for the second, and at the same time the span until the appearance of secondary comments on the second news item is longer. For the other parameters (the total number of comments, the number of comments at the first, second and third levels), the three selected news items are close in quantitative terms.
For further study, the following theoretical research task can be formulated: what is the nature of the processes of commenting on news items and blogs, and what features of these complex social systems lead to the fact that, for the correspondence of the probability density of the distribution of comments by their number, a power law is applied and the dynamics have a complex two-stage character in many cases?

5. Derivation of the Power Law of the Distribution of Comments from the Stationary Fokker-Planck Equation

The Fokker-Planck equation is widely used for the analysis and modeling of non-stationary processes observed in various complex systems and allows the achievement of good agreement with the predicted behavior and observed data. Therefore, as a testable hypothesis, we assume that the Fokker-Planck equation can be used to analyze and model the appearance of comments on newsfeed and blogs.
In general, the Fokker-Planck equation has the form:
ρ ( x , t ) t = x [ μ ( x ) · ρ ( x , t ) ] + 1 2 2 x 2 [ D ( x ) · ρ ( x , t ) ]
where ρ ( x , t ) is the time-dependent probability density of the distribution over states x (in our case, state x is the number of comments observed at time t ) ,   D ( x ) is a state–dependent coefficient x that determines a random change in state x ,   and   μ ( x ) is a state-dependent coefficient x that determines a purposeful change in state x .
In relation to our model, D ( x ) can be interpreted as user actions caused by a spontaneous impulse that arose when reading the news or comments on it from other users, when the event described in the newsfeed or blog is not essential, but the user is willing to spend time commenting or responding to another commentator (the user has a spontaneous desire to respond to this news). μ ( x ) can be interpreted as purposeful actions caused by the desire to respond to a newsfeed or blog that is essential to the user, as well as to comment on another user’s comment if this touches on a topic that is important from the point of view of this user (the user is constantly interested in this topic).
When analyzing the observed data, at first step we will not consider the dynamics of the appearance of comments over time, but take a static picture formed over a certain period of time (when the changes stop), so we can proceed to the stationary Fokker-Planck equation, which has the form:
d d x [ μ ( x ) · ρ ( x ) ] + 1 2 d 2 d x 2 [ D ( x ) · ρ ( x ) ] = 0
Calculate the derivatives in Equation (2):
d d x [ μ ( x ) · ρ ( x ) ] = [ μ ( x ) · d ρ ( x ) d x + ρ ( x ) · d μ ( x ) d x ]
d 2 d x 2 [ D ( x ) · ρ ( x ) ] = d d x [ d d x [ D ( x ) · ρ ( x ) ] ] = d d x [ D ( x ) · d ρ ( x ) d x + ρ ( x ) · d D ( x ) d x ] =
= D ( x ) · d 2 ρ ( x ) d x 2 + 2 d D ( x ) d x · d ρ ( x ) d x + ρ ( x ) · d 2 D ( x ) d x 2
After substituting the derivatives into Equation (2), we obtain:
μ ( x ) · d ρ ( x ) d x d μ ( x ) d x · ρ ( x ) + 1 2 D ( x ) · d 2 ρ ( x ) d x 2 + d D ( x ) d x · d ρ ( x ) d x + d 2 D ( x ) d x 2 · ρ ( x ) = 0
Further, to build the model, it is necessary to make assumptions about the dependence of D ( x )   and μ ( x ) on the state of x and consider two conditions. Firstly, we consider the magnitude of the terms included in Equation (3), and secondly, we can assume that with the growth of state x (the increase in the number of possible comments (the significance of a newsfeed or blog), the values D ( x ) and μ ( x ) should also increase). Logic suggests that all terms of Equation (3) should have the same magnitude, which has p ( x ) . Both the first and the second condition will be met if the dependencies D ( x ) and μ ( x ) on the state x have the form: μ ( x ) = μ 0 · x and D ( x ) = D 0 · x 2 . In this form, the growth of D ( x ) and μ ( x ) will be ensured with an increase in the state of x, and on the other hand the condition of preserving the magnitude is fulfilled. Substituting D ( x ) and μ ( x ) into Equation (3) gives:
μ 0 · x · d ρ ( x ) d x μ 0 · ρ ( x ) + 1 2 D 0 · x 2 · d 2 ρ ( x ) d x 2 + 2 D 0 · x · d ρ ( x ) d x + D 0 · ρ ( x ) = 0
x 2 · d 2 ρ ( x ) d x 2 + 2 [ 2 μ 0 D 0 ] · x · d ρ ( x ) d x + 2 [ 1 μ 0 D 0 ]   ρ ( x ) = 0
Denote 2 [ 1 μ 0 D 0 ] = γ , then:
x 2 · d 2 ρ ( x ) d x 2 + [ 2 + γ ] · x · d ρ ( x ) d x + γ ·   ρ ( x ) = 0
Equation (5) refers to equations of the Euler equation type and its solution can be found in the form: ρ ( x ) = k C k x q , where C k are constant coefficients at the corresponding roots of the characteristic equation, which has the form:
q ( q 1 ) + [ 2 + γ ] q + γ = 0
This equation has two roots: q 1 = 1 and q 2 = γ . Thus, for ρ ( x ) we obtain:
ρ ( x ) = C 1 x 1 + C 2 x γ
We find the constant coefficients C 1 and C 2 using the normalization condition of the function ρ ( x )
1 ρ ( x ) d x = C 1 l n ( x ) 1 + C 2 x 1 γ 1 γ 1 1
Integral (7) is calculated from 1 to ∞, because there may be users who have made a very large number of comments to the news, but there cannot be commentators who have written less than one comment. Given that for x   l n ( x ) = , then C 1 = 0 and, respectively, C 2 = γ 1 . Finally, we get: ρ ( x ) = [ γ 1 ] x γ .
Let us compare the obtained theoretical result with the observed data (see Figure 5). Linear approximation of the data presented in Figure 5 allowed us to obtain the equation: y = 1.49 1.23 z , which must be compared with the equation:
l n { ρ ( x ) } = l n { γ 1 } γ · l n ( x )
If γ = 1.23 , then l n ( γ 1 ) = 1.47 , which shows a very good correspondence between the theory and the observed data.
The results obtained show that, with a linear dependence of μ ( x ) on the state of x and a quadratic dependence of D ( x ) on the state of x , the power law of dependence is the probability density of the distribution of comments by their number (states of x ). This can be obtained from the solution of the stationary Fokker-Planck equation, and the observed data and theoretical calculations have good agreement with each other.
Special attention should be paid to this result. Its importance lies in the fact that the effects of memory and self-organization play an important role in the dynamics of social processes. However, in this case it turns out that from the Fokker-Planck equation (describing the dynamics as a whole at the macro level), the derivation of which considers a completely stochastic Markov approximation, it is possible to obtain theoretical results that are in good agreement with the observed data. We can make an assumption that the multi-directionality of a multitude of local actions and processes, each of which has both memory and self-organization, leads in the total result to the fact that memory can largely disappear as a result of the multi-direction of the ongoing micro-processes.

6. A Model of the Nonlinear Dynamics of the Appearance of Comments Based on the Fokker-Planck Equation

Since the use of the Fokker-Planck equation and the approach described above allow us to obtain the power law of distribution observed in practice, it is advisable to use this equation to describe the dynamics of the observed processes.
Using the method of Laplace transformations for Equation (1), it is possible to obtain (see Appendix A) the following expression for the distribution function:
ρ ( x , t ) = [ [ l n ( x ) ] 2 D 0 t + [ 1 2 μ 0 D 0 ] l n ( x ) 1 ] 2 π D 0 t 3 e [ [ l n ( x ) ] 2 2 D 0 t + [ 3 2 μ 0 D 0 ] l n ( x ) + [ 1 2 μ 0 D 0 ] 2 D 0 t 2 ] d t
The probability that the number of comments by the time it reaches a certain number L can be found by the formula (10):
P ( L , t ) = 1 0 L [ [ [ l n ( x ) ] 2 D 0 t + [ 1 2 μ 0 D 0 ] l n ( x ) 1 ] 2 π D 0 t 3 e [ [ l n ( x ) ] 2 2 D 0 t + [ 3 2 μ 0 D 0 ] l n ( x ) + [ 1 2 μ 0 D 0 ] 2 D 0 t 2 ] d t ] d x
0 L [ [ [ l n ( x ) ] 2 D 0 t + [ 1 2 μ 0 D 0 ] l n ( x ) 1 ] 2 π D 0 t 3 e [ [ l n ( x ) ] 2 2 D 0 t + [ 3 2 μ 0 D 0 ] l n ( x ) + [ 1 2 μ 0 D 0 ] 2 D 0 t 2 ] d t ] d x
This determines the probability that the threshold L (for example, the maximum possible value of the number of comments) will not be reached by time t . The dependence of the number of comments N ( t ) on time t will be described by the equation: N ( t ) = P ( L , t ) · L .
We will conduct simulation modeling and analyze the theoretical results obtained. As an example, we choose L = 100 and three sets of values of μ 0 and D 0 ( μ 0 = 0.45 и D 0 = 0.50 conventional units ( μ 0 < D 0 see curve 1 in Figure 11); μ 0 = 0.50 и D 0 = 0.50 and μ 0   = D 0 conventional units ( μ 0 > D 0 see curve 2 in Figure 11) and μ 0 = 0.55 and D 0 = 0.50 conventional units ( μ 0 > D 0 see curve 3 in Figure 11)). Figure 11 shows the results of modeling the dynamics of changes over time in the number of comments N ( t ) at the selected values of the model parameters μ 0 , D 0 and L .
Theoretical calculations show that, with the growth of μ 0 relative to D 0 , the growth rate of the curve increases (see Figure 11).
It is important to note that the model based on the Fokker-Planck equation for all values of the parameters μ 0 and D 0 shows the S-shaped nature of the dynamics of changes in the number of comments to the news over time, which in many cases is not consistent with the observed data (see Figure 6 and Figure 8).
The correspondence of the theoretical model and the observed data (see Figure 6 and Figure 8) can be obtained if we assume that two processes with different μ 0 and D 0 can occur simultaneously. Moreover, the sum of the partial fractions of the processes should be equal to 1, i.e., P t o t a l ( L , t ) = α 1 · P 1 ( L , t ) + α 2 · P 2 ( L , t ) , where α 1 + α 2 = 1 . At the same time, one of the processes is generated by commenting on the newsfeed or blog itself, and the second by commenting on comments. To describe this, we consider the possible time delay in commenting on comments in the model. If we enter the delay time (denote it τ ), then the distribution function will take the form:
ρ ( x , t τ ) = [ [ l n ( x ) ] 2 D 0 [ t τ ] + [ 1 2 μ 0 D 0 ] l n ( x ) 1 ] 2 π D 0 [ t τ ] 3 e [ [ l n ( x ) ] 2 2 D 0 [ t τ ] + [ 3 2 μ 0 D 0 ] l n ( x ) + [ 1 2 μ 0 D 0 ] 2 D 0 [ t τ ] 2 ] d t
As we wrote earlier, this may be due to a significant difference in the average time of appearance of second-level comments (the time interval between the appearance of a first-level comment and a comment on this comment), which may lead to the implementation of two-stage dynamics in the appearance of comments.
As an example of modeling, we will choose the following model parameters for the process of commenting on the newsfeed or blog itself: μ 0 , 1 = 0.55, D 0 , 1 = 0.50, and for the second (commenting on comments) μ 0 , 2 = 0.50, D 0 , 2 = 0.50, τ = 50 conventional units, α 1 = 0.75, α 2 = 0.25 and L = 100 ( μ 0 , 1 ˃ μ 0 , 2 was chosen based on the assumption that commenting on the news is a more primary process for users than commenting on comments.
Figure 12 shows the results of modeling the dynamics of changes in the number of comments N ( t ) over time, because two processes can occur in parallel. As can be seen from the simulation results presented in Figure 12, there is a good coincidence of real data (see Figure 6 and Figure 8) and theoretical calculations (curve 1, constructed considering the time delay τ ). Without considering the delay, the dynamics of the news commenting process is S-shaped (see curve 2 in Figure 12), which coincides with the observed data presented in Figure 10 and is consistent with a significant difference in the average time of appearance of second-level comments for news items 1, 2 and 3 selected as an example.
The parallel flow of the two processes does not violate the integrity of the model, because the stationary solution of the modified Fokker-Planck equation (taking into account the delay by τ ) and the usual equation has the same form, which was described in the section “Derivation of the power law of the distribution of comments from the stationary Fokker-Planck equation”.

7. Discussion

Firstly, it is possible to analyze the topics of news items that gain the largest number of comments (i.e., have the greatest public interest), make a ranking of their popularity, and study their static distributions. Further, within each group, it is possible to determine the exponent of the power law ρ ( x ) = [ γ 1 ] x γ . Then, considering that μ 0 D 0 = 1 γ 2 , it is possible to determine the value of μ 0 D 0 by which it is possible to judge for which types of news and messages purposeful commenting is predominant (an increase in the ratio of μ 0 D 0 ), and for which this is “random” (a decrease in the ratio of μ 0 D 0 ). This will allow prediction in the future as to what news item may cause what user behavior, and how they may influence public opinion.
Secondly, using the dynamic distribution functions obtained in this work, it is possible to analyze the observed processes of commenting on newsfeed and blogs. Further, based on this, it is possible to determine the parameters of the model μ0, D0 and τ for various types of news, which can also allow prediction in the future what news may cause what user behavior, and how this may influence public opinion.
In conclusion, we note that the complex nature of the dynamics of processes in complex social systems can be described, not only based on models created based on the Fokker-Planck equation. For example, in [41,42,43,44,45,46,47,48], models were developed by the authors specifically to describe the stochastic dynamics of changes in the state of complex social systems. These models take into account the processes of self-organization and memory availability. To create this model, graphical diagrams of the probabilities of transitions between possible states of the described systems were considered taking into account previous states. This method allows the taking into account memory, and describes not only Markov but also non-Markov processes. Using this approach, a nonlinear differential equation of the second order was derived, which allows the setting and solution of problems for determining the probability density function of the amplitude of deviations of parameters describing the observed processes of a non-stationary time series, depending on the values of the time interval of its determination and the depth of memory accounting. The differential equation obtained during the study contains not only terms responsible for random change (diffusion) and ordered change (destruction), but also a term that is responsible for the possibility of self-organization, which significantly distinguishes it from the Fokker-Planck equation. Within the framework of the models developed by [41,42,43,44,45,46,47,48], it is possible to describe processes whose dynamics have both an S-shaped character for changes and a two-stage process.
The novelty of our work in comparison with the works of our predecessors is that, by using a stationary version of the Fokker-Planck equation for the data observed in practice, a power law of the distribution of their parameters can be obtained that is consistent with them. In this case, it can be made a pre-position that the multidirectional nature of many local actions and processes, each of which has both memory and self-organization, leads in summary to the fact that memory can largely disappear and the process in a generalized form becomes Markovsky. This allowed us, under certain assumptions for coefficients in the Fokker-Planck equation, to obtain from its stationary form a power law of distribution for the number of comments on news and blogs. As shown in our paper, the theory aligns well with the data observed in reality.
Secondly, assuming that the Fokker-Planck equation under certain circumstances can be applied to describe the dynamics in the systems in question (for example, based on what is described above) we considered the temporal dependencies of the appearance of comments on various news and found that it can be both S-shaped in nature and have a more complex-two-staged form, which can be explained within the framework of using the Fokker-Planck equation only by the presence of two processes and delay time.

8. Conclusions

The results obtained in the work allow us to draw several conclusions:
  • The stationary distribution of news observed in practice by the number of comments to on it corresponds to the power law: ρ ( x ) = [ γ 1 ] x γ , where ρ ( x ) is the share of news items in their total number having x comments, and γ is the exponent.
  • The dynamics of changes over time in the number of comments to a newsfeed or blog can have both an S-shaped form and a two-stage one, which may be due to a significant difference in the average time of appearance of comments at the second level (the time interval between the appearance of a comment at the first level and a comment on this comment), i.e., the value of the average delay.
  • The power law of dependence observed in practice is the stationary probability density of the distribution of news by the number of comments (states x ) which can be obtained from the solution of the stationary Fokker-Planck equation if some assumptions are made during its derivation. We assume that the coefficient μ ( x ) responsible in the Fokker-Planck equation for a purposeful change in the state of the system x ( x is the current number of comments on the news) linearly depends on the state x , and the coefficient D ( x ) responsible for a random change depends on x quadratically. All this suggests that the Fokker-Planck equation can be used to describe processes in complex network structures.
  • The solution of the unsteady Fokker-Planck equation under the assumptions of the linear dependence of μ ( x ) on the state of x and the quadratic dependence of D ( x ) on the state of x allows us to obtain an equation for the probability density of transitions between the states of the system per unit of time, which are in good agreement with the observed data, taking into account the effect of the delay time between the appearance of the first level comment and the comment on this comment.
  • The models developed based on the Fokker-Planck equation are in good agreement with the observed data, which makes it possible to create algorithms for monitoring and predicting the evolution of public opinion of users of news information resources.

Author Contributions

D.Z.: conceptualization, formal analysis, writing-review & editing; J.P.: methodology, visualization; V.K.: data curation, writing-original draft. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Russian Science Foundation (RSF), grant no. 22-21-00109 “Development of the dynamics forecasting models of social moods based on the analysis of text content time series of social networks using the Fokker-Planck and nonlinear diffusion equations”.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

One of the solutions of the Fokker-Planck equation can be obtained as follows. Using the method of Laplace transformations for Equation (1), we can write:
s G ( s , x ) ¯ ρ ( 0 , x ) = d d x [ μ ( x ) · G ( s , x ) ¯ ] + 1 2 d 2 d x 2 [ D ( x ) · G ( s , x ) ¯ ]
Considering that at time t = 0 (the beginning of the process) there are no comments, then: ρ ( 0 , x ) = 0.
Further, substituting into Equation (A1) the corresponding derivatives and dependencies μ ( x ) and D ( x ) (the choice of which was discussed earlier, and their use leading to the results of the distribution of the number of comments according to the power law observed in reality), we obtain:
x 2 · d 2 G ( s , x ) ¯ d x 2 + 2 [ 2 μ 0 D 0 ] · x · d G ( s , x ) ¯ d x + 2 [ 1 μ 0 + s D 0 ]   G ( s , x ) ¯ = 0
We are looking for a solution to this equation in the form: G ( s , x ) ¯ = k C k x q , where C k   are the coefficients for the roots of the characteristic equation, which has the form: q ( q 1 ) + 2 [ 2 μ 0 D 0 ] q + 2 [ 1 μ 0 + s D 0 ] = 0 . Let us finds the roots of the characteristic equation.
q 1 , 2 = [ 3 2 μ 0 D 0 ] 2 ± [ 1 2 μ 0 D 0 ] 2 + 8 s D 0 2
We write it down as follows:
G ( s , x ) ¯ = x [ 3 2 μ 0 D 0 ] 2 { C 1 x [ 1 2 μ 0 D 0 ] 2 + 8 s D 0 2 + C 2 x [ 1 2 μ 0 D 0 ] 2 + 8 s D 0 2 }
Given that γ = 2 [ 1 μ 0 D 0 ] we write:
G ( s , x ) ¯ = C 1 · x [ γ + 1 ] [ γ 1 ] 2 + 8 s D 0 2 + C 2 · x [ γ + 1 ] + [ γ 1 ] 2 + 8 s D 0 2
For s   ( t 0 )   ρ ( x , 0 ) for any x must be equal to 0, so C 1 should be put equal to 0 [ γ + 1 ] + [ γ 1 ] 2 + 8 s D 0 2 + and x + ). Using the normalization condition (for the image, the integral from 1 to must be equal to 1 s ), we find the coefficient with C 2 :
2 C 2 [ γ + 1 ] [ γ 1 ] 2 + 8 s D 0 + 2 · x [ γ + 1 ] + [ γ 1 ] 2 + 8 s D 0 2 + 1 1 = 1 s
C 2 = [ γ 1 ] + [ γ 1 ] 2 + 8 s D 0 2 s
G ( s , x ) ¯ = [ γ 1 ] + [ γ 1 ] 2 + 8 s D 0 2 s · x [ γ + 1 ] + [ γ 1 ] 2 + 8 s D 0 2
Substitute γ and get:
α = 1 2 μ 0 D 0 2 = 1 2 μ 0 D 0
3 2 μ 0 D 0 2 = 1 + 1 2 μ 0 D 0 2 = 1 + α
β = 1 2 8 D 0 = 2 D 0
k = D 0 8 [ 1 2 μ 0 D 0 ] 2 = D 0 2 [ 1 2 μ 0 D 0 ] 2 = [ α β ] 2
x β · k + s = e [ β · l n ( x ) ] · k + s
Let us writes this:
G ( s , x ) ¯ = [ α · e [ β · l n ( x ) ] · k + s s + β · k + s · e [ β · l n ( x ) ] · k + s s ] · x [ 1 + α ]
We find the original e [ β · l n ( x ) ] · k + s s and the original β · k + s · e [ β · l n ( x ) ] · k + s s we find by differentiating the original e [ β · l n ( x ) ] · k + s s by ln( x ).
d d ( l n ( x ) ) [ e [ β · l n ( x ) ] · k + s s ] = β · k + s · e [ β · l n ( x ) ] · k + s s
G ( s , x ) ¯ = [ α · e [ β · l n ( x ) ] · k + s s d d ( l n ( x ) ) [ e [ β · l n ( x ) ] · k + s s ] ] · x [ 1 + α ]
e [ β · l n ( x ) ] · k + s s = 1 s · e y · k + s , where [ β · l n ( x ) ] = y .
Dividing an image by s is analogous to integrating over t of the original e y · k + s   . Let us find this original:
e [ β · l n ( x ) ] · k + s β · l n ( x ) 2 π t 3 · e [ β · l n ( x ) ] 2 4 t · e k t
e [ β · l n ( x ) ] · k + s s β · l n ( x ) 2 π t 3 · e [ β · l n ( x ) ] 2 4 t · e k t d t
d d ( l n ( x ) ) [ e [ β · l n ( x ) ] · k + s s ] β · l n ( x ) 2 π t 3 [ 1 β 2 2 t · l n ( x ) ] · e [ β · l n ( x ) ] 2 4 t · e k t d t
After making all the necessary substitutions, we get the following expression for the distribution function:
ρ ( x , t ) = [ [ l n ( x ) ] 2 D 0 t + [ 1 2 μ 0 D 0 ] l n ( x ) 1 ] 2 π D 0 t 3 e [ [ l n ( x ) ] 2 2 D 0 t + [ 3 2 μ 0 D 0 ] l n ( x ) + [ 1 2 μ 0 D 0 ] 2 D 0 t 2 ] d t

References

  1. Gardiner, C. Stochastic Methods: A Handbook for the Natural and Social Sciences; Springer: Berlin, Germany, 2009. [Google Scholar]
  2. Lux, T. Inference for systems of stochastic differential equations from discretely sampled data: A numerical maximum likelihood approach. Ann. Financ. 2012, 9, 217–248. [Google Scholar] [CrossRef] [Green Version]
  3. Hurn, A.; Jeisman, J.; Lindsay, K. Teaching an old dog new tricks: Improved estimation of the parameters of stochastic differential equations by numerical solution of the Fokker-Planck equation. In Financial Econometrics Handbook; Gregoriou, G., Pascalau, R., Eds.; Palgrave: London, UK, 2010. [Google Scholar]
  4. Elliott, R.J.; Siu, T.K.; Chan, L. A PDE approach for risk measures for derivatives with regime switching. Ann. Financ. 2007, 4, 55–74. [Google Scholar] [CrossRef]
  5. Orlov, Y.N.; Fedorov, S.L. Generation of nonstationary time series trajectories based on the Fokker-Planck equation. WORKS MIPT 2016, 8, 126–133. [Google Scholar]
  6. Chen, Y.; Cosimano, T.F.; Himonas, A.A.; Kelly, P. An Analytic Approach for Stochastic Differential Utility for Endowment and Production Economies. Comput. Econ. 2013, 44, 397–443. [Google Scholar] [CrossRef]
  7. Savku, E.; Weber, G.-W. Stochastic differential games for optimal investment problems in a Markov regime-switching jump-diffusion market. Ann. Oper. Res. 2020, 1–26. [Google Scholar] [CrossRef]
  8. Andrianova, E.G.; Golovin, S.A.; Zykov, S.V.; Lesko, S.A.; Chukalina, E.R. Review of modern models and methods of analysis of time series of dynamics of processes in social, economic and socio-technical systems. Russ. Technol. J. 2020, 8, 7–45. [Google Scholar] [CrossRef]
  9. Dorogovtsev, S.N.; Mendes, J.F.F. Evolution of networks. Adv. Phys. 2002, 51, 1079–1187. [Google Scholar] [CrossRef] [Green Version]
  10. Newman, M.E.J. The structure and function of complex networks. SIAM Rev. 2003, 45, 167–256. [Google Scholar] [CrossRef] [Green Version]
  11. Dorogovtsev, S.N.; Mendes, J.F.F.; Samukhin, A.N. Generic scale of the scale-free growing networks. Phys. Rev. E 2001, 63, 062101. [Google Scholar] [CrossRef] [Green Version]
  12. Golder, S.A.; Wilkinson, D.M.; Huberman, B.A. Rhythms of social interaction: Messaging within a massive online network. In Communities and Technologies 2007; Steinfield, C., Pentland, B.T., Ackerman, M., Contractor, N., Eds.; Springer: London, UK, 2007; pp. 41–66. [Google Scholar]
  13. Kumar, R.; Novak, J.; Tomkins, A. Structure and evolution of online social networks. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’06), Philadelphia, PA, USA, 20–23 August 2006; pp. 611–617. [Google Scholar]
  14. Mislove, A.; Marcon, M.; Gummadi, K.P.; Druschel, P.; Bhattacharjee, B. Measurement and analysis of online social networks. In Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement (IMC ’07), San Diego, CA, USA, 24–26 October 2007; pp. 29–42. [Google Scholar]
  15. Pasa, L.; Navarin, N.; Sperdut, A. SOM-based aggregation for graph convolutional neural networks Neural Computing and Applications Neural Comput. Applic. 2022, 34, 5–24. [Google Scholar]
  16. Pulipati, S.; Somula, R.; Parvathala, B.R. Nature inspired link prediction and community detection algorithms for social networks: A survey. Int. J. Syst. Assur. Eng. Manag. 2021. [Google Scholar] [CrossRef]
  17. Airoldi, E.M.; Blei, D.M.; Fienberg, S.E.; Xing, E.P. Mixed membership stochastic blockmodels. J. Mach. Learn. Res. 2008, 9, 1981–2014. [Google Scholar] [PubMed]
  18. Cho, Y.-S.; Steeg, G.V.; Galstyan, A. Co-evolution of selection and influence in social networks. In Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence (AAAI 2011), San Francisco, CA, USA, 7–11 August 2011. [Google Scholar]
  19. Sahafizadeh, E.; Ladani, B.T. The impact of group propagation on rumor spreading in mobile social networks. Phys. A Stat. Mech. Its Appl. 2018, 506, 412–423. [Google Scholar] [CrossRef]
  20. Varma, V.S.; Morarescu, I.C.; Haye, Y. Analysis and control of multi-leveled opinions spreading in social networks. In Proceedings of the American Control Conference (ACC 2018), Milwaukee, WI, USA, 27–29 June 2018; pp. 3404–3409. [Google Scholar]
  21. López-Santamaría, L.-M.; Almanza-Ojeda, D.-L.; Gomez, J.C.; Ibarra-Manzano, M.-A. Age and Gender Identification in Unbalanced Social Media. In Proceedings of the 2019 International Conference on Electronics, Communications and Computers (CONIELECOMP), Cholula, Mexico, 27 February–1 March 2019. [Google Scholar] [CrossRef]
  22. Barberá, P. Less is More? How Demographic Sample Weights Can Improve Public Opinion Estimates Based on Twitter Data. 2016. Available online: http://pablobarbera.com/static/less-is-more.pdf (accessed on 21 December 2021).
  23. Luo, F.; Cao, G.; Mulligan, K.; Li, X. Explore Spatiotemporal and Demographic Characteristics of Human Mobility via Twitter: A Case Study of Chicago. Appl. Geogr. 2016, 70, 11–25. [Google Scholar] [CrossRef] [Green Version]
  24. Sekara, V.; Stopczynski, A.; Lehmann, S. Fundamental structures of dynamic social networks. Proc. Natl. Acad. Sci. USA 2016, 113. [Google Scholar] [CrossRef] [Green Version]
  25. Ubaldi, E.; Vezzani, A.; Karsai, M.; Perra, N.; Burioni, R. Burstiness and tie activation strategies in time-varying social networks. Sci. Rep. 2017, 7, srep46225. [Google Scholar] [CrossRef]
  26. Yatim, A.F.M.; Wardhana, Y.; Kamal, A.; Soroinda, A.A.R.; Rachim, F.; Wonggo, M.I. A corpus-based lexicon building in Indonesian political context through Indonesian online news media. In Proceedings of the 2016 International Conference on Advanced Computer Science and Information Systems (ICACSIS), Malang, Indonesia, 15–16 October 2016. [Google Scholar] [CrossRef]
  27. Kirn, S.L.; Hinders, M.K. Dynamic wavelet fingerprint for differentiation of tweet storm types. Soc. Netw. Anal. Min. 2020, 10, 4. [Google Scholar] [CrossRef]
  28. Karami, A.; Elkouri, A. Political Popularity Analysis in Social Media; Springer: Berlin, Germany, 2019; pp. 456–465. [Google Scholar]
  29. Koti, P.; Pothula, S.; Dhavachelvan, P. Age Forecasting Analysis—Over Microblogs. In Proceedings of the 2017 Second International Conference on Recent Trends and Challenges in Computational Models (ICRTCCM), Tindivanam, India, 3–4 February 2017; pp. 83–86. [Google Scholar] [CrossRef]
  30. Mukhamediev, R.I.; Yakunin, K.; Mussabayev, R.; Buldybayev, T.; Kuchin, Y.; Murzakhmetov, S.; Yelis, M. Classification of Negative Information on Socially Significant Topics in Mass Media. Symmetry 2020, 12, 1945. [Google Scholar] [CrossRef]
  31. Ko, H.; Jong, Y.; Sangheon, K.; Libor, M. Human-machine interaction: A case study on fake news detection using a backtracking based on a cognitive system. Cogn. Syst. Res. 2019, 55, 77–81. [Google Scholar]
  32. Bushman, B.; Whitaker, J. Media Influence on Behavior. Reference Module in: Neuroscience and Biobehavioral Psychology. 2017. Available online: http://scitechconnect.elsevier.com/neurorefmod/ (accessed on 24 November 2020).
  33. Bandari, R.; Asur, S.; Huberman, B.A. The Pulse of News in Social Media: Forecasting Popularity. arXiv 2012, arXiv:1202.0332v1. Available online: https://arxiv.org/pdf/1202.0332.pdf (accessed on 21 December 2021).
  34. Willaert, T.; Van Eecke, P.; Beuls, K.; Steels, L. Building Social Media Observatories for Monitoring Online Opinion Dynamics. Soc. Media Soc. 2020, 6. [Google Scholar] [CrossRef]
  35. Tran, C.; Shin, W.-Y.; Spitz, A. Community Detection in Partially Observable Social Networks. ACM Trans. Knowl. Discov. Data 2021, 16, 1–24. [Google Scholar] [CrossRef]
  36. Chen, Z.; Li, X.; Bruna, J. Supervised community detection with line graph neural networks. In Proceedings of the 7th International Conference on Learning Representations (ICLR 2019), New Orleans, LA, USA, 6–9 May 2019. [Google Scholar]
  37. Hoffmann, T.; Peel, L.; Lambiotte, R.; Jones, N.S. Community detection in networks without observing edges. Sci. Adv. 2020, 6, eaav1478. [Google Scholar] [CrossRef] [Green Version]
  38. Du, B.; Lian, X.; Cheng, X. Partial differential equation modeling with Dirichlet boundary conditions on social networks. Bound. Value Probl. 2018, 2018, 50. [Google Scholar] [CrossRef]
  39. Liu, X.; He, D.; Liu, C. Modeling information dissemination and evolution in time-varying online social network based on thermal diffusion motion. Phys. A Stat. Mech. its Appl. 2018, 510, 456–476. [Google Scholar] [CrossRef]
  40. Bomba, A.; Kunanets, N.; Pasichnyk, V.; Turbal, Y. Mathematical and computer models of message distribution in social networks based on the space modification of Fermi-Pasta-Ulam approach. Adv. Intell. Syst. Comput. 2019, 836, 257–266. [Google Scholar]
  41. Zhukov, D.; Khvatova, T.; Zaltsman, A. Stochastic Dynamics of Influence Expansion in Social Networks and Managing Users’ Transitions from One State to Another. In Proceedings of the 11th European Conference on Information Systems Management (ECISM 2017), Genoa, Italy, 14–15 September 2017; pp. 322–329. [Google Scholar]
  42. Sigov, A.S.; Zhukov, D.O.; Khvatova, T.Y.; Andrianova, E.G. A Model of Forecasting of Information Events on the Basis of the Solution of a Boundary Value Problem for Systems with Memory and Self-Organization. J. Commun. Technol. Electron. 2018, 63, 1478–1485. [Google Scholar] [CrossRef]
  43. Zhukov, D.; Khvatova, T.; Millar, C.; Zaltcman, A. Modelling the stochastic dynamics of transitions between states in social systems incorporating self-organization and memory. Technol. Forecast. Soc. Chang. 2020, 158, 120134. [Google Scholar] [CrossRef]
  44. Zhukov, D.; Khvatova, T.; Istratov, L. A stochastic dynamics model for shaping stock indexes using self-organization processes, memory and oscillations. In Proceedings of the European Conference on the Impact of Artificial Intelligence and Robotics (ECIAIR 2019), Oxford, UK, 31 October–1 November 2019; pp. 390–401. [Google Scholar]
  45. Zhukov, D.; Khvatova, T.; Istratov, L. Analysis of non-stationary time series based on modelling stochastic dynamics considering self-organization, memory and oscillations. In Proceedings of the International Conference on Time Series and Forecasting (ITISE 2019), Granada, Spain, 25–27 September 2019; Volume 1, pp. 244–254. [Google Scholar]
  46. Khvatova, T.; Zaltsman, A.; Zhukov, D. Information processes in social networks: Percolation and stochastic dynamics. CEUR Workshop. In Proceedings of the 2nd International Scientific Conference “Convergent Cognitive Information Technologies”; Springer: Berlin/Heidelberg, Germany, 2017; Volume 2064, pp. 277–288. [Google Scholar]
  47. Zhukov, D.O.; Lesko, S.A. Stochastic self-organissation of poorly structured data and memory realisation in an information domain when designing news events forecasting models. In Proceedings of the 2nd IEEE International Conference on Big Data Intelligence and Computing, Auckland, New Zealand, 8–12 August 2016. [Google Scholar] [CrossRef]
  48. Zhukov, D.O.; Zaltcman, A.G.; Khvatova, T.Y. Changes in States in Social Networks and Sentiment Security Using the Principles of Percolation Theory and Stochastic Dynamics. In Proceedings of the 2019 IEEE International Conference “Quality Management, Transport and Information Security, Information Technologies” (IT and QM and IS), Sochy, Russia, 23–27 September 2019; pp. 149–153. [Google Scholar]
Figure 1. Density of distribution of commentators by their number of comments for the period from 1 January to 31 December 2020.
Figure 1. Density of distribution of commentators by their number of comments for the period from 1 January to 31 December 2020.
Mathematics 10 00989 g001
Figure 2. Linearization of the observed data for the Gaussian distribution.
Figure 2. Linearization of the observed data for the Gaussian distribution.
Mathematics 10 00989 g002
Figure 3. Linearization of observed data for exponential distribution.
Figure 3. Linearization of observed data for exponential distribution.
Mathematics 10 00989 g003
Figure 4. Linearization of the observed data for power distribution.
Figure 4. Linearization of the observed data for power distribution.
Mathematics 10 00989 g004
Figure 5. Linearization of the observed data for power distribution after cleaning unscrupulous users.
Figure 5. Linearization of the observed data for power distribution after cleaning unscrupulous users.
Mathematics 10 00989 g005
Figure 6. The observed dynamics of change over time, the number of comments on a news item of public interest that appeared on the portal https://echo.msk.ru/news/2626290-echo.html on 16 April 2020.
Figure 6. The observed dynamics of change over time, the number of comments on a news item of public interest that appeared on the portal https://echo.msk.ru/news/2626290-echo.html on 16 April 2020.
Mathematics 10 00989 g006
Figure 7. The observed dynamics of change over time, the number of comments on a news item of public interest that appeared on the portal https://echo.msk.ru/news/2740844-echo.html on 12 November 2020.
Figure 7. The observed dynamics of change over time, the number of comments on a news item of public interest that appeared on the portal https://echo.msk.ru/news/2740844-echo.html on 12 November 2020.
Mathematics 10 00989 g007
Figure 8. Dynamics of changes over time, the number of comments on a news item of public interest that appeared on the portal https://echo.msk.ru/news/2740844-echo.html on 12 November 2020, after removing the time gaps.
Figure 8. Dynamics of changes over time, the number of comments on a news item of public interest that appeared on the portal https://echo.msk.ru/news/2740844-echo.html on 12 November 2020, after removing the time gaps.
Mathematics 10 00989 g008
Figure 9. The observed dynamics of change over time, the number of comments on a news item of public interest that appeared on the portal https://echo.msk.ru/news/2571431-echo.html on 15 January 2020.
Figure 9. The observed dynamics of change over time, the number of comments on a news item of public interest that appeared on the portal https://echo.msk.ru/news/2571431-echo.html on 15 January 2020.
Mathematics 10 00989 g009
Figure 10. Dynamics of changes over time, the number of comments on a news item of public interest that appeared on the portal https://echo.msk.ru/news/2571431-echo.html on 15 January 2020, after removing the time gaps.
Figure 10. Dynamics of changes over time, the number of comments on a news item of public interest that appeared on the portal https://echo.msk.ru/news/2571431-echo.html on 15 January 2020, after removing the time gaps.
Mathematics 10 00989 g010
Figure 11. Dynamics of changes over time in the number of comments to the news in a simulation model based on the Fokker-Planck equation.
Figure 11. Dynamics of changes over time in the number of comments to the news in a simulation model based on the Fokker-Planck equation.
Mathematics 10 00989 g011
Figure 12. Dynamics of changes over time in the number of comments on the news in the simulation model based on the Fokker-Planck equation, considering two parallel processes.
Figure 12. Dynamics of changes over time in the number of comments on the news in the simulation model based on the Fokker-Planck equation, considering two parallel processes.
Mathematics 10 00989 g012
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Zhukov, D.; Perova, J.; Kalinin, V. Description of the Distribution Law and Non-Linear Dynamics of Growth of Comments Number in News and Blogs Based on the Fokker-Planck Equation. Mathematics 2022, 10, 989. https://doi.org/10.3390/math10060989

AMA Style

Zhukov D, Perova J, Kalinin V. Description of the Distribution Law and Non-Linear Dynamics of Growth of Comments Number in News and Blogs Based on the Fokker-Planck Equation. Mathematics. 2022; 10(6):989. https://doi.org/10.3390/math10060989

Chicago/Turabian Style

Zhukov, Dmitry, Julia Perova, and Vladimir Kalinin. 2022. "Description of the Distribution Law and Non-Linear Dynamics of Growth of Comments Number in News and Blogs Based on the Fokker-Planck Equation" Mathematics 10, no. 6: 989. https://doi.org/10.3390/math10060989

APA Style

Zhukov, D., Perova, J., & Kalinin, V. (2022). Description of the Distribution Law and Non-Linear Dynamics of Growth of Comments Number in News and Blogs Based on the Fokker-Planck Equation. Mathematics, 10(6), 989. https://doi.org/10.3390/math10060989

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop