1. Introduction
The debate within the scientific community regarding power-law behavior in social and physical systems has been long-standing [
1,
2]. Typically, power-law behavior is observed at the macro level of a system, prompting researchers to seek microscopic interpretations of these phenomena. Mathematically, the power-law is unique in its satisfaction of the scale-free property
[
1], establishing a close relationship between the self-similarity of stochastic processes and power-law behavior [
3]. This statistical property is a characteristic feature of both social and financial systems. Measures of long-range memory based on self-similarity are often ambiguous, as Markov processes with power-law statistical properties can exhibit long-range memory, including slowly decaying auto-correlation [
4,
5,
6,
7,
8,
9,
10,
11,
12]. Financial markets, in particular, provide empirical limit order book (LOB) data that exhibit these power-law statistical properties [
13].
From the perspective of econophysics, it is essential to provide microscopic interpretations of econometric models that typically serve as macroscopic modeling of complex social systems. These models frequently rely on assumptions of self-similarity and long-range dependence. Empirical data analysis is essential for verifying assumptions and using macroscopic modeling. The nonlinearity in Markov processes can exhibit statistical properties usually considered as long-range dependence on physical and social systems [
14].
The order-splitting behavior of financial market traders in empirical data of order books is recovered as statistical property or long-range persistence [
15,
16]. Thus, we have evidence of genuine long-range dependence in financial systems, which has to be accounted for in comprehensive modeling. The order-splitting behavior of traders should be evident in the sequence of submitted limit orders. In this contribution, we demonstrate the critical role of assumptions regarding self-similarity and illustrate how a straightforward model of opinion dynamics based on the discrete autoregressive fractionally integrated moving average (ARFIMA) series can challenge these assumptions. Our proposed model is empirically grounded on the order imbalance time series from financial markets [
17,
18] and can serve as a theoretical interpretation of empirical findings. This contribution is the concluding research of order flow imbalance dynamics and generalizes previous findings as a model of opinion dynamics. Here, we provide a natural interpretation of empirically observable power-law of cancellation times by the heterogeneity of trading agents. We use Kolmogorov–Smirnov (KS) statistics to quantify the self-similarity test for the investigated time series.
Section 2 provides a brief overview of the limit order time series, serving as the foundation for a broader interpretation of opinion dynamics. In
Section 3, we present a model of power-law waiting times arising from a system of heterogeneous agents.
Section 4 offers evidence of the broken self-similarity assumption when opinion cancellation is included in the model. Finally, we discuss our results and offer conclusions in
Section 5.
2. Modeling Limit Order Flow and/or Opinion Dynamics
In our recent work [
17], we analyzed the limit order flow of the market and denoted it as
:
where
represents the volume of the submitted limit order. We examined
through the lens of the ARFIMA process, as the probability density functions (PDFs) of order volumes
exhibit power-law tails. We documented fluctuations in the memory parameter for various stocks, finding values in the range of
. Despite the rough approximation of the PDF of volumes
by the Lévy stable distribution, the time series
can be considered Lévy stable motion (FLSM)-like. Though we deal with discrete time series and use event time in modeling empirical series, a more general concept of FLSM seems appropriate here. The discrete-time ARFIMA process aggregated in the limit converges to either fractional Brownian or Lèvy stable motion [
19].
The series
serves as a macroscopic measure of opinion in the order flow, exhibiting long-range dependence due to the heterogeneity of the agents. However, a more comprehensive measure of traders’ macro opinion should incorporate events of order cancellation and execution. Therefore, we explore an alternative sequence of order flow:
where the first sum includes all live limit orders, encompassing all limit order volumes
submitted before event
j and awaiting cancellation or execution. A sequence of limit order submissions of length
N generates a series of order imbalance
of length
, as each submission pairs with a cancellation or execution event. Notably,
differs significantly from
:
is bounded while
is unbounded. Additionally, our evaluation of the memory parameter
d for
using ARFIMA assumptions yielded contradictory results. Our previous work [
17] concluded that the time series defined in (
2) does not exhibit FLSM-like properties. Consequently, persistent limit order submission flow or long-range dependence is concealed from econometric methods when analyzing the time series of order imbalance
.
To reinterpret the order imbalance series
, we introduced a
q-extension of the geometric distribution as the discrete
q-exponential probability mass function (PMF) [
17]:
The connection with generalized Tsallis statistics [
20] strengthens the choice of this power-law distribution. The PMF (
3) is a good choice for fitting the empirical limit order cancellation times. Our analysis [
17] showed the low sensitivity of defined parameters to the order sizes and price levels. The fitted parameters,
, and
, are consistent across stocks and trading days analyzed.
One could consider the stochastic queuing model where tasks are executed based on a continuous-valued priority [
21,
22] as an explanation of power-law waiting time distribution. We propose an alternative reasoning and, in
Section 3, derive the power-law of waiting time distribution originating from an agent’s heterogeneity.
We propose a relatively simple limit order imbalance model based on fractional Lévy stable limit order inflow and the discrete
q-exponential lifetime distribution. This model, derived from empirical analysis [
17], has a broader perspective for the possible applications in other social system modeling.
Extending the model’s interpretation, we consider it a version of opinion dynamics applicable to other social systems. Originally, the model consisted of two random series: (a) A series of submitted limit order volumes
, which we generate as a discrete-time process ARFIMA{0,d,0}{
}, where
d denotes the memory parameter,
the stability index, and
N is the series length. (b) An independent series of the same length for the limit order cancellation times generated using the PMF
defined in Equation (
3).
For the extended model interpretation, represents the opinion weight, positive for a buy (first one) and negative for a sell (second one). The limit order cancellation time, measured in event space , represents the opinion lifetime. While initially designed for analyzing financial market order flow, this extended interpretation can investigate other instances of weighted opinions in social systems. This model exemplifies a simple time series constructed using an ARFIMA sequence, yet exhibiting properties beyond the assumption of self-similarity.
With these independent discrete-time sequences, we calculate the model time series
defined by sequence
(see Equation (
2)). Here, the opinion submission event number
and its cancellation event number
are determined for each
of sequence (a) and the corresponding discrete-time interval
k of sequence (b). The generated discrete-time series represents an artificial analog of the empirical order imbalance, comparable with order flow data in financial markets [
17]. We achieved good correspondence with empirical data by choosing the artificial model parameters:
;
;
[
17]. For other applications, the model can be simplified by replacing the sequence
with unit weights:
We denote these series with an additional index
S, for example,
. Another simplification involves choosing
in Equation (
3), yielding a geometric distribution:
where
. The geometric distribution, as a discrete version of the exponential distribution, is a common choice for waiting times in many physical and social systems.
3. Heterogeneity of Agents and Power-Law of Waiting Time
While power-law waiting times are observed in stochastic queueing models with continuous-valued priorities [
21,
22], another plausible explanation for this phenomenon could be the heterogeneity of trading agents. In financial markets, agents manage a diverse range of assets, leading to substantial variability in their trading activities and the lifetimes of their orders. To address this, we propose a model that combines agent heterogeneity to derive a power-law distribution for limit order cancellation times.
Consider n categories of agents, each with different rates for limit order submission and cancellation. The lowest rate is one limit order per trading day (the duration of the time series under investigation). Let us denote this probability as . Agents who submit two orders per day have a probability , and agents submitting i limit orders have a probability . The most active traders, who submit n limit orders, have a probability .
Under this framework, the waiting (cancellation) time for agents in the
i-th category follows a geometric distribution with the PMF given by
. To obtain the overall PMF for the ensemble of agents, we average this distribution over all categories. According to our assumptions, the arrival probability of orders from different agent categories is proportional to the index
i, and the number of agents in each category is inversely proportional, Zipf’s law. Thus, the PMF for the entire ensemble of agents can be expressed as follows:
To understand the result given by Equation (
7), consider the limit as
:
which reveals a power-law with an exponent
. This power-law nature is illustrated in
Figure 1, alongside partial sums defined by
where
and
.
Our assumptions, incorporating Zipf’s law, lead to a power-law of cancellation (waiting) time that is exponentially stretched on both sides. This restriction arises from the fixed number of agent categories
n or the related number of opinions (orders) submitted,
. Crucially, the power-law exponent in Equation (
7) is
. Given the relationship between the
q-exponential distribution and the Pareto distribution, this implies an exponent
, as empirically defined in [
17]. Therefore, the presented description of the PMF for waiting times in a heterogeneous agent ensemble supports the conclusion that a power-law exponent
is a stylized fact in financial markets. Further empirical studies of cancellation times using the proposed PMF in Equation (
7) would be valuable.
4. Self-Similarity Analysis of Proposed Model
Building upon previous efforts to unravel long-range dependence in social systems [
14], it is crucial to juxtapose macroscopic descriptions with empirical data and agent-based modeling. Extensive empirical studies of volatility, trading activity, and order flow in financial markets have solidified the foundation for examining long-range memory properties [
15,
16,
23,
24,
25,
26,
27]. Various econometric models based on fractional noise have been proposed to represent time series of financial variables [
23,
28,
29,
30,
31,
32,
33]. Yet, from an econophysics perspective, these models often serve merely as macroscopic interpretations of complex social phenomena, frequently relying on ad hoc assumptions of long-range memory. Despite advancements in trading algorithms and machine learning, predicting stock price movements remains a formidable challenge for researchers [
34,
35,
36].
In this section, we address the requirement of self-similarity, a cornerstone in modeling long-range dependence, within our proposed opinion dynamics model. Econometric methods commonly accept the assumption of self-similarity without scrutiny; however, a deeper examination is crucial [
37].
Stochastic time series are often assumed to be self-similar if they satisfy certain scaling relations. For instance, a series
is self-similar if it holds that
, where ∼ indicates identical distributions for any
and
. Moreover, these series should exhibit stationary increments:
for any
and
. These processes, characterized by self-affine increments, follow the rule that
for any
[
38]. All these properties are defined through equality in distributions; thus, the simplest estimation of
H should also be based on distributional equality. By analyzing these distributions, we can identify deviations from the self-similarity requirement.
Our model of limit order flow
and opinion imbalance
assume stationary increments as they stem from a Lévy stable distribution. We express the self-similarity condition as follows:
To compare distributions, we employ the KS two-sample test [
39] and compute the KS distance
D:
where
represents the cumulative empirical distribution functions for an integer sequence
and a corresponding sequence of
:
From the definition of self-similarity (
11), we expect consistent values of
H that minimize
D for any
. Diverse values of
H across different
suggest a failure to meet the self-similarity criterion.
Our opinion dynamics model provides a useful case study to illustrate how introducing opinion cancellation disrupts a self-similar series of opinion inflow
. We simulate the ARFIMA series
with the following parameters:
,
,
200,000, alongside a series of opinion durations (waiting times)
using parameters
and
. We then generate the series
of opinion imbalance (
2) with a length of
.
In
Figure 2, we compare numerically calculated KS distances
, Equation (
12), as functions of
H for various series, demonstrating that, while the series
maintains self-similarity, the series
does not, as evidenced by the range of
H values obtained for different
. Even when simplifying the model to only consider signs of volumes, the KS distance
is less sensitive to
H, supporting the conclusion that, while
can be considered self-similar,
is not. From our point of view, this procedure to control self-similarity should apply to any observed time series.
While researchers employ various methodologies to estimate the self-similarity parameter
H of observed time series, there often lies a gap in validating the self-similarity assumption itself [
37]. It is imperative that we devote greater attention to developing and refining methods that rigorously test these self-similarity assumptions. Particularly, the method we propose here, while robust for complex models, shows limitations in accuracy when applied to simplified series such as
and
, where the numerically calculated functions
display a fractured structure indicative of potential method inadequacies.
In
Table 1, we list the Hurst parameter evaluation results using diverse methodologies for the model series
,
,
, and
. Further details on the estimation of mean square displacement (MSD) and
H using different methods, such as the Absolute Value Estimator (AVE) or Higuchi’s method, are elaborated in [
17,
18]. These results underscore that formally evaluated Hurst parameters can sometimes yield misleading conclusions regarding persistence and long-range dependence. Although all series were generated with the same memory parameter
d, a correct interpretation of self-similarity is essential for accurately understanding memory effects in these time series.
In conclusion, the model of artificial order imbalance and opinion dynamics discrete-time series provides valuable insights into the statistical properties of financial market limit order flow and imbalance. The comparison with empirical data underscores the utility of the model. The power-law nature of limit order cancellation time distribution is a statistical property contributing significantly to the correct interpretation of order imbalance time series.
5. Discussion and Conclusions
In our previous work [
17], we introduced a discrete
q-exponential distribution, as outlined in Equation (
3). This
q-extension of the geometric distribution has a direct relation to the theoretical foundations of generalized Tsallis statistics [
20]. Empirical validation of this model on limit order cancellation times across ten different stocks and trading days demonstrated its robustness, with the fitted
q-exponential PMF parameter
independent of the other order properties. This model aligns with the second-class Pareto distribution, which is known to exhibit a power-law tail with an exponent
[
40]. In this contribution, we utilize a heterogeneous agent model to elucidate this distinctive power-law characteristic.
Our approach categorizes trading agents based on their activity within selected intervals, such as one trading day, leading to
n categories where
represents the number of limit orders submitted per agent of category. Given that each order is canceled or executed, it is natural to model the lifetime
k of orders from each agent group
i with a geometric PMF
. Assuming the number of agents in each group
i is inversely proportional to the group’s index, in consistence with Zipf’s law, the probabilities
contribute equally when averaging waiting times across all agent categories. This leads to the explicit form of the PMF of cancellation (waiting) times as defined in (
7), effectively capturing the empirically observed power-law behavior of limit order cancellation times [
17].
We further expand and generalize this model by integrating two independent random sequences, the ARFIMA{0,d,0}{a, N} and
from Equation (
3), to form the imbalance series
, portraying opinion dynamics. This model not only elucidates the properties of limit order imbalance in financial markets but also offers insights into the complexity of long-range dependence observed in various social systems [
14,
18,
41].
The proposed model serves as an example of a time series with hidden long-range dependence. Thus, we propose the method of self-similarity tests and demonstrate that series and are not self-similar. Though the result is predictable, the proposed method may be useful in analyzing other empirical time series before using widely accepted methods of self-similar series analysis.
This study significantly advances our understanding of order imbalance and memory in the financial markets. By integrating the ARFIMA series with the q-exponential distribution waiting time, we provide a framework for modeling complex behaviors in social systems. Our findings not only bridge the gap between theoretical constructs and empirical observations but also pave the way for future research aimed at developing more precise models and gaining deeper insights into financial market dynamics and beyond.