1. Introduction
Alongside globalization, the transfer of knowledge among different areas is a characteristic of our times, resulting in the emergence and strengthening of border domains. Among the most interesting such domains is econophysics. In our field of interest is the second law of thermodynamics and its further developments, generated by the dispute between Max Planck and C. Caratheodory [
1,
2], leading to crystallization of the entropy concept. The entropy concept was introduced by L. Boltzmann (1844–1906), and the formula
, representing the dependency of the entropy S and probability W, which was engraved on his gravestone [
3], requires a macro-system with known state probabilities (
), but with the restriction:
. The debates and emulation inspired by Gibbs Paradox, which showed a deviation from the second law of thermodynamics, generated a different approach of entropy [
4].
Considered the father of informational theory, Claude Shannon proposed a new version of entropy calculus [
5,
6,
7]:
with
.
Over time, Shannon entropy has diversified the types and the fields of applications, such as relative entropy, entropy of a composed system, conditional entropy, entropy of a correlated system, Tsallis entropy, Sharma-Taneja-Mittal entropy, Kaniadakis entropy, and Abe entropy [
2,
8,
9,
10]. For a while, the two fields evolved separately, but more recently there have been papers published that aim to harmonize and combine econophysics with different areas of economics. The most spectacular evolution was in the financial field. In this regard, Jovanovic and Schinckus [
11] developed some relevant features regarding the interference of econophysics and finance, using an accessible language, presenting a combination of methods that help the development of both domains. Thus, they combine the theory of probabilities with the empirical data processing in the financial field. In the same sense, Ausloos, Jovanovic and Schinckus [
12] conclude that econophysics and finance are not irreconcilable domains, but both can progress by identifying procedures and models for the benefit of economic sciences, remarking that there are similar approaches, with positive results, in handling financial phenomena. Also, the authors bring a number of methodological clarifications. In parallel with the development of the entropy indicator, a new group of such indicators was used in the analysis of the economic systems structure. One category of methods is the econometric ones, for example the Cobb-Douglas production function [
13]:
. In this equation,
Qt—total production output (the real value of all goods produced in a year);
L—labor input (the total number of person-hours worked in a year);
K—capital input (the real value of all machinery, equipment, and buildings);
A—total factor productivity;
t—time;
α and
β are the output elasticity of capital and labor, respectively. Also, in this category of methods the Input-Output model proposed by W. Leontief is framed [
14,
15,
16]. Vitanov [
17] discuss the complex dynamics of science structures and systems, so the evaluation of research productivity requires a combination of qualitative and quantitative methods and measures. Also, Dimitrova and Vitanov [
18] investigate how the adaptation of the competition coefficients of the competing populations for the same limited resource influences the system dynamics in the regions of the parameter space, where chaotic motion of Shilnikov kind exists. Vitanov and Ausloos [
19] present a rich inventory of dynamic models based on the behavior of groups of scientists and suitable to describe the emergence and spreading of new ideas in a competitive process. The authors also discuss the role of fluctuations during the emergence of innovation and when best to turn from deterministic models to more complex stochastic models.
Among the other specific methods used in the analysis of the macroeconomic systems structure, some are some quite significant. For example, the Herfindahl index of regional specialization [
20,
21] is useful in analyzing the geographical distribution of territorial-administrative indicators, or the specializations in the economic sector. The Krugman index [
22], in economic literature called the K-spec. index, assumes dividing a country into geographical regions, or macro neighborhoods, in the border areas of the European Union. The K-spec. index, for a region, characterizes contrasts that exist between the structure of the workload in a region and the defined area economy. The converted Gini index [
23,
24] is a statistical measure used for the analyzing the concentration among values of a frequency distribution. The benefit of this index is that it also applies to a qualitative series (for example, production distribution by activity sectors NACE, income distribution by administrative subdivisions, etc.). This coefficient is calculated as the ratio between the average of absolute deviations and the arithmetic mean of the items. The Gini index may also be computed based on a chart, according to the surface area of concentration being its double. The Gini index calculated against the per capita income is used to define types of countries such as Organization for Economic Co-operation and Development (OECD) countries, the countries of Latin America, and the countries of Eastern Europe. The Theil index [
25,
26,
27] is a statistical indicator inspired by the entropy measurement, calculated for an uncertain event, and characterized by a probability vector defined as the difference between the maximum value of the event’s entropy and its entropy. The Theil index value is directly proportional to the concentration of the distribution values; if the distribution values are equiprobable, which means the concentration is at its minimum, then the value of this index is zero.
The length of the structural vector
x [
28] is defined by
eventually
, with the limits
, respectively:
.
It is estimated that its concentration is directly proportional to the indicator value, but it is recommended to take into consideration the following two observations: first—the indicator size is strongly influenced by both the number of groups into which the population is divided and by the amplitude of the distribution, second—the indicator is available in the measuring unit of the characteristic. For this reason, it cannot be used in comparing the concentration of the population units relative to different characteristics.
The Lorenz curve [
29,
30] is used to characterize the diversification or concentration of information in an economic system. The concentration curve was used in economic analysis, for the first time by Atkinson, to measure income distribution and redistribution, and in time became recognized as “the golden standard”. It is formed distinctly for discrete and continuous data. For a quantitative characteristic X defined by
, where
is a value, or a crowd, or an interval of values for the characteristic X, and
is absolute frequency. In order to analyze the concentration of values for variable X starting from the distribution
, two sets of relative frequencies are calculated. The processing phases of a distribution, designed for the analysis of the concentration degree, are represented below, distribution versus transformed distribution:
or
.
For curve fitting, where a data series known with the values for each characteristic and direct cumulative, the following steps are taken:
The geographic regions are arranged in increasing order, in relation to the values of the ratio
For each variable, the relative frequencies and cumulative relative frequencies are calculated;
The Lorenz curve is drawn by joining the points , where .
The concentration area, between the first bisector and the concentration curve, is a much more effective measure than a simple graphical representation to characterize revenue distribution. The proposed procedure seeks to highlight changes in the structure of an ensemble, measured by weights with sub-unitary values, the sum of which is not restricted to be equal to 1, as is the case for other indicators with the same destination, such as entropy or the Gini coefficient.
3. Results
3.1. A Particular Case
We will analyze the particular case where
(hence, the distribution is completely specified), in order to outline the difficulties met when calculating the distribution function for the general case. Indeed:
and taking into account that ([
36], p. 133):
It follows for
, the expression
obviously:
Even for this simple case, the calculus of the theoretical median, for instance, leads to the transcendent equation , i.e., which really has a solution on the interval (0;1), as and and . The solution is unique, as we can prove that the graphs of the curves and are intersected at a single point on the interval (0,1).
The value of parameter
is obtained, for instance, if
(from the estimation relation by means of the method of moments), a value higher than the “critical level 0.125”, in accordance [
37] with Equations (9) and (10).
Consequently, it is clear that the various problems related to the direct implication of the distribution function (quantiles assessment, natural tolerance, etc.) must be considered according to various particular values of in order to finalize the computations. In the next sections, we proceed to develop the analysis using the sequential method.
3.2. Verifying Certain Statistical Hypotheses
Verifying a simple statistical hypothesis:
with the alternative:
represents a hypothesis on the stability of the system structure at a given moment [
33,
37]. By using the likelihood ratio test for a single observation, we deduced the equation providing the decision constant
according to
and to the significance level
of the test:
In this equation,
can be approximated either by a I-st degree polynomial
leading in the left side to a parabola:
or by a II-nd degree polynomial
, leading in the left side to a IV-th degree curve (polynomial):
Again, the case where proves to be interesting, since we have to show that the polynomial intersects only once the hyperbola on the interval (0,1). Indeed, if we denote by we have and as .
In this case, we would like to use several sequels of observations, therefore it is better to use the sequential probability ratio test (SPRT)-Wald procedure (the developments in the area of theoretical and applied sequential analysis generated the editing of a profile journal since 1984: Sequential Analysis: Design Methods and Applications):
which, by logarithms operation, leads to:
If
and
are the risks associated to the two hypotheses, and A and B are the decision constants of the sequential test,
and
, then the experiment estimating area is given by the double inequality:
Here, is the sequential sample.
3.3. The Sequential Comparison
If we have two systems characterized by parameters and , then, from a practical point of view, it is interesting to compare the level of the respective parameters. This is reduced to verifying the compound hypothesis versus the alternative .
Girshick [
38] proposed an SPRT test as follows: let
X and
Y be the two systems, or the same system in two periods (in our case), characterized by the densities
and
, respectively. We choose two values,
and
, and let
be the statistical hypothesis: the joint distribution of variables
X and
Y has the form
, with alternative
; the joint distribution is
. In other words, verifying
H versus
H’ is reduced to:
.
By using the notations
and
, respectively, then the likelihood ratio associated with the observation pairs
is, in our case:
So, the uncertainty area is given by relations:
Girshick showed that reducing the verification of to that of can be made if there is a function with the following properties:
This function, called also the Girshick function, can be considered as a measure of the difference between and .
Vaduva [
39] proved that, for the distributions of type
, the Girshick’s function has the form
, where (+) or (−) appear when
r is increasing/decreasing.
In case of our repartition, we can write:
Hence,
, a function excellently meeting the conditions required by Girshicks’s method.
4. Application and Conclusions
The European Union’s economy is emerging, year after year, and growing stronger as a world leading economic player. In 2015, the total gross domestic product (GDP) of the 28 member states exceeded 14,600 billion Euros, accounting for around 20% of the world trade being, next to the US and China, the third world economic power.
We plan to look at whether, between the first five economies in the European Union (Germany, Spain, France, Italy, United Kingdom) with annual GDP values of more than 1000 billion, there have been changes in the structure of the group.
Table 1 shows the data for the years 2003 and 2015.
In 2015, the group of the most powerful economies in the European Union held about 68% of GDP, decreasing compared with 2003, and also there were structural movements within The Group. For example, the weight of Spain increased (+52.05% in 2015 compared to 2003), while for the rest of the group members the weight decreased, given the increase of GDP in absolute value (+27.29% for the group of the five most developed countries, +39.51% across the European Union). The characterization of the structure using the methods of informational statistics (entropy, Gini coefficients, structure vectors, etc.) is not possible, because
, but it is useful to use the method outlined above. The mean values determined by Equation (3), the variants calculated by Equation (4), the parameters computed by Equation (10), and other intermediate elements necessary for testing the hypotheses (16a) and (16b), and for determining the uncertainty intervals by Equations (22) and (24), are presented in
Table 2.
Based on the information processed in
Table 2, for the usual risks values of type I and II,
and
, with
and
, the decision report and the essence of the Wald test in the validation of the hypotheses (16a) and (16b), as well as through the interval established by Equation (22), the result is as follows:
, as in version (24), the size
is placed between the limits of uncertainty
.
The conclusion drawn from this is that in the group of the five developed European Union countries, between 2003 and 2015, there were no major changes in the structure. This is a decision made at a type I risk of 5% or a type II risk of 10%.
The sequential test, which we chose in order to test the hypothesis on the movement of the GDP structure for the group of developed countries in the European Union, offers, unlike other possible variants of testing, the advantage of working with the existence of an uncertainty area. This is useful due to the fact that GDP is also determined for periods below annual and for national territorial divisions (nomenclature of territorial units for statistics), as well as for neighboring regions (Euro-regions), and it is interesting to follow the structural changes over time. The proposed method can also be used beyond the suggested application, for example, in characterizing the structure and its changes within economic activities (European Classification of Economic Activities) to highlight the economic upturn by increasing the share of high added value activities, or the changes (in time) in occupational structure (International Standard Classification of Occupations). Likewise, population structures can be analyzed by nationalities, ethnic groups, age groups, and many more categories.