Next Article in Journal
New Fuzzy Implication Model Consisting Only of Basic Logical Fuzzy Connectives
Next Article in Special Issue
Robust Semi-Infinite Interval Equilibrium Problem Involving Data Uncertainty: Optimality Conditions and Duality
Previous Article in Journal
A New Bivariate Survival Model: The Marshall-Olkin Bivariate Exponentiated Lomax Distribution with Modeling Bivariate Football Scoring Data
Previous Article in Special Issue
Adding a Degree of Certainty to Deductions in a Fuzzy Temporal Constraint Prolog: FTCProlog
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Divergence and Similarity Characteristics for Two Fuzzy Measures Based on Associated Probabilities

by
Gia Sirbiladze
*,
Bidzina Midodashvili
and
Teimuraz Manjafarashvili
Department of Computer Sciences, Ivane Javakhishvili Tbilisi State University, Tbilisi 0186, Georgia
*
Author to whom correspondence should be addressed.
Axioms 2024, 13(11), 776; https://doi.org/10.3390/axioms13110776
Submission received: 26 August 2024 / Revised: 31 October 2024 / Accepted: 5 November 2024 / Published: 9 November 2024
(This article belongs to the Special Issue New Perspectives in Fuzzy Sets and Their Applications)

Abstract

:
The article deals with the definitions of the distance, divergence, and similarity characteristics between two finite fuzzy measures, which are generalizations of the same definitions between two finite probability distributions. As is known, a fuzzy measure can be uniquely represented by the so-called its associated probability class (APC). The idea of generalization is that new definitions of distance, divergence, and similarity between fuzzy measures are reduced to the definitions of distance, divergence, and similarity between the APCs of fuzzy measures. These definitions are based on the concept of distance generator. The proof of the correctness of generalizations is provided. Constructed distance, similarity, and divergence relations can be used in such applied problems as: determining the difference between Dempster-Shafer belief structures; Constructions of collaborative filtering similarity relations; non-additive and interactive parameters of machine learning in phase space metrics definition, object clustering, classification and other tasks. In this work, a new concept is used in the fuzzy measure identification problem for a certain multi-attribute decision-making (MADM) environment. For this, a conditional optimization problem with one objective function representing the distance, divergence or similarity index is formulated. Numerical examples are discussed and a comparative analysis of the obtained results is presented.

1. Introduction

1.1. On the Information Measures of Distance for Two Probability Distributions

It is known that in statistics or probability theory, as well as in information theory, the concept of distance quantitatively describes the proximity between two statistical objects or the similarity. It can be reduced to two finite probability distributions and/or two statistical samples. Thus, the distance between two populations is interpreted as a measure of closeness between two finite probability distributions. Note that many of the already defined statistical distances are not metrics, and some of them do not have the property of symmetry. Some types of distances that generate so-called squared distances are called (statistical) divergences.
When comparing finite probability distributions, as always Mahalanobis [1], Bhattacharyya [2] and Hellinger [3] distance formulas are used. Of course, depending on the need, the other distances may be used as well. Mahalanobis distance is used in statistical analysis when comparable probability distributions have different means and variances. Bhattacharya’s distance is mainly used in the construction of face recognition processes, and Hellinger’s distance is mainly an important tool in the engineering of text mining and classifying documents processes. Note that Hellinger’s distance is often called Jeffrey’s distance.
Let X = { x 1 , , x n } be a finite set and P ( X ) denote the space of all probability distributions defined on X . Let P ( 1 ) and P ( 2 ) be any two probability distributions from P ( X ) . Introduce notations
P ( 1 ) ( x i ) = P i ( 1 )   and   P ( 2 ) ( x i ) = P i ( 2 )   ,           i = 1 , , n ;       i = 1 n P i ( k ) = 1   ,       k = 1 , 2 .
The following expression is called Bhattacharyya’s coefficient of distributions P ( 1 ) and P ( 2 ) [4]:
B C ( P ( 1 ) , P ( 2 ) ) = i = 1 n ( P i ( 1 ) P i ( 2 ) ) 1 / 2 .
The Bhattacharyya’s distance between two probability distributions is defined as follows [4]:
D B ( P ( 1 ) , P ( 2 ) ) = log ( B C ( P ( 1 ) , P ( 2 ) ) ) = log ( i = 1 n P i ( 1 ) P i ( 2 ) )   .
D B ( ) does not represent a metrics in P ( X ) , but it measures the similarity and closeness between two probability distributions. There is also the Bhattacharyya’s angle [4], which measures the distance between two probability distributions:
Δ B ( P ( 1 ) , P ( 2 ) ) = arccos ( B C ( P ( 1 ) , P ( 2 ) ) ) .
The Hellinger’s distance between two probability distributions is called an expression [3]:
D H ( P ( 1 ) , P ( 2 ) ) = [ 1 B C ( P ( 1 ) , P ( 2 ) ) ] 1 / 2 = 1 i = 1 n ( P i ( 1 ) P i ( 2 ) ) 1 / 2 .
D H forms a bounded metric in P ( X ) , which also measures the similarity and closeness between two probability distributions.
The Mahalanobis distance is a measure of the distance between a point x = ( x 1 , , x n ) R n and a probability distribution P on R n [1]. Given a probability distribution P on R n with expectation value m = ( m 1 , , m n ) and positive semi-definite covariance matrix C . Mahalanobis distance of a point x = ( x 1 , , x n ) from the probability distribution P is called an expression
D M ( x , P ) = ( x m ) C 1 ( x m ) .

1.2. On the Divergences for Two Probability Distributions

Now we turn to the analysis of divergences between two probability distributions. We will give definitions of several divergences based on the concept of distances of probability distributions, which will be important in our further synthesis problems.
In general, f -divergence is a type of function D v f ( P ( 1 ) | | P ( 2 ) ) [5] that measures the difference between two probability distributions P ( 1 ) and P ( 2 ) . However, here we present a slightly different general divergence measure. Divergence was first introduced by A. Rényi [6]:
D v R ( α ) ( P ( 1 ) P ( 2 ) ) = 1 α 1 log ( i = 1 n ( P i ( 1 ) ) α ( P i ( 2 ) ) α 1 ) = 1 α 1 log E P ( 1 ) [ ( P ( 1 ) P ( 2 ) ) α 1 ] ,
where 0 < α < ,       α 1 ,       E is a symbol of mathematical expectation. When we calculate the Rényi divergence for different values of α , we obtain well-known divergences as particular cases of Rényi divergence.
An important divergence is Kullbak—Leibler divergence [7,8]
D V K L ( P ( 1 ) | | P ( 2 ) ) = D v R ( 1 ) ( P ( 1 ) P ( 2 ) ) = i = 1 n P i ( 1 ) log P i ( 1 ) P i ( 2 ) = E P ( 1 ) ( log P ( 1 ) P ( 2 ) ) ,
which also represents Shannon’s relative entropy with respect to the distributions P ( 1 ) and P ( 2 ) . D V K L is also called an asymmetric divergence.
The following symmetric divergence is produced from it, which today is called the Jeffrey’s divergence [9]:
D v J ( P ( 1 ) , P ( 2 ) ) = D v K L ( P ( 1 ) P ( 2 ) ) + D v V K L ( P ( 2 ) P ( 1 ) ) = D v J ( P ( 2 ) , P ( 1 ) ) .
Then
D v J ( P ( 1 ) , P ( 2 ) ) = i = 1 n ( P i ( 1 ) log P i ( 1 ) P i ( 2 ) P i ( 2 ) log P i ( 1 ) P i ( 2 ) ) = i = 1 n ( P i ( 1 ) P i ( 2 ) l ) ( log P i ( 1 ) log P i ( 2 ) ) .
The Kullback—Leibler divergence also produces the symmetric Jensen-Shannon divergence [10,11,12]:
D v J S ( P ( 1 ) , P ( 2 ) ) = 1 2 ( D v K L ( P ( 1 ) R ) + D v K L ( P ( 2 ) R ) ) = 1 2 i = 1 n ( P i ( 1 ) log ( 2 P i ( 1 ) P i ( 1 ) + P i ( 2 ) ) + P i ( 2 ) log ( 2 P i ( 2 ) P i ( 1 ) + P i ( 2 ) ) ) ,
where R = 1 2 ( P ( 1 ) + P ( 2 ) ) is a probability distribution on P ( X ) . D v J S ( ) represents a particular case of so-called λ —divergence
D v λ ( P ( 1 ) , P ( 2 ) ) = λ D v V K L ( P ( 1 ) R ) + ( 1 λ ) D v K L ( P ( 2 ) R ) ,
where R = λ P ( 1 ) + ( 1 λ ) P ( 2 )   ,       λ [ 0 , 1 ] is some probability distribution from P ( X ) . Obviously, D v J S ( P ( 1 ) , P ( 2 ) ) = D v 1 / 2 ( P ( 1 ) , P ( 2 ) ) . The following relation of the Jensen-Shannon divergence to the Shannon entropy is valid:
D v J S ( P ( 1 ) , P ( 2 ) ) = h ( P ( 1 ) + P ( 2 ) 2 ) h ( P ( 1 ) ) + h ( P ( 2 ) ) 2 ,
where
h ( P ) = i = 1 n P i ln P i
is the Shannon entropy, P P ( X ) ,       P = ( P 1 , P 2 , , P n ) , 0 D v J S log 2 .
We could continue the list of definitions of divergence, but the definitions above will be enough to discuss the main problem of the research in this article. It would also be natural to describe the relationships between the divergences mentioned here and the entropy measures. However, this would take us far and go beyond the problems developed in the article.
It should be noted that in recent years, fundamental studies on distance, similarity, and difference relations in the space of finite probability distributions are hardly available. We only find the use of classical definitions in important practical problems. Let us consider some of them: Ref. [13] proposes an algorithm to measure the similarity between nominal variables of the same attribute based on the fact that the similarity between nominal variables depends on the relationship between attributes subsets. This algorithm uses the difference in the distribution which is quantified by f-divergence to form a feature vector of nominal variables. In [14], neighborhood-based collaborative filtering is considered. A new similarity scheme is proposed that breaks free of the constraint of the co-rated items. Moreover, an item similarity measure based on the Kullback–Leibler divergence is presented, which identifies the relation between items based on the probability distribution of ratings. In [15] the use of a distribution-free overlapping measure as an alternative way to quantify sample differences and assess research hypotheses expressed in terms of Bayesian evidence is illustrated. The main features and potential of the overlapping index by means of three empirical applications are presented. The distance between probability distributions of different dimensions is proposed in [16]. Ref. [17] introduces fuzzy parameters and assesses the similarity between probability distributions using the fuzzy extended Kullback–Leibler divergence. In [18] a new divergence measure is proposed based on the Kullback—Leibler divergence to measure the difference between different basic probability assignments (BPAs) for the Dempster—Shafer belief structure. The book [19] covers the methods of metric distances and their application in probability theory and other fields. The methods are fundamental in the study of limit theorems and generally in assessing the quality of approximations to a given probabilistic model.

1.3. On Non-Additive, Monotone Measures and Their Probability Representations in Aggregation Functions in the MADM Environment: A Basic Motivation of the Work

As is known, in stochastic modeling, the construction of the probability distribution in the space of the system states X = { x 1 , , x n } is reduced to the calculations of the frequency distribution of the system states from the prehistoric objective data obtained by conducting random experiments. When studying different populations, additive aggregation functions such as expected value, weighted expected value, and others are mainly used to estimate basic population parameters.
However, today we often encounter such experiments where there is very little or no prehistoric objective data. At such times, the only source of estimation of population parameters is experts and their knowledge. In processing these data, we use non-additive, monotone (we often call them fuzzy) measures instead of additive, probability measures to evaluate the degree of activity of the system state or group of states [20,21,22]. The syntactic representations of expert assessments on the population are often quite intellectually free. We often use the modern generalizations and extensions of L. Zadeh’s fuzzy set theory developed today to construct the corresponding semantic forms. Therefore, in aggregation procedures, non-additive but monotone aggregation integral functions are used, such as the finite integrals of Choquet [20] and Sugeno [21] and others. By then, in such type aggregation functions two poles—the index of uncertainty of expert assessments (fuzzy measures) and the imprecision characteristic (fuzzy sets, variables)—are considered.
Definition 1 
[22]. Let X = { x 1 , , x n } be some finite set. Denote the algebra of all its subsets by 2 X . We will say that the set function g : 2 X [ 0 , 1 ] is a fuzzy measure if
( i )     g ( ) = 0 ,         ( i i )       g ( X ) = 1 ,         ( i i i )       i f   A B ,       A , B 2 X   t h e n   g ( A ) g ( B ) .
On the basis of the reasoning presented above, it can be concluded that it is essential for aggregating expert assessments in different types of problems, such as face recognition, data classification, decision making, forecasting, and others, to identify the uncertainty index—fuzzy measure; Additionally, on the space of all fuzzy measures G ( X ) defined on X , let us construct the relations of their similarity and difference as it is performed in the corresponding stochastic analysis (see the previous subsection) [21]. However, in this problem, we face an almost unsolvable problem: From the number of values of the fuzzy measure, 2 | X | and not | X | as it is in the case of the probability distribution (measure), it is clear that the direct extensions of the definitions of the distance and divergence between the probabilistic measures given in Section 1.1 and Section 1.2 for fuzzy measures are meaningless. As is known, a fuzzy measure is completely equated with its associated probability class.
Definition 2 
[23]. Let us say that g : 2 X [ 0 , 1 ] is some fuzzy measure on 2 X and S n represents the class of all permutations of the elements of the set { 1 , 2 , , n } . For each permutation σ = ( σ ( 1 ) , σ ( 2 ) , , σ ( n ) ) S n
P σ ( x σ ( 1 ) ) = g ( { x σ ( 1 ) } ) , P σ ( x σ ( 2 ) ) = g ( { x σ ( 1 ) , x σ ( 2 ) } ) , P σ ( x σ ( i ) ) = g ( { x σ ( 1 ) , , x σ ( i ) } ) g ( { x σ ( 1 ) , , x σ ( i 1 ) } ) , P σ ( x σ ( n ) ) = 1 g ( { x σ ( 1 ) , , x σ ( n 1 ) } ) ,       g ( { x σ ( 0 ) } ) 0 .
Probability distribution P σ = { P σ ( x σ ( 1 ) ) , , P σ ( x σ ( n ) ) } is called probability distribution associated with fuzzy measure g , and { P σ ( ) } σ S n is called associated probability class (APC).
It should be noted that, due to its non-additivity, the fuzzy measure is a rather inflexible tool when it presents the index of uncertainty in the models of any difficult, complex problem, for example, in MADM aggregation operators. In fact, its values are used in practice in the integral operators of Choquet and Sugeno. Of course, logically, fuzzy measure APC should have more possibilities of use as additive measures, although they form a single class whose “direct insertion” into aggregations requires quite convenient techniques. Fuzzy measure APC has been widely used by authors in their research, especially generalizations of MADM aggregation operators where the fuzzy measure is replaced by probability distributions from its APC. Consider a brief overview of these studies. New fuzzy aggregation operators based on the APC for the extension of the finite Choquet integral are presented in [24]. In [25] the authors used associated probabilities in associated probability intuitionistic fuzzy aggregations for MADM. Probabilistic OWA and other weighted aggregation operators are generalized to fuzzy associated probability operators for fuzzy intuitionistic fuzzy environments. For the same purpose, in [26], APC is used in the construction of new aggregations. Particularly, in interactive MADM under q-rung orthopair fuzzy and q-rung picture linguistic environments. In [26] APC is used in MADM aggregations with interacting attributes for the extension of the Choquet integral under q-rung orthopair fuzzy information. Constructed operators are included in the definition of new—type objective functions in the multi-objective facility location selection problem. In [27] associated probabilities aggregations are developed in multi—stage investment decision-making under q-rung orthopair fuzzy environment. In [28], new associated statistical parameters are developed with the APC aggregations in the Interactive MADM. The APC are developed in the new fuzzy extensions on binomial distribution in [29]. In [30] possibilistic simulation—based interactive fuzzy MADM is constructed on the basis of the APC under discrimination q-rung picture linguistic information. In the same article, associated probabilities are used in the processing of insufficient expert data for the MADM.
As we have already noticed, there are no new studies on the definition of metric, similarity, or divergence relations in fuzzy measures space. Only two works were found [31,32], where the notion of metric for two finite fuzzy measures is introduced. Our study is the first attempt in this direction for finite fuzzy measures. As for the introduction of the mentioned relations in the space of finite probability distributions, this was conducted a long time ago, and a significant place has already been given to this issue in this article too. The new definitions of the mentioned relations in the finite fuzzy measures space, which are based on the APCs of the fuzzy measures, are a kind of generalization of these definitions. It is clear that, in recent years, there have been no new important studies of the mentioned definitions in finite probability spaces. Therefore, what we have included in the article are only the publications published in recent years that deal with the applications of similarity and difference relations of finite probability distributions (Section 1.2).
Based on the above presented, it is already possible to talk about the main motivation of this research. As it was said, in many studies, the use of a finite fuzzy measure in various types of problems is equated with the use of a class of probability distributions associated with it. Definitions of distance, similarity or divergence indexes between finite probability distributions are determined by elementary arithmetic calculations of probability distribution values at the points of phase space. Why should we not use this idea for finite fuzzy measures in the definitions of similar indexes if the probability distribution would be changed by the class of associated probabilities—APC? This task can be formulated formally in the case of distance in general as follows:
As was mentioned, the fuzzy measure g is uniquely represented by the so-called associated probability class A P C ( g ) P ( X ) [31], and the distance between two fuzzy measures is reduced to the distance between their associated probability classes. If g 1 and g 2 are two fuzzy measures, and A P C ( g i )   ,       i = 1 , 2 are classes of associated probabilities, then in general the distance between them is defined as
D ( g 1 , g 2 ) = D P ( A P C ( g 1 ) , A P C ( g 2 ) ) .
We will use this idea in the current study to generalize classical measures of similarity and divergence between two fuzzy measures from analogs of probabilistic measures, respectively.
If this could be done, then many interesting practical and applied problems, whose uncertainty index researchers consider the fuzzy measure, would be reduced to algebraic manipulations of the elements of the class of associated probabilities. After all, this class is a very convenient tool, because each of its elements is a finite probability distribution. All this would make it easier to solve many problems, because we will not be dealing with a monotone, but an additive class of measures. Let us cite some examples of such problems: effective solution of fuzzy measure identification, which is one of the problems of this research; determining the difference between the Dempster—Shafer belief structures; measuring distribution similarities between samples; problems of determining the similarity of non-additive measures of different dimensions; problems of constructing user similarity relations in collaborative filtering models; metrics introduction problems in the phase space of non-additive and interacting parameters of machine learning; clustering and classification of complex objects and other problems.
The second section presents the preliminary concepts. In particular, certain classes of fuzzy measures are briefly described: the Sugeno λ -additive measures [21] and Choquet second—order capacities equivalent to supermodular and submodular dual fuzzy measures [20,33,34]. Representations of their associated probabilities, as well as Dempster-Shafer’s belief structure [35,36] and its probabilistic representations, are formulated. The third section discusses divergence and similarity indexes, as well as new definitions of distance for two fuzzy measures. The proof of the correctness of the generalizations of the definitions of distance, divergence, and similarity measures is presented in the fourth section. The use of these generalized concepts in fuzzy measure identification problems for different fuzzy measure classes is discussed in the fifth section. The obtained results and prospects for research development are presented in the sixth section.

2. Preliminary Concepts

2.1. Sugeno λ -Additive Measures and the Choquet Second-Order Capacities: Representations of Their Associated Probabilities

In this subsection, we briefly introduce the important and frequently used fuzzy measure classes in practice. Non-additive monotone measures (named λ -additive fuzzy measure) were first introduced by Sugeno [21]. He called monotone measures fuzzy measures, and later A. Kandel called them fuzzy statistics [37].
Definition 3 
[21]. Let X = { x 1 , , x n } be some finite set. Denote the algebra of all its subsets by 2 X . A fuzzy measure g λ : 2 X [ 0 , 1 ]         ( λ > 1 ) is called an λ -additive fuzzy measure if for every two sets A , B 2 X ,       A B =
g λ ( A B ) = g λ ( A ) + g λ ( B ) + λ g λ ( A ) g λ ( B ) ,
by the following normalization condition
1 λ { x i X ( 1 + λ g λ ( { x i } ) ) 1 } = 1 .
It is easy to show that A 2 X
g λ ( A ) = 1 λ { x i A ( 1 + λ g λ ( { x i } ) ) 1 } .
t is not difficult to write the associated probabilities of λ -additive fuzzy measure:
σ S n ,           P σ ( x i ) = g λ ( { x i } ) j = 1 i ( σ ) 1 ( 1 + λ g λ ( { x σ ( j ) } ) ,
where i = 1 , 2 , , n   ;       i ( σ ) is the index of location of element x i in the permutation
σ = ( σ ( 1 ) , , σ ( n ) ) .   If   i ( σ ) = 1 ,   then   j = 1 0 1 .
Definition 4
[23].  g and g dual fuzzy measures on 2 X are called the Choquet second-order lower and upper capacities, respectively, if A , B 2 X
g ( A B ) + g ( A B ) g ( A ) + g ( B ) g ( A B ) + g ( A B ) g ( A ) + g ( B ) ,
where g ( A ) = 1 g ( A ¯ ) .
The Choquet second-order capacities are a fairly broad class of fuzzy measures. Sugeno λ -additive fuzzy measures also belong to them. It is not difficult to show that for g 1 / ( 1 + λ ) * is dual to g λ * .
It is easy to prove that
Proposition 1 
[23]. If g , g : 2 X [ 0 , 1 ] are dual fuzzy measures, then they have a common associated probability class { P σ ( ) } σ S n and for every σ S n we have P σ ( g ) ( ) = P σ ( g ) ( ) , where σ is a dual permutation to the permutation σ :
σ ( i ) = σ ( n i + 1 )   ,       i = 1 , , n .
Let us introduce the following notations:
Suppose G ( X ) is a class of all fuzzy measures defined on 2 X and G C h ( 2 ) ( X ) is the Choquet second-order capacities class on 2 X ( G C h ( 2 ) ( X ) G ( X ) ) ; Let G S ( X ) be Sugeno λ -additive fuzzy measures class on 2 X ( G S ( X ) G C h ( 2 ) ( X ) G ( X ) ). It is clear that if P : 2 X [ 0 , 1 ] is any probabilistic measure (probability distribution), then it can be formally considered as a fuzzy measure and its associated probability class consists of one element and it coincides with this distribution itself. Let us denote by P ( X ) the class of all probabilistic measures defined on 2 X , then
P ( X ) G S ( X ) G C h ( 2 ) ( X ) G ( X ) .
Proposition 2 
[23]. If g , g : 2 X [ 0 , 1 ] are the Choquet dual capacities of the second order ( g , g G C h ( 2 ) ( X ) ) , then A X :
g ( A ) = min σ S n P σ ( A )   ;           g ( A ) = max σ S n P σ ( A ) ,
where { P σ ( ) } σ S n is the common associated probabilities class of dual measures   g   and   g .

2.2. Dempster-Shafer’s Belief Structure and Its Probabilistic Representations

The theory of evidence is based on two dual fuzzy measures—belief ( B e l ) and plausibility ( P l ) measures [35,36]. It is easy to show that these two classes of dual measures are also subclasses of second-order Choquet capacities. These two fuzzy measures on 2 X are uniquely determined by the so-called basic probability assignment on 2 X ( B P A ) [36]:
m : 2 X [ 0 , 1 ]
with the following two conditions:
( i )       m ( ) = 0   ;       ( i i )       B 2 X m ( B ) = 1 .
Of course, generally speaking, m is probability distribution on 2 X , not on X !
Then
P l ( A ) = B : A B m ( B )   ,         A 2 X , B e l ( A ) = B : B A m ( B )   ,         A 2 X .
There is also feedback:
m ( A ) = B : B A ( 1 ) | A \ B | B e l ( B )   ,       A 2 X .
Each set A 2 X , for which m ( A ) > 0 , is called a focal element. The pair F , m is called the body of evidence, where
F = { A ,   A 2 X ,   m ( A ) > 0 } .
It is easy to show that
m ( A ) = B F ; B A ( 1 ) | A \ B | min σ S n P σ ( B e l ) ( B ) ,       A 2 X ,
where { P σ ( B e l ) ( ) } σ S n is a class of associated probabilities of measure B e l .
Definition 5 
[38]. A possibility measure P o s on 2 X is called a fuzzy measure for which there is a possibility distribution on X , π : X [ 0 , 1 ) ,       π ( x i ) = P o s ( { x i } ) ,       i = 1 , , n ;       x 0 : π ( x 0 ) 1   , such that
P o s ( A ) = max x i A π ( x i )   ,       A 2 X .
The necessity measure N e s is dual to the possibility measure, so that N e s ( A ) = 1 P o s ( A ¯ )   ,         A 2 X [38]. It is known that for every P o s measure on 2 X there is a so-called consonant data body [36]
F = { A l 1 A l 2 A l k }   ,         A l i = { x 1 , x 2 , x l i }   ,       i = 1 , 2 , , k ,
and by the possibility distribution m l i m ( A l i )   ,     i = 1 , , k ;       π j = π ( x j )   ,       j = 1 , , n , there is the following connection between them:
{ π i = j : x i A l j m l j   ,       i = 1 , 2 , , n m l j = π l j π l j + 1   ,       π l k + 1 0   ,       j = 1 , 2 , , k   .      
Let { P σ ( P o s ) ( ) } σ S n be the associated probabilities class of the P o s measure. Then the connections between elements { π i }   ,       { m l i } and { P σ ( P o s ) ( ) } σ S n are easily obtained:
P σ ( P o s ) ( { x σ ( i ) } ) = max l = 1 , , i { π σ ( l ) } max l = 1 , , i 1 { π σ ( l ) } = max j = 1 , , i s : x σ ( j ) A l s m l s max j = 1 , , i 1 s : x σ ( j ) A m l s .
On the other hand
π i = P o s ( { x i } ) = max σ S n P σ ( P o s ) ( { x i } )   ,         i = 1 , 2 , , n .
Since P o s is the second-order Choquet capacity, we obtain
m l j = π l j π l j + 1 = max σ S n P σ ( P o s ) ( x l j ) max σ S n P σ ( P o s ) ( x l j + 1 )   ,         j = 1 , , k .
Denote by G P o s ( X ) the class of all possibility measures defined on 2 X .
Let m be some B P A F , m , F = { A 1 , A 2 , , A k } . Suppose for each focal element A j   ,       j = 1 , , k we have | A j | —dimensional W j weight vector W j = w j ( 1 ) , , w j ( | A j | ) in this focal element for each x X , such that w j ( i ) [ 0 , 1 ]   ,       i = 1 | A j | w j ( i ) = 1 . This is called the allocation vector of the element A j . R. Yager [39] has studied the set function
g ( A ) = j = 1 k [ m ( A j ) i = 1 | A j A | w j ( i ) ] ,
which represents the fuzzy measure associated with the belief structure F , m on 2 X . Denote by G F , m ( X ) is the class of fuzzy measures associated with the body of evidence F , m on 2 X . Obviously, each of such fuzzy measures is uniquely defined by the class of certain allocation vectors W = { W 1 , , W k } . If j = 1 , , k , W j is chosen so that w j ( 1 ) = 0 and w j ( i ) = 0   ,       i = 2 , , | A j | , then the fuzzy measure g coincides with the measure P o s . And vice versa, if w j ( | A j ) | = 1 and w j ( i ) = 0   ,       i = 1 , , | A j | 1 , then g coincides with the measure N e s . Let us note here that all fuzzy measures of this class have a common Shapley entropy [39]. It is easy to see the relationship between the associated fuzzy measure, the associated probability class, the focal elements, and the allocation vector class W = { w 1 , , w S } .
P σ ( g ) ( { x σ ( i ) } ) = j = 1 k m ( A j ) [ l = 1 | A j { x σ ( 1 ) , , x σ ( i ) } | w j ( l ) l = 1 | A j { x σ ( 1 ) , , x σ ( i 1 ) } | w j ( l ) ] = A j F : A j { x σ ( i ) } m ( A j ) w j ( | A j { x σ ( 1 ) , , x σ ( i ) } | ) .

3. Divergence, Similarity and Distance Parameters for Two Fuzzy Measures

In this section, we will realize the main problem formulated in the introduction, which is formulated in the case of distance in the form (16). We will try to generalize the distance and divergence measures of two probability distributions to the case of two fuzzy measures. As it was shown in the previous paragraph, any fuzzy measure g G ( X ) generates its APC- { P σ ( g ) ( ) } σ S n . The reverse statement is also valid.
Proposition 3 
[23]. Let us be given a fuzzy measure g G ( X ) and its APC- { P σ ( g ) ( ) } σ S n . If A = { x i 1 , x i 2 , , x i s } 2 X is some set, then
g ( A ) = P τ ( g ) ( A ) ,
where in permutation τ = ( τ ( 1 ) , τ ( 2 ) , , τ ( s ) , , τ ( n ) ) of the first s indexes coincide with indexes i 1 , i 2 , i s : τ ( j ) = i j   ,       j = 1 , 2 , , s .
Therefore, with associated probabilities, we always get the g fuzzy measure’s values on the arguments A 2 X . Thus, we will say that the fuzzy measure g G ( X ) is given if and only if the class of its associated probabilities { P σ ( g ) ( ) } σ S n is known.
The main concept of our study is to relate the measurement of similarity or divergence between two fuzzy measures to the similarity and difference between their APCs (15).

3.1. Distance Generalizations for Two Fuzzy Measures on G ( X )

Definition 6. 
Let Τ m { ( z 1 , z 2 , , z m ) R m   | z i 0   ,       i = 1 , 2 , , m } . Let also F : Τ m R + be a function. We say that F is a distance generating function if it satisfies the following five properties:
( 1 )       F ( z 1 , z 2 , , z m ) = 0       i f       z 1 = z 2 = = z m = 0 ( 2 )       I f       z i z i       f o r     a n y       i       t h e n       F ( z 1 , z 2 , , z m ) F ( z 1   , z 2   , , z m   )
That is, F is monotonically non-decreasing.
( 3 )       F ( z 1 + z 1 , z 2 + z 2 , , z m + z m ) F ( z 1 , z 2 , , z m ) + F ( z 1   , z 2   , , z m   ) .
That is, F is sub-additive.
( 4 )       F ( z , z , , z ) = z .
That is, F is idempotent
( 5 )       F ( z σ ( 1 ) , z σ ( 2 ) , , z σ ( m ) ) = F ( z 1 , z 2 , , z m )   ,       σ S m .
That is, F is symmetric.
Sort the permutations n ! = m         σ = ( σ ( 1 ) , , σ ( n ) ) from S n by some criterion to renumber them from 1 to m , and, therefore, we get the associated probability class { P σ ( ) } σ S n as a m -dimensional vector ( P 1 , P 2 , , P m ) where m = n ! . Let d be a distance on P ( X ) .
Definition 7. 
Let us be given two fuzzy measures g 1 , g 2 G ( X ) . A binary argument function
D ( F , d ) ( g 1 , g 2 ) = F ( d ( P 1 ( g 1 ) , P 1 ( g 2 ) ) , , d ( P m ( g 1 ) , P m ( g 2 ) ) )
is called the distance between two fuzzy measures, where ( P 1 ( g 1 ) , , P m ( g 1 ) ) and ( P 1 ( g 2 ) , , P m ( g 2 ) ) are associated probability classes of fuzzy measures g 1 , g 2 G ( X ) , respectively.
It is not difficult to show that the function D ( F , d ) : G ( X ) × G ( X ) defined by the distance generator function has distance properties. Here are two examples of generator function F that we will use in the future:
F q ( z 1 , z 2 , , z m ) ( 1 m i = 1 m z i q ) 1 / q ,       q 1 F max ( z 1 , z 2 , , z m ) = max 1 i m { z i } .
If the distance d between probability distributions is calculated by the following two formulas:
d max ( P ( g 1 ) , P ( g 2 ) ) = max 1 i n | P ( g 1 ) ( x i ) P ( g 2 ) ( x i ) | d q ( P ( g 1 ) , P ( g 2 ) ) = ( i = 1 n | P ( g 1 ) ( x i ) P ( g 2 ) ( x i ) | q ) 1 / q ,       q 1 ,
then it is easy to show.
Proposition 4. 
Let us be given two fuzzy measures g 1 , g 2 G ( X ) . The distances between fuzzy measures and their corresponding dual fuzzy measures coincide
D ( F , d ) ( g 1 , g 2 ) = D ( F , d ) ( g 1 , g 2 ) ,
where g 1 and g 2 are dual fuzzy measures to g 1 and g 2 , respectively.
As a concrete example, consider the case when
F 2 ( z 1 , , z m ) = 1 m i = 1 m z i 2      
and distance on P ( X ) is defined by the formula:
d 2 ( P ( 1 ) , P ( 2 ) ) = i = 1 n ( P ( 1 ) ( x i ) P ( 2 ) ( x i ) ) 2 ,
then
D ( F 2 , d 2 ) ( g 1 , g 2 ) = 1 m j = 1 m i = 1 n ( P j ( g 1 ) ( x i ) P j ( g 2 ) ( x i ) ) 2 = 1 n ! σ S n i = 1 n ( P σ ( g 1 ) ( x σ ( i ) ) P σ ( g 2 ) ( x σ ( i ) ) ) 2 .
Let us introduce the concepts of associated parameters with associated probabilities of any fuzzy measures g 1 , g 2 G ( X ) , which will be analogous to the definitions provided in the introduction of the article. Let { P σ ( g 1 ) } σ S n and { P σ ( g 2 ) } σ S n be the APCs for the fuzzy measures g 1 and g 2 from G ( X ) , respectively.
Definition 8. 
1. Bhattacharyya’s F —coefficient for two fuzzy measures g 1 , g 2 G ( X ) is called
B C ( F ) ( g 1 , g 2 ) = F ( B C ( P 1 ( g 1 ) , P 1 ( g 2 ) ) , , B C ( P m ( g 1 ) , P m ( g 2 ) ) ) ,
where B C ( P i ( g 1 ) , P i ( g 2 ) ) = j = 1 n [ P i ( g 1 ) ( x j ) , P i ( g 2 ) ( x j ) ] 1 / 2 ,       i = 1 , , m
2. Then Bhattacharyya’s F —distance between two fuzzy measures g 1 , g 2 G ( X ) is called
D B ( F ) ( g 1 , g 2 ) = F ( log ( B C ( P 1 ( g 1 ) , P 1 ( g 2 ) ) , , log ( B C ( P m ( g 1 ) , P m ( g 2 ) ) ) .
Other generalized F —distances between two fuzzy measures are defined similarly. Calculate the distance for the F = F max function:
D B ( F max ) ( g 1 , g 2 ) = max 1 i m { log ( ( B C ( P i ( g 1 ) , P i ( g 2 ) ) } = min σ S n { log ( B C ( P σ ( g 1 ) , P σ ( g 2 ) ) } = min σ S n { log ( i = 1 n [ ( P σ ( g 1 ) ( x σ ( i ) ) P σ ( g 2 ) ( x σ ( i ) ) ] 1 / 2 ) } ,
while for F = F q , ( q 1 ) , we have
D B ( F q ) ( g 1 , g 2 ) = 1 m i = 1 m [ log 1 B C ( P i ( g 1 ) , P i ( g 2 ) ) ] q q = 1 n ! σ S n [ log 1 i = 1 n ( P σ ( g 1 ) ( x σ ( i ) ) P σ ( g 2 ) ( x σ ( i ) ) ) 1 / 2 ] q q .
For the particular case F = F 1 , we have simple expression
D B ( F 1 ) ( g 1 , g 2 ) = 1 m σ S n log ( B C ( P σ ( g 1 ) , P σ ( g 2 ) ) ) = 1 n ! log { [ σ S n ( i = 1 n ( P σ ( g 1 ) ( x σ ( ( i ) ) P σ ( g 2 ) ( x σ ( ( i ) ) ) 1 / 2 ) ] } .
In the same way, we can construct Bhattacharyya’s F —angle between two fuzzy measures. We also similarly can construct the Helling F —distance for two fuzzy measures (omitted here).

3.2. Divergence Generalizations and Similarity Index for Two Fuzzy Measures

Now let us turn to generalizations of divergences for two fuzzy measures.
Definition 9. 
Rényi F —divergence between two fuzzy measures g 1 , g 2 G ( X ) is called:
D v R ( F , α ) ( g 1 g 2 ) = F ( D v R ( α ) ( P 1 ( g 1 ) P 1 ( g 2 ) ) , , D v R ( α ) ( P m ( g 1 ) | | P m ( g 2 ) ) ) ,
where 0 < α < ,       α 1 .
For the function F = F q we obtain
D v R ( F q , α ) ( g 1 g 2 ) = 1 m σ S n [ D v R ( , α ) ( P σ ( g 1 ) , P σ ( g 2 ) ) ] q q = 1 m ( α 1 ) q σ S n ( log ( i = 1 n ( P σ ( g 1 ) ( x σ ( i ) ) ) α ( P σ ( g 2 ) ( x σ ( i ) ) ) α 1 ) ) q q .
Definition 10. 
The Kullback—Leibler F —divergence between two fuzzy measures g 1 , g 2 G ( X ) is called:
D v K L ( F ) ( g 1 g 2 ) = F [ D v K L ( P 1 ( g 1 ) P 1 ( g 2 ) ) , , D v K L ( P m ( g 1 ) P m ( g 2 ) ) ] .
As an example, let us build the Kullback—Leibel F q —divergence ( α 1 ) .
D v K L ( F q ) ( g 1 g 2 ) = 1 m σ S n [ D v K L ( P σ ( g 1 ) , P σ ( g 2 ) ) ] q q = 1 n ! σ S n ( i = 1 n ( P σ ( g 1 ) ( x σ ( i ) ) log P σ ( g 1 ) ( x σ ( i ) ) P σ ( g 2 ) ( x σ ( i ) ) ) ) q q .
Let us also construct the Kullback—Leibler F max —divergence.
D v K L ( F max ) ( g 1 g 2 ) = max σ S n { D v K L ( P σ ( g 1 ) , P σ ( g 2 ) ) } = max σ S n { i = 1 n ( P σ ( g 1 ) ( x σ ( i ) ) log P σ ( g 1 ) ( x σ ( i ) ) P σ ( g 2 ) ( x σ ( i ) ) ) } .
Definition 11. 
The Jeffrey’s F —divergence for two fuzzy measures g 1 , g 2 G ( X ) is called:
D v J ( F ) ( g 1 , g 2 ) D v K L ( F ) ( g 1 g 2 ) + D v K L ( F ) ( g 2 g 1 ) = D v J ( F ) ( g 2 , g 1 ) .
Let us construct the Jeffrey’s divergence for F = F q :
D v J ( F q ) ( g 1 , g 2 ) = 1 n ! σ S n ( i = 1 n [ ( P σ ( g 1 ) ( x σ ( i ) ) P σ ( g 2 ) ( x σ ( i ) ) ) log P σ ( g 1 ) ( x σ ( i ) ) P σ ( g 2 ) ( x σ ( i ) ) ] ) q q .
For F = F max we obtain:
D v J ( F max ) ( g 1 , g 2 ) = max σ S n [ i = 1 n ( P σ ( g 1 ) ( x σ ( i ) ) P σ ( g 2 ) ( x σ ( i ) ) ) log P σ ( g 1 ) ( x σ ( i ) ) P σ ( g 2 ) ( x σ ( i ) ) ] .
Definition 12. 
F , λ —divergence for two fuzzy measures g 1 , g 2 G ( X ) is called
D v ( F , λ ) ( g 1 , g 2 ) λ D v K L ( F ) ( g 1 R ) + ( 1 λ ) D v K L ( F ) ( g 2 R ) ,
where the associated probabilities of a fuzzy measure R are obtained as a convex combination of the associated probabilities of fuzzy measures g 1 and g 2 .
P σ ( R ) ( x σ ( i ) ) = λ P σ ( g 1 ) ( x σ ( i ) ) + ( 1 λ ) P σ ( g 2 ) ( x σ ( i ) )   ,         0 λ 1 .
If in this definition λ = 1 / 2 , then we get a generalization of Jensen-Shannon’s classical divergence with respect to fuzzy measures.
Definition 13. 
The Jensen-Shannon F —divergence for two fuzzy measures g 1 , g 2 G ( X ) is called
D v J S ( F ) ( g 1 , g 2 ) = D v ( F , 1 / 2 ) ( g 1 , g 2 ) = 1 2 ( D v K L ( F ) ( g 1 R ) + D v K L ( F ) ( g 2 R ) ) .
Let us construct the Jensen-Shannon F = F q —divergence
D v J S ( F q ) ( g 1 , g 2 ) = 1 2 q m i = 1 m ( j = 1 n [ P j ( g 1 ) log ( 2 P j ( g 1 ) P j ( g 1 ) + P j ( g 2 ) ) + P j ( g 2 ) log ( 2 P j ( g 2 ) P j ( g 1 ) + P j ( g 2 ) ) ] q q = = 1 2 q n ! σ S n m ( j = 1 n [ P j ( g 1 ) ( x σ ( j ) ) log ( 2 P σ ( g 1 ) ( x σ ( j ) ) P σ ( g 1 ) ( x σ ( j ) ) + P σ ( g 2 ) ( x σ ( j ) ) ) + P σ ( g 2 ) ( x σ ( j ) ) log ( 2 P σ ( g 2 ) ( x σ ( j ) ) P σ ( g 1 ) ( x σ ( j ) ) + P σ ( g 2 ) ( x σ ( j ) ) ) ] q q .
Let us also construct the Jensen-Shannon F = F max —divergence
D v J S ( F max ) ( g 1 , g 2 ) = 1 2 max σ S n { j = 1 n ( P j ( g 1 ) ( x σ ( j ) ) log ( 2 P σ ( g 1 ) ( x σ ( j ) ) P σ ( g 1 ) ( x σ ( j ) ) + P σ ( g 2 ) ( x σ ( j ) ) ) + P σ ( g 2 ) ( x σ ( j ) ) log ( 2 P σ ( g 2 ) ( x σ ( j ) ) P σ ( g 1 ) ( x σ ( j ) ) + P σ ( g 2 ) ( x σ ( j ) ) ) ) } .
As we can see, there are many options and combinations of generalizations, however, their further definitions would take us too far from the main topic of the article.

3.3. Similarity Index Between Two Fuzzy Measures

In this subsection, we will consider the generalization of the definition of similarity relation to the space of fuzzy measures. We will also provide a short analysis of it.
Huang [40] defined the index of similarity of two probability distributions.
Definition 14 
[40]. If two probability distributions P ( 1 ) , P ( 2 ) P ( X ) are defined on X , then their distributions similarity index is defined as:
D S I ( P ( 1 ) , P ( 2 ) ) = β ( P ( 1 ) , P ( 2 ) ) 1 2 [ β ( P ( 1 ) ) + β ( P ( 2 ) ) ] ,
where the expression
β ( P ) = i = 1 n ( P ( x i ) ) 2 = E P ( P )
is called an informity of the distribution P P ( X ) , and the expression
β ( P ( 1 ) , P ( 2 ) ) = i = 1 n P ( 1 ) ( x i ) P ( 2 ) ( x i )
is called a cross informity of two distributions.
Huang noted [40] that if DSI = 0.75, 0.5, and 0.25, then the similarity of distributions P ( 1 ) and P ( 2 ) is high, medium and low, respectively. This informational analysis and related synthesis issues are indeed questionable, and many authors have even noted the lesser correctness of this analysis. Perhaps it would be useful to describe the linguistic variable of similarity with fuzzy terms, in order to use it further in synthesis issues. See one of the examples of this in Figure 1:
Obviously, if we start the analysis of the divergences defined above, we have to introduce new strong semantic forms to solve their practical problems of synthesis.
We can easily generalize Huang’s similarity index in the case of two fuzzy measures as well.
Definition 15. 
Huang’s F —similarity index for two fuzzy measures g 1 , g 2 G ( X ) is called:
D S I ( F ) ( g 1 , g 2 ) = F ( D S I ( P 1 ( g 1 ) , P 1 ( g 2 ) ) , , D S I ( P m ( g 1 ) , P m ( g 2 ) )   ,       m = n ! .
For F = F q —similarity index we have:
D S I ( F q ) ( g 1 , g 2 ) = 2 1 n ! [ σ S n ( j = 1 n P σ ( g 1 ) ( x j ) P σ ( g 2 ) ( x j ) ) i = 1 n [ ( P σ ( g 1 ) ( x j ) ) 2 + ( P σ ( g 2 ) ( x j ) ) 2 ] ) q ] q ,
and for the F = F max case:
D S I ( F max ) ( g 1 , g 2 ) = 2 max σ S n { j = 1 n P σ ( g 1 ) ( x j ) P σ ( g 2 ) ( x j ) ) i = 1 n [ ( P σ ( g 1 ) ( x j ) ) 2 + ( P σ ( g 2 ) ( x j ) ) 2 ] } .
It is easy to show that D S I ( F ) ( ) [ 0 , 1 ] .

3.4. On the Correctness of the Generalizations of Definitions of Distance, Divergence and Similarity Measures

As was mentioned, the APC— { P σ } σ S n of a probability distribution consists of one element, and this element is the probability distribution itself. Obviously, if we consider the definitions developed in the article for two fuzzy measures in relation to distance, divergence and similarity, when these two fuzzy measures are probability ones, then the generalized expressions should coincide with the corresponding classical formulas for the probability measures.
Proposition 5. 
If in the role of fuzzy measures g 1 and g 2 in Formulas (47)–(64) and (68) we consider probability distributions g 1 = P ( 1 ) and g 2 = P ( 2 ) , then for any F —distance generator these formulas coincide with the Formulas (2)–(13) and (65) of classical probabilistic definitions, respectively.
Proof. 
Let us show the validity of this proposition in the cases of distance, divergence, and similarity relations.
(a) Let us show for a distance that
D ( F , d ) ( P ( 1 ) , P ( 2 ) ) = d ( P ( 1 ) , P ( 2 ) )
for any F —distance generator function.
Consider
D ( F , d ) ( P ( 1 ) , P ( 2 ) ) = F ( d ( P 1 ( p ( 1 ) ) , P 1 ( P ( 2 ) ) ) , , d ( P m ( p ( 1 ) ) , P m ( P ( 2 ) ) ) .
Since the associated probabilities of the probability distribution are unique, that is P i ( P ( 1 ) ) = P ( 1 ) , and P i ( P ( 2 ) ) = P ( 2 )   ,         i = 1 , 2 , , m , therefore,
D ( F , d ) ( P ( 1 ) , P ( 2 ) ) = F ( d ( P ( 1 ) , P ( 2 ) ) , , d ( P ( 1 ) , P ( 2 ) ) = d ( P ( 1 ) , P ( 2 ) ) ,
because F is idempotent.
(b) Let us show for the divergence that
D v R ( F , α ) ( P ( 1 ) P ( 2 ) ) = D v R ( α ) ( P ( 1 ) P ( 2 ) )
for any F —distance generator function. Similarly, to the previous case, we have
D v R ( F , α ) ( P ( 1 ) P ( 2 ) ) = F ( D v R ( α ) ( P 1 ( p ( 1 ) ) P 1 ( P ( 2 ) ) ) , , D v R ( α ) , ( P m ( p ( 1 ) ) P m ( P ( 2 ) ) ) = = F ( D v R ( α ) ( P ( 1 ) P ( 2 ) ) , , D v R ( α ) , ( P ( 1 ) P ( 2 ) ) = D v R ( α ) ( P ( 1 ) P ( 2 ) ) .
(c) A similar proposition will hold for Huang’s F —index D S I ( F ) ( , ) (without proof)
D S I ( F ) ( P ( 1 ) , P ( 2 ) ) = D S I ( P ( 1 ) , P ( 2 ) )

4. The Use of Generalized Distance, Divergence, and Similarity Parameters in Fuzzy Measure Identification Problems for G ( 2 ) ( X ) , G C h ( 2 ) ( X ) , G S ( X ) , G P o s ( X ) and G F , m Fuzzy Measures Classes

Let us briefly review fuzzy measure identification problems and results that are widely used in interactive MADM models.
In [41], the practical applicability of two probability representations of a finite fuzzy measure—the Campos-Bolanos representation (CBR), equivalent to the APC of a fuzzy measure and the Murofushi-Sugeno representation (MSR)—is examined within the context of multi-criteria decision-making (MCDM) models. This work constructs a new MSR-type representation-interpreter specifically for a particular class of finite fuzzy measures. In [41], a universal interpreter for a capacity (fuzzy measure) in probability MSR, under the Choquet integral framework and second-order dual capacities, is explored. The next research focus is on analyzing fuzzy measure non-additivity indexes, which are relevant for interactive MCDM models where attribute interactions are observed. The non-additivity index effectively measures the degree of attributes interaction. In [42], this index is employed to evaluate the range of advantages concerning decision-maker alternatives. [43] introduces the use of the non-additivity index to replace the Shapley concurrent interaction index and develop an undated MADM decision scheme. A method for calculating the non-additivity index and a decision support algorithm to establish dominance relationships for optimal alternatives ranking are also presented. Key properties of the non-additivity index are considered in [44], along with a capacity identification algorithm based on this index. The algorithm uses linear constraints to reflect decision-maker advantages over alternatives and formulates a linear programming problem to determine the optimal capacity. A capacity identification simulation algorithm based on the non-additivity index is developed in [45]. Another research direction involves the additivity defectiveness of capacities, as discussed in [46]. This paper introduces the concept of capacity defectiveness, representing the degree of capacity non-additivity, and calculates or approximates the defectiveness coefficient for certain capacity classes. An optimal approximation approach for fuzzy integrals is also developed for replacing fuzzy measures with classical measures. The identification of fuzzy measures, including interaction indexes and importance values (Shapley values), is further examined in [47], where various fuzzy measure representations, such as Mobius transformations and k-order additive measures, are considered. Ref. [48] shows that every discrete fuzzy measure can be represented as a k-order additive fuzzy measure, and presents alternative representation methods using interaction indexes and Shapley values. A learning algorithm for identifying k-maxitive measures based on heuristic least mean squares is given in [49]. Ref. [50] explores the structure and properties of a specific fuzzy measure type applicable to interactive MADM models, utilizing interaction coefficients, Möbius representation and dual fuzzy measures. Ref. [49] introduces the generalized interaction index, or g-index, which requires significant computational resources, and presents algorithms to calculate the g-index for k-maxitive measures. Ref. [51] provides a new visualization scheme for understanding fuzzy measures, while Ref. [52] examines a joint Choquet integral-fuzzy measures operator that uses attribute interactions. Ref. [53] utilizes a hesitant fuzzy linguistic term set to describe attribute interactivity in fuzzy measure identification. Finally, Ref. [47] reviews current approaches to fuzzy measure identification and their advantages and limitations, with Ref. [54] discussing fuzzy measure representations for learning Choquet and Sugeno integrals.
In this section, we present a completely new approach to fuzzy measure identification problems. Conditional optimization problems will be constructed, where the similarity, distance, and divergence parameters defined in the previous sections for two fuzzy measures will be used as objective functions, and the requirements and data of the interactive MADM will be considered in the constraints. The identification fuzzy measure APC will be considered as optimization unknown variables. Specific identification will be performed for specific fuzzy measure classes.
As a first example, let us consider the class G ( 2 ) ( X ) of two-additive fuzzy measures. Suppose we are given an interactive MADM model [55,56] with possible alternatives D = ( d 1 , d 2 , , d m ) and attributes X = { x 1 , x 2 , , x n } . Suppose that in the interactive MADM model, attributes importance values { I 1 , I 2 , , I n } and pairwise interactions are known in the form of a symmetric matrix { I i j } i = i , , n ; j = 1 , , n ,       i j [57,58,59,60,61,62]. As is known in [26], if g ^ G ( 2 ) ( X ) , then its associated probabilities are calculated by the following formula: σ S n
P ^ σ ( x σ ( i ) ) P σ ( g ^ ) ( x σ ( i ) ) = I σ ( i ) + 1 2 j = 1 i 1 I σ ( i ) σ ( j ) 1 2 j = i + 1 n I σ ( i ) σ ( j ) ,         i = 1 , , n .
In (76), if i = 1 , then the second addend is equal to 0, and if i = n , then the third addend is equal to 0. Assume that in the decision-making matrix there are alternatives with ratings { ξ i j } from [0, 1] (see Table 1):
Suppose that the index of uncertainty of expert evaluations on attributes X = { x 1 , x 2 , , x n } is represented by g G ( X ) fuzzy measure. If there is a dependence between the attributes in the form of interactions, then it is completely acceptable to use the Choquet finite integral as an aggregation tool [20]. Then, to obtain a scalar evaluation of the ranking of each alternative d i and to aggregate its ratings ( ξ i 1 ,   ξ i 2 ,   ,   ξ i j ,   ,   ξ i n ) to make a final decision, we will use the value of the Choquet integral with respect to the fuzzy measure g [20]. We highlight the following publications of the authors of this study in the direction of using associated probabilities based on extensions of the Choquet and Sugeno integral operators [63,64]. We also know that this population parameter is known in fuzzy statistics as monotone expectation ( M E ( d i ) ) [23], i.e.,
M E ( d i ) = 0 + g ( { x j X ,       ξ i j α } ) d α   ,         i = 1 , 2 , , m ,
or
M E ( d i ) = j = 1 n P τ ( g ) ( x τ ( j ) ) ξ i τ ( j )   ,         i = 1 , 2 , , m ,
where τ = ( τ ( 1 ) , , τ ( m ) ) S n is a permutation for which 1 ξ i τ ( 1 ) ξ i τ ( 2 ) ξ i τ ( n ) 0 .
Now let us consider the concepts of the best approximation of a particular fuzzy measure g ¯ G ( X ) from a given class of fuzzy measures. The idea is that this approximation is achieved for such fuzzy measure g ^ from the given fuzzy measures subclass that the distance or divergence to a particular fuzzy measure g ¯ is the smallest or this approximation is achieved for such fuzzy measure g ^ from the subclass of given fuzzy measures that the similarity to the fuzzy measure g ¯ is greatest. In doing so, we determine a similar to concrete approximate fuzzy measure g ¯ G ( X ) from the given class of fuzzy measures. Consider definitions (16)–(19), which are different cases of approximation.
Definition 16. 
The fixed two-additive fuzzy measure g ^ G ( 2 ) ( X ) is called the F —best approximation of the fuzzy measure g ¯ G ( X ) according to the similarity relation for the F —distance generator
D S I ( F ) ( g ¯ , g ^ ) = sup g G ( X ) D S I ( F ) ( g , g ^ ) .
Therefore, the best, highly similar fuzzy measure g ¯ among the two-additive fuzzy measures is the known fuzzy measure g ^ .
Analogous definitions can be made in relation to the distance and divergence between fuzzy measures.
Definition 17. 
A fixed two-additive fuzzy measure g ^ G ( 2 ) ( X ) is called the F —best approximation of the fuzzy measure g ¯ G ( X ) according to the D ( F , d ) —distance relation for the F —distance generator
D ( F , d ) ( g ¯ , g ^ ) = inf g G ( X ) D ( F , d ) ( g , g ^ ) .
Note that the D ( F , d ) —distance in Definition 17 can be replaced by the D R ( F ) —distance defined above or other distances generalized here.
Definition 18. 
The fixed two-additive fuzzy measure g ^ G ( 2 ) ( X ) is called a F —best approximation of the fuzzy measure g ¯ G ( X ) according to the D v R ( F , α ) —divergence for the F —distance generator
D v R ( F , α ) ( g ¯ g ^ ) = inf g G ( X ) D v R ( F , α ) ( g g ^ ) .
Note here again that the D v R ( F , α ) —divergence in Definition 18 can be replaced by divergences D v K L ( F )   ,       D v J ( F )   ,       D v J S ( F ) defined above or other generalized ones mentioned here.
Note that the area of approximation G ( 2 ) ( X ) in Definitions 16, 17, and 18 can be replaced by other practically important subclasses of fuzzy measures and presented here: G S ( X ) —Sugeno λ —additive fuzzy measures, G C h ( 2 ) ( X ) —Choquet second-order capacity, G P o s ( X ) —possibility measures, G F , m ( X ) —fuzzy measures associated with the body of evidence and other classes. Obviously, such subclasses of fuzzy measures are considered, for which their associated probability classes are derived, whose representations include only the basic data of these classes. For example, formulas: For the possibility measures (31); for the Sugeno λ —additive measure (17); for the measure associated with the data body (37). Therefore, let us introduce analogous definitions for these fuzzy measures’ classes.
Definition 19. 
The fixed g ^ λ G S ( X ) —Sugeno λ —additive fuzzy measure is called the F —best approximation of the fuzzy measure g ¯ G ( X ) according to the D ( F , d ) —distance relation (according to the D v ( F ) —divergence relation), if for the F —distance generator
D ( F , d ) ( g ¯ , g ^ λ ) = inf g G ( X ) D ( F , d ) ( g , g ^ λ ) ( D v ( F ) ( g ¯ , g ^ λ ) = inf g G ( X ) D v ( F ) ( g , g ^ ) ) .
Let us now introduce a completely new type of fuzzy measure identification problem in the MADM environment based on the definitions given here. In general, the idea is as follows: a scalar problem of conditional optimization will be constructed, the objective function of which will be the concept of one of the approximations. The problem constraints are constructed from the particular MADM problem data and constraints, and the optimization unknown variables will be the associated probabilities from the APC which should be identified; specifically definitions 16–18 in the MADM environment allow us to construct the model uncertainty index—fuzzy measure g ¯ with the best approximation from the given fuzzy measures class, while taking into account other MADM data as constraints. For example, from a practical point of view it is possible to create an approximate interval evaluation of their monotone expectation for some alternatives (with confidence intervals and others). Consider a particular problem with respect to the DSI index when considering a class G ( 2 ) ( X ) . To solve this problem, let us make a conditional optimization problem, in which the associated probability class { P σ ( ) } σ S n of the fuzzy measure g ¯ will be unknown:
{ D S I ( F ) ( g , g ^ ) = F ( D S I ( P σ , P ^ σ )   ,       σ S n ) max   S . C .         i = 1 n P σ ( x i )   ,     σ S n     , 0 P σ ( x i )   1   ,         σ S n   ,         i = 1 , 2 , , n     , M i k M E ( d i k ) = j = 1 n P τ k ( x τ k ( j ) ) ξ i k τ k ( j ) M i k +   ,         k = 1 , 2 , , s   ,
where M E ( d i k ) are the monotone expected values for the alternatives d i 1 , d i 2 , , d i s evaluated at intervals by the experts. In the formulas of (83) for each i k   τ k S n , that ξ i k τ k ( 1 ) ξ i k τ k ( 2 ) ξ i k τ k ( n ) .
The conditional optimization problem’s (83) solution fuzzy measure g ¯ G ( X ) is the closest, similar fuzzy measure to the second-order additive capacity g ^ , which takes into account the constraints associated with probabilities in the form of attribute interaction indexes and MADM data on expert evaluations of expected ratings of alternatives. Consider the numerical example of (83):
Example 1. 
Let F = F 1   ,       n = 3 and m = 4 . Suppose that the decision-making matrix is as follows (Table 2):
Assume that the importance values { I j } of the attributes and the indexes of their pairwise interaction { I i j } are as follows (Table 3):
Using Table 3 and Formulas (76), calculate the associated probabilities class { P ^ σ ( ) } σ S 3 of the fuzzy measure g ^ G ( 2 ) ( X ) (see Table 4).
Then (83) will take a specific form with respect to the unknown associated probabilities values { P σ ( ) } :
{ D S I ( F 1 ) ( g , g ^ ) = 2 3 ! j = 1 6 i = 1 3 [ P σ j ( x i ) P ^ σ j ( x i ) ] j = 1 6 i = 1 3 [ ( P σ j ( x i ) ) 2 + ( P ^ σ j ( x i ) ) 2 ] max S . C .               0 P σ j ( x i ) 1   ,         j = 1 , 2 , , 6   ,         i = 1 , 2 , 3   , i = 1 3 P σ j ( x i ) = 1   ,         j = 1 , 2 , , 6   , 0.45 i = 1 3 P σ 5 ( x σ 5 ( i ) ) ξ 2 σ 5 ( i ) = M E ( d 2 ) 0.65   ,         σ 5 = ( 3 , 1 , 2 ) , 0.35 i = 1 3 P σ 1 ( x σ 1 ( i ) ) ξ 4 σ 1 ( i ) = M E ( d 4 ) 0.55   ,         σ 1 = ( 1 , 2 , 3 )   ,
where
( ξ 2 σ 5 ( 1 ) ,       ξ 2 σ 5 ( 2 ) ,         ξ 2 σ 5 ( 3 ) ) = ( 0.8 , 0.6 , 0.4 ) , ( ξ 4 σ 1 ( 1 ) ,       ξ 4 σ 1 ( 2 ) ,         ξ 4 σ 1 ( 3 ) ) = ( 0.8 , 0.7 , 0.4 ) .
The numerical solution of the conditional optimization problem (84) as an associated probabilities class of the fuzzy measure g ¯ G ( X ) is presented in Table 5.
The maximum similarity index is equal to D S I ( g ¯ , g ^ ) =       0.89997 .
Of course, it was possible to use any distance or divergence Formulas (47)–(52) or (53)–(64) as the objective function in the problem (83)–(84). Additionally, we could use the distances D 2 ,       D B ( F ) ( D B ( F q ) , D B ( F max ) ) as a minimization criterion. Also, we could use the divergences D v R ( F , α ) ,       D v K L ( F )   ( D v K L ( F q ) , D v K L ( F max ) ) , D v J ( F ) ( D v J ( F max ) , D v J ( F q ) ) , and D v ( F , α ) ,   D v J S ( F )       ( D v J S ( F q ) , D v J S ( F max ) ) as a minimization criterion. Consider another example:
Example 2. 
We consider the conditional optimization problem (83)–(84), but we choose the Jeffrey’s F 1 —divergence as the objective function, and the constraints will be the same. MADM data are not changed, that is, the class of associated probabilities { P ^ σ ( ) } σ S 3 is the same. We get the following conditional optimization problem:
{ D v J ( F 1 ) ( g , g ^ ) = 1 3 ! j = 1 6 ( i = 1 3 [ P σ j ( x i ) P ^ σ j ( x i ) ] log [ P σ j ( x i ) P ^ σ j ( x i ) ] ) min S . C .                                                 i = 1 3 P σ j ( x i ) = 1   ,         j = 1 , 2 , , 6 0 P σ j ( x i ) 1   ,         j = 1 , 2 , , 6   ;         i = 1 , 2 , 3 0.45 i = 1 3 P σ 5 ( x σ 5 ( i ) ) ξ 2 σ 5 ( i ) 0.65 0.35 i = 1 3 P σ 1 ( x σ 1 ( i ) ) ξ 4 σ 1 ( i ) 0.55
Table 6 presents the results of the numerical solution of the conditional optimization problem (85) in the form of the class of associated probabilities of the fuzzy measure g ¯ G ( X ) .
The minimum divergence value is equal to D v J ( F 1 ) ( g ¯ , g ^ ) = 0 . 00021 .
Of course, for the identification of the associated probabilities class of a fuzzy measure, one could also consider multi-objective conditional optimization problems by considering several objective functions. This should be derived from the issues of the synthesis of a specific research problem.
Consider one last example:
Example 3. 
Consider another example. Let us choose the domain of approximation of a fuzzy measure g ¯ as all classes of fuzzy measures G F , m ( X ) associated with the body of data F , m defined on 2 X , where for each fuzzy measure g ¯ its associated probabilities are calculated as follows (Formula (38)), σ S n :
P σ ( x σ ( i ) ) = A j F : A j { x σ ( i ) } m ( A j ) w j ( | A j { x σ ( 1 ) , , x σ ( i ) } | ) ,       i = 1 , , n .
As the index of the difference between the approximation fuzzy measure g ¯ and the fuzzy measure g ^ for the objective function of the corresponding conditional optimization problem, let us choose the Jeffrey’s symmetric F 2 —divergence D v J ( F 2 ) ( , ) :
D v J ( F 2 ) ( g , g ^ ) = 1 n ! ( σ S n ( i = 1 n [ P σ ( x i ) P ^ σ ( x i ) ] [ log 2 P σ ( x i ) log 2 P ^ σ ( x i ) ] ) 2 .
To construct a numerical example, let us take the same matrix from Table 2 as the decision-making matrix of the MADM model. Consider the Choquet integral with respect to the fuzzy measure g ¯ as an aggregation operator. Consider the following body of evidence F , m : F = { A 1 , A 2 } ,     A 1 = { x 1 , x 2 } ,     A 2 = { x 2 , x 3 }   ,       m ( A 1 ) = 2 3     ,       m ( A 2 ) = 1 3     and the vector of allocation weights W = { w 1 , w 2 } , w 1 = { w 1 ( 1 ) , w 1 ( 2 ) } = { 3 4 , 1 4 }   ;       w 2 = { w 2 ( 1 ) , w 2 ( 2 ) } = { 1 5 , 4 5 } . Table 7 presents the fuzzy measure g ^ associated with this body of evidence and weights, and more precisely, its associated probability class { P ^ σ } σ S 3 , for this we used Formula (38).
To identify the fuzzy measure g ¯ as the MADM uncertainty index, consider the conditional optimization problem with the constraints of the previous examples and the divergence D v J ( F 2 ) ( , ) minimization criterion:
{ D v J ( F 2 ) ( g , g ^ ) = 1 3 ! ( j = 1 6 i = 1 3 [ P σ j ( x i ) P ^ σ j ( x i ) ] [ log 2 P σ j ( x i ) log 2 P ^ σ j ( x i ) ] ) 2 min S . C .                                                 i = 1 3 P σ j ( x i ) = 1   ,         j = 1 , 2 , , 6 0 P σ j ( x i ) 1   ,         j = 1 , 2 , , 6   ;         i = 1 , 2 , 3 0.45 i = 1 3 P σ 5 ( x σ 5 ( i ) ) ξ 2 σ 5 ( i ) 0.65 0.35 i = 1 3 P σ 1 ( x σ 1 ( i ) ) ξ 4 σ 2 ( i ) 0.55
The associated probability class of the fuzzy measure g ¯ G ( X ) for the solution of the problem (88) is given in Table 8:
The minimum value of the objective function D v J ( F 2 ) ( g , g ^ ) is: D v J ( F 2 ) ( g ¯ , g ^ ) = 0 . 00506 .
For all three examples, at the last stage, we complete the MADM problem presented in Table 2 and rank the alternatives d 1 , d 2 , d 3 , d 4 using the identified fuzzy measure g ¯ . Aggregation will be accomplished again with the Choquet integral
M E ( d i ) = j = 1 3 P τ ( i ) ( x τ ( i ) ( j ) ) ξ i τ ( i ) ( j )   ,         i = 1 , 2 , 3 , 4 ,
where for each alternative d i there is a permutation τ ( i ) = ( τ ( i ) ( 1 ) , τ ( i ) ( 2 ) , τ ( i ) ( 3 ) ) S 3 such that
ξ i τ ( i ) ( 1 ) ξ i τ ( i ) ( 2 ) ξ i τ ( i ) ( 3 ) .
Table 9 summarizes the aggregation values for all three examples, and Table 10 presents the ranking of the alternatives.
It becomes clear that the alternative d 4 is the best for all problems.
Comparative Analysis: We note again that the examples given here are for illustrative purposes only and their purpose is to make it easy for the reader to construct the analogous optimization model they need. However, a little comparative analysis can be conducted. Examples 1 and 2 are practically indistinguishable examples in that the MADM environment is the same, only the objective functions differ. e.g., In 1, the objective function is the similarity parameter D S I ( F 1 ) ( . , . ) and its maximization is considered. However, the maximum similarity parameter is quite high— D S I ( F 1 ) =       0.89997 . And, e.g., in 2, the objective function is the divergence parameter D v J ( F 2 ) ( , ) and its minimization is considered. Here too, the minimum divergence parameter is quite high— D v J ( F 2 ) = 0 . 00021 . Both examples consider interactive MADM in the approximation of pairwise interaction of environmental attributes, where the two-additive fuzzy measure is considered as the uncertainty index (Table 4). In both examples, the approximation class is the same, and the similarity between the identified fuzzy measures is high (Table 5 and Table 6). Therefore, the ranking of the alternatives should be almost identical (in our case it is completely identical (Table 10).
As for the third example, here the MADM environment is partially different from the case of the previous two examples. Here, the uncertainty index is considered to be the body of evidence and the class of fuzzy measures associated with it (86). A specific allocation weight vector and a specific body of evidence are considered. Probabilistic representations of the associated fuzzy measure are given in Table 7. There is agreement in all three examples, where all conditional optimization problems’ constraints are the same. In the third example, the objective function is the same as the divergence parameter D v J ( F 2 ) ( , ) of the second example, and its minimization is considered. Here too, the minimum divergence parameter is quite high— D v J ( F 2 ) = 0 . 00506 . As a result, the ranking of the alternatives in the third example partially coincided with the rankings of the alternatives in the first and second examples.
We tested different divergence or similarity parameters in the objective function role for the same MADM environment. We also changed the environment of MADM. As a result, the rankings of the alternatives for the fixed MADM environment are almost identical when we take the Choquet finite integral as the aggregation operator. Which indicates the sensitivity of the identification problem constructed in the article.

5. Conclusions

The presented paper discusses the binary relations of distance, divergence and similarity defined on the space of finite fuzzy measures, which in a certain sense represent generalizations of similar binary relations defined on the space of finite probability distributions. The correctness of the generalizations is proved. The generalizations are based on the class of fuzzy measure—associated probabilities. More precisely, the distance, divergence, and similarity parameters between two fuzzy measures are determined by the distance, divergence, and similarity parameters between their associated probability classes. For the definition of the mentioned parameters between the associated probability classes, the concept of distance generator is introduced, which scales the values of the associated probability class in the scalar value of the corresponding parameter.
The concepts of distance, divergence and similarity between fuzzy measures are used in the fuzzy measure identification problem for a certain multi-attribute decision-making (MADM) environment. For this, a conditional optimization problem with a single objective function that actually represents a parameter of distance, divergence, or similarity, is formulated. The constraints of the optimization problem represent constraints on the MADM data, as well as constraints on the associated probabilities of identification fuzzy measure. Given the extrema of the objective function, the identification fuzzy measure is the best approximation for the class of fuzzy measures allowed in the MADM environment.
The classes of the second-order Choquet capacities, two-additive fuzzy measures, Sugeno λ —additive fuzzy measures, possibility measures, and fuzzy measures associated with the body of evidence are considered. Numerical examples are discussed and a comparative analysis of the obtained results is presented. In the conditional optimization problem corresponding to a simple example of three parallel MADMs, one and the same constraints are considered. The only difference is in the allowable classes of the search fuzzy measure, which are associated with the nature of the specific problem. There is also a difference in the selection of the objective function, which derives from the choice of preferences of the decision-maker.
The results are obtained at the level of ranking of MADM alternatives. For ranking, data aggregations of alternatives are created using the fuzzy measure identified in it from the Choquet aggregation operator. The results show some differences, which are due to the selection of the search class of the fuzzy measure approximation and the preferences of the decision-maker in the selection of the objective function.
We have considered the same examples in the case of the Sugeno finite integral. The ranking of the alternatives was found to be in great agreement with the case of the Choquet integral. It would also be interesting to consider other aggregation integral operators that use fuzzy measures in their calculations.
Future studies will consider other classes of fuzzy measure approximations related to specific important practical problems. Other important new divergence and distance generalizations for fuzzy measures that are not within the scope of this article will also be considered. Changing the aggregation operator was sensitive.
In future studies, we will consider the use of metrics, similarity and divergence relations defined in the fuzzy measures space in aggregation relations with non-additive and interacting parameters or attributes, such as users’ similarity relation in collaborative filtering systems, phase space metrics of non-additive components of machine learning, measures of clustering and classification of complex objects, etc. In our future studies, it would also be natural to describe the relationships between the divergences mentioned here and the entropy measures.

Author Contributions

The authors contributed equally in this work. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Shota Rustaveli National Scientific Foundation of Georgia (SRNSF), grant number [FR-22-969].

Data Availability Statement

The paper is original and, therefore, no data were used.

Acknowledgments

The authors are grateful to the anonymous reviewers for their valuable comments and suggestions in improving the quality of the paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Mahalanobis, P.C. On the generalized distance in statistics. Sankhya Indian J. Stat. Ser. A 1936, 80, 1–7. [Google Scholar]
  2. Bhattacharyya, A. On discrimination and divergence. In Proceedings of the India Science Congress, Asiatic Society of Bengal, Calcutta, India, 1942; Available online: https://books.google.com.sg/books/about/Proceedings_of_the_Indian_Science_Congre.html?id=9mYbAAAAMAAJ&redir_esc=y (accessed on 15 August 2024).
  3. Nikulin, M.S. Hellinger distance 1994. In Encyclopedia of Mathematics; EMS Press, 2001; Available online: https://en.wikipedia.org/wiki/Hellinger_distance (accessed on 15 August 2024).
  4. Bhattacharyya, A. On a Measure of Divergence between Two Multinomial Populations. Sankhyā 1946, 7, 401–406. [Google Scholar]
  5. Wikipedia, F-Divergence. Available online: https://en.wikipedia.org/wiki/F-divergence (accessed on 15 August 2024).
  6. Rényi, A. On measures of entropy and information. In Proceedings of the 4th Berkeley Symposium on Mathematics, Statistics and Probability, Los Angeles, CA, USA, 30 June–30 July 1960; University of California Press: Berkeley, CA, USA, 1961; pp. 547–561. [Google Scholar]
  7. Kullback, S.; Leibler, R.A. On information and sufficiency. Ann. Math. Stat. 1951, 22, 79–86. [Google Scholar] [CrossRef]
  8. Csiszar, I. $I$-Divergence Geometry of Probability Distributions and Minimization Problems. Ann. Probab. 1975, 3, 146–158. [Google Scholar] [CrossRef]
  9. Jeffreys, H. Theory of Probability, 2nd ed.; Oxford University Press: Oxford, UK, 1948. [Google Scholar]
  10. Schütze, H.; Manning, C.D. Foundations of Statistical Natural Language Processing; MIT Press: Cambridge, UK, 1999; p. 304. [Google Scholar]
  11. Sterreicher, F.; Vajda, I. A new class of metric divergences on probability spaces and its statistical applications. Ann. Inst. Statist. Math. 2003, 55, 639–653. [Google Scholar] [CrossRef]
  12. Nielsen, F. On a variational definition for the Jensen-Shannon summarization of distances based on the information radius. Entropy 2021, 23, 464. [Google Scholar] [CrossRef]
  13. Liang, Z.; Jianhui, L. The Similarity for Nominal Variables Based on F-Divergence. Int. J. Database Theory Appl. 2016, 9, 191–202. [Google Scholar] [CrossRef]
  14. Deng, J.; Wang, Y.; Guo, J.; Deng, Y.; Park, Y. A similarity measure based on Kullback–Leibler divergence for collaborative filtering in sparse data. J. Inf. Sci. 2018, 45, 656–675. [Google Scholar] [CrossRef]
  15. Pastore, M.; Calcagnì, A. Measuring Distribution Similarities Between Samples: A Distribution-Free Overlapping Index. Front. Psychol. 2019, 10, 1089. [Google Scholar] [CrossRef]
  16. Cai, Y.; Lim, L.-H. Distances Between Probability Distributions of Different Dimensions. IEEE Trans. Inf. Theory 2022, 68, 4020–4031. [Google Scholar] [CrossRef]
  17. Le, H.; Sang, V.N.T.; Thuy, L.N.L.; Bao, P.T. The fuzzy Kullback–Leibler divergence for estimating parameters of the probability distribution in fuzzy data: An application to classifying Vietnamese Herb Leaves. Sci. Rep. 2023, 13, 14537 . [Google Scholar] [CrossRef]
  18. Fei, L.; Deng, Y. A new divergence measure for basic probability assignment and its applications in extremely uncertain environments. Int. J. Intell. Syst. 2018, 34, 584–600. [Google Scholar] [CrossRef]
  19. Rachev, S.T.; Klebanov, L.B.; Stoyanov, S.V.; Fabozzi, F. The Methods of Distances in the Theory of Probability and Statistics; Springer: New York, NY, USA, 2013; p. 619. [Google Scholar]
  20. Choquet, G. Theory of Capacities. Ann. De L’institut Fourier 1953, 5, 131–295. [Google Scholar]
  21. Sugeno, M. Theory of Fuzzy Integrals and Its Applications. Ph.D. Thesis, Tokyo Institute of Technology, Tokyo, Japan, 1974. [Google Scholar]
  22. Denneberg, D. Non-Additive Measure and Integral; Springer Science & Business Media: Berlin/Heidelberg, Germany, 1994. [Google Scholar] [CrossRef]
  23. Campos, L.M.; Bolanos, M.N. Representation of fuzzy measures through probabilities. Fuzzy Set Syst. 1989, 31, 23–36. [Google Scholar]
  24. Sirbiladze, G. New Fuzzy Aggregation Operators Based on the Finite Choquet Integral—Application in the MADM Problem. Int. J. Inf. Technol. Decis. Mak. 2016, 15, 517–551. [Google Scholar] [CrossRef]
  25. Sirbiladze, G.; Khutsishvili, I.; Midodashvili, B. Associated immediate probability intuitionistic fuzzy aggregations in MCDM. Comput. Ind. Eng. 2018, 123, 1–8. [Google Scholar] [CrossRef]
  26. Sirbiladze, G. Associated Probabilities in Interactive MADM under Discrimination q-Rung Picture Linguistic Environment. Mathematics 2021, 9, 2337. [Google Scholar] [CrossRef]
  27. Sirbiladze, G.; Garg, H.; Khutsishvili, I.; Ghvaberidze, B.; Midodashvili, B. Associated probabilities aggregations in multistage investment decision-making. Kybernetes 2021, 52, 1370–1399. [Google Scholar] [CrossRef]
  28. Sirbiladze, G.; Khvedelidze, T. Associated Statistical Parameters’ Aggregations in Interactive MADM. Mathematics 2023, 11, 1061. [Google Scholar] [CrossRef]
  29. Sirbiladze, G.; Kacprzyk, J.; Manjafarashvili, T.; Midodashvili, B.; Matsaberidze, B. New Fuzzy Extensions on Binomial Distribution. Axioms 2022, 11, 220. [Google Scholar] [CrossRef]
  30. Sirbiladze, G.; Kacprzyk, J.; Davitashvili, T.; Midodashvili, B. Associated Probabilities in Insufficient Expert Data Analysis. Mathematics 2024, 12, 518. [Google Scholar] [CrossRef]
  31. de Campos, L.M.; Lamata, M.T.; Moral, S. Distances between fuzzy measures through associated probabilities: Some applications. Fuzzy Sets Syst. 1990, 35, 57–68. [Google Scholar] [CrossRef]
  32. Sirbiladze, G.; Gachechiladze, T. Restored fuzzy measures in expert decision-making. Inf. Sci. 2005, 169, 71–95. [Google Scholar] [CrossRef]
  33. Grabisch, M. k-order additive discrete fuzzy measures and their representation. Fuzzy Sets Syst. 1997, 92, 167–189. [Google Scholar] [CrossRef]
  34. Wu, J.-Z.; Zhang, Q. 2-order additive fuzzy measure identification method based on diamond pairwise comparison and maximum entropy principle. Fuzzy Optim. Decis. Mak. 2010, 9, 435–453. [Google Scholar] [CrossRef]
  35. Dempster, A.P. Upper and Lower Probabilities Induced by a Multivalued Mapping. Ann. Math. Stat. 1967, 38, 325–339. [Google Scholar] [CrossRef]
  36. Shafer, G. A Mathematical Theory of Evidence; Princeton University Press: Princeton, NJ, USA, 1976. [Google Scholar]
  37. Kandel, A. On the control and evaluation of uncertain processes. IEEE Trans. Autom. Control 1980, 25, 1182–1187. [Google Scholar] [CrossRef]
  38. Dubois, D.; Prade, H. Possibility Theory; Plenum Press: New York, NY, USA, 1988. [Google Scholar]
  39. Yager, R.R. On the Entropy of Fuzzy Measures. IEEE Trans. Fuzzy Syst. 2000, 8, 453–561. [Google Scholar] [CrossRef]
  40. Huang, H. A New Index for Measuring the Difference Between Two Probability Distributions; CC-BY 4.0; Qeios: London, UK, 2024. [Google Scholar] [CrossRef]
  41. Sirbiladze, G.; Midodashvili, B.; Midodashvili, L.; Siprashvili, D. About One Representation-Interpreter of a Monotone Measure. J. Comput. Cogn. Eng. 2021, 1, 13–24. [Google Scholar] [CrossRef]
  42. Wu, J.-Z.; Beliakov, G. Nonadditivity Index Oriented Decision Preference Information Representation and Capacity Identification. Econ. Comput. Econ. Cybern. Stud. Res. 2020, 54, 281–297. [Google Scholar]
  43. Huang, L.; Wu, J.-Z.; Beliakov, G. Multicriteria correlation preference information (MCCPI) with nonadditivity index for decision aiding. J. Intell. Fuzzy Syst. 2020, 39, 3441–3452. [Google Scholar] [CrossRef]
  44. Wu, J.-Z.; Beliakov, G. Nonadditivity index and capacity identification method in the context of multicriteria decision making. Inf. Sci. 2018, 467, 398–406. [Google Scholar] [CrossRef]
  45. Grabisch, M.; Kojadinovic, I.; Meyer, P. A review of methods for capacity identification in Choquet integral based multi- at-tribute utility theory. Eur. J. Oper. Res. 2008, 186, 766–785. [Google Scholar] [CrossRef]
  46. Huang, L.; Wu, J.-Z.; Xi, R.-J. Nonadditivity Index Based Quasi-Random Generation of Capacities and Its Application in Comprehensive Decision Aiding. Mathematics 2020, 8, 301. [Google Scholar] [CrossRef]
  47. Grabisch, M. Alternative Representations of Discrete Fuzzy Measures for Decision Making. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 1997, 5, 587–607. [Google Scholar] [CrossRef]
  48. Javier, M.; Guillaume, S.; Pilar, B. k-maxitive fuzzy measures: A scalable approach 2 to model interactions. Fuzzy Sets Syst. 2017, 324, 33–48. [Google Scholar]
  49. Javier, M.; Serge, G.; Tewfik, S.; Pilar, B. An algorithm for computing the generalized interaction index for k-maxitive fuzzy measures. J. Intell. Fuzzy Syst. 2020, 38, 1–11. [Google Scholar]
  50. Ünver, M.; Özçelik, G.; Olgun, M. A fuzzy measure theoretical approach for multi criteria decision making problems containing sub-criteria. J. Intell. Fuzzy Syst. 2018, 35, 6461–6468. [Google Scholar] [CrossRef]
  51. Buck, A.R.; Anderson, D.T.; Keller, J.M.; Wilkin, T.; Islam, M.A. A Weighted Matrix Visualization for Fuzzy Measures and Integrals. In Proceedings of the 2020 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), Glasgow, UK, 19–24 July 2020; pp. 1–8. [Google Scholar] [CrossRef]
  52. Abdullah, L.; Awang, N.A.; Othman, M. Application of Choquet Integral-Fuzzy Measures for Aggregating Customers’ Satisfaction. Adv. Fuzzy Syst. 2021, 2021, 1–8. [Google Scholar] [CrossRef]
  53. Zhang, M.; Cao, C. A 2-order Additive Fuzzy Measure Identification Method Based on Hesitant Fuzzy Linguistic Interaction Degree and Its Application in Credit Assessment. Res. Sq. 2021; preprint. [Google Scholar] [CrossRef]
  54. Beliakov, G.; Divakov, D. On representation of fuzzy measures for learning Choquet and Sugeno integrals. Knowl. Based Syst. 2020, 189, 105134. [Google Scholar] [CrossRef]
  55. Grabisch, M. The representation of importance and interaction of features by fuzzy measures. Pattern Recognit. Lett. 1996, 17, 567–575. [Google Scholar] [CrossRef]
  56. Roubens, M. Interaction between criteria and definition of weights in MCDA problems. In Proceedings of the 44th Meeting of the European Working Group “Multicriteria Aid for Decisions”, Brussels, Belgium, 10 March 1996. [Google Scholar]
  57. Kojadinovic, I. Modeling interaction phenomena using fuzzy measures: On the notions of interaction and independence. Fuzzy Sets Syst. 2002, 135, 317–340. [Google Scholar] [CrossRef]
  58. Marichal, J.L.; Roubens, M. Dependence between criteria and multiple criteria decision aid. In Proceedings of the 2nd International Workshop on Preferences and Decisions, Trento, Italy, 1–3 July 1998; pp. 69–75. [Google Scholar]
  59. Beliakov, G.; Cabrerizo, F.J.; Herrera-Viedma, E.; Wu, J.-Z. Random generation of k-interactive capacities. Fuzzy Sets Syst. 2020, 430, 48–55. [Google Scholar] [CrossRef]
  60. Beliakov, G.; Divakov, D. Aggregation with dependencies: Capacities and fuzzy integrals. Fuzzy Sets Syst. 2022, 446, 222–232. [Google Scholar] [CrossRef]
  61. Sirbiladze, G.; Sikharulidze, A. Weighted fuzzy averages in fuzzy environment, Parts I, II. Int. J. Uncertain. Fuzziness Knowl. 2003, 11, 139–172. [Google Scholar] [CrossRef]
  62. Sirbiladze, G.; Manjafarashvili, T. Connections between Campos-Bolanos and Murofushi–Sugeno Representations of a Fuzzy Measure. Mathematics 2022, 10, 516. [Google Scholar] [CrossRef]
  63. Sirbiladze, G. Extremal Fuzzy Dynamic Systems: Theory and Applications; IFSR International Series on Systems Science and Engineering 28; Springer: New York, NY, USA; Heidelberg, Germany; Dordrecht, The Netherlands; London, UK, 2013. [Google Scholar]
  64. Sirbiladze, G. Associated Probabilities’ Aggregations in Interactive MADM for q-Rung Orthopair Fuzzy Discrimination Environment. Int. J. Intell. Syst. 2020, 35, 335–372. [Google Scholar] [CrossRef]
Figure 1. Fuzzy terms of similarity of two fuzzy measures. Q 1 = “very low similarity”, Q 2 = “low similarity”, Q 3 = “medium similarity”, Q 4 = “high similarity”, Q 5 = “very high similarity”.
Figure 1. Fuzzy terms of similarity of two fuzzy measures. Q 1 = “very low similarity”, Q 2 = “low similarity”, Q 3 = “medium similarity”, Q 4 = “high similarity”, Q 5 = “very high similarity”.
Axioms 13 00776 g001
Table 1. MADM decision-making matrix.
Table 1. MADM decision-making matrix.
x x 1 x 2 x j x n
d
d 1  
d 2
d i ξ i 1 ξ i 2 ξ i j ξ i n
d m
Table 2. Decision matrix.
Table 2. Decision matrix.
x i x 1 x 2 x 3 M E ( d i )
d i
d 1 0.30.70.5-
d 2 0.60.40.8 0.45 M E ( d 2 ) 0.65
d 3 0.50.60.7-
d 4 0.80.70.4 0.35 M E ( d 4 ) 0.55
Table 3. Magnitudes of attribute values { I j } and pairwise interaction indexes { I i j } .
Table 3. Magnitudes of attribute values { I j } and pairwise interaction indexes { I i j } .
I i j x 1 x 2 x 3 I j
x 1 -0.150.250.3
x 2 0.15-0.100.4
x 3 0.250.10-0.3
Table 4. The class of associated probabilities { P ^ σ ( ) } σ S 3 .
Table 4. The class of associated probabilities { P ^ σ ( ) } σ S 3 .
x i P ^ σ ( ) x 1 x 2 x 3
σ
σ 1 = ( 1 , 2 , 3 )   P ^ σ 1 ( ) 0.1000.4250.475
σ 2 = ( 1 , 3 , 2 )   P ^ σ 2 ( ) 0.1000.5250.375
σ 3 = ( 2 , 1 , 3 )   P ^ σ 3 ( ) 0.2500.2750.475
σ 4 = ( 2 , 3 , 1 )   P ^ σ 4 ( ) 0.5000.2750.225
σ 5 = ( 3 , 1 , 2 )   P ^ σ 5 ( ) 0.3500.5250.125
σ 6 = ( 3 , 2 , 1 )   P ^ σ 6 ( ) 0.5000.3750.125
Table 5. Associated probabilities class of the fuzzy measure g ¯ (for the problem (84)).
Table 5. Associated probabilities class of the fuzzy measure g ¯ (for the problem (84)).
x i P σ ( ) x 1 x 2 x 3
σ
σ 1 = ( 1 , 2 , 3 )   P σ 1 ( ) 0.08550.43980.4747
σ 2 = ( 1 , 3 , 2 )   P σ 2 ( ) 0.11030.53220.3575
σ 3 = ( 2 , 1 , 3 )   P σ 3 ( ) 0.25000.27500.4750
σ 4 = ( 2 , 3 , 1 )   P σ 4 ( ) 0.50000.27500.2250
σ 5 = ( 3 , 1 , 2 )   P σ 5 ( ) 0.35000.52500.1250
σ 6 = ( 3 , 2 , 1 )   P σ 6 ( ) 0.50000.37500.1250
Table 6. The class of associated probabilities of fuzzy measure g ¯ for problem (85).
Table 6. The class of associated probabilities of fuzzy measure g ¯ for problem (85).
x i P σ ( ) x 1 x 2 x 3
σ
σ 1 = ( 1 , 2 , 3 )   P σ 1 ( ) 0.10520.42030.4745
σ 2 = ( 1 , 3 , 2 )   P σ 2 ( ) 0.10020.52490.3749
σ 3 = ( 2 , 1 , 3 )   P σ 3 ( ) 0.25000.27490.4751
σ 4 = ( 2 , 3 , 1 )   P σ 4 ( ) 0.50000.27490.2251
σ 5 = ( 3 , 1 , 2 )   P σ 5 ( ) 0.35000.52500.1250
σ 6 = ( 3 , 2 , 1 )   P σ 6 ( ) 0.50000.37490.1251
Table 7. Class of associated probabilities { P ^ σ } σ S 3 of the fuzzy measure g ^ associated with the body of evidence F , m and the given allocation weights vector W = { w 1 , w 2 } .
Table 7. Class of associated probabilities { P ^ σ } σ S 3 of the fuzzy measure g ^ associated with the body of evidence F , m and the given allocation weights vector W = { w 1 , w 2 } .
x i P ^ σ ( ) x 1 x 2 x 3
σ
σ 1 = ( 1 , 2 , 3 )   P ^ σ 1 ( ) 0.5000.2330.267
σ 2 = ( 1 , 3 , 2 )   P ^ σ 2 ( ) 0.5000.4330.067
σ 3 = ( 2 , 1 , 3 )   P ^ σ 3 ( ) 0.1670.5670.267
σ 4 = ( 2 , 3 , 1 )   P ^ σ 4 ( ) 0.1670.5670.267
σ 5 = ( 3 , 1 , 2 )   P ^ σ 5 ( ) 0.5000.4330.067
σ 6 = ( 3 , 2 , 1 )   P ^ σ 6 ( ) 0.1670.7670.067
Table 8. The associated probability class of the approximation fuzzy measure g ¯ .
Table 8. The associated probability class of the approximation fuzzy measure g ¯ .
x i P σ ( ) x 1 x 2 x 3
σ
σ 1 = ( 1 , 2 , 3 ) P σ 1 ( ) 0.52180.24110.2371
σ 2 = ( 1 , 3 , 2 ) P σ 2 ( ) 0.49530.43320.0715
σ 3 = ( 2 , 1 , 3 ) P σ 3 ( ) 0.18390.54780.2683
σ 4 = ( 2 , 3 , 1 ) P σ 4 ( ) 0.16780.56360.2686
σ 5 = ( 3 , 1 , 2 ) P σ 5 ( ) 0.49910.44570.0552
σ 6 = ( 3 , 2 , 1 ) P σ 6 ( ) 0.17140.76320.0654
Table 9. Values of aggregations on alternatives d 1 , d 2 , d 3 , d 4 for all three examples.
Table 9. Values of aggregations on alternatives d 1 , d 2 , d 3 , d 4 for all three examples.
Aggregation Values for Example 1Aggregation Values for Example 2Aggregation Values for Example 3
d 1 0.4610.4540.578
d 2 0.5290.5290.520
d 3 0.5690.5630.590
d 4 0.5770.5700.680
Table 10. Rankings of alternatives d 1 , d 2 , d 3 , d 4 .
Table 10. Rankings of alternatives d 1 , d 2 , d 3 , d 4 .
Alternative Rankings
Example 1 d 4 d 3 d 2 d 1
Example 2 d 4 d 3 d 2 d 1
Example 3 d 4 d 3 d 1 d 2
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Sirbiladze, G.; Midodashvili, B.; Manjafarashvili, T. Divergence and Similarity Characteristics for Two Fuzzy Measures Based on Associated Probabilities. Axioms 2024, 13, 776. https://doi.org/10.3390/axioms13110776

AMA Style

Sirbiladze G, Midodashvili B, Manjafarashvili T. Divergence and Similarity Characteristics for Two Fuzzy Measures Based on Associated Probabilities. Axioms. 2024; 13(11):776. https://doi.org/10.3390/axioms13110776

Chicago/Turabian Style

Sirbiladze, Gia, Bidzina Midodashvili, and Teimuraz Manjafarashvili. 2024. "Divergence and Similarity Characteristics for Two Fuzzy Measures Based on Associated Probabilities" Axioms 13, no. 11: 776. https://doi.org/10.3390/axioms13110776

APA Style

Sirbiladze, G., Midodashvili, B., & Manjafarashvili, T. (2024). Divergence and Similarity Characteristics for Two Fuzzy Measures Based on Associated Probabilities. Axioms, 13(11), 776. https://doi.org/10.3390/axioms13110776

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop