Next Article in Journal
Applied Geospatial Bayesian Modeling in the Big Data Era: Challenges and Solutions
Next Article in Special Issue
A Bibliometric Analysis of the Use of Artificial Intelligence Technologies for Social Sciences
Previous Article in Journal
OASIS-Net: Morphological Attention Ensemble Learning for Surface Defect Detection
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Aggregation Metric Based on Partitioning and Consensus for Asymmetric Distributions in Likert Scale Responses

by
Juan Moreno-Garcia
1,†,
Benito Yáñez-Araque
2,*,†,
Felipe Hernández-Perlines 
2,† and
Luis Rodriguez-Benitez
3,†
1
Department of Information Systems and Technologies, School of Industrial and Aerospace Engineering, University of Castilla-La Mancha, 45071 Toledo, Spain
2
Department of Business Administration, University of Castilla-La Mancha, 45071 Toledo, Spain
3
Department of Information Systems and Technologies, College of Computer Science, University of Castilla-La Mancha, 13071 Ciudad Real, Spain
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Mathematics 2022, 10(21), 4115; https://doi.org/10.3390/math10214115
Submission received: 26 September 2022 / Revised: 25 October 2022 / Accepted: 31 October 2022 / Published: 4 November 2022

Abstract

:
A questionnaire is a basic tool for collecting information in survey research. Often, these questions are measured using a Likert scale. With multiple items on the same broad object, these codes could be summed or averaged to give an indication of each respondent’s overall positive or negative orientation towards that object. This is the basis for Likert scales. Aggregation methods have been widely used in different research areas. Most of them are mathematical methods, such as the arithmetic mean, the weighted arithmetic mean, or the OWA (Ordered Weighted Averaging) operator. The usual presentation of Likert scale derived data are Mean. This paper presents a new approach to compute an aggregate value that represents Likert scale responses as a histogram adequate to treat better than Mean with asymmetric distributions. This method generates a set of partitions using an approach based on successive division. After every division, each partition is evaluated using a consensus measure and the one with the best value is then selected. Once the process of division has finished, the aggregate value is computed using the resulting partitions. Promising results have been obtained. Experiments show that our method is appropriate for distributions with large asymmetry and is not far from the behavior of the arithmetic mean for symmetric distributions. Overall, the article sheds light on the need to consider other presentations of Likert scale derived data beyond Mean more suitable for asymmetric distributions.

1. Introduction

The process of obtaining information from an investigation in some areas is usually achieved by designing and creating a survey. A survey is a quantitative technique used to obtain primary information and the basic tool for collecting information is provided by the questionnaire. It is used in research to obtain the information that is used to reach the conclusions about the treated problem. Questionnaires have been used in different areas, such as the social sciences [1,2,3], education [4,5], market studies [6], sports [7,8], accident prevention [9,10], augmented reality [11], clinical data [12,13], etc.
The information gathered from a questionnaire is represented using two types of variables: observable variables and latent variables. An observable variable can be defined as “a concept that can be directly observed and measured”, i.e., observable variables are “the directly measured proxy variables that contain the raw data” [14]. These variables are also called indicators (and also called items or manifest variables). In contrast, a latent variable is a theoretical concept that cannot be directly observed or measured, but can be inferred from observable variables or from other latent variables that are measured in the first instance from observable variables. A variable of this type is also called a construct, composite, or factor and can be understood as something “meaningful”. A construct, then, is inferred from one or more indicators, that is, a latent variable is calculable from some observable variable or variables. Thus, a multidimensional construct is a construct where “their indicators are themselves latent constructs” [15]. A multidimensional construct establishes a relationship among various latent constructs and allow researchers to match broad predictors with broad outcomes, increasing the explained variance [16]. An indicator of a multidimensional construct is called dimension and represents one aspect, clearly defined, of the content domain of the overarching construct. Indicator or item, factor, composite, construct and multidimensional construct are the names usually used in statistical analysis applied to social, political, and behavioral sciences.
In most cases, the data gathered from a questionnaire allows for obtaining valuable information from it. Statistical analysis or other analytic techniques such as fsQCA [17,18,19] are used for this purpose. Those methods usually need a process where the information is aggregated. This aggregation is especially useful in two main situations: (1) to conflate the responses to the different items composing a question (construct) [20] or (2) to combine the answers offered by the different respondents to a concrete question item.
The use of measures of central tendency such as the arithmetic mean is common. A measure of central location tries to represent with a single value the center or location of the distribution. A subject’s score on a factor is the sum of his or her responses to the items selected to define that factor (direct scores). This practice is equivalent to assigning weights of 1 to the items if they define the factor, or 0 if they do not [21,22]. When the factors are not defined by the same number of items (as is usual), it is recommended to calculate for each subject his/her mean on each factor (sum of his/her responses to the items of the factor divided by the number of items of the factor); in this way, the scores of each subject (or group means) on the different factors can be compared with each other [23]. These measurements have the problem that they are very sensitive to extreme values, which makes them not useful in asymmetric distributions. Furthermore, the opinions collected in a survey can generate an asymmetric distribution that will not be correctly reflected with a central tendency measure. Therefore, measures of central tendency are not the most appropriate for obtaining a value that represents the information collected in a survey. Taking into account these considerations studied in previous works of the authors in the field of Social Sciences [1,2,3], it is interesting to design a method to generate groups of opinions from the surveys. Thus, instead of representing the distribution by means of a measure of central tendency, we propose to represent it with a value closer to the most representative group. To do this, the groups of opinions are detected from the survey respondents, and each of these groups are represented by a single value when the aggregated value is computed. To obtain the appropriate groups of opinion, consensus measures are used, and, to detect such groups, domain partition techniques are applied.
This paper proposes a new aggregation method for questionnaires with closed-ended questions [24] in order to measure a given phenomenon under investigation. The proposed method can treat ordered bounded domains, such as an ordinal-polytomous scale [25]. It is based on a continuous division of the input space [26,27] that groups closer answers into partitions. To establish such clusters, the concept of entropy [28,29] and a distance measure are used. In order to test the validity of our proposal, the results obtained from experimentation are compared with the arithmetic mean using two synthetic and real databases.

1.1. Contributions of the Present Paper

Multiple-item scales are widely used in making hiring decisions, assessing student, customer, and employee satisfaction, conducting needs assessments and program evaluations, and in scientific research projects. Unfortunately, those who construct these scales often have little knowledge of how to effectively develop and evaluate them. When scales are not properly developed, they typically yield data that are unsuitable for their intended purpose.
The Likert scale aims primarily at solving a technical problem that has arisen in relation to the quantitative aspects of the study of social attitudes. Likert [30], in his original research work on the development of the scales that bear his name, assumes that, for experimental purposes, attitudes are distributed fairly normally. However, he recognizes that this assumption may not be correct, so he calls for future research to determine its correctness or incorrectness by further experiment. This paper responds to that call, proposing a solution in the calculation of Likert scales when the assumption of normality is not met, that is, when the distribution is asymmetric.
We present a new approach to compute an aggregate value that represents Likert scale responses as a histogram adequate to treat with asymmetric distributions. In addition, the operation of the our method is not far from the behavior of the arithmetic mean for symmetric distributions. There are different possibilities in the literature to aggregate information into a single value, and most of them are based on measures of central tendency that are not appropriate for asymmetric distributions.
The presented method is based on the use of successive divisions to generate partitions of the input space. Furthermore, the use of consensus has been proven to be an appropriate way to demonstrate the validity of a set of opinions. For this reason, consensus has been used to measure the goodness of the candidate partitions. More concretely, the approach of Tastle [29] used to measure a Likert scale is used in this paper. Our approach proposes a measure based on consensus and suggests the distances between the candidate partitions to select one of them. In addition, the way the candidate partitions are generated is another innovative contribution.
Finally, this paper proposes the use of a parameter α to control the partitioning rate. It takes values in the interval [ 0 , 1 ] ; if α = 1 , the parameter has no effect. In this way, the behavior of the algorithm can be controlled by the final user, something that is not possible in other existing aggregation functions.

1.2. Structure

This paper is organized as follows: Section 2 exposits the background of the treated issue. Then, Section 3 provides details on our proposal to compute the aggregate value. Section 4 tests the proposed method and presents a discussion about the results achieved. Lastly, Section 5 summarizes the final conclusions and future research.

2. Background

Aggregation methods have been widely used in different research areas. Most of them are mathematical methods, such as the arithmetic mean [31] or the weighted arithmetic mean, that allows indicating a weight for each component. The Ordered Weighted Averaging) operator is another method, introduced by Yager [32]. It is widely used in applied mathematics and fuzzy logic, and it can also be used to aggregate the information obtained from a questionnaire. For instance, He et al. [33] provide new aggregation operators for Extended Comparative Linguistic Expressions with Symbolic Translation (ELICIT) information by developing novel OWA based operators, such as the Induced OWA (IOWA) operator in order to avoid the OWA operator needs of reordering its arguments because ELICIT information does not have an inherent order due to its fuzzy representation. Pons-Vives et al. [34] provide an application of OWA operators to customer classification in hotels. These authors argue that the use of the OWA operator improves the performance of the classical K-means and reduces the number of convergence iterations. An associated collection of weights W = ( w 1 , w 2 , , w n ) lying in the unit interval are needed in order to compute an OWA operator of dimension n (Equation (1)).
F ( a 1 , , a n ) = j = 1 n w j · b j
where b j is the jth largest in ( a 1 , a 2 , a n ) and i = 1 n w i = 1 .
Another category of methods, such as the one we present in this paper, are based on consensus [20,35,36]. A recent study of the aggregation methods based on consensus is presented in [20] and, now, some relevant papers about consensus will be described in order to introduce the reader to the treated issue.
Fedrizzi and Pasi [37] present a review of fuzzy logic-based approaches to model consensus. It is applied in the context of fuzzy Group Decision Making (GDM) and two different models are synthesized so as to model a consensus under individual fuzzy preferences. In that paper, OWA operators are used to come to a consensus. Parreiras et al. [38] presented a flexible consensus scheme that allows for capturing a consistent collective opinion from a group of experts. It is based on a linguistic hierarchical model, which is advantageous both from the human viewpoint and from the operational viewpoint. A moderator can intervene in the discussion process and the authors test the approach on a hypothetical enterprise strategy planning problem. Alcantud et al. [39] presented the concept of approval consensus measure (ACM) to measure the cohesiveness that the expression of dichotomous opinions conveys. The 2012 presidential elections in the USA were selected as a real scenario to test their approach. In [40], from a set of objects, a consensus partition containing a maximum number of joined or separated pairs in such set is established. Cultural Consensus Theory (CCT) is an approach to information pooling (aggregation and data fusion) and is used for measurement and inference in the social and behavioral sciences [41]. France and Batchelder [42] described research related to CCT where a clusterwise version of continuous cultural consensus analysis called CONSCLUS was designed based on k-means. Their CCT-Means algorithm is used to fit CONSCLUS and an algorithm similar to k-means is used to compute the clusters. The centroids of the clusters are calculated using the continuous CCT procedure and the algorithm gives a good performance on relatively large datasets. They also implemented an extension of CONSCLUS for fuzzy clustering using an alternating least squares extension to the C-means algorithm. In their experimentation, they showed how CONSCLUS could be used to analyze a set of online review data. To compute the topology of a public transport network, Fiori et al. [43] used a consensus clustering density-based approach, referred to as DeCoClu (Density Consensus Clustering). They infer the geographical locations of stops with GPS data and use a consensus clustering strategy based on a new distance function to compute the relative distances between points. Experiments were conducted on real-data collections provided by a public transport company, and showed the utility of their proposal. Plaia et al. define a consensus ranking in order to assign a class label or class ranking to each node in a decision tree [44]. Zhang and Li [45] develop two consensus-based TOPSIS-Sort-B (variation of the Technique for Order of Preference by Similarity to Ideal Solution, known as TOPSIS-Sort method) algorithms to deal with multi-criteria sorting in the context of group decision-making (MCS-GDM) problems. Then, the authors define the consensus measures and devise different feedback adjustment mechanisms and consensus reaching algorithms to help experts reach consensus by considering different needs for MCS-GDM problems. Thus, TOPSIS-Sort-B is presented as an improved version of TOPSIS-Sort for sorting problems in which boundary profiles should be determined. Other variations of the TOPSIS-Sort method have been developed such as TOPSIS-Sort-C, which should be used to address problems in which it is more appropriate to determine characteristic profiles [46]. For their part, Gai et al. [47] propose a consensus-trust driven framework of bidirectional interaction for social network large-group decision-making. The proposed consensus framework is applied to a blockchain platform selection problem in the supply chain to demonstrate the effectiveness and applicability of the model.

3. Materials and Methods

A questionnaire is composed of a group of m questions, represented as Q = { Q 1 , Q 2 Q m } , where each question Q i can be seen as a group of m i items Q i = { I 1 i , I 2 i I m i i } , where each I j i takes values in a concrete domain. The Likert scale [30] is frequently used as the domain to measure the responses in a questionnaire. For this reason, the proposed method is designed to be used with this scale.
In order to contextualize our proposal, now we study a concrete example based on the first question of the questionnaire using a 5-point Likert scale such as this: “1 = strongly disagree, 2 = disagree, 3 = have no idea, 4 = agree, 5 =  strongly agree” taken from [48] (Table 1). Suppose that nine surveyed completed this question with the values shown in Table 2 and that the arithmetic mean is used in this example to aggregate the responses. In Table 2, there are two levels of aggregation for each question: the first one is shown in the column “1st level” and represents the response of each surveyed to an item; the second one, column “2nd level”, is the result of the aggregation process applied to the values of the first level. In addition, the arithmetic mean of each answer is shown in the last row (col. mean).
Section 3.1 shows the algorithm in detail. The proposed approach is now detailed where a successive partition of the frequency histogram of the occurrences of the responses is built. Then, the initial histogram is split into two partitions, and then these two partitions are divided again and again until they cannot be divided any more (Section 3.2). The partition condition is a measure concretely designed with this aim (Section 3.3). Once the split has finished, the final aggregate value is computed (Section 3.4). Finally, an example is given in Section 3.5.

3.1. Detailed Algorithm of the Proposed Method

Algorithm 1 details the operation of our method. It computes the histogram without taking into account the items with a frequency equal to 0 (Lines 1–2). These elements are stored in L p r o c (list of processing). The output set containing the partitions is named L o u t (Line 3). A new element is taken from L p r o c (Line 5). After that, all possible partitions composed of two sets of responses are generated (Line 9 and Section 3.2) if this is possible (Line 6). Each partition is now evaluated using an evaluation function that allows for selecting the best partition (Line 10 and Section 3.3). A parameter α is used to control the way the partitions are made (Line 11). This is the only parameter used, and it allows for selecting the rate of partitioning, and its behavior will be observed in the experiments. The smaller α is, the greater the rate of partitioning becomes. If two new partitions are obtained, they are added to L p r o c , and they will be processed in the next iteration of the loop (Lines 12–13); in other cases, the original partition is generated as output (Line 15). This process is repeated until no new partition can be obtained (Line 4). Lastly, the aggregate value is computed using the partition of the input space (Line 20 and Section 3.4). Another aggregation function must be selected if the input histogram has not been divided (Line 22).
Algorithm 1 Partitioning and consensus based method
1:
h i s t = CalculatingTheHistogram( X 1 X n )
2:
L p r o c = DeletingElementsWithFrequencyZero( h i s t )
3:
L o u t = {Output list with the selected partitions}
4:
while   L p r o c do
5:
    e = L p r o c .pop() {e is removed from L p r o c and is processed}
6:
    if length(e) = = 1 then
7:
         L o u t .add(e) {e cannot be divided and is added to L o u t }
8:
    else
9:
         L p a r t s = GeneratingPartitions(e) {Section 3.2}
10:
       p a r t = ObtainingBestPartition( L p a r t s ) {Section 3.3}
11:
      if eval( p a r t ) > α · cons(e) then
12:
          L p r o c .add( p a r t 1 ) {Adding first partition of p a r t }
13:
          L p r o c .add( p a r t 2 ) {Adding second partition of p a r t }
14:
      else
15:
          L o u t .add(e) {e is not divided and is added to L o u t }
16:
      end if
17:
    end if
18:
end while
19:
if length( L o u t )>1 then
20:
    r = ComputingOutputValue( L o u t ) (Section 3.4)
21:
else
22:
     r = using another aggregation function, such as the arithmetic mean, OWA, etc.
23:
end if

3.2. Generation of the Partitions

The selected partition is divided into two parts: Each part contains consecutive values of the original partition and all possible combinations are computed. If the input histogram has n values, then there exist n 1 possible partitions using this cutting strategy. For example, let e be the histogram of the column I 1 1 (Table 2) taking the values { { 1 , 2 9 } , { 2 , 1 9 } , { 4 , 2 9 } , { 5 , 4 9 } } representing nine responses using a 5-point Likert scale where item 1 has been chosen two times, item 2 only in one response, etc. There are three possible partitions since e contains four values. Table 3 shows the partition p a r t divided into two partitions, referred to as p a r t 1 and p a r t 2 .

3.3. Evaluating the Partitions

Now, the equations that define our proposal will be presented. Equation (2) is used to evaluate the partition p a r t , taking values in the interval [ 0 , 1 ] . A larger evaluation value indicates a better partition:
e v a l ( p a r t ) = c o n s ( p a r t ) · d i s t a n c e ( p a r t )
The consensus of each component of a partition p a r t (Equation (3)) and the distance between the partitions p a r t 1 and p a r t 2 (Equation (5)) are considered. Equation (3) takes a value in the interval [ 0 , 1 ] since the function C n s offers an output value in this interval:
c o n s ( p a r t ) = | p a r t 1 | | p a r t | · C n s ( p a r t 1 ) + | p a r t 2 | | p a r t | · C n s ( p a r t 2 )
where p a r t is the partition composed of p a r t 1 and p a r t 2 , and | p a r t i | is the width of p a r t i .
Different measures of consensus have been proposed previously in the literature. The approach of Tastle and Wierman [29] has been selected since it is a measure designed specifically to measure the consensus of a standard Likert scale. Furthermore, this method is valid for measuring other scales of closed-ended questions. Tastle and Wierman proposed the use of the mean and standard deviation as measures of the dispersion along with the Shannon entropy (Equation (4)). In their proposal, the consensus of a partition with a single element is 1.0 , and several examples showing its behavior can be found in the original paper:
C n s ( X ) = 1 + i = 1 n p i · log 1 | X i X ¯ | X m a x X m i n
where p i is the probability associated with the distribution under consideration, X ¯ is the mean of X, X m a x X m i n is the width of X, and | | is the absolute value.
As it is indicated above, a distance measure is defined between two partitions. In addition, Equation (5) takes a value in the interval [ 0 , 1 ] , and it fulfills that r v ( p a r t 2 ) > r v ( p a r t 1 ) since each element in p a r t 2 is greater than each element in p a r t 1 :
d i s t a n c e ( p a r t ) = r v ( p a r t 2 ) r v ( p a r t 1 ) | L i k e r t |
where | L i k e r t | is the width of the input Likert scale, and r v ( p a r t i ) returns the representative value for p a r t i .
To compute the representative value of a partition function, r v is used (Equation (6)).
r v ( p a r t ) = f a c t ( f i ) · n o r m ( f i ) | p a r t |
where p a r t = ( { j , f j } , { j + 1 , f j + 1 } , , { j + | p a r t | , f j + | p a r t | } , { j , f j } ) is a tuple where j is a Liker scale value, f j its frequency, and n o r m ( f i ) normalizes the value f i p a r t .
The idea is to detect the areas of maximum frequency concentration within the partition. For this reason, a method has been designed that reduces the small frequencies and maintains the highest ones (Equation (7)) by applying a factor to each of the frequencies in the sample. The frequencies that fall in the lower third of the highest frequency ( m a x ( p a r t ) ) are multiplied by a factor of 1 3 , those that fall in the second third are multiplied by a factor of 2 3 , and finally those that fall in the upper third are not modified, so the factor is equal to 1:
f a c t ( f i ) = 1 3 , f i < m a x ( p a r t ) · 1 3 2 3 , f i m a x ( p a r t ) · 1 3 a n d f i < m a x ( p a r t ) · 2 3 1 , f i m a x ( p a r t ) · 2 3
where p a r t = ( { 1 , f 1 } , { 2 , f 2 } , , { | L i k e r t | , f | L i k e r t | } ) and f r e q = { f 1 , f | L i k e r t | } .
The distribution obtained is then normalized and the mean, which is the representative value of the sample, is calculated. By reducing the lowest frequencies and keeping the highest ones, the new average approaches the areas of maximum frequency concentration.
The main idea of the approach is to split only when a new partition p a r t reaches a good consensus (Equation (3)) and when the two partitions in p a r t are far enough from each other (Equation (5)). In this case, we consider the best way of obtaining a more consistent partitioning. For instance, let p a r t be the partition selected in Section 3.2 (row 2 of Table 3): p a r t 1 = { { 1 , 2 9 } , { 2 , 1 9 } } and p a r t 2 = { { 4 , 2 9 } , { 5 , 4 9 } } . First of all, Equation (4) is applied to calculate the consensus values, obtaining 0.9025 and 0.6325 , respectively. Next, Equation (3) is computed: | p a r t 1 | | p a r t | · C n s ( p a r t 1 ) + | p a r t 2 | | p a r t | · C n s ( p a r t 2 ) = 1 4 · 0.9025 + 1 4 · 0.6325 = 0.7675 where | p a r t 1 | | p a r t | = | 2 1 | | 5 1 | = 1 4 and | p a r t 2 | | p a r t | = | 5 4 | | 5 1 | = 1 4 . Equation (5) computes the distance between p a r t 1 and p a r t 2 , and the result is 0.7 . Lastly, the evaluation value is calculated using Equation (2): it is equal to 0.7675 · 0.7 = 0.5373 .

3.4. Calculating the Aggregate Value

The aggregate value is computed using Equation (8) once the best partition has been selected. This operation obtains the sum of the frequencies of each partition multiplied ( s u m F r e q ( p a r t i ) ) by a representative value from the own partition ( r v ( p a r t i ) ). Then, each partition is represented by a single value (Equation (6)):
v = i = 1 m s u m F r e q ( p a r t i ) · r v ( p a r t i )
with L o u t = { p a r t 1 , p a r t 2 , , p a r t m } , p a r t i is a partition, and r v ( p a r t i ) is calculated using Equation (6).
Although, in Equation (8), rv is used, it can be substituted by other measures such as the mode or the median. The justification of why using the rv is related with our aim of representing the partition with a value that considers the zone that concentrates the maximum frequencies. As example, the partition given in Section 3.3 (row 2 of Table 3) is now used in order to illustrate the operation of Equation (8). p a r t 1 and p a r t 2 are represented by the values 1.25 and 4.75 , respectively. Then, the aggregate value is computed in the following way:
( s u m F r e q ( p a r t 1 ) · r v ( p a r t 1 ) ) + ( s u m F r e q ( p a r t 2 ) · r v ( p a r t 2 ) ) = ( ( 2 9 + 1 9 ) · 1.25 ) + ( ( 2 9 + 4 9 ) · 4.75 ) = 3 9 · 1.25 + 6 9 · 4.75 = 3.583

3.5. Example of the Operation of Algorithm 1

The operation of this algorithm, step by step, will now be shown using an example. The question used has nine items, and each one of them uses a 5-point Likert scale taking, for example, the following values { 4 , 5 , 5 , 4 , 1 , 1 , 2 , 5 , 5 } have been taken from the column I 1 1 of Table 2. The value for α is 0.55 . Then,
  • Line 1: The histogram h i s t = { { 1 , 2 9 } , { 2 , 1 9 } , { 3 , 0 9 } , { 4 , 2 9 } , { 5 , 4 9 } } is obtained using the answers.
  • Line 2: L p r o c is initialized with a set containing every single element, { { 1 , 2 9 } , { 2 , 1 9 } , { 4 , 2 9 } , { 5 , 4 9 } } , where the value 3 is not added because its frequency of occurrence is equal to 0.
  • Line 3: The list that contains the output partitions ( L o u t ) is initialized to ∅.
  • Line 4: The main loop finishes when there are no partitions to process, i.e., L p r o c is empty.
  • Line 5: An element is taken and removed from L p r o c ; in this case, as there is only one, then e = { { 1 , 2 9 } , { 2 , 1 9 } , { 4 , 2 9 } , { 5 , 4 9 } } .
  • Lines 6–7: If e only contains one element, it is added to the output list L o u t (Line 7) since it can not be divided. This situation does not occur in this example.
  • Lines 8–9: Otherwise, all the possible partitions of e are generated and stored in L p a r t s , which is the list that contains all the possible partitions as shown in Table 4 (Section 3.2).
  • Line 10: The best partition p a r t is now selected using Equation (2). The column “eval” in Table 4 shows the values obtained in this case. The best partition is P 2 since it reaches the greatest value ( 0.537 ).
  • Lines 11–13: If the evaluation value of p a r t (Equation (2)) is greater than the consensus value of e, then the two components of P 2 ( p a r t 1 and p a r t 2 ) are added to L p r o c (Lines 12 and 13); otherwise, p a r t is not split any more and is added to L o u t (Lines 14–15). In our example, the division is achieved because p a r t obtains a greater value ( d i s t a n c e ( p a r t ) · c o n s ( p a r t )   =   0.7 · 0.7675   =   0.5373   =   0.537 ) than the value obtained by e ( α · C n s ( e )   =   0.55 · 0.4627   =   0.2544 ), i.e., L p r o c = { { { 1 , 2 9 } , { 2 , 1 9 } } , { { 4 , 2 9 } , { 5 , 4 9 } } } . By means of the parameter α , the user can indicate the desired rate of split. In the next iteration of the main loop, the partitions in L p r o c are processed in the same way.
  • Line 19: When the loop finishes, the selected partition takes the value L o u t   =   { { { 1 , 2 9 } , { 2 , 1 9 } } , { { 4 , 2 9 } , { 5 , 4 9 } } } and the final aggregate value is equal to 3.583 (Line 20).

4. Results

Two tests will be performed, one using two randomly generated synthetic samples, and the other using a real database. The synthetic datasets are used to test how the presented method creates the partitions and how the parameter α influences the creation of the partitions (Section 4.1). In order to test our proposal in a real dataset, a 2014 survey carried out by some of the authors of this work and used in different research publications of the field of Social Sciences [1,49] is taken as input of the experiments (Section 4.2).

4.1. Study of the Obtained Partitions and the Influence of the Parameter α in the Creation of the Partitions

To perform the tests, two datasets formed by 300 surveys have each been randomly generated using a 7-point Likert scale and a 9-point Likert scale, respectively. Each survey of the dataset is represented by a histogram generated randomly with the same probability for each of the items. The histogram is represented in the same way as in Section 3.2.
The obtained results are studied according to the symmetry of the sample. As a measure of symmetry, the Fisher’s coefficient [50] has been used being one of the most widely used measures of symmetry. In addition, for the study of symmetry, four intervals have been defined to categorize the symmetry of a sample, these are: “symmetric” ( Q 1 = [ | 0 | | 0.25 | ) ), “little symmetric” ( Q 2 = [ | 0.25 | | 0.50 | ) ), “very little symmetric” ( Q 3 = [ | 0.50 | | 0.75 | ) ), and “not at all symmetric” ( Q 4 = [ | 0.75 | | | ) ). As can be seen, the negative symmetry value is converted to its same value as a positive for the calculation of the interval to which it belongs. Each dataset generated consists of 75 surveys with a symmetry value belonging to each one of the four intervals ( 75 · 4 = 300 ); in this way, it will be possible to study the results obtained using the same number of surveys for each one of the defined intervals. In order to evaluate the results, a representative value v r e p has been defined to be used for comparison with the value calculated by our method and with the arithmetic mean (Equation (9)).
v r e p = n o r m ( f i ) n i f i p a r t | f i 0.5 · f m a x
where f m a x is the maximum frequency value in p a r t , n i is the number of elements of p a r t that fulfils f i 0.5 · f m a x , and n o r m ( f i ) normalizes the value f i considering the values f m a x p a r t that fulfills f m a x 0.5 .
Thus, only the highest frequencies are considered in order to find the greatest group of opinion.
For the study of the behavior of the parameter α , the experiment has been repeated taking α the values: 0.1 , 0.2 , 0.3 , 0.4 , 0.5 , 0.6 , 0.7 , 0.8 , and 0.9 . The obtained results are shown in Table 5 and Table 6, respectively. In these tables, the two first columns show the α value and the used method (Me): our approach, the arithmetic mean, and a third field indicating when both methods obtain the same results. The third column named adequateness (Adq) indicates the percentage of times a method obtains the nearest value to the representative one, the average of the symmetry measures for which each method is the best (Asm), and the detailed percentages for each one of the symmetry values, columns Q 1 , Q 2 , Q 3 , and Q 4 . For instance, in Table 5, if α = 0.6 , our proposal is the best for 197 samples, 65.67 % , and the arithmetic mean is the best for 78 samples ( 26.0 % ). In addition, the same aggregated value for both methods is obtained in 25 samples ( 8.33 % ). The average of the symmetry measures where our method is the best is 0.58 and, for the arithmetic mean, it is 0.19 . The last four columns show the number of surveys where the proposed method is the best for each of the defined intervals. For example, taking Q 2 , our proposal is the most adequate 35 times ( 46.67 % ); on 30 occasions ( 40.0 % ), the arithmetic mean has been more suitable, and 10 times ( 13.33 % ) the same result has been obtained.

Discussion

Next, the results obtained with the variation of α , the values of asymmetry obtained, and the results for each interval of asymmetry are provided in detail.
On the basis of the results obtained, it can be concluded that the higher values of α obtain the worst results, Col Adq. of Table 5 and Table 6, and, as α drops in value, results improve. For α < 0.5 , the results start to get worse and, with α = 0.1 , the output value is the same as the one obtained with the arithmetic mean for every sample. The best results are obtained for α = 0.6 , a 65.67 % with a 7-point Likert scale and a 76.67 % with a 9-point Likert scale. Results are better when using a 9-point Likert scale because there are more items and more combinations to perform the partitioning (Section 3.2).
The proposed method is the best with a high average asymmetry value (Table 5 and Table 6, col. A s m ); more concretely, it reaches values between 0.56 and 0.65 for the 7-Likert scale test and values from 0.48 to 0.56 for the 9-Likert scale test, labeled as very little symmetry in both cases. The measures of asymmetry obtained by the arithmetic mean are much lower than this: the intervals [ 0.18 , 0.26 ] and [ 0.17 , 0.31 ] for 7-Likert and 9-Likert scale tests, respectively. In addition, the study of the intervals Q 1 Q 4 corroborates these results since the adequateness of our proposal occurs when 0.2 α 0.7 in the intervals with more asymmetry Q 3 and Q 4 . In particular, when α = 0.5 and α = 0.6 , in both tests, results greater than 95 % are obtained. For the Q 2 interval, our proposal is the best except for a 7-point Likert scale being α = 0.3 . This shows that our method is suitable for asymmetric distributions (labeled as “little symmetric”, “very little symmetric”, and “not at all symmetric”), and it presents a similar behavior with respect to the arithmetic mean for symmetric distributions.
Table 7 and Table 8 show the obtained results with respect to the partitioning rate. The rows indicate the α values and the cols the number of partitions, i.e., 0 indicates no division in the histogram, 1 indicates one division of the histogram, etc. With respect to the parameter α , as it was expected (Line 11 in Algorithm 1), a greater value of α implies less partitioning. More concretely, when α = 1 , there is no high rate of partitioning, but as α goes down, the rate of partitioning goes up (Table 7 and Table 8). When α 0.5 , partitions are made for all samples for both of the Likert scales tested. For very low α , the partitioning ratio is very large, performing almost the maximum number of possible partitions.
In brief, on the basis of the experimentation carried out, an α value in the interval [ 0.5 0.6 ] could be a good value for the α parameter based on the test results. The tests show that this parameter can be used to choose the rate of partitioning. In addition, it is shown that our method is appropriate for distributions with “little symmetric”, “very little symmetric”, and “not at all symmetric”. Once the experimentation with synthetic data has been carried out, the next step is to use real data.

4.2. Real Dataset

In the field of Social Sciences, it is very typical to aggregate the value of several items. For example, in [1,49], the study conducted used a primary database that was acquired from a survey entitled “Spain Survey of Training and Dynamic Capabilities of the Firm” (STraDyCaF). This survey was both administrated and distributed between the months of May and December of 2014 using the web based tool, LimeSurvey Version 2.05+. This Web application is open source, and its specialty is the creation, distribution of surveys, and the population management too. This tool sent an e-mail to each one of the survey respondents (Senior Executives) containing a personalized link. The representative sample for the survey was Spanish companies all over the nation with a number of employees greater than 50. Table 9 summarizes the technical details of the survey.
In [49], four variables were used to conduct the analysis with the method fsQCA: Training (CONDF), Organizational performance (DORG), Absorptive capacity (ACAP) and Innovation capacity (INN). Table 10 shows the characteristics of each one of the variables; more concretely, it shows the Likert scale selected, the 7-point because it is the most used in questionnaires [51,52,53], the number of underlying dimensions of the dependent variables, and the number of items. A greater level of detail of these data can be found in [1,49].
As it can be observed, it is needed to compute the representative value for each variable that is an aggregate value from a large number of items (Last col in Table 10). For instance, for variables DORG and INN, it is needed to compute the average of nine items. This fact drives us to consider a method to obtain the groups of opinion (consensus) in the items to be aggregated. This value would be closer to the group with the greatest frequency than measures of central tendency.
In this test, the aggregate value for each variable used in STraDyCaF has been computed, and a similar test as the one detailed in Section 4.1 has been completed. The values for α are 0.5 and 0.6 (the values that reach the best results in the tests detailed in Section 4.1). Table 11 shows the interval symmetry distribution of the samples for each STraDyCaF dataset variable. As it can be observed, many samples have a symmetry catalogued as symmetric ( 90 % ). The results are shown in Table 12 for the variables ACAP, CONDF, DORG, and INN, respectively. The last column shows the average of all the tests.
The first thing that is observed is that, in many cases ( 63 % ), our method obtains the same aggregated value as the arithmetic mean because the samples are quite symmetrical in 90 % of the cases (Table 12). Our method surpasses the arithmetic mean in all samples with an asymmetry belonging to the intervals Q 3 and Q 4 (very little symmetric and not at all symmetric). For samples belonging to the Q 2 interval, our method exceeds the arithmetic mean in most of the samples ( 44.75 % vs 36.84 % ). For the samples falling in the Q 1 interval, the value obtained is identical for both methods ( 68.19 % ). In summary, it is observed that our method is appropriate for distributions with little asymmetry, very little asymmetry, or without symmetry, and is not far from the behavior of the arithmetic mean for symmetric distributions. It can be said that the experiments using the actual survey show a great similarity with respect to those detailed in Section 4.1.

5. Conclusions

A new method of aggregating the collected information which is more suitable for asymmetric distributions has been presented. This method generates a set of partitions using an approach based on successive divisions. It forms groups of closer answers represented as partitions using a measure based on consensus and a distance measure between the possible partitions. The method allows for selecting the rate of partitioning by means of a parameter α ; several experiments have been completed to test the behavior of such a parameter.
The structure of the algorithm allows for changing the evaluation function to consider other criteria for making the division of the input partition. The evaluation function calculates the distance between the possible partitions using a consensus measure and a distance measure between a unique representing value of each partition.
Experimentation aims to test how the method generates the partitions and the influence of the parameter α on the creation of these partitions on synthetic and real datasets. Promising results have been obtained, verifying that the aggregate value obtained by our approach is appropriate for asymmetric distributions obtaining similar results to the arithmetic mean for symmetric distributions.
As future work, we intend to investigate other evaluation functions of the partitions in order to obtain an added value that improves that obtained by the current method. In addition, the study of alternative consensus measures is another important research line to be explored.

Author Contributions

Conceptualization, J.M.-G. and L.R.-B.; methodology, J.M.-G., L.R.-B. and B.Y.-A.; software, J.M.-G. and L.R.-B.; validation, B.Y.-A., J.M.-G., F.H.-P. and L.R.-B.; formal analysis, J.M.-G., B.Y.-A. and L.R.-B.; investigation, B.Y.-A. and J.M.-G.; resources, B.Y.-A. and J.M.-G.; data curation, J.M.-G. and L.R.-B.; writing—original draft preparation, J.M.-G., B.Y.-A., F.H.-P. and L.R.-B.; writing—review and editing, J.M.-G., B.Y.-A., F.H.-P. and L.R.-B.; visualization, J.M.-G., B.Y.-A., F.H.-P. and L.R.-B.; supervision, J.M.-G., B.Y.-A. and L.R.-B.; project administration, J.M.-G. and B.Y.-A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to requirements of ethics approval.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Hernández-Perlines, F.; Moreno-Garcia, J.; Yáñez Araque, B. Using fuzzy-set qualitative comparative analysis to develop an absorptive capacity-based view of training. J. Bus. Res. 2016, 69, 1510–1515. [Google Scholar] [CrossRef]
  2. Yeh, T.-M.; Pai, F.-Y.; Wu, L.-C. Relationship Stability and Supply Chain Performance for SMEs: From Internal, Supplier, and Customer Integration Perspectives. Mathematics 2020, 8, 1902. [Google Scholar] [CrossRef]
  3. Ruiz-Palomino, P.; Yáñez-Araque, B.; Jiménez-Estévez, P.; Gutiérrez-Broncano, S. Can servant leadership prevent hotel employee depression during the COVID-19 pandemic? A mediating and multigroup analysis. Technol. Forecast. Soc. Chang. 2022, 174, 121192. [Google Scholar] [CrossRef] [PubMed]
  4. Hernández-Perlines, F.; Moreno-Garcia, J.; Yáñez-Araque, B. Training and business performance: The mediating role of absorptive capacities. SpringerPlus 2016, 5, 2074. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  5. Robbers, E.; Petegem, P.V.; Donche, V.; Maeyer, S.D. Predictive validity of the learning conception questionnaire in primary education. Int. J. Educ. Res. 2015, 74, 61–69. [Google Scholar] [CrossRef]
  6. Kubacki, K.; Rundle-Thiele, S.; Pang, B.; Buyucek, N. Minimizing alcohol harm: A systematic social marketing review (2000–2014). J. Bus. Res. 2015, 68, 2214–2222. [Google Scholar] [CrossRef] [Green Version]
  7. Calabuig Moreno, F.; Prado-Gascó, V.; Crespo Hervás, J.; Núñez-Pomar, J.; Añó Sanz, V. Spectator emotions: Effects on quality, satisfaction, value, and future intentions. J. Bus. Res. 2015, 68, 1445–1449. [Google Scholar] [CrossRef]
  8. Calabuig Moreno, F.; Prado-Gascó, V.; Crespo Hervás, J.; Núñez-Pomar, J.; Añó Sanz, V. Predicting future intentions of basketball spectators using SEM and fsQCA. J. Bus. Res. 2015, 69, 1396–1400. [Google Scholar] [CrossRef]
  9. Carpio de los Pinos, A.J.; González García, M.; Soriano, J.A.; Yáñez Araque, B. Development of the Level of Preventive Action Method by Observation of the Characteristic Value for the Assessment of Occupational Risks on Construction Sites. Int. J. Environ. Res. Public Health 2021, 18, 8387. [Google Scholar] [CrossRef]
  10. Stephens, A.N.; Fitzharris, M. Validation of the driver behavior questionnaire in a representative sample of drivers in Australia. Accid. Anal. Prev. 2015, 86, 186–198. [Google Scholar] [CrossRef]
  11. Santos, M.E.C.; Polvi, J.; Taketomi, T.; Yamamoto, G.; Sandor, C. Toward Standard Usability Questionnaires for Handheld Augmented Reality. IEEE Comput. Graph. Appl. 2015, 35, 66–75. [Google Scholar] [CrossRef] [PubMed]
  12. Arslanturk, S.; Siadat, M.; Ogunyemi, T.; Killinger, K.; Diokno, A. Analysis of incomplete and inconsistent clinical survey data. Knowl. Inf. Syst. 2016, 46, 731–750. [Google Scholar] [CrossRef]
  13. Yáñez-Araque, B.; Gómez-Cantarino, S.; Gutiérrez-Broncano, S.; López-Ruiz, V. Examining the Determinants of Healthcare Workers’ Performance: A Configurational Analysis during COVID-19 Times. Int. J. Environ. Res. Public Health 2021, 18, 5671. [Google Scholar] [CrossRef] [PubMed]
  14. Hair, J.; Hult, G.; Ringle, C.; Sarstedt, M. A Primer on Partial Least Squares Structural Equation Modeling (PLS-SEM); Sage Publications: Thousand Oaks, CA, USA, 2014. [Google Scholar]
  15. Polites, G.L.; Roberts, N.; Thatcher, J. Conceptualizing models using multidimensional constructs: A review and guidelines for their use. Eur. Inf. Syst. 2012, 21, 22–48. [Google Scholar] [CrossRef]
  16. Edwards, J.R. Multidimensional Constructs in Organizational Behavior Research: An Integrative Analytical Framework. Organ. Res. Methods 2001, 4, 144–192. [Google Scholar] [CrossRef] [Green Version]
  17. Ragin, C. Fuzzy-Set Social Science; University of Chicago Press: Chicago, IL, USA, 2000. [Google Scholar]
  18. Ragin, C. Redesigning Social Inquiry: Fuzzy Sets and Beyond; University of Chicago Press: Chicago, IL, USA, 2008. [Google Scholar]
  19. Rihoux, B.; Ragin, C. Configurational Comparative Methods: Qualitative Comparative Analysis (QCA) and Related Techniques; SAGE: Los Angeles, CA, USA, 2009. [Google Scholar]
  20. Chiclana, F.; Garcia, J.M.T.; del Moral, M.J.; Herrera-Viedma, E. A statistical comparative study of different similarity measures of consensus in group decision-making. Inf. Sci. 2013, 221, 110–123. [Google Scholar] [CrossRef] [Green Version]
  21. Kline, P. An Easy Guide to Factor Analysis; Routledge: London, UK, 1994. [Google Scholar]
  22. Nunnally, J.C. Psychometric Theory; McGraw-Hill: London, UK, 1978. [Google Scholar]
  23. DiStefano, C.; Min, Z.; Mindrila, D. Understanding and Using Factor Scores: Considerations for the Applied Researcher, Practical Assessment. Res. Eval. 2009, 14, 20. [Google Scholar]
  24. Mellenbergh, G. Chapter 10: Tests and Questionnaires: Construction and administration. In Advising on Research Methods: A consultant’s Companion; Johannes van Kessel Publishing: Huizen, The Netherlands, 2008. [Google Scholar]
  25. Gnaldi, M.; Bacci, S.; Bartolucci, F. A multilevel finite mixture item response model to cluster examinees and schools. Adv. Data Anal. Classif. 2016, 10, 53–70. [Google Scholar] [CrossRef]
  26. Moreno-Garcia, J.; Castro-Sanchez, J.J.; Jimenez, L. A direct linguistic induction method for systems. Fuzzy Sets Syst. 2004, 146, 79–96. [Google Scholar] [CrossRef]
  27. Quilan, J. Induction of decision trees. Mach. Learn. 1986, 1, 81–106. [Google Scholar] [CrossRef] [Green Version]
  28. Shannon, C.E. A Mathematical Theory of Communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef]
  29. Tastle, W.J.; Wierman, M.J. Consensus and dissention: A measure of ordinal dispersion. Int. J. Approx. Reason. 2007, 45, 531–545. [Google Scholar] [CrossRef] [Green Version]
  30. Likert, R. A Technique for the Measurement of Attitudes. Arch. Psychol. 1932, 140, 1–55. [Google Scholar]
  31. Tran, T.; Phung, D.; Venkatesh, S. Modelling human preferences for ranking and collaborative filtering: A probabilistic ordered partition approach. Knowl. Inf. Syst. 2016, 47, 157–188. [Google Scholar] [CrossRef]
  32. Yager, R.R. On ordered weighted averaging aggregation operators in multicriteria decision-making. IEEE Trans. Syst. Man Cybern. 1988, 18, 183–190. [Google Scholar] [CrossRef]
  33. He, W.; Dutta, B.; Rodríguez, R.M.; Alzahrani, A.A.; Martínez, L. Induced OWA Operator for Group Decision Making Dealing with Extended Comparative Linguistic Expressions with Symbolic Translation. Mathematics 2021, 9, 20. [Google Scholar] [CrossRef]
  34. Pons-Vives, P.J.; Morro-Ribot, M.; Mulet-Forteza, C.; Valero, O. An Application of Ordered Weighted Averaging Operators to Customer Classification in Hotels. Mathematics 2022, 10, 1987. [Google Scholar] [CrossRef]
  35. Hussain, S.; Bashir, S. Co-clustering of multi-view datasets. Knowl. Inf. Syst. 2016, 47, 545–570. [Google Scholar] [CrossRef]
  36. Zhao, W.; Liu, H.; Dai, W.; Ma, J. An entropy-based clustering ensemble method to support resource allocation in business process management. Knowl. Inf. Syst. 2016, 48, 305–330. [Google Scholar] [CrossRef]
  37. Fedrizzi, M.; Pasi, G. Fuzzy logic approaches to consensus modelling in group decision-making. Stud. Comput. Intell. 2008, 117, 19–37. [Google Scholar]
  38. Parreiras, R.O.; Ekel, P.Y.; Martini, J.S.C.; Palhares, R.M. A flexible consensus scheme for multicriteria group decision-making under linguistic assessments. Inf. Sci. 2010, 180, 1075–1089. [Google Scholar] [CrossRef]
  39. Alcantud, J.C.R.; Calle, R.D.A.; Cascón, J.M. On measures of cohesiveness under dichotomous opinions: Some characterizations of approval consensus measures. Inf. Sci. 2013, 240, 45–55. [Google Scholar] [CrossRef] [Green Version]
  40. Guénoche, A. Consensus of partitions: A constructive approach. Adv. Data Anal. Classif. 2011, 5, 215–229. [Google Scholar] [CrossRef]
  41. Batchelder, W.H.; Anders, R. Cultural Consensus Theory: Comparing different concepts of cultural truth. J. Math. Psychol. 2012, 56, 316–332. [Google Scholar] [CrossRef]
  42. France, S.L.; Batchelder, W.H. Unsupervised consensus analysis for online review and questionnaire data. Inf. Sci. 2014, 283, 241–257. [Google Scholar] [CrossRef]
  43. Fiori, A.; Mignone, A.; Rospo, G. Decoclu. Inf. Sci. 2016, 328, 378–388. [Google Scholar] [CrossRef]
  44. Plaia, A.; Sciandra, M. Weighted distance-based trees for ranking data. Adv. Data Anal. Classif. 2019, 13, 427–444. [Google Scholar] [CrossRef]
  45. Zhang, Z.; Li, Z. Consensus-based TOPSIS-Sort-B for multi-criteria sorting in the context of group decision-making. Ann. Oper. Res. 2022. [Google Scholar] [CrossRef]
  46. de Lima Silva, D.F.; de Almeida Filho, A.T. Sorting with TOPSIS through Boundary and Characteristic Profiles. Comput. Ind. Eng. 2020, 141, 106328. [Google Scholar] [CrossRef]
  47. Gai, T.; Cao, M.; Chiclana, F.; Zhang, Z.; Dong, Y.; Herrera-Viedma, E.; Wu, J. Group Decision and N. Consensus-trust Driven Bidirectional Feedback Mechanism for Improving Consensus in Social Network Large-group Decision Making. Group Decis. Negot. 2022, 1–30. [Google Scholar] [CrossRef]
  48. Pharhi, O. Lessons learned: A practical approach. J. Knowl. Manag. Pract. 2009, 10. [Google Scholar]
  49. Yáñez-Araque, B.; Hernández-Perlines, F.; Moreno-Garcia, J. From Training to Organizational Behavior: A Mediation Model through Absorptive and Innovative Capacities. Front. Psychol. 2017, 8, 1532. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  50. Brase, C.H.; Brase, C.P. Understanding Basic Statistics, 7th ed.; Cengage Learning: Hampshire, UK, 2016. [Google Scholar]
  51. Dawes, J. Do data characteristics change according to the number of scale points use? An experiment using 5 point 7 point and 10 point scales. Int. J. Mark. Res. 2008, 50, 61–104. [Google Scholar] [CrossRef]
  52. Norman, B. Likert scales, levels of measurement and the “laws” of statistics. Adv. Health Sci. Educ. 2010, 15, 625–632. [Google Scholar] [CrossRef]
  53. Miller, G.A. The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychol. Rev. 1956, 101, 343–352. [Google Scholar] [CrossRef]
Table 1. Example of questionnaire with two questions.
Table 1. Example of questionnaire with two questions.
Part I—The Planning Phase12345
Was the project score clear to the team members?
Did the team members actively participate in the estimation process?
Did you feel that your opinion was heard in the planning phase?
Did you state your opinion regarding the construction of the WBS?
Part II—The Execution Phase12345
Were the project meetings held at adequate intervals?
Were progress indicator models used?
Were changes in the project adequately controlled?
1 = strongly disagree, 2 = disagree, 3 = have no idea, 4 = agree, 5 = strongly agree.
Table 2. Responses and aggregate values for the two questions of the questionnaire.
Table 2. Responses and aggregate values for the two questions of the questionnaire.
Surveyed I 1 1 I 2 1 I 3 1 I 4 1 1st Level2nd Level
14354 4.00 3.19
24243 3.25
35444 4.25
41443 3.25
52121 1.50
61243 2.50
75443 4.00
85433 3.75
95112 2.25
col. mean 3.67 2.78 3.44 2.89 3.19
Table 3. Partitions obtained for the histogram e.
Table 3. Partitions obtained for the histogram e.
PartPart1Part2
1 { { 1 , 2 9 } } { { 2 , 1 9 } , { 4 , 2 9 } , { 5 , 4 9 } }
2 { { 1 , 2 9 } , { 2 , 1 9 } } { { 4 , 2 9 } , { 5 , 4 9 } }
3 { { 1 , 2 9 } , { 2 , 1 9 } , { 4 , 2 9 } } { 5 , 4 9 }
Table 4. Partitions when e = { { 1 , 2 9 } , { 2 , 1 9 } , { 4 , 2 9 } , { 5 , 4 9 } } .
Table 4. Partitions when e = { { 1 , 2 9 } , { 2 , 1 9 } , { 4 , 2 9 } , { 5 , 4 9 } } .
Part.Part1Part2Eval
P 1 { { 1 , 2 9 } } { { 2 , 1 9 } , { 4 , 2 9 } , { 5 , 4 9 } } 0.526
P 2 { { 1 , 2 9 } , { 2 , 1 9 } } { { 4 , 2 9 } , { 5 , 4 9 } } 0.537
P 3 { { 1 , 2 9 } , { 2 , 1 9 } , { 4 , 2 9 } } { { 5 , 4 9 } } 0.403
Table 5. Study of the parameter α with a 7-Likert scale.
Table 5. Study of the parameter α with a 7-Likert scale.
α Me.Adq.Asm.Q1Q2Q3Q4
0.9 Our4 ( 1.33 % ) 0.63 1 ( 1.33 % )0 ( 0.0 % )0 ( 0.0 % )3 ( 4.0 % )
A.M.0 ( 0.0 % ) 0.0 0 ( 0.0 % )0 ( 0.0 % )0 ( 0.0 % )0 ( 0.0 % )
Equal296 ( 98.67 % ) 0.44 74 ( 98.67 % )75 ( 100.0 % )75 ( 100.0 % )72 ( 96.0 % )
0.8 Our42 ( 14.0 % ) 0.56 7 ( 9.33 % )6 ( 8.0 % )11 ( 14.67 % )18 ( 24.0 % )
A.M.8 ( 2.67 % ) 0.24 3 ( 4.0 % )5 ( 6.67 % )0 ( 0.0 % )0 ( 0.0 % )
Equal250 ( 83.33 % ) 0.43 65 ( 86.67 % )64 ( 85.33 % )64 ( 85.33 % )57 ( 76.0 % )
0.7 Our143 ( 47.67 % ) 0.62 8 ( 10.67 % )19 ( 25.33 % )47 ( 62.67 % )69 ( 92.0 % )
A.M.50 ( 16.67 % ) 0.18 31 ( 41.33 % )19 ( 25.33 % )0 ( 0.0 % )0 ( 0.0 % )
Equal107 ( 35.67 % ) 0.34 36 ( 48.0 % )37 ( 49.33 % )28 ( 37.33 % )6 ( 8.0 % )
0.6 Our197 ( 65.67 % ) 0.58 14 ( 18.67 % )35 ( 46.67 % )73 ( 97.33 % )75 ( 100.0 % )
A.M.78 ( 26.0 % ) 0.19 47 ( 62.67 % )30 ( 40.0 % )1 ( 1.33 % )0 ( 0.0 % )
Equal25 ( 8.33 % ) 0.19 14 ( 18.67 % )10 ( 13.33 % )1 ( 1.33 % )0 ( 0.0 % )
0.5 Our195 ( 65.0 % ) 0.57 17 ( 22.67 % )34 ( 45.33 % )70 ( 93.33 % )74 ( 98.67 % )
A.M.82 ( 27.33 % ) 0.2 51 ( 68.0 % )29 ( 38.67 % )1 ( 1.33 % )1 ( 1.33 % )
Equal23 ( 7.67 % ) 0.28 7 ( 9.33 % )12 ( 16.0 % )4 ( 5.33 % )0 ( 0.0 % )
0.4 Our186 ( 62.0 % ) 0.57 19 ( 25.33 % )29 ( 38.67 % )65 ( 86.67 % )73 ( 97.33 % )
A.M.76 ( 25.33 % ) 0.21 44 ( 58.67 % )27 ( 36.0 % )3 ( 4.0 % )2 ( 2.67 % )
Equal38 ( 12.67 % ) 0.28 12 ( 16.0 % )19 ( 25.33 % )7 ( 9.33 % )0 ( 0.0 % )
0.3 Our149 ( 49.67 % ) 0.62 9 ( 12.0 % )18 ( 24.0 % )52 ( 69.33 % )70 ( 93.33 % )
A.M.65 ( 21.67 % ) 0.26 32 ( 42.67 % )23 ( 30.67 % )6 ( 8.0 % )4 ( 5.33 % )
Equal86 ( 28.67 % ) 0.28 34 ( 45.33 % )34 ( 45.33 % )17 ( 22.67 % )1 ( 1.33 % )
0.2 Our75 ( 25.0 % ) 0.65 3 ( 4.0 % )10 ( 13.33 % )23 ( 30.67 % )39 ( 52.0 % )
A.M.29 ( 9.67 % ) 0.24 15 ( 20.0 % )9 ( 12.0 % )4 ( 5.33 % )1 ( 1.33 % )
Equal196 ( 65.33 % ) 0.4 57 ( 76.0 % )56 ( 74.67 % )48 ( 64.0 % )35 ( 46.67 % )
0.1 Our0 ( 0.0 % ) 0.0 0 ( 0.0 % )0 ( 0.0 % )0 ( 0.0 % )0 ( 0.0 % )
A.M.0 ( 0.0 % ) 0.0 0 ( 0.0 % )0 ( 0.0 % )0 ( 0.0 % )0 ( 0.0 % )
Equal300 ( 100.0 % ) 0.45 75 ( 100.0 % )75 ( 100.0 % )75 ( 100.0 % )75 ( 100.0 % )
Table 6. Study of the parameter α with a 9-Likert scale.
Table 6. Study of the parameter α with a 9-Likert scale.
α Me.Adq.Asm.Q1Q2Q3Q4
0.9 Our8 ( 2.67 % ) 0.48 2 ( 2.67 % )2 ( 2.67 % )1 ( 1.33 % )3 ( 4.0 % )
A.M.1 ( 0.33 % ) 0.17 1 ( 1.33 % )0 ( 0.0 % )0 ( 0.0 % )0 ( 0.0 % )
Equal291 ( 97.0 % ) 0.45 72 ( 96.0 % )73 ( 97.33 % )74 ( 98.67 % )72 ( 96.0 % )
0.8 Our47 ( 15.67 % ) 0.53 6 ( 8.0 % )14 ( 18.67 % )10 ( 13.33 % )17 ( 22.67 % )
A.M.6 ( 2.0 % ) 0.17 5 ( 6.67 % )1 ( 1.33 % )0 ( 0.0 % )0 ( 0.0 % )
Equal247 ( 82.33 % ) 0.44 64 ( 85.33 % )60 ( 80.0 % )65 ( 86.67 % )58 ( 77.33 % )
0.7 Our150 ( 50.0 % ) 0.56 13 ( 17.33 % )33 ( 44.0 % )51 ( 68.0 % )53 ( 70.67 % )
A.M.34 ( 11.33 % ) 0.19 23 ( 30.67 % )10 ( 13.33 % )0 ( 0.0 % )1 ( 1.33 % )
Equal116 ( 38.67 % ) 0.39 39 ( 52.0 % )32 ( 42.67 % )24 ( 32.0 % )21 ( 28.0 % )
0.6 Our230 ( 76.67 % ) 0.54 25 ( 33.33 % )57 ( 76.0 % )74 ( 98.67 % )74 ( 98.67 % )
A.M.61 ( 20.33 % ) 0.17 44 ( 58.67 % )16 ( 21.33 % )0 ( 0.0 % )1 ( 1.33 % )
Equal9 ( 3.0 % ) 0.19 6 ( 8.0 % )2 ( 2.67 % )1 ( 1.33 % )0 ( 0.0 % )
0.5 Our227 ( 75.67 % ) 0.54 24 ( 32.0 % )58 ( 77.33 % )72 ( 96.0 % )73 ( 97.33 % )
A.M.69 ( 23.0 % ) 0.19 48 ( 64.0 % )17 ( 22.67 % )2 ( 2.67 % )2 ( 2.67 % )
Equal4 ( 1.33 % ) 0.23 3 ( 4.0 % )0 ( 0.0 % )1 ( 1.33 % )0 ( 0.0 % )
0.4 Our222 ( 74.0 % ) 0.53 26 ( 34.67 % )55 ( 73.33 % )69 ( 92.0 % )72 ( 96.0 % )
A.M.71 ( 23.67 % ) 0.21 46 ( 61.33 % )18 ( 24.0 % )4 ( 5.33 % )3 ( 4.0 % )
Equal7 ( 2.33 % ) 0.32 3 ( 4.0 % )2 ( 2.67 % )2 ( 2.67 % )0 ( 0.0 % )
0.3 Our202 ( 67.33 % ) 0.54 24 ( 32.0 % )46 ( 61.33 % )62 ( 82.67 % )70 ( 93.33 % )
A.M.80 ( 26.67 % ) 0.25 44 ( 58.67 % )23 ( 30.67 % )9 ( 12.0 % )4 ( 5.33 % )
Equal18 ( 6.0 % ) 0.33 7 ( 9.33 % )6 ( 8.0 % )4 ( 5.33 % )1 ( 1.33 % )
0.2 Our143 ( 47.67 % ) 0.55 16 ( 21.33 % )31 ( 41.33 % )45 ( 60.0 % )51 ( 68.0 % )
A.M.77 ( 25.67 % ) 0.31 36 ( 48.0 % )19 ( 25.33 % )13 ( 17.33 % )9 ( 12.0 % )
Equal80 ( 26.67 % ) 0.42 23 ( 30.67 % )25 ( 33.33 % )17 ( 22.67 % )15 ( 20.0 % )
0.1 Our0 ( 0.0 % ) 0.0 0 ( 0.0 % )0 ( 0.0 % )0 ( 0.0 % )0 ( 0.0 % )
A.M.0 ( 0.0 % ) 0.0 0 ( 0.0 % )0 ( 0.0 % )0 ( 0.0 % )0 ( 0.0 % )
Equal300 ( 100.0 % ) 0.45 75 ( 100.0 % )75 ( 100.0 % )75 ( 100.0 % )75 ( 100.0 % )
Table 7. Number of partitions using a 7-Likert scale.
Table 7. Number of partitions using a 7-Likert scale.
Number of Partitions (7-Points)
0123456
0.9296400000
0.82505000000
0.710619220000
0.69249420000
0.5016511718000
0.4076111111200
0.3053913112500
0.200017920515
0.1000000300
Table 8. Number of partitions using a 9-Likert scale.
Table 8. Number of partitions using a 9-Likert scale.
Number of Partitions (9-Points)
012345678
0.929190000000
0.8247530000000
0.71151841000000
0.6625836000000
0.50192822510000
0.4010910561250000
0.307661226638100
0.200026612977242
0.100000000300
Table 9. Technical data sheet. (a) Source: DIRCE 2015 (Central Business Register, CBR or DIRCE in Spanish, on 1 January 2014).
Table 9. Technical data sheet. (a) Source: DIRCE 2015 (Central Business Register, CBR or DIRCE in Spanish, on 1 January 2014).
Population scope (universe)Spanish companies with 50 or more employees, in any sector except public administration, agricultural sector and activities of households and extraterritorial organizations and bodies
Geographical scopeAll the national territory/ Spanish national territory
Sampling unit/ Unit of analysisFirm
Population census a 22013
Effect size/ Statistical power0.2301 (small)/ 0.8001 (Post-Hoc analysis, t test correlation point biserial model, one tail, α error probability of 0.05)
Sample size/ response rate112 valid surveys/ 7.18
Sampling procedureSimple random sampling without replacement
Confidence level95%; z = 1.96; p = q = 0.50 ; α = 0.05
Sampling error 9.24 %
Key respondentsSenior Executives
Date of fieldwork/data collectedBetween May and December 2014
Table 10. Information of the variables used in [49].
Table 10. Information of the variables used in [49].
VariableLikert ScaleDimensions No. and Their NameItems No.
ACAP74: acquisition, assimilation, transformation & exploitation 14 : 3 + 4 + 4 + 3
CONDF71: Training5
DORG72: Economic performance & satisfaction performance. 9 : 5 + 4
INN72: product innovation & process innovation 9 : 5 + 4
Table 11. Symmetry distribution of the STraDyCaF dataset.
Table 11. Symmetry distribution of the STraDyCaF dataset.
Var. Q 1 Q 2 Q 3 Q 4
ACAP961510
CONDF1011100
DORG103423
INN104800
Total404 (90%)38 (8%)3 (1%)3 (1%)
Table 12. Obtained results to variable DORG of the STraDyCaF dataset.
Table 12. Obtained results to variable DORG of the STraDyCaF dataset.
Var. α Me.N.V.A.M. Q 1 Q 2 Q 3 Q 4
ACAP 0.5 Our28 ( 25.0 % ) 0.21 18 ( 18.75 % )9 ( 60.0 % )1 ( 100.0 % )0 ( 0.0 % )
A.M.40 ( 35.71 % ) 0.12 34 ( 35.42 % )6 ( 40.0 % )0 ( 0.0 % )0 ( 0.0 % )
Equal44 ( 39.29 % ) 0.07 44 ( 45.83 % )0 ( 0.0 % )0 ( 0.0 % )0 ( 0.0 % )
0.6 Our27 ( 24.11 % ) 0.21 17 ( 17.71 % )9 ( 60.0 % )1 ( 100.0 % )0 ( 0.0 % )
A.M.41 ( 36.61 % ) 0.12 35 ( 36.46 % )6 ( 40.0 % )0 ( 0.0 % )0 ( 0.0 % )
Equal44 ( 39.29 % ) 0.07 44 ( 45.83 % )0 ( 0.0 % )0 ( 0.0 % )0 ( 0.0 % )
CONDF 0.5 Our5 ( 4.46 % ) 0.12 5 ( 4.95 % )0 ( 0.0 % )0 ( 0.0 % )0 ( 0.0 % )
A.M.7 ( 6.25 % ) 0.15 6 ( 5.94 % )1 ( 9.09 % )0 ( 0.0 % )0 ( 0.0 % )
Equal100 ( 89.29 % ) 0.11 90 ( 89.11 % )10 ( 90.91 % )0 ( 0.0 % )0 ( 0.0 % )
0.6 Our7 ( 6.25 % ) 0.12 7 ( 6.93 % )0 ( 0.0 % )0 ( 0.0 % )0 ( 0.0 % )
A.M.15 ( 13.39 % ) 0.2 8 ( 7.92 % )7 ( 63.64 % )0 ( 0.0 % )0 ( 0.0 % )
Equal90 ( 80.36 % ) 0.1 86 ( 85.15 % )4 ( 36.36 % )0 ( 0.0 % )0 ( 0.0 % )
DORG 0.5 Our14 ( 12.5 % ) 0.47 7 ( 6.8 % )2 ( 50.0 % )2 ( 100.0 % )3 ( 100.0 % )
A.M.28 ( 25.0 % ) 0.11 26 ( 25.24 % )2 ( 50.0 % )0 ( 0.0 % )0 ( 0.0 % )
Equal70 ( 62.5 % ) 0.06 70 ( 67.96 % )0 ( 0.0 % )0 ( 0.0 % )0 ( 0.0 % )
0.6 Our14 ( 12.5 % ) 0.47 7 ( 6.8 % )2 ( 50.0 % )2 ( 100.0 % )3 ( 100.0 % )
A.M.28 ( 25.0 % ) 0.11 26 ( 25.24 % )2 ( 50.0 % )0 ( 0.0 % )0 ( 0.0 % )
Equal70 ( 62.5 % ) 0.06 70 ( 67.96 % )0 ( 0.0 % )0 ( 0.0 % )0 ( 0.0 % )
INN 0.5 Our9 ( 8.04 % ) 0.27 3 ( 2.88 % )6 ( 75.0 % )0 ( 0.0 % )0 ( 0.0 % )
A.M.29 ( 25.89 % ) 0.14 27 ( 25.96 % )2 ( 25.0 % )0 ( 0.0 % )0 ( 0.0 % )
Equal74 ( 66.07 % ) 0.06 74 ( 71.15 % )0 ( 0.0 % )0 ( 0.0 % )0 ( 0.0 % )
0.6 Our9 ( 8.04 % ) 0.27 3 ( 2.88 % )6 ( 75.0 % )0 ( 0.0 % )0 ( 0.0 % )
A.M.30 ( 26.79 % ) 0.14 28 ( 26.92 % )2 ( 25.0 % )0 ( 0.0 % )0 ( 0.0 % )
Equal73 ( 65.18 % ) 0.06 73 ( 70.19 % )0 ( 0.0 % )0 ( 0.0 % )0 ( 0.0 % )
TOTAL Our113 ( 12.61 % ) 0.27 67 ( 8.30 % )34 ( 44.75 % )6 ( 100.0 % )6 ( 100.0 % )
A.M.218 ( 24.33 % ) 0.14 190 ( 23.51 % )28 ( 36.84 % )0 ( 0.0 % )0 ( 0.0 % )
Equal565 ( 63.06 % ) 0.07 551 ( 68.19 % )14 ( 18.42 % )0 ( 0.0 % )0 ( 0.0 % )
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Moreno-Garcia, J.; Yáñez-Araque, B.; Hernández-Perlines , F.; Rodriguez-Benitez, L. An Aggregation Metric Based on Partitioning and Consensus for Asymmetric Distributions in Likert Scale Responses. Mathematics 2022, 10, 4115. https://doi.org/10.3390/math10214115

AMA Style

Moreno-Garcia J, Yáñez-Araque B, Hernández-Perlines  F, Rodriguez-Benitez L. An Aggregation Metric Based on Partitioning and Consensus for Asymmetric Distributions in Likert Scale Responses. Mathematics. 2022; 10(21):4115. https://doi.org/10.3390/math10214115

Chicago/Turabian Style

Moreno-Garcia, Juan, Benito Yáñez-Araque, Felipe Hernández-Perlines , and Luis Rodriguez-Benitez. 2022. "An Aggregation Metric Based on Partitioning and Consensus for Asymmetric Distributions in Likert Scale Responses" Mathematics 10, no. 21: 4115. https://doi.org/10.3390/math10214115

APA Style

Moreno-Garcia, J., Yáñez-Araque, B., Hernández-Perlines , F., & Rodriguez-Benitez, L. (2022). An Aggregation Metric Based on Partitioning and Consensus for Asymmetric Distributions in Likert Scale Responses. Mathematics, 10(21), 4115. https://doi.org/10.3390/math10214115

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop