1. Introduction
The process of obtaining information from an investigation in some areas is usually achieved by designing and creating a survey. A survey is a quantitative technique used to obtain primary information and the basic tool for collecting information is provided by the questionnaire. It is used in research to obtain the information that is used to reach the conclusions about the treated problem. Questionnaires have been used in different areas, such as the social sciences [
1,
2,
3], education [
4,
5], market studies [
6], sports [
7,
8], accident prevention [
9,
10], augmented reality [
11], clinical data [
12,
13], etc.
The information gathered from a questionnaire is represented using two types of variables: observable variables and latent variables. An observable variable can be defined as “a concept that can be directly observed and measured”, i.e., observable variables are “the directly measured proxy variables that contain the raw data” [
14]. These variables are also called indicators (and also called
items or manifest variables). In contrast, a latent variable is a theoretical concept that cannot be directly observed or measured, but can be inferred from observable variables or from other latent variables that are measured in the first instance from observable variables. A variable of this type is also called a construct, composite, or factor and can be understood as something “meaningful”. A construct, then, is inferred from one or more indicators, that is, a latent variable is calculable from some observable variable or variables. Thus, a multidimensional construct is a construct where “their indicators are themselves latent constructs” [
15]. A multidimensional construct establishes a relationship among various latent constructs and allow researchers to match broad predictors with broad outcomes, increasing the explained variance [
16]. An indicator of a multidimensional construct is called dimension and represents one aspect, clearly defined, of the content domain of the overarching construct. Indicator or item, factor, composite, construct and multidimensional construct are the names usually used in statistical analysis applied to social, political, and behavioral sciences.
In most cases, the data gathered from a questionnaire allows for obtaining valuable information from it. Statistical analysis or other analytic techniques such as
fsQCA [
17,
18,
19] are used for this purpose. Those methods usually need a process where the information is aggregated. This aggregation is especially useful in two main situations: (1) to conflate the responses to the different items composing a question (construct) [
20] or (2) to combine the answers offered by the different respondents to a concrete question item.
The use of measures of central tendency such as the arithmetic mean is common. A measure of central location tries to represent with a single value the center or location of the distribution. A subject’s score on a factor is the sum of his or her responses to the items selected to define that factor (direct scores). This practice is equivalent to assigning weights of 1 to the items if they define the factor, or 0 if they do not [
21,
22]. When the factors are not defined by the same number of items (as is usual), it is recommended to calculate for each subject his/her mean on each factor (sum of his/her responses to the items of the factor divided by the number of items of the factor); in this way, the scores of each subject (or group means) on the different factors can be compared with each other [
23]. These measurements have the problem that they are very sensitive to extreme values, which makes them not useful in asymmetric distributions. Furthermore, the opinions collected in a survey can generate an asymmetric distribution that will not be correctly reflected with a central tendency measure. Therefore, measures of central tendency are not the most appropriate for obtaining a value that represents the information collected in a survey. Taking into account these considerations studied in previous works of the authors in the field of Social Sciences [
1,
2,
3], it is interesting to design a method to generate groups of opinions from the surveys. Thus, instead of representing the distribution by means of a measure of central tendency, we propose to represent it with a value closer to the most representative group. To do this, the groups of opinions are detected from the survey respondents, and each of these groups are represented by a single value when the aggregated value is computed. To obtain the appropriate groups of opinion, consensus measures are used, and, to detect such groups, domain partition techniques are applied.
This paper proposes a new aggregation method for questionnaires with closed-ended questions [
24] in order to measure a given phenomenon under investigation. The proposed method can treat ordered bounded domains, such as an ordinal-polytomous scale [
25]. It is based on a continuous division of the input space [
26,
27] that groups closer answers into partitions. To establish such clusters, the concept of entropy [
28,
29] and a distance measure are used. In order to test the validity of our proposal, the results obtained from experimentation are compared with the arithmetic mean using two synthetic and real databases.
1.1. Contributions of the Present Paper
Multiple-item scales are widely used in making hiring decisions, assessing student, customer, and employee satisfaction, conducting needs assessments and program evaluations, and in scientific research projects. Unfortunately, those who construct these scales often have little knowledge of how to effectively develop and evaluate them. When scales are not properly developed, they typically yield data that are unsuitable for their intended purpose.
The Likert scale aims primarily at solving a technical problem that has arisen in relation to the quantitative aspects of the study of social attitudes. Likert [
30], in his original research work on the development of the scales that bear his name, assumes that, for experimental purposes, attitudes are distributed fairly normally. However, he recognizes that this assumption may not be correct, so he calls for future research to determine its correctness or incorrectness by further experiment. This paper responds to that call, proposing a solution in the calculation of Likert scales when the assumption of normality is not met, that is, when the distribution is asymmetric.
We present a new approach to compute an aggregate value that represents Likert scale responses as a histogram adequate to treat with asymmetric distributions. In addition, the operation of the our method is not far from the behavior of the arithmetic mean for symmetric distributions. There are different possibilities in the literature to aggregate information into a single value, and most of them are based on measures of central tendency that are not appropriate for asymmetric distributions.
The presented method is based on the use of successive divisions to generate partitions of the input space. Furthermore, the use of consensus has been proven to be an appropriate way to demonstrate the validity of a set of opinions. For this reason, consensus has been used to measure the goodness of the candidate partitions. More concretely, the approach of Tastle [
29] used to measure a Likert scale is used in this paper. Our approach proposes a measure based on consensus and suggests the distances between the candidate partitions to select one of them. In addition, the way the candidate partitions are generated is another innovative contribution.
Finally, this paper proposes the use of a parameter to control the partitioning rate. It takes values in the interval ; if , the parameter has no effect. In this way, the behavior of the algorithm can be controlled by the final user, something that is not possible in other existing aggregation functions.
1.2. Structure
This paper is organized as follows:
Section 2 exposits the background of the treated issue. Then,
Section 3 provides details on our proposal to compute the aggregate value.
Section 4 tests the proposed method and presents a discussion about the results achieved. Lastly,
Section 5 summarizes the final conclusions and future research.
2. Background
Aggregation methods have been widely used in different research areas. Most of them are mathematical methods, such as the arithmetic mean [
31] or the weighted arithmetic mean, that allows indicating a weight for each component. The Ordered Weighted Averaging) operator is another method, introduced by Yager [
32]. It is widely used in applied mathematics and fuzzy logic, and it can also be used to aggregate the information obtained from a questionnaire. For instance, He et al. [
33] provide new aggregation operators for Extended Comparative Linguistic Expressions with Symbolic Translation (ELICIT) information by developing novel OWA based operators, such as the Induced OWA (IOWA) operator in order to avoid the OWA operator needs of reordering its arguments because ELICIT information does not have an inherent order due to its fuzzy representation. Pons-Vives et al. [
34] provide an application of OWA operators to customer classification in hotels. These authors argue that the use of the OWA operator improves the performance of the classical K-means and reduces the number of convergence iterations. An associated collection of weights
lying in the unit interval are needed in order to compute an OWA operator of dimension
n (Equation (
1)).
where
is the
jth largest in
and
.
Another category of methods, such as the one we present in this paper, are based on consensus [
20,
35,
36]. A recent study of the aggregation methods based on consensus is presented in [
20] and, now, some relevant papers about consensus will be described in order to introduce the reader to the treated issue.
Fedrizzi and Pasi [
37] present a review of fuzzy logic-based approaches to model consensus. It is applied in the context of fuzzy Group Decision Making (GDM) and two different models are synthesized so as to model a consensus under individual fuzzy preferences. In that paper, OWA operators are used to come to a consensus. Parreiras et al. [
38] presented a flexible consensus scheme that allows for capturing a consistent collective opinion from a group of experts. It is based on a linguistic hierarchical model, which is advantageous both from the human viewpoint and from the operational viewpoint. A moderator can intervene in the discussion process and the authors test the approach on a hypothetical enterprise strategy planning problem. Alcantud et al. [
39] presented the concept of approval consensus measure (ACM) to measure the cohesiveness that the expression of dichotomous opinions conveys. The 2012 presidential elections in the USA were selected as a real scenario to test their approach. In [
40], from a set of objects, a consensus partition containing a maximum number of joined or separated pairs in such set is established. Cultural Consensus Theory (CCT) is an approach to information pooling (aggregation and data fusion) and is used for measurement and inference in the social and behavioral sciences [
41]. France and Batchelder [
42] described research related to CCT where a clusterwise version of continuous cultural consensus analysis called CONSCLUS was designed based on
k-means. Their CCT-Means algorithm is used to fit CONSCLUS and an algorithm similar to
k-means is used to compute the clusters. The centroids of the clusters are calculated using the continuous CCT procedure and the algorithm gives a good performance on relatively large datasets. They also implemented an extension of CONSCLUS for fuzzy clustering using an alternating least squares extension to the C-means algorithm. In their experimentation, they showed how CONSCLUS could be used to analyze a set of online review data. To compute the topology of a public transport network, Fiori et al. [
43] used a consensus clustering density-based approach, referred to as DeCoClu (Density Consensus Clustering). They infer the geographical locations of stops with GPS data and use a consensus clustering strategy based on a new distance function to compute the relative distances between points. Experiments were conducted on real-data collections provided by a public transport company, and showed the utility of their proposal. Plaia et al. define a consensus ranking in order to assign a class label or class ranking to each node in a decision tree [
44]. Zhang and Li [
45] develop two consensus-based TOPSIS-Sort-B (variation of the Technique for Order of Preference by Similarity to Ideal Solution, known as TOPSIS-Sort method) algorithms to deal with multi-criteria sorting in the context of group decision-making (MCS-GDM) problems. Then, the authors define the consensus measures and devise different feedback adjustment mechanisms and consensus reaching algorithms to help experts reach consensus by considering different needs for MCS-GDM problems. Thus, TOPSIS-Sort-B is presented as an improved version of TOPSIS-Sort for sorting problems in which boundary profiles should be determined. Other variations of the TOPSIS-Sort method have been developed such as TOPSIS-Sort-C, which should be used to address problems in which it is more appropriate to determine characteristic profiles [
46]. For their part, Gai et al. [
47] propose a consensus-trust driven framework of bidirectional interaction for social network large-group decision-making. The proposed consensus framework is applied to a blockchain platform selection problem in the supply chain to demonstrate the effectiveness and applicability of the model.
3. Materials and Methods
A questionnaire is composed of a group of
m questions, represented as
, where each question
can be seen as a group of
items
, where each
takes values in a concrete domain. The Likert scale [
30] is frequently used as the domain to measure the responses in a questionnaire. For this reason, the proposed method is designed to be used with this scale.
In order to contextualize our proposal, now we study a concrete example based on the first question of the questionnaire using a 5-point Likert scale such as this: “1 = strongly disagree, 2 = disagree, 3 = have no idea, 4 = agree, 5 = strongly agree” taken from [
48] (
Table 1). Suppose that nine surveyed completed this question with the values shown in
Table 2 and that the arithmetic mean is used in this example to aggregate the responses. In
Table 2, there are two levels of aggregation for each question: the first one is shown in the column “1st level” and represents the response of each surveyed to an item; the second one, column “2nd level”, is the result of the aggregation process applied to the values of the first level. In addition, the arithmetic mean of each answer is shown in the last row (col. mean).
Section 3.1 shows the algorithm in detail. The proposed approach is now detailed where a successive partition of the frequency histogram of the occurrences of the responses is built. Then, the initial histogram is split into two partitions, and then these two partitions are divided again and again until they cannot be divided any more (
Section 3.2). The partition condition is a measure concretely designed with this aim (
Section 3.3). Once the split has finished, the final aggregate value is computed (
Section 3.4). Finally, an example is given in
Section 3.5.
3.1. Detailed Algorithm of the Proposed Method
Algorithm 1 details the operation of our method. It computes the histogram without taking into account the items with a frequency equal to 0 (Lines 1–2). These elements are stored in
(list of processing). The output set containing the partitions is named
(Line 3). A new element is taken from
(Line 5). After that, all possible partitions composed of two sets of responses are generated (Line 9 and
Section 3.2) if this is possible (Line 6). Each partition is now evaluated using an evaluation function that allows for selecting the best partition (Line 10 and
Section 3.3). A parameter
is used to control the way the partitions are made (Line 11). This is the only parameter used, and it allows for selecting the rate of partitioning, and its behavior will be observed in the experiments. The smaller
is, the greater the rate of partitioning becomes. If two new partitions are obtained, they are added to
, and they will be processed in the next iteration of the loop (Lines 12–13); in other cases, the original partition is generated as output (Line 15). This process is repeated until no new partition can be obtained (Line 4). Lastly, the aggregate value is computed using the partition of the input space (Line 20 and
Section 3.4). Another aggregation function must be selected if the input histogram has not been divided (Line 22).
Algorithm 1 Partitioning and consensus based method |
- 1:
= CalculatingTheHistogram() - 2:
= DeletingElementsWithFrequencyZero() - 3:
{Output list with the selected partitions} - 4:
whiledo - 5:
e = .pop() {e is removed from and is processed} - 6:
if length(e)1 then - 7:
.add(e) {e cannot be divided and is added to } - 8:
else - 9:
- 10:
= ObtainingBestPartition( ) { Section 3.3} - 11:
if eval() >cons(e) then - 12:
.add() {Adding first partition of } - 13:
.add() {Adding second partition of } - 14:
else - 15:
.add(e) {e is not divided and is added to } - 16:
end if - 17:
end if - 18:
end while - 19:
if length()>1 then - 20:
- 21:
else - 22:
using another aggregation function, such as the arithmetic mean, OWA, etc. - 23:
end if
|
3.2. Generation of the Partitions
The selected partition is divided into two parts: Each part contains consecutive values of the original partition and all possible combinations are computed. If the input histogram has
n values, then there exist
possible partitions using this cutting strategy. For example, let
e be the histogram of the column
(
Table 2) taking the values
representing nine responses using a 5-point Likert scale where item 1 has been chosen two times, item 2 only in one response, etc. There are three possible partitions since
e contains four values.
Table 3 shows the partition
divided into two partitions, referred to as
and
.
3.3. Evaluating the Partitions
Now, the equations that define our proposal will be presented. Equation (
2) is used to evaluate the partition
, taking values in the interval
. A larger evaluation value indicates a better partition:
The consensus of each component of a partition
(Equation (
3)) and the distance between the partitions
and
(Equation (
5)) are considered. Equation (
3) takes a value in the interval
since the function
offers an output value in this interval:
where
is the partition composed of
and
, and
is the width of
.
Different measures of consensus have been proposed previously in the literature. The approach of Tastle and Wierman [
29] has been selected since it is a measure designed specifically to measure the consensus of a standard Likert scale. Furthermore, this method is valid for measuring other scales of closed-ended questions. Tastle and Wierman proposed the use of the mean and standard deviation as measures of the dispersion along with the Shannon entropy (Equation (
4)). In their proposal, the consensus of a partition with a single element is
, and several examples showing its behavior can be found in the original paper:
where
is the probability associated with the distribution under consideration,
is the mean of
X,
is the width of
X, and
is the absolute value.
As it is indicated above, a distance measure is defined between two partitions. In addition, Equation (
5) takes a value in the interval
, and it fulfills that
since each element in
is greater than each element in
:
where
is the width of the input Likert scale, and
returns the representative value for
.
To compute the representative value of a partition function,
is used (Equation (
6)).
where
,
is a tuple where
j is a Liker scale value,
its frequency, and
normalizes the value
.
The idea is to detect the areas of maximum frequency concentration within the partition. For this reason, a method has been designed that reduces the small frequencies and maintains the highest ones (Equation (
7)) by applying a factor to each of the frequencies in the sample. The frequencies that fall in the lower third of the highest frequency (
) are multiplied by a factor of
, those that fall in the second third are multiplied by a factor of
, and finally those that fall in the upper third are not modified, so the factor is equal to 1:
where
and
.
The distribution obtained is then normalized and the mean, which is the representative value of the sample, is calculated. By reducing the lowest frequencies and keeping the highest ones, the new average approaches the areas of maximum frequency concentration.
The main idea of the approach is to split only when a new partition
reaches a good consensus (Equation (
3)) and when the two partitions in
are far enough from each other (Equation (
5)). In this case, we consider the best way of obtaining a more consistent partitioning. For instance, let
be the partition selected in
Section 3.2 (row 2 of
Table 3):
and
. First of all, Equation (
4) is applied to calculate the consensus values, obtaining
and
, respectively. Next, Equation (
3) is computed:
where
and
. Equation (
5) computes the distance between
and
, and the result is
. Lastly, the evaluation value is calculated using Equation (
2): it is equal to
.
3.4. Calculating the Aggregate Value
The aggregate value is computed using Equation (
8) once the best partition has been selected. This operation obtains the sum of the frequencies of each partition multiplied (
) by a representative value from the own partition (
). Then, each partition is represented by a single value (Equation (
6)):
with
,
is a partition, and
is calculated using Equation (
6).
Although, in Equation (
8),
rv is used, it can be substituted by other measures such as the mode or the median. The justification of why using the
rv is related with our aim of representing the partition with a value that considers the zone that concentrates the maximum frequencies. As example, the partition given in
Section 3.3 (row 2 of
Table 3) is now used in order to illustrate the operation of Equation (
8).
and
are represented by the values
and
, respectively. Then, the aggregate value is computed in the following way:
3.5. Example of the Operation of Algorithm 1
The operation of this algorithm, step by step, will now be shown using an example. The question used has nine items, and each one of them uses a 5-point Likert scale taking, for example, the following values
have been taken from the column
of
Table 2. The value for
is
. Then,
Line 1: The histogram is obtained using the answers.
Line 2: is initialized with a set containing every single element, , where the value 3 is not added because its frequency of occurrence is equal to 0.
Line 3: The list that contains the output partitions () is initialized to ∅.
Line 4: The main loop finishes when there are no partitions to process, i.e., is empty.
Line 5: An element is taken and removed from ; in this case, as there is only one, then .
Lines 6–7: If e only contains one element, it is added to the output list (Line 7) since it can not be divided. This situation does not occur in this example.
Lines 8–9: Otherwise, all the possible partitions of
e are generated and stored in
, which is the list that contains all the possible partitions as shown in
Table 4 (
Section 3.2).
Line 10: The best partition
is now selected using Equation (
2). The column “eval” in
Table 4 shows the values obtained in this case. The best partition is
since it reaches the greatest value (
).
Lines 11–13: If the evaluation value of
(Equation (
2)) is greater than the consensus value of
e, then the two components of
(
and
) are added to
(Lines 12 and 13); otherwise,
is not split any more and is added to
(Lines 14–15). In our example, the division is achieved because
obtains a greater value (
) than the value obtained by
e (
), i.e.,
. By means of the parameter
, the user can indicate the desired rate of split. In the next iteration of the main loop, the partitions in
are processed in the same way.
Line 19: When the loop finishes, the selected partition takes the value and the final aggregate value is equal to (Line 20).
5. Conclusions
A new method of aggregating the collected information which is more suitable for asymmetric distributions has been presented. This method generates a set of partitions using an approach based on successive divisions. It forms groups of closer answers represented as partitions using a measure based on consensus and a distance measure between the possible partitions. The method allows for selecting the rate of partitioning by means of a parameter ; several experiments have been completed to test the behavior of such a parameter.
The structure of the algorithm allows for changing the evaluation function to consider other criteria for making the division of the input partition. The evaluation function calculates the distance between the possible partitions using a consensus measure and a distance measure between a unique representing value of each partition.
Experimentation aims to test how the method generates the partitions and the influence of the parameter on the creation of these partitions on synthetic and real datasets. Promising results have been obtained, verifying that the aggregate value obtained by our approach is appropriate for asymmetric distributions obtaining similar results to the arithmetic mean for symmetric distributions.
As future work, we intend to investigate other evaluation functions of the partitions in order to obtain an added value that improves that obtained by the current method. In addition, the study of alternative consensus measures is another important research line to be explored.