Next Article in Journal
A Socioeconomic Dataset of the Risk Associated with the 1% and 0.2% Return Period Stillwater Flood Elevation under Sea-Level Rise for the Northern Gulf of Mexico
Previous Article in Journal
Using Twitter to Detect Hate Crimes and Their Motivations: The HateMotiv Corpus
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

iKeyCriteria: A Qualitative and Quantitative Analysis Method to Infer Key Criteria since a Systematic Literature Review for the Computing Domain

1
Departamento de Informática y Ciencias de la Computación, Escuela Politécnica Nacional, Quito 170525, Ecuador
2
Centro de Estudios en Microelectrónica y Sistemas Distribuidos, Universidad de Los Andes, Mérida 5101, Venezuela
3
GIDITIC, Universidad EAFIT, Medellín 050022, Colombia
4
Intelligent and Interactive Systems Laboratory, Universidad de Las Américas, Quito 170125, Ecuador
*
Author to whom correspondence should be addressed.
Submission received: 25 April 2022 / Revised: 16 May 2022 / Accepted: 17 May 2022 / Published: 26 May 2022
(This article belongs to the Section Information Systems and Data Management)

Abstract

:
A systematic literature review is a synthesis of the available evidence, in which a review of quantitative and qualitative aspects of primary studies is carried out, to summarize the existing information regarding a particular topic. The researchers extract key criteria from papers collected about their study area, answering research questions and conducting document analysis. Nonetheless, in some cases, these criteria are improperly justified, unknowing their true level of importance in the study subject. Hence, an additional study is necessary to explain the criteria relevance in the papers studied using qualitative and quantitative premises. The correct identification of these key criteria is a critical factor in prioritizing and achieving appropriate results in any scientific research work. In our paper, a new method to determine key criteria from a literature review is proposed, composed of three components: input-process-output. First, the inputs are a set of criteria to evaluate and a set of documents to analyze. Next, the process component examines the document set to indicate whether the criteria to be analyzed are found. The process component produces a Boolean matrix, which is the input of the mathematical logic process that will get the key criteria considered necessary and sufficient as the output component. The iKeyCriteria method has been applied in different computing domains, particularly for serious games design and virtual organizations, giving positive results in each context. Finally, we developed an online tool that provides global support to the execution of our method.

1. Introduction

A systematic literature review (SLR) is a research methodology designed to answer a focused research question. This methodology is becoming an important resource for researchers to help in the information search and solve problems in different research areas. Their findings enable them to learn about advancements, characteristics and challenges and thus create or improve existing proposals in specific fields. Currently, we find its application in all study areas [1].
An SLR is a way of evaluating and interpreting all the available research relevant to a particular research domain, thematic area, or phenomenon of interest [2]. For example, Cochrane reviews [3] summarize the results of available and carefully designed studies for controlled clinical trials and provide a high level of evidence on the efficacy of health interventions. According to [4], an SLR allows the synthesis of research findings to discover areas where further research is needed. In general, an SLR must define research questions, search strings, inclusion and exclusion criteria, quality criteria, among others, and then retrieve literature that helps answer the research questions.
SLRs are carried out in the scientific field, as sufficient analysis to extract key criteria in their study field through answers to research questions and document analysis. Nonetheless, in some cases, these criteria are not adequately justified, and as a result, their relevance is revealed in the papers examined. These facts do not allow us to contrast the level of importance of the study subject. For this reason, an additional study is necessary to explain the criteria relevance in the papers studied using qualitative and quantitative foundations. This method would significantly speed up the research process by generating new proposals or solutions in various study fields. The lack of methods or processes to simplify the relevance of a criteria selection delays the research results and adequate justifications.
We consider it critical to review the literature early in the research process to identify gaps in a specific study area and, based on these findings, to propose new supported knowledge.
In our work, the final documents of the SLR are the input to carry out the qualitative and quantitative analysis to obtain key criteria in a specific domain. This article aims to be a beginning point for researchers interested in applying an SLR methodology, which helps to highlight and identify key criteria in each field of research. Namely, a novel qualitative and quantitative analysis method is proposed here, which allows us to know key criteria related to a specific area of knowledge, which has been tested, particularly in the computing domain.
This proposal is an extension of a regular SLR, since the TF-IDF (term frequency-inverse document frequency) metric is applied, which is related to the frequency of a term in a set of papers, with a qualitative analysis considering researchers’ opinion, to determine key patterns using a set of logical axioms, of which key elements are obtained to be considered in each area. The quantitative method based on the TF-IDF metric was automated based on the collected papers from the SLR. The automation process can manage papers in many languages.
The qualitative and quantitative analysis method was applied in different case studies, such as the following, to obtain key criteria in the areas of serious games [5], user-centered design [6], and virtual organizations area for defining a characterization under the Industry 4.0 context [7].
The main contribution of this paper is to help researchers with the selection and verification of the relevance of key criteria in different areas of knowledge, allowing them to obtain the relevance of a criterion in a document as well as in a collection of documents. With these results obtained from key criteria, the researcher can know what elements are important and useful for the generating of new knowledge in his study area. The paper structure is as follows; Section 2 presents related work; Section 3 describes our proposal to obtain the necessary criteria for an SLR; Section 4 demonstrates the automation of our proposal’s quantitative analysis process; Section 5 demonstrates a case study in the domain of methodologies for serious games design, and Section 6 reports conclusions and future work.

2. Related Works

The impact of emerging technology is essential [8] when integrating communication through different interactive media. For this reason, a summary of different articles related to the key criteria selection to improve decision-making in different research areas is presented.
In [9], the authors propose a criteria identification to evaluate the health literature critically, which is obtained from an SLR as well as the experts’ knowledge consideration, to define a priority.
The authors present a study to contribute to the selection of environmentally friendly ecological suppliers that seek to provide efficient and simple alternatives to take advantage of organic waste from the community in general based on key ecological selection criteria [10]. In this document, the importance of a set of criteria is observed, which allows for making decisions about suppliers that meet the ecological characteristics. However, in this study, the selected criteria are based on literature analysis of general environmental factors, allowing for the development of common ecological standards. Also, the experts’ experience gives weight to the criteria to obtain the most important in the ecological area. Our methodological proposal based on qualitative and quantitative analysis is helpful, for this research since the authors could define procedures to determine basic criteria from an SLR to select green suppliers.
Other authors [11] present criteria selection using text mining approaches. As a result of an SLR on text mining techniques, papers were published. In this article, the criteria definition begins from an SLR observed to avoid the subjectivity presented in some cases.
In contrast, [12] describes a process to identify criteria that aid in the selection of a production system compatible with manufacturing companies. The authors explain the procedure for choosing these criteria: a literature search is carried out first, identifying the criteria. Also, they add criteria obtained from the experts’ opinions, and the criteria are validated using a Delphi method. In this work, the method to determine criteria for a specific topic is the main subject, using a Delhi method to solve ambiguities. Meanwhile, we consider a similar approach to criteria validation using logical axioms to obtain the criteria pattern necessary in a specific context.
In earlier research, the authors [13], proposed a method to find selection criteria for Enterprise Resource Planning (ERP)/Customer Relationship Management (CRM) systems used by Pymes in Poland. The method collects information from online surveys and analyses using Spearman’s Rank Order Correlation. To find the key selection criteria to ensure the optimal purchase decision of the correct ERP either or both CRM system. An SLR on the selection criteria of ERP either or both CRM systems was conducted, and the opinions of 83 respondents with work experience in Pymes who had previously implemented such systems were considered. According to the authors, the study results can be applied in a similar context related to decision-making in Pymes for the selection of the CRM either or both ERP class system.
In summary, researchers always begin with an SLR when selecting criteria for their works, and the researchers’ opinion is used to prioritize, determine relevance, and more. Thus, experts’ opinions or mathematical methods are used to prioritize the criteria. However, each of the articles analyzed presents a general criteria selection process, starting from an SLR, without presenting a complete and detailed process that visualizes each activity to be generated to find the selection criteria. Thereby, researchers could replicate their proposals for the criteria selection processes, helping in different areas of specific knowledge.
An SLR answers a defined research question by collecting and summarizing all empirical evidence that fits pre-specified eligibility criteria. A meta-analysis is the use of statistical methods to summarize the results of these studies.
In our case, we applied a rigorous meta-analysis using a statistical method to summarize the results of these studies. Our proposal presents some characteristics to consider for relevant criteria selection in a specific study area, which can be replicated in any context of computer science.
Table 1 shows what aspects are considered with the proposed method. As we see, not all the studies analyzed contemplate these elements when selecting or presenting relevant criteria. They always present their results via SLR.

3. Methodological Proposal to Infer Key Criteria

Our method is composed of three components: input-process-output. The process illustrated in Figure 1 presents the activities to be carried out to identify key criteria.
First, the inputs are a set of criteria to evaluate and a document set to analyze. Then, the process offers two paths: one is based on the researcher’s opinion who analyzes the document sets to indicate if the criteria to be analyzed are found; the other is based on statistical analysis, which considers the frequency of occurrence of the criteria in the document sets.
The two paths allow the generation of a Boolean matrix, which serves as the input of the mathematical logic process, and finally will allow obtaining the output related to the key criteria considered necessary and sufficient.
The following section describes in detail each part of the process of the Figure 1.

3.1. Inputs

The base documents often include identical results and certain differential characteristics that we seek to identify and classify through statistical analysis. The initial criteria are the basic arguments or keywords from independent primary studies obtained from a review of several bibliographical sources of information (documents of the context to be studied) focused on the same question. When the study context is clear, documents or books serve as a base with a set of concepts or key terms, that enclose that domain, and each of the key criteria arises as input to the process. The initial criteria selection can be based on the experience of the researcher, or based on a source of scientific information.
For example, if we want to validate the characteristics that a software requirement must fulfill to be well-formulated, we can start from the individual and group characteristics of a requirement proposed by the ISO/IEC/IEEE 29148 standard [14]. In this case, the criteria to validate would be appropriate, unambiguous, complete, unique, feasible, verifiable, correct, and compliant. Moreover, the set of base documents would be all software engineering standards related to requirements.

3.1.1. Set of Initial Criteria

The initial criteria must be organized in an analysis matrix. This matrix is the input of the methodological proposal and allows for defining the initial criteria, synonyms, and understanding the context of the term. The columns of the analysis matrix, as shown in Table 2, are as follows.
  • Criteria. The initial criteria to validate.
  • Definition. A description of the exact meaning of the criterion.
  • Context. It provides additional information to allow readers to understand a criterion.
  • Synonyms. Terms related to the criteria that allow eliminating ambiguities and identifying the number of occurrences in the collection of documents.

3.1.2. Set of Documents to Evaluation

The set of documents for evaluation is obtained through an SLR that provides a complete and comprehensive summary of relevant research studies related to the research questions. Our method requires making two groups of documents (P, Q).
  • The set of documents named P should group works related to the study context.
  • The set of documents named Q that should group works related to the opposite study context.

3.2. Processes

The methodological process is composed of two preceding processes or antecedent processes and one consequent process (seen Figure 1). One of the two antecedent processes can be chosen. The antecedent process 1 is based on expert opinion assumes objectivity or scientific impartiality; the role of the researcher becomes instrumental since anyone would reach the same conclusions if they followed the same steps. However, it is impossible to deny intuition and individual discernment, so the antecedent process 2 is based on the TF-IDF metric, which is based on the number of times a term appears in the document.

3.2.1. Antecedent Process 1 Based on the Expert’s Opinion

The first activity carried out in this process is to obtain the justification matrix (see Table 3) that argues the presence, or lack thereof, to finally obtain the Boolean matrix (see Table 4), whose entries are either 0s or 1s, which indicates the absence or presence of the initial criteria on the selected documents from the SLR, based on the expert’s opinion. It is made for the groups of documents P and Q.

3.2.2. Antecedent Process Based on the TF-IDF Metric

TF-IDF is a metric to determine the importance of a word in a collection of papers. In our case, this metric will be necessary to identify the importance of the initial criteria to the set of documents. TF-IDF value increases proportionally to the number of times a term appears in the document and is offset by the number of documents in the corpus that contain the term, which helps to adjust for the fact that some terms appear more frequently in general. This process starts with the determination of the term-frequency matrix. It aims to obtain the number of times (f) that a term (t) occurs in a document (d) (it is stored in the term-frequency matrix (see Table 5)). It counts each term and its synonyms. However, in the beginning, we use a reduced analysis matrix (see Table 6), which considers the criteria and their synonyms.
After obtaining the term frequency matrix, the normalized and the inverse frequency are calculated to obtain the TF-IDF value. Thus, the normalized frequency matrix is obtained (see Table 7), which is calculated using Equation (1).
Normalized frequency. The normalized frequency adjusts the frequency of the term or relevance score to normalize the effect of document length on the document ranking. It is obtained by dividing the absolute frequency value by the value of the maximum frequency of the term contained in the document.
Equation (1) allows obtaining the content of the normalized frequency table for each term in the document.
t f ( t i , d j , a ) = a + ( 1 a ) t f ( t i , d j ) t f m a x ( d j )
where,
  • t f ( t i , d j , a ) , represents the value obtained from normalized frequency.
  • The constant a = 0.5 is a value that softens the frequency function of the term and leads to its normalization, recommended by Christopher D. Manning [15].
  • t f t i , d j , represents the absolute frequency of the term (t) in the document (d).
  • t f m a x ( d j ) , represents the raw frequency of the most frequently occurring term in the document.
Also, the inverse frequency matrix is obtained (see Table 8).
Inverse frequency. The inverse document frequency is a measure of whether the term is common or not in the collection of documents. It is obtained by dividing the total documents number by the documents number that contains the term, and the logarithm of this quotient is taken:
The following Equation (2) will be used.
i d f ( t i , D ) = l o g n ( D ) d f t i
where
  • n ( D ) , refers to the total documents in the collection.
  • d f t , refers to the total documents where the term appears.
  • The  i d f ( t i , D ) is performed for each term.
Finally, the TF-IDF matrix is obtained (see Table 9).
TF-IDF matrix. It is the product of two measures, multiplying the values obtained in the normalized frequency t f ( t i , d j , a ) by the values of the inverse frequency i d f ( t i , D ) (see Equation (3)).
t f i d f ( t i , d j , a , D ) = t f ( t i , d j , a ) i d f ( t i , D )
With the values obtained in the TF-IDF Matrix, we proceed to create a Boolean matrix (0s or 1s). For this matrix, the variable k (k = average or variance) is used that represents the average of the frequency of the TF-IDF value. Then, if the value obtained t f i d f ( t i , d j , a , D ) k , then the value in this matrix will be (1), otherwise the value of (0) is placed (see Table 10).
It is important to mention that the mode, mean, median, and variance can be considered more than the average, which can be selected according to the end-user needs; the option is chosen to finally generate the Boolean matrix.

3.2.3. Consequent Process

It is the inference process and aims to obtain the matrices of instantiation, behavior, and pairing. It is based on logical axioms to extract the criteria.
The objective of this process is to infer the key criteria. The boolean matrices obtained through the antecedent processes are used as inputs. Next, the process of obtaining the final key criteria is described.

Instantiation Matrix

This matrix is used to specify whether the criterion is in the document being analyzed, and with the pair of values identified by rows of the matrix, it will help to identify through the truth table if it is in any category of the logical pattern matrix.
The instance matrix is made up of two columns (initial criteria and documents P and Q) (see Table 11). This matrix is made for each term o criteria to be evaluated.
Criteria Column: The values of the row of the boolean matrix of each criterion are placed (antecedent processes 1 or 2). It is necessary to build a matrix for each term evaluated.
Documents Column: In this column, 0 or  1 is placed depending on the study context; 1 (one) when they belong to the set P related to the study context, and 0 (zero) for the set of documents Q related to the opposite study context.

3.3. Output

Behavior Matrix

This matrix serves to categorize and determine which criteria are relevant in the study context and is validated with the help of truth tables.
The behavior matrix is made up of two sections. (See Table 12). The first section corresponds to a truth table of mathematical logic composed of a column for each input variable (for example, p and q). The second section shows all the possible results of the comparison of the pair of values (criteria, document), of the instantiation matrix. Here in the truth table that we use, p and q mean the following.
  • p : are the criteria which need their existence (or lack thereof) to be verified in the documents
  • q : are documents which are either from the study context (P) or the opposite context (Q).
These (p, q) can take 1 s or  0 s as appropriate; this pair of values is verified in the instantiation matrix (criteria, documents). If a pair of values from the truth table is found in the instantiation matrix it will be true (1), otherwise it is false, and the value of (0) is placed.
Then, the result of the second section is compared with the matrix of logical patterns. (see Table 12).
For example, to better understand its logic we have the following.
p q . If p then q.
The researcher specifies, “If I found the key criterion in my study document, this could be a document from my study context.“ Such a statement is known as conditional.
  • p: criterion
  • q: study context
In such a way that the statement can be expressed as  p q .
Its truth table is as follows (see Table 13):
The interpretation of the results of the truth table is as follows:
Analyzing if the researcher lied with the statement of the previous statement: when p = 1 means that the key criterion was found in the study document and q = 1 that this document is from the study context, therefore p q = 1 (the researcher told the truth).
When p = 1 and q = 0 , it means that p q = 0 the researcher lied, since I found the key criterion in the study document, but this document is not from my study context.
When p = 0 and q = 1 , it means that although he did not find the key criterion in the study document, the document does belong to his study context, so he did not lie, such that p q = 1 .
When p = 0 and q = 0 , it means that although the criterion was not found in the study document, that document is not from his study context, therefore p q = 1 since he did not lie either.
Considering the previous statements, to achieve the result, we apply mathematical logic that will allow us to classify the criterion into a category.
Table 14 presents the logical axioms that will allow for classifying the criteria. It comprises two sections; the first presents the truth table, and the second shows logical patterns useful for specifying the necessary and sufficient conditions.
Next, we describe each of the categories in which a criterion can be categorized:
  • Necessary and not sufficient (N-NS). q p : Criterion appears in all documents of the study context and criterion appears at least once in the opposite study context.
  • Not necessary and sufficient (NN-S) p q : Criterion appears at least once in the study context and does not appear in any document from the opposite study context.
  • Necessary and sufficient (NS) p q : Criterion appears in all documents of the study context and does not appear in any document of the opposite study context.
  • Not necessary and not sufficient (NN-NS) ( p q ) q : Criterion appears at least once in the study context and also appears at least once in all documents of the opposite study context.
  • None: The criterion was not in any of the categorized groups in the logical pattern matrix.
Pairing Matrix. This matrix is used to categorize all the previously analyzed criteria according to a logical pattern of behavior. See Table 15.
This method was applied in different study cases helping researchers applying SLR to determine important criteria in different case studies [5,6,7].

4. Automation of the Method

In order to simplify the use of our methodological proposal, we developed a software tool that supports the method’s execution and saves time. The tool uses both the  t f i d f metric to obtain the Boolean matrix and the researcher’s analysis to go directly to the mathematical logic and obtain the categorized results. The following section describes the tool in depth.

4.1. Description of the Tool

Our web application is divided into three modules: information, administration, and the process itself.
  • The informative module has two interfaces: the first at the beginning of the application, which contains a description of the method, and the second contains the tool manual.
  • The user administration module has an interface where the user’s information is displayed, but its main task is to record each of the processes carried out by the researcher. It focuses on two states of the process, one when it is completed, in which case the application saves the PDF files, the criteria, and the results, and the other when it is running. This module automatically saves each change of the current state of the process, for example, when a file or criterion is added or deleted, or when the variable k changes the calculation method.
  • The application module or the process itself consists of four sub-modules described as follows:
    • Uploading PDF files: this module consists of four interfaces that allow the researcher to upload to the application the files that will be used in the process. The first two interfaces allow files to be uploaded to each of the two research groups established by the researcher. The following interface displays the files uploaded in each group, and the last one is an informative interface for the correct upload of the files to the application.
    • Uploading criteria and synonyms: this module consists of two text inputs that allow for entering the criteria and all synonyms related to that criterion. All the criteria and synonyms entered by the researcher are shown in a table.
    • Selects the type of process that allows for selecting process 1 or process 2. Process 1 is based on the researcher’s opinion; here, the Boolean matrix that the researcher obtained from his analysis is used. Process 2 will use the  t f i d f metric and then select the type of measure k with which the comparison will work in the next step of the process; otherwise, when selecting process 1, it will go to the next step directly to obtaining the result.
    • Select type of measurement: this module allows for selecting between the median, mode, mean, and variance to calculate the variable “k” used to obtain the Boolean matrix.
    • Results: The last module of the application shows the results obtained, which are shown in a pie chart with criteria categorized and all the matrices obtained in the process. Additionally, the results obtained can be exported in an excel file.

4.2. Implementation

The application was implemented in the Django framework, a high-level Python Web Framework that encourages rapid development.
Django was chosen as the tool’s implementation framework since it is a high-level Python web framework that enables the building of secure and maintained websites. It is a high-level language and well suited for scientific and engineering environments. Python is widely used in statistical data analysis, automation, and the development of dependable, scalable systems using modules that make programming easier and have a low learning curve. Python is a high-level language and well suited for scientific and engineering environments [16,17].
The result of the processes carried out in the method are matrices; Python has libraries such as Numpy that provide high-performance multidimensional array objects that significantly enhance performance and speed up the execution time correspondingly.
The user administration module was developed using Django. It provides a framework for sessions and user management as well as the use of SQLite as a database.
Our tool, called, Criteria checker application (see Figure 2), is installed on the servers of the research laboratory and is available at the web address http://criteria-checker.epn.edu.ec, accessed on 24 April 2022.

5. Application in a Case of Study

5.1. Application Context

The general purpose of this study case is to determine the key criteria for creating educational serious games from the study of software development methodologies and serious games. In this research, the main study context is identified, such as the study of serious games development methodologies, and the opposite context is the study of the traditional software development methodologies, so that the research will be based on the use of these contexts and the application of the Antecedent Process based on the tf-idf metric.
This case study begins by following the steps proposed by the Kitchenham methodology [2], in which an SLR is carried out whose specific purpose is to obtain the set of documents that meet the search criteria, and that goes through the review process using inclusion and exclusion criteria.
Once the review is executed, the documents found as a result are 26 related papers. This resulting group of documents constitutes the input of our process.

5.2. Initial Criteria

The initial documentary review focuses on the areas of interest, where, within the literature, terminologies referring to the main contexts are identified, which can even be synonymous.
Based on our application context, if the research needs to find information regarding “the key criteria for the creation of educational serious games”, several criteria can be defined for each of the main search terms, which are “methodology”, “lifecycle”, “agile process”, “pedagogical objectives”, “gamification”, and “GamePlay”, among others; these initial criteria can be related to synonymous terms such as methodologies, methods, processes, principles, tools, mechanisms, strategies, designs, and more.
The final criteria selection is strongly based on the experience, knowledge, and skills of the researcher, which will define whether or not the words apply to the context, and in turn, will discard those that are not or whose combination of these is not used, traditionally or professionally, in the study field. For example, a combination of words unlikely to be found would be information referring to a “design for the manufacture of serious games”.
For the context of the application, the selected criteria are those set out in Table 16.
Our study case determined that, of the 26 resulting documents, 12 papers were classified for group P and the remaining 14 papers for group Q.

5.3. Using the Support Tool

1.
The 12 documents related to group P are loaded, as shown in Figure 3 and Figure 4.
The documents of the main context used in this study case are [18,19,20,21,22,23,24,25,26,27,28].
2.
The 14 documents related to group Q are loaded, as shown in Figure 5.
The documents of the opposite context are [29,30,31,32,33,34,35,36,37,38,39,40,41,42].
3.
Is necessary to verify that all the 26 papers have been uploaded without problems, as shown in Figure 6.
4.
Each of the defined criteria with their respective synonyms is entered, as shown in Figure 7.
5.
Next, is necessary to choose between the antecedent process 1, to measure the expert’s opinion, or the antecedent process 2, to apply the  t f i d f metric.
We select the TF-IDF antecedent process 2. If the obtained value of  t f i d f ( t i , d j , a , D ) k , then the value in this matrix will be (1), otherwise the value of (0) is placed. k represents the (median, mean, mode and variance) of the frequency of the TF-IDF value.
Process 2, or TF-IDF metric, allows selecting the type of measurement to obtain the Boolean table; for our study case, the “mode” measurement was chosen (Figure 8), once the app starts finish the calculation of the results, the app shows all the results explained in Section 3.2.2 related to the execution of the TF-IDF method, as shown as an example in Figure 9, Figure 10 and Figure 11.
For details of result tables check the Mendeley dataset repository in [43].
The results of the analysis process using k (median) are detailed in Table 17 and shown in Figure 12.

6. Conclusions

This work proposes a new method that allows the selection of key criteria of a study field, which helps decision-making in that research field.
In most of the studies analyzed, the criteria selection starts from the literature review, allowing researchers to obtain grounded criteria. However, these are not contrasted and validated in the same information set, which would help in determining whether those criteria are really important.
In this study context, we propose an add-on to a systematic literature review, giving researchers an alternative way to find the relevant criteria, characteristics, or congruent patterns to create and propose new ideas.
Our method offers a rigorous meta-analysis using a statistical method to consider relevant criteria selection in a specific study area, which can be replicated in any context. The proposed TF-IDF metric is based on a qualitative and quantitative analysis that results in the same frequency analysis table, which leads to a mathematical process to obtain necessary and sufficient criteria as a result, or categorize them in different states, as was presented in the method.
Different study cases have been carried out using the qualitative and quantitative analysis methods, allowing criteria selection in areas like serious games, user-centered design, serious games methodologies, and requirements.
In order to simplify the use of our method, we developed a software tool that supports the method’s execution and saves time. However, a tool limitation is that the documents must be unlocked for the analysis.
The web application allows us to validate the criteria defined by the expert using mathematical logic criteria and determine if there is a need to include new criteria to satisfy the systematic literature review performed.
Automation of this method helps researchers obtain a validated criterion in papers that study a specific research domain.
The method applied in this study can serve as a basis for application to any topic that requires validation of the systematic literature review of previous studies. In future work, we suggest further validation of this application in case studies to assess the needs of other researchers, improve the needs of other researchers, and improve based on the feedback.
This study did have some limitations. The first issue is associated with the possible ambiguities of the terms, so it was necessary to include synonymous words and terms in another language. Another limitation, as indicated, in the case of path 1, is that it is based on the subjective opinion of the researcher, which in some cases is advantageous since it can break an ambiguity of terms and analyze whether the document effectively refers to the study context; the other path is based on a statistical process that does not give rise to subjectivity, and is based on the occurrence of the terms, but given possible ambiguity, it might not be as precise. Moreover, it is important to ensure that the documents were studied and did not have security blocks.
In future works, we plan to continue testing this method to obtain feedback and improve the process and its automation, applying it in different case studies to obtain key criteria for both gamification elements in software development related to data analytics and requirement validation in software development.
Finally, this methodology can serve as a starting point for future work in order to show relevant results and trends of research on the subject studied.

Author Contributions

Conceptualization, M.C.-T., J.A., M.S., B.A., C.-P.L., P.A.-V. and M.P.; methodology, M.C.-T., J.A. and M.S.; validation, M.C.-T., M.S. and M.N.; formal analysis, M.C.-T. and M.S.; investigation, M.C.-T. and M.S.; writing—original draft preparation, M.C.-T., M.S., B.A. and P.A.-V.; writing—review and editing, M.C.-T., M.S., J.A., P.A.-V., B.A. and M.P.; supervision, J.A.; M.P. project administration, P.A.-V. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Universidad de Las Américas-Ecuador, internal research project INI.PAV.20.01.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Lame, G. Systematic literature reviews: An introduction. In Proceedings of the International Conference on Engineering Design, ICED, Bordeaux, France, 24–28 July 2019; Volume 2019, pp. 1633–1642. [Google Scholar] [CrossRef] [Green Version]
  2. Kitchenham, B.; Charters, S. Guidelines for performing Systematic Literature reviews in Software Engineering. Engineering 2007, 45, 1051. [Google Scholar] [CrossRef]
  3. Ibrahim, A.; Ahmed, M.; Conway, R.; Carey, J.J. Risk of infection with methotrexate therapy in inflammatory diseases: A systematic review and meta-analysis. J. Clin. Med. 2019, 8, 15. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  4. Snyder, H. Literature review as a research methodology: An overview and guidelines. J. Bus. Res. 2019, 104, 333–339. [Google Scholar] [CrossRef]
  5. Carrión, M.; Santorum, M.; Pinaida, A.; Aguilar, J. Estudio para inferir criterios clave para el diseño de Juegos Serios. In Proceedings of the 2019 International Conference on Information Systems and Software Technologies (ICI2ST), Quito, Ecuador, 13–15 November 2019; pp. 63–70. [Google Scholar]
  6. Quimbita, L.; Santorum, M. Estudio de Metodologías Participativas y de Enfoques Centrados en el Usuario para la definición de una Metodoogía de Diseño de Juegos Serios Educativos. Ph.D. Thesis, Escuela Politécnica Nacional, Quito, Ecuador, 2020. [Google Scholar]
  7. Guamushig, T.; Lopez, C.; Santorum, M. Análisis y Caracterización de las Organizaciones Virtuales para la Colaboración en el Contexto de la Industria 4.0. Ph.D. Thesis, Escuela Politécnica Nacional, Quito, Ecuador, 2020. [Google Scholar]
  8. Pavlič, J.; Tomažič, T.; Kožuh, I. The impact of emerging technology influences product placement effectiveness: A scoping study from interactive marketing perspective. J. Res. Interact. Mark. 2021. [Google Scholar] [CrossRef]
  9. Sale, J.E.; Brazil, K. A strategy to identify critical appraisal criteria for primary mixed-method studies. Qual. Quant. 2004, 38, 351–365. [Google Scholar] [CrossRef] [Green Version]
  10. Banaeian, N.; Mobli, H.; Nielsen, I.E.; Omid, M. Criteria definition and approaches in green supplier selection—A case study for raw material and packaging of food industry. Prod. Manuf. Res. 2015, 3, 149–168. [Google Scholar] [CrossRef]
  11. Hashimi, H.; Hafez, A.; Mathkour, H. Selection criteria for text mining approaches. Comput. Hum. Behav. 2015, 51, 729–733. [Google Scholar] [CrossRef]
  12. Dohale, V.D.; Akarte, M.M.; Verma, P. Determining the Process Choice Criteria for Selecting a Production System in a Manufacturing Firm Using a Delphi Technique. In Proceedings of the International Conference on Industrial Engineering and Engineering Management (IEEM), Macao, China, 15–18 December 2019; pp. 1265–1269. [Google Scholar]
  13. Cieciora, M.; Bołkunow, W.; Pietrzak, P.; Gago, P. Online Journal of Applied Knowledge Management Key criteria of ERP/CRM systems selection in SMEs in Poland. Online J. Appl. Knowl. Manag. 2020, 8, 85–98. [Google Scholar] [CrossRef]
  14. International Organization for Standardization. Systems and Software Engineering—Life Cycle Processes—Requirements Eengineering Inginierie; International Organization for Standardization: Geneva, Switzerland, 2018. [Google Scholar] [CrossRef]
  15. Manning, C.D.; Raghavan, P.; Schutze, H. An Introduction to Information Retrieval; Cambridge University Press: Cambridge, England, 2009; p. 569. [Google Scholar] [CrossRef]
  16. Esteves, R.A.; Wang, C.; Kraft, M. Python-based open-source electro-mechanical co-optimization system for MEMS inertial sensors. Micromachines 2022, 13, 1. [Google Scholar] [CrossRef]
  17. Django. Django Documentation. Available online: https://docs.djangoproject.com/en/4.0/ (accessed on 16 May 2022).
  18. Bousbia, S.; Miladi, E.; Kooli, Z.; Boumaiza, W. Proposal of a methodology of serious games’ demystification for the teaching of technical modules. IEEE Glob. Eng. Educ. Conf. EDUCON 2016, 10–13, 1155–1159. [Google Scholar] [CrossRef]
  19. M̧ehmet, C. A methodological approach for serious game software development: An application for language disorders. Master Thesis, Atilim University, Ankara, Turkey, 2012. [Google Scholar]
  20. Cano, S.P.; González, C.S.; Collazos, C.A.; Arteaga, J.M.; Zapata, S. Agile software development process applied to the serious games development for children from 7 to 10 years old. Int. J. Inf. Technol. Syst. Approach 2015, 8, 64–79. [Google Scholar] [CrossRef] [Green Version]
  21. Cano, S.P. Propuesta Metodológica para el Diseño de Juegos Serios para Niños con Implante Coclear. Ph.D. Thesis, Universidad del Cauca, Cauca, Colombia, 2016. [Google Scholar]
  22. Lepe-Salazar, F. A model to analyze and design educational games with pedagogical foundations. In Proceedings of the 12th International Conference on Advances in Computer Entertainment Technology—ACE ’15, Iskandar, Malaysia, 16–19 November 2015; pp. 1–14. [Google Scholar] [CrossRef]
  23. Nadolski, R.; Hummel, H.; Van den Brink, H.; Hoefakker, R.; Slootmaker, A.; Kurvers, H.; Storm, J. EMERGO: A methodology and toolkit for developing serious games in higher education. Simul. Gaming 2008, 39, 338–352. [Google Scholar] [CrossRef]
  24. Najoua, T.; Mohamed, E.A. KASP: A Cognitive-Affective Methodology for Designing Serious Learning Games. Int. J. Adv. Comput. Sci. Appl. 2018, 9, 2–11. [Google Scholar] [CrossRef]
  25. De Lope, R.P.; Medina-medina, N.; Soldado Montes, R.; Mora García, A.; Gutiérrez-vela, F.L. Designing educational games: Key elements and methodological approach. In Proceedings of the 9th International Conference on Virtual Worlds and Games for Serious Applications (VS-Games), Athens, Greece, 6–8 September 2017; pp. 63–70. [Google Scholar] [CrossRef]
  26. Padilla, N. Metodología Para El Diseño De Videojuegos Educativos Sobre Una Arquitectura Para El Análisis Del Aprendizaje Colaborativo. Ph.D. Thesis, Universidad de Granada, Granada, Spain, 2011. [Google Scholar]
  27. De Lope, R.P.; Medina-Medina, N.; Paderewski, P.; Gutierrez-Vela, F.L. Design methodology for educational games based on interactive screenplays. CEUR Workshop Proc. 2015, 1394, 90–101. [Google Scholar]
  28. Tran, C.D.; George, S.; Marfisi-Schottman, I. EDoS: An authoring environment for serious games design based on three models. In Proceedings of the 4th European Conference on Games Based Learning 2010, ECGBL, Copenhagen, Denmark, 21–22 October 2010; pp. 393–402. [Google Scholar]
  29. Kuhrmann, M.; Ternité, T. Including the Microsoft Solution Framework as an agile method into the V-Modell XT. Eval. Nov. Approaches Softw. Eng. 2006. [Google Scholar]
  30. Arévalo, W.; Atehortúa, A. Metodología de Software MSF en pequeñas empresas MSF software methodology in small businesses. Cuad. Act. 2012, 4, 83–90. [Google Scholar]
  31. Balduino, R. Introduction to OpenUP (Open Unified Process). Organization 2007, 1–9. [Google Scholar]
  32. Jacobson, I.; Ng, P.W.; Spence, I. The essential unified process. Dr. Dobb’s J. 2006, 31, 40–45. [Google Scholar]
  33. Ruel, H.J.; Bondarouk, T.; Smink, S. The Waterfall Approach and Requirement Uncertainty. Int. J. Inf. Technol. Proj. Manag. 2010, 1, 43–60. [Google Scholar] [CrossRef] [Green Version]
  34. Petersen, K.; Wohlin, C.; Baca, D. The waterfall model in large-scale development. Lect. Notes Bus. Inf. Process. 2009, 32, 386–400. [Google Scholar] [CrossRef] [Green Version]
  35. Company, R. Rational Unified Process for Systems Engineering; Company, R.: New York, NY, USA, 2005. [Google Scholar]
  36. Zuehlke Engineering. Unified Software Development Process; Technical Report; Zuehlke Engineering. Available online: http://www0.cs.ucl.ac.uk/staff/ucacwxe/lectures/3C05-01-02/aswe5.pdf (accessed on 28 April 2022).
  37. Kruchten, P. The Rational Unified Process An Introduction, 2nd ed.; Addison-Wesley Professional: Boston, MA, USA, 2000; p. 320. [Google Scholar]
  38. Leyva, E.; González, M. An Unified Process adaptation for the development of Joomla based web applications. In Ciencias Holguín; Centro de Información y Gestión Tecnológica de Santiago de Cuba: Holguín, Cuba, 2006; Volume XII, pp. 1–13. [Google Scholar]
  39. Abreu, R.B.; Guntín, Y.O.; Alfonso, Y.Á.; Mena, J.C. Metodología ágil Crystal Clear. Un caso de estudio. Ser. Cient. 2009, 2, 3. [Google Scholar]
  40. Duarte, A.O. Las Metodologías de Desarrollo Ágil como una Oportunidad para la Ingeniería del Software Educativo. Rev. Av. Sist. Inform. 2008, 5, 159–171. [Google Scholar]
  41. Schwaber, K.; Sutherland, J. La Guía definitiva de Scrum: Las Reglas del Juego, Scrum Guides. 2013. Available online: https://scrumguides.org/docs/scrumguide/v2020/2020-Scrum-Guide-Spanish-Latin-South-American.pdf (accessed on 24 April 2022).
  42. Tobergte, D.R.; Curtis, S. Ingenieria de Software un enfoque practico. arXiv 2013, arXiv:1011.1669v3. [Google Scholar]
  43. Carrión-Toro, M. “tf-idf metric iKeyCriteria (dataset)”, Mendeley Data, V1. Available online: https://data.mendeley.com/datasets/jc8k4cnr2c/1 (accessed on 16 May 2022). [CrossRef]
Figure 1. The input-process-output approach.
Figure 1. The input-process-output approach.
Data 07 00070 g001
Figure 2. Criteria checker application.
Figure 2. Criteria checker application.
Data 07 00070 g002
Figure 3. Uploading group P files.
Figure 3. Uploading group P files.
Data 07 00070 g003
Figure 4. Group P files uploaded.
Figure 4. Group P files uploaded.
Data 07 00070 g004
Figure 5. Group Q files uploaded.
Figure 5. Group Q files uploaded.
Data 07 00070 g005
Figure 6. Group P and Group Q files successfully uploaded.
Figure 6. Group P and Group Q files successfully uploaded.
Data 07 00070 g006
Figure 7. Uploading criteria.
Figure 7. Uploading criteria.
Data 07 00070 g007
Figure 8. Choice of the antecedent process.
Figure 8. Choice of the antecedent process.
Data 07 00070 g008
Figure 9. Result tables.
Figure 9. Result tables.
Data 07 00070 g009
Figure 10. Absolute term frequency matrix (see Table 5).
Figure 10. Absolute term frequency matrix (see Table 5).
Data 07 00070 g010
Figure 11. Normalized frequency matrix (see Table 7).
Figure 11. Normalized frequency matrix (see Table 7).
Data 07 00070 g011
Figure 12. Key Criteria Results.
Figure 12. Key Criteria Results.
Data 07 00070 g012
Table 1. iKeyCriteria method characteristics.
Table 1. iKeyCriteria method characteristics.
Characteristics iKeyCriteria Methodology
SLRCriterion Relevance in a DocumentCriterion Relevance in a Collection of DocumentsCriteria Categorization According to Relevance- Use Truth TablesSelection or Inference of Key Criteria
[10]x x
[12]x x
[11]x x
[9]x x
[13]x x
Table 2. Initial criteria matrix.
Table 2. Initial criteria matrix.
CriteriaDefinitionContextSynonyms
Term 1Description of the termWhat to look for? Term explanationRelated terms and synonyms
Term nDescription of the term nWhat to look for? Term explanationRelated terms and synonyms
Table 3. Justification matrix.
Table 3. Justification matrix.
PQ
CRITERIADocument 1PDocument 3PDocument nQDocument nQ
The term appears in the document
Term 1(Yes or No justification)JustificationJustificationJustification
Term n(Yes or No justification)JustificationJustificationJustification
Table 4. Boolean matrix.
Table 4. Boolean matrix.
PQ
CRITERIADocument 1PDocument 2PDocument 3QDocument nQ
Term1 0 1 0 1 0 1 0 1
Termn 0 1 0 1 0 1 0 1
Table 5. Term frequency matrix.
Table 5. Term frequency matrix.
PQ
CRITERIADocument 1PDocument PnDocument 1QDocument Qn
Term 1 t f ( t , d ) t f ( t , d ) t f ( t , d ) t f ( t , d )
Term n t f ( t , d ) t f ( t , d ) t f ( t , d ) t f ( t , d )
Table 6. Reduced analysis matrix.
Table 6. Reduced analysis matrix.
CriteriaSynonyms
Term 1Related terms and synonyms
Table 7. Normalized frequency matrix.
Table 7. Normalized frequency matrix.
PQ
CRITERIADocument 1PDocument PnDocument 1QDocument Qn
Term 1 t f ( t i , d j , a ) t f ( t i , d j , a ) t f ( t i , d j , a ) t f ( t i , d j , a )
Term n t f ( t i , d j , a ) t f ( t i , d j , a ) t f ( t i , d j , a ) t f ( t i , d j , a )
Table 8. Inverse document frequency matrix.
Table 8. Inverse document frequency matrix.
CriteriaValor idf
Term 1 i d f ( t i , D )
Term n i d f ( t i , D )
Table 9. TF-IDF matrix.
Table 9. TF-IDF matrix.
PQ
CRITERIADocument 1PDocument 1Q
Term 1 t f i d f ( t i , d j , a , D ) t f i d f ( t i , d j , a , D )
Term n t f i d f ( t i , d j , a , D ) t f i d f ( t i , d j , a , D )
Table 10. TF-IDF Boolean matrix.
Table 10. TF-IDF Boolean matrix.
PQ
CRITERIADocument 1PDocument nPDocument 1QDocument nQ
Term 1 0 1 0 1 0 1 0 1
Term n 0 1 0 1 0 1 0 1
Table 11. Instantiation Matrix.
Table 11. Instantiation Matrix.
CriteriaDocument
“Term 1” ( P 1 ) ( Q 0 )
0 1 1 ( P 1 )
0 1 1 ( P 2 )
0 1 0 ( Q 1 )
0 1 0 ( Q 2 )
Table 12. Behavior matrix.
Table 12. Behavior matrix.
Truth TableResult-Criteria
pq
111 true, 0 false
101 true, 0 false
011 true, 0 false
001 true, 0 false
Table 13. True table.
Table 13. True table.
pq p q
111
100
011
001
Table 14. Logic patterns matrix.
Table 14. Logic patterns matrix.
Logic Patterns
Truth TableNecessary and Not Sufficient ConditionNot Necessary and Sufficient ConditionNecessary and Sufficient ConditionNot Necessary and Not Sufficient Condition
pq q p p q p q ( p q ) ( p q ) p
1111111
1010001
0101001
0011101
Table 15. Pairing matrix.
Table 15. Pairing matrix.
Criterion 1Criterion 2Criterion …n
111
100
010
111
Logical PatternNecessary and not sufficientNot necessary and sufficientNecessary and sufficient
Table 16. Selected criteria for the application of the method.
Table 16. Selected criteria for the application of the method.
CriteriaSynonymsCriteriaSynonyms
methodologymethodtestsevaluation
lifecyclestagesrolescharacter
agile processscrumpedagogical objectiveseducational aims
resourcesdigital resourcesgamificationludification
storynarrativeproduct qualityproperty
prototypingmock-upquality of useusability
modeling languagemachine languagegamePlayactions
Table 17. Key Criteria Matrix based on TF-IDF metric.
Table 17. Key Criteria Matrix based on TF-IDF metric.
CriteriaLogical Pattern
MethodologyNone
lifecycleNone
agile processNone
resourcesNone
storySufficient and not Necessary
prototypingNone
modeling languageNone
testsNone
rolesNone
pedagogical objectivesSufficient and not Necessary
gamificationNone
product qualityNone
quality of useNone
gameplayNone
display devicesNone
interaction devicesNone
feedbackNot Necessary and Not Sufficient
difficulty settingsNone
usabilityNone
accesabilityNone
visual elementsNot Necessary and Not Sufficient
sound elementsNone
initial testNone
context helpNone
adaptabilityNone
learning techniquesSufficient and not Necessary
avatarNone
context centered designNone
user centered designNone
end usersNone
expertsNone
linguisic gameNone
creative techniquesNone
consensus mechanismsNone
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Carrión-Toro, M.; Aguilar, J.; Santórum, M.; Pérez, M.; Astudillo, B.; Lopez, C.-P.; Nieto, M.; Acosta-Vargas, P. iKeyCriteria: A Qualitative and Quantitative Analysis Method to Infer Key Criteria since a Systematic Literature Review for the Computing Domain. Data 2022, 7, 70. https://doi.org/10.3390/data7060070

AMA Style

Carrión-Toro M, Aguilar J, Santórum M, Pérez M, Astudillo B, Lopez C-P, Nieto M, Acosta-Vargas P. iKeyCriteria: A Qualitative and Quantitative Analysis Method to Infer Key Criteria since a Systematic Literature Review for the Computing Domain. Data. 2022; 7(6):70. https://doi.org/10.3390/data7060070

Chicago/Turabian Style

Carrión-Toro, Mayra, Jose Aguilar, Marco Santórum, María Pérez, Boris Astudillo, Cindy-Pamela Lopez, Marcelo Nieto, and Patricia Acosta-Vargas. 2022. "iKeyCriteria: A Qualitative and Quantitative Analysis Method to Infer Key Criteria since a Systematic Literature Review for the Computing Domain" Data 7, no. 6: 70. https://doi.org/10.3390/data7060070

APA Style

Carrión-Toro, M., Aguilar, J., Santórum, M., Pérez, M., Astudillo, B., Lopez, C. -P., Nieto, M., & Acosta-Vargas, P. (2022). iKeyCriteria: A Qualitative and Quantitative Analysis Method to Infer Key Criteria since a Systematic Literature Review for the Computing Domain. Data, 7(6), 70. https://doi.org/10.3390/data7060070

Article Metrics

Back to TopTop