Computational Methods for Information Processing from Natural Language Complaint Processes—A Systematic Review

Blandón Andrade, J. C.; Castaño Toro, A.; Morales Ríos, A.; Orozco Ospina, D.

doi:10.3390/computers14010028

Open AccessReview

Computational Methods for Information Processing from Natural Language Complaint Processes—A Systematic Review

¹

Systems and Telecommunications Engineering Program, Catholic University of Pereira, Pereira 60001, Colombia

²

Pereira Energy Company, Pereira 60003, Colombia

^*

Author to whom correspondence should be addressed.

Computers 2025, 14(1), 28; https://doi.org/10.3390/computers14010028

Submission received: 28 November 2024 / Revised: 17 December 2024 / Accepted: 25 December 2024 / Published: 20 January 2025

Download

Browse Figure

Versions Notes

Abstract

:

Complaint processing is of great importance for companies because it allows them to understand customer satisfaction levels, which is crucial for business success. It allows them to show the real perceptions of users and thus visualize the problems, which are regularly processed from oral or written natural language, derived from the provision of a service. In addition, the treatment of complaints is relevant because according to the laws of each country, companies have the obligation to respond to these complaints in a specified time. The specialized literature mentions that enterprises lost USD 75 billion due to poor customer service, highlighting that companies need to know and understand customer perceptions, especially emotions, and product reviews to gain insight and learn about customer feedback because of the importance of the voice of the customer for an organization. In general, it is evident that there is a need for research related to computational language processing to handle user requests. The authors show great interest in computational techniques for the processing of this information in natural language and how this could contribute to the improvement of processes within the productive sector. This work searches in indexed journals for information related to computational methods for processing relevant data from user complaints. It is proposed to apply a systematic literature review (SLR) method combining literature review guides by Kitchenham and the PRISMA statement. The systematic process allows the extraction of consistent information, and after applying it, 27 articles were obtained from which the analysis was conducted. The results show various proposals using linguistic, statistical, machine learning, and hybrid methods. We find that most authors combine Natural Language Processing (NLP) and Machine Learning (ML) to create hybrid methods. The methods extract relevant information from complaints of the customers in natural language in various domains, such as government, medical, banks, e-commerce, public services, agriculture, customer service, environmental, and tourism, among others. This work contributes as support for the creation of new systems that can give companies a significant competitive advantage due to their ability to reduce the response time of the complaints as established by law.

Keywords:

complaint process; computational methods; deep learning; machine learning; natural language processing; systematic literature review

1. Introduction

Complaints refer to quality of service; they allow user to measure and evaluate whether a product or service offered by a company is well received by users [1,2]. Companies must offer different service channels for users to register their comments, then process the information in natural language and provide a response to the user within the time limits established by law. Finally, they must propose strategies to improve the good or service and thus avoid future sanctions [3,4]. Natural language is defined as the medium used for communication between people. Its highly sophisticated cycle of evolution makes it complex to process [5]. Natural Language Processing (NLP) is concerned with creating computational systems to perform language-related tasks in pursuit of human-machine communication [6,7].

Hanni et al. [8] mention the growth of e-commerce sites on the Internet as a great opportunity to learn about customer feedback on their products and thereby be able to improve them. Akella et al. [9] contextualize the importance of the voice of the customer for an organization; they explain that these data must be processed using NLP so that they can be converted into concrete actions in companies. Saranya and Jayanthy [10] highlight the importance of extracting information, especially emotions, from texts and product reviews to gain insight. Ramaswamy et al. [11] mention that companies need to understand customer perceptions to build effective plans for marketing their products and that these messages are found across multiple service channels. Lam et al. [12] highlight that in 2017, companies lost USD 75 billion due to poor customer service; they state that they found inefficient people that spent a lot of time in processes that did not solve the problem. They highlight that companies should be concerned about how agents respond, for example, to a cancellation request and how certain actions can lead to a positive or negative outcome. Other authors also point to the need for research related to computational language processing to handle user requests and thereby improve services offered by companies to their customers [13,14,15,16,17]. With the above, this work focuses on finding different methods of natural language processing that allow businesses to process the complaints given by users in natural language and that are related to the provision of a good or service, then find relevant aspects and thus contribute to providing a response to users in less time.

This work applies a systematic literature review method consisting of three phases: (i) definition of research questions, where the questions that must be resolved at the end of the analysis are posed; (ii) carrying out the search process, where search strings are posed and then launched in the different databases selected; and (iii) screening and filtering, which is the section where the information is organized. After applying the systematic process, 27 articles were obtained to answer the research questions. The results show different proposals focused on linguistic, statistical, and machine learning methods and combinations thereof. It is expected that this work will support the creation of new systems focused on the treatment of natural language complaints.

The paper is organized as follows: Section 2 describes the method used to carry out the study and details the subsequent steps. Section 3 presents the results. Section 4 discusses the findings. Section 5 presents the conclusions of the review.

2. Materials and Methods

2.1. Literature Review Method

A systematic literature review is a reliable, accurate, and verifiable method to search for objective information on a specific topic. The available literature is evaluated and interpreted based on the research questions posed for the study. In this work, an adaptation the Kitchenham [18] method was used, and the results are reported following the PRISMA statement [19]. The public registry of the review is found in https://doi.org/10.17605/OSF.IO/3QGK5, accessed on 25 December 2024.

2.1.1. Literature Review Guide by Kitchenham

We use the Kitchenham Method [18], which is categorized as a tertiary literature re-view and where the main objective is better the method original presented in 2004 through the inclusion of quality assessment. The method proposes the phases presented in Table 1.

In this work, relevant information related to computational methods for information processing from natural language complaints was extracted from scientific material and then the information obtained was analyzed. In the following sections, we describe the activities included in Table 1.

2.1.2. PRISMA Statement

PRISMA statement is the preferred reporting for systematic reviews and meta-analyses. It includes a 27-item checklist that covers everything from the title to the discussion of a systematic literature review. It consists of a flow chart that focuses on the processes related to screening and filtering of a literature review to deliver evidence about the review protocol used.

2.2. Research Questions

In this study, a research question was posed, which made it possible to determine the scope and precision of the information required. The research question defined for the review is: What are the characteristics of the computational methods used to process information from natural language complaints?

2.3. Database Search Criteria

Search Process

With the research question defined, the search strings to be used were designed and then launched in the databases, and the resulting articles were used to try to answer the research question. Two search strings were defined: (i) Methods for Complaint Process, and (ii) Programming Languages for Complaint Process. Considering the strings, a search equation was constructed (see Equation (1)) for use in the databases:

\begin{array}{l} ((((petitions) OR (complaint process)) AND (Method)) OR (((complaint process) OR \\ (petitions)) AND ((java) OR (python) OR (javascript) OR (php)))) OR ((personal \\ complaint) AND (method)) . \end{array}

(1)

An electronic scientific database is defined as a set of standardized information records for easy access with the possibility of ordering them according to desired criteria. For this research, databases specialized in computing and artificial intelligence were chosen. In order to strengthen the selection, some bibliographic references were considered [18,20,21,22,23]. With the elements described above, the following databases were used for this systematic review: (i) IEEE Xplore Digital Library; (ii) Science Direct; (iii) Springer; and (iv) Web of Science.

2.4. Screening and Filtering

2.4.1. Inclusion and Exclusion Criteria

The inclusion and exclusion criteria seek to establish limits for the systematic literature review and, thus, to be able to interpret the information reliably in order to make a good classification of the studies with direct evidence on the research question [18,24]. The inclusion criteria defined were: (i) the document or study is an academic journal article, conference article, or book chapter; (ii) the language of the document is English or Spanish; and (iii) the document is related to information processing complaints originating in natural language complaints. The exclusion criteria are: (i) the publication date of the article is earlier than 2018; (ii) the article is repeated; (iii) the format of the text is incomplete; (iv) the document or study is a white paper, book, non-scientific publication, or an abstract. With these criteria, it is expected to obtain high-quality articles for the study and relevant to the processing of complaints written in natural language. There was also interest in works that have been published in recent years so that the findings are relevant to our research.

2.4.2. Quality Assessment

Quality assessment allows the quality of the selected articles to be measured, which will be useful for the synthesis and analysis of the study results. Checklists of the factors to be evaluated in each study are often used, and numerical quality assessments can be obtained [24]. QA1 is used to guarantee a future comparison, regardless of the model presented by the authors. QA2 aims to see the robustness of the system. QA3 seeks to shed light on the technologies used in the systemization of the method and to demonstrate whether they are modern and current technologies. QA4 allows you to know the performance of the methods and make a comparison between the different methods. The quality assessment in this work was evaluated using four criteria embodied in the following questions:

QA1. Does the study clearly present a computational method for processing information from a natural language complaint?

QA2. Does the study mention the number of documents processed in the computational method?

QA3. Does the study mention a programming language or framework for the development of the computational method?

QA4: Does the method present any evaluation criteria?

The questions were then scored as follows:

QA1: Yes, a computational method for processing information from a natural language complaint is described. No, a computational method for processing information from a natural language complaint is not found.

QA2: Yes, the number of documents processed by the computational method is mentioned. No, the number of documents processed by the computational method is not mentioned.

QA3: Yes, at least one programming language or framework is mentioned for the development of the computational method. No, there is no mention of a programming language or framework for the development of the computational method.

QA4: Yes, some method evaluation criteria are present. No, no method evaluation criteria are present.

The authors verified, in each article, at what level the quality criteria were met or not, and based on this, a table of results was constructed. The scoring process was Yes = 1, No = 0.

2.4.3. Data Collection

In this section, information is extracted by retrieving relevant data. To achieve the objective, a classification scheme was constructed to organize the information and thus answer the defined research question. Table 2 presents the scheme that was used.

Here, one researcher extracted the data, and another checked the extraction. The main researcher coordinated the data extraction and checking tasks, which involved all the authors of this paper.

2.4.4. Data Analysis

To analyze the data, we extracted the information into columns in an Excel file. Subsequently, they were tabulated as follows:

Source of information.
Technology of the method found in the article.
Application domain.
Number of documents processed by the method.
Evaluation level of the computational method.
Programming languages or frameworks used by the method.

With the information collected, the research question could be answered through the information contained in each of the columns.

2.4.5. Flow Chart of Screening and Filtering

Figure 1 shows the flow chart of the screening and filtering process recommended by the PRISMA statement. It shows the four journals where the articles were found, how we excluded unwanted article, and the information about the included records.

In the identification stage, 128 records were collected from 4 databases: IEEE Xplore Digital Library (32), Science Direct (38), Springer (27), and Web of Science (31). A total of 25 records were removed before the screening process (20 due to duplicates and 5 for other reasons). In the screening phase, 103 records were reviewed, of which 60 were excluded after a human evaluation because their results were unclear and the technologies used were not considered significant. Subsequently, 43 reports were retrieved, but 3 could not be obtained. In the eligibility stage, the 40 studies were assessed, excluding 16 because their publication dates were earlier than 2018. Finally, 27 studies met the inclusion criteria and were used for the review.

3. Results

After launching the search equation (Equation (1)) in each of the selected databases, articles in which a relation to the subject was found in the title, summary, or conclusions were selected. The classification scheme was then used to collect the information. Table 3 shows the number of articles found in each of the databases.

The application of the criteria made it possible to obtain the most relevant and suitable articles to answer the research question posed. Table 4 presents the results after applying the inclusion and exclusion criteria, resulting in 27 articles for the study.

Next, the quality criteria defined for the study were evaluated to verify the level of completeness of the information contained in the documents selected for the study. Table 4 shows the papers in descending order according to the score obtained.

As shown in Table 4, none of the 27 articles selected for the study met the four quality criteria established. Overall, 11.12% met three quality criteria, 44.45% met two criteria, 37.03% met at least one criterion, and 7.4% did not meet any quality criteria. The above shows that 92.6% of the articles at least explain the method they developed for their results. In total, 7.4% did not clearly show the method, only mentioning the technology without further details.

After we conducted data collection, we classified the information using Table 2. Then we performed data analysis by tabulating the information in an Excel file to answer the research question with consistent data. The detailed analysis is presented in the following section.

4. Discussion

In this section, analysis of the data will be carried out. This analysis sought to answer the study’s research question. Regarding the sources of information, it was found that the authors of the works related to the treatment of information from complaints mainly belonged to universities located in countries such as Pakistan, China, Indonesia, Taiwan, Japan, the United States, India, Korea, Qatar, Australia, England, Israel, Egypt, and Brazil. Referring to the date of publication, articles published from the year 2018 onward were considered. Articles published after 2017 were considered because they may include advancements in artificial intelligence [52], data analysis, and automation, key technologies that have transformed complaint processing in terms of efficiency and personalization.

4.1. Linguistic Methods

Regularly use the rules of language and, after a process of analysis, the desired result is reached. Usui et al. [26] present a system that takes data from the electronic drug history of a Japanese pharmacy, approximately 5000 documents, which are used as a complaint mechanism. The authors formulate rules based on morphological analysis, execute them, and then automatically annotate data from free text with the international disease code. Although the system does not have the best performance, reporting only a 66% accuracy and a 63% recall, the authors hope to improve it. Anggraini et al. [31] present a rule-based method and categorization methods for a company that provides drinking water. The system takes user comments from the web, processes approximately 100 documents, evaluates them textually using rules, and then determines whether it is a positive or negative comment. Farouk et al. [34] present a method to evaluate the semantic similarity between Arabic sentences, taking into account farmers’ complaints and trying to give them a solution. The method uses TD-IDF as a weighting scheme, then classifies using MapReduce Support Vector Machine to refine the semantic similarity and obtains an F-measure of 86.7%. Tootooni et al. [37] present a method that takes the first digital patient record from an emergency department. They developed a structured list that categorizes the complaints, then developed an NLP-based algorithm called Chief Complaint Mapper (CCMapper), with which they assign a category to the complaint in free text. The method is validated by two expert physicians, and they use the Kappa statistic to contrast their evaluations. The method has a sensitivity of 82.3%, specificity of 99.1%, and F-score of 82.3%. Yoshikawa et al. [50] present a recommendation system for e-commerce. They mainly use customer feedback and satisfaction information. Then, with those inputs, they use an information extraction method to obtain positive and negative information to then make a recommendation to the customer.

The advantage of this type of method is that, thanks to the rules that are defined, many elements of the language can be extracted, such as verb tenses, to obtain the root and meaning of the texts. The disadvantage of these methods is the effort involved in constructing large sets of rules.

4.2. Statiscal Methods

Based on the distribution of words in the corpus. It should be clarified that most Machine Learning methods use statistics as a fundamental part of the operation of the algorithms. Achcar and de Godoy [46] present a method that allows evaluation of the service quality standard of a telecommunications company by using statistical process control (SPC). They used a dataset from January 2018 to November 2019 related to monthly and weekly counts of user complaints regarding the technical services offered. The authors use multiple linear regression models with the count data transformed to a logarithmic scale and Poisson regression models with the original count data, thus managing to detect significant factors to improve. They mention that future complaint counts based on statistical models will help the company to plan the distribution of technicians in the different areas and thus improve service. The authors use Minitab 20.4 version Statistical Software, which can examine current and past data to discover trends, find and predict patterns, uncover hidden relationships between variables, and create good visualizations.

Statistical methods are reliable for understanding new words or detecting errors, such as wrong words or accidental omissions. The disadvantage of these methods appears when adding extra terms as synonyms to the original information when preprocessing is performed, because it is difficult to maintain or improve the precision and recall values of the methods.

4.3. Machine Learning (ML) Methods

Typically learn from data from which a wide number of possibilities are derived. Singh and Saha [28] present a method for commerce that seeks to benefit from social networks and shopping websites. They suggest that complaints are usually worked from text, but that advantage can be taken from mixed codes. The authors manually annotate classes such as complaint, emotion, or sentiment from the Product Review dataset, which is a CORPUS of mixed-language complaints consisting of 3711 annotated instances. Then they develop a framework based on Graph Attention Network (GAT) and adding self-attention layers to perform complaint detection (main task), sentiment classification, and emotion recognition simultaneously. They obtained a precision of 72.82% and a Macro-F1 of 71% in the complaint detection task. Alamsyah et al. [27] present a method to classify one million complaints to the dependencies (five dependencies) of a bank in Indonesia. The authors perform text preprocessing, including use of the TF-IDF algorithm, and then use a Convolutional Neural Network to perform the classification. The results show that it achieves 85% Accuracy, although the authors acknowledge that the system is yet to be implemented in a real environment. Hsu et al. [30] present a method to analyze the chief complaints of preschool children to detect influenza-like illnesses to help with physician diagnosis and act quickly in the face of an outbreak. The authors use Deep Learning tools, especially the BERT algorithm, to classify texts. It obtained an Accuracy of 72.87%. Assaf and Srour [32] present a method to analyze occupant complaints in 16 buildings (approximately 6000) and try to forecast thermal complaints as a strategy for predictive maintenance of facilities. The authors used the multilayer perceptron model. Fan et al. [33] present a method consisting of a Deep Cross Domain Network (DCDN), which takes water pollution complaints and classifies whether the complaint has bad intentions or not. They first use the LSTM method to extract the domain features, then the self-attenuation mechanism fuses the shared domain features and private domain features so that finally the multilayer perceptron generates the classification result. They use the Python programming language. Singh et al. [36] present a system for identifying complaints and classifying sentiments. They label a CORPUS with sentiments according to the text written by users; they can be positive, negative, or neutral. Then they use Deep Learning tools, among them AffectiveSpace 2, to determine if the text is a complaint or a sentiment, finding that there is a correlation between these two variables. They obtain an Accuracy of 83.63% and a Macro-F1 score of 81.9% for the complaint identification task. Fan et al. [41] propose an annotation-based text classification method for environmental complaint reporting. They first use a small amount of labeled data to establish the CORPUS of cell vocabulary. Then, the cell vocabulary was expanded into the CORPUS of the pre-trained model. Finally, the TextCNN model was trained to perform automatic labeling and classification of the complaint text. Tong et al. [42] present a method of classifying complaint text from the web based on a character-level Convolutional Neural Networks (CNN). The authors remove negative elements using lexicons, then perform character embedding to encode the characters, then perform feature extraction to reduce dimensionality, and finally classify by means of a convolutional network. Luo et al. [43] take short texts from the 12,345 line of Haikou city to perform text classification given the number of calls. The authors perform experiments to compare FastText, TextCNN, TextRNN, and RCNNN technologies and conclude that the best technology in the experiments was TextCNN. Chen et al. [44] present a method that takes complaints from a tourism page, calculates word frequency, and applies the LDA theme model (Bayesian model) for classification of complaints into their respective categories to contribute to complaint management. Shin et al. [45] present a method for indoor water leakage management. They apply machine learning (ML) to predict the spatial distribution of customer complaints, specifically using the XGBoost and LightGBM models. The authors mention that their tool can contribute to decision-making. Zhong et al. [38] present a method for building quality complaints, which should be classified and resolved quickly. They use Convolutional Neural Networks (CNN) to capture semantic features in texts and then perform automatic classification of the writings into predefined categories.

The authors conclude that compared to support vector machine and Bayes-based classifiers, CNNs perform better. Wang et al. [47] present a method that processes written air pollution complaints in Beijing in the years 2019 and 2020. The authors extract names and addresses of geographical points, as well as times and types of complaints, using Bidirectional Encoder Representations from Transformers (BERT) plus Conditional Random Fields (CRF). They then perform filtering operations and manage to create heat maps to know the most polluted areas more accurately in Beijing to address emergencies more quickly. Chen et al. [51] present an intelligent government complaint prediction method to respond to citizen complaints through Machine Learning (ML) technologies. The system collects complaints and integrates them since it performs label correction to refine the labels and, in some cases, unifies them into one category. With the refined data, the central server processes solutions to the complaints through classification algorithms. The authors mention that their major contribution is to apply text classification, as well as label correction, to better train the classification method.

The advantage of Machine Learning methods are that they are capable of learning from large amounts of linguistic data, thus recognizing the relationships between words, phrases, and sentences in texts without the need for explicit rules. The disadvantage of this type of method is that it is not understandable by humans, which makes it difficult to diagnose the reason for false positives or negatives in the developed systems; this implies increasing the effort in the construction of the training sets.

4.4. Hybrid Methods

Use one or a combination of several techniques for information processing. Qurat-ul-ain et al. [25] present a method that automatically classifies complaints received through a web portal. First, preprocessing is performed, where it collects information about complaints, then tokenizing, stemming, and lemmatization are applied through NLP tools. Then feature extraction is performed through Count Vectorizer and TF-IDF to convert textual data into numerical data. Finally, 10,000 complaints are classified into 10 different classes using Support vector machine (SVM), Random Forest, Logistic Regression, Multinomial Naive Bayes, and K-Nearest Neighbor (KNN) algorithms. They obtained an Accuracy of 85%. Yance Nanlohy et al. [29] present a public complaint method for reporting problems about government performance where there are several categories and the objective is to classify complaints. The method starts with data preparation, then NLP feature extraction (number of words, number of characters, average number of words) is performed, then preprocessing transforms unstructured textual data into a structured model, then term frequency weighting (TF-IDF) is calculated, and finally Naive Bayes Multinomial algorithm is applied for classification. They obtained a precision of 91.38% and a recall of 90.73%. HaCohen-Kerner et al. [35] present a method for automatic text classification of complaint letters written in Hebrew, which were sent to several companies and are required to be classified into different categories. The method starts by computing the frequencies of word unigrams, then they use Batch Normalization (BN), Supervised Learning (SL), Sequential Minimal Optimization (SMO), and Random Forests (RF) Machine Learning models. Then, the IG and CFS filters are applied. They obtain an Accuracy of 84.5% for seven categories. Ke and Chen [40] present a method where they take customer complaints to a gas company between the years 2018 and 2020 to perform text classification. First, they perform text preprocessing, then the complaint is segmented using dictionaries, then Naive Bayes based combined with an N-Gram model is used, and finally word frequency is analyzed. Rao and Zhang [39] present a method where they take complaints from an online portal, use the Bayes network to segment words in Chinese language, then extract emotional semantic features as well as text content features. Finally, they use complaint content classification model using K-Means algorithm for clustering. They obtain a Precision of 92.93% and a Recall of 93.90%. Li et al. [48] present a method using work order information from customer complaints of an energy company. They perform natural language processing with the following steps: work order data cleaning, text segmentation, information characterization, training, and evaluation of Machine Learning models. Kim and Lim [49] present a method that, from user complaints, seeks to suggest improvements. The method takes complaints from the database, then performs data analysis using NLP. Subsequently, a hierarchy of service features is constructed with a keyword dictionary, customer complaints are identified using the sentiment analysis technique, and customer complaint tables are developed using statistical process control (SPC) analysis for service quality. The authors acknowledge that the method can be improved.

The advantage of hybrid methods is the ability to combine several techniques; this is important because solutions to different problems within language processing can be modeled. They are a highly promising approach because they integrate expert knowledge through linguistic or logical rules, enabling the effective handling of exceptional cases and limited datasets. This capability enhances accuracy in complex scenarios while ensuring greater interpretability, which is an essential factor in complaint analysis where context and language sensitivity are critical. Furthermore, hybrid methods provide the flexibility to quickly adapt to changes in complaint patterns without requiring full model retraining, resulting in a more robust and efficient solution compared to purely data-based approaches. In contrast, purely machine learning methods often require large amounts of labeled data and are prone to errors in domain-specific or ambiguous contexts. A possible disadvantage refers to its lack of maturity; it could cause problems, and, depending on the techniques used, it may combine its problems.

4.5. Syntesis of Methods

As a way of synthesizing the information of the works found, Table 5 is presented.

These methods demonstrate promising approaches, yet several limitations impact their real-world applicability. The issues include data quality, such as noise, misspellings, and domain-specific terminology, which complicate preprocessing steps like tokenization and feature extraction. Additionally, many models rely on large, labeled datasets, which are often difficult and costly to acquire and may struggle with imbalanced classes. The black-box nature of many machine learning models limits their interpretability, which is critical for decision-making in customer service and regulatory contexts. The hybrid models introduce complexity and integration challenges because they may not efficiently handle growing datasets without significant computational resources and require regular retraining to adapt to evolving complaint patterns, adding to their maintenance cost. Despite these limitations, hybrid methods and continuous improvements in preprocessing and feature extraction techniques offer a path toward more robust and efficient complaint classification systems, although further research is needed to address these issues for practical deployment.

The automation of complaint processing also raises important ethical concerns, particularly related to data privacy and potential biases in automated systems. First, the handling of sensitive customer information, which must be protected in compliance with data privacy regulations. Second, there is the risk of bias in automated systems because machine learning models often reflect the biases present in the training data. Third, the lack of transparency in many machine learning models complicates accountability in decision-making. To mitigate these risks, it is essential to incorporate ethical guidelines, such as fairness-aware algorithms and rigorous data governance practices, ensuring that automated systems are both reliable and just.

Some limitations of the current research findings can relate to (i) the method used; because it was adapted from the original, this could introduce variations that affect the reliability and scalability of the results; (ii) searches in the different databases present different methods for processing the English language; there are very few examples for other languages; and (iii) using a search single string could generate some type of bias in the research. The adaptation of the method was carried out because we believe that the refinement proposed by the authors of the method is unnecessary for this research. We wanted to show complaint extraction methods for other languages, but the English language prevails in the databases. Despite this, we tried to present processing of complaints methods in other languages. Finally, we consider that as part of defining a research question, it was appropriate to define a search string only.

As future work, according to the reviewed literature, the authors suggest: (i) A multilabel classifier to process complaint texts with multiple labels [29,30,35,53]; (ii) classifying complaints according to the severity of their explicit and/or covert verbal violence [35]; (iii) complaint identification from social media data such as politeness markers in texts [36]; (iv) investigating the impact of emotions on complaints [35]; (v) deeper analysis of the question mark series and their interpretation using sentiment analysis [35,50]; (vi) building and applying model(s) that will also use key phrases, expansions of abbreviations, and summaries that can be extracted from the complaints [35]; (vii) incorporating clustering analysis and association rule mining to identify the categories of complaints and how they are interrelated with the actions taken to handle them [32]; and (viii) the extraction of complaints from audios or videos [39].

5. Conclusions

For organization, complaints become a good form of feedback that can be used to improve their most sensitive processes, i.e., those related to users’ perceptions of a good or service. In this work, a systematic literature review was proposed to find the most used computational methods for complaint processing and their technologies for the linguistic treatment of the important information contained therein.

This work defined a research question for the study. Then, the search process was performed with the equation in the four selected databases, which resulted in 128 articles that met the criteria established up to that point. After applying the inclusion and exclusion criteria, the result was 27 articles that could be used to answer the research questions. The quality of the articles was then evaluated, for which four quality criteria were defined, to know how many of them were met by each article, concluding that 92.6% of the articles met at least one quality criterion. The information was then tabulated for analysis. The 27 articles used different technologies, as shown in Table 5; this shows that 3.7% of the authors used statistical methods to process texts with complaints, 18.52% used linguistic methods, 51.85% used Machine Learning methods (including Deep Learning), and 25.93% used hybrid methods to which they made different adaptations.

According to the findings of this research work, we found that the linguistic methods are not used due to the human cost in the construction of the rules, that the statistics have mistaken when preprocessing is performed when adding extra terms, and that the machine learning methods need a great amount of training sets. The combination of NLP and Machine Learning techniques could help in the extraction of the relevant information from the complaints of the customers. We conclude that hybrid methods allow the integration of various approaches, that they are widely used for processing and extracting the information from complaints, and that this can provide a significant competitive advantage to companies that can develop these types of systems, as they can reduce response times to complaints within the framework of the law. However, before an automatic complaint process can be useful to enterprises, technical and ethical issues both need to be addressed.

Author Contributions

Conceptualization, J.C.B.A., A.C.T. and A.M.R.; methodology, J.C.B.A. and A.M.R.; software, J.C.B.A., A.M.R., A.C.T. and D.O.O.; validation, J.C.B.A., A.M.R., A.C.T. and D.O.O.; formal analysis, J.C.B.A., A.M.R. and A.C.T.; investigation, J.C.B.A., A.M.R. and A.C.T.; writing—original draft preparation, J.C.B.A., A.M.R., A.C.T. and D.O.O.; data curation, J.C.B.A., A.M.R., A.C.T. and D.O.O.; writing—review and editing, J.C.B.A. and A.M.R.; project administration, J.C.B.A. and D.O.O.; supervision, J.C.B.A.; visualization, A.C.T. and D.O.O.; funding acquisition, J.C.B.A. and D.O.O. All authors have read and agreed to the published version of the manuscript.

Funding

This research received external funding from the Ministry of Science, Technology, and Innovation; the Catholic University of Pereira; and Pereira Energy Company in Colombia.

Data Availability Statement

Data are available upon request.

Conflicts of Interest

The authors declare that this study received funding from Pereira Energy Company. The funder was not involved in the study design, collection, analysis, interpretation of data, the writing of this article or the decision to submit it for publication. In addition, author Daniela Orozco Ospina is employed by Pereira Energy Company. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

del Gómez Torres, M.P.; Rojas Berrío, S.; Robayo Pinzón, Ó.J. Identificación de niveles de calidad en el servicio a partir de peticiones, quejas y reclamos, en entidades bancarias de Colombia 2007–2014. Libre Empresa 2015, 12, 11–26. [Google Scholar] [CrossRef]
Ramírez Puentes, J.P.; Prada Vargas, C. Implementation of improvements in the process of requests, complaints, claims, suggestions and congratulations of a health sector organization in Colombia. In Proceedings of the 2021 Congreso Internacional de Innovación y Tendencias en Ingeniería (CONIITI), Bogotá, Colombia, 29 September–1 October 2021; pp. 1–4. [Google Scholar]
Chacón, E.Y.; Barajas, H.P.; Sanchez Mojica, K.Y. Efectividad de los PQRS como Canal de Comunicación entre los Estudiantes y la Fundación de Estudios Superiores Comfanorte FESC. Rev. Convicciones 2018, 3, 163–168. [Google Scholar]
Campo Bedoya, M.A. Los Beneficios de la Tecnología Frente a la Interposición y/o Notificación de las Peticiones, Quejas, Reclamos y Solicitudes (PQRS), Una Vez se Expidió el Código de Procedimiento Administrativo y de lo Contencioso Administrativo (Ley 1437 De 2011). Specialization Thesis, Universidad Militar Nueva Granada, Bogotá, Colombia, 2017. [Google Scholar]
Vásquez, A.C.; Huerta, H.V.; Quispe, J.P.; Huayna, A.M. Procesamiento de lenguaje natural. Rev. Investig. Sist. Informática 2009, 6, 45–54. [Google Scholar]
Mejía, J.M.M. Lingüística Computacional y de Corpus: Teorías, Métodos y Aplicaciones; Universidad de Antioquia: Medellín, Colombia, 2021; ISBN 978-958-50-1039-0. [Google Scholar]
Blandón Andrade, J.C. Aplicaciones del Procesamiento de Lenguaje Natural. Entre Cienc. Ing. 2022, 16, 7–8. [Google Scholar] [CrossRef]
Hanni, A.R.; Patil, M.M.; Patil, P.M. Summarization of customer reviews for a product on a website using natural language processing. In Proceedings of the 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Jaipur, India, 21–24 September 2016; pp. 2280–2285. [Google Scholar]
Akella, K.; Venkatachalam, N.; Gokul, K.; Choi, K.; Tyakal, R. Gain Customer Insights Using NLP Techniques. SAE Int. J. Mater. Manuf. 2017, 10, 333–337. [Google Scholar] [CrossRef]
Saranya, K.; Jayanthy, S. Onto-based sentiment classification using machine learning techniques. In Proceedings of the 2017 International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS), Coimbatore, India, 17–18 March 2017; pp. 1–5. [Google Scholar]
Ramaswamy, S.; DeClerck, N. Customer Perception Analysis Using Deep Learning and NLP. Procedia Comput. Sci. 2018, 140, 170–178. [Google Scholar] [CrossRef]
Lam, S.; Chen, C.; Kim, K.; Wilson, G.; Crews, J.H.; Gerber, M.S. Optimizing customer-agent interactions with natural language processing and machine learning. In Proceedings of the 2019 Systems and Information Engineering Design Symposium (SIEDS), Charlottesville, VA, USA, 26 April 2019. [Google Scholar]
Sezgen, E.; Mason, K.J.; Mayer, R. Voice of airline passenger: A text mining approach to understand customer satisfaction. J. Air Transp. Manag. 2019, 77, 65–74. [Google Scholar] [CrossRef]
Ogudo, K.A.; Nestor, D.M.J. Sentiment Analysis Application and Natural Language Processing for Mobile Network Operators’ Support on Social Media. In Proceedings of the 2019 International Conference on Advances in Big Data, Computing and Data Communication Systems (icABCD), Winterton, South Africa, 5–6 August 2019; pp. 1–10. [Google Scholar]
Hagen, L.; Uzuner, Ö.; Kotfila, C.; Harrison, T.M.; Lamanna, D. Understanding Citizens’ Direct Policy Suggestions to the Federal Government: A Natural Language Processing and Topic Modeling Approach. In Proceedings of the 2015 48th Hawaii International Conference on System Sciences, Kauai, HI, USA, 5–8 January 2015; pp. 2134–2143. [Google Scholar]
Jungherr, A.; Jürgens, P. The Political Click: Political Participation through E-Petitions in Germany. Policy Internet 2010, 2, 131–165. [Google Scholar] [CrossRef]
Mistler, M.; Schlueter, N.; Loewer, M.; Rafalczyk, V. Methodology for a Model-based Traceability of Requirements from Complaints in Business Networks Using e-DeCoDe. In Proceedings of the 2021 IEEE International Conference on Industrial Engineering and Engineering Management (IEEM), Singapore, 13–16 December 2021; pp. 1255–1259. [Google Scholar]
Kitchenham, B.; Pearl Brereton, O.; Budgen, D.; Turner, M.; Bailey, J.; Linkman, S. Systematic literature reviews in software engineering—A systematic literature review. Inf. Softw. Technol. 2009, 51, 7–15. [Google Scholar] [CrossRef]
Jack, H. Engineering Design, Planning, and Management; Academic Press: New York, NY, USA, 2021; ISBN 978-0-12-824164-6. [Google Scholar]
Medina Otalvaro, C.M.; Blandón Andrade, J.C.; Zapata Jaramillo, C.M.; RiosPatiño, J.I. IoT Best Practices and their components: A Systematic Literature Review. IEEE Lat. Am. Trans. 2022, 20, 2217–2228. [Google Scholar] [CrossRef]
Todorov, T. Practical aspects of journal indexing in scientific databases. In Proceedings of the 2021 5th International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT), Ankara, Turkey, 21–23 October 2021; pp. 233–236. [Google Scholar]
Nieto-Chaupis, H. Interpretation of Scimago Ranking in Terms of Success Probabilities. In Proceedings of the 2019 IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies (CHILECON), Valparaiso, Chile, 13–27 November 2019; pp. 1–4. [Google Scholar]
Gamboa, J.O.A. Bases de datos y calidad de las revistas científicas: La aportación de Latindex. ESPACIO I+D Innovación Más Desarro. 2017, 6, 8–28. [Google Scholar] [CrossRef]
Kitchenham, B. Procedures for Performing Systematic Reviews; Keele University Technical Report TR/SE-0401; Keele University: Keele, UK, 2004; Volume 33. [Google Scholar]
Qurat-Ul-Ain; Shaukat, A.; Saif, U. NLP based Model for Classification of Complaints: Autonomous and Intelligent System. In Proceedings of the 2022 2nd International Conference on Digital Futures and Transformative Technologies (ICoDT2), Rawalpindi, Pakistan, 24–26 May 2022; pp. 1–6. [Google Scholar]
Usui, M.; Aramaki, E.; Iwao, T.; Wakamiya, S.; Sakamoto, T.; Mochizuki, M. Extraction and Standardization of Patient Complaints from Electronic Medication Histories for Pharmacovigilance: Natural Language Processing Analysis in Japanese. JMIR Med. Inf. 2018, 6, e11021. [Google Scholar] [CrossRef]
Alamsyah, D.P.; Arifin, T.; Ramdhani, Y.; Hidayat, F.A.; Susanti, L. Classification of Customer Complaints: TF-IDF Approaches. In Proceedings of the 2022 2nd International Conference on Intelligent Technologies (CONIT), Hubli, India, 24–26 June 2022; pp. 1–5. [Google Scholar]
Singh, A.; Saha, S. GraphIC: A graph-based approach for identifying complaints from code-mixed product reviews. Expert Syst. Appl. 2022, 216, 119444. [Google Scholar] [CrossRef]
Yance Nanlohy, L.; Mulyanto Yuniarno, E.; Mardi Susiki Nugroho, S. Classification of Public Complaint Data in SMS Complaint Using Naive Bayes Multinomial Method. In Proceedings of the 2020 4th International Conference on Vocational Education and Training (ICOVET), Malang, Indonesia, 19 September 2020; pp. 241–246. [Google Scholar]
Hsu, J.-H.; Weng, T.-C.; Wu, C.-H.; Ho, T.-S. Natural Language Processing Methods for Detection of Influenza-Like Illness from Chief Complaints. In Proceedings of the 2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Auckland, New Zealand, 7–10 December 2020; pp. 1626–1630. [Google Scholar]
Anggraini, A.; Kusumaningtyas, E.M.; Barakbah, A.R.; Fiddin Al Islami, M.T. Indonesian Conjunction Rule Based Sentiment Analysis For Service Complaint Regional Water Utility Company Surabaya. In Proceedings of the 2020 International Electronics Symposium (IES), Surabaya, Indonesia, 29–30 September 2020; pp. 541–548. [Google Scholar]
Assaf, S.; Srour, I. Using a data driven neural network approach to forecast building occupant complaints. Build. Environ. 2021, 200, 107972. [Google Scholar] [CrossRef]
Fan, Q.; Han, H.; Wu, S. Credibility analysis of water environment complaint report based on deep cross domain network. Appl. Intell. 2022, 52, 8134–8146. [Google Scholar] [CrossRef]
Farouk, R.A.; Khafagy, M.H.; Ali, M.; Munir, K.; Badry, R.M. Arabic Semantic Similarity Approach for Farmers’ Complaints. Int. J. Adv. Comput. Sci. Appl. 2021, 12, 348–358. [Google Scholar] [CrossRef]
HaCohen-Kerner, Y.; Dilmon, R.; Hone, M.; Ben-Basan, M.A. Automatic classification of complaint letters according to service provider categories. Inf. Process. Manage. 2019, 56, 102102. [Google Scholar] [CrossRef]
Singh, A.; Saha, S.; Hasanuzzaman, M.; Dey, K. Multitask Learning for Complaint Identification and Sentiment Analysis. Cogn. Comput. 2022, 14, 212–227. [Google Scholar] [CrossRef]
Tootooni, M.S.; Pasupathy, K.S.; Heaton, H.A.; Clements, C.M.; Sir, M.Y. CCMapper: An adaptive NLP-based free-text chief complaint mapping algorithm. Comput. Biol. Med. 2019, 113, 103398. [Google Scholar] [CrossRef]
Zhong, B.; Xing, X.; Love, P.; Wang, X.; Luo, H. Convolutional neural network: Deep learning-based classification of building quality problems. Adv. Eng. Inform. 2019, 40, 46–57. [Google Scholar] [CrossRef]
Rao, Z.; Zhang, Y. Research on Content of User Complaint Classification Based on Data Mining. In Proceedings of the 2020 IEEE 5th Information Technology and Mechatronics Engineering Conference (ITOEC), Chongqing, China, 12–14 June 2020; pp. 1080–1085. [Google Scholar]
Ke, Y.; Chen, L. Research on Text Classification and Data Analysis of Complaints by Shanghai Gas Company. In Proceedings of the 2021 IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chongqing, China, 12–14 March 2021; Volume 5, pp. 1626–1630. [Google Scholar]
Fan, Q.; Yang, K.; Qiu, C.; Wang, Z. Environmental Complaint Text Classification Scheme Combining Automatic Annotation and TextCNN. In Proceedings of the 2021 China Automation Congress (CAC), Beijing, China, 22–24 October 2021; pp. 4731–4736. [Google Scholar]
Tong, X.; Wu, B.; Wang, S.; Lv, J. A Complaint Text Classification Model Based on Character-Level Convolutional Network. In Proceedings of the 2018 IEEE 9th International Conference on Software Engineering and Service Science (ICSESS), Beijing, China, 23–25 November 2018; pp. 507–511. [Google Scholar]
Luo, J.; Qiu, Z.; Xie, G.; Feng, J.; Hu, J.; Zhang, X. Research on Civic Hotline Complaint Text Classification Model Based on word2vec. In Proceedings of the 2018 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), Zhengzhou, China, 18–20 October 2018; pp. 180–1803. [Google Scholar]
Chen, Y.; Cai, Z.; Xu, T.; Lai, G. The Early-Warning and Control of Service Complaint Based on Time Series Forecasting Method and SPC Model-Take Ctrip as an Example. In Proceedings of the 2018 15th International Conference on Service Systems and Service Management (ICSSSM), Hangzhou, China, 21–22 July 2018; pp. 1–6. [Google Scholar]
Shin, J.; Son, S.; Cha, Y. Spatial distribution modeling of customer complaints using machine learning for indoor water leakage management. Sustain. Cities Soc. 2022, 87, 104255. [Google Scholar] [CrossRef]
Achcar, J.A.; de Godoy, D.M. Quality of Services: An Application with Customer Complaint Data from a Telecommunication Company. Indep. J. Manag. Prod. 2021, 12, 928–944. [Google Scholar] [CrossRef]
Wang, X.; Zhu, Y.; Zeng, H.; Cheng, Q.; Zhao, X.; Xu, H.; Zhou, T. Spatialized Analysis of Air Pollution Complaints in Beijing Using the BERT plus CRF Model. Atmosphere 2022, 13, 1023. [Google Scholar] [CrossRef]
Li, H.; Li, Z.; Rao, Z. Text Mining Strategy of Power Customer Service Work Order Based on Natural Language Processing Technology. In Proceedings of the 2019 International Conference on Intelligent Computing, Automation and Systems (ICICAS), Chongqing, China, 6–8 December 2019; pp. 335–338. [Google Scholar]
Kim, J.; Lim, C. Customer complaints monitoring with customer review data analytics: An integrated method of sentiment and statistical process control analyses. Adv. Eng. Inform. 2021, 49, 101304. [Google Scholar] [CrossRef]
Yoshikawa, T.; Wang, Y.; Kawai, Y. A Product Recommendation System Based on User Complaint Analysis Using Product Reviews. In Proceedings of the 2019 IEEE 8th Global Conference on Consumer Electronics (GCCE), Osaka, Japan, 15–18 October 2019; pp. 710–714. [Google Scholar]
Chen, S.; Zhang, Y.; Song, B.; Du, X.; Guizani, M. An Intelligent Government Complaint Prediction Approach. Big Data Res. 2022, 30, 100336. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is All you Need. In Proceedings of the Advances in Neural Information Processing Systems; Curran Associates, Inc.: Long Beach, CA, USA, 2017; Volume 30, pp. 5998–6008. [Google Scholar]
Blandón-Andrade, J.C.; Castaño-Toro, A.; Morales-Ríos, A.; Tangarife, N. Complaint Process Management in an Electric Power Company. Comput. Sist. 2024, 28, 1143–1154. [Google Scholar] [CrossRef]

Figure 1. PRISMA-statement-based screening and filtering flow chart.

Table 1. SLR Method.

Phase	Activity
Research questions	Definition of research questions.
Search Process	Definition of search strings.
Search Process	Selection of Databases.
Inclusion and exclusion criteria	Definition of inclusion and exclusion criteria.
Quality assessment	Definition of quality criteria.
Data collection	Extraction of relevant information from each document.
Data analysis	Answer the research questions.

The authors.

Table 2. Classification scheme.

Information	Description
Title	Name of the article.
Authors	Name of author(s).
Year of publication	Year of publication available in the journal.
Method	Method for extracting information from natural language complaints.
Programming Languages	Programming languages or frameworks mentioned in the article.
Application domain	The domain for which the computational method was developed.
Number of documents processed	Number of documents processed by the computational method.
Evaluation Criteria	Criterion that measures the effectiveness of the method presented.

The authors.

Table 3. Results of applying inclusion and exclusion criteria.

Database	Number of Items in the First Search	Percentage of Items in the First Search	Number of Items Resulting from the Application of the Criteria	Percentage of Items Resulting from the Application of the Criteria
IEEE Xplore Digital Library	32	25.0%	13	48.2%
Science Direct	38	29.7%	6	22.2%
Springer	27	21.09%	2	7.4%
Web of Science	31	24.21%	6	22.2%
Total	128	100%	27	100%

The authors.

Table 4. Scoring of the quality criteria.

Source	QA1	QA2	QA3	QA4	Total Score
Qurat-ul-ain et al. [25].	Yes	No	Yes	Yes	3
Usui et al. [26].	Yes	No	Yes	Yes	3
Alamsyah ket al. [27].	Yes	No	Yes	Yes	3
Singh and Saha [28].	Yes	No	No	Yes	2
Yance Nanlohy et al. [29].	Yes	No	No	Yes	2
Hsu et al. [30].	Yes	No	No	Yes	2
Anggraini et al. [31].	Yes	No	Yes	No	2
Assaf and Srour [32].	Yes	No	Yes	No	2
Fan et al. [33].	Yes	Yes	No	No	2
Farouk et al. [34].	Yes	No	No	Yes	2
HaCohen-Kerner et al. [35].	Yes	No	No	Yes	2
Singh et al. [36].	Yes	No	No	Yes	2
Tootooni et al. [37]	Yes	No	No	Yes	2
Zhong et al. [38].	Yes	No	No	Yes	2
Rao and Zhang [39].	Yes	No	No	Yes	2
Ke and Chen [40].	Yes	No	No	No	1
Fan et al. [41].	Yes	No	No	No	1
Tong et al. [42].	Yes	No	No	No	1
Luo et al. [43].	Yes	No	No	No	1
Chen et al. [44].	Yes	No	No	No	1
Shin et al. [45].	Yes	No	No	No	1
Achcar y de Godoy [46].	Yes	No	No	No	1
Wang et al. [47].	Yes	No	No	No	1
Li et al. [48].	Yes	No	No	No	1
Kim and Lim [49].	Yes	No	No	No	1
Yoshikawa et al. [50].	No	No	No	No	0
Chen et al. [51].	No	No	No	No	0

The authors.

Table 5. Synthesis of information.

Source	Year	Method	Category	Application Domain	Number of Documents Processed	Complaint Language	Evaluation Criteria	Programming Language
Qurat-ul-ain et al. [25]	2022	NLP and Machine Learning	Hybrid	Citizen complaints portal	10,000	English	Accuracy: 86%.	--
Usui et al. [26]	2018	Morphological analysis	NLP	Patient complaints in community pharmacy	5000	Japanese	Accuracy: 66%. Recall: 63%.	--
Alamsyah ket al. [27]	2022	Neural Networks and TF-IDFs	ML	Complaints Bank Rakyat Indonesia	1 million documents	Indonesian	Accuracy: 85%.	--
Singh and Saha [28]	2022	Graph attention network (GAT)	ML	Complaints on web pages.	--	English	Accuracy: 72.82%. Macro-F1: 71%.	--
Yance Nanlohy et al. [29]	2020	NLP—Multinomial Naive-Bayes	Hybrid	Complaints about the government	--	Indonesian	Accuracy: 91.38%. Recall: 90.73%.	--
Hsu et al. [30]	2020	BERT	ML	Medical Complaints	--	Chinese	Accuracy: 72.87%.	--
Anggraini et al. [31]	2020	Rule-based sentiment analysis and categorization	NLP	Complaints drinking water company	100	Indonesian	--	--
Assaf y Srour [32]	2021	Multilayer Perceptron	ML	Complaints from building occupants	6000	English	--	--
Fan et al. [33]	2022	Deep cross domain network	ML	Complaints water pollution	--	Chinese	--	Python
Farouk et al. [34]	2021	Latent Semantic Analysis approach	NLP	Complaints from Arab farmers	--	Arabic	F-measure: 86.70%.	--
HaCohen-Kerner et al. [35]	2019	Unigrams, machine learning and filtering methods	Hybrid	Letters of complaint published on the Internet	--	Hebrew	Accuracy: 84.5%.	--
Singh et al. [36]	2022	Sentiment classification and feature detection with AffectiveSpace	ML–DL	Customer Service Complaints business sector	--	English	Accuracy: 83.63%. Macro-F1 score: 81.9%.	--
Tootooni et al. [37]	2019	Chief Complaint Mapper (CCMapper)	NLP	Patient complaints in the emergency department	--	English	Sensitivity: 82.3%. Specificity: 99.1%. F-score: 82.3%.	--
Zhong et al. [38]	2019	Convolutional Neural Networks (CNN)	ML	Complaints about building quality	--	Chinese	Accuracy: 72.6%. Recall: 47%. F1-Score: 53.4%.	--
Rao and Zhang [39]	2020	Bayes and K-means	Hybrid	Online complaints about Chinese websites	--	Chinese	Accuracy: 92.93%. Recall: 93.90%.	--
Ke and Chen [40]	2021	N-grams and Naive-Bayes algorithm	Hybrid	Gas service complaints	--	Chinese	--	--
Fan et al. [41]	2021	TextCNN	ML	Environmental Complaints	--	Chinese	--	--
Tong et al. [42]	2018	CNN at character level	ML	Complaints on web platforms	--	Chinese–English	--	--
Luo et al. [43]	2018	FastText, TextCNN, TextRNN, and RCNN	ML	Haikou online complaints 12,345	--	Chinese	--	--
Chen et al. [44]	2018	LDA (Latent Dirichlet Allocation) Model	ML	Tourism Complaints	--	Chinese	--	--
Shin et al. [45]	2022	ML: XGBoost and LightGBM	ML	Complaints of urban problems	--	Korean	--	--
Achcar and de Godoy [46]	2021	Multiple linear regression models and Poisson regression models	Statistician	Quality of service Telecommunications company	--	Portuguese	--	--
Wang et al. [47]	2022	BERT + CRF	ML	Air pollution complaints	--	Chinese	--	--
Li et al. [48]	2019	NLP and KNN, SVM, CNN, RNN, LSTM	Hybrid	Energy company complaints	--	Chinese	--	--
Kim and Lim [49]	2021	Sentiment analysis and SPC analysis	Hybrid	Quality of Service Complaints	--	English	--	--
Yoshikawa et al. [50]	2019	Information extraction	NLP	E-Commerce Complaints	--	Japanese	--	--
Chen et al. [51]	2022	Text classification with ML (Label Correction)	ML	Government Complaints	--	Chinese	--	--

The authors.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Blandón Andrade, J.C.; Castaño Toro, A.; Morales Ríos, A.; Orozco Ospina, D. Computational Methods for Information Processing from Natural Language Complaint Processes—A Systematic Review. Computers 2025, 14, 28. https://doi.org/10.3390/computers14010028

AMA Style

Blandón Andrade JC, Castaño Toro A, Morales Ríos A, Orozco Ospina D. Computational Methods for Information Processing from Natural Language Complaint Processes—A Systematic Review. Computers. 2025; 14(1):28. https://doi.org/10.3390/computers14010028

Chicago/Turabian Style

Blandón Andrade, J. C., A. Castaño Toro, A. Morales Ríos, and D. Orozco Ospina. 2025. "Computational Methods for Information Processing from Natural Language Complaint Processes—A Systematic Review" Computers 14, no. 1: 28. https://doi.org/10.3390/computers14010028

APA Style

Blandón Andrade, J. C., Castaño Toro, A., Morales Ríos, A., & Orozco Ospina, D. (2025). Computational Methods for Information Processing from Natural Language Complaint Processes—A Systematic Review. Computers, 14(1), 28. https://doi.org/10.3390/computers14010028

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Computational Methods for Information Processing from Natural Language Complaint Processes—A Systematic Review

Abstract

1. Introduction

2. Materials and Methods

2.1. Literature Review Method

2.1.1. Literature Review Guide by Kitchenham

2.1.2. PRISMA Statement

2.2. Research Questions

2.3. Database Search Criteria

Search Process

2.4. Screening and Filtering

2.4.1. Inclusion and Exclusion Criteria

2.4.2. Quality Assessment

2.4.3. Data Collection

2.4.4. Data Analysis

2.4.5. Flow Chart of Screening and Filtering

3. Results

4. Discussion

4.1. Linguistic Methods

4.2. Statiscal Methods

4.3. Machine Learning (ML) Methods

4.4. Hybrid Methods

4.5. Syntesis of Methods

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI