Classification and Prediction of Sustainable Quality of Experience of Telecommunication Service Users Using Machine Learning Models

Banjanin, Milorad K.; Stojčić, Mirko; Danilović, Dejan; Ćurguz, Zoran; Vasiljević, Milan; Puzić, Goran

doi:10.3390/su142417053

Open AccessArticle

Classification and Prediction of Sustainable Quality of Experience of Telecommunication Service Users Using Machine Learning Models

by

Milorad K. Banjanin

¹

,

Mirko Stojčić

^2,*

,

Dejan Danilović

²,

Zoran Ćurguz

²,

Milan Vasiljević

¹ and

Goran Puzić

³

¹

Department of Computer Science and Systems, Faculty of Philosophy Pale, University of East Sarajevo, Alekse Šantića 1, 71420 Pale, Bosnia and Herzegovina

²

Department of Information and Communication Systems in Traffic, Faculty of Transport and Traffic Engineering Doboj, University of East Sarajevo, Vojvode Mišića 52, 74000 Doboj, Bosnia and Herzegovina

³

Faculty of Economics and Engineering Management in Novi Sad, University Business Academy in Novi Sad, Cvećarska 2, 21102 Novi Sad, Serbia

^*

Author to whom correspondence should be addressed.

Sustainability 2022, 14(24), 17053; https://doi.org/10.3390/su142417053

Submission received: 14 November 2022 / Revised: 4 December 2022 / Accepted: 12 December 2022 / Published: 19 December 2022

(This article belongs to the Special Issue Industry 4.0: Quality Management and Technological Innovation)

Download

Browse Figures

Versions Notes

Abstract

:

The quality of experience (QoE) of the individual user of telecommunication services is one of the most important criteria for choosing the service package of mobile providers. To evaluate the sustainability of QoE, this paper uses indicators of user satisfaction or dissatisfaction with the quality of network services (QoS), especially with conversational, streaming, interactive and background classes of traffic in networks. The importance of knowing the impact of selected combinations of paired legal–regulatory, technological–process, content-formatted and performative, contextual–relational and subjective user-influencing factors on QoE sustainability is investigated using a multiple linear regression model created in Minitab statistical software, machine learning model based on boosted decision trees created in the MATLAB software package and predictive models created by using an automatic modeling method. The classification of influence factors and their matching for the analysis of interaction fields of users and services aim to mark QoE as sustainable by determining the accuracy of the weight of subjective ratings of user satisfaction indicators as transitional variables in the predictive model of QoE. The hypothetical setting is that the individual user’s curiosity, creativity, communication, personality, courage, confidence, charisma, competence, common sense and memory are adequate transition variables in a sustainable QoE model. Using the applied methodology with an original research approach, data were collected on the evaluations of research variables from anonymous users of mobile operators in the geo-space of Republika Srpska and B&H. By treating the data with mathematical and machine learning models, the QoE assessment was performed at the level of an individual user, and after that, several models were created for the prediction and classification of QoE_i. The results show that the relative error (RE) of the predictive models, created over the collected dataset, is insufficiently low, so the improvement of the prediction performance was achieved via data augmentation (DA). In this way, the relative prediction error is reduced to a value of RE = 0.247. The DA method was also applied for the creating a classification model, which at best demonstrated an accuracy of 94.048%.

Keywords:

QoS; sustainable quality QoE_i; paired components; influence factors; transition variables in the model; user satisfaction and rigidity; DA; prediction; classification; machine learning models

1. Introduction

The increasing user demand for continuous, permanent, reliable and fast data transmission occurs along with increasing expectations related to an appropriate quality of experience (QoE) when using certain telecommunication services. This is especially important for users of cellular networks who use services, particularly streaming, in various states of mobility. Therefore, QoE represents one of the most important criteria when choosing a provider of telecommunication services today. On the other hand, not only is it also very important for telecommunication operators to understand the ways of user perception and estimation of QoE in order to achieve a competitive advantage, but knowledge of user QoE can also be a very important reference for optimizing network resources. In addition to subjective measurements of QoE based on user ratings, providers must perform objective measurements of quality of service (QoS) through certain parameters, and, by improving them, influence the increase in the level of QoE [1]. It is especially important to predict user needs for a certain level of QoE in the future, which is achieved via predictive and classification models [2] based on machine learning techniques, such as artificial neural networks [3]. Predictive QoE modeling can be descriptively defined as a broad term that refers to the process of developing a mathematical tool or model that generates an accurate forecast based on existing historical data. Therefore, it can be said that predictive modeling facilitates the achievement of sustainability, which according to [4] is viewed as a main normative principle for modern society that includes the long-term ethical relationship of current generations with the generations of the future.

QoE is a multidisciplinary concept represented in various fields, such as psychology, ergonomics, marketing, information technology, artificial intelligence, social sciences, etc. This wide representation is one of the main reasons for the lack of a single definition of this term, as well as a list of factors that influence its level [5]. However, in the next section, several definitions that are most often used will be presented. According to [6], QoE can be defined as “A measure of user performance based on both objective and subjective psychological measures of using an ICT service or product”. The Standardization Sector of the International Telecommunication Union (ITU-T) defines QoE as “The overall acceptability of an application or service, as perceived subjectively by the end-user” [7,8,9]. In the technical specification of the European Telecommunications Standards Institute (ETSI) [5] and ITU-T recommendation [10], QoE is defined as “the degree of delight or annoyance of the user of an application or service”, which is also one of the current definitions of this term. In addition to regulatory bodies, numerous researchers formulated their definitions and on the basis of them created certain models for assessing the quality of experience. According to [11], QoE represents a subjective user assessment of service quality, and the most frequently used qualitative ratings are: “Good”, “Poor”, “Fair”, “Bad”. It is the state of the field, with a multitude of definitions and ways to measure QoE, that opened the door for future research into this concept. Most published models present QoE via mean opinion score (MOS) for a group of users. In this paper, the quality of user experience at the level of an individual user QoE_i is observed as a dependent/output variable, where i ranges within 1...n users. This means that the arithmetic mean of QoE_i values corresponds to the total QoE for a group of users expressed through MOS.

Today, special attention is paid to modeling QoE in user interactions with video streaming services, which represent the most significant part of total mobile internet traffic [12]. This progress in the field of video streaming resulted in the rise of both video-on-demand services (YouTube, Netflix, Amazon Video, Hulu, etc.) and live (Twitch.tv, YouTube Gaming) streaming [13]. In order for providers to perform constant monitoring and ensure a satisfactory level of QoE for the mentioned services in the future, special emphasis is now placed on models based on machine learning techniques [14].

The main goal of this research is to model the overall quality of user experience in interactions with telecommunication services that belong to the conversational, streaming, interactive and background classes of network traffic. For the case study, the three largest mobile operators, as providers of the aforementioned services, operating on the territory of the Republic of Srpska and Bosnia and Herzegovina were chosen: Mtel, BH Telekom and HT Eronet [15].

The hypothetical setting is that the individual user’s curiosity, creativity, communication, personality, courage, confidence, charisma, competence, common sense and memory are adequate transition variables in a sustainable QoE model. The second assumption is that the use of transitional variables of the sustainable quality of individual user QoE_i can be a very important parameter for planning and optimizing network resources and realizing the competitive advantage of individual service providers in the future.

The following contributions of this paper can be clearly identified:

This paper uses a subjective approach to assessing the level of user experience, which is based on a unique questionnaire with 11 questions.
Survey questions were formulated based on the selected original set of five factors affecting QoE: (1) legal–regulatory; (2) technological–process; (3) content-formatted and performative; (4) contextual–relational; (5) subjective–user.
The subjects of user evaluation are all telecommunications services of the three largest mobile operators operating on the territory of the Republic of Srpska and Bosnia and Herzegovina. It is important to point out that there is no previously published research on this topic that is related to the mentioned geographical area, as well as the observed set of services, which also represents the great practical importance of this paper.
This paper presents a unique methodology based on a combination of mathematical, statistical and machine learning methods in order to assess, classify and predict the quality of user experience at the level of an individual user, which is why a large number of models were created.
The possibilities of synthetic data augmentation using the data augmentation (DA) method were demonstrated, as well as the way in which this method affects the performance improvements of machine learning models.

The paper is structurally divided into six sections. After the introductory section, Section 2 provides a review of relevant published research. Section 3 contains an overview of research materials and methods. The analysis of influencing factors on the quality of user experience is given in Section 4. The main focus of the research is in Section 5, where the results of QoE_i modeling are presented with discussion. Section 6 refers to the conclusion, after which a list of references is given.

2. Review of Relevant Published Research

It is generally known that a real-time service that is very important for an enormous number of users, voice transmission over Internet protocol (VoIP), has been extensively researched with different goals. In [16], QoE modeling for VoIP service is presented by improving the simplified E-model with a subjective MOS model. The advantages of the final model in terms of accuracy and reliability are explained. On the basis of developing wireless network technologies, high-speed data transmission is now possible. Therefore, in the paper [17], the authors present a holistic modeling of QoE estimation when using live video streaming services in 4G networks. The main goal of the aforementioned research is the assessment and prediction of subjective QoE metrics, taking into account various variables related to QoS, bit stream and basic video quality metrics.

The paper [18] presents a QoE model based on machine learning principles aimed at protecting the privacy data of users who participated in “sensitive” case studies. The basic idea in this orientation is that there is no need to share user datasets amongst researchers. Instead, each researcher collaboratively participates in partial model training. The paper [19] presents an approach that solves this problem using intelligent sampling and the results of experiments with similar outcomes. A generic QoE model for evaluating the quality of video calls within a Wi-Fi network is presented in the paper [20]. The authors proposed metrics that are independent of content, application and user equipment or device. Specifically, they selected the following QoE parameters: perceptual bitrate—PBR, freeze ratio, length and number of video freezes in real time. When it comes to the World Opera application for real-time video content transfer, the QoE quality assessment is modeled in the paper [21]. In a very complex solution of the model, the video streaming of several musical artists, spatially dislocated, are combined into one video. The assessment of the level of QoE in this case is determined on the basis of an indicator called perceived reliability, because the quality directly depends on possible failures of some of the end-to-end software, hardware or network components, which are included in the implementation of this telecommunication service. In addition to QoE models, which are mainly related to streaming traffic today, other services should also be highlighted. Thus, in the paper [22], the authors propose a Web QoE model for evaluating the level of user experience when using interactive services using a Web browser. This model is based on psychological characteristics of the user, which are connected with their memory, that is, the memory of the experience gained in the previous period. Similar research related to the modeling of the quality of experience for Web applications (QoEWA) is presented in the paper [23]. This approach integrates key performance indicators (KPI) and key quality indicators (KQI) [24].

The four categories of factors affecting QoE in the paper [25] are: human-related influencing factors; system-related influencing factors; context-related influencing factors; service-/application-content-related influencing factors. The focus of the observation is audio and video streaming services available through wireless and mobile technologies (4G, 5G).

Regression algorithms of machine learning (artificial neural networks) were applied in research [26] to map QoS parameters onto QoE level for VoIP services. In the paper [27], the authors model the dependencies of the subjectively assessed level of QoE and technical network parameters obtained via measurements when using web browsing services. The recognizable aim of the paper is to connect subjective QoE with measurable parameters of service quality, which can be monitored in an operational environment, enabling an objective and real assessment of QoE. Today, quality is becoming an increasingly prevalent concept in the field of software engineering. During software development, great attention is paid to the prediction and modeling of its quality with the help of machine learning techniques such as artificial neural networks [2,3].

In addition to the above, reviews of relevant research related to QoE modeling are given in papers [11,13,28,29,30,31]. Table 1 provides a comparative overview of QoE modeling in this paper with the references of eleven papers from a reference set of previous research. The main improvements presented in this paper compared to previously published papers are: (a) assessment and modeling of the quality of user experience at the level of individual user QoE_i, not at the level of a group of users; (b) original questionnaire for subjective user assessment of QoE_i level with a unique set of questions representing independent/input, transition and dependent/output variables; (c) services that are subject to evaluation: all services and classes of telecommunication traffic, not just individual services or applications; (d) operators that are subject to evaluation: all mobile operators on the territory of the Republic of Srpska and B&H; (e) estimation of QoE_i based on transition variables that represent indicators of satisfaction and indicators of dissatisfaction of service users; (f) original classification of combinations of paired factors affecting QoE_i into five groups; (g) creation of an interactive model of factors that affect QoE_i; (h) creation of multiple models for QoE_i prediction and classification based on machine learning techniques; (i) the IBM SPSS Modeler software platform was used for modeling; (j) the DA method was applied. The mark in parentheses in front of individual improvements (a),...,(j) is used in the sixth column of Table 1 under the title “comparative improvements presented in the paper”.

In addition to the code of each of the 11 references in the first column and the titles of papers in the second column, the third column of the table shows the services/applications observed, the fourth shows methods and models used, and the fifth column shows the factors affecting QoE as a research variable. In the sixth column, the comparative advantages of the solution in this paper are interpreted.

3. Materials and Research Methods

The research process can be presented algorithmically, through several successive steps, as follows:

In the first step, we performed the analysis of various factors that affect the quality of user experience and created an interactive QoE_i model;
In the next step, the research process was implemented in accordance with a subjective approach to the assessment of the level of user experience and the survey method, on the basis of which a research instrument was created—a QoE questionnaire. A correlation amongst the influencing factors on QoE_i in formulated questions and research-independent, transition and dependent variables was established.
The third step was the process of online surveying of users of network services and applications about the level of certain indicators of the quality of subjective–user experience in interactions with communication services performed by professional companies—providers of telecommunication services in certain locations;
Data obtained by surveying users was prepared for processing in the fourth step;
The statistical analysis of the research sample was performed in the fifth step, where basic statistical indicators related to the responses to individual questions and the structure of respondents were given;
In the sixth step, a mathematical model was created to assess the subjective-user QoE_i based on the responses to the questions from the QoE questionnaire as input variables;
In the seventh step, a QoE_i probability model was created;
Correlation analysis of research variables was performed in the eighth step;
The last step represents the special focus of the research and refers to the results of QoE_i modeling. Within this step, the results of the QoE_i prediction and classification model based on machine learning techniques are particularly important.

A more detailed description of each of the aforementioned steps with applied methods is given in the next part of this section.

3.1. Analysis of Influencing Factors and Creation of an Interactive QoE_i Model

Factors affecting QoE can be defined according to [5] as: “any characteristic of a user, system, service, application or context whose actual state or setting may have influence on the Quality of Experience of the user”. Within the framework of various studies, authors singled out different factors of influence on the overall level of QoE of a service. For example, in the paper [25], these factors are classified into four categories: human-related influencing factors, system-related influencing factors, context-related influencing factors and content-related influencing factors.

In this paper, influencing factors were categorized as five paired components, as follows:

Subjective–user influencing factors: demographic and socio-economic background, physical and mental constitution or emotional state of the user.
Technological–process influencing factors: transmission, encoding, storage, display and reproduction/media display, etc.
Contextual–relational influencing factors: any property of the situation that describes the user environment, in terms of physical (location and space, activities, state-mobility and behavior), time, social (people who are present or involved in the experience), economic (costs, type of subscription or type of brand of service/system), and technical characteristics.
Content-formatted and performative influencing factors, which in the case of videos are related to traffic class or streaming quality, encoding speed, resolution, duration, movement patterns, type and content structure of videos, etc.
Legal–regulatory influencing factors in multidimensional space on the intuitive and systemic quality of the user experience. According to technical specification [5], in this paper, an expanded number, i.e., five multi-dimensional areas in which QoE influencing factors for a specific service/application are evident, namely: application robustness area, operator/provider network resource area, network traffic context area, subjective user area and legal–regulatory area. The given categorization of space in this research is synchronized with the categorized paired factors of influence on the overall level of QoE, i.e., legal–regulatory, technological–process, relational–contextual, content–performative and subjective–user.

Based on the analyzed and observed factors affecting QoE_i, in this step of the research modeling of their interactions was performed. Their interdependencies and correlations, as well as correlations with research variables, are shown graphically.

3.2. QoE Questionnaire and Selection of Research Variables

In research considering the assessment and measurement of QoE, two basic approaches are most often applied:

Service level measurements represent subjective measurements. They are most often carried out by agents accessing telecommunication services and responding to the created research questions at the end.
Network level measurements (objective approach) allow the estimation of QoE values depending on the measured values of network QoS parameters (e.g., bit error ratio (BER), packet loss ratio (PLR), delay, etc.) [24,32].

Subjective assessment is considered the most accurate approach to measuring QoE, where a discrete MOS scale of five levels is used to quantify factors affecting QoE: 1: bad, 2: poor; 3: fair, 4: good, 5: excellent [25,33]. In this research, the evaluation of user QoE_i is carried out with a subjective approach and a survey method, based on which a research instrument was created—a QoE questionnaire with 11 questions [15], through which five reconstructed groups of paired factors affecting QoE_i were presented.

In order to model QoE_i, a selection of research variables was made on the basis of the questions in the questionnaire. Ten independent/input variables (X₁...X₁₀) were identified, which were defined by questions 1–9 of the questionnaire: age (X₁); gender (X₂); legal–regulatory affiliation of the organization/firm (X₃); provider(s) with whom the user has experience (X₄); qualitative level of experience—perceived level of QoS (X₅); level of satisfaction with the price of the provider’s services (X₆); evaluation of the user’s legal security in interactions with the service provider in the area of service provision on the basis of contracts and payment of bills by cost calculation (X₇); user experience expressed by characteristics for four traffic classes (X₈); user experience expressed by levels for four traffic classes (X₉); length of user experience of the provider’s services (X₁₀).

Figure 1 graphically shows how the five observed groups of paired factors affecting QoE_i are represented by 11 questions in the questionnaire, based on which, in the next step, three classes of research variables are defined. It is noticeable that the input independent variables (X₁....X₁₀), transition variables (D₁...D₁₀; C₁...C₅) and an output dependent variable QoE_i were identified.

In addition to the identified independent variables, in Figure 1 it can be seen that question 10 in the questionnaire defines 10 transition variables as indicators of satisfaction (Delight-D) up to the level of delight, and, in question 11, five transition variables as indicators of dissatisfaction (Complaint-C) up to the level of anxiety for the specific user service. Satisfaction indicators represent the effects of online services on the user: curiosity (D₁), creativity (D₂), communication (D₃), personality (D₄), courage (D₅), confidence (D₆), charisma (D₇), competence (D₈), common sense (D₉), and memory (D₁₀). Dissatisfaction indicators are defined by connection of online services with the following forms and measures of rigidity that services cause in users: interpersonal (C₁), behavioral (C₂), structural—longing for structure and coping with the lack of structure (C₃), prospective anxiety (C₄), inhibitory rigidity (C₅). The two groups of indicators represent subjective user factors, which, based on a mathematical model, enable a direct assessment of the quality of user experience QoE_i as a dependent variable in the evaluation model.

3.3. Survey Process

The survey process was conducted online, using the Google Forms web application, which enables survey administration and is included as part of the free web-based package Google Docs Editors. The questionnaire was open for filling out in the period from 21 April 2021 until 26 April 2021.

The target group of respondents were mainly employees of companies, state or local administrative bodies, educational/scientific institutions, agencies and associations (chambers, agencies, citizens’ unions, entrepreneurs, banks, business and diplomatic missions, etc.), and other institutional forms in the territory of Bosnia and Herzegovina. The focus of the research is subjective user evaluations of respondents’ own experience in interactions with telecommunication services offered by telecommunication service providers M:tel [24], HT Eronet, BH Telecom and others operating in the territory of Bosnia and Herzegovina.

3.4. Preprocessing of Data Collected

The data collected by filling out the questionnaire were automatically stored in an Excel document. A total of 167 responses were collected, and the preprocessing of the data collected included filtering, i.e., removal of missing values and coding of individual responses into numerical values.

3.5. Statistical Analysis of Data Collected

Based on a formatted sample of 157 completed questionnaires, a brief statistical overview of the responses to each question was determined. One part of the questions was processed according to the percentage share of the responses offered in the sample (e.g., age, gender, length of experience), while the second part was processed according to average value of the user’s rating, which is measured on the MOS continuous scale.

3.6. Assessment of User QoE_i with a Mathematical Model

The subjective approach to assessing the specific value of QoE_i, as the first step, implies the identification of KQI that directly affect the quality of user experience [11]. The following expression represents the generic relation between the QoE value and identified indicators:

Q o E = f (I_{E}^{1}, \dots, I_{E}^{K})

(1)

where

I_{E}^{j}

denotes the jth indicator (j = 1…K). In practice, a linear relationship or function f is often assumed between QoE and

I_{E}^{j}

, which is also the case in this research. In order to determine the specific value of QoE_i, it is necessary to assign certain weighting factors to the identified indicators. Thus, most often, QoE can be expressed by a linear combination of weight indicators, which is shown by the following expression [11]:

Q o E = \sum_{j = 1}^{K} w_{j} \cdot I_{E}^{j}

(2)

where w_j represents the weighting factor assigned to the indicator

I_{E}^{j}

. In this research, both satisfaction indicators and dissatisfaction indicators were taken into account, and memorized in the subjective user experience for which each QoE_i was determined when creating a linear mathematical model.

3.7. Creating a QoE_i Probability Model

The main goal of this research stage is to find the density function of the probability distribution f(x), which allows the calculation of the probability that QoE_i will take a value from a certain interval [a, b]. Using the SPC tool for Excel and the distribution fitting procedure, data fitting, i.e., QoE_i values with several common distributions, was tested. As a result of this procedure, the Anderson–Darling (AD) statistic test, as well as corresponding p-values, were given for data fitting with a certain type of distribution, and the distributions were ranked according to the Akaike information criterion (AIC). AIC, as one of the most common tools in statistical modeling, is a means of selecting the best model in a set of models. Therefore, this criterion is an indicator of the degree of the statistical model fitting with real data (goodness of fit—GOF). Its mathematical formulation is an extension of the maximum likelihood principle [34]:

A I C = - 2 \ln (\hat{L}) + 2 K

(3)

where

\hat{L}

is the maximum value of the likelihood function, and K is the number of model parameters [35]. AIC is an indicator of the relative amount of information lost by creating a given model over the actual data, with AIC making a compromise between GOF and model simplicity. This means that the less information is lost, the lower its value and the higher the quality of the model.

3.8. Correlation Analysis of Research Variables

Determining the strength and direction of the interrelationships of all observed research variables was carried out via correlation analysis. Spearman’s correlation coefficients show non-parametric measures of the correlation rank between independent (X₁–X₁₀), transitional (D, C) and a dependent variable (QoE_i). One of the main reasons for using Spearman’s correlation coefficient in this research is that the values of most variables are measured on an ordinal scale, where categories are ranked from 1 to 5. As a means of visualizing correlation coefficients, a correlogram or correlation matrix is used [36,37]. A correlogram enables the analysis of the relationship between each pair of numerical variables through a scatterplot, or some other symbol that represents the correlation (in this case, color).

3.9. Creating a Model for QoE_i Prediction and Classification

Predictive and classification models are two types of models in the fields of machine learning and data mining that can be used to predict the ever-growing trend of Big Data in the future, or their belonging to certain classes of telecommunication network services [38]. By definition, a prediction is a forecast from the present to the future based on data obtained in the past [39]. Classification can be descriptively defined as the process of identifying the category to which a new observation belongs based on a training dataset containing observations whose participation category is known. Therefore, prediction models in application results predict continuous values, while classification models predict categorical (discrete) class labels. In the paper, both types of models were created, in accordance with the type of target output variable. User QoE_i ratings, calculated on the basis of a mathematical model, are continuous in nature, and prediction models are created for their forecasting. Mapping these values into a discrete, ordinal scale with numeric levels 0, 1, 2, 3, 4 and 5 allows for the creation of classification models based on machine learning techniques.

Models for QoE_i Prediction

As part of the research, predictive QoE_i models based on multiple linear regression, decision trees and machine learning models were created in the IBM SPSS Modeler software environment using an automatic modeling method [40]. The data obtained by filling out the questionnaire along with the estimated QoE_i values were structured in an Excel file into input/output vectors and divided into two parts. Ninety percent of the vectors were used to train the model and the remaining ten percent were used to test the prediction accuracy performance. As input to all created models, 10 independent variables (X₁…X₁₀) were used. All models were created according to the supervised learning paradigm [39].

A multiple linear regression model was created in Minitab statistical software. The coefficients of the model were determined via the method of least squares.

A machine learning model based on decision trees (boosted decision tree) was created in the MATLAB software package.

Predictive models created by using an automatic modeling method: In the IBM SPSS Modeler operating environment, structured data for training and testing were brought to the input of the Auto Numeric node in order to automatically create different predictive models simultaneously. In this way, in just one pass through the modeling process, the Auto Numeric node examines different machine learning techniques with default option settings, and based on that, offers the most accurate potential solutions and ranks them according to correlation and relative prediction error [24]. Supported techniques include neural networks, classification and regression trees (C&R tree), CHAID (Chi-square automatic interaction detection), linear regression, generalized linear regression and SVM. The relative error (RE) of the prediction was used as a criterion for quality assessment and selection of the most accurate model [24]. RE represents the ratio of the sum of squared errors of the dependent variable QoE_i and the sum of squared errors of the null model, or, given mathematically:

R E = \frac{\sum_{i = 1}^{n} (Q o E_{i} - Q o E_{i p})^{2}}{\sum_{i = 1}^{n} {(Q o E_{i} - Q o E_{i m})}^{2}}

(4)

where QoE_i is the estimated value of user experience of the ith user, QoE_ip is the prediction based on the actual value of QoE_i, and QoE_im is the arithmetic mean of the variable QoE_i.

In order to increase the accuracy of the model prediction, the DA method was used to artificially expand the available dataset [41,42]. Synthetic data were generated using the Autoencoder artificial neural network implemented in the MATLAB programming environment, which was trained according to the reinforcement learning paradigm.

Additional improvement of the accuracy of predictive models in the IBM SPSS Modeler software was carried out using the boosting method. This method is based on the creation of several models (components) in sequence, i.e., creating an ensemble of models that, as a result, provide a joint prediction of the dependent variable.

QoE_i classification models created by using an automatic modeling method.

QoE_i classification models, as well as predictive models, were created in the IBM SPSS Modeler software environment using the Auto Classifier option. Support techniques included neural networks, C&R trees; quick, unbiased, efficient statistical trees (QUESTs); CHAID; C5.0; logistic regression; decision lists; Bayesian networks; discriminants; nearest neighbors and SVM. The DA method was used to improve classification performance by artificially expanding the training and testing datasets. A set of 10 independent variables was used as an input to the classification models, and the dataset was divided in the ratio 90%:10%.

4. Factors Affecting the Quality of User Experience

4.1. Legal–Regulatory Factors

Legal–regulatory factors include the application of standards and legal norms, and special attention in this paper is given to norms defined by international and European standardization bodies in the field of telecommunications, which are related to the quality of user experience. These are the 3rd Generation Partnership Project (3GPP), the ETSI and the ITU-T. In the focus of the research are the following documents of the aforementioned organizations, which can be classified into three categories:

Technical specifications (TS): ETSI TS 103 294 V1.1.1 (2014-12) [5], ETSI TS 102 250-2 V2.4.1 (2015-05) [43], ETSI TS 102 250-1 V2.2.1 (2011-04) [44].
Technical reports (TR): ETSI TR 103 488 V1.1.1 (2019-01) [45], 3GPP TR 26.944 V10.0.0 (2011-03) [46], ETSI TR 102 643 V1.0.1 (2009-12) [6].
Recommendations (R): ITU-T Recommendation P.10/G.100 (11/2017) [10], ITU-T Recommendation G.1000 [47], ITU-T Recommendation P.10/G.100, Amendment 2 [8].

Within the documents, particularly important are normative or prescriptive definitions of terms related to QoE and QoS [24], which affect the correct establishment of relations between them. Considering that normativity represents theory, order and truth, it means that this type of definition prescribes “how something should look in reality” in order to achieve a satisfactory level of QoE. Apart from the definitions, the very important content of the mentioned norms are procedures for measuring QoE and glossaries of terms.

Subscribers sign contracts with the operator, which regulate interrelations regarding the provision/use of a service. According to the General Terms and Conditions for the provision of telecommunication services of the Mtel operator, a subscriber is “any legal or physical person who has concluded a contract with Mtel for using telecommunications services that are the subject of these General Terms and Conditions” [48]. Among other things, contracts can also regulate the minimum levels of QoS that the operator should deliver to users, as well as the maintenance of network components and services. Terminal user equipment/devices are covered by manufacturers’ guarantees. Unlike the General Conditions that apply to all services, the conditions for the provision of individual telecommunication services, such as prices, are regulated by Special Conditions.

4.2. Technological–Process Factors of Network Services/Applications

Technological–process factors affecting the quality of user experience (QoE) are directly related to QoS parameters and their objective measurements. According to [43,45], QoS parameters enable the quantification of the quality of user experience. They are often labeled with KPI, and are under the control of the service provider [49]. The authors in the paper [31], based on a review of published relevant research, classified factors related to QoS parameters into a group of system factors, such as: transmission speed, frame rate, packet loss, resolution, codec, jitter, quantization level, delay, packet error rate, spatial degradation, compression level, encoding, quality degradation, throughput, etc. In addition, a similar list of system parameters is given in the paper [25]. The basic criterion on the basis of which the classification is made into conversational, streaming, interactive and background traffic is sensitivity to delay [24]. According to [46], the term System Quality of Service (SQoS) refers to network indicators–attributes that are defined from the perspective of the service provider, not the user, and that directly affect QoE. The technical report [46] defines another term arising from QoS, which is end-to-end service quality of service (ESQoS). It describes the overall level of QoS, end-to-end. The parameters through which SQoS and ESQoS can be evaluated are defined by the service provider.

4.3. Content-Formatted and Performative Factors

Users make their QoS requests from a portfolio of telecommunication traffic classes interacting with one or more network devices. Providers provide the most common contents of fixed and mobile telephony services, data transfer and Internet access, as well as wholesale of voice and IP services and telecommunication capacities.

The order of discourses possessed by formatted service content and application algorithms are of particular importance for improving the quality of services and user experience. Discourses are special ways of understanding performative contents and their meanings, technological tools and algorithmic procedures, behavioral patterns and ways of acting of service providers and application creators. Within the discourse there are performatively available knowledges with satisfactory or unsatisfactory contents of services and applications that are integral parts of corporative “culture in action”. The corporate culture of the service provider/operator does not influence the action by offering the ultimate values towards which the action of individuals strives, but by shaping the repertoire of contents that in certain life situations help service users to solve certain problems. It forms a practice for users to routinely treat subjects in communication dynamics in different contexts and spaces mentioned earlier (Section 3.1), in handling objects with which they interact, in describing objects of interest and understanding the world in reality.

If a parallel between the meaning of the discourse shaped by the provider and the practice that is developed in the interaction spaces is established, then discourse is treated as being and practice is treated as existence from which discourses arise. Certainly, the practice changes dynamically because it simultaneously includes the thought and activities of the user in creating an experience under the influence of interactive factors on the quality of user experience.

A very present and important concept that regulates relations between providers/operators and society of service users is corporate social responsibility (CSR). This concept is integrated into the development strategies of telecommunication operators in our country and in the world. Social responsibility in modern society requires the company to be a corporate citizen that responds to the needs of customers (users), i.e., to deliver good products and services at a reasonable price and thus contribute to the values of society as a whole. Therefore, the behavior of a telecommunication company, as well as individuals in different roles in the phrase “culture in action”, should be evaluated according to the code of ethics, and that is why companies are responsible for their behavior ethically and legally [50]. According to the results presented in the paper [50], socially responsible business has a great impact on user satisfaction, retaining and building loyal customers, bringing competitive advantage, increasing market share, increasing sales revenue and increasing sales volume. Above all, content–formative and performative–interpretive factors determine the status and reputation of the operator and its position on the market as a clear indicator of the quality of the service it provides, and therefore how it contributes to the quality of user experience. Operators with a larger number of current users have a greater chance of attracting more potential users in the future. In this way, the development, expansion of the capacity and range of company’s services in the field of telecommunications is a predictable consequence.

4.4. Contextual–Relational Factors

From a sociological aspect, the users of telecommunication services are people of the new age who are trained to work with knowledge and new orientations in responsive and strategic contexts of providers/operators representing a company [51]. The need to research the experience of people is gaining importance in the interactions of users of company’s services with intelligent products, operating in an intelligent environment and communicating in intelligent contexts. The strongest influence on the quality of user experience is recognizable in the person’s orientations.

According to [52], context is defined as a “communication microenvironment that allows the meaning of information or message to be evaluated as good or bad, pleasant or unpleasant, true or false, useful or useless, permanent or temporary, individual or common, general or special”. Numerous publications take into account the influence of contextual factors on the perceived level of QoE. According to [53], context factors “include any situational property to describe the user’s environment in terms of physical, temporal, social, economic, task and technical characteristics”.

According to [54], the social context is defined by the interpersonal relationships that exist during the experience. Therefore, it is important to consider whether the user of an application/system is alone or with other people, and even how many different people are involved in the experience. Additionally, it is necessary to take into account the cultural, educational, professional and other levels of users. As socio-contextual factors in [55], user activities are observed through social networks, such as contact lists, links and interactions, as well as types of shared information. Therefore, the social context becomes very important at the level of content recommendation, which, based on the collected information about the context, guarantees a better user experience. Collaborative recommendation is a concept where a user recommends items consumed by other users with similar preferences [54,56].

4.5. Subjective–User Factors

According to [53], user factors represent “any variant or invariant property or characteristic of a human user. The characteristic can describe the demographic and socio-economic background, the physical and mental constitution or the user’s emotional state”. This means that they can include: bio-physical, emotional and mental characteristics; age [57]; gender; emotions; motivations; fears; levels of attention; education; expectations; needs; knowledge; previous experiences; etc. Based on the review of the literature presented in [31], the following user factors that were analyzed in studies were identified: gender [58], age, culture, personality, attitude, mood, emotion and interest.

Subjective–user factors are very complex and strongly interrelated. They influence the perceptual process at two important levels [59]. At the level of early sensory or so-called low-level processing, the main role is played by properties related to the user’s physical (health) and emotional and mental constitution. At the level of higher-level processing, i.e., cognitive processing (higher-level cognitive processing), which includes interpretations and reasoning, other subjective–user factors are important [53].

4.6. An Interaction Model of Paired Factors Affecting QoE_i

Figure 2 graphically shows an interaction model between individual factors that affect the QoE_i value. As can be concluded from the figure, there are two-way relations of influence according to the each with each principle of the paired factors and their corresponding input variables, including transitory subjective–user variables of satisfaction and/or dissatisfaction [15]. The value of QoE_i as a dependent variable is obtained via a mathematical model, directly based on users’ ratings of satisfaction and dissatisfaction indicators, while independent variables serve as inputs to QoE_i prediction and classification models.

5. Results of QoE_i Modeling and Discussion

At the beginning of this section, an overview of the preliminary statistics of the research sample is given based on the data collected through online surveys. Then the mathematical model is presented, which was used to estimate QoE_i values, the probability model of the dependent variable and the correlation analysis of the research variables. Based on the data collected and values calculated, the second part of the section describes how to create predictive and QoE_i classification models.

5.1. Research Sample Statistics

The structure of the research sample consisted of 167 surveyed users of communication services on the territory of the Republic of Srpska and BiH. After filtering the data, i.e., removal of missing values, the sample size was reduced to 157 completely filled forms used for further analysis and processing, of which 52% were predominantly male respondents, and 48% were female.

Regarding age structure, respondents between the age of 30 and 45 dominates with 55% of the total sample. The representation of users by other age intervals, in descending order, is as follows: interval 45 to 60 years with 29%, interval 20 to 30 with 8%, respondents older than 60 with 8%, respondents younger than 20 with a negligible 0.006 % [15].

The questionnaire offered six possible responses to the question related to the legal–regulatory affiliation of the respondent’s organization/firm to one of the forms. The analysis of the results shows that 53% of the people surveyed from the sample work in educational/scientific institutions, followed by 24% in companies, 16% in state or local administration bodies, 3% in agencies, associations—chambers, unions, entrepreneurs, banks, representative offices, etc., 3% in other organizational forms, and the fewest number of respondents belongs to the category of students, 1%.

Out of the total number of people surveyed, 128 or 81.5% declared that they had experience using services from only one provider. Out of the given number, 82.8% refers to users of the Mtel network. The second place in terms of frequency of use with 7.8% belongs to other IT providers: BH Telekom with 5.4%, HT Eronet with 3.1%, and other networks with 0.8%. The rest of the respondents out of the sample (19.5%) used/use the services of two or more telecommunication service providers.

Average qualitative assessment of the level of user experience, i.e., perceived level of QoS can be rated as good. The corresponding quantitative rating expressed through the MOS is equal to 2.97. When it comes to user satisfaction with the price of services, the achieved MOS value is 2.83, which can be considered as solid. Users rated the legal security in interactions with the provider in the area of contract-based service provision and bill payment by cost accounting with an average score of 2.97 (solid) [15].

For four defined classes of traffic [24], based on the MOS values that represent the average ratings of the quality of experience level, it can be concluded that background traffic services have the highest rating, MOS = 3.87. On the other hand, real-time services, such as audio and video streaming, were rated with the lowest average rating, MOS = 3.54. The largest number of users or 29.9% declared that they had experience in using the provider’s services from 15 to 20 years. Only 2.5% of users had a short experience, from one to five years.

Table 2 shows descriptive statistics for indicators of user satisfaction and dissatisfaction. In addition to the mark of each indicator, the values of the arithmetic mean (MOS), standard deviation, variance, sum of squares, minimum, median and maximum are given. The statistical focus on question 10 shows that communication is the best-rated indicator of user satisfaction with MOS = 3.71, while the worst-rated indicator is courage with an average MOS of 2.92. Negative influences on the final level of QoE_i values are expressed through five indicators of dissatisfaction defined by question 11, out of which behavioral rigidity has the greatest impact, rated with MOS = 2.88, and the least influence is caused by prospective anxiety, with a value of MOS = 2.80 [15]. As can be seen on the basis of Table 2, the MOS values are lower for indicators of dissatisfaction compared to indicators of satisfaction.

5.2. QoE_i Estimation Model

A weight value equal to 0.1 was assigned to each of the 10 identified user satisfaction indicators (D₁,…,D₁₀). These indicators affect the increase in the overall level of QoE_i by multiplying their weights with a corresponding user rating. Negative indicators (C₁,…,C₅) are associated with weight values equal to −0.2, which means that they affect the reduction of the QoE_i level. Based on the above, a mathematical formulation can be derived for estimating the user QoE_i:

Q o E_{i} = 0.1 \sum_{j = 1}^{10} D_{j} - 0.2 \sum_{k = 1}^{5} C_{k}

(5)

where D_j denotes the user’s ratings of satisfaction indicators, and C_k denotes the ratings of dissatisfaction indicators. Therefore, the total level of user QoE_i is determined on a scale of real numbers from 0 to 5, noting that negative values of QoE_i are mapped to the value of 0. The arithmetic mean of individual, subjective QoE_i values calculated in this way, where i = 1...157, is equal to the QoE value for a group of 157 users and amounts to MOS = 0.503. If this MOS is rounded to an integer value, a score of 1 is obtained, which qualitatively represents a bad user QoE rating for the observed services and telecommunications operators.

5.3. QoE_i Probability Model

Table 3 shows the results of the distribution fitting procedure, where the results of the Anderson–Darling (AD) statistic test are given, as well as associated p-values, and the distributions are ranked according to AIC. It is obvious that the LogNormal function has the lowest value, AIC = 26.61, and therefore it describes best the probability density distribution of the variable QoE_i. So, although a smaller value of the AD test indicates a better fit of the data with the observed type of distribution (logistic distribution has the smallest value of AD = 9.604), the choice was made according to AIC. Smaller p-values represent evidence against the null hypothesis of a statistical fit of the data with a particular distribution.

Figure 3 graphically shows the shape of the LogNormal probability distribution density function.

5.4. Correlation Analysis of Research Variables

The results of the correlation analysis are shown graphically, in the form of a correlogram in Figure 4. Each value of the correlation coefficient in the interval between −1 and 1 is represented by a certain shade of blue for positive and red for negative coefficients. Based on Figure 4, it can be concluded that, according to the model for estimating the QoE_i value, the strongest correlation (by absolute value) with user ratings of the quality of experience has transitional variables used to define satisfaction indicators D (0.41) and dissatisfaction indicators C (−0.55). According to the correlogram, the highest correlation exists between variables X₆ and X₇ and is 0.63, and then between X₅ and X₉, with a coefficient of 0.43. Additionally, it can be noted that a correlation equal to zero exists between the independent variables X₁ and X₆ and between X₁ and the dissatisfaction indicator C.

5.5. Predictive Models of QoE_i

In this section, the results of QoE_i prediction using a multiple linear regression model, decision trees and models based on machine learning created in the IBM SPSS Modeler software environment are presented.

5.5.1. Multiple Linear Regression Model

Linear regression models are the most commonly used statistical models. The basic property that allows them such a status is the simplicity of creation. However, they have a number of disadvantages, especially when applied to large databases. In the observed case, the multiple linear regression model was created in the statistical software Minitab and can be expressed by the following equation:

QoE_i = 0.343 + 0.0115 X₁ + 0.104 X₂ − 0.0438 X₃ − 0.0081 X₄ − 0.0133 X₅ + 0.0271 X₆ + 0.0210 X₇ − 0.0085 X₈ − 0.0007 X₉ + 0.0035 X₁₀

(6)

As an accuracy performance indicator of user QoE_i prediction, it is observed that the mean square error (MSE) of the test in this case has a value of MSE = 0.625, while the coefficient of determination is R² = 2.5%. Therefore, it is a very low coefficient of determination, which confirms the shortcomings of the linear model.

5.5.2. Boosted Decision Tree Model

By implementing the modified code [60] in the MATLAB software package, according to the supervised learning paradigm, a machine learning model based on boosted decision trees was developed. The test results show that the MSE of the created model prediction is 0.388, which is almost twice the error compared to the linear model. In addition, based on the results shown in Figure 5, which were obtained by executing the code [60], it is concluded that the user experience expressed by the characteristics of each of the four traffic classes (X₈) and the user experience expressed by the levels for the four traffic classes (X₉), represent the variables with the greatest influence (predictor importance) on the prediction of QoE_i. In last place in terms of influence is the variable representing the age (X₁) of the respondent.

If a decision tree is created with 15 inputs, which represent indicators of satisfaction and dissatisfaction with the quality of user experience, the QoE_i prediction error is then MSE = 0.16. As a result of training and testing the model, by executing the code [60], the MATLAB program package generates a graphic representation of the importance of the influence of each of the mentioned inputs on the prediction of QoE_i (Figure 6a). By creating a decision tree with inputs that are exclusively satisfaction indicators, Figure 6b shows as a result the ranking of the importance of the influence on the prediction of QoE_i for each of them. It can be concluded that the rating of the service impact on the user’s charisma has the greatest influence (D₇). The prediction error of the quality of user experience in this case has a value of MSE = 0.45. Nevertheless, the smallest prediction error is achieved if only five dissatisfaction indicators denoted by C are considered as inputs (MSE = 0.09). This result is in accordance with the presented mathematical model for estimating QoE_i (expression (5)) in which higher-weighting coefficients according to absolute value (0.2) are associated with indicators of dissatisfaction. Figure 6c graphically shows the importance of the influence of indicator C on the prediction of the target variable, from which it can be concluded that the rating of the connection of services with prospective anxiety (C₄) is at the top of the ranking by the value of prediction importance. So, taking into account the MSE values, it is clear that the prediction using the boosted decision tree model shows significantly more accurate results compared to the linear model.

5.5.3. Predictive Models Created by Using an Automatic Modeling Method

Table 4 shows the results of testing the three most accurate predictive models created using the Auto Numeric node [36,37]. It is concluded that the regression model has the smallest relative testing error, which is 1.070. However, a relative prediction error equal to 1 is a characteristic of the null or intercept model, which as a prediction result returns the arithmetic mean of the target variable. Models with relative testing errors less than 1 are considered better than the null, with the smaller the relative error, the better the model. Thus, the created models with the relative testing errors shown in Table 4 can be considered unsatisfactorily accurate [61].

If the obtained correlation coefficients for the three models from Table 4 are compared with the scale of Pearson’s correlation coefficients, shown in Table 5, it is concluded that, in the best case, there is only a low correlation between the estimated QoE_i values and the QoE_i values obtained by prediction, and that is for the model based on the k-NN machine learning technique (r = 0.206). The C&R tree model shows the worst prediction performance in terms of both correlation (0.000) and relative error (1.147) [62].

The presented results of model testing are a direct consequence of the insufficiently large training dataset. Therefore, using the data augmentation method, the basic dataset is augmented with “synthetic” data, obtained on the basis of 157 existing vectors [41]. The artificial expansion of the dataset was performed with the autoencoder artificial neural network implemented in the MATLAB programming environment, and its architecture is shown in Figure 7 [63]. The autoencoder represents a feedforward neural network that sends signals in only one direction, from the input to the output layer, and thus forms a directed acyclic graph, the aim of which is to learn to compress input vectors. Thus, the autoencoder represents data on the input side of the network using fewer nodes in the hidden layer. Based on this performance, its task is to perform a complete reconstruction of the input. Therefore, the hidden layer can be considered as a features detector.

The architecture of the autoencoder, shown in Figure 7, consists of an input layer with 11 nodes (ten independent variables X₁...X₁₀ and one dependent variable QoE_i), 5 nodes in the hidden layer and 11 nodes in the output layer in which the input vectors (X₁’...X₁₀’, QoE_i’) are reconstructed. The transfer functions of the neurons of the hidden and output layer have the form of a logistic sigmoid function (logsig):

f (z) = \frac{1}{1 + e^{- z}}

(7)

where z is the input to the neuron. By bringing 157 available vectors to the input layer of the network, the same number of modified copies is obtained at the output as the interpretation of the input. Therefore, with one described iteration, the input dataset is doubled. In each subsequent iteration, the sum of the input number of vectors and the corresponding number of reconstructed vectors from the previous iteration is added to the input layer of the autoencoder network, which is shown graphically in Figure 8.

Figure 8 shows that in the fourth iteration, the total number of vectors increased to 2512. After each iteration, training and testing of several models was performed, and the prediction results for each of the three most accurate predictive models are shown in the diagram in Figure 9.

From Figure 9, it can be concluded that the value of the relative prediction error decreases with the expansion of the training and testing dataset, while at the same time the correlation coefficients increase, which is the case for all created models. The predictive model that shows the best performance of all presented is the model based on C&R trees. According to Figure 9, the RE prediction of this model is 0.274, while the correlation coefficient is equal to 0.853 in the fourth iteration. If the scale of Pearson’s coefficients shown in Table 5 is taken into account, it can be concluded that the QoE_i values from the set for testing the model and the values obtained as a result of prediction have a very high level of correlation. In total, 2512 input–output vectors were used for training and testing the tree model, out of which 2260 were used for training and 252 for testing. The trend of decreasing relative error of all the models shown in Figure 9 can be represented by the following linear equation:

R E = - 0.042 \cdot m + 0.788

(8)

where m represents the serial number of the model starting from the generalized linear model from the first iteration, which has serial number 1. Figure 10 shows the layered structure of the created tree with a depth equal to five (tree depth = 5).

The C&R tree algorithm starts by examining the input nodes to find the best split, which is defined by the minimum impurity index as the result of the split. The Gini index is most often used to define impurity, which is related to the probability of random sample misclassification. In the IBM SPSS Modeler software, a minimum impurity change of 10⁻⁴ is defined by default to create a new split in the tree. All divisions are binary, which means that each division generates two subgroups, each of which is then divided into another two and so on until one of the stopping criteria is reached. The following were set as stopping criteria in the created C&R tree model:

Minimum records in parent branch—prevents splitting if the number of records in a node to be split (parent) is less than the set value—2% of the total dataset.
Minimum records in child branch—prevents the split if the number of records in any branch created by the split (child node) would be less than the set value—1% of the total dataset.

In order to further improve the accuracy of the C&R tree model in the IBM SPSS Modeler software, the boosting method was applied. This method is based on creating several models (components) in sequence, i.e., creating ensemble models. Increasing prediction accuracy is achieved by creating each subsequent model in the ensemble in such a way that it focuses on inputs for which the previous model made poor predictions. Finally, prediction is made by applying a whole series of models, using a weighted voting procedure. In this way, separate predictions are combined into one resultant. Nevertheless, the results show that the relative error achieved by the ensemble prediction is greater than the RE of the existing model and amounts to RE = 0.404. Higher prediction accuracy, i.e., a lower RE value compared to the existing C&R tree model was achieved with a reference model. This is the first model in the ensemble of 10 created components shown in Table 6, with a relative error smaller by 0.027 than the previously considered RE = 0.274 amounting to RE = 0.247, which corresponds to the prediction accuracy A = 69.7%. The prediction accuracy A can be represented by the following expression:

A = 1 - \frac{\sum_{i = 1}^{n} |Q o E_{i p} - Q o E_{i}|}{Q o E_{i p \max} - Q o E_{i p \min}}

(9)

where n—the number of input–output vectors from the dataset for testing, QoE_ip_max—the maximum value of the prediction variable QoE_i, QoE_ip_min—the minimum value of the prediction variable QoE_i. In addition to the value of A, Table 6 shows the number of inputs for each component, and the size of the model expressed by the number of nodes.

Apart from the smaller relative error achieved by the reference model, based on Table 6, it is necessary to point out another advantage of this model compared to the others. That advantage is the smaller number of inputs (9 inputs), which favors its simplicity.

5.6. Models for QoE_i Classification

Given that the QoE_i values obtained on the basis of the mathematical assessment model presented by Expression (5) are of a continuous nature, in the first step of the process of creating this type of model it is necessary to map a continuous into a discrete absolute category rating (ACR) scale as presented in Table 7 [64,65].

Structured input–output vectors for training and testing are processed in the Auto Classifier node in order to automatically create different models for QoE_i classification simultaneously. Similar to the Auto Numeric node, in only one pass through the modeling process, Auto Classifier examines different machine learning techniques with default option settings. Options in this case mean the number of neural network layers, the number of neurons in each layer, the shape and parameters of the classification function, the training algorithm, stopping criteria, tree size, etc. Based on the test results, Auto Classifier offers the most accurate solutions and ranks them according to the overall classification accuracy expressed in percentages. Based on the available set of 157 vectors, several models were created, and Table 8 shows the results of testing the three most accurate models.

Based on the results shown in Table 8, it is concluded that the k-NN model has the highest classification accuracy of all tested models, with 50% correctly classified input vectors. In order to increase the accuracy of the classification, with the DA method, the basic dataset is expanded with synthetic copies in the same way as it is presented in the QoE_i prediction model. Four iterations of increasing the dataset were performed, according to the procedure previously carried out for predictive models (Figure 8), and the results of testing the three most accurate models after each iteration are presented in a diagram in Figure 11. As a criterion for selecting the final, most accurate model, the maximum total classification accuracy was observed, which was achieved after the fourth iteration, i.e., with a total set of 2512 input–output vectors, and it amounts to 94.048%. Increasing the overall classification accuracy by increasing the data set can be represented by a linear growth trend that has the following form:

O v e r a l l A c c u r a c y = 2.009 \cdot c + 71.487

(10)

where c represents the serial number of the classification model starting from the C&R Tree model with serial number 1 in the first iteration.

SVM represents one of the most powerful prediction methods, which is based on statistical learning theory and the VC (Vapnik–Chervonenkis) theory proposed by Vapnik (1982, 1995) and Chervonenkis (1974). This machine learning technique belongs to the supervised learning paradigm; therefore, in the training dataset, there must be a label of one of the two categories to which each input vector belongs. The SVM training algorithm creates a model that assigns new examples to one of the categories, making it a non-probabilistic binary linear classifier.

SVM maps data into a multidimensional space of variables so that the input data can be categorized, even when the data are not linearly separable. Thus, it is necessary to find a separator between the categories, after which the data are transformed in such a way that the separator can be constructed as a hyperplane, which has the maximum margin between the categories. After that, the characteristics of new data can be used to predict the group to which the new data should belong [66]. The mathematical function used for this transformation is known as the kernel function. SVM in IBM SPSS Modeler supports linear, polynomial, radial basis function (RBF) and sigmoid kernel function, and in this paper, the RBF kernel was used for the observed SVM model. The kernel has a regularization parameter C, which controls the relationship between the maximum margin and the minimum number of misclassified data. In this research, by default, its value is set to 10. The stopping criterion defines the moment when the optimization algorithm should be stopped. Its value ranges from 10⁻¹ to 10⁻⁶, and according to default settings in the software, it is C = 10⁻³. A smaller criterion value results in a more accurate model, but it takes more time to train [67]. The RBF gamma parameter is 0.1 and can be considered as the “expansion” of the kernel and thus the decision region. When gamma is low, the decision area is very wide. In the case of a high value of gamma, the decision boundaries are very distorted and so-called decision boundary islands are created around the points belonging to the same category.

6. Conclusions

The paper presents an original research approach to creating a model for assessment, prediction and classification of the sustainable quality of the experience of anonymous users of telecommunication services, especially in interactions with conversational, streaming, interactive and background class network traffic. For the case study, representative providers in the geo-area of Republika Srpska and Bosnia and Herzegovina were chosen, and data from anonymous users were collected in an online context by filling in an originally designed questionnaire. The specificity in the design of the questionnaire is made up of transitory variables, which in the models have the role of indicators of the characteristics of individual service users. They indicate the user’s curiosity in relation to the provider’s service portfolio; situational creativity during the use of services; strength of character and ethical consistency in public communication; the effectiveness of multimodal communication in the interaction field of users and services; the courage of users in the network interaction field that is based on conviction; charisma; competence, common-sense reasoning; and operational readiness and functional ability of the user’s perceptive, cognitive and receptor memory. The output variable contains indicators of shades of satisfaction up to the level of enthusiasm in the interaction field and/or dissatisfaction up to the level of rigidity on the interpersonal, behavioral, structural inhibitory and prospective anxiety levels. It is satisfaction or dissatisfaction that are indicators of the sustainable quality of the user experience in the output dependent variable. The treatment of research data was carried out in modern software environments for statistical analysis and creation of machine learning models for assessment, classification and prediction of the sustainable quality of user experience.

Based on the analysis of paired legal–regulatory, technological–process, content-formatted and performative, contextual–relational and subjective–user factors affecting the quality of user experience, survey questions were formulated, and they were used to define research-input-independent and transition variables—QoE_i indicators. One of the important novelties presented in this paper compared to previous publications is the dependent/output variable QoE_i, which indicates the sustainable quality of user experience at the level of an individual user. The arithmetic mean of QoE_i for i = 1...157, actually represents the value of QoE for a specified group of users expressed through MOS. Given that in Section 5 (Section 5.2) the MOS value was calculated and is 0.503, it can be concluded that the overall quality of experience for the sample of 157 users is insufficiently acceptable.

As part of QoE_i predictive modeling, a multiple linear regression model, a machine learning model based on decision trees and predictive models based on an automatic modeling method were created. The results show that the accuracy of the model trained and tested on a set of collected data was insufficiently high. In order to improve the prediction accuracy performance, a synthetic expansion of the dataset was performed using the data augmentation method in four iterations, which increased the basic set from 157 to 2512 input–output vectors. The smallest relative prediction error was achieved with the C&R tree model and is RE = 0.274, and it was further reduced by 9.9% to the value of RE = 0.247 with the boosting method.

When it comes to QoE_i classification, the models trained and tested on the initial set of 157 vectors also demonstrated insufficiently high accuracy, 50% at best. However, by augmenting the dataset with the DA method in four iterations, it was shown that the model based on SVM achieves the highest accuracy of 94.048% among the created models. All created models for assessment, prediction and classification of sustainable quality of user experience can serve entities that provide telecommunication services as a tool to adapt them to users in the future and thus achieve long-term sustainability of quality with increased effectiveness and efficiency of business results.

In addition to a unique subjective approach to the assessment of the quality of experience and an original questionnaire, the subject of the user assessment of the quality of experience stands out as an important scientific contribution and novelty applied in this paper. It represents all services and classes of telecommunication traffic of all observed mobile operators in the given case study. Future research can be oriented towards modeling the quality of experience of users of individual telecommunication services.

Some limitations of this paper refer to the QoE models that were created on the basis of data obtained for the services of telecommunication operators that operate in a perhaps insufficiently wide geo-territorial space. Therefore, it would probably be a good opportunity to train and test the model on new data for application in other geo-areas and for certain services and operators. Additionally, given that the models were created on the basis of an artificially expanded data set, certain deviations in the results are possible if the same models were created on a real data set.

Author Contributions

Conceptualization, M.K.B. and M.S.; methodology, M.K.B., M.S. and Z.Ć.; software, M.S. and M.V.; validation, M.K.B., Z.Ć. and G.P.; formal analysis, M.K.B. and M.S.; investigation, M.K.B., M.S. and Z.Ć.; resources, M.S. M.K.B., Z.Ć., D.D. and M.V.; data curation, M.K.B., M.S., Z.Ć., G.P., D.D. and M.V.; writing—original draft preparation, M.K.B. and M.S.; writing—review and editing, M.K.B., M.S., Z.Ć. and G.P.; visualization, M.S., M.K.B., M.V. and D.D.; supervision, M.K.B., Z.Ć.; project administration, M.S., Z.Ć. and D.D.; funding acquisition, D.D., Z.Ć. and G.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Available on request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Zhu, Y.; Heynderickx, I.; Redi, J.A. Understanding the role of social context and user factors in video quality of experience. Comput. Hum. Behav. 2015, 49, 412–426. [Google Scholar] [CrossRef]
Geiser, M.; Panwar, D.; Tomar, P.; Harsh, H.; Zhang, X.; Solanki, A.; Nayyar, A.; Alzubi, J.A. An optimization model for software quality prediction with case study analysis using MATLAB. IEEE Access 2019, 7, 85123–85138. [Google Scholar] [CrossRef]
Movassagh, A.A.; Alzubi, J.A.; Gheisari, M.; Rahimi, M.; Mohan, S.; Abbasi, A.A.; Nabipour, N. Artificial neural networks training algorithm integrating invasive weed optimization with differential evolutionary model. J. Ambient Intell. Humaniz. Comput. 2021, 12, 1–9. [Google Scholar] [CrossRef]
Hansmann, R.; Mieg, H.A.; Frischknecht, P. Principal sustainability components: Empirical analysis of synergies between the three pillars of sustainability. Int. J. Sustain. Dev. World Ecol. 2012, 19, 451–459. [Google Scholar] [CrossRef]
ETSI TS 103 294; Speech and Multimedia Transmission Quality (STQ); Quality of Experience; A Monitoring Architecture, Technical Specification, V1.1.1. European Telecommunications Standards Institute: Sophia Antipolis Cedex, France, 2014. Available online: https://www.etsi.org/deliver/etsi_ts/103200_103299/103294/01.01.01_60/ts_103294v010101p.pdf (accessed on 30 March 2021).
ETSI TR 102 643; Human Factors (HF); Quality of Experience (QoE) Requirements for Real-Time Communication Services, Technical Report, V1.0.1 (2009-12). European Telecommunications Standards Institute: Sophia Antipolis Cedex, France. 2009. Available online: https://www.etsi.org/deliver/etsi_tr/102600_102699/102643/01.00.01_60/tr_102643v010001p.pdf (accessed on 30 March 2021).
Laghari, K. On Quality of Experience (QoE) for Multimedia Services in Communication Ecosystem. Ph.D. Thesis, Institut National des Telecommunictions, Télécom SudParis, Paris, France, 30 April 2012. Available online: https://tel.archives-ouvertes.fr/tel-00873612/document (accessed on 13 November 2022).
ITU-T Recommendation P.10/G.100; Amendment 2: New Definitions for Inclusion in Recommendation ITU-T P.10/G.100. International Telecommunication Union: Geneva, Switzerland, 2008. Available online: https://www.itu.int/rec/T-REC-P.10-200807-S!Amd2/en (accessed on 8 November 2021).
Vakili, A.; Grégoire, J.-C. QoE management in a video conferencing application. In Future Information Technology, Application and Service; Lecture Notes in Electrical Engineering; Park, J.J., Leung, V.C.M., Wang, C.L., Shon, T., Eds.; Springer: Dordrecht, The Netherlands, 2012; Volume 164, pp. 191–201. [Google Scholar] [CrossRef]
ITU-T Recommendation P.10/G.100 (11/17); Vocabulary for Performance, Quality of Service and Quality of Experience. International Telecommunication Union: Geneva, Switzerland, 2017. Available online: https://www.itu.int/rec/T-REC-P.10-201711-I/en (accessed on 8 November 2022).
Dai, Q. A Survey of Quality of Experience. In Energy-Aware Communications. EUNICE 2011; Lecture Notes in Computer Science; Lehnert, R., Ed.; Springer: Berlin, Heidelberg, 2011; Volume 6955, pp. 146–156. [Google Scholar] [CrossRef]
Eswara, N.; Ashique, S.; Panchbhai, A.; Chakraborty, S.; Sethuram, H.P.; Kuchi, K.; Kumar, A.; Channappayya, S.S. Streaming video QoE modeling and prediction: A long short-term memory approach. IEEE Trans. Circuits Syst. Video Technol. 2019, 30, 661–673. [Google Scholar] [CrossRef] [Green Version]
Barman, N.; Martini, M.G. Qoe modeling for HTTP adaptive video streaming–a survey and open challenges. IEEE Access 2019, 7, 30831–30859. [Google Scholar] [CrossRef]
Ruan, J.; Xie, D. A survey on QoE-oriented VR video streaming: Some research issues and challenges. Electronics 2021, 10, 2155. [Google Scholar] [CrossRef]
Banjanin, M.K.; Maričić, G.; Stojčić, M. Multifactor influences on the quality of experience service users of telecommunication providers in the Republic of Srpska, Bosnia and Herzegovina. Int. J. Qual. Res. 2022, 17. [Google Scholar] [CrossRef]
Daengsi, T.; Wuttidittachotti, P. QoE Modeling for Voice over IP: Simplified E-model Enhancement Utilizing the Subjective MOS Prediction Model: A Case of G. 729 and Thai Users. J. Netw. Syst. Manag. 2019, 27, 837–859. [Google Scholar] [CrossRef]
García-Pineda, M.; Segura-Garcia, J.; Felici-Castell, S. A holistic modeling for QoE estimation in live video streaming applications over LTE Advanced technologies with Full and Non Reference approaches. Comput. Commun. 2018, 117, 13–23. [Google Scholar] [CrossRef]
Ickin, S.; Vandikas, K.; Fiedler, M. Privacy preserving qoe modeling using collaborative learning. In Proceedings of the 4th Internet-QoE Workshop on QoE-Based Analysis and Management of Data Communication Networks, Los Cabos, Mexico, 21 October 2019; pp. 13–18. [Google Scholar]
Khokhar, M.J.; Saber, N.A.; Spetebroot, T.; Barakat, C. An intelligent sampling framework for controlled experimentation and QoE modeling. Comput. Netw. 2018, 147, 246–261. [Google Scholar] [CrossRef] [Green Version]
Dasari, M.; Sanadhya, S.; Vlachou, C.; Kim, K.H.; Das, S.R. Scalable Ground-Truth Annotation for Video QoE Modeling in Enterprise WiFi. In Proceedings of the 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS), Banff, AB, Canada, 4–6 June 2018; pp. 1–6. [Google Scholar]
Veeraragavan, N.R.; Montecchi, L.; Nostro, N.; Vitenberg, R.; Meling, H.; Bondavalli, A. Modeling QoE in dependable tele-immersive applications: A case study of world opera. IEEE Trans. Parallel Distrib. Syst. 2015, 27, 2667–2681. [Google Scholar] [CrossRef]
Hoßfeld, T.; Biedermann, S.; Schatz, R.; Platzer, A.; Egger, S.; Fiedler, M. The memory effect and its implications on Web QoE modeling. In Proceedings of the 2011 23rd International Teletraffic Congress (ITC), San Francisco, CA, USA, 6–9 September 2011; pp. 103–110. [Google Scholar]
Lycett, M.; Radwan, O. Developing a quality of experience (QoE) model for web applications. Inf. Syst. J. 2019, 29, 175–199. [Google Scholar] [CrossRef] [Green Version]
Banjanin, M.K.; Stojčić, M.; Drajić, D.; Ćurguz, Z.; Milanović, Z.; Stjepanović, A. Adaptive Modeling of Prediction of Telecommunications Network Throughput Performances in the Domain of Motorway Coverage. Appl. Sci. 2021, 11, 3559. [Google Scholar] [CrossRef]
Bouraqia, K.; Sabir, E.; Sadik, M.; Ladid, L. Quality of experience for streaming services: Measurements, challenges and insights. IEEE Access 2020, 8, 13341–13361. [Google Scholar] [CrossRef]
Hu, Z.; Yan, H.; Yan, T.; Geng, H.; Liu, G. Evaluating QoE in VoIP networks with QoS mapping and machine learning algorithms. Neurocomputing 2020, 386, 63–83. [Google Scholar] [CrossRef]
Isak-Zatega, S.; Lipovac, A.; Lipovac, V. Logistic regression based in-service assessment of mobile web browsing service quality acceptability. EURASIP J. Wirel. Commun. Netw. 2020, 96, 1–21. [Google Scholar] [CrossRef]
Mitra, K.; Zaslavsky, A.; Åhlund, C. QoE modelling, measurement and prediction: A review. arXiv 2014, arXiv:1410.6952. [Google Scholar] [CrossRef]
Pal, D.; Triyason, T. A survey of standardized approaches towards the quality of experience evaluation for video services: An ITU perspective. Int. J. Digit. Multimed. Broadcast. 2018, 2018, 1724. [Google Scholar] [CrossRef] [Green Version]
Juluri, P.; Tamarapalli, V.; Medhi, D. Measurement of quality of experience of video-on-demand services: A survey. IEEE Commun. Surv. Tutor. 2015, 18, 401–418. [Google Scholar] [CrossRef]
Baraković Husić, J.; Baraković, S.; Cero, E.; Slamnik, N.; Oćuz, M.; Dedović, A.; Zupčić, O. Quality of experience for unified communications: A survey. Int. J. Netw. Manag. 2020, 30, e2083. [Google Scholar] [CrossRef]
Banjanin, M.K.; Stojčić, M. Conceptual Model of the Cyber-physical System in the Space of the M9J Road Section. In Proceedings of the 15th International Conference on Advanced Technologies, Systems and Services in Telecommunications (TELSIKS), Niš, Serbia, 20–22 October 2021; pp. 299–302. [Google Scholar]
Belmudez, B.; Möller, S. Audiovisual quality integration for interactive communications. EURASIP J. Audio Speech Music Process. 2013, 2013, 1–23. [Google Scholar] [CrossRef] [Green Version]
Cavanaugh, J.E.; Neath, A.A. The Akaike information criterion: Background, derivation, properties, application, interpretation, and refinements. Wiley Interdiscip. Rev. Comput. Stat. 2019, 11, e1460. [Google Scholar] [CrossRef]
Portet, S. A primer on model selection using the Akaike Information Criterion. Infect. Dis. Model. 2020, 5, 111–128. [Google Scholar] [CrossRef] [PubMed]
Ćurguz, Z.; Banjanin, M.; Stojčić, M. Machine learning models for prediction of mobile network user throughput in the area of trunk road and motorway sections. In Proceedings of the First International Conference on Advances in Traffic and Communication Technologies, Sarajevo, Bosnia and Herzegovina, 26–27 May 2022; pp. 27–35. [Google Scholar]
Ćurguz, Z.; Banjanin, M.; Stojčić, M. Prediction of user throughput in the mobile network along the motorway and trunk road. Sci. Eng. Technol. 2022, 2, 23–30. [Google Scholar] [CrossRef]
Simakovic, M.; Cica, Z.; Drajic, D. Big-Data Platform for Performance Monitoring of Telecom-Service-Provider Networks. Electronics 2022, 11, 2224. [Google Scholar] [CrossRef]
Stojčić, M.; Banjanin, M.K. Predictive Modeling of Telecommunications Traffic Performance Based on Machine Learning Techniques. In Proceedings of the VIII International Symposium NEW HORIZONS 2021 of Transport and Communications, Doboj, Bosnia and Herzegovina, 26–27 November 2021; pp. 378–385. [Google Scholar]
Ivaniš, P.; Drajić, D. Information Theory and Coding-Solved Problems; Springer International Publishing: Cham, Switzerland, 2012; ISBN 978-3-319-49369-5. [Google Scholar] [CrossRef]
Stojčić, M.; Banjanin, M.; Ćurguz, Z.; Stjepanović, A. Machine Learning Model of Communication of Physical and Virtual Sensors in the Mobile Network on the Motorway Section. In Proceedings of the 44th International Convention, CTI, MIPRO 2021, Opatija, Croatia, 27 September–1 October 2021; pp. 447–452. [Google Scholar]
Tensorflow. Available online: https://www.tensorflow.org/tutorials/images/data_augmentation (accessed on 13 November 2022).
ETSI TS 102 250-2; Speech and Multimedia Transmission Quality (STQ); QoS Aspects for Popular Services in Mobile Networks; Part 2: Definition of Quality of Service Parameters and Their Computation, Technical Specification, V2.4.1. European Telecommunications Standards Institute: Sophia Antipolis Cedex, France, 2015. Available online: https://www.etsi.org/deliver/etsi_ts/102200_102299/10225002/02.04.01_60/ts_10225002v020401p.pdf (accessed on 2 April 2021).
ETSI TS 102 250-1; Speech and Multimedia Transmission Quality (STQ); QoS Aspects for Popular Services in Mobile Networks; Part 1: Assessment of Quality of Service, Technical Specification, V2.2.1 (2011-04). European Telecommunications Standards Institute: Sophia Antipolis Cedex, France, 2011. Available online: https://www.etsi.org/deliver/etsi_ts/102200_102299/10225001/02.02.01_60/ts_10225001v020201p.pdf (accessed on 2 April 2021).
ETSI TR 103 488; Speech and Multimedia Transmission Quality (STQ); Guidelines on OTT Video Streaming; Service Quality Evaluation Procedures, Technical Specification, V1.1.1 (2019-01). European Telecommunications Standards Institute: Sophia Antipolis Cedex, France, 2019. Available online: https://www.etsi.org/deliver/etsi_tr/103400_103499/103488/01.01.01_60/tr_103488v010101p.pdf (accessed on 11 April 2021).
3GPP TR 26.944; 3rd Generation Partnership Project; Technical Specification Group Services and System Aspects; End-to-End Multimedia Services Performance Metrics (Release 10), Technical Report, V10.0.0 (2011-03). 3rd Generation Partnership Project: Sophia Antipolis, France, 2011. Available online: https://www.arib.or.jp/english/html/overview/doc/STDT63v9_10/5_Appendix/Rel10/26/26944-a00.pdf (accessed on 31 March 2021).
ITU-T Recommendation G.1000; Communications Quality of Service: A Framework and Definitions. International Telecommunication Union: Geneva, Switzerland, 2002.
Mtel. Opšti Uslovi za Pružanje Telekomunikacionih Usluga (Prečišćeni Tekst). Available online: https://mtel.ba/Binary/397/Opsti-uslovi-za-pruzanje-telekomunikacionih-usluganesluzbeni-precisceni-tekst.pdf (accessed on 8 October 2022).
GSM Association. Definition of Quality of Service Parameters and Their Computation; Official Document IR.42, Version 9.0. Available online: https://www.gsma.com/newsroom/wp-content/uploads//IR.42-v9.0.pdf (accessed on 3 April 2021).
BaBatunde, K.A.; Akinboboye, S. Corporate social responsibility effect on consumer patronage-management perspective: Case study of a telecommunication company in Nigeria. J. Komun. 2013, 29, 55–71. [Google Scholar]
Maričić, G.; Banjanin, M.K.; Stojčić, M. Legal-Regulatory Paired Component in the QoE Model for Assessment of the Quality of Experience of Users of Services of Company. In Proceedings of the Materials of 1st International Scientific and Practical Internet Conference “The impact of COVID-19 Pandemic on development of modern world: Threats and opportunities”-WayScience, Dnipro, Ukraine, 9–10 September 2021; pp. 33–36. [Google Scholar]
Banjanin, K.M. Komunikacioni Inženjering; Univerzitet u Istočnom Sarajevu, Saobraćajno-Tehnički Fakultet Doboj: Doboj, Bosnia and Herzegovina, 2007; ISBN 978-99938-859-4-8. [Google Scholar]
Brunnström, K.; Beker, S.A.; De Moor, K.; Dooms, A.; Egger, S.; Garcia, M.-N.; Hossfeld, T.; Jumisko-Pyykkö, S.; Keimel, C.; Larabi, C.; et al. Qualinet white paper on definitions of quality of experience. In Proceedings of the Fifth Qualinet Meeting, Novi Sad, Serbia, 12 March 2013. [Google Scholar]
Reiter, U.; Brunnström, K.; Moor, K.D.; Larabi, M.C.; Pereira, M.; Pinheiro, A.; Zgank, A. Factors influencing quality of experience. In Quality of Experience; Möller, S., Raake, A., Eds.; Springer: Cham, Switzerland, 2014; pp. 55–72. [Google Scholar] [CrossRef]
Rahman, M.A.; El Saddik, A.; Gueaieb, W. Augmenting context awareness by combining body sensor networks and social networks. IEEE Trans. Instrum. Meas. 2010, 60, 345–353. [Google Scholar] [CrossRef]
Su, J.H.; Yeh, H.H.; Yu, P.S.; Tseng, V.S. Music recommendation using content and context information mining. IEEE Intell Syst. 2010, 25, 1541–1672. [Google Scholar] [CrossRef]
Naumann, A.B.; Wechsung, I.; Hurtienne, J. Multimodal interaction: A suitable strategy for including older users? Interact. Comput. 2010, 22, 465–474. [Google Scholar] [CrossRef]
Hyder, M.; Laghari, K.u.R.; Crespi, N.; Haun, M.; Hoene, C. Are QoE Requirements for Multimedia Services Different for Men and Women? Analysis of Gender Differences in Forming QoE in Virtual Acoustic Environments. In Emerging Trends and Applications in Information Communication Technologies IMTIC 2012. Communications in Computer and Information Science; Chowdhry, B.S., Shaikh, F.K., Hussain, D.M.A., Uqaili, M.A., Eds.; Springer: Berlin/Heidelberg, Germany, 2012; Volume 281, pp. 200–209. [Google Scholar] [CrossRef]
Jumisko-Pyykkö, S.; Häkkinen, J.; Nyman, G. Experienced quality factors: Qualitative evaluation approach to audiovisual quality. In Multimedia on Mobile Devices; SPIE: Bellingham, WA, USA, 2007; Volume 6507, pp. 169–180. [Google Scholar] [CrossRef]
MathWorks. Available online: https://www.mathworks.com/products/demos/machine-learning/boosted-regression.html (accessed on 8 November 2022).
IBM. Available online: https://www.ibm.com/docs/en/SS3RA7_18.3.0/pdf/ModelerModelingNodes.pdf (accessed on 8 November 2022).
Selvanathan, M.; Jayabalan, N.; Saini, G.K.; Supramaniam, M.; Hussin, N. Employee Productivity in Malaysian Private Higher Educational Institutions. PalArch’s J. Archaeol. Egypt/Egyptol. 2020, 17, 66–79. [Google Scholar] [CrossRef]
Stackoverflow. Available online: https://stackoverflow.com/questions/39265746/data-augmentation-techniques-forgeneral-datasets (accessed on 8 November 2022).
Singla, A.; Rao, R.R.R.; Göring, S.; Raake, A. Assessing media qoe, simulator sickness and presence for omnidirectional videos with different test protocols. In Proceedings of the 2019 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), Osaka, Japan, 23–27 March 2019; pp. 1163–1164. [Google Scholar] [CrossRef]
Kara, P.A.; Bokor, L.; Sackl, A.; Mourão, M. What your phone makes you see: Investigation of the effect of end-user devices on the assessment of perceived multimedia quality. In Proceedings of the 2015 Seventh International Workshop on Quality of Multimedia Experience (QoMEX), Messinia, Greece, 26–29 May 2015; pp. 1–6. [Google Scholar] [CrossRef]
IBM. Available online: https://www.ibm.com/docs/en/spss-modeler/saas?topic=models-how-svm-works (accessed on 8 November 2022).
Scikit-Learn. Available online: https://scikit-learn.org/stable/auto_examples/svm/plot_rbf_parameters.html (accessed on 8 November 2022).

Figure 1. Correlation amongst QoEi influencing factors, questionnaire questions and research variables.

Figure 2. Interaction model of paired factors affecting QoE_i.

Figure 3. LogNormal probability distribution density function of the QoE_i variable.

Figure 4. Correlogram of research variables.

Figure 5. The importance of the influence of certain independent variables on the prediction of QoE_i.

Figure 6. The importance of the influence of indicators on the prediction of QoE_i: (a) all indicators; (b) satisfaction indicators; (c) indicators of dissatisfaction (adapted with permission from Ref. [15]. 2023, International Journal for Quality Research ).

Figure 7. Architecture of the used autoencoder neural network.

Figure 8. Four iterations of dataset augmentation using the autoencoder network.

Figure 9. Results of testing predictive models of QoE_i in four iterations of increasing the dataset using the DA method.

Figure 10. Structure of the created C&R tree model.

Figure 11. Results of testing the QoEi classification model in four iterations of increasing the dataset using the DA method.

Table 1. Comparative overview of QoE modeling in previous research with QoE_i modeling in this paper.

Ord. Number	Title of Paper	Service/ Application Observed	Methods and Models Used	Observed Factors/Variables Affecting QoE	Comparative Improvements Presented in This Paper
[16]	QoE Modeling for Voice over IP: Simplified E-model Enhancement Utilizing the Subjective MOS Prediction Model: A Case of G.729 and Thai Users	VoIP	Objective simplified E-model; subjective MOS model for prediction	Delay, packet loss, jitter	(a), (b), (c), (d), (e), (f), (g), (h), (i), (j)
[17]	A holistic modeling for QoE estimation in live video streaming applications over LTE Advanced technologies with Full and Non Reference approaches	Live video streaming	Statistical modeling—regression analysis for objective assessment of video quality; factor analysis	Variables related to QoS, bit stream and basic video quality metrics grouped into factors	(a), (b), (d), (e), (f), (g), (i), (j)
[18]	Privacy Preserving QoE Modeling using Collaborative Learning	Applicable to all services	A machine learning model with data privacy protection—a collaborative machine learning model	Maximum bandwidth for downlink; search time; assessment time	(a), (b), (d), (e), (f), (g), (i), (j)
[19]	An Intelligent Sampling Framework for Controlled Experimentation and QoE Modeling	YouTube video streaming	Machine learning models	QoS variables (delay, bandwidth...)	(a), (b), (c), (d), (e), (f), (g), (i), (j)
[20]	Scalable Ground-Truth Annotation for Video QoE Modeling in Enterprise WiFi	Video telephony	Adaboosted decision trees	Perceptual bitrate (PBR), freeze ratio, freeze length and number of video freezes	(a), (b), (c), (d), (e), (f), (g), (i), (j)
[21]	Modeling QoE in Dependable Tele-immersive Applications: A Case Study of World Opera	World Opera application	Subjective method based on perceived reliability; stochastic activity networks (SANs)	Human perception of video and audio, audience characteristics, performance elements and artistic content	(a), (b), (c), (d), (e), (f), (h), (i), (j)
[22]	The Memory Effect and Its Implications on Web QoE Modeling	Interactive Web services	Support vector machines; iterative exponential regressions; two-dimensional hidden Markov models	Technical factors (scope, page load time, packet loss...); psychological factors (expectations, memory effects, user)	(a), (b), (c), (d), (e), (f), (g), (i), (j)
[25]	Quality of Experience for Streaming Services: Measurements, Challenges and Insights	Streaming services	Subjective methods; objective methods; hybrid methods	Human-related influencing factors; system-related influencing factors; context-related influencing factors; content-related influencing factors	(a), (b), (c), (d), (e), (f), (h), (i), (j)
[26]	Evaluating QoE in VoIP networks with QoS mapping and machine learning algorithms	VoIP services	MOS model; PESQ model; E-model; a single-layer artificial neural network model	Echo, packet loss, jitter, bandwidth, delay	(a), (b), (c), (d), (e), (f), (g), (i), (j)
[23]	Developing a Quality of Experience (QoE) model for Web Applications	Web applications	Quality of experience of Web application (QoEWA) model	Objective factors (KPI); subjective factors (KQI).	(a), (b), (d), (e), (f), (g), (h), (i), (j)
[27]	Logistic regression based in-service assessment of mobile web browsing service quality acceptability	Searching the Web	Binary logistic regression model	Average time-to-connect-TCP	(a), (b), (d), (e), (f), (g), (h), (i), (j)

Table 2. Descriptive statistics for indicators of user satisfaction and dissatisfaction.

	Mark	Mean (MOS)	Standard Deviation	Variance	Sum of Squares	Min	Median	Max
Indicators of user satisfaction	D₁	3.32	1.01	1.03	1889	1	3	5
	D₂	3.17	0.99	0.99	1727	1	3	5
	D₃	3.71	0.96	0.91	2300	1	4	5
	D₄	3.00	1.02	1.04	1575	1	3	5
	D₅	2.92	1.00	1.01	1499	1	3	5
	D₆	2.99	1.02	1.03	1568	1	3	5
	D₇	3.04	1.01	1.03	1616	1	3	5
	D₈	3.16	0.95	0.90	1697	1	3	5
	D₉	2.96	1.11	1.23	1569	1	3	5
	D₁₀	3.09	1.09	1.19	1674	1	3	5
Indicators of user dissatisfaction (forms and measures of rigidity)	C₁	2.83	1.05	1.09	1426	1	3	5
	C₂	2.88	1.02	1.03	1462	1	3	5
	C₃	2.82	1.01	1.01	1402	1	3	5
	C₄	2.80	1.06	1.12	1408	1	3	5
	C₅	2.87	1.08	1.17	1472	1	3	5

Table 3. Results of fitting different tested distributions with a QoE_i variable.

Distribution	AD	p	AIC
LogNormal—three parameter	13.47	0.000	26.61
LogLogistic—three parameter	12.16	<0.005	47.09
Exponential—two parameter	37.41	<0.001	102.3
Logistic	9.604	<0.005	292.0
Normal	10.53	0.000	295.5
Smallest extreme value	−157.0	>0.250	176443
Largest extreme value	−86.76	>0.250	1,579,994,965

Table 4. Ranking list of the best tested models for QoE_i prediction.

Created Model	Correlation	Relative Error
1. Regression	0.127	1.070
2. k-nearest neighbors (k-NN)	0.206	1.075
3. C&R tree	0.000	1.147

Table 5. The usual five-point scale of Pearson’s correlation coefficients.

Absolute Value of the Correlation Coefficient	Qualitative Assessment
$0.00 < \|r\| \leq$ 0.19	Very low correlation
$0.20 \leq \|r\| \leq$ 0.39	Low correlation
$0.40 \leq \|r\| \leq$ 0.59	Moderate correlation
$0.60 \leq \|r\| \leq$ 0.79	High correlation
$0.80 \leq \|r\| \leq$ 1.00	Very high correlation

Table 6. Prediction results for ensemble components of the C&R tree model.

Model/Component Number	Prediction Accuracy (A)	Number of Inputs	Number of Nodes
1	69.7%	9	23
2	52.3%	9	19
3	47.0%	10	17
4	39.2%	10	25
8	34.6%	10	35
5	30.4%	10	25
9	27.8%	10	29
6	25.5%	10	29
10	13.7%	10	25
7	10.0%	10	21

Table 7. Mapping of continuous QoE_i values into the ACR scale.

Continuous Scale	ACR Scale
0 ≤ QoE_i < 0.5	0
0.5 ≤ QoE_i < 1.5	1
1.5 ≤ QoE_i < 2.5	2
2.5 ≤ QoE_i < 3.5	3
3.5 ≤ QoE_i < 4.5	4
4.5 ≤ QoE_i ≤ 5	5

Table 8. Ranking list of the best tested models for QoE_i classification.

Model Created	Total Classification Accuracy [%]
1. k-NN	50.00
2. C&R tree	46.15
3. Neural network	42.31

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Banjanin, M.K.; Stojčić, M.; Danilović, D.; Ćurguz, Z.; Vasiljević, M.; Puzić, G. Classification and Prediction of Sustainable Quality of Experience of Telecommunication Service Users Using Machine Learning Models. Sustainability 2022, 14, 17053. https://doi.org/10.3390/su142417053

AMA Style

Banjanin MK, Stojčić M, Danilović D, Ćurguz Z, Vasiljević M, Puzić G. Classification and Prediction of Sustainable Quality of Experience of Telecommunication Service Users Using Machine Learning Models. Sustainability. 2022; 14(24):17053. https://doi.org/10.3390/su142417053

Chicago/Turabian Style

Banjanin, Milorad K., Mirko Stojčić, Dejan Danilović, Zoran Ćurguz, Milan Vasiljević, and Goran Puzić. 2022. "Classification and Prediction of Sustainable Quality of Experience of Telecommunication Service Users Using Machine Learning Models" Sustainability 14, no. 24: 17053. https://doi.org/10.3390/su142417053

APA Style

Banjanin, M. K., Stojčić, M., Danilović, D., Ćurguz, Z., Vasiljević, M., & Puzić, G. (2022). Classification and Prediction of Sustainable Quality of Experience of Telecommunication Service Users Using Machine Learning Models. Sustainability, 14(24), 17053. https://doi.org/10.3390/su142417053

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Classification and Prediction of Sustainable Quality of Experience of Telecommunication Service Users Using Machine Learning Models

Abstract

1. Introduction

2. Review of Relevant Published Research

3. Materials and Research Methods

3.1. Analysis of Influencing Factors and Creation of an Interactive QoEi Model

3.2. QoE Questionnaire and Selection of Research Variables

3.3. Survey Process

3.4. Preprocessing of Data Collected

3.5. Statistical Analysis of Data Collected

3.6. Assessment of User QoEi with a Mathematical Model

3.7. Creating a QoEi Probability Model

3.8. Correlation Analysis of Research Variables

3.9. Creating a Model for QoEi Prediction and Classification

Models for QoEi Prediction

4. Factors Affecting the Quality of User Experience

4.1. Legal–Regulatory Factors

4.2. Technological–Process Factors of Network Services/Applications

4.3. Content-Formatted and Performative Factors

4.4. Contextual–Relational Factors

4.5. Subjective–User Factors

4.6. An Interaction Model of Paired Factors Affecting QoEi

5. Results of QoEi Modeling and Discussion

5.1. Research Sample Statistics

5.2. QoEi Estimation Model

5.3. QoEi Probability Model

5.4. Correlation Analysis of Research Variables

5.5. Predictive Models of QoEi

5.5.1. Multiple Linear Regression Model

5.5.2. Boosted Decision Tree Model

5.5.3. Predictive Models Created by Using an Automatic Modeling Method

5.6. Models for QoEi Classification

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.1. Analysis of Influencing Factors and Creation of an Interactive QoE_i Model

3.6. Assessment of User QoE_i with a Mathematical Model

3.7. Creating a QoE_i Probability Model

3.9. Creating a Model for QoE_i Prediction and Classification

Models for QoE_i Prediction

4.6. An Interaction Model of Paired Factors Affecting QoE_i

5. Results of QoE_i Modeling and Discussion

5.2. QoE_i Estimation Model

5.3. QoE_i Probability Model

5.5. Predictive Models of QoE_i

5.6. Models for QoE_i Classification