1. Introduction
The COVID-19 crisis has had an unequivocally strong destructive impact on tourism and travel worldwide [
1,
2,
3,
4], draining the liquidity of hotel firms, increasing overall exposure to the risk of bankruptcy, and threatening the general collapse of the sector. Predicting the impact of the COVID-19 crisis on various business areas and aspects of business is important for policymakers, stakeholders, and the public, as it focuses their attention on maintaining sustainability and avoiding general economic collapse. In addition, the COVID-19 crisis has shown that it is necessary to deepen knowledge regarding the behavior and efficiency of different prognostic models, and that the adaptation of existing models and the creation of new models are needed for application in sudden crises and volatile market conditions. Non-linear data are complicated to forecast, and it is even more difficult to forecast nonlinear data with shocks [
5]. In this way, acquired knowledge on the basis of COVID-19, formulated through prognostic models, is valuable for development in the fields of financial modeling, disaster management, and business analysis [
6], as well as for adjusting public and governmental supportive measures in respect of new conditions.
Investigating bankruptcy is especially important as bankruptcy is a disruptive, costly, and enormously impactful phenomenon for both entrepreneurs and a wide range of stakeholders [
7,
8,
9]. It can have significant destructive repercussions on economic growth in terms of job losses, the destruction of assets, and productive base [
10], and can have a negative impact on the environment given that companies subject to bankruptcy often take excessive environmental and public health risks [
11]. In the COVID-19 era, the importance of studying bankruptcy has been emphasized, bearing in mind that bankruptcies are a lagging economic indicator [
11] and that large-scale governmental support measures, as well as temporary adaptive measures implemented by firms, have additionally delayed the impact of the crisis on the risk of bankruptcy in already weakened local economies. Accordingly, the number of bankruptcies worldwide decreased for the period 2019 to 2020, with an average decrease of 12.4% [
12]. The number of closed companies in the Republic of Serbia decreased by 58% for the same period, while the number of newly established companies increased by 34% [
13].
The importance of studying bankruptcy is further emphasized for the tourism and travel sector (hereinafter T&T), which is the co-creator and main receiver of the pandemic and its ramifications [
14], and has been especially plagued by uncertainties since the outbreak of COVID-19 [
15]. This problem has been magnified for the hotel industry, which is inherently characterized by its fixed assets to current assets ratio, limited sales power of estates, and high leverage. Financial strength is crucial for the survival of hotel companies during the COVID-19 crisis [
16].
Furthermore, the topic is especially important for the Republic of Serbia for the following reasons: (1) T&T is an important factor of economic development in the Republic of Serbia [
17,
18,
19]; (2) T&T showed a trend of growth in the years preceding the COVID-19 outbreak (for example, see [
20,
21,
22]); (3) for the hotel industry (hereinafter HI), bankruptcies result in substantial losses for investors, hotel companies, employees, and other stakeholders, and reorganizations are hardly feasible in the Republic of Serbia [
23,
24].
ANNs are trained to make market predictions by recognizing the laws of the market recognized from knowledge gained from an earlier period of time, and their effective application under conditions of changed laws and relations is questionable, especially in cases when these changes occur suddenly and dramatically, as was the case with the COVID-19 crisis. In addition, the impact of the COVID-19 crisis on earlier predictive models in the context of their accuracy has been insufficiently examined in the scientific literature. Furthermore, another motive for studying the risk of bankruptcy in the HI stems from a serious gap identified in the scientific literature in terms of the number and scope of studies dealing with the application and development of models for predicting and analyzing bankruptcy risk in the HI and wider in hospitality sector. To the best of our knowledge, only three studies have previously dealt with the analysis of the risk of bankruptcy in the HI in the Republic of Serbia, one of which analyzed this risk during the COVID-19 crisis; our study builds on that research.
In this study we perform an ex-post analysis of the performance of different ANN models in predicting the bankruptcy risk of companies from the HI in our sample, where an ANN model is defined as a combination of four components: (1) the set of input factors (parameters of ANNs), (2) the length of the series of previous years used by ANNs for learning, (3) the time interval to which the input factors used by ANNs are associated, and (4) application of the specific hybrid cascade model presented in the paper. Various indicators of the model accuracies are used as performance indicators of ANNs, while the following are taken into account within the set of input factors: bankruptcy risk zones defined by Altman’s EM model and their variations, various market factors relevant to the HI, and non-financial internal indicators of the sample companies.
The goal of testing the accuracy of different ANN models is that, in addition to the general contribution to the performance base of different configurations of ANNs, their accuracies in predicting the bankruptcy risk zones of companies in the HI are used as indicators of the degree of changes in the general business environment of the sampled companies and indirectly in the HI. As a result, we present 11 stability indicators for individual years, most of which are effective in identifying significant structural changes in 2020, the year of the emergence of the COVID-19 crisis.
Furthermore, we present a method for monitoring the dynamic patterns of changes in the tendency of companies towards the risk of bankruptcy for the purpose of deriving different classes of companies and identifying those in which ANNs made the largest errors in predictions, which may be an indicator of their existential lability or fraud in financial reports.
We analyze various indicators of accuracy for different configurations of ANN in the assessment of Altman’s bankruptcy risk zones for a sample of 100 randomly selected non-bankrupt hotel companies in the Republic of Serbia, using data for these companies for the period 2015 to 2020. Overall, developed ANNs assessed the risk zones of the firms in the period 2016 to 2021, while the number of ANNs whose results are directly presented in the study is 9580. Among these networks, 9060 ANNs, implemented within 635 prediction models, were used for predictive purposes, while 520 ANNs were trained for classification problems within four classification models.
Unlike the basic approach, where bankruptcy prediction is treated as a binary classification problem that predicts whether a firm will go bankrupt or not, we present an approach that classifies hotel firms into three zones of bankruptcy risk in respect of Altman’s Z″-score model for emerging markets, giving an assessment of the risk of bankruptcy to which the firm will be exposed in the short-term (up to one year) in relation to the forecast year. In the derived approach presented, classification in relation to four bankruptcy risk zones is also performed, providing an even more gradual approach in the assessment of this risk. Altman’s Z″-score model for emerging markets for the hotel companies is used as an indicator of the assessed bankruptcy risk exposure of the hotel firms. Altman’s Z-score model was chosen because it represents the gold-standard for assessing bankruptcy risk, and its usability in risk prediction has been confirmed in a large number of studies (e.g., Soon et al. [
25], Reisz and Perlich [
26]) even in modern times [
27,
28], and due to its confirmed applicability when applied in the economic environment of the Republic of Serbia [
29,
30,
31].
The ANNs presented in this paper comprise two-layer time-delay time-series ANNs (TDNNs) that include input time-series vectors with two degrees of time lags, based on which they realize the classification of bankruptcy risk zones for particular firms in individual years. The time-series model presented in the study implies that ANNs at the input layer receive time-series datasets, where each set comprises n observations, i.e., firms, and one set corresponds to one year from the previous period regarding the forecast year. Each of the sets for individual years includes time-series vectors for different variables, which are referred to herein as input factors of the ANNs. The vectors of the time-series input factors describe the predispositions of firms from the previous period to form a zone in the year for which the classification is made, which can be either the year of learning or the year of forecast. Thus, the data points in the data vector for individual input factors within individual sets for particular years are the lag observations of the factors. They contain time-dependent patterns which can be learnt by an algorithm. Although TDNNs have been reported to be successful for prediction and classification [
32,
33,
34], this concept, to the best of our knowledge, has not been previously used to predict bankruptcies in any business area.
The study was conceived through six research problems. The first objective of the study was to identify changes in the generalization abilities and precisions of the set of initial ANN models, which were used to predict the bankruptcy risk zones of companies in 2018 and 2019 compared to 2020 and 2021, in order to generally assess the impact of the COVID-19 crisis in the last 2 years. A total of 2060 ANNs were developed and tested for these needs.
As the input factors, ANNs were used as financial indicators and the indicators of Altman’s Z″-score model (Z″-score, zone, change in the Z″-score), as well as a number of market and internal non-financial factors of the companies. In addition, the input clustering factor that represents a company’s cluster in relation to its historical changes in the development of bankruptcy tendencies was used, where K-means cluster analysis was employed to form the clusters. In order to test the applicability of ANNs configured in this way, their goodness-of-prediction indicators in the pre-COVID-19 period were analyzed in terms of the accuracy of predictions in 2018 and 2019 and the same indicators in 2020 characterized by the presence of the COVID-19 crisis.
The second objective of the study, i.e., the development of indicators for the evaluation of the stability of a year and indirectly the effects of the COVID-19 crisis in 2020, was inspired by three observations identified through the analysis of ANN operation in the considered period: (1) that indicators of the goodness-of-fit of an ANN, when an ANN is used for evaluation purposes (i.e., the ANN learns from the example of observations in the same year for which the classification of firm zones is made) can be used as indicators of differences between the states of the risk of bankruptcy in the previous period and the state in a year of analysis (i.e., they can be used as indicators of changes in a given year); (2) that a longer period from which the data used by ANN originates will give better results in terms of model accuracy when no significant changes occur; and (3) that differences in the goodness-of-fit and goodness-of-prediction of ANNs used for forecasting in one year relatively observed in relation to the accuracy of the same models used for forecasting for other years can be used as indicators of changes in the year of forecasting.
The generalization capabilities of ANNs have attracted significant attention from the scientific community (e.g., German et al. [
35], Wolpert [
36], Lawrence and Giles [
37], Li et al. [
38], Wu and Liu [
39], Neyshabur et al. [
40]) and represent the ability of trained ANNs to generalize well from the learning data to new unseen data. For the TDNN ANNs presented in this paper, generalization errors relating to the testing phase of the ANNs (i.e., the generalization capabilities of ANNs tested using data for the years that were used for the learning process of the ANNS) differ in relation to the generalization errors when the same ANNs are used for forecasting in the forecast year, due to differences in patterns of zone formation in the previous period used for learning and the forecast year of the ANNs. The generalization gap that is calculated on the basis of this difference, and is inferred from changes in the general accuracy of ANNs, for unchanged ANN configuration factors is interpreted in this paper as an indicator of relative changes, i.e., the stability of the forecast year. With two-level time-delay prediction networks, a smaller generalization gap means greater stability of the year over the overall analysis period. In the one-level time-delay networks (which we call “self-classification networks”) used to formulate 8 of the 11 stability indicators, the generalization gap increases with increasing stability of years, which can be attributed to the overfitting effect that this class of ANN tends to have in relatively stable years.
Additional indicators of stability for the years were derived based on analysis of the accuracy of the networks in predictions within the three risk zones of the Altman model, as well as four derived risk zones. A special class of stability indicator was derived based on differences in overestimating and underestimating the risk zones of corporate bankruptcies, based on the assumption that ANNs will overestimate firms in the year in which negative structural economic circumstances occur, while, conversely, they will underestimate firms for years with positive development of operating factors. Stability indicators can also be used at the level of individual firms, with firms with higher fluctuations in the generalization capabilities of ANNs being considered more at risk of bankruptcy. This application is based on the conclusions of a study conducted by Dambolena and Sarkis [
41], according to which the instability in financial ratios showed a significant increase over time as the corporation approached failure.
Validation of the demonstrated indicators was performed (1) by comparative analysis of the results of [
41], (2) in relation to the descriptive indicators of Altman’s EM Z″-score model for the hotel industry in the Republic of Serbia, (3) using the relative strength index (RSI) (see Wilder [
42], Rodríguez-González [
43], Rodríguez-González et al. [
44], and Yao and Herbert [
45]), and (4) in relation to the results of the study of Matejić et al. [
6], which gave an assessment of the bankruptcy risk for hotel companies in the Republic of Serbia in the period 2016–2026.
For the third objective of the paper, analysis of the effects of the learning period and previous time horizon for which the input factors for the models were used served as a basis for modeling the second group of stability indicators and assessing the way in which the lengths of these periods affect an ANN’s generalization capabilities. The fourth objective of the paper, i.e., testing the usage of non-financial, industry, and internal indicators regarding their contribution to the accuracy of ANN models, was chosen to assess the way in which these factors affect model accuracy in times of crisis. The results of ANN models for 2019, 2020, and 2021 were compared with the results of the MDA analysis used for model feature selection, to demonstrate the differences in the results of these techniques.
In addition, we combined basic ANN models with the K-means clusterization of firms based on their affiliation with risk zones in two ways. In the case of the fifth objective of the paper (testing the usage of hybrid cascading ANN models), clusters of the firms were an additional input factor for the ANN models, so in this way hybrid cascade models were obtained, whose precisions were compared with the precisions of basic ANN models. Clusters were formed based on dynamism of the changes in firms’ zones for the data for the years involved in the clustering process. The sixth objective of the paper was the identification of groups of companies regarding their development processes in respect of the bankruptcy risk, i.e., dynamic risk patterns for which the ANN models had the highest predicting error, including the application of clustering. Differences in the accuracy of ANNs for different clusters of firms in 2020 can be interpreted through potential embezzlement in financial reporting and/or through the stronger impact of the COVID-19 crisis on this class of firms.
Overall, we present a new conceptual framework for bankruptcy risk analysis at the level of individual firms, groups of firms, industries, and markets. The conceptual framework presented can potentially be used for other business areas, and in the assessment of categories of bankruptcy risk that are not strictly related to the zones of the Altman model. In this case, the classes of risk of other models, such as well-known scoring models (e.g., Conan and Holder [
46], Springate [
47], Fulmer et al. [
48], Ohlson [
49], Zmijewski [
50], etc.) and classes obtained from purpose-built models that result in classification groups can be used, including empirical data on the number of bankrupt and non-bankrupt firms.
The rest of this paper is organized as follows.
Section 2 reviews the current literature on bankruptcy prediction, and specifically bankruptcy prediction for the hotel industry.
Section 3 provides the methodology used in the study.
Section 4 reports the empirical results and discussion.
Section 5 concludes the research.
2. Literature Review
Since the 1980s, a large number of scientific and professional studies have dealt with the application of ANNs in the development of bankruptcy prediction models. The reasons for this interest lie in the advantages of neural networks as techniques for developing predictive models, such as: (1) representation of non-linear dependencies, (2) application regardless of the type of distribution of the training set and input parameters, (3) robustness to sampling variations in overall classification performance [
51] and resistance to noisy data [
52], (4) fair generalization capabilities, and (5) relatively higher accuracy of ANNs compared to other classification techniques. Their ability to represent non-linear relationships makes them well-suited to modeling the frequently non-linear relationship between the likelihood of bankruptcy and commonly used variables (i.e., financial ratios) [
53].
However, the main disadvantage of ANN applications is that there are no clear and universally applicable rules for configuring ANNs to obtain acceptable classification accuracy. This problem is additionally complicated by the fact that ANN configurations include a number of technical and functional parameters such as methods; learning parameters such as momentum, number of epochs, and learning rate; number of hidden layers and number of nodes; input and output activation functions; rescaling methods for covariates, type of training and optimization algorithms; input factors; learning periods; etc. As a result, it is necessary to test in a large number of cases an enormous number of configurations in order to find ANNs with adequate accuracy in classification and prediction. Another problem in the implementation of ANNs arises from a large number of different types of precision, i.e., network errors that can be used in their evaluation and methods for minimizing the errors in back-propagation algorithms. In addition to common error types such as Type I and Type II errors, ANN accuracy can be measured at the level of the training set, testing set, and both of them, and this error does not have to correspond to the prediction error in respect of actual use in a given problem area. Third, ANNs do not have intuitive semantics and do not reveal acquired knowledge about dependencies within the problem area for which they are trained, i.e., users do not have the ability to understand the rules generated by ANNs to represent the problem [
54].
Among the basic methods of ANN, the most commonly used is the multilayer perceptron (MLP), which is a class of feed-forward artificial neural network that uses a supervised learning algorithm for training; namely, the back-propagation of errors. MLPs are affected by a number of limitations (e.g., local minima and over-adaptation to samples) and their predictive power depends on a number of parameters [
55].
The first group of studies investigating the application of ANNs in predicting bankruptcies compared the precision achieved by applying this classification technique in relation to other techniques, and demonstrated the good relative predictive abilities of ANNs, whereby the majority of research deals with comparative analyses of ANNs, and multiple discriminant analysis (MDA) is avoided (
Table 1):
Another important research area in the field of ANNs relates to improving the accuracy of ANN models in bankruptcy prediction studies. A large number of studies in this domain have dealt with the input factors (parameters) for which ANNs have the best results (for example, Liang et al. [
63] (financial ratios, board structure and ownership structure, net profit/income), Becerra-Vicario et al. [
64] and Kovacova [
65] (the country of origin of a company)), while other papers have dealt with the length of the time horizon (Fathi and Anis [
60]), models for the selection of input factors (Sun et al. [
66]), etc.
In recent years, special attention has been focused on combining multiple classifiers and the development of hybrid ANN models, i.e., on the development of advanced machine learning techniques (
Table 2).
The third group of studies deals with the adaptation of ANNs and the analysis of the effectiveness of ANNs in different business areas (
Table 3).
However, there is a paucity of papers that have used ANNs for prediction and analysis under COVID-19 conditions. Bernardi et al. [
79] used a logit model and an ANN to simulate the effect of the COVID-19 crisis on the bankruptcy risk of firms in Brescia in 2020, and showed they are less reliable under COVID-19 conditions. Chandra and He [
80] showed that due to high volatility in stock prices during COVID-19, it is more challenging to provide forecasting through the use of ANNs.
Although ANNs have received a great deal of attention from scientific and professional communities in recent years because of their superiority over traditional demand-forecasting techniques for hotel services [
81], forecasting prices, and the habits of tourists (for example, Zhang et al. [
82] and Song and Li [
83]), a limited number of studies have dealt with predicting bankruptcies in this sector using ANNs as well as other machine learning techniques. ANN algorithms have mostly been applied to longitudinal data in the form of time series in forecasting tourism and hospitality demand, including travel, hotels, transportation, and consumers’ values and satisfaction [
84]. In this field, forecast accuracy in assessing demand has been shown to depend on the country of origin and the forecasting horizon [
85], error measures, data frequency, the competitive set, data-generation processes [
86], and data trend patterns [
81]. Huang et al. [
5] adapted a neural network fuzzy-based time-series model for forecasting Taiwan’s tourism demand, where the relationship between two successive values in a time series was established in terms of degrees of membership and the relationships were assessed only in two successive years. Their forecasting model outperformed other models in forecasting tourism demand during the SARS event. Teixeira and Fernandes [
87] used an ANN to predict the tourism time series, where tourism revenue and total overnight stays were monitored for hotels in the north region of Portugal.
In the field of hotel bankruptcy prediction, Kim [
88] showed that an ANN was more accurate, with smaller estimated relative error costs than SVM. Young and Gu [
89] formulated predictive models for predicting the risk of hotel bankruptcies in Korea using logistic regression and an ANN, for a set of different financial variables and a sample of 102 hotels, and showed that ANN techniques have better results than logit in terms of accuracy (81.8% accuracy for the ANN). Li and Sun [
90] built models for predicting hotel insolvency using standard statistical techniques. MLP and a support vector machine (SVM) ANN were used for various financial indicators, including indicators of certain tourism activities and using a sample of 23 Chinese hotels, and showed that ANNs have better results compared to statistical techniques when data from two years before the bankruptcy were used. Fernández-Gámez et al. [
91] trained a probabilistic neural network and MLP networks for equal samples of bankrupt and non-bankrupt hotels, including 22 input financial and non-financial factors, and showed that MLPs have greater precision than the second model and traditional statistical techniques. Park and Hancer [
92] compared the precision of ANNs and logit using a sample of hotels, restaurants, and entertainment service companies and confirmed the superiority of ANNs.
A slightly larger number of papers have addressed parameters relevant to the prediction of hotel bankruptcies; however, these parameters have not been combined with advanced prediction techniques such as ANNs. Among the factors important for predicting bankruptcies in HI, the following were found: EBITDA [
91], sex of CEO [
93], size, location [
94], fluctuations in customer demand [
95], financial indicators included in Altman’s Z-score model [
96], financial leverage variables [
97], etc.
Some studies have dealt with the application of scoring techniques, especially Altman’s models, in predicting the risk of bankruptcy in HI (Altman’s models: Diakomihalis [
98], Goh et al. [
96]; Springate and Grover models: Kesuma et al. [
99]). Statistical techniques have also been directly used in predicting the risk of bankruptcy in the T&T sector. Gu and Gao [
100] published the first study predicting bankruptcy one year prior to bankruptcy in American hotels and restaurants, using an MDA model. Pacheco [
97] used MDA and logit models in SME Portuguese hotels and restaurants.
The number of studies dealing with the risk of bankruptcy in the pre-COVID-19 period in the hotel industry in the Republic of Serbia is insufficient. Milašinovic et al. [
31] applied the Z″-score model on a sample of seven hotels in the Republic of Serbia and their results were later confirmed in practice, where within the identified group of risky companies, one went bankrupt while the other two withdrew their shares. Mizdraković et al. [
23] applied Altman’s Z’ and Z’’-scores for the hotel industry of the Republic of Serbia for the period 2008–2012. When 2008 and 2011 are compared, the average Altman’s scores record decreases of approximately 70% and other scores have confirmed the same results. The authors of [
23] also point out that further studies might focus on formulating a bankruptcy prediction model for Serbian hotels.
When it comes to studies dealing with the impact of the COVID-19 crisis on the risk of bankruptcy in the HI, in the scientific literature, to the best of our knowledge, there can be found only three papers with models applied to the newly emergent conditions. In a study by Wieprow and Gawlik [
101], the MDA technique and logit models were used to assess the risk of bankruptcy of companies in the Polish tourism sector under the crisis conditions caused by the COVID-19 pandemic in the first half of 2020. Theirs is the first paper to provide a critical evaluation, and networks are mentioned therein. The results of the study presented in [
16] indicate that financial strength will be crucial for the survival of hotel companies under the conditions of the COVID-19 crisis. That study predicts that 25% of firms will face a financial distress situation if revenues fall by 60%, while the percentage of such firms will be 32% if revenues fall by 80%. Matejić et al. [
6] assessed the impact of the COVID-19 crisis on the bankruptcy risk of hotel companies in the Republic of Serbia, covering the period to 2026 and showing the results of six novel structural models, which were tested for Altman’s EM, Springate, and Zmijewski scores. In that paper, novel zonal dynamic indicators were introduced, and were used for monitoring the industry’s dynamism with regard to changes in the risk zones of Altman’s Z″-score model and for the analysis of the trend in the bankruptcy risk conditions. These indicators can be used for assessment of the stability of the periods, but do not include the application of ANNs. The authors also state that the COVID-19 crisis has shown that it is necessary to deepen knowledge regarding the behavior and efficiency of different prognostic models, and that the adaptation of existing models, and the creation of new ones, are needed.
Overall, the literature analysis has indicated three significant gaps in the scientific literature. Firstly, an insufficient number of studies have focused on predicting the risk of bankruptcy solely by considering hotel firms, and although the success of ANNs in predicting the risk of bankruptcy has been confirmed in this domain, the applications of various network input factors, their configurations, and hybrid techniques in this sector have remained insufficiently researched. Secondly, there is a gap in the literature in the field of critical evaluation and adaptation of models for predicting and analyzing bankruptcies in the context of COVID-19 and crisis conditions in general. Third, when it comes to the Republic of Serbia, only three papers have dealt with the application of assessing the bankruptcy risk in the industry. The research presented in this paper is related to the results of Matejić et al. [
6], the difference being that it is focused only on ex-post analysis, usage of ANNs, and the derivation and validation of stability indicators, for which conditions of crisis are valuable grounds for the research. This paper is a logical continuation of [
6], although this paper does not deal with the prediction of bankruptcies in the coming period and discusses in detail various aspects of the application of ANN networks for the hotel industry in the Republic of Serbia.
3. Research Methodology
In this study we used data from a sample of 100 randomly selected hotel companies within the I-Accommodation and Catering Services sector for the period 2015 to 2020. The research was conducted based on the financial statements of the companies available on the website of the Agency for Business Registers. At the time of writing, no financial statements for 2021 were available, so no analysis of the factual Z″-scores for that year has been performed.
The MLP method with the standardized rescaling method for covariances was used for ANNs, whereby batch-type training and the scaled conjugate gradient optimization method were applied, and ANN training was performed on 70% of the sample, while 30% was used for testing. The MLP paradigm was chosen because 95% of ANN implementations used in the business domain use this method [
102]. IBM SPSS Version 27 was used, with the following rules for the training: maximum steps without a decrease in error: 1; data used for computing prediction error: both training and test data; maximum training time: 15 min; maximum training epochs: automatic; minimum relative change in training error: 0.0001; minimum relative change in training error ratio: 0.001. We applied the principle of including only those ANNs in the results for which 100% of the sample was included in network testing and training, in order to show the reference precision of ANNs.
When it comes to network topology, multiple attempts with different topologies have shown that the best results are obtained using a configuration with one hidden layer, i.e., the input activation function. Hyperbolic tangent and the softmax output activation function were used in all ANN models developed and tested. Hidden layers enable ANNs to generalize. Increasing the number of hidden layers increases the overfitting risk for ANNs, which can lead to poor forecasting abilities and can increase the time needed for training of the ANNs. The overfitting phenomenon is connected to the number of weights (number of hidden layers and nodes) in ANNs, meaning that a larger number of weights relative to the number of observations in the training set leads to greater ability of the ANNs to adapt to the residuals and idiosyncrasies of the observations, and to degraded ability of the ANNs to generalize [
103]. As a result of the autocorrelations in a time series, the number of input nodes or the lagged observations used in the neural networks is often a more important than the number of hidden nodes [
51]. For the ANNs presented in this paper, the number of hidden nodes varies from 3 to 11; the number of hidden nodes was chosen for each ANN model separately, with the goal of maximizing the accuracy of the ANNs in the testing phase.
In the research, different ANN models were tested; these are described in the paper as a combination of the following basic components: (1) groups of input factor models, i.e., features of the ANN model (model of entering factors, MEF); (2) network learning period (learning period, LP), which denotes the number of years preceding the year of the effective classification from which observations were used for the process of ANN learning; (3) length of the period from which the input factors used by ANNs are taken (factors period, FP); and (4) application of the input factor of the cluster of company zones (C), to provide a forecast of the zone of the Altman model in which the company will find itself in a one-year period. The derived components of each model also comprise the total period of analysis when the sum of LP and FP is marked as the range of analysis (ROF) and MEF class (MEF class, MC ∈ [1, 3]), where the models in relation to the MEF are divided into three classes depending on whether they include only financial indicators of firms provided on the basis of Altman’s firm model (Class 1) or include market (Class 2) and non-financial internal (Class 3) factors. In some analyses where the aim was to draw general conclusions regarding MEF classes, MC was used as a model component instead of the MEF component. In this way, each model can be expressed as a combination of (LP).(FP).(MEF).(C). Among all models, two classes of reference models have been singled out, where the RM19–21 class represents a set of common models for the period 2019–2021, while RM18–21 represents a set of common models for the period 2018–2021. The reason for separating these subsets results from the fact that in the years after 2018, the availability of chronological data increased, as longer FP and LP periods were available, so the inclusion of all models in certain segments of the analysis combined the effects of different applied models.
The basic approach in bankruptcy prediction is to treat bankruptcy as a binary classification problem, where the target (output) variable of the models is commonly a dichotomous variable where “firm filed for bankruptcy” is set to 1 and “firm remains solvent” is set to 0. In contrast to this approach, the ANNs presented in this paper classify hotel firms into three zones of bankruptcy risk according to Altman’s Z″-score model for emerging markets, so the predictive ANNs presented in the study provide an assessment of the categorical risk of bankruptcy to which the firm will be exposed in the short-term, bearing in mind that Altman’s model is the most effective in short-term forecasts. Thus, the original Altman’s Z-score model has accuracy in predicting bankruptcies: between 82 and 94% in a six-month period, 72% in an 18-month period, 48% in a three-year period, and 29% in a four-year period [
104].
Altman’s Z-score models are credit scoring techniques based on multiple discriminant analysis (MDA), which provides metrics for classifying and predicting the health of companies based on a set of financial indicators. The models classify firms into three risk zones, i.e., classes of different exposure to bankruptcy risk based on the value of the variant of the Z-score used for the year under analysis and on discriminant factors (cut-off values) set by the models. The zones include the following: third zone (no-risk zone, safe zone, green zone), second zone (moderate risk zone, gray zone), and first zone (high-risk zone, distress zone, red zone). We use the Z″-score model (1995) and the original Z-score model’s adaptation for non-manufacturing activities and emerging markets, whose formula can be found in Srebro et al. [
30].
In the derived division into four risk zones, an even more gradual assessment of the risk of bankruptcy is given in relation to Altman’s model. This approach cannot be found in the earlier scientific literature [
6] and has the advantage, when compared to binary classification, that it gives a more gradual assessment of the risk of bankruptcy, while its disadvantage is that it depends on the discriminant values defined through Altman’s model for determining the risk of bankruptcy.
The accuracies of all models for individual years in the period 2018–2021, which were used to predict one of the three zones of the Altman EM model, are presented in the
Appendix A (
Table A2,
Table A3,
Table A4,
Table A5 and
Table A6), and the results of the models for classification into one of the 4 risk zones are presented in the same section (
Table A7 and
Table A8). The obtained values of different indicators for ANN accuracies are also presented in the
Appendix A (
Table A2,
Table A3,
Table A4,
Table A5,
Table A6,
Table A7 and
Table A8).
The ANNs presented in this paper are two-level time-delay time-series ANNs that include input time-series parameters with two levels of time lags, based on which they realize the classification of bankruptcy risk zones for particular firms in individual years. The former practically means that the presented ANN model generates one-year forecasts (Fyear), which in the time-series formulation of forecasting ANNs can be used as discrete input parameters for ANNs for forecasts in future years {Fyear+1, ... Fyear+2}.
Time-delay neural networks (TDNNs) are a paradigm presented by Waibel et al. [
105], and can be seen as a special structure of recurrent neural networks, where such networks at the input layer receive input vectors of time-series data with constant sampling rates and these vectors are overlapping in their segments. Since TDNNs are able to use the dynamics of a system and to forecast the outputs in the current time, TDNNs have typically been reported to be successful for prediction and classification [
32,
33,
34,
106]. TDNNs have been shown to be more accurate than traditional time-series forecasting models, such as the ARIMA model [
34]. However, as far as we know, this useful approach has not been previously used in studies that have dealt with the prediction of bankruptcies in any business area.
The time-series model presented in this paper does not pre-select the input parameters for the ANN, as is the case for classical approaches in machine learning for classification of time series data, which assume that time series are modeled as a generative process [
107] by assuming a certain time series model, such as the autoregressive model [
108] and the hidden Markov model [
109]. Instead, the brute-force analysis of the impact of different time-series data for different input factors of ANNs on their accuracy has been monitored.
A two-level time-delay time-series ANN model implies that ANNs at the input layer receive multidimentional time-series datasets, i.e., batches where each set has time-series vectors for x input factors and n firms from the sample and corresponds to the year from the interval [Fyear−i, Fyear], whereby Fyear represents the year in which the ANN will predict the firm’s zone (forecast year), while the interval [Fyear−i, Fyear−1] represents a set of years preceding the year for which the ANN predicts the firm’s zone; we refer to this as the “learning period” (learning period, hereinafter LP). Thus, the LP represents the number of previous years in relation to the forecast year for which the ANN had data on the actual realized zone and used these data during the learning process.
Therefore, within a set of input batches , there are two classes of batches, BL ⊂ B, which include a set of batches for which i > 0, and which as an additional input factor receive a zone indicator in Fyear−I (ZoneFyear−i) and BF ⊂ B, which includes one batch for which an ANN is used to forecast the risk zone in which i = 0 and no additional indicator of the zone is used. Therefore, the first layer of time delay refers to a delay between the batch for which the zone is forecast and the batches used for the network learning.
In the second layer of time delay, each of the input batches includes time series for different variables, which are referred to as input factors, and these time series describe the predispositions of companies from the previous period to form a zone in the year for which classification is performed, which can be either a year of learning or a year of prediction. Thus, the data points in the data vector for individual variables within individual sets for particular years are the lag observation variables that originate from a period of at least 2 years earlier than the year of prediction. They contain time-dependent patterns within different sets of years, which can be learnt by the ANN algorithm, i.e., its hidden layer. The set of all input factors that may have different time-series intervals in the paper is referred to as the model of entering factors (MEF) and can be expressed as:
where x denotes a specific factor,
Fyear denotes the year of prediction, jx represents the length of the time-series interval of the factor x, and parameter
i denotes the number of years preceding the prediction year for
. By obtaining
, we approach the period from which the input factors originate, which is referred to herein as the factors period (hereinafter FP).
Thus, each batch uses the same MEF, which we refer to as the MEF of a model, while the time-series lengths of vectors for input factors and the combinations of factors used are set so they are the same for individual factors for different batches, although these lengths may differ from each other among different factors and have a value for individual factors up to , which represents the maximum number of years preceding the year of classification from which the given input factor originates. In this sense, the maximum delay of the MEF model in relation to the forecast year has the value . This period is referred to as the range of analysis (ROF), which represents the sum of the LP and FP periods.
The overlap of segments in the ANN time-series is significant because it provides a representation of nonstationary features at different time positions [
110]. In the basic conceptual model of two-level time-delay time-series ANNs that we present (
Figure 1), overlapping has been achieved through the intersections of datasets of individual years. Thus, for example, with the model LP
2020 = 2, FP = 2, the network is preparing to predict the zone in 2020, learning from the examples of 2018 and 2019, i.e., based on the sets for 2018 and 2019, which include, in addition to data on realized zones of companies in these years, the input factors originating from 2016 and 2017 in the first case, and from 2017 and 2018 in the second case.
A special class of ANN models presented herein and used to determine the stability indicators of the year are ANNs that have been trained using their own example, so that they have only one layer of time delay. In this case, therefore, the ANNs have an LP of 0 and one batch on the BL-type input layer, such that the year of learning for the batch is equal to the year of prediction of the model. ANNs with such a model in the paper are designated as ANNs for self-classification (hereinafter ASC). For ASCs, bFyear ⊂ BL, such that i = 0. ASCs presented in the paper have applied different FP periods; one MEF model did not apply the cluster factor, and this is marked as (LP = 0). (FP). (MEF = 3). (C = 0).
The available ROF determines the maximum value of the sum of LP and FP, so that the choice of one component determines the possible values of the other component of the model, so that LP ∈ [0, ROF] and FP ∈ [1, ROF], and (LP + FP) ≤ ROF. For the ANN models presented herein, the available period from which the input factors originated covered the period 2015 to 2020, while the networks learned based on the prediction of zones by years for the period 2016 to 2020. For the range of years with available data, the last year for which data on company zones are available is 2020, and the maximum network learning period is 5 years. Analogously, the FP parameter used values from the interval [1, 5]. For example, for LP2021 = 1, an ANN was trained on 2020 to forecast zones in 2021, while for LP2021 = 4, with the same goal, the network learned based on a period of 4 years starting with 2016. An analogous indicator to the LP indicator is the n indicator, which is a set of observations used for training, so that, for example, a learning period of one year corresponds to the size of the sample used, n = 100.
Input factors constituting the MEF component of the model (presented in
Table 4) were tested within MEF Class 1, i.e., time-series vectors for the Z″-score of the company in the FP period (Z″
year−i), the zone in which the company found itself in the FP period (Zone
year−i), and changes in Z″-score compared to the previous year in the FP period (ΔZ
year−i). Among the Class 2 indicators, market indicators for the analyzed period were considered: the average Z″-score of the hotel sector in the FP period (Z″
ind−i), percentual contribution of tourism and hospitality to GDP in the FP period (%GDP
year−i), GNI per capita in the FP period (GNIPC/
year−i), projected value of the Z “-score of the market for the year for which the forecast is made (F
year(Z″
ind)), long-term provision costs (LPC
year−i) of the accommodation sector, and long-term liabilities of the accommodation sector (LLS
year−i). Models with market factors were not applicable to models with a learning period of 1 year, as in this case they have a constant value for the observations in a sample. The previous market indicators were chosen because they are based on Altman’s model or reflect the effects of the COVID crisis quickly enough, as is shown to be the case with %GDP
year−i, GNIPC/
year−i, LPC
year−i, and LLS
year−i. The Z″-score of the HI in the FP period was calculated as the arithmetic mean of the Z″-score companies in the sample for each of the years in the FP period, while the predicted value of Z″-score for the year for which the forecast was made, i.e., F
year(Z″
ind), was provided by applying a multiple regression model. The values of market factors used at the input layer of ANNs are shown in
Table A1.
In order to generate the values for Fyear (Z″ind), the testing of various financial and non-financial indicators of the HI provided on the basis of BRA (Business Registers Agency) reports and the selected model as factors was undertaken to build a model with a 95% confidence level; the factors were Z″ind/year−1, ΔZ″ind/year−1, LLSyear−1, and LPCyear−1. The model had a perfect fit for the analyzed data (R2 = 1. α = 0.05) and predicted that the Z-score in 2021 will be 7.618. The multiple regression model we used has the form:
Furthermore, non-financial internal indicators of the firms that were included in the MEF models were as follows: the age of the company (AGE); the type of business structure of the company (Type ∈ {LLC, Corp}), where LLC stands for Limited Liability Company and Corp stands for Corporation; the country of the origin of the main owner (ORIG ∈ {Dom, For, ChangedtoFor}), where Dom stands for domestic, For stands for foreign and ChangedtoFor stands for changed to foreign in the period 2016 to 2019; type of distribution of the owning rights (DIS ∈ {Dom, Dis, ChangedToDis, ChangedToDom}), where Dom denotes dominant (51%) ownership of one entity, Dis stands for distributed, ChangedToDis denotes a change in distribution to distributed within the period 2016 to 2019 and ChangedToDom denotes a change in distribution to dominant within the same period. In addition to the previous factors that took one last official value for each company, the models with MEF = 17 and MEF = 19 took into account data in respect of the number of employees in the past year (NEyear−1) and changes in the number of employees in the past year (ΔNEyear−1), where the first model was available starting in 2020 while the second was available only in 2021, given that at the time of writing we know that data on the number of employees in companies in the sample from 2018 to 2020 were available. Through models with MEF = 12 and MEF = 20, the indicator RSTAByear−I is implemented; this is one of the SI indicators introduced herein.
A total of 28 baseline MEF models were initially developed, of which 8 models with the lowest results in ANN zone prediction accuracy were eliminated and these models are not shown in the study results. We present the results of 635 ANN prediction models trained for the forecast period from 2018 to 2021, with a total of 8890 implemented ANNs; 4 models used for self-classification ANNs, with a total of 280 ANNs classified by firms into three Altman’s risk zones and 240 classification ANNs that classified firms into 4 custom risk zones; and an additional 170 ANNs trained to forecast firms in 4 risk zones based on a single input factor model. Thus, in total, the results of 9580 ANNs are demonstrated herein, while the number of networks trained in preparation for the research is at least as great.
We also tested a hybrid cascade model ANN, whereby for the ANN models tested in the existing research, additional input factors of the cluster memberships of the firms were used, describing dynamic patterns of changes in firm zones in the period preceding the year of observation used by the models. The idea of applying the cluster was to classify companies into groups for each year used by the models for their learning or forecasting, depending on their predispositions during the year for classification into a particular zone, which were expressed through the dynamics of changes in the zones of a given company in the previous period (dynamic pattern). Next, the obtained cluster for each company was added to each ANN model as a parameter of the input layer, and we analyzed the precision of the ANNs obtained in this way and the differences in relation to the precision of the same models obtained without cluster application. In this way, the continuity of changes in the zones of companies and the dynamism of these changes were approximated through dynamic patterns represented through the clusters. Clustering was performed for the following sets:
where DP indicates a dynamic template, F
year indicates the forecast year, ROF represents the analysis range, and K
Fyear denotes a set that is clustered for the individual forecast years for which the networks were trained, with the clustering process resulting in cluster values for each firm and each year in the period [F
year−i, F
year].
In this part of the analysis, a series of experiments was performed where the precision of different types of clusters was analyzed depending on (1) the number of formed clusters, (2) the value of parameter i, and (3) the value of parameter j. While choosing the number of clusters that would be the result of clustering, in addition to the precision of ANNs, the sizes of groups of companies within each cluster were taken into account, so that the numbers of non-empty clusters were selected under the criteria that none of the clusters should have a size that exceeds 50% of the sample.
The clustering was performed using the K-means cluster analysis iterate and classify method, with a maximum of 10 iterations. For each year, three cluster input factors were selected for which the ANN models had the highest precision compared to other cluster input factors, while the cluster input factors obtained in this way were marked as CFYEARNC/j, where NC denotes the number of groups within the clusterization. Thus, in the networks prepared for forecasting in 2019, the following clusters were applied: C2019
8/3, which included 8 clusters, data on the respective zones of companies in the previous 3-year period in relation to the year for which the clustering was performed, and provided clustering of companies in 2018 and 2019; C2019
8/2, which included 8 clusters, data on the affiliation of zones in the previous 2-year period in relation to the year of clustering, and provided clustering of companies for the period 2017 to 2019; and C2019
3/1, which included 3 clusters, data on the affiliation of zones in the previous year compared to the year of clustering, and provided clustering of companies in the period 2016 to 2019. For ANNs prepared for forecasting in 2018, the following clusters were similarly used: C2018
8/2 and C2018
3/1. For forecasting networks in 2020 we used the clusters C2020
8/3, C2020
8/2, and C2020
3/1 (
Table 5), while for forecasts in 2021 the following clusters were used: C2021
8/3, C2021
8/2, and C2021
3/1.
Common practice applied for ANNs is to divide time-series sets into three distinct sets: training, testing, and validation (out-of-sample) sets. The training set is used by ANNs to learn the hidden patterns present in the data, while the testing set is used to evaluate the generalization capabilities of the ANNs in the pre-prediction phase and as a criterion for best-performing ANN selection. The validation set is used as a final indicator of the performance of the trained ANN. The ANNs presented herein use 70% of the observations from input time-series vectors as training sets, and the rest of the observations as the testing set. The testing set is randomly selected from the input vectors; in this way, the danger of using a test set characterized by one type of market conditions is largely avoided [
103]. The validation set for the ANNs consists of observations that chronologically follow the training set, and includes observations for the forecast years for which data relating to the factual zonal memberships of the firms are available. In this way, testing of the accuracy of the ANNs on the validation set demonstrates their forecasting ability and generalization capabilities in the prediction phase.
In this paper we use an indicator of ANN accuracy that represents the average accuracy in the training and testing phase and is calculated based on the average errors:
where
ACC ∈ [0%, 100%],
PIPTEST represents the percentage of incorrect predictions for the testing set,
PIPTRAIN represents the percentage of incorrect predictions for the training set,
NTRAIN represents the total number of companies in the training set,
NTEST represents the number of firms in the testing set,
represents the number of false positives for the training set for zone Z, and
represents the number of false positives for the test set for zone Z.
The reason for using a dedicated indicator stems from the fact that SPSS reports the data on ANN accuracy in the form of the percentage of incorrect predictions for the sets used, so that calculation of standard accuracy indicators such as accuracy, sensitivity, specificity, and Type I and Type II errors for the three groups of results depending on the zones would be exhaustive and would limit the possibilities for testing a large number of ANNs. Secondly, the stability indicators that are presented herein are derived from the model generalization errors, and at the same time, dedicated accuracy indicators include generalization errors in pre-prediction (the learning phase) through the inclusion of errors in both the testing and training phases. In fact, the obtained values of the precision indicator are lower than those for accuracy, sensitivity, and specificity for a set of classifications that combines training and testing sets. For example, in the case of a network with ACC = 82%, PIPTEST = 19.3%, and PIPTRAIN = 20.3%, the values of the standard indicators for the training and testing set would be as follows: accuracy = 95.9%, sensitivity = 84.3%, and specificity = 90%.
The accuracy of a model is defined through three indicators: (1) AMAN (accuracy of the most accurate network), (2) AA (averaged accuracy), and (3) MEA (the most accurate network effective accuracy). AMAN represents the value of a dedicated indicator of accuracy for the most accurate network within a set of trained ANNs with the model (LP). (FP). (MEF). (C).
AA represents the arithmetic mean of the dedicated indicators of all ANNs trained for a model. AMAN and AA are used to describe the accuracy of ANNs in the learning phase (pre-prediction phase), whereby the zones assessed by the ANN were compared with the actual zones for datasets in the relevant years (precision in the learning phase). We equate the learning phase with the set of years for which firms’ zones are assessed on the basis of input factors, and in which the ANN learns using data for the actual values of the zones. Thus, the accuracy in the learning phase can be equated with the accuracy of the classification of the period or individual year, which does not include data for the forecast year or errors that would occur in the assessment of the zone for that year, and can be used to assess the stability of the period constituting the learning period. Accuracy in the learning phase is also important for the selection of a forecasting model, although, as will be shown later, there is no strong linear or positive relationship between accuracy in the learning phase and effective precision, which becomes especially evident in times of crisis. These indicators correspond to the generalization capabilities of ANNs established on the basis of testing data.
The MEA indicator represents the effective precision of the most accurate network for a set of trained ANNs for a given model and corresponds to the generalization capabilities of ANNs established on the basis of validation data, where for validation data the complete sets of zones for the sample in a forecast year are used. Unlike the previous two indicators, the MEA describes prediction accuracy, where the accuracy is tested by comparing ANN forecasts with the factual zones in the forecast year, which were not known to the ANNs during their training. In this sense the prediction accuracy is calculated as the difference 100%-PIP, where PIP represents the number of incorrect predictions for the sample. All three indicators can take a value from 0% to 100%.
The AMAN, AA, and MEA indicators are also applied to the groups of models, where the group is denoted in such a way that one or more of its components takes the exact value. For example, the group of models (LP = 2). (FP). (MEF). (C=0) represents all ANNs with LP = 2 for which the clustering factor was not applied, regardless of their FP and MEF values. In the group of models, two more accuracy indicators are used: AVM (averaged accuracy for the most accurate networks), which represents the arithmetic mean of accuracy in the learning phase of the most precise ANNs for individual models in a group of models; and AEP (averaged effective accuracy for the most accurate networks), which denotes the average effective accuracy in the prediction phase of the most accurate ANNs in the group of models.
Methodological Basis for Formulating Stability Indicators
In order to determine whether ANNs had more difficulties in understanding and interpreting the process of generating zones in 2020 and 2021 in relation to the previous period, we analyzed the dynamics of market changes, i.e., the stability of particular years from the ANNs’ viewpoint. The assumption of this analysis was that years can be considered more stable in terms of market dynamics if time-delay ANNs can be trained to more accurately classify zones in the same year, i.e., to better understand the relationships between inputs from the previous period and realized zones in a given year as part of the learning process and effective forecasting.
Thus, the term “stability” for the year in the context of ANNs denotes the predictability of the zone of the companies in the sample during the analyzed year, for a set of development preconditions for each company from the previous period. The notion of stability of the year in terms of the dynamics of changes in the market denotes the presence of unforeseen changes in the year in relation to conditions from the previous period in the market that are relevant to the monitored indicators, i.e., deviation of the year under review in relation to the trend in the development of bankruptcy risk derived from the previous period. In this way, goodness-of-fit and goodness-of-prediction indicators of time-delay networks can be viewed as indicators of their relative stability. Dambolena and Sarkis [
41] similarly used the error in the estimate of an MDA function as an indicator of the stability of financial ratios for assessing corporate failure risk. A similar approach is used in out-of-trend (OOT) analysis within stability studies, where OOT results are defined as stability results that do not follow the expected trend [
111,
112]. These can be found in quality control studies [
113], where process stability refers to the predictability of the process to remain within control limits. The variations occur either because of the inherent nature of the process or some external or forced changes, called disturbances [
114], which Shewhart, a “father of statistical quality control” [
115], named “assignable-cause” and “chance-cause” variation. In this sense, we consider the development of bankruptcy risk to be a dynamic development process, while the precision in the assessment of this risk, expressed through goodness-of-fit and goodness-of-prediction of the models used for the assessment, is used as an indicator of controllability of the bankruptcy development process.
The conceptual framework in the application of ANN accuracies as the indicators of stability can be described starting from the econometric approach to trend assessment, which considers a time series to be composed of several combined components, where econometric series are usually of strictly limited duration and often exhibit strong trends [
116].
In this sense, the time series data for variable y at time t are represented by the papers as a result of three additive components: (1) trend component, which is a component of the value of the variable y at time t due to changes in the value of variable y in the period preceding the moment t-Ty(t), and can be described as the expected value of the variable y based on the previous period; (2) the component resulting from the changes at time t compared to the previous period (εcy(t)), which include irregular movements of the y indicator in the year of analysis in relation to the previous trend in the movement of the y value; and (3) intrinsic model errors (εiy(t)), which can be marked as “assignable-cause” errors and which we assume are stationary in the analyzed periods. In this way, the value of the indicator y realized at time t-Ey(t) can be formulated as:
If the precision of the ANN model is viewed as a developmental auto-regressive process, i.e., a time-series model variable that varies over time and depends on previous values Ey(t), the component can be approximated through the effective precision of the ANN (EA(t)) at time t, while Ty(t) (the expected value of accuracy based on trend data) can be approximated as the accuracy of the ANN model in the network training phase (TA(t)).
In this case, the component of changes (
εcA(t)) means the change in general precision due to differences in zone formation patterns in the previous period and the year of analysis, i.e., the changes in general circumstances of the industry for which the classification is performed, with unchanged ANN configuration factors (such as FP and LP period lengths, MEF models, number of hidden layers, learning methods, learning schedules, etc.). The change component described in this way describes the “chance-cause” errors [
115] in individual years and can take a positive or negative value. An intrinsic error (
εiA(t)) occurs when accuracy is considered a time-series variable; this means a decrease in accuracy, i.e., it is a product of chance in the choice of observations for the training and test sets in network learning, as well as noise in the sampled measurements, which is analyzed throughout the literature [
117]. With other unchanged ANN configuration parameters, intrinsic error can be considered to be stationary and can be interpreted as white noise in situations when for the calculation of ANN accuracies a sufficient number of ANNs is used.
The previous definition of the time-series of precision of the ANN model is related to the generalization capability of ANNs, which means success in fitting an ANN, i.e., training the ANN on the input dataset in order to produce an associated set of target outputs. Over about twenty years of rapid development of ANN technology, many methods of improving the generalization capability have been proposed, but the generalization problem related to an ANNs is still serious [
118]. The generalization error of a machine learning model measure represents the ability of the trained model (algorithm) to generalize well from the learning data to new unseen data, and is in practice usually measured by the difference between the error in the training data and the error in the test data [
119]. In Neyshabur et al. [
40], several different measures and explanations for the generalization capabilities of ANNs are examined. Bias–variance decomposition is a way of analyzing a model’s generalization error with respect to a particular problem as a sum of the following: bias, which represents average network approximation error; variance, which represents the sensitivity of a model regarding the sample it uses; and irreducible error, which is a product of the noise in the problem itself. Stable models are not sensitive to small changes in the training set, i.e., models for which a small change in its training set results in a small change in its output [
118].
A number of papers in the scientific literature have dealt with methods for reducing generalization errors in ANNs (such as Li et al. [
38] and Wu et Liu [
39]), as well as tension between the complexity and generalization of a model [
120], while other papers have addressed the trade-off between bias and variance as basic components of generalization error in ANNs [
35,
36,
37,
121]. Musavi et al. [
121] used the expected error of the outputs of ANNs for a probabilistic input model, which was used as a measure of generalization ability. Our study does not use generalization learning strategies to reduce generalization errors, while at the same time all ANNs learn from an equal number of observations, and these errors are used to interpret changes in the year of analysis.
Generalization error is related to two common situations in model development, i.e., underfitting and overfitting, which reduce the generalization capability of the models. The literature emphasizes that ANNs are susceptible to the “overfitting” problem [
122,
123] since ANNs are usually over-parameterized models that are usually trained using a small amount of data [
118]. In this problem, the bias of the model tends to zero while the variance describing the sensitivity of the ANN to the data presented to it at the input layer increases. Significant overfitting in the time-delay ANNs presented herein was avoided through relatively simple models with a small number of input factors and one hidden layer, and models with low degrees of freedom were achieved, while networks were trained on a relatively large number of observations. Conversely, “underfitting” occurs when the ANN has a high bias, while variance tends to zero.
In the presented time series model with three components, the generalization error can be represented as a generalization gap expressed through the formula:
where the component of change
intensifies the bias and variance of the model. Analogously, the component of the model that is the result of changes in the year of analysis t can be expressed as
. Thus, the effects of changes that occurred in the year of analysis t in relation to the previous period can be interpreted on the basis of generalization error among these years, while it is assumed that the intrinsic error of the model
tends to zero, since the precision of the ANN is analyzed in parallel as a set of a number of years, which are assumed to have stationary intrinsic errors.
The precision in the learning phase that we present is calculated based on errors in training and test observations, and incorporates a generalization error that is not related to the network training process or error in representing the process of forming zones in the previous period expressed through differences in ANN accuracy in the training and testing set; it does not take into account the zone assessment process in the year of analysis t. Let us call this error the generalizing error of the learning period, . Thus, in two-level time-delay prediction networks, two levels of generalization errors are analyzed: the generalization error of the learning period and the generalization error in the forecast year. The year can be considered more stable if the differences between the ANN generalization errors in the forecast year less the generalizing errors of the learning period (that is, the difference between the effective precision they incorporate and precision in the learning phase, which incorporates a generalization error in the learning period ) tend to zero.
Time-delay networks with self-classification were used as the basic tool for formulating SI indicators in 8 out of 11 indicators, because it has been shown that in these networks, where the learning period is equal to the prediction period, unlike two-level time-delay networks, a generalization gap expressed through increases with increasing stability of the year. That is, when effective precision is considered (), less effective precision is achieved in the case of more stable years, while precision in the learning phase stops at a higher level in more stable years. Previously, this could be interpreted through the effect of overfitting, which occurs for self-classification networks in relatively stable years when they are exposed to a larger sample of out-of-sample input data, due to insufficiently complex problems before the introduction of ANNs. Thus, in contrast to prediction networks, in self-classification networks, there may be cases where the effective accuracy is higher than the accuracy in the learning phase, and can also take a positive value from the initial time-series model. In the case of such networks, it is enough to use precision in the learning phase for estimating the generalization error, i.e., the stability of individual years within the analyzed period LP + 1.
In this study, for each of the years in the analyzed period, ANNs were trained for self-classification of zones with different FP periods for the group of input factors MEF = 3, since this model was available for application in 2016 and is the basic model on which the models were built. For each (LP = 0). (FP). (MEF = 3). (C = 0) configuration, 20 ANNs were created, i.e., a total of 280 ANNs, among which a network was requested for each group of networks with the highest precision in the learning phase (AMAN), as shown in
Table 6. An additional 240 ANNs for self-classification were trained to classify 4 derived bankruptcy risk zones, which were formulated in a study based on Altman’s zones of risk, based on Matejić et al. [
6].
The MSTAB2 indicator differs from the first three indicators shown in
Table 6; differences in precision in the learning phase are obtained by varying the FP component for unchanged LP periods, and differences in generalizing errors in the learning phase for different years of classification are based on variations in the learning period that result from changing the FP component instead of the LP component. A special class of indicator, which includes UO and UOR indicators, is based on differences in the nature of errors, i.e., it is based on differences in overestimating and underestimating the risk zones of corporate bankruptcies, based on the assumption that ANNs will overestimate firms experiencing negative structural economic circumstances, while conversely underestimating firms for years with positive development of market factors.
In order to realize the validation of SI based on self-classification, we trained 10 ANNs for all prediction models with (LP > 0). (FP). (MEF). (C), which classified the risk zones of firms into three zones, within the partial years in the period 2018–2021. In this part of the analysis, a total of 8890 ANNs were generated as follows: 580 ANNs for 2018, 1340 ANNs for 2019, 2780 ANNs for 2020, and 4190 ANNs for 2021. A total of 635 ANN models were created for all three years, of which 61 models were applied in the case of all years in the period 2019–2021 (RM
19–21), which included 1830 ANNs, while 28 models were applicable for all years in the period 2018–2021 (RM
18–21), which included 1120 ANNs. For each of the models and each year, a network was required with the highest precision in the learning phase. The obtained values of AMAN indicators for all four years are shown in
Table A2,
Table A3,
Table A4 and
Table A5.
These models were also used in the analysis of the applicability of ANNs in 2020 in the case of the COVID-19 crisis and the formulation of market instability indicators (DESTAB), where in both cases the effective precision of ANNs in the analysis was used. Instability indicators were formed on the basis of average precision in the learning phase of the most precise networks trained within the reference models RM19–21 and RM18–21, and the average effective precision of the same ANN. Thus, with this indicator, . The indicators STAB2 and STAB3 are based on the precision in the classification of companies in the derived sub-zone 21 and zone 1 of the risk of bankruptcy by networks that are trained to predict the zones of companies and divide them into 4 risk zones. Unlike previous indicators, these indicators are based on the average accuracies of all trained network prediction networks.
SI can be used not only to evaluate the dynamics of market changes, but also to work at the level of individual firms, in which case the same ANN developed for the sample of firms can be used to analyze the accuracy in estimating the zone of a particular firm in different time periods. In this sense, firms with higher fluctuations in ANN precision may be considered more at risk for bankruptcy. Frequent changes in the SI of individual firms can be interpreted as a consequence of inaccurate or untrue financial reporting, frequent changes in organization and management, oscillations in the financial structure, susceptibility to shocks and impacts coming from the market, and more. Assessing the stability of financial indicators of firms is important from the point of view of their risk of bankruptcy, as it has been shown that instability in financial ratios shows a significant increase over time as the corporation approaches failure, and the inclusion of stability factors increases the accuracy of predictive models [
41].
4. Results and Discussion
This section is divided into seven sub-sections. Within the first sub-section, along with the empirical results of the study, we introduce three SI indicators (STAB, MSTAB1, and RSTAB), which are based on the accuracies and relative accuracies of ANNs for self-classification, i.e., ASCs. In the second sub-section, we introduce the MSTAB2 indicator based on the observation that FP lengths affect ANN accuracies differently depending on the stability of the forecast year, and present an analysis of changes in ANN accuracies depending on LP and FP period lengths. In the third sub-section, we introduce the instability indicator (DESTAB), which starts from differences in effective accuracies and accuracies in the learning phase. Within this section, an assessment of the applicability of the ANN models presented in the study in the first years of the COVID-19 crisis is given. In the fourth sub-section, we introduce four additional SIs based on the accuracies of ANNs within three and four bankruptcy risk zones, and an analysis of the accuracies within individual zones is presented. In the fifth sub-section, the accuracies of ANNs for different zonal dynamic patterns are analyzed and two additional SIs are introduced, which are based on the degrees of overestimation and underestimation of the zones made by ANNs (UO and UOR). In the sixth sub-section, the validation of SIs is performed through mutual comparison, comparison with descriptive indicators of bankruptcy risk in the HI in the Republic of Serbia, and comparison with the relative strength index, which is used for the analysis of trend strength in particular years for indicators related to zones and Z″-scores in individual years. In the seventh sub-section, the influence of input factor models, model classes, and inclusion of cluster input factors on AMAN indicators for set of 8310 ANNs for 2019, 2020 and 2021 is tested. In addition, a series of univariate analyses and two-tailed paired-samples t-tests is performed, and the obtained results are compared with the results of MDA analysis.
4.1. Stability Indicators Based on ANN Precision for Self-Classification
Within the second research objective of the paper, the first assumption on which SI is based is that indicators of goodness-of-fit of can be used to analyze ANN models, and when ANNs are used for self-classification these indicators can be used to identify differences in bankruptcy risk, which can be inferred from data from the previous period and the actual situation in a given year, i.e., they can be used as an indicator of the level of changes in a given year. This assumption was confirmed through a quantitative analysis of accuracies of ASCs, where it was found that in this model the generalization gap
grows with the reduction of the generalization gap in the learning period of ANNs, represented through the general precision of ANNs in the learning period (
). Previous results can be interpreted through the effect of overfitting of ANNs, which is present in this case and which is shown in
Table 7. Therefore, in the case of such networks, it is enough to use accuracy in the learning phase
for estimating generalization error, i.e., the stabilities of individual years within the analyzed period of length LP + 1 ending with the year of analysis.
The stability of the year can be estimated in relation to a period of one year (FP = 1), for which it is assessed to what extent the zones in a given year were predictable based on indicators from the previous year, to a period of two years (FP = 2), for which it is assessed to what extent zones in a given year were predictable based on indicators relevant to the risk of bankruptcy in the past two years or in relation to longer preceding periods. Hence, stability can be expressed through the general accuracy of ANNs in the learning period (
), which is then expressed through the accuracy of the most accurate network (AMAN) indicator:
where:
}, M: (LP = 0). (FP). (MEF = 3). (C = 0).
In order to be able to compare the stability indicator for a given year with previous years in a previous period, it is necessary to compare the SI with the maximum value for a given year for FP; this represents the maximum FP value available for each individual year in the period. Therefore:
where
, and in this case SI is used for an individual year:
where
M: (LP = 0). (FP). (MEF = 3). (C = 0).
For example, if we want to compare SIs for the period 2016 to 2020, , and the indicator will be used for each of the five years. For the period 2018 to 2020, , and the following SI will be used in this analysis: , and .
In the case when stability is assessed relative to the previous year, it is expressed as:
where:
,
That is the case, for example:
In the case when the stability is expressed relative to the previous year, but in the context of a longer previous period, it is expressed in the same way:
where N expresses the number of previous years in the period.
Based on the calculated indicators STAB, MSTAB1, and RSTAB, it can be concluded that the greatest stability in the period under analysis was achieved in 2018, and that a smaller decline in stability was achieved in 2019, while according to all indicators the decline in 2020 was significant, expressed as 6% to 8% based on RSTAB indicators. In 2017, the value of SIs decreased slightly compared to 2016, while overall we can say that in the period 2016 to 2020, the HI in the Republic of Serbia was a dynamic industry in terms of changes in exposure to bankruptcy risk, with frequent changes on an annual basis.
4.2. FP and LP Period Lengths and Stability Indicators
As shown in
Table 4, years with different stabilities had different variations in learning phase accuracies and effective accuracies of ANNs, which were the result of changes in the length of the period from which input factors originated. For example, if we compare 2018, 2019, and 2020, for which models with FP values up to 3 are collectively available, in 2018 the most accurate network in the learning phase with FP = 3 had 4.2% higher accuracy compared to the network with FP = 1, while the most accurate network in 2019 had FP = 2 and 1.9% higher accuracy compared to the network with FP = 1. In the case of 2020, the most accurate network in the context of comparative analysis was also the network that had FP = 1, i.e., extending the period from which input factors originate to FP = 4 in this case did not improve the accuracy of networks in the learning phase. Previously, this has practically meant that in conditions of relatively stable years, as in the case of exposure to bankruptcy risk in 2018, extending the period from which input factors originate in relation to FP = 1 generally leads to improving the accuracy of ANNs. In this sense, we can conclude that in conditions when major changes occur and especially in the case of a crisis, such as in 2020, the increase in accuracy resulting from the extension of the FP period relative to FP = 1 begins to decline, and there can even be a decline in accuracy where the positive dependence between the length of the period from which the input factors originate and the ANN accuracies in the learning phase is abandoned. In this sense, the stability of a year within the analyzed period with FP
max can be defined as the difference:
Thus, unlike for the first three SI indicators, where the differences in generalization errors in the learning phase between different years were a consequence of different training periods, for this indicator the differences are based on variations in the learning periods resulting from the change in the FP component instead of the LP component. Using the example of comparative analysis of 2018, 2019, and 2020, the values of this indicator are . Similarly, for the period 2017–2020, the indicator MSTAB22 is used and the resulting accuracies are, respectively, +2.2%, +2.3%, +1.9%, and −0.9%.
When ANNs are used for predictive purposes, as already mentioned, the accuracies in the learning phase can be interpreted based on the precision of ANNs for classification trained for the period preceding the year of prediction, but this dependence is not direct, because ANNs with LP > 0 and FP > 1 are trained on the basis of examples of individual years, and the knowledge that they acquire from later years within the learning period is used to infer dependence in earlier years within this period. In this way, the accuracies of ANNs in the learning phase for networks with this configuration can be interpreted as a consequence of the stability of the entire learning period.
As shown in
Table 8 in the next section, for the set of reference models RM
19–21 and all FP periods, the average accuracy in the learning phase (AVM) for ANNs trained to make forecasts for 2019 was 88.2%, while this accuracy dropped drastically in the case of forecasting networks for 2021 (79.9%). Similarly, for the set of reference models RM
18–21, which had an FP value of up to 2, the highest accuracy in the learning phase included networks trained for 2019 (86.1%), followed by forecasting networks for 2020 (85.7%) and 2018 (76.8%), while forecasting networks had the lowest goodness-of-fit accuracy for 2021 (75.2%). Such low precision in the learning phase in the most stable year of 2018 is a consequence of the relative instability of the period preceding this year, which is a sign that the changes in the period 2015 to 2017 were greater than in the period until 2019, including that year.
Therefore, although the forecasting networks for 2018 were trained from a previously unstable period, the accuracies of the self-classification networks showed that this year was stable and that it fitted into the pattern movements from the previous period. This was not the case for networks trained for forecasts for 2019; these were trained in a more stable period compared to 2018, but this year differed in relation to the period that preceded it.
In order to analyze in more detail the influence of the LP and FP components of the model on the accuracy of networks in the learning phase, for reference models RM
19–21 we calculated the average AVM indicators for different LP and FP values. In the case of 2019, networks with LP = 1 (AVM = 84.8%) had significantly higher accuracy in the learning phase compared to networks with LP = 2 (AVM = 79.4%) and LP = 3 (AVM = 76.2%); these results are shown in
Figure 2. For LP = 1, extending the period from which input factors originate from 1 to 2 increased accuracies in the learning phase by as much as 7.2%, which is due to the fact that the inclusion of 2016 in FP had a positive impact on the stability of the whole period. This was not the case with the addition of data from 2015, where growth in stability of 1.1% was registered. For networks trained for 2020, the order was retained in respect of LP periods; however, in this case, the differences in precision caused by changing the length of the learning period were smaller (85.7%; 83.6%, 79.4%). For LP = 1, the networks for 2020 were the most accurate when they applied an FP value of 3 (87.6%), the accuracy was slightly lower for FP = 2 (86%), and accuracy was the lowest in the case of FP = 1 (83.5%), i.e., in in this case, the largest achieved improvement was 4.1%.
In the case of 2021, however, the laws from the period in which the COVID-19 crisis did not have a greater impact on the risk of bankruptcy were violated. Thus, in this case, there was no significant difference in accuracy for different LP periods, where for LP = 1 the average accuracy was 79.9%, for LP = 2 it was 79.7%, and for LP = 3 it was 78.8%. Observing the combinations of LP and FP values, in the reference models for 2021 the highest accuracy was achieved for LP = 1 and FP = 1 (AVM = 80%), while with the same LP, the differences for FP = 2 (AVM = 79.5%) and FP = 3 (AVM = 79.9%) were insignificant. Thus, the addition of data from 2017 and 2016 did not increase the accuracy in the learning phase of the networks prepared for 2021.
Similarly, for reference models that included 2018 in addition to the previous years (RM18–21), the accuracy in the learning phase of ANNs for 2018 was approximately the same for models that were trained using data from one and two previous years (75.3% and 75.8%, respectively), while in the case of forecasting networks for 2019, better accuracies were achieved for LP = 1 (82.5%) than for LP = 2 (77.5%). In the case of forecasting networks for 2020, slightly higher accuracies were achieved when they were trained using data from 2019 (84.5%) than in the case when 2018 was also included (82.1%). ANNs had higher accuracy with FP = 2 in all years in the period 2018–2021 than in the case when FP had a value of 1. The largest increase in accuracy due to the addition of the year from which the input factors originate was achieved in networks trained for application for 2019 (+7.7%), and a slightly smaller increase was realized in forecasting networks for 2020 (+3%), while in the case of ANNs for 2021, there was a decrease in accuracy of 0.2%.
Based on the above, it can be concluded that in general, the extension of the period from which input factors originate and learning periods in time-series ANN models for predicting the risk of bankruptcy has beneficial effects on the accuracy of ANNs for classification in years assessed as stable within the analyzed period, regardless of the stability of the learning period, as was the case in 2018. Conversely, the extension of these periods may not be effective for years that are assessed as “unstable” as long as the previous period is stable, as was the case in 2019.
4.3. Stability Indicators Based on Effective Precisions of ANNs
Unlike self-classification ANNs for which the generalization gap (
grows with the increase in their accuracies during the learning phase
, i.e., with increasing stability of individual years, this is not the case with forecasting networks. On the subset of reference models for the period 2019–2021 (RM
19–21), we calculated AEP indicators as indicators of
and compared them with AVM values, which indicated
, as shown in
Table 8. As shown in
Table A6, which presents accuracies in the learning phase and the effective accuracies for the most accurate networks in the learning phase prepared to predict the zones for 2019 and 2020, for networks trained for 2020, the difference between the average effective accuracy and the average accuracy of the ANNs in the learning phase was significantly larger compared to this difference in 2019. In the case of 2021, effective precision was not calculated, since at the time of writing data for the actual zones of companies in the same year were not available.
Starting from these observations, we formulated the instability indicator of the year, which is expressed as the difference between the average accuracy of the most accurate networks of reference models and the average effective accuracy of the same networks. The instability indicator can be then expressed as:
where
.
Based on the instability indicators calculated on the basis of reference models RM
19–21 and RM
18–21, which are shown in
Table 8 above, it is evident that this indicator had up to 10% higher values for networks trained for forecasting in 2020 than in networks trained for forecasting in 2019, while for reference models that included 2018, the value of this indicator was over 18 times higher, which confirms the results of previously introduced indicators. When it comes to 2019, both indicators calculated for RM
18–21 indicated the appearance of destabilization in this year compared to 2018, which was also the result of previously considered indicators.
Effective precision may be higher than precision in the learning phase, which occurs in years with high stability and thus affects ANN overfitting, as was the case with forecasting networks for 2018, for models with FP = 2. The value of the instability indicator takes a value from −100% to +100%, where the scale is inverse, i.e., negative values reflect the high stability of the year.
Overall, precision in the learning phase is not linearly related to the effective precision of the same networks and the nonlinear relationship between these two indicators is caused by unexpected positive and negative changes that occur in the forecast year, which further limits the usability of ANNs under conditions of sudden change due to problems in choosing the most accurate model for forecasting in the next year.
In relation to the first research goal of the paper, in order to answer the question of applicability of the developed models in pre-COVID-19 and COVID-19 crisis conditions, based on the data from
Table 8 it can be concluded that the accuracy of the tested ANNs was satisfactory in the case of predictions for 2018 and 2019, but in the case of 2020 the effective accuracy of the reference models decreased significantly and amounted to 68% for RM
19–21 at best, while for RM
18–21 the accuracy was 67% at best, which, with the nonlinear nature of the dependency between accuracies in the learning phase and effective accuracies, makes the application of ANNs questionably beneficial in predicting risk zones of the Altman model in the early years following a sudden crisis like COVID-19 for individual firms. Thus, among the most accurate networks of all models in the learning phase, for 2019 and 2020, as shown in
Table A6, the highest effective accuracy for 2019 of 83% was registered, while for 2020 this value was 11% lower.
4.4. Stability Indicators Based on ANN Precision in Classifications within Bankruptcy Risk Zones
In the next part of the analysis of the dynamics of market changes, we analyzed the changes in the context of the predictive potentials of ANNs within different zones. For the initial set of 280 ANNs for self-classification, for each combination of FP, zone, and year, the network with the highest precision was identified as shown in
Table A7. Accuracy within individual zones was calculated as the average percentage share of TP (true positive) predictions within a given zone in relation to the total number of predictions in that zone for the testing and training sample, with FP = 1, for which data on accuracies are available for the period 2016 to 2020. The best networks were on average the most accurate in terms of AMAN indicators in terms of the assessment that the company will be in the second zone (100%), followed by the accuracy of the assessment that the company will be in the third zone (92.9%) and the first zone (91.8%).
According to the adopted principle for stability assessment, where the maximum value for AMAN is observed within the set of available data for FP, depending on the comparative periods, the analysis of three zones was not specific enough, and it could only be concluded that 2018 and 2019 were better than other years in terms of accuracy in the first zone; that in 2020, compared to those two years, the accuracy in the first zone decreased significantly; and that the indicators from the third zone showed that 2017 had greater inaccuracy compared to all other years. In the context of this analysis of zones for baseline models, zone 1 therefore revealed the most information about changes that occurred within individual years when ANNs were used for classification into three risk zones, based on the generalization error in the learning phase; in this case, changes that occurred in 2020 were recognized as being opposed to changes in 2019.
In order to examine in more detail the usability of accuracy indicators within the zones, the second zone was divided into two subzones Z
21 and Z
22, with a discrimination factor of 4533 since it was previously identified that the risk of falling from the second to the first zone in 2020 for companies with Z-scores of 3.75–4.335 in the past year increased to 42% (compared to 6% in 2016), while the changes within this group in 2019 are also noticeable. However, for companies with Z″-scores in the interval 4.533–5.85, the stability was maintained in 2019 and 2020 [
6].
For the model group, (LP > 0). (FP = 1/2/3). (MEF = 3). (C = 0), and for each year of forecasting for the period 2017 to 2020, 10 ANNs were trained, i.e., a total of 170 ANNs, for which the average forecasting accuracies within 4 zones (AA and AMAN) were monitored. In addition, for each forecast year in the period 2016 to 2020, 20 networks for self-classification were trained and used, making a total of 240 ANNs used to assess stability in the context of AMAN indicators. The obtained precision values for all 410 ANNs are presented in
Table A7 and
Table A8.
Trained self-classification networks for the period 2016 to 2020, which classified companies into 4 groups, for FP = 1 were the most accurate in the learning phase in predicting that a company would be in the third zone (98.5%), followed by the first zone (93.2%), while the accuracy was much lower for forecasts within the second zone (Z
2: 55%, Z
22: 70.8%). In order to test whether improvements in the effective precisions of ANNs can be achieved through this separation, we calculated the effective precision for the most precise networks from
Table A7, whereby in determining the effective accuracy of the companies classified in subzones 21 and 22, these companies were treated as companies in zone 2. The most precise ANN found for 2019 had a precision of 82%, while in 2020 it had a precision of 70%, based on which we can conclude that such subzone separation did not increase the effective precision of the ANNs.
Analysis of the accuracy of these 240 ANNs for self-classification in 4 zones showed that zone 1 exhibited the least degree of change that occurred in one year, while the most information was given by zone 22 and especially zone 21. We compared the accuracy of these ANNs with data for the year for which the classification was performed, using univariate analysis (with α = 0.05). It was found that the statistically significant influence on accuracy for classifications based on one year, the year for which the classification was performed (
p = 0.035) was significant in Z
21, while in the case of the remaining zones no statistically significant relationship was identified. Thus, for the period of analysis with a common FP = 1, only in 2018 was higher accuracy achieved compared to other years in the forecasts within subzone 21; this amounted to 75%, while in the case of other years the accuracy within this subzone was 50%, which supports the suggestion of high turbulence in the hotel industry in the period preceding 2018. When 2018, 2019, and 2020 were compared, the precisions in 2019 (83.3%) and 2020 (80%) were significantly lower than in 2018 (100%). Therefore, unlike the stability analysis based on precision within 3 risk zones, the analysis for 4 risk groups showed that changes occurred in 2019 in the domain of subzone 21. The previous result is in line with the conclusion of the study of Matejić et al. [
6], according to which in 2019, the progress of companies in the second risk zone was endangered and with factual data on the number of companies in subzone 21, the number of companies in this subzone increased from 8 in 2018 to 12 in 2019.
A series of univariate analysis tests (with α = 0.05) for AA indicators for 170 ANNs were performed, where AAs were compared for different time periods used in the input dataset. This analysis confirmed that the accuracy in subzone 21 was statistically significantly influenced by the usage of data from 2018 (
p = 0.019), while the accuracy within subzone 22 was not significantly affected by any particular year. When it comes to the first zone, AA accuracy was influenced by the presence of 2017 (
p = 0.031) and 2020 (
p = 0.011). This means that 2020 brought unexpected changes in the first zone, unlike 2019. In the event of a crisis, therefore, the real decline in the first zone in 2020 exceeded the network’s expectations. The number of firms in zone 1 has thus increased, from 26 in 2019 to 36 in 2020, as shown in
Table A7.
With the accuracy of the assessment within the third zone, a statistically significant impact was registered in 2017 (
p = 0.004), whose data negatively affected the accuracy. In this case, the smallest accuracy in the learning phase was achieved when 2017 and 2020 were included in the input set, whereby the inclusion for 2017 had a more negative effect than 2020. Interestingly, in 2020, the number of companies in the third zone decreased by as much as 23%, while in 2017, which had a statistically significant impact, the number of companies was unchanged compared to 2016. This can be interpreted through unexpected changes in 2017 (such as the increase in transitions from the third to the second zone and from the first to the third zone, as shown in Matejić et al. [
6]), while the negative trends in 2020 for zone 3 for networks were already expected based on data from 2019.
SI indicators based on the accuracies of ANNs in the classifications within individual zones can be classified as:
Classification into 3 zones:
Classification into 4 zones:
The MSTAB13Z1 indicator, calculated for individual years in the period 2018–2020 and three risk zones, indicates that negative changes were achieved in the first zone in 2020 with the value of this indicator being 90.9%, while in 2019, according to this indicator, no qualitative change was achieved (100% value in the case of 2018 and 2019). The STAB3 indicator calculated for the same period indicates smaller changes in 2019 and 2020 (94.9% and 93%, respectively, compared to 95.3% achieved in 2018) within subzone 1, while the STAB2 indicator shows that larger negative changes within subzone 21 occurred in 2019 than in 2020 (58% in 2019, 60.1% in 2020, and 63% in 2018). However, the indicator MSTAB13Z21 indicates an evident change in 2019, with the value of 83.4% compared to 100% achieved in 2018, while the negative trend in subzone 21 continued in 2020 with a value of 80%.
4.5. Accuracies of ANNs for Different Zonal Dynamic Patterns
For the 9 most precise ANN models for self-classification in the three risk zones, or a total of 180 ANNs, which were used for predictions in 2018, 2019, and 2020, we determined particular accuracies for 10 groups of firms to identify the groups with the lowest accuracies. The aim of this analysis was to identify the relationship between the dynamics of changes within the zones of firms in the previous period with effective accuracies of ANNs, and to identify groups of firms with certain dynamics of changes in zones in the previous period in which the ANN made the greatest classification errors. In other words, clusters of firms that represented different patterns in zone changes in the previous period were compared with effective accuracies (MEA) for these groups, to draw conclusions about the effect of the COVID-19 crisis in 2020 on firms with different dynamic patterns in terms of previous zonal changes. In order to cluster and generalize the dynamics of changes in the zones in the previous period, for each year of classification, 10 groups of companies were determined, depending on the zones to which these companies belonged in the period before the year of classification. For clustering purposes, K-means cluster analysis was used (method: iterate and classify) with a maximum of 10 iterations. Data in respect of company clusters are presented in
Table A9.
Clusters of companies for 2018 were formed on the basis of zones to which companies belonged in the period 2015 to 2018, and within them the dominant group of companies was placed in cluster 1, where there were companies with stable low exposure to bankruptcy, as well as cluster 2, where there were companies that were stably exposed to a high risk of bankruptcy. For these two groups of companies, ANNs were highly accurate in assessing their zone in 2018. The lowest accuracy of ANNs was achieved for cluster 9, where there were only 2 companies whose zones fell in 2016, and cluster 8, with 3 companies that had a significant decline in zone in 2016, while in 2017 they recovered their zones. Such deviations, given the number of firms in these clusters, can be interpreted as exceptions.
In 2019, clusters were formed on the basis of zones to which companies belonged in the three-year period 2016 to 2018, and the cluster where the networks were most inaccurate was cluster 3, which had only 3 companies and where companies fell from zone 3 to zone 2 between 2016 and 2017. ANNs were also relatively inaccurate for cluster 2, which included 18 companies that were stable in the first zone in the past period. This conclusion is in line with the results of the analysis of zonal SIs, according to which a significant change in subzone 21 was achieved in 2019, which was the result of an unexpected jump of companies from the first risk zone. For as many as 6 clusters, ANNs had an effective accuracy of 100%.
For the classification in 2020, clusters were formed based on the zones that companies occupied in the period 2016 to 2019, while the classification was performed using the ANNs only for 2020. It was shown that in 2020, the best ANNs for all FP values had the lowest accuracy when it came to groups: 3, 7, 9, and 10. Basically, the networks were the most inaccurate for companies with a rise in zone in the period 2016 to 2018, and the higher the rise in that period, the more inaccurate the networks were in the assessment of the zone in 2020. Thus, the crisis in 2020 affected larger clusters of firms and had the greatest negative effect on firms that had a significant improvement in the risk of bankruptcy in the period 2016 to 2018, while less affected firms were stable in the same zone during this period or recorded a decline in the zone in the previous period.
In addition to the previous analysis, within the same set of 180 ANNs, we calculated for each group of networks for 2018, 2019, and 2020 the type of error in the effective precision of the most accurate networks by year, i.e., whether ANNs were overestimating or underestimating the company’s zone. Thus, unlike 2018, for which underestimations of companies prevailed over their overestimations, which is a sign that in 2018 the actual exposure to bankruptcy risk was lower than the expectations of networks (i.e., positive changes), in 2019 the changes were negative, given that among the mistakes, overestimations of the zones of companies began to prevail. In 2020, this negative trend continued, and was more destructive compared to 2019.
In this sense, the difference between the number of firms for which zones were underestimated in relation to the number of firms in which zones were overestimated in the forecast year can be seen as an indicator of the nature of changes in the year, while a change in the previous indicator in two successive years can be considered a relative indicator of the change in the analyzed year compared to the previous year. Neural networks have previously been shown to be able to reduce systematic overestimation or underestimation [
124,
125], therefore values of these effective indicators can be largely attributed to changes that occur in the analyzed year, and to a lesser extent to model errors. According to the above, the following indicator of stability of a year can be introduced:
where UO
year is an indicator of the nature of changes in the year, UO
YEAR ∈ [−NR, +NR], NR represents the number of observations for which a prediction was made, NR
UE represents the number of underestimated firms for which the forecast generated a lower zone compared to the zone in which the firm actually found itself, and NR
OE represents the number of firms in which the forecast generated a higher zone in relation to the zone in which the firm actually found itself.
Moreover, the relative stability of a year can be expressed as:
where UOR
year is an additional relative indicator of the nature of the change in the year compared to the previous year and UOR
year ∈ [−2NR, +2NR], and according to these indicators, as shown in
Table 9, 2018 was a year during which positive changes occurred, while in 2019 there was a negative qualitative change. In the case of 2020, the value of the UO indicator was lower than in 2019, which indicates a greater extent of overestimation of companies, while UOR indicates that the negative trend in 2020 continued compared to 2019, but in this case it was weaker than the breakpoint in the nature of change achieved in 2019.
4.6. Validation of Stability Indicators
In order to perform the validation of SIs, their values were first compared with descriptive indicators for Z″-scores and company zones in the period 2015 to 2020, i.e., the risk of bankruptcy in the HI in the Republic of Serbia. The results are presented in
Table 10. In addition to basic indicators, the descriptive analysis included indicators that are presented in Matejić et al. [
6]: staticity of a year (S), positive flows of a year (PF), and negative flows of a year (NF), for which data have been available since 2016.
To further analyze the impact of adding data from individual years on the trend strength for individual time-series indicators, we incrementally calculated the value of the relative strength indicator (hereinafter RSI), where this indicator was used to approximate the component of the time series model describing changes in operating conditions in the analyzed year, i.e., .
In this approach, the strength of the trend of relevant indicators of bankruptcy risk in the analyzed period ending with the year for which the assessment of bankruptcy risk was performed, reflects the stability of a given year within the analysis period and determines the accuracy that ANNs will achieve in bankruptcy risk classification for individual firms in a given year. The high precision of ANNs in a particular year of analysis can be interpreted through the increasing trend in the time interval ending with the year of analysis and vice versa. Through the analysis of the strength of the trend within the period ending with the year under analysis, and various descriptive time-series variables that include indicators of centrality and scattering as basic factors of stability, an evaluation of the stability of the year can be performed.
The relative strength indicator is a well-known technical analysis indicator for assessment of the price momentum of securities and is a part of the diverse calculations and formulas commonly present in software computing research [
47]. It has been shown that combining neural networks with the RSI improves trading systems [
43,
44]. The original value of this indicator, presented by Wilder W in 1978, was used in [
42], and was adjusted to measure the strength of the trend linearly in the range 0–100%. The RSI used in the paper was calculated based on the formula:
where RSI
p represents the strength of the trend for the period
p = {p∈P |p ∈ [YEAR−i, YEAR]};
RSIp ∈ [0, 100%];
AGp represents the arithmetic mean of increasing values whose trend is measured in every two successive years in the analyzed period; and
ALp represents the arithmetic mean of impairment, whose trend is measured every two successive years in the analyzed period. Unlike the original Wilder model, which is intended for use at shorter time intervals in the time-series, and which uses 14 periods for the calculation of indicators, the RSI indicator used herein is based on a reduced number of periods in order to be more sensitive to the annual values of the monitored indicators of centrality and dispersion of zones and Z″-scores.
Table 10 presents the values of RSI-monitored indicators describing the zones and Z″-scores of firms for the periods 2015–2020 (RSI1) and 2016–2020 (RSI2), where the values of these indicators for individual years from 2017 to 2020 were obtained by calculating the strength of the trend in the period starting from 2015 until the year of analysis in the first case, while in the second case the calculation was performed for the period starting from 2016.
Table 11 presents the values of different SIs previously presented in the paper for 2018, 2019, and 2020, together with the number of ANNs used to calculate the presented values for these three years. SIs were calculated in relation to this period, i.e., for a maximum value of the FP period of 3 for indicators that used ANNs for self-classification, and FP values of 2 for indicators that used ANNs as a forecasting tool; these were the maximum available FP values for 2018.
SIs: STAB, MSTAB1, and RSTAB indicated that in 2018 the stability of the HI in terms of bankruptcy risk exposure increased compared to 2016 and 2017, and that in 2017 there was a smaller decrease in stability compared to 2016, while in general in the pre-COVID-19 period the HI in the Republic of Serbia was a very turbulent industry. Based on the insight into the descriptive indicators presented in
Table 10, it is noticeable that for 2016 there was an increase in the mean values of Z″-scores, with a larger decline from zone 3 than from zone 2. A total of 24 company transitions between the zones were registered that year, with an equal share of positive and negative transitions. The year 2017 was even more dynamic, during which the trend of falling from the third risk zone continued, while the number of jumps in zone 1 increased, with the median Z″-score values falling, indicating a decline in firm diversification after its growth in 2016. In total in 2017, there were 27 changes in the zones of companies, of which 13 were positive transitions. Indicators point to a slowdown in the flows of bankruptcy risk in 2018, in which only 10 transitions took place, of which 7 were positive while 3 were negative. Despite the small number of transitions, Z″-scores record the growth in the context of the median, which indicates that in 2018 the risk of bankruptcy decreased, especially for companies that were previously more exposed to this risk, while the decline in companies from zone 3 slowed down.
RSI indicators for all indicators except the arithmetic mean of Z″-scores indicate that the strength of the trend was higher when the 2018 time-series were added than was the case for 2017, while auto-correlation in Z″-scores increased significantly in 2018 compared to the previous year. Overall, although the increase in Z″-scores in 2017 indicates the continuation of the positive trend from the previous period, in 2017 the average strength of the trend in the third class of indicators was only 17%, while the strength of the median of the Z″-scores was 57%, which indicates a decline in stability in 2017. In the case of 2018, the average trend of the third group of indicators increased to 33%, while the strength of the median trend increased to 74%, which indicates an increase in stability in 2018.
Compared to 2018, in 2019 there was a slight decline in stability, where 10 out of 11 SIs registered signs of trend change and only MSTAB13Z1 did not identify changes in trend, so we can say that this indicator is relatively insensitive to the type of change that occurred in 2019. Minor changes in 2019 were identified by STAB, MSTAB13, STAB3, and MSTAB23. A slightly larger decline in stability in 2019 was detected by indicators MSTAB13Z21 and STAB2, where both indicators detected changes in zone 21. RSTAB and DESTAB indicators have also proven to be effective indicators of change, where RSTAB showed a decrease in stability in 2019 compared to 2018 by 1.1%, while the DESTAB indicator conversely showed a qualitative change in the trend of sample bankruptcy risk, from negative destabilization to the occurrence of the destabilization process measured through + 3.9%. UO and UOR indicators also showed that a qualitative change took place in 2019, where the UO indicated that in 2019, overestimations of companies began to exceed underestimations of companies.
Analogously, in 2019, most quantitative indicators recorded a decline, except for the staticity indicator (S), which remains at the same level as in 2018. A total of 16 transitions were recorded, of which 10 were negative, primarily from zone 3, while the largest number of jumps was recorded from zone 1. The declines in the median and arithmetic mean indicate negative trends for 2019. Partial values of autocorrelations in Z″-scores with a time lag of 1 year, however, show that although there was a decline in these correlations in 2019, they remained at a higher level in 2019 compared to the period 2015 to 2017, i.e., 2019 is more correlated with the previous year than was the case in the years preceding 2018, which is reflected in the increased accuracy of the ANNs in 2018 compared to 2016 and 2017.
In respect of RSI indicators in 2019, the strength of the trend was significantly reduced in most indicators compared to 2018, while for other indicators it remained at the same level as in 2018, like the trends for the period 2015 to 2020 and for the period 2016 to 2020, which is a clear sign of declining stability in 2019. In terms of business transitions between zones, a decline in trend was observed in all types of transitions except those originating from the third zone. Transitions from the second to the first zone during the entire analyzed period do not show any trend, and can be assessed as a stochastic process, while the retention of the companies in the first zone does not depend on the characteristics of the year in the period 2018 to 2020, since there was no change in the trend of this indicator. It is interesting that in 2019, trends in zone 2 were significantly disrupted, and in this way the problems of ANNs in respect of the forecasts of zone 21 in 2019 can be explained, considering that for this year ANNs learned from the example of 2018, which had opposite tendencies within this subzone.
The results of the literature in this domain are not in line with the analysis of indicators performed in Matejić et al. [
6], where 2019 was assessed as a neutral year; however, in the relevant paper this assessment is given for the period from 2015, while in the results we present 2019 is more unstable than 2018, although it has higher stability than 2016 and 2017.
In 2020, however, there is an overall collapse of SIs, where all indicators except the STAB2 indicator register a significant change compared to the previous period. The strength of the changes for this year is reflected by the SIs, which registered a decrease in the accuracy of classification ANN values to 8.9% compared to 2019 and up to 9.5% compared to 2018; a decrease in the maximum classification accuracy of 8% compared to 2019 and 9.1% compared to 2018; lower relative stability compared to the previous year of 6.4%; no effect of increasing the FP period to the accuracy of the ANNs; intensification of the destabilization process of 18.7%; and less negative changes in the risk of bankruptcies of companies that were in the first risk zone and the continuation of the negative trend in overestimating the zone of companies in relation to their underestimation of 1.5%. The improvement in the situation in 2020 compared to 2019 in the area of changes in subzone 21 was indicated only by the STAB2 indicator, while the MSTAB13Z21 indicator showed a further deterioration in the predictability of zone 21.
Descriptive indicators also indicate that in 2020, there was a larger decline compared to 2019, where there was an increase in standard deviation in the zones of firms, which indicates greater stratification among firms in the context of their risk of bankruptcy and increasing negative transition flows (NFs); these are, for the first time in the analyzed period, beginning to lead in relation to positive transitions. Significant changes particularly affected zone 3, where as many as 15 companies left this zone, while the decline from the second zone and jumps from lower to higher zones were not significantly affected compared to the period starting in 2018. A total of 25 changes were recorded in zones, of which as many as 21 were negative. However, partial auto-correlations remained at a higher level in 2018 than in 2016 and 2017, which confirms that the changes in 2020 were below those in 2016 and 2017, as a result of which the accuracy of ANNs for self-classification has not decreased compared to these two years.
The decline in 2020 was also identified by RSI indicators, except for the arithmetic means of Z″-scores and zones that have previously been shown not to be the best indicators of stability of individual years and negative transition flows (NFs) where the growth trend of this indicator increases, starting in 2018. Slightly higher stability compared to 2019 was recorded for transitions where companies remained in the second risk zone, where there was an increase in the trend from 9% to 11% in 2020. That year, the staticity trend fell from 75% to 22%, the median Z″-scores fell from 32% to 10%, and the median zone fell from 100% to 0%, while the average class II RSI indicator fell from 61% in 2019 to 40%, while within class IV it fell from 44% to 27%, which indicates that 2020 was a year with a significant decline in stability.
RSI indicators for transitions originating from the third zone, as shown in
Table 7, underwent a significant decline in the trend of these transitions in 2020, while, at the same time, ANNs maintained accuracy in forecasting within zone 3. This can be explained by the fact that ANNs were trained using data for the period dominated by negative transitions, especially transitions from the third to the second zone, which were also affected by the COVID-19 crisis in 2020 with the greatest power, so the ANNs were ready to recognize such changes in 2020, even if the COVID-19 crisis had not occurred.
In the period from 2016 to 2020, therefore, within the structure of transitions in two successive years, negative transitions were dominating with a total of 42 such transitions, while there were 34 positive transitions. A total of 17 transitions were recorded with a significant change of zone: 8 from zone 3 to zone 1, and 9 from zone 1 to zone 3 in two consecutive years. The previous indicators support the conclusion that the HI in the Republic of Serbia is a dynamic industry in terms of changes in bankruptcy risk. Overall, during the analyzed period, there was a trend of relative increase of negative transitions, which indicates a decline from higher to lower zones compared to positive transitions in the same year; starting from 2016 these differences were 0, 1, 4, 4, and 17. Among the negative transitions in the entire analyzed period, the dominant transitions were those from the third to the second zone, where as many as 22 transitions were recorded in the pre-COVID-19 period, while the number of transitions from the second to the first zone was 18.
Given that 2019, in which the effects of the COVID-19 crisis were not present, brought a significant decline in the stability of the pre-COVID-19 period, the separation of the impacts of the COVID-19 crisis in 2020 is a difficult problem. The indicators that stood out in the identification of a special form of crisis in 2020 are the destabilization indicator and the MSTAB1
3Z1. The appearance of a special form of crisis in 2020 is indicated by a descriptive indicator of staticity, given that the value of this indicator in 2019 had not changed, while in 2020 the value of this indicator fell by 21%. To fully separate the impact of the COVID-19 crisis, it is necessary to take into account the cyclical trends in HI bankruptcy risks in the Republic of Serbia, as was performed in Matejić et al. [
6], and the results of that paper can be used to validate SIs under conditions of major structural changes. The COVID-19 crisis in 2020 represented a significant change in 2020 concerning the bankruptcy risk in the HI in the Republic of Serbia.
The success of SIs in describing changes in time-series data was validated from the point of view of analyzing their correlation with the values of RSI indicators for the period 2016 to 2020 (RSI2), which is shown in more detail in
Table A10. This analysis shows that ANNs largely follow RSI indicators of HI bankruptcy risk in the Republic of Serbia in the given period of analysis, despite evident turbulences in this industry during the period, with an average correlation coefficient value of 0.79. As shown in
Table A10, different SIs have different success rates in monitoring RSI for different indicators, where for the first class of indicators, the highest average correlation of indicators with Z″-scores was achieved in a given period, while in the second class it was shown that the accuracies of the ANNs are good approximators of changes in trend data for static indicators of the year (S) and positive flows of the year (PFs).
Among the indicators themselves, the destabilization indicator has the highest average correlation with the analyzed RSI indicators of various variables that are important for monitoring the risk of bankruptcy at the HI level, which had the highest correlation with STAB and STAB3 indicators, while the STAB2 indicator had the lowest overall results. It is also important that overestimation and underestimation of zones (UO and UOR) had the worst results for the firms within the third risk zone.
4.7. Influence of Input Factors of the Model on the ANN Accuracies in the Learning Phase
In the last part of the analysis, we tested the impact of a set of input factor models, model classes, and addition of an input factor of a cluster on the AMAN accuracies of ANNs for predicting firm risk zones in the individual years 2019 to 2021. In this case, a set of 8310 ANNs was used, with different learning periods and periods from which input factors originated, along with different MEF models and different applications of cluster input factors. For the set of all models, univariate analysis (α = 0.05) was performed partially for each year and the statistical significance of the influence of model components on AMAN model accuracy was tested. Through univariate analysis tests, the influence of the MC component for the sets of Class 1 and Class 2 models and partially Class 1 and Class 3 models was additionally tested. For models with a learning period of one year, it was not possible to use market indicators, as they are constant in the sample. Some models were not applicable due to unavailability of data for certain years and such models are listed in
Table A2,
Table A3,
Table A4 and
Table A5. The results of the univariate tests are shown in
Table 12.
For networks prepared to predict bankruptcy risk zones in 2019 and 2020, the choice of input factor model and MEF class had a statistically significant impact on ANN accuracy. A partial comparison of the model class based on Altman’s indicators and the class of models that additionally applied market factors in both years showed that there was a statistically significant difference in the accuracy that these two classes of models have in both years. For 2019, the most precise Class 1 ANN had a precision in the learning phase of 91.7%, while the most precise Class 2 ANN had a precision of 83.4%. For 2020, the most precise Class 1 network had an AMAN of 90.7%, while in the case of Class 2 this precision was 88.5%. Thus, in the case of 2019 and 2020, the addition of market indicators among the input factors of the model led to a decrease in the accuracies of the ANNs in the learning phase. These results are expected because it was shown that the indicators of the Z″-scores of firms and their zones in a given year in the period 2016–2019 are very weakly correlated with market indicators used in the study (0.003 and 0.007, respectively), while market indicators are at the same time strongly mutually correlated (0.974, correlation matrix determinant = 0.000).
When it comes to internal non-financial indicators of the firms, no statistically significant difference was identified in either case, which means that the addition of non-financial internal indicators, in general, in the case of 2019 and 2020, did not increase the accuracy of ANNs in the learning phase. Accuracy in the learning phase, as previously mentioned, did not include effective accuracy achieved in 2019 and 2020, and accuracy in the case of 2019 represents accuracy in the classification for the previous period depending on the specific LP period, ending in 2018. In the case of networks prepared for 2020, the year for which the network training was realized and thus the last year that had an effect on the AMAN value was 2019, for which it was established through previous analyses that the AMAN did not deviate significantly compared to the previous period in the context of bankruptcy risk exposure. In this sense, we can generalize the previous conclusions, i.e., that the addition of market indicators leads to a decrease in accuracy in the learning phase when learning takes place under regular market conditions, when the impact of a sudden crisis is not felt, while adding non-financial internal indicators does not improve the precision of ANNs for the models presented in this study.
When it comes to choosing a specific model of input factors, in the case of 2019 and 2020 there were generally statistically significant differences in the accuracy of ANNs in the learning phase that arose as a result of choosing the MEF model. In order to examine the application of market and non-financial factors in more detail, we performed a series of two-tailed paired-samples
t-tests, comparing the precision of individual models with included market factors and non-financial factors in relation to the results of models with MEF = 3, with the same LP and FP. The results are shown in
Table 13.
For networks prepared for forecasting in 2019, models with MEF ∈ {12, 18–20} were not applicable, while the accuracies of ANNs were statistically significantly reduced by the isolated application of factors Fyear (Z″ind) (MEF = 7), LPCyear−i (MEF = 9), and LLSyear−i (MEF = 11) among market factors, and firm age factors among internal nonfinancial indicators (MEF = 13), while the precision was improved through the application of the combination of the country of origin of the dominant owners in combination with the change in Z″-score in the previous year (MEF = 17). Among the models that applied these input factors, the most accurate network in the learning phase for 2019 had an accuracy of 91.7%.
For networks prepared for forecasting in 2020, models with MEF = 19 were not available, as we possessed data regarding the number of employees in the sample companies only for the period from 2018 to 2020. In 2020, the impact of the MEF component was weaker than in the case of 2019, and two MEF models that compromised the accuracy of ANNs were identified: a model with MEF = 8, which took into account the arithmetic mean of the Z″-score sample in the past year (Z″ind−i) and a model with MEF = 14 (TYPE), which took into account the institutional type of the hotel firm.
For networks prepared for prediction in 2021, models with MEF = 6 were not available, as at the time of writing we did not have data for %GDP2020. In the case of 2021, which included 2020 in the learning period, none of the MEF classes stood out in terms of accuracy compared to others, while the application of input factor models with individual market and internal indicators had begun to yield results in terms of improving the accuracy of networks.
The model class had a statistically significant impact for all models in general and when comparing Class 1 models with Class 2 models. The most accurate MEF Class 1 network had an accuracy of 86%, within Class 2 it had an accuracy of 84.2%, and within the Class 3 it had an accuracy of 84.1%. Network accuracy was improved for models with the same FP and LP for models that took into account the following market factors: MEF = 8 or the arithmetic mean of the Z″-score sample (Z″ind−i); MEF = 11 taken into account by LLSYEAR−i; and MEF = 12, which took into account the indicator of relative stability of the year in the past year (RSTABYEAR−i). The improvement was achieved in the combined model, which in addition to this indicator took into account the age of the company (MEF = 20). The MEF = 8 models, which reduced the accuracy of learning networks for forecasting in 2020, increased the accuracies of ANNs in learning for 2021.
In general, in conditions of sudden crisis, we can state that market input factors improve the accuracy of ANNs with the same LP and FP in the first years of the crisis, but have limited application because they cannot be used for LP = 1, for which the highest accuracies of ANNs were achieved in the learning phase in periods with significant changes in stability. Further empirical research is needed in this domain, which would analyze the influence of market input factors on the accuracy of ANNs for forecasting zones in 2022.
The application of clusters generally had a statistically significant effect on the accuracies of ANNs prepared for forecasting zones in 2019 and 2020; however, these improvements were not large. In 2019, through the application of the input factor of cluster C20198/3, the accuracy of the model on which this cluster was applied was improved by 1.7%; through the application of C20198/2 this improvement amounted to 1.1%, while the improvement achieved through the application of cluster C20193/1 amounted to only 0.3%. The application of clusters had better results in the case when the full available period for FP and LP was not used, since the application of input factors in some models extended the period from which data were transmitted to the network. Thus, for FP = 1, the accuracy was improved by 2%, while for FP = 2 and FP = 3, the improvements were insignificant (0.2% each). Similarly, the accuracy of the models with LP = 1 was mostly improved and this improvement was 1.7%, while for higher LP values it was lower (1.1% and 0.3% for LP values of 2 and 3). The improvement in the accuracy of reference models for 2019 was 1.3%.
For 2020, the application of the cluster C20208/3 improved the accuracy of ANNs by 1.5% and the cluster C20208/2 by 1.9%, while the application of C20203/1 was counterproductive and reduced the value of AVM indicators by 0.9%. The largest improvement in accuracy was achieved with models with FP = 2 and amounted to 1.6%, followed by models with FP = 3, where the improvement was 1.4%; for FP = 1, improvement was 1.1%, while the least improved accuracy was achieved for models with FP = 2 and FP = 4 and amounted 0.8%.
In 2021, when different laws ruled, the application of the cluster input factor did not have a statistically significant impact on AMAN accuracies. In the case when the networks were trained for forecasting zones in this year, the networks had the highest accuracy for the LP = 1, FP = 1 configuration, and the application of clusters did not give results, as the clusters included data from a longer period before 2021, which in this case was undesirable. Thus, for all three MEF classes, the most accurate ANNs for each of the classes were obtained for models that did not use the cluster input factor.
We used multiple discriminant analysis (MDA) as a model factor selection method, wherein the stepwise method and Wilk’s lambda for function separation were applied. When internal non-financial indicators of firms were tested partially, it was established with a 95% confidence level that the firm’s zone in individual years in the period 2016–2019 can be described only through the age of the firm (AGE), with values of Wilks’ lambda of 0.887 (p = 0.003), 0.897 (p = 0.005), 0.926 (p = 0.024), and 0.917 (p = 0.015), respectively by years in the analyzed period. In the case of 2020, none of the internal factors considered by this method had a statistically significant impact on the achieved zone of the company in a given year, including the following factors: the number of employees in the past two years and the change in the number of employees in the past year.
MDA analysis confirmed that the input cluster factors, formed on the basis of firm risk zones in the previous period compared to the year in which the firms’ zones were analyzed, had a statistically significant impact on firm zones in 2019 and 2020, which was not the case in 2021. In contrast to the analysis of the impact of internal non-financial factors in this case, the dependencies of the firm’s zone in the year of analysis and the clusters used to train ANNs for a given forecast year were partially examined. A statistically significant association was found for the following clusters applied to ANNs for forecasting in 2019 and 2020: C20198/3 (Wilk’s lambda = 0.733, p = 0.000), C20198/2 (Wilk’s lambda = 0.930, p = 0.029), C20208/3 (Wilk’s lambda = 0.916, p = 0.014), C20208/2 (Wilk’s lambda = 0.763, p = 0.000), and C20203/1 (Wilk’s lambda = 0.686, p = 0.000). According to this method, C20193/1 (Wilk’s lambda = 0.979, p = 0.359), had no statistically significant impact on risk zones in 2019, for which the ANNs also registered the least improvement in accuracy.
When it comes to market factors, the application of MDA for individual years is not possible, since the values of market factors for firms in partial years are constant. In this case, the MDA method was applied to different periods within the period 2016–2020, including different intervals from two to five years, where the impact of market factors from the previous period on the achieved zone of companies in the analyzed periods, and the impact of market factors from the current year to the firm zone in that same year, were tested. No statistically significant relationship was found between a market factor and a firm’s zone for any combination of period and market factors. Thus, for example, when testing the relationship between the realized zone in the year and previous values of market factors, for the period 2017–2020, it was established that the following had the lowest p-values: RSTAByear−1 (0.079) and Z″ind-1 (0.061); for the period 2018–2020 the following had the lowest p-values: RSTAByear−1 (0.078) and LPCyear−1 (0.075); for the period 2019–2020, the following indicators had the lowest p-values: LPCyear−1 (0.202) and Z″IND-1 (0.207). No statistically significant relationships were established for the pre-COVID-19 periods either.
Overall, the results of the MDA analysis are in principle consistent with the results of ANNs, with the difference being that ANNs found usable data in respect of Z″
ind−i, RSTAB
year−i, and LLS
year−i for zone forecasts in 2021. At the same time, ANNs did not find the company age factor to be useful except when it was combined with RSTAB
year−i in forecasting the zone in 2021. In addition, financial distress prediction is crucial in the financial domain, and serious financial losses may be avoided because of good financial distress prediction [
126]. Finally, the detection and prediction of fraudulent financial reporting has become of greater importance to researchers and practitioners due to the increased number of corporate fraud reports, including bankruptcy as a consequence [
127]. Bankruptcy could result from fraudulent financial reporting [
128]. Researchers have used various techniques and models to detect and predict fraudulent financial reporting, given that it can obscure a company’s possible bankruptcy. Predicting corporate bankruptcy has been extensively studied in accounting, as all stakeholders in a firm have a vested interest in monitoring its financial performance. A study by Wilson [
129] shows that neural networks perform significantly better than discriminant analysis in predicting the bankruptcy of firms. The detection of management fraud is an important issue facing the accounting and auditing profession. ANNs can be used as a decision aid (analytical procedure) in detecting instances of management fraud [
130]. Fanning and Cogger [
131] suggest that ANNs offer a superior ability compared to standard statistical methods in detecting FFS.
5. Conclusions
In this study, a multidimensional comprehensive analysis of changes in various indicators of accuracy was performed, i.e., the generalization capabilities of ANNs for the assessment of bankruptcy risk zones of the Altman’s model for hotel companies in the Republic of Serbia. Different models of ANNs, in terms of learning periods, periods from which input factors originate, application of hybrid cascade models, and for different years, with different dynamic characteristics were trained and tested. ANN generalizing errors served as a basis for evaluating the stability of individual years, the dynamics of changes in the market, and consequently formulation of a series of annual stability indicators of the HI in terms of the risk of bankruptcy that are presented throughout the paper. Special attention was paid to the indicators of stability in 2020, in which the effects of the COVID-19 crisis were present, in order to establish the extent to which this year differed from previous years, by taking into account the trends in development of different factors.
According to the first research objective of the paper, it was shown using the example of 1220 ANNs for the period 2019–2020 and 840 ANNs for the period 2018–2020 that the ANNs significantly lost their average goodness-of-prediction accuracy in predictions of the bankruptcy risk zones of HI firms in 2020, having average accuracies ranging from 61.5% to 68%. The average effective accuracy of trained ANNs in the case of 2019 ranged from 77.4% to 83%, while in 2018 a slightly lower accuracy was achieved in the range 74.8% to 79.3%.
In addition, the nonlinear nature of the relationship between goodness-of-fit accuracy and the effective accuracy of the ANN models further complicates the selection of an adequate ANN prognostic model, making the ANN models tested in the study, under the conditions of a sudden crisis such as the COVID-19 crisis for individual companies, unusable in the case of zone assessments in the first year of the crisis, i.e., 2020. Analogously, in line with the declining accuracy of ANNs for forecasting the zones in 2020, the accuracy in the learning phase of reference models for risk forecasting in 2021 decreased from 85.4% in 2020 to 74.4% in 2021; however, the growth of effective precision in this year compared to 2020 is possible through adapting ANNs to new working conditions and increasing the stability of the learning period, i.e., inclusion of 2020 in a learning period that is similar in nature to 2021, which requires further empirical research.
The second objective of the paper, i.e., the development of indicators for the evaluation of the stability of a year, and through that the effects of the COVID-19 crisis in 2020, resulted in 11 different indicators of stability presented in the study. The indicators were validated through quantitative analysis of bankruptcy risk indicators, their mutual comparison, comparison of the obtained results with the results of the work presented in Matejić et al. [
6], and through the analysis of correlation of stability indicator values with RSI values of bankruptcy risk indicators. It was shown that for the period 2016–2020, RSI indicators are highly correlated with stability indicators so that ANN generalization errors are a good indicator of the stability of individual years within the analyzed periods.
The stability indicators for the pre-COVID-19 period showed that the HI in the Republic of Serbia is a dynamic industry with large oscillations in the risk of bankruptcy. Within the period starting from 2016, in 2017 there was an increase in the risk of bankruptcy, while 2018 was a year of stabilization, for which the predictive potential of ANNs was at the highest level in the analyzed period, and for which underestimations of the zones were more significant than overestimations, while the registered moderate changes were of a positive nature. Compared to 2018, as many as 10 of the 11 indicators registered a change in trend in 2019. Analysis of descriptive indicators of bankruptcy risk confirmed that changes in 2019 were generally negative in nature, where the risk of bankruptcy increased, especially for firms in the sample that were previously less exposed to the risk of bankruptcy; however, the change in 2019 was smaller compared to the changes in 2016 and 2017. Overall, in the period 2016 to 2019, there was a trend of relative increase in negative transitions that indicates a decline from higher to lower zones in relation to positive transitions, among which the most pronounced are the transitions from the third to the second zone of bankruptcy risk.
For 2020, stability indicators were collapsing in the context of corporate exposure to the risk of bankruptcy, where stability indicators describe the severity of the crisis in 2020 through the following: collapse compared to 2019 on a scale of 6.4% to 8.9% and up to 9.5% compared to 2018; intensification of the destabilization process with a scale of 18.7%; and continuation of the negative trend in overestimating the zone of companies in relation to their underestimation.
However, the separation of the impact of the COVID-19 crisis is a difficult problem under the conditions of the HI in the Republic of Serbia, given that the bankruptcy risk trends in this industry in the pre-COVID period were volatile and the fact that in 2019 a negative trend was already present compared to 2018. The indicators that stood out in identifying a different form of crisis in 2020 are the destabilization indicator, which identified a strengthening of the destabilization process of 18.7% (while the value of this indicator in 2019 was 1%), and an indicator that measures accuracy within first zone of Altman’s model, considering that this indicator singled out 2020 in relation to 2019 due to the nature of the changes. Further, it was found that the descriptive indicator of the staticity of the year is a good indicator of the COVID-19 crisis, given that the value of this indicator in 2019 had not changed, while in 2020 its value fell by as much as 21%. Similarly, only in 2019, the trend of relative increase of negative transitions was marked, which means a decline from a higher to a lower zone compared to positive transitions in the same year in the entire analyzed period from 2016 to 2020, while this difference in 2020 amounted to 17 transitions. The findings of the paper are in accordance with the results of Matejić et al. [
6], who confirmed that the COVID-19 crisis had a degrading effect on the HI in the Republic of Serbia in 2020.
In terms of the third objective of the paper, the conclusion of the analysis is the extension of the time horizon from which input factors originate and the extent to which learning periods in time-series ANN models for bankruptcy risk-zone prediction have beneficial effects on ANN accuracies in years assessed as stable within the analyzed period, regardless of the stability of the overall learning period, expressed through ANN accuracy in the learning phase of the classification year, as was the case in 2018.
In the context of the fourth objective of the paper (the testing of the usage of non-financial, industry, and internal indicators regarding their contribution to the accuracy of ANNs models), it was shown using the example of 8310 ANNs that adding market indicators to Altman’s model indicators (Z″year−i and Zoneyear−i) leads to a decrease in accuracy in the learning phase of ANNs during stable periods of learning, while the application of internal non-financial indicators of firms has no impact on accuracy. In the case of 2021-forecasting ANNs, the addition of market factors increased their accuracy in the learning phase, but this addition was not effective overall because it was previously shown that the highest accuracy in 2021 was achieved when learning was realized through the use of data solely for 2020, where it was not possible to use constant market indicators on an annual basis.
The addition of cluster input factors, i.e., the use of the hybrid cascade model presented herein, improved the accuracy of prediction in the networks trained for 2019 and 2020, although these improvements were small, depended on the type of cluster used, and had a greater effect on models with a shorter analysis period in relation to the maximum available period, extending the analysis period to the maximum available period. In the case of ANNs prepared for 2021, the use of cluster input factors, however, did not yield results, because in this case the extension of the analysis period was counterproductive, due to the difference in 2020 compared to the previous period.
In terms of the sixth objective of the paper, the analysis of dynamic patterns, wherein companies were classified into 10 groups in 2018, 2019, and 2020, depending on changes in their zones in the previous periods, showed that in 2018 there were no groups of companies that stood out in terms of effective accuracies of ANNs, while in 2019 lower effective accuracies were achieved in a group of 18 companies that were previously stable in zone 1 and which in 2019 achieved improvements in that zone, which is in line with declining predictability within subzone 21 in 2019. However, when it comes to 2020, the crisis that year affected a larger group of companies and had a greater negative effect on companies that had a positive trend, i.e., a decline in exposure to bankruptcy risk in the period from 2016 to 2018.
This study makes several scientific contributions. First, we present a new conceptual framework for the application of ANNs in predicting bankruptcy risks through the analysis of their generalization capabilities. This is the first study after that of Matejić et al. [
6] to tackle firm risk zones using ANNs with scoring technique input factors (in this case Altman’s Z″-score model) and to apply time-delay time-series neural networks to predict bankruptcy risk in the hotel industry. Second, we have provided empirical results regarding the applicability of ANNs and the effectiveness of different ANN models in predicting bankruptcy risk zones in the hotel industry in pre-COVID-19 conditions and the first year of the COVID-19 crisis. In this way, in general, the empirical basis for the influence on ANN accuracies of various input factors is expanded, including non-financial internal indicators of companies, market indicators, and application of hybrid cascade ANNs, as well as different entry and learning periods. Third, we have presented a methodology for assessing the annual stability of the industry, and the degree and nature of changes that occur globally in a given year in the analyzed industry, providing a set of stability indicators based on different types of ANN accuracy for different ANN models. The stability indicators are important in ex-post analyses, especially in the field of interpreting the accuracy of ANNs in predicting bankruptcy risk zones and identifying factors that affect their effectiveness. Fourth, we have demonstrated one way to apply clustering to assess changes in bankruptcy risk development for different groups of firms, depending on their historical development of bankruptcy risk. Fifth, we have suggested that RSI indicators for various descriptive indicators relevant to the development of bankruptcy risk can be used both in the evaluation of ANN models and in the pre-evaluation of the usability of ANNs to predict this phenomenon in years with different contributions to data trends over a wider period of time. Sixth, through a detailed analysis of trends in the development of bankruptcy risk in the HI in the Republic of Serbia, we have provided a basis for better understanding of this phenomenon, which is important for public and internal policy makers, creditors, employees, and other relevant entities.
The limitations of the study stem from the limited sample of available data, especially in the period when the COVID-19 crisis was present, and we analyzed the impact of the crisis only for 2020. The persuasiveness of the study is limited both by the randomness factor in network training and the limited possibilities for validation through earlier research studies, due to the innovativeness of the concept of bankruptcy prediction presented herein.
The methodology of measuring dynamic changes presented herein is potentially applicable in other industries and national economies, in the general evaluation of market dynamics, not only in times of crisis, but also in cases of developmental and positive changes in the market and at the level of individual firms, for which it is necessary to first test their robustness. In this sense, it is necessary to expand the analysis period and test the accuracy of ANN models and stability indicators for the years after 2021, and for periods in which there were positive development changes in the hotel industry in Serbia, and apply the same approach to another sample of companies. In addition to testing the robustness of the model for the hotel industry in the Republic of Serbia, it is necessary to test its applicability in other industries and national economies, in order to ensure wider applicability.
Furthermore, the results of the presented concept are limited by the capacities of Altman’s EM model and depend on its discriminant values for determining bankruptcy risk zones. In this sense, it is necessary to test the effectiveness of the presented model in predicting bankruptcy itself, especially for different scenarios that can be reported through the manipulation of discriminatory (cut-off) values of Altman’s model in further studies. Additionally, it is advisable to test the robustness of Altman’s model in terms of applying other scoring models (such as Conan and Holder [
46], Springate [
47], Fulmer et al. [
48], Ohlson [
49], Zmijewski [
50], etc.) and integrative scoring technics. Further development in the field of artificial neural networks, and of other machine learning techniques, for the prediction of fraudulent financial reporting is also advisable.