1. Introduction
Recently, beginning at the end of 2019, during 2020 and now in 2021, we have experienced the rapid propagation of a novel coronavirus that killed more than eighteen hundred and infected thousands of individuals in just the first two months of the pandemic [
1]. More recently, the virus has rapidly spread and has moved to many cities in all continents of the world. The most notable symptoms of the patients (based on experimental clinical data) are a dry cough, dyspnea, high fever and other related symptoms. At the beginning, most cases were localized to the city of Wuhan in China. As a consequence of this, in 30 January 2020, the World Health Organization (WHO) officially declared the COVID-19 outbreak to be a Public Health Emergency of International Concern [
2].
Nowadays, due to the importance of the problem, many research groups in the world have dedicated their efforts to understanding all facets of the COVID-19 pandemic, and as a representative sample of the current literature in this area we mention some of these works. There was an interesting work on identifying emerging patterns that may contribute to achieving the automatic diagnosis of COVID-19 using convolutional neural networks, and the results showed that the method can provide a relevant impact on the automatic diagnosis of COVID-19 [
3]. Another relevant work was the research of COVID-19 cases in China based on a dynamic statistical approach [
4]. Other important articles can also be mentioned: a prediction with deep neural learning models of commercially available antiviral drugs that have a high probability of a positive impact on the novel coronavirus [
5] and an early prediction of the COVID-19 outbreak in China based on a particular design of a mathematical model [
6]. Additionally, the work presented in [
7] described a range of practical online/mobile geographical information systems, mapping dashboards and applications for tracking the COVID-19 pandemic as it evolves around the globe. In addition, in [
8], a proposal for utilizing the definition of cartograms in order to better visualize the spread of COVID-19 was presented. Finally, we can outline some recent studies that have been undertaken using artificial intelligence (AI); for example, the work presented in [
9], in which the authors put forward the idea of using learning methods for improving the identification of COVID-19 cases in a quicker fashion when using a mobile phone-based web survey. In addition, AI techniques have been successfully utilized in decision-making problems for healthcare applications. This implies that AI-driven methods can be useful in identifying when COVID-19 outbreaks will occur, as well as in predicting their nature of spread rate around the globe [
10].
However, the existing mentioned works have mostly treated the temporal facet of the problem, meaning that most of these contributions have been aimed at predicting or forecasting the COVID-19 data in a variety of ways. This facet is also relevant, as organizations need to estimate the number of COVID-19 cases to be able to produce the optimal decisions concerning the financial support to be directed to the solution of the COVID-19 problem. Therefore, the current gap in the existing knowledge is the lack of proposed models that can intelligently combine both the spatial and temporal aspects of the dynamics of COVID-19. In this sense, one of the most important contributions of this article is the utilization of neural networks for clustering similar countries, with respect to their status in the COVID-19 pandemic, and consequently the ability to put forward common strategies for countries in the same cluster. In addition, another important contribution is the use of the fuzzy fractal approach for efficiently predicting in each of the classes formed by the neural network. In our opinion, these contributions are both very important, including when combined, as they complement each other. This way, the temporal view of the problem is complemented by the spatial aspect in order to arrive at the global problem solution. In addition, we want to emphasize that the proposed approach is a combination of three methods, which has not previously been carried out in the literature (in this case, self-organized maps, fuzzy logic and fractal methods) for the spatial and temporal analysis of data and its application for the time series prediction.
Regarding the relevance of the paper to the sustainability area, the proposed approach is directly related to the healthcare theme. However, speaking more in general, this study proposes a novel methodology that can be used to monitor and stop the spread of COVID-19, which is a major public concern across the world. Therefore, this study certainly contributes to the social sustainability dimension.
The rest of the article is as follows. In
Section 2, we briefly summarize the most important concepts of a special kind of neural network of an unsupervised nature.
Section 3 explains the theoretical basis of the fractal dimension concept.
Section 4 briefly describes the basic concepts of fuzzy logic and its application in a time series prediction.
Section 5 offers a description of the problem to be solved and the method that is proposed in this work.
Section 6 outlines the experiments and summarizes the results achieved with the proposal in this article.
Section 7 offers a discussion of the obtained results. Lastly,
Section 8 summarizes the conclusions that were elaborated after finishing this work.
2. Self-Organizing Neural Networks
The Kohonen map, which is also recognized as the self-organizing map (SOM), is a type of unsupervised neural network model that can be utilized to find and analyze patterns in datasets of high dimensionality. This neural network was originally put forward in 1982 by the Finnish Teuvo Kohonen. The SOM is a grouping method that finds clusters in a dataset and does not require the utilization of statistical methods. The SOM is composed of only two layers: the input and the output layers [
11]. The main aim of this method is to move the input elements, having
n attributes, to the output in a form where the elements have a relation among them (this is achieved by forming the clusters). In this model, connection weights are used so that the neurons have a relation between them and the inputs are directly connected to the outputs. The weights from the N inputs to the M output nodes are initialized with small values in a random fashion [
12]. The activations of the output units based on this model are presented in Equation (1). The method for adapting the weights can be defined by Equation (2).
where
activation of output unit j,
activation value of input unit,
weights of lateral connections to the output,
neurons in neighborhood,
unity function giving a value of 1 or 0 and
gain term that has a decreasing (in time) behavior. The ability to learn “competitively” is provided by the lateral connections, which can be viewed as the output layer neurons competing to be able to classify the input patterns. In the initial phase of training, the input patterns are offered to the neural model and the winner output is the one with the closest vector of weights and viewed as the cluster representative. Equation (1) illustrates how the distance is applied to make the selection of the winning neuron [
13]. In
Figure 1, an illustration of the self-organizing network architecture showing the neighborhood around the winner neuron is presented. It has a very simple structure without hidden layers, using only an input and an output layer. For the application in this paper, both the input and output layers have 199 nodes (this is the number of countries considered in the study) and the number of epochs is 1000.
Neural networks, such as the SOM model, have been widely applied in real-world problems, such as in identifying salinity sources [
14], determining plant communities based on bryophytes [
15] and the diagnosis of arthritis [
16]. However, in this article, the neural network is utilized for classifying 199 countries of the globe with COVID-19 confirmed cases. The world dataset was obtained from the Humanitarian Data Exchange (HDX) [
17].
3. Theoretical Background on the Fractal Dimension
Recently, significant advances have been achieved in the study of fractal theory constructs for understanding the geometrical complexity of objects [
18]. As an example, time series coming from financial and economic dynamic systems can exhibit a fractal structure [
19,
20]. In addition, the fractal theoretical constructs have found remarkable applications in a plethora of areas, such as in medicine, manufacture, aerospace and control. A well-known definition of the dimension is:
where N(r) represents the number of boxes needed to cover a particular object and r represents a box size estimation. An approximation of the numeric value of the fractal dimension can be found by looking for the number of boxes covering the object for different r values (size of the box) and then computing a least squares regression in order to approximate the d value; this is known as the box counting algorithm. In
Figure 2, an illustration of this algorithm for an arbitrary C curve is presented. In this case, for different r values we have a different number of boxes; then, evaluating a regression, a value of the box dimension can be found by Equation (4):
where d represents the estimation of the fractal dimension, and the least squares method can approximate this value based on a given dataset.
For the particular situation of this paper, classification of a time series can be achieved using the fractal dimension (the value of d is between 1 and 2, due to the fact that data are on the plane). The idea that is fundamental to this classification method is that the value of a smoother object’s dimension is near to one. However, for a rougher object, the value of the dimension is near a value of two.
4. Basic Concepts of Fuzzy Logic for Forecasting
It is possible to utilize a fuzzy rule base as a forecasting model; for this, a suitable partition of the input space has to be made. In this case, the partition is needed to be able to discriminate among different objects by their features. To simplify the analysis, without losing generality, the objects are assumed to be on the plane, which in this particular situation are time series graphs. In this case, fuzzy clustering techniques [
21,
22] can be used to start grouping the data, and then after the clusters are formed, a fuzzy rule base can be constructed that basically constitutes a forecasting scheme for a particular application.
If we suppose that there are n objects O1, O2, …, On, then the fuzzy clustering algorithms may be utilized to find n pairs (Xi, Yi) i = 1, …, n, that correspond to the n cluster centers. In this form, a fuzzy system can be directly defined in a straightforward fashion:
This general scheme of fuzzy rules can be utilized for pattern recognition, or in the time series prediction, because both situations are structurally similar. These rules are in the Mamdani form [
20] but can also be expressed as a Sugeno fuzzy model [
22]. For high dimensionality cases, this approach can be extended in a direct form. However, the most important issue is that there is an exponential explosion of rules. The complete description of the fuzzy system in Equation (5) requires defining the membership functions of the X and Y fuzzy variables and finding their optimal values for the parameters.
6. Simulation Results
The proposed approach based on unsupervised neural networks was applied in order to create clusters of countries in the globe. Based on these clusters, their classification was then performed by assuming four classes defined with respect to the emergency levels of COVID-19: very high, high, medium and low (indicated by red, orange, yellow and green colors, respectively).
Table 1 indicates a list of the countries that are ordered according to the number of cases in the clusters, and after that, they are alphabetically ordered inside each cluster. The achieved results with this approach are presented in the following Figures. The details of the implementation of the proposed approach are as follows. The fractal dimension was calculated with Fractalyse 2.4.1 software. The SOM network and the fuzzy logic model were implemented in Matlab R2018b language. The hardware was a personal computer with Intel Core ™ i7—4510, 16 GB RAM and 2.60 GHz.
A plot of the clusters created with the neural network is presented in
Figure 12, which clearly indicates the classes of COVID-19 confirmed cases for the period of time from 22 January 2020 to 20 January 2021.
A plot of the clusters of recovered cases created with the neural network is illustrated in
Figure 13, which clearly indicates the classes for COVID-19 recovered cases for the period of time from 22 January 2020 to 20 January 2021.
In addition, an analogous analysis can also be made for the spatial analysis distribution of COVID-19 deaths around the globe. A plot of the groups created with the neural network is presented in
Figure 14, which clearly indicates the COVID-19 classes for death cases for the period of time from 22 January 2020 to 20 January 2021. In
Table 2 and
Table 3, we show the results for recovered and death cases, respectively.
The case of the time series prediction was used to illustrate that spatial classification helps to improve the prediction of the COVID-19 time series. The prediction approach is based on the temporal analysis information, and also uses the spatial information from the clustering that results from the neural network. In summary, we consider this spatial–temporal approach that combines the fuzzy fractal part with self-organizing neural network to be a good mixing or hybridization of methods to improve results. It is important to recall that the country class is not a fixed value, as it is dependent on the complexity evaluation for a specific time window. In this case, after the initial control actions in a next time window, the class value can decrease if the control action was the correct one. Based on preliminary experiments and analyses, we have recognized that in some situations with additional one-month data, we are able to recognize a change in the class of the country with the proposed method. In other cases, this could require larger periods of time, such as two to three months, for detecting a change. We consider that this is an interesting area of future work that we would like to investigate.
In a sequence of Figures, forecasting plots produced by the SOM fuzzy fractal approach for some countries for a period that is more recent are shown. In this case, forecasting 10 days ahead (21 January 2021 to 30 January 2021) based on data utilized for designing the fuzzy system (22 January 2020 to 20 April January 2021) is presented.
Figure 15 illustrates forecasted confirmed cases in Belgium, where it is noticeable that the forecasted values are relatively near to the real values.
Figure 16 illustrates forecasted confirmed cases in Italy. The percentage errors for Belgium and Italy are 0.24 and 0.05, respectively. In both cases, the forecasts are extremely near the real values, which confirms that the proposed approach appropriately deals with the time series prediction problem. Finally, we show for the same periods of time, in
Figure 17 and
Figure 18, the forecasts for United States of America (USA), and Mexico, respectively. Again, the forecasts are very good, as the predicted values are very near to the real values. The percentage errors for USA and Mexico are 1.06 and 0.69, respectively.
In summary, the hybrid approach shows very good results and we plan to test with other periods of time and also with more countries.
7. Discussion of Results
In summary, we can state that based on the previous results, the proposed SOM fuzzy fractal approach performs positively, as it is able to predict the number of COVID-19 cases for the countries considered in the experiments. The prediction accuracy is not the same for all countries, but this was expected, as their dynamics are different. In considering similar works dealing with COVID-19 prediction [
5,
6], most of these papers are presenting prediction approaches based on mathematical models, which are very interesting, but mostly dependent on having very good mathematical models of the epidemic and then being able to mathematically derive equations for the prediction. On the other hand, the proposed approach in this article is mainly based on artificial intelligence techniques, such as fuzzy logic theory, that enable to directly represent expert knowledge on control by using fuzzy rules that form a fuzzy model. In this way, the fuzzy system does not depend on the mathematical model, but rather on an intelligent approach based on previous data (that are collected daily) and knowledge from human experts on the problem. In addition, in this fuzzy model, the fractal dimension is used to provide more information about the complexity of the dynamics of COVID-19 data, and thus helps to improve results. Finally, we have to say that in this work, the designs of the fuzzy rules and membership functions were achieved through trial and error in combination with knowledge about the problem, but we believe that the application of an optimization method can improve the results, and we will consider this task in our future work. We believe that metaheuristics coming from computational intelligence could be easily used to optimize the fuzzy model for prediction and further improve the results.
8. Conclusions
We have presented in this article a spatial and temporal study of the dynamics of the COVID-19 propagation by applying a special kind of neural network, which is the self-organizing map for the spatial analysis of data, and a fuzzy fractal system for modeling the temporal trends of COVID-19 data of the studied countries. Based on the self-organizing neural network, countries that have a similar COVID-19 propagation can be spatially grouped; in this fashion, we are able to analyze which countries have similar behavior and thus may benefit from using similar strategies in controlling the virus propagation. In addition, a fuzzy fractal approach is utilized for the temporal analysis of time series trends of the studied countries. Then, a hybrid combination of both the self-organizing maps and the fuzzy fractal system is proposed for the efficient forecasting of COVID-19 for the studied countries. Most of the previous articles concerning COVID-19 data have viewed the problem mostly regarding the temporal aspect, which is certainly important, but we believe that the combination of both aspects of the problem is relevant to improve the forecasting ability that is needed for real applications. In conclusion, the most relevant contribution of this article is the use of unsupervised neural networks for clustering similar countries and the fuzzy fractal approach for being able to forecast the times series and help in the fight against the COVID-19 pandemic, thus putting forward the idea that strategies for similar countries could be established accordingly with the proposed hybrid combination. If we view our contribution in a more general way, we can say that with our proposal we have filled the current gap in the existing knowledge, meaning the lack of models that combine both the spatial and temporal aspects of the dynamic systems; in particular, the existing lack of intelligent models that combine the temporal and spatial components of real problems, where fuzzy logic is used to perform this combination. Although the proposed model has been illustrated in this paper with the COVID-19 problem, we believe that the proposed model is applicable to other problems as long as they have both temporal and spatial components. We only need to have data from a problem that can be processed both by the self-organizing map and the fractal dimension algorithm, and then the proposed hybrid model can provide the prediction results. We envision that this proposed model can also be used in predicting economic or financial time series from countries, or similar problems. As future work, we may also consider applying other computational intelligent techniques (such as type-2 fuzzy logic, convolutional neural network, metaheuristic algorithms and swarm intelligence) that may help in dealing with this problem in a more convenient and improved way. We are also planning to develop a user-friendly software implementation of the proposed model that would require more detailed issues to be considered. Finally, we envision considering other novel approaches, as the ones outlined in [
26,
27], and other recent interesting works related to evolutionary or swarm fuzzy models and chaos, as in [
28,
29,
30,
31].