Deep Learning Algorithms for Forecasting COVID-19 Cases in Saudi Arabia
Abstract
:1. Introduction
2. Related Work
Deep Learning and COVID-19 Forecasting
3. Methods and Data
3.1. Methodology
- (1)
- Data Collection: The required dataset of COVID-19 in Saudi Arabia provided by the Ministry of Health.
- (2)
- Pre-Analysis: Some analysis steps were made on the data to discover hidden patterns.
- (3)
- Processing: Processing data and cleaning missing values or unimportant variables.
- (4)
- Time Series Extraction: Extraction of a day, month, and year from collected data and making them into separate attributes for the analysis.
- (5)
- Data Scaling: Scaling data is very important to get good performance when applying models.
- (6)
- Building Models: It is the core step, where three models are built based on three different algorithms; LSTM, CNN, and ARIMA.
- (7)
- Predicting Outcomes: For each model, we predict the outcomes of different cases in cities.
- (8)
- Results of Analysis: After applying the models, the results of each algorithm are analyzed separately.
3.2. Research Datasets
- (1)
- Confirmed cases (newly infected cases).
- (2)
- Recovered cases.
- (3)
- Death/mortality cases.
3.3. Data Preprocessing Steps
- Step 1: The dataset needs to be cleaned before applying the algorithm; in this step, we process the missing values and fill them with zero.
- Step 2: Sort the date in ascending order starting with 1 April 2020.
- Step 3: Extract day, month, and year from the date column for analysis.
- Step 4: Split weekend days from weekdays and apply pivoting on the indicator column to get case type in different columns, then fill missing values with zero.
- Step 5: Create and prepare a new dataset for each case daily and cumulatively for Saudi Arabia.
3.4. Selection of the Deep Learning Algorithms
- (1)
- Long short-term memory (LSTM)
- (2)
- Convolutional neural network (CNN)
- (3)
- Autoregressive integrated moving average (ARIMA)
3.5. Performance Evaluation Metrics
3.6. COVID-19 Assumptions
4. Implementation and Discussion of Results
4.1. Experimental Setup
4.2. LSTM Deep Learning (Setup and Training)
4.2.1. Selecting the Nodes, Layers, and Hyper-Parameters of LSTM
4.2.2. Building, Training, and Testing the LSTM Model
4.2.3. Prediction of Future Confirmed, Death, and Recovered Cases
4.3. Selecting the ARIMA Parameters
4.3.1. Building, Training, and Testing the ARIMA Model
4.3.2. Prediction of Future Confirmed, Death, and Recovered Cases
4.4. Selecting the CNN Parameters
4.4.1. Building, Training, and Testing the CNN Model
4.4.2. Prediction of Future Confirmed, Death, and Recovered Cases
4.5. Comparative Analysis
5. Conclusions
- The best model to predict the confirmed cases is LSTM, which has better RMSE and values. Still, CNN has a similar comparative performance to LSTM.
- The best model to predict death cases is LSTM, with better RMSE, MAE, MAPE, and values compared to the other two models.
- The best model to predict the recovered cases is CNN, with better RMSE, MAE, MAPE, and values compared to the other two models.
- The most difficult cases to predict are the recovered cases, which have lower error metrics achieved by all algorithms.
- LSTM unexpectedly performed badly when predicting the recovered cases. It has RMSE and values of 641.32 and 0.3134, respectively.
- There is a slight difference between ARIMA and an LSTM algorithm in predicting death cases. ARIMA has MAE and MAPE values of 2.25 and 16.27%, respectively.
- To sum up, LSTM has a better predictive performance for the confirmed and death cases, while CNN has a better performance in predicting the recovered cases.
6. Future Work
- Investigating other advanced deep learning and machine learning algorithms and comparing their performance to the techniques used in this research.
- Building city-wide forecasting models to predict the spread of COVID-19 cases in major cities in Saudi Arabia, such as Riyadh and Mecca.
- Considering other types of feature selection methods to determine the optimal combinations of features to avoid overfitting and underfitting problems, which in turn lead to the generalization of the models.
- Enriching the datasets using feature extrication engineering to find more relevant features that lead to more accurate forecasts.
- Avoiding manual selection of the hyperparameters of the DL algorithms by using advanced optimization techniques to automatically search for their optimal values.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Kumaravel, S.K.; Subramani, R.K.; Jayaraj Sivakumar, T.K.; Madurai Elavarasan, R.; Manavalanagar Vetrichelvan, A.; Annam, A.; Subramaniam, U. Investigation on the Impacts of COVID-19 Quarantine on Society and Environment: Preventive Measures and Supportive Technologies. 3 Biotech 2020, 10, 393. [Google Scholar] [CrossRef] [PubMed]
- Jiang, X.; Coffee, M.; Bari, A.; Wang, J.; Jiang, X.; Huang, J.; Shi, J.; Dai, J.; Cai, J.; Zhang, T.; et al. Towards an Artificial Intelligence Framework for Data-Driven Prediction of Coronavirus Clinical Severity. Comput. Mater. Contin. 2020, 63, 537–551. [Google Scholar] [CrossRef]
- Lai, C.C.; Shih, T.P.; Ko, W.C.; Tang, H.J.; Hsueh, P.R. Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) and Coronavirus Disease-2019 (COVID-19): The Epidemic and the Challenges. Int. J. Antimicrob. Agents 2020, 55, 105924. [Google Scholar] [CrossRef] [PubMed]
- Li, H.; Liu, S.M.; Yu, X.H.; Tang, S.L.; Tang, C.K. Coronavirus Disease 2019 (COVID-19): Current Status and Future Perspectives. Int. J. Antimicrob. Agents 2020, 55, 105951. [Google Scholar] [CrossRef] [PubMed]
- Kucharski, A.J.; Russell, T.W.; Diamond, C.; Liu, Y.; Edmunds, J.; Funk, S.; Eggo, R.M.; Sun, F.; Jit, M.; Munday, J.D.; et al. Early Dynamics of Transmission and Control of COVID-19: A Mathematical Modelling Study. Lancet Infect. Dis. 2020, 20, 553–558. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Hellewell, J.; Abbott, S.; Gimma, A.; Bosse, N.I.; Jarvis, C.I.; Russell, T.W.; Munday, J.D.; Kucharski, A.J.; Edmunds, W.J.; Sun, F.; et al. Feasibility of Controlling COVID-19 Outbreaks by Isolation of Cases and Contacts. Lancet Glob. Health 2020, 8, e488–e496. [Google Scholar] [CrossRef] [Green Version]
- Calandra, D.; Favareto, M. Artificial Intelligence to Fight COVID-19 Outbreak Impact: An Overview. Eur. J. Soc. Impact Circ. Econ. 2020, 1, 84–104. [Google Scholar] [CrossRef]
- Chaudhary, L.; Singh, B. Community Detection Using Unsupervised Machine Learning Technique on COVID-19 Dataset. Soc. Netw. Anal. Min. 2021, 11, 28. [Google Scholar] [CrossRef]
- Xu, X.; Jiang, X.; Ma, C.; Du, P.; Li, X.; Lv, S.; Yu, L.; Chen, Y.; Su, J.; Lang, G.; et al. Deep Learning System to Screen Coronavirus Disease 2019 Pneumonia. arXiv 2020, arXiv:2002.09334. [Google Scholar] [CrossRef]
- Huang, L.; Han, R.; Ai, T.; Yu, P.; Kang, H.; Tao, Q.; Xia, L. Serial Quantitative Chest CT Assessment of COVID-19: A Deep Learning Approach. Radiol. Cardiothorac. Imaging 2020, 2, e200075. [Google Scholar] [CrossRef]
- Mei, X.; Lee, H.C.; Diao, K.Y.; Huang, M.; Lin, B.; Liu, C.; Xie, Z.; Ma, Y.; Robson, P.M.; Chung, M.; et al. Artificial Intelligence–Enabled Rapid Diagnosis of Patients with COVID-19. Nat. Med. 2020, 26, 1224–1228. [Google Scholar] [CrossRef]
- Loey, M.; Smarandache, F.; Khalifa, N.E.M. Within the Lack of Chest COVID-19 X-ray Dataset: A Novel Detection Model Based on GAN and Deep. Symmetry 2020, 12, 651. [Google Scholar] [CrossRef] [Green Version]
- Ucar, F.; Korkmaz, D. COVIDiagnosis-Net: Deep Bayes-SqueezeNet Based Diagnosis of the Coronavirus Disease 2019 (COVID-19) from X-ray Images. Med. Hypotheses 2020, 140, 109761. [Google Scholar] [CrossRef] [PubMed]
- Tiwari, S.; Kumar, S.; Guleria, K. Outbreak Trends of Coronavirus Disease-2019 in India: A Prediction. Disaster Med. Public Health Prep. 2020, 14, e33–e38. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Qiang, X.L.; Xu, P.; Fang, G.; Liu, W.; Kou, Z. Using the Spike Protein Feature to Predict Infection Risk and Monitor the Evolutionary Dynamic of Coronavirus. Infect. Dis. Poverty 2020, 9, 33. [Google Scholar] [CrossRef]
- Ke, Y.Y.; Peng, T.T.; Yeh, T.K.; Huang, W.Z.; Chang, S.E.; Wu, S.H.; Hung, H.C.; Hsu, T.A.; Lee, S.J.; Song, J.S.; et al. Artificial Intelligence Approach Fighting COVID-19 with Repurposing Drugs. Biomed. J. 2020, 43, 355–362. [Google Scholar] [CrossRef]
- Kırbaş, İ.; Sözen, A.; Tuncer, A.D.; Kazancıoğlu, F.Ş. Comparative Analysis and Forecasting of COVID-19 Cases in Various European Countries with ARIMA, NARNN and LSTM Approaches. Chaos Solitons Fractals 2020, 138, 110015. [Google Scholar] [CrossRef]
- Chimmula, V.K.R.; Zhang, L. Time Series Forecasting of COVID-19 Transmission in Canada Using LSTM Networks. Chaos Solitons Fractals 2020, 135, 109864. [Google Scholar] [CrossRef]
- Alzahrani, S.I.; Aljamaan, I.A.; Al-Fakih, E.A. Forecasting the Spread of the COVID-19 Pandemic in Saudi Arabia Using ARIMA Prediction Model under Current Public Health Interventions. J. Infect. Public Health 2020, 13, 914–919. [Google Scholar] [CrossRef]
- Ogundokun, R.O.; Lukman, A.F.; Kibria, G.B.M.; Awotunde, J.B.; Aladeitan, B.B. Predictive Modelling of COVID-19 Confirmed Cases in Nigeria. Infect. Dis. Model. 2020, 5, 543–548. [Google Scholar] [CrossRef]
- Tomar, A.; Gupta, N. Prediction for the Spread of COVID-19 in India and Effectiveness of Preventive Measures. Sci. Total Environ. 2020, 728, 138762. [Google Scholar] [CrossRef] [PubMed]
- Hawas, M. Generated Time-Series Prediction Data of COVID-19’s Daily Infections in Brazil by Using Recurrent Neural Networks. Data Br. 2020, 32, 106175. [Google Scholar] [CrossRef] [PubMed]
- Papastefanopoulos, V.; Linardatos, P.; Kotsiantis, S. COVID-19: A Comparison of Time Series Methods to Forecast Percentage of Active Cases per Population. Appl. Sci. 2020, 10, 3880. [Google Scholar] [CrossRef]
- Car, Z.; Baressi Šegota, S.; Anđelić, N.; Lorencin, I.; Mrzljak, V. Modeling the Spread of COVID-19 Infection Using a Multilayer Perceptron. Comput. Math. Methods Med. 2020, 2020, 5714714. [Google Scholar] [CrossRef]
- Zeroual, A.; Harrou, F.; Dairi, A.; Sun, Y. Deep Learning Methods for Forecasting COVID-19 Time-Series Data: A Comparative Study. Chaos Solitons Fractals 2020, 140, 110121. [Google Scholar] [CrossRef]
- Arora, P.; Kumar, H.; Panigrahi, B.K. Prediction and Analysis of COVID-19 Positive Cases Using Deep Learning Models: A Descriptive Case Study of India. Chaos Solitons Fractals 2020, 139, 110017. [Google Scholar] [CrossRef] [PubMed]
- Hemdan, E.E.D.; Shouman, M.A.; Karar, M.E. COVIDX-Net: A Framework of Deep Learning Classifiers to Diagnose COVID-19 in X-ray Images. arXiv 2020, arXiv:2003.11055. [Google Scholar]
- Barstugan, M.; Ozkaya, U.; Ozturk, S. Coronavirus (COVID-19) Classification Using CT Images by Machine Learning Methods. arXiv 2020, arXiv:2003.09424. [Google Scholar]
- Hu, Z.; Ge, Q.; Li, S.; Jin, L.; Xiong, M. Artificial Intelligence Forecasting of Covid-19 in China. arXiv 2020, arXiv:2002.07112. [Google Scholar] [CrossRef]
- Gozes, O.; Frid, M.; Greenspan, H.; Patrick, D. Rapid AI Development Cycle for the Coronavirus (COVID-19) Pandemic: Initial Results for Automated Detection & Patient Monitoring Using Deep Learning CT Image Analysis. arXiv 2020, arXiv:2003.05037. [Google Scholar]
- Yang, Z.; Zeng, Z.; Wang, K.; Wong, S.S.; Liang, W.; Zanin, M.; Liu, P.; Cao, X.; Gao, Z.; Mai, Z.; et al. Modified SEIR and AI Prediction of the Epidemics Trend of COVID-19 in China under Public Health Interventions. J. Thorac. Dis. 2020, 12, 165–174. [Google Scholar] [CrossRef] [PubMed]
- Sahai, A.K.; Rath, N.; Sood, V.; Singh, M.P. ARIMA Modelling & Forecasting of COVID-19 in Top Five Affected Countries. Diabetes Metab. Syndr. Clin. Res. Rev. 2020, 14, 1419–1427. [Google Scholar] [CrossRef]
- Dehesh, T.; Mardani-Fard, H.A.; Dehesh, P. Forecasting of COVID-19 Confirmed Cases in Different Countries with ARIMA Models. medRxiv 2020. [Google Scholar] [CrossRef]
- Hernandez-Matamoros, A.; Fujita, H.; Hayashi, T.; Perez-Meana, H. Forecasting of COVID19 per Regions Using ARIMA Models and Polynomial Functions. Appl. Soft Comput. J. 2020, 96, 106610. [Google Scholar] [CrossRef]
- Shoeibi, A.; Khodatars, M.; Alizadehsani, R.; Ghassemi, N.; Jafari, M.; Moridian, P.; Khadem, A.; Sadeghi, D.; Hussain, S.; Zare, A.; et al. Automated Detection and Forecasting of COVID-19 Using Deep Learning Techniques: A Review. arXiv 2007, arXiv:2007.10785. [Google Scholar]
- Elsheikh, A.H.; Saba, A.I.; Elaziz, M.A.; Lu, S.; Shanmugan, S.; Muthuramalingam, T.; Kumar, R.; Mosleh, A.O.; Essa, F.A.; Shehabeldeen, T.A. Deep Learning-Based Forecasting Model for COVID-19 Outbreak in Saudi Arabia. Process Saf. Environ. Prot. 2021, 149, 223–233. [Google Scholar] [CrossRef] [PubMed]
- Akdi, Y.; Emre Karamanoğlu, Y.; Ünlü, K.D.; Baş, C. Identifying the Cycles in COVID-19 Infection: The Case of Turkey. J. Appl. Stat. 2022. [Google Scholar] [CrossRef]
- Marzouk, M.; Elshaboury, N.; Abdel-Latif, A.; Azab, S. Deep Learning Model for Forecasting COVID-19 Outbreak in Egypt. Process Saf. Environ. Prot. 2021, 153, 363–375. [Google Scholar] [CrossRef]
- Rajput, N.K.; Grover, B.A.; Rathi, V.K. Word Frequency and Sentiment Analysis of Twitter Messages During Coronavirus Pandemic. arXiv 2020, arXiv:2004.03925. [Google Scholar]
- Bhat, M.; Qadri, M.; Beg, N.-u.-A.; Kundroo, M.; Ahanger, N.; Agarwal, B. Sentiment Analysis of Social Media Response on the COVID-19 Outbreak. Brain. Behav. Immun. 2020, 87, 136–137. [Google Scholar] [CrossRef]
- Pokharel, B.P. Twitter Sentiment Analysis during COVID-19 Outbreak in Nepal. SSRN Electron. J. 2020. [Google Scholar] [CrossRef]
- Manguri, K.H.; Ramadhan, R.N.; Mohammed Amin, P.R. Twitter Sentiment Analysis on Worldwide COVID-19 Outbreaks. Kurdistan J. Appl. Res. 2020, 5, 54–65. [Google Scholar] [CrossRef]
- Medford, R.J.; Saleh, S.N.; Sumarsono, A.; Perl, T.M.; Lehmann, C.U. An “Infodemic”: Leveraging High-Volume Twitter Data to Understand Public Sentiment for the COVID-19 Outbreak. Open Forum Infect. Dis. 2020, 7, ofaa258. [Google Scholar] [CrossRef]
- Mansoor, M.; Gurumurthy, K.; Prasad, V.R.B. Global Sentiment Analysis of COVID-19 Tweets Over Time. arXiv 2020, arXiv:2010.14234. [Google Scholar]
- Garcia, M.B. Sentiment Analysis of Tweets on Coronavirus Disease 2019 (COVID-19) Pandemic from Metro Manila, Philippines. Cybern. Inf. Technol. 2020, 20, 141–155. [Google Scholar] [CrossRef]
- de las Heras-Pedrosa, C.; Sánchez-Núñez, P.; Peláez, J.I. Sentiment Analysis and Emotion Understanding during the COVID-19 Pandemic in Spain and Its Impact on Digital Ecosystems. Int. J. Environ. Res. Public Health 2020, 17, 5542. [Google Scholar] [CrossRef]
- Chandrasekaran, R.; Mehta, V.; Valkunde, T.; Moustakas, E. Topics, Trends, and Sentiments of Tweets about the COVID-19 Pandemic: Temporal Infoveillance Study. J. Med. Internet Res. 2020, 22, e22624. [Google Scholar] [CrossRef] [PubMed]
- Kruspe, A.; Häberle, M.; Kuhn, I.; Zhu, X.X. Cross-Language Sentiment Analysis of European Twitter Messages Duringthe COVID-19 Pandemic. arXiv 2020, arXiv:2008.12172. [Google Scholar]
Model | Predicted Variable | RMSE | nRMSE | MAE | nMAE | MAPE | |
---|---|---|---|---|---|---|---|
LSTM | Confirmed | 196.58 | 4.64 | 104.78 | 2.47 | 11.65 | 0.96 |
Deaths | 3.29 | 6.33 | 2.34 | 4.51 | 16.9 | 0.94 | |
Recovered | 641.3 | 13.6 | 312.8 | 6.64 | 32.7 | 0.313 |
Model | Predicted Variable | RMSE | nRMSE | MAE | nMAE | MAPE | |
---|---|---|---|---|---|---|---|
ARIMA | Confirmed | 578.1 | 13.7 | 258.6 | 6.1 | 21.4 | 0.15 |
Deaths | 3.6 | 7.0 | 2.25 | 4.33 | 16.27 | 0.88 | |
Recovered | 473.48 | 10.05 | 248.62 | 5.28 | 34.98 | 0.39 |
Model | Predicted Variable | RMSE | nRMSE | MAE | nMAE | MAPE | |
---|---|---|---|---|---|---|---|
CNN | Confirmed | 200.62 | 4.74 | 97.58 | 2.31 | 10.03 | 0.95 |
Deaths | 4.41 | 8.48 | 2.75 | 5.28 | 17.62 | 0.90 | |
Recovered | 426.29 | 9.05 | 213.26 | 4.53 | 38.86 | 0.67 |
Model | Predicted variable | RMSE | nRMSE | MAE | nMAE | MAPE | |
---|---|---|---|---|---|---|---|
ARIMA | Confirmed | 578.1 | 13.7 | 258.6 | 6.1 | 21.4 | 0.15 |
Deaths | 3.6 | 7.0 | 2.25 | 4.33 | 16.27 | 0.88 | |
Recovered | 473.48 | 10.05 | 248.62 | 5.28 | 34.98 | 0.39 | |
LSTM | Confirmed | 196.58 | 4.64 | 104.78 | 2.47 | 11.65 | 0.96 |
Deaths | 3.29 | 6.33 | 2.34 | 4.51 | 16.9 | 0.94 | |
Recovered | 641.3 | 13.6 | 312.8 | 6.64 | 32.7 | 0.31 | |
CNN | Confirmed | 200.62 | 4.74 | 97.58 | 2.31 | 10.03 | 0.95 |
Deaths | 4.41 | 8.48 | 2.75 | 5.28 | 17.62 | 0.90 | |
Recovered | 426.29 | 9.05 | 213.26 | 4.53 | 38.86 | 0.67 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Al-Rashedi, A.; Al-Hagery, M.A. Deep Learning Algorithms for Forecasting COVID-19 Cases in Saudi Arabia. Appl. Sci. 2023, 13, 1816. https://doi.org/10.3390/app13031816
Al-Rashedi A, Al-Hagery MA. Deep Learning Algorithms for Forecasting COVID-19 Cases in Saudi Arabia. Applied Sciences. 2023; 13(3):1816. https://doi.org/10.3390/app13031816
Chicago/Turabian StyleAl-Rashedi, Afrah, and Mohammed Abdullah Al-Hagery. 2023. "Deep Learning Algorithms for Forecasting COVID-19 Cases in Saudi Arabia" Applied Sciences 13, no. 3: 1816. https://doi.org/10.3390/app13031816
APA StyleAl-Rashedi, A., & Al-Hagery, M. A. (2023). Deep Learning Algorithms for Forecasting COVID-19 Cases in Saudi Arabia. Applied Sciences, 13(3), 1816. https://doi.org/10.3390/app13031816