1. Introduction
In December 2019, the first case of Coronavirus 2019 (COVID019) was reported in Wuhan, China. Until now, the virus affected millions of people, showing almost 630 million cases and 6.5 million deaths worldwide [
1]. The most common symptoms of COVID-19 are fever, cough, fatigue, headache, dizziness, sputum and dyspnea. Consequently, some patients sustained further damage to their respiratory system; specifically, lesions were detected in the lower lobes of both lungs. Severe cases of COVID-19 can result in acute respiratory distress syndrome or complete respiratory failure [
2].
Given the solemnity of COVID-19, reliable and swift diagnosis is extremely important. There have been numerous methods for the detection of COVID-19. The primary method is reverse-transcription polymerase chain reaction (RT-PCR) [
3]. These tests suffer from high false-positives or false-negatives due to sample contamination, virus mutations or user error during sample extraction [
4]. As a result, several studies [
5,
6] suggested on using Computed Tomography (CT-Scans) for performing diagnosis, since it showed higher accuracy. Consequently, it was shown that the majority of COVID-19 cases share similar radiographic features, such as bilateral abnormalities and multifocal ground-glass opacities, mostly at the lower lung lobes during the early stages for the disease and at the final stages pulmonary consolidation was observed [
7]. However, compared to CT-Scans, chest X-rays are cheaper and faster in image generation; furthermore, it is an accessible method for medical imaging and the body gets exposed to less radiation during the procedure [
8]. Chest X-rays are already used as a diagnostic tool for COVID-19 [
9]. Furthermore, there are some regarding the radiation exposure to patients during COVID screening. On the other hand, reducing the radiation dose lowers the image quality bringing noise and artifacts to the produced images, compromising the diagnosis. In [
10], they used U-Net based discriminators in the GANs framework that enabled it to learn both global and local differences between the denoised and normal-dose images. Results based on simulated and real-world datasets showed excellent performance on denoising low-dose CT (LDCT) images, which consequently enables safer ways for patient screening. On a different study [
11], they applied Neural Network Architecture Search (NAS) to LDCT and proposed a multi-scale and multi-level memory efficient NAS for LDCT denoising. Their proposed method showed better results using fewer parameters than other state-of-the-art methods.
There has been an immense growth in Machine Learning the past few years. Specifically, in medicine, it is used for various tasks, such as classification of cardiovascular diseases, diabetic retinopathy and others [
12,
13,
14]. The revolutionary performance of the convolutional neural network (CNN), has enabled medical experts to use it on many tasks, such as the diagnosis of skin lesions, detection of brain tumors and breast cancer [
15,
16,
17].
Applying Deep Learning models on chest X-ray (CXR) images has proven beneficial where various researchers showed auspicious results in the diagnosis of pulmonary diseases including COVID-19 pneumonia. Notably, Rajpurkar et al. [
18] developed a new CNN architecture called CheXNet based on DenseNet121 for the classification of 14 different pulmonary diseases by training it on over 100,000 X-ray images. They reported that their method exceeds average radiologist performance on the F1 metric. Similarly, in [
19] the authors proposed a method for automatic detection of COVID-19 pneumonia from CXR images using pre-trained convolutional neural networks, reaching accuracy ~99%. In addition, Keidar et al. [
20] proposed a deep learning model for the detection of COVID-19 from CXR images and clustering of similar images to the model’s result. Lastly, in [
21] a method for the detection of COVID-19 is proposed using various Deep Learning models and a support vector machine (SVM) as a classifier.
Correspondingly, additional studies proposed methods for the automatic diagnosis of COVID-19, from CXR images using Deep Learning [
22,
23,
24,
25]. Their methods revealed high performance in detecting COVID-19; although, they possess a few flaws. Foremost, all the mentioned studies had finite number of COVID-19 CXR images. This can affect the training and evaluation performance of these methods, resulting in improper generalization for future data. In addition, they did not use external unseen data for evaluation of their methods.
The goal of this study is the comparative evaluation of Deep Learning methods on COVID-19 CXR image classification and their potential to be used as decision-making tools for COVID-19 diagnosis. Our analysis is performed using five deep learning models covering various state-of-the-art architectures. We also applied all models in the largest dataset (at the time of writing and to the best of our knowledge) [
26].
4. Discussion
It is beyond doubt that COVID-19 affected millions of humans worldwide jeopardizing their health, while at the same time pushing health care services to their limit. Fast and accurate identification of positive COVID-19 cases is essential for the prevention of virus spread. CXR imaging is publicly available at a low cost while producing fast results compared to the more commonly used methods, such as RT-PCR tests and CT scans. Furthermore, LDCT scans can be used for patient screening since recent methods have been developed that successfully denoise the produced images.
Thus, numerous studies on COVID-19 identification from CXR images using deep learning methods showed excellent results. However, some of them used limited data for training and evaluation. Consequently, a model will probably not be able to generalize well to new, unseen data with insubstantial training making its usage in a clinical scenario deficient. In this study, a system is proposed for the automatic detection and diagnosis of COVID-19 from CXR images using deep learning methods. To achieve this, the largest COVID-19 CXR dataset with COVID-19 images was used to train and evaluate five different deep learning models on COVID-19 identification.
The proposed methods of this study showed high results in COVID-19 identification as shown in
Table 8, attaining equal or more of 93% in Precision and Recall scores. The best performer was ResNet101, achieving 96% scores across all metrics.
Henceforth, the plan for this study is to apply lung segmentation and localization on CXR images to increase the classification accuracy of this system and also testing an ensemble model, making it more robust and enabling it to generalize even better to new CXR images. Furthermore, another goal is to test the system against professional radiologists and see how well it performs. Furthermore, collaborating with professional radiologists will also result on the acquisition of valuable feedback from them, regarding the usability of this system in a clinical environment as a decision-making tool.
It is worth mentioning that ensemble models can be a powerful tool for improving the performance of deep learning algorithms [
44]. However, the scope of our work was to highlight the significance and potential of individual deep learning models, rather than to focus specifically on ensemble techniques. Therefore, we decided to evaluate each model separately and to present their results in a comparable manner. We believe that this approach allows us to gain a better understanding of the strengths and limitations of each model and to provide insights into their potential for improving the accuracy and efficiency of COVID-19 CXR image analysis.