1. Introduction
Each year, thousands of people suffer lung diseases and eventually die due to their illnesses; some of these diseases include lung opacity (LO), pneumonia, COVID-19, tuberculosis (TB), bacterial pneumonia (BP), and viral pneumonia (VP) [
1]. Every year, the ratio is anticipated to rise [
2]. According to the WHO, the three diseases that kill the most people worldwide are COVID-19, pneumonia, and TB [
3]. There are 450 million afflicted individuals. Additionally, there are more cases involving minors (657 out of 1000). Furthermore, the rapid development of COVID-19 patients has put massive stress on the worldwide health care structure. COVID-19 has been a terrible pandemic. TB, LO, and pneumonia both pose a serious risk of death [
4,
5]. Therefore, the prompt and correct diagnosis of these disorders is essential for providing effective care and sparing lives [
6,
7].
An LO on chest radiographs, which is frequently used by radiologists, basically refers to a white area of an unknown significance. Any object that prevents CRIs from passing through will appear white on a CRI because the lungs are typically dark on a CRI. Therefore, among other things, a white area among the normally black lungs could represent cancer, an infection, hemorrhage, fluid, or a foreign substance. The radiologist who reads the CRI makes an effort to provide an accurate and specific diagnosis using the medical data available, such as coughing and temperature, previous investigations, and laboratory results. So, if a patient visits the emergency room with a cough and fever, pneumonia will probably be found as an opacity on a CRI. An opacity that is more rounded in a long-term smoker is much more probable to be cancerous. White opacities in both lungs of a person with heart failure are most likely caused by edema or fluid in the lungs.
Globally, pneumonia is thought to be the main factor of child fatalities. A lung infection known as pneumonia can be brought on by either bacteria or viruses. Fortunately, antibiotics and antiviral medications work well in the treatment of this bacterial or viral infectious condition. However, the quicker identification of viral or BP and the subsequent administration of the appropriate medication can considerably aid in preventing a patient’s health from deteriorating, which ultimately results in mortality [
8]. Different kinds of pneumonia have been identified using CRIs, CT scans, and complete blood count (CBC) tests. Another kind of pneumonia affected by a ronavirus-2 is COVID-19. COVID-19, which now ranks as the largest pandemic in history, causes acute respiratory infections in humans. The virus initially infected people in China (Wuhan) in December 2019 [
9]. Due to its rapid spread, COVID-19 is fatal to people. According to the WHO, there have been 761,402,282 confirmed COVID-19 cases reported globally to date, with 6,887,000 fatalities [
10]. According to the WHO, there have been reported instances of COVID-19 in America, Europe, Africa, and Southeast Asia, respectively, of 268,252,496, 184,161,028, 60,719,433 and 9,431,508 cases. The Pakistani government reports that 1,518,083 COVID-19 cases have been documented there, with 30,304 deaths and 1,469,930 recoveries [
11]. COVID-19 is typically detected with an antibody and PCR (polymerase chain reaction) test all over the world. These COVID-19 identification techniques are laborious and inefficient, and they take a while to receive results. So, with the aid of a doctor, chest radiology procedures such as CT scans and CRIs are performed to obtain outcomes more quickly. The signs of both illnesses include sneezing, coughing, fever, shortness of breath, and fatigue.
Furthermore, millions of individuals lose their lives to TB each year because it is a serious infection that primarily attacks the lungs. With timely detection and proper classification from other conditions with comparable radiologic features, TB can be treated to lessen the disease burden. With a worldwide death amount of approximately 1.8 million individuals and 10.4 million additional cases of human immunodeficiency virus (HIV) every year, the second most frequent reason for infectious illness deaths is tuberculosis, according to the WHO. Many underdeveloped countries are witnessing an increase in TB cases. Although both women and men can be affected, it seems to affect men more often. A lengthy course of antibiotic therapy and treatment is provided to patients with active TB [
12]. Chest radiography has been recommended by the WHO and other organizations as an efficient approach for effective case discovery and existence examinations for the identification of TB.
All of the aforementioned illnesses share indications such as cough, sneezing, temperature, shortness of breath, and exhaustion. These lung disorders are categorized and identified utilizing CBC tests, RT-PCR, ultrasounds, and TST tests. These tests might take longer and still miss 20% of cases because the RT-PCR test has only an 80% sensitivity. After 24 h, a CT scan and a CRI were conducted in order to effectively control the false negatives in both asymptomatic and symptomatic individuals. A big issue with CT scans and chest radiographs, though, is the potential for COVID-19, pneumonia, LO, and TB diagnoses to be made at the same time. Moreover, manual tests are time-consuming and costly. To solve this, we require an efficient method that quickly and accurately categorizes CRIs employing trained convolutional neural networks (CNNs). Because they are less expensive, offer clean air sacs, and process more quickly than CT scans, CRIs are utilized frequently.
According to recent studies, DL-based artificial intelligence (AI) approaches can accurately diagnose a variety of disorders using CRIs with a level of precision comparable to that of experienced radiologists [
13,
14]. In resource-constrained situations when qualified radiologists are not easily accessible, these computer-aided detection (CAD) systems can increase practitioners’ CRIs inter-reader variability and interpretation accuracy [
15]. Similar to this, it has been shown that CAD approaches based on DL or conventional machine learning (ML), which might be utilized in clinical settings, can reliably categorize COVID-19 and other lung infections on chest radiographs [
16,
17]. A lot of novel DL architectures are created by researchers to identify different diseases using CRIs. For identifying COVID-19 using CRIs, the authors of [
18] presented a novel COVID-Net DL framework and a COVIDx (easily accessible COVID-19 dataset). CRIs can be categorized by COVID-Net into any of the three categories. The framework consisted of two phases of projections, expansions, depthwise representations, and extensions, all relying on lightweight residual projection–expansion–projection–extension process models. The innovative TB-Net, a self-attention DL network for TB detection employing CRIs, was developed by the researchers in [
19]. A highly specialized DL network containing attention condensers was called the TB-Net. They also tested TB-Net’s decision-making abilities using an explainability-driven effectiveness verification process. In [
20], the authors created a COVID-CXDNetV2 model for pneumonia and COVID-19 identification using CRIs. The model was based on ResNets and YOLOv2 architectures. Furthermore, in order to select the inputs (images or clinical data) and objectives of the system that could help obtain a trustworthy DL-based tool for difficulties related to COVID-19, the most pertinent and current medical studies and articles were examined in [
21]. However, DL approaches use unstructured data in contrast to traditional ML methods, robotically extract robust traits, and generate reliable outcomes. Here are some advantages of DL over ML and other categorization techniques. This might lead to an increased accuracy with a bigger dataset. It will be quick and effective to evaluate and classify. It is not necessary to manually choose and extract features. This will be handled for you by a CNN. An unstructured, unclassified dataset is utilized for this procedure. DL makes it simple to build frameworks that create more precise outcomes in identifying and predicting particular lung diseases using chest radiographs.
According to our understanding, there are some drawbacks to current lung disease classification research: most (majority) previous studies utilized datasets with fewer images, (small datasets) or used the images of one dataset, which limits the generalization ability of the models. Less training data are available, the DL-based models are not completely generalizable, and the chances of overfitting are high. The vast majority of research uses transfer learning (TL) and conventional ML methods to detect lung diseases. Yet, the major issue with conventional ML (such support vector machines, or SVMs) is the extended training time for large datasets. The most significant restrictions in TL systems, however, are overfitting and negative transfer. One of its drawbacks is that pre-trained classification systems are frequently honed using the ImageNet database, which contains images unrelated to medical imagery. Moreover, pre-trained TL models require a lot of computing effort. It is still difficult to set up an effective CADS to promptly and effectively diagnose lung illness using chest radiographs. Furthermore, numerous researchers have suggested categorizing COVID-19, different kinds of pneumonia, TB, and standard CRIs. To the best of our knowledge, there is no single model and single dataset for lung disease classification into COVID-19, LO, pneumonia (or VP and BP), TB, and images of healthy individuals. This motivates the development of an automatic and reliable model for the classification of various lung illnesses.
The DeepLungNet DL model is suggested as a solution to these constraints. It makes use of feature extraction that is based on filters, which can aid in obtaining an excellent classification performance. DeepLungNet extract features hierarchically and is capable of producing end-to-end learning, in contrast to traditional methods for feature extraction and selection that demand specialized knowledge. The convolutional layer and Leaky ReLU (LReLU) activation functions utilized to create the proposed framework extract the utmost important and in-depth features from the CRIs. The framework can minimize a range of weight characteristics by using a max-pooling procedure. We added batch normalization (BN) operations, convolutional layers, and group convolutional layers, a squeeze ConV layer with numerous 1 × 1-filter layers, and a combination of 1 × 1 and 3 × 3 ConV layers (expand layer) to make the suggested model a novel lung disease classification technique. Our approach is cost-effective, inexpensive, and less time consuming compared to traditional lung disease detection and classification approaches. Additionally, using the common Kaggle datasets that are open to the public, the proposed architecture was verified. Finally, the performance of our framework was compared with hybrid approaches (DL-based model plus SVM). To further show the model’s usefulness, the suggested model was evaluated on a different publicly accessible dataset from the agriculture domain. The proposed structure works admirably in test accuracy for lung disease classification, according to the results. The following is a summary of the study’s primary contributions:
For the purpose of lung disease classification utilizing chest radiographs, an effective DeepLungNet model is proposed.
Five-class classifications are made of CRIs into TB, Pneumonia, COVID-19, LO, and normal.
Six-class classifications are made of CRIs into VP, BP, COVID-19, normal, TB, and LO.
To improve the model’s performance, demonstrate the model’s generalizability, and prevent the overfitting issue, data augmentation is used.
To determine the effectiveness of the DeepLungNet framework, we used hybrid methodologies to assess the classification performance of the presented approach on the similar experimental setting and dataset. For this goal, we employed a range of categorization criteria, including precision, f1-score, recall, and accuracy.
The proposed framework is validated on another publicly accessible dataset from the agriculture domain to prove the generalization ability and usefulness of the framework.
The remainder of this article is structured as follows:
Section 2 provides details about related work,
Section 3 considers our used method,
Section 4 describes the experiments and model’s results,
Section 5 hosts a discussion, and conclusions are made in
Section 6.
2. Related Work
In the majority of nations, CRIs are routinely employed as a feasible choice for the identification of COVID-19 and other lung diseases. However, detecting COVID-19 is a challenging method that requires the clinical imaging of individuals. Lung cancer (LC) is one of the main reasons why people die. A prompt diagnosis may increase human survival. Image processing and ML have demonstrated significant potential for the analysis of pulmonary illnesses. To detect and categorize lung disorders, an issue that is still being studied and deserves more attention, a number of hybrid, ML, and DL methods have been published in the past. In-depth analyses of the DL approaches for LO, TB, COVID-19, VP, pneumonia, and BP are included in this section.
The authors of [
22] used SVM and multi-level thresholding for COVID-19 detection or identification. The authors enhanced the contrast of input CRI by employing a median filter after examining the patient’s CRIs. After that, a multi-level picture segmentation threshold is applied utilizing the Otsu objective function. After that the SVM was employed to distinguish between lungs with an infection and lungs without an infection. In [
23], the author presented a method based on autoregressive integrated moving average and least-squares SVM (LS-SVM) to identify or detect COVID-19 (ARIMA). The five countries with the maximum number of COVID-19 patients that have been confirmed are Italy, the United States, Spain, France, and the United Kingdom. The method used the verified cases as an input to forecast the disease’s transmission one month in advance. For accuracy, LS-SVM surpassed ARIMA. A novel COVID-19 detection procedure built on a self-organization map and locality-weighted learning was proposed by the authors in [
24]. (LWL-SOM). They utilized the SOM technique to gather the CRIs images into clusters on the basis of the same features in order to differentiate between healthy and COVID-19 patients. Furthermore, the LWL technique was utilized to develop a framework for recognizing COVID-19. The recommended framework enhanced the performance results for the correlation coefficients between normal and COVID-19 and pneumonia and COVID-19 cases. The present ML-based techniques that utilize AI assessment measures to differentiate between normal and COVID-19 patients outperform the suggested framework.
Unfortunately, standard ML approaches underperform DL approaches since they significantly trust human feature extraction and precise feature selection. DL approaches extract more robust deep features, make use of unstructured data, and make more precise outcomes compared to traditional ML algorithms. Nowadays, it has become standard procedure to automatically extract classification features using DL algorithms. Classifiers built on DL can be used to fully and automatically detect COVID-19 from CRIs.
For the categorization of CRIs, in [
25], the authors proposed a DL framework with nine layers. The two-class classification of three illness categories, i.e., TB, pneumonia, and COVID-19, was accomplished by the means of six diverse datasets obtained from publicly accessible CRIs employing a DL framework that was completely trained from scratch. In [
26], the authors trained a DL model with 6587 CRIs using stochastic gradient descent. The model successfully classified CRIs into four classes (normal, TB, pneumonia, and COVID-19) using 128 × 128 CRIs. In [
27], the authors developed TL with VGG16 for TB diagnosis on CRIs. They refined the model using 1324 CRIs, and it produced satisfactory classification results for TB and healthy CXR images. In [
28], the authors used a pre-trained DCNN-based Inception-V3 framework with TL. The collected dataset had 3532 CRIs in total, each of which were improved and scaled to 299 × 299. However, the study did not categorize TB in CRIs. In order to categorize CRIs, the authors of [
29] combined VGG16 and attention mechanism. The techniques used to classify CRIs into COVID-19, normal, no findings, BP, and VP achieved a good classification performance on three CRI datasets.
Similar to this, in [
30], the authors compared seven different popular DL neural network topologies. The small dataset employed in the study consisted of 50 CRIs and 25 CRIs from each of the COVID-19 and healthy patients. Only the classifier underwent training utilizing radiography; all other models underwent pre-training using the ImageNet database, which comprises about 14 million images of diverse types and is a broad image dataset. The best-performing designs in their tests were the VGG19. A similar approach was used to offer a modification of the VGG model that incorporates the convolutional COVID-19 block in [
31]. The framework was assessed utilizing a diverse dataset consisting of 1887 images from 2 distinct publicly accessible datasets. Three categories of photographs were created: normal (654), pneumonia (864), and COVID-19 (300 images). In [
32], many chest x-ray photographs from diverse sources were combined to form one of the main freely accessible collections of CRIs. Last, COVID-CXNet was created by the authors of [
32] utilizing the TL approach and the well-known CheXNet model. This reliable model was able to recognize new COVID-19 pneumonia founded on important and relevant features with an accurate localization. In [
33], the authors classified CRIs as belonging to COVID-19 and healthy people or VP patients using eleven CNN models. They considered three possible approaches to improving the COVID-19 identification designs by including extra layers. The models under examination were all well-known frameworks that have proved to be effective in applications for image recognition and detection. Using a COVID-19 radiography database, the recommended techniques for each explored design were assessed, with the Xception and EfficientNetB4 models producing the best performance results. Moreover, the authors of [
34] proposed a CNN-based architecture for COVID-19 detection from CRIs, increasing the test’s efficacy and reliability. The suggested method combines a custom model with a TL approach to increase accuracy. Many pre-trained DL networks, including MobileNetV2, InceptionV3, VGG16, and ResNet50, were used to extract features. The performance indicators in this study were categorization and classification accuracy. The results of this research demonstrate that DL can identify COVID-19 in CRIs. InceptionV3 has attained the highest level of accuracy compared to other TL methods.
Previous studies have also used hybrid approaches, which integrate both DL- and ML-based procedures, in addition to ML approaches and DL models. In [
35], the authors used a hybrid technique (SVM and deep-feature-based approach) to use CRIs to identify patients who were infected with COVID-19. SVM is utilized for classification instead of a DL-based framework since DL models require a sizable amount of training data. For COVID-19 categorization and classification, deep features from the fully connected (FC) layers of DL models are gathered and input into the SVM. Pneumonia, the norm, and COVID-19 are the distant CRIs data sources employed in the technique. The method helps doctors differentiate among normal, pneumonia, and COVID-19 cases. The characteristics of 13 DL frameworks were used to assess the SVM algorithm’s COVID-19 identification performance. Resnet50 and SVM attained the highest classification performance. Furthermore, in [
36], the authors used CRI data to train CNN frameworks as feature extractors and the SVM as a classification algorithm to assess whether the individuals were healthy, had pneumonia, or were suffering from COVID-19. The tests compared various classes, feature extraction frameworks, feature selection algorithms, and kernels. To discriminate among the three groups of pneumonia, COVID-19, and normal, the investigators employed the resnet50, resnet18, resnet101, and GoogleNet TL methods. Using resnet101, resnet50, resnet18, and GoogleNet, they were able to achieve the highest average accuracy.
The previous works could be expanded much more. According to the aforementioned literature review, different ML, DL, and hybrid techniques were used to classify various lung illnesses based on CRIs. However, existing approaches are unable to classify lung diseases into TB, VP, pneumonia, BP, COVID-19, and LO. Additionally, to evaluate the generalizability and robustness of models, we need to train and test models on multiple datasets or datasets with images from multiple datasets. The majority of studies employed only one dataset for model performance validation. This paper proposes a DeepLungNet model which is trained on the images from multiple datasets to verify the robustness of the model. This study’s main objective is to detect multiple lung diseases using a single model with an adequate accuracy while minimizing false positives. Analysis of the data reveals that the suggested system for lung disease classification is useful and reliable.
3. Methodology
The application of DL approaches has already had a significant positive impact on the fields of image processing (more specifically, medical imaging). In this study, we suggested the DeepLungNet DL framework for lung disease classification using chest radiographs. Using (our integrated) dataset, we will categorize chest radiographs into the following four groups: TB, normal, LO, COVID-19, and pneumonia.
Figure 1 depicts the suggested strategy’s abstract representation. To put the suggested technique into practice, we provided the model images of chest radiographs. The input images for the datasets were a variety of sizes. Then, we used pre-processing to shrink the dimensions of the input images to 224-by-224 pixels in order to assure homogeneity and speed up the procedure. To further categorize CRIs into the 5 ideal configurations, a DeepLungNet architecture with only 20 convolutional layers was created. For each experiment, independent datasets were used for testing and training. We specifically utilized 80% of the dataset for training, whereas we used 20% for testing purposes. The two datasets were then used to evaluate the proposed model.
3.1. Data Pre-Processing
3.1.1. Data Augmentation
One of the issues when attempting to use DL frameworks for medical imaging detection and classification tasks is the lack of suitable data (balanced data) to train the DL frameworks. It is necessary to collect more medical imaging data, yet doing so requires a large amount of time and money. By applying data augmentation strategies to the pre-existing data without gathering any new medical imaging data, we boosted the amount of data that are now available; we used data augmentation techniques to overcome the class imbalance issue. The radiograph scans in the dataset were randomly rotated at an arbitrary angle between −20 and 20 degrees and moved up to 30 pixels in both the vertical and horizontal directions. To create new images, we translated the existing images at random between 0.9 and 1.1. It is critical to keep in mind that in each training session, the imageDataAugmenter function continuously produces sets of enhanced images. By dramatically increasing the dataset’s image count, we were able to train our deep learning framework with more training images and improve its performance.
3.1.2. Image Resizing
The datasets input CRIs come in a variety of dimensions. To ensure homogeneity and improve the processing speed, we pre-processed the radiographs to scale them to 224 × 224 pixels in accordance with the requirements (input picture) of our model.
3.2. Dataset Partitioning
The CRIs were separated into testing and training groups for each experiment. More exactly, the framework was trained on 80% of the dataset, then tested on the remaining 20%.
3.3. Deep DeepLungNet Architecture Details
In this study, we proposed the DeepLungNet framework for lung illness classification. Only 20 learnt layers, i.e., 18 convolutional layers and 2 FC layers, make up the DeepLungNet model. In total, there are 64 layers in our architecture: 1 for the picture input, 16 for convolutions, 2 for group convolutions, 18 for batch normalization (BN), 19 for leaky ReLU (LReLU), 1 for maximum pooling, 2 for fully connected, 1 for average pooling, 1 for dropout, 1 for softmax, and 1 for classification. The leaky ReLU activation function comes after the ConV and group convolutional layers.
Table 1 displays the DeepLungNet model’s architecture. In the DeepLungNet model, the first (input) layer is the top (initial) layer. Its size is equivalent to the size of the input features, and it contains I × J elements. For processing, our framework takes input images with a 224 × 224-pixel size. ConV layers with a kernel size of 7 × 7, 3 × 3, and 1 × 1 are used, which performs a ConV operation to create feature maps. The first ConV layer extracts the feature from the CRIs (of size 224 × 224) by using 64 filters of size 7 × 7 with a shift of 2 × 2 and padding of 3 × 3. Following the use of convolutions and kernel, the output of the ConV layers (feature map) can be derived by using Equation (1). Equation (1) represents the ConV operation between the image and kernel [
34]:
represents the output feature map, and jd (r, s) represents the chest radiographs which are multiplied by the (v, w) index of the kth kernel of the cth layer. After employing convolutions on the input chest radiographs, the output of size is formed, whereby i stands for the input, p for padding, k for kernel size, and s for steps.
All ConV and group ConV layers are followed by activation functions. Following convolutional layers are the activation functions. The most popular activation functions in the past were sigmoid and tanh. These limitations led researchers to develop substitute activation functions, such as the rectified linear unit (ReLU) and its derivatives (ELU, Noisy ReLU, and LReLU), which are presently utilized in the bulk of DL applications. A node in a layer uses the activation function to transform the weighted sum of the input into the output. All neurons with negative values are deactivated by the
ReLU activation function, rendering a significant percentage of the framework (network) indolent. To enhance the model’s classification performance, we applied an enhanced
ReLU activation function (LReLU activation function) to describe the
ReLU activation function as a very minor linear percentage of x rather than stating that it be 0 for negative input values. Here is how this activation function was intended: the LReLU, in contrast to
ReLU, does not deactivate the inputs and also generates an output for negative values. The LReLU activation function works according to Equation (2):
When given a positive input, the LReLU function returns x, but when given a negative input, it returns 0.01 times x (small value).
To normalize the outputs of ConV layers, we used the BN operation. BN enables regularization and accelerates the learning process of neural networks, and it also helps to prevent overfitting.
The output feature of the first ConV layer is delivered into the next convolutional layer (fire module) after employing the activation function (LRelU) and BN operation. Three ConV layers make up the Fire module: a squeeze ConV layer with numerous 1 × 1-filter layers, then 3 × 3 and 1 × 1 ConV layers (expand layer). To decrease the total parameters, we chose 1 × 1 layers. The number of input channels multiplied by the number of filters and the filter size, which is three, yields the total number of parameters in the layer. Therefore, we utilized fewer kernels in the squeeze layer than in the expand layer to reduce the number of inputs to 3 × 3 kernels. We used the padding of 1 pixel in the ConVlayers with 3 × 3 filters in order to make the output of the 3 × 3 and 1 × 1 filters the same size. After the fire module, we employed a maximum pooling layer. The maximum pooling layers with a stride of 2 × 2 after the fourth convolutional layer were used for down-sampling. The spatial size, computational complexity, the number of parameters, and calculations were all reduced by this layer. Equation (3) shows the working of the maximum pooling layer.
The f(x) represents the optimal feature map. In our model, a filter size of 3 × 3 and a stride of 2 × 2 is utilized to select the highest value from the neighboring pixels (in a radiograph image) using maximum pooling.
The output of the fire module is passed as an input to the ConV layer taking, 64 kernels of size 3 × 3 and padding 1 × 1. Similarly, the next ConV layer also applies the 64 kernels of size 3 × 3 with padding of 1 × 1. The activation function after this convolutional layer comes after the additional layer. We applied the activation after the addition layer. The next six convolutional layers are connected using shortcut connections, whereas the remaining (last) six convolutional layers are connected sequentially.
The first FC layer receives the output of the final (i.e., eighteenth) ConV layer. A one-dimensional feature vector is created from the two-dimensional feature map that was taken from the ConV layers by the FC layer. The operations of a FC layer are elaborated in Equation (4).
where
i,
m,
n,
d,
w, and
b stand for the output index, width, height, depth, weights, and bias of the FC layer, respectively. We used the dropout layer after the initial FC (to prevent overfitting). The final FC layer is followed by the softmax and classification layers.
3.4. Hyper-Parameters
The choice of hyper-parameters is central to the effectiveness of DL architectures. In order to discover the appropriate value for each hyper-parameter given the wide range of alternatives available, we investigated the effectiveness of the suggested DeepLungNet model using a number of hyper-parameter settings. We choose a few hyperparameters for a model to determine how the DL architecture hyperparameter affects the representation of the entire network. The model is trained on dataset 1 using different parameters, and the model performance metrics are examined. Until the model has reached optimal accuracy, as shown in
Table 2, this process is repeated using a new set of values for hyperparameters.
Table 3 shows the final hyper-parameter values. We employed the stochastic gradient descent optimization approach since it is effective for larger datasets, rapid, and memory efficiency. In order to account for the possibility of overfitting, we trained the model for 50 epochs.
5. Discussion
The aim and key goal of this paper was to present a DL-based framework for effective lung disease classification and identification including LO, pneumonia, TB, VP, COVID-19, and BP from chest radiographs. Because DL approaches provide better results for the classification or detection of different diseases of both plants and humans [
50,
51,
52,
53,
54], we have created an end-to-end solution that does not employ feature extraction or selection. We validated the robustness and generalizability of our suggested technique using two datasets that contained photographs from various databases. In this work, we suggested a DeepLungNet DL-based framework that, when trained on chest radiographs, surpasses competing models in terms of accuracy (97.47). The framework’s testing and training accuracy increases and its training and testing loss rapidly decreases after each epoch. The proposed framework is evaluated against both state-of-the-art frameworks that may be presented in the past and hybrid methods (DL + SVM). To evaluate the system’s effectiveness and generalizability, we evaluated it using the “Lemon Quality Dataset,” a popular and openly available Kaggle dataset from the agriculture domain. The suggested framework outperforms innovative and hybrid techniques and works well.
Since the DeepLungNet framework employed the LReLU activation function instead of the ReLU activation function, our research methodology performs well. We also used the LReLU activation function to address the issue of dying ReLU. In the event of a dying ReLU issue, the DL framework will remain inactive. Using an LReLU, we applied the DeepLungNet approach that was suggested to resolve this issue. When the unit is not active, the LReLU activation mechanism permits for a non-zero (small) gradient. So, it continues to learn rather than coming to a halt or running into a brick wall. As a result, the proposed DeepLungNet model’s lung disease classification performance is improved by the LReLU activation function’s enhanced feature extraction capability. The vanishing gradient and degradation issues are resolved by DeepLungNet’s skip connections method. Each layer that impairs the framework’s effectiveness will be skipped, and the gradient will have access to an alternative shortcut path. Learning does not decrease from the first layers to the last layers since the skip connection transfers the output from a previous layer to a following layer. These results are further explained by the fact that our suggested method can effectively extract the most robust, distinctive, and in-depth features to represent the CRI for exact and reliable categorization. Color, edges, and other (low-level) features are extracted by the first convolutional layers. On the other hand, higher layers are in charge of detecting high-level features, such an anomaly in the CRIs. Furthermore, our architecture is based the following concepts. We used filters of different sizes, i.e., 7 × 7, 3 × 3, and 1 × 1 to extract both local and global features. The max-pooling layer in our model aids in the reduction in model dimensions and parameters and the retention of critical feature information. The model also lessens the calculations, i.e., computation cost (to speed up training) by using group ConV operations. A 50% dropout rate is used to reduce co-adaptation and overfitting. When many neurons in a layer extract extremely comparable or same deep features from the input images or data, this is referred to as co-adaptation, which leads to overfitting. Moreover, the BN is utilized to speed up training, standardize the inputs, stabilize the framework, reduce the number of epochs, and provide regularization to prevent the model from overfitting.
It is time consuming and requires a lot of effort to detect lung problems. The images from chest radiographs are also less clear due to noise and fluctuating contrast. As a result, it became difficult for professionals to immediately inspect the CRI. This study provides an automated system for classifying lung disorders that aid in the early detection of lung ailments. This method significantly enhances patient survival and treatment options. The suggested approach provides a trustworthy and efficient way to recognize lung conditions on chest radiographs, supporting the physician in making quick and accurate decisions.
Although the suggested strategy yielded good outcomes, we pointed out a few flaws and made some recommendations for future investigations. The proposed method is unable to categorize many lung disorders, such as pneumothorax, LC, asthma, etc. How successfully the system detects lung disorders when using additional imaging modalities, such as computer tomography, is uncertain from the proposed DeepLungNet technique (CT scans). In the suggested method, image data are repeatedly divided into a test set (20%) and a training set (80%). Yet, different divisions can lead to various consequences. Despite the fact that our technique accomplished exceptionally well on two CRIs datasets, this study’s conclusions have not been validated in real clinical investigations. We will try to employ the suggested approaches to demonstrate the effectiveness of the DeepLungNet algorithm using larger and more varied datasets in the future to resolve the above-mentioned limitations. Just now, we contrasted the effectiveness of our framework and method with hybrid methods, and in the future, we will evaluate the effectiveness of our method with alternative TL-based methods in which we will utilize the FC layer for classification rather than the SVM. Future work will examine how well the suggested model performs in classifying CRIs into more precise categories, such as pneumothorax, LC, asthma, etc., by incorporating data from additional research datasets. In order to use the proposed DeepLungNet model in practical applications to diagnose diseases such as TB, breast cancer, LO, etc., we plan to test its generalizability in the future using more datasets on lung diseases or other datasets (detection and bone crack detection datasets) in the medical domain using CT scans, MRI images, etc. Additionally, in order to validate the outcomes of the suggested method, we wish to judge the DeepLungNet technique using actual clinical cases.