A Hybrid Learning-Architecture for Improved Brain Tumor Recognition

Dixon, Jose; Akinniyi, Oluwatunmise; Abdelhamid, Abeer; Saleh, Gehad A.; Rahman, Md Mahmudur; Khalifa, Fahmi

doi:10.3390/a17060221

Open AccessArticle

A Hybrid Learning-Architecture for Improved Brain Tumor Recognition

by

Jose Dixon

^1,†

,

Oluwatunmise Akinniyi

^1,†

,

Abeer Abdelhamid

²

,

Gehad A. Saleh

³

,

Md Mahmudur Rahman

⁴

and

Fahmi Khalifa

^1,2,*

¹

Electrical and Computer Engineering Department, School of Engineering, Morgan State University, Baltimore, MD 21251, USA

²

Electronics and Communications Engineering Department, Mansoura University, Mansoura 35516, Egypt

³

Department of Diagnostic and Interventional Radiology, Faculty of Medicine, Mansoura University, Mansoura 35516, Egypt

⁴

Department of Computer Science, School of Computer, Mathematical and Natural Sciences, Morgan State University, Baltimore, MD 21251, USA

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Algorithms 2024, 17(6), 221; https://doi.org/10.3390/a17060221

Submission received: 17 March 2024 / Revised: 15 May 2024 / Accepted: 18 May 2024 / Published: 21 May 2024

(This article belongs to the Special Issue Algorithms for Computer Aided Diagnosis)

Download

Browse Figures

Versions Notes

Abstract

:

The accurate classification of brain tumors is an important step for early intervention. Artificial intelligence (AI)-based diagnostic systems have been utilized in recent years to help automate the process and provide more objective and faster diagnosis. This work introduces an enhanced AI-based architecture for improved brain tumor classification. We introduce a hybrid architecture that integrates vision transformer (ViT) and deep neural networks to create an ensemble classifier, resulting in a more robust brain tumor classification framework. The analysis pipeline begins with preprocessing and data normalization, followed by extracting three types of MRI-derived information-rich features. The latter included higher-order texture and structural feature sets to harness the spatial interactions between image intensities, which were derived using Haralick features and local binary patterns. Additionally, local deeper features of the brain images are extracted using an optimized convolutional neural networks (CNN) architecture. Finally, ViT-derived features are also integrated due to their ability to handle dependencies across larger distances while being less sensitive to data augmentation. The extracted features are then weighted, fused, and fed to a machine learning classifier for the final classification of brain MRIs. The proposed weighted ensemble architecture has been evaluated on publicly available and locally collected brain MRIs of four classes using various metrics. The results showed that leveraging the benefits of individual components of the proposed architecture leads to improved performance using ablation studies.

Keywords:

brain tumor; neural networks; deep learning; vision transformer; feature fusion

1. Introduction

According to the Central Brain Tumor report of the National Brain Tumor Society [1], about 1M Americans are living with a brain tumor today, and more than 90K people will be diagnosed with a primary brain tumor in 2023 [2]. A brain tumor is defined as abnormal cells in the brain, which are frequently categorized as benign/low grade (non-cancerous) or malignant (cancerous) [3]. Benign tumors (grades I and II) are non-progressive and, therefore, considered less aggressive. They originate in the brain, grow gradually, and cannot spread to other body parts. Malignant brain tumors, on the other hand, come in a variety of sorts and grades. The predominant primary brain tumors in adults are gliomas (comprising astrocytomas, oligodendrogliomas, and ependymomas) and meningiomas. Meningioma is a single category in the WHO’s Central Nervous System (CNS) fifth edition, with 15 subtypes. Most subtypes are benign and considered CNS WHO grade I [4]. Glioma tumors have diverse intensities and spread across the brain’s glial cells. They can be classified into four grades (grades I through IV), from benign to the most malignant [5]. Pituitary adenomas are typically benign, slow-growing tumors and are the most common type of pituitary gland tumor [6]. Meningioma and pituitary tumors both grow around the skull region and the pituitary gland, respectively. Thus, brain tumor detection for early diagnosis becomes a critical yet challenging task for assisting in the appropriate selection of treatment options to preserve the patients’ lives.

Generally, a series of physical and neurological examinations are utilized to diagnose brain tumors. The most reliable method for diagnosing brain tumors is through biopsy. This involves removing and examining a tissue sample under a microscope using various histological techniques. However, biopsies are invasive and carry a risk of bleeding, tissue injury, and functional loss [7]. Diagnosing brain tumors poses a challenge for healthcare providers due to their complicated nature. The prompt identification and treatment of brain tumors is critical to the survival rate of these patients [8]. Various imaging techniques (e.g., magnetic resonance imaging (MRI) and computerized tomography (CT)) are effectively utilized as noninvasive assistive tools for diagnosis, and a biopsy and pathological examination are performed to confirm a diagnosis. MRI is the preferred choice among these imaging techniques as both a non-ionizing and non-invasive method [5]. At the core of modern neuroimaging is the adoption of non-invasive brain tumor diagnosis and classification using MRI as it permits clinicians to examine the fundamental, structural, and functional characteristics of brain tumors [7,9]. As a result, medical imaging plays a crucial role in determining the type and stage of the tumor and developing a treatment plan. However, the manual review of these images is time-consuming, disorganized, and prone to errors due to patient volume. Brain tumor classification is one of the most difficult tasks due to its heterogeneity, isointense and hypointense features, and associated perilesional edema. T1-weighted (T1-w) contrast-enhanced MRIs are typically used to classify primary tumors, such as meningioma, and secondary cancers. As opposed to standard feature extraction and classification techniques, the main approaches for classifying brain tumors rely on region-based tumor segmentation. As a result, using AI-based tools, e.g., deep learning (DL) methodologies, the paradigm changed toward classification challenges.

Various traditional machine learning (ML) or advanced DL models have been adopted to build effective tools for brain tumor classification. In [10], Cheng et al. enhanced tumor region-of-interests with an adaptive spatial technique to divide these regions into subregions. Model-based features were extracted, including histogram-derived, texture-derived using gray-level co-occurrence matrix (GLCM), and bag-of-words features. Their experiments showed accuracies of 87.54%, 89.72%, and 91.28% on estimated features using the spatial partition method. However, a smaller dataset and only engineered features were utilized for classification. Using principal component analysis (PCA), Rathi and Palani [11] developed a brain tumor classification tool using linear discriminant analysis. They incorporated several hand-crafted features, such as intensity, texture, and shape, to label brain tissues as white matter, gray matter, CSF, abnormal, and normal. The classification was based on a support vector machine (SVM). However, their approach demonstrated a lower average accuracy of 0.83 (sensitivity: 0.88; specificity: 0.80); and their method employed hand-crafted features only and used traditional ML. Kumar et al. [12] proposed a structure for classifying brain tumors in MRIs based on gray wolf optimization and multiclass SVM. The gray wolf optimization technique performed better than the firefly algorithm and particle swarm optimization and reached a classification accuracy of 95.23%. Unlike the proposed system, they only employed GLCM for feature extraction and used a smaller dataset (3,064 images). Ismael and Abdel-Qader [13] integrated 2D discrete-wavelet transform and Gabor filtering to build a strong transform-domain statistical feature set to classify brain tumors. They used a back-propagation multilayer neural network using the derived statistical MRI characteristics. Also, their approach combined the two methods and achieved a fairly small score of 91.9% for accuracy using a small dataset. Similar to [10], Abir et al. proposed a probabilistic neural network for brain tumor classification [14]. In preprocessing, they applied image filtering, sharpening, resizing, and contrast enhancement, as well as extracting texture features using GLCM. Their suggested method achieved an accuracy level of 83.33%.

In addition to research work utilizing the extraction of hand-crafted features (e.g., [10,11,12,13]), a model involving deep architecture to classify images in self-learning scenarios has also been developed. A modified capsule network called CapsNet was suggested by Afshar et al. [15]. This network used the spatial link between the brain lesion and the non-tumor tissues around it. The CapsNet architecture renders an overall accuracy of 88.33%. Although they used a modified neural architecture, the method scored a low accuracy score with a fairly longer training setting (50 epochs). Abiwinanda et al. [16] used a CNN-based DL model to classify brain tumor images using five classification models. A ReLU layer and a maximum pool layer were included in the final architecture. The study claimed a validation accuracy of 84.19%. However, a shallow architecture was built and validated on a small dataset with three classes [17]. Deepak et al. [18] used the same data source as [16] and the same deep TL technique (GoogleNet) to classify the images. The images were analyzed for characteristics, which were then employed in compiling test and classification models. The authors reached an accuracy of 98% using a five-fold cross-validation. Although the overfitting was studied, system performance is affected by the training sample size reduction. Also, a high learning rate and a single classifier (SVM or KNN) were used. Similarly, Swati et al. [19] exploited pre-trained CNNs and employed a block-wise fine-tuning mechanism using a pre-trained VGG19 and achieved an overall accuracy of 94.82%. A hybrid feature extraction method by Gumaei et al. [20] was developed for classifying brain tumors. The feature vector was extracted using PCA and normalized GIST descriptors. Finally, a regularized extreme learning machine was proposed to classify brain tumors with 94.23% accuracy. Although their method showed improved classification accuracy, they used a hold-out evaluation method tested on 3064 images. A generic CNN-based algorithm consisting of six convolutional and max-pooling layers and only one fully connected layer was introduced by Anaraki et al. [21]. Following that, the best model generated by the genetic algorithm (GA) is averaged. Brain tumor classification accuracy reached 94.2% on the tested dataset. Their method was implemented using a longer training time frame (100 epochs), and a holdout evaluation was conducted using only 615 images (500 + 115) for testing.

Recently, Sharif et al. [22] proposed a decision support system utilizing pre-trained DenseNet201. Entropy–Kurtosis-based high feature values (EKbHFV) and modified GA (MGA) meta-heuristics were used for feature extraction. BRATS2018 and BRATS2019 datasets were evaluated using a multiclass SVM cubic classifier. The accuracy on BRATS2018 (BRATS2019) was 99.7% (99.8%) and 98.8% (99.3%) on glioblastoma (GBM/HGG) and lower grade glioma (LGG), respectively. The main limitation of their study is the removal of certain important features that affect the system’s accuracy. A comparative study was proposed using five CNN architectures by Asif et al. [23]. They modified the final layers of Xception, DenseNet201, DenseNet121, ResNet152V2, and InceptionResNetV2 with a deep dense block and softmax layer as the output layer. Using the Figshare dataset (3064 T1-w MRIs), they achieved an accuracy of 99.67% on the three-class dataset and 95.87% on the four-class (inclusive of healthy patients) dataset. The results show that the proposed model based on Xception architecture is the most suitable deep model for multi-class brain tumor classification. Despite the high accuracy, they applied fine-tuning parameters for the pre-trained models only using small-size training data. Agrawal et al. [24] developed a DL model called MultiFeNet that uses multi-scale architecture for feature extraction. Instead of employing various kernel sizes, multi-scaling was implemented using a dilation rate. The introduced model was tested using five-fold cross-validation to achieve 96.4% for sensitivity, F1-score, precision, and accuracy. However, the multi-scale feature scaling increased system complexity, computational expense, and network training and hence optimization complexity. Also, there was a lack of cross-dataset generalization as the authors tested/trained their method on the same dataset. A transfer learning (TL)-based DL approach for the multi-class classification of brain tumor type via fine-tuning of pre-trained EfficientNets was proposed by Zulfiqar et al. [25]. Five variants of modified EfficientNets were trained under different experimental settings. GradCAM-based visualization maps of modified EfficientNetB2 were applied to MR brain tumor sequences. The reported accuracy, precision, recall/sensitivity, and F1-score were 98.86%, 98.65%, 98.77%, and 98.71%, respectively. However, reduced accuracy of 91.53% was observed when performing cross-validation experiments on different datasets. A predictive CNN model using a hybrid generative adversarial network (GAN) was proposed by Sahoo et al. [26]. Both GAN-augmented samples and the original augmented dataset were fed into an in-house CNN classification model. Among various GAN architectures, progressive-growing (PGGAN) demonstrated accuracy, precision, recall, F1-score, and NPV metrics of 98.8%, 98.45%, 97.2%, 98.11%, and 98.09%, respectively. Although the system showed promising results on various diseases, brain tumor recognition was evaluated in a small dataset, Figshare, using train/validation/test split only. El-Wahab et al. [27] leveraged the benefits of using the 1 × 1 convolution layer and TL to realize a fast classification brain tumor process. The method realized an average accuracy of 98.63%, using five TL iterations, and 98.86% using retrained k-fold cross-validation (internal TL between the folds, k = 5). Despite the fact that the method overcame the overfitting issue because of unnecessary parameters, it was trained only on the Figshare dataset with increased computational cost and struggled with noisy data. In [28], a reinforcement learning-based architecture was introduced by Chaki et al. Similar to the present study, they used multiple datasets. The first dataset included 7023 images (1645 for meningioma, 1621 for glioma, 1757 for pituitary, and 2000 for normal classes), and the second dataset contained 253 images (155 brain tumors and 98 normal). Their suggested approach achieved an accuracy of 97.5%. The limitation of their study is that the interference time was too long (9 h). The crossover smell agent-optimized multilayer perception (CSA-MLP) was suggested for the identification and classification of a tumor by Arumugam et al. [29]. Their images have been collected from three datasets, and their method was based on employing CNN+ MLP classification head, which was incorporated with the CSA optimization algorithm to enhance the accuracy. Their approach scored an accuracy of 98.56%. Although the authors combined CNN and higher hand-crafted features, only binary classification and holdout validation scenarios were employed.

Various studies have been proposed in the literature with promising results, and this paper extends the existing work for brain tumor classification. Generally, most of the above-mentioned studies used a single dataset compared to our approach, where we integrated multiple datasets for system training and evaluated our method on an additional local dataset. The limited availability of annotated data combined with the small size may lead to overfitting and hinder the generalization of the proposed framework to diverse clinical scenarios. While pre-trained CNNs offer transferable features learned from large-scale datasets, their applicability to medical imaging tasks, particularly brain tumor characterization, may be constrained by domain-specific variations and complexities not adequately captured in general-purpose pre-trained models.

The proposed framework utilizes an ensemble architecture as an automated tool for tumor detection and diagnosis. The developed architecture integrates multiple learnable modules with the ability to capture localized and long-range dependencies. Integrating those modules improves the system’s ability to recognize complex patterns and facilitates the understanding of the context of features across the entire brain image, thus improving the system’s performance. The main contribution aspects of the present work include (i) a robust ensemble architecture that integrates a weighted feature fusion of information-rich features for classification as compared to direct feature concatenation; (ii) prominent disease-related features are retained through three learnable modules compared to single feature-based methods; (iii) capturing of long-range dependencies, non-local relationships, and complex pattern recognition by integrating a ViT in addition to a localized CNN-based method;and (iv) we documented an improved system classification accuracy using public local datasets via the utilization of a cross-validation evaluation scenario.

The paper is structured into five sections, starting with the introduction of modern CAD systems for brain tumor detection and diagnosis, the relevant review of the recent and related literature work followed by the outlined contributions of this work in Section 1. Full descriptions of the data, the methodology, and the details of the learnable modules and features extractions strategies are completely given in Section 2. Employed evaluation criteria, conducted experiments, and obtained results are given in Section 4. A discussion of our results and associated conclusions as well as future work suggestions are outlined in Section 5 and Section 6, respectively.

2. Methodology

The schematic diagram of our analysis pipeline for brain tumor classification is shown in Figure 1, which consists of multiple fractions: pre-processing and normalization, model training and parameters settings for feature extraction, feature fusion, and finally, multi-label classification. All those steps are illustrated next.

2.1. Datasets

The proposed ensemble architecture is evaluated using public and locally acquired datasets that are two-dimensional (2D). The public dataset contains 7023 human brain MRI images, classified into four classes: glioma, meningioma, pituitary macroadenomas, and normal brain with no detected tumor. The 7023 images (of size 512 × 512) are collected from Figshare, SARTAJ, and Br35H datasets [30] and the number of images are 1645, 1621, and 1757 images for meningioma, glioma, and pituitary, respectively. The objective of collecting from three sources is to obtain near-balanced classes to avoid bias in model training and prediction. In addition to the brain tumor types, an additional control (normal or no tumor) class was used, containing 2000 images. Class distributions are shown in Table 1.

A locally acquired dataset was also utilized for additional evaluation. This dataset is collected from an MR brain study of 64 patients at the diagnostic radiology department, Mansoura University Hospital, Mansoura, Egypt (IRB protocol # R.21.09.1437.R1). The images were acquired using a 1.5T Philips-Ingenia MR scanner using standard head coils, where non-contrast MRI sequences (T2-w, T1-w, and FLAIR) were first acquired. The locally acquired data included three classes: normal (no detected tumor), benign (including both meningioma and pituitary macroadenoma), and malignant (gliomas). The resolution of the images is 176 × 176, representing MRI files originating in DICOM, with detailed parameters shown in Table 2.

2.2. Data Preprocessing

First, data is prepossessed to improve the overall analysis results. The images are loaded from the original resolution of the public and local datasets and resized to a standard size of 224 × 224 pixels using bilinear interpolation. No noise filtering and reduction operations are applied to the images. The data augmentation procedure involves various data augmentations through the Keras ImageDataGenerator. During the data augmentation process, all images are randomly rotated at 15%, width and height are shifted at 10%, zoom at 20%, flipped vertically or horizontally. Image enhancement is carried out to adjust the image brightness within the 80% to 120% range. Then, using the Pillow (PIL) library, each image is converted to a PIL image object and then normalized by dividing them by 255.0 to scale the pixel values between 0 and 1. After the data augmentation process, images are converted to RGB format and then reshaped into arrays to be generated iteratively. An example of the applied data normalization and enhancement is shown in Figure 2.

Then, the augmented enhanced images are down-sampled using average pooling. Average pooling retains a lot of feature composition compared with max pooling, which discards a significant portion of the feature composition [31]. It is defined as:

y_{i, j, c} = \frac{1}{k_{h} k_{w}} \sum_{m = 0}^{k_{h} - 1} \sum_{n = 0}^{k_{w} - 1} x_{(i \times s_{h} + m), (j \times s_{w} + n), c}

(1)

where

y (i, j, c), k_{h}

and

k_{w}

, and

1 / k_{h} (k_{w})

represent the output value at pixel

(i, j)

of the cth channel, windows of values in the input feature map, and the normalization factor, respectively. The stride values are determined by

s_{h}

and

s_{w}

, and

x (a, b, c)

represents the input value at position

(a = i \times s_{h} + m), (b = j \times s_{w} + n)

in the input feature map c.

2.3. Features Extraction

To enhance brain tumor classification accuracy and precision, the proposed pipeline integrates MRI-derived information-rich features: CNN-derived, ViT-derived, texture, and radiomic features. These feature-deriving algorithms are well-known for their proficiency in medical image classification tasks. The extracted features are then integrated using a fusion strategy where the layers of these models are combined using the stacking method to create an ensemble classifier, resulting in a more robust brain tumor classification framework.

At first, a set of texture features from the MR images to enhance brain tumor type differentiation by helping the proposed architecture harness the spatial interactions between image intensities. The features are computed using two descriptors: higher-order texture and structural features. The first hand-crafted features are derived from GLCM-based Haralick features and the gray-level run length matrix (GLRLM) features. The GLCM model is a statistical approach that leverages pixel connectivity with its neighbors (usually eight neighbors) to quantify homogeneity, yielding features such as contrast, correlation, energy, and homogeneity [32]. GLCM generates a square matrix according to the number of gray levels present in the image, where each cell within the matrix signifies the frequency of co-occurring pairs of related gray levels in the image. For a preprocessed image, I, with dimension

N \times M

; the GLCM matrix, G, parameterized by an offset (

Δ x

,

Δ y

) can be constructed as in [33] using:

G_{Δ x, Δ y} (i, j) = \sum_{x = 1}^{N} \sum_{y = 1}^{M} \{\begin{matrix} 1, & I (x, y) = i and I (x + Δ x, y + Δ y) = j \\ 0, & otherwise \end{matrix} .

(2)

Here, i and j are the pixel (gray level) values;

I (x, y)

is the gray-level value at the pixel defined by the spatial positions x and y. Co-occurrence matrices are commonly large and sparse. Therefore, different metrics are often derived from the matrix to obtain a more meaningful set of features. These metrics, such as energy, contrast, correlation, homogeneity, and dissimilarity, can be calculated from the GLCM matrix [34]. GLRLM texture features are also used in our approach as an additional set of higher-order statistical texture features. Like GLCM, the GLRLM analysis typically extracts the pixel’s spatial plane features based on high-order statistics of immediate neighboring pixels [35]. The process yields a normalized 2D feature matrix in which each component represents the overall number of occurrences of the gray level in the given direction [36]. Typically, the GLRLM extractor captured information on pixel pairs at angles 0, 45, 90, and 135°. Mathematically, each element

L (i, j | θ)

of the run length matrix L, represents the number of runs with pixels of gray-level intensity equal to i and length of run equal to j along a specific orientation,

i, j \in [0, 255], θ \in {0, 45, 90, 135}

. From L for an input image of size

N \times M

, many features, including short/long run emphasis (SRE/LRE), gray level/run length non-uniformity (GLN/RLN), run percentage (RP), low/high gray level run emphasis (LGRE/HGRE) for a given

θ

can be calculated using the number of gray level (g) and some discrete run lengths (r) of a given image as follows:

$S R E = \sum_{i = 1}^{g} \sum_{j = 1}^{r} \frac{L (i, j)}{j^{2}}$	$L R E = \sum_{i = 1}^{g} \sum_{j = 1}^{r} j^{2} L (i, j)$	$G L N = \sum_{i = 1}^{g} {(\sum_{j = 1}^{r} L (i, j))}^{2}$
$R L N = \sum_{i = 1}^{r} {(\sum_{j = 1}^{g} L (i, j))}^{2}$	$R P = \frac{1}{N \times M} \sum_{i = 1}^{g} \sum_{j = 1}^{r} L (i, j)$	$L G R E = \sum_{i = 1}^{g} \sum_{j = 1}^{r} \frac{L (i, j)}{i^{2}}$
	$H G R E = \sum_{i = 1}^{g} \sum_{j = 1}^{r} i^{2} L (i, j)$

In summary, the GLRLM features texture patterns analysis provides discriminative power for image classification and complements other features like color, shape, and intensity for comprehensive representations and better performance. It is worth mentioning that the images from each pair are rotational invariant (or symmetric). Moreover, local binary patterns (LBP)-derived features are also integrated to make texture analysis more efficient due to its sturdiness towards the monotonic gray-scale changes caused by different lighting conditions and noise. Compared with GLCM as a global texture encoding approach (for the whole image), LBP is a local feature encoding scheme that effectively characterizes the local texture variation. Like GLCM, the method compares each pixel in an image with its neighbors and encodes the result as a binary pattern. Thus, LBP adds additional structural characteristics over the GLCM-statistical and GLRLM run length features while being more computationally efficient [37]. A demonstrative example of the estimated GCLM (color-coded) and the LBP is shown in Figure 3.

Secondly, we harness the power of neural networks, particularly convolutional neural networks (CNN), which have demonstrated performance improvements compared to traditional ML methods due to their inherent characteristics of self-learning and automated feature extraction, capturing localized patterns, hierarchical representations, and spatial hierarchies. Although transfer learning can be employed using pre-trained CNNs, we opted to build our architecture so that it is trained from scratch on the problem at hand to learn directly from the available data and thus extract meaningful features. Some experiments were performed to optimize the architecture hyperparameters. In particular, we employed the GridSearchCV [38] method to determine the optimal layers and parameters. As a result, the network comprises five convolutional and pooling layers with final flattening to allow the model to learn complex relationships between the high-level features.

Some of the disadvantages of using CNNs are that they require strong spatial organization, struggle with non-local relationships and scale variation, and have a limited ability to generalize data. Thus, our ensemble architecture integrates a ViT model that addresses the limitations of CNN-based models by introducing fixed-sized patch embedding. Moreover, ViTs can capture long-range dependencies, making them suitable for recognizing complex patterns and understanding the context of features across the entire image. The proposed model uses a pre-trained ViT model for feature extraction. The patch encoder uses patch and position embedding for each patch. The transformer encoders comprise multi-head attention, feedforward, normalization, and residual connections layers. After the transformer layers, a global average pooling layer is applied to obtain a fixed-size representation of the entire sequence. The ViT model employs softmax activation and is pre-trained on ImageNet to learn generic features from diverse visual data.

3. Weighted Ensemble Classification

After feature extraction, we adopted a feature fusion strategy to integrate them for classification. Generally, a fusion of features in deep learning is an integration method that combines different information extracted for the same input (or sample) to improve the overall model performance. Extracted MR features are initially weighted and then presented as input to a classifier. The weighing of the features is employed to highlight the impact of various features on tumor classification compared to traditional fusion techniques that employ equal feature weighing.

Currently, three base-feature extractors (CNN, hand-crafted, and ViT) are employed to provide individual classification or prediction of the input sample (an MR image). Each model’s output for a given input sample

y_{i}

is a probability vector

P_{k}

, where

k \in [1, \dots, m c]

and

m c

represent the number of classes. The feature extractor modules were trained individually, and to improve the system’s overall accuracy, the final system performance is based on the weighted combined features using weights

w_{i} \in [0, 1]

[39]. To weigh the features vector before ensemble classification, the weights were assigned based on the models’ priority. The model with the best performance out of the three was prioritized using the following equation.

w_{i} = \frac{P_{i}}{\sum_{k = 1}^{3} P_{i}}

(3)

Weighted ensemble prediction can be performed after each model’s weights are assigned. After that, test data can be utilized to evaluate the model by comparing its performance to the predicted output. The last step of the analysis pipeline is the classification using the weighted feature vector. A multilayer perceptron (MLP) classifier was used for classification. The classifier choice is based on the fact that MLP can learn intricate nonlinear patterns by magnifying pertinent aspects of input data while suppressing irrelevant information [40].

4. Experimental Results

In this work, the proposed ensemble architecture is evaluated using the public and locally-acquired datasets described in Section 2.1. For the public dataset, the ground truth labels were provided to the researchers and used to assess the accuracy/sensitivity of the proposed system. All patients were first described for the local dataset based on MRI findings on T2, DWI, and contrast-enhanced T1 images. All patients with malignant tumors had a biopsy.

The back-end for the proposed architecture was Tensorflow 2.13.0 and Python 3.8. The CNN as a feature extractor consisted of five layers, each followed by a ReLU activation function and a max-pooling layer to improve computational efficiency. A flattened layer was introduced to reshape the feature maps to the one-dimensional vector, passed into an MLP trained over 50 epochs with a batch size of 125 for classification. MLP employed a sparse categorical cross-entropy loss function and Adam optimizer with a learning rate starting at 0.001 and was reduced by default during the training process for better results. The ViT model was configured to extract a 16 × 16 image patch size, with 20 layers, 3072 hidden dimensions, and 12 attention heads. A NumPy expanded dimension parameter was added to represent an additional dimension added to the batch dimension. A predicted image parameter was provided to pass the preprocessed image through the ViT model to extract features. A flattened feature parameter converted the extracted features to a one-dimensional vector. The MLP classifier model consisted of a flattened layer with 86,000 features, three dense layers each, ReLU activation, 0.5-dropout, batch normalization, and Softmax activation for the final layer. The MLP model was compiled and evaluated with the Adam optimizer, a batch size of 125, 30 epochs, sparse categorical cross-entropy loss, and accuracy metrics.

All of our experiments and analysis were conducted on a Dell workstation with a 12th Gen Intel^® Core™ i9-12700, 20 processors, 64.0 GB of memory, 1.5 TB disk capacity and an NVIDIA GeForce RTX 3060 with GPU. The end-to-end execution time for testing the proposed ensemble was 60 ± 0.29 s, including feature extraction, fusion, and MLP classification. The model assessment used five-fold cross-validation for performance evaluation. As an unbiased estimator, cross-validation enhances how the deep architecture will transfer to an independent dataset and helps to partially decrease overfitting or selection bias. For cross-validation, we used a ratio of 80% to 20% of the entire dataset for training and testing consecutively. In our experiments, quantitative assessment relied on three metrics: accuracy (Ac), sensitivity (Se), and specificity (Sp) [41].

Ac = \frac{Tp + Tn}{Tn + Tn + Fn + Fp}

(4)

Sn = \frac{Tp}{Tp + Fn}

(5)

Sp = \frac{Tn}{Tn + Fp}

(6)

Here, Tp (Tn) represents the number of true positive (negative), and Fp (Fn) represents the number of false positive (negative) samples.

The above metrics summarize the overall accuracy in Table 3. To examine the importance of different MRI-derived features, we employed ablation techniques to understand the contributions of individual ensemble components. Three different ablation scenarios were explored: individual modules, paired modules, and all three modules. For paired modules, we assessed the contribution of (1) higher-order texture and deeper features while excluding the ViT-derived features; (2) deeper and ViT-derived features while leaving out the texture features; (3) ViT-derived and texture features while excluding the CNN model. The performance of the ensemble model was compared, and the results using the public dataset are tabulated in Table 3.

Furthermore, the confusion matrix (CM) and receiver operating characteristics (ROC), powerful tools for evaluating and comparing classification models, were used to perform another quantitative evaluation of the proposed methods. The first row of Figure 4 shows the individual model’s CM, which is extremely useful for determining which classes were misclassified the most. Moreover, other metrics (e.g., precision, F1-score, and recall) may be calculated from a given matrix. The second row of Figure 4 shows the ROC which provides robustness analysis using graphical illustrations for a given class. Technically, intermediate curve points are constructed by changing the decision threshold (i.e., the control parameter) to classify instances into positive or negative classes. Various trade-offs between Tp and Fn rates are generated by this threshold, and the curve is constructed. Additionally, broken black (diagonal) represents the random guess, for which ROCs falling below this line indicate poor model performance in identifying such classes. ROCs are thus valuable assessment tools showing how well an ML classifier distinguishes between different classes. The ROCs and CMs of the proposed ensemble method tested using the first (public) dataset are shown in the Figure 5.

To demonstrate the effectiveness of the proposed architecture, the trained model on the public dataset was used to classify brain images from the local dataset. The accuracy is reported in the last row of Table 3, and both the confusion matrix and ROC curves are shown in Figure 6. The results obtained on the second (local) dataset demonstrate how powerful the proposed model is in generalizing to a different dataset. In total, the quantitative results and scores of the ROC and CMs document the higher performance of the proposed model in correctly classifying brain MRIs.

Finally, to emphasize the benefits of the proposed architecture, we compared the accuracy of our proposed model with its competitive recent literature models that partially utilized similar datasets or combined datasets. Some comparative models use the Figshare dataset compared to our study, which uses three datasets from public sources, including Figshare. We also compared against similar approaches that have adopted an aggregation of different datasets to ensure that our methodology’s efficacy and generalizability are fairly assessed. To strengthen the argument for our framework and demonstrate its advantages without being unfair to other authors, we deliberately avoided reimplementing the approaches or methodologies of competitive state-of-the-art (SOTA) models. Instead, we focused on comparing the accuracy of our proposed model with recent literature models that utilized similar datasets. This approach ensures a fair comparison while still providing valuable insights into how our framework performs within similar datasets and methodologies used by other researchers. The major factor for comparing classification results is accuracy. As a result, we considered the mean accuracy for the quantitative analysis. Table 4 shows how the proposed approach compares to other methods tested on various brain tumor MRI data regarding accuracy.

5. Discussion

Precise diagnosis and classification of brain tumors is of utmost importance for early intervention. Clinically, various physical and neurological examinations are utilized for diagnosis, and biopsy remains the most reliable diagnostic method, i.e., the gold standard. However, it is invasive and has potential risks of bleeding, tissue injury, and functional loss. As a result, AI-based research utilizes medical imaging for its crucial role in determining the type and stage of the tumor and developing a treatment plan. Over the last decade, various works of literature have been developed that exploited different image modalities (e.g., MRI, CT, etc.) for the prompt identification of brain tumors; the authors of [48] presented a comprehensive review. MRI is the preferred choice as it is a non-ionizing and non-invasive method. The main objective of this work is to develop a robust architecture for brain tumor identification using MRI. The focus is identifying prominent localized and non-localized information-rich features associated with disease, building upon public and local datasets for system training and validation. Thus, this work contributed to a system capable of classifying brain tumors based on rich-MRI-derived cues.

We have introduced a hybrid architecture for multi-classification of brain tumors integrating three learning modules to extract texture (radiomics) and deep hidden features, all combined by a feature weighing scenario to form a rich feature vector for tumor classification. Quantitatively, and as shown in Table 3, the integrative efforts of the employed models demonstrate the superiority of our ensemble approach (accuracy ≥ 99%) for both datasets. The subsequent exploration of ablation techniques, dissecting the ensemble model under various scenarios, reveals that the combined ViT with texture features model outperforms others with an impressive overall accuracy of ∼95%.

Further, Table 3 shows the summary of the evaluation metrics of the ensemble and ablated models. The results/performance of CNN-derived features (CnF) alone showed reduced performance compared with the hand-crafted features (HcF). This can partly be explained by the shallow nature of neural architecture and the integrative nature of three sets of radiomics (i.e., GLCM, GLRM, and LBP). The ViT-based (ViF) classification alone showed the best results, which is explained by its ability to capture long-range dependencies, making it suitable for recognizing complex patterns. The “CnF+HcF”, “CnF+ViF,” and “ViF+HcF” ablated models’ results show the performance of the ensemble model without the ViT, HcF, and CNN models, respectively. Table 3 showed that the fused experiments with ViT-derived features increase the performance of the individual branch. This affirms the utility of the ViT-derived feature and the fusion strategy employed, where weights are assigned to each feature branch based on its prediction accuracy.

In addition to quantitative metrics, robustness analysis is also highlighted using ROC curves, as shown by the interconnected lines in Figure 5 and Figure 6. Generally, the assessment of model effectiveness using those curves is based on the area under the curve (AUC). The latter quantitatively assesses the model’s ability to identify a specific class with “1” and “0”, indicating the best and the worst performance, respectively. Notably, in the ROC curves for the first datasets, (Figure 5) the normal and pituitary class reside closest to the top-left corner (AUC is ∼100%), while the meningioma and glioma class exhibits a smaller distance from this position. A similar observation can be drawn from Figure 6 for the second dataset. The ROC curve and AUC scores show that our architecture correctly classified each class well. For individual model performance in predicting each class, the ROC curves in Figure 4d–f also revealed that the ViT model performed the best in predicting all classes as it has curves that are more tilted to the top left corner, while the CNN ROC curves are the farthest from the top left corner amongst the three individual models. This is consistent with the results in Table 3. The confusion matrix in Figure 4a–c further evaluates the performance of the three models as it shows the proportion of instances each model predicted correctly.

The comparative accuracy, in Table 4, contrasting the proposed method against recent literature work tested on various brain tumor datasets, highlight its advantages. Some of the compared methods (e.g., refs. [11,12,13,18,19,20,26] were tested partially on the public dataset in Table 1, while others (e.g., refs. [21,29] adopted a method similar to our approach of integrating various datasets to reduce model overfitting and avoid unbalanced data class scenarios. No previous studies have utilized the same three datasets used in our study. However, by employing this strategy, we aim to showcase the distinct contributions and advantages of our proposed model without directly competing with or undermining the efforts of other authors. While this may not represent the optimum comparative scenario, it offers valuable context by illustrating how our framework stands relative to recent work that shares similarities in dataset usage. This approach allows for a nuanced evaluation of our framework’s performance and position within the broader research landscape utilizing similar datasets, thus providing valuable insights for readers and researchers.

In our proposed framework, we emphasize critical issues concerning the scientific community but also offer a solution with far-reaching implications beyond brain tumor classification alone. While the presented architecture is designed specifically for brain tumor classification, its underlying principles and methodologies have the potential to be used in a variety of oncology applications. The weighted fusion of multiple learnable modules within our ensemble architecture enables the capture of both localized and long-range dependencies. The latter increases our system’s ability to identify complex patterns indicative of brain tumors with high sensitivity. The transferability of pre-trained models and feature extraction techniques suggests that they could be applied to other cancer types, opening the door to multi-modal and multi-organ image analysis frameworks.

Despite the promising results, our method has some limitations, including the reliance on a single image modality (i.e., MRI) for prediction, the lack of explainability and interpretability of the machine decisions, and the use of a single ML classifier. The above-mentioned limitations stand as a motivation for future improvement.

6. Conclusions and Future Work

This work has introduced a robust analysis technique that integrated three learning modules to extract texture and deep hidden features from the brain images for tumor classification. The experimental results and evaluation using public and local datasets documented that integration radiomic-based features (extracted using GLCM, LBP, and GLRLM texture descriptors) and deep features (extracted using convolutional neural network and vision transformer) yielded higher accuracy. Various ablation studies and comparisons with other state-of-the-art methods documented the superiority of our ensemble approach.

Our future research will focus on exploring other powerful deep neural modules (e.g., attention modules), and investigating the performance of different ML classifiers (e.g, support vector machine, random forests, etc.) or stacking classifiers. Also, we recognize the value of segmentation in precise tumor delineation and treatment planning. Future research directions may incorporate segmentation outputs to improve classification performance and clinical utility. Further, we will explore expanding the number of classes to make them more user-friendly and practical for use in the medical field.

Author Contributions

Conceptualization, J.D., O.A., A.A., G.A.S., M.M.R. and F.K.; methodology, J.D., O.A., A.A. and F.K.; software, J.D., O.A. and A.A.; validation, G.A.S., M.M.R. and F.K.; formal analysis, A.A., G.A.S., M.M.R. and F.K.; investigation and resources, G.A.S., M.M.R. and F.K.; data curation, J.D., O.A. and G.A.S.; writing—original draft preparation—review and editing, J.D., O.A., A.A., G.A.S., M.M.R. and F.K.; visualization, A.A. and F.K.; supervision, M.M.R. and F.K.; project administration G.A.S., M.M.R. and F.K.; funding acquisition, M.M.R. and F.K. All authors have read and agreed to the published version of the manuscript.

Funding

This work supported by the Center for Equitable Artificial Intelligence and Machine Learning Systems (CEAMLS), Morgan State University, Project #11202202; and in part funded by the National Institutes of Health (NIH) Agreement No.# 1OT2OD032581. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the NIH.

Institutional Review Board Statement

For the local dataset, the study was conducted in accordance with the Declaration of Helsinki, and approved by the Institutional Review Board (or Ethics Committee) of Mansoura University (IRB protocol # R.21.09.1437.R1, 09-23-2021).

Informed Consent Statement

For the local dataset, all participants were given complete information about the study’s objectives and they gave their informed consent.

Data Availability Statement

The local dataset will made available upon reasonable request to the corresponding author after publication. The publicly-used dataset is available at http://mohammadmahoor.com/affectnet/ and https://www.kaggle.com/datasets/msambare/fer2013 (accessed on 11 December 2023).

Conflicts of Interest

The authors declare no conflict of interest.

References

Ostrom, Q.T.; Patil, N.; Cioffi, G.; Waite, K.; Kruchko, C.; Barnholtz-Sloan, J.S. CBTRUS statistical report: Primary brain and other central nervous system tumors diagnosed in the United States in 2013–2017. Neuro-Oncology 2020, 22, iv1–iv96. [Google Scholar] [CrossRef] [PubMed]
Society, N.B.T. Brain Tumor Facts 2023. Available online: https://braintumor.org/brain-tumors/about-brain-tumors/brain-tumor-facts/ (accessed on 21 October 2023).
Lapointe, S.; Perry, A.; Butowski, N.A. Primary brain tumours in adults. Lancet 2018, 392, 432–446. [Google Scholar] [CrossRef] [PubMed]
Louis, D.N.; Perry, A.; Wesseling, P.; Brat, D.J.; Cree, I.A.; Figarella-Branger, D.; Hawkins, C.; Ng, H.; Pfister, S.M.; Reifenberger, G.; et al. The 2021 WHO classification of tumors of the central nervous system: A summary. Neuro-Oncology 2021, 23, 1231–1251. [Google Scholar] [CrossRef] [PubMed]
Bezdan, T.; Zivkovic, M.; Tuba, E.; Strumberger, I.; Bacanin, N.; Tuba, M. Glioma brain tumor grade classification from mri using convolutional neural networks designed by modified fa. In Proceedings of the International Conference on Intelligent and Fuzzy Systems, Izmir, Turkey, 21–23 July 2020; pp. 955–963. [Google Scholar]
Melmed, S. Pituitary-tumor endocrinopathies. N. Engl. J. Med. 2020, 382, 937–950. [Google Scholar] [CrossRef]
Villanueva-Meyer, J.E.; Mabray, M.C.; Cha, S. Current clinical brain tumor imaging. Neurosurgery 2017, 81, 397. [Google Scholar] [CrossRef] [PubMed]
McFaline-Figueroa, J.R.; Lee, E.Q. Brain tumors. Am. J. Med. 2018, 131, 874–882. [Google Scholar] [CrossRef] [PubMed]
Roberts, T.A.; Hyare, H.; Agliardi, G.; Hipwell, B.; d’Esposito, A.; Ianus, A.; Breen-Norris, J.O.; Ramasawmy, R.; Taylor, V.; Atkinson, D.; et al. Noninvasive diffusion magnetic resonance imaging of brain tumour cell size for the early detection of therapeutic response. Sci. Rep. 2020, 10, 9223. [Google Scholar] [CrossRef]
Cheng, J.; Huang, W.; Cao, S.; Yang, R.; Yang, W.; Yun, Z.; Wang, Z.; Feng, Q. Enhanced performance of brain tumor classification via tumor region augmentation and partition. PLoS ONE 2015, 10, e0140381. [Google Scholar] [CrossRef] [PubMed]
Rathi, V.; Palani, S. Brain Tumor Detection and Classification Using Deep Learning Classifier on MRI Images. Res. J. Appl. Sci. Eng. Technol. 2015, 10, 177–187. [Google Scholar]
Kumar, A.; Ansari, M.A.M.H.; Ashok, A. A Hybrid Framework for Brain Tumor Classification using Grey Wolf Optimization and Multi-Class Support Vector Machine. Int. J. Recent Technol. Eng. 2019, 8, 7746–7752. [Google Scholar] [CrossRef]
Ismael, M.R.; Abdel-Qader, I. Brain tumor classification via statistical features and back-propagation neural network. In Proceedings of the 2018 IEEE International Conference on Electro/Information Technology (EIT), Rochester, MI, USA, 3–5 May 2018; pp. 252–257. [Google Scholar]
Abir, T.A.; Siraji, J.A.; Ahmed, E.; Khulna, B. Analysis of a novel MRI based brain tumour classification using probabilistic neural network (PNN). Int. J. Sci. Res. Sci. Eng. Technol. 2018, 4, 65–79. [Google Scholar]
Afshar, P.; Plataniotis, K.N.; Mohammadi, A. Capsule networks for brain tumor classification based on MRI images and coarse tumor boundaries. In Proceedings of the ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 12–17 May 2019; pp. 1368–1372. [Google Scholar]
Abiwinanda, N.; Hanif, M.; Hesaputra, S.T.; Handayani, A.; Mengko, T.R. Brain tumor classification using convolutional neural network. In World Congress on Medical Physics and Biomedical Engineering 2018, Prague, Czech Republic, 3–8 June 2018; Springer: Singapore, 2019; pp. 183–189. [Google Scholar]
Cheng, J. Figshare. Dataset. 2017. Available online: https://figshare.com/articles/dataset/brain_tumor_dataset/1512427/5 (accessed on 11 December 2023).
Deepak, S.; Ameer, P. Brain tumor classification using deep CNN features via transfer learning. Comput. Biol. Med. 2019, 111, 103345. [Google Scholar] [CrossRef] [PubMed]
Swati, Z.N.K.; Zhao, Q.; Kabir, M.; Ali, F.; Ali, Z.; Ahmed, S.; Lu, J. Content-based brain tumor retrieval for MR images using transfer learning. IEEE Access 2019, 7, 17809–17822. [Google Scholar] [CrossRef]
Gumaei, A.; Hassan, M.M.; Hassan, M.R.; Alelaiwi, A.; Fortino, G. A hybrid feature extraction method with regularized extreme learning machine for brain tumor classification. IEEE Access 2019, 7, 36266–36273. [Google Scholar] [CrossRef]
Anaraki, A.K.; Ayati, M.; Kazemi, F. Magnetic resonance imaging-based brain tumor grades classification and grading via convolutional neural networks and genetic algorithms. Biocybern. Biomed. Eng. 2019, 39, 63–74. [Google Scholar] [CrossRef]
Sharif, M.; Khan, M.; Alhussein, M.; Khursheed, K.; Raza, M. A decision support system for multimodal brain tumor classification using deep learning. Complex Intell. Syst. 2021, 8, 3007–3020. [Google Scholar] [CrossRef]
Asif, S.; Ming, Z.; Tang, F.; Zhu, Y. An enhanced deep learning method for multi-class brain tumor classification using deep transfer learning. Multimed. Tools Appl. 2023, 82, 31709–31736. [Google Scholar] [CrossRef]
Agrawal, T.; Choudhary, P.; Shankar, A.; Singh, P.; Diwakar, M. MultiFeNet: Multi-scale feature scaling in deep neural network for the brain tumour classification in MRI images. Int. J. Imaging Syst. Technol. 2023, 34, e22956. [Google Scholar] [CrossRef]
Zulfiqar, F.; Bajwa, U.I.; Mehmood, Y. Multi-class classification of brain tumor types from MR images using EfficientNets. Biomed. Signal Process. Control 2023, 84, 104777. [Google Scholar] [CrossRef]
Sahoo, S.; Mishra, S.; Panda, B.; Bhoi, A.K.; Barsocchi, P. An Augmented Modulated Deep Learning Based Intelligent Predictive Model for Brain Tumor Detection Using GAN Ensemble. Sensors 2023, 23, 6930. [Google Scholar] [CrossRef]
Abd El-Wahab, B.; Nasr, M.; Khamis, S.; Ashour, A.S. BTC-fCNN: Fast Convolution Neural Network for Multi-class Brain Tumor Classification. Health Inf. Sci. Syst. 2023, 11, 3. [Google Scholar] [CrossRef] [PubMed]
Chaki, J.; Woźniak, M. Brain Tumor Categorization and Retrieval Using Deep Brain Incep Res Architecture Based Reinforcement Learning Network. IEEE Access 2023, 11, 130584–130600. [Google Scholar] [CrossRef]
Arumugam, M.; Thiyagarajan, A.; Adhi, L.; Alagar, S. Crossover smell agent optimized multilayer perceptron for precise brain tumor classification on MRI images. Expert Syst. Appl. 2024, 238, 121453. [Google Scholar] [CrossRef]
Chaki, J.; Wozniak, M. Brain Tumor MRI Dataset. 2023. Available online: https://ieee-dataport.org/documents/brain-tumor-mri-dataset (accessed on 11 December 2023).
Nirthika, R.; Manivannan, S.; Ramanan, A.; Wang, R. Pooling in convolutional neural networks for medical image analysis: A survey and an empirical study. Neural Comput. Appl. 2022, 34, 5321–5347. [Google Scholar] [CrossRef]
Haralick, R.M.; Shanmugam, K.; Dinstein, I.H. Textural features for image classification. IEEE Trans. Syst. Man Cybern. 1973, 6, 610–621. [Google Scholar] [CrossRef]
Sun, W.; Zeng, N.; He, Y. Morphological arrhythmia automated diagnosis method using gray-level co-occurrence matrix enhanced convolutional neural network. IEEE Access 2019, 7, 67123–67129. [Google Scholar] [CrossRef]
Jafarpour, S.; Sedghi, Z.; Amirani, M.C. A robust brain MRI classification with GLCM features. Int. J. Comput. Appl. 2012, 37, 1–5. [Google Scholar]
Mohanty, A.K.; Beberta, S.; Lenka, S.K. Classifying benign and malignant mass using GLCM and GLRLM based texture features from mammogram. Int. J. Eng. Res. Appl. 2011, 1, 687–693. [Google Scholar]
Öztürk, Ş.; Akdemir, B. Application of feature extraction and classification methods for histopathological image using GLCM, LBP, LBGLCM, GLRLM and SFTA. Procedia Comput. Sci. 2018, 132, 40–46. [Google Scholar] [CrossRef]
Park, B.; Shin, T.; Cho, J.S.; Lim, J.H.; Park, K.J. Improving blueberry firmness classification with spectral and textural features of microstructures using hyperspectral microscope imaging and deep learning. Postharvest Biol. Technol. 2023, 195, 112154. [Google Scholar] [CrossRef]
LaValle, S.M.; Branicky, M.S.; Lindemann, S.R. On the relationship between classical grid search and probabilistic roadmaps. Int. J. Robot. Res. 2004, 23, 673–692. [Google Scholar] [CrossRef]
Breve, F.A. COVID-19 detection on Chest X-ray images: A comparison of CNN architectures and ensembles. Expert Syst. Appl. 2022, 204, 117549. [Google Scholar] [CrossRef] [PubMed]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
Shreffler, J.; Huecker, M.R. Diagnostic Testing Accuracy: Sensitivity, Specificity, Predictive Values and Likelihood Ratios. 2022. Available online: https://europepmc.org/article/nbk/nbk557491 (accessed on 10 January 2024).
London, I.C. IXI Dataset. Available online: https://brain-development.org/ixi-dataset/ (accessed on 10 May 2024).
Scarpace, L.; Flanders, A.E.; Jain, R.; Mikkelsen, T.; Andrews, D.W. Data from REMBRANDT. Cancer Imaging Arch. 2015, 10, 818–833. [Google Scholar]
Scarpace, L.; Mikkelsen, L.; Cha, T.; Rao, S.; Tekchandani, S.; Gutman, S.; Pierce, D. Radiology data from the cancer genome atlas glioblastoma multiforme [TCGA-GBM] collection. Cancer Imaging Arch. 2016, 11, 1. [Google Scholar]
Pedano, N.; Flanders, A.E.; Scarpace, L.; Mikkelsen, T.; Eschbacher, J.M.; Hermes, B.; Ostrom, Q. Radiology data from the cancer genome atlas low grade glioma [TCGA-LGG] collection. Cancer Imaging Arch. 2016, 2, 1–6. [Google Scholar]
Nickparvar, M. Brain Tumor MRI Dataset; Kaggle: San Francisco, CA, USA, 2021. [Google Scholar]
Chakrabarty, N. Brain MRI images for brain tumor detection. J. Exp. Med. 2019, 216, 539–555. [Google Scholar]
Batool, A.; Byun, Y.C. Brain tumor detection with integrating traditional and computational intelligence approaches across diverse imaging modalities-Challenges and future directions. Comput. Biol. Med. 2024, 175, 108412. [Google Scholar] [CrossRef]

Figure 1. Schematic overview of the proposed methodology for brain tumor classification.

Figure 2. Data augmentation process applied to the preprocessed images from left to right, horizontal and vertical flip, rotate, shear, scale, sigmoid contrast, gamma contrast.

Figure 3. Example of contrast-enhanced axial brain T1-w MR images from the dataset: (a) normal brain with no detected tumor, (b) right temporal glioma, (c) pituitary macroadenoma, and (d) right frontal meningioma showing the difference between the classes based on their estimated GLCM (second row) and LBP (third row).

Figure 4. Model performance demonstrated using confusion matrices (first row) and ROC curves (second row) based on using an individual set of features texture (a,d), CNN-derived (b,e), and ViT-derived (c,f), respectively.

Figure 5. Proposed model performance on the public dataset using the confusion matrix (a) and ROC curves (b).

Figure 6. Proposed model accuracy and performance tested on the second (local) dataset using the confusion matrix (a) and the ROC curve (b).

Table 1. Per-class distribution for the public dataset.

	Dataset
	Figshare	SARTAJ	Br35H	Total
No Tumor	0	500	1500	2000
Glioma	1426	195	0	1621
Meningioma	708	937	0	1645
Pituitary Macroadenomas	930	827	0	1757
Total	3064	2459	1500	7023

Table 2. DICOM header information for the MRI sequences of the local dataset.

Parameter	T1-w	T2-w	FLAIR
Repetition time (TR) [ms]	580	4432	10,000
Echo time (TE) [ms]	15	100	115
Inversion time (TI) [ms]	-	-	2700
Matrix size	$80 \times 80$	$80 \times 80$	$80 \times 80$
Field-of-view (FOV) [mm²]	$250 \times 170$	$250 \times 170$	$250 \times 170$
Slice thickness [mm]	5	5	5
Contrast agent flow rate [mL/s]	2	-	-
Maximum dose [ml]	10	-	-

Table 3. Overall model’s accuracy and the associated ablation studies using various metrics. Ac: accuracy; Sn: sensitivity; Sp: specificity; CNN: convolutional neural network; and ViT: vision transformer.

	Evaluation Metrics, (%)
Method	Ac	Se	Sp
Hand-crafted Features (HcF)	84.05	82.99	94.11
CNN-derived Features (CnF)	79.20	78.52	92.21
ViT-derived Features (ViF)	92.77	92.54	97.50
CnF+HcF	93.74	93.21	97.88
CnF+ViF	88.86	88.08	96.01
ViF+HcF	94.66	94.24	98.15
Proposed (public dataset)	99.19	99.52	99.53
Proposed (local dataset)	99.38	99.46	99.50

Table 4. Comparison of the classification accuracy with state-of-the-art models.

		Evaluation Metrics, (%)
Method	Dataset	Ac	Se	Sp
Rathi and Palani [11]	[17]	83.00	88.00	80.00
Kumar et al. [12]	[17]	95.23	–	–
Ismael and Abdel-Qader [13]	[17]	91.9	–	–
Deepak and Ameer [18]	[17]	97.12	–	–
Swati et al. [19]	[17]	94.82	93.00	94.60
Gumaei et al. [20]	[17]	94.23	–	–
Anaraki et al. [21]	[17,42,43,44,45]	94.20	–	–
Sahoo et al. [26]	[17]	98.85	–	–
Chaki et al. [28]	[46]	97.5	-	-
Arumugam et al. [29]	[17,46,47]	98.56	98.64	98.45

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Dixon, J.; Akinniyi, O.; Abdelhamid, A.; Saleh, G.A.; Rahman, M.M.; Khalifa, F. A Hybrid Learning-Architecture for Improved Brain Tumor Recognition. Algorithms 2024, 17, 221. https://doi.org/10.3390/a17060221

AMA Style

Dixon J, Akinniyi O, Abdelhamid A, Saleh GA, Rahman MM, Khalifa F. A Hybrid Learning-Architecture for Improved Brain Tumor Recognition. Algorithms. 2024; 17(6):221. https://doi.org/10.3390/a17060221

Chicago/Turabian Style

Dixon, Jose, Oluwatunmise Akinniyi, Abeer Abdelhamid, Gehad A. Saleh, Md Mahmudur Rahman, and Fahmi Khalifa. 2024. "A Hybrid Learning-Architecture for Improved Brain Tumor Recognition" Algorithms 17, no. 6: 221. https://doi.org/10.3390/a17060221

APA Style

Dixon, J., Akinniyi, O., Abdelhamid, A., Saleh, G. A., Rahman, M. M., & Khalifa, F. (2024). A Hybrid Learning-Architecture for Improved Brain Tumor Recognition. Algorithms, 17(6), 221. https://doi.org/10.3390/a17060221

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Hybrid Learning-Architecture for Improved Brain Tumor Recognition

Abstract

1. Introduction

2. Methodology

2.1. Datasets

2.2. Data Preprocessing

2.3. Features Extraction

3. Weighted Ensemble Classification

4. Experimental Results

5. Discussion

6. Conclusions and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI