VSA-GCNN: Attention Guided Graph Neural Networks for Brain Tumor Segmentation and Classification

Pratap Joshi, Kambham; Gowda, Vishruth Boraiah; Bidare Divakarachari, Parameshachari; Siddappa Parameshwarappa, Paramesh; Patra, Raj Kumar

doi:10.3390/bdcc9020029

Open AccessArticle

VSA-GCNN: Attention Guided Graph Neural Networks for Brain Tumor Segmentation and Classification

by

Kambham Pratap Joshi

¹,

Vishruth Boraiah Gowda

²,

Parameshachari Bidare Divakarachari

^3,*

,

Paramesh Siddappa Parameshwarappa

⁴ and

Raj Kumar Patra

⁵

¹

Department of CSE- Data Science/Cyber Security, MLR Institute of Technology, Hyderabad 500043, India

²

Department of Information Science and Engineering, SJB Institute of Technology, Affiliated to Visvesvaraya Technological University, Bangalore 560060, India

³

Department of Electronics and Communication Engineering, Nitte Meenakshi Institute of Technology, Bengaluru 560064, India

⁴

Department of Computer Science & Engineering, School of Engineering, Central University of Karnataka, Kalaburagi 585367, India

⁵

Computer Science and Engineering, CMR Technical Campus, Hyderabad 501401, India

^*

Author to whom correspondence should be addressed.

Big Data Cogn. Comput. 2025, 9(2), 29; https://doi.org/10.3390/bdcc9020029

Submission received: 13 November 2024 / Revised: 8 January 2025 / Accepted: 12 January 2025 / Published: 31 January 2025

Download

Browse Figures

Versions Notes

Abstract

:

For the past few decades, brain tumors have had a substantial influence on human life, and pose severe health risks if not treated and diagnosed in the early stages. Brain tumor problems are highly diverse and vary extensively in terms of size, type, and location. This brain tumor diversity makes it challenging to progress an accurate and reliable diagnostic tool. In order to effectively segment and classify the tumor region, still several developments are required to make an accurate diagnosis. Thus, the purpose of this research is to accurately segment and classify brain tumor Magnetic Resonance Images (MRI) to enhance diagnosis. Primarily, the images are collected from BraTS 2019, 2020, and 2021 datasets, which are pre-processed using min–max normalization to eliminate noise. Then, the pre-processed images are given into the segmentation stage, where a Variational Spatial Attention with Graph Convolutional Neural Network (VSA-GCNN) is applied to handle the variations in tumor shape, size, and location. Then, the segmented outputs are processed into feature extraction, where an AlexNet model is used to reduce the dimensionality. Finally, in the classification stage, a Bidirectional Gated Recurrent Unit (Bi-GRU) is employed to classify the brain tumor regions as gliomas and meningiomas. From the results, it is evident that the proposed VSA-GCNN-BiGRU shows superior results on the BraTS 2019 dataset in terms of accuracy (99.98%), sensitivity (99.92%), and specificity (99.91%) when compared with existing models. By considering the BraTS 2020 dataset, the proposed VSA-GCNN-BiGRU shows superior results in terms of Dice similarity coefficient (0.4), sensitivity (97.7%), accuracy (98.2%), and specificity (97.4%). While evaluating with the BraTS 2021 dataset, the proposed VSA-GCNN-BiGRU achieved specificity of 97.6%, Dice similarity of 98.6%, sensitivity of 99.4%, and accuracy of 99.8%. From the overall observation, the proposed VSA-GCNN-BiGRU supports accurate brain tumor segmentation and classification, which provides clinical significance in MRI when compared to existing models.

Keywords:

bidirectional gated recurrent unit; brain tumor segmentation; convolutional architectures; graph convolutional neural network; histogram equalization; variational spatial attention mechanism

1. Introduction

Nowadays, the early detection of brain tumors is essential to support doctors in treating cancer patients to maximize their lifespan [1,2,3]. Technological advancements and machine learning (ML) can assist neurosurgeons in tumor diagnosis without requiring expensive and surgical operations [4,5,6]. Medical imaging is the method and procedure used to provide visual depictions of a body’s interior for use in clinical analysis and medical intervention, as well as to show how certain organs or tissues work. It is also used to recognize and diagnose diseases which are enclosed by skin and bones [7,8,9]. Medical imaging also builds a database of typical anatomy and physiology in order to find abnormalities. A brain scan is a sort of diagnostic procedure in which images of the brain are taken. Similar to how an X-ray can detect a fractured bone inside the body, a brain scan can detect a tumor inside the skull [10,11]. MRI and computed tomography (CT) scans are the two scans used most frequently to identify brain tumors. A CT scan is frequently used to identify the presence of a tumor, and an MRI scan is frequently used to obtain more specific information about the size, location, and potential type of the tumor [12,13].

The brain tumor segmentation (BTS) from MRI scans is a complex and time-consuming process because of variable patient features, which is a major concern in early detection and diagnosis. Hence, for effective detection and identification of brain tumors, many ML-based studies are conducted on classifying brain tumors from medical image processing [14,15,16]. The pre-processing involves enhancement, segmentation, feature selection. The post-processing involves identification and classification. Manual segmentation of brain tumors, on the other hand, necessitates the use of clinical professionals to correctly locate the kinds of tumor [17]. The manual segmentation procedure is labor-intensive, depends on physician experience, and is time-consuming. As a result, automatic computer-based segmentation models have been created, saving the surgeon time while providing trustworthy and precise segmentation findings. These automatic models lessen the effort required by clinicians to complete disease mechanism diagnosis [18,19]. Several machine learning algorithms are used to distinguish between healthy and unhealthy brain tissues in MRI scans. However, selecting fully automated characteristics is difficult and necessitates a combination of medical competence and computer engineering [20]. Even with major advancements in various ML-based models, brain tumors are still difficult to segment and identify using existing methods, particularly when irregularly shaped tumors show complex morphological features. The majority of conventional models struggle to signify the complex spatial interactions among tumor locations due to heterogeneity, image acquisition, and resolution. These spatial interactions make it challenging for conventional models to accurately segment and classify brain tumors. Also, several models only concentrate on either segmentation or classification, ignoring how crucial it is to combine the two tasks for detailed tumor analysis.

The following are the research gaps which are being targeted by this research:

inadequate handling of complex tumor shapes;
limited incorporation of segmentation and classification tasks;
inadequate assessment on diverse brain tumor datasets, robustness, and generalizability;

Also, the question which is being asked by this research is

What are the key factors contributing to the effectiveness of the proposed VSA-GCNN-BiGRU?

To address these gaps, the major objective of this research is to provide effective segmentation and classification of brain tumors using the VSA-GCNN-BiGRU method. The proposed method combines a VSA-GCNN and a Bi-GRU to enable the efficient modeling of local and global context interactions between regions, in contrast to earlier approaches. Moreover, the proposed method allows it to manage small tumors and class imbalance issues successfully when analyzed with existing state-of-the-art methods. Overall, the proposed method offers a novel and efficient strategy for classifying and segmenting brain tumors with significant benefits, such as improved accuracy and computational time, over the existing models. The major contributions from this manuscript are given below:

The VSA-GCNN model includes a morphological gradient function and a Dice loss function for segmenting the tumor type in the MRI images and reducing feature information loss in the max pooling layer.
Effective segmentation is achieved due to the extraction of active feature vectors using convolutional architecture, such as AlexNet, due to its ability to work well for the time with identifying tumor changes in the brain with different frames.
The VSA-GCNN model depth is increased in this manuscript for obtaining active semantic feature vectors that help in improving the BTS results by adding more spatial attention to the tumor regions in the input MRI.

This manuscript is prepared as follows: articles related to BTS are reviewed in Section 2. The mathematical equations and simulation outcomes of the developed model are stated in Section 3 and Section 4. Future extensions and findings of this manuscript are depicted in Section 5.

2. Literature Review

An automatic brain tumor segmentation method was developed by Adham Aleid et al. [21] for the early detection of tumors using MRI images from the BraTS 2017 dataset. This method was developed to overcome the higher computational complexity, expensive infrastructure, and small database to train the network in exiting methods. It has the advantages of suppressing noise, better execution time, and computational complexity. Even though the system achieved average accuracy on the detection of the tumor region, it failed to perform precise pixel-level segmentation on the determined region of interest.

A DNN-based automatic brain tumor segmentation using 3D attention U-Net was suggested by Ajay S. Ladkat et al. [22] using MRI images from the BraTS 2019 dataset. Each pixel of the 3D image was enhanced and segmented with the attention mechanism of 3D U-Net. The accuracy of the system for each tumor pixel achieved better results due to the ability of U-Net to handle the pixel level segmentation. Since the developed model of 3D attention U-Net has more parameters due to skip connection, the suggested system is prone to overfitting.

Khiet Dang et al. [23] developed a DL-based brain tumor segmentation framework that integrated pre-processing methods of the MRI images of BraTS 2019 to perform segmentation with U-Net and classification with VGG and GoogleNet. As VGG and GoogleNet are the complex architectures, with the use of these classification techniques this framework achieved better tumor classification results. The suggested U-Net model with VGG and GoogleNet fails to achieve precise detection of the core tumor in the BraTS dataset due to the overfitting of the model.

Agus Subhan Akbar et al. [24] suggested a U-Net architecture with an attention mechanism combined with the residual path and skip connection known as Multipath Residual Attention Block. The evaluation of this suggested ensemble model was performed on all three BraTS datasets and achieved better results in tumor segmentation due to pixel level segmentation of U-Net. However, the suggested U-Net model has a poor Dice value on the complex regions of tumor sub-types and faces difficulty in the identification of severity.

A CNN-based transformer parallel network was developed by Yu Chen et al. [25] for the multimodal segmentation of brain tumors. This method of processing the contextual information in CNN was developed to overcome the limitations of long-distance information transfer. With the use of encoder and decoder, the limitations of acquiring local information have been addressed. This approach has resulted in better segmentation results with the use of the CNN-based parallel network. However, the suggested approach lacks in its processing of various complex regions of brain tumor images and its segmentation performance diminishes if the core tumor pixel region overlaps with other regions.

Metlek and Halit [26] suggested a novel shuffled-YOLO network for the BraTS 2019 dataset. The segmenting features were enhanced by using multi-scale data which were extracted from the network backbone. For instant segmentation, the enriched feature map scaling was performed by using anchor boxes from YOLO. The feature maps were taken as inputs and the computational cost was reduced, also enhancing the learning capability of model. However, the accuracy level of segmenting brain tumors is low with this approach.

An advanced CAD system for the automatic detection of brain tumors was developed by T. S. Sheela Shiney [27] by using Modified Hierarchical K-means with Firefly Clustering (MHKFC) to improve the quality of image segmentation from the BraTS 2021 dataset. It showed the advantages of locating the precise location of the brain nodules and was less cost-effective to be implemented in real-time medical applications. The suggested model of Modified Hierarchical K-means with Firefly Clustering faces issues of performance in classifying the brain tumor types precisely and it requires more characteristic features to distinguish the types of tumors effectively.

An automated BTS was suggested by Mahmoud Elmezain et al. [28] using a deep capsule network (CapsNet) and latent dynamic conditional random fields (LDRCF) on the BraTS 2021 dataset. To address the issues of unpredictable variations in the size and shape of tumors the CapsNet and LDRCF are combined, which resulted in better segmentation results. However, the CapsNet model is more complex and derives many features to achieve better accuracy in classification, identification of such characteristics in a typical brain tumor MRI scan is difficult.

N. Nagarani et al. [29] demonstrated classification of brain tumors with MRI using a generative adversarial network called SPGAN-MSOA-CBT MRI. Initially, the data were collected from the BraTS 2019 dataset, followed by pre-processing of the data, which lessened the noise and capitalized the power of input image. Using this model contained an advantage for classifying the regular and irregular areas of brain tumors for the early detection and diagnosis of brain disease. However, the suggested model’s performance failed in evaluating numerous datasets.

Ayesha Jabbar et al. [30] illustrated the detection of brain tumors by means of a hybrid Caps-VGGNet model. The suggested algorithm efficiency was evaluated using the BraTS 2019 dataset that had superior pictures of brain tumors. The presented model addressed the issue of a need for large datasets by dynamical extraction and classification of features. However, the illustrated model contained limited interpretability due to the complexity of the hybrid model. Additionally, the Caps-VGGNet model requires careful initialization of capsule networks while training on the data.

Snehal Rajput et al. [31] presented brain tumor segmentation by a triplanar ensemble model accompanied by volumetric multiparametric magnetic resonance images using the BraTS 2020 dataset. The presented method involved an optimized triplanar ensemble model using channel attention and spatial mechanisms, with multiple orthogonal planar views to forecast segmentation labels, and achieved better accuracy with fewer constraints. However, the optimized triplanar model lacks in the ability to fine-tune the model for optimum filter size without increasing the network complexity, and achieved a competitive Dice score compared to the coronal planar models.

Saima Esmaeilzadeh Asl et al. [32] suggested hybrid filtering with U-Net architecture in multimodal MRI volumes of brain tumor segmentation using the BraTS 2020 dataset. The suggested U-Net method is employed for automated brain tumor segmentation by 3D filters and images. The U-Net model utilized semantic segmentation and pre-processing by a hybrid filter, which included the bilateral filter and blacktop hat. This method exhibited better tumor segmentation with lower computational memory usage and was more efficient for all tumor images. However, a U-Net architecture flaw in accurately segmenting gliomas, due to rapid proliferation and varied shapes of gliomas, is reduced by radiation therapy.

Dongwei Liu et al. [33] proposed a SGEResU-Net method for segmentation using the BraTS 2021 dataset. The suggested SGEResU-Net method works to implant the residual blocks and SGE attention blocks into a single 3D U-Net architecture at same time. The proposed SGEResU-Net method achieved magnificent DSC values on the BraTS 2021 dataset, indicating its effectiveness in precise tumor segmentation. This automatic segmentation of brain tumors lacks due to their location, size, and shape which affected the model’s consistency in different cases. However, the model’s complexity with embedded residual and attention blocks required more significant expertise and computational resources for maintenance and implementation.

From the related works, the common limitations observed are poor segmentation results on pixels of areas of interest, more processing time, complexity in implementation, poor quality of segmented image, average accuracy, and not applicable for real-time applications. To overcome these limitations a VSA-GCNN mechanism is proposed in this research for the effective segmentation of brain tumors, as discussed in the further sections.

3. Methodology

In the proposed methodology, the VSA-GCNN segmentation is proposed as performing effective segmentation of brain tumors. The imaging dataset of BraTS 2019, BraTS 2020, and BraTS 2021 is utilized for the proposed method. The Min–Max Normalization is utilized in the pre-processing stage for images and the AlexNet architecture of convolutional layers is utilized for the extraction of features from the MRI images. VSA-GCNN and Bi-GRU are employed for efficient segmentation and classification. The overall diagram of the proposed study is characterized in Figure 1.

3.1. Database Description

The data are collected from the MRIs of benchmark datasets: BraTS 2019 (https://www.kaggle.com/datasets/aryashah2k/brain-tumor-segmentation-brats-2019, accessed on 1 June 2024), BraTS 2020 (https://www.kaggle.com/datasets/awsaf49/brats20-dataset-training-validation, accessed on 1 June 2024), and BraTS 2021 (https://www.kaggle.com/datasets/dschettler8845/brats-2021-task1/code, accessed on 1 June 2024), which consists of the same dataset, but with different training, testing, and validation sets. These datasets consist of 3D scans of MRI with precise types of brain tumor and the same modalities, such as T1, T1-weighted, T2-weighted, T1ce and FLAIR. The BraTS 2021 consists of 1251 training set, 570 testing set and 219 validation sets. The sample MRI scans of BraTS 2021 is depicted in Figure 2.

Modalities of the BraTS 2020 database are equipped with manual segmentation of lesions and consist of 125 subject data without the manual ground truth which can be used for comparison. Example MRI scans of the BraTS 2020 database are included in Figure 3.

259 glioma images of higher grade and 76 lower grade images are included in the BraTS 2019 data, the manual creation of a ground truth by the expert neuroradiologists is utilized for the annotation protocol. In the database, the MRI scans belong to 4 modalities, such as T1, T2, T2-CE, and T2-FLAIR, and example scans are given in Figure 4.

3.2. Data Pre-Processing

The min–max normalization and histogram equalization technique is utilized on the tumor images in this study. Min–max is based on minimum and maximum values of un-normalized data utilized in rescaling. The lower and higher bounds are defined linearly by the min–max normalization approach, and the data are rescaled between 0 and 1 or from −1 to 1. Equation (1) shows how to calculate the min–max normalization.

x_{i, n}^{'} = \frac{x_{i, n} - (x_{i})}{(x_{i}) - (x_{i})} (n m a x - n m i n) + n m i n

(1)

where

m a x

and

m i n

signify maximum and minimum value of

i t h

characteristic, respectively. Similarly, the lower and upper limits of the data are given as

n m a x

and

n m i n

, respectively [37]. This method minimizes variance and amplifies the impact of outliers to avoid irrelevant pixel values to the determined segmentation region, it also preserves the relative order and distance between the data points. After normalizing the image, the histogram equalization is applied to eliminate noise from the normalized images.

Histogram Equalization is an image enhancement method that shifts the probability distribution of intensity towards a uniform distribution. The normalized histogram of the image

M \times N

is defined as given in Equation (2).

N_{H E} = \frac{h}{\sum_{i = 0}^{L - 1} h (i)} = \frac{h}{M \times N}

(2)

where

h

is the statistical histogram and

i

is the

i t h

intensity level for normalized image

x_{i, n}^{'}

. The transformation function

T (l)

is measured as shown in Equations (3) and (4).

T (l) = (L - 1) \sum_{k = 0}^{l} N_{H E} (k)

(3)

T (l) = \frac{L - 1}{M \times N} \sum_{k = 0}^{l} h (k)

(4)

where,

l = 0, 1, 2, \dots, L - 1

which denotes the

l

th intensity level. The enhanced image can be obtained by mapping the input image’s intensity level to the transformation function [38]. The global contrast of the processed image is typically increased via histogram equalization to find the difference between the classes of brain tumor and segment it effectively from the non-tumor region.

3.3. Brain Tumor Segmentation

After collecting the MRI brain scans, the VSA-GCNN model is proposed for brain tumor segmentation, which effectively handles the variations in tumor appearance and location. Initially, the collected MRI brain scans are resized to

64 \times 64

. The image is segmented using VSA-GCNN, which is a two-way cascade model which is utilized to segment and predicts the given image input with the help of encoder and decoder architecture. The proposed VSA-GCNN uses a ResNet for the lower and upper branches at the stage of encoding to attain the low-level semantics to deepen the high-level semantic depth. Enhanced semantic analysis using VSA block is implemented in the upper branch procedure, which includes dense feature data. To nullify the missing data in high-level extracted features for attaining better semantic depth analysis in the upper branch, the attention module will be used in the exact location of the lower-level branch and the concerned receptive field is expanded. Due to this, more image spatial regions can be obtained via variational convolution layers of multiple scales. Each layer of the encoding stage contains skip connections for the image spatial data to pass through, from one encoding layer to the corresponding decoding layer by enhancing the MRI image information density. In the upper branch, blocks of fusing features complete the combining process of different features, which is utilized to enhance encoding attention to the spatial regions of the concerned receptive field in the MRI images. In the lower branch, the remote dependencies between the information of features are strengthened by strip convolution, and the utilization of image–spatial data fusion at same level is done to attain the predicted result.

3.3.1. Variational Spatial Attention Mechanism

Variational Spatial Attention is used to increase the attention weights for the pixels representing the brain tumor in MRI images for the GCNN and is used to create the input. It is convolutionally dimension reduced with a convolution kernel and, in the network, three-way branch processing is applied from top to bottom. To create

X_{0}

, the input

X

as array of normalized histogram images

N_{H E}

is convolutionally dimension reduced with a

1 \times 1

convolution kernel and, in the network, three-way branch processing is applied to

X_{0}

from top to bottom. The size of the MRI image–spatial map is given by

W \times H

. The feature maps are partitioned into multiple patches and each patch is passed to the attention block to calculate the pixel-level relevance. To improve the network performance, the

I_{n 1}

and

I_{n 2}

are obtained using the convolution to minimize the dimension of input features

I_{n}

in the attention block. The size of each spatial data map in the upper processing layer is

w \times h

. To determine pixel-level relevance, the experiment is run through the attention block for each patch. The pixel-level correlation is then measured with the Gaussian function to produce the result

e

as in Equation (5). In order to create the output

O u t

,

e

is finally combined the original input spatial data of

I_{n}

as shown in Equations (5) and (6).

e_{i} = \frac{g a u s s (I_{n 1} . I_{n 2})}{\sum_{i = 1}^{N} g a u s s (I_{n 1} . I_{n 2})}

(5)

where,

N

is the number of chunks and

g a u s s

is the Gaussian function. Convolution is used in decreasing the size of

I_{n}

in the attention block, resulting in

I_{n 1}

and

I_{n 2}

, which are obtained in order to enhance the network’s overall performance.

O u t = \sum_{i = 1}^{N} e_{i} + I_{n}

(6)

The correlation between pixels enhances the network’s performance in the processing of the higher layer. The dimensionality reduction processing used was motivated by the concept of topology, increasing the task’s critical route task processing speed, and enhancing algorithm processing efficiency. The attention block in the VSA is depicted in Figure 5, which completes the multi-scale feature information extraction for high-level semantic information.

In MRI imaging, the segmentation objective and its surrounding pixels usually resemble one another. As a result, the suggested approach collects the location which is related to the area in the topmost computing in the middle layer processing and then uses the local spatial feature block following passing through the attention block as in Equation (7) to operate.

O = δ [P_{1} {⨂ G}_{1}, P_{2} {⨂ G}_{2}, P_{3} {⨂ G}_{3}, . . . P_{S} {⨂ G}_{S}]

(7)

where

δ

is the activation function,

P_{1}, P_{2}, P_{3}, \dots, P_{S}

determines the segmentation target. The proposed method executes element-wise multiplication of

P

and

G

. The proposed method combines O and

X_{!}

at the conclusion of each module as in Equation (8).

X_{i} = O + X_{i}

(8)

The patches from spatial data maps are altered by adjusting the cut-off ratio

s

. The layers of extracted features from various scales are built in the module, for the processing of input spatial data

X

concurrently in order to acquire denser spatial data information. In order to produce Output as shown in Equation (9), the output of each layer is fused at the channel level using

1 \times 1

convolution [39].

O u t p u t = I ([X_{!}, X_{1} . . ., X_{n}])

(9)

where

I (.)

stands for the

1 \times 1

convolution.

The attention-based spatial data are then given to the GCNN for the easy segmentation which represents the features of the image and the

O u t p u t

is taken as input

x

to the GCNN.

3.3.2. VSA-GCNN

The weighted attention-based features are given as input to the GCNN, which comprises an expanded up-sampling path for deep feature maps, considerably enhances medical picture segmentation performance. The proposed approach utilizes the GCNN to perform image segmentation regarding certain constraints implemented on the Region of Interest (ROI). Mainly, GCNN is applied to the image Region Adjacency Graph (RAG) for predicting region labels given as sample nodes which are represented in resulting regions, since it effectively captures the complex relationships among the regions. The GCNN is then trained with a semi-supervised learning framework to accomplish the node-level classification. The region constraints were performed as input in many ways. In this study, region constraints are given as scribbles that are signifying the rough location of ROI and then representing its properties. Graph nodes are covered by scribbles that are considered labelled nodes (L) with region labels assigned regarding a unique scribble’s color but the nodes which are not covered by scribbles remain unlabeled. The GCNN predicts its region labels and a general idea of the proposed approach is denoted in Figure 6. The proposed approach segments various ROI at a time, providing seeds for individuals.

The resulting GCNN structure takes two convolutional layers which are expressed as Equation (10):

H^{(p)} = σ (H^{(p - 1)} *_{G} g)

(10)

where,

{(H}^{(p)} \in R^{N \times n_{p}})

represents feature matrix of

(p t h

) layer;

(g)

indicates the function that has aggregating node;

{(n}_{p})

characterizes number of feature maps and

{(H}^{0} = X);

(*_{G}

) symbolizes graph convolutional operator;

σ (\cdot)

stands as activation function. GCNN layers are separated by a dropout layer which is exploited to minimize overfitting. The amount of input channels to the first layer

{(H}^{(0)} \in R^{N \times C})

is

(C)

and the amount of output channels is 16. The ReLU activation

f (H) = \max (0, H)

is employed to the first layer output. Then

{(H}^{(1)} \in R^{N \times 16})

takes 16 channels as input and output;

{(N}_{C})

signifying the number of labels in the segmented image. The second convolutional layer’s output is subjected to softmax activation in order to determine the likelihood that node

(v_{i})

, which is a member of each

{(N}_{C})

, represents the region

(R_{i})

. Finally,

(v_{i})

is allocated to the label of the region with the highest probability. Both spectral and spatial alternatives were used to test the GCNN design. Their methods for computing convolution on a graph in the neural network’s convolutional layers vary. The most often used graph convolution operators that illustrate the spectral and spatial approach are taken into consideration and evaluated side by side.

3.3.3. Loss Function

Eventually, the image segmentation problem transforms into a classification task for each pixel in the image. In comparison to natural picture segmentation, the task of medical image segmentation is rather straightforward, and it is primarily a two-class problem. It merely needs to segment the target and background pixels. However, the proportion of target pixels in the overall image may be significantly lower than that of background pixels, resulting in a class imbalance issue. Cross-entropy is the most often used loss function, yet it may not be the ideal solution for class imbalance situations. Because it is agnostic to the number of foreground or background pixels, Dice loss can help to reduce the class-imbalance problem. However, Dice loss will have a negative impact on back propagation and can render training unstable. The loss function is calculated as given in Equation (11).

L (p, g) = L_{B E} (p, g) + L_{d i c e} (p, g)

(11)

where,

p

refers predicted image and

g

refers ground truth.

L_{B E}

is the binary entropy loss and

L_{d i c e}

is the Dice loss. The VSA mechanism is employed to selectively focus on relevant areas of the input image. This is accomplished by computing the attention weights which is represented as Equation (12):

α = σ (\frac{W}{c \cdot X} + b)

(12)

X

is the input feature map;

σ

represents sigmoid function,

W a n d b

refer to the learnable weights and biases. By analyzing the attention weights, we can show that the VSA-GCNN method effectively focuses on the tumor regions, leading to improved segmentation performance.

3.4. Feature Extraction

The segmented image is fed to AlexNet to perform feature extraction, where it is a process of reducing the dimensions of the images from processed data. It also involves the process of compressing and dividing the raw unprocessed data into smaller and simpler groups. The primary characteristics of the massive data are the large number of distinct variables they represent. Feature extraction assists in extracting the best features from the massive amount of data by selecting and merging distinct variables into features in order to minimize the data size effectively. While remaining accessible to utilize, these features precisely and distinctly describe the actual dataset. In this research, the AlexNet fully connected layer and the convolution method are applied to the characteristics obtained from these layers for the segmented image

y (x)

of size

240 \times 240 \times 3

. The structure of AlexNet with convolutional layers and fully convoluted layers is depicted in Figure 7.

The structure of AlexNet contains five convolutional layers, three max-pooling layers, two normalized layers, two Fully Connected layers (FC), and one SoftMax layer. The dropout function is used in the first two FC layers to avoid overfitting, then max-pooling is applied in the 1st, 2nd, and 5th convolutional layers to measure the higher value of patches to make a down-sampled map. The kernel mappings in the previous layer operate on the same GPU, which are then linked to max-pooled layers [40,41]. The neurons in fully linked layers are all connected to neurons in the layer below. Each convolution layer consists of a convolution filter and a non-linear activation function ReLU. The input size is fixed, as well as the max-pooling function, and is performed by pooling layers because of completely connected layers. The input size of

240 \times 240

pixels is based on a method where 5 × 5 and 3 × 3-pixel filters are moved over the image in the convolutional layer. AlexNet employs Rectified Linear Units (ReLU) as it has an advantage in terms of training speed: a CNN utilizing ReLU can achieve the minimum number of errors. The entire AlexNet architecture can be expressed mathematically as shown in Equation (13).

X_{f e a t u r e s} = {(C N N \to R N \to M P)}^{2} \to ({C N N}^{3} \to M P) \to {(F C \to D O)}^{2} \to L i n e a r \to s o f t m a x

(13)

where,

C N N

is the convolutional layer with ReLU activation,

R N

is the local response normalization,

M P

is max-pooling,

F C

is the fully connected layer,

D O

is the dropout, and

L i n e a r

is the fully connected layer without activation function. To move the following layer, activation maps with more efficient characteristics are produced. The fact that activation maps are distinctive is their most significant characteristic. Without affecting the image’s features, the pooling layer is exploited to minimize the image’s size and cost, and will extract features of

1000 \times 1

from AlexNet.

3.5. Classification

The extracted features are then fed to Bi-GRU for classification, which is explained briefly here. GRU is appropriate for processing sequential data which are offered as a solution to the problems of various parameters and lengthy training times. It also memorizes the information of prior nodes through the use of “gate” to address the gradient disappearing issue. The GRU has fewer parameters than the LSTM because it only has two gates, the update gate and the reset gate, respectively, but it accomplishes the same goal in less training time. Figure 8 displays the structural diagram of the GRU model.

Here,

x_{t} a n d

h_{t}

refers to an input and an output;

(r_{t})

and

{(z}_{t})

represents reset gate and update gate.

{(z}_{t})

and

(r_{t})

cooperatively regulate the computation from hidden state (

(h_{t - 1})

to

{(h}_{t})

.

{(z}_{t})

handles both

x_{t}

as well as earlier

(h_{t - 1})

, and

{(z}_{t})

range is among 0 and 1.

{(z}_{t})

defines how much

(h_{t - 1})

is transferred to following state. The detailed gate unit is considered as Equations (14)–(17):

z_{t} = σ (w_{z} \times [h_{t - 1}, x_{t}])

(14)

r_{t} = σ (w_{r} \times [h_{t - 1}, x_{t}])

(15)

\tilde{h_{t}} = t a n h (W \times [r_{t} \times h_{t - 1}, x_{t}])

(16)

h_{t} = (1 - z_{t}) \times h_{t - 1} + z_{t} \times \tilde{h_{t}}

(17)

The sigmoid function represented by (

σ)

functions as a gate control signal by transforming the data into a value between 0 and 1. GRU is a one-way neural network topology; states are transmitted from the front to the rear, but the output state at any given time is connected to both the previous and following states. Bidirectional GRU is required in this situation in order to resolve the issue. Both directions of information extraction are carried out using the BiGRU model, and the final output data are

h_{t}^{(i)} = [\vec{h_{t}^{(1)}}, \overset{\leftarrow}{h_{t}^{(1)}}]

;

(\frac{h_{t}^{(i)}}{h_{t}^{(1)}}

) characterizes BiGRU data for

(t)

;

\vec{{(h}_{t}^{(1)}}

) and

\overset{\leftarrow}{h_{t}^{(1)}}

denotes the forward and backward GRU data for

(t)

. It similarly has the assistance of a rapid response time with less complexity. Therefore, the hidden state includes the input sequence complete information and classifies the stages of brain tumor.

4. Results and Discussion

In this scenario, the VSA-GCNN model’s effectiveness is validated applying Python 3.7, anaconda navigator 3.5.2.0 (64-bit) software environment. Additionally, the VSA-GCNN evaluation is verified on system with 4-TB hard-disk, 128-GB RAM, Windows 10 (64-bit) OS and i9 processor. Libraries, like TensorFlow, Keras, NumPy, and OpenCV, are used to investigate the VSA-GCNN model’s effectiveness. According to the protocols in the BraTS 2019, 2020, and 2021 databases, tumor structure is categorized into three sub-regions like Whole-Tumors (WTs) regions, Enhancing Tumors (ETs) regions, and Tumor-Cores (TCs) regions. The measure of values in these sub-regions are given based on Dice score coefficients and Hausdorff distance (HD). The sensitivity is determined as the proportion of total predicted positive perceptions of the anticipated positive perceptions, and it is mathematically specified in Equation (18). The specificity is determined as the proportion of total predicted negative perceptions of the anticipated negative perceptions, which is mathematically indicated in Equation (19). Further, accuracy is one of the important performance metrics in the BTS. The accuracy is determined as the amount of correctly detected perceptions of total perceptions, which is denoted in Equation (20).

S e n s i t i v i t y = \frac{T P}{T P + F N} \times 100

(18)

S p e c i f i c i t y = \frac{T N}{T N + F P} \times 100

(19)

A c c u r a c y = \frac{T P + T N}{T N + T P + F N + F P} \times 100

(20)

Precision is the most important performance measure that calculates similarity measure of the obtained values as shown in Equation (21). F-measure refers to the harmonic mean value of precision and recall as shown in Equation (22). It measures the results of evaluated recall and precision values on the dataset. The DSC measures the spatial overlap between the segmentation by using the Equation (23). The average HD between the set of volume pixels of ground truth and segmentation is measured using Equation (24).

P r e c i s i o n = \frac{T P}{T P + F P} \times 100

(21)

F - m e a s u r e = \frac{2 \times P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l} \times 100

(22)

D S C = \frac{2 T P}{(2 T P + F P + F N)} \times 100

(23)

H D = \frac{(\frac{G t o S}{G} + \frac{S t o G}{S})}{2} \times 100

(24)

where,

T P - T r u e P o s i t i v e, T N - T r u e N e g a t i v e, F P - F a l s e P o s i t i v e a n d F N - F a l s e N e g a t i v e

.

G t o S

refers to the directed average HD from ground truth to segmentation, while

S t o G

refers to the vice versa concept;

G

represents number of volume pixels in ground truth;

S

denotes volume pixels in the segmentation.

4.1. Quantitative Analysis

The proposed VSA-GCNN architecture is validated on the BraTS 2019, 2020, and 2021 datasets, which contain MRIs of brain tumor patients with various tumor types. A VSA-GCNN architecture is incorporated into the network to solve memory computational challenges in order to disclose the multimodality nature of the scans. The experimental evaluation is accomplished by validating Table 1 by comparing the existing model’s Dice scores with the proposed method. For instance, on BraTS 2020, the suggested model outperformed CNN, ResNet, and U-Net with WT of 98.6, ET of 100, and TC of 99.8. The performance determination of various segmentation processes on chosen BraTS datasets are measured with DSC, as well as HD, as shown in Table 1.

The proposed method performance with various segmentation methods is measured based on accuracy, precision, and F-measure for identifying TC, ET, and WT in Table 2, Table 3 and Table 4. The graphical representation of these results is illustrated in Figure 9, Figure 10 and Figure 11, respectively.

From Table 1, Table 2, Table 3 and Table 4, it is observed that the VSA-GCNN has achieved superior results when related to the existing methods. Even though the CNN has a high accuracy rate in processing images, it still has limitations, requiring large amounts of data for training and high computational needs. ResNet enhances deep neural networks’ performance by extending additional neural layers and reducing error percentage. In another way, skip connections combine the outputs of prior layers using stacked layers’ outputs, permitting an achievable deeper network to be trained than originally attainable. However, it requires a longer training period, which makes it practically impossible for real-world applications. GCNN is well-suited for multi-class image segmentation problems since it can handle a high number of classes and provide pixel-level segmentation maps for each. Table 5 shows the analysis of computational time with various segmentation models, such as CNN, ResNet, GCNN and VSA-GCNN. Table 5 clearly shows that the proposed VSA-GCNN obtained the segmented portions with a span of less than 36 s which is much better than the existing CNN, ResNet, and GCNN which have 82 s, 71 s and 57 s, respectively.

4.2. Comparative Analysis

This section shows the proposed VSA-GCNN compared with existing brain tumor segmentation methods, such as SPGAN-MSOA-CBT MRI N. [29], Shuffled YOLO [23], and Single level U-Net 3D [22], on BraTS 2020. Table 6 displays the comparative analysis of the proposed VSA-GCNN segmentation with existing methods on BraTS 2019 in terms of accuracy, precision, sensitivity, specificity and sensitivity-measure. The graphical representation of the BraTS 2019 results is illustrated in Figure 12.

The Dice score, specificity, and sensitivity are the parameters considered in the comparative analysis as shown in Table 7 and Table 8. The graphical representation of table results is illustrated in Figure 13 and Figure 14.

5. Discussion

This research suggests effective segmentation and classification of brain tumors by proposing a novel approach named VSA-GCNN-BiGRU. The segmentation results are compared with the conventional techniques by considering three datasets, such as BraTS 2019, 2020 and 2021. The SPGAN-MSOA-CBT MRI [29] is compared with the proposed VSA-GCNN for the BraTS 2019 dataset; the Single U-Net 3D [22] and shuffled YOLO [23] are compared with the proposed VSA-GCNN on BraTS 2020; MHKFC [27] and CapsNet + LDRCF [24] are compared with the proposed VSA-GCNN on BraTS 2021. When the proposed VSA-GCNN is evaluated on BraTS 2020, the obtained Dice score, specificity, and sensitivity values are 98.66%, 99.70%, and 97.40%, which is higher than the Refs [22,23]. When the proposed VSA-GCNN is evaluated on BraTS 2021, the obtained Dice score, specificity, and sensitivity values are 98.45%, 97.70%, and 97.40%, which is higher than the Refs. [24,27]. From the overall result comparison, the proposed VSA-GCNN-BiGRU approach addresses existing problems in brain tumor segmentation and classification, including inaccurate segmentation, computational inefficiency, and limited robustness, by achieving high Dice scores and outperforming the existing methods. Notably, the proposed VSA-GCNN accomplishes a significant reduction in computational time of 36 s when compared with the other existing models. While considering the complexity, the proposed VSA-GCNN is designed as modular and is robust to implement. The obtained results prove the fact that the VSA-GCNN accomplishes a higher result than existing methods in all of the collected datasets.

Still, significant computational resources are required for the suggested method, especially when processing the training stage. This might restrict its use in settings with limited resources. The proposed approach makes use of deep learning methods, which might be challenging to understand and comprehend. Its popularity in clinical settings may be constrained by this lack of explainability. While analyzing with transformer-based models, the VSA-GCNN requires less computational resources, making it more appropriate for arrangement on resource-constrained devices. But the VSA-GCNN may struggle with larger MRI input sizes and complex tumor structures, where transformer-based models produce better results in accurately classifying the brain tumors.

6. Conclusions

In this manuscript, the VSA-GCNN-BiGRU model is implemented for effective BT segmentation and classification. Hence, the inclusion of batch normalization and histogram equalization layers in the U-Net model effectively learns and stabilizes the multi-scale fine grained segment region. Additionally, the incorporation of the morphological gradient function and Dice loss function improves the tumor pixel segmentation in the images and decreases feature information loss. In this research manuscript, the BraTS 2019, 2020 and 2021 databases are utilized for experimental analysis. Hence, the VSA-GCNN-BiGRU model is robust, effective, and reduces the segmentation time with accuracy of 98.98%, 98.2%, and 99.8% on the BraTS 2019, 2020, and 2021 databases. The experimental results obtained are superior compared to the existing model, by means of DC, accuracy, specificity, and sensitivity. In the future, the VSA-GCNN-BiGRU model can be utilized to segment and to classify the sub-types of the medical images by developing an optimal hyper parameter set of models.

Author Contributions

The paper investigation, resources, data curation, writing original draft preparation, writing-review and editing, and visualization were done by K.P.J. and V.B.G. The paper conceptualization, and software, were conducted by P.S.P. and R.K.P. The validation and formal analysis, methodology, supervision, project administration, and funding acquisition of the version to be published were conducted by P.B.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data available in a publicly accessible repository.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Sahoo, A.K.; Parida, P.; Muralibabu, K.; Dash, S. An improved DNN with FFCM method for multimodal brain tumor segmentation. Intell. Syst. Appl. 2023, 18, 200245. [Google Scholar] [CrossRef]
Budati, A.K.; Katta, R.B. An automated brain tumor detection and classification from MRI images using machine learning techniques with IoT. Environ. Dev. Sustain. 2022, 24, 10570–10584. [Google Scholar] [CrossRef]
Saeedi, S.; Rezayi, S.; Keshavarz, H.R.; Niakan Kalhori, S. MRI-based brain tumor detection using convolutional deep learning methods and chosen machine learning techniques. BMC Med. Inf. Decis. Making 2023, 23, 16. [Google Scholar] [CrossRef] [PubMed]
Zhu, Z.; He, X.; Qi, G.; Li, Y.; Cong, B.; Liu, Y. Brain tumor segmentation based on the fusion of deep semantics and edge information in multimodal MRI. Inf. Fusion 2023, 91, 376–387. [Google Scholar] [CrossRef]
Shah, H.A.; Saeed, F.; Yun, S.; Park, J.H.; Paul, A.; Kang, J.M. A robust approach for brain tumor detection in magnetic resonance images using finetuned efficientnet. IEEE Access 2022, 10, 65426–65438. [Google Scholar] [CrossRef]
Farajzadeh, N.; Sadeghzadeh, N.; Hashemzadeh, M. Brain tumor segmentation and classification on MRI via deep hybrid representation learning. Expert Syst. Appl. 2023, 224, 119963. [Google Scholar] [CrossRef]
Karayegen, G.; Aksahin, M.F. Brain tumor prediction on MR images with semantic segmentation by using deep learning network and 3D imaging of tumor region. Biomed. Signal Process. Control 2021, 66, 102458. [Google Scholar] [CrossRef]
Tandel, G.S.; Tiwari, A.; Kakde, O.G. Performance optimisation of deep learning models using majority voting algorithm for brain tumour classification. Comput. Biol. Med. 2021, 135, 104564. [Google Scholar] [CrossRef] [PubMed]
Shaik, N.S.; Cherukuri, T.K. Multi-level attention network: Application to brain tumor classification. Signal Image Video Process 2022, 16, 817–824. [Google Scholar] [CrossRef]
Das, S.; Bose, S.; Nayak, G.K.; Satapathy, S.C.; Saxena, S. Brain tumor segmentation and overall survival period prediction in glioblastoma multiforme using radiomic features. Concurr. Comput. Pract. Exper. 2022, 34, e6501. [Google Scholar] [CrossRef]
Sasank, V.V.S.; Venkateswarlu, S. An automatic tumour growth prediction based segmentation using full resolution convolutional network for brain tumour. Biomed. Signal Process. Control 2022, 71, 103090. [Google Scholar] [CrossRef]
Asthana, P.; Hanmandlu, M.; Vashisth, S. Brain tumor detection and patient survival prediction using U-Net and regression model. Int. J. Imaging Syst. Technol. 2022, 32, 1801–1814. [Google Scholar] [CrossRef]
Srinivas, C.; KS, N.P.; Zakariah, M.; Alothaibi, Y.A.; Shaukat, K.; Partibane, B.; Awal, H. Deep transfer learning approaches in performance analysis of brain tumor classification using MRI images. J. Healthc. Eng. 2022, 2022, 3264367. [Google Scholar] [CrossRef] [PubMed]
Maqsood, S.; Damaševičius, R.; Maskeliūnas, R. Multi-modal brain tumor detection using deep neural network and multiclass SVM. Medicina 2022, 58, 1090. [Google Scholar] [CrossRef]
Bairagi, V.K.; Gumaste, P.P.; Rajput, S.H.; Chethan, K.S. Automatic brain tumor detection using CNN transfer learning approach. Med. Biol. Eng. Comput. 2023, 61, 1821–1836. [Google Scholar] [CrossRef] [PubMed]
Akinyelu, A.A.; Zaccagna, F.; Grist, J.T.; Castelli, M.; Rundo, L. Brain tumor diagnosis using machine learning, convolutional neural networks, capsule neural networks and vision transformers, applied to MRI: A survey. J. Imaging 2022, 8, 205. [Google Scholar] [CrossRef] [PubMed]
Jena, B.; Saxena, S.; Nayak, G.K.; Balestrieri, A.; Gupta, N.; Khanna, N.N.; Laird, J.R.; Kalra, M.K.; Fouda, M.M.; Saba, L.; et al. Brain tumor characterization using radiogenomics in artificial intelligence framework. Cancers 2022, 14, 4052. [Google Scholar] [CrossRef] [PubMed]
Atia, N.; Benzaoui, A.; Jacques, S.; Hamiane, M.; Kourd, K.E.; Bouakaz, A.; Ouahabi, A. Particle swarm optimization and two-way fixed-effects analysis of variance for efficient brain tumor segmentation. Cancers 2022, 14, 4399. [Google Scholar] [CrossRef]
Haq, A.U.; Li, J.P.; Agbley, B.L.Y.; Khan, A.; Khan, I.; Uddin, M.I.; Khan, S. IIMFCBM: Intelligent integrated model for feature extraction and classification of brain tumors using MRI clinical imaging data in IoT-healthcare. IEEE J. Biomed. Health Inf. 2022, 26, 5004–5012. [Google Scholar] [CrossRef] [PubMed]
Aamir, M.; Rahman, Z.; Dayo, Z.A.; Abro, W.A.; Uddin, M.I.; Khan, I.; Imran, A.S.; Ali, Z.; Ishfaq, M.; Guan, Y.; et al. A deep learning approach for brain tumor classification using MRI images. Comput. Electr. Eng. 2022, 101, 108105. [Google Scholar] [CrossRef]
Aleid, A.; Alhussaini, K.; Alanazi, R.; Altwaimi, M.; Altwijri, O.; Saad, A.S. Artificial Intelligence Approach for Early Detection of Brain Tumors Using MRI Images. Appl. Sci. 2023, 13, 3808. [Google Scholar] [CrossRef]
Ladkat, A.S.; Bangare, S.L.; Jagota, V.; Sanober, S.; Beram, S.M.; Rane, K.; Singh, B.K. Deep neural network-based novel mathematical model for 3D brain tumor segmentation. Comput. Intell. Neurosci. 2022, 2022, 4271711. [Google Scholar] [CrossRef] [PubMed]
Dang, K.; Vo, T.; Ngo, L.; Ha, H. A deep learning framework integrating MRI image preprocessing methods for brain tumor segmentation and classification. IBRO Neurosci. Rep. 2022, 13, 523–532. [Google Scholar] [CrossRef] [PubMed]
Akbar, A.S.; Fatichah, C.; Suciati, N. Single level UNet3D with multipath residual attention block for brain tumor segmentation. J. King Saud Univ. Comput. Inf. Sci. 2022, 34, 3247–3258. [Google Scholar] [CrossRef]
Chen, Y.; Yin, M.; Li, Y.; Cai, Q. CSU-Net: A CNN-Transformer parallel network for multimodal brain tumour segmentation. Electronics 2022, 11, 2226. [Google Scholar] [CrossRef]
Metlek, S.; Çetıner, H. ResUNet+: A New Convolutional and Attention Block-Based Approach for Brain Tumor Segmentation. IEEE Access 2023, 11, 69884–69902. [Google Scholar] [CrossRef]
Shiney, T.S.; Jerome, S.A. An Intelligent System to Enhance the Performance of Brain Tumor Diagnosis from MR Images. J. Digit. Imaging 2023, 36, 510–525. [Google Scholar] [CrossRef]
Elmezain, M.; Mahmoud, A.; Mosa, D.T.; Said, W. Brain tumor segmentation using deep capsule network and latent-dynamic conditional random fields. J. Imaging 2022, 8, 190. [Google Scholar] [CrossRef]
Nagarani, N.; Karthick, R.; Sophia, M.S.C.; Binda, M.B. Self-attention based progressive generative adversarial network optimized with momentum search optimization algorithm for classification of brain tumor on MRI image. Biomed. Signal Process. Control 2024, 88, 105597. [Google Scholar] [CrossRef]
Jabbar, A.; Naseem, S.; Mahmood, T.; Saba, T.; Alamri, F.S.; Rehman, A. Brain tumor detection and multi-grade segmentation through hybrid caps-VGGNet model. IEEE Access 2023, 11, 72518–72536. [Google Scholar] [CrossRef]
Rajput, S.; Kapdi, R.; Roy, M.; Raval, M.S. A triplanar ensemble model for brain tumor segmentation with volumetric multiparametric magnetic resonance images. Healthc. Anal. 2024, 5, 100307. [Google Scholar] [CrossRef]
Esmaeilzadeh Asl, S.; Chehel Amirani, M.; Seyedarabi, H. Brain tumors segmentation using a hybrid filtering with U-Net architecture in multimodal MRI volumes. Int. J. Inf. Technol. 2024, 16, 1033–1042. [Google Scholar] [CrossRef]
Liu, D.; Sheng, N.; He, T.; Wang, W.; Zhang, J.; Zhang, J. SGEResU-Net for brain tumor segmentation. Math. Biosci. Eng. 2022, 19, 5576–5590. [Google Scholar] [CrossRef] [PubMed]
BraTS 2021. Available online: https://www.kaggle.com/datasets/dschettler8845/brats-2021-task1/code (accessed on 1 June 2024).
BraTS 2020. Available online: https://www.kaggle.com/datasets/awsaf49/brats20-dataset-training-validation (accessed on 1 June 2024).
BraTS 2019. Available online: https://www.kaggle.com/datasets/aryashah2k/brain-tumor-segmentation-brats-2019 (accessed on 1 June 2024).
Kim, H.J.; Baek, J.W.; Chung, K. Associative knowledge graph using fuzzy clustering and Min-Max normalization in video contents. IEEE Access 2021, 9, 74802–74816. [Google Scholar] [CrossRef]
Li, Y.; Yuan, Z.; Zheng, K.; Jia, L.; Guo, H.; Pan, H.; Guo, J.; Huang, L. A novel detail weighted histogram equalization method for brightness preserving image enhancement based on partial statistic and global mapping model. IET Image Proc. 2022, 16, 3325–3341. [Google Scholar] [CrossRef]
Fu, Z.; Li, J.; Hua, Z. MSA-Net: Multiscale spatial attention network for medical image segmentation. Alex. Eng. J. 2023, 70, 453–473. [Google Scholar] [CrossRef]
Cui, Y.; Wang, R.; Si, Y.; Zhang, S.; Wang, Y.; Lin, A. T-type inverter fault diagnosis based on GASF and improved AlexNet. Energy Rep. 2023, 9, 2718–2731. [Google Scholar] [CrossRef]
Samee, N.A.; Mahmoud, N.F.; Atteia, G.; Abdallah, H.A.; Alabdulhafith, M.; Al-Gaashani, M.S.; Ahmad, S.; Muthanna, M.S.A. Classification framework for medical diagnosis of brain tumor with an effective hybrid transfer learning model. Diagnostics 2022, 12, 2541. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Graphical representation of proposed brain tumor segmentation and classification.

Figure 2. Example MRI brain scans of BraTS 2021 database [34].

Figure 3. Example MRI brain scans of BraTS 2020 database [35].

Figure 4. Sample MRI brain scans of BraTS 2019 database [36].

Figure 5. Architecture of attention block.

Figure 6. Process of GCNN for segmentation.

Figure 7. Structure of AlexNet.

Figure 8. Structure of GRU.

Figure 9. Performance metrics analysis VSA-GCNN on BraTS 2019.

Figure 10. Performance metrics analysis VSA-GCNN on BRATS 2020.

Figure 11. Performance metrics analysis VSA-GCNN on BraTS 2021.

Figure 12. Performance of VSA-GCNN to existing models on BraTS 2019 [29].

Figure 13. Performance of VSA-GCNN to existing models on BraTS 2020.

Figure 14. Performance of VSA-GCNN compared to existing methods on BraTS 2021.

Table 1. Performance analysis of various methods of segmentation on BraTS dataset.

Performance Evaluation on BraTS 2019
Methods	DSC			HD			Specificity			Sensitivity
Methods	TC	ET	WT	TC	ET	WT	TC	ET	WT	TC	ET	WT
CNN	83.4	73.3	89.3	29	22.29	23.61	98.41	97.59	97.47	96.65	97.12	94.99
ResNet	92.1	85.1	93.5	22.15	18.56	21.29	98.58	97.79	97.64	96.80	97.29	95.15
GCNN	96.6	99.1	94.6	20.07	16.89	20.79	98.73	97.97	97.81	96.95	97.45	95.29
VSA-GCNN	99.6	99.9	98.4	13.47	12.68	18.74	98.87	98.11	97.95	97.07	97.63	95.45
Performance Evaluation on BraTS 2020
Methods	DSC			HD			Specificity			Sensitivity
Methods	TC	ET	WT	TC	ET	WT	TC	ET	WT	TC	ET	WT
CNN	84.6	75.2	90.6	30.1	24.29	25.11	99.39	99.43	99.25	98.04	98.74	96.99
ResNet	93.8	86.9	94.9	23.45	19.76	23.29	99.57	99.59	99.43	98.16	98.91	97.13
GCNN	97.9	100.7	96.3	21.47	17.89	22.49	99.75	99.73	99.56	98.35	99.07	97.23
VSA-GCNN	99.8	100	98.6	13.77	12.68	18.94	99.95	99.92	99.70	98.47	99.25	97.40
Performance Evaluation on BraTS 2021
Methods	DSC			HD			Specificity			Sensitivity
Methods	TC	ET	WT	TC	ET	WT	TC	ET	WT	TC	ET	WT
CNN	85.2	81.3	92.3	29.04	21.32	22.12	99.07	99.44	97.24	97.76	98.94	96.95
ResNet	95.2	83.5	95.1	27.23	19.11	18.43	99.22	99.59	97.37	97.92	99.06	97.09
GCNN	96.5	85.8	97.2	23.51	18.09	16.22	99.39	99.71	97.51	98.10	99.16	97.26
VSA-GCNN	98.8	92.8	98.4	20.02	15.05	14.21	99.56	99.84	97.70	98.23	99.35	97.40

Table 2. Evaluation of performance metrics of proposed method on BraTS 2019.

Methods	Accuracy			Precision			F-Measure
Methods	TC	ET	WT	TC	ET	WT	TC	ET	WT
CNN	88.1	79.9	85.6	83.1	72.3	87.4	94.5	95.0	96.2
ResNet	94.2	91.3	92.3	90.1	80.2	92.3	97.4	97.8	99.2
GCNN	99.0	94.2	91.9	95.2	90.4	97.2	99.7	97.9	98.9
VSA-GCNN	99.8	95.3	92.1	94.4	90.5	99.1	97.8	97.4	97.8

Table 3. Evaluation of performance metrics of proposed method on BraTS 2020.

Models	Accuracy			Precision			F-Measure
Models	TC	ET	WT	TC	ET	WT	TC	ET	WT
CNN	90.1	80.9	86.9	84.1	73.2	89.2	95.5	95.2	98.5
ResNet	96.9	92.9	94.1	91.9	82.4	93.2	99.1	99.1	99.2
GCNN	98.2	97.0	92.3	94.9	91.0	97.5	99.3	98.8	98.8
VSA-GCNN	99.8	95.9	92.3	94.6	90.5	99.1	98.1	97.9	97.6

Table 4. Evaluation of performance metrics of proposed method on BraTS 2021.

Models	Accuracy			Precision			F-Measure
Models	TC	ET	WT	TC	ET	WT	TC	ET	WT
CNN	91.1	79.3	85.9	85.7	74.7	88.5	94.9	94.9	97.9
ResNet	95.9	93.1	93.2	91.2	83.6	92.9	98.9	98.4	98.9
GCNN	98.9	96.8	94.2	95.2	92.8	93.2	98.6	96.1	98.5
VSA-GCNN	99.6	95.4	98.2	94.6	90.5	99.1	98.1	97.9	97.6

Table 5. Analysis of computational time.

Proposed Method	Computational Time (s)
CNN	82
ResNet	71
GCNN	57
VSA-GCNN	36

Table 6. Comparative analysis on BraTS 2019.

Proposed Method	Accuracy (%)	Precision (%)	Sensitivity (%)	Specificity (%)	F-Measure (%)
SPGAN-MSOA-CBT MRI [29]	99.97	99.88	99.91	99.87	99.77
VSA-GCNN	99.98	99.91	99.92	99.91	99.86

Table 7. Comparison of VSA-GCNN segmentation with existing methods on BraTS 2020.

Methods	Dice Score			Specificity			Sensitivity
Methods	ET	TC	WT	ET	TC	WT	ET	TC	WT
Shuffled YOLO [23]	94.00	93.00	98.00	93.00	91.00	95.00	96.00	89.00	97.00
Single level U-Net 3D [22]	72.91	80.19	88.58	99.92	99.91	99.23	73.71	81.75	92.76
VSA-GCNN	100	99.80	98.60	99.95	99.92	99.70	98.47	99.25	97.40

Table 8. Comparative analysis of VSA-GCNN with existing methods on BraTS 2021.

Methods	Dice Score			Specificity			Sensitivity
Methods	ET	TC	WT	ET	TC	WT	ET	TC	WT
MHKFC [27]	94.00	93.00	98.00	93.00	91.00	95.00	96.00	89.00	97.00
CapsNet + LDCRF [24]	85.00	88.00	92.00	87.00	91.00	93.00	85.00	86.00	90.00
VSA-GCNN	99.98	99.23	98.45	99.56	99.84	97.70	98.23	99.35	97.40

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Pratap Joshi, K.; Gowda, V.B.; Bidare Divakarachari, P.; Siddappa Parameshwarappa, P.; Patra, R.K. VSA-GCNN: Attention Guided Graph Neural Networks for Brain Tumor Segmentation and Classification. Big Data Cogn. Comput. 2025, 9, 29. https://doi.org/10.3390/bdcc9020029

AMA Style

Pratap Joshi K, Gowda VB, Bidare Divakarachari P, Siddappa Parameshwarappa P, Patra RK. VSA-GCNN: Attention Guided Graph Neural Networks for Brain Tumor Segmentation and Classification. Big Data and Cognitive Computing. 2025; 9(2):29. https://doi.org/10.3390/bdcc9020029

Chicago/Turabian Style

Pratap Joshi, Kambham, Vishruth Boraiah Gowda, Parameshachari Bidare Divakarachari, Paramesh Siddappa Parameshwarappa, and Raj Kumar Patra. 2025. "VSA-GCNN: Attention Guided Graph Neural Networks for Brain Tumor Segmentation and Classification" Big Data and Cognitive Computing 9, no. 2: 29. https://doi.org/10.3390/bdcc9020029

APA Style

Pratap Joshi, K., Gowda, V. B., Bidare Divakarachari, P., Siddappa Parameshwarappa, P., & Patra, R. K. (2025). VSA-GCNN: Attention Guided Graph Neural Networks for Brain Tumor Segmentation and Classification. Big Data and Cognitive Computing, 9(2), 29. https://doi.org/10.3390/bdcc9020029

Article Menu

VSA-GCNN: Attention Guided Graph Neural Networks for Brain Tumor Segmentation and Classification

Abstract

1. Introduction

2. Literature Review

3. Methodology

3.1. Database Description

3.2. Data Pre-Processing

3.3. Brain Tumor Segmentation

3.3.1. Variational Spatial Attention Mechanism

3.3.2. VSA-GCNN

3.3.3. Loss Function

3.4. Feature Extraction

3.5. Classification

4. Results and Discussion

4.1. Quantitative Analysis

4.2. Comparative Analysis

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI