1. Introduction
Breast cancer is a disease that affects women worldwide; it is the most-diagnosed and represents one of the four cases of dangerous types of cancer. Furthermore, it is the leading cause of death in women. An estimated 2.3 million new cases in 2020 indicate that one in every eight diagnoses that year was breast cancer. In 2020, there were an estimated 684,996 deaths from breast cancer, and, by 2040, the number of cases will increase by almost 50% [
1,
2].
In cancer diagnosis, there are benign and malignant types of tumors. The benign tumors do not spread throughout the body and usually do not reappear when extracted by surgery. The malignant tumors invade the tissue around the breast, and the cancer cells can spread and invade other body organs, causing development of metastasis, one of several complications that can cause the patient to die. Treatment for breast cancer can be highly effective, preventing the progression and eradication of the disease and giving a 90% or higher probability of survival, mainly when it is detected early. Treatment refers to a combination of surgical removal, radiotherapy, immunotherapy, and chemotherapy [
3,
4].
This study examines different medical imaging techniques, namely, mammography (MG) and ultrasound (US). Images in the first case are obtained by emitting small amounts of radiation. Then, this radiation is absorbed, depending on the density of the tissues. Finally, an image is obtained, depending on the dose of radiation that passes to the different tissues [
5]. In the second case, the US images are obtained by emitting US waves to produce the image, where the acoustic impedance plays an important role, since this term is used to describe the resistance to the passage of US energy through a substance or tissue due to its refractive and absorption properties. Because different tissues have different impedances, those with higher impedances appear brighter, as the wave returns with greater intensity, such as in the case of bones. A US sensor also calculates the return time of the wave, meaning near objects are reflected before distant ones and are, accordingly, placed closer to the screen [
6,
7].
The American College of Radiology (ACR) established a standardized method for describing the perceptual features of a breast lesion contained in medical imaging, such as MG and Computer Tomography (CT). This system, called BI-RADS (Breast Imaging Reporting and Database System), allows one to determine if a mass is benign or malignant according to its features, such as shape, texture, and size, and indicates the probability of each state. Therefore, the patient continues treatment, depending on the diagnosis obtained [
6,
8,
9,
10]. Below, we present a brief description of the BI-RADS system.
BI-RADS 0 is assigned when the image does not provide enough information for diagnosis. Prior studies must be requested, and new images are acquired for analysis. BI-RADS 1 characterizes a normal breast in the MG image, i.e., one which does not present suspicious findings. In the category of BI-RADS 2, there are no signs of cancer, but there may be benign findings. BI-RADS 3 to BI-RADS 5 categories express probability values which are greater than and up to less than or equal to of being a malignant neoplasm. BI-RADS 6 expresses that the presence of cancer is confirmed.
Most of the automated CAD (Computer Aided Diagnostic) systems are based on various machine or deep learning strategies, applying deep or handcraft features to obtain superior performance in different applications, such as segmentation and classification. The performance of a CAD system is characterized via commonly used metrics such as Accuracy (ACC), Precision (PRE), Sensitivity (SEN), Specificity (SPE), F1-Score, etc. Below, we present a brief review of recently proposed CAD systems that demonstrated excellent performance in terms of those metrics.
Wei et al. [
11] used a database collected from Quanzhou First Hospital in Fujian, China. Their system removed the edges of the images, eliminating artifacts. As the handcraft features, they employed the uniform Local Binary Patterns (uLBP), Histogram-Oriented Gradient (HOG), and Grey Level Co-occurrence Matrix (GLCM) texture features. Lastly, two different SVM classifiers, based on the Bayes theorem, separated these features into two classes. As a result, they obtained the following criteria values in binary classification: ACC of 91.11%, SEN of 94.34%, and SPE of 86.49%.
Zhang et al. [
12] first segmented the ROI of MG by removing noise, enhancing the image via logarithmic spatial transform and removing the oblique-pectoral muscle, as well as the background. Next, the coefficients of time-frequency spectrum were obtained via fractional Fourier transform; later, those features were reduced via the PCA technique. At the final stage, the classifiers (SVM and k-nearest neighbors) were employed, resulting in the following performance results (in the SVM case): SEN of 92.22%, SPE of 92.10%, and ACC of 92.16%.
Daoud et al. [
13] obtained the ROI image using a delimiting box. Next, the classification of US breast lesions was carried out by employing extraction of deep features, using the VGG-19 model and selecting the handcraft features, such as texture (800 features) and morphological features (18 features). Then, the method performed the combination of handcraft features with the deep features by each convolutional layer of the CNN architecture, obtaining an ACC of 96.1%, SEN of 95.7%, and SPE of 96.3%.
Jabeen et al. [
14] performed, in their system, several main steps: data augmentation, as well as processing via pre-trained DarkNet-53 architecture by modifying the output layer and extracting the features contained in the
Global Average Pooling layer. Afterwards, two optimization algorithms were used to extract the best features: Reformed Differential Evaluation (RDE) and Reformed Gray Wolf (RGW). The classification of the obtained features via the cubic SVM reported a PRE of 99.3%.
Heenaye-Mamode et al. [
15] developed a convolutional neural network (CNN) to segment and classify distinct types of breast abnormalities, such as asymmetry, calcifications, masses, and carcinomas. Firstly, the Transfer Learning method was carried out on their dataset using the pre-trained model ResNET-50. Then, they employed an enhanced deep learning model by adjusting the learning rate adaptively under variations in error curves. As a result, the novel model achieved a PRE of 88% in classifying these four types of breast cancer abnormalities (masses, calcifications, carcinomas, and asymmetry) in MG images.
Tsai et al. [
16] performed BI-RADS classification by using a database of the E-Da hospital in Taiwan and assigning the labels for each image proposed by physicians. The category was assigned according to the proportion of lesion areas in the location, a 224 × 224 block with 36-pixel pitch. The method was based on the EfficientNET deep architecture. Finally, they carried out the classification, obtaining a PRE of 94.22%, SEN of 95.31%, and SPE of 99.15%.
Muduli et al. [
17] proposed a CNN model for automated breast cancer classification from different types of images: MG and US. The model contained five learnable convolutional blocks, each containing four convolutional layers and a fully connected layer as a classifier. The model automatically extracted prominent features from the images with fewer tunable parameters. Exhaustive simulation results on MG datasets (MIAS, DDSM, and INbreast) and US datasets (BUS-1 and BUS-2) confirmed better performance against recent state-of-the-art schemes. In addition, data augmentation permitted reducing overfitting. Their CNN model achieved an ACC of 96.55%, 90.68%, and 91.28% on MIAS, DDSM, and INbreast datasets, respectively. Similarly, ACCs of 100% and 89.73% were achieved on the BUS-1 and BUS-2 datasets, respectively.
In their work, Raza et al. [
18] presented a CNN architecture with 24 convolutional blocks consisting of 6 convolutional filters, 9 Inception modules, and 1 fully connected layer. They used the RELU, Leaky-RELU, and RELU-clipped activation functions and Batch Normalization. The designed architecture reached an ACC of 99.35%, PRE of 99.6%, SEN of 99.66%, and an F1-Score of 99.6%.
Alsheikhy et al. [
19] presented a study that used the AlexNET CNN architecture, employing different classifiers such as K-Nearest Neighbor (KNN), Naive Bayes with the Gaussian kernel, and Decision Tree (DT). DWT was employed for the images in an attempt to denoise White Gaussian Noise. Furthermore, the PCA technique was used to reduce the high-dimensional obtained data. Three private datasets were evaluated: Kaggle Breast Histopathology Images (BHI), CBIS-DDSM Breast Images, and Breast Cancer Wisconsin (BCW). Its average ACC was over 98.6%, and several metrics were greater than 98.0%.
In the study of Zhang et al. [
20], the authors employed standard eight-layer CNN and improved it by integrating two techniques: Batch Normalization (BN) and Dropout (DO). In the final stage, they used Rank-based Stochastic Pooling (RSP). The BDR–CNN model, a combination of mentioned techniques, was hybridized with a two-layer GCN, resulting in a novel BDR–CNN–GCN model. It was utilized in experiments with 322 MG images from the mini_MIAS dataset, and a 14-way data augmentation method was employed. The performance of the novel framework achieved a SEN of 96.20%, SPE of 96.00%, and ACC of 96.10%.
Nagwan et al., in their study [
21], generated input images with a pseudocolor technique— Contrast Limited Adaptive Histogram Equalization (CLAHE)—and pixel-wise intensity adjustment. The generated image was composed by the original image in the first channel, and the second channel represented the CLAHE enhanced image. Finally, the last channel contained the obtained pseudocolor image. These images were fed as a backbone of CNN to generate high-level deep features. Next, a processing technique based on Logistic Regression (LR) and analysis (PCA) was applied. The system was evaluated on two datasets: INbreast and mini-MIAS. The proposed CAD system could achieve the highest performance ACC of 98.60% and 98.80% for the INbreast and mini-MIAS datasets, respectively.
The major drawback in the previous studies, where the deep features extraction strategy was employed, is the absence of a procedure for characterizing and selecting the deepest features, which could measure the information significance of these features focusing on the classification performance. The current study proposes a novel fusion strategy for identifying informative features and eliminating irrelevant ones that might degrade the classification performance. Additionally, we have investigated and justified better performance of the designed method in combining the deep features with handcraft features that can guarantee higher classification performance.
To overcome the above issues, we propose an efficient deep learning–handcraft model that is suitable for MG and US breast images. The major contributions of this work are as follows:
Deep learning and handcraft features are fused via analysis of the lesions’ features in accordance with statistical criteria, guaranteeing a better performance in the diagnosis;
Two types of studies, of different natures, are used. Use of MG and US images in developed systems justifies the claim of better performance of the novel systems against recent state-of-the-art systems in MG and US databases, whether standalone or combined;
Several feature fusion algorithms, such as genetic algorithms and mutual information, are employed; they are based on probabilistic methods and appear to demonstrate superior performance in classifying lesions in MG as well as in US images.
The rest of the manuscript is organized in the following sections:
Section 2 describes the proposed system and fusion procedure of the proposed features.
Section 3 explains the experimental setup to test and performance evaluation results. A discussion of the evaluation is presented in
Section 4. Finally, the conclusions of the study are stated in
Section 5.