1. Introduction
Female breast cancers continue to record increases in the numbers of new cases and are reported as the most common incidence and mortality cancer worldwide, surpassing other types of cancer [
1]. According to the World Health Organization (WHO), more than 2.3 million women were diagnosed with breast cancer globally in 2020, and 685,000 died from the disease [
2]. This means that, on average, a woman is diagnosed with breast cancer every 14 s. Preventive measures, including imaging screening, can potentially detect cancer at an early stage, which in turn increases the patient’s survival rate [
3]. Among the numerous screening methods currently available for the detection of breast cancer, mammography has become the main and most effective imaging method. Apart from that, it has increased the rate of early-stage cancer detection [
4,
5].
One of the important signs in cancer detection on mammographic images is the presence of microcalcifications, which appear as small bright spots within an inhomogeneous background [
6].
Figure 1a,b illustrates an example of benign and malignant microcalcifications on mammogram images taken from the MIAS database [
7]. Note that the morphology of this microcalcification is a crucial predictor of its pathological nature. Large, round, and oval calcifications of uniform size exhibit benign (non-cancerous) characteristics. In contrast, smaller and non-uniform calcifications exhibit characteristics of malignant growth [
8,
9]. In clinical practice, it is difficult and time-consuming for radiologists to interpret and evaluate microcalcifications accurately. This is true especially when the microcalcifications appear in low contrast and are obscured by the background tissue of the images [
10]. Here, human errors based on subjective evaluations may lead to unnecessary biopsy procedures, which can cause harm and anxiety for patients [
11].
In order to improve the accuracy of assessing microcalcifications, numerous studies have been conducted to develop computational approaches that could potentially aid radiologists in distinguishing benign and malignant microcalcifications [
3]. The standard computer-aided detection (CAD) processes consist of image preprocessing, segmentation, feature extraction, feature selection, and classification model, as depicted in
Figure 2a. Each phase involves a different technique, and the performance relies heavily on the preceding phase. Preprocessing is the initial phase of the image processing pipeline. Filtering is commonly applied as a preprocessing technique for removing noises and other artefacts in the image. Noise emerges in mammograms when the image’s brightness varies in areas representing the same tissues owing to non-uniform photon distribution [
3]. It produces a grainy appearance, reducing the visibility of some features within the image, especially microcalcifications in dense breast tissue. Because noise, edge, and texture are high-frequency components, distinguishing them is challenging [
12].
Various filtering techniques have been used in the literature to reduce noise in mammogram images. Each method has its benefits and drawbacks. For example, the Wiener filter is considered a linear filter that can improve the images by reducing random noise but may produce a blurry effect and incomplete noise filtration [
13]. Non-linear filters, such as the median filter, can overcome the limitations of linear filters thanks to several benefits, such as being straightforward and offering a sensible noise removal performance, but they could distort fine edges even at low noise densities [
14]. The comparative reviews from [
12] agreed that different types of noise require different filtering techniques.
Another important phase that influences classification performance is feature extraction [
11]. Feature extraction methodologies analyse images to extract the most prevalent features and are employed as inputs to machine learning classifiers to distinguish between benign and malignant classes. Such features included intensity, statistical, shape, and textural features [
5]. The grey-level cooccurrence matrix (GLCM), which calculates the occurrence of various grey levels in a region of interest (ROI), is a well-known texture feature and is utilised extensively in the literature [
5,
15,
16,
17]. Nevertheless, all of these features focus on local information of the images and are often burdened with details, resulting in data complexity [
18].
This study proposes a new classification approach based on persistent homology (PH) that can extract informative features from the images, which consists of filtering and feature extraction processes, as shown in
Figure 2b.
PH, the topological data analysis (TDA) core tool, has recently been widely used as a multi-scale representation of topological features. It can extract topological summaries from data that capture the birth and death of connected components, loops, and voids through a filtration process [
19]. Apart from that, persistence diagrams (PD) are one of the topological descriptors produced by PH [
20]. They comprise a collection of points in the half-plane above the diagonal with coordinates (birth and death) of topological features, helpful in distinguishing robust and noisy topological properties [
21]. Other than that, the geometric measurement of the associated topological properties directly correlates with the lifespan (differences between death and birth). A long lifespan is considered a prominent feature represented by points far from the diagonal in the diagram. In contrast, short lifespans, represented by points close to the diagonal, are interpreted as noise [
22].
There has been minimal effort to explore the potential use of PD as a filtering approach, especially in mammogram images. A noteworthy study by [
23] filtered out 20% of points close to the diagonal in PD, but they focused on interpreting quality assessment in the eye fundus image. The PD of a microcalcification image contains thousands of points with many short lifespans (noise), necessitating an additional filtering procedure. Thus, a novel method for filtering noise in a persistent diagram based on the maximum lifespan of the image is proposed.
In PH, the topological features generated from PD can be vectorised and integrated into machine learning models. Various vectorisation approaches have been proposed, with promising results in various fields. For example, the persistent image (PI) feature proposed by [
24] has been utilised in hepatic tumour classification with considerable accuracy [
25,
26]. Meanwhile, the persistent entropy (PE) and p-norm features were applied for dark soliton detection [
27] and persistent landscapes (PL) in the quantitative analysis of fluorescence microscopy images [
28]. Furthermore, the authors of [
29] employed Betti numbers for evaluating tumour heterogeneity in image feature extraction. Nevertheless, a recent study by [
30] stated that keeping track of the lifespan is more informative than the progression of Betti numbers. The researchers used the mean, the standard deviation of lifespan for each cycle, and PE for 0- and 1-dimensional features to embed in machine learning techniques in detecting the correct Gleason score of prostate cancer, reporting an accuracy above 95%.
PH-based machine learning is a promising technique with many potential applications across different fields [
31]. However, one of the challenges mentioned in the published literature is PH-based feature representation [
19]. In other words, selecting suitable topological features is crucial because the suitability of features depends on the data type and the problem at hand. This study explored the potential use of the PH method for noise filtering and feature extraction processes. To the best of our knowledge, this is the first work using PH to tackle the challenge of filtering noise and selecting suitable topological features to improve the classification performance of microcalcifications in mammogram images. The purpose of this paper can be summarised as follows: (i) to propose multi-level noise filtering of 1-dimensional homology group PD based on maximum lifespan, (ii) to obtain the vectorised topological features from the filtered PD using the PI and PE, (iii) to compare the performance of the filter and non-filter PD including the performance of an individual feature and concatenated features using several machine learning models, as well (iv) to suggest the optimal filtration level for the MIAS and DDSM datasets. As this work aims to highlight the importance of a topological approach to classify the microcalcifications in a machine learning setting, prior knowledge of machine learning is assumed. Therefore, it will not be recalled in this section or elsewhere in the paper.
4. Discussion and Future Work
The literature commonly describes PHs robust against image noise [
25,
31]. This study demonstrates that if the PD is taken directly without any filtering on the diagram, machine learning models cannot successfully classify the vectorised topological features of microcalcification. Other than that, experimental results indicate that the performance was improved by implementing the optimal filtering level for each dataset.
It is discovered that, in terms of topological features, PI is more prominent in the MIAS dataset, whereas PE in the DDSM dataset.
Figure 12a,b illustrates the discriminant values of every feature based on the DT model. Compared to malignant microcalcifications, benign microcalcifications will have a lower value of PI and PE.
In
Figure 13, we present some examples of the images from both datasets where benign images have higher values (PI and PE) in non-filter topological features than malignant features. This results in misclassification between the two classes and impacts the classification performance. Because benign microcalcifications present a long lifespan in the PD (refer to
Figure 5), the significant topological features can be distinguished after filtering in the PD.
PH offers some unique advantages. One of the advantages is its ability to analyse data at multiple scales, which can be particularly useful for medical images with features of different sizes and scales. By analysing the data at multiple scales, PH can capture information about features that other methods may miss. Besides, it represents a lower computational burden to the system because the classification operation is not on the image matrix but on a compact vector from the input data [
23]. This study uses two topological features for each image: PE and PI features. The complexity of a CAD system increases rapidly with the number of features used [
44]. Furthermore, the proposed filtering procedure does not influence the image quality, because the process only operates on the PD, as opposed to some preprocessing methods, such as linear filtering, which can cause degradation of edges and image details, giving the images a blurred effect [
13].
Comparative Analysis
Several existing state-of-the-art non-PH models used to conduct experiments with the same MIAS and DDSM datasets were chosen for performance comparison, as shown in
Table 7. Based on this table, two types of images were applied for both datasets: greyscale and binary images. Greyscale indicates that the classification process is performed straight from the original image from the dataset, consisting of 256 pixel values. Meanwhile, the classification of binary images indicates that the original images undergo a segmentation process to produce a black-and-white image, where pixel values of 0 (black) are considered as the background region and pixel values of 1 (white) are considered as segmented regions of microcalcifications. However, some microcalcifications, usually malignant cases, are not clearly visible and are closely connected to background tissue, making it impossible for segmentation algorithms to obtain complete segmentation for these calcifications [
45]. For that reason, the proposed methods are tested for the greyscale images so that the whole topological structure of the image can be considered.
Various features have been studied in the literature to classify benign and malignant microcalcifications. Research by Fadil et al. [
15] used 2D discrete wavelet transform for contrast enhancement of the microcalcifications and extracted eight textural features on the GLCM, achieving 95% accuracy and 0.92 AUC using the random forest (RF) classifier. On the other hand, Suhail et al. [
46] present a way to obtain a single feature value by applying a scalable linear fisher discriminant analysis (LDA) approach, achieving up to 96% accuracy with 0.95 AUC using SVM. Mahmood et al. [
16] employed machine learning integrated with the radiomic approach to classifying the textural and statistical features, attaining 98% accuracy and 0.90 AUC. Apart from that, Gowri et al. [
17] also used textural features with fractal analysis and obtained 96.3% accuracy. Melekoodappattu et al. [
5] proposed a hybrid extreme machine learning classifier consisting of the extreme learning machine (ELM) with the fruitfly optimization algorithm (ELM-FOA) along with glowworm swarm optimization (GSO). The preprocessing stage was conducted using the Wiener filter and enhanced using contrast-limited adaptive histogram equalisation (CLAHE). Here, 44 features are extracted using the speed up robust feature (SURF), Gabor filter, and GLCM, and achieved 99.15% accuracy.
In addition, topological features have also been studied by [
47,
48] for modelling and classification of microcalcifications, with promising results. To the best of our knowledge, our proposed method is the first work using a persistent diagram as a filtering approach as well as PH features to tackle the challenge of discriminating between benign and malignant microcalcifications. This method achieves 96.2% accuracy with 0.96 AUC for the MIAS dataset and 99.3% accuracy with 0.99 AUC for the DDSM dataset. This is comparable to other state-of-the-art non-PH approaches developed to solve the same problem.
Although the performance is satisfactory, this study has several limitations. First, this study used a limited number of images. Hence, additional testing must be conducted on mammographic images collected from hospitals or population-screening projects. Increasing the number of images would permit a more in-depth assessment and can prevent bias in the data. Second, external validation of model performance was not conducted. Doing so could have further demonstrated its generalisability. Third, the performance of the model is not compared to deep learning approaches owing to the small amount of data. For future studies, the persistent homology features can be extended to deep learning architectures and potentially achieve even better performance and robustness of image classification algorithms, particularly in the context of medical imaging, where complex structures and patterns are often present. Lastly, the choice of preprocessing steps, such as image enhancement or denoising, can also affect the resulting persistent homology features and introduce bias into the model. It is important to carefully consider the appropriate preprocessing steps for the specific dataset and to evaluate the impact of these steps on model performance.