1. Introduction
Ultrasound (US) imaging is considered the most-affordable clinical imaging technique [
1]. In addition, it is a safe and effective method for diagnosing a number of medical conditions [
2,
3].
US imaging has been used, for example, in the diagnosis of Breast Cancer (BC). BC is the most-commonly diagnosed type of cancer worldwide [
4]. BC screening programs, where diagnosis usually occurs before symptoms appear, are based on mammography [
5]. Despite its positive impact on mortality rates, a study found a reduction of almost 30% [
6], and there are some downsides to it such as, for example, the existence of false positives and false negatives [
7]. Moreover, it is also known that mammography sensitivity decreases for women having denser breasts [
8]. US imaging plays a role in overcoming this specific disadvantage [
9,
10].
COVID-19 has, as of March 2023, infected over 759-million people worldwide, causing almost 7-million deaths [
11]. The portability of Point-Of-Care US (POCUS) emerges as a safe and viable alternative for evaluating COVID-19 since it does not require patient transportation to clinical facilities [
12]. Besides, it was found that higher lung US scores are related to severe stages of the disease, meaning that POCUS may play an important role in patient risk stratification [
13]. Moreover, the use of POCUS in respiratory failure has shown excellent performances in terms of the diagnosis’s speed and accuracy [
14].
Despite all the benefits, US images typically show Speckle Noise (SN), which limits contrast resolution and decreases lesion conspicuity. As a consequence, diagnostic interpretation is impaired [
15,
16]. SN usually presents itself as granularity in otherwise clean US images, but its aspect is dependent on the characteristics of both the machinery used and the analyzed tissues. This Speckle appearance occurs when US waves emitted by the source are reflected on a surface or tissue that is irregular [
17]. What happens is that, after the waves are emitted, they interact with different tissues and organs (that may be irregular) and are reflected into the sensor with different phases. This results in random constructive and destructive synergies between these waves, which creates a random granular pattern called Speckle [
16].
Mathematically, SN appears multiplied by the US signal resulting from wave reflection on the tissue. Therefore, for that reason, the SN level is directly dependent on the local pixel intensity of the area where it occurs. Consequently, SN not only increases image blurriness, but also decreases its contrast [
18].
Therefore, the signal received by the US machinery is composed of a useful signal and SN. In fact, Speckle is a multiplicative type of noise [
15], but it is composed of a multiplicative and additive fraction [
18]. For that reason, it can be mathematically modeled through Equation (
1).
where
(x,y) are the 2D coordinates of the US image,
O is the observed signal,
S is the original signal,
M is the multiplicative component of the noise, and
A is the additive component of the noise. Since the multiplicative component is usually much more significant than the additive component, the latter is usually disregarded [
18].
Multiplicative SN can be shown as a Rayleigh distribution, following the probability density function found in Equation (
2):
where
x represents a random variable with a Rayleigh distribution, and the
parameter represents the noise level, as it decreases lesion conspicuity [
19].
The presence of SN represents a problem when assessing medical US images. For that reason, different efforts have been made to remove this type of noise while preserving image characteristics.
Given what was discussed above, filtering is an essential step for image analysis. The main challenge is to remove SN without changing important features of the image [
15].
There are several classical approaches to remove SN, based on filtering techniques. The Mean and Median filters are two commonly used approaches. While the first one replaces the central pixel of the neighborhood being analyzed by the average value in that neighborhood, the latter replaces it with the Median. Since the Mean filter considers all pixel values at the analyzed location, it results in detailed blurring, which does not occur with the Median filter. This difference occurs because the Median filter only considers the intermediate pixel value, which also helps to preserve edges [
20]. The Wiener filter, which operates based on local variance values, applies more smoothing where the variance is lower and has also been successfully used for removing SN [
20,
21]. Adaptive-Mean filters are also widely used for SN removal [
22]. The Lee filter is one such example. When applying it, smoothing is specifically performed in regions with low variance, rather than high variance. This approach preserves the edges present in the image, as edges are typically characterized by higher variance regions [
15]. This selective smoothing can present itself as a problem since it disregards the presence of SN near the edges. The Kuan filter, also used for SN removal, converts the multiplicative noise into additive noise. It is very similar to the Lee filter, but it uses a different weighting function [
23].
A study from Khan et al. [
22] compared the capability of several filters in the task of SN removal. This comparison used the Root-Mean-Squared Error, Peak Signal-to-Noise Ratio (PSNR), Speckle Suppression Index, Standard Deviation-to-Mean Ratio, and Structural Similarity Index Measure (SSIM) as performance metrics. Their analysis showed that, for variance values of pixel intensities below one, the filters that were generally more robust in removing SN from the US images were the Median, Lee, and Kuan filters.
Moving forward to novel approaches, Artificial Intelligence (AI) has made its way into the medical field, and noise removal is no exception. Recent research has been aiming to reduce the effect of SN in medical US images using Deep Learning (DL).
The work of Mishra et al. [
24] consisted of using AI for SN removal of liver US images. Their model was a Convolutional Neural Network (CNN) made up of a series of three ResNet [
25] blocks (with skip connections), besides having a convolutional block at the beginning and at the end of the network. The authors found that their DL model outperformed all the classical techniques for despeckling US [
26,
27,
28,
29], in terms of the PSNR and SSIM. Despite the robust results that this approach achieved, the ground-truth Despeckled data were obtained through state-of-the-art approaches, which means that, instead of learning how to obtain clean data from noisy US images, the model was learning how to mimic those approaches or an ensemble of them.
Another study [
30] aimed to remove SN from US images. To do that, a pre-trained Residual Network was used and tested on US images with simulated added SN and on natural images. The variance of the simulated noise assumed values of:
0.02; 0.04; 0.06; 0.08; and 0.1. Model performance on natural images was assessed with the PSNR and SSIM, while for US images, a non-reference quality metric was used—Naturalness Image Quality Evaluator (NIQE) [
31]. The authors compared their performance with that of classical filters (for example, Mean, Median, Lee, Kuan) and found that the model outperformed all filters in terms of both the PSNR and SSIM for all noise levels. Similar results were found for the NIQE metric when evaluating SN suppression in US images. The authors also aimed to evaluate model performance in removing naturally occurring SN that occurs in clinical US images. They found, once again, that their model achieved a higher NIQE value than that of classic filters, showing that the DL approach might be better at removing naturally occurring SN in US images. One of the downsides of their evaluation is that NIQE is primarily designed for evaluating the quality of natural images. While it can provide a measure of perceived quality for general images, it may not be the most-appropriate metric to assess the quality of US images.
A group of researchers aimed to test five different networks in the task of SN removal: a dilated convolution autoencoder, two U-shaped networks (one with batch normalization and another with batch re-normalization), a generative adversarial network, and a CNN-Residual Network [
32]. In order to do that, they first added simulated SN to their medical US images with four different noise levels (
= 0.1, 0.25, 0.5, 0.75). Then, they trained several classic DL architectures and tested them on the task of removing the added SN with two different test sets. Besides comparing the performance of the networks, in terms of the PSNR and SSIM, the authors also compared them to classical filtering techniques (including the Median, Mean, Lee, and Kuan filters). It was found, considering the first dataset, that the autoencoder was the architecture with the best PSNR for the three highest noise levels. Besides, in terms of the SSIM, it was a U-net that outperformed all the others. Moreover, the authors found that, in general, the proposed DL techniques outperformed classical methods in terms of removing simulated added SN.
Although the studies described here demonstrated that DL models outperform several classical filters, they did not focus on the clinical relevance of their models. Additionally, it is worth noting that most of the proposed architectures had more than 500,000 parameters, with all of them surpassing 150,000 parameters [
24,
30,
32]. This elevated parameter count directly corresponds to heightened computational resource requirements, encompassing increased processing power, memory, and time demands. Training and deploying such models can be resource-expensive and costly. Furthermore, point-of-care US is frequently used in scenarios characterized by limited memory and processing power. The real-time or online application of complex models can be compromised in such contexts. Finally, larger models require more energy consumption, which, in turn, increases carbon emissions, which have a detrimental impact on the environment [
33].
The novelty of our work focused on simplifying DL approaches for SN reduction in medical US images towards a more-environmentally friendly, cost-effective, and resource-efficient model. Moreover, a classification task was also explored, considering the importance of studying the impact of naturally occurring SN removal in clinical practice.
The remainder of the paper is organized as follows.
Section 2 starts by exploring the data used in this study and their division into different sets. It continues by deeply describing the constructed models, their variations, and the parameter configuration for training.
Section 3 presents the results.
Section 4 discusses the obtained findings while comparing our work to the ones presented in
Section 1. Finally,
Section 5 addresses the general findings of the work, draws attention to the limitations associated with the followed methodology, and points out possible directions of future work.
5. Conclusions
Motivated by the goal of improving diagnostic capability using US images and considering the lack of practical AI-based solutions for SN removal, we proposed a DL model to remove SN from US images with a 5× to 20× lower number of parameters compared with other proposed DL approaches [
24,
30,
32]. All CNN-AE models achieved better results, in terms of the SSIM and PSNR, when compared with the Median and Lee filters. We also showed that our model was less affected by the increment in the noise level than the filters. The results gave us a hint that simplicity may be the solution.
The novelty of this study was the investigation of the removal of naturally occurring SN from original US images for a real-world application. Based on a CNN classification model that differentiates malignant from benign breast lesions, we tested the impact of SN removal in such a clinical task. Our results showed that removing SN decreased the MCC, F1-score, and accuracy of the CNN classification model. However, considering that the correct detection of a malignant lesion is the most-essential task in clinical practice, sensitivity and the NPV should be considered as targets for evaluating the developed model. Considering this, the CNN-AEs ( and ), which were not the best models in terms of the PSNR and SSIM, may be considered the most-appropriate models for the clinical task considered in this study.
Overall, our exploratory study stands as a way to argue that every AI application for solving medical problems must focus on its application to the real world. In the case of SN removal, please note that:
- 1.
CNN-AE’s performance compares to the Lee filter in terms of sensitivity and the NPV, but without compromising diagnostic accuracy, needing parameter tuning, or being time-consuming.
- 2.
A low-parameter CNN-AE is able to achieve high performance and is more computationally efficient and cost-effective than more-complex models.
- 3.
For models trained with US images with simulated added SN as the input, it is essential to test them with the original US images to evaluate their applicability in the real world.
When considering image quality improvements, researchers should focus on its benefits to clinical practice. Here, we demonstrated that, although the CNN-AE () achieved the best results in terms of the SSIM and PSNR, it did not display a clear benefit in real-world applicability. On the other hand, the CNN-AE trained with higher noise levels showed an increased capability to correctly identify malignant breast lesions, which is extremely important in clinical practice. As a matter of fact, the CNN-AE generally outperformed the classification model trained with the original US images in terms of both sensitivity and the NPV.
As a final note, the effort that is being made in the task of SN noise removal might not necessarily contribute to a distinct added value to clinical practice. In this medical domain, the current studies have focused on removing simulated added SN, often overlooking the challenge of removing naturally occurring SN. Moreover, there is no focus on the relevance of removing such naturally occurring SN in terms of medical diagnosis. Our study showed that the SSIM and PSNR metrics do not directly translate into added clinical value. For that reason, future steps in this area of research should study the clinical impact of removing naturally occurring SN present in US images. Only by doing that will the DL models actually add value to real-world clinical practice.