Physics-Based Practical Speckle Noise Modeling for Optical Coherence Tomography Image Denoising

Yang, Lei; Wu, Di; Gao, Wenteng; Xu, Ronald X.; Sun, Mingzhai

doi:10.3390/photonics11060569

Open AccessArticle

Physics-Based Practical Speckle Noise Modeling for Optical Coherence Tomography Image Denoising

by

Lei Yang

^1,2,3

,

Di Wu

²

,

Wenteng Gao

¹

,

Ronald X. Xu

^2,* and

Mingzhai Sun

^2,3,*

¹

Department of Precision Machinery and Instruments, University of Science and Technology of China, Hefei 230026, China

²

Suzhou Institute for Advanced Research, University of Science and Technology of China, Suzhou 215123, China

³

Division of Life and Medicine, School of Biomedical Engineering, University of Science and Technology of China, Hefei 230026, China

^*

Authors to whom correspondence should be addressed.

Photonics 2024, 11(6), 569; https://doi.org/10.3390/photonics11060569

Submission received: 18 April 2024 / Revised: 4 June 2024 / Accepted: 5 June 2024 / Published: 17 June 2024

(This article belongs to the Section Biophotonics and Biomedical Optics)

Download

Browse Figures

Versions Notes

Abstract

:

Optical coherence tomography (OCT) has been extensively utilized in the field of biomedical imaging due to its non-invasive nature and its ability to provide high-resolution, in-depth imaging of biological tissues. However, the use of low-coherence light can lead to unintended interference phenomena within the sample, which inevitably introduces speckle noise into the imaging results. This type of noise often obscures key features in the image, thereby reducing the accuracy of medical diagnoses. Existing denoising algorithms, while removing noise, tend to also damage the structural details of the image, affecting the quality of diagnosis. To overcome this challenge, we have proposed a speckle noise (PSN) framework. The core of this framework is an innovative dual-module noise generator that can decompose the noise in OCT images into speckle noise and equipment noise, addressing each type independently. By integrating the physical properties of noise into the design of the noise generator and training it with unpaired data, we are able to synthesize realistic noise images that match clear images. These synthesized paired images are then used to train a denoiser to effectively denoise real OCT images. Our method has demonstrated its superiority in both private and public datasets, particularly in maintaining the integrity of the image structure. This study emphasizes the importance of considering the physical information of noise in denoising tasks, providing a new perspective and solution for enhancing OCT image denoising technology.

Keywords:

optical coherence tomography (OCT); noise synthesis; physical prior; image denoising; deep learning

1. Introduction

With the continuous advancement of technology, the application of biomedical imaging techniques in clinical diagnosis is becoming increasingly common. Among various imaging techniques, optical coherence tomography (OCT) has gained wide recognition for its ability to non-invasively acquire cross-sectional images of biological tissues [1,2]. OCT utilizes the principle of low-coherence interferometry to simulate the imaging mechanism of ultrasound pulse echoes, capturing the light scattering signals of microstructures within biological tissues to generate high-resolution images. Low-coherence interferometry utilizes the interference signals from low-coherence light sources to measure the shape, thickness, and other parameters of objects. This technique achieves sub-micron-level longitudinal and transverse spatial resolutions and can detect extremely weak reflection signals with an intensity as low as

10^{- 10}

of the incident light power [3]. OCT plays an important role in the detection and diagnosis of various retinal diseases in ophthalmology, providing precise depth information and three-dimensional structural images for conditions such as glaucoma, macular degeneration, macular edema, and diabetic retinopathy. By accurately visualizing biological tissues, OCT greatly enhances the diagnostic capabilities for these common retinal pathologies [4,5,6,7]. In addition to detailed examination of the fundus, OCT can also provide precise three-dimensional structural images of the cornea [8], aiding in monitoring changes in corneal thickness and identifying abnormalities in corneal structure. OCT technology has also demonstrated unique value in assisting ophthalmic surgeries [9,10], providing real-time image guidance and navigation for surgeons. During procedures such as vitreoretinal surgery and cataract surgery, the high-resolution depth and three-dimensional images provided by OCT help surgeons precisely locate lesions [11], enabling accurate interventions. Beyond ophthalmology, OCT technology also plays a role in the field of dermatology [12], aiding in the diagnosis and monitoring of skin diseases. It can provide high-resolution cross-sectional images of the skin [13], assisting physicians in observing skin structures, hair follicles, sweat glands, and evaluating skin lesions. OCT finds numerous applications in dentistry as well [14], enabling non-invasive detection of dental caries and cavities as well as precise diagnosis of pulpal and periodontal diseases [15]. The high-resolution images provided by OCT are crucial for assessing the health of teeth and periodontal tissues, facilitating comprehensive dental restoration and implant planning by dental professionals [16]. OCT also enables high-definition imaging of the esophagus, stomach, small intestine, large intestine, bile ducts, and pancreatic ducts, aiding in early diagnosis and accurate assessment of gastrointestinal diseases [17]. In addition to the field of biomedical imaging, OCT has also demonstrated its unique potential and value in various other fields. For example, in the domain of art preservation and restoration, OCT technology can be used to assess the integrity of artworks [18]. In materials science, OCT is employed for precise thickness measurement of material coatings [19], aiding in the identification of potential defects.

Currently, researchers in OCT imaging system development are continuously reducing the size and cost of OCT devices while improving imaging performance (speed, resolution, etc.) and flexibility, making them suitable for a wide range of applications. Currently, handheld OCT devices have been widely used in disease examinations [20]. This technology will find a broader user base and become an indispensable tool for high-resolution, non-invasive, and non-destructive imaging. Handheld probes based on dual-axis MEMS technology [21] have achieved undistorted OCT volume imaging without the need for post-processing. By optimizing control signals and using custom control boards, high-speed scanning and high-quality real-time imaging have been achieved, providing a new solution for portable, undistorted OCT imaging.

The design of OCT draws inspiration from the Michelson interferometer, utilizing the principle of low-coherence interference for imaging. The Michelson Interferometer splits light into two paths, reflects it off mirrors, and recombines it to create an interference pattern, which is used for precise measurements of distances or refractive indices. This technique constructs high-resolution images of the internal structure of samples by detecting the interference signal between the reflected light from the reference mirror and the scattered light from the sample [22]. However, when low-coherence light interacts with a rough surface of the sample, it gives rise to undesired interference phenomena among the scattered light, introducing speckle noise into the imaging results. Additionally, various types of noise are inevitably introduced in OCT due to factors such as sensor material properties, environmental conditions, and circuit design. During the actual image acquisition process, the presence of noise becomes even more complex due to imperfections in equipment design, uncertainties in the environment, and the subject being imaged. The existence of noise significantly impacts image quality by reducing contrast, obscuring details of target structures, increasing the workload for medical diagnosis, and ultimately decreasing the accuracy of disease diagnosis based on images [23]. These noises also compromise the accuracy of quantitative analysis of the retina and choroid [24]. The choroid, which exhibits a rich vascular structure [25]. Compared to the noise in traditional cameras, the noise in OCT is more complex and exhibits distinctive characteristics. Therefore, achieving effective and robust denoising of OCT images is of great significance.

The landscape of OCT image denoising has seen a variety of approaches. Some efforts focus on hardware improvements to reduce noise, targeting noise at detectors and scanners [26,27,28], etc. However, these improvements are less effective against noise itself and often impractical for substantial enhancements. Another aspect of the research focuses on developing algorithmic solutions to effectively mitigate the effects of noise. This approach offers high portability and can be applied to various OCT systems through post-processing. Currently, there are many traditional methods used for denoising: block-matching 3D [29], nonlocal-means(NLM) [30], etc. These traditional methods can lead to over-denoising, resulting in the removal of some fine details [31].

In recent years, deep learning has demonstrated remarkable results in the field of image processing, such as image segmentation, super-resolution, classification, denoising, etc. [32,33,34]. Zhang et al. introduced DnCNN [35], which was the first to apply CNN and global residual learning to image denoising. DnCNN has demonstrated strong capabilities in solving supervised denoising problems. Recently, D. Hu et al. attempted to utilize diffusion models for denoising networks in OCT denoising [36]. However, these methods relying on supervised networks necessitate paired images for training, and the process of obtaining paired data typically involves time-consuming multi-frame averaging in OCT denoising tasks. Due to the uncontrolled movement of the eye, resulting in variations in the captured fundus structure with each acquisition, it becomes difficult to acquire a large number of well-matched pairs. Data-related issues significantly limit the development of supervised approaches in OCT denoising tasks [37].

Unsupervised networks, which do not require paired data, hold great potential for OCT denoising tasks [38]. Huang et al. proposed an innovative denoising algorithm called DRGAN based on generative adversarial net (GAN) [39] methods [40]. This method utilizes a discriminator and a feature fusion mechanism to constrain the network, enabling the extraction of noise and clean features at the feature level. Meanwhile, Geng et al. also proposed a GAN-based denoising network called TCFL [41], which focuses on extracting noise and clean features at the image level. Both of these networks employ a strategy in the denoising process: combining the extracted noise features from the noisy images with the clean features to generate paired data, followed by training the denoising network using supervised learning methods. However, both of these methods share a common oversight regarding the correlation of the noise signals. Specifically, they do not fully consider the signal correlation of the noise, i.e., the noise characteristics can vary for different captured objects. Directly transplanting erroneous noise onto clean images for training the denoiser may result in the loss of structural information during the denoising process.

Noise synthesis-based denoising algorithms are cutting-edge approaches used to address denoising problems in the absence of paired data [42]. Noise synthesis-based denoising algorithms involve addressing the challenge of insufficient paired data by employing unsupervised methods to generate data and subsequently training a denoiser using the generated data in a supervised manner. This approach can also be seen as a variant of cycle-consistent generative adversarial network [43], where noise injection and denoising are trained separately. This separation allows the denoising network to pay more attention to preserving fine feature details. Fang et al. utilizing flow-based models for noise generation [44]. Taking into account the physical priors of noise during the process of noise synthesis has proven to be immensely advantageous. For example, Wei et al. addressed noise generated in nighttime photography by constructing a noise synthesis model based on the physical principles of camera imaging [45]. Zhang et al. employed both physics-based and learning-based approaches to separately synthesize signal-dependent and signal-independent noise [46]. Moseley et al. combined the physical noise model of the camera with real noise samples and carefully selected training image scenes based on 3D ray tracing to generate high-fidelity training data [47]. The C2N method [48] utilizes a GAN-based framework to generate authentic noise. C2N demonstrates outstanding performance in unsupervised noise synthesis tasks. However, there is currently a lack of noise synthesis-based denoising algorithms specifically designed for OCT noise, making it challenging to apply existing networks designed for other types of noise to OCT noise.

In our research, we introduce a groundbreaking denoising algorithm that leverages the physical properties of OCT and the synthesis of noise, encapsulated within the practical speckle noise (PSN) framework. As depicted in Figure 1, our methodology sets itself apart from existing approaches by harnessing the intrinsic characteristics of pristine images and the inherent physical knowledge of OCT during the noise synthesis phase. This innovative strategy allows for more nuanced and effective noise management within the OCT imaging process. In this paper, we specifically term the noise present in OCT as “practical speckle noise”. This choice of terminology serves a dual purpose: first, to clearly distinguish it from other forms of speckle noise, and second, to address a common oversight in the field where “speckle noise” is sometimes used as a blanket term for all OCT noise, thereby overshadowing the existence and impact of other noise sources. By clearly delineating “practical speckle noise”, we enhance the precision of our understanding and treatment of noise in OCT imaging. Noise originating from other sources is, in this context, categorized as “device noise”. The PSN framework is designed to generate realistic OCT noise images that are paired with clean images, forming the basis for training our denoiser. Our work, therefore, not only contributes a novel perspective to OCT noise characterization but also presents a significant advancement in the practical application of denoising algorithms, promising to enhance the quality and reliability of OCT imaging across various applications.

In summary, the contributions of this paper are as follows:

We attempted to address the problem of OCT image denoising using a noise synthesis-based approach combined with deep learning. Our method successfully generated noise images that closely resembled the distribution of real noise while preserving the noise patterns consistent with the original clean images.
In the process of designing the denoising algorithm, we took into full consideration the physical characteristics of OCT and incorporated speckle simulation algorithms into the noise synthesizer. This innovative approach of combining speckle simulation with deep learning represents the first of its kind in current OCT denoising research.
We designed an innovative dual-module noise generator to specifically address speckle noise and device noise present in practical speckle noise.
Extensive experiments have shown that our proposed method has achieved state-of-the-art performance in the field of unpaired OCT image denoising, demonstrating its powerful ability to preserve structural details. Additionally, through ablation experiments, the importance of incorporating OCT physical information for improving denoising results has been further confirmed.

2. Materials and Methods

2.1. Method Background

In real-world scenarios, the generation of practical speckle noise is a highly complex process influenced by various device factors. Practical speckle noise can be considered as a form of multiplicative noise. However, by transforming the data into a logarithmic space, the multiplication operation can be simplified to addition. Therefore, in our work, we simplify OCT noise images I as a linear combination of a clean image C and practical speckle noise N. This can then be expressed as

I = C + N

.

Practical speckle noise is inherently complex. During the imaging process, coherent light is emitted from the device, illuminating the target object; the reflected light is subsequently received and processed. At each stage, noise is introduced. The illumination of the target by coherent light generates speckle noise, which is a consequence of the coherent nature of the light. Additionally, various environmental and device-related factors contribute to additional noise during both the emission and reception of light. In summary, OCT noise encompasses a combination of speckle noise and device-related noise sources. Therefore, in this context, we consider the practical speckle noise N as the superposition of device noise G and speckle noise S. This can then be expressed as

N = G + S

.

Based on this noise model assumption, we have developed a denoising algorithm called the practical speckle noise (PSN) framework, which incorporates both OCT physical information and synthesized noise. Figure 2 showcases the complete algorithm framework of PSN and our innovative dual-module noise generator.

2.2. Train the Noise Generator Using Unpaired Images

We train our noise generator G using a GAN framework, as shown in Figure 2a. The generator G takes real clean images as input and generates noise related to them. Practical speckle noise is then added to the clean images to synthesize noisy images. A discriminator D is used to distinguish between real noisy images and synthesized noisy images.

Our noise generator G generates corresponding practical speckle noise n based on real clean images c. The clean image c, when combined with practical speckle noise n, results in the corresponding synthesized noise image y.

y = c + n = c + G (c)

(1)

Concurrently, we trained a discriminator network D to differentiate whether a given noisy image is synthesized by our generator G or sampled from an authentic dataset. The objective of training the noise generator G is to deceive the discriminator D, rendering it incapable of distinguishing between the synthetic noise images and the genuine noise images. This forms an adversarial learning process to train the noise generator to produce realistic noise images. The more realistically generator G can produce noise images, the more effective it indicates that the discriminator D is. The two networks, G and D, can be adversarially optimized using the Wasserstein distance with gradient penalty loss as follows [49]:

\begin{matrix} L_{a d v} (D, G) & = E_{y^{'} \sim P_{N}} [D (y^{'})] - E_{x \sim P_{C}} [1 - D (x + G (x))] \\ + λ E_{x_{σ} \sim P_{σ}} [(∥ \nabla_{x_{σ}} D (x_{σ}) ∥_{2} - 1)^{2}] \end{matrix}

(2)

In this context, the training objective of the generator G is to minimize the overall loss, while the training objective of the discriminator D is to maximize this overall loss.

P_{N}

and

P_{C}

represent the collection of real noisy images and the collection of real, clean images. The real noisy image

y^{'}

is sampled from the collection of real noisy images. The symbol in the upper right of y indicates that this image is not paired with a real clean image x. To ensure more stable training of the network, we employ a gradient penalty term;

λ

is 10, and

x_{σ} \sim P_{σ}

is one of the internal boundary points between the generated images and the real images. We also incorporated the

L_{s t d}

loss to mitigate the potential problem of color bias in the generated noise. This loss term is defined as follows:

L_{s t d} = \frac{1}{N} {∥ \sum_{i \in P} {\hat{n}}_{i} ∥}_{1}

(3)

N represents the number of pixels, and P represents the generated noise;

{\hat{n}}_{i}

represents the noise value at position i in the noise. Homogeneous minimization of robustness loss ensures that the generated noise has a mean value of 0, making GAN training more stable and preventing network instability. It also limits the occurrence of color shifts in the generated images. We only restrict the global mean to be close to 0, but the average of noise in local regions can still exceed 0, depending on the underlying signal. Combining our two losses, our optimized total loss is:

L_{G} = L_{a d v} + w_{s t d} L_{s t d}

(4)

We set

w_{s t d} = 0.01

during the experiment.

2.3. The Noise Generator

Based on the noise formation process in OCT, we have developed an innovative dual-module noise generator G. Our noise generator G integrates the physical priors of the noise, improving the accuracy and authenticity of the generated noise, as shown in Figure 2c. Our noise generator, denoted as G, is primarily composed of two distinct modules: the Speckle Noise Synthesis Module and the Device Noise Synthesis Module, each carefully designed to serve its specific purpose within the system. The Speckle Noise Synthesis Module primarily generates noise originating from laser interference. The Device Noise Synthesis Module primarily generates the inherent noise or variations introduced by electronic devices or components during the signal acquisition or processing process.

Speckle Noise Synthesis Module. We incorporate the physical priors of speckle noise. When coherent light illuminates a randomly rough surface, it generates speckle noise that can be simulated using mathematical models. The simulation algorithm is illustrated in Algorithm 1. We integrate the simulated speckle patterns into the network. This module begins by extracting features from clean images using a feature extractor. The feature extractor consists of a 1 × 1 convolutional layer and five residual blocks. The simulated speckle patterns generated by the simulation algorithm are fused with the extracted features using a spatial attention mechanism. Specifically, the extracted features undergo a sigmoid operation to obtain an attention matrix. By multiplying the attention matrix with the speckle patterns, we obtain the initial noise mapping. This step is aimed at aligning the generated noise with the features of the clean image. The initial mapping is further adjusted using a multi-level feature adjuster

f_{d}^{m}

.

f_{d}^{m}

consists of two branches. One branch consists of 2 modified residual blocks [50] with 1 × 1 convolutional layer, primarily adjusting the noise features at the pixel level. The other branch consists of three residual blocks with a convolutional kernel size of 3, mainly implementing spatially correlated feature adjustments. To enhance the training stability and the noise generation capability of the network, we incorporate the unmodified algorithm-simulated speckle images directly as the initial noise mapping. Subsequently, we utilize a multi-level feature adjuster called

f_{i}^{m}

[48], which consists of three residual blocks with a 1 × 1 convolutional layer, to optimize the noise features.

Device Noise Synthesis Module We intend for this module to synthesize device noise. However, the formation of device noise is a highly intricate and stochastic process, stemming from various internal components as well as environmental factors. In light of this complexity, we employ Gaussian noise as a physical prior for modeling device noise. In this module, we consider the inherent connection between heteroscedastic Gaussian noise and the features of clean images. Firstly, a feature extractor is employed to extract features from the clean image, capturing the mean and variance of a Gaussian distribution. The feature extractor consists of a 1 × 1 convolutional layer and five residual blocks without relying on any pre-trained models. Subsequently, for each position, sampling is performed to obtain the initial noise mapping, representing the heteroscedastic Gaussian noise. The initial mapping is further refined through a multi-level feature adjuster known as

f_{i}^{m}

. Similarly, we directly employ Gaussian noise as the initial noise mapping, which is then subject to feature adjustments using the multi-level feature adjuster

f_{d}^{m}

. It is important to note that

f_{i}^{m}

and

f_{d}^{m}

in the Device Noise Synthesis Module are structurally consistent with those in the Speckle Noise Synthesis Module. But they do not share weights [48].

In the final step, we combine all the noise features that have been adjusted by

f_{i}^{m}

and

f_{d}^{m}

, resulting in a comprehensive noise feature. This feature effectively captures the noise characteristics expressed by each module. To ensure compatibility with the target clean image, we apply a 1 × 1 convolution to the final noise feature for the last scaling adjustment. By directly superimposing the processed noise feature onto the clean image, we successfully generate the final synthesized noise map. In this way, we can derive the following equation.

y = c + S (c) + D (c)

(5)

Here, y represents the synthesized noise image, c denotes the clear image, S stands for the Speckle Noise Synthesis Module, and D signifies the Device Noise Synthesis Module. This is a more complete noise synthesis process of our method.

Algorithm 1 Speckle Simulation

Require:: image size L and mask diameter D
1:: Initialize an empty matrix E with size $L \times L$
2:: for $h = 0$ to L do
3:: for $w = 0$ to L do
4:: $s_{h w} \Leftarrow$ sampling from uniform distribution $U (0, 1)$
5:: $e_{h w} \Leftarrow e x p {i 2 π s_{h w}}$
6:: if $d i s t a n c e ((h, w), (L / 2, L / 2)) > D / 2$ then
7:: $e_{h w} \Leftarrow 0$
8:: end if
9:: Assign $e_{h w}$ to the $(h, w)$ entry of matrix E
10:: end for
11:: end for
12:: $I \Leftarrow {| F F T (E) |}^{2}$
13:: return Speckle image I

2.4. Simulation of Speckle Image

When describing light waves in terms of probability amplitude, speckle can be explained using the following formula:

I (x, y) = I_{0} \cdot {| E (x, y) |}^{2}

(6)

Here,

I (x, y)

represents the speckle intensity received at position (

x, y

), and

I_{0}

is the intensity of the incident light.

E (x, y)

corresponds to the complex amplitude of the electric field at position (

x, y

).

For a simplified speckle model, we assume that the light wave is a plane wave and neglect spatial variations in phase. In this case, we express the complex amplitude of the electric field as:

E (x, y) = A \cdot exp (i ϕ)

(7)

Here, A represents the amplitude, and

ϕ

represents the phase. When the light wave interacts with an obstruction or scattering object, speckle patterns are formed. In the region of speckle, there are spatial variations in the amplitude and phase of the light wave. To simulate this effect, we introduce a random phase factor to describe the formation of speckle:

E (x, y) = A \cdot exp (i ϕ) \cdot exp (i 2 π s_{x y})

(8)

Here,

s_{x y}

is a randomly sampled value from a uniform distribution on the interval [0, 1] and represents the random phase of the speckle.

Finally, we compute the speckle intensity

I (x, y)

, which corresponds to the received light intensity, by taking the squared modulus of the complex amplitude:

\begin{matrix} I (x, y) & = I_{0} \cdot {| E (x, y) |}^{2} \\ = I_{0} \cdot {| A \cdot exp (i ϕ) \cdot exp (i 2 π s_{x y}) |}^{2} \end{matrix}

(9)

Using this approach, we can generate speckle patterns by randomly sampling the phase factor

s_{x y}

and specifying the amplitude A and incident light intensity

I_{0}

. To facilitate the simulation, we set

I_{0}

and

A \cdot exp (i ϕ)

to 1.

The Speckle Noise Syntheisis Module uses a speckle simulation algorithm to generate speckle images. Based on the physical phenomena and characteristic features of speckles, various simulation algorithms have been developed. Among them, we have adopted one of the extensively utilized methods for our simulation. The simulation algorithm we employ can be illustrated in Algorithm 1. The specific process of image synthesis is vividly depicted in the Figure 3, illustrating the synthesis procedure. In this simulation, a square matrix of size L * L is created and filled with complex numbers uniformly distributed within the range of (0,1) for both amplitude and phase. To generate the speckle pattern, the points in the L * L array that are located at a distance greater than D/2 from the center are set to zero, resulting in a circular region with a diameter of D. The array then undergoes an inverse Fourier transform, and each point in the transformed array is multiplied by its complex conjugate. This process yields the synthesized speckle image. It is important to note that, according to the Fourier transform theorem, the position of the circular region within the larger array is irrelevant, while the ratio of L to D determines the minimum size of the speckle. As an example, if L/D = 2, the Nyquist criterion is satisfied, resulting in a minimum speckle size of 2 pixels in width. This algorithm employs the fast Fourier transform (FFT) to achieve a many-to-one mapping, simulating the observed laser speckle pattern. Finally, square the magnitude of the Fourier transform result to obtain the final speckle image [51].

2.5. Train the Denoising Model Using Paired Images

The data generated by the trained noise generator can be directly used to optimize supervised network F, as shown in Figure 2b. We start by generating pseudo-noisy images y from clean samples x and then use these pairs to train the denoising model in a supervised manner.

Based on past experience [35,52,53], we minimize the

L_{2}

reconstruction loss, which is defined as:

L_{2} = \frac{1}{m} \sum_{k = 1}^{m} {∥ F ({\hat{y}}_{k}) - x_{k} ∥}_{2}

(10)

Here,

\hat{y}

represents the noise image we generated:

{\hat{y}}_{k} = x_{k} + G (x_{k})

. x is a real clean image, and m is the batch size. G represents the dual-module noise generator that we proposed. k is the index of each image in a mini-batch B of size m.

3. Results

3.1. Experimental Preparation

Dataset: To train and evaluate our noise-based synthetic denoising algorithm, we created a proprietary dataset for OCT image denoising. We utilized a custom spectral-domain OCT (SDOCT) system to acquire multiple frames of B-scan OCT images in a consistent scanning direction. These acquired images underwent a meticulous selection process, where only those with minimal misalignment of tissue structures were retained. Subsequently, we performed an averaging operation to obtain corresponding clean images, ensuring that only the noise image closest to the clean image was included in the final dataset. In total, we meticulously curated 175 pairs of noisy and clean images, meticulously pairing them together. To enhance the diversity of noise patterns in the dataset, we introduced an additional set of 400 unrelated noise images. These images were initially sized at

360 \times 512

. Moreover, to augment the authenticity of the dataset, we acquired 60 true clean images using an SVision brand Swept-Source OCT device. These images were captured with an extended exposure time to enhance clarity, making them a faithful representation of clean images. However, these 60 images had a larger size of

2635 \times 1112

, significantly exceeding the dimensions of

360 \times 512

. To ensure consistency in image size, we meticulously cropped and resized them, yielding 300 true clean images sized at

360 \times 512

. The meticulously paired set of 175 images was exclusively reserved for evaluating the denoising efficacy of our method. Additionally, we evaluated the generalization performance of our trained model by testing it on the publicly available PKU37 dataset [41]. This dataset consists of 33 clean images, each of which is paired with 50 noisy images. The clean images are obtained by averaging the corresponding 50 noisy images. Therefore, the dataset comprises a total of 1650 noisy images.

Implementation Details and Optimization: To train our noise generator, we constructed batches of size 4. We utilized the Adam optimizer [54] with an initial learning rate of

10^{- 4}

. The training of the noise generator extended beyond 60 epochs. For the denoising model, we employed a DnCNN architecture with 17 layers [35] and trained it using images generated by the pre-trained noise generator. During the training of the denoiser, we divided the images into 30 smaller image patches and set the batch size to 16. The Adam optimizer with an initial learning rate of

10^{- 4}

was used. The training process spanned 50 epochs and lasted for approximately one hour. All our experiments were conducted on an NVIDIA GTX 1080Ti GPU, highlighting the efficiency and simplicity of our algorithm design.

Quantitative Measurements: In terms of quantitative comparison of image denoising effects, we employed a comprehensive evaluation system consisting of three unsupervised metrics and three supervised metrics. In terms of supervised metrics, in addition to the commonly used peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM) [54], we also introduced the gradient conduction mean square error (GCMSE) [55] as a metric. An elevated PSNR value is indicative of minimal divergence between the reconstructed and the original images. Meanwhile, the SSIM scale spans from 0 to 1, with a score of 1 denoting absolute image congruence. A higher SSIM value is reflective of a closer resemblance between the reconstructed and the original images. PSNR primarily focuses on the mean squared error of the image, while SSIM takes into account a comprehensive set of factors, including luminance, contrast, and structural information. GCMSE focuses specifically on the edge and boundary details in the images, and a smaller value indicates better preservation of image structural details by the denoising algorithm. By utilizing these metrics, the quality, similarity, and differences of image reconstructions can be quantified, enabling the evaluation and comparison of performance among different denoising algorithms.

3.2. Comparison of Synthesized Noise with Other Noise Synthesis Models

To validate the effectiveness of our noise synthesis approach, we conducted a comparative analysis with several other noise synthesis methods, namely the Gaussian noise model, C2N [48], TCFL [41], DRGAN [40], and our proposed method. The Gaussian noise model, being a traditional approach, does not require any specific data training. However, C2N, TCFL, DRGAN, and our method were trained using our carefully curated dataset. Additionally, we performed a direct comparison by superimposing two types of noise, Gaussian noise and speckle simulation images, onto the clean images.

Figure 4 presents the results of the noise synthesis experiments. Figure 4a shows the clean image used as the input for noise synthesis. The subsequent images display the outcomes of various noise synthesis algorithms applied to this clean image. Figure 4h showcases a randomly chosen real noise image for comparison. Figure 4b clearly demonstrates the noticeable differences between Gaussian noise and practical speckle noise. The results depicted in Figure 4c represent the outputs of the C2N method, demonstrating its commendable noise synthesis ability. However, it falls short in capturing the distinctive characteristics of practical speckle noise in finer details. This limitation arises from the fact that C2N was primarily designed for camera applications rather than specifically for OCT imaging. Figure 4e and Figure 4f respectively depict the noise synthesis results from DRGAN and TCFL. The major limitation observed in both cases is the clear misalignment between the noise characteristics and the signal features. This issue is particularly evident in Figure 4f, where the noise in the background shows obvious feature misalignment. Figure 4g showcases the noise synthesis result obtained from our method, which closely resembles the characteristics of real noise compared to the other methods.

3.3. Comparison of Denoising Results with Other OCT Denoising Methods

The purpose of synthesizing practical speckle noise in our model is for denoising. Therefore, achieving better denoising results is the primary evaluation criterion for the model. In our evaluation, we compared traditional methods, state-of-the-art deep learning-based OCT denoising algorithms, and noise synthesis methods, including NLM [30], BM3D [29], DRGAN [40], TCFL [41], additive white gaussian noise (AWGN), and C2N [48]. Initially, we conduct training and testing of the models on our meticulously crafted dataset. The AWGN, C2N, and our noise synthesis models employ the same DnCNN [35] as the denoising network. We train this denoiser using generated paired data, and the denoising results obtained from the trained denoiser serve as the denoising outcomes for the noise synthesis models. We utilize these methods to denoise 175 real noisy images from our dataset, which have corresponding clean images. Following the denoising process, we calculate quantitative metrics to evaluate the performance.

Table 1 presents the quantitative denoising results. It is evident that the denoising algorithm proposed in this paper surpasses other denoising algorithms in all three mainstream metrics, demonstrating its superior performance. This indicates that our approach excels in effectively eliminating noise, especially background noise. In comparison to the C2N network, our approach has successfully attained a remarkable PSNR score of 38.504, marking a substantial enhancement of 7.47%. In terms of SSIM, it has reached an impressive 0.954, reflecting a notable improvement of 2.80%. Furthermore, it has demonstrated exceptional performance by achieving a GCMSE value of 0.899, which corresponds to a significant reduction of 31.8%. These advancements underscore the effectiveness of our method in enhancing image denoising outcomes.

The visual results are depicted in Figure 5. Figure 5a illustrates a real noisy image used for denoising, where the retinal structure is severely degraded by noise, making it challenging to discern, and the background is heavily contaminated with noise. Figure 5i represents the ground truth obtained by averaging multiple noisy images, with red and green arrows indicating the preserved details in the retinal image. Figure 5b showcases the denoising outcome using the BM3D method. While BM3D effectively removes noise by fine-tuning the parameters, it also erases the retinal structure, rendering the result unacceptable. Figure 5c displays the denoising result using the NLM method, which not only fails to remove noise effectively but also introduces new noise artifacts. Figure 5d exhibits the denoising outcome using a Gaussian noise model for data synthesis. Due to the significant disparity between Gaussian noise and real noise, the trained denoiser’s performance is mediocre, with inadequate removal of both background and structural noise. Figure 5e demonstrates the denoising result using the C2N method. Although it yields promising results, there is still residual background noise, and the structure becomes blurred due to the disparity between the synthesized noise and real noise. Figure 5f showcases the denoising outcome using the DRGAN method. DRGAN’s usage of multiple discriminators often leads to over-smoothing of the image, resulting in the loss of crucial fine details. Preserving fine structures is vital in medical imaging, as even small structures have diagnostic significance. Figure 5g displays the denoising result using the TCFL method. TCFL demonstrates commendable background noise removal but performs poorly in denoising fine structures. This is because the synthesized noise often exhibits structural deviations, making it less effective in denoising fine structures that deviate from the noise distribution. Finally, Figure 5h presents the denoising outcome using our proposed PSN method, which achieves the best results in both background noise removal and preservation of detailed structures.

To validate the generalization of our method, we tested the model trained on our dataset on the PKU37 dataset, and the quantitative results are shown in Table 2. Despite some domain differences between our dataset and PKU37, our method outperforms TCFL in terms of PSNR, SSIM, and GCMSE. It is important to note that the TCFL was directly trained on PKU37 data, and the metrics were directly cited from TCFL paper. The data of PKU37 is not sufficient for training our method, and their clean images are obtained by averaging, creating paired noise-clean image pairs. There is a significant improvement, particularly in terms of SSIM (Structural Similarity Index Measure). We attribute this improvement primarily to our PSN, which generates practical speckle noise images that are paired with real clean images. The generated noise images closely resemble the actual noise images, allowing for better preservation of the underlying structure. Compared to the TCFL method, our approach shows significant improvements, with a PSNR of 30.490, which is a 3.64% increase, an SSIM of 0.794, which is a substantial 17.95% boost, and a GCMSE of 6.922, which is a significant 10.48% reduction. In summary, our method performs well in denoising and detail preservation, and it demonstrates good generalization capabilities, even when tested on a different dataset.

3.4. Ablation Experiments

In this study, our objective was to investigate the impact of incorporating two types of speckle priors into the network. We conducted ablation experiments specifically targeting the Speckle Noise Synthesis Module, while excluding any experiments on the Device Noise Synthesis Module. This decision was primarily motivated by the need to assess the influence of incorporating speckle physical priors on the noise synthesis process. One of the main reasons for this choice is the extensive existing research on simulating noise using Gaussian noise. To provide a clearer understanding, let’s define the two branches in the speckle noise module as Speckle-I and Speckle-D. The Speckle-I branch exclusively utilizes speckle images for the initial noise mapping, while the Speckle-D branch serves a different purpose. Therefore, we conducted two additional ablation experiments: one involved removing the Speckle Noise Synthesis Module from the noise generator, retaining only the Device Noise Synthesis Module, and subsequently retraining and testing the model. The other experiment focused on removing only the Speckle-D branch from the Speckle Noise Synthesis Module while keeping the Device Noise Synthesis Module and Speckle-I branch intact, followed by retraining and testing.

Quantitative results are presented in Table 3. By incorporating the Speckle-I module into the base generator structure, we observed improvements of 1.12% and 1.83% in the PSNR and SSIM metrics, respectively. Building upon this, the addition of the Speckle-D structure further enhanced the performance, resulting in a significant 6.27% improvement in PSNR and a notable 0.95% improvement in SSIM.

Visual comparisons of the results are shown in Figure 6. Gradually incorporating the Speckle Noise Synthesis Module resulted in a reduction of artifacts introduced by the denoising process, leading to enhanced visualization of retinal structures in the image. The ablation experiments unequivocally validate the crucial and beneficial influence of incorporating physical information on denoising in our proposed algorithm.

4. Discussion

In this study, we propose a denoising algorithm based on OCT physical information and noise synthesis, referred to as the Practical Speckle Noise framework, to achieve better structure-preserving denoising of OCT images. We conducted comparative experiments on our dataset and the PKU37 dataset, comparing three mainstream metrics: PSNR, SSIM, and GCMSE, to validate the advanced denoising performance of our algorithm. Subsequently, we performed ablation experiments to confirm the significant positive impact of incorporating physical information on the denoising results.

In our experiments, we initially tested traditional algorithms and GAN-based OCT denoising algorithms, which have been commonly used in the past. The traditional algorithms evaluated for denoising performance were NLM and BM3D. BM3D outperformed NLM on both datasets and achieved reasonably good results. However, it still had some limitations in terms of both metrics and visual effects. BM3D did not completely remove the noise and could introduce structural artifacts. DRGAN and TCFL are advanced GAN-based denoising algorithms specifically designed for OCT. They both incorporate a noise synthesis approach into their network design by extracting noise from noisy images and adding it to clear images to create paired training data. However, this noise synthesis approach overlooks the signal correlation of the noise. This can be observed in Figure 4, where the noise synthesis images generated by DRGAN and TCFL exhibit noticeable misalignment of noise patterns. Such improper noise synthesis directly affects the final denoising results. While these methods perform well in background denoising, as directly transplanting noise onto the background is reasonable, they introduce significant structural artifacts in structural denoising. The noise in structural regions is highly correlated with the structures, and direct transplantation is an incorrect approach.

The denoising algorithm based on noise synthesis synthesizes noise on the clear image in a more rational manner, taking into full consideration the characteristics of the clear image to generate corresponding noise. In addition to our proposed algorithm, we tested two other noise synthesis-based denoising algorithms: AWGN and C2N. Clearly, the results obtained with the C2N model are superior to those of the AWGN model. This is because the C2N model is specifically trained to capture the characteristics of practical speckle noise through network training. However, we observed that the C2N model faces challenges in synthesizing practical speckle noise, particularly in background noise removal and structure preservation. In contrast, our method, which is based on the noise introduction process in OCT and incorporates relevant physical information, outperforms these models. It produces superior denoising results and effectively preserves fine details. Moreover, our method demonstrates sensitivity to small structures and makes efforts to preserve potential structural features, further validating its effectiveness.

Our proposed method integrates the physical prior information of OCT noise into the noise synthesis process, thereby ensuring that the synthesized noise closely mirrors real-world conditions. Practical speckle noise, a common occurrence in OCT images, is influenced by various factors such as sensor noise, coherence of the light source, and scattering properties of the sample. By simulating these physical processes, our noise synthesizer can accurately generate synthesized noise that aligns with the actual noise observed in OCT images, thus providing a more robust foundation for subsequent image denoise tasks.

We conducted a series of ablation experiments to validate the effectiveness of the physical information we incorporated and the method of its integration. As demonstrated in Table 3 and Figure 6, the results clearly show that as the unique physical information of OCT is gradually stripped from the noise synthesizer, the denoising performance correspondingly deteriorates. This indicates that the rational integration of OCT’s physical information into the network is crucial for generating realistic OCT noise, and the more accurate the synthesized noise is, the better the denoising results will be. This further confirms the pivotal role of physical information in enhancing the performance of denoising algorithms.

Ultimately, our research holds significant clinical implications for noise processing in OCT images. By improving image quality and accuracy, we can assist clinicians in more accurately diagnosing and treating eye diseases, thereby enhancing the quality of life for patients. Moreover, our method provides new insights and technical support for future medical research and clinical applications based on OCT images, potentially driving advancements in the field.

5. Conclusions

In this study, we aimed to achieve better structure preservation in denoising results and proposed a novel framework for denoising based on OCT physical information and noise synthesis, named the Practical Speckle Noise (PSN) framework. Our framework takes into account the noise formation process in OCT and integrates physical prior information into the noise generator, introducing an innovative dual-module noise synthesizer that enhances its capability to synthesize practical speckle noise. The uniqueness of our work lies in the combination of the mathematical simulation of speckles with deep learning networks, an unexplored territory until now. Moreover, the approach of separately handling OCT noise into speckle noise and equipment noise is also a pioneering attempt. Our noise generator is capable of learning the distribution of practical speckle noise in an unpaired setting, enabling the synthesis of realistic noise images. By training the denoiser with the generated noise and paired clear images, we achieved denoising performance superior to existing unsupervised methods, particularly in preserving fine structural details. Notably, our method not only excels on our dataset but also demonstrates excellent generalization and effectiveness on other public datasets, proving its universality and practicality. Ablation experiments have proven that incorporating physical prior information into noise modeling is a significant breakthrough in addressing the challenge of practical speckle noise removal. Our method opens up new directions for future research. It encourages the integration of physical prior information in other image denoising tasks and the combination of mathematical simulations with deep learning algorithms. The promising denoising results we obtained provide compelling evidence of the advantages of integrating physical information into medical image denoising. This breakthrough paves the way for new possibilities in the field of medical image processing.

Author Contributions

Conceptualization, L.Y. and D.W.; methodology, L.Y., D.W. and W.G.; software L.Y.; validation, L.Y. and W.G.; formal analysis, L.Y. and D.W.; investigation, W.G.; resources, D.W.; data curation, D.W. and W.G.; writing—original draft preparation, L.Y.; writing—review and editing M.S. and R.X.X.; visualization M.S.; supervision, M.S. and R.X.X.; project administration, M.S. and R.X.X.; funding acquisition, M.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Natural Science Foundation of Jiangsu Province grant number BK20231214; the National Key Research and Development Program of China grant number 2022YFA1104803.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Aumann, S.; Donner, S.; Fischer, J.; Müller, F. Optical coherence tomography (OCT): Principle and technical realization. In High Resolution Imaging in Microscopy and Ophthalmology: New Frontiers in Biomedical Optics; Springer: Berlin/Heidelberg, Germany, 2019; pp. 59–85. [Google Scholar]
Drexler, W.; Liu, M.; Kumar, A.; Kamali, T.; Unterhuber, A.; Leitgeb, R.A. Optical coherence tomography today: Speed, contrast, and multimodality. J. Biomed. Opt. 2014, 19, 071412. [Google Scholar] [CrossRef]
Huang, D.; Swanson, E.A.; Lin, C.P.; Schuman, J.S.; Stinson, W.G.; Chang, W.; Hee, M.R.; Flotte, T.; Gregory, K.; Puliafito, C.A.; et al. Optical coherence tomography. Science 1991, 254, 1178–1181. [Google Scholar] [CrossRef]
Schmitt, J.M. Optical coherence tomography (OCT): A review. IEEE J. Sel. Top. Quantum Electron. 1999, 5, 1205–1215. [Google Scholar] [CrossRef]
Puliafito, C.A.; Hee, M.R.; Lin, C.P.; Reichel, E.; Schuman, J.S.; Duker, J.S.; Izatt, J.A.; Swanson, E.A.; Fujimoto, J.G. Imaging of macular diseases with optical coherence tomography. Ophthalmology 1995, 102, 217–229. [Google Scholar] [CrossRef]
Drexler, W.; Morgner, U.; Ghanta, R.K.; Kärtner, F.X.; Schuman, J.S.; Fujimoto, J.G. Ultrahigh-resolution ophthalmic optical coherence tomography. Nat. Med. 2001, 7, 502–507. [Google Scholar] [CrossRef] [PubMed]
Adhi, M.; Duker, J.S. Optical coherence tomography–current and future applications. Curr. Opin. Ophthalmol. 2013, 24, 213–221. [Google Scholar] [CrossRef]
Maeda, N. Optical coherence tomography for corneal diseases. Eye Contact Lens 2010, 36, 254–259. [Google Scholar] [CrossRef]
Ahronovich, E.Z.; Simaan, N.; Joos, K.M. A review of robotic and OCT-aided systems for vitreoretinal surgery. Adv. Ther. 2021, 38, 2114–2129. [Google Scholar] [CrossRef] [PubMed]
El-Haddad, M.T.; Tao, Y.K. Advances in intraoperative optical coherence tomography for surgical guidance. Curr. Opin. Biomed. Eng. 2017, 3, 37–48. [Google Scholar] [CrossRef]
Nguyen, P.; Chopra, V. Applications of optical coherence tomography in cataract surgery. Curr. Opin. Ophthalmol. 2013, 24, 47–52. [Google Scholar] [CrossRef]
Gambichler, T.; Jaedicke, V.; Terras, S. Optical coherence tomography in dermatology: Technical and clinical aspects. Arch. Dermatol. Res. 2011, 303, 457–473. [Google Scholar] [CrossRef] [PubMed]
Pierce, M.C.; Strasswimmer, J.; Park, B.H.; Cense, B.; De Boer, J.F. Advances in optical coherence tomography imaging for dermatology. J. Investig. Dermatol. 2004, 123, 458–463. [Google Scholar] [CrossRef]
Erdelyi, R.A.; Duma, V.F.; Sinescu, C.; Dobre, G.M.; Bradu, A.; Podoleanu, A. Dental diagnosis and treatment assessments: Between X-rays radiography and optical coherence tomography. Materials 2020, 13, 4825. [Google Scholar] [CrossRef] [PubMed]
Hsieh, Y.S.; Ho, Y.C.; Lee, S.Y.; Chuang, C.C.; Tsai, J.c.; Lin, K.F.; Sun, C.W. Dental optical coherence tomography. Sensors 2013, 13, 8928–8949. [Google Scholar] [CrossRef]
Carvalho, L.; Roriz, P.; Simões, J.; Frazão, O. New trends in dental biomechanics with photonics technologies. Appl. Sci. 2015, 5, 1350–1378. [Google Scholar] [CrossRef]
Kirtane, T.S.; Wagh, M.S. Endoscopic optical coherence tomography (OCT): Advances in gastrointestinal imaging. Gastroenterol. Res. Pract. 2014, 2014, 376367. [Google Scholar] [CrossRef]
Targowski, P.; Rouba, B.; Góra, M.; Tymińska-Widmer, L.; Marczak, J.; Kowalczyk, A. Optical coherence tomography in art diagnostics and restoration. Appl. Phys. A 2008, 92, 1–9. [Google Scholar] [CrossRef]
Liu, P.; Groves, R.M.; Benedictus, R. Non-destructive evaluation of delamination growth in glass fiber composites using optical coherence tomography. In Proceedings of the Nondestructive Characterization for Composite Materials, Aerospace Engineering, Civil Infrastructure, and Homeland Security, San Diego, CA, USA, 11–15 April 2014; SPIE: Philadelphia, PA, USA, 2014; Volume 9063, pp. 378–386. [Google Scholar]
Monroy, G.L.; Won, J.; Spillman, D.R., Jr.; Dsouza, R.; Boppart, S.A. Clinical translation of handheld optical coherence tomography: Practical considerations and recent advancements. J. Biomed. Opt. 2017, 22, 121715. [Google Scholar] [CrossRef]
Cogliati, A.; Canavesi, C.; Hayes, A.; Tankam, P.; Duma, V.F.; Santhanam, A.; Thompson, K.P.; Rolland, J.P. MEMS-based handheld scanning probe with pre-shaped input signals for distortion-free images in Gabor-domain optical coherence microscopy. Opt. Express 2016, 24, 13365–13374. [Google Scholar] [CrossRef]
Schmitt, J.M.; Xiang, S.; Yung, K.M. Speckle in optical coherence tomography. J. Biomed. Opt. 1999, 4, 95–105. [Google Scholar] [CrossRef]
Xiang, S.; Zhou, L.; Schmitt, J.M. Speckle noise reduction for optical coherence tomography. In Proceedings of the Optical and Imaging Techniques for Biomonitoring III, San Remo, Italy, 6–8 September 1997; SPIE: Philadelphia, PA, USA, 1998; Volume 3196, pp. 79–88. [Google Scholar]
Lv, H.; Fu, S.; Zhang, C.; Zhai, L. Speckle noise reduction of multi-frame optical coherence tomography data using multi-linear principal component analysis. Opt. Express 2018, 26, 11804–11818. [Google Scholar] [CrossRef] [PubMed]
Nickla, D.L.; Wallman, J. The multifunctional choroid. Prog. Retin. Eye Res. 2010, 29, 144–168. [Google Scholar] [CrossRef] [PubMed]
Jayaraman, V.; John, D.; Burgner, C.; Robertson, M.; Potsaid, B.; Jiang, J.; Tsai, T.; Choi, W.; Lu, C.; Heim, P.; et al. Recent advances in MEMS-VCSELs for high performance structural and functional SS-OCT imaging. Opt. Coherence Tomogr. Coherence Domain Opt. Methods Biomed. XVIII 2014, 8934, 11–21. [Google Scholar]
Sorkin, N.; Achiron, A.; Abumanhal, M.; Abulafia, A.; Cohen, E.; Gutfreund, S.; Mandelblum, J.; Varssano, D.; Levinger, E. Comparison of two new integrated SS-OCT tomography and biometry devices. J. Cataract. Refract. Surg. 2022, 48, 1277–1284. [Google Scholar] [CrossRef]
Feng, X.; Wang, Y.; Liang, J.; Xu, Y.; Ortega-Usobiaga, J.; Cao, D. Analysis of lens thickness distribution based on swept-source optical coherence tomography (SS-OCT). J. Ophthalmol. 2021, 2021, 4717996. [Google Scholar] [CrossRef]
Dabov, K.; Foi, A.; Katkovnik, V.; Egiazarian, K. Image denoising by sparse 3-D transform-domain collaborative filtering. IEEE Trans. Image Process. 2007, 16, 2080–2095. [Google Scholar] [CrossRef]
Buades, A.; Coll, B.; Morel, J.M. A non-local algorithm for image denoising. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 20–26 June 2005; Volume 2, pp. 60–65. [Google Scholar]
Miao, Q.; Xue, M.; Wang, H. Noise reduction of fingertip OCT image based on generated unpaired high-quality datasets. In Proceedings of the International Conference on Optical and Photonic Engineering (icOPEN 2022), Online, 24–27 November 2022; SPIE: Philadelphia, PA, USA, 2023; Volume 12550, pp. 308–314. [Google Scholar]
Viedma, I.A.; Alonso-Caneiro, D.; Read, S.A.; Collins, M.J. Deep learning in retinal optical coherence tomography (OCT): A comprehensive survey. Neurocomputing 2022, 507, 247–264. [Google Scholar] [CrossRef]
Sagheer, S.V.M.; George, S.N. A review on medical image denoising algorithms. Biomed. Signal Process. Control 2020, 61, 102036. [Google Scholar]
Zhao, H.; Yang, T.; Zhou, X. Laser speckle denoising with deep convolutional network. In Proceedings of the Twelfth International Conference on Digital Image Processing (ICDIP 2020), Osaka, Japan, 19–22 May 2020; SPIE: Philadelphia, PA, USA, 2020; Volume 11519, pp. 404–412. [Google Scholar]
Zhang, K.; Zuo, W.; Chen, Y.; Meng, D.; Zhang, L. Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising. IEEE Trans. Image Process. 2017, 26, 3142–3155. [Google Scholar] [CrossRef]
Hu, D.; Tao, Y.K.; Oguz, I. Unsupervised denoising of retinal OCT with diffusion probabilistic model. In Proceedings of the Medical Imaging 2022: Image Processing, Online, 21–27 March 2022; SPIE: Philadelphia, PA, USA, 2022; Volume 12032, pp. 25–34. [Google Scholar]
Ge, C.; Yu, X.; Yuan, M.; Fan, Z.; Chen, J.; Shum, P.P.; Liu, L. Self-supervised Self2Self denoising strategy for OCT speckle reduction with a single noisy image. Biomed. Opt. Express 2024, 15, 1233–1252. [Google Scholar] [CrossRef]
Tian, C.; Fei, L.; Zheng, W.; Xu, Y.; Zuo, W.; Lin, C.W. Deep learning on image denoising: An overview. Neural Netw. 2020, 131, 251–275. [Google Scholar] [CrossRef] [PubMed]
Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial networks. Commun. ACM 2020, 63, 139–144. [Google Scholar] [CrossRef]
Huang, Y.; Xia, W.; Lu, Z.; Liu, Y.; Chen, H.; Zhou, J.; Fang, L.; Zhang, Y. Noise-powered disentangled representation for unsupervised speckle reduction of optical coherence tomography images. IEEE Trans. Med. Imaging 2020, 40, 2600–2614. [Google Scholar] [CrossRef] [PubMed]
Geng, M.; Meng, X.; Zhu, L.; Jiang, Z.; Gao, M.; Huang, Z.; Qiu, B.; Hu, Y.; Zhang, Y.; Ren, Q.; et al. Triplet Cross-Fusion Learning for Unpaired Image Denoising in Optical Coherence Tomography. IEEE Trans. Med. Imaging 2022, 41, 3357–3372. [Google Scholar] [CrossRef] [PubMed]
Zhang, Y.; Qin, H.; Wang, X.; Li, H. Rethinking noise synthesis and modeling in raw denoising. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 11–17 October 2021; pp. 4593–4601. [Google Scholar]
Zhu, J.Y.; Park, T.; Isola, P.; Efros, A.A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2223–2232. [Google Scholar]
Abdelhamed, A.; Brubaker, M.A.; Brown, M.S. Noise flow: Noise modeling with conditional normalizing flows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 3165–3173. [Google Scholar]
Wei, K.; Fu, Y.; Yang, J.; Huang, H. A physics-based noise formation model for extreme low-light raw denoising. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 2758–2767. [Google Scholar]
Zhang, F.; Xu, B.; Li, Z.; Liu, X.; Lu, Q.; Gao, C.; Sang, N. Towards General Low-Light Raw Noise Synthesis and Modeling. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 2–6 October 2023; pp. 10820–10830. [Google Scholar]
Moseley, B.; Bickel, V.; López-Francos, I.G.; Rana, L. Extreme low-light environment-driven image denoising over permanently shadowed lunar regions with a physical noise model. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 6317–6327. [Google Scholar]
Jang, G.; Lee, W.; Son, S.; Lee, K.M. C2n: Practical generative noise modeling for real-world denoising. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 11–17 October 2021; pp. 2350–2359. [Google Scholar]
Gulrajani, I.; Ahmed, F.; Arjovsky, M.; Dumoulin, V.; Courville, A.C. Improved training of wasserstein gans. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
Lim, B.; Son, S.; Kim, H.; Nah, S.; Mu Lee, K. Enhanced deep residual networks for single image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, 21–26 July 2017; pp. 136–144. [Google Scholar]
Duncan, D.D.; Kirkpatrick, S.J. Algorithms for simulation of speckle (laser and otherwise). In Proceedings of the Complex Dynamics and Fluctuations in Biomedical Photonics V, San Jose, CA, USA, 19–21 January 2008; SPIE: Philadelphia, PA, USA, 2008; Volume 6855, pp. 23–30. [Google Scholar]
Zhao, H.; Gallo, O.; Frosio, I.; Kautz, J. Loss functions for neural networks for image processing. arXiv 2015, arXiv:1511.08861. [Google Scholar]
Yu, S.; Park, B.; Jeong, J. Deep iterative down-up CNN for image denoising. In Proceedings of the CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Long Beach, CA, USA, 16–17 June 2019; pp. 2095–2103. [Google Scholar]
Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef]
López-Randulfe, J.; Veiga, C.; Rodríguez-Andina, J.J.; Farina, J. A quantitative method for selecting denoising filters, based on a new edge-sensitive metric. In Proceedings of the 2017 IEEE International Conference on Industrial Technology (ICIT), Toronto, ON, Canada, 22–25 March 2017; pp. 974–979. [Google Scholar]

Figure 1. The process of synthesizing a noisy image. The previous method involved extracting noise features from a noisy image and merging them into a clean image to create paired training images. However, this approach overlooked the signal dependency of the noise. In contrast, our method generates noise based on the signal of the clean image and physical priors, ensuring that the generated noise corresponds appropriately to each clean image in the pair.

Figure 2. Overview of PSN framework. (a) We train the generator G using the framework of a generative adversarial network (GAN) [39]. (b) We employ the trained generator G to generate paired data, which are subsequently utilized to train a supervised denoising network F. (c) Our dual-module noise generator G is primarily divided into two components: the Speckle Noise Synthesis Module and the Device Noise Synthesis Module. The Speckle Noise Synthesis Module is used to synthesize the noise characteristics generated by laser interference. The speckle image in the Speckle Noise Synthesis Module is generated using a speckle simulation algorithm, which provides an initial approximation of the speckle noise. Specific details are shown in Algorithm 1. The Device Noise Synthesis Module is used to simulate the noise characteristics originating from devices and uncontrollable factors.

Figure 3. The process of speckle image simulation.

Figure 4. Comparison of synthesized noise results. (a) Real clean image. (b) Gaussian noise image (c) Noise generated by C2N. (d) Gaussian noise combined with speckle-simulated noise. (e) Noise synthesized by the noise synthesis module extracted from the trained DRGAN. (f) Noise synthesized by the noise synthesis module extracted from the trained TCFL. (g) Our method. (h) Real noisy image.

Figure 5. Denoising performance comparison of different methods on our dataset. The red and green arrows respectively indicate the subtle structures within the image. (a) Real noisy image. (b) BM3D. (c) NLM. (d) AWGN. (e) C2N and DnCNN as the denoiser. (f) DRGAN. (g) TCFL. (h) Our method and DnCNN as the denoiser. and (i) Clean image.

Figure 6. Results of the ablation experiments for our method. The green arrow delineates the intricate details within the image. (a) Real noisy image. (b) Without the Speckle Noise Synthesis Module. (c) Without Speckle-D branch in the Speckle Noise Synthesis Module. (d) Our complete method. (e) Clean image.

Table 1. Quantitative results of the our dataset with different methods.

	Noisy	BM3D	NLM	DRGAN	TCFL	AWGN	C2N	OURS
PSNR	19.026 ± 0.093	33.762 ± 0.216	26.404 ± 0.244	28.686 ± 1.834	30.523 ± 0.753	34.340 ± 0.474	35.828 ± 0.467	38.504 ± 0.590
SSIM	0.122 ± 0.011	0.858 ± 0.004	0.437 ± 0.008	0.863 ± 0.043	0.875 ± 0.086	0.902 ± 0.011	0.928 ± 0.004	0.954 ± 0.002
GCMSE	18.233 ± 4.578	2.993 ± 1.969	6.567 ± 2.794	23.507 ± 5.214	5.810 ± 2.532	1.199 ± 1.169	1.318 ± 1.280	0.899 ± 1.046

Table 2. Quantitative results of the PKU37 dataset with different methods.

	Noisy	BM3D	NLM	DRGAN	TCFL	OURS
PSNR	20.157 ± 0.383	29.259 ± 0.560	24.455 ± 0.325	26.934 ± 0.641	29.418 ± 0.539	30.490 ± 0.993
SSIM	0.172 ± 0.021	0.557 ± 0.024	0.325 ± 0.025	0.569 ± 0.014	0.635 ± 0.026	0.749 ± 0.023
GCMSE	33.410 ± 6.702	15.709 ± 3.513	14.80 ± 3.854	14.832 ± 3.634	7.723 ± 3.850	6.922 ± 2.854

Table 3. The results of the model’s ablation experiments on our test set.

Speckle-I	Speckle-D	PSNR	SSIM
×	×	35.828 ± 0.467	0.928 ± 0.039
✓	×	36.230 ± 0.482	0.945 ± 0.017
✓	✓	38.504 ± 0.590	0.954 ± 0.002

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, L.; Wu, D.; Gao, W.; Xu, R.X.; Sun, M. Physics-Based Practical Speckle Noise Modeling for Optical Coherence Tomography Image Denoising. Photonics 2024, 11, 569. https://doi.org/10.3390/photonics11060569

AMA Style

Yang L, Wu D, Gao W, Xu RX, Sun M. Physics-Based Practical Speckle Noise Modeling for Optical Coherence Tomography Image Denoising. Photonics. 2024; 11(6):569. https://doi.org/10.3390/photonics11060569

Chicago/Turabian Style

Yang, Lei, Di Wu, Wenteng Gao, Ronald X. Xu, and Mingzhai Sun. 2024. "Physics-Based Practical Speckle Noise Modeling for Optical Coherence Tomography Image Denoising" Photonics 11, no. 6: 569. https://doi.org/10.3390/photonics11060569

APA Style

Yang, L., Wu, D., Gao, W., Xu, R. X., & Sun, M. (2024). Physics-Based Practical Speckle Noise Modeling for Optical Coherence Tomography Image Denoising. Photonics, 11(6), 569. https://doi.org/10.3390/photonics11060569

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Physics-Based Practical Speckle Noise Modeling for Optical Coherence Tomography Image Denoising

Abstract

1. Introduction

2. Materials and Methods

2.1. Method Background

2.2. Train the Noise Generator Using Unpaired Images

2.3. The Noise Generator

2.4. Simulation of Speckle Image

2.5. Train the Denoising Model Using Paired Images

3. Results

3.1. Experimental Preparation

3.2. Comparison of Synthesized Noise with Other Noise Synthesis Models

3.3. Comparison of Denoising Results with Other OCT Denoising Methods

3.4. Ablation Experiments

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI