1. Introduction
The use of light for biomedical imaging dates back to pioneering studies by researchers such as T.B. Curling (1843), R. Bright (1831), and M. Cutler (1929) [
1,
2,
3,
4]. Subsequent advances in science and technology have led to the development of various light sources (LASER and LED) and image acquisition sensors, driving the widespread adoption of light in medical and life sciences. Existing imaging modalities have limitations due to ionizing radiation, contrast agents, metal restriction, sophisticated systems, and high costs. Therefore, alternative optical-based imaging techniques with simple, radiation-free, and affordable designs are crucial. Transillumination imaging requires a simple system consisting of light sources, a camera as a detector, and a computer for controlling and processing images. The use of transillumination (diaphanography) to monitor the pathology of human organs has become of interest in recent years, as there have been many new advances related to light source technology, sensor variables, and theoretical, experimental, and clinical results [
3,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
20,
21,
22,
23,
24]. However, this method faces great challenges, with a strong scattering of light in biological tissues and time-consuming image processing [
24].
Figure 1 shows the transillumination mode and the epi-transillumination mode of the transillumination imaging. Transillumination mode is commonly known by placing the light source on the opposite side of the recording device (typically a camera). The epi-illumination mode can also be considered as a mode of transillumination imaging, utilizing the light source and the recording device positioned on the same side of the object with appropriate lighting conditions. When the lighting conditions are appropriately adjusted to allow the light to diffuse well in the turbid medium, it becomes possible to acquire the distribution of the absorption structure on the surface of the body.
The transillumination image is the blurred shadows of the absorbing structures in a turbid medium, such as the collection of point absorbers. As the depth of the absorbing structure increases, the resulting image exhibits progressively more pronounced blurring. Additionally, acquiring light signals when light penetrates thick body parts is challenging. This challenge may also be reflected in breast light or blood vessel finder applications currently on the market [
22,
23]. Consequently, de-blurring scattering on observed images has remained a difficulty until recently. To realize transillumination imaging, many studies have been conducted to reduce scattering [
24,
25,
26,
27,
28,
29]. K. Shimizu et al. have derived a depth-dependent point spread function (PSF) to characterize the scattering of a point light source in biological tissue [
24]. The depth-dependent PSF is presented as Equation (
1) [
24]:
where
and
C,
,
, and
d represent the constants concerning
and
d, the reduced scattering coefficient, the absorption coefficient, and the depth of the structure, respectively.
On the surface, it is possible to divide the light into two components, the composition of direct light that is affected by scattering, the absorption of the scattered environment, and the composition of diffused light. Back-scatter or back-reflection imaging uses a light source and a recording device placed on the same side of the object. By adjusting according to the passing conditions, we can obtain the distribution of absorption structure at the body’s surface. The distribution of light at the observing surface involves the decomposition of incident light into two components: direct light, which undergoes scattering and absorption within the surrounding medium, and diffused light.
Tran et al. have successfully applied this PSF to imaging absorption structures in biological tissues, assuming a uniform distribution of light in the plane containing the absorption structures [
25]. As a result, Tran et al. have demonstrated optical computed tomography (OCT) with transillumination imaging and have reconstructed internal structures in small animals [
25]. However, the scattering suppression process in this study depends on the deconvolution process with the Lucy–Richardson algorithm. Therefore, the restored image depends on the selection of the iteration number. The use of the depth-dependent point spread function in conjunction with deep learning to suppress scattering is one of the most remarkable advances that has been made. Van et al. developed the scattering suppression technique and the estimation of the depth of the structure in a turbid medium using deep learning [
27]. The research team has succeeded in suppressing scattering and estimating depth using convolutional neural network (CNN) and fully convolutional network (FCN) models. Using the proposed technique from Van et al., the de-blurred images, depth information, and three-dimensional (3D) structure of simple or single absorption structure were estimated [
27]. The blood flow in the reconstructed 3D vessels could be estimated using a depth-dependent contrast model [
29].
Shimizu et al. recently proposed novel techniques to reconstruct a 3D structure in a turbid medium from a single blurred 2D image obtained using near-infrared transillumination imaging [
30]. One technique uses 1D information on the intensity profile in the light-absorbing image. Profiles under different conditions are calculated by convolution with the depth-dependent point spread function (PSF) of the transillumination image. In databanks, profiles are stored as lookup tables to connect the contrast and spread of the profile to the absorber depth. A one-to-one correspondence was found from the contrast and spread to the absorber depth and thickness. Another technique uses 2D information from the transillumination image of a volumetric absorber. A blurred 2D image is deconvolved with the depth-dependent PSF, thereby producing many images with points of focus on different parts. The depth of the image part can be estimated by searching the deconvolved images for the image part with the best focus. The techniques are time-consuming because of the nature of the convolution and deconvolution process. In addition, it could be applied to the simple structure.
The results in previous studies still show limitations when processing a depth of
mm related to the efficiency of scattering suppression, the shape of the reconstructed structures, the estimated depth, and the applicability in complex structures [
25,
26,
28,
29,
30]. These problems are related to the absorption structure’s complexity, the biological tissue’s heterogeneity, the training data, and the neural network model itself. Dang et al. have proposed the Attention Res-UNet model for de-blurring by adding the Attention gate and the Residual block to the common U-net model structure. As a result, a correlation of more than 89% could be achieved between the de-blurred image and the original structure image [
31]. Dang et al. have proposed depth estimation using the DenseNet169 model with high accuracy beyond the limit of
mm [
31].
The complexity of the light-absorbing structure is also unresolved. The current solution is to subdivide the image into several parts, each of which contains only one simple structural part with a relative spatial location roughly on the same plane [
27,
29,
30]. Current techniques also cannot handle multiple structures distributed at different depths next to each other in the same image.
This paper presents a new method named the pixel-by-pixel scan matrix method that uses deep learning to de-blur and estimate depth information of the absorbing structures in a turbid medium. Consequently, with de-blurred two-dimensional (2D) images at different angles as the projection images, the 3D de-blurred absorbing structures and the cross-sectional images could be reconstructed using the filtered-back-projection method. It also allows for restoring the “clear” image of the light-absorbing structure, so that only one convolutional neural network needs to be used for depth estimation and explicit image reconstruction. If the result of the proposed method could be achieved, the 2D de-blurred image and its pixel depth information can lead to reconstructing a 3D view of absorbing structures with a limited acquisition angle, even with only a single 2D image.
3. Experiment with the Complex Structures in the Tissue-Equivalent Phantom
The feasibility and effectiveness of the proposed method were examined in an experiment with complex structures in a tissue-equivalent phantom.
Figure 7 presents a schematic of the experimental system for obtaining transillumination images. The phantom was irradiated with near-infrared (NIR) light 800 nm from a laser through a beam expander and a diffuser for homogeneous illumination. Images were captured at all 360 degrees using a CMOS camera placed on the opposite side of the phantom. The image observed with the scattering medium is quite blurred compared to the image observed with the clear medium.
Figure 8 presents the normalized intensity profiles at the 150th pixel row for four images (
Figure 8a–d), each scaled to
. In
Figure 8a, the observed image depicts an absorbing structure in a clear medium at a 0-degree orientation, providing a baseline for comparison with the width
.
Figure 8b illustrates the observed image of the same structure in a scattering medium, highlighting the impact of the scattering effect on the apparent width and contrast of the object, recorded at
. As proposed in previous research, the effectiveness of scattering suppression via PSF deconvolution is demonstrated in
Figure 8c, yielding a contrast value of
. The reconstructed object’s width
error of
%. Using the proposed technique,
Figure 8d achieves a perfect contrast of
. This result is attributed to the output of the de-blurred model and the pixel-by-pixel scanning method that produces binary values (0 and 1), with an object width
and an error of
%.
Figure 9 presents the normalized intensity profiles on the 350th pixel row for four images (
Figure 9a–d), each scaled to
. In
Figure 9a, the observed image depicts an absorbing structure in a clear medium at the 0-degree orientation, providing a baseline for comparison.
Figure 9b illustrates the observed image of the same structure in a scattering medium, highlighting the scattering effect’s impact on the object’s apparent width and contrast, recorded at
. As proposed in previous research, the effectiveness of scattering suppression via PSF deconvolution is demonstrated in
Figure 9c, yielding a contrast value of
. Employing the proposed technique,
Figure 9d achieves a contrast of
.
In this study, the 3D image reconstruction process averages per image. This level of performance is achieved using a computational setup that includes an NVIDIA Tesla V100 GPU, complemented by 12.0 GB of GPU memory and powered by a 16-core Intel Xeon processor. These hardware specifications, while not at the cutting edge, are chosen to demonstrate the considerations of the proposed method on commonly available equipment, making it accessible for broader medical research and diagnostic applications. This approach ensures that the method is practical not only in terms of technical performance but also in terms of its adaptability to a variety of real-world settings.
The contrast improvement ratio (CIR) [
35] serves as a metric to evaluate the effectiveness of different image processing techniques across 360 rotation angles of an object, as depicted in
Figure 10. The orange line in the graph represents the CIR of the deconvolution method employing the PSF function, the purple line indicates the CIR of the proposed method, and the red line signifies the percentage improvement between these two methods. A notable observation from this graph is the similar trend exhibited by both methods, particularly their lowest CIR values at the 90 and 270-degree rotation angles. This similarity in the CIR trend and specific low points at these angles can be attributed to several factors. First, these angles typically correspond to the longest path lengths through the object, potentially leading to increased scattering and reduced contrast. Second, the alignment of certain features within the object’s structure at these angles might amplify the scattering effects, further diminishing contrast. The consistency of this trend across both methods suggests that these dips in CIR are likely due to the inherent geometry and optical properties of the object rather than the limitations of the image processing techniques.
The percentage improvement metric allows for the following observations. The initial large difference at the first angle: a substantial difference in CIR at the first angle may indicate that one method significantly outperforms the other under specific imaging conditions, influenced by the initial orientation of the object relative to the imaging apparatus. Decrease to a minimum at the 90th angle: The minimum CIR at the 90th angle suggests reduced effectiveness for both methods, likely due to the structural or optical characteristics of the object that increase the scattering or reduce the contrast at this specific orientation. Increase to a maximum at the 180th angle: the peak in CIR at the 180th angle indicates a point of optimal performance for both methods, likely due to more favorable conditions for contrast enhancement. Decrease to minimum at the 270th angle and increase towards the 180th angle: The pattern of decreasing to another minimum at the 270th angle, followed by an increase towards the 360th angle, highlights the influence of the object’s orientation and imaging conditions on the performance of the methods. The cyclic nature of this pattern implies that certain angles consistently present challenges or advantages for contrast enhancement.
Figure 11 shows cross-sectional images at the mid-height of the upper object, each scaled to a resolution of
. In
Figure 11a, the image obtained in a clear medium reveals an object width of
mm.
Figure 11b shows the image in a scattering medium, where the scattering effects obscure the object’s dimensions.
Figure 11c demonstrates the application of the erasing template technique, yielding a reconstructed object width of
mm, corresponding to an error
. This error suggests that while the technique is beneficial for enhancing image clarity, it may alter the perceived dimensions of the object. Finally,
Figure 11d depicts the result of the proposed technique, with a reconstructed object width of
mm, and a reduced error
. This reduced error indicates a higher fidelity in preserving the object’s true dimensions while effectively mitigating scattering effects.
Similarly,
Figure 12 shows cross-sectional images in the middle height of the lower object, each scaled to
. In
Figure 12a, captured in a clear medium, the width of the object is measured at
mm.
Figure 12b, taken in a scattering medium, illustrates how scattering effects can significantly obscure the dimensions of the object. The use of the erasing template technique, shown in
Figure 12c, results in a reconstructed object width of
mm, with an associated error of
. This suggests that while the technique enhances image clarity, it might also slightly distort the object’s perceived size. In particular, the proposed technique, as seen in
Figure 12d, achieves a more accurate reconstruction, producing an object width of
mm and a significantly lower error rate of
. This demonstrates the technique’s higher accuracy in maintaining the object’s true dimensions, despite the presence of scattering effects.
Figure 13 shows the results of the filtered back-projection method using the dataset of 360 degrees with two-level thresholds, which are common in all the figures. The internal structure, which is barely seen in
Figure 13d, became visible by the proposed technique.
Figure 14 shows the results of the proposed method in a single view at 302 degrees. The 3D view of the de-blurred internal structures became visible by the proposed method.
4. Experiment with Mouse
Figure 15 shows the experimental apparatus, where a living female mouse (Slc:ICR, 20 weeks-old,
g) serves as the subject. Anesthesia was administered intraperitoneally with pentobarbital injection, ensuring immobilization and comfort of the mouse throughout the experiment. Subsequently, the mouse was securely placed within a cylindrical holder meticulously crafted from transparent acrylic resin to facilitate unobstructed observational access and light penetration. Illumination was achieved through the deployment of an 800 nm wavelength laser light, propagated through a beam expander and a diffuser to establish uniformly disseminated illumination across one side of the holder. In contrast, a CMOS camera was strategically positioned on the holder’s opposing side, enabling the capture of transilluminated images. The comprehensive capture of transillumination images was facilitated by a rotating system, which rotated the holder to acquire diverse perspectives. This methodology harbors the potential for the reconstruction of 3D images via the FBP algorithm, contingent on the successful procurement of the requisite projection images.
Figure 16a shows an ultrasound image of the kidney region in a mouse, reconstructed from 360-degree transillumination images of a mouse with the horizontal dimension of the left kidney measured at
mm. On the contrary,
Figure 16b shows a cross-sectional view in which the kidney is visible in the horizontal plane. This section is reconstructed from a 360-degree illuminated image of the mouse. However, in these observed images, internal organs, such as the kidney, are barely discernible and difficult to distinguish.
Figure 16c further presents the cross-sectional image reconstructed from the deconvolved images, using the sample removal technique described in a previous study [
25]. In this reconstruction, the left kidney is distinguishable, with a measured width of
mm, and an associated error of
. Using the proposed technique,
Figure 16d reveals a de-blurred image that significantly improves the cross-sectional view. The reconstructed left kidney in this image has a width of
mm, with a reduced error of
, demonstrating the proposed technique’s efficacy in enhancing image clarity and precision.
The stack of cross-sectional images was vertically arranged to create a 3D image.
Figure 17 shows the results with different threshold levels with conventional thresholds applied. In
Figure 17a, the internal structure was barely discernible, but the previous technique (
Figure 17b) and the proposed technique (
Figure 17c) revealed greater visibility. This enabled the identification of high-absorption organs such as the kidneys and lower sections of the liver.
Figure 18 shows two stages of 3D reconstruction imaging.
Figure 18a shows the output image of the scattering de-blurring process, highlighting the significant reduction in the blurred image. Scattering effects within the image have been effectively suppressed, revealing clearer details of the absorption structures.
Figure 18b illustrates the result of the 3D reconstruction imaging, combining scatter de-blurring and depth estimation. The result is a 3D reconstructed image that provides an insightful and comprehensive representation of the internal light-absorbing structures. The addition of depth estimation contributes to the spatial dimension, enhancing the ability to visualize structures within a three-dimensional context.
This experimental investigation confirmed the practicality of achieving 3D imaging for the internal light-absorbing structure of a small animal. The stack of cross-sectional images was vertically arranged to create a 3D image.
Figure 19 validates the efficacy of the CNN depth estimation model in producing clear images while using a threshold of
. The correlation coefficient between the de-blurred images generated through FCN and CNN is 0.9134. Concerning the width measurement, all images maintain a consistent scale of
. Specifically, focusing on the object positioned in the lower right corner of the 350th-pixel row, the actual width of the object is measured at
(
Figure 19a). Using the PSF function deconvolution approach (
Figure 19c), the width is computed as
(with an error of 7.88%). Using the de-blurring method with the FCN model (
Figure 19d) yields a width measurement of
(with an error of 2.94%). Meanwhile, utilizing the image reconstruction technique from the depth matrix via the CNN model (
Figure 19e) results in a width of
(with an error of 6.82%). These results indicate the feasibility of both the FCN model-based de-blurring and the image reconstruction from the depth matrix via the CNN model for scatter de-blurring.
Inherent limitations accompany the thresholding method used for reconstructing blurred images using CNN models, presenting a nuanced landscape of benefits and challenges. On the positive side, the streamlined utilization of a singular CNN model for image reconstruction translates into tangible reductions in costs and computational resources, while maintaining commendable accuracy and precision. However, a notable drawback emerges as a threshold is applied to the image reconstruction process. Imposing a threshold for image reconstruction with a value smaller than the threshold of black pixels diminishes the generality of the reconstruction.
Furthermore, when adequate depth information is available and a clear image is achieved after de-blurring, it becomes possible to perform a 3D reconstruction of the light-absorbing structures from a single 2D image.
5. Conclusions
This research addressed the challenging tasks of de-blurring caused by scattering, restoring complex absorbing structures, and estimating the depths of complex structures presented in the transilluminated image of biological tissue. The key contribution of our work lies in developing a pixel-to-pixel scanning method that incorporates deep learning models to provide information and depth values for a given blurred image of absorption structures with multiple depth levels. This novel approach enables us to associate various depths with each pixel and then estimate the depths of the absorbing structures in the whole image. Additionally, it should be noted that when a full viewing angle is available, this study has demonstrated the ability to reconstruct complete 3D structures, providing a comprehensive understanding of the structures within the imaged medium.
Integrating the U-Net and CNN models in the reconstruction process has yielded remarkable results. Combining the clear and de-blurred 2D image from the U-Net model with multiple depth estimations from the CNN model, we obtain a comprehensive 3D representation of the absorbing structures within the turbid medium. This multidimensional insight provides valuable information for researchers and experts and enables our understanding of the complex nature of absorption structures within turbid media and other related domains.
Although this approach leverages the capabilities of deep learning models, it is essential to acknowledge the challenges related to data size and computational power. Creating large datasets for training and conducting computationally intensive operations requires careful consideration and optimization. The expanding capability of scattering suppression and depth estimation for absorption structures in turbid mediums using deep learning, combined with our pixel-to-pixel scanning method, represents a significant achievement. This technique can be applied to the advancement of medical imaging and other related fields. Using the strengths of the U-Net and CNN models and the novel depth estimation process, researchers gain access to a powerful tool for reconstructing 3D structures from 2D images or even a single 2D image.