1. Introduction
To enable researchers to handle medical images more securely, it is necessary to have a framework in place that ensures patients cannot be identified from these medical images even if they are leaked. To achieve this, strong noise can be added to the images. However, such anonymization processes can render the medical images useless, protecting privacy but preventing meaningful analysis from being conducted. There are no guidelines on how strong the added noise should be, and there is no theoretical guarantee of privacy protection.
On the other hand, differential privacy technology [
1,
2], which has been gaining attention and is increasingly being implemented in society, is believed to be capable of realizing such a framework by providing theoretical guarantees and controllable parameters (i.e., privacy budget) on the strength of privacy protection. Differential privacy includes global differential privacy, which trusts the administrator (in this case, the medical image analyst), and local differential privacy (LDP), which does not even trust the administrator. This study extends the application of LDP. The medical images obtained through the algorithm in this study provide a theoretical guarantee on the upper bound of the probability of identifying an individual, particularly in the regions outside the lesions, even if the images are leaked to external parties. Indeed, although the application of differential privacy to multidimensional data such as images is still in its early stages, this study proposes a novel method that contributes to such pioneering efforts. Until now, it has been challenging to generate medical images that maintain utility with a realistic privacy budget. However, by adopting the technology developed by Shibata et al. [
3] in previous studies, which combines LDP techniques with diffusion models [
4], this study aims to overcome this issue.
Previous studies are typically concentrated on retaining the identity of the subject (e.g., a patient) in an image while removing anomalous regions. This is very important, especially in medical image analysis, because obtaining the counterfactual (fake) normal image can yield an anomaly map by taking a difference between the real anomalous image and the fake normal image. On the contrary, in this study, we formulate and validate a framework that retains anomalous regions while diffusing (i.e., altering) the identity of patients using LDP techniques and diffusion models. This is the key highlight of our research. We call this framework an Identity Diffuser. This is very challenging compared with the above-mentioned “normalizer” because one must estimate almost all the body structure from an anomalous region, where typically much lesser information is contained compared with the other region.
The remainder of this paper is structured as follows.
Section 2 reports related research.
Section 3 formulates the proposed method and the experimental validation approach.
Section 4 presents the medical images generated by the proposed method and the experimental results.
Section 5 discusses the results.
2. Related Works
The present study is categorized as a special case of inpainting. Inpainting is formulated to guess unknown masked region(s) in an image. Bertalmio et al. [
5] adopted the continuous diffusion equation to interpolate the masked region. This method is a mathematical model. Lugmayr et al. [
6] adopted denoising diffusion probabilistic models [
4] to recover the masked region using an image prior knowledge (RePaint algorithm). This method is a data-driven model. The keyword "diffusion" is common and interesting, but the algorithms are significantly different between the two. In this study, we adopt the RePaint algorithm because we can easily integrate it with the Gaussian local differential privacy algorithm explained in the next section. The RePaint is an algorithm that performs inpainting in the image space, but there are also inpainting algorithms that operate in the latent space, specifically for latent space diffusion models [
7]. Tran et al. [
8] applied deep learning-based inpainting techniques to CXR images, but it was not in the context of privacy protection, and differential privacy techniques were not utilized in their work. In a broader framework than the normalization of lesion structures, some studies have applied defect repair techniques to medical imaging [
9,
10,
11].
Although the application target is facial images rather than medical images, several studies have adopted local differential privacy algorithms to erase image features [
12,
13]. Regarding the application to medical images, Shibata et al. have recently employed flow-based deep generative models [
14] or diffusion models [
3] to formulate and validate the approach. It is important to note that these approaches are different from DP-SGD [
15], which applies differential privacy to the gradient information during the backpropagation process in neural network training.
3. Methods
Figure 1 shows the flowchart of the proposed method. Identity Diffuser is built based on the framework RePaint [
6], which is an inpainting algorithm using diffusion models. In this study, we further combine the theory of local differential privacy (LDP) with the RePaint algorithm.
Specifically, on a supercomputer (using a single NVIDIA A100 GPU, (NVIDIA, Santa Clara, CA, USA )), we trained a diffusion model on 7808 “normal” CXR images. The batch size was set to 8, and with gradient accumulation set to 2, this effectively emulates a batch size of 16. The learning rate was set to
, and the Adam optimizer was employed. The CXR images had a resolution of 256 × 256, and FP16 mixed precision computation was utilized, but Flash Attention was not used. This was executed for 180,000 steps, equivalent to approximately 369 epochs of training. The training process took about 18 h. For inference, the model data from the 180,000th step was loaded. The RePaint algorithm was employed for inpainting, with 12 “internal iterations” executed (for details on the internal iterations, please refer to the RePaint paper [
6]). The components of the proposed algorithm are detailed below.
3.1. Diffusion Models
Diffusion models [
4] gradually add noise to data such as images and then model the reverse process (denoising) using deep neural networks, typically UNets [
16]. The diffusion model is mathematically equivalent to indirectly minimizing the Kullback–Leibler divergence between the distribution formed by the true images and the modeled distribution. We represent this UNet by the vector function
, where
is the input image and
t is the denoising time step, which represents the noise intensity in the input image. Instead of restoring meaningful images directly from completely random vectors, the process gradually approaches meaningful images, i.e.,
and
is a completely denoised image (the generated image). There are numerous proposed methods for this approach, with Denoising Diffusion Implicit Models [
17] being relatively well-known. Regarding the training of denoising, it is usually set up to predict the original image (or the noise itself) from images with various noise intensities. These images with different noise intensities are uniquely identified by the denoising timesteps
t. The maximum value of the denoising timesteps (i.e., the number of different noise intensity data distributions prepared) is often taken to be between 200 and 1000. The settings for noise intensity at each step (noise scheduling) also vary, with sigmoid scheduling, which we adopted in this study, and linear scheduling being representative examples.
3.2. Local Differential Privacy
The local differential privacy (LDP) with the Gaussian mechanism [
1,
18] is formulated by
where
and
are two “neighborhood” images (here, CXR images),
is the LDP-processed image.
M is a probabilistic algorithm, and Pr is the probability that
is obtained when we input
or
to the algorithm. In this study, we very loosely define the neighborhood: for a fixed CXR image
, the neighborhood image
can take any image in the image space. This definition potentially results in a large privacy budget if we retain the utility of LDP-processed images.
In LDP for images [
3,
12,
14,
19], noise is added to the image. Specifically, Gaussian noise is added by adopting the Gaussian mechanism. This can be mathematically expressed as:
where
is the noisy image,
x is the original image, and
represents Gaussian noise with mean 0 and variance
. When noise is added as described, the processed image satisfies the above equation. For the theoretical background on the setting of variance, please refer to previous research [
3].
3.3. Inpainting Using Diffusion Models: RePaint
RePaint [
6] is a missing region restoration (inpainting) algorithm that utilizes diffusion models. The inputs to the algorithm are a binary mask image indicating the areas to be restored and the original image. In the original RePaint algorithm, it is assumed that the restoration areas are completely noisy (corresponding to a privacy budget of 0 in LDP). However, it should be noted that in this study, it is not necessarily assumed that the noise is complete.
3.4. Post Processing for LDP with Diffusion Models
Shibata et al. [
3] proposed associating this noise addition process with LDP, specifically suggesting that adding noise in the diffusion process corresponds to limiting the privacy budget. More specifically, it becomes possible to back-calculate the privacy budget assumed by the Gaussian mechanism from the variance of the noise added to the image in the forward process. They found that the denoising timesteps in diffusion models correspond to different privacy budgets. Additionally, Shibata et al. [
3] found that denoising in diffusion models dramatically improves the quality, or utility, of images that satisfy LDP by having noise added.
3.5. Identity Diffuser (Proposed Framework)
First, the diffusion model was trained using only normal CXR images extracted from the RSNA Pneumonia detection dataset [
20] to ensure that no new lesions are generated during the medical image creation process. Specifically, we trained a diffusion model with 7808 normal CXRs from the RSNA dataset. The resolution of the CXR images was downsampled to 256 × 256 by averaging the pixel values. Second, we prepared CXR images with the boundary box, which localizes anomalous regions. These bounding boxes were adapted from the annotations that accompanied the NIH CXR dataset, which served as the basis for the RSNA dataset. These CXR images were processed as follows: In the boundary box, we retained all the information, i.e., the pixel intensities. Outside the boundary box, we added Gaussian noise based on privacy budget (
and
,
-
local differential privacy). Third, we ran the RePaint algorithm based on the trained model and obtained denoised CXR images. The implementation of the diffusion model was adopted from an open repository [
21] for 2D image applications, with slight modifications, and incorporated the RePaint algorithm. Lastly, three radiologists individually investigated the denoised CXR images together with the original images and evaluated scores.
To summarize, we have the LDP-processed image
, which satisfies
where the operator
extracts the normal region from the entire image.
The quantification experiments conducted by the physicians were performed on a single laptop using a custom-made application. This was a WindowsForm application developed using the .NET framework, which allowed for sequential image presentation and comparison. The physicians were able to systematically input the quality of each image.
There are three evaluation criteria. The first evaluation criterion is the overall naturalness of the CXR images. The second evaluation criterion assesses whether the embedded lesions appear natural when contrasted with other parts of the image. The third evaluation criterion is the degree of identity suppression. Each criterion is rated on a scale from 1 to 5, where 1 indicates the least natural (least suppressed) and 5 indicates the most natural (most suppressed). These criteria were assessed using lesions from ten different diagnoses, two different cases, three distinct sampling results, and four different privacy budgets. This means that 240 CXR images were evaluated per person. The evaluation results from three physicians were averaged, resulting in values.
4. Results
Figure 2 shows the results of CXR image processing using the proposed method.
Table 1,
Table 2 and
Table 3 show the relationship (the raw data) between privacy protection and utility for CXR obtained from this experiment. Utility was evaluated by averaging the results of the first and second evaluation criteria. And the privacy protection metric was directly taken from the results of the third evaluation criterion.
5. Discussion
It can be observed that as the privacy budget increases, the strength of privacy protection decreases and utility slightly increases. Even when the privacy budget is at its maximum, the utility does not reach its peak, indicating room for improvement in the extrapolation algorithm. On the other hand, even when the privacy budget is at its minimum (), the strength of privacy protection does not reach its maximum (P5), revealing that individuals can still be somewhat identified based on the lesions alone.
This method qualitatively differs from traditional techniques in the way it embeds lesions. It is particularly useful from the perspective of data augmentation in machine learning for rare diseases where only a limited number of images are available.
Although it slightly deviates from the main subject of this study, by strictly defining image adjacency, it may be possible to evaluate the value of the privacy budget more rigorously and generate images that maintain utility with a smaller privacy budget.
Since the lesions have not been altered at all, the disease completely remains visible to the human eye, but it may slightly affect the performance of diagnostic models (classification models). The examination of the impact of the proposed method on the performance of lesion classification models is designated for future research.
Finally, methods for supplementing missing parts in medical images, such as chest X-ray images, have already been developed (e.g., [
8]). However, to the best of the authors’ knowledge, there are no prior studies that deal with methods for removing individuality while preserving lesion structures, as in this study. Therefore, a direct comparison with prior studies in the strict sense is not possible.
Author Contributions
Conceptualization, H.S. and S.H.; methodology, H.S.; software, H.S.; validation, H.S.; formal analysis, H.S.; investigation, H.S., S.K., S.M., and Y.S.; resources, H.S.; data curation, H.S.; writing—original draft preparation, H.S.; writing—review and editing, H.S., S.H., S.K., S.M., Y.S., and O.A.; visualization, H.S.; supervision, S.H. and O.A.; project administration, S.H.; funding acquisition, S.H. All authors have read and agreed to the published version of the manuscript.
Funding
This work was supported by Japan Science and Technology Agency (JST), CREST Grant Number JPMJCR21M2.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Acknowledgments
The Department of Computational Diagnostic Radiology and Preventive Medicine, the University of Tokyo Hospital, is sponsored by HIMEDIC Inc. and Siemens Healthcare K.K. This research was conducted using the FUJITSU Supercomputer PRIMEHPC FX1000 and FUJITSU Server PRIMERGY GX2570 (Wisteria/BDEC-01) at the Information Technology Center, the University of Tokyo.
Conflicts of Interest
The authors declare no conflicts of interest. The funders had no role in the design of the study, in the collection, analyses, or interpretation of data, in the writing of the manuscript, or in the decision to publish the results.
Abbreviations
The following abbreviations are used in this manuscript:
CXR | Chest X-rays |
LDP | Local Differential Privacy |
RSNA | Radiological Society of North America |
References
- Dwork, C. Differential privacy. In Lecture Notes in Computer Science, Proceedings of the International Colloquium on Automata, Languages, and Programming, Venice, Italy, 10–14 July 2006; Springer: Berlin/Heidelberg, Germany, 2006; pp. 1–12. [Google Scholar]
- Dwork, C.; Kenthapadi, K.; McSherry, F.; Mironov, I.; Naor, M. Our data, ourselves: Privacy via distributed noise generation. In Lecture Notes in Computer Science, Proceedings of the Advances in Cryptology-EUROCRYPT 2006: 24th Annual International Conference on the Theory and Applications of Cryptographic Techniques, St. Petersburg, Russia, 28 May–1 June 2006; Proceedings 25; Springer: Berlin/Heidelberg, Germany, 2006; pp. 486–503. [Google Scholar]
- Shibata, H.; Hanaoka, S.; Nakao, T.; Kikuchi, T.; Nakamura, Y.; Nomura, Y.; Yoshikawa, T.; Abe, O. Practical Medical Image Generation with Provable Privacy Protection based on Denoising Diffusion Probabilistic Models for High-resolution Volumetric Images. Appl. Sci. 2024, 14, 3489. [Google Scholar] [CrossRef]
- Ho, J.; Jain, A.; Abbeel, P. Denoising diffusion probabilistic models. Adv. Neural Inf. Process. Syst. 2020, 33, 6840–6851. [Google Scholar]
- Bertalmio, M.; Sapiro, G.; Caselles, V.; Ballester, C. Image Inpainting. In Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH ’00), New Orleans, LA, USA, 23–28 July 2000; pp. 417–424. [Google Scholar] [CrossRef]
- Lugmayr, A.; Danelljan, M.; Romero, A.; Yu, F.; Timofte, R.; Van Gool, L. Repaint: Inpainting using denoising diffusion probabilistic models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 11461–11471. [Google Scholar]
- Corneanu, C.; Gadde, R.; Martinez, A.M. Latentpaint: Image inpainting in latent space with diffusion models. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 3–8 January 2024; pp. 4334–4343. [Google Scholar]
- Tran, M.T.; Kim, S.H.; Yang, H.J.; Lee, G.S. Deep learning-based inpainting for chest X-ray image. In Proceedings of the 9th International Conference on Smart Media and Applications, Jeju, Republic of Korea, 17–19 September 2020; pp. 267–271. [Google Scholar]
- Armanious, K.; Mecky, Y.; Gatidis, S.; Yang, B. Adversarial inpainting of medical image modalities. In Proceedings of the ICASSP 2019—2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 12–17 May 2019; pp. 3267–3271. [Google Scholar]
- Zhang, R.; Lu, W.; Wei, X.; Zhu, J.; Jiang, H.; Liu, Z.; Gao, J.; Li, X.; Yu, J.; Yu, M.; et al. A progressive generative adversarial method for structurally inadequate medical image data augmentation. IEEE J. Biomed. Health Inform. 2021, 26, 7–16. [Google Scholar] [CrossRef] [PubMed]
- Jiang, Y.; Xu, J.; Yang, B.; Xu, J.; Zhu, J. Image inpainting based on generative adversarial networks. IEEE Access 2020, 8, 22884–22892. [Google Scholar] [CrossRef]
- Fan, L. Differential privacy for image publication. In Proceedings of the Theory and Practice of Differential Privacy (TPDP) Workshop, London, UK, 11 November 2019; Volume 1, p. 6. [Google Scholar]
- Wen, Y.; Liu, B.; Song, L.; Cao, J.; Xie, R. Differential Private Identification Protection for Face Images. In Face De-identification: Safeguarding Identities in the Digital Era; Springer: Cham, Switzerland, 2024; pp. 75–108. [Google Scholar]
- Shibata, H.; Hanaoka, S.; Cao, Y.; Yoshikawa, M.; Takenaga, T.; Nomura, Y.; Hayashi, N.; Abe, O. Local differential privacy image generation using flow-based deep generative models. Appl. Sci. 2023, 13, 10132. [Google Scholar] [CrossRef]
- Abadi, M.; Chu, A.; Goodfellow, I.; McMahan, H.B.; Mironov, I.; Talwar, K.; Zhang, L. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, 24–28 October 2016; pp. 308–318. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Lecture Notes in Computer Science, Proceedings of the Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015; Proceedings, Part III 18; Springer: Cham, Switzerland, 2015; pp. 234–241. [Google Scholar]
- Song, J.; Meng, C.; Ermon, S. Denoising diffusion implicit models. arXiv 2020, arXiv:2010.02502. [Google Scholar]
- Dwork, C. Differential privacy: A survey of results. In Lecture Notes in Computer Science, Proceedings of the International Conference on Theory and Applications of Models of Computation, Xi’an, China, 25–29 April 2008; Springer: Berlin/Heidelberg, Germany, 2008; pp. 1–19. [Google Scholar]
- Xue, H.; Liu, B.; Ding, M.; Zhu, T.; Ye, D.; Song, L.; Zhou, W. Dp-image: Differential privacy for image data in feature space. arXiv 2021, arXiv:2103.07073. [Google Scholar]
- RSNA. RSNA Pneumonia Detection Challenge. 2018. Available online: https://www.rsna.org/rsnai/ai-image-challenge/rsna-pneumonia-detection-challenge-2018 (accessed on 23 July 2024).
- Denoising Diffusion Probabilistic Model, in Pytorch. 2024. Available online: https://github.com/lucidrains/denoising-diffusion-pytorch (accessed on 23 July 2024).
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).