Next Article in Journal
A Convolutional Neural Network-Based Auto-Segmentation Pipeline for Breast Cancer Imaging
Next Article in Special Issue
Auxcoformer: Auxiliary and Contrastive Transformer for Robust Crack Detection in Adverse Weather Conditions
Previous Article in Journal
Study on Exchange Rate Forecasting with Stacked Optimization Based on a Learning Algorithm
Previous Article in Special Issue
Quantized Graph Neural Networks for Image Classification
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Image Steganography and Style Transformation Based on Generative Adversarial Network

1
School of Communication and Information Engineering, Shanghai University, Shanghai 200444, China
2
CAS Key Laboratory of Electro-Magnetic Space Information, University of Science and Technology of China, Hefei 230027, China
*
Author to whom correspondence should be addressed.
Mathematics 2024, 12(4), 615; https://doi.org/10.3390/math12040615
Submission received: 9 January 2024 / Revised: 1 February 2024 / Accepted: 3 February 2024 / Published: 19 February 2024
(This article belongs to the Special Issue Representation Learning for Computer Vision and Pattern Recognition)

Abstract

:
Traditional image steganography conceals secret messages in unprocessed natural images by modifying the pixel value, causing the obtained stego to be different from the original image in terms of the statistical distribution; thereby, it can be detected by a well-trained classifier for steganalysis. To ensure the steganography is imperceptible and in line with the trend of art images produced by Artificial-Intelligence-Generated Content (AIGC) becoming popular on social networks, this paper proposes to embed hidden information throughout the process of the generation of an art-style image by designing an image-style-transformation neural network with a steganography function. The proposed scheme takes a content image, an art-style image, and messages to be embedded as inputs, processing them with an encoder–decoder model, and finally, generates a styled image containing the secret messages at the same time. An adversarial training technique was applied to enhance the imperceptibility of the generated art-style stego image from plain-style-transferred images. The lack of the original cover image makes it difficult for the opponent learning steganalyzer to identify the stego. The proposed approach can successfully withstand existing steganalysis techniques and attain the embedding capacity of three bits per pixel for a color image, according to the experimental results.

1. Introduction

Image steganography is a concealed communication method that uses seemingly benign digital images to conceal sensitive information. An image with hidden messages is known as a stego. The existing mainstream approaches for image steganography are content-adaptive, which embed secrets into highly textured or noisy regions by minimizing a heuristically defined distortion function, which measures the statistical detectability or distortion. Based on the near-optimal steganographic coding scheme [1,2], numerous efficient steganographic cost functions have been put forth over the years, and many of them are based on statistical models [3,4] or heuristic principles [5,6,7]. The performance of steganography could also be enhanced by taking into account the correlations between nearby picture elements, such as in [8,9,10].
Image steganalysis, on the other hand, seeks to identify the presence of a hidden message inside an image. Traditional steganalysis methods are based on statistical analysis or training a classifier [11] based on hand-crafted features [12,13,14]. In recent years, deep neural networks have been proposed for steganalysis [15,16,17,18,19], and they have outperformed traditional methods, which challenges the security of the steganography. To defend against steganalysis, some researchers have proposed embedding secret messages using deep neural networks and simulating the competitionbetween steganography and steganalysis by a Generative Adversarial Network (GAN), which alternatively updates a generator and a discriminator by which enhanced cover images or distortion costs can be learned. However, since these methods embed messages based on an existing image, it is possible for the adversary to generate cover–stego pairs, which will provide more information for the steganalysis. To solve this problem, some works utilize GANs to learn how to map the pieces of the secret information to the stego and directly produce stego images without the cover [20,21,22,23,24,25]. But, the images obtained by the GAN are not satisfying in terms of the visual quality due to the difficulty of the image-generation task.
The goal of the above-mentioned methods is to keep the stego images indistinguishable from the unprocessed natural images since the transfer of the natural images has been a common phenomenon in recent years. Recently, with the rapid growth of AGI, the well-performing image-generation and image-processing models have emerged in great numbers, such as dalle2 [26] and stable diffusion [27], increasing the attention to the steganography of the AI-generated or -processed images [28,29]. Among the images produced by AI, the art-style images have become more popular on social networks, thereby generating stegos that are indistinguishable from style-transferred images, which could be a new way for high capacity and secure steganography. In [30], Zhong et al. proposed a steganography method in stylized images. They produced two similar stylized images with different parameters, and one of them was used for embedding and the other as a reference. However, because it remains dependent on the framework of embedding distortion and STC coding, the adversary may detect the stego by generating cover–stego pairs and training a classifier; thereby, the stego images face the risk of being detected. In this paper, we propose to encode secret messages into images at the same time as the generation of style-transferred images. The contributions of the paper are as below:
  • We designed a framework for image steganography during the process of image style transfer. The proposed method is more secure compared to traditional steganography since the steganalysis without the corresponding cover–stego pairs is difficult.
  • We validated the effectiveness of the proposed method by experiments. The results showed that the proposed approach can successfully embed 1 bpcpp, and the generated stego cannot be distinguished from the clean style-transferred images generated by a model without steganography. The accuracy of the recovered information was 99%. Though it was not 100%, this can be solved by coding secret information using error-correction codes before hiding them in the image.

2. Related Works

2.1. Image Steganography

The research on steganography is usually based on the “prisoner’s problem” model, which was proposed by American scholar Simmons in 1983 and is described as follows: “Assuming Alice and Bob are held in different prisons and wish to communicate with each other to plan their escape, but all communication must be checked by the warden Wendy”. The steganographic communication process is shown in Figure 1. The sender, Alice, hides the message in a seemingly normal carrier by selecting a carrier that Wendy allows and using the key shared with the receiver, Bob. This process can be represented as:
E m b ( c , m , k ) = s
Then, the carrier is transmitted to the receiver, Bob, through a public channel. Bob receives the carrier containing the message and uses a shared key to extract the message:
E x t ( s , k ) = m .
Wendy, the monitor in the public channel, aims to detect the presence of covert communication.
Existing steganography methods can be divided into three categories: (1) cost-based steganography, (2) model-based steganography, and (3) generative steganography.

2.1.1. Cost-Based Steganography

Each cover element i 1 , N of the cover is allocated a cost ρ i 0 and a probability β i for modifying its value according to the image content, and techniques such as UNIWARD, WOW, and HILL propose a variety of cost-designing methods. The objective of cost-based steganography is to embed the secret message into the cover in a way that minimizes the sum of the predicted costs of all modified pixels, which is calculated by d = i = 1 N β i ρ i . To this end, the problem of secret embedding is recognized as source coding with a fidelity constraint, and near-optimal coding schemes Syndrome-Trellis Codes (STCs) and Steganographic Polar Codes (SPCs) have been developed. Cost-based steganography adaptively embeds secrets; hence, the steganography is imperceptible. However, the costs are occasionally determined via heuristic methods and cannot be mathematically associated with the potential of the changes in the embedding being detected. Moreover, when a well-informed adversary is cognizant of the changing rates of the embedding, which is taken as a kind of side information in steganalysis, it can be used by the adversary to improve the steganalysis’s accuracy by utilizing selection-channel-aware features or Convolutional Neural Networks.

2.1.2. Model-Based Steganography

Model-based steganography establishes a mathematical model for the distribution of carriers, aiming to embed messages while preserving the inherent distribution model as much as possible. MiPOD is an example of the model-based steganographic scheme. It assumes the noise residuals in a digital image follow a Gaussian distribution with zero mean and variances σ i 2 , which are estimated for each cover element i. The messages are embedded, aiming to reduce the effectiveness of the most-advanced detector that an adversary can create. While this approach is theoretically secure, challenges arise due to variations in the distribution models among multimedia data, such as images and videos, acquired by different sensors. Furthermore, the influence of distinct temporal and environmental factors on the pixel distribution complicates the identification of a universally applicable model for accurately fitting real-world distributions.

2.1.3. Coverless Steganography

Unlike cost-based methods or model-based methods, where a prepared cover object is used to hide the data by modifying the pixel values, coverless steganography is based on the principle that natural carriers may carry the secret information that both parties want to transmit in secret communication. This does not require preparing the cover to be modified, but aims to embed information directly within the carrier medium itself, without relying on modifying a distinct cover. Traditional methods achieve this by selecting an image that is suitable for the message to be transmitted. With the development of generative models, recent research has proposed to embed the messages during the image-generation or processing.

2.2. Image Style Transfer

Image style conversion methods can be divided into non-realistic rendering (Non) Photorealistic Rendering (NPR) methods and computer vision methods. The NPR methods have developed into an important area in the field of computer graphics; however, most NPR stylization algorithms are designed for specific artistic styles, and it is not easily extended to other styles. The method of computer vision regards style transformation as a texture synthesis problem, that is the extraction and transformation from the source texture to the target texture. The framework of “image analogy” learning achieves universal style conversion by learning from examples of the provided unshaped and stereotypical images. But, these methods only use low-level image patterns.
Physical features cannot effectively provide advanced image structural features. Inspired by Convolutional Neural Networks (CNN)s, Gatys et al. [31] first studied how to use Convolutional Neural Networks to transform natural images into famous painting styles, such as van Gogh’s Starry Night. They proposed modeling the content of photos as intermediate-layer features of pretrained CNNs and modeling artistic styles as the statistics of intermediate-layer features.
With the rapid development of style transition networks based on CNNs, the efficiency of image style conversion has gradually improved, and image-processing software such as Prisma and Deep Forger have become popular, making sending artistic style images on social platforms a common phenomenon. Therefore, covert communication using stylized images as carriers should not be easily suspected. Based on this, this section proposes a steganography method for image style transfer, which embeds secret messages while performing image stylization, making the generated encrypted image indistinguishable from the clean stylized image, improving the steganography’s security and capacity.

3. Proposed Methods

It is shown that deep neural networks can learn to encode a wealth of relevant information by invisible perturbations [24]. Image style transfer could be taken as encoding the target style information into the content image. Therefore, we encoded the secret information during the process of image style transfer, directly creating a stylized image with hidden secret messages, as opposed to first computing the steganographyand then applying encoding methods to the image. The style-transferred image containing the secret message is expected to be indistinguishable from the one without the secret message, and to enhance its visual quality, a GAN model was used, where SRNet was utilized as the discriminator, which learns the detailed features of the image and performs well at distinguishing the traditional stego and cover.
As shown in Figure 2, the network architecture consists of four parts: (1) a generator G, which takes the content image and the to-be-embedded message as the inputs, simultaneously achieving style transformation and information embedding; (2) a message extractor E, which is trained along with G, takes the stego image as the input, and precisely retrieves hidden information; (3) a discriminator A, which is iteratively updated with the generator and extractor; and (4) a style transformer loss-computing network L, which is a pretrained VGG model; it is employed to determine the resulting image’s style and content loss. The whole model is trained by the sender, and when the model is well trained, the message-extraction network E is shared with the receiver to extract the secret messages that are hidden in the received image. The implementation details of each part are as follows.
In our implementation, we adopted the architecture of image transformation networks in [32] as the generator G; it first utilizes two stride-2-convolutions to down-sample the input, followed by several residual blocks, then two convolutional layers with stride 1/2 are used to upsample, followed by a stride-1 convolutional layer, which uses a 9 × 9 kernel. Instance Normalization [33] is added to the start and end of the network.
To encode secret messages during the image style transfer, we concatenated the message M of size C m × H × W with the output of the first convolutional layer with respect to the input content image X c of size C × H × W and took the resulting tensor as the input of the second convolutional layer; in this way, we obtain a feature map that contains both the secret messages and the input content. The following architecture is like an encoder–decoder, which first combines and condenses the information and, then, restores an image with the condensed feature. The final output of G is a style-transferred image Y s of size C H W , which also contains the secret messages. The details of the architecture are shown in Table 1.

3.1. Style Transfer Loss Computing

The resulting images should possess similar content to X c and possess the target style, which is defined by a target style image X s . For this reason, we applied a loss calculation network L to quantify in the high-level content the difference between the resulting image and X c and style difference between the resulting image and X s , respectively. L is implemented as a 16-layer VGG network, which is pre-trained on the ImageNet dataset for the image-classification task in advance. To achieve style transfer, two perceptual losses were designed, namely the content reconstruction loss and style reconstruction loss.

3.1.1. Content Reconstruction Loss

We define the content reconstruction loss as the difference between the activations of the intermediate layers of L with respect to X c and Y s as the inputs. The activation maps of the j-th layer of the network in terms of the input image x are represented as ϕ j ( x ) , then the content loss is defined as the mean-squared error between the activation map of Y s and X c , represented as:
L cont ( X c , Y s , j ) = 1 C j H j W j i , j ϕ ( X c ) ϕ ( Y s ) 2
It is shown in [32] that the high-level content of the image is kept in the responses of the higher layers of the network, while the detailed pixel information is kept in the responses of the lower layers. Therefore, we calculated the perceptual loss for style transfer at the high layers. This does not require that the output image Y s perfectly matches X c ; instead, it encourages it to be perceptually similar to X c ; hence, there is extra room for us to implement style transfer and steganography.

3.1.2. Style Reconstruction Loss

To implement style transfer, except for content loss, style reconstruction loss is also required to penalize the differences in style such as the colors and textures between Y s and X s when Y s deviates from the input X c in terms of style. To this end, we first define the Gram matrix G j ϕ ( x ) to be a matrix of size C j × C j , and the elements of G j ϕ ( x ) are defined as:
G j ϕ ( x ) c , c = 1 C j × H j × W j h = 1 H i w = 1 W i ϕ j ( x ) h , w , c ϕ j ( x ) h , w , c
To achieve better performance, we calculated the style loss L s t y from a set of layers J instead of a single layer j. Specifically, L s t y is defined as the sum of the losses for each layer j J , as described in Equation (5).
L sty = j J G j ϕ ( X sty ) G j ϕ ( Y s ) 2

3.2. Extractor

To accurately recover the embedded information, a message-extraction network E, which has the same architecture as the generator G, is trained together with G. It takes the generated image, i.e., Y s , as the input and outputs O of size C m × H × W . The revealed message M is obtained according to O:
M i , j , k = 0 if O i , j , k < 0 1 if O i , j , k 0
The loss for revealing the information is defined as the mean-squared error between the embedded message M and the extracted message M :
L ext = M M 2
When the model is well trained, E is shared between Alice and Bob for convert communication, which plays the role of the secret key. Therefore, it is crucial to keep the secret of the trained E.

3.3. Adversary

To enhance the resulting Y s ’s visual quality, an adversarial training technique is applied, where SRNet [18] is applied as a discriminator to classify the generated style-transferred images containing secret messages and clean style-transferredimages generated by a style-transfer network without the steganography function. The cross-entropy loss is applied to measure the performance of the discriminator, which is defined as Equation (8).
L adv = y log ϕ ( x ) + ( 1 y ) log ( 1 ϕ ( x ) )
When updating the generator, the objective is to maximize L adv , while when updating the discriminator, the objective is to minimize L adv .

3.4. Training

In the training process, we iteratively update the parameters of the generator and adversary. Each iteration contains two epochs: in the first epoch, we leave the parameters of the discriminator unchanged and update the parameters of the first convolution layer, the generator, and the extractor by minimizing the content loss L cont , style loss L sty , and message extraction loss L ext , but maximizing the discriminator’s loss L adv ; hence, the total loss for training is defined as follows:
L t o t a l = α L cont + β L sty + λ L ext γ L adv ,
where α , β , λ , and γ are hyper-parameters to balance the content, style, message-extraction accuracy, and risk of being detected by the discriminator. In the second epoch, we update the parameters of the adversary by using the loss defined in Equation (8) while keeping the remaining parameters fixed.

4. Experiments

To verify the efficiency of the suggested approach, we randomly chose a style image from the WikiArt dataset as the target style and randomly took 20,000 content images from COCO [34], 10,000 for training and 10,000 for testing. We repeated the experiments 10 times. All the images were resized to 512 × 512 px with the channel of 3, and the messages to be embedded were binary data with the size of 3 × 512 × 512 , i.e., the payload was set as 1 bit per channel per pixel (bpcpp). In the training, the Adam optimizer was applied, and the learning rate was set as 1 × 10 4 . We trained the network for 200 epochs. The performance of the proposed method was evaluated from two aspects: (1) the accuracy rate of message extraction and (2) the ability to resist steganalysis. To demonstrate the versatility and robustness of the proposed method, we also validated the proposed method on the other style image on the Internet.

4.1. Message Extraction Accuracy Analysis

We assumed the sender and the receiver share the parameters and architecture of the extractor, the adversary knows the algorithm for data hiding and can train a model by herself but will obtain mismatched parameters. We explored, in such a situation, whether the hidden message can be extracted accurately by the receiver and whether the secret messages could be leaked to the adversary.
We trained five models of the same architecture, but with different random seeds, and these architectures are illustrated in Figure 2. The well-trained networks are represented as Net 1 , Net 2 , Net 3 , Net 4 , and Net 5 , respectively. We randomly split the content dataset into two separate sets, one for testing and the other for training. The secret messages to be embedded were randomly generated binary sequences and were reshaped as 3 × 256 × 256 . In the testing stage, we extracted the hidden messages by using extractors from different trained models. The results are displayed in Table 2, from which we can infer that the matched extractor can successfully extract the concealed message, and the accuracy rate of the extracted message reached 99.2%, demonstrating that the receiver could accurately recover the messages. But, an adversary cannot steal the secret messages hidden by the proposed method since the mismatched extractor can only recover less than 50% of the messages.

4.2. Security in Resisting Steganalysis

To verify the security of the embedded secret messages, we compared the generated stego style-transferred images with the clean style-transferred images generated by the style-transfer network without steganography [32]. We trained four networks M c 1 , M c 2 , M s 1 , and M s 2 . M s 1 and M s 2 are the same architecture proposed in this paper, but with different parameters; M c 1 and M c 2 are style-transfer networks without steganography [32]. The generated images are displayed in Figure 3, where it is clear that the message embedding had no effect on the image visually.
The residual of clean image and stego image with secret are shown in Figure 4. It should be noted that the difference between the generated stegos style-transferred and style-transferred images without hidden messages is not only caused by the message embedding, but also due to the different parameters of the model, e.g., the images generated by M 1 are different from those by M 2 , but are also different from M 3 and M 4 . Thereby, it is difficult to tell whether the image has been produced by a style-transfer network with the steganography function or by another ordinary style-transfer network without steganography. To verify the security of the proposed method, we assumed the attacker is trying to distinguish the generated stego from the cover generated by other normal style-transfer networks without the steganography function. According to the Kerckhoff principle, we considered a powerful steganalyzer who knows the target style image and all the knowledge of the model (i.e., the architecture and parameters) the steganographer has used. In this case, the attacker can generate the same stego as the steganographer, taking the generated stegos as positive samples and the covers generated by the models as negative samples to train a binary classifier. We applied different steganalysis methods, including using traditional SPAM [13] and SRM [14] features to train a classifier, as well as using the deep learning methods XuNet [16] and SRNet [18]. Similar to steganalysis, we preserved the cover and stego of the same content in the same batch when training the deep-learning-based steganalyzer. Table 3 contains the experimental findings. The average testing errors were all about 0.5, confirming the safety of the suggested procedure. We compared the security of the proposed method with other state-of-the-art steganography methods. The performance under a 0.4 bpp payload is shown in Table 4. It can be seen that the detection error of our method was about 0.5, which equals random guessing; hence, we can infer that our method cannot be detected. Since traditional methods embed secrets by modifying the pixel value of the original image, the modification traces could be reflected by some statistical features or be learned by a deep neural network. Instead, the proposed method embeds the secrets during the image generation; there is no exact cover for the steganalyzer to refer to, so it is difficult to detect.

5. Conclusions

In this study, we proposed a high-capacity and safe method for image steganography. We hid secret messages in an art-style image in the process of image generation by a GAN model. It was verified by experiments that the proposed approach can achieve a high capacity of 1 bpcpp, and the generated images cannot be distinguished from the clean images of the same content and style. The proposed method provides a new way for covert communication on social networks. However, there are still some limitations in its application. The message recovery accuracy did not achieve 100%; in addition, there will be complex noise in the real-world communication channel, and some platforms will compress the image before transmission, which will decrease the accuracy of message recovery. In the future, we will consider performing error-correction coding on secret messages before embedding them into the image and explore how to improve the robustness of the steganography.

Author Contributions

Conceptualization, L.L.; Methodology, L.L. and K.C.; Software, L.L.; Validation, K.C.; Writing—original draft, L.L.; Writing—review & editing, G.F. and D.W.; Visualization, D.W.; Supervision, X.Z. and W.Z.; Project administration, X.Z. and G.F.; Funding acquisition, X.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Natural Science Foundation of China under Grants U22B2047, U1936214, and 62302286 and the China Postdoctoral Science Foundation under Grant No. 2023M742207.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Filler, T.; Judas, J.; Fridrich, J. Minimizing Additive Distortion in Steganography using Syndrome-Trellis Codes. IEEE Trans. Inf. Forensics Secur. 2011, 6, 920–935. [Google Scholar] [CrossRef]
  2. Yao, Q.; Zhang, W.; Chen, K.; Yu, N. LDGM Codes Based Near-optimal Coding for Adaptive Steganography. IEEE Trans. Commun. 2023, 2023, 1. [Google Scholar] [CrossRef]
  3. Pevný, T.; Filler, T.; Bas, P. Using high-dimensional image modelsto perform highly undetectable steganography. In Proceedings of the International Workshop on Information Hiding, Calgary, AB, Canada, 28–30 June 2010; pp. 161–177. [Google Scholar]
  4. Sedighi, V.; Cogranne, R.; Fridrich, J. Content-Adaptive Steganography by Minimizing Statistical Detectability. IEEE Trans. Inf. Forensics Secur. 2015, 11, 221–234. [Google Scholar] [CrossRef]
  5. Holub, V.; Fridrich, J. Designing Steganographic Distortion Using Directional Filters. In Proceedings of the IEEE Workshop on Information Forensic and Security (WIFS), Tenerife, Spain, 2–5 December 2012; pp. 234–239. [Google Scholar]
  6. Holub, V.; Fridrich, J.; Denemark, T. Universal Distortion Function for Steganography in an Arbitrary Domain. EURASIP J. Inf. Secur. 2014, 2014, 1–13. [Google Scholar] [CrossRef]
  7. Li, B.; Wang, M.; Huang, J.; Li, X. A new cost function for spatial image steganography. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Paris, France, 27–30 October 2014; pp. 4206–4210. [Google Scholar]
  8. Li, B.; Wang, M.; Li, X.; Tan, S.; Huang, J. A strategy of clustering modification directions in spatial image steganography. IEEE Trans. Inf. Forensics Secur. 2015, 10, 1905–1917. [Google Scholar]
  9. Denemark, T.; Fridrich, J. Improving steganographic security by synchronizing the selection channel. In Proceedings of the 3rd ACM Information Hiding and Multimedia Security Workshop, Portland, OR, USA, 17–19 June 2015; pp. 5–14. [Google Scholar]
  10. Li, W.; Zhang, W.; Chen, K.; Zhou, W.; Yu, N. Defining joint distortion for JPEG steganography. In Proceedings of the 6th ACM Workshop on Information Hiding and Multimedia Security, Innsbruck, Austria, 20–22 June 2018; pp. 5–16. [Google Scholar]
  11. Kodovský, J.; Fridrich, J.; Holub, V. Ensemble classifiers for steganalysis of digital media. IEEE Trans. Inf. Forensics Secur. 2012, 7, 432–444. [Google Scholar] [CrossRef]
  12. Holub, V.; Fridrich, J. Low-complexity features for JPEG steganalysis using undecimated DCT. IEEE Trans. Inf. Forensics Secur. 2015, 10, 219–228. [Google Scholar] [CrossRef]
  13. Li, B.; Li, Z.; Zhou, S.; Tan, S.; Zhang, X. New steganalytic features for spatial image steganography based on derivative filters and threshold LBP operator. IEEE Trans. Inf. Forensics Secur. 2018, 13, 1242–1257. [Google Scholar] [CrossRef]
  14. Fridrich, J.; Kodovsky, J. Rich models for steganalysis of digital images. IEEE Trans. Inf. Forensics Secur. 2012, 7, 868–882. [Google Scholar] [CrossRef]
  15. Qian, Y.; Dong, J.; Wang, W.; Tan, T. Deep learning for steganalysis via Convolutional Neural Networks. Proc. SPIE 2015, 9409, 94090J. [Google Scholar]
  16. Xu, G.; Wu, H.Z.; Shi, Y.Q. Structural design of Convolutional Neural Networks for steganalysis. IEEE Signal Process. Lett. 2016, 23, 708–712. [Google Scholar] [CrossRef]
  17. Ye, J.; Ni, J.; Yi, Y. Deep learning hierarchical representations for image steganalysis. IEEE Trans. Inf. Forensics Secur. 2017, 12, 2545–2557. [Google Scholar] [CrossRef]
  18. Boroumand, M.; Chen, M.; Fridrich, J. Deep residual network for steganalysis of digital images. IEEE Trans. Inf. Forensics Secur. 2018, 14, 1181–1193. [Google Scholar] [CrossRef]
  19. Butora, J.; Yousfi, Y.; Fridrich, J. How to Pretrain for Steganalysis. In Proceedings of the 9th Information Hiding and Multimedia Security Workshop, Brussels, Belgium, 22–25 June 2021. [Google Scholar]
  20. Zhang, J.; Chen, K.; Li, W.; Zhang, W.; Yu, N. Steganography with Generated Images: Leveraging Volatility to Enhance Security. IEEE Trans. Dependable Secur. Comput. 2023, 2023, 1–12. [Google Scholar] [CrossRef]
  21. Chen, K.; Zhou, H.; Wang, Y.; Li, M.; Zhang, W.; Yu, N. Cover Reproducible Steganography via Deep Generative Models. IEEE Trans. Dependable Secur. Comput. 2022, 20, 3787–3798. [Google Scholar] [CrossRef]
  22. Zhu, J.; Kaplan, R.; Johnson, J.; Li, F.F. Hidden: Hiding data with deep networks. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 657–672. [Google Scholar]
  23. Tan, J.; Liao, X.; Liu, J.; Cao, Y.; Jiang, H. Channel Attention Image Steganography with Generative Adversarial Networks. IEEE Trans. Netw. Sci. Eng. 2022, 9, 888–903. [Google Scholar] [CrossRef]
  24. Tang, W.; Li, B.; Mauro, B.; Li, J.; Huang, J. An automatic cost learning framework for image steganography using deep reinforcement learning. IEEE Trans. Inf. Forensics Secur. 2020, 16, 952–967. [Google Scholar] [CrossRef]
  25. Guan, Z.; Jing, J.; Deng, X.; Xu, M.; Jiang, L.; Zhang, Z.; Li, Y. DeepMIH: Deep invertible network for multiple image hiding. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 372–390. [Google Scholar] [CrossRef]
  26. Ramesh, A.; Dhariwal, P.; Nichol, A.; Chu, C.; Chen, M. Hierarchical Text-Conditional Image Generation with Clip Latents. arXiv 2022, arXiv:2204.06125. [Google Scholar]
  27. Rombach, R.; Blattmann, A.; Lorenz, D.; Esser, P.; Ommer, B. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 10684–10695. [Google Scholar]
  28. Bui, T.; Agarwal, S.; Yu, N.; Collomosse, J. RoSteALS: Robust Steganography using Autoencoder Latent Space. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 933–942. [Google Scholar]
  29. Yu, J.; Zhang, X.; Xu, Y.; Zhang, J. CRoSS: Diffusion Model Makes Controllable, Robust and Secure Image Steganography. arXiv 2023, arXiv:2305.16936. [Google Scholar]
  30. Zhong, N.; Qian, Z.; Wang, Z.; Zhang, X. Steganography in stylized images. J. Electron. Imaging 2019, 28, 033005. [Google Scholar] [CrossRef]
  31. Gatys, L.A.; Ecker, A.S.; Bethge, M. A neural algorithm of artistic style. arXiv 2015, arXiv:1508.06576. [Google Scholar] [CrossRef]
  32. Johnson, J.; Alahi, A.; Li, F. Perceptual losses for real-time style transfer and super-resolution. In European Conference on Computer Vision; Springer International Publishing: Berlin/Heidelberg, Germany, 2016; pp. 694–711. [Google Scholar]
  33. Huang, X.; Belongie, S. Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 1501–1510. [Google Scholar]
  34. Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft coco: Common objects in context. In Proceedings of the Computer Vision–ECCV, Zurich, Switzerland, 6–12 September 2014; pp. 740–755. [Google Scholar]
Figure 1. Comparison with traditional image steganography and style transfer steganography.
Figure 1. Comparison with traditional image steganography and style transfer steganography.
Mathematics 12 00615 g001
Figure 2. Framework of hiding information in style transform network.
Figure 2. Framework of hiding information in style transform network.
Mathematics 12 00615 g002
Figure 3. Comparison of clean style-transferred images without steganography (columns (c,d)) and stego style-transferred images (columns (e,f)).
Figure 3. Comparison of clean style-transferred images without steganography (columns (c,d)) and stego style-transferred images (columns (e,f)).
Mathematics 12 00615 g003
Figure 4. Residual of the style transferred image with and without secret information: C M c 1 , C M c 2 respectively referred to the style transferred image generated by the clean model M c 1 and M c 2 , S M s 1 , S M s 2 respectively referred to the style transferred image with secret message generated by the steganography model M s 1 and M s 2 .
Figure 4. Residual of the style transferred image with and without secret information: C M c 1 , C M c 2 respectively referred to the style transferred image generated by the clean model M c 1 and M c 2 , S M s 1 , S M s 2 respectively referred to the style transferred image with secret message generated by the steganography model M s 1 and M s 2 .
Mathematics 12 00615 g004
Table 1. Structure of message-embedding network and message-extraction network.
Table 1. Structure of message-embedding network and message-extraction network.
Message-Embedding NetworkMessage-Embedding Network
Network LayerOutput SizeNetwork LayerOutput Size
input 3 × 256 × 256 input 3 × 256 × 256
padding( 40 × 40 ) 3 × 336 × 336 3 × 9 × 9 conv, step 1 3 × 256 × 256
32 × 9 × 9 conv, step 1 32 × 336 × 336 32 × 3 × 3 conv, step 1 / 2 32 × 128 × 128
secret message 3 × 336 × 336 64 × 3 × 3 conv, step 1 64 × 64 × 64
message concat 35 × 336 × 336 residual block, 128 filters 128 × 64 × 64
64 × 3 × 3 conv, step 2 64 × 168 × 168 residual block, 128 filters 128 × 68 × 68
128 × 3 × 3 conv, step 2 128 × 84 × 84 residual block, 128 filters 128 × 72 × 72
residual block, 128 filters 128 × 80 × 80 residual block, 128 filters 128 × 76 × 76
residual block, 128 filters 128 × 76 × 76 residual block, 128 filters 128 × 80 × 80
residual block, 128 filters 128 × 72 × 72 128 × 3 × 3 conv, step 2 128 × 84 × 84
residual block, 128 filters 128 × 68 × 68 64 × 3 × 3 conv, step 2 64 × 168 × 168
residual block, 128 filters 128 × 64 × 64 32 × 9 × 9 conv, step 2 32 × 336 × 336
64 × 3 × 3 conv, step 1 / 2 64 × 128 × 128 3 × 9 × 9 conv, step 1 3 × 336 × 336
32 × 3 × 3 conv, step 1 / 2 32 × 256 × 256
3 × 9 × 9 conv, step 1 3 × 256 × 256
Table 2. Message recovery accuracy using different extractors.
Table 2. Message recovery accuracy using different extractors.
NettestNet1Net2Net3Net4Net5
Nettrain
Net 1 0.990.390.310.280.32
Net 2 0.370.990.280.230.38
Net 3 0.310.190.990.330.41
Net 4 0.400.290.310.990.34
Net 5 0.290.320.370.190.98
The results of using the matched extractor is represented in bold font. Net 1 , Net 2 , Net 3 , Net 4 , and Net 5 are the same architectures as illustrated in Figure 2.
Table 3. Average error of stego with 1 bpp under the detection of different steganalysis methods.
Table 3. Average error of stego with 1 bpp under the detection of different steganalysis methods.
Steganalysis MethodSPAM [13]SRM [14]XuNet [16]SRNet [18]SCA-SRNet [18]
P E 0.480.490.510.470.48
Table 4. Detection error comparison with different steganography methods with 0.4 bpp.
Table 4. Detection error comparison with different steganography methods with 0.4 bpp.
SteganographyWOW [5]SUNIWAR [6]HILL [7]Ours
SRNet0.910.890.860.47
SCA-SRNet0.920.910.870.49
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, L.; Zhang, X.; Chen, K.; Feng, G.; Wu, D.; Zhang, W. Image Steganography and Style Transformation Based on Generative Adversarial Network. Mathematics 2024, 12, 615. https://doi.org/10.3390/math12040615

AMA Style

Li L, Zhang X, Chen K, Feng G, Wu D, Zhang W. Image Steganography and Style Transformation Based on Generative Adversarial Network. Mathematics. 2024; 12(4):615. https://doi.org/10.3390/math12040615

Chicago/Turabian Style

Li, Li, Xinpeng Zhang, Kejiang Chen, Guorui Feng, Deyang Wu, and Weiming Zhang. 2024. "Image Steganography and Style Transformation Based on Generative Adversarial Network" Mathematics 12, no. 4: 615. https://doi.org/10.3390/math12040615

APA Style

Li, L., Zhang, X., Chen, K., Feng, G., Wu, D., & Zhang, W. (2024). Image Steganography and Style Transformation Based on Generative Adversarial Network. Mathematics, 12(4), 615. https://doi.org/10.3390/math12040615

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop