1. Introduction
The construction of digital watermark systems is one of the directions of steganography, the purpose of which is to conceal data in digital media or files so that the embedded mark would be difficult to remove or detect. Unlike the usual task of steganography, i.e., the problem of hiding the fact of data transmission, it is usually not necessary to hide the presence of a watermark in most applications. It is believed that it is possible to predict in advance whether a watermark is contained or not. The scope of applications for digital watermarks includes copyright protection and protection against illegal copying. Various features of the classification of digital watermarks (DWMs) are considered in [
1,
2,
3]: DWM sare classified according to reliability, visibility, capacity and embedding methods. In this paper, we focus on inconspicuous watermarks.
A digital watermark is called invisible if the source signal and the output signal are visually indistinguishable. In this case, an interpolated bitmap image acts as the source signal, and an image with embedded information acts as the output signal.
During the study, the following works were analyzed concerning the issues of interpolation, digital watermarks and application of the Bezier curve:
Onearticle presents studies of the current state and development of digital steganography. A review of the subject area is conducted, and the basic concepts of steganography are described. The principles of digital methods and the purpose of digital watermarks are studied. The article discusses the use of digital watermarks. The difference between the task of hidden data transmission and the task of embedding digital watermarks is shown [
4].
Abook presents an exposition of the basic concepts of steganography, as well as a brief description of known stegosystems. The authors describe the main criteria for the stability of stegosystems, including a new criterion proposed by the authors. New results of the work of the research group on the detection of stegosystems for still images, embedding in digital documents and stegosystems for channels with noise, as well as methods for minimizing the detectability of stegosystems in blind stegoanalysis, are presented [
5].
In onepaper, the authors propose a new approach to shape interpolation based on the Poisson equation. The authors propose a method of interpolation of a nonlinear gradient field that takes into account both the coordinates of the vertices and the orientation of the surface. When proper boundary conditions are met, intermediate shapes are restored implicitly from interpolated gradient fields, while traditional methods usually manipulate vertex coordinates directly. Experimental results demonstrate that this method avoids the shrinkage problem that occurs during linear shape interpolation [
6].
In anotherwork, the methods of interpolation of regions proposed in the literature on databases of moving objects impose restrictions that can have a significant impact on the representation of the evolution of moving regions, in particular when there is a turn between two observations. In this paper, the authors propose a data model for moving regions that allows moving segments to rotate and change their length during their evolution between two observations and uses square Bezier curves to determine the trajectories of their endpoints. Experimental results show that the strategy can be used in addition to the methods of interpolation of regions [
7].
In anotherarticle, the researchers present an interpreted fuzzy logic model for edge detection based on a new generalized domain-independent parametric adaptive membership function based on the Bezier curve (ABCMF) constructed for image blurring. In order to create a reliable fuzzy structure using the developed new membership function, the blurred image is convoluted using a simplified edge detector based on fuzziness to determine the direction of intensity changes. From the estimated indicators, the authors conclude that the proposed method provides stable accuracy exceeding 91% compared to its analogues [
8].
2. Background on Lagrange Interpolation and Bezier Curves
In our work, we consider bitmap stegocontainers represented by bmp files. The proposed algorithm is also applicable to other formats that use non-distorting compression methods: tiff, png, pcx, tga, pgm. These containers are matrices of pixel brightness into which copyright information can be recorded.Unlike the known methods of embedding information in images, we propose to initially transform an empty stegocontainer using Lagrange interpolation. Thus, the bitmap can be enlarged by two times so that in addition to the original pixels, there are interpolated values. Let’s take a closer look at how Lagrange interpolation works in a container transform.
Lagrange interpolation is a common method of signal and image processing [
19,
20,
21,
22]. In the segment
at certain points
, (node points)
, the values of the function
are known. The form of the polynomial of order
,
, whose values at the node points are equal to the values of the function
, is:
And polynomial
:
is called an interpolation polynomial which is constructed from points
. Using the Lagrange interpolation formula, it is possible to write
as a linear combination of the interpolation values of the
at node points:
Considering all the relevant conditions [
23], it can be seen that
is determined by the formula:
Thus, the Lagrange interpolation polynomial has the following form:
To solve the problem of watermark creation, we divide the pixels of the interpolated image into groups of five points, on which the Bezier curve is built using this formula:
in which
n = 4,
P(t) is the value of the Bezier curve point,
is image pixel value and
. The step for
can take values of 0.1 or less. The smaller the step value, the more curve values we obtain.
The curves in question were first presented to the general audience in 1962 by Pierre Bezier, who used them for the computer-aided design of automobile bodies. The curves were named after Bezier, and the recursive method of defining curves was named after de Casteljou (de Casteljau’s algorithm). To date, Bezier curve formulas are used in many branches of science, and many scientists are also conducting research in this area [
24,
25,
26,
27].
The Bezier curve is a special case of a Bernstein polynomial, which is a parametric curve and defined by the following expression:
in which
is a number of reference points,
is the reference point number,
is the step,
is the coordinate of the reference point and
is the basis function of the Bezier curve. Then the coordinates of the curve are described depending on the parameter
in the following way:
For two points: .
For three points: .
For four points: .
For five points:
3. Embedding Method LIBC5
We call the proposed algorithm LIBC5 (Lagrange interpolation Bezier curve for 5 points). The algorithm is arranged as follows. First, consider a group of five pixel brightness values of an interpolated bitmap image . The values are the pixel values of the original image for the single selected color component (R, G or B), and the values and are added by interpolation. Embedding a bit of information occurs in pixels and by taking the nearest point with a Bezier curve rounded to an integer, such that the lowest bit of its value coincides with the bit of information we want to embed. To select the desired value of the Bezier curve point, it is necessary to set such a step t thatprovides a sufficient choice of the values of the curve points.
Nowconsider the graph of the averaged pixel brightness and the constructed Bezier curve with step
(
Figure 1):
Along the -axis are the brightness values of the image of one of the color component pixels taken from the first row of the pixel matrix in order. The -axis shows the linear order of pixels and their ordinal numbers. For clarity, in this example, the step was taken as t = 0.1. The graph shows that for every 5 brightness points of the interpolated image (blue line), we have 11 points of the Bezier curve (red line).
Bits of information are written to
and to
by taking such a value from the Bezier curve, in which the lowest bit is equal to the embedded bit of the message.
Figure 2 shows how the corresponding values for replacing pixels
and
are determined.
The graph in
Figure 2 shows that for point
, there is a choice of four values of the points the Bezier curve, which can be used by rounding them to an integer value and using their lowest bit. It is necessary to take from the Bezier curve the closest value with the point
such that the lowest bit of this value will coincide with the bit of the embedded message. Then, for point
, we also need to choose a suitable value with a Bezier curve from four possible ones, after which a segment of the Bezier curve is constructed at the next five pixel points and the actions are repeated.
So, let’s denote the rounded values of the Bezier curve which can be used to replace the current point
or
by the sets
and
, respectively, where
is the number of values at a given step
. The value of
actually represents the number of segments into which the curve passing from point
to point
is divided, with the exception of the first and last segments, and with the exception of those two segments that surround the value of the curve corresponding to
in order. Thus, to calculate
, we need to divide the unit segment by the length of the interval
, subtract four segments and divide the resulting value in half to replace the two values
and
. Then, we need to add one, since in the end we do not need to get the number of segments, but the number of points obtained:
Simplifying this expression, we obtain a formula for calculating
:
Thus, we see that at , a choice of four possible values is created. The edge values on the curve corresponding to , and are not used to replace the values of and because they correspond to those pixels of the image that originally formed the pixel matrix before the interpolation process.
Since the values of the points of the Bezier curve must be rounded to an integer, we can get the same values, which will reduce the choice for replacing
and
. Therefore, we next consider the case at
. In
Figure 3, the red line shows the Bezier curve, and the blue line plots the brightness of the current bitmap image:
Inconstructing a Bezier curve with a step of
t = 0.05, we have a choice of
= 9 values to replace each of
and
.
Figure 4 shows the possible points to replace
and
.
Let’s consider an example of embedding a message with the author’s information in a 029.bmp image file using the LIBC5 method considered.
Table 1 shows the rounded values for the corresponding
t values for the 029.bmp file, as well as the correspondence of these values to points
and
:
Table 1 shows that the following options are available to replace the value of
: 48, 49, 51, 52, 54, 55, 56 and 57, along withoptions such as 58, 57, 56, 55, 54 and 53 to replace
. It should also be taken into account that of the possible values to replace
and
, we have to choose one suitable value, the lowest bit of which will coincide with the message bit for embedding. At the same time, if the lowest bit of
and
has already coincided with the message bit, then no replacement is required.
If we assume that the sequence of message bits is = (1, 0), we canconvert to binary the values of the set , corresponding to :
=4810=1100002
=4910=1100012
=5110=1100112
=5210=1101002
=5410=1101102
=5510=1101112
=5610=1110002
=5710=1110012
To embed the first bit from the sequence , we need values from whose lowest bit is 1, that is, and . Replacing the pixel values for with any of them, we write bit 1 into the pixel matrix of the image file and proceed to .
For , we convert the values of the corresponding set to binary, excluding duplicate values:
=5810=1110102
=5710=1110012
=5610=1110002
=5510=1101112
=5410=1101102
=5310=1101012
To embed the second bit from sequence , we need those values from that have the least significant bit equal to 0, that is, and . Replacing the values of the pixel with any of these, we write bit 0 inthe pixel matrix of the image file. Hence, a block of five pixels has been passed, and two bits of the message are written to it.
Similarly, embedding into subsequent blocks of the image, taken sequentially from the pixel matrix, can occur. The 029.bmp image with dimensions of 450 in width and 450 in height allows us to use 90 blocks in each row for each of the RGB components. Thus, the total number of blocks for this image will be 40,500 when using one of the component pixels, or 121,500 when using three components. When embedding two bits in each block, we get the formula (Equation (10)) for calculating the maximum number of embedded bits in a bitmap image by one of the components of the dimensions
in width and
in height by the proposed embedding method:
In the example considered, N = 81,000 bits or 10,125 bytes. If all the components of the pixel brightness are used, the value of is multiplied by three. It should be noted that if it is necessary to fill the container with half of its capacity, you can use blocks of pixels through one, since the Bezier curve for five points is built separately for each block of five pixels.
If we return to the example considered, the question naturally arises: by what algorithm is the value selected to replace an image pixel if we have a sample of several suitable values? This algorithm was implemented in such a way that the first in order is selected from a suitable sample of values and used to replace the pixel. So, when replacing , the value of the Bezier curve point closest to from the set of suitable values is used.
Next, we candenote the set of suitable values for embedding each next bit as and ; that is, contains values from and contains values from such that their lowest bit is the same as the current bit of the embedded message. The values of e and f are variable and depend on the pixels of the image used. The smaller the step , the less likely it is that will be equal to or . If such a situation occurs that or , none of the values of the set or is suitable for embedding the next bit of the message. In a process of decoding, it will be impossible to determine the presence of such a case, which will create a decoding error due to the loss of a message bit. In the software implementation, we took the step , which will reduce the probability of this error and allow us to obtaina sufficient sample of and . But no matter how small the step t is, the number of values of the sets and will always be limited, since they are rounded values calculated by Equation (1), and even with the maximum possible number of points of the Bezier curve constructed, we will get several equal values. Therefore, it is worth thinking about the quality of the source containers here. If we consider a bitmap image in which there are blocks with only the same brightness, we will inevitably encounter the loss of the embedded bit of the message. Thus, for unambiguous decoding, it is necessary to provide for cases of obviously unsuitable blocks of pixels that will be skipped. The LIBC5 algorithm was launched for a set of 800 grayscale bitmaps with a size of 450 by 450 points, and the embedded messages representing a pseudo-random sequence were successfully decoded. The Bezier curve was plotted in increments of .
By replacing the values obtained by Lagrange interpolation in the pixel matrix, we do not violate the overall pixel statistics of the image. At the same time, using the Bezier curve, we use a smooth transition of pixel values, minimizing possible distortions.
Thus, you can fill in up to 50% of the pixels of the image with embedded information. But since the percentage of embedding remained at 21% in the previously studied methods [
27,
28,
29], we embedded approximately the same amount of information for comparative analysis.
4. LIBC5 Algorithm Analysis
Next, we conducted a stegoanalysis of the developed LIBC5 method. For the experiment, a set of 800 images with a size of 450 × 450 pixels was used, filled with the proposed method by 21%. The message to be embedded was obtained using a pseudo-random number generator, and then it was encrypted with Vernam cipher.
The analysis of the obtained stegocontainers was carried out by the RS (regular–singular) method [
30]. RS analysis uses a sensitive method of double statistics derived from spatial correlations in images. In the RS method, there are three main factors that affect the accuracy of the estimated message length: the initial deviation, the noise level of the container image and the placement of the message bits in the image. This method shows a fairly accurate result even on noisy images. The experiments carried out by the RS method showed the stego stability of the LIBC5 algorithm.
Table 2 shows the result of RS analysis on a set of empty containers, and
Table 3shows the result of RS on filled containers using the LIBC5 method, where
L shows the percentage of information detected. The obtained results of calculating the
type I error shown that its probability is about 0%. The percentage of detection of the embedded information by the RS method showed the absolute stability of the LIBC5 method developed by us in relation to this type of stagoanalysis.
We will compare the results of the developed method with the INMI (improved neighbor mean interpolation) steganographic method, which we investigated in the article [
28]. In [
31], a modification of the INMI method was previously obtained based on the use of a Lagrange interpolation polynomial of the second degree to obtain a container image. The Lagrange interpolation formula has the following form:
where
is the polynomial of degree
, taking a value equal to one, in the node of
, equal to zero—in other nodes
,
.
In the modified algorithm of the INMI method, the image obtained by adding additional rows and columns of pixels to the original image was considered in fragments of five pixels numbered from 0 to 4. Known pixels (0, 2, 4) are considered interpolation nodes. Therefore, unknown pixel values (1 and 3) were found using the Lagrange interpolation polynomial of degree 2. The image obtained by adding additional rows and columns of pixels to the original image was considered in fragments of five pixels, numbered from 0 to 4.
The pixel values of the container image are obtained using the following formula:
where
is the pixel number in a fragment of five pixels.
Based on the INMI algorithm considered, in [
28] we determined that the maximum container capacity is 21% and depends on the image. We conducted research on a set of 800 images of size 450 by 450, that is, using the same set of stegocontainers as in this study of the LIBC5 method. Here are the resulting tables of the stegoanalysis of the INMI method:
Comparing the results obtained using the LIBC5 method in
Table 2 and
Table 3, and the results of the INMI method in
Table 4 and
Table 5, we can conclude that LIBC5 is more resistant to steganalysis and can be successfully used on image files forthe tasks of embedding DWM (digital watermarks).