1. Introduction
In recent years, there has been tremendous advancement in multimedia technologies that encompasses images, video and audio [
1,
2,
3]. As a result, it has become easy to create, broadcast, distribute and store digital content. A number of tools are easily available through which tampering of digital media can be done very effectively [
4,
5]. This poses a serious problem in case a digital content is to be used as an evidence. To address this issue, digital watermarking and hashing techniques have been proposed. A digital watermark is an imperceptible signal added to a digital content for integrity verification and copyright protection. The disadvantage of using a watermark is the extra payload added to the original content. Hashing is an alternate way to check the integrity of digital content. Cryptographic hash functions such as SHA1 generates a fixed size 160-bit code which can be used for integrity verification and authentication. An important property of cryptographic hash is its sensitivity to the change in input data. For example, hash generated through SHA1 completely changes even if a single bit of input data is changed.
Despite successful use of cryptographic hash functions in e-commerce and other applications, researchers have developed novel hashing algorithms for multimedia content verification during the past two decades. This is because of the difference between text data and multimedia data. For text data, a single bit change in data would change the entire meaning whereas in case of multimedia data, a single bit change in data value generally keeps the semantic of the content intact. Therefore, for integrity verification of multimedia data like digital images, a robust hash function is required. Image hashing is a process which converts the actual input image into a short numeric representation. Image hash is also referred to as a Perceptual Hash Function (PHF). To obtain hash of an image, spatial or transform domain features are extracted from the image and used to generate its hash [
6]. The selection of feature is an important aspect in image hashing. Features should be chosen such that they are robust to content preserving or non-malicious operations such as compression, filtering, brightness contrast adjustment, geometric transformation, etc., and should change when the image undergoes malicious tampering. In addition, the hash should be made secure against hash collision and other attacks by using one or more secret keys. The hash should also be compact in size.
The general framework of image hashing is shown in
Figure 1. The input image is first pre-processed before feature extraction. Pre-processing generally involves image resizing, color to gray-scale conversion and low-pass filtering to remove unwanted noise. Relevant features are then extracted from the pre-processed image. The type of feature selected has a very significant impact on robustness and tamper detection capability of an image hashing scheme. The extracted features are processed to form image hash. This step may additionally involve compression of features so that the size of the hash is considerably reduced. For image integrity verification, the generated hash and secret key is usually transmitted to the receiver through a secure channel, while the image be may transmitted using an insecure channel. The hash of the received image is calculated and compared with the received hash. If both the hashes match, the received image is declared authentic. Different comparison metrics can be used; for example, Hamming distance, bit-error rate, Euclidean distance, sum of absolute difference, etc. To make the hash secure against intentional attacks, a secret key is used either in the hash generation or the feature extraction stage. The purpose of using secret key is to make it extremely difficult for an attacker to produce correct hash for a given input image.
Following are some important properties of a PHF [
7]:
A PHF should be robust to non-malicious distortions like compression, Gaussian noise, motion blurring, etc.
A PHF needs to be sensitive to tampering along with tamper localization.
A PHF should be key dependent. Without knowledge of the secret key, it should be extremely difficult to calculate the correct hash.
To explain the first two points, consider the Baboon image shown in
Figure 2a and its tampered version in
Figure 2b. Tampering is done by changing the area around the right eye ball. This is an example of malicious distortion or tampering. An effective image hashing scheme should detect this tampering as it has changed the content of the image. To show an example of non-malicious distortion, consider the JPEG compressed version of the Baboon image shown in
Figure 2c. In this case, the content of the image is same, however the pixel values would be different. An ideal image hash function should generate similar hash value for images shown in
Figure 2a,c as they have similar content. Practical image hash functions however would generate difference hash values. The difference in hash values should be considerably small. A threshold is used to distinguish malicious and non-malicious distortions. Choosing a suitable threshold is still a research challenge in image hashing and is also discussed in this paper.
Devising an image hashing scheme that is robust to non-malicious operations, sensitive to tampering with localization and also secure is quite a challenging task. For example, increasing robustness has a negative effect on tamper detection and security of an image hashing scheme. Image hashing is very much different from cryptographic hash functions, as in the latter, there is no robustness issue. In this paper, a new image hashing scheme is proposed that generates a compact image hash having the ability to detect minute level tampering, offers high robustness to non-malicious distortions and is also secure. To obtain these properties, Gaussian pyramid decomposition of the input image is performed. Although pyramid decomposition has been widely used in image compression [
8], computer vision [
9], etc., its use in image hashing, as proposed in this paper, has proved to be every effective.
Let us discuss the usefulness of applying Gaussian pyramid decomposition to the input image to obtain a hash which is compact, robust to non-malicious and sensitive to malicious tampering. The input image of size pixels is first resized to pixels and then subjected to repeated subsampling and Gaussian filtering to obtain higher levels of the pyramid. Let the resized image of size pixels be considered as the base level. The fourth level Gaussian pyramid decomposition of this image would yield a matrix. If hash of the input image is constructed using this matrix, it would be compact in size. Due to repeated subsampling and Gaussian filtering, the coefficients at higher levels of the pyramid do not change much when the input image (base level) is subjected to non-malicious distortions. Similarly, if there is malicious tampering in the input image such that the tampered area is greater than a certain percentage of the total image area, there will be change in the corresponding matrix coefficient(s). This enables tamper detection with localization. In addition, the hash is made secure using random noise addition to the input image. This makes base level of the pyramid to become a random matrix which is a function of the input image (whose hash is to be calculated) and the random noise which is a function of the secret key shared between the sender and receiver. With this brief overview of the proposed scheme, the main contributions of this paper are as follows:
A robust and secure image hashing scheme using Gaussian Pyramids is proposed.
A technique of adding random noise is introduced to enforce hash security.
The proposed scheme is evaluated using Receiver Operating Characteristics curve (ROC) which demonstrate effectiveness of the proposed scheme under different non-malicious and malicious distortions.
The rest of paper is organized as follows.
Section 2 presents review of some image hashing schemes reported in the literature. In
Section 3, the effect of secret key on hash security is discussed.
Section 4 describes the proposed scheme. Experimental results are presented in
Section 5. Finally,
Section 6 concludes this paper.
2. Literature Review
A number of image hashing schemes have been proposed during the last 15 years to address the core issues discussed in
Section 1. Some schemes are focused on tamper detection and reduction of hash size, while others are focused more towards robustness. It is interesting to note that the choice of feature significantly effects the robustness and tamper detection capability of an image hashing scheme. For example, if a feature is robust to geometric distortion and compression, it might be sensitive to contrast enhancement. On the other hand, a feature which is robust to a number of content preserving manipulations might not be sensitive to minute level of malicious tampering. This makes the design of a robust image hashing scheme extremely challenging as there is no defined boundary between content preserving operations and tampering. To get an idea about different features and their impact on image hashing, the following review highlights some image hashing schemes proposed in the literature.
Swaminathan et al. [
10] presented a robust and secure image hashing technique using Fourier transform features and controlled randomization. This scheme is robust against a number of non-malicious operations like JPEG compression, RST attacks, shearing, gamma correction, etc. It can also detect cut and paste attacks. Ouyang et al. [
11] used Quaternion Discrete Fourier Transforms (QDFT) and Log polar Transform to generate hash for color images. This scheme shows good sensitivity and is robust against content preserving manipulations, especially large angle rotation. Tang et al. [
12] proposed an image hashing algorithm using ring partition and non-negative matrix factorization (NMF). This algorithm is robust to image scaling, JPEG compression, contrast adjustment, image rotation, gamma correction, and Gaussian filtering. It shows good discriminative capability against non-malicious manipulations. Tabatabaei et al. [
13] introduced a two-stage robust content-based image authentication scheme using Approximate Message Authentication Codes (AMACs). AMACs consists of error correction codes and message authentication codes. Hash generation consists of three steps; pre-processing, intermediate and final hash generation. The results reported in the paper demonstrate high robustness of this scheme against a number of non-malicious operations and is able to detect minute tampering with localization. Ahmed et al. [
7] proposed a DWT-based hashing scheme for image authentication which addresses core issues such as tamper detection, robustness and security. The proposed technique yields good robustness against JPEG compression, low-pass filtering and image sharpening. In addition, the scheme can identify small tampering with localization of the tampered regions. However, the scheme is not robust to brightness, contrast enhancement and rotation.
Monga and Mihcak [
14] construct an image hash using the idea of non-negative matrix factorization (NMF). They argue that standard rank reduction techniques such as QR factorization and Singular Value Decomposition produce low rank bases which do not necessarily preserve the intrinsic structure of an image. The use of NMF allows computationally simple design with good robustness capability. The hash generated using this method has good discriminative capability when used to compare different images. However it is not clear if small tampered regions can be successfully detected. Zhenjun and Linlin [
15] proposed a three-step hashing method to achieve a desirable balance between discrimination and robustness using a technique which they refer to as Local Linear Embedding (LLE). In the first step, the input image is converted into a normalized matrix. The second step consists of constructing a secret key-based secondary image from the normalized matrix obtained from the pre-processed image. Finally, LLE is applied to the secondary image to get low dimensional vectors whose statistics are calculated to produce a short image hash. This scheme provides desirable discrimination and is robust to common signal processing operations.
Xiang et al. [
16] constructed a histogram based image hashing scheme. This scheme is robust against geometric deformation. The scheme consists of two steps; firstly, the input image is filtered using a low-pass Gaussian filter. Secondly, a histogram of the preprocessed image is extracted using mean value of the image and a binary string is computed. The hash is secured by making it key-dependent. Experimental results show that the scheme achieves robustness against scaling, rotation, translation and geometric attacks. Similarly, Abbas et al. [
17] have also proposed a histogram-based image hashing scheme using Noise Resistant Local Binary Pattern (NRLBP) for achieving robustness and discrimination capabilities. This scheme shows good robustness to a number of content preserving operations and can effectively detect small tampered regions with localization. A disadvantage of this scheme is large size of hash. Tang et al. in [
18] use the idea of Color Vector Angle (CVA) to generate image hash. The authors argue in their paper that most of the hashing schemes use luminance information for hash generation of color images. Although computationally less expensive, luminance based features do not always capture details of a color image. To address this issue, Histogram of CVA referred to as HVCA is used as a feature to generate image hash. Results reported in the paper show good discrimination between different sets of images. It appears that this scheme is not designed to detect minute level of tampering with localization.
Zhao and Wei [
19] use rotation invariance of magnitudes and corrected phases of zernike moments to generate image hash. This scheme is robust against a number of non-malicious attacks, excluding rotation. It can also detect tampered area with localization. Wu and Niu [
20] have proposed an image hashing technique using features extracted by using the Scale Invariant Feature Transform (SIFT). Cyclic matrix are generated from the SIFT features thus translating rotation into elementary transformation of the cyclic matrix. Image hash is obtained by performing Eigen value decomposition of the cyclic matrix. Besides rotation, the proposed scheme is also robust to JPEG compression (QF = 60), scaling, brightness adjustment, minor translation, median filtering and Gaussian noise. The paper however does not show any result exhibiting minute tamper detection and localization capability. In [
21], Karsh et al. have proposed an image hashing scheme which is invariant to RST transformations. The hash is constructed using local and global features. Local features are extracted from salient regions of an image using Markov absorption probabilities. Global features are extracted using some statistics measure. The two set of features help to identify small and large tampered regions with good robustness against non-malicious distortions.
Wang et al. [
22] proposed a robust hashing scheme for detection of region duplication forgery. This means copying one portion or block of an image and pasting it to another portion or block of the same image. This method is robust to blurring, lossy compression, noise and also copy region rotation. It also reduces total number of blocks and narrows the block matching search space. In [
23], Bodin et al. have proposed an image hashing scheme for hybrid document security. This algorithm is capable of securing graphical parts of paper and digital documents with unprecedented performance using a very small image digest. The main contribution of this work is the ability of the hash to work under print and scan noise environment. The input image is first normalized and second-order moments are calculated to form the hash of the image. The image hash contains a number of components such as moments, image resolution, index mapping and color mapping table. Khelif and Jiang [
24] proposed a robust and secure hashing algorithm based on virtual watermark detector. The idea proposed by the authors to generate image hash is very interesting. The input image is first pre-processed using a high-pass filter to obtain edge information. The edge-map image is then divided into overlapping blocks and mean of absolute value of coefficients of each block is calculated to form a feature vector. A secret key is used to generate a sequence of virtual watermark. Another secret key is then used to permute the virtual watermark. The permuted virtual watermark and the feature vector coefficients obtained from the input image are given to the watermark detector module. The output of the watermark detector generates hash of the input image. Hash obtained using this scheme is robust to a number of content preserving operations. Results shown in the paper reveals that this scheme cannot detect small level of tampering.
From the above discussion, it is clear that devising an image hashing scheme that is robust to all types of content preserving operations, sensitive to minute level tampering and is also secure is quite a challenging task. Some of the papers reviewed above perform very well for non-malicious distortions but cannot detect minute level tampering. Most of the papers do not secure the hash, which makes it difficult to use hashing for image authentication. If hash is generated without employing a secret key, then an attacker could launch hash collision attacks to defeat the basic purpose of image authentication [
7]. On the basis of literature survey, it is concluded that till date, no benchmark has been defined for image hashing schemes because there is always a trade-off between robustness and discriminative capability.
3. Effect of Secret Key on Hash Security
Hash is also known as a fingerprint because it is unique for every image. An image hash should also be secure against intentional attacks. In order to achieve this property, Tangent Delay Ellipse Reflecting Cavity Map System (TD-ERCS) [
25,
26] and Skew tent map [
27] have been introduced in the proposed scheme. Mathematically, TD-ERCS is written as [
25]:
where
Skew tent map is:
,
and
are random numbers between 0 and 1. Here
and
m are known as seed parameters of TD-ERCS map. In skew tent map,
is the initial value, and
is the control parameter. Using
,
and
, the noisy image is generated using the following pseudo-random number generator (PRNG) equation:
TD-ERCS and Skew tent maps initial parameters are initialized to generate a noisy image having the same dimension as that of the input image. The noisy image is then added to the input image to obtain the output image. Hash of the output image now depends on the input image and the secret key. In other words, the hash feature space is made random. The output image is then processed through Gaussian pyramid decomposition to generate the hash of the input image.
Figure 3 shows the perceptual appearance of the original image and the noisy image. As the PRNG is generated using the secret key, therefore the hash depends upon the secret key. By changing the secret key, the hash value will be different.
Figure 4 shows the effect of noise on the Cameraman image at different values of
. By increasing the value of
, the effect of noise increases. For example, when
, the PSNR was 17.87 dB, when
, the PSNR was 12.76 dB. Similarly for
and
, the observed PSNR was 11.25 dB and 10.83 dB, respectively. Therefore, by increasing the value of
, the generated hash will become more random thus making it difficult for an attacker to guess the hash value and launch hash collision attacks. In addition, for the same image, different hash values can be generated by changing the secret key.
4. The Proposed Scheme
A new image hashing scheme is proposed in this paper which simultaneously attempts to address the core issues of robustness to non-malicious distortions, detection of minute level tampering and securing the hash using a secret key. This is accomplished by obtaining Gaussian pyramid decomposition of the input image. An illustration of a 4-level Gaussian pyramid is shown in
Figure 5. The input image is considered as the base level,
. The upper level of the pyramid is obtained by down-sampling and low-pass filtering the previous level. Let
,
,
and
represent the first, second, third and fourth levels of the Gaussian pyramid, respectively. As an example, suppose the image
has a dimension of
pixels. The entries in
will therefore be
. There will be only 256 entries in
as compared to
entries in
. Therefore, the hash generated by using
will be compact in size. In addition, the entries in
do not change significantly due to non-malicious manipulations like filtering, compression, etc. However, there is a significant change in an
entry whose corresponding spatial area in
has been maliciously tampered. This explains the reason behind the proposed scheme to be robust against non-malicious distortions and sensitive to detect tampering with localization. To further reduce hash size, disk and LOG filters are applied to the top level of the pyramid to obtain two different matrices. The difference of these two matrices is the hash of the input image
. Since both these filters perform blurring operation to
, therefore the entries in the difference matrix are small in magnitude as compared to entries in
. In fact, some of entries in
are very close to zero, thus significantly reducing hash size. The value of
for the LOG filter is
and the value of radius for the disk filter is 2. These parameters have been obtained after thorough experimentation involving a number of non-malicious distortions and different types of tampering cases. The proposed scheme consists of hash generation and hash verification modules which are now explained.
4.1. Hash Generation Module
The block diagram of hash generation module is given in
Figure 6 and the steps are mentioned below.
The input image
of dimension
is converted to a gray scale image and then resized to 256 × 256 pixels. The notation
x,
y is used to represent the spatial location of a pixel. Let
represent the normalized image which is given by Equation (
10).
Let
k be the secret key which is shared between the sender and the receiver through a secure channel. To enforce security in the hash, a noisy image is added to
to obtain
.
where
controls the intensity of noise. The notation
refers to the noisy image which is generated using the secret key
k. A pseudo-random number is initialized using
k to generate the noisy image.
The image is decomposed to level 4 using Gaussian pyramid decomposition to get a 16 × 16 matrix represented by . Level 3 decomposition would generate hash of a larger size, whereas level 5 decomposition would yield hash features that would not be sensitive enough to detect minute level tampering. After experimentation, level 4 Gaussian pyramid decomposition for a 256 × 256 image was found to be the best choice.
The image is then filtered using LOG and disk filters. As for the LOG filter, the value of is 0.5 and the window size is . Let be the output image of the LOG filter. The size of radius for the disk filter is equal to 2. If value of the radius increases, blurring will increase. The output image of disk filter is represented by .
and are subtracted and their difference is the final hash H of the input image I.
4.2. Hash Verification Module
The hash verification module shown in
Figure 7 is used to verify the authenticity of the received image,
.
The received image of dimension is processed through the same steps as discussed in the previous section to obtain hash of the received image, .
A difference matrix
d is calculated as shown by Equation (
13)
Each element of d is compared with a chosen threshold . If any element of d is greater than , then the corresponding spatial area of the image would be considered as tampered.