Optics Communications 278 (2007) 257–263 www.elsevier.com/locate/optcom
Image watermarking based on an iterative phase retrieval algorithm and sine–cosine modulation in the discrete-cosine-transform domain H. Zhang, L.Z. Cai *, X.F. Meng, X.F. Xu, X.L. Yang, X.X. Shen, G.Y. Dong Department of Optics, Shandong University, Jinan 250100, PR China Received 30 January 2007; received in revised form 12 April 2007; accepted 12 April 2007
Abstract A novel digital image watermarking system based on an iterative phase retrieval algorithm and sine–cosine modulation in the discretecosine-transform (DCT) domain is proposed. The original hidden image is first encrypted into two phase masks. Then the cosine and sine functions of one of the phase masks are introduced as a watermark to be embedded into an enlarged host image in the DCT domain. By extracting the watermark of the enlarged superposed image and decryption we can retrieve the hidden image. The feasibility of this method and its robustness against some attacks, such as occlusion, noise attacks, quantization have been verified by computer simulations. This approach can avoid the cross-talk noise due to direct information superposition and enhance the imperceptibility of hidden data. 2007 Elsevier B.V. All rights reserved. PACS: 42.30.Rx; 42.30.Wb; 42.30.Va Keywords: Image encryption; Watermarking; Phase retrieval; Image reconstruction; Digital image processing
1. Introduction Research in the field of optical image hiding, encryption and digital watermarking has been receiving increasing attention since Refregier and Javidi proposed the method of double random-phase encoding [1,2], which has been further extended from Fourier domain [3] to other domains [4–10]. Some other important image hiding or encryption methods include iteratively retrieved phase encoding with a modified projection onto constraint sets (POCS) algorithm [11] or traditional phase retrieval algorithm [12]; in these two algorithms the original hidden image can be iteratively encoded into one phase-only mask (POM) as a key, located in either the Fourier or the input plane, and the other phase mask is fixed as a lock [13]. To enlarge the key space in the *
Corresponding author. Tel.: +86 531 88362857; fax: +86 531 88364613. E-mail address:
[email protected] (L.Z. Cai). 0030-4018/$ - see front matter 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.optcom.2007.04.013
iteration process, Chang et al. [14] have proposed a multiplephase retrieval algorithm in spatial frequency domain where all the phase masks are placed in the input and Fourier planes of a cascaded lens system. To achieve faster convergence, Situ and Zhang [15,16] reported their two phasemask system with 4f setup where the phase distributions of both masks can be adjusted simultaneously in each iteration. The iteratively retrieved phase encoding techniques can also be applied in the Fresnel domain as in the security system based on multiple-phase retrieval we proposed [17]. Digital watermarking techniques have also been widely reported in the information security field [18–23]. The basic idea of digital watermarking is to embed some hidden information into the host image that prevents the information from being read by unauthorized persons but let the message be read as needed. Usually, the most important requirements for digital watermarking are imperceptibility and robustness. The former means the perceptual hidden information cannot be detected directly by eyes or detect devices, and the latter means the watermarking scheme
258
H. Zhang et al. / Optics Communications 278 (2007) 257–263
should survive some attacks such as occlusion, noise attacks, quantization, etc. The hidden information can be embedded in the spatial domain [18] or transform domain [19]. Recently, Zhou et al. have proposed an information hiding scheme in which the watermark can be extracted by subtraction between each two pixels in neighboring lines or columns [24], but the cross-talk noise exists in the retrieved hidden image due to the direct information superposition. To eliminate the cross-talk noise and enhance the imperceptibility of the hidden data, in this paper we suggest an improved watermarking method. We will first explain its theoretical principles, and then give its verifications by computer simulations. 2. Theoretical analysis 2.1. Iterative phase retrieval algorithm The principle of encryption process based on an iterative phase retrieval algorithm can be schematically illustrated in Fig. 1 with a three-plane system: the input plane P1, intermediate plane P2 and output plane P. In the three planes are placed a gray-scale host image f(x,y) in close contact with a phase mask PM1, a phase mask PM2 and a grayscale image to be hidden, g(x,y), respectively. A plane wave of unit amplitude is assumed to illuminate the input plane. The image encryption process is actually a problem of finding two correct phase functions W1 and W2 of the two phase masks under the constraint of the given input and output functions f(x,y) and g(x,y), which can be solved by the modified POCS algorithm [11] or the phase retrieval algorithm [12]. The phase retrieval algorithm consists of a number of cycling iterations, each involving both forward (from the left to right) and backward propagations. Suppose that in the kth step (k = 1, 2, 3, . . .) iteration the two phase distributions are Wk1 ðx; yÞ and Wk2 ðx; yÞ and the input amplitude function is fk, then the output image will be [17] g^k ðx; yÞ ¼ gk ðx; yÞ exp½i/k ðx; yÞ ¼ FrT FrTff k ðx1 ; y 1 Þ exp½iwk1 ðx1 ; y 1 Þ; z1 g exp½iwk2 ðx2 ; y 2 Þ; z2
ð1Þ
under Fresnel approximation, where FrT{f(x,y);z} denotes the Fresnel transform of f(x,y) by a distance z, and z1 and
P1 (x1 ,y1)
f (x1 ,y1)
ψ1
P2 (x2 ,y2)
ψ2
P (x, y)
g (x, y)
λ
Input plane
z1
z2
Output plane
Fig. 1. Schematic diagram of the encryption process based on iterative phase retrieval algorithm.
z2 are the two distances between planes P1 and P2 and between P2 and P, respectively. There are mainly two digital methods to simulate the Fresnel transform: convolution approach (or angular spectrum algorithm) [25,26] and discrete Fresnel transform approach [10,17,23]. In the former algorithm, all the planes in the simulations should have the same resolution and pixel sizes; otherwise, the retrieved image quality will possibly be affected. In the latter method, the resolution and pixel sizes may be different. We will adopt the latter approach in next section, in which the wavelength, the distance of propagation, image resolution, sampling number and other parameters can be specified by Shannon sampling theorem [26]. When gˆk is obtained, the two phase functions can be modified as ( ) IFrTfgðx; yÞ exp½i/k ðx; yÞ; z2 g kþ1 w2 ðx2 ; y 2 Þ ¼ angle ð2Þ FrTff ðx; yÞ exp½iwk1 ðx1 ; y 1 Þ; z1 g wkþ1 1 ðx1 ; y 1 Þ (
) ) IFrTfgðx; yÞ exp½i/k ðx; yÞ; z2 g ¼ angle IFrT ; z1 exp½iwkþ1 2 ðx2 ; y 2 Þ (
ð3Þ where angle {} denotes phase extraction operation, and IFrT stands for the inverse Fresnel diffraction. In each iteration both phase distributions are adjusted simultaneously. In general, the convergent criteria can be the mean square error (MSE) and the correlation coefficient (CC), which are defined by M X N 1 X 2 MSE ¼ jhk hj ; ð4Þ MN i¼1 j¼1 CC ¼
COVðh; hk Þ ; rh rhk
ð5Þ
where h denotes the original host image or the hidden image, hk stands for its kth step iteration result; M and N are the number of pixels in each row and column of the image, and i and j represent the pixel position in each row and column, respectively; r is the standard deviation of the image; and COV(h,hk) is the covariance of the two images defined as COVðh; hk Þ ¼ Ef½h EðhÞ½hk Eðhk Þg;
ð6Þ
where E{} is the expected value operator. If both the iterated hidden image gk(x,y) and the host image fk(x1,y1) satisfy the convergent criteria, that is, the MSE is less than a preset threshold value MSEth and the CC is larger than a threshold value CCth, the iteration process stops. Otherwise, the fk in Eq. (1) is always chosen to be the original image f to guarantee accuracy. The simultaneous adjustment of both phase distributions in each iteration ensures fast convergence of the iteration process. The arbitrary initializations can generate recovered images with almost the same quality, although they may result in different distributions of the two phase masks.
H. Zhang et al. / Optics Communications 278 (2007) 257–263
can be retrieved perfectly, without any noise. We will describe the algorithm of watermark embedding and extraction below. The two dimensional DCT of an image h(u,v) of size M · N and the corresponding inverse discrete-cosinetransform (IDCT) are defined as [19]
2.2. Watermark embedding The schematic diagram of the watermark embedding process is shown in Fig. 2a. In this method, the host image f(x,y) is first enlarged from size M · N to size 2M · 2N by copying one pixel to four pixels in neighbor line and column according to following regulation [24], eð2m 1; 2n 1Þ ¼ f ðm; nÞ;
M 1 X N 1 X 2 H ðn; gÞ ¼ pffiffiffiffiffiffiffiffi cðnÞcðgÞ hðu; vÞ MN u¼0 v¼0 ð2u þ 1Þnp ð2v þ 1Þgp cos cos ; 2M 2N M 1 X N 1 2 X hðu; vÞ ¼ pffiffiffiffiffiffiffiffi cðnÞcðgÞH ðn; gÞ MN n¼0 g¼0 ð2u þ 1Þnp ð2v þ 1Þgp cos cos ; 2M 2N
eð2m 1; 2nÞ ¼ f ðm; nÞ;
eð2m; 2n 1Þ ¼ f ðm; nÞ; eð2m; 2nÞ ¼ f ðm; nÞ m ¼ 1; 2; 3; . . . ; M; n ¼ 1; 2; 3; . . . ; N :
259
ð7Þ
Here f and e denote the host image before and after being enlarged, respectively. It was reported that the real and imaginary parts of the encrypted data may be embedded in one enlarged host image based on double random-phase encoding in a 4f system, and then the original image can be decrypted by neighbor pixel value subtraction (NPVS) algorithm, but this method suffers from cross-talk noise caused by the subtraction of real and imaginary parts of the encoded data [24]. To eliminate the cross-talk noise, here we adopt an improved NPVS algorithm in the discrete-cosine-transform (DCT) domain, in which the phase information is modulated by sine and cosine functions and then embedded into the DCT coefficients matrix of an enlarged host image e, and by using the improved NPVS algorithm in the watermark extraction and decryption stage the hidden image
cðnÞ ¼ cðgÞ ¼
p1ffiffi ; 2
for n ¼ 0
1; p1ffiffi ; 2
for n ¼ 1; 2; . . . ; M 1 for g ¼ 0
1;
for g ¼ 1; 2; . . . ; N 1
n(ψ 2)
s(ψ 2) w co
d
e' IDCT
DCT
f'
b
e'
ð10Þ
For simplicity, we apply DCT operation to each column of the enlarged host image e, and then obtain the corresponding
ws i
f
ð9Þ
where
a
e
ð8Þ
d'
g' DCT
(iψ2 ′) ex p Watermark extraction
Decryption
Fig. 2. Schematic diagram of (a) the watermark embedding process and (b) the watermark extraction process.
260
H. Zhang et al. / Optics Communications 278 (2007) 257–263
DCT coefficients matrix d. From Eq. (7) we know that the same relation holds true for d as for e, that is dð2m 1; 2n 1Þ ¼ dð2m 1; 2nÞ; m ¼ 1; 2; 3; . . . M; dð2m; 2n 1Þ ¼ dð2m; 2nÞ; n ¼ 1; 2; 3; . . . ; N : ð11Þ
f 0 ðm; nÞ ¼ e0 ð2m 1; 2nÞ:
Therefore, the original hidden image can be finally obtained g0 ðx; yÞ ¼ FrTfFrTff 0 ðx1 ; y 1 Þ exp½iw1 ðx1 ; y 1 Þ; z1 g exp½iw02 ðx2 ; y 2 Þ; z2 g:
Since expðiw2 Þ þ expðiw2 Þ 2 expðiw2 Þ expðiw2 Þ ; sin w2 ¼ 2i cos w2 ¼
ð12Þ
we can embed the cosine and sine functions of the phase mask W2 into the neighboring lines or columns of the DCT coefficients of d as d 0 ð2m 1; 2n 1Þ ¼ dð2m 1; 2n 1Þ þ w cos w2 ðm; nÞ; d 0 ð2m 1; 2nÞ ¼ dð2m 1; 2nÞ; d 0 ð2m; 2n 1Þ ¼ dð2m; 2n 1Þ þ w sin w2 ðm; nÞ; d 0 ð2m; 2nÞ ¼ dð2m; 2nÞ;
ð17Þ
ð18Þ
This watermarking scheme is performed with a subtraction algorithm. From Eq. (13), we can see that the sine and cosine functions of one phase key can be embedded in two pixel positions of the DCT coefficients while the pixel values of the other two positions are fixed. This can ensure that the embedded information can be successfully extracted by the subtraction algorithm between each two pixels in neighboring columns, as shown in Eqs. (14) and (15). If embedding both phase keys, we could not apply the subtraction algorithm for watermark extraction. Therefore, we choose only one phase mask here to be modulated and embedded, and the other as the private key to be sent.
m ¼ 1; 2; 3; . . . M
n ¼ 1; 2; 3; . . . ; N :
ð13Þ
Here, w is a weighting factor to control the imperceptibility of the hidden information, and d 0 is the composite image after embedding. By applying the IDCT to each column of the composite image d 0 , the enlarged superposed image e 0 can be obtained, which will be sent to the authorized receiver together with the first phase mask information W1 and the geometrical parameters as encrypted data. 2.3. Watermark extraction and decryption The schematic diagram of the watermark extraction process is shown in Fig. 2b. The authorized receiver first performs DCT to each column of the enlarged superposed image e 0 to get d 0 . With reference to Eqs. (11) and (13), we can find Rðm; nÞ ¼ d 0 ð2m 1; 2n 1Þ d 0 ð2m 1; 2nÞ ¼ dð2m 1; 2n 1Þ þ w cos w2 ðm; nÞ dð2m 1; 2nÞ ¼ w cos w2 ðm; nÞ; Iðm; nÞ ¼ d 0 ð2m; 2n 1Þ d 0 ð2m; 2nÞ ¼ dð2m; 2n 1Þ þ w sin w2 ðm; nÞ dð2m; 2nÞ ¼ w sin w2 ðm; nÞ;
ð14Þ
3. Computer simulation 3.1. Performance of this method A series of computer simulations have been made to verify the feasibility of our proposed method and investigate its performance. In all these simulations, the original host image f(x1,y1) and the hidden image g(x,y) are shown in Fig. 3a and b, respectively, with 256 · 256 pixels of 10 lm pixel pitch and 256 gray-levels; the size of all phase masks is the same as that of the original images, and the parameters we used are k = 532 nm, and z1 = z2 = 48.1 mm. First, we verify the validity of this method with all the right keys. Two random-phase distributions W1 and W2 were taken for PM1 and PM2 in Fig. 1 to start the iteration process. Fig. 3c and d show the retrieved host image and hidden image, respectively, after 50 iteration cycles. We can see that the retrieved images are almost the same as the original ones. The enlarged host image is shown in Fig. 4a; and the enlarged superposed images carrying
ð15Þ
where m = 1, 2, . . . , M; n = 1, 2, . . . , N. Consequently, the weighted phase information W2 can be retrieved as expðiw02 Þ ¼ R þ iI ¼ wðcos w2 þ i sin w2 Þ ¼ w expðiw2 Þ;
ð16Þ
where W02 is the retrieved phase information of W2. By normalization, the weighting factor w in the equation above doesn’t affect the reconstruction of W2. This feature can be considered as one unique advantage of this method, making it very convenient in use. On the other hand, the original host image f before being enlarged can also be extracted from the enlarged superposed image e 0 as
Fig. 3. (a) Original host image; (b) original hidden image; (c) retrieved host image after 50 iteration cycles and (d) retrieved hidden image after 50 iteration cycles.
H. Zhang et al. / Optics Communications 278 (2007) 257–263
261
Fig. 4. (a) Enlarged host image; (b), (c) and (d) enlarged superposed image with weighting factor of 0.01, 0.1 and 0.5, respectively; (e), (f) and (g) retrieved hidden image with weighting factor of 0.01, 0.1 and 0.5, respectively, using our proposed method.
watermark are shown in Fig. 4b–d, where the weighting factor w is 0.01, 0.1 and 0.5, respectively. Fig. 4e–g are retrieved hidden images with weighting factor w being 0.01, 0.1 and 0.5, respectively. It is clear that the enlarged superposed image suffers severer degradation when w grows larger; however, the corresponding original hidden images are perfectly retrieved, without any cross-talk even for a very large w. As a comparison, we also give the similar results by using the approach of Zhou and Chen [24] with the same set of weighting factor of 0.01, 0.1 and 0.5, respectively, in Fig. 5a–c. Obviously these figures suffer severe cross-talk noise. Second, the necessity of using correct phase masks at their correct positions is checked. The simulations show that the retrieved hidden image will become totally noisy when one of the two phase masks is removed, translated by just one pixel, or replaced by a wrong one. The sensitivity of geometry parameters k, z1, z2 to the image retrieval is also tested. The results indicate that the relative change from 0.5% to 1% for these parameters will lead to noise patterns for the retrieved image.
Fig. 5. (a), (b) and (c) retrieved hidden image with weighting factor of 0.01, 0.1 and 0.5, respectively, using Zhou et al.’s method.
3.2. Robustness of this method against occlusion and noise attacks The robustness of this method against some attacks is also investigated. In Fig. 6a and b we give an enlarged superposed image with weighting factor w = 0.1 cut by 25% occlusion and the corresponding retrieved hidden image, respectively. Fig. 6c and d are similar results but now the superposed image is attacked by salt and pepper noise with 0.01 density. From these pictures we can see that in all cases the retrieved hidden images can be recognized without doubt. 3.3. Effect of image gray-scale quantization Finally, the effect of gray-scale quantization of the enlarged superposed image is analyzed. Fig. 7a–c show the retrieved hidden image when the gray scale of the superposed image with weighting factor 0.1 is quantized to 4, 8 and 16 bits, respectively. For quantitative comparison, in Fig. 8 we give some curves of CCs versus weighting factors for different quantization levels of the superposed image. From the curves we can see that a higher quantization level will yield a greater CC for a given weighting factor w, while a greater weighting factor will yield a greater CC for a given quantization level. Consequently, we can choose an appropriate weighting factor for a certain quantization level to ensure an acceptant quality of the retrieved hidden image. For example, if the quantization level is only four bits, we must choose w P 0.2; even in this condition the correlation coefficient CC is still less than 0.8. On the contrary, when the quantization bits are 8, w P 0.02 will guarantee CC P 0.95; and in case of 16 quantization bits, any w greater than or
262
H. Zhang et al. / Optics Communications 278 (2007) 257–263 1.1 1.0 0.9 0.8 0.7
CC
0.6 0.5 0.4 0.3
4 bits 8 bits 16 bits
0.2 0.1 0.0 0.0
0.2
0.4
0.6
0.8
1.0
Weighting factor w Fig. 8. Curves of CCs versus weighting factors for different quantization levels of the superposed image.
Table 1 Calculated MSEs and CCs of the original hidden image and the retrieved hidden image in Figs. 4, 6 and 7
Fig. 6. (a) Enlarged superposed image with weighting factor 0.1 cut by 25% occlusion; (b) retrieved hidden image from (a); (c) enlarged superposed image with weighting factor 0.1 attacked by salt and pepper noise with 0.01 density and (d) retrieved hidden image from (c).
Fig. 7. Retrieved hidden images when the enlarged superposed image with weighting factor 0.1 is quantized to (a) 4, (b) 8 and (c) 16 bits, respectively.
equal to 0.01 will make CC reach its maximum value 1. Therefore, the use of a reasonably high quantization level, such as 8 and 16 bits, which is easy to realize in digital processing, is an effective means to increase the imperceptibility of the hidden information with a small weighting factor and improve the quality of retrieved image. As quantitative estimations, in Table 1 we give the calculated MSEs and CCs of the original hidden image and the retrieved hidden image in Figs. 4, 6 and 7. From these figures we can see that the hidden image can be retrieved without doubt using our proposed method, and this method has certain robustness against attacks. 4. Conclusions In summary, we have proposed a method of digital image watermarking based on an iterative phase retrieval
Figure
CC
MSE
4e 4f 4g 6b 6d 7a 7b 7c
1.0000 1.0000 1.0000 0.4979 0.3670 0.5787 0.9969 1.0000
4.6847 · 1012 4.6847 · 1012 4.6847 · 1012 0.1086 0.1162 0.0843 3.6097 · 104 2.8722 · 109
algorithm and sine–cosine modulation in the DCT domain. In this approach, the original hidden image is first encrypted into two phase masks. Then the cosine and sine functions of one of the phase masks are introduced as a watermark to be embedded into an enlarged host image in the DCT domain with an appropriate weighting factor. By extracting the watermark of the enlarged superposed image and decryption, we can retrieve the hidden image. The use of sine and cosine functions modulation can enhance the security of the system, and the embedding operation in DCT domain can reduce the degradation of the superposed image and thus improve the imperceptibility of encrypted data. Furthermore, the improved NPVS algorithm has the advantage of elimination of cross-talk noise resulted from direct information superposition. The robustness of this method against some attacks, such as occlusion, noise attacks, quantization have been verified. Both correlation coefficient (CC) and mean square error (MSE) are adopted as convergence criterions to evaluate quantitatively the performance of the decrypted images. This method can also be used for color image watermarking. For this purpose both the hidden image and the host image may be straightly split into red (R), green (G)
H. Zhang et al. / Optics Communications 278 (2007) 257–263
and blue (B) components separately, and the three channels can be employed to embed the watermark separately. On the other hand, a single-channel color encryption technique based on color image format conversion [27] can also be adopted for color image watermarking with higher data transmission efficiency than the multi-channel method. We will report our detailed work in this field in a separate paper. We believe this method may find wide applications in practice, especially for the use in Internet. Acknowledgements This work is supported by the National Natural Science Foundation of China (Grants 60477005 and 60677026) and the Natural Science Foundation of Shandong Province (Grants Y2004G01 and Y2006A09), China. The authors also thank the reviewers for some useful suggestions. References [1] P. Refre´gier, B. Javidi, Opt. Lett. 20 (1995) 767. [2] B. Javidi, Optical and Digital Techniques for Information Security, Springer, New York, 2005.
263
[3] S. Kishk, B. Javidi, Appl. Opt. 41 (2003) 5462. [4] O. Matoba, B. Javidi, Opt. Lett. 24 (1999) 762. [5] T. Nomura, S. Mikan, Y. Morimoto, B. Javidi, Appl. Opt. 42 (2003) 1508. [6] E. Tajahuerce, B. Javidi, Appl. Opt. 39 (2000) 6595. [7] G. Unnikrishnan, J. Joseph, K. Singh, Opt. Lett. 25 (2000) 887. [8] B. Hennelly, J.T. Sheridan, Opt. Lett. 28 (2003) 269. [9] H. Kim, D.H. Kim, Y.H. Lee, Opt. Exp. 12 (2004) 4912. [10] L.Z. Cai, M.Z. He, Q. Liu, X.L. Yang, Appl. Opt. 43 (2004) 3078. [11] J. Rosen, Opt. Lett. 18 (1993) 1183. [12] J.R. Flenup, Appl. Opt. 21 (1982) 2758. [13] Y. Li, K. Kreske, J. Rosen, Appl. Opt. 39 (2000) 5295. [14] H.T. Chang, W.C. Lu, C.J. Kuo, Appl. Opt. 41 (2002) 4825. [15] G. Situ, J. Zhang, Opt. Commun. 245 (2005) 55. [16] G. Situ, J. Zhang, Optik 114 (2003) 473. [17] X.F. Meng, L.Z. Cai, X.L. Yang, X.X. Shen, G.Y. Dong, Appl. Opt. 45 (2006) 3289. [18] N. Takai, Y. Mifune, Appl. Opt. 41 (2002) 865. [19] H.T. Chang, C.L. Tsan, Appl. Opt. 44 (2005) 6211. [20] S. Kishk, B. Javidi, Opt. Lett. 28 (2003) 167. [21] H. Kim, Y.H. Lee, Opt. Express 13 (2005) 2881. [22] X. Peng, L. Yu, L. Cai, Opt. Commun. 226 (2003) 155. [23] L.Z. Cai, M.Z. He, Q. Liu, X.L. Yang, Appl. Opt. 43 (2004) 3078. [24] X. Zhou, J.G. Chen, J. Mod. Optics 53 (2006) 1777. [25] G. Situ, J. Zhang, Opt. Lett. 30 (2005) 1306. [26] U. Schnars, W.P.O. Ju¨ptner, Meas. Sci. Technol. 13 (2002) R85. [27] S. Zhang, M.A. Karim, Microwave Opt. Technol. Lett. 21 (1999) 318.