Reversible steganography based on side match and hit pattern for VQ-compressed images

Information Sciences 181 (2011) 2218–2230 Contents lists available at ScienceDirect Information Sciences journal homepage: www.elsevier.com/locate/i...

Download PDF

1MB Sizes 0 Downloads 35 Views

Report

PDF Reader
Full Text

Information Sciences 181 (2011) 2218–2230

Contents lists available at ScienceDirect

Information Sciences journal homepage: www.elsevier.com/locate/ins

Reversible steganography based on side match and hit pattern for VQ-compressed images Cheng-Hsing Yang a, Wei-Jen Wang b, Cheng-Ta Huang b, Shiuh-Jeng Wang c,⇑ a

Department of Computer Science, National Pingtung University of Education, Pingtung 900, Taiwan Department of Computer Science and Information Engineering, National Central University, Taoyuan 320, Taiwan c Department of Information Management, Central Police University, Taoyuan 333, Taiwan b

a r t i c l e Article history: Received 1 October Received in revised Accepted 7 January Available online 14

i n f o

2009 form 5 January 2011 2011 January 2011

Keywords: Data hiding Reversible data hiding Steganography Vector quantization Side match vector quantization

a b s t r a c t Data hiding is an important technique in multimedia security. Multimedia data hiding techniques enable message senders to disguise secret data by embedding them into cover media. Thus, delivering secret messages is as easy as sending the cover media. Recently, many researchers have studied reversible data hiding for images. Those methods can reconstruct the original cover image and extract the embedded secret data from a stegoimage. This study proposes a novel reversible steganographic method of embedding secret data into a vector quantization (VQ) compressed image by applying the concept of side match. The proposed method uses extra information, namely the hit pattern, to achieve reversibility. Moreover, its small hit pattern enables the embedding of the entire hit pattern along with the secret data in most cases. To optimize visual quality of the output stego-image, the method applies the concept of partitioned codebooks (state codebooks). The partition operation on the codebook uses a look-up table to minimize embedding and extraction time. We also propose the use of diagonal seed blocks to embed the entire hit pattern into the cover image without producing any extra control messages. Compared to the Chang and Lin method, the experimental results show that the proposed method has higher capacity, better visual quality, and shorter execution time. 2011 Elsevier Inc. All rights reserved.

1. Introduction Illegal tampering and copying are notorious problems in digital multimedia. Therefore, integrity authentication and copyright protection for digital multimedia have become critical research issues. Data hiding is among the most promising solutions for these problems. Many data hiding techniques for images have been proposed for different purposes, such as secret communication, copyright protection, and tampering detection [10,13,23]. These techniques use dedicated data hiding approaches to embed secret messages into the original images and to create stego-image outputs. Digital images are usually stored in compressed formats such as JPEG, vector quantization (VQ) [6], and side-match vector quantization (SMVQ). Various data hiding schemes have been proposed for VQ or SMVQ images [5,11]. Fig. 1 shows a typical example of VQ encoding in which codebook size and vector size are 256 and 16, respectively. In the ﬁrst step, the cover image is divided into non-overlapping blocks, each of which is 4 4 pixels, shown on the left side of Fig. 1. For each block, the

⇑ Corresponding author. Fax: +886 3 3272038. E-mail address: [email protected] (S.-J. Wang). 0020-0255/$ - see front matter 2011 Elsevier Inc. All rights reserved. doi:10.1016/j.ins.2011.01.015

C.-H. Yang et al. / Information Sciences 181 (2011) 2218–2230

2219

Fig. 1. The process of VQ encoding.

closest codeword from the codebook is found, and the index for the codeword is recorded in the index table. For example, codeword cw1 is selected by the ﬁrst block, and index 1 is recorded in the index table (right side, Fig. 1). Some data hiding techniques are reversible because they can recover the original image and extract the embedded message [1,7,8,12,15–19]. Reversible data hiding methods for images can be roughly classiﬁed into non-VQ-based and VQ-based methods. A non-VQ-based method may use a difference expansion method [8,16,17] or a histogram-based method [7,12,15,18] to embed data; a VQ-based method uses a VQ image or a SMVQ image to embed data [2–4,9,14,20–22]. Reversible data hiding approaches based on VQ or SMVQ can be classiﬁed into three groups: images as outputs [4], legitimate VQ or SMVQ codes as outputs [2], and VQ-derived or SMVQ-derived codes with additional control messages as outputs [3,9,14,20–22]. A typical approach in the ﬁrst group uses multiple codewords to represent a block and then produces a stego-image as the output [4]. In [4], a SMVQ reversible method was proposed for creating a state codebook for each block X and then searching for the closest codeword cwa to block X from the state codebook. It then searches for the closest codeword cwb to codeword cwa from the state codebook. If the bit to be embedded is 0, block X is replaced by codeword cwa; otherwise, block X is replaced by (2 cwa + 1 cwb)/3. The second major VQ-based approach uses only one codeword to represent a block. It then produces legitimate VQ or SMVQ codes as the output [2]. The method proposed in [2] uses a side-match approach to utilize the spatial locality of an image. The scheme constructs three state codebooks, G0, G1, and G2, from the codebook, and uses multiple pre-designed hit maps to resolve conﬂicts. The third major VQ-based approach accepts VQ or SMVQ codes as inputs and then produces sequences of VQ-derived or SMVQ-derived codes along with additional control messages as outputs [3,9,14,20–22]. The declustering strategy proposed in [3] uses the spatial locality of an image to embed data. In [21], a sorted VQ codebook is divided into 2B clusters where B is the number of secret bits per embeddable block, and only half of the clusters are used to embed secret data. Various strategies of priorities, indicators, and index exchanging have also been developed to improve embedding capacity and compression ratio. In [22], the fractal Hilbert curve is applied to replace the traditional trace of processing the VQ index table. In [14], the authors proposed a MFCVQ-based technique for embedding secret data by calculating the difference value between a block and its neighboring blocks for use in reversible data hiding. The method proposed in [20] uses a two-way path and an initial key to achieve reversible data hiding. In [9], the authors apply the search-order coding method to data hiding for VQ images. Their method embeds six secret bits for every two blocks. The novel reversible data hiding scheme proposed in the current study is based on the concept of side match for VQcompressed images. Both the input and the output in this scheme are legitimate VQ-compressed codes, and the method can recover the original VQ-compressed codes and extract the embedded data from a stego-image. For each block in an image, the method partitions the codebook into many linked state codebooks of related codewords. A hit pattern is also used to resolve conﬂicts. The experimental results show that, despite the use of hit patterns, the proposed method has low overhead. The experimental results also conﬁrm its high embedding capacity, good visual quality, and fast processing speed. Moreover, the method enables complete embedding of the hit pattern in the cover media. The remainder of this paper is organized as follows. In Section 2, the background knowledge and the research issue are introduced. The proposed method is presented in Section 3, and the experimental results are given in Section 4. Finally, conclusions are given in Section 5.

2220

C.-H. Yang et al. / Information Sciences 181 (2011) 2218–2230

2. Background 2.1. The Chang and Lin’s embedding method Chang and Lin [2] proposed a reversible embedding method, which uses only one codeword to represent an image block. Despite its high compression rate, the method produces legitimate image outputs. Chang and Lin used the SMVQ concept to create three state codebooks, G0, G1, and G2, for each block X. Fig. 2 illustrates the concept in the case of three 4 4 blocks, U, X, and L. Block U is the upper adjacent block of X, and block L is the left adjacent block of X. Assume that both U and L have been encoded at some step in the Chang and Lin method and Yi represents a codeword in the codebook. First, the border values of X are temporarily assigned by its upper and left adjacent blocks U and L such that x0 = (u12 + l3)/2, x1 = u13, x2 = u14, x3 = u15, x4 = l7, x8 = l11, and x12 = l15. Second, the seven assigned values are used to ﬁnd the n closest codewords from the codebook according to the following equation:

DðX; Y i Þ ¼

3 3 X X ðxj yi;j Þ2 þ ðx4j yi;4j Þ2 ; j¼0

ð1Þ

j¼1

where n is the size of the state codebook, and xj and yi,j are jth elements of X and Yi, respectively. The n closest codewords are then used to construct a state codebook. Finally, the codeword of the state codebook with the minimum Euclidean distance from X is used to encode X. The input of the scheme in [2] is a VQ-compressed image (i.e., a VQ index table). In this discussion, ‘‘X’’ refers to a block that is reconstructed from the corresponding VQ index. In [2], the blocks of the ﬁrst column and the ﬁrst row are not processed. Each block X from left to right and from top to bottom is then processed according to the side-match concept. For each block X, three state codebooks G0, G1, and G2 with the same size are created. Fig. 3 gives an example of G0, G1, and G2 where the size of the state codebook is 4 and the size of the super codebook is 16. In the initial stage, the codewords of the super codebook are clustered as shown in Fig. 3. The G0 is constructed by the side-match prediction of X, where the four closest codewords are selected from the codebook by the distance function deﬁned in Eq. (1). Then, each codeword cwi of G0 successively identiﬁes its closest codeword in the same cluster as the corresponding codeword of G1. If a codeword of G0 cannot identify its closest codeword, the corresponding position of G1 is set to null. The resulting corresponding state codebook G1 should satisfy G0 \ G1 = Ø and "i–jcwi, cwj 2 G1, cwi – cwj. For example, as shown in Fig. 3, codeword cw4 of G0 identiﬁes the closest codeword cw7 as the corresponding codeword of G1. However, because codeword cw9 of G0 cannot identify the closest codeword from its cluster, the corresponding position of G1 is set to null. After G0 and G1 are constructed, they can be used to perform the embedding process. If X is equal to the ith codeword of G0, and if the ith codeword of G1 is available, one secret bit is embedded in X. If the secret bit is 0, X is unchanged; otherwise, X is replaced with the ith codeword of G1. If the ith codeword of G1 is null, X cannot embed a secret bit. However, if X is in G1, X must be replaced with the corresponding codeword of G2. Moreover, for accurate recovery of X, their method creates a state codebook G2 for X to ensure that X is not in G2. A hit map records whether a codeword is a potential replacement for X. For example, as shown in Fig. 3, codeword cw7 of G1 identiﬁes its closest codeword cw3 from the codewords with zero hits in the hit map. If a position of G1 is null, the corresponding position of G2 is set to null. The third state codebook G2 should also satisfy the following conditions: G0 \ G1 \ G2 = Ø and "i–jcwi, cwj 2 G2, cwi–cwj. The hit map is created by training each block X according to the following rules: if X is equal to the ith codeword of its sorted codebook, the ith bit of the hit map is set to 1. Otherwise, the ith bit of the hit map is 0. The sorted codebook consists of the codewords that are sorted by Eq. (1) in the

Fig. 2. Block X and its left block L and upper block U.

C.-H. Yang et al. / Information Sciences 181 (2011) 2218–2230

2221

Fig. 3. An example of generating G0, G1, and G2.

super codebook. Moreover, to enlarge the candidate pool of G2, which has a hit value of 0, twelve hit maps are proposed and separated according to the smoothness in the neighborhood of each block X.

2.2. Limitations of the Chang and Lin method In the Chang and Lin method, conﬂicts are resolved by multiple hit maps. However, the hit maps are too large to be embedded in cover media since all hit maps are needed when the extracting phase begins. Let the size of a hit map be jCj bits. The total number of the codewords in the codebook must then be jCj. Since their method may use as many as twelve hit maps, least 12 jCj bits are needed to store hit maps. Chang and Lin claimed that multiple hit maps can be embedded in a VQ index table by using only G0 and G1. The embedded hit maps and secret data can then be extracted blindly. The current study, however, demonstrates that their method is problematic. Fig. 4 shows a scenario in which their hit maps cannot be successfully extracted. The ﬁgure shows hit maps initially embedded into the embedded index table (the stego-image). In the extracting phase, the ﬁrst index 15 is not in G0 or in G1. Because the embedded hit maps have not yet been extracted, G2 cannot be constructed immediately. Two cases are possible: (1) index 15 is in G2 or (2) index 15 is not in G2. If index 15 is in G2 as shown in Fig. 4, it must be recovered from index 13. If index 15 is not in G2, the recovered index is still index 15. Because G2 has not been created, their algorithm cannot make an exact reversion for index 15. For the next index 44, reversion is still impossible because creating G0 and G1 requires its left block and upper block. Moreover, the method cannot judge whether a bit is embedded for index 44.

Fig. 4. The problem of Chang and Lin reversible method.

2222

C.-H. Yang et al. / Information Sciences 181 (2011) 2218–2230

3. The proposed method In this section, we propose a new reversible data hiding method which produces either legitimate VQ codes as outputs. The proposed method uses a novel hit pattern, which is small and can be embedded together with the secret data during the embedding phase. 3.1. Reversible data hiding For illustration purposes, the processed block shown in Fig. 2 is designated block X, and the left block and the upper block of X are L and U, respectively. Fig. 5 shows the proposed method for processing block X. First, codebook C is partitioned into m state codebooks S0, S1, . . ., Sm1 with the same size, where S0 is called the embedding state codebook, S1, S2, . . ., Sm2 are called the middle state codebooks, and Sm1 is called the ﬁnal state codebook. The S0 is generated based on the SMVQ method, which uses the left block L and the upper block U of block X to predict and select codewords from codebook C. The Si is generated by using codewords in Si1 to ﬁnd the closest codewords from the remaining codewords in C, where i = 1, 2, . . ., m 1. Generating Si from Si1 reduces the distance between each pair of the corresponding codewords in Si1 and Si. Block X is processed in one of three ways: Case (a): If block X is equal to the ith codeword of S0, the embedding procedure for block X is invoked. If the to-beembedded bit is 0, X remains unchanged; otherwise, X is set to the ith codeword of S1. Case (b): If block X is equal to the ith codeword of one middle state codebook Sj, X is set to the ith codeword of Sj+1. If the original X is in Sm2, a hit bit 0 is outputted. Case (c): If block X is in the ﬁnal state codebook Sm1, a hit bit 1 is outputted, and X remains unchanged. Fig. 6 gives an overview of the proposed embedding strategy for block X. The arrows represent possible transitions of block X from one state codebook to another state codebook in the proposed algorithm. One secret bit is embedded when X is in S0, and one hit bit is created when X is in Sm2 or Sm1. 3.2. Embedding and extraction The embedding algorithm is given below. The algorithm takes an index table (i.e. legitimate VQ codes), a codebook, and secret data as inputs; it produces another index table (i.e. legitimate VQ codes) and a hit pattern as outputs.

Fig. 5. The ﬂowchart for embedding block X.

C.-H. Yang et al. / Information Sciences 181 (2011) 2218–2230

2223

Fig. 6. The main idea of the proposed embedding method.

Algorithm Embedding: Input: Index table I, codebook C, secret W Output: Embedded index table I0 , hit pattern H 1. Indexes belonging to the ﬁrst row or to the ﬁrst column of I are not applied in the embedding phase. For each index Ii in the ﬁrst row or the ﬁrst column, set I0i ¼ Ii 2. From left to right and from top to down, process each remaining index Ii by executing Step 3–Step 6. 3. For index Ii, construct state codebooks S0, S1, . . ., Sm1 from codebook C 4. If Ii 2 S0, one bit is embedded. If the to-be-embedded bit in W is 0, set I0i ¼ Ii . If the to-be-embedded bit in W is 1, set I0i to the corresponding index in S1 5. If Ii 2 Sj and j = 1, 2, . . ., m 2, the block to be processed is not embedded. Set index I0i to the corresponding index in Sj+1. Moreover, if Ii 2 Sm2, output a hit bit 0 to hit pattern H 6. If Ii2 Sm1, the block to be processed is not embedded. Set I0i ¼ Ii , and output a hit bit 1 to hit pattern H

Fig. 7 shows an illustrative example of an embedding phase with 32 state codebooks. Indexes located in the ﬁrst row and in the ﬁrst column of the index table remain unchanged and are used as seed blocks, namely the upper-left seed blocks. The other indexes are then processed from left to right and from top to bottom in Steps 3–6. Now consider the case in Fig. 7, which shows codebooks in a ﬁxed state for illustration purposes. Note that the proposed method requires each index to construct its own state codebooks S0, S1, . . ., Sm1. In the case of Fig. 7, the ﬁrst index 13 belongs to S2. Thus, it is not embedded, and Ii0 is set to the corresponding index 15 in S3. The second index 44 is processed and then identiﬁed in S0. Thus, one bit is embedded. Because the to-be-embedded bit is 0, Ii0 is set to 44. The third index 58 is in S0, and the to-be-embedded bit is 1. Therefore, I0i is set to the corresponding index 69 in S1. The fourth index 17 is in S31. Therefore, I0i is set to 17, and a hit bit 1 is outputted. The ﬁfth index 73 is in S30. Therefore, I0i is set to the corresponding index 85 in S31, and a hit bit 0 is outputted.

Fig. 7. A simpliﬁed example of the proposed embedding method.

2224

C.-H. Yang et al. / Information Sciences 181 (2011) 2218–2230

The extracting and reversing algorithm is as follows: Algorithm Extracting_and_Reversing: Input: Embedded index table I0 , codebook C, hit pattern H Output: Original index table I, extracted secret W 1. Indexes in the ﬁrst row and in the ﬁrst column of I0 remain unchanged in the extracting phase. For each index I0i in the ﬁrst row or the ﬁrst column, set Ii = I0i 2. From left to right and from top to down, process each remaining index I0i by executing Steps 3–7 3. For index I0i , construct state codebooks S0, S1, . . ., Sm1 from codebook C 4. If I0i 2 S0, extract a bit 0 to W, and set Ii ¼ I0i 5. If I0i 2 S1, extract a bit 1 to W, and set Ii to the corresponding index in S0 6. If I0i 2 Sj and j = 2, . . ., m 2, set Ii to the corresponding index in Sj1 7. If I0i 2 Sm1 and the to-be-processed hit bit in H is 0, set Ii to the corresponding index in Sm2; if I0i 2 Sm1 and the tobe-processed hit bit in H is 1, set Ii ¼ I0i

Fig. 8 gives an illustrative example of extraction and reversion, where the embedded table is generated as shown in Fig. 7. Similarly, indexes located in the ﬁrst row and in the ﬁrst column of the embedded index table remain unchanged. The other indexes are handled by executing Steps 3–7. For each embedded index I0i , a unique state codebook must be constructed. The new codebooks must be the same as the state codebooks created by index Ii because the left index and upper index are identical in indexes I0i and Ii. In Fig. 8, state codebooks are ﬁxed for illustration purposes. The ﬁrst embedded index I0i , which is 15, belongs to S3. Therefore, no secret data is embedded, and Ii is set to the corresponding index 13 in S2. Next, the embedded index I0i is 44, and it belongs to S0. Therefore, a secret bit 0 is extracted, and Ii is set to 44. Since the third index, I0i ¼ 69, belongs to S1, a secret bit 1 is extracted, and Ii is set to the corresponding index 58 in S0. The fourth index, I0i ¼ 17, is in S31. Because the to-be-processed hit bit is 1, Ii is set to 17. The ﬁfth index, I0i ¼ 85, must be found in S31. The to-be-processed hit bit is 0. Thus, Ii is set to its corresponding index 73 in S30. 3.3. Applying lookup table and similarity distance threshold Our approach is required to construct a different set of state codebooks S0, S1, . . ., Sm1 for each block X. In the constructing phase, the proposed method needs to search the closet codeword for each codeword in Si from the remaining codewords in C, such that i = 1, . . ., m 1. This phase is time consuming and represents a potential performance bottleneck. To minimize execution time, a lookup table for maintaining the distance between each pair of codewords is pre-constructed and presorted. The proposed hit pattern strategy creates a hit bit only when X is in Sm2 or in Sm1. Since the probability of X in Sm2 or Sm1 is very low, the size of the hit pattern is usually small. To enable adjustment of hit pattern size, a threshold TH is used to modify the state transition rules (Fig. 6). Using a smaller threshold maintains stego-image quality because it minimizes the possibility of poor codeword replacement. For each block X, suppose that the closest codeword of X, denoted by cwi, belongs to Si. Let cw0, cw1, . . ., cwm1 be the corresponding codewords of cwi in S0, S1, . . ., Sm1, respectively. As Fig. 9 shows, all the corresponding codewords of cwi are listed, including two additional dummy codewords cw1 and cwm. Let DS(cwi, cwi+1) be the Euclidian distance between codewords cwi and cwi+1. If DS(cwi, cwi+1) > TH, edge (cwi, cwi+1) is denoted by a bold line. Otherwise, edge (cwi, cwi+1) is denoted by a thin line. Note that edge (cw1, cw0) is deﬁned as a thin line whereas edge (cwm1, cwm) is deﬁned as a bold line. The procedure for processing block X is then modiﬁed according to the following embedding rules: (1) If DS(cwi, cwi+1) 6 TH, the codeword of block X is set to Si+1 instead of Si. Otherwise, X remains unchanged.

Fig. 8. A simpliﬁed example of the proposed extracting and reversing method.

C.-H. Yang et al. / Information Sciences 181 (2011) 2218–2230

2225

Fig. 9. The new embedding rules.

(2) Let cwj be the ﬁnal encoded result of X. Note that cwj should be cwi or cwi+1. If DS(cwj, cwj+1) > TH and DS(cwj1, cwj) 6 TH, a hit bit is created as follows: Case (a) cwj = cw0 The created hit bit is 1 if the embedded bit is 1; otherwise, the hit bit is 0. Case (b) cwj – cw0 The created hit bit is 1 or 0 according to cwj = cwi or cwj = cwi+1, respectively.

3.4. Diagonal seed blocks for embedding and extraction To ensure that the entire hit pattern can be embedded, diagonal seed blocks are used to increase the total number of blocks that have two side-match neighboring blocks with known values. Fig. 10 shows the diagonal sequence of processing. The gray blocks in Fig. 10 are seed blocks, which do not participate in the embedding phase. Blocks located in the upper left create their own state codebooks S0, S1, . . ., Sm1 according to their down-side neighboring blocks and their right-side neighboring blocks. However, blocks located in the lower right use their upper-side neighboring blocks and their left-side neighboring blocks for encoding and decoding. The hit pattern H is created by pre-scanning all blocks in the proposed order. To explain the proposed method, two types of blocks are deﬁned as follows: A block that generates a hit bit is called a hit-bit-generating block. A block that can be used to embed one bit is called a secret-bit-embedding block. Note that a block can be processed if both of its neighboring blocks have been processed. In Fig. 10(a), for example, blocks numbered 1–10 are ready for processing. In the embedding phase, the hit pattern H is embedded before the secret bits. Blocks are scanned in the proposed order, and the hit pattern H is embedded according to the following rules: 1. For a hit-bit-generating block X, if its hit bit Hi has been embedded in one of the previous secret-bit-embedding blocks, block X is ready and is processed. If its hit bit has not yet been embedded, block X is put into a queue. 2. For a secret-bit-embedding block, a hit bit Hi is embedded. If the hit-bit-generating block X that generates the embedded hit bit Hi is in the queue, block X is removed from the queue and is processed. 3. If the queue is not empty and no block can be processed, the generated hit bits of the hit-bit-generating blocks in the queue cannot be embedded. These hit bits are outputted, and the hit-bit-generating blocks are then removed from the queue for processing. Using the diagonal processing order in our method is advantageous since the probability of encountering a secretbit-embedding block is enormously larger than that of encountering a hit-bit-generating block. Therefore, the chance of embedding the entire hit pattern is extremely high.

Fig. 10. The proposed diagonal sequence of processing: (a) the case of n n VQ blocks; (b) the case of n m VQ blocks where n – m.

2226

C.-H. Yang et al. / Information Sciences 181 (2011) 2218–2230

4. Experimental results This section describes the results of several experiments that conﬁrmed the effectiveness and efﬁciency of the proposed method. The experiments were performed on an Intel Core 2 T5500 1.66 GHz machine with 2 GB RAM equipped with a Microsoft XP operating system. The method was implemented in C language. The cover images used in the experiments were 512 512 gray images, and the secret data were random bit strings. The size of the codebook, which was created by the Linde–Buzo–Gray (LBG) algorithm, was 512 codewords. The state codebook size was set to 16 codewords. The peak signal-to-noise ratio (PSNR) was used to quantify stego-image output quality. Fig. 11 shows the VQ-compressed cover images Lena, Sailboat, Baboon, Peppers, F-16, and Boat. The threshold (TH) in the experiments was set to control the distance between each state codebook and its next state codebook. Tables 1 and 2 show the experimental results of our method using (1) the original upper-left seed blocks (described in Section 3.2) and (2) the diagonal seed blocks (the diagonal seeding method), respectively, for different threshold (TH) values. Because the diagonal seeding method used fewer blocks, its payload and hit-pattern size were slightly larger than those of the original upper-left seeding method did. The experimental results indicate that, given the same test image, the size of the payload remains unchanged for different threshold values, and a larger TH value results in a smaller hit pattern. The experimental results also indicate that a larger TH value results in a lower PSNR value given the same test image. Table 3 shows the size (in bits) of the hit pattern that cannot be embedded for each test image. The results of the upperleft seeding method show that few hit-pattern data cannot be embedded into images Baboon, Sailboat, and Boat. As for the remaining test images, all hit pattern data were successfully embedded. The results show that all hit patterns can be embedded into their corresponding stego-images when using the diagonal seeding method. The results conﬁrm that using diagonal seed blocks is practical because it does not require extra control messages. Fig. 12 shows the relationship among the PSNR, threshold, and pure payload (numbers above the curves) for upper-left seed blocks when using the proposed method: (a) Lena, (b) Baboon, (c) Sailboat, (d) Peppers, (e) F-16 and (f) Boat. The pure payload is deﬁned as the size of the payload minus the size of the hit pattern. The experimental results conﬁrm that, as the TH value increases, the pure payload increases. Thus, capacity increases as well. However, the PSNR value decreases. Table 4 compares the results obtained by the proposed method and by the Chang and Lin method given similar PSNR values. The experimental results show that the size of the hit pattern obtained by the proposed method is smaller than that in the Chang and Lin hit maps. Therefore, the proposed method has smaller overhead because less information is needed to embed secret data. The payload in the proposed method is also larger, which indicates a larger capacity. In [2], the multiple hit maps need 512 12 = 6144 bits, where 512 is the codebook size, and 12 is the number of hit maps. On average, the hit

Fig. 11. Six VQ-compressed cover images. (a) Lena with a PSNR of 32.24 dB. (b) Baboon with a PSNR of 24.70 dB. (c) Sailboat with a PSNR of 29.25 dB. (d) Peppers with a PSNR of 31.40 dB. (e) F-16 with a PSNR of 30.295 dB. (f) Boat with a PSNR of 30.163 dB.

2227

C.-H. Yang et al. / Information Sciences 181 (2011) 2218–2230 Table 1 Experimental results of the proposed upper-left seeding method for different thresholds (TH). Proposed method with upper-left seed blocks

Measures

Lena

Baboon

Sailboat

Peppers

F-16

Boat

Average

TH = 90

PSNR (dB) Payload (bits) Hit pattern size

31.247 12665 7539

24.302 6242 5060

28.899 11621 9509

29.254 12178 7592

29.934 12395 10023

29.831 11967 9310

28.911 11178 8172

TH = 100

PSNR (dB) Payload (bits) Hit pattern size

30.953 12665 5862

24.242 6242 4442

28.752 11621 8407

29.034 12178 6010

29.689 12395 8870

29.476 11967 7901

28.691 11178 6915

TH = 150

PSNR (dB) Payload (bits) Hit pattern size

29.939 12665 2324

23.870 6242 2879

27.863 11621 3171

28.449 12178 2769

28.834 12395 6104

28.439 11967 3989

27.899 11178 3539

TH = 200

PSNR (dB) Payload (bits) Hit pattern size

29.203 12665 1002

23.410 6242 2211

27.235 11621 1493

28.051 12178 1914

28.452 12395 4145

27.698 11967 2301

27.342 11178 2178

TH = 1

PSNR (dB) Payload (bits) Hit pattern size

27.521 12665 74

21.770 6242 198

25.562 11621 114

25.711 12781 52

25.212 12395 124

26.017 11967 136

25.299 11178 116

Table 2 Experimental results of the proposed diagonal seeding method for different thresholds (TH). Proposed method with diagonal seed blocks

Measures

Lena

Baboon

Sailboat

Peppers

F-16

Boat

Average

TH = 90

PSNR (dB) Payload (bits) Hit pattern size

31.260 12781 7706

24.208 6535 5410

28.774 11729 10055

28.728 12399 8122

29.769 12679 10530

29.664 12120 9879

28.734 11374 8617

TH = 100

PSNR (dB) Payload (bits) Hit pattern size

30.947 12781 5999

24.148 6535 4786

28.605 11729 8824

28.508 12399 6452

29.585 12679 9384

29.417 12120 8441

28.535 11374 7314

TH = 150

PSNR (dB) Payload (bits) Hit pattern size

29.929 12781 2533

23.747 6535 3229

27.604 11729 3520

27.926 12399 2942

28.906 12679 6615

28.448 12120 3972

27.760 11374 3802

TH = 200

PSNR (dB) Payload (bits) Hit pattern size

29.035 12781 1203

23.284 6535 2400

27.031 11729 1730

27.440 12399 2040

27.600 12679 4330

27.688 12120 2342

27.013 11374 2341

TH = 1

PSNR (dB) Payload (bits) Hit pattern size

26.61 12781 194

21.60 6535 341

24.98 11729 224

25.02 12399 168

24.485 12679 239

25.372 12120 253

24.678 11374 237

Table 3 The size of the un-embedded hit pattern for various images (in bits). Proposed methods

Threshold (TH)

Lena

Baboon

Sailboat

Peppers

F-16

Boat

Average

Upper-left seeding

90 100 150 200 1

0 0 0 0 0

1 1 7 6 0

4 3 0 0 0

0 0 0 0 0

0 0 0 0 0

1 1 0 0 0

1 0.83 1.17 1 0

Diagonal seeding

90 100 150 200 1

0 0 0 0 0

0 0 0 0 0

0 0 0 0 0

0 0 0 0 0

0 0 0 0 0

0 0 0 0 0

0 0 0 0 0

pattern size in the proposed method is about 18.8% smaller than the hit map size in the Chang and Lin method. Notably, whereas the Chang and Lin method cannot embed both secret data and hit maps into the cover image, the proposed method has a high probability of doing so. Table 5 compares average execution time between the Chang and Lin method and the proposed method. The average execution time for the proposed method consists of (1) the average time for processing all test images for different TH parameters, and (2) the average time for constructing a lookup table if it is used. The average execution time for constructing a lookup table in the experiments in this study was about two seconds. Note that the time required to construct a lookup table

2228

C.-H. Yang et al. / Information Sciences 181 (2011) 2218–2230

Fig. 12. Relationship among the PSNR, threshold, and pure payload (numbers above the curves) of the proposed method using upper-left seed blocks: (a) Lena, (b) Baboon, (c) Sailboat, (d) Peppers, (e) F-16 and (f) Boat.

Table 4 Comparisons of the Chang and Lin method and the proposed method using upper-left seed blocks. Methods

Measures

Lena

Baboon

Sailboat

Pepper

Average

Chang and Lin [2]

Payload (bits) PSNR (dB) Hit map size

8707 30.23 6144

3400 24.04 6144

7601 28.00 6144

8421 29.15 6144

7032 27.86 6144

Proposed method using upper-left seed blocks

Payload (bits) PSNR (dB) Hit pattern size TH

12665 30.26 3207 130

6242 24.07 3758 120

11621 28.00 4715 130

12178 29.18 8289 87

10677 27.88 4992 117

Table 5 Comparison based on average execution time in seconds. Chang and Lin [2]

Average execution time (seconds)

36.6

Proposed method Upper-left seeding method

Diagonal seeding method

Without lookup table

With lookup table

Without lookup table

With lookup table

627.6

30.3

941.3

43.9

C.-H. Yang et al. / Information Sciences 181 (2011) 2218–2230

2229

can be eliminated if the lookup table is already created and stored for reuse. As Table 5 shows, the execution time of the proposed method improves dramatically if a lookup table is used. The results also show that the proposed method with a lookup table has the shortest execution time. Fig. 13 compares the stego-images produced by the Chang and Lin method with those obtained by the proposed method. The comparison results show that the visual quality of the stego-images obtained by the proposed method is much better than that given by the Chang and Lin method, which produces numerous artifacts that are easily visible to the human eye. For example, their stego-image of Lena clearly shows white spots in the forehead region and black spots in the hair region. This occurs because their G2 is constructed using codewords with the hit bit 0. The distance between the corresponding codewords in G1 and G2 is very large. The proposed method, however, searches for the codewords that are closest to the corresponding codewords in the current state codebook, which it then uses to construct the next state codebook. Therefore, the stego-images obtained by the proposed method are of higher visual quality. Even when TH = 1, the proposed method does not produce visible artifacts, as shown in Fig. 13(d). 5. Conclusions The proposed reversible data hiding method based on the SMVQ prediction embeds secret data into VQ-compressed images. The codebook is partitioned into several related state codebooks to minimize the distance between the corresponding codewords in the current processing state codebook and in the next state codebook. Therefore, it does not generate artifacts in stego-images. After using a lookup table to store all distances between each pair of codewords, a partition operation is performed on the codebook to improve the execution time for embedding and extraction. The hit pattern used to resolve

Fig. 13. Experimental results of image Lena: (a) Original VQ-compressed image (PSNR = 32.24) (b) Stego-image (PSNR = 30.23) in [2] (c) Stego-image obtained by the proposed upper-left seeding method with TH = 130 (PSNR = 30.235) (d) Stego-image obtained by the proposed upper-left seeding method with TH = 1 (PSNR = 27.52).

2230

C.-H. Yang et al. / Information Sciences 181 (2011) 2218–2230

conﬂicts is also smaller than that in Chang and Lin [2]. The proposed diagonal seeding method ensures that the hit pattern can be embedded into the cover image. The experimental results conﬁrm that the proposed method is superior to the Chang and Lin method in terms of capacity, visual quality, and execution time. Acknowledgments This research was partially supported by the National Science Council of the Republic of China under the Grants NSC 982221-E-153-001, NSC99-2218-E-008-012, NSC 98-2221-E-015-001-MY3, and NSC 99-2918-I-015-001. We are also appreciated to Ted Knoy for the comments offers in improving the paper presentations References [1] M.U. Celik, G. Sharma, A.M. Tekalp, E. Saber, Reversible data hiding, in: Proceeding of International Conference on Image Processing, ICIP’02, Rochester, New York, 2002, pp. 157–160. [2] C.C. Chang, C.Y. Lin, Reversible steganography for VQ-compressed images using side matching and relocation, IEEE Transactions on Information Forensics and Security 1 (2006) 493–501. [3] C.C. Chang, C.Y. Lin, Reversible steganographic method using SMVQ approach based on declustering, Information Sciences 177 (2007) 1796–1805. [4] C.C. Chang, W.L. Tai, C.C. Lin, A reversible data hiding scheme based on side match vector quantization, IEEE Transactions on Circuits and Systems for Video Technology 16 (2006) 1301–1308. [5] C.C. Chang, W.C. Wu, Y.H. Chen, Joint coding and embedding techniques for multimedia images, Information Sciences 178 (2008) 3543–3556. [6] T.H. Chen, C.S. Wu, Compression-unimpaired batch image encryption combining vector quantization and index compression, Information Sciences 180 (2010) 1690–1701. [7] K.S. Kim, M.J. Lee, H.K. Lee, Y.H. Suh, Histogram-based reversible data hiding technique using subsampling, in: Proceedings of The 10th ACM Workshop on Multimedia and Security, Oxford, United Kingdom, 2008, pp. 69–74. [8] H.J. Kim, V. Sachnev, Y.Q. Shi, J. Nam, H.G. Choo, A novel difference expansion transform for reversible data embedding, IEEE Transactions on Information Forensics and Security 3 (2008) 456–465. [9] C.C. Lee, W.H. Ku, S.Y. Huang, A new steganographic scheme based on VQ and SOC, IET Image Processing 3 (2009) 243–248. [10] X. Li, J. Wang, A steganographic method based upon JPEG and particle swarm optimization algorithm, Information Sciences 177 (2007) 3099–3109. [11] C.C. Lin, S.C. Chen, N.L. Hsueh, Adaptive embedding techniques for VQ-compressed images, Information Sciences 179 (2009) 140–149. [12] C.C. Lin, N.L. Hsueh, A lossless data hiding scheme based on three-pixel block differences, Pattern Recognition 41 (2008) 1415–1425. [13] I.C. Lin, Y.B. Lin, C.M. Wang, Hiding data in spatial domain images with distortion tolerance, Computer Standards & Interfaces 31 (2009) 458–464. [14] Z.M. Lu, J.X. Wang, B.B. Liu, An improved lossless data hiding scheme based on image VQ-index residual value coding, Journal of Systems and Software 82 (2009) 1016–1024. [15] Z. Ni, Y.Q. Shi, N. Ansari, W. Su, Reversible data hiding, IEEE Transactions on Circuits and Systems for Video Technology16 (2006) 354–362. [16] D.M. Thodi, J.J. Rodriguez, Expansion embedding techniques for reversible watermarking, IEEE Transactions on Image Processing 16 (2007) 721–730. [17] J. Tian, Reversible data embedding using a difference expansion, IEEE Transactions on Circuits and Systems for Video Technology 13 (2003) 890–896. [18] P. Tsai, Y.C. Hu, H.L. Yeh, Reversible image hiding scheme using predictive coding and histogram shifting, Signal Processing 89 (2009) 1129–1143. [19] H.W. Tseng, C.P. Hsieh, Prediction-based reversible data hiding, Information Sciences 179 (2009) 2460–2469. [20] J.X. Wang, Z.M. Lu, A path optional lossless data hiding scheme based on VQ index table joint neighboring coding, Information Sciences 179 (2009) 3332–3348. [21] C.H. Yang, Y.C. Lin, Reversible data hiding of a VQ index table based on referred counts, Journal of Visual Communication and Image Representation 20 (2009) 399–407. [22] C.H. Yang, Y.C. Lin, Fractal curves to improve the reversible data embedding for VQ-indexes based on locally adaptive coding, Journal of Visual Communication and Image Representation 21 (2010) 334–342. [23] C.H. Yang, C.Y. Weng, S.J. Wang, H.M. Sun, Adaptive data hiding in edge areas of images with spatial LSB domain systems, IEEE Transactions on Information Forensics and Security 3 (2008) 488–497.