Pattern Recognition 38 (2005) 1111 – 1115 www.elsevier.com/locate/patcog
Rapid and brief communication
Efficient encryption of wavelet-based coded color images Karl Martin∗ , Rastislav Lukac, Konstantinos N. Plataniotis Multimedia Laboratory, The Edward S. Rogers Sr. Department of ECE, University of Toronto, 10 King’s College Road Toronto, Ontario, M5S 3G4, Canada Received 6 December 2004; accepted 3 January 2005
Abstract An efficient, secure color image coder incorporating Color-SPIHT (C-SPIHT) compression and partial encryption is presented. Confidentiality of the image data is achieved by encrypting only the significance bits of individual wavelet coefficients for K iterations of the C-SPIHT algorithm. By varying K, the level of confidentiality vs. processing overhead can be controlled. For K = 2, adequate security is achieved and an average of only 0.40% of bits needed encrypting for test images coded at 0.8 bpp. 䉷 2005 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved. Keywords: Image encryption; Partial encryption; Secure coding; Secure wavelet based coding; Secure SPIHT
1. Introduction Multimedia systems security has become increasingly important in recent times. Data communications has become largely multimedia in nature with imaging capabilities being embedded in a myriad of portable devices, such as mobile phones and personal digital assistants (PDAs). Simultaneously, communication channels, such as wireless and wired Internet, are increasingly hostile with no offer of confidentiality. To promote the proliferation of next generation multimedia applications such as online collaboration, the assurance of confidentiality must be a central component of the services offered. Strong encryption schemes, such as the Advanced Encryption Standard (AES), are designed to ensure the confidentiality of arbitrary binary data, but do not take into account the ∗ Corresponding author. Tel.: +1 416 978 6845; fax: +1 416 978 4425. E-mail addresses:
[email protected] (K. Martin),
[email protected] (R. Lukac). URLs: www.dsp.utoronto.ca/∼kmartin (K. Martin), www.dsp.utoronto.ca/∼lukacr (R. Lukac).
special properties of multimedia data. Specifically, still images and video can be overwhelming to encrypt and decrypt due to their large size. Typically, this type of data already requires significant processing overhead to transmit and access it due to the necessary use of complex compression algorithms. Furthermore, the use of power-limited portable devices such as mobile phones makes the use of encryption schemes such as AES on all the data highly prohibitive [1]. In this paper, we present a new efficient method for encrypting still color images based on the principle of partial or selective encryption [1,2]. It relies on the color set partitioning in hierarchical trees (C-SPIHT) [3] compression algorithm in order to produce a secure, coded image. The proposed method uses the C-SPIHT coder to direct a stream cipher to encrypt only certain significance bits, resulting in minimal computational overhead. The rate-distortion performance of the compression algorithm is unaffected since specific bits of the image are encrypted after encoding. This novel scheme is more efficient than previously published schemes, such as [2], since encryption is only performed on the significance bits of individual coefficients and not trees. Also, the level of confidentiality vs. computational overhead may be controlled via a parameter which determines how
0031-3203/$30.00 䉷 2005 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved. doi:10.1016/j.patcog.2005.01.002
1112
K. Martin et al. / Pattern Recognition 38 (2005) 1111 – 1115
many coding iterations for which the encryption should be performed. The confidentiality of the image results not only from the encryption of the significance bits, but also from the unknown arrangement of encrypted and unencrypted bits.
2. Background 2.1. Color-SPIHT The Set Partitioning in Hierarchical Trees (SPIHT) wavelet-based compression algorithm is well regarded as a highly efficient technique for lossy image compression, but was not designed for multi-channel images such as RGB color images [3]. The C-SPIHT algorithm in [3] uses SPIHT with a modified spatial orientation tree (SOT) structure to allow efficient coding of natural RGB images. The image is first transformed into a decorrelated color space, such as YCb Cr , then each channel is transformed using a discrete wavelet transform (DWT). The modified SOT structure, shown in Fig. 1, uses one in every four luminance channel (Y) LL subband coefficients as the root of a tree containing chrominance channel (e.g., Cb and Cr ) coefficients. The rest of the tree structures are the same as in SPIHT. The net result is that the chrominance channel coefficients, comprising 23 of the total number of coefficients but typically having less energy than the luminance channel coefficients, are contained in only 41 of the SOTs. In this way, fewer bits can be used to code their insignificance in the initial iterations of the coding algorithm. As in SPIHT, the C-SPIHT algorithm maintains three ordered lists: the list of insignificant pixels (LIP), containing all the coefficients segregated from trees but not yet found to be significant at the current or previous quantization thresholds; the list of insignificant sets (LIS), containing entire trees that do not have any coefficients found significant at the current or previous quantization thresholds; and the list of significant pixels (LSP), containing coefficients found significant at the current or previous quantization thresholds. 2.2. Partial encryption Partial or selective encryption is the technique of securing the confidentiality of compressed data by encrypting only a fraction of the total data to reduce computational overhead [1]. The most significant portion of the data, as dictated by a compression algorithm, is encrypted to disallow decoding without the knowledge of the decryption key. In Ref. [2], partial encryption of quadtree-based compressed images and SPIHT compressed images were demonstrated. Using SPIHT with grayscale images, it was proposed in Ref. [2] that only the significance bits (both for individual coefficients and trees) of the top two tree levels, along with the initial quantization threshold, be encrypted. The number of bits encrypted is variable depending on the image content. Grayscale test images of size 512 × 512 for screen-display
Fig. 1. Root parent–child structure in the C-SPIHT algorithm.
applications were encoded at 0.8 bpp and it was found that less than 2% of the coded source was typically encrypted. It was claimed in Ref. [2], that confidentiality was achieved not just through securing the most significant information, but by making the correct state of the decoder very difficult to determine. A different approach was taken in Ref. [4], where 3DSPIHT coded video is encrypted using two stages of coefficient confusion where each cube of wavelet coefficients is scrambled both within a cube and between cubes before compression. Also, the signs of the low-frequency coefficients are encrypted using a chaotic stream cipher. The computational cost of encryption was found to be less than 10% of compression if the cubes were no bigger than 32×32×32 and the DWT was performed using 4 levels or less. In Ref. [5], random permutations of wavelet coefficients before coding using SPIHT or JPEG2000, as a means of fast encryption, were studied. However, the authors found that this significantly reduced compression performance (up to 27% for test images). 3. New scheme for efficient color image encryption Let us denote the C-SPIHT bit-stream as the ordered set of bits B. The bit-stream can be divided into the ordered subsets B = {B0 , B1 , B2 , . . .}, where Bk is the set of bits obtained during iteration k of the C-SPIHT algorithm. Each Bk can be further subdivided into Bk = {Bk,LIP , Bk,LIS , Bk,LSP }, where Bk,LIP denotes the ordered set of bits obtained during the first phase of the sorting pass where coefficients in the LIP are tested for significance, Bk,LIS denotes the ordered set of bits obtained during the second phase of the sorting pass where entire trees are tested for significance, and Bk,LSP denotes the ordered set of bits obtained during the refinement pass.
K. Martin et al. / Pattern Recognition 38 (2005) 1111 – 1115
1113
Fig. 2. Composition of subset Bk of C-SPIHT bit-stream.
Each set of bits Bk,LIP is composed of significance bits (Bk,LIP-sig ) and sign bits (Bk,LIP-sgn ). Similarly, each set of bits Bk,LIS is composed of significance bits (Bk,LIS-sig ) and sign bits (Bk,LIS-sgn ) for individual coefficients, as well as significance bits (Bk,LIS-Tsig ) for trees. This decomposition of the bit-stream is shown in Fig. 2. The proposed efficient encryption scheme uses an encryption function fE (·) to encrypt only the bits Bk,LIP-sig and Bk,LIS-sig , for k =0, 1, . . . , K −1, where the parameter K is controlled by the user at the time of encryption/encoding to determine the number of coding iterations to be encrypted. That is, only the coefficient significance bits obtained during the first K iterations of the C-SPIHT algorithm are encrypted. The sign bits Bk,LIP-sgn and Bk,LIS-sgn remain unencrypted, as do the tree significance bits Bk,LIS-Tsig . While the bits Bk,LIS-Tsig represent significance information, they are not encrypted because their values are determined by the significance of one or more unknown coefficients within a large group of coefficients. This can be considered as nonspecific information, not directly affecting reconstruction values and hence not requiring encryption. The combined coding and encryption system is shown in Fig. 3(a). The encryption function fE (·) is implemented using a symmetric key stream cipher. A block cipher cannot be used because the bits need to be decrypted individually to determine the execution path of the decoder, which in turn specifies the location of the subsequent bits to be decrypted before decoding. The use of a stream cipher also provides two incidental advantages: they generally require less computational resources than block ciphers, and they do not contribute to error propagation. The proposed scheme can be viewed as partial bit-plane encryption in the wavelet domain. The higher-order bitplanes of the DWT coefficients found significant during the initial K iterations of the C-SPIHT are encrypted. Confidentiality is achieved, not only by encrypting these highorder bit-planes of most significant coefficients, but also by making the correct state of the encoder/decoder difficult to determine. For small K, the correct interpretation of the unencrypted portion may reveal important features of the image but, as in Ref. [2], the proposed approach makes it difficult to actually interpret which bits are unencrypted and how to decode them correctly.
Fig. 3. System level diagram of coding and encryption (a), and decryption and decoding (b). Dark gray bits represent encrypted significance bits.
Decryption and decoding is achieved in a similar manner (Fig. 3(b)) with only the significance bits Bk,LIP-sig and Bk,LIS-sig , for 0 k < K, decrypted and interpreted by the decoder at the appropriate point in the C-SPIHT algorithm. The decryption function fD (·) of the stream cipher uses the secret key to perform the actual decryption. Each decrypted bit must be passed to the decoder before the decryption continues so that the decoder can instruct the stream cipher which bit to decrypt next. It is assumed that the transmission channel for the encrypted/coded image is error free. To demonstrate the difficulty encountered by a cryptanalyst attempting to determine which bits are unencrypted, j we use bk,LIP to denote the j th bit in the set Bk,LIP , for j = 0, 1, 2, . . . , Nk,LIP − 1, where Nk,LIP is the total number of bits in Bk,LIP . It is known a priori that the first bit is a significance bit: 0 ∈ Bk,LIP-sig bk,LIP
∀k < K.
(1)
However, the classification of the second bit depends on the first bit: 0 =0 Bk,LIP-sig if bk,LIP 1 ∈ (2) bk,LIP Bk,LIP-sgn otherwise. This can be generalized as follows: j −1 Bk,LIP-sgn if (bk,LIP ∈ Bk,LIP-sig j j −1 bk,LIP ∈ and bk,LIP = 1), Bk,LIP-sig otherwise,
(3)
for 1 j < Nk,LIP . From Eq. (3), it is evident that the cryptl analyst must correctly interpret all previous bits bk,LIP , j
0 l < j , in order to determine whether bk,LIP is unencrypted. Hence, without the decryption key, not only do the
1114
K. Martin et al. / Pattern Recognition 38 (2005) 1111 – 1115
Fig. 4. Original images (top) and encrypted images using K = 2 (bottom). Lena and Peppers are 512 × 512 in size, and Monarch and Parrots are 512 × 768 in size. The encrypted images are decoded without decrypting.
encrypted bits remain confidential, but the locations of the unencrypted bits cannot be determined and are thus confidential. The situation is made even more difficult for the cryptanalyst for 1 k < K since the beginning location of Bk,LIP within B will also be unknown. Similar arguments can be made for Bk,LIS . Note that confidentiality depends on the number of encrypted bits being large enough to make an exhaustive search for all possible combinations computationally infeasible. In summary, the proposed encryption scheme achieves confidentiality in two ways: (a) encryption of the most significant information of individual wavelet coefficients, and (b) making the correct state of the decoder very difficult to determine. This differs from the scheme in Ref. [2] in three important ways: (a) the bits Bk,LIP-sig and Bk,LIS-sig are encrypted regardless of which subband the particular coefficients being considered reside in, (b) computational overhead is reduced by not encrypting Bk,LIS-Tsig , and (c) the level of confidentiality achieved vs. computational overhead can be controlled via the parameter K. Property (a) of the proposed scheme means that no a priori decisions are made as to which particular coefficients should be encrypted; the scheme in Ref. [2] limits encryption to the coefficients in the top two levels of the trees. In the scheme proposed here, the decision is made by the C-SPIHT coder via the successive approximation quantization, hence encrypting the coefficients which represent the most significant features of the image regardless of which subbands they reside in. A set of test images and their encrypted versions for K = 2 are shown in Fig. 4. The images were coded using CSPIHT with a 5-level DWT using the CDF 9/7 biorthogonal wavelet filters. The images on the bottom are decoded without knowledge of the encryption key. Clearly, no features of
Table 1 The number of bits encrypted for the test images using different values of K. The percentage of total bits encrypted for the test images coded at 0.8 bpp is shown in brackets. Test Image K Lena 1 2 3 4 5 6
256 990 1718 2898 6061 13527
Monarch (0.12%) 384 (0.47%) 1221 (0.82%) 2810 (1.38%) 5805 (2.89%) 12878 (6.45%) 28008
Peppers
(0.10%) 404 (0.31%) 1130 (0.71%) 2012 (1.48%) 3713 (3.28%) 9130 (7.122%) 19968
Parrots
(0.19%) 488 (0.54%) 1138 (0.96%) 1925 (1.77%) 3370 (4.31%) 6679 (9.52%) 13978
(0.12%) (0.29%) (0.49%) (0.86%) (1.70%) (3.55%)
the images are revealed and confidentiality is ensured. Table 1 lists the number of bits encrypted for each test image for K = 1, 2, . . . , 6. With K = 2, an adequate level of confidentiality could be achieved with only an average 0.40% of the bits encrypted for the test images coded at 0.8 bpp. 4. Summary The proposed efficient encryption scheme uses a stream cipher to encrypt only the significance bits for individual coefficients encountered during the first K sorting passes of the C-SPIHT algorithm. Confidentiality of the image data is ensured since the most significant information is encrypted and the unencrypted information is difficult to locate. The user may control the confidentiality vs. processing overhead by choosing K at the time of encrypting/encoding. The principles behind the proposed scheme may be applied to other embedded image and video coders.
K. Martin et al. / Pattern Recognition 38 (2005) 1111 – 1115
Acknowledgment The work of K. Martin is partially supported by a grant from the Natural Sciences and Engineering Research Council of Canada (NSERC) under the Network for Effective Collaboration Technologies through Advanced Research (NECTAR) project.
References [1] T. Lookabaugh, D.C. Sicker, Selective encryption for consumer applications, IEEE Commun. Mag. 42 (5) (2004) 124–129.
1115
[2] H. Cheng, X. Li, Partial encryption of compressed images and videos, IEEE Trans. Signal Process. 48 (8) (2000) 2439–2451. [3] A.A. Kassim, W.S. Lee, Embedded color image coding using SPIHT with partially linked spatial orientation trees, IEEE Trans. Circuits Syst. for Video Technology 13 (2) (2003) 203–206. [4] S. Lian, J. Sun, Z. Wang, A secure 3D-SPIHT codec, in: Proceedings of European Signal Processing Conference, 2004, pp. 813–816. [5] R. Norcen, A. Uhl, Encryption of wavelet-coded imagery using random permutations, in: Proceedings of IEEE International Conference on Image Processing, 2004, pp. 3431–3434.
About the Author—KARL MARTIN received the B.A.Sc. degree in Engineering Science (Electrical specialty) and the M.A.Sc. degree in Electrical Engineering, at the University of Toronto, Canada in 2001 and 2003 respectively. He is currently pursuing a Ph.D. in the Edward S. Rogers Sr. Department of Electrical and Computer Engineering at the University of Toronto. His research interests include multimedia security, multimedia processing, wavelet-based image coding, object-based coding, and CFA processing. Mr. Martin is a member of both the IEEE Signal Processing Society and Communications Society. Since 2003 he has held the position of Vice-Chair of the Signals and Applications Chapter, IEEE Toronto Section. About the Author—RASTISLAV LUKAC received the M.S. (Ing.) and Ph.D. degrees in Telecommunications from the Technical University of Kosice, Slovak Republic in 1998 and 2001, respectively. From February 2001 to August 2002 he was an Assistant Professor at the Department of Electronics and Multimedia Communications at the Technical University of Kosice. Since August 2002 he is a Researcher in Slovak Image Processing Center in Dobsina, Slovak Republic. From January 2003 to March 2003 he was a Postdoctoral Fellow at the Artificial Intelligence & Information Analysis Lab at the Aristotle University of Thessaloniki, Greece. Since May 2003 he has been a Postdoctoral Fellow with the Edward S. Rogers Sr. Department of Electrical and Computer Engineering at the University of Toronto in Toronto, Canada. His research interests include digital camera image processing, microarray image processing, multimedia security, and nonlinear filtering and analysis techniques for color image & video processing. Dr. Lukac is a Member of the IEEE Circuits and Systems, IEEE Consumer Electronics, and IEEE Signal Processing Societies. He serves as a Technical Reviewer for various scientific journals and he participates as a Member of numerous International Conference Committees. In 2003 he was awarded the NATO/NSERC Science Award. About the Author—KONSTANTINOS N. PLATANIOTIS received the B. Engineering degree in Computer Engineering from the Department of Computer Engineering and Informatics, University of Patras, Patras, Greece in 1988 and the M.S and Ph.D degrees in Electrical Engineering from the Florida Institute of Technology (Florida Tech), Melbourne, Florida in 1992 and 1994 respectively. He was affiliated with the Computer Technology Institute (C.T.I), Patras, Greece from 1989 to 1991. From August 1997 to June 1999 he was an Assistant Professor with the School of Computer Science at Ryerson University. He is currently an Associate Professor at the Edward S. Rogers Sr. Department of Electrical and Computer Engineering where he researches and teaches adaptive systems and multimedia signal processing. He co-authored, with A.N. Venetsanopoulos, a book on “Color Image Processing & Applications”, Springer Verlag, May 2000, he is a contributor to seven books, and he has published more than 200 papers in refereed journals and conference proceedings in the areas of multimedia signal processing, image processing, adaptive systems, communications systems and stochastic estimation. Dr. Plataniotis is a Senior Member of IEEE, a past member of the IEEE Technical Committee on Neural Networks for Signal Processing. He was the Technical Co-Chair of the Canadian Conference on Electrical and Computer Engineering (CCECE) 2001, and CCECE 2004. He is the Technical Program Co-Chair for the 2006 Intelligent Transportation Systems Conference, and the 2006 International Conference in Multimedia and Expo (ICME 2006).