J. Vis. Commun. Image R. 24 (2013) 885–894
Contents lists available at SciVerse ScienceDirect
J. Vis. Commun. Image R. journal homepage: www.elsevier.com/locate/jvci
Image representation using block compressive sensing for compression applications Zhirong Gao a,c,1, Chengyi Xiong b,⇑, Lixin Ding c, Cheng Zhou b a
College of Computer Science, South-Central University for Nationalities, Wuhan 430074, China College of Electronic and Information Engineering, Hubei Key Lab of Intelligent Wireless Communication, South-Central University for Nationalities, Wuhan 430074, China c Computer School, Wuhan University, Wuhan 430074, China b
a r t i c l e
i n f o
Article history: Received 17 November 2012 Accepted 2 June 2013 Available online 11 June 2013 Keywords: Image representation Image compression Encrypted image Robust coding Discrete cosine transform Block compressive sensing Coefficient random permutation Adaptive sampling
a b s t r a c t The emerging compressive sensing (CS) theory has pointed us a promising way of developing novel efficient data compression techniques, although it is proposed with original intention to achieve dimensionreduced sampling for saving data sampling cost. However, the non-adaptive projection representation for the natural images by conventional CS (CCS) framework may lead to an inefficient compression performance when comparing to the classical image compression standards such as JPEG and JPEG 2000. In this paper, two simple methods are investigated for the block CS (BCS) with discrete cosine transform (DCT) based image representation for compression applications. One is called coefficient random permutation (CRP), and the other is termed adaptive sampling (AS). The CRP method can be effective in balancing the sparsity of sampled vectors in DCT domain of image, and then in improving the CS sampling efficiency. The AS is achieved by designing an adaptive measurement matrix used in CS based on the energy distribution characteristics of image in DCT domain, which has a good effect in enhancing the CS performance. Experimental results demonstrate that our proposed methods are efficacious in reducing the dimension of the BCS-based image representation and/or improving the recovered image quality. The proposed BCS based image representation scheme could be an efficient alternative for applications of encrypted image compression and/or robust image compression. Ó 2013 Elsevier Inc. All rights reserved.
1. Introduction Over the last two decades, great improvements have been made in image and video compression techniques driven by a growing demand for storage and transmission of visual information. By far, many image and video compression standards, such as JPEG, JPEG2000 and MPEG-4 AVC/H.264 etc., have been proposed. However, these mainstream compression coding techniques still may be not very efficiency in some special compression applications, for examples of robust image coding and encrypted image compression. Robust coding is very important in digital multimedia transmission over internet and wireless networks, where the image codec needs to have not only excellent compression performance to reduce data rate, but also high robustness performance to resist transmission error due to channel noise and/or packetloss. In addition, when data is transmitted over insecure bandwidth-limited channels, data compression and encryption are always necessary. Although the traditional scheme that the ⇑ Corresponding author. Fax: +86 2767842854. E-mail addresses:
[email protected] (Z. Gao),
[email protected]. edu.cn (C. Xiong),
[email protected] (L. Ding). 1 Fax: +86 2767842854. 1047-3203/$ - see front matter Ó 2013 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.jvcir.2013.06.006
encryption follows compression is suitable for most of the applications, there are some applications which require encryption to be carried out before compression, where an information owner and a network operator do not trust each other. In such a case, to protect his content, the information owner wants to encrypt his data before giving it to network operator. Because an encryption algorithm converts the data from comprehensible to incomprehensible structure, it makes the encrypted data difficult to efficiently compress using any of the classical compression algorithms, which relies on intelligence embedded in the data. The emerging compressive sensing or compressive sampling (CS) theory [1–4] has pointed us a promising way of developing novel efficient data compression techniques, although it is proposed with original intention to achieve dimension-reduced sampling for saving sampling cost. The CS is a new signal sampling theory telling us that we can exactly recover the original signals through few measurements less than Shannon sampling rate if the signal is sparse or compressible, which is different from Shannon sampling theory thinking that the sampling rate must be at least twice of signal’s highest frequency if we want to reconstruct a bandwidth limited signal without distortion. According to the CS theory, it is possible that we can achieve data sampling and compression at the same time. Since its introduction of CS theory, the CS technique
886
Z. Gao et al. / J. Vis. Commun. Image R. 24 (2013) 885–894
has already been widely used in many application areas such as signal processing, pattern recognition, communication and etc., besides signal sensing [5]. Although traditional CS combined with ordinary quantization and entropy coding has been shown not a good compression technique [6] in view of compression efficiency, research on CS based image/video coding has become a hot topic in recent years. Petrisor et al. [7] proposed a two description coding scheme based on a general frame synthesis operator by using CS recovery. Zhang et al. [8] proposed a multiple description image/video coding method by CS theory, which employed the CS theory into the popular discrete cosine transform (DCT) based image/video coding framework to generate different descriptions directly in the frequency domain to enhance the error resilient property for video transmission. Huang et al. [9] and Venkatraman et al. [10] proposed CS based approaches to object based video coding, which employed CS into the residual object error of a video frame. Sanei [11] proposed an interactive and adaptive joint source-channel coding for progressive transmission of medical images, in which a hierarchical alternative least squares based CS is used for adaptive selection of wavelet coefficients. Wen et al. [12] proposed a CS image compression algorithm using quantized DCT coefficients and noiselet transform for compression of images that characterized by an abundance of high frequency information, significant noise and/or missing data. Prades et al. [13] proposed a distributed video coding scheme using compressive sampling, where the non-key frame was encoded and recovered by CS. Kumar and Makur [14] proposed a lossy compression scheme of encrypted image by CS technique, where the encryption was applied on the image in the spatial domain followed by the CS based compression. Deng et al. [15,16] proposed a robust image coding method with a number of descriptions based on CS for high packet loss transmission, where the discrete wavelet transform (DWT) was applied for sparse representation, and the resulting low-frequency subband and high-frequency subbands were, respectively encoded by different CS codecs. In conventional CS framework, the original signal is sub-sampled by projecting it onto a measurement bases in encoder side, and then recovered in decoder side by solving a linear programming problem [3,4]. This conventional CS framework adopts nonadaptive random projection to obtain the dimensional-reduced representation of original input image to achieve compression. The major challenges existed in CS for practical applications include how to efficiently reduce measurement ratio (MR) with good recovered signal quality and decrease the implementation complexity for the CS recovery, where the MR is defined as the ratio of the measurement dimension to the total dimension of signal measured. To achieve better reconstructed image quality with as small MR as possible in CS framework, many optimal reconstruction algorithms have been proposed in previous literature [17–26], while only few optimal sampling schemes [27–30] based on the image characteristics that target to improve CS encoding efficiency have been presented. A perceptual based weighting measurement [27] has been introduced to extract more low-frequency information in CS encoder side, which is based on the knowledge that human eyes are more sensitive to low-frequency information of image than to high-frequency information, where the JPEG quantization Table is used. A reweighted sampling based on the statistical characteristics of image has been proposed in [28], in which the sampled coefficients are modified by the corresponding weighting values before performing CS, which can get much more performance gains than the reweighted reconstruction method [19]. Comparing to frame-based CS, block-based CS (BCS) of image [25,26] is an efficient alternative for significantly reducing computational complexity, but allocating the same number of measurement dimension to any image block with different spatial
characteristic will be no doubtful deficiency in BCS, because different spatial characteristics may imply large difference in sparsity, and blocks with weak sparsity will be reconstructed worse than those with strong sparsity. To address this problem, a saliencybased BCS (SBCS) method has been proposed in [29], in which the number of measurement dimension allocating to different image block is proportional to its saliency information. However, there exists a drawback in this SBCS that it is not easy to found an efficient criterion for reasonably assigning the numbers of measurement dimension to the corresponding sampled image blocks. In this paper, the BCS with DCT based image representation for compression applications is investigated, where two efficient methods, one called coefficient random permutation (CRP) in DCT domain originally presented in [30], and the other termed adaptive sampling (AS), are proposed to enhance compression performance. The proposed BCS-based image representation scheme could be an efficient alternative for encrypted compression applications (especially for that encryption followed by compression), and/or robust image compression applications. The CRP can be applied to achieve both encrypting image in transform domain prior to further compression and balancing the sparsity of the measured vectors for increasing the BCS sampling efficiency at the same time. The AS can be used to increase encoding efficiency by distinctively extracting the information of different image components according to their importance to image reconstruction quality, which is implemented by adaptively designing the measurement matrix used in CS based on the energy distribution characteristics of image in DCT domain. Experimental results demonstrate that the proposed methods are efficient in improving the BCS based image representation ability for increasing compression performance. The rest of the paper is organized as follows. In Section 2, a brief overview for CS is provided for convenience of the following discussion. The CRP technique for BCS based image representation is described in detail in Section 3, followed by the AS technique as proposed in Section 4. Experimental results and discussions are given in Section 5, and a brief conclusion for this paper is provided in Section 6. 2. Overview of compressive sensing The compressive sensing (CS) theory has been introduced in public for several years [1,2]. Despite a large number of researches in this related field have been reported, and now not a few overview papers [3,4] on the CS theory can be searched, a brief induction for CS is provided in this Section for convenience of the following discussion. The CS framework includes sampling process in encoder side and reconstruction process in decoder side. The sampling process is basically to take non-adaptive linear projection, which is later called conventional compressive sensing (CCS) and described as follows:
y ¼ Ux
ð1Þ
where x denotes a real-value, discrete time signal with finite dimension N, i.e. x 2 RN, y represents the obtained measurement values vector with reduced dimension n (n << N), and U is a sampling matrix (also called measurement matrix) with dimension n N. CS requires that the sampled signals are sparse in time/space domain, or can be sparsely represented in a certain transform domain w (e.g. DCT or wavelet), i.e. x can be represented as x = ws, and here the variable s denotes the sparse representation of signal x. So the above sampling process can also be described in a more general form [3,4] as follows:
y ¼ Uws ¼ H s
ð2Þ
where H = Uw is called a n N sensing matrix. Although recovering x from y is an ill-posed problem in condition of n << N, according to
887
Z. Gao et al. / J. Vis. Commun. Image R. 24 (2013) 885–894
compressive sensing theory, the original signal can be exactly reconstructed by solving a combinational optimization problem expressed as (3) as long as x is sparse in some domain.
min : k^skl0 ;
s:t: : y ¼ H^s
ð3Þ
where k kl0 denotes l0-norm, and ^s represents the estimation of the sampled sparse vector s. However, searching a solution with this constraint in (3) is NP-hard, which is commonly replaced by solving the linear programming problem as
min : k^skl1 ;
s:t: : y ¼ H^s
ð4Þ
It is proved that if U and w are incoherent and their product H ¼ Uw satisfies the Restricted Isometry Property (RIP) [2,3], then the K-sparse (i.e. K non-zero values) signal x can be well reconstructed by (4) from the n P cK logðN=KÞ measurements, where c is a small constant. One of the suitable ways to measure the coherence between U and w (called mutual coherence) is to look at the columns of H ¼ Uw . The mutual coherence can be defined as the largest absolute and normalized inner product between different columns in H as
lfHg ¼ max16i;j6N and i–j
hTi hj khi k khj k
ð5Þ
The mutual coherence provides a measure of the worst similarity between the columns of the sensing matrix H, a value that exposes the matrix’s vulnerability, as such two closely related columns may confuse any pursuit technique. For each integer K = 1,2,. . ., defining the isometry constant dK of a matrix H as the smallest number such that
ð1 dK Þksk2l2 ¼ kHsk2l2 ¼ ð1 þ dK Þksk2l2
ð6Þ
holds for all K-sparse vectors s, we will loosely say that the sensing matrix H obeys the RIP of order K if dK is not too close to one. When this property holds, H approximately preserves the Euclidean length of K-sparse signals, which in turn implies that K-sparse vectors cannot be in the null space of H. An equivalent description of the RIP is to say that all subsets of K columns taken from H are in fact nearly orthogonal. The commonly used sampling matrix U can be i.i.d. Gaussian matrix with entries being outcomes of i.i.d. Gaussian variables [3,4], while the structure measurement matrix is more simpler for hardware implementation [31,32] at cost of the CS performance. It is universal in the sense that the product of an i.i.d Gaussian matrix and the transform matrix H ¼ Uw is also i.i.d Gaussian thus will have the RIP with high probability regardless the choice of transform basis w. In [31], Donoho et al. reported several families of random measurement ensembles which behave equivalently, including random spherical, random signs, partial Fourier and partial Hadamard. Among the reconstruction algorithms, Basis Pursuit (BP) is the first one proposed to solve this problem [1], and Orthogonal Matching Pursuit (OMP) [17] is proposed for fast reconstruction, etc.
smooth region should have stronger sparsity than those located at rich edge or texture region. The method of bits random permutations has been used in channel coding in communication system for design of interleaver, which can maximally scatter the burst error generated in the process of data channel transmission, thus plays a very important role in increasing the reliability of data transmission. Motivated by the above fact, the CRP in DCT domain of image is exploited to equalize the sparsity of the sampled vectors in CS encoding stage for enhancing measurement efficiency of BCS of image in this Section, which is followed by a CRP based BCS scheme. The proposed CRP can be at the same time used to achieve encrypting image in transform domain at information owner side. In traditional BCS, the CS sampling does not exploit the intercoefficient correlation within a sampled signal vector, so coefficient permutations across signal vectors will not adversely affect encoding performance. On the other hand, because the random permutations make the distribution of coefficients become randomization and homogenization, such random permutations will make sparsity of all the sampled vectors nearly identical, which improves reconstruction performance since it is likely that before CRP some less sparse vectors will not be reconstructed well. The proposed CRP in DCT domain of image is described in detail as follows. The input image is first divided into B = ðM=mÞ ðN=nÞ image blocks with the same size of m n, where M (N) denotes the number of row (column) of the original image, and is integer multiple of m (n). All image blocks are then sparsified by block 2-D DCT with block size of m n, obtaining the DCT coefficient arrays of all B image blocks, respectively denoted as ai, i = 1,2,...,B. The DCT coefficients in all arrays are further grouped to form m n coefficient vectors, each corresponding to different frequency compon o nent of image, denoted as bj ¼ aji ; i ¼ 1; 2; :::; B , j = 1,2,. . .,m n, where aji represents the jth coefficient in array ai. The random permutation operations for different frequency component vector are ^j ¼ Perm½b , carried out independently, modeled as b j
j = 1,2,. . .,m n, and Perm[] represents random permutation operator. Finally, the block coefficients rebuilding operation (BCRO) is performed to form the coefficient vectors to be measured in the following sampling stage of CS. Every rebuilt coefficient vector to be sampled is still composed of m n coefficients, each from different frequency component vector after random permutations, repren o ^i ; j ¼ 1; 2; :::; m n , i = 1,2,. . .,B, where b ^i is the ith ^i ¼ b sented as a j
j
^j . The pseudo-code for the CRP algorithm is coefficient in vector b shown in Fig. 1. The overall architecture of BCS scheme with CRP in DCT domain is shown in Fig. 2. It contains two main modules: a CRP-based BCS encoder module and a CRP-based BCS decoder module. The BCS encoder module receives an input image, and then in sequence performs block-based 2-D DCT (B2DCT), CRP and Sampling as (2).
3. CRP based block compressive sensing As mentioned above, the theory foundation of CS is that the signal sampled is sparse or compressible, and the required minimal number of measurement dimension for perfect recovery is determined by the sparsity of the sampled signal. When an image is represented by a BCS scheme, it will be inefficient to assign the same number of measurement dimension to each sampled vector corresponding to the different image block, because the image block with different spatial characteristic has significantly different sparsity from each other. In general, the image blocks located at
Fig. 1. The proposed CRP algorithm.
888
Z. Gao et al. / J. Vis. Commun. Image R. 24 (2013) 885–894
Original image B2DCT
Reconstruction
CRP
CIRP
Sampling
B2IDCT
Measurement data
Reconstructed image
BCS encoder
BCS decoder
Fig. 2. The overall architecture of BCS scheme with CRP.
The BCS encoder module generates the measurement data for all sampled vectors produced by CRP unit. The CRP-based BCS decoder module receives the measurement data transmitted by the BCS encoder module, and then in sequence performs reconstruction as (4) for all sampled vectors, coefficients inverse random permutations (CIRP) and block-based 2-D Inverse DCT (B2IDCT). The CRP-based BCS decoder module generates the recovered image. CIRP is the inverse operation of CRP, which is used to obtain the reconstructed DCT coefficients for each original image block from the reconstructed vector by the BCS Reconstruction unit, because the reconstructed vectors produced by the BCS Reconstruction unit are scrambling representation in DCT domain of image blocks. Here the adopted CRP is content-independent, so only a seed for generating a pseudo random number is required to transmit to the decoder side for CIRP. Although the CRP is image content independent, the CRP can reassign randomly (pseudo randomly) the position of all coefficients, which has the ability to make them distribute approximately identical to guarantee that the rebuilt sampled coefficient vectors by the CRP have nearly identical sparsity. To investigate the stability and efficiency of CRP in equalizing the sparsity of the sampled vectors, extensive experiments have been carried out, and the experimental results have been provided in later.
4. AS based block compressive sensing In conventional CS scheme, a fixed measurement matrix is used to achieve non-adaptively sampling signal. This scheme is simple but not high efficiency, because it identically samples all signals with different feature, and non-distinctively extracts all information components of a signal that maybe have different importance to recovery signal quality. In this Section, a novel method of developing adaptive measurement matrix is proposed to achieve adaptive sampling (AS) in our BCS in DCT domain. As what we have known, image signals are bandwidth limited, and their energy are mostly distributed in the low-frequency components. In addition, the human eyes have certain masking effect on some parts of the frequency spectrum of image signals, which like low-pass filters that makes them more sensitive to low frequency components than high frequency ones. So, the low frequency components occupying large part of energy of image signals are more important to image quality than the high frequency components occupying less part of energy. Considering these facts, we propose to develop an adaptive measurement matrix in BCS framework based on the energy distribution characteristic of the sampled image signal in DCT domain to enhance the performance of BCS. The key idea is to adaptively extract more
information of the important frequency components than that of the relatively non-important frequency components in sampling stage to reduce recovery error. It is implemented by adaptively scaling the coefficients of measurement matrix according to the sampled image’s characteristic, i.e. unequally measuring the different frequency components based on their importance to image quality. The related details for the BCS with AS are described as follows. Let ai denotes the coefficients vector of the ith image block in DCT domain, and aji is the jth coefficient in vector ai that represents the jth frequency component belonging to this image block, then the energy distributed in the jth frequency component of full-size image could be defined as
Ej ¼
B 2 X j ai
ð7Þ
i¼1
Accordingly, we define a weighting vector based on the energy of every frequency component as
x ¼ ½E1 ; E2 ; . . . ; Emn
ð8aÞ
and a weighting matrix as
X ¼ ½x; x; . . . ; x
ð8bÞ
Then a modified measurement matrix adaptive to the sampled image is designed as
^ ¼ orth ðU: XÞT U
T
ð9Þ
where U is the measurement matrix designed according to the conventional CS scheme, superscript ‘T’ denotes transpose operator, and orth() returns an re-orthogonalized measurement row bases (MRBs). From (9), the adaptive measurement matrix is designed by entry-wise multiplication of original measurement matrix with weighting matrix followed by re-orthogonalization of row bases, where the weighting coefficient is determined by the energy of each frequency component. Although the proposed AS scheme at first glance looks like the reweighted sampling scheme introduced in [28], it is considerably different from the latter as follows. The first, the reweighted method introduced in [28] weighs the input vector sampled, while the AS weighs the measurement matrix, and the re-orthogonalizing the measurement row bases is applied to preserve the RIP of the new measurement matrix. The second, the reweighted method introduced in [28] uses the linear combination of the absolute value of its mean and the square root of its variance of each frequency component to determine weighting coefficient, while the AS uses the energy of each frequency component (which should be equivalent to square of mean plus variance). The proposed AS can enhance the BCS performance more efficiently than the method introduced in [28] as shown in the later Section. Why the proposed AS scheme can efficiently enhance the CS performance can be interpreted in a geometric way. The essence of CS representation of signal is a projection transform representation in the subspace spanned by a set of MRBs forming the measurement matrix. A random measurement matrix forms a projection subspace orienting to a random direction, a CS measurement sequence y for a signal x is a projection of the original signal x from the high dimensional space onto the subspace, and certainly the original signal x can be approximated by a linear combination of all the MRBs. If the selected MRBs are orthogonal from each other, they will form an orthogonal subspace. And then, the original signal x can be optimally linearly approximated by all the orthogonal MRBs, and the optimal weighting coefficients are the measurement values y. Furthermore, if the MRBs forming random measurement matrix are both orthogonal from each other and
Z. Gao et al. / J. Vis. Commun. Image R. 24 (2013) 885–894
completed, then the original signal x can be reconstructed perfectly by the linear combination of the complete orthogonal MRBs, and the optimal weighting coefficients are the measurement values. In other words, based on the signals’ orthogonal decomposition theory, when the projection transform bases are orthogonal from each other, the energy in the transform domain of signal can be best approximation to that of the original signal. And when a set of orthogonal bases forming a space is complete, the energy in the transform domain of signal will be equal to that of the original signal, which is called law of energy conservation. So, the orthogonalization of MRBs is very important to guarantee the RIP of the measurement matrix, and re-orthogonalizing the MRBs of the measurement matrix after weighting is necessary, because the orthogonalization of MRBs can make the energy of the signal in the transform domain be approximation of that of the original signal as much as possible. In CCS, the direction of the projection subspace is random, so all the components of the sampled signal will typically contribute the same amount of energy with the same probability. While adaptively reweighted the measurement matrix could rotate the subspace direction to approach that of the sampled vector, where the signal components have large magnitudes. So, more energy of large magnitude coefficients will be captured, and thus can be recovered more accurately than the small magnitude ones. Weighting the sampled coefficients [28] before CS is equivalent to weighting the measurement matrix, which makes the subspace approach the directions where the signal components have large magnitudes, and in total improves encoding efficiency. However, this simply weighting the measurement matrix will destroy the orthogonality of measurement bases and cause measurement redundancy because the correlation among the MRBs is increased, thus make the energy preservation property of the measurement dictionary bad, and the RIP of the measurement matrix bad as shown by the below experiment results. On the other hand, by using the reweighted sampling, the reconstruction distortion for the frequency component with large magnitude will be reduced while that for the frequency component with small magnitude may be increased. If the weighting coefficients are defined suitably, the total amount of distortion of all frequency components will be significantly reduced and the reconstructed image quality will be improved obviously. On the contrary, if the weighting coefficients are defined inappropriately, the total amount of distortion of all frequency components may not be reduced efficiently, and even it will be increased and the reconstructed image quality is degraded. Optimal design of the weighting coefficients becomes a key problem for efficiently implementing reweighted sampling. The weighting coefficients determined as [28] by the linear combination of the absolute value of its mean (first order moment) and the square root of its variance (second order moment) of each frequency component, which is essentially equivalent to that determined by the square root of the energy (second order moment) of each frequency component (it also can be validated by the simulation experimental results as list in the later Section), can make the subspace nearly first-order approach the directions where the signal components have large magnitudes, and hence successfully improve CS performance. While weighting the measurement matrix by the energy of each frequency component instead of its square root can force the projection subspace quadratic approach the directions where the signal components have large magnitude, which makes the subspace approach the directions where the signal components have large energy more closely, hence more information of the frequency component with large energy will be captured, this frequency component will be reconstructed more accurately. This energy based weighting scheme by combining re-orthogonalization has a better tradeoff in amplifying large coefficient and shrinking small
889
coefficient than that based on the reweighted method [28], which achieves to further improve the reconstructed image quality. Certainly, how to get a perfect weighting matrix based on image characteristics is still worth to be further exploited. The proposed BCS scheme with AS is described in details as follows. The input image is first divided into B = ðM=mÞ ðN=nÞ image blocks with size of m n, as introduced in Section 3, and each image block is sparsified by performing 2-D DCT. The weighting matrix X is then obtained by (7) and (8), and the adaptive ^ is designed as (9). Finally, each image block measurement matrix U is represented in DCT domain as (10a), (10b) based on whether CRP is adopted or not at the same time:
^a ^i yi ¼ U
ð10aÞ
^ ai yi ¼ U
ð10bÞ
In order to achieve CS reconstruction with AS, the weighting coefficients vector x should be transmitted to the decoder side, which can be efficiently compressed by conventional entropy coding algorithm before being transmitted so that only fewer additional memory resources will be required. 5. Experimental results To evaluate the efficiency of our proposed methods, the comparison experiments and results are given in this Section. We have carried out several sets of experiments to validate the efficiency of our methods: (a) the proposed CRP can efficiently balance the sparsity of coefficients vectors to be sampled; (b) the proposed CRP and/or AS can efficiently enhance the CS representation performance in both reducing measurement ratio and/or increasing PSNR and visual quality; (c) re-orthogonalization of the weighted measurement matrix is efficient to enhance the measurement efficiency, and the weighted sampling by using energy of each frequency component is more efficient than that by using the linear combination of the absolute value of its mean and the square root of its variance of each frequency component;(d) performance comparison with other similar works. We first select several 256 256 gray-level .bmp compression test images for our experiments, and set block size as 16 16. The i.i.d Gaussian matrix is used as the conventional CS measurement matrix U, and OMP [17] algorithm for CS reconstruction, which is fast but not high efficiency. Here only parts of experimental results are given for space saving. Fig. 3 illustrates the comparison of distribution of coefficients magnitude decay curves for the sampled vectors in DCT domain, where 256 256 gray-level lenna.bmp is used for test. The magnitude values are expressed in logarithm format as 20 log 10() for clearly shown. Fig. 3(a) illustrates the distribution in the case of not using CRP technique, and Fig. 3(b) the distribution in the case of using CRP technique. Experimental results indicate that the CRP is efficient to make the distribution of coefficients magnitude decay curves of all the sampled vectors identical, which could considerately enhance the sampling efficiency when equally allocating the measurement dimension to different sampled vectors. To verify the stability and efficiency of CRP in equalizing the sparsity of the sampled vectors, 10 100-tries with different permutation matrix generated randomly have been done, and all the distribution graphics of magnitude decay curves obtained from which are plotted overlapped in one figure as shown in Fig. 3(c), from which we can see the answer is obviously positive. We further have compared the BCS performance of these methods: (1) conventional compressive sensing (CCS) representation without using other optimization; (2) compressive sensing representation with the reweighted sampling (CS + RWS) [28]; (3) the
890
Z. Gao et al. / J. Vis. Commun. Image R. 24 (2013) 885–894 magnitude decay curve with permutation 70
60
60
50
50 magnitude (in dB)
magnitude (in dB)
magnitude decay curve without permutation 70
40 30
40 30
20
20
10
10 0
0 0
50
100 150 block size
200
250
0
50
100
150
200
250
block size
(b)
(a)
(c) Fig. 3. Comparison of distribution of coefficients magnitude decay curves for the sampled vectors. (a) The distribution of magnitude decay curves without using CRP, (b) the distribution of magnitude decay curves with using CRP, (c) the distribution of magnitude decay curves with using CRP for 100 tries.
proposed compressive sensing representation with coefficients random permutations (CS + CRP); (4) the proposed compressive sensing representation with adaptively sampling (CS + AS); and (5) the proposed compressive sensing representation with combining CRP and AS (CS + CRP + AS). The PSNR performance comparisons are tabulated in Table 1 at several measurement ratios (MR) ranging from 0.2 to 0.5, and the visual quality comparisons are illustrated in Fig. 4, where the 256 256 gray-level lenna.bmp is still used for test. From the comparison results, it can be drawn that the CRP method can efficiently enhance BCS representation performance, and the performance gain obtained nonlinearly increases as the MR increases; the proposed AS method based on energy distribution in DCT domain is more efficient than the RWS method [28] based on the first-order and second-order statistics characters in DCT domain in enhancing
Table 1 PSNR (in dB) performance comparisons for several optimal sampling schemes. MR Method
CCS CS + RWS [28] CS + CRP CS + AS CS + CRP + AS
0.2
0.3
0.4
0.5
20.7106 25.6006 22.5514 26.5400 27.9764
23.2085 27.1063 25.4565 28.3899 30.1709
25.0567 28.1842 28.0055 29.6009 31.7731
27.0152 29.1852 30.7128 30.7668 33.4267
BCS performance. The proposed CS representation with combining CRP and AS techniques can obtain performance gains best. To validate the efficiency of the proposed methods, we also respectively carry out the following comparison experiments: (1) comparing the original reweighted sampling of [28] (RWS [28]) to the reweighted sampling with weighting coefficients determined by the square root of energy of each frequency component (RWS + SRE) (i.e. (8a) is replaced by x = sqrt[E1,E2,...,Em n]), and reweighted sampling with re-orthogonalizing measurement matrix (RWS [28]+RO, and RWS + SRE + RO); (2) comparing our methods (CS + AS, and CS + CRP + AS) with other similar works including [14], [26] and [28]. Here several 512 512 pgm gray images are used in our tests, which are illustrated in Fig. 5. The block size is set as 16 16, the i.i.d Gaussian matrix is used as the conventional CS measurement matrix U, and OMP [17] and NESTA [21] algorithm for CS reconstruction are used. The PSNR performance comparison results are tabulated in Table 2 and Table 3 at several measurement ratios (MR) ranging from 0.1 to 0.5. Table 2 lists the experimental results for comparing RWS [28] to RWS + RO, RWS + SRE, and RWS + SRE + RO where OMP reconstruction algorithm is used, and Table 3 lists the experimental results for comparing our proposed methods to other similar works including [14], [26] and [28] where NESTA reconstruction algorithm is used. The experimental results for [14] are obtained by replacing the Basis Pursuit recovery algorithm adopted
891
Z. Gao et al. / J. Vis. Commun. Image R. 24 (2013) 885–894
Fig. 4. Visual quality comparisons for lenna.bmp (measurement ratio equals to 0.4, and block size is 16 16). (a) Original image, (b) recovered by CCS, (c) recovered by CS + RWS, (d) recovered by CS + CRP, (e) recovered by CS + AS, (f) recovered by CS + CRP + AS.
(a) lenna
(b) barbara
(c) goldhill
(d) boat
(e) peppers
Fig. 5. Test images used in our experiments.
(f) mandrill
(g) zelda
892
Z. Gao et al. / J. Vis. Commun. Image R. 24 (2013) 885–894
Table 2 PSNR (in dB) performance comparisons for the affection of re-orthogonalization. MR Lenna
Barbara
Goldhill
Boat
Pepper
Mandrill
Zelda
RWS [28] RWS + SRE RWS [28] + RO RWS + SRE + RO RWS [28] RWS + SRE RWS [28] + RO RWS + SRE + RO RWS [28] RWS + SRE RWS [28] + RO RWS + SRE + RO RWS [28] RWS + SRE RWS [28] + RO RWS + SRE + RO RWS [28] RWS + SRE RWS [28] + RO RWS + SRE + RO RWS [28] RWS + SRE RWS [28] + RO RWS + SRE + RO RWS [28] RWS + SRE RWS [28] + RO RWS + SRE + RO
0.1
0.2
0.3
0.4
0.5
26.7461 26.7532 27.4266 27.4537 22.9687 22.9895 23.2044 23.2206 26.3884 26.3990 26.8856 26.9044 24.7151 24.7170 25.4635 25.4500 25.9456 25.9897 26.9271 26.9372 20.0824 20.0442 19.8948 19.8893 30.7375 30.7683 31.5236 31.5251
29.4068 29.4210 30.3163 30.3240 25.0254 25.0303 25.6109 25.6228 28.4547 28.4272 29.0213 29.0019 27.1931 27.2202 28.1942 28.2156 29.0629 29.0138 30.0465 30.0334 21.2653 21.2521 21.2016 21.1751 33.5478 33.5159 34.4345 34.4241
31.3464 31.3304 32.4222 32.4237 26.6924 26.7073 27.7599 27.7658 29.8529 29.8430 30.4830 30.4733 29.0315 29.0427 30.3268 30.3247 31.0388 31.0592 31.9728 31.9678 22.2545 22.2675 22.3303 22.3070 35.4171 35.4426 36.4454 36.4315
32.6084 32.6184 33.7770 33.7779 28.0989 28.1254 29.7696 29.7888 30.7933 30.7950 31.5235 31.5457 30.3630 30.3752 31.8678 31.8774 32.1753 32.1896 33.0949 33.0811 23.0399 23.0546 23.2713 23.2570 36.6363 36.6548 37.7133 37.7224
33.7801 33.7861 35.0284 35.0278 29.5207 29.5544 31.5197 31.5328 31.7763 31.8090 32.6812 32.6844 31.8076 31.7809 33.4085 33.4094 33.1793 33.2445 34.0224 34.0300 23.8967 23.9029 24.3622 24.3862 37.8869 37.8956 38.9308 38.9405
Table 3 PSNR (in dB) performance comparisons of several similar works. MR Lenna
Barbara
Goldhill
Boat
Pepper
Mandrill
Zelda
Kumar’s [14] RWS [28] + RO Mun’s + DDWT [26] Proposed AS Proposed CRP + AS Kumar’s [14] RWS [28] + RO Mun’s + DDWT [26] Proposed AS Proposed CRP + AS Kumar’s [14] RWS [28] + RO Mun’s + DDWT [26] Proposed AS Proposed CRP + AS Kumar’s [14] RWS [28] + RO Mun’s + DDWT [26] Proposed AS Proposed CRP + AS Kumar’s [14] RWS [28] + RO Mun’s + DDWT [26] Proposed AS Proposed CRP + AS Kumar’s [14] RWS [28] + RO Mun’s + DDWT [26] Proposed AS Proposed CRP + AS Kumar’s [14] RWS [28] + RO Mun’s + DDWT [26] Proposed AS Proposed CRP + AS
0.1
0.2
0.3
0.4
0.5
20.9449 29.3319 27.8925 30.3642 31.4374 18.9614 24.0166 22.6668 24.4255 25.1659 22.3567 28.1355 26.9712 28.9880 29.2958 20.6701 26.9488 25.5759 27.8925 28.6605 20.2674 29.1194 28.585 29.9277 30.6586 17.6292 20.6412 20.7532 21.2555 21.5076 24.8490 33.6270 32.0401 34.4698 35.3401
25.5516 32.6233 30.8921 33.6259 34.7328 22.8641 26.8551 23.9335 27.1651 28.4126 25.4141 30.3325 28.7911 31.1807 31.6374 23.8130 30.1983 28.293 31.2151 32.2614 25.2703 32.0310 31.6131 32.8315 33.4685 19.5063 22.1743 21.871 22.9414 23.2706 29.7030 36.6081 34.7534 37.6072 38.4256
28.5599 34.9488 33.1207 35.9803 36.9860 25.4603 29.6445 25.389 29.9905 31.5860 27.4289 32.0733 30.2972 33.0500 33.5058 26.5177 33.1698 30.3551 34.3370 35.3267 28.2057 33.7339 33.4837 34.5726 35.0773 20.8098 23.6259 22.9088 24.5548 25.0842 32.6180 38.8354 36.5706 39.9241 40.3475
30.6952 36.5222 34.8477 37.6565 38.6389 27.5011 32.2459 26.883 33.0302 34.3817 29.1129 33.4447 31.6221 34.5399 35.2228 28.6435 35.5280 32.0424 36.9672 38.0095 30.3093 34.8937 34.9349 35.6976 36.2933 21.9859 24.8926 23.9626 25.9718 26.8064 34.7929 40.4292 37.9766 41.5520 41.7548
32.8962 38.1220 36.4943 39.2998 40.0898 29.7092 34.9044 28.5789 36.1403 37.3357 30.8299 35.0074 32.9345 36.1523 36.8781 30.9570 37.9864 33.6668 39.5585 40.4928 32.3908 36.0467 36.2580 36.8767 37.5298 23.3562 26.4518 25.0873 27.6428 28.5930 36.7653 41.8620 39.3300 43.0214 43.1772
in [14] by NESTA algorithm used here and without using quantization for fairness of comparison. The experimental results for Mun’s + DDWT method (here we only list this method for
comparison for space limitation because it has almost best performance among all the methods introduced in [26]) are obtained by their open software BCS-SPL-1.5-1 from http://
893
Z. Gao et al. / J. Vis. Commun. Image R. 24 (2013) 885–894 Table 4 Energy ratios of between the measurement data and the original image. MR Lenna
Baboon
Pepper
CCS RWS [28] RWS [28] + RO Our AS CCS RWS [28] RWS [28] + RO Our AS CCS RWS [28] RWS [28] + RO Our AS
0.1
0.2
0.3
0.4
0.5
0.0916 46065 0.9795 0.9875 0.0914 67733 0.9600 0.9652 0.0914 69815 0.9881 0.9931
0.2110 28369 0.9881 0.9932 0.2111 41745 0.9672 0.9719 0.2103 43017 0.9932 0.9962
0.2664 19005 0.9925 0.9958 0.2670 27964 0.9733 0.9772 0.2668 28820 0.9956 0.9976
0.3910 14638 0.9947 0.9970 0.3929 21538 0.9782 0.9813 0.3936 22195 0.9969 0.9982
0.4716 11801 0.9963 0.9979 0.4704 17356 0.9826 0.9851 0.4683 17887 0.9978 0.9988
www.ece.msstate.edu/~fowler/BCSSPL, where the block size of 16 16 is chosen. From the comparison results, we can see that (1) reweighted sampling of [28] is nearly equivalent to the reweighted sampling with weighting coefficients determined by the square root of energy of each frequency component; (2) re-orthogonalizing the reweighted measurement matrix is meaningful to enhance the CS performance; (3) the proposed AS that combines determining weighting coefficients by the energy of each frequency component with re-orthogonalizing the reweighted measurement matrix can get more measuring performance gains than RWS [28], i.e. considerably reduce measurement ratio and/or improving the reconstructed image quality; and (4) The proposed CRP + AS can get much more CS performance gains than other similar works. In addition, to reveal the necessity and efficiency of re-orthogonalizing the measurement row bases for preserving the RIP of the measurement matrix, the energy ratios of between the measurement data and the original image by different measurement matrix are compared, and the results with 256 256 test images lenna.bmp, baboon.bmp and pepper.bmp are listed in Table 4, where the size of block is set as 16 16, and the MR ranges from 0.1 to 0.5. From Table 5, we can see that the energy ratios by our AS are always more close to 1 than those by others, which indicate that our proposed AS with re-orthogonalizing the measurement row bases has better energy preservation property, and thus could be efficient to guarantee RIP of measurement matrix and improve the CS performance. Meanwhile, we can see from Table 4 that the energy ratios by the reweighted measurement of [28] are much greater than 1, which indicate that there exist a large number of measurement redundancies by the method in [28], and the RIP of its equivalent measurement matrix could be not very well. 6. Conclusions Image representation using BCS in DCT domain for compression applications (especially suited for encrypted image compression) is investigated in this paper. Two efficient techniques used in the stage of encoding for enhancing BCS performance are exploited. The proposed CRP not only provides an efficient method to implement image encryption before compression, but also can efficiently equalize the sparsity of the sampled coefficients vectors for improving CS encoding efficiency. Some less sparse blocks will not be reconstructed well without CRP when assigning the same numbers of measurement dimension to all blocks, while the random permutations across blocks does not adversely affect encoding of CS that does not exploit inter-coefficient correlation within a block, and such permutations make all vectors to be measured nearly equally sparse, therefore CRP can enhance measurement efficiency. The proposed AS can achieve extracting distinctively
different frequency components based on their importance to image recovery quality by adaptively scaling the coefficients of measurement matrix, which is obviously efficient in enhancing CS encoding performance and/or improving the reconstructed image quality. The fact that the proposed AS can get more performance gains indicates that the reweighted sampling based on the energies of frequency components can acquire important information of image components more efficiently, and re-orthogonalizing the weighting measurement matrix is necessary to guarantee the RIP of the measurement matrix and effective to enhance measurement efficiency. Our proposed scheme could be an efficient alternative for encrypted image compression. Comprehensive investigations on efficiently alleviating blocking artifacts due to block-based sampling and noises generated by CRP, as well as high efficient robust image compression based on this image representation scheme, is our work to be deeply carried out in future.
Acknowledgments The authors would like to thank Editor and the reviewers for their helpful comments that lead to an essential improvement of the manuscript. This work was supported by the Fundamental Research Funds for the Central Universities (granted No. CZY12006), Natural Science Foundations of China (granted No. 60972081) and Hubei Natural Science Foundations of China (granted No. 2009CDA139).
References [1] D.L. Donoho, Compressed sensing, IEEE Trans. Inform. Theory 52 (4) (2006) 1289–1306. [2] E.J. Candes, T. Tao, Near-optimal signal recovery from random projections: universal encoding strategies?, IEEE Trans Inform. Theory 52 (12) (2006) 5406–5425. [3] R.G. Baraniuk, Compressive sensing [lecture notes], IEEE Signal Process. Mag. 24 (4) (2007) 118–124. [4] E.J. Candes, M.B. Wakin, An introduction to compressive sampling, IEEE Signal Process. Mag. 25 (2) (2008) 21–30. [5] R.G. Baraniuk, E. Candes, M. Elad, Y. Ma, Applications of sparse representation and compressive sensing, Proc. IEEE 98 (6) (2010) 906–909. [6] V.K. Goyal, A.K. Fletcher, S. Rangan, Compressive sampling and lossy compression, IEEE Signal Process. Mag. 25 (2) (2008) 48–56. [7] T. Petrisor, B. Pesquet-Popescu, J.-C. Pesquet, A compressed sensing approach to frame-based multiple description coding, in: Proc. Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), vol. 2, 2007, pp. 709–712. [8] Y. Zhang, S. Mei, Q. Chen, Z. Chen, A novel image/video coding method based on compressive sensing theory, in: Proc. Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), 2008, pp. 1361–1364. [9] H. Huang, A. Makur, D. Venkatraman, Video object error coding method based on compressive sensing, in: Porc. Int. Conf. Control, Automation, Robotics and Vision, 2008, pp. 1287–1291. [10] D. Venkatraman, A. Makur, A compressive sensing approach to object-based surveillance video coding, in: Proc. Int. Conf. Acoustics, Speech and Signal Processing (ICASSP), 2009, pp. 3513–3516.
894
Z. Gao et al. / J. Vis. Commun. Image R. 24 (2013) 885–894
[11] S. Sanei, A. H. Phan, J.-L. Lo V. Abolghasemi and A. Cichocki, ‘‘A compressive sensing approach for progressive transmission of images’’, in Proc. Int. Conf. DSP’2009, pp. 1–5, 2009. [12] J. Wen, Z. Chen, Y. Han, J.D. Villasenor, S. Yang, A compressive sensing image compression algorithm using quantized DCT and noiselet information, in: Proc. Int. Conf. Acoustics, Speech and Signal Processing (ICASSP), 2010, pp. 1294–1297. [13] J. Prades-Nebot, Y. Ma, T. Huang, Distributed video coding using compressive sampling, in: Proc. Int. Conf. Picture Coding Symposium (PCS), 2009, pp. 1–4. [14] A.A. Kumar, A. Makur, Lossy compression of encrypted image by compressive sensing technique, in: Proc. Int. Conf. TENCON’2009, 2009, pp. 1–5. [15] C. Deng, W. Lin, B.-S. Lee, C.T. Lau, Robust image compression based on compressive sensing, in: Proc. Int. Conf. Multimedia & Expo (ICME), 2010, pp. 462–467. [16] C. Deng, W. Lin, B.-S. Lee, C.T. Lau, Robust image compression based on compressive sensing, IEEE Trans. Multimedia 14 (2) (2012) 278–290. [17] J.A. Tropp, A.C. Gilbert, Signal recovery from random measurements via orthogonal matching pursuit, IEEE Trans. Inform. Theory 53 (12) (2007) 4655– 4666. [18] M.A.T. Figueiredo, R.D. Nowak, S.J. Wright, Gradient projection for sparse reconstruction application to compressed sensing and other inverse problems, IEEE J. Sel. Top. Signal Process. 1 (4) (2007) 586–597. [19] R. Chartrand, W. Yin, Iteratively reweighted algorithms for compressive sensing, in: Proc. Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), 2008, pp. 3869–3872. [20] E.J. Candes, M.B. Wakin, S. Boyd, Enhancing sparsity by reweighted l1 minimization, J. Fourier Anal. Appl. 14 (5) (2008) 877–905.
[21] S. Becker, J. Bobin, E. Candes, NESTA: a fast and accurate first-order method for sparse recovery, SIAM J. Imaging Sci. 4 (1) (2009) 1–39. [22] L. He, L. Carin, Exploiting structure in wavelet-based Bayesian compressive sensing, IEEE Trans. Signal Process. 57 (9) (2009) 3488–3497. [23] R.G. Baraniuk, V. Cevher, M. Duarte, C. Hegde, Model-based compressive sensing, IEEE Trans. Inf. Theory 56 (4) (2010) 1982–2001. [24] B. Han, F. Wu, D. Wu, Image representation by compressive sensing for visual sensor networks, J. Vis. Commun. Image Represent. 21 (4) (2010) 325–333. [25] L. Gan, Block compressed sensing of natural images, in: Proc. of Int. Conf. Digital Signal Processing, 2007, pp. 403–406. [26] S. Mun, J.E. Fowler, Block compressed sensing of images using directional transforms, in: Proc. Int. Conf. Image Processing (ICIP), 2009, pp. 3021–3024. [27] Y. Yang, O.C. Au, L. Fang, X. Wen, W. Tang, Perceptual compressive sensing for image signals, in: Proc. Int. Conf. Multimedia & Expo (ICME), 2009, pp. 89–92. [28] Y. Yang, O.C. Au, L. Fang, X. Wen, W. Tang, Reweighted compressive sampling for image compression, in: Proc. Int. Conf. Picture Coding Symposium (PCS), 2009, pp. 89–92. [29] Y. Yu, B. Wang, L. Zhang, Saliency-based compressive sampling for image signals, IEEE Signal Process. Lett. 17 (11) (2010) 973–976. [30] Z. Gao, C. Xiong, C. Zhou, H. Wang, Compressive Sampling with Coefficients Random Permutations for Image Compression, in: Proc. Int. Conf. on Multimedia and Signal Processing, 2011, pp. 321–324. [31] D. Donoho, Y. Tsaig, Extensions of compressed sensing, Signal Process. 86 (2006) 533–548. [32] W. Bajwa, J. Haupt, G. Raz, S.J. Wright, R. Nowak, Toeplitz-structured compressed sensing matrices, in: IEEE 14th Workshop on Statistical, Signal Processing, 2007, pp. 294–298.