Extrema points coding based on empirical mode decomposition: An improved image sub-band coding method

Extrema points coding based on empirical mode decomposition: An improved image sub-band coding method

Computers and Electrical Engineering 39 (2013) 882–892 Contents lists available at SciVerse ScienceDirect Computers and Electrical Engineering journ...

1MB Sizes 0 Downloads 93 Views

Computers and Electrical Engineering 39 (2013) 882–892

Contents lists available at SciVerse ScienceDirect

Computers and Electrical Engineering journal homepage: www.elsevier.com/locate/compeleceng

Extrema points coding based on empirical mode decomposition: An improved image sub-band coding method q Guang-tao Ge a,b,⇑, Lu Yu a a b

Institute of Information and Communication Engineering, Zhejiang University, Hangzhou 310027, China School of Information and Electronic Engineering, Zhejiang Gongshang University, Hangzhou 310018, China

a r t i c l e

i n f o

Article history: Available online 19 February 2013

a b s t r a c t The extrema points coding approach is an attractive way to represent the image. This approach will potentially become a new scheme to compress the still image and will make a great progress for mobile visual search technique. According to the idea of extrema points coding, a complete image compression approach is presented in this paper. Based on the Empirical Mode Decomposition (EMD) theory, digital image is decomposed into different sub-bands, where the most important extrema points are extracted. On mid and low frequency sub-bands, a novel compression method with obvious advantages over current methods is brought out. Combined this method with Linderhed’s ‘‘coding of the EMD using DCT of variable sampled blocks (VSDCTEMD)’’, a complete digital image compression scheme is designed. It improves Linderhed’s image compression scheme based on EMD. Ó 2013 Elsevier Ltd. All rights reserved.

1. Introduction The most important lossy compression methods are discrete cosine transform (DCT) based methods [1] and discrete wavelet transform (DWT) based methods [2]. Based on DCT, the Joint Photographic Experts Group (JPEG) committee published the first standard [3,4], commonly known as the JPEG standard. In 1996, the JPEG committee began to investigate the new still image compression standard named JPEG2000 [5]. JPEG2000 is more frequently studied in recent decades, which is based on DWT. However, JPEG2000 still has several problems. One of them is about the best wavelet base selection: since the two-dimensional wavelet transform represents ‘‘point singularities’’ conquered functions but not ‘‘linear singularities’’ conquered functions [6], the standard two-dimensional wavelet transform technique cannot be taken as an optimal tool for describing two-dimensional spatial information [7]. Thus, some methods that integrate fractal coding and DWT [8] are being explored. Other new image coding methods are also being discussed including improved versions of SPIHT coding and directionlets based methods [9]. Nevertheless, it is necessary to use an adaptive decomposition method for image compression. As an adaptive multi-scale analysis tool for non-stationary signals, Empirical mode decomposition (EMD) was firstly introduced by Huang et al. [10]. During the EMD process, the original signal is decomposed into a series of intrinsic mode functions (IMFs) with mono instantaneous phase and mono instantaneous frequency, reflecting intrinsic time–frequency or space–frequency characteristics of the original signal. With self-adaptation property and full data-drive capacity, EMD can adaptively choose the basis functions, which is much more flexible than DWT. As far as one-dimensional case is widely concerned, studies are carried to the idea of two-dimensional which is called Bidimensional Empirical Mode Decomposition (BEMD). The BEMD method has been applied to fields such as feature extraction, texture analysis, image de-noising and image fusion [11]. Specially, the frequency identity of IMFs has raised scientists’ interests on transforming and coding image on the q

Reviews processed and recommended for publication to Editor-in-Chief by Deputy Editor Dr. Ferat Sahin.

⇑ Corresponding author at: Institute of Information and Communication Engineering, Zhejiang University, Hangzhou 310027, China. E-mail address: [email protected] (G.-t. Ge). 0045-7906/$ - see front matter Ó 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.compeleceng.2013.01.003

G.-t. Ge, L. Yu / Computers and Electrical Engineering 39 (2013) 882–892

883

sub-bands of different IMFs. Based on BEMD theory, Linderhed compressed still image successfully in her PhD thesis [12]. According to Linderhed’s primary idea, He improved EMD decomposition’s bidimensional interpolation method and designed a new image compression method [13]. Tian presented a BEMD based image compression method based on multi-resolution characteristics of BIMF [6]. Moreover, Guaragnella proposed a technique to enhance JPEG encoding with the BEMD process [14]. As a classical sub-bands image compression and coding method, Linderhed’s scheme has the feature that the operation on every sub-band is similar. Even though the algorithm under this scheme provides quite good experiment results, the ignorance of each sub-band’s different space–frequency characteristic is still an obstruction to get higher compression ratio. To represent each sub-band’s space–frequency characteristic, this paper extends Linderhed’s research by further exploring a general variable sampling method based on extema points. The method in this paper is helpful to represent digital image purely with extrema point’s information. According to the extrema point’s information, we could transmit extrema points of one sub-band (IMF or residue) and then recognize image contents on this sub-band. This would be very valuable for mobile visual search technique [15]. This paper is organized as follows: the concepts about EMD and BEMD are explained in Section 2, the primary idea of Linderhed’s compression method is introduced in Section 3, and the details of compression method with grid characteristic points are described in Section 4. Section 5 gives a brief description of the newly obtained scheme of image sub-band coding based on BEMD. The experiment results are shown in Section 6 and finally Section 7 includes conclusion. 2. EMD and BEMD The principle of EMD is based on the characterization of the signal through its decomposition in IMF, which can be defined as follows: Definition 1. A function is an intrinsic mode function (IMF) if the number of extrema equals the number of zero-crossings and if it has a zero local mean [10]. With this definition, the principle of EMD can be described as in Section 2.1. 2.1. Empirical mode decomposition (EMD) The signal x(t) can be decomposed

xðtÞ ¼

n X ci þ r n

ð1Þ

i¼1

In formula (1), ci is an IMF. That is to say n-empirical modes can be decomposed from the original signal. In the decomposed result, there is also a residue rn, which is the low frequency trend of x(t). The IMFs c1; c2; . . . ;cn are different frequency bands ranging from high to low. On each frequency band, the frequency components are different. 2.2. Bidimensional Empirical Mode Decomposition (BEMD) As for the two-dimensional signal f(x, y), x = 1, . . . , M, y = 1, . . . , N, the sifting process could be described as follows: (1) Initialize: let j = 1, rj1(x, y) = r0(x, y) = f(x, y) (2) Extract the jth IMF: (a) Initialize: let k = 1, hk1(x, y) = h0(x, y) = rj1(x, y). (b) Extract local minima/maxima of hk1. (c) Compute upper envelope and lower envelope functions umax(x, y) and umin(x, y) by interpolating, respectively, local minima and local maxima of hk1. (d) Compute m(x, y) = [umax + umin]/2 (mean envelope, local mean surface). (e) Update: hk(x, y) = hk1(x, y)  m(x, y). (f) Calculate stopping criterion (standard deviation, SD). (g) Put k = k + 1 and repeat steps (b)–(f) until SD < e, e is the gate value. (3) Update residue rj = rj1  cj, cj is the jth IMF; (4) Repeat steps (2) and (3) with j = j + 1 until the number of IMFs meets the demands; (5) Finally, we get the result of BEMD:

f ðx; yÞ ¼

n X ci ðx; yÞ þ r n ðx; yÞ

ð2Þ

i¼1

The standard deviation SD is calculated as

SD ¼

XX jhðk1Þ  hk j2 2

i

j

hðk1Þ

:

ð3Þ

884

G.-t. Ge, L. Yu / Computers and Electrical Engineering 39 (2013) 882–892

2.3. The first IMF, the first residue and the first local mean surface If the decomposition goes only once, we can get the simplification of formula (2), as:

f ðx; yÞ ¼ c1 ðx; yÞ þ r1 ðx; yÞ

ð4Þ

In this paper, c1(x, y) is defined as the first IMF, and r1(x, y) is defined as the first residue. When being calculated by upper envelope and lower envelope functions in the first time of sifting, m(x, y) could be marked as m1(x, y), defined as the first local mean surface of the first IMF and the first residue. Similarly, the second local mean surface of the first IMF and the first residue could be marked as m2(x, y); the third local mean surface of the first IMF and the first residue could be marked as m3(x, y), etc. 3. The primary idea of Linderhed’s compression method Linderhed’s image compression method is a typical sub-band compression method based on BEMD [16]. In this method, one of the key points is to discuss the possibility of representing and reconstructing images with extrema points. Linderhed’s initial work along this approach is to transmit extrema points of each IMF and residue. Then the IMFs and residues is reconstructed with spline interpolation at the decoder [12,17]. The coding bits of extrema points are much larger for the decoder to get an appropriate compression ratio because of the uncertainty of extrema points’ position. Alternatively, Linderhed brought out the ‘‘coding of the EMD using DCT of the variable sampled blocks (VSDCTEMD)’’ method, and used this method on the IMFs. Since different fields of one image usually have different contents, the frequency distribution is varied. Setting a uniform sampling rate on the whole image, the traditional sampling principle inevitably leads to redundancy. This is because it usually gives a low frequency image field a comparatively high sampling rate. It means quantity of sampling points is much more than necessary. On the contrary, if the sampling rate could be adjusted adaptively according to one image’s local frequency, a much higher compression ratio is acquired. This is the primary idea of ‘‘Variable Sampling’’. In the VSDCTEMD method, one sampling unit is one pixel template (usually the 7  7 template). The local frequency of each pixel template should be estimated firstly, and then the maximum frequency in this template could be confirmed. Thus, the real sampling rate in this template is decided according to the maximum frequency and the Shannon sampling theorem. Normally, the local sampling rate in different pixel template is different. In order to reduce the mosaic phenomena during the reconstruction process, the overlapped pixel line is remained between two neighbor templates. The overlapped pixels guarantee equal boundary value between templates in reconstruction. Each IMF is sampled into the variable sampled points block, and then this block is coded using the DCT as the last step of the IMF’s compression. Therefore, the scheme of compressing one image with VSDCTEMD is shown in Fig. 1. 4. Compression with the grid characteristic points Linderhed proposed that the extrema points coding approach is an attractive way to represent the image in her PhD thesis. According to her research, the major distortion comes from the reconstruction by interpolation. This is because position data coding is really difficult for those random existing extrema points.

Decoding

Reconstructed IMF1 Reconstructed IMFn Reconstructed Residue

Fig. 1. The scheme of VSDCTEMD approach.

Huffman Decoding

Data Sythesized

Residue

iDCT&Reconstruct

Image Reconstruction

Huffman Coding

Coding

IMFn

DCT&Quantize

BEMD

Variable Sample

IMF1

Image Input

G.-t. Ge, L. Yu / Computers and Electrical Engineering 39 (2013) 882–892

885

The VSDCTEMD method is an alternative approach for extrema points coding. Generally speaking, it is enough to decompose an original image into two sub-bands (the first IMF and the first residue) for compression. VSDCTEMD method is fit for the first IMF. As for the first residue, the VSDCTEMD method comes into a mess of paradox. For this reason, Linderhed turned to the traditional method of DCT to compress the first residue, where she called this improved DCT method as ‘‘DCT Threshold Coding’’ (DCTT) [18]. It is a pity that primary idea of extrema points coding could not be executed to the end. In this section, we improve the plan of representing and reconstructing the image with extrema points, and solve the problem of position data coding when representing the first residue. The novel method in this section is effective when compressing one image’s mid and low frequency information. 4.1. Comparison of two primary ideas for representing and reconstructing the first residue image with extrema points In Linderhed’s VSDCTEMD method, the choice of those sampling points is based on the distribution of extrema points (or ‘‘the characteristic points’’). In the final analysis, the VSDCTEMD method is to represent and reconstruct the image with ‘‘extrema points in blocks’’, as shown in Fig. 2. From Fig. 2, we know that the primary idea of VSDCTEMD is to divide one image into blocks. Since the local extrema points’ density is varied in different blocks, different sampling rate is defined by setting high sampling rate in high-density blocks and low sampling rate in low-density blocks. However, this idea caused problems when facing an image without widely distribution of high frequency information such as the first residue image. It is unnecessary to divide the image into many blocks if the distribution of extrema points is very sparse. This is because the data coding for blocks will reduce the compression ratio. On the contrary, if the image is only divided into several large blocks, there is a high probability that the sampling rate is very high in a whole large block. This happens when there exists a high density extrema point’s local distribution at one small field. In this case, the compression ratio will also decrease at last. This is the so-called paradox of VSDCTEMD when facing the first residue image. To solve this problem, we design a novel approach to compress the first residue image. Our primary idea is to represent and reconstruct the image with ‘‘extrema points in layers’’, as shown in Fig. 3. Concerning about the primary idea of the BEMD as step (e) of (2) shows in Section 2.2, the decomposing process is actually to subtract local mean surfaces from the original image. The first residue is the sum of several local mean surfaces in the first round of BEMD, as:

r1 ðx; yÞ ¼

k X mi ðf ðx; yÞÞ

ð5Þ

i¼1

Therefore, we can compress the first residue only if we compress these several local mean surfaces. We might find some important points with normative position to represent the local mean surfaces, since normative position means low coding bit rate. 4.2. Extracting the grid characteristic points from local mean surfaces The distribution of extrema points on local mean surfaces is also disordered and unsystematic. When presenting local mean surfaces, we must provide exact information of the position of extrema points. It means the information of an extrema

Into blocks

Random extrema points

Random extrema points

Random extrema points

setting sampling rate in each block according to the the local extrema points density

Fig. 2. Linderhed’s idea to represent and reconstruct the image.

Fig. 3. Our idea to represent and reconstruct the image.

Coding

The first residue

Random extrema points

886

G.-t. Ge, L. Yu / Computers and Electrical Engineering 39 (2013) 882–892

point position should be transmitted along with the information of its gray-level. For the ‘‘ith’’ local mean surface with Ni extrema points, the number of the data to present this surface will be 3  Ni. However, the local mean surface is much smoother than the image and the envelope of the extrema points. For this reason, we raised the idea of normative grid characteristic points choice. The so-called grid characteristic points are chosen by the way as follows [19]: Consider a two dimensional dataset sampled with a rectangular sampling grid. Firstly, we compute the number of the extrema points on each row, denoted by Wrj, j = 1, 2, . . . , Nr, where Nr is the total number of rows of the grid. Wr (max) and Wr (min) are respectively the maximum and minimum value of Wrj, j = 1, 2, . . . , Nr. Then we pick up a row if its number of the extrema points is greater than the value WAr given by:

WAr ¼ ðW r ðmaxÞ þ W r ðminÞÞ=G;

ð6Þ

where G is a parameter specified empirically between 1.5 and 7.5. In formula (6), different G value will give us different compression ratio: Less G means higher compression ratio. The first and the last row are always selected no matter their numbers of extrema points are greater or less than the value WAr. The columns are processed in the same way. The knots for spline interpolation are then determined as extrema points on each of the selected rows and columns, as well as the intersection points of the selected rows and columns. The procedure above will produce a relatively sparse grid on data domain. For the ‘‘ith’’ local mean surface, the number of its selected rows and columns is xi and yi respectively, then the number of the grid characteristic points could be calculated as M = xi  yi. So as far, M is the number of the gray level data to be transmitted; xi and yi are the position information to be transmitted, which are only two one-dimensional arrays. Since then, (M + xi + yi) is the number of the data to represent a local mean surface. The local mean surface could be reconstructed in the receiver with the function of bi-cubic spline interpolation. 4.3. Reconstruct local mean surfaces with the grid characteristic points Fig. 4 is the EMD decomposition of Lena128  128 taking SD = 0.2: Fig. 4a is the original image. Fig. 4b is the 1st IMF. Fig. 4c is the 1st residue. Fig. 4d is the 1st local mean surface. Fig. 4e is the 2nd local mean surface. Fig. 4f is the 3rd local mean surface. After extracting the grid characteristic points from three local mean surfaces, we use Huffman coding for the grid characteristic points positions and the quantized amplitude of their gray level. We can reconstruct three mean surfaces and the first residue after Huffman decoding and bi-cubic interpolating. As Fig. 5 shows, the reconstructed images are perfect. Fig. 5a is the reconstruction of the 1st local mean surface. Fig. 5b is the reconstruction of the 2nd local mean surface. Fig. 5c is the reconstruction of the 3rd local mean surface. Fig. 5d is the reconstruction of the 1st residue.

(a) Lena128×128

(d) mean1

(b) IMF1

(e) mean2

(c) residue1

(f) mean3

Fig. 4. The BEMD decomposition of Lena128  128 taking SD = 0.2.

887

G.-t. Ge, L. Yu / Computers and Electrical Engineering 39 (2013) 882–892

(a) reconstruct mean1

(b) reconstruct mean2

(c) reconstruct mean3

(d) reconstructresidue1

Fig. 5. The reconstruction of the local mean surfaces and the first residue.

4.4. Compression ratio calculation To assure the first residue’s reconstruction quality, the most important factor to be considered is the precision of the grid characteristic points’ coordinates. For this reason, data recording coordinate information is remained to four digits behind decimal point. That means this error is less than 0.0001. Thus, one binary data necessary for normalized coordinate information is 14 bits. For m  n image, we calculate the bit rate of every pixel before Huffman coding. Here we call this compression ratio as the Original Bits per pixel (OBpp). The OBpp is a parameter to calculate the compression ratio without considering of the coding redundancy, which can describe primary compression capability. The lower the OBpp value is, the higher the compression ratio is. The procedure to calculate OBpp value for the first residue image is described as follows:

OBpp1 ¼

! 3 X xi  yi  8 =ðm  nÞ i¼1

OBpp2 ¼

ð7Þ

!

3 X ðxi þ yi Þ  14 =ðm  nÞ

ð8Þ

i¼1

OBpp ¼ OBpp1 þ OBpp2

ð9Þ

According to formula (7), formula (8) and formula (9), we calculate the OBpp value of the first residue image as shown in Figs. 4 and 5, which is 1.2417 bpp. After the step of Huffman coding, we also calculated the bits of every pixel and called this compression ratio as Bits per pixel (Bpp). The Bpp is the final parameter to calculate the compression ratio after erasing the coding redundancy. Similarly, the lower the Bpp is, the higher the compression ratio is. The procedure to calculate Bpp for the first residue image is described as follows:

Bpp1 ¼

! 3 X ðxi  yi  8Þ=Cri =ðm  nÞ

ð10Þ

i¼1

Cri is the compression ratio of xi  yi points data with Huffman coding

Bpp2 ¼

! 3 X Huffmanððxi þ yi Þ  14Þ =ðm  nÞ

ð11Þ

i¼1

Huffman((xi + yi)  14) is the bits after Huffman coding from (xi + yi)  14 bits data

Bpp ¼ Bpp1 þ Bpp2

ð12Þ

Therefore, according to formula (10), formula (11) and formula (12), we calculate the final Bpp of the first residue image as shown in Figs. 4 and 5, which is 0.5625 bpp. We compared the compression results of our method with JPEG standard and JPEG2000 standard. In this paper, we take the Peak-signal-to-noise-ratio (PSNR) as the parameter to measure the distortion degree of the reconstructed images. From Table 1, we know that the distortion rate of our method is much lower for the first residue image than JPEG standard and JPEG2000 standard, when taking an obviously higher compression ratio. On the other hand, we also compared the compression results of our method with Linderhed’s DCT Threshold Coding method (DCTT). According to Table 2, we can draw the conclusion that our method has obvious advantages compared with Linderhed’s DCTT method when compressing the first residue image.

888

G.-t. Ge, L. Yu / Computers and Electrical Engineering 39 (2013) 882–892 Table 1 Comparison of compression level and PSNR with JPEG & JPEG2000.

Bpp (bpp) PSNR (dB)

JPEG

JPEG2000

Our method

0.79 44.6352

0.69 39.5226

0.5625 68.1268

Table 2 Comparison of compression level and PSNR with DCTT. DCTT Bpp (bpp) PSNR (dB)

0.30 45.0000

Our method 0.44 50.0000

0.30 47.1602

0.31 50.0113

4.5. The influence of SD on the first residue compression When there is a more rigorous SD, there will be more sifting times; thereby BEMD will produce more local mean surfaces. For instance, when SD is equal to 0.12, the first decomposition of the image Lena128  128 will give us 5 local mean surfaces, as shown in Fig. 6. If every local mean surface was extracted and represented, the compression level of our method will not be stable. Nevertheless, we will find that this problem could not become a bottleneck in the application of our method, if we take a simple analysis on the physics properties of these local mean surfaces. The definition of IMF describes: ‘‘At any point, the mean value of the envelopes defined by the local maxima and minima is zero.’’ This means that the local mean surface produced in the sifting round nearest to the satisfaction of SD has the smallest fluctuating amplitude. In this case, the local maxima and minima are nearly equal to zero. It is not difficult to imagine that a local mean surface like this will carry high frequency information of small amplitude. On the contrary, since the first local mean surface is the furthest to the satisfaction of SD, it takes the greatest local low frequency information of one image. The local low and mid frequency information carried by the local mean surfaces later will decrease rapidly in turn. Therefore, we tried to reconstruct the first residue image with the first three local mean surfaces. The result is equal to the reconstruction when SD = 0.2. Then we compared this result with the original first residue image with SD = 0.12 to calculate the distortion rate. Meanwhile, we also reconstructed the first residue image with all the five local mean surfaces with SD = 0.12, and then we calculated the distortion rate. In Table 3, we compared our experiment results with the JPEG and JPEG2000 results.

(a) Lena128×128

(e) mean2

(b) IMF1

(f) mean3

(c) residue1

(d) mean1

(g) mean4

(h) mean5

Fig. 6. The EMD decomposition of Lena128  128 taking SD = 0.12.

889

G.-t. Ge, L. Yu / Computers and Electrical Engineering 39 (2013) 882–892 Table 3 Comparison of compression level and PSNR among approaches of three or five surfaces reconstruction and JPEG&JPEG2000.

Bpp (bpp) PSNR (dB)

JPEG

JPEG2000

First three local mean surfaces

Five local mean surfaces

0.83 44.4639

0.75 38.8310

0.5625 30.6557

0.89 66.4979

According to Table 3, the compression result with five local mean surfaces only has advantage on low distortion rate compared with JPEG and JPEG2000. After studying the compression result with the first three local mean surfaces, we found that the compression ratio has obvious advantage over JPEG and JPEG2000, while the distortion rate is lower. A more applicable advice is that a rigorous SD standard is not necessary when taking the BEMD theory to compress images. The reason for this is the compression process does not need a precise IMF to do time–frequency analysis, but needs an appropriate tool to get a high compression ratio. As a data decomposition tool, the job of BEMD in image compression is to decompose the original data (the original image) into several frequency sub-bands, which is convenient for representing and reconstructing in the next step. Accordingly, it could be called as a good application of BEMD if the problem discussed above is solved. In most cases, SD = 0.2 is strict enough for BEMD to finish the compression task. 5. The improved image sub-band coding scheme based on BEMD The reason we use ‘‘the grid characteristic points method’’ in Section 4 on local mean surface is that local mean surface is a kind of slow changed bi-dimensional signal. The first residue can also be taken as a kind of slow changed bi-dimensional signal as it is the sum of local mean surfaces in the first round decomposition. Thus, this method is also fit for the first residue. However, this method is not good enough when handling the first IMF, as the two-dimensional signal of the first IMF is not so slow. The information of the first IMF is mainly on the high frequency band. Its high frequency random extrema points cannot be remedied by the function of bi-cubic spline interpolation. Therefore, we cannot discard the extrema points with important high frequency jumping information. As the sum of the first IMF and the first residue, the original image could not be simply compressed with this method either. To compress the original image, it is feasible to combine ‘‘the grid characteristic points method’’ in Section 4 with a method special for high frequency image compression. Since Linderhed’s VSDCTEMD method has good effect on compressing the first IMF, we use VSDCTEMD to compress the first IMF and use the method in Section 4 to compress the first residue. Thus, the new scheme of the image sub-band coding based on BEMD is the combination of these two methods, as shown in Fig. 7. 6. Experiment results According to the scheme of VSDCTEMD, Linderhed compressed the image Lena128  128. The result is shown in Fig. 8 [12]. In Fig. 8, the left picture is the original image; the right picture is the compressed image.

Input image

IMF

VSDCTEMD

BEMD

The first local mean surface’s grid characteristic points

Data Synthesizer

...

Residue

The nth local mean surface’s grid characteristic points

Coding

Decoding IMF

iVSDCTEMD

+ Reconstruct image

The first local mean surface reconstruction

Residue

+

Data Separator

... The nth local mean surface reconstruction

Fig. 7. The scheme of improved image compression approach based on BEMD.

890

G.-t. Ge, L. Yu / Computers and Electrical Engineering 39 (2013) 882–892

Fig. 8. The compression of Lena128  128 with VSDCTEMD approach.

When taking a rigorous SD, we reconstructed the first residue image with five local mean surfaces. Combined with the VSDCTEMD method, we got the compressed image of Lena128  128, which is shown in Fig. 9. The parameter G is equal to 1.6. In Fig. 9, the left picture is the original image; the right picture is the compressed image. As another choice, we reconstructed the first residue image with the first three local mean surfaces when taking a rigorous SD. Combined with the VSDCTEMD method, we got the compressed image of Lena128  128, which is shown in Fig. 10. The parameter G is equal to 1.6. In Fig. 10, the left picture is the original image; the right picture is the compressed image. With SD = 0.2, we reconstructed the first residue image with all the three local mean surfaces. Combined with the VSDCTEMD method, we got the compressed image of Lena128  128, which is shown in Fig. 11. The parameter G is equal to 1.6. In Fig. 11, the left picture is the original image; the right picture is the compressed image. The Bpp value and PSNR value of these four experiments (Figs. 8–11) are compared in Table 4. During the simulation process, the CPU for calculation is ‘‘Intel(R) core (TM) 2 Duo T6570 PU 2.1G’’; the memory is approximately 1.3 GB per Core; the simulation software is MATLAB7.0. The computation cost of these four experiments is also shown in Table 4. According to Table 4, we found that the distortion rate of Fig. 11 is very low, whereas its compression ratio is quite high with small computation cost. This result shows that taking SD = 0.2 is the most favorable BEMD choice for image compression. Comparing Fig. 11 with Fig. 8, we know that our method acquires an obviously higher compression ratio on a similar distortion rate. This illustrates that our method improves Linderhed’s method successfully. We also compared the results in Fig. 11 with the JPEG and JPEG2000 results when compressing the original Lena128  128 image, as shown in Table 5:

Fig. 9. The compression image when using five mean surfaces to reconstruct the first residue.

Fig. 10. The compression image when using three mean surfaces to reconstruct the first residue.

891

G.-t. Ge, L. Yu / Computers and Electrical Engineering 39 (2013) 882–892

Fig. 11. The compression image when using SD = 0.2 as the stopping criterion.

Table 4 Comparison among four experiments.

Bpp (bpp) PSNR(dB) Encoding time (s) Decoding time (s)

Fig. 8

Fig. 9

Fig. 10

Fig. 11

1.35 30.520 6.078 3.151

1.12 30.600 6.159 3.236

1.04 28.368 5.219 2.111

1.05 30.590 5.018 2.215

Table 5 Comparison of compression level and PSNR between JPEG&JPEG2000 and our method.

Bpp (bpp) PSNR(dB)

JPEG

JPEG2000

Fig. 11 (our method)

0.8 32.862

0.8 31.243

1.05 30.590

Table 6 Comparison of compression level and PSNR between Multi-resolution Coding-1, 2 and our method (compress the Lena128  128 image).

Bpp (bpp) PSNR(dB)

Multi-resolution Coding-1

Multi-resolution Coding-2

Our method

1.22 29.540

1.22 33.921

1.22 33.698

Table 7 Quality comparison of decoded images evaluated using different images and methods. Image

Bpp (bpp)

Algorithm

PSNR(dB)

Baboon

1.2

JPEG JPEG2000 Multi-resolution Coding-2 Our method

29.571 28.233 29.682 29.018

Couple

1.0

JPEG JPEG2000 Multi-resolution Coding-2 Our method

31.952 29.813 32.040 29.297

Woman

0.9

JPEG JPEG2000 Multi-resolution Coding-2 Our method

33.252 31.543 33.450 30.087

According to Table 5, we found that the compression ratio of Fig. 11 cannot reach the level of JPEG and JPEG2000, even though the distortion rate is little lower. Nevertheless, our method can also be called an effective compression method since the compression result is quite good at bitrates around 1 bpp. On the other hand, we also compared the coding results of our method with other BEMD-based image compression methods such as the references [6,14], as in Table 6. According to Table 6, we found that our method is better than Multi-resolution Coding-1 but is not as good as Multi-resolution Coding-2. However, after analyzing the calculation complexity of Multi-resolution Coding-2, we can draw the

892

G.-t. Ge, L. Yu / Computers and Electrical Engineering 39 (2013) 882–892

conclusion that our method’s complexity is much lower than Multi-resolution Coding-2. The reason is that Multi-resolution Coding-2 includes image pyramid down-sampling and cascade feedback [6]. For further performance evaluation, we took in more images with different features. The experiment results are shown in Table 7. Table 7 shows that our method produces the PSNR values similar to the JPEG2000 standard for these test images. Even though the PSNR values are not as good as those produced by JPEG standard or Multi-resolution Coding-2, our method can also be called a feasible compression method since the quality of decoded images is quite good and stable at bit rates around 1 bpp. Additionally, as the proposed method is a complete extrema points coding method with low complexity, it is meaningful for the development of the mobile visual search technique. 7. Conclusion As it is stated that ‘‘The extrema points coding approach is an attractive way to represent the image. . .. . .’’ (A. Linderhed, et al., 2004), the main contribution of this paper is to design a complete extrema points coding scheme at different subbands. In addition, this paper finds a way to solve the problem of extrema point’s position data coding at low and mid frequency sub-bands. These two works are both favorable to improve the plan of representing and reconstructing digital image with extrema points. Different with the transform field based standards such as JPEG and JPEG2000, the extrema points coding method gives a new choice to compress the still image. Even though the method in this paper cannot reach the highest compression level of JPEG standard and JPEG2000 standard now, it can be called an effective compression method according to the experiments results. What’s more, compared with JPEG and JPEG2000, ‘‘the grid characteristic points’’ method in Section 4 has obvious advantages of high compression ratio and low distortion rate on the low and mid frequency sub-band. It gives a possibility to finish a very high-speed initial step of mobile visual search in the mobile communication network. Acknowledgements This paper is supported by China Postdoctoral Science Foundation (2012M511881) and National Natural Science Foundation of China (61076021) (61102146). Thanks a lot to the creative work of Dr. Anna Linderhed. Thanks a lot to the work of Mr. Zengzhe Zhang. Thanks a lot to Dr. BeiBei Zhu and LeiLei Jia. Best regards to my parents. References [1] Pennebaker WB, Mitchell JL. JPEG still image data compression standard. New York: Van Nostrand Reinhold; 1993. [2] Usevitch BE. A tutorial on modern lossy wavelet image compression: foundations of JPEG 2000; 2001. [3] Information technology – digital compression and coding of continuous-tone still images – Part 1: Requirements and guidelines. ISO/IEC International Standard 10918-1, ITU-T Rec. T.81; 1993. [4] Rabbani M, Joshi R. An overview of the JPEG2000 still image compression standard. USA; 2002. [5] Information technology – JPEG2000 image coding system, ISO/IEC International Standard 15444-1. ITU Recommendation T.800; 2000. [6] Tian Y, Zhao K, Xu Y, Peng F. An image compression method based on the multi-resolution characteristics of BEMD. Comput Math Appl 2011;61:2142–7. [7] Jiao LC, Tan S, Liu F. Ridgelet theory: from ridgelet transform to curvelet. Chin J Eng Math 2005;22:761–73. [8] Iano Y, da Silva FS, Cruz AL. A fast and efficient hybrid fractal-wavelet image coder. IEEE Trans Image Process 2006;15:98–105. [9] Velisavljevic V, Beferull-Lozano B, Vetterli M. Space–frequency quantization for image compression with directionlets. IEEE Trans Image Process 2007;16:1761–73. [10] Huang NE, Shen Z, Long SR, Wu MC, Shih HH, Zheng Q, et al. The empirical mode decomposition and the Hilbert spectrum for nonlinear and nonstationary time series analysis. Proc Roy Soc A: Math Phys Eng Sci 1998;454:903–95. [11] Nunes JC, Guyot S, Del´echelle E. Texture analysis based on local analysis of the bidimensional empirical mode decomposition. Mach Vision Appl 2005;16:177–88. [12] Linderhed A. Adaptive image compression with wavelet packets and empirical mode decomposition [PhD thesis]. Sweden: Linköping University; 2004. [13] He J, Peng F. Algorithm for image compression based on improved EMD. J Infrared and Millim Waves 2008;27:295–8. [14] Guaragnella C, Manni A, Palumbo F, Politi T. Bidimensional empirical mode decomposition for multiresolution image coding. In: Computational Intelligence for Measurement Systems and Applications (CIMSA), 2010 IEEE international conference; 2010. p. 34–7. [15] Girod B, Chandrasekhar V, Chen DM, Cheung NM, Grzeszczuk R, Reznik Y, et al. Mobile visual search. Sig Process Mag – IEEE 2011;28:61–76. [16] Linderhed A. 2D empirical mode decompositions in the spirit of image compression. Proc SPIE 2002;4738:1–8. [17] Linderhed A. Image empirical mode decomposition: a new tool for image processing. Adv Adapt Data Anal 2009;01:265–94. [18] Linderhed A. Variable sampling of the empirical mode decomposition of two-dimensional signals. Int J Wave Multiresolut Inform Process 2005;03:435–52. [19] Xu Y, Liu B, Liu J, Riemenschneider S. Two-dimensional empirical mode decomposition by finite elements. Proc Roy Soc A: Math Phys Eng Sci 2006;462:3081–96. Guangtao Ge received a Ph.D. degree from Harbin Engineering University, China in 2009. He is currently a Postdoctoral Researcher in Institute of Information and Communication Engineering, Zhejiang University. He is also a lecturer in School of Information and Electronic Engineering, Zhejiang Gongshang University. His research area is in multimedia communication, sparse representation and optimization calculation. Lu Yu received a Ph.D. degree from Zhejiang University, China in 1996. She is currently a Professor in the Institute of Information and Communication Engineering, Zhejiang University. Her research area is in video coding, multimedia communication and relative ASIC design. She has published more than 100 technical papers and contributed 200 proposals to international and national standards in the recent years.