Combined image compression and denoising using wavelets

Combined image compression and denoising using wavelets

ARTICLE IN PRESS Signal Processing: Image Communication 22 (2007) 86–101 www.elsevier.com/locate/image Combined image compression and denoising usin...

749KB Sizes 3 Downloads 164 Views

ARTICLE IN PRESS

Signal Processing: Image Communication 22 (2007) 86–101 www.elsevier.com/locate/image

Combined image compression and denoising using wavelets V. Bruni, D. Vitulano Istituto per le Applicazioni del Calcolo ‘‘M.Picone’’, C.N.R., Viale del Policlinico 137, 00161 Rome, Italy Received 21 June 2006; received in revised form 23 November 2006; accepted 23 November 2006

Abstract This paper presents a novel scheme for simultaneous compression and denoising of images: WISDOW-Comp (Wavelet based Image and Signal Denoising via Overlapping Waves—Compression). It is based on the atomic representation of wavelet details employed in WISDOW for image denoising. However, atoms can be also used for achieving compression. In particular, the core of WISDOW-Comp consists of recovering wavelet details, i.e. atoms, by exploiting wavelet low frequency information. Therefore, just the approximation band and significance map of atoms absolute maxima have to be encoded and sent to the decoder for recovering a cleaner as well as compressed version of the image under study. Experimental results show that WISDOW-Comp outperforms the state of the art of compression based denoisers in terms of both rate and distortion. Some technical devices will also be investigated for further improving its performances. r 2006 Elsevier B.V. All rights reserved. Keywords: Image restoration; Image compression; Wavelets; Thresholding; Overlapping effects principle; Minimum description length principle

1. Introduction Denoising is currently one of the most interesting and widely investigated topics in Image Processing from both a theoretical and an empirical point of view. It commonly occurs in image acquisition and transmission in a variety of fields such as astronomy, remote sensing and so on [23,25,36]. The problem can be stated as follows: g ¼ h  f þ t,

(1)

where g is the noisy signal, h is blur, f is the clean signal while t is noise. In this paper t will be zeroCorresponding author. Tel.: +39 6 88470224;

fax: +39 6 4404306. E-mail addresses: [email protected] (V. Bruni), [email protected] (D. Vitulano).

mean Gaussian with variance s2 while h is the Dirac function. Despite a huge number of approaches proposed in literature, denoisers’ performances mainly depend on a suitable representation able to compact, as much as possible, original image information in a few coefficients. From this point of view, wavelets seem to be a powerful tool (see [9,33, Chapter 9]), as proved by the amount of approaches proposed in literature (see for instance [6,8,12,14,17,27,30, 35,38,49] and references therein) as well as the considerable effort in improving their compaction ability [10,13,29,43]. But, compaction of information is also fundamental for compression purposes [26,42]. In fact, encoding consists of eliminating redundant and ‘‘minor’’ information usually corresponding to image high frequencies. Although affinity between denoising and compression

0923-5965/$ - see front matter r 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.image.2006.11.006

ARTICLE IN PRESS V. Bruni, D. Vitulano / Signal Processing: Image Communication 22 (2007) 86–101

87

two main drawbacks. Firstly, thresholding performances strongly depend on the size of the context in which threshold has to be estimated. Secondly, it has to be performed for each coefficient, which is somewhat expensive. A solution may be to only fix the threshold for each wavelet sub-band, as in [7]. But in this case, adaptive thresholds tend to assume values quite smaller than the universal threshold. It turns out a higher number of selected coefficients, yielding worse performances in terms of sparseness and, then, compression ratio. In this paper we present a novel comp-denoiser, WISDOW-Comp, able to achieve a good trade-off between the aforementioned requirements. It inherits all properties of WISDOW, designed for image and signal denoising [5]. The latter splits wavelet details into atoms behaving as waves traveling through both space and scales. Each atom is composed of those coefficients generated by a singularity point in the original signal, as depicted in the left column of Fig. 2. Different kinds of singularity will produce slightly different shapes that can be approximated by a fixed shape—on a fixed basis (right column of Fig. 2). This way, absolute maximum amplitude (ama) of each atom along with its location allow us to recover the remaining coefficients. WISDOW-Comp then

theoretically began about in the middle of the past century [11], only recently there has been a practical interest in frameworks sharing operations common to these two tasks. They are usually called compdenoisers and are required in all cases where computing time has to be low or hardware power is limited [44]—a short review about this topic is in Section 2. Comp-denoisers are mainly based on thresholding wavelet details of the signal to be transmitted—only coefficients over exceeding the threshold are retained. As matter of fact, the choice of the best threshold is a hard problem since it implicitly requires the knowledge of original data. Donoho universal threshold [14] theoretically gives an asymptotic optimal estimation error in the minimax sense, i.e. when the number of samples reaches infinity. Nonetheless, both noise and part of the clean signal are eliminated by thresholding, yielding some annoying artifacts as depicted in Fig. 1. As an alternative, soft-thresholding [12] can be used. It is closer to the optimum minimax rate and preserves signal regularity, i.e. the estimator is at least as regular as the original signal [12]. Soft-thresholding tries to reduce the gap between preserved and discarded coefficients leading to a better recovery. Results can be further improved by making softthresholding adaptive [6]. But such a strategy has

110

110 30

100

100

20 90

T

90 10 80

80

0

70

70

-10

60

-20

50

-30

-T

60 50

-40

40 40

60

40

80 100 120 140 160 180 200 220

5

10

15

20

25

30

35

40

45

50

40

60

80 100 120 140 160 180 200 220

40

60

80 100 120 140 160 180 200 220

110

110 30

100

100

20

90

90

T 10

80

80 0 70

70

-10

60

-20

50

-30

-T

60 50

-40

40 40

60

80 100 120 140 160 180 200 220

40 5

10

15

20

25

30

35

40

45

50

Fig. 1. Thresholding effects on a noisy (top) and clean (bottom) step signal using a spline biorthogonal 3/9 wavelet. From left to right: signal, its wavelet representation and final result. Dotted line in the wavelet representation represents discarded coefficients using Donoho’s universal threshold while solid line is retained coefficients.

ARTICLE IN PRESS V. Bruni, D. Vitulano / Signal Processing: Image Communication 22 (2007) 86–101

f (t)

f (t)

88

t

to

t

Wf (u,s)

Wf (u,s)

to

to

u

to

u

Fig. 2. (Left) Infinite ramp signal (top) and its corresponding atom in the wavelet domain (bottom). (Right) A 2nd degree polynomial ramp (solid line) can be locally approximated by an infinite ramp (dotted line). Their corresponding wavelet atoms at a fixed scale level are also depicted (bottom).

exploits WISDOW representation, since the significance map will be composed of atoms absolute maxima locations instead of positions of all significant coefficients. Therefore, this approach gives a sparser representation than, adaptive or not, thresholding schemes and it is completely independent of context size tuning. However, the main contribution of this paper is to fully exploit intra-scale property of the aforementioned representation. In fact, amas can be recovered by the significance points map and the corresponding values in the low frequency component at the same scale. It easily follows by exploiting differential properties of wavelet transform [33]. Such a strategy can be then successfully inserted in a classical multiresolution scheme for recovering the final image. In particular, just the significance map (at all scale levels) along with the wavelet approximation band (at the coarsest one) are sent to the decoder. This latter is then able to recover the detail components at the coarser scale level. Hence, the approximation component and the recovered detail bands allow us to reach the approximation band at the successive (finer) scale. The process is then iterated till the first (the finest) scale level is reached.

Summing up, the proposed compression based denoiser is completely automatic, requires a low computational effort and outperforms the state-ofthe-art comp-denoisers. In addition, since wavelet atoms satisfy the well-known overlapping effects principle at each scale level, they can be independently managed enabling the decoder to be progressive. The organization of the paper is as follows. Section 2 contains a short presentation of available comp-denoisers. Some theoretical and practical aspects will be dealt and discussions about their limits will be provided. Section 3 briefly describes WISDOW details useful to follow the design of WISDOW-Comp contained in Section 4. In particular, the propagation of information across scales will be investigated by exploiting links between low and high frequency wavelet coefficients at each scale level. The corresponding algorithm will be the topic of Section 4.2. Section 5 shows some experimental results on selected test images as well as comparisons with the state-of-the-art comp-denoisers. Some technical variants oriented to further increase the rate-distortion performances of the basic scheme are also presented. Finally, conclusions are drawn in Section 6.

ARTICLE IN PRESS V. Bruni, D. Vitulano / Signal Processing: Image Communication 22 (2007) 86–101

2. Combining compression and denoising Theoretical interest for linking signal compression and denoising is not novel at all—see for instance [1,2,16,18–22,40,45,46,48]. Classical approaches showed [2,11,45] and proved [16,46] that the combined problem denoising þ compression is equivalent to separately solving them with respect to L2 norm. However, some recent investigations have been oriented to justify the intuition of a stronger link between denoising and compression. For instance, Natarajan in [28,37] argued that lossy compression can be used for denoising by using the Occam’s hypothesis: ‘‘The simplest explanation of the observed phenomenon is more likely to be correct’’. His approach practically brought together these two fields highlighting their link. A successive step forward in establishing a deeper connection between denoising and compression has been made by Liu and Moulin [31]. Their elegant formulation proved that the complexity regularized [36] estimate of an image can be seen as a compressed version of the noisy one, where the coder operates at a precise point (different from the one in [37]) of the ratedistortion curve. Bounds for the MSE (mean square error) of the estimation method are then given in terms of index of resolvability, quantifying the compressibility of the true image. More recently, Yea and Pearlman [47] tried to unify the two aforementioned approaches. They have shown that the optimal compression rate for denoising is given by the critical rate of the noise source. The latter is linked to the saturation point of the adopted denoiser at mid-high bit rates. From this point on the rate grows while the MSE remains almost unchanged. They also showed that this point does not coincide with the ones found by Occam filter and the complex regularization. All aforementioned comp-denoisers are based on well-known models and try to find out the optimal point in the ratedistortion curve. The two following ones are somewhat different. They propose new models for compression combined with denoising. The first one has been proposed by Chang et al. in [7]. They stress an optimal design of an adaptive threshold (BayesShrink) from a Bayesian framework based on the ratio between noise and original signal variance at each decomposition sub-band. Thresholding is then performed by the quantizer zero-zone using the assumption that the quantization error of preserved coefficients is negligible. The minimum description length (MDL) criterion [39] is finally used to design

89

the uniform quantizer for a successive coding of wavelet coefficients [7]. The resulting comp-denoiser is called BayesShrink þ Compress. Walker’s compdenoiser [44] (TAWS-Comp: Tree Adapted Wavelet Shrinkage and COMPression) employs a different strategy: it exploits a zero-tree based coder [42] for achieving denoising. The clustering property of wavelets, where higher amplitude coefficients are thickened around signal edges, is combined with the different behavior of true information and noise along scale levels. In fact, there is the empirical evidence that true image edges make higher and higher wavelet coefficients for increasing scale levels (from finer to coarser) unlike noise, that has an opposite behavior.1 Hence, universal threshold is empirically decreased for minimizing thresholding distortion. Following the same philosophy of Occam, this comp-denoiser shows, in a some manner, the potentialities of JPEG2000 as denoiser—as well as [47] does for SPIHT. From this short review, it turns out that both sparseness and scale linking play a key role for a robust and efficient comp-denoiser. In the next sections, we will present how WISDOW-Comp tries to accomplish these requirements. 3. WISDOW Before presenting WISDOW-Comp, a brief overview about WISDOW [5] and its properties for denoising will be given. Let us consider a signal having an isolated singularity in t0 (infinite ramp), defined as follows: ( a; tot0 ; f ðtÞ ¼ (2) a1 ðt  t0 Þ þ a; tXt0 : Its wavelet transform at time u and scale s, using the mother wavelet c, assumes the following nice form: Z þ1  t  u 1 Wf ðu; sÞ ¼ pffiffi f ðtÞc dt s s 1 pffiffi ð3Þ ¼ a1 s sF ðu; sÞ, Rz R þ1 where F ðu; sÞ ¼ z 1 cðtÞ dt þ z tcðtÞ dt, with z ¼ ðt0  uÞ=s. We can then define a basic atom centered at location t0 as the set of wavelet coefficients of a function having a first order singularity in t0 and represented by Eq. (3). At each scale s, Eq. (3) is completely determined by the slope 1

This phenomenon has also been rigorously proved in [33, Chapter 6, 34,24, Chapter 11].

ARTICLE IN PRESS 90

V. Bruni, D. Vitulano / Signal Processing: Image Communication 22 (2007) 86–101

a1 and the singularity location t0 . The former can be estimated from Wf in the least squares sense using F as model function. With regard to t0 , a wavelet generating an absolute modulus maximum for Wf, as depicted in Fig. 2 (leftmost bottom), can be selected. In other words, the highest energy point of Wf at a fixed scale s is a pointer for the absolute modulus maximum of the basic atom in (3). Fig. 3 depicts two waves approaching each other till they overlap. As a matter of fact, it is the wavelet transform at different scales of a signal having two singularity points, as the one depicted in Fig. 4. At each scale level, each of them corresponds to a basic atom. The interference of the corresponding waves can be then modeled using the overlapping effects principle, i.e. by splitting the wavelet transform at each scale level into single atoms having the same shape. Using this kind of representation, a signal having more than two singularity points will correspond to the superimposition of single atoms in the wavelet domain [5]. In particular, jump discontinuities are composed of two interfering basic atoms having opposite sign, as shown in [5] and depicted in Fig. 5. Singularities of a different order generate differently shaped atoms. Nonetheless, we propose to approximate each of them with the same basic atom defined in (3). From a mathematical point of view, it corresponds to a first order approximation around the singularity location in the original signal (see right column of Fig. 2). It is obvious that the smaller the wavelet support the more negligible the approximation error. This approximation allows us to overcome some drawbacks coming from a precise shape recovery as in [15]. This kind of representation results useful for denoising purposes. Let gðtÞ be the infinite ramp

f (t)

Fig. 3. Wavelet details of the signal in Fig. 4 computed up to 6th scale level (from left to right, top to bottom) using a spline biorthogonal 3/9 wavelet.

t1 t2

t

Fig. 4. Signal having two singularity points in t1 and t2 .

corrupted by white Gaussian noise u, i.e. gðtÞ ¼ f ðtÞ þ uðtÞ, and Wg its wavelet transform. According to the previous observations, f can be recovered by getting an estimation of the slope a1 from noisy data Wg, using F ðu; sÞ as model function. pTherefore, Wg can be substituted for ffiffi Wf ¼ a1 s sF ðu; sÞ and the restored signal is achieved by inverting the transform. A detailed description of the denoising algorithm can be found in Appendix A, while the extension to 2-D is in Appendix B. Notice that Wf provides an estimation of each coefficient in the cone of influence of the analyzed singularity, i.e. 8u 2 ½u  ðL=2Þs; u þ ðL=2Þs, where ½L=2; L=2 is the wavelet support. It can be rigorously proved that the estimation error E PR ¼ kWf  Wf k2 is lower than the hard-thresholding one E D [33]—see [5] for details.

ARTICLE IN PRESS V. Bruni, D. Vitulano / Signal Processing: Image Communication 22 (2007) 86–101 150

150

100

100

50

50

0

0

-50

-50

-100

91

-100 0

0

500 1000 1500 2000 2500 3000 3500 4000 4500

0

200 400 600 800 1000 1200 1400 1600 1800

0

Fig. 5. First row: Piecewise polynomial signal (left) and its portion (right). Second row: Wavelet details of the selected portion of the signal (left); the latter is composed of two interfering polynomial atoms (right).

It is worth outlining the difference between WISDOW representation and the schemes in [15,34]. In particular, WISDOW is able to overcome some of their critical drawbacks. Firstly, the proposed approach does not require any constraints for the distance between close atoms (or their maxima). Hence, the scale level of the wavelet decomposition can be freely selected. Moreover, it links wavelet scale levels in a less rigid way, without building maxima chains—as it will be more clear in the following. Finally, the atomic decomposition leads to a faster algorithm without involving energy minimizations or matching pursuit. An in-depth discussion on this topic can be found in [5]. 4. WISDOW-Comp WISDOW improves the sparsity of the wavelet representation and emphasizes the energy compaction property of the wavelet transform. In fact, each wavelet detail band of the image under study is represented by the couples (location, amplitude) of the absolute modulus maxima of the selected atoms that represent the points of highest energy for a singularity in the original signal. This effect is as evident as the scale level increases (from the finest to the coarsest) since the wavelet support becomes wider. In order to stress this point, Fig. 6 shows

significance maps concerning vertical sub-bands at the third scale level of Lena image. They are produced by, respectively, classical hard-thresholding (left) and WISDOW (right). While significance map of thresholding is composed of all locations of significant coefficients, WISDOW representation just consists of locations of atoms maxima. Notice how WISDOW map is sparser, even though both schemes employ Donoho’s universal threshold. Such a trend is independent of the adopted threshold. Nonetheless, this effect is attenuated in a multiresolution representation, which is more suitable for compression. For instance, an adaptive Bayesian threshold [7] has been used both in WISDOW and in a shrinkage scheme employing the same wavelet basis (spline biorthogonal 3/9). Resulting significance maps are, respectively, shown in Fig. 7a and b. It is obvious that the higher the threshold the lower the bit rate, i.e. the higher the distortion. Nevertheless, WISDOW allows us to use a higher threshold without increasing distortion, as it will be shown in the experimental results. This is possible since it also predicts values of coefficients under the threshold. Hence, WISDOW will employ universal threshold since it is high enough and automatic. The resulting significance map is shown in Fig. 7c. Notice that atoms are located along edges where wavelet coefficients are known to be larger.

ARTICLE IN PRESS V. Bruni, D. Vitulano / Signal Processing: Image Communication 22 (2007) 86–101

92

Fig. 6. Vertical detail sub-band at 3rd scale level of a wavelet overcomplete expansion of 512  512  8 bits Lena image. Significant locations are white on a dark background. (Left) Significance map of hard thresholded coefficients. (Right) significance map of atoms maxima selected by WISDOW. In both cases Donoho’s universal threshold has been employed.

Fig. 7. Vertical detail sub-band at 3rd scale level of 512  512  8 bits Lena image using spline biorthogonal 3/9 filters: (a) significance map of atoms maxima which have been selected by WISDOW using an adaptive Bayesian threshold, as computed in [7]; (b) significance map of coefficients selected by Bayes– Shrink algorithm [7]; (c) significance map of atoms maxima which have been selected by WISDOW using Donoho’s universal threshold.

Nonetheless, it is possible to further reduce the information to code by exploiting intra-scale properties of atomic representation combined with the equivalence between a wavelet operator and the derivation. Let us come back to Eq. (2) and let us compute its wavelet approximation component at time u and scale s using the scaling function f Z þ1  t  u 1 Sf ðu; sÞ ¼ pffiffi f ðtÞf dt s s 1 Z pffiffi pffiffi þ1 fðyÞ dy þ a1 s sGðu; sÞ, ð4Þ ¼a s 1

where Z

Z

þ1

Gðu; sÞ ¼

tfðtÞ dt  z z

þ1

fðtÞ dt. z

As for the function F in Eq. (3), G is known once the basic atom is fixed and the singularity location is

known. Even though Sf depends on both the parameters a and a1 , its second partial derivative with respect to u just depends on a1 . In fact, it holds pffiffi d2 Gðu; sÞ d2 Sf ðu; sÞ ¼ a s s . 1 du2 du2 a1 can be computed by inverting the previous equation. In particular, for the discrete case it holds: a1 ¼

1 Sf ðt0 þ Dt; 2j Þ  2Sf ðt0 ; 2j Þ þ Sf ðt0  Dt; 2j Þ , Gðt0 þ Dt; 2j Þ  2Gðt0 ; 2j Þ þ Gðt0  Dt; 2j Þ 2 3=2j

(5) where t0 is the location of the atom under study (Eq. (3)) and Dt is the size of the adopted grid. This way, amas can be extracted from the approximation band using the information relative to singularities locations—i.e. the significance map. It turns out that modulus maxima significance map of wavelet detail bands and the approximation at the coarsest

ARTICLE IN PRESS V. Bruni, D. Vitulano / Signal Processing: Image Communication 22 (2007) 86–101

scale could be the only information to be sent to the decoder, allowing a considerable saving in terms of required bits budget. 4.1. Algorithm

band. The decoder works in a progressive way. At the coarsest scale level, it uses significance map information for computing corresponding amas from approximation band. Hence, the basic wavelet atom is modeled in the cone of influence of the analyzed position via (3), as depicted in Fig. 8. It is suitably added to the wavelet band we are going to reconstruct, according to WISDOW model. This operation is performed for all points of the significance map corresponding to the wavelet detail band under study.

d 2j

a 2j

The structure of WISDOW-Comp is simple. The encoder works as WISDOW, computing the atoms maxima locations: the significance map. This latter is then sent to the decoder along with the approximation

93

to to-Δt to to+Δt

k

k

Fig. 8. Marked points in the leftmost figure represent approximation band coefficients to use for the estimation of the slope a1 . Hence, from a1 and t0 (contained in the significance map) it is possible to recover the corresponding atom in the high pass component (rightmost) via Eq. (3).

Fig. 9. WISDOW-Comp block scheme: (top) encoder, (bottom) decoder.

ARTICLE IN PRESS V. Bruni, D. Vitulano / Signal Processing: Image Communication 22 (2007) 86–101

94 5 4

6

8

5

6

4 3

4

3

2

2

1

1

2 0

0

0

-2

-1

-1

-4

-2

-2

-3 1

1.5

2

2.5

3

3.5

4

4.5

5

-6 1

1.5

2

2.5

3

3.5

4

4.5

5

1

1.5

2

2.5

3

3.5

4

4.5

5

Fig. 10. Decimated basic atom from 1st to 3rd scale level using odd phase (solid line). Its approximation using a symmetric shape (dotted line).

Fig. 11. (Left) Denoised and compressed Lena image (s ¼ 20) using WISDOW-Comp at bpp ¼ 0:24 and PSNR ¼ 29:77 dB. (Right) Denoised and compressed Lena image (s ¼ 10) using WISDOW-Comp at bpp ¼ 0:482 and PSNR ¼ 32:96 dB.

Once the wavelet detail band is completely recovered, a step of the inverse wavelet transform is applied for achieving the approximation band at the next (finer) scale level. The algorithm is iterated till the first (finest) scale level is reached. A block scheme of WISDOWComp is depicted in Fig. 9, while the precise algorithm is described below. 4.1.1. Coder Let fa2J ; fd 2j g1pjpJ g be the decimated wavelet transform of the noisy signal fgðiÞ; 1pipNg computed up to Jth level. Let L be the support size of the selected wavelet, while f the denoised signal whose decimated wavelet decomposition will be indicated with fa2J ; fr2j g1pjpJ g.  Perform WISDOW algorithm (in Appendix A) adapted to a decimated wavelet decomposition.2 2 It remains the same except for the fact that the function F must be suitably decimated according to the scale level while the cone of influence of a maximum point is ½mi  L=2; mi þ L=2.

 Be a2J the uniformly quantized and entropy coded approximation band a2J .  For each detail band, extract amas significance map: fB2j g1pjpJ .  Run-length and entropy code the binary vectors fB2j g1pjpJ .  Send a2J and fB2J g1pjpJ to the decoder. 4.1.2. Decoder The bands fr2j g1pjpJ are initialized to zero. Set j ¼ J. (1) For each B2j ðiÞa0 with 1pipN=2j :  Compute the corresponding a1 using (5), where t0 ¼ i and Dt ¼ bL=2c.  Model the basic atom a1 23=2j F ðk; 2j Þ, where k 2 ½i  L=2; i þ L=2.  Add a1 23=2j F ðk; 2j Þ to r2j . (2) Perform a step backward of the DWT computing a2j1 . (3) If j41, set j ¼ j  1 and go to 1, else stop.

ARTICLE IN PRESS V. Bruni, D. Vitulano / Signal Processing: Image Communication 22 (2007) 86–101

The 2-D extension of the above algorithm is straightforward, accounting the considerations in Appendix B. It is worth outlining that WISDOW-Comp must employ a multiresolution approach rather than an overcomplete representation as WISDOW. Even though this choice is more suitable for compression purposes, it introduces an indetermination for atoms shapes. In other words decimation introduces various kinds of distortion that obviously yield a loss of quality. An example is shown in Fig. 10. A solution may be to spend a bit per atom for predicting the phase of decimation. However, we will not adopt any strategy to not complicate encoder and decoder structure. This choice does not dramatically influence WISDOW-Comp performances, as we will see in the experimental results. 5. Experimental results Results presented in this section will concern three well-known 512  512  8 bits images: Lena, Goldhill and Barbara. They are a representative sample for testing our scheme since they are characterized by a different frequency information. In fact, the last one is highly detailed while the other two are smoother [7]. A three scale levels decimated decomposition using a spline biorthogonal 3/9 wavelet has been performed. Even though WISDOW is independent of the adopted wavelet, the symmetry of the aforementioned wavelet helps us to better exploit the atomic decomposition. With regard to the choice of the scale level, it is worth spending some words. We have seen that the only information required by WISDOW-Comp is the approximation band and the significance map of amas. It is well known that noise energy diminishes along scale levels (j), since it is proportional to s2 =2j [34], where s2 is the noise variance. On the contrary, clean signal information grows along scales in agreement with its local Lipschitz regularity [33]. Then the higher the scale level, the cleaner the approximation. On the other hand, the error of propagation of amas from the coarser to the finer scale has to be accounted for too. In fact, atoms decay, which is regulated by the Lipschitz order, has not been considered for not complicating the encoding scheme. Hence, the minimization of the corresponding error requires that the decomposition level must be not too high. It has been empirically observed that a three scale level wavelet decomposition gets a

95

good trade-off between the two above-mentioned effects in terms of rate distortion. The classical MDL criterion has been adopted for quantizing the wavelet approximation band using settings in [7]. This choice allows us to make an objective comparison with BayesShrink þ Compress. Fig. 11 depicts denoised and decoded Lena image achieved by WISDOW-Comp for s ¼ 20 and 10 while Table 1 contains rate-distortion versus noise standard deviation values of WISDOW-Comp. It has been compared in Fig. 12 with the most effective comp-denoisers proposed in literature: BayesShrink þ Comp [7], TAWS-Comp [44], critical rate þ SPIHT [47] and Complexity Regularized þ EQ [31,32]. This latter outperforms alternative quantization schemes like JPEG [26] and Morphological Coder [41]. WISDOW-Comp achieves a considerable gain in terms of rate-distortion results for various noise variance values. The same trend is followed for Barbara and Goldhill, as shown in Fig. 13. We have seen that the choice of the scale level in WISDOW-Comp has been made accounting for the trade-off between a clean approximation band and a minimization of the propagation of amas prediction error. However, this strategy does not provide a clean approximation band for high variance noise, since annoying mottling artefacts are visible—see

Table 1 Lena image: results in terms of PSNR and bpp achieved for various ss using WISDOW-Comp Noise s

WISDOW-Comp PSNR

bpp

5

36.04 34.80 32.29

1.510 0.617 0.452

7

34.47 34.13 32.06

1.436 0.547 0.444

10

32.96 31.58 28.31

0.482 0.337 0.164

20

29.95 29.77 27.95

0.390 0.240 0.155

30

28.45 27.46

0.196 0.149

35

27.86 27.13

0.183 0.148

ARTICLE IN PRESS V. Bruni, D. Vitulano / Signal Processing: Image Communication 22 (2007) 86–101

96

40

PSNR

35

WISDOW-Comp Bayes-Sh Complex-reg TAWS-Comp Critical rate σ = 35 σ = 30 σ= 20 σ = 10 σ=7 σ=5

30

25 0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

2.2

bpp Fig. 12. Lena image: PSNR versus bpp at different noise standard deviations comparisons between WISDOW-Comp, BayesShrink þ Compress [7], Complexity regularization þ EQ [31], TAWS-Comp [44] and Critical rate þ SPIHT [47].

32

33

31

32

30

31 30

28

PSNR

PSNR

29

27 26

24 23

28 27

WISDOW-Comp Bayes-Sh TAWS-Comp σ= 30 σ= 20 σ= 10

25

29

WISDOW-Comp Bayes-Sh TAWS-Comp σ= 30 σ= 20 σ= 10

26 25

22

24 0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

bpp

0.2

0.4

0.6

0.8

1

1.2

1.4

bpp

Fig. 13. (Left) Barbara image. (Right) Goldhill image. PSNR versus bpp at different noise standard deviations. Comparisons between WISDOW-Comp, BayesShrink þ Compress [7] and TAWS-Comp [44].

Fig. 14. They can be avoided by processing the approximation band via Wiener filtering its significant coefficients.3 Moreover, smoothness of this 3

Wiener2 matlab routine with default parameters has been used in all tests.

band can be exploited for achieving a lower rate too. Regions of insignificant coefficients can be reconstructed by a suitable interpolation using significant ones. We use bilinear interpolation with non-equally spaced knots in all experiments presented in this paper. It is to be highlighted that

ARTICLE IN PRESS V. Bruni, D. Vitulano / Signal Processing: Image Communication 22 (2007) 86–101

Fig. 14. Zoomed and enhanced mottling artefacts of wavelet approximation sub-band.

Fig. 15. The 3rd scale level approximation sub-band of Lena image. (Left) Black regions represent the complement of the enlarged amas significance map. (Right) Approximation sub-band after interpolation.

significance map must be enlarged by including the modulus maxima neighborhood. The latter is necessary for estimating detail bands coefficients by means of Eq. (5) (Fig. 15).4 This strategy usually guarantees about a 20% saving of bits with a gain of 0.5 dB on average in terms of PSNR. This variant 4

According to the proposed comp-denoiser, the neighborhood of a modulus maximum is composed of its adjacent coefficients: previous and next along either columns or rows according to the considered sub-band (see also Fig. 8).

97

will be called WISDOW-Comp.2 and results on Lena image are in Table 2. It is worth outlining that the bilinear interpolation does not affect the data used for atoms estimation while decreases the noise in correspondence to smooth regions. We have seen that the error for propagating information through scales can represent a possible limit of the proposed comp-denoiser. In order to control it, it is possible to code amas of the vertical sub-band since it is useful for reconstructing the diagonal one. In fact, at each scale level columns of vertical band are the wavelet approximation of the corresponding diagonal one: vj ¼ aj  crow  fcol and d j ¼ aj  crow  ccol . Instead of coding all vertical amas, this amount of bits can be differently allocated. That is, only a part of selected amas of the coarsest vertical sub-band is coded, while the omitted vertical ones (the smaller amas) are predicted using rows of approximation sub-band. The remaining bits are used for coding the greater amas of the other coarsest sub-bands: horizontal and diagonal. It is obvious that the number of vertical detail amas to be coded must be larger than the other sub-bands for their crucial role. In other words, instead of coding all M vertical band amas, the following empirical allocation can be performed: 2 3M from the coarsest vertical band, while the remaining 13M equally peaked from the coarsest horizontal and diagonal bands. The classical MDL criterion has been adopted for quantizing amas using the algorithm in [7]. Anyhow, it is possible to observe that the proposed representation can be exploited in the quantization step too. In fact, amas can be estimated by the coder from the corresponding wavelet approximation using Eq. (5). Hence, the difference between estimated and actual values of amas has to be sent to decoder. The range of this prediction error is obviously smaller than the one of amas. This peculiarity can be then exploited for reducing quantization error at the same rate. In particular, the number of bins is about three times larger than the one required for coding amas at the same rate. This variant will be denoted with WISDOWComp.3. It is worth highlighting that a different selected amount of amas at each band corresponds to an adaptive threshold. We have then embedded a Bayesian threshold [7] in the proposed compdenoiser instead of a universal one. We noticed that such a strategy further improves results achieved via our empirical choice, provided that

ARTICLE IN PRESS V. Bruni, D. Vitulano / Signal Processing: Image Communication 22 (2007) 86–101

98

Table 2 Lena image: results in terms of PSNR and bpp achieved for various ss using WISDOW-Comp.2, WISDOW-Comp.3 and WISDOW-Comp:2 þ 3 Noise s

WISDOW-Comp.2

WISDOW-Comp.3

WISDOW-Comp:2 þ 3

PSNR

bpp

PSNR

bpp

PSNR

bpp

5

36.50 34.95 32.48

1.25 0.58 0.40

36.22 34.90 32.40

1.41 0.60 0.44

36.55 35.02 32.60

1.18 0.57 0.387

7

35.31 34.35 32.40

1.14 0.50 0.39

34.55 34.25 32.26

1.336 0.53 0.435

35.50 34.53 32.55

1.07 0.489 0.381

10

33.25 31.60 28.70

0.430 0.323 0.151

33.15 31.60 28.50

0.45 0.327 0.157

33.50 31.70 28.92

0.41 0.315 0.142

20

30.75 30.12 28.30

0.34 0.193 0.138

30.20 30.01 28.11

0.375 0.22 0.147

30.93 30.30 28.50

0.331 0.18 0.130

30

28.83 28.00

0.182 0.135

28.60 27.60

0.186 0.141

29.00 28.21

0.175 0.130

35

28.47 27.80

0.169 0.133

28.00 27.36

0.175 0.14

28.72 27.98

0.160 0.127

40

35 PSNR

all Bayesian thresholds are multiplied by a constant g. This is easily understandable, since Bayesian thresholds are generally smaller than the universal one. On the contrary WISDOW-Comp relies on higher threshold values since it is able to recover the remaining significant information. In our empirical tests, using g ¼ 3, a 0.3 dB gain on average has been achieved with respect to WISDOW-Comp.3 results reported in Table 2. WISDOW-Comp.2 and WISDOW-Comp.3 can be combined in WISDOW-Comp:2 þ 3. It achieves better results in terms of rate distortion as shown in Table 2 and Fig. 16. The main properties of each variant of WISDOW-Comp are listed in Table 3. Finally, WISDOW-Comp numerical implementation is easy and fast. In fact, it adds simple operations to OðNÞ complexity required by hardthresholding, where N is the length of the signal under study. These operations are due to wavelets atoms estimation in the least squares sense and requires a OðMLÞ, where MðoNÞ is the number of maxima atoms of the wavelet decomposition and L is the wavelet filters length. WISDOW-Comp.3 does not increase the computational effort of WISDOWComp since the decoder computes both location and amplitude of amas for getting their significance map. On the contrary, WISDOW-Comp.2 implies a

WISDOW-Comp.2+3 Bayes-Sh Complex-reg TAWS-Comp Critical rate σ = 35 σ = 30 σ = 20 σ = 10 σ=7 σ=5

30

25 0.2 0.4 0.6 0.8

1

1.2 1.4 1.6 1.8

2

2.2

bpp

Fig. 16. Lena image: PSNR versus bpp at different noise standard deviations comparisons between WISDOW- Comp:2þ 3, BayesShrink þ Compress [7], Complexity regularization þ EQ [31], TAWS-Comp [44] and Critical rate þ SPIHT [47].

Wiener filtering operation and the bilinear interpolation, that is 2OðN=2J Þ. 6. Conclusions An effective framework (WISDOW-Comp) for simultaneous denoising and compression has been presented. It allows us to outperform

ARTICLE IN PRESS V. Bruni, D. Vitulano / Signal Processing: Image Communication 22 (2007) 86–101

99

Table 3 Description of the main differences between WISDOW-Comp and its variants WISDOW-Comp.2, WISDOW-Comp.3 and WISDOW-Comp:2 þ 3 WISDOW-Comp WISDOW-Comp.2 WISDOW-Comp.3 WISDOW-Comp:2 þ 3

fhj ; vj ; d j g estimated from aj using Bj Wiener filter and bilinear interpolation for aj Encoding of amas estimation error Combined WISDOW-Comp.2 and WISDOW-Comp.3

current comp-denoisers, thanks to its properties such as context size independence, compaction of information, theoretical optimality with respect to hard-thresholding in L2 metric and propagation of information through scales. In a sense, the first three make it a generalization of Chang et al. philosophy, oriented to reducing context dependence as well as improving thresholding performances. On the contrary, wavelet atoms propagation through scales extends Walker’s comp-denoiser—based on following single coefficients. As shown in the experimental results, WISDOW-Comp enables us to reach distortion results comparable to the best available compdenoisers, with a considerable bits saving on Lena image. Furthermore, the lack of parameters to be tuned along with a low complexity make it very attractive in many application fields. Future research will consist of investigating a more rigorous travelling law of wavelet atoms through scales. More precisely, intra-scale dependencies of wavelet coefficients can be more rigorously formalized for predicting atoms location at different scales [3,4]. Appendix A. Algorithm 1 In this section the denoising algorithm based on WISDOW representation is described. Let us consider an overcomplete wavelet decomposition of a noisy signal g, computed up to Jth scale level,5 i.e. fa2J ; fd 2j g1pjpJ g, where a2J is the coarsest approximation band and fd 2j g1pjpJ are the detail bands. Let L be the support size of the selected wavelet, while f the denoised signal whose overcomplete wavelet decomposition will be indicated with fa2J ; fr2j g1pjpJ g. The bands fr2j g1pjpJ are initialized to zero. 5

Gain in rate and distortion Gain in distortion Gain in rate and distortion

J is a priori fixed by the user. In general 3 or 4 are the best choices.

For each scale level j:

(1) Perform hard-thresholding on d 2j using Donoho universal threshold T [14]. Let d 2j the thresholded vector. (2) Extract compact domains Dk from d 2j . Dk is composed of non-zero neighboring coefficients in d 2j . Let K be the number of found domains. (3) For each Dk :  Compute the locations of its absolute modulus maxima and put them in the vector M k .  Check interference. For each mi 2 M k , single out the related estimation domain I i such that:  if there are no minima in Dk , then I i  Dk ;  if mi has just one adjacent minimum in Dk , then I i has a common extremum with Dk while the other one corresponds to the found minimum location;  if mi has two adjacent minima in Dk (one on its left and one on its right), they will be the extrema points of the domain I i . If jI i j4L2j1 then I i ¼ I i \ ½mi  L2j1 ; j1 mi þ L2 SK . (4) Be M ¼ k¼1 M k . Sort M in decreasing order with respect to amplitudes absolute value. (5) Least squares estimation. For each mi 2 M:  Estimate the associated slope ai in the corresponding domain I i using ai ¼ P ming k2I i jd 2j ðkÞ  g23=2j F ðk; 2j Þj2 , i.e. P j k2I i d 2j ðkÞF ðk; 2 Þ ai ¼ 3=2j P ; j 2 2 k2I i F ðk; 2 Þ  compute the atom ai 23=2j F ðk; 2j Þ; k 2 ½mi  L2j1 ; mi þ L2j1 ;  add previous atom to r2j and subtract it from d 2j .

ARTICLE IN PRESS 100

V. Bruni, D. Vitulano / Signal Processing: Image Communication 22 (2007) 86–101

Invert the overcomplete wavelet decomposition using fa2J ; fr2j g1pjpJ g to obtain the denoised signal f . Appendix B. Algorithm 2 The proposed 1-D denoising scheme can also be used for images. The aim is to retain just a single and easily recognizable atom shape. We then exploit the separability of 2-D wavelet bases [33] for adapting the 1-D scheme. Let c be the 1-D mother wavelet and f the associated scaling function. Horizontal detail subband is the result of convolving rows of the approximation band of the previous (finer) scale level by f and columns by c. Vice versa for vertical detail sub-band. For the diagonal one both rows and columns are processed by c. Therefore, a simple way of adapting the proposed 1-D denoising scheme consists of independently deal with columns of horizontal and diagonal sub-bands and rows of the vertical one [5]. For example, the horizontal subband is h2j ¼ a2j1  frow  ccol at the scale level j. Then, WISDOW locally models a2j1  frow as original signal. The recovery process involves atoms produced by the convolution between a2j1  frow and ccol . The 2-D algorithm is the following. Let us consider an overcomplete wavelet decomposition of a noisy image gðx; yÞ, computed up to Jth scale level, i.e. fa2J ; fh2j ; v2j ; d 2j g1pjpJ g, where a2J is the coarsest approximation band, while fh2j ; v2j ; d 2j g1pjpJ , respectively, are the horizontal, vertical and diagonal bands. For each scale level j: (1) Perform hard-thresholding on fh2j ; v2j ; d 2j g using Donoho universal threshold T [14]. (2) For each column of h2j ; perform the Algorithm 1 for achieving the denoised band h2j . (3) For each row of v2j ; perform the Algorithm 1 for achieving v2j . (4) For each column of d 2j ; perform the Algorithm 1 for achieving d 2j . Invert the overcomplete wavelet decomposition using fa2J ; fh2j ; v2j ; d 2j g1pjpJ g to obtain the denoised image f ðx; yÞ. References [1] E. Ayanoglu, On optimum quantization of noisy sources, IEEE Trans. Inform. Theory 36 (November 1990) 1450–1452.

[2] T. Berger, Rate Distortion Theory, Prentice-Hall, Englewood Cliffs, NJ, 1971. [3] V. Bruni, B. Piccoli, D. Vitulano, Wavelet time-scale dependencies for signal and image compression, in: Proceedings of IEEE International Conference ISPA 2005, Zagreb, 2005, pp. 105–110. [4] V. Bruni, B. Piccoli, D. Vitulano, Signal and image denoising via scale-space atoms, in: Proceedings of EUSIPCO 2006, Florence, September 2006. [5] V. Bruni, D. Vitulano, Wavelet based signal denoising via simple singularities approximation, Signal Process. 86 (April 2006) 859–876. [6] S.G. Chang, B. Yu, M. Vetterli, Spatially adaptive thresholding with context modeling for image denoising, IEEE Trans. Image Process. 9 (9) (September 2000) 1522–1530. [7] S.G. Chang, B. Yu, M. Vetterli, Adaptive wavelet thresholding for image denoising and compression, IEEE Trans. Image Process. 9 (9) (September 2000) 1532–1546. [8] H. Choi, R. Baraniuk, Analysis of wavelet—domain Wiener filters, in: Proceedings of the IEEE—SP International Symposium on Time-frequency and Time-scale Analysis, October 1998. [9] I. Daubechies, Ten Lectures on Wavelets, SIAM, Philadelphia, 1992. [10] M.N. Do, M. Vetterli, Contourlets: a new directional multiresolution image representation, in: Proceedings of 36th Asilomar Conference on Signals Systems and Computers, November, vol. 1, 2002, pp. 497–501. [11] R.L. Dobrushin, B.S. Tsybakov, Information transmission with additional noise, IEEE Trans. Inform. Theory IT-8 (1962) 293–304. [12] D.L. Donoho, Denoising by soft-thresholding, IEEE Trans. Inform. Theory 41 (3) (May 1995). [13] D.L. Donoho, Orthonormal ridgelets and linear singularities, SIAM J. Math. Anal. 31 (5) (2000) 1062–1099. [14] D.L. Donoho, I.M. Johnstone, Ideal spatial adaptation via wavelet shrinkage, Biometrika 81 (1994) 425–455. [15] P.L. Dragotti, M. Vetterli, Wavelet footprints: theory algorithms and applications, IEEE Trans. Signal Process. 51 (5) (May 2003) 1306–1323. [16] Y. Ephraim, R.M. Gray, A unified approach for encoding clean and noisy sources by means of waveform and autoregressive model vector quantization, IEEE Trans. Inform. Theory 34 (July 1988) 826–834. [17] G. Fan, X. Xia, Image denoising using a local contextual hidden Markov model in the wavelet domain, IEEE Signal Process. Lett. 8 (5) (May 2001) 125–128. [18] T. Fine, Optimum mean square quantization of a noisy input, IEEE Trans. Inform. Theory IT-11 (April 1965) 293–294. [19] T.R. Fisher, J.D. Gibson, B. Koo, Estimation and noisy source coding, IEEE Trans. Acoustic Speech Signal Process. 38 (January 1990) 23–24. [20] D. Geiger, F. Girosi, Parallel and deterministic algorithms from MRF’s: surface reconstruction, IEEE Trans. Pattern Anal. Mach. Intell. 13 (May 1991) 401–412. [21] S. Geman, D. Geman, Stochastic relaxation Gibbs distributions and the Bayesian restoration of images, IEEE Trans. Pattern Anal. Mach. Intell. PAMI-6 (November 1984) 721–741.

ARTICLE IN PRESS V. Bruni, D. Vitulano / Signal Processing: Image Communication 22 (2007) 86–101 [22] J.D. Gibson, B.K.S.D. Gray, Filtering of colored noise for speech enhancement and coding, IEEE Trans. Signal Process. 39 (August 1991) 1732–1742. [23] R.C. Gonzalez, R.E. Woods, Digital Image Processing, second ed., Prentice-Hall, Englewood Cliffs, NJ, 2002. [24] S. Jaffard, Y. Meyer, R.D. Ryan, Wavelets: Tools for Science and Technology, SIAM, Philadelphia, 2001. [25] A.K. Jain, Fundamentals of Digital Image Processing, Prentice-Hall, Englewood Cliffs, NJ, 1989. [26] JPEG software codec, Portable Res. Video Group, Stanford University, Stanford, CA, 1997. [27] M. Kazubek, Wavelet domain image denoising by thresholding and Wiener filtering, IEEE Signal Process. Lett. 10 (11) (November 2003) 324–326. [28] K. Konstantinos, B.K. Natarajan, An architecture for lossy compression of waveforms using piecewise-linear approximation, IEEE Trans. Signal Process. 42 (9) (September 1994). [29] E. Le Pennec, S. Mallat, Non linear image approximation with bandelets, Technical Report CMAP/Ecole Polytechnique, 2003. [30] X. Lin, M.T. Orchard, Spatially adaptive image denoising under overcomplete expansion, in: Proceedings of the IEEE International Conference on Image Processing, September 2000, pp. 300–303. [31] J. Liu, P. Moulin, Complexity-regularized image denoising, IEEE Trans. Image Process. 10 (6) (June 2001) 841–851. [32] S. LoPresto, K. Ramchandran, T. Orchard, Image coding based on mixture modeling of wavelet coefficients and a fast estimation-quantization framework, in: Proceedings of Data Compression Conference ’97, Snowbird, UT, 1997, pp. 221–230. [33] S. Mallat, A Wavelet Tour of Signal Processing, Academic Press, New York, 1998. [34] S. Mallat, W.L. Hwang, Singularity detection and processing with wavelets, IEEE Trans. Inform. Theory 38 (2) (March 1992). [35] M.K. Mihcak, I. Kozintsev, K. Ramchandran, P. Moulin, Low-complexity image denoising based on statistical modeling of wavelet coefficients, IEEE Signal Process. Lett. 6 (12) (December 1999) 300–303. [36] P. Moulin, J. Liu, Statistical imaging and complexity regularization, IEEE Trans. Inform. Theory 46 (5) (August 2000) 1762–1777.

101

[37] B.K. Natarajan, Filtering random noise from deterministic signals via data compression, IEEE Trans. Signal Process. 43 (11) (November 1995) 2595–2605. [38] J. Portilla, V. Strela, M. Wainwright, E. Simoncelli, Image denoising using scale mixtures of Gaussians in the wavelet domain, IEEE Trans. Image Process. 12 (11) (November 2003) 1338–1351. [39] J. Rissanen, MDL denoising, IEEE Trans. Inform. Theory 46 (7) (November 2000). [40] S.J. Sakrison, Source encoding in the presence of random disturbance, IEEE Trans. Inform. Theory IT-14 (January 1968) 165–167. [41] S.D. Servetto, K. Ramchandran, M.T. Orchard, Image coding based on a morphological representation of wavelet data, IEEE Trans. Image Process. 8 (September 1999) 1161–1174. [42] J.M. Shapiro, Embedded image coding using zero-trees of wavelet coefficients, IEEE Trans. Signal Process. 41 (December 1993) 3445–3462. [43] J.L. Starck, E.J. Candes, D.L. Donoho, The curvelet transform for image denoising, IEEE Trans. Image Process. 11 (6) (June 2002) 670–684. [44] J.S. Walker, Combined image compressor and denoiser based on tree-adapted wavelet shrinkage, in: D.C. O’Shea (Ed.), Optical Engineering, vol. 41(7), July 2002, pp. 1520–1527. [45] H.S. Witsenhausen, Indirect rate distortion problems, IEEE Trans. Inform. Theory IT-26 (September 1980) 518–521. [46] J.K. Wolf, J. Ziv, Transmission of noisy information to a noisy receiver with minimum distortion, IEEE Trans. Inform. Theory IT-16 (July 1970) 406–411. [47] S. Yea, W.A. Pearlman, Critical encoding rate in combined denoising and compression, in: Proceedings of ICIP 2005, Genoa, Italy, September 2005. [48] J. Zhang, The mean field theory in EM procedures for blind Markov random field image restoration, IEEE Trans. Image Process. 2 (January 1993) 27–40. [49] H. Zhang, A. Nosratinia, R.O. Wells, Image denoising via wavelet-domain spatially adaptive FIR Wiener filtering, in: IEEE ICASSP 2000, vol. 5, Istanbul, Turkey, June, 2000, pp. 2179–2182.