Signal Processing: Image Communication 17 (2002) 509–524
Adaptive and global optimization methods for weighted vector median filters Laurent Lucata,1, Pierre Siohanb,*,1, Dominique Barbac a Conexant, 2 avenue P. Piffault, 72100 Le Mans, France IRISA-INRIA, Campus de Beaulieu, 35042 Rennes Cedex, France c IRCCyN, 63, Rue Christian Pauc, La Chantrerie-BP 60601, 44306 Nantes cedex, France b
Received 4 December 2001; received in revised form 1 March 2002; accepted 8 March 2002
Abstract Weighted vector median filters (WVMF) are a powerful tool for the non-linear processing of multi-components signals. These filters are parametrized by a set of N weights and, in this paper, we propose two optimization techniques of these weights for colour image processing. The first one is an adaptive optimization of the N 1 peripheral weights of the filter mask. The major and more difficult task is to get a mathematical expression for the derivative of the WVMF output with respect to its weights; two approximations are proposed to measure this filter output sensitivity. The second optimization technique corresponds to a global optimization of the central weight alone, the value of which is determined, in a noise reduction context, by an analytical expression depending upon the mask size and the probability of occurrence of an impulsive noise. Both approaches are evaluated by simulations related to the denoising of textured, or natural, colour images, in the presence of impulsive noise. Furthermore, as they are complementary, they are also tested when used together. r 2002 Elsevier Science B.V. All rights reserved. Keywords: Colour images; Impulsive noise removing; Non-linear filters optimization; Vector filters; Weighted vector median filters
1. Introduction Non-linear filters using order statistics are useful tools in digital image and video processing. Among them, the median filter is well known for its impulsive noise removal and ability to preserve edges. For a median filter the only tunable parameter being its sliding window, shape or size, *Corresponding author. E-mail address:
[email protected] (P. Siohan). 1 Dr. Laurent Lucat and Dr. Pierre Siohan performed the initial part of this work while being at France Telecom R&D, 4, rue du Clos Courtel, 35512 Cesson-S!evign!e cedex, France.
it may be impossible to profile it properly for a given application. Hence, more flexible filters, such as the weighted median filters (WMFs) have been proposed: WMFs are parametrized by a set of N weights, where N is the number of samples included in the window; a full analysis of WMFs can be found in [21]. A judicious adjustment of the weights allows one, for instance, to reach a better preservation of desired patterns. This emphasizes the need to develop techniques for adequately choosing these parameters. A structural constraint approach [20] or an error gradient back-propagation algorithm [14,12] are efficient ways to adapt the WMF parameters. As the adaptive WVMFs
0923-5965/02/$ - see front matter r 2002 Elsevier Science B.V. All rights reserved. PII: S 0 9 2 3 - 5 9 6 5 ( 0 2 ) 0 0 0 2 3 - 1
510
L. Lucat et al. / Signal Processing: Image Communication 17 (2002) 509–524
allow to maintain a good compromise between impulse-like noise reduction and details preservation, they are often used for still and moving image processing [7]. Another class of non-linear order statistics filters is the L-filter [4], whose output is defined as a linear combination of the order statistics (i.e. the sorted samples). L-filters are also parametrized by a set of N weights, which allows the filter to reach various profiles, including the mean or the median behaviour. An optimization scheme, based on an error gradient back propagation has also been proposed by Pitas [13]. In order to efficiently process multi-component signals, such as colour images, satellite images, or motion vector fields in video coders, scalar nonlinear filters, such as median, WMF, L-filter, have been extended to the multi-variate case, leading to the vector median filter (VMF), the WVMF [3] and the multi-channel L-filter, respectively. These extensions involve specific definitions related to the ordering of multi-valued data [5]. Thus, the VMF and WVMF involve a reduced ordering, while multi-channel L-filters have been proposed using either a reduced [6,11] or the marginal [8] ordering. This reduced ordering can also be based on colour vector angles as fully described in [18]. Unlike for the scalar case, there is a lack of optimization procedures for vector non-linear filters. Indeed, the underlying mathematical model is less tractable than in the scalar case and leads to a difficult optimization problem. To the best of our knowledge, the only proposed optimization scheme is devoted to multi-channel L-filters, based on the reduced [11] and marginal [8] ordering, where the numerous NP2 parameters (P denotes the number of channels) are optimized according to a square error minimization, and under straight assumptions. Here we consider the WVMFs, which are efficient tools in colour image processing. Similar to the scalar WMF, they are parametrized by a set of N weights which have to be appropriately determined in order to reach high filtering performances. In [19], typical sets of weights are proposed, which allow a preservation of all stationary regions in image sequences. An adaptive WVMF is proposed in [2] to process motion
vector fields, where the weights are ‘‘judiciously’’ adjusted according to measures of the motion associated to each pixel included in the sliding window. An adaptive WVMF is also used in [1] to regularize optic flow estimates. However, no ‘‘optimal’’ WVMF has yet been derived. In this paper, we propose two new approaches to optimize the WVMF weights. The first one consists of an adaptive scheme, where the weights are regularly updated in order to adapt to the local signal content. The second approach can be seen as a global optimization, where the central weight of the filter mask is chosen so as to satisfy a statistical criterion; the underlying context considered here is the channel-independent impulsive noise removing in colour images. Finally, both methods are combined, providing a powerful processing tool, which is evaluated in the filtering of noise into colour textured images. Our paper is organized as follows. In Section 2, the adaptive optimization scheme is described; two versions are proposed, involving different expressions of the filter output sensitivity. Performance evaluation are based on a weight convergence test and on results of colour image denoising. The global optimization is described in Section 3; results of the proposed approach are compared to those of an empirical minimization of the mean absolute or quadratic error, conducted on a colour image, and the enhancement due to the WVMF optimization is illustrated through a comparison with the unoptimized filter results. Both approaches are then combined in Section 4 and conclusions are given in the last section.
1.1. Nomenclature X ¼ fxi gi¼1;y;N : xpi : xc : fwi gi¼1;y;N : wki :
set of input vector samples included in the filter’s window pth component of the input vector xi ð1pppPÞ central vector of the filter’s window set of scalar weights of the filter’s window value of the scalar weight wi at iteration k
L. Lucat et al. / Signal Processing: Image Communication 17 (2002) 509–524
sgnv ðaÞ ¼ ðsgnða1 Þ; y; sgnðaP ÞÞT : LMS: mL1E: mL2E: RGB: SSP: VML1 : VML2 : WVML1 : WVML2 :
sign function for a vector a
least mean squares mean L1 error mean L2 error red, green, blue sample selection probability vector median with the L1 norm vector median with the L2 norm weighted vector median with the L1 norm weighted vector median with the L2 norm
511
be represented by a set of M noise-free vectors d ¼ fdm gm¼1;y;M ; and by fym gm¼1;y;M the set of corresponding estimates at the filter output, the mean L1 error and P mean L2 error are given by ML1 E P ¼ ð1=MÞ M m¼1 jjdm ym jjL1 and ML2 E ¼ 2 ð1=MÞ M m¼1 jjdm ym jjL2 ; respectively. The optimization is conducted independently for each weight wi and is based on a gradient backpropagation algorithm [10]. Considering the ML2 E criterion and using the stochastic approximation which leads to an LMS-like algorithm (least mean squares), it can be summarized by wkþ1 ¼ wki þ 2mðd yÞ i
@y ; @wki
1pipN;
ð2Þ
2. Adaptive optimization of WVMF
where m is the adaptation step, and wki denotes the value of the weight wi at the kth iteration. When the ML1 E criterion is used, this leads to
2.1. WVMF optimization formulation
wkþ1 ¼ wki þ m sgnv ðd yÞ i
The WVMF is defined [3] by y ¼ arg min xj AX
N X
wi jjxj xi jjLp ;
ð1Þ
i¼1
where X ¼ fxi gi¼1;y;N is the set of input vector samples included in the window, fwi gi¼1;y;N are the associated weights, Lp is a norm (usually L1 or L2 ), and y is the filter output. We denote by P the number of components of these input and output vectors, with P being equal to 3 for RGB colour image processing. When the min in Eq. (1) is not unique, an additional rule is required to choose the WVMF output sample. In this paper, the filter mask is supposed being specified with respect to its shape and size, these parameters are not optimized as it may be the case for some scalar filters, as for instance for the rank order-based filter (ROBF) in [16]. The weights of the optimal WVMF are derived from the minimization of a cost function, which corresponds to the ML1 E or ML2 E criteria. The latter [19] are respective extensions to vector data of the well-known minimum absolute error (MAE) and minimum square error (MSE). That means, denoting by d the desired output, which may
@y ; @wki
1pipN;
ð3Þ
where sgnv ðaÞ ¼ ðsgnða1 Þ; y; sgnðaP ÞÞT ; sgn denoting the sign function and a ¼ ða1 ; y; aP ÞT : Thus, the central problem in both cases is finding, at each iteration k; a mathematical expression for the derivatives @y=@wki ; 1pipN: When the WVMF output is uniquely defined (i.e. there is a single sample corresponding to the min in Eq. (1)), it can be proved that the filter output is invariant to an infinitesimal variation of its weights. Hence, the mathematical derivative @y=@wki is null for each i; therefore it does not provide relevant information on the filter output sensitivity. It is worth mentioning that the approach based on the implicit function theorem, which has been used in the scalar case [16,17], is no longer valid in this vector case. In order to get suitable expressions for the derivatives, two solutions are proposed, whose validity is illustrated afterwards by a convergence test and filtering results on colour images. Note also that, as the optimization is carried out independently for each weight, and as the computation principle is the same at each iteration, in order to alleviate the notation the lower and upper indexes i and k; respectively, are omitted in the description of the two solutions we propose.
L. Lucat et al. / Signal Processing: Image Communication 17 (2002) 509–524
512
2.2. First solution: local approximation for @y=@w
f ’(xj )
(D 5 )
(D 4 ) = (D j o) (D 1 )
In order to get a suitable expression, we consider, at first, the derivative as a local approximation of the output increasing rate, leading to the following formulation: @y dy ; ð4Þ ¼ lim @w s1 jdwj-0 dw
(D 2 ) (D 3 ) = (D i )
dya0
where the subscript s1 corresponds to solution 1. Let us look for the smallest increment dw of the weight w such that dy does not become null. Hence, the output of the WVMF switches from the sample xj0 to a sample xj1 due to a weight increment equal to dw j1 : Using these notations, the derivative approximation can be written as x j xj @y ¼ 1 j 0: ð5Þ @w s1 dw 1 The problem is now to find the index j1 and the weight increment dw j1 : Let us define f ðxj Þ ¼ PN i¼1 wi jjxj xi jj: The WVMF expression in Eq. (1) can be rewritten as y ¼ arg minxj AX f ðxj Þ: The partial derivative of f ðxj Þ with respect to w is @f ðxj Þ ¼ jjxj xjj: ð6Þ @w Denoting by f 0 ðxj Þ the aggregate weighted distance f ðxj Þ after a weight incrementation dw; we get f 0 ðxj Þ ¼ f ðxj Þ þ dwjjxj xjj:
ð7Þ
Hence, it can be seen that ff 0 ðxj Þgj¼1;y;N corresponds to a set of affine functions of the weight increment dw: In Fig. 1, these affine functions are represented through the set of N straight lines fðDj Þgj¼1;y;N : The intersection points of each ðDj Þ with ðDj0 Þ are the key issue of the problem; they are computed for 1pjpN (and for each weight wi ) through f 0 ðxj Þ ¼ f 0 ðxj0 Þ 3 dw j ¼
f ðxj Þ f ðxj0 Þ : jjxj0 xjj jjxj xjj
ð8Þ
Thus, we know each minimal weight increment dw j of the weight w implying f ðxj Þ becomes lower than f ðxj0 Þ: at this point, the straight line ðDj Þ becomes under ðDj0 Þ: Hence, dw j1 ¼ minj jdw j j
δ w 5i
0
δ w 2i
δ w 3i
δ w 1i
δ wi
Fig. 1. Evolution of the aggregate weighted distances according to the weight increment wi ð1pipNÞ:
corresponds to the minimal weight increment (or decrement) of w which allows the WVMF output to switch from the sample xj0 to another sample, namely xj1 ; corresponding to the local sensitivity. In the example presented in Fig. 1, j0 ¼ 4 and j1 ¼ 2: 2.3. Second solution: formulation of @y=@w based on a mean sensitivity The objective of this alternative is to estimate a more global behaviour of the filter when a weight is modified. This global analysis is motivated by the fact that a gradient approach, such as the one studied in Section 2.2, is confined to the search of a local minimum of the cost function; however, a local variation of the WVMF output due to a small weight increment, or decrement, may not be representative of the filter behaviour for larger fluctuations. The alternative solution principle is to take into account the increments dw j ; jaj0 ; of the weight w which are necessary to make the WVMF output switch from the sample xj0 to each sample xjaj0 ; when these switches are possible. These required weight increments can be seen as critical values in the filter output evolution, in the way that each one indicates a new direction (in the vector component space) of the WVMF
L. Lucat et al. / Signal Processing: Image Communication 17 (2002) 509–524
513
output variation. Then, the global behaviour may be illustrated by the mean WVMF output increasing rate, computed from all critical points. This leads to the following expression of the filter output derivative: X xj xj dy 1 0 ¼ ; ð9Þ dw s2 cardðSÞ x AS dw j
reference line (Dj n ) and the others ðDj Þ is obtained in the same way as in Section 2.2, using Eq. (8) where the index j0 is then replaced with the current index j n : In the example of Fig. 2, the mean WVMF output increasing rate is then given by ! ! ! !! dy 1 X4 X5 X4 X2 X4 X6 X4 X3 ¼ þ þ þ ; dw s2 4 dw5 dw2 dw6 dw3
where S denotes the whole set of potential output vectors involved when increasing or decreasing the weight w; and the subscript s2 corresponds to solution 2. This second solution is expected to be less reactive than the first one, but the weights evolution may be more regular because of the mean behaviour. The problem is now to find the location of the critical points. These points correspond to each weight increment implying that a given f ðxj Þ becomes lower than all other f ðxi Þ (the straight line (Dj ) is then under the other lines (Di )). This leads to the search of a convex curve, such as the one illustrated in Fig. 2. The intersection points are iteratively computed; this search starts with the straight line ðDj0 Þ and is independently conducted for negative versus positive weight increments. At each step, the computation of the intersections between the current
! ! ! where Xi Xj ¼ OXj OXi ¼ xj xi : As a matter of comparison, the solution of Section 2.2 would ! result in ðdy=dwÞs1 ¼ X4 X2 =dw2 ; dw2 being the smallest increment of the weight w implying a WVMF output switch. Note that, in order to maintain the curve convexity, when a multi-intersection occurs, such as the point I4;5 in Fig. 2, the new reference line is the one having the larger slope for negative dw and the lower one for positive dw:
j
f ’(x j)
(D5 )
(D7 )
(D4 ) = (Djo ) (D1 ) (D2 )
I 6,3
(D6 )
(D3 ) = (Di )
I 2,6 I 4,2
I 4,5
δw5i
0 δw2 i
δw6i
δw3i
δwi
Fig. 2. Iterative search of the intersection points for the second solution and for the weight wi ð1pipNÞ:
2.4. Parameter convergence test The aim of the test is to study the evolution of the filter’s parameters. However, as the proposed optimization aims to regularly adapt to any given signal content, in general, no convergence to a given set of weights can be observed. Nevertheless, if for a given input signal such an optimal set of weights exists, it is of great interest to assess whether the parameters can converge to desired values, using the proposed adaptation schemes. Thus, we consider the test illustrated in Fig. 3. The input signal consists of a one-dimensional (1D) fragment (70 pixels) extracted from a textured colour image, and periodically duplicated. The adaptive and the target filters are 1D WVMF defined using the L1 norm as distance metric and such that N ¼ 5: The set of target weights is f1; 2; 3; 4; 5g and the optimization of the adative WVMF is based on solution 1, with a step m; in Eq. (2), set to 107 : In Fig. 4, we give the results of the convergence test. It can be observed that the weights converge to values close to the normalized target weights. At the convergence, which is reached in this example after 620 iterations, the output y is equal to the desired output d: Other simulations with different sets of target weights, as well as using solution 2, lead to a
L. Lucat et al. / Signal Processing: Image Communication 17 (2002) 509–524
514
WVMF with target weights
x
d
+
-
y
WVMF with adaptive weights
input signal
output signal
Fig. 3. Schematic illustration of the convergence test.
0.4 0.35
normalized weights
Weight 5
0.3 Weight 4
0.25 0.2
Weight 3
0.15 Weight 2
0.1 Weight 1
0.05 0
100 200 300 400 500 600 700 800 900 1000
iterations
Fig. 4. Evolution of the weights.
weight convergence, with again y ¼ d; for length pattern of up to 70 pixels. This ‘‘positive’’ test illustrates the validity of our optimization approach.
optimization solution number 2. This example emphasizes the benefit of using adaptive weights for a better preservation of fine details. Other simulations, using solution 1 and the L2 norm as distance metric have also shown the significant improvements due to our adaptive optimization. According to our simulations, and as shown with the ML2 E measures presented in Table 1, the WVML2 filter seems to work better with solution 1 while the WVML1 filter performs better using solution 2. In the present case of impulsive noise, the optimization can be performed without any reference to the noise-free image; the corrupted image is then considered as the reference. Further simulations have shown that the resulting degradation of the performances is weak, with regard to the gain reached with the adaptive filtering. This is due to the fact that, at least up to a probability of the impulse noise such that p ¼ 0:2; the ‘‘corrupted’’ adaptation steps where the reference data correspond to impulses are not correlated among them, and thus do not influence the global evolution of the weights [9]. Finally, we have to note that the central pixel of the mask, denoted wc ; has been ignored here ðwc ¼ 0Þ; because a positive wc quickly involves a high value of the central weight leading to an identity filter. Therefore, an appropriate specification of wc is expected to further improve the filtering performances.
2.5. Application to colour image filtering In this section, we evaluate the performance of the adaptive WVMF for removing impulsive noise. The input image is a textured image embedded in a R–G–B channel-independent impulsive noise. The objective of the filtering is to remove the impulses while preserving the noise-free patterns; in this way, the WVMF weights should adapt to these patterns. In Fig. 5, we present illustrations of original noisy (channel-independent impulsive noise of probability p ¼ 0:2) images and the respective results of the adaptive WVMF and VMF,2 using a 5 5 filter mask, the L1 norm and 2 Which is equivalent to an unoptimized WVMF whose weights are equal to 1.
3. Global optimization 3.1. Global approaches In this section, rather than adaptively optimizing the parameters of the WVMF filter in order to be locally adapted to the signal, we look for fixed parameters in adequation to some global constant features of the signal. The proposed approach is a statistical one. It takes into account the fact that the weight values, wj ; naturally have a strong impact on the filter output y; and, consequently, on the probabilities Pr½y ¼ xj ; usually denoted SSPðxj Þ (sample selection probability) in the scalar case. This terminology can be straightforwardly
L. Lucat et al. / Signal Processing: Image Communication 17 (2002) 509–524
(b) VMF 5×5
(a) Noisy images ( p = 0.2)
515
(c) Adaptive WVML1 5×5
Fig. 5. Illustration of the benefit due to the WVML1 adaptation (solution 2) for texture 1 (at the top), texture 2 (in the middle) and texture 3 (at the bottom).
Table 1 ML2 E measures for the VMF and the optimized adaptive WVMF defined with L1 or L2 Norm for the filters
VMF
WVMF solution 1
WVMF solution 2
Texture 1
L1 L2
1123 1162
995 880
865 923
Texture 2
L1 L2
2229 2268
2153 1866
1906 1956
extended to the vector case. If the desired SSP values are known, and if a mathematical formulation, or a conversion algorithm, is available to derive the weights from the SSPs, then we are able
to express, or compute, the optimal weights. Such an approach has been used in the scalar case, for instance to determine the parameters of the WMF and weighted order statistics filter (WOSF) [15]. It involves a direct account of the SSP, the desired image being assumed to be known. However, even if a similar approach could be investigated with WVMF its complexity would probably be very high. In the following, we prefer to focus on a particular case related to the statistical optimization of the central weight, wc ; according to the SSP of the central sample xc : The first and straightforward advantage is to have a single weight to optimize and a second one is that an estimation of the SSP is no longer required, which also means that a priori knowledge of the desired image is not necessary.
516
L. Lucat et al. / Signal Processing: Image Communication 17 (2002) 509–524
3.2. Statistical optimization of the central weight 3.2.1. General formulation In the context of impulsive noise reduction for colour images, our goal is to find an optimal value for the central weight of the WVMF, wc ; or for its P normalized version, wnc ¼ wc = N i¼1 wi : This computation naturally takes into account all the other weights included in the filter mask. Indeed, in the case of an impulsive noise, the central sample xc plays a particularly important role: if it is not affected by an impulse, it is desirable that the filter output takes its value, therefore it is then required that the central weight wc be high. On the contrary, when the central pixel is affected by an impulse, it is better if the central weight is small. Assuming no impulse detection is carried out, which is the case in our study, we have no a priori knowledge to tell us if the target pixel is noisy or not. In fact the only information, which is assumed to be known, is statistical: it is the probability of occurrence of noise, denoted by p: The ‘‘optimal’’ choice for wc has to be based on the input data and, especially, on the parameter p: Thus, this problem is directly related to the computation of SSPðxc Þ: We have to find the adequate relation between SSPðxc Þ and p; such as SSPðxc Þ ¼ probability that xc be not noisy: ð10Þ Otherwise said, the optimization has to find the value of wc providing an SSPðxc Þ such that Eq. (10) is satisfied. It is assumed that the impulsive noise is scalar and independent for each one of the three components of the colour images. The probability that the kth component is not noisy is ð1 pÞ: Therefore, the vector xc will not be noisy if each one of the component is not; the probability of this event is ð1 pÞ3 : The second and harder remaining problem consists of finding the left member in Eq. (10), i.e. the probability P ¼ SSPðxc Þ; as a function of the weights. 3.2.2. Computation of the SSPðxc Þ In a probabilistic framework the input samples fxi gi¼1;y;N are realizations of a random variable
X : The weights associated to them are deterministic parameters. The SSPðxc Þ can then be formulated as P ¼ Pr½xc ¼ WVMFðfxi ; wi gi¼1;y;N Þ : Taking into account the definition of the WVMF, see Eq. (1), it can be rewritten as " # N X P ¼ Pr xc ¼ arg min wi jjxj xi jj xj
¼ Pr
" N X
i¼1
wi jjxc xi jjo
i¼1
N X
# wi jjxj xi jj; 8jac :
i¼1
If we want to be able to go beyond this first step, simplifying assumptions cannot be avoided. The samples fxj gj¼1;y;N are supposed to be independent, although some strong relations may exist between the samples included in the mask. Furthermore, we also assume that the measures P f N w jjx x j i jjgj¼1;y;N are independent and we i¼1 i will see later on that, even if such a strong assumption is required, it nevertheless leads to useful results. According to these assumptions, we can express P as follows: " # N N N Y X X P¼ Pr wi jjxc xi jjo wi jjxj xi jj : j¼1;jac
i¼1
i¼1
For xj axc ; P can be written as " PN N Y i¼1ðiaj;cÞ wi ðjjxc xi jj jjxj xi jjÞ Pr P¼ jjxc xj jj j¼1ðjacÞ # owc wj :
ð11Þ
The problem is then to find the probability density function (pdf) of the random variable Z given by PN i¼1ðiaj;cÞ wi ðjjxc xi jj jjxj xi jjÞ : Z¼ jjxc xj jj The derivation of this density function can be decomposed in four steps: 1. Find the pdf of jjxj xi jj: This pdf naturally depends on the pdf of the samples but is also related to the norm under consideration. The substraction ðxj xi Þ and the norm computation imply convolutions of the pdf of the
L. Lucat et al. / Signal Processing: Image Communication 17 (2002) 509–524
1 0 exp Kn2 1 1 þ z2 =2W 2 pffiffiffi qðzÞ ¼ 2pW " 3=2 pffiffiffi z2 0 Kn p 1 þ 2W 2 1=2 ! z2 0 erf Kn 1 þ 2W 2 1 z2 þ 1þ 2W 2 1 !# z2 02 exp Kn 1 þ : 2W 2
random variable X : Consequently, the pdf for jjxj xi jj is close to a Gaussian distribution. Later on we will assume that jjxj xi jj has a Gaussian distribution Nðmn ; sn Þ; where the subscript n is related to the selected norm. The validity of this assumption, justified in Appendix A, can be visually checked in Fig. 6. 2. Deduce the pdf of jjxc xi jj jjxj xi jj: The samples being assumed independent we have to consider the substraction of two random variables (X1 X2 ), where X1 and X2 admit pdfs given by N1 ðmn ; sn Þ and N2 ðmn ; sn Þ; respectively. ðX1 X2 Þ is therefore another random variable with a density function given by the convolution of N1 and N2 ; which p leads again to a normal distribution: ffiffiffi N3 ð0; 2sn Þ: P 3. Calculate the pdf of wi ðjjxc xi jj jjxj xi jjÞ: Each element in the summation correspondspto ffiffiffi a random variable with distribution Ni ð0; 2wi sn Þ: To obtain the pdf of the sum we have to convolve all Ni distributions. This ‘‘multiconvolution’’ again leads to a normal pffiffiffi qffiffiffiffiffiffiffiffiffiffiffiffi P 2ffi distribution: Nð0; 2sn i wi Þ: 4. Find the pdf of q the quotient. ffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffi P 2ffi 0 Denoting W ¼ i wi and Kn ¼ mn = 2sn ; it
x 10
_3
7
3.5
x 10
_3
6
3
5
2.5 4 2 3
1.5
2
1
1
0.5 0
0
100
200
300
400
500
(a) L1 Norm
600
700
800
ð12Þ
The general form of the probability expressed in Eq. (11) is such as Pr½Zowc wj : Denoting by QðzÞ the cumulative distribution function (cdf) of the random variable Z; we find that the corresponding probability is Qðwc wj Þ: To find Q we have to integrate qðzÞ: But before proceeding to this numerical integration we have to determine Kn0 and, therefore, the parameters mn and sn corresponding to the Gaussian-like distribution of jjxj xi jj; these equivalent parameters are also given in Appendix A.
can be shown [9] that the pdf is
4
517
0 0
50 100 150 200 250 300 350 400 450 500
(b) L2 Norm
Fig. 6. Theoretical distribution (solid line) and ‘‘equivalent Gaussian distribution’’ (dotted line) for jjxj xi jj:
L. Lucat et al. / Signal Processing: Image Communication 17 (2002) 509–524
518
3.2.3. Central weight of the WVMF and probability of occurrence of the impulsive noise As indicated in Section 3.2.1 we are looking for a single parameter, the central weight, being optimal and determined once for all with respect to global characteristics of the signal, more precisely according to the occurrence probability of the impulses in the input image to be filtered. On the other hand, taking into account the explanations given in Section 3.2.2, Eq. (11) can be rewritten as P ¼ ð1 pÞ3 ¼
N Y
Qðwc wj Þ:
ð13Þ
j¼1ðjacÞ
The computation of P is therefore dependent on the values of the peripheral weights wj ðjacÞ: In order to simplify the use of the WVMF filter, we propose computing the function Q for a set of standard weights fwjac g: For example, we impose that all the weights be equal to 1. Nevertheless, as ffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi PN 2 qðzÞ depends on W ¼ i¼1ðiaj;cÞ wi ; and therefore on N; a computation of QðzÞ has to be done for each desired value of N: With this standard set of weights, qðzÞ and QðzÞ are independent from j; so we can rewrite Eq. (13) as P ¼ ð1 pÞ3 ¼ ðQðwc 1ÞÞN1 : This formulation allows us to express the central weight value, wc ; or the normalized central weight wnc (the unitary normalization consists of P N n j¼1 wj ¼ 1), as a function of the probability occurrence of the impulses wc ¼ 1 þ Q1 ðð1 pÞ3=ðN1Þ Þ or wnc ¼
1 þ Q1 ðð1 pÞ3=ðN1Þ Þ N þ Q1 ðð1 pÞ3=ðN1Þ Þ
;
ð14Þ
where the function Q simply depends on Kn0 (according to the WVMF norm) and N: In order to evaluate the relevance of the proposed method for the determination of the WVMF central weight, simulations have been carried out. For that we used a textured image and distorted it with an impulsive independent noise on each component R, G and B; the probability of occurrence of the impulses were p ¼ 0:05; 0:1; 0:2 and 0:5; respectively; for each value of
p different values of the normalized central weight have been tested and the ML1 E and ML2 E errors have been computed. These simulations allowed us to find the optimal empirical weights with an accuracy of 102 : For each value of p under consideration we have compared these optimal empirical weights with the ones given by our method (cf. Eq. (14)) for masks with sizes 3 3; 5 5; 7 7 and for the two norms L1 and L2 of the WVMF filter. The results are synthesized in Fig. 7. As in the previous section the filter size and shape are assumed to be known. Naturally, if this size is increased too much, we get a loss in fine image details as well as higher computation time. Based on our experiments [9], it seems that a size of 5 5 leads to a good tradeoff. These simulations show that the proposed technique allows a good estimation of the central weight, taking into account the probability of occurrence of the impulses. Nevertheless, it is also clear that our method is less efficient when p-0 ðpo0:05 or 0:1). Indeed our model does not take into account a typical feature of the WVMF, which is that a normalized weight wni ; so that wni X0:5; implies that the output is necessarily xi : Therefore, the optimal empirical weights converge towards 0.5 while the curve becomes close to 1. It is worth noting that in such a situation one has rather to proceed in two steps in order to get rid of the impulses: detecting them in a first step and filtering them in the second one. 3.3. Performances of the WVMF filter with optimized central weight We now want to evaluate the improvement of the quality of the filtered image using a WVMF with a central weight optimized according to the technique proposed in Section 3.2. All the peripheral weights are supposed to be fixed and all equal. Our test images are distorted by an impulsiveindependent noise with probability p ¼ 0:2: Optimizing the central weight, as explained above, provides a very significant gain with regard to the filtering performances, for both norms L1 and L2 : In Fig. 8 we can see that, for a texture, in comparison with a VML1 ; the WVML1 with
1
1
0.9
0.9
0.8
0.8
Normalized central weight
Normalized central weight
L. Lucat et al. / Signal Processing: Image Communication 17 (2002) 509–524
0.7 0.6 0.5 0.4 0.3 0.2
0.7 0.6 0.5 0.4 0.3 0.2
0.1
0.1
0
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Probability of the impulsive noise (for each component)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Probability of the impulsive noise (for each component)
(d) WVML2 3×3. 1
0.9
0.9
0.8
0.8
Normalized central weight
Normalized central weight
(a) WVML1 3×3. 1
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Probability of the impulsive noise (on each component)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Probability of the impulsive noise (on each component)
(e) WVML2 5×5.
1
1
0.9
0.9
0.8
0.8
Normalized central weight
Normalized central weight
(b) WVML1 5×5.
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
519
0.7 0.6 0.5 0.4 0.3 0.2 0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Probability of the impulsive noise (on each component)
(c) WVML1 7×7.
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Probability of the impulsive noise (on each component)
(f) WVML2 7×7.
Fig. 7. Normalized central weight for our method (solid line: -), or empirically optimized in order to minimize the ML1 EðþÞ and ML2 E (o) criteria, for the WVML1 [a,b,c] and WVML2 [d,e,f], respectively.
L. Lucat et al. / Signal Processing: Image Communication 17 (2002) 509–524
520
(a) VML1 5×5.
(b) WVML1 5×5.
Fig. 8. Comparison of the VML1 and WVML1 with optimized central weight and equal peripheral weights. Table 2 ML2 E errors for the VML1 and WVML1 with optimized central weight and identical peripheral weights (mask sizes 5 5); input noisy images with p ¼ 0:2
improvement can also be obtained for natural images.
Filtered image
VML1
WVML1
4. Adaptive and global optimization of the WVMF
Texture 1 Texture 2 Texture 3
1123 2229 1639
595 1167 864
The two optimization schemes presented in the two previous sections may be complementary. The adaptive technique presented in Section 2 consists of a local optimization of the peripheral weights while, on the contrary, the global statistical optimization in Section 3 only deals with the central weight. In the latter section, we have seen that the computation of the function QðzÞ was only carried out for a standard set of identical and constant peripheral weights. However, if we have
identical peripheral weights and optimized central weight leads to a clearly better compromise in terms of noise reduction/detail preservation. The corresponding ML2 E measures are reported in Table 2. In Fig. 9 another illustration is given, with the image ‘‘Boat’’, showing that a significant
L. Lucat et al. / Signal Processing: Image Communication 17 (2002) 509–524
no a priori knowledge of these weights, it is, in theory, necessary to achieve the numerical integration of qðzÞ for each value of wjac ; and, therefore, for each new filtering operation if these weights are not constant. Even if it is possible, that would be very time consuming from a computational point of view. To combine adaptive and global optimizations it is in fact simpler, and just as efficient as the results will show, to use a unique function qðzÞ for given N and p: Therefore, the main difference with the adaptive method presented in Section 2 is that instead of being equal to 0 the central weight is fixed at a value which is the result of a global optimization. Then, at each iteration of the adaptive algorithm, the peripheral weights are locally optimized, and normalized afterwards to take account of the central weight value. The simulation results show a very significant improvement concerning the quality of the filtered images for the two adaptive methods described in Section 2 and for the two types of vector filters WVML1 and WVML2 : It also appears that for the two optimization methods of the peripheral weights, the WVML1 provides better results, in terms of visual quality and ML2 E measures, than the WVML2 ; even if the noisy image is used as a reference to compute the error at the filter output. For the overall set of possible situations, in which the central weight is optimally computed, the WVML1 filter provides the better results, with solution 1 if the reference image is noisy, and otherwise with solution 2. In Fig. 10 we illustrate, with the WVML1 and using solution 2, the interest of optimally specifying the central weight. Indeed, the optimal central weight leads to a picture having a clearly better visual quality. In Table 3 we have reported the whole set of results for the case of non-noisy reference. These numerical results, firstly, show the significant advantage of optimized WVMF over VMF, the signal to noise improvement being of, at least, 2:4 dB: On the other hand, it also appears that there are significant differences, up to nearly 1 dB; according to the filter, WVML1 performing better than WVML2 ; and to the adaptive optimization solution, 1 being better than 2, used.
521
Original noisy "Boat" image ( p = 0.2).
Image boat filtered by a 5×5 VMF.
Image boat filtered by a 5×5 WVML1 with optimal central weight.
Fig. 9. Comparison of the VMF and of the WVMF with optimal central weight for impulsive noise filtering.
L. Lucat et al. / Signal Processing: Image Communication 17 (2002) 509–524
522
(a) Central weight equal to 0
(b) Optimized central weight
Fig. 10. Comparison between WVML1 (solution 2, mask size 5 5; noise free reference) with and without central weight.
Table 3 Comparison of the WVML1 and WVML2 with globally optimized central weight and peripheral weight optimized according to solution 1 or solution 2 (mask size 5 5; input noisy images with p ¼ 0:2; noise-free references)
Solution 1
Filtered image
WVML1
WVML2
Texture 1
ML2 E ¼ 570 SNRi=VML1 ¼ 2:94 dB ML2 E ¼ 1127 SNRi=VML1 ¼ 2:96 dB ML2 E ¼ 517 SNRi=VML1 ¼ 3:37 dB ML2 E ¼ 1042 SNRi=VML1 ¼ 3:30 dB
ML2 E ¼ 669 SNRi=VML2 ¼ 2:40 dB ML2 E ¼ 1248 SNRi=VML2 ¼ 2:59 dB ML2 E ¼ 616 SNRi=VML2 ¼ 2:76 dB ML2 E ¼ 1166 SNRi=VML2 ¼ 2:89 dB
Texture 2 Solution 2
Texture 1 Texture 2
5. Conclusion If various optimization techniques are available for non-linear scalar filters, it is not the case for their vector counterpart. In this paper, we have proposed two optimization techniques to compute the weights of the WVMF. The first technique provides an optimization of the WVMF peripheral weights, which takes into account the non-stationary behaviour of the signal. Two
variants of an adaptive algorithm minimizing the ML2 E criterion have been tested for impulse noise reduction. The results obtained with textured images have shown that the noise could be strongly reduced while keeping a good preservation of fine details, even when using a noisy reference. According to our simulations WVML2 filters gave better results for the first variant of our adaptive algorithm (solution 1) and WVML1 filters performed better when using solution 2.
L. Lucat et al. / Signal Processing: Image Communication 17 (2002) 509–524
For this first set of optimization techniques the filter central weight was assumed to be zero. In the second approach our purpose was to achieve a global optimization of the central weight, all the peripheral weights being fixed and equal; the optimization issue being to remove as much as possible an additive impulsive noise affecting the colour images. The principle of our statistical optimization consisted of constraining the probability that the filter output corresponding to the central sample of the filter mask be equal to the probability that this sample be not distorted by an impulse. Using some simplifying assumptions, we were able to provide an analytical expression of the optimal central weight as a function of the probability of occurrence of the noise. Our analytical approach has been validated by many simulation results, showing also the efficiency of the method for denoising colour images, textures and natural images as well. These two techniques, the local and the global ones, can be combined together and we finally showed that their respective benefits can be added to each other.
Appendix A. Parameters of the equivalent Gaussian distributions First of all, let us examine the case of the L1 norm. It can be expressed as jjxj xi jj ¼
P X
jxkj xki j;
k¼1
where xki denotes the kth component of the vector xi and P the number of components. In the following, we focus on colour image processing and therefore we set P ¼ 3: We assume that xki is uniformly distributed in the range ½0; 255 for images coded using 8 bits per component and, therefore, 24 bits for colour images. As the original image is distorted by a uniformly distributed impulse noise, this assumption becomes still more valid. Afterwards, to determine the parameters of a Gaussian distribution being as close as possible to the density function, jjxj xi jjL1 ; we use the
523
logarithmic version of the maximum likelihood criterion. Let us consider a collection of samples denoted fui gi¼1;y;N ; and let us assume that this collection is characterized by a Gaussian distribution. The probability that its parametrisation be given by ðm; sÞ is characterized by the ‘‘log-likelihood’’ pffiffiffiffiffiffi log Lðm; sÞ ¼ N logð 2psÞ
N 1 X ðui mÞ2 : 2s2 i¼1
After a discretization of the continuous distribution jjxj xi jjL1 ; we get a new collection fui gi¼1;y;N containing nðkÞ samples equal to k; the nðkÞ being proportional to the density function. The ‘‘log-likelihood’’ can therefore be rewritten as log Lðm; sÞ pffiffiffiffiffiffi 1 X ¼ N logð 2psÞ 2 nðkÞðk mÞ2 ; 2s k X with nðkÞ ¼ N: k
Then it can be shown that the maximum likelihood is reached for the following set of parameters: sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi P P 2 knðkÞ k k nðkÞ m¼ k and s ¼ m2 : N N Let m1 and s1 denote their corresponding numerical values, respectively, we find m1 ¼ 255:3 and s1 ¼ 103:8; which leads to K10 ¼ 1:7: Let us now consider the case of the L2 norm rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi XP ðjxkj xki jÞ2 : jjxj xi jj ¼ k¼1 Again we note that the resulting distribution for jjxj xi jjL2 is close to a Gaussian distribution, even though our initial assumption is one of a scalar uniform distribution for fxki g: jjxj xi jjL2 will therefore be supposed to follow a Gaussian distribution and its parameters will be computed according to the maximum likelihood criterion. The parameters found for the L2 norm are m2 ¼ 169:3 and s2 ¼ 63:4; leading to K20 ¼ 1:9: In Fig. 6, we show the ‘‘equivalent Gaussian distributions’’ as defined previously in the case of
524
L. Lucat et al. / Signal Processing: Image Communication 17 (2002) 509–524
the L1 and L2 norms. It is also worth noting that if, instead of a uniform distribution for xki ; we had chosen any ‘‘bell-shaped’’ distribution the final result would be still closer to the Gaussian.
References [1] L. Alparone, M. Barni, F. Bartolini, R. Caldelli, Regularization of optic flow estimates by means of weighted vector median filtering, IEEE Trans. Image Process. 8 (10) (1999) 1462–1467. [2] L. Alparone, M. Barni, F. Bartolini, V. Cappellini, Adaptively weighted vector-median filters for motion-fields smoothing, in: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 96), Vol. 4, 1996, pp. 2267–2270. [3] J. Astola, P. Haavisto, Y. Neuvo, Vector median filters, in: Proceedings of the IEEE, Vol. 78-4, 1990, pp. 678–689. [4] A.C. Bovik, T.S. Huang, D.C. Munson, A generalization of median filtering using linear combinations of order statistics, IEEE Trans. Acoust. Speech Signal Process. 31 (6) (1983) 1342–1349. [5] R.C. Hardie, G. Arce, Ranking in Rp and its use in multivariate image processing, IEEE Trans. Circuits Systems Video Technol. 1 (2) (1991) 197–209. [6] S.A. Kassam, M. Aburdene, Multivariate median filters and their extensions, in: Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS 91), 1991, pp. 85–88. [7] J.-S. Kim, H.W. Park, Adaptive 3-D median filtering for restoration of an image corrupted by impulsive noise, Signal Processing: Image Communication 16 (7) (2001) 657–668. [8] C. Kotropoulos, I. Pitas, Multichannel L-filters based on marginal data ordering, IEEE Trans. Signal Process. 42 (10) (1994) 2581–2595. [9] L. Lucat, Analyse et synth"ese d’une famille de filtres nonlin!eaires: Application au traitement d’images couleurs, Ph.D. Thesis, Universit!e de Nantes, France, no d’ordre ED 82-323, September 1998.
[10] L. Lucat, P. Siohan, Adaptive weighted vector median filter using a gradient algorithm, in: Proceedings of the European Signal Processing Conference (EUSIPCO 98), Vol. 2, 1998, pp. 777–780. [11] N. Nikolaidis, I. Pitas, Multichannel L-filter based on reduced ordering, IEEE Trans. Circuits Systems Video Technol. 6 (5) (1996) 470–482. [12] J.L. Paredes, G.R. Arce, An optimization algorithm for recursive weighted median filters with realvalued weights, in: Proceedings of the IEEE International Conference on Image Processing, Vol. 1, 2000, pp. 892–895. [13] I. Pitas, S. Vougioukas, LMS order statistic filter adaptation by backpropagation, Signal Processing 25 (3) (1991) 319–335. [14] M. Ropert, D. Pel!e, Synthesis of adaptive weighted order statistics filters with gradient algorithms and application to image processing, in: Proceedings of the IEEE International Conference on Image Processing (ICIP 94), Vol. 2, 1994, pp. 512–516. [15] M. Ropert, F.M. de Saint Martin, D. Pel!e, A new representation of weighted order statistic filters, Signal Processing 54 (2) (1996) 201–206. [16] P. Salembier, Adaptive rank order based filters, Signal Processing 27 (1) (1992) 1–25. [17] M. Tabiza, P. Bolon, Performance evaluation of Da-filters, in: Proceedings of the Sixth European Signal Processing Conference (EUSIPCO 96), Vol. 1, 1996, pp. 49–52. [18] M.I. Vardavoulia, I. Andreadis, P. Tsalides, A new vector median filter for colour image processing, Pattern Recognition Lett. 22 (6–7) (2001) 675–689. . amo, . Y. Neuvo, Three-dimensional [19] T. Viero, K.O. Oist. median-related filters for color image sequence filtering, IEEE Trans. Circuits Systems Video Technol. 4 (2) (1994) 129–142. [20] R. Yang, L. Yin, M. Gabbouj, J. Astola, Y. Neuvo, Optimal weighted median filters under structural constraints, in: Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS 93), 1993, pp. 942–945. [21] L. Yin, R. Yang, M. Gabbouj, Y. Neuvo, Weighted median filters: a tutorial, IEEE Trans. Circuits Systems II 43 (3) (1996) 157–192.