Pattern Recognition Letters 33 (2012) 1388–1396
Contents lists available at SciVerse ScienceDirect
Pattern Recognition Letters journal homepage: www.elsevier.com/locate/patrec
Multifocus image fusion and denoising: A variational approach Cosmin Ludusan a,b,⇑, Olivier Lavialle a a b
Université de Bordeaux, IMS UMR 5218 CNRS, Bordeaux Sciences Agro, Talence, France Technical University of Cluj-Napoca, The Faculty of Electronics, Telecommunications and Information Technology, Cluj-Napoca, Romania
a r t i c l e
i n f o
Article history: Received 9 March 2011 Available online 7 March 2012 Communicated by H. Sako Keywords: Multifocus image fusion Denoising Variational model Partial Differential Equations Image enhancement Image restoration
a b s t r a c t In this letter we propose a variational approach for concurrent image fusion and denoising of multifocus images, based on error estimation theory and Partial Differential Equations (PDEs). In real world scenarios the assumption that the inputs of an image fusion process contain only useful information, pertinent to the desired fused output, does not hold true more often than not. Thus, the image fusion problem needs to be addressed from a more complex, fusion-denoising point of view, in order to provide a fused result of greater quality. The novelty of our approach consists in defining an image geometry-driven, anisotropic fusion model, numerically expressed using an anisotropy-reinforcing discretization scheme that further increases the anisotropic behavior of the proposed fusion paradigm. The preliminary experimental analysis shows that robust anisotropic denoising can be attained in parallel with efficient image fusion, thus bringing two paramount image processing tasks into complete synergy. One immediate application of the proposed method is fusion of multifocus, noise-corrupted images. Ó 2012 Elsevier B.V. All rights reserved.
1. Introduction Image fusion aims at extracting and synthesizing information from multiple sources, e.g. multisensor, multifocus or multiframe, in order to produce a more accurate, more reliable composite result. This synthesis can be performed at different levels of complexity (Samadzadegan, 2003): signal-level, image-level (pixellevel), feature-level or decision-level. In order to properly manage fusion tasks, according to their complexity and purpose, a classification of the main fusion methods is required. One such classification is given by Blum et al. (2006), dividing fusion methods into two main groups: Multiscale-decomposition-based (e.g. Li and Yang (2008), Liu et al. (2001), Chen et al. (2008) and Li et al. (2005)) and Nonmultiscaledecomposition-based (e.g. Huang and Jing (2007), Li et al. (2002)) fusion methods. The proposed variational model belongs to the latter group, together with methods described in (Socolinsky, 2000; Socolinsky and Wolff, 2002; John and Vorontsov, 2005; Ballester et al., 2006; Fischer and Modersitzki, 2006; Wang and Ye, 2006, 2007; Piella, 2008, 2009). These methods are based on the assumption that the input images are noise-free, assumption that, more often than not, does not hold true as a precondition for image fusion applications. Complexity-wise, our variational model falls between image-
⇑ Corresponding author. Address: IMS UMR 5218 CNRS, 351 Cours de la Libération, F-33405 Talence Cedex, France. Tel.: +33 5 4000 6185; fax: +33 5 4000 6644. E-mail address:
[email protected] (C. Ludusan). 0167-8655/$ - see front matter Ó 2012 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.patrec.2012.02.017
level and feature-level fusion, since one part of the fusion process implies fusing edge maps. Often, in practice, the input images are corrupted by noise from various sources at different stages of processing, from acquisition to transmission or storage. Following this line of reasoning Pop et al. (2007a,b), Mitianoudis and Stathaki (2008), Pop (2008) and Wang et al. (2008) propose a series of variational and non-variational PDE-based image fusion methods with implicit denoising, better suited for practical image fusion applications. Within this family of fusion methods we define the prerequisites for the proposed fusion model, leading to the development of a more robust and improved variational approach for concurrent fusion and denoising of noise-corrupted multifocus images. The increased noise robustness is attained by using an anisotropic approach to image fusion, combined with an anisotropy-reinforcing numerical scheme – better suited for noise filtering and salient feature detection. The novelty of the proposed joint fusion–denoising model with respect to the previously mentioned variational and non-variational PDE-based methods consists in: (a) The use of an ‘‘intelligent’’ diffusion (denoising) component in the form of a geometry-driven anisotropic diffusion mechanism as opposed to an isotropic diffusion one, employed by many of the existing variational and PDE-based methods. (b) Its edge-enhancing component (part of the fusion mechanism) is also defined anisotropically, as opposed to similar formalisms that are based on a classic Gaussian-smoothed gradient map, for example.
1389
C. Ludusan, O. Lavialle / Pattern Recognition Letters 33 (2012) 1388–1396
This letter is organized as follows: Section 2 briefly describes the basic concepts pertaining to error estimation theory in the context of variational image fusion, as they were first laid out by John and Vorontsov (2005). A detailed description of the proposed model is given in Section 3, followed by an experimental analysis and discussion in Section 4. Conclusions, remarks and future development are outlined in Section 5. 2. Robust error estimation theory using a variational framework Let Iðx; yÞ : X ! R be a degraded observed image and Ir(x, y) the recovered version of I(x, y), where (x, y) refers to the pixel of coordinates (x, y) within the image space X R2 . Using error estimation theory, the recovered image Ir(x, y) can be estimated by means of an error functional E(Ir(x, y)) that expresses the difference between the original image and the estimated one, as a function of Ir(x, y) (John and Vorontsov, 2005):
EðIr ðx; yÞÞ ¼
Z
qððx; yÞ; Ir ðx; yÞ; jrIr ðx; yÞjÞdxdy
ð1Þ
X
where X is the image support and rIr(x, y) is the image gradient vector in (x, y). The error norm q is defined according to the requirements of the application or the nature of the degradation, e.g. for filtering out AWGN (Additive White Gaussian Noise) from a degraded image, one suitable choice for q would be a least square error norm (John and Vorontsov, 2005). The extremum of (1) is estimated using the Euler–Lagrange equation, satisfied by a function f of argument u which extremizes the following functional:
Eðf Þ ¼
Z
Fðu; f ðuÞ; f 0 ðuÞÞdu
ð2Þ
where F is a given function with continuous first order partial derivatives and f is the function to be found. The Euler–Lagrange equation can be described using an ODE (Ordinary Differential Equation) of variable u that extremizes E(f):
@ d @ Fðu; f ðuÞ; f 0 ðuÞÞ Fðu; f ðuÞ; f 0 ðuÞÞ ¼ 0 @f ðuÞ du @f 0 ðuÞ
ð3Þ
Thus, using (3) the extremum of (1) can be analogously derived, leading to an Euler–Lagrange equation of the form:
@q 1 @q r rI r ¼ 0 jrIr j @jrIr j @Ir
ð4Þ
A closed-form solution Ir (x, y) is obtainable from (4), the estimation of Ir being performed using numerical optimization methods, e.g. gradient descent optimization. Hence, Ir can be iteratively estimated using the following update rule:
Ir ðx; y; t þ 1Þ
Ir ðx; y; tÞ s
@Ir ðx; y; tÞ @t
ð5Þ
3. Anisotropic image fusion and denoising Following the guidelines established by John and Vorontsov (2005) in their seminal work on image fusion through Total Variation (TV) minimization, the proposed model is formulated on the same TV minimization principle of establishing a parallel between the fused image eI and the recovered version Ir of an initial image I from error estimation theory. As already discussed in the previous section, one possible choice for the error norm q from (1) is the least square error norm, expressed as:
1 2
qððx; yÞ; jrIr ðx; y; tÞjÞ ¼ jrIr ðx; y; tÞj2
This error norm provides a simple, straightforward way of filtering AWGN in an isotropic manner, being a function of jrIr(x, y)j but not explicitly of the image itself. From an image fusion point of view, a more suitable error norm would be one that combines AWGN filtering with edge enhancement, since the aim of a fusion process is to transfer salient features from the input images to the fused result. In image fusion, saliency is usually defined as edge information, thus, the fusion process is aimed at enhancing and synthesizing this type of information. Such an error norm, for noise filtering and edge enhancement, was proposed and used by John and Vorontsov (2005), Wang et al. (2008), Mitianoudis and Stathaki (2008):
a qððx; yÞ; Ir ðx; y; tÞ; jrIr ðx; y; tÞjÞ ¼ jrIr ðx; y; tÞj2 2 b þ J I ðx; yÞ½Ir ðx; y; tÞ Iðx; yÞ2 2
ð6Þ
with initial condition Ir(x, y, 0) = I(x, y). The time evolution of (6), through the gradient descent optimization method, will iteratively continue until a given minimization criterion is satisfied. In practice, only a finite number of iterations is required for obtaining visually satisfactory results (John and Vorontsov, 2005).
ð8Þ
where a and b are constants that control the level of noise filtering and edge enhancement respectively, while JI is a Gaussian smoothed edge map of the form:
J I ðx; yÞ ¼
Z
0
0
jrIðx0 ; y0 Þj2 Gðx x0 ; y y0 ; rÞdx dy
ð9Þ
G being a zero-mean Gaussian function of standard deviation r. Application-wise, isotropic filtering is not suitable for practical use in image processing since it destroys salient information in the form of edges and structures (Terebes, 2004a). Conversely, anisotropic filtering has been for some time the de facto approach in image restoration and enhancement, edge detection or image segmentation. Since the main objective in image fusion is to properly detect and fuse salient features (i.e. edges and structures) originating from multiple sources, anisotropic processing provides the means of efficiently attaining this objective. To this end, in defining our variational fusion paradigm we use an alternative, anisotropic error norm q that better responds to the needs of image fusion, both in terms of noise filtering and of edge enhancement:
a jrIr ðx;y;tÞj2
r1 ½Da ðIr ðx;y;tÞÞ 2 rIr ðx;y;tÞ b Jðx;y;tÞ½Ir ðx;y;tÞ Iðx;yÞ2 ð10Þ 2
qððx;yÞ;Ir ðx;y;tÞ;jrIr ðx;y;tÞjÞ ¼
where t is the time evolution parameter, s the optimization step size, and:
@Ir ðx; y; tÞ @q 1 @q ¼ þr rIr ðx; y; tÞ @t jrIr j @jrIr j @Ir
ð7Þ
where r1 is the inverse del operator and Jðx; y; tÞ ¼ J I ðx; yÞ J Ir ðx; y; tÞ is the gain function first introduced by John and Vorontsov (2005) that we have redefined in an anisotropic manner:
Jðx; y; tÞ ¼
Z
0
0
_ Iðx0 ; y0 ÞjGðx x0 ; y y0 ; rem Þdx dy jr Z _ Ir ðx0 ; y0 ; tÞjGðx x0 ; y y0 ; rem Þdx0 dy0 jr
ð11Þ
1390
C. Ludusan, O. Lavialle / Pattern Recognition Letters 33 (2012) 1388–1396
_ the gradient along the g direction ð~ with r g ¼ rIr =jrIr jÞ, its numerical expression and computation being discussed in more detail in Appendix A.3. Choosing only the g direction for computing the gradient map enforces the overall anisotropic characteristic of our approach and also helps in avoiding unwanted contour artifacts along the n direction due to small registration errors or initial noise contamination (at t = 0, before the denoising process starts). Substituting the proposed anisotropic error norm from (10) into (6) leads to the following estimation model for Ir(x, y):
@Ir ðx; y; tÞ ¼ aDa ðIr ðx; y; tÞÞ þ bHðJðx; y; tÞÞJðx; y; tÞ½Ir ðx; y; tÞ Iðx; yÞ @t ð12Þ where Da is a diffusion differential operator defined in the sense of Alvarez et al. (1992) and computed using an explicit numerical approximation scheme (Terebes et al., 2002, 2004), while H(J) is defined so as to allow only salient information to be transferred to Ir (John and Vorontsov, 2005):
HðJÞ ¼
1; if J P 0 0; if J < 0
ð13Þ
Mathematically, a fusion process can be described as comprising S input images (sources), I1 ðx; yÞ; . . . ; IS ðx; yÞ : X ! R representing the same scene, spatially or temporally. Linking the two theoretical constructs of error estimation and variational fusion is unambiguously achieved by extending the notion of recovered image Ir, from error estimation theory, to the fused image eI, from image fusion theory. Thus, every input image Is(x, y), s = {1, . . . , S}, from the fusion process is regarded as a degraded image I, recovered using the error estimation variational approach described by (12). One way of classifying fusion techniques, according to Mitianoudis and Stathaki (2008) is into spatial domain and transform domain techniques. In the spatial domain, a fused image can be defined as:
eIðx; yÞ ¼ CðI1 ðx; yÞ; . . . ; IS ðx; yÞÞ
ð14Þ
where C represents the fusion rule, describing how the relevant features from the input images are combined to yield the fused image eI. The fused image eI can be constructed therefore, as a linear combination of S input images Is, where the useful information is transferred to the fused result by means of weight functions ws(x, y, t), as part of the error estimation process (Mitianoudis and Stathaki, 2008):
eIðx; y; tÞ ¼
S X ws ðx; y; tÞ Is ðx; yÞ
ð15Þ
s¼1
The process of fusing multiple inputs into a single fused result requires, first and foremost, ensuring proper edge preservation. This cannot be achieved using a sequential approach to estimating the S fusion weight functions, since salient information belonging to one input could be lost if not correlated to all the other inputs (Mitianoudis and Stathaki, 2008). That is why concurrent fusion and denoising as a parallel, simultaneous process is not equivalent to a sequential denoising followed by a simple fusion process. Simultaneously estimating the derivatives @ws/@t, s = {1, . . . , S} for all S input images can be accomplished by using the following simplifying mathematical construct:
@eI @eI @ws @ws ¼ ¼ Is @t @ws @t @t
ð16Þ
ð17Þ
ws ðx; y; tÞ s
ws ðx; y; t þ 1Þ
@ws ðx; y; tÞ @t
ð18Þ
where
@ws ðx; y; tÞ 1 ¼ fa Da ðeIðx; y; tÞÞ @t Is ðx; yÞ þ b HðJ s ðx; y; tÞÞJ s ðx; y; tÞ½eIðx; y; tÞ Is ðx; yÞg
ð19Þ
Js(x, y, t) in this case is expressed as the difference:
J s ðx; y; tÞ ¼
Z
0
0
_ Is ðx0 ; y0 ÞjGðx x0 ; y y0 ; rem Þdx dy jr Z _ eIðx0 ; y0 ; tÞjGðx x0 ; y y0 ; rem Þdx0 dy0 jr
¼ J Is ðx; y; tÞ Jeðx; y; tÞ
ð20Þ
I
The time evolution of each weight ws, governed by the update rule (18), leads to the fused result eI described by (15). At the beginning of the fusion process, at time t = 0, all fusion weights are initialized to ws (x, y, 0) = 1/S; over time, the fusion weights ws(x, y, t) will adapt in order to emphasize the salient information found in their corresponding input images Is (x, y) correlated to the fused image eIðx; y; tÞ at that particular instant of time t. At first glance, when comparing the proposed error norm (12) to the error norm employed by the John and Vorontsov (2005) fusion model (8), the proposed norm appears to be defined in a counterintuitive manner, since the diffusion term Da describes a negative advection process, while the edge-enhancement term describes a positive advection process, the opposite of how a standard PDEbased restoration with edge enhancement process is normally described. The fundamental difference between the two models lies in the fact that the John–Vorontsov model directly employs its error norm in defining a fusion paradigm, while our proposed model is constructed on the premise of describing a fusion process as a linear combination of weighted inputs. Thus, the variational framework is indirectly used to describe the time evolution of each independent weight function, a process that adds an additional overall sign inversion, ultimately leading to the standard PDEbased restoration with edge enhancement process description. With regard to the previous remark, about the overall sign inversion, assuming a typical fusion scenario of S input images I1, . . . , IS, the initial value of the fusion weight functions is set to 1/S, for all ws(x, y, 0), and furthermore, in accordance with the update rule (18) with @ws(x, y, t)/@t expressed as (19). By integrating the computed @ ws(x, y, t)/@t into (18), ws(x, y, t + 1) is now equal to:
ws ðx; y; t þ 1Þ
ws ðx; y; tÞ
s Is ðx; yÞ
faDa ðeIðx; y; tÞÞ
þ bHðJ s ðx; y; tÞÞJ s ðx; y; tÞ½eIðx; y; tÞ Is ðx; yÞg
ð21Þ
Finally, using (15), eIðx; y; tÞ is expressed as: eIðx;y;t þ 1Þ ¼
S X
ws ðx;y;t þ 1Þ Is ðx;yÞ
s¼1
¼
Thus, all S weights ws are simultaneously estimated based on their corresponding input Is and correlated to the fused image eI (Mitianoudis and Stathaki, 2008):
@ws ðx; y; tÞ 1 @eIðx; y; tÞ ¼ @t Is ðx; yÞ @t
Each fusion weight function is iteratively estimated using an update rule of the following form:
S X
ws ðx;y;tÞ
s¼1
s Is ðx;yÞ
faDa ðeIðx;y;tÞÞ
o þ bHðJ s ðx;y;tÞÞJ s ðx;y;tÞ½eIðx;y;tÞ Is ðx;yÞg Is ðx;yÞ ¼
S X
ws ðx;y;tÞ Is ðx;yÞ þ s
s¼1
s
S X s¼1
S X
aDa ðeIðx;y;tÞÞ
s¼1
bHðJ s ðx;y;tÞÞJ s ðx;y;tÞ½eIðx;y;tÞ Is ðx;yÞ
ð22Þ
C. Ludusan, O. Lavialle / Pattern Recognition Letters 33 (2012) 1388–1396
Recalling the general expression of a time-evolving PDE-based image processing paradigm, (22) can be expressed in generic terms as:
e eIðx; y; t þ 1Þ ¼ eIðx; y; tÞ þ s @ Iðx; y; tÞ @t
ð23Þ
and also further simplified, by breaking it down into its constituent parts:
( S S X @eI X ¼ aDa ðeIðx; y; tÞÞ bHðJ s ðx; y; tÞÞJ s ðx; y; tÞ½eIðx; y; tÞ @t s¼1 s¼1 Is ðx; yÞeIðx; y; 0Þ ¼
S X
ws ðx; y; 0Þ Is ðx; yÞ
ð24Þ
s¼1
where s is equivalent to dt. By analogy with the Kornprobst et al. (1997) method for image deblurring and denoising, @eIðx; y; tÞ=@t can be also formalized in terms of functional components, as follows:
@eI XS ¼ aDa ðeIðx; y; tÞÞ s¼1 @t |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} Denoising
XS
bHðJ s ðx; y; tÞÞJ s ðx; y; tÞ½eIðx; y; tÞ Is ðx; yÞ s¼1 |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl ffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} Edge enhancemet
ð25Þ
Coupling
In our case the deblurring component from the Kornprobst et al. (1997) model, in essence an edge-enhancement process, is replaced by the multifocus image fusion process with its intrinsic edge-enhancing characteristic. Details regarding the numerical implementation, from orientation estimation (Appendix A.1) to the numerical discretization scheme (Appendix A.2) and gradient computation (Appendix A.3), will be properly detailed in Appendix A. 4. Experimental results For testing our image fusion model we have chosen a multifocus image fusion scenario, comprising two input images (Fig. 1(a) and (b): the original images (ECE – Lehigh University, 2011) are realworld, registered images, describing the same scene in different focus points. The initial, estimated, AWGN of the original images is of rnoisebias = 0.48 and will be considered as a noise bias when further adding AWGN. In order to validate the fusion method’s denoising capabilities, the test scenario is constructed on the assumption of a reference image, in relative terms, since the desired output varies from application to application. In our case, the multifocus image fusion of noisy images is required to produce a fused result of the scene described by the input images (I1 and I2) that is both in focus and properly denoised. For that purpose, the relative reference image was obtained using the proposed fusion method itself, with an empirically chosen set of parameters, that produce a fused image within desired quality specs. This result is not necessarily the best, per se, since that would require a priori knowing the ideal outcome of the fusion process, thus contradicting the fundamental definition of information and making image fusion, in itself, redundant. In order to establish a somehow objective base of reference in relation with the fusion reference, a second relative reference was used, obtained by averaging the original input images, hence using the most basic fusion paradigm, also known as the simple average fusion rule. The two references were thus obtained: the simple average reference Ravg, illustrated in Fig. 1(c), was obtained by averaging the original input images (Fig. 1(a) and (b). The fusion reference Rfus,
1391
illustrated in Fig. 1(d), was computed using (15) and (19) with an empirically chosen set of parameters, ES1 (ES - Experimental Setting): iterations = 15, s = 0.1 – theoretical time t = s iterations = 1.5s, WPCA = 3, Kn = 10, Kg = 5, rem = 7, a = 1 and b = 0.6. (see Appendix A for the full parameter description). Fig. 1(e)–(h) represent enlarged details of the input images, Ravg and Rfus respectively, and illustrate the difference in visual quality and filtered noise between the original inputs, Ravg and Rfus. The fusion method can be analyzed in terms of noise filtering and edge enhancement using objective evaluation measures that require a reference image: e.g. RMSE, PSNR, VIF and SSIM. The SSIM (Structural Similarity index) defined by Wang and Bovik (2002) quantifies edge information (from an image fusion point of view this information represents salient information), while the VIF (Visual Information Fidelity) quantifies the Shannon information shared between the reference and the test image, relative to the information contained in the reference image itself (Sheikh and Bovik, 2006). In order to ensure an objective, unbiased quality assessment, all of the previously mentioned quality metrics were used under a third-party implementation, i.e. the MeTriX MuX Visual Quality Assessment Package (Gaubatz, 2011). The test scenario is constructed on the idea of verifying the fusion method’s robustness to AWGN, while correctly performing multifocus image fusion, that is, yielding a fused result both in focus and denoised. The AWGN varies from no added noise, (rn = rnoisebias), to rn = rnoise + rnoisebias = 30. For the first experimental setting (ES1), the quality metrics’ evolution with respect to rnoise is illustrated in Fig. 2: The fused result eI rnoise for all tested values of rnoise, with respect to its relative reference Rfusion, where Rfusion falls within the category of ‘‘greater’’ quality results, fulfilling the two important criteria of being properly fused and denoised through a joint fusion–denoising approach. The averaged result Iravnoise for all tested values of rnoise, with g respect to its relative reference Rfusion in order to qualitatively quantify the difference between the simple average fusion method and the proposed fusion–denoising model. rnoise The two inputs I1;2 for all tested values of rnoise, with respect to their relative reference Rfusion, to provide an extra point of reference, quality-wise. Upon analyzing the experimental results illustrated in Fig. 2, the following immediate remarks can be formulated: (i) The proposed fusion paradigm, with the parameter set ES1, best suited for low to medium noise contamination, performs as intended even for high AWGN levels. (ii) As it can be seen from the quality metrics graphs (Fig. 2), the proposed fusion method yields promising results even when the method’s parameters are empirically chosen, without any prior parameter optimization. (iii) The proposed fusion process exhibits a robust salient information transfer and integration even for high AWGN levels, as indicated by the SSIM evolution in Fig. 2(d). The experimental analysis continues by defining additional experimental settings in order to emulate a real-world scenario of adapting the image processing model to the conditions of the input data. Thus, two new experimental settings, ES2 and ES3, are defined as instances of the proposed method, particularly adapted, parameter-wise, to tackle with medium and high noise contamination, respectively. The three proposed experimental settings (Table 1) emulate a coarse optimization scenario (Fig. 3), where, in order to maintain
1392
C. Ludusan, O. Lavialle / Pattern Recognition Letters 33 (2012) 1388–1396
Fig. 1. Multifocus image fusion test setting: (a) First input image I1; (b) Second input image I2; (c) Simple average reference image Ravg; (d) Fusion reference image Rfus; (e) I1 – detail; (f) I2 – detail; (g) Ravg – detail; (h) Rfus – detail.
(a)
(c)
(b)
(d)
Fig. 2. Multifocus image fusion test scenario (ES1) - quality metrics evolution: (a) RSME/rnoise evolution; (b) PSNR/rnoise evolution; (c) VIF/rnoise evolution; (d) SSIM/rnoise evolution.
a high quality fused result regardless of the input images and the degree of their contamination, the model’s parameters are adjusted so as to compensate for a drop in quality, or at least limit this drop as much as possible. Thus, the resulting optimization is achieved by using the input parameters from one of the three experimental settings at each level of rnoise, according to the best measured value, quality metric-wise.
All of the quality metrics used for the coarse optimization test scenario indicate that the increase in AWGN can be dynamically compensated by a proper selection of the fusion method’s input parameters. Hence, a proper parametrization ensures the robustness of the fusion method across the entire range of rnoise values. The evolution of the SSIM quality metric (Fig. 3(d) for the optimized experimental setting Rfusion/OPT underlines the implicit
C. Ludusan, O. Lavialle / Pattern Recognition Letters 33 (2012) 1388–1396 Table 1 Noise robustness assessment – experimental settings. Experimental setting
ES1 ES2 ES3
Proposed model parameters t
s
WPCA
Kn
Kg
rem
a
b
1.5 s 1.5 s 4s
0.1 0.1 0.1
3 5 5
10 10 10
5 5 5
7 7 9
1 1.6 2.2
0.6 0.3 0.3
anisotropic characteristic of the fusion method, both for noise filtering and edge enhancement. The three experimental settings are also evaluated using dedicated fusion quality metrics (Fig. 3(e) and 3(f)), i.e. the QAB/F (Xydeas and Petrovic, 2000; Petrovic, 2001) and the QW (Piella, 2004), r r having as reference the original input images I1 noisebias and I2 noisebias (Fig. 1(a) and (b). Since there are no completely noise-free input images available, the quality assessment slightly under-evaluates the fused results, but within acceptable limits, providing an accurate enough evaluation. In Fig. 4 a concurrent fusion and denoising example is presented in order to provide an additional visual assessment of the proposed fusion method result with respect to the Simple average fusion result. The previous experimental analysis also underlines that for high and very high noise levels the drop in quality is significant and partly due to external factors, such as: (a) The limited number of input images – given a noise of zero mean, for a sufficiently large number of input images, the initialization process (input averaging) alone would act as an implicit denoising process. This theoretical property is employed as an additional denoising step by the Socolinsky fusion model Socolinsky (2000), where the fusion process is always assumed as comprising a sufficiently large number of inputs.
1393
(b) The entropy of the input images – usually filtering methods that employ a geometry-driven processing mechanism have difficulties in initially assessing the image geometry when the inputs are contaminated with high value noise, e.g. the Osher and Rudin (1990) shock filter. A simple an efficient solution, employed by many denoising methods, e.g. Catté et al. (1992), is to integrate in the filtering model a presmoothing component that helps decrease the initial entropy and allows the model’s edge detector to properly describe the geometry of the input image. Thus, in the case of the proposed joint fusion–denoising paradigm, an extra boost in output quality is available by integrating a pre-smoothing component in the overall model, the fusion inputs being thereby expressed as G⁄Is, where G is a Gaussian PSF of rps. In order to illustrate the difference in output quality between the proposed fusion model without pre-smoothing and the proposed fusion model with pre-smoothing, the following experimenn ¼30 tal setting is constructed: for the input image set Ir1;2 , the best quality scores for the proposed fusion model without a presmoothing component are obtained using the ES3 parameter set; in order to assess the difference in quality for the proposed fusion model with pre-smoothing, the same set of parameters is kept, i.e. ES3 (Table 1), with the additional parameter rps = 0.5. rn ¼30 For the input image set I1;2 the initial entropies are EI1 ¼ 7:639 and EI2 ¼ 7:657, respectively and the fused result eI rn ¼30 (Fig. 5(a) is obtained using the ES3 input parameters set, as previously mentioned. For the pre-smoothed case, the input image rn ¼30 set Grps ¼0:5 I1;2 has the following initial entropies: EI1 ¼ 7:525 and EI2 ¼ 7:579, respectively, and the fused result eI rrnps¼30 ¼0:5 (Fig. 5(b)) is obtained using the ES3 input parameters set and the additional input parameter rps = 0.5. The quality assessment for the proposed experimental setup is summarized in Table 2 from where it can be concluded that even a small decrease in the initial entropy of the input images, through
Fig. 3. Noise robustness test scenario – coarse optimization: (a) RSME/rnoise; (b) PSNR/rnoise; (c) VIF/rnoise; (d) SSIM/rnoise; (e) QAB/F/rnoise; (f) QW/rnoise.
1394
C. Ludusan, O. Lavialle / Pattern Recognition Letters 33 (2012) 1388–1396
Fig. 4. Concurrent image fusion and denoising example: (a) I1 + AWGN of rnoise = 7.52 input image; (b) I2 + AWGN of rnoise = 7.52 input image; (c) Simple average fusion result and detail (d); (e) Proposed fusion method result using the ES2 parameter set and detail (f).
Table 2 Proposed fusion model w/ pre-smoothing: qualitative evaluation.
w/ rps = 1.5, . . . , 3), the fused result quality decreases if the presmoothing is anything other than minimal, just enough to slightly decrease the initial entropy.
5. Conclusion and remarks
Fig. 5. Noise robustness test scenario – pre-smoothing example: (a) eI rn ¼30 fused result; (b) eI rrnps¼30 ¼0:5 fused result.
a pre-smoothing process (rps = 0.5), can provide a significant increase in quality of the output fused image eI. Moreover, the previous test scenario (Table 2) provides the following conclusions: on the one hand it supports the initial inference stating that a small pre-smoothing with the purpose of reducing the input images’ entropy can provide an additional increase in fused output quality, as indicated by the image fusion quality metrics for the ES3 w/ rps = 0.5, while on the other hand, as the remaining experimental settings indicate (Table 2 – ES3
In this letter we have proposed, discussed and tested a multifocus image fusion and denoising method based on a variational approach that is intended to perform concurrent anisotropic denoising and edge enhancement in a unified manner, as part of the fusion process. The main features of the proposed model are: integrated anisotropic denoising and edge enhancement, the use of PCA and of a discrete interpolation scheme for the model’s numerical implementation – in order to enforce the anisotropic behavior as described by the theoretical model, and last but not least the fusion process is defined as a linear combination of its weighted inputs – its time evolution being indirectly described by the evolution of the fusion weights themselves. The results obtained thus far indicate that the fusion method efficiently filters AWGN (even for high values of rn) while properly enhancing edges and contours, delivering a fused result of desired quality, both in focus and denoised. It needs to be stressed that the
C. Ludusan, O. Lavialle / Pattern Recognition Letters 33 (2012) 1388–1396
obtained results and their quality was not the result of an express parameter optimization, therefore fusion performances could be further improved by an express and exhaustive parameter optimization (the subject of ongoing work). Further work on this model will consist in, but will not be limited to: exhaustive parameter optimization, comparative analysis with existing fusion methods, using an even more extensive set of quality metrics, further improving the fusion process, by adding a measure of confidence (coherence) when computing the orientation h through PCA, strengthening the edge enhancement in order to further amend contrast preservation in the fused image, extending the fusion method’s application range, e.g. multisensor image fusion, medical image fusion, remote sensing image fusion.
Acknowledgements This work was funded by the Romanian National Research Council CNCSIS-UEFISCSU under project number PNII-IDEI code 908/2007 and through a Young Researchers BD Grant, and also by EGIDE France and Bordeaux Sciences Agro.
A.2. Numerical interpolation scheme In the discrete image space, I is not a continuous function, hence its derivative cannot be directly computed, only approximated. One straightforward approximation for Ix at pixel (i, j) where i = 1, . . . , x, j = 1, . . . , y is through finite difference approximation, e.g. the forward difference approximation:
Dþx ðIÞ ¼
Iðx þ dx; yÞ Iðx; yÞ ¼ Iiþ1;j Ii;j dx
V 1 ¼ Ii;j1 þ 0:5 ½Iiþ1;j1 Ii1;j1 dncosh þ 0:5 ½Ii1;j1 2Ii;j1 þ Iiþ1;j1 ðdncoshÞ2 þ 0:5 ½Ii1;j 2Ii;j þ Iiþ1;j ðdncoshÞ2
Let I : X R2 ! R be a real-valued image function and rI = (Ix, Iy) the gradient vector of I at pixel coordinates (x, y). T
ðA:4Þ
In order to construct a structure-dependent frame of reference, more adequate for anisotropic image processing, we will use the directions best describing structures and gradients, i.e. (n, g). Properly describing salient features in an image often requires sub-pixel accuracy, therefore, a numerical interpolation scheme is required for computing the finite difference approximations within the (n, g) frame of reference. The approximations required in each pixel are the following: I(n, g) = I(x, y), I(n + dn, g), I(n dn, g), I(n, g + dg), I(n, g dg). In computing the interpolation points for the previously mentioned approximations, the notion of orientation is required and to this end h from (A.3) will be used. Computing I(n + dn, g) requires three interpolations points, V1, V2, V3, expressed as:
V 2 ¼ Ii;j þ 0:5 ½Iiþ1;j Ii1;j dncosh Appendix A
1395
ðA:5Þ
V 3 ¼ Ii;jþ1 þ 0:5 ½Iiþ1;jþ1 Ii1;jþ1 dncosh þ 0:5 ½Ii1;jþ1 2Ii;jþ1 þ Iiþ1;jþ1 ðdncoshÞ2 yielding:
A.1. Vector orientation estimation
Iðn þ dn; gÞ ¼ V 2 þ 0:5 ðV 1 V 3 Þdn sin h In order to define the notion of orientation, we assume that the image function I is continuous or piecewise continuous, thus making it differentiable. A robust and reliable way of describing the orientation associated to a given vector, belonging to a vector field, is through PCA, as thoroughly explained in (Donias, 1999). In order to determine h through PCA, a 2 2 symmetrical covariance matrix is constructed in each point of the vector field:
M¼
m11
m12
m12
m22
ðA:1Þ
ðA:2Þ
ðA:6Þ
Analogously, I(n dn, g) is computed using:
V 01 ¼ Ii;j1 0:5 ½Iiþ1;j1 Ii1;j1 dncosh þ 0:5 ½Ii1;j1 2Ii;j1 þ Iiþ1;j1 ðdncoshÞ2 V 02 ¼ Ii;j 0:5 ½Iiþ1;j Ii1;j dncosh þ 0:5 ½Ii1;j 2Ii;j þ Iiþ1;j ðdncoshÞ2
and using its eigenvalues, the orientation of the first eigenvector (h1) can be computed as follows:
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1 0 m22 m11 þ ðm11 m22 Þ2 þ 4m212 A h1 ¼ arctan @ 2m12
þ 0:5 ðV 1 þ V 3 2V 2 Þðdn sin hÞ2
ðA:7Þ
V 03 ¼ Ii;jþ1 0:5 ½Iiþ1;jþ1 Ii1;jþ1 dncosh þ 0:5 ½Ii1;jþ1 2Ii;jþ1 þ Iiþ1;jþ1 ðdncoshÞ2 yielding:
Iðn dn; gÞ ¼ V 02 0:5 V 01 V 03 dn sin h þ 0:5
V 01 þ V 03 2V 02 ðdn sin hÞ2
ðA:8Þ
h1 represents the orientation of the gradient vector rI, or more precisely of the g direction, where ~ g ¼ jrrIIj ; ~ gkrI. In order to work in an anisotropic manner, we require the use of the n direction, ~ n?~ g, describing the structure’s (edge’s) geometry and having the following orientation:
I(n, g + dg) and I(n, g dg) are obtained in a similar manner, each requiring three interpolation points:
p h ¼ h1 þ modp 2
Z 2 ¼ Ii;j þ 0:5 ½Iiþ1;j Ii1;j dg sin jhj
ðA:3Þ
In computing h1 and consequently h a PCA with rectangular support of fixed size WPCA was performed in each pixel. The orientation h has meaning only for non-zero gradient vectors. For null gradient vectors, describing homogeneous regions of the image, a given value (by convention) will be associated to h.
Z 1 ¼ Ii;j1 þ 0:5 ½Iiþ1;j1 Ii1;j1 dg sin jhj þ 0:5 ½Ii1;j1 2Ii;j1 þ Iiþ1;j1 ðdg sin hÞ2
þ 0:5 ½Ii1;j 2Ii;j þ Iiþ1;j ðdg sin hÞ2 Z 3 ¼ Ii;jþ1 þ 0:5 ½Iiþ1;jþ1 Ii1;jþ1 dg sin jhj þ 0:5 ½Ii1;jþ1 2Ii;jþ1 þ Iiþ1;jþ1 ðdg sin hÞ2 yielding:
ðA:9Þ
1396
C. Ludusan, O. Lavialle / Pattern Recognition Letters 33 (2012) 1388–1396
8 Z 2 0:5 ðZ 1 Z 3 Þdg cos h > > > < þ0:5 ðZ þ Z 2Z Þðdg cos hÞ2 1 3 2 Iðn; g þ dgÞ ¼ > þ 0:5 ðZ Z Þd g cos h Z 2 1 3 > > : þ0:5 ðZ 1 þ Z 3 2Z 2 Þðdg cos hÞ2
h>0 ðA:10Þ h<0
and
Z 01 ¼ Ii;j1 0:5 ½Iiþ1;j1 Ii1;j1 dg sin jhj Z 02
þ 0:5 ½Ii1;j1 2Ii;j1 þ Iiþ1;j1 ðdg sin hÞ2 ¼ Ii;j 0:5 ½Iiþ1;j Ii1;j dg sin jhj
ðA:11Þ
þ 0:5 ½Ii1;j 2Ii;j þ Iiþ1;j ðdg sin hÞ2 Z 03 ¼ Ii;jþ1 0:5 ½Iiþ1;jþ1 Ii1;jþ1 dg sin jhj þ 0:5 ½Ii1;jþ1 2Ii;jþ1 þ Iiþ1;jþ1 ðdg sin hÞ2 yielding:
8 0 Z 2 þ 0:5 Z 01 Z 03 dg cos h > > > > < þ0:5 Z 0 þ Z 0 2Z 0 ðdg cos hÞ2 3 1
2 Iðn; g dgÞ ¼ > Z 02 0:5 Z 01 Z 03 dg cos h > > >
: þ0:5 Z 01 þ Z 03 2Z 02 ðdg cos hÞ2
h>0 h<0 ðA:12Þ
_ A.3. Expressing Da and r Da is an anisotropic diffusion differential operator, defined in the Alvarez et al. (1992) sense and described in general terms by the following PDE:
Da ðIðx; y; tÞÞ ¼ cn Inn þ cg Igg
ðA:13Þ
where
8 @ 1 ½g n ðIn ÞIn with g n ðsÞ ¼ 1þðs=K < cn ¼ @n 2 nÞ : cg ¼ @@g ½g g ðIg ÞIg with g g ðsÞ ¼
ðA:14Þ
1 1þðs=K g Þ
2
Using finite difference approximations of the form (A.4) and the (n, g) frame of reference approximations computed in Appendix A.2, (A.13) can be rewritten as (Terebes et al., 2002; Terebes et al., 2004; Terebes, 2004a):
Da ðIÞ ¼ g n Dþn ðIÞ Dþn ðIÞ g n Dn ðIÞ Dn ðIÞ þ g g Dþg ðIÞ Dþg ðIÞ g g Dg ðIÞ Dg ðIÞ
ðA:15Þ
_ I ¼ Ig ; 0 T represents the directional gradient, computed using r only the g component of the (n, g) frame of reference. Ig is computed using (A.10) and (A.12) using a central difference approximation scheme. References Alvarez, L., Lions, P.L., Morel, J.M., 1992. Image selective smoothing and edge detection by nonlinear diffusion. II.. SIAM J. Numer. Anal. 29, 845–866. Ballester, C., Caselles, V., Igual, L., Verdera, J., Rougé, B., 2006. A variational model for P + XS image fusion. Int. J. Comput. Vision 69, 43–58. Blum, R., Xue, Z., Zhang, Z., 2006. Multi-Sensor Image Fusion and Its Applications. CRC Press Taylor and Francis Group (Chapter: An Overview of Image Fusion, pp. 1–36). Catté, F., Lions, P.L., Morel, J.M., Coll, T., 1992. Image selective smoothing and edge detection by nonlinear diffusion. SIAM J. Numer. Anal. 29, 182–193. Chen, S.h., Su, H., Zhang, R., Tian, J., 2008. Fusing remote sensing images using á trous wavelet transform and empirical mode decomposition. Pattern Recogn. Lett. 29, 330–342. Donias, M., 1999. Caractérisation de Champs d’Orientations par Analyse en Composantes Principales et Estimation de la Curbure (in French). Ph.D. thesis,
L’Université Bordeaux I – École Doctorale des Sciences Physiques et de L’Ingénieur. ECE – Lehigh University, 2011. Investigations of image fusion. http:// www.ece.lehigh.edu/SPCRL/IF/disk.htm. Fischer, B., Modersitzki, J., 2006. Image Fusion and Registration - a Variational Approach. In: Krause, E., Shokin, Y., Resch, M., Shokina, N. (Eds.), Computational Science and High Performance Computing II, Notes on Numerical Fluid Mechanics and Multidisciplinary Design, Vol. 91. Springer, Berlin–Heidelberg, pp. 193–203. Gaubatz, M., 2011. MeTriX MuX Visual Quality Assessment Package. http:// foulard.ece.cornell.edu/gaubatz/metrix_mux/. Huang, W., Jing, Z., 2007. Multi-focus image fusion using pulse coupled neural network. Pattern Recogn. Lett. 28, 1123–1132. John, S., Vorontsov, M., 2005. Multiframe selective information fusion from robust error estimation theory. IEEE Trans. Image Process. 14, 577–584. Kornprobst, P., Deriche, R., Aubert, G., 1997. Image coupling, restoration and enhancement via PDE’s. In: Proceedings of the 1997 IEEE International Conference on Image Processing, IEEE Computer Society, Washington, DC, USA, pp. 458–461. Li, S., Kwok, J.T., Wang, Y., 2002. Multifocus image fusion using artificial neural networks. Pattern Recogn. Lett. 23, 985–997. Li, S., Yang, B., 2008. Multifocus image fusion by combining curvelet and wavelet transform. Pattern Recogn. Lett. 29, 1295–1301. Li, Z., Jing, Z., Yang, X., Sun, S., 2005. Color transfer based remote sensing image fusion using non-separable wavelet frame transform. Pattern Recognit. Lett. 26, 2006–2014. Liu, Z., Tsukada, K., Hanasaki, K., Ho, Y.K., Dai, Y.P., 2001. Image fusion by using steerable pyramid. Pattern Recogn. Lett. 22, 929–939. Mitianoudis, N., Stathaki, T., 2008. Image Fusion: Algorithms and Applications. Academic Press (Chapter Enhancement of multiple sensor images using joint image fusion and blind restoration, pp. 299–326). Osher, S., Rudin, L., 1990. Feature-oriented image enhancement using shock filters. SIAM J. Numer. Anal. 27, 919–940. Petrovic, V., 2001. Multisensor Pixel-level Image Fusion. Ph.D. thesis, University of Manchester. Piella, G., 2004. New quality measures for image fusion. In: Proceedings of the 7th International Conference on Information Fusion, pp. 542–546. Piella, G., 2008. A variational approach for image fusion visualization. In: 16th European Signal Processing Conference (EUSIPCO). Piella, G., 2009. Image fusion for enhanced visualization: a variational approach. Int. J. Comput. Vision 83, 1–11. Pop, S., 2008. Modèles de Fusion et Diffusion par Équations aux Dérivées Partielles: Application à la Sismique Azimutale (in French). Ph.D. thesis. L’Université Bordeaux I - École Doctorale des Sciences Physiques et de L’Ingénieur en cotutelle avec L’Université Technique de Cluj-Napoca (Roumanie). Pop, S., Lavialle, O., Terebes, R., Borda, M., 2007a. Low-level fusion: a PDE-based approach. In: Proceedings of the 10th International Conference on Information Fusion, pp. 1–8. Pop, S., Terebes, R., Borda, M., Guillon, S., Keskes, N., Baylou, P., Lavialle, O., 2007b. 3D Seismic data fusion and filtering using a PDE-based approach. In: Proceedings of the 2007 IEEE International Conference on Image Processing, pp. 117–120. Samadzadegan, F., 2003. Fusion techniques in remote sensing. In: Challenges in Geospatial Analysis, Integration and Visualization II, ISPRS Commission IV Joint Workshop, ISPRS. Int. Arch. Photogram. Rem. Sens. Spatial Inform. Sci. Sheikh, H., Bovik, A., 2006. Image information and visual quality. IEEE Trans. Image Process. 15, 430–444. Socolinsky, D., 2000. A variational approach to image fusion. Ph.D. thesis, The Johns Hopkins University. Socolinsky, D., Wolff, L., 2002. Multispectral image visualization through first-order fusion. IEEE Trans. Image Process. 11, 923–931. Terebes, R., 2004. Diffusion Directionnelle. Applications à la Restauration et à l’Amelioration d’Images de Documents Anciens (in French). Ph.D. thesis, L’Université Bordeaux I - École Doctorale des Sciences Physiques et de L’Ingénieur en cotutelle avec L’Université Technique de Cluj-Napoca (Roumanie). Terebes, R., Borda, M., Baozong, Y., Lavialle, O., Baylou, P., 2004. A new PDE based approach for image restoration and enhancement using robust diffusion directions and directional derivatives based diffusivities. In: Proceedings of the 7th International Conference on Signal Processing, pp. 707–712. Terebes, R., Lavialle, O., Baylou, P., Borda, M., 2002. Directional anisotropic diffusion. In: 11th European Signal Processing Conference, European Association for Signal Processing (EURASIP). Wang, C., Ye, Z., 2006. First-order fusion of volumetric medical imagery. IEE Proceedings - Vision Image Signal Process 153, 191–198. Wang, C., Ye, Z., 2007. Perceptual Contrast-Based Image Fusion: A Variational Approach. Acta Autom. Sin. 33, 132–137. Wang, W.W., Shui, P.L., Feng, X.C., 2008. Variational models for fusion and denoising of multifocus images. IEEE Signal Process. Lett. 15, 65–68. Wang, Z., Bovik, A., 2002. A universal image quality index. IEEE Signal Process. Lett. 9, 81–84. Xydeas, C., Petrovic, V., 2000. Objective image fusion performance measure. Electron. Lett. 36, 308–309.