Infrared Physics & Technology 72 (2015) 37–51
Contents lists available at ScienceDirect
Infrared Physics & Technology journal homepage: www.elsevier.com/locate/infrared
Infrared and visible image fusion with the use of multi-scale edge-preserving decomposition and guided image filter Wei Gan a, Xiaohong Wu a,⇑, Wei Wu a, Xiaomin Yang a, Chao Ren a, Xiaohai He a, Kai Liu b a b
College of Electronics and Information Engineering, Sichuan University, Chengdu, Sichuan 610064, PR China School of Electrical Engineering and Information, Sichuan University, Chengdu, Sichuan 610064, PR China
h i g h l i g h t s A edge-preserving multi-scale decomposition is employed to extract features. The guided image filter is used to optimize the weighting maps of each layer image. A novel image fusion scheme for IR and VI images is proposed. Phase congruency is adopted to extract the saliency maps from source images.
a r t i c l e
i n f o
Article history: Received 22 September 2014 Available online 22 July 2015 Keywords: Image fusion Multi-scale edge-preserving decomposition Guided image filter Phase congruency
a b s t r a c t Infrared (IR) and visible (VIS) image fusion techniques enhance visual perception capability by integrating IR and VIS images into a single fused image under various environments. This process serves an important function in image processing applications. In this paper, a novel IR and VIS image fusion framework is proposed by combining multi-scale decomposition and guided filter. The proposed scheme could not only preserve the details of source IR and VI images but could also suppress the artifacts effectively by combining the advantages of multi-scale decomposition and guided filter. First, both IR and VIS images are decomposed with a multi-scale edge-preserving filter. Saliency maps of IR and VIS images are then calculated on the basis of phase congruency. Subsequently, the guided filtering is adopted to generate weighting maps. Finally, the resultant image is reconstructed with the weighting maps. Experiments show that the proposed approach can achieve better performance than other methods in terms of subjective visual effect and objective assessment. Ó 2015 Elsevier B.V. All rights reserved.
1. Introduction Infrared imaging technology has recently been widely used in various image processing and computer vision applications. Complementary and redundancy information exist between infrared (IR) and visible (VIS) images from the same scene because of the different properties of IR and VIS image sensors. Thus, the use of a technique known as IR and VIS image fusion is desired to enhance visual perception capability by integrating IR and VIS images into a single fused image under various environments. The fused image can provide more comprehensive information on a scene and is more appropriate for computer processing or human interpretation than any other source image. Recent extensive research and various techniques have been dedicated to IR and VIS image fusion. The average fusion method, ⇑ Corresponding author. E-mail address:
[email protected] (X. Wu). http://dx.doi.org/10.1016/j.infrared.2015.07.003 1350-4495/Ó 2015 Elsevier B.V. All rights reserved.
which is the most common fusion method, obtains the fused image by averaging the source images pixel by pixel. Although simple, this method results in a blurred image. The image features of different scales are very sensitive to the human visual system. Therefore, multi-scale transform [1–10], which includes Laplacian pyramid [11], wavelet transform [5–8], contourlet transform [9,10], etc., is widely used in image fusion. Multi-scale transform-based methods generally consist of the following steps: (1) performing multi-scale decomposition on the source images; (2) fusing transform coefficients with some rules; and (3) reconstructing the fused image with the fused coefficients. Although these methods can perform well, the fused image may have some distortions or artifacts because spatial consistency is not well-considered in these methods [12]. Li et al. [12] presented a guided filter-based weighted average fusion method to solve this problem. Guided filtering (GF) [13] is used to solve the spatial consistency problem in this method. Despite exhibiting efficient performance, the method has two drawbacks. One is that the
38
W. Gan et al. / Infrared Physics & Technology 72 (2015) 37–51
saliency maps generated by the Laplace operator do not well represent the visually discernable features in an image. Another is that multi-scale decomposition is not considered, which causes some details of the source IR and VI images to be missing in the fused image. The traditional multi-scale transform-based methods are confronted with the spatial inconsistency problems. Although guided filter-based method can overcome the spatial inconsistency problem, another issue emerges in that some details would be missing. This study presents a novel image fusion scheme that takes the advantages of multi-scale edge-preserving decomposition and guided filter to overcome the aforementioned problems. Multi-scale decomposition could preserve the details of source IR and VI images, whereas the guided filter could effectively suppress the artifacts by optimizing the weighting maps. The proposed scheme could not only preserve the details of source IR and VI images but could also suppress the artifacts effectively by combining the advantages of multi-scale decomposition and guided filter. Moreover, phase congruency (PC) rather than Laplace operator is adopted in this study to obtain better saliency maps, which improves the performance of the proposed method. Therefore, the scheme outperforms conditional fusion methods. The main contributions of the proposed scheme can be summarized as follows:
2. Mathematical theory A number of concepts are adopted in this paper. This section will describe these concepts, which include PC, multi-scale decomposition with edge-preserving filter, and guided image filter. 2.1. Phase congruency PC, which is insensitive to changes in illumination and contrast, provides a measure of feature significance in an image [15–18]. The Fourier phase consists of more perceptual information than the Fourier amplitude in an image. A point with the higher PC contains more highly informative features [16,15], which is the main reason that PC is adopted in this study. Fig. 1 shows a comparison of saliency maps generated by PC and Laplace operator. Fig. 1(a) is an input IR image. Fig. 1(b) presents the saliency map extracted by Laplace operator, whereas Fig. 1(c) shows a saliency map generated by PC. The saliency map generated by PC holds more visually discernable features than the saliency map generated by Laplace operator. 2.2. Multi-scale decomposition with edge-preserving filter A multi-scale decomposition can capture the useful image features of an image at different scales. This process is useful for many image processing applications. However, Laplacian pyramid, a traditional multi-scale decomposition, is constructed with linear filters, which might cause the fused image to have some artifacts. These artifacts may be mitigated by nonlinear edge-preserving filters rather than linear filters [19–21]. Farbman et al. [19] proposed a nonlinear edge-preserving filter with a weighted least squares framework. The task of this edge-preserving filter is a balance between two goals. The first goal is to make the filtering output Iout as close to the source image Iin as possible. The second goal is to make the filtering output Iout as smooth as possible, except for the significant edges of the source image Iin . Formally, we have:
1. A multi-scale decomposition with edge-preserving filter is employed to extract the useful image features at different scales. This process would improve the quality of the fused image. 2. The guided image filter is used to optimize the weighting maps of each layer image, which is obtained by multi-scale edge-preserving decomposition. The artifacts can be effectively suppressed in this way. 3. This method proposes a novel image fusion scheme for IR and VI images that makes full use of both the multi-scale edge-preserving decomposition and the guided filter. Therefore, the proposed scheme outperforms other fusion methods. 4. PC is adopted to extract the saliency maps from source images. PC is utilized instead of Laplace operator to extract visually discernable features [14] and to generate saliency maps because the former provides a measure of feature significance that is insensitive to variations of contrast and brightness.
2 Iout ¼ arg minkIin Iout k22 þ b kax Dx Iout k22 þ ay Dy Iout 2
where Dx and Dy are the forward horizontal and vertical difference operators respectively. The parameter b is a positive regularization constant, which controls the balance between the first goal and the second goal. The parameters ax and ay are the horizontal and vertical smoothness weights, respectively, which can be expressed as:
The remainder of this paper is organized as follows: PC, multi-scale decomposition with edge-preserving filter, and guided image filter are briefly reviewed in Section 2. The details of the proposed approach are presented in Section 3. The experiment results are discussed in Section 4. Conclusions are drawn in Section 5.
(a)
ð1Þ
Iout
ax ¼
@ Iin @x
a ;
ay ¼
@ Iin @y
a ð2Þ
The exponent a determines the sensitivity to the gradients of the source image Iin . Meanwhile, we can tune the parameter b to
(b)
(c)
Fig. 1. Comparison of saliency maps generated by PC and Laplace operator: (a) input image; (b) saliency map extracted by Laplace operator; (c) saliency map extracted by PC.
39
W. Gan et al. / Infrared Physics & Technology 72 (2015) 37–51
generate a range of approximate images with different degrees of smoothness. Thus, we can obtain Iout from Iin through:
Iout ¼ Dðb; Iin Þ
ð3Þ
linear model exists between the guidance image and the filtering output. The second is that the filtering output should be as similar as possible to the source image. The filtering output at pixel i can be expressed as [13]:
Iout ðiÞ ¼
X Guided W ij ðIc ÞIin ðjÞ
ð7Þ
where Dð:Þ refers to an operator, which generates a smooth filtering output Iout from the input image Iin by Eq. (1). In other words, Dð:Þ generates a coarse version of the input image Iin . Similar to Laplacian pyramid, a multi-scale decomposition can be constructed with the edge-preserving filter. A flowchart of the multi-scale decomposition with edge-preserving filter is illustrated in Fig. 2. The decomposition consists of a coarse version of the input image, i.e., an approximate image of the input image and a sequence of detailed images that capture the details at progressively finer scales. Suppose we will construct a K-level decomposition. First, we generate a sequence of progressively coarser images
where lk and r2k are the mean and variance of the guidance image Ic in a square window C k , respectively. Herein, the radius of the window is r. For simplicity, we can rewrite Eq. (7) as:
I1c ; I2c ; . . . ; IK1 by: c
Iout ¼ GðIin ; Ic ; r; eÞ
Iic ¼ Dðbi ; Iin Þ;
i ¼ 1; 2; . . . ; K 1
where bi < biþ1 . The coarsest of these versions
ð4Þ IKc
will be considered
as the approximate image of the input image. The detailed images Iid , i.e., the differences of two neighboring coarse images, are calculated as:
Iid ¼ Ii1 Iic ; c
i ¼ 1; 2; . . . ; K 1
ð5Þ
where I0c ¼ Iin . We can construct the input image Iin by summing up the approximate and detailed images.
Iin ¼ IK1 þ c
K 1 X Iid
ð6Þ
i¼1
2.3. Guided image filtering Guided image filter [13] is a local linear translation-variant filter based on a local linear model. This filter can be used in many applications such as up-sampling, image matting, and image fusion [12]. Guided image filtering, which includes a filtering source image Iin , a guidance image Ic and a filtering output Iout , is a compromise between two conditions. The first condition is that a local
j
where W Guided ð:Þ denotes a filter kernel function of the guidance ij image Ic and is independent of the source image Iin . The filter kernel function in Eq. (7) can be expressed as:
W Guided ðIc Þ ¼ ij
X ðIc ðiÞ lk ÞðIc ðjÞ lk Þ 1þ r2k þ e ði;jÞ2C
ð8Þ
k
ð9Þ
where the parameters r and e determine the filter size and the blur degree of the guided filter respectively. 3. Proposed method The framework of the proposed method is shown in Fig. 3. The algorithm is performed with the following main steps: (1) decomposing the source IR and VIS images; (2) generating saliency maps with PC; (3) calculating weighting maps; and (4) reconstructing the fused image. The source IR and VIS images are decomposed into approximate and detailed images in the decomposition stage. Saliency maps of the IR and VIS images are realized by using PC in the saliency map generation stage. The weighting map is generated by guided image filtering in the weighting map calculation stage. The fused image is constructed with the composite approximate and detailed images in the reconstruction stage. 3.1. Multi-scale decomposition The source IR image IIR is decomposed into an approximate IR DðiÞ
image IcIR and a set of detailed IR images IIR ; i ¼ 1; 2; . . . ; K 1 with the multi-scale decomposition. First, a sequence of scale approximate coarse IR images Cð1Þ
Cð2Þ
CðK1Þ
IIR ; IIR ; . . . ; IIR CðiÞ IIR
¼ Dðbi ; IIR Þ;
is generated using Eq. (4):
i ¼ 1; 2; . . . ; K 1 CðK1Þ
The coarsest of these versions IIR IcIR .
mate image computed as: DðiÞ
Cði1Þ
IIR ¼ IIR
Let
CðiÞ
IIR ;
Cð0Þ IIR
ð10Þ is considered as the approxiDðiÞ
¼ IIR . Thus, a set of detailed images IIR
i ¼ 1; 2; . . . K 1
is
ð11Þ
Similarly, the source VIS image IVIS could be decomposed into an approximate VIS image IcVIS and a set of detailed layer VIS images DðiÞ
IVIS ; i ¼ 1; 2; . . . K 1 in different scales. 3.2. Saliency map generation We adopted PC to generate saliency maps of source IR and VIS images because PC is a powerful tool for extracting visually discernable features. We can calculate the saliency map of the IR image by:
SIR ¼ PðIIR Þ Fig. 2. Flow chart of multi-scale decomposition with edge-preserving filter.
ð12Þ
where Pð:Þ refers to the PC operator; and SIR is the saliency map IR image after performing PC.
40
W. Gan et al. / Infrared Physics & Technology 72 (2015) 37–51
Visible image
Infrared image (2)
Generating saliency maps
(1)
Approximate image
saliency map of IR
Decomposing source image
Detailed image 1
saliency map of VIS
Detailed image K
...
Calculating weighting maps (3)
Decomposing source image
(1)
Approximate image
Initial weight map for approximate image
Fused approximate image
Guided filtering
Initial weight map for detailed image
Fused detailed image 1
Detailed image 1
Detailed image K
...
Initial weight map for approximate image
Initial weight map for detailed image
Fused detailed image K
...
Reconstructing the fused image
(4)
Fused image
Fig. 3. Diagram of the proposed method.
(a)
(b)
(c)
(d)
(e)
Fig. 4. Test image pairs (first row is the IR images; second row is its corresponding VIS images): (a)‘‘Quad’’ image set; (b) ‘‘UNcamp’’ image set; (c) ‘‘7118a’’ image set; (d)‘‘trees_4917’’ image set; (e) ‘‘e518a’’ image set.
In a similar manner, we can attain the saliency map of VIS image SVIS from source VIS image IVIS by using PC. 3.3. Weighting maps calculation Weighting maps are calculated in two steps. The first step is to infer an initial weighting map by comparing the saliency maps of IR and VIS images. However, the initial weighting maps may be noisy and may produce artifacts in the fused image [12]. To solve this problem, GF is applied to the initial weighting maps to generate the ultimate weighting maps in the second step. The initial weighting map for the source IR is calculated with the saliency maps extracted in the previous step. The initial weighting map can be computed as follows:
M IR ði; jÞ ¼
1; if SIR ði; jÞ > SVIS ði; jÞ 0;
otherwise
ð13Þ
where M IR ði; jÞ denotes the initial weight at position ði; jÞ in the initial weighting map of IR image MIR . SIR ði; jÞ and SVIS ði; jÞ refer to the saliency values at position ði; jÞ in the IR and VIS images respectively. The weighting maps obtained above may produce artifacts in the fused image because of the spatial inconsistency [12] in the weighting maps. To resolve this problem, we apply the GF to the initial weighting maps with the corresponding source images that serve as the guidance images to generate the ultimate weighting maps, in which pixels with similar brightness tend to have similar weights. The ultimate weighting maps of IR image at different scales can be calculated as:
(
W cIR ¼ GðM IR ; IIR ; r0 ; e0 Þ DðiÞ
W IR ¼ GðMIR ; IIR ; r i ; ei Þ;
i ¼ 1; 2; . . . ; K 1
ð14Þ
W. Gan et al. / Infrared Physics & Technology 72 (2015) 37–51
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
41
Fig. 5. IR and VIS image fusion results of different fusion algorithms for ‘‘Quad’’ image set: (a) IR image; (b) VIS image; (c) WT method; (d) SIDWT based method; (e) GF based method; (f) NSCT based method; (g) SAID based method; (h) the proposed method.
DðiÞ
where W cIR and W IR are the ultimate weighting maps of approximate and detailed images respectively. r i ; ei ; i ¼ 1; 2; . . . ; K 1 are the parameters of the guided filter for different scale. The method by which set the values of ri ; ei is of great importance. The weighting maps should generally be consistent with
its corresponding images [12]. In other words, if an image is spatially smooth, its weighting map would likewise be spatially smooth. By contrast, the weighting map would be sharp if an image is spatially sharp. Thus, an approximate image prefers a smooth weighting map, whereas a detailed image prefers a sharp
42
W. Gan et al. / Infrared Physics & Technology 72 (2015) 37–51
weighting map. Therefore, we adopt a large filter size and blur degree for the approximate image, whereas we adopt a small filter size and a small blur degree for detailed images. We can obtain the ultimate weighting map of approximate DðiÞ
image W cVIS and detailed images W VIS of VIS image in the same manner. 3.4. Fused image reconstruction The approximate and detailed images can be fused by weighted averaging:
IBf ¼ W cIR IcIR þ W cVIS IcVIS
ð15Þ
DðiÞ If
ð16Þ
¼
DðiÞ DðiÞ W IR IIR
þ
DðiÞ DðiÞ W VIS IVIS
DðiÞ
where IBf and If
refer to the fused approximate and detailed
images, respectively. Finally, the fused image If can be synthesized with the fused approximate and detailed images.
If ¼ IBf þ
used for NSCT-, SAID- and the GF-based fusion method, are given by the authors [9,21,12] respectively. The proposed method mainly consists of two parameters types. One is for the multi-scale edge-preserving decomposition, and the other is for the guided filter. The parameters of multi-scale decomposition contain the number of the level of decomposition K, the regularization constants, i.e., bi ; i ¼ 1; 2; . . . ; K 1, and the exponent a. The level of decomposition is set to K ¼ 4 in the experiment, the regularization constants are set to bi ¼ f0:075; 0:600; 4:800g; i ¼ 1; 2; 3, and a is set to 0:6. Given that the guided filter would be adopted to optimize the initial weighting maps, two parameters, i.e., the filter size r i and the regularization constant ei , are present in each scale. The filter sizes and the regularization parameters are set to r i ¼ f45; 15; 7; 3g; i ¼ 0; 1; 2; 3, and ei ¼ f0:7; 0:03; 0:003; 0:0003g; i ¼ 0; 1; 2; 3, respectively.
K X DðiÞ If
ð17Þ
i¼1
4. Experimental results and analysis Five pairs of full registration source IR and VIS images, which are called ‘‘Quad’’, ‘‘UNcamp’’, ‘‘7118a’’, ‘‘trees_4917’’, and ‘‘e518a’’ (see Fig. 4), are selected from www.imagefusion.org to assess the performance of the proposed method. We considered five other image fusion methods, which include the wavelet transform (WT)-, the shift invariant discrete WT (SIDWT)-, nonsubsampled contourlet transform (NSCT)- [9], saliency analysis and image decomposition (SAID)- [21] and the guided filter(GF)- [12]based fusion method for comparison in this experiment. Three levels of decomposition are set for the WT- and SIDWT-based methods. The average method is employed to combine low-frequency coefficients, whereas the maximum energy rule is adopted to fuse high-frequency coefficients for these two methods. The parameters
4.1. Subjective performance evaluation The experimental results with different image fusion methods for each image set are shown in Figs. 5, 7, 8, 9, and 10 respectively. Fig. 5 shows the experimental results on ‘‘Quad’’ image set. Fig. 5(a) and (b) show the source IR and VIS images of ‘‘Quad’’ image, respectively. The cars and the traffic lights are clear in the IR image, whereas the advertising board is blurred. By contrast, cars and traffic lights almost cannot be seen in the VIS image, whereas the advertising board is clear. The cars, traffic lights, and advertising board are clear in the fused images. The results obtained from the different fusion methods contain major information from IR and VIS images in terms of visual quality. However, the results of the different fusion methods are slightly different in terms of contrast and details. Fig. 6 is zoomed to a marked region in Fig. 5 for further comparison. Fig. 6(c) shows that the results of the WT-based method consist of some obvious artifacts. Furthermore, the WT-based method smoothed the fused image, which caused some details to be missing. Both the car and traffic lights are unclear. Fig. 6(d) shows fewer artifacts in the fusion image obtained by the SIDWT-based method because the shift-invariant transform can successfully suppress pseudo-Gibbs phenomena and can achieve better edges in the fused image.
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
Fig. 6. Zoomed version of the marked region in Fig. 5: (a)–(h) the zoomed version of the labeled area of Fig. 5(a)–(h).
W. Gan et al. / Infrared Physics & Technology 72 (2015) 37–51
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
43
Fig. 7. IR and VIS image fusion results of different fusion algorithms for ‘‘UNcamp’’ image set: (a) IR image; (b) VIS image; (c) WT based method; (d) SIDWT based method; (e) GF based method; (f) NSCT based method; (g) SAID based method; (h) the proposed method.
Fig. 6(e), which is obtained from GF-based method, has fewer artifacts, but some details are missing in the result because the guided filter overcomes the spatial inconsistency problem. However, the guided filter tends to smooth the weighting map, which causes missing details in the result. Fig. 6(f) shows that some details are
not clear in the fusion image obtained by the NSCT-based method [9]. Fig. 6(g) shows that the fusion image obtained by the SAID-based method [21] contains many noises. Fig. 6(h) shows the resultant image obtained with the proposed method. The resultant image appears natural, and its edges are clear. Furthermore,
44
W. Gan et al. / Infrared Physics & Technology 72 (2015) 37–51
the artifacts, even the artifacts in the top of the car, are almost eliminated because the proposed method takes the advantages of multi-scale decomposition and guided filter. Multi-scale decomposition could extract the useful image features at different scales to improve the quality of the fused image. Moreover, the guided filter would eliminate the artifacts produced by spatial inconsistency. Furthermore, PC can generate a better saliency map than Laplace operator. Thus, the result of the proposed method is better than that of other methods. The proposed method outperforms other methods in terms of visual effect. Fig. 7 shows the resultant images obtained with different methods for the ‘‘UNcamp’’ image set. Fig. 7(a) and (b) show the source IR and VIS image, respectively. Fig. 7(c) and (d) show that the WTand SIDWT-based methods produce obvious artifacts, and the resultant images obtained with these two methods are unclear. Fig. 7(e) is the result of the GF-based method, and it shows that the GF-based method maintains better performance than the WT- and SIDWT-based methods. Though the artifacts are reduced, part of the fence is unclear. In general, the GF based method maintains a better performance than WT and SIDWT based method. Fig. 7(f) shows that some details are not clear in the fusion image obtained by the NSCT-based method. The fusion image obtained by the SAID-based method shown in Fig. 7(g) shows many noises. Fig. 7(h) shows the result of the proposed method. The proposed method not only preserves the image details of source images well, but also suppresses the artifacts. In addition, the resultant image has a higher contrast. The performance of the proposed method is superior to the other methods in terms of visual effect. Figs. 8–10 are examples on ‘‘7118a’’, ‘‘trees_4917’’, and ‘‘e518a’’ image sets respectively. Similar to previous results, the results of the proposed method are better than the results of other methods. The results of the proposed method contain the details of the source images and have few artifacts. Therefore, generally the proposed method achieves a better performance than other methods. We can draw the conclusion that the proposed method outperforms other methods in terms of visual effect.
Table 1 shows that SIDWT-based method has a better performance than WT-based method in terms of MI, Q AB=F , and Q SSIM . GF-based method has better results than SIDWT-based method in terms of MI as a whole. The performance of GF-based method is worse than
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
4.2. Objective performance evaluation The information on the source IR and VIS images should be effectively transferred into the final fused image with the use of fusion methods. Three objective fusion quality criteria [22,23], i.e., an information theory-based metric (MI) [24], an image feature-based metric (Q AB=F ) [25], and an image structure-based metric (Q SSIM ) [26], are adopted to assess the fusion results and to quantify the performance of the proposed method. Mutual information is adopted in the information theory-based metric. The metric with mutual information [24] measures the amount of source image information preserved in the fused image after fusion processing. The Q AB=F metric [25] is the feature-based metric, which measures how well the amount of edge information is kept in the fused images. The image structure-based metric Q SSIM uses structural similarity (SSIM) [27] for fusion assessment. The metric Q SSIM [26] assesses the amount of structural similarity transferred from the source images to the fused images. Higher values of these three metrics indicate better fusion quality of the fused image. Quantitative comparisons of the different fusion methods’ performance are given in Table 1 for the five test image sets. In addition, we compare the contribution of IR and VIS when evaluating MI, Q AB=F ; Q SSIM in Table 1, where MIVIS and MIIR denote the contriAB=F butions of VIS image and IR image in terms of MI respectively, Q VIS AB=F and Q IR refers to the contributions of VIS image and IR image in
terms of Q AB=F respectively, and Q SSIM and Q SSIM are the contribuVIS IR tions of VIS image and IR image in terms of Q SSIM respectively.
Fig. 8. IR and VIS image fusion results of different fusion algorithms for ‘‘7118a’’ image set: (a) IR image; (b) VIS image; (c) WT based method; (d) SIDWT based method; (e) GF based method; (f) NSCT based method; (g) SAID based method; (h) the proposed method.
W. Gan et al. / Infrared Physics & Technology 72 (2015) 37–51
the SIDWT-based methods in terms of Q AB=F . This finding demonstrates that GF-based method can transfer the mutual information of the source images into the fused image well but does not effectively import the edge information of the source images into the fused image. This finding is consistent with the phenomenon that the GF-based method can effectively suppress the artifacts but
45
tends to produce a smooth fused image, i.e., it tends to miss some details. Compared with WT- and SIDWT-based methods, NSCT-based method has better results in terms of MI and Q SSIM as a whole. This finding illustrates that NSCT-based method can well import the mutual information and the structural similarity of the source images into the fused image well. Generally,
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
Fig. 9. IR and VIS image fusion results of different fusion algorithms for ‘‘ trees_4917’’ image set: (a) IR image; (b) VIS image; (c) WT based method; (d) SIDWT based method; (e) GF based method; (f) NSCT based method; (g) SAID based method; (h) the proposed method.
46
W. Gan et al. / Infrared Physics & Technology 72 (2015) 37–51
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
Fig. 10. IR and VIS image fusion results of different fusion algorithms for ‘‘ e518a’’ image set: (a) IR image; (b) VIS image; (c) WT based method; (d) SIDWT based method; (e) GF based method; (f) NSCT based method; (g) SAID based method; (h) the proposed method.
SAID-based method has better performance than the WT-, SIDWT-, GF-, and NSCT-based methods in terms of MI and Q SSIM . This demonstrates that SAID-based method can well import the mutual information and the structural similarity of the source images into the fused image well. As a whole, the proposed method has the
higher Q AB=F and Q SSIM indices as opposed to the other methods. In generally, the performance of the proposed method is the best. From Table 2, we can see that the average Q AB=F and Q SSIM of the proposed method outperform other methods. Although the average MI of the proposed method is not the highest, it is the second
47
W. Gan et al. / Infrared Physics & Technology 72 (2015) 37–51 Table 1 AB=F SSIM SSIM Evaluation indices (the MI, Q AB=F ; Q SSIM ; MIVIS ; MIIR ; Q AB=F VIS ; Q IR ; Q VIS ; Q IR ) for the fused images on the test image sets. Image set
Method
MI
MIVIS
MIIR
Q AB=F
Q VIS
Q IR
Q SSIM
Q SSIM VIS
Q SSIM IR
Quad
WT based method SIDWT based method GF based method NSCT based method SAID based method Proposed method
1.57 1.84 1.44 1.90 2.52 2.30
1.18 1.45 0.65 1.47 2.22 1.92
0.39 0.39 0.79 0.43 0.31 0.38
0.488 0.581 0.490 0.456 0.481 0.673
0.332 0.406 0.227 0.352 0.438 0.517
0.156 0.175 0.263 0.105 0.043 0.155
0.524 0.562 0.622 0.567 0.799 0.961
0.248 0.272 0.177 0.356 0.710 0.611
0.276 0.293 0.445 0.211 0.089 0.350
UNcamp
WT based method SIDWT based method GF based method NSCT based method SAID based method Proposed method
1.37 1.45 1.61 2.01 3.52 1.68
0.54 0.55 0.64 0.83 2.90 0.57
0.84 0.90 0.97 1.18 0.62 1.11
0.434 0.503 0.515 0.383 0.455 0.564
0.185 0.210 0.235 0.186 0.351 0.240
0.249 0.293 0.280 0.197 0.104 0.324
0.463 0.494 0.512 0.507 0.755 0.761
0.215 0.234 0.236 0.310 0.714 0.354
0.248 0.260 0.276 0.198 0.041 0.407
7118a
WT based method SIDWT based method GF based method NSCT based method SAID based method Proposed method
2.37 2.65 2.53 3.30 3.17 3.75
1.54 1.71 1.24 1.83 2.10 2.56
0.84 0.95 1.29 1.47 1.07 1.20
0.484 0.552 0.487 0.457 0.372 0.615
0.329 0.375 0.290 0.297 0.283 0.460
0.155 0.177 0.198 0.160 0.089 0.155
0.212 0.274 0.237 0.337 0.706 0.461
0.135 0.169 0.170 0.156 0.656 0.368
0.076 0.105 0.067 0.181 0.050 0.093
trees_4917
WT based method SIDWT based method GF based method NSCT based method SAID based method Proposed method
1.35 1.45 1.77 1.66 1.50 1.68
0.56 0.59 0.91 0.37 1.23 0.54
0.79 0.87 0.86 1.29 0.27 1.13
0.472 0.525 0.512 0.385 0.423 0.604
0.221 0.243 0.261 0.144 0.257 0.274
0.251 0.282 0.251 0.241 0.167 0.330
0.476 0.515 0.391 0.650 0.631 0.722
0.256 0.273 0.252 0.081 0.565 0.372
0.219 0.242 0.139 0.569 0.066 0.350
e518a
WT based method SIDWT based method GF based method NSCT based method SAID based method Proposed method
1.82 2.00 2.17 2.28 2.93 2.34
1.02 1.15 1.39 1.21 2.41 1.50
0.80 0.85 0.78 1.06 0.52 0.84
0.542 0.607 0.623 0.425 0.483 0.662
0.328 0.361 0.390 0.233 0.406 0.397
0.214 0.246 0.233 0.192 0.077 0.265
0.588 0.619 0.627 0.549 0.860 0.952
0.329 0.342 0.367 0.298 0.770 0.545
0.259 0.277 0.260 0.251 0.090 0.407
AB=F
AB=F
The values highlighted in bold indicate that they are the best or highest among the comparisons’ results. Table 2 AB=F SSIM SSIM The average MI, Q AB=F ; Q SSIM MIVIS ; MIIR ; Q AB=F values of fused images on the test image sets. VIS ; Q IR ; Q VIS and Q IR Method
MI
MIVIS
MIIR
Q AB=F
Q VIS
Q IR
Q SSIM
Q SSIM VIS
Q SSIM IR
WT based method SIDWT based method GF based method NSCT based method SAID based method Proposed method
1. 70 1.88 1.90 2.23 2.73 2.35
0.97 1.09 0.97 1.14 2.17 1.42
0.73 0.79 0.94 1.09 0.56 0.93
0.484 0.554 0.525 0.421 0.443 0.624
0.279 0.319 0.280 0.242 0.347 0.378
0.205 0.235 0.245 0.179 0.096 0.246
0.453 0.493 0.478 0.522 0.750 0.771
0.237 0.258 0.241 0.240 0.683 0.450
0.216 0.236 0.237 0.282 0.067 0.321
AB=F
AB=F
The values highlighted in bold indicate that they are the best or highest among the comparisons’ results. Table 3 The MI, Q AB=F ; Q SSIM values of fused images for role analysis of the used techniques. Image set
Method
MI
Q AB=F
Q SSIM
Quad
GF based method PC based method MSEPD based method Proposed method
1.44 2.30 1.55 2.30
0.490 0.686 0.504 0.673
0.627 0.711 0.820 0.961
UNcamp
GF based method PC based method MSEPD based method Proposed method
1.61 1.62 1.64 1.68
0.515 0.552 0.527 0.564
0.512 0.528 0.741 0.761
7118a
GF based method PC based method MSEPD based method Proposed method
2.53 3.90 2.50 3.75
0.487 0.618 0.496 0.615
0.237 0.302 0.364 0.461
Trees_4917
GF based method PC based method MSEPD based method Proposed method
1.77 1.54 1.70 1.68
0.512 0.592 0.528 0.604
0.391 0.531 0.546 0.722
e518a
GF based method PC based method MSEPD based method Proposed method
2.17 2.18 2.26 2.34
0.623 0.648 0.635 0.662
0.627 0.687 0.912 0.952
The values highlighted in bold indicate that they are the best or highest among the comparisons’ results.
48
W. Gan et al. / Infrared Physics & Technology 72 (2015) 37–51
Table 4 The average MI, Q AB=F ; Q SSIM values of fused images for role analysis of the used techniques. Method
MI
Q AB=F
Q SSIM
GF based method PC based method MSEPD based method Proposed method
1.90 2.31 1.93 2.35
0.478 0.552 0.677 0.771
0.525 0.619 0.538 0.624
The values highlighted in bold indicate that they are the best or highest among the comparisons’ results.
highest. These indicate that generally the performance of the proposed method is the best among the competing methods. The mutual, edge, and structural information of the source images are well-transferred into the fused image. This finding further demonstrates that the proposed method can achieve better results by taking the advantages of multi-scale decomposition and guided filter, and PC. Therefore, the fused result of the proposed method preserves the most image information of source IR and VIS images. In addition, the objective metrics are consistent with the above
visual effect. Hence, the proposed method outperforms the WT-, SIDWT-, GF-, NSCT- and SAID-based based methods in terms of visual and objective qualities. From Tables 1 and 2, we can observe that in general NSCT-based method can transfer more information of IR image into the fused image than the proposed method in terms of MI. However, the contributions of IR images and VIS images are not balanced. The NSCT-based method does transfer the information of IR image into the fused image in terms of MI, while it transfers few information of VIS image into the fused image. Similarly, SAID-based method can transfer more information of VIS image into the fused image than the proposed method in terms of Q SSIM , and the contributions are not balanced between of IR image and VIS image. The SAID-based method does transfer information of VIS image into the fused image in terms of Q SSIM , while it transfers few information of IR image into the fused image. From Table 2, we can see that the average contributions of VIS image and IR image with the proposed method are larger than the average contributions with other methods in terms of Q AB=F . Moreover, the average contributions of IR image with the proposed method are the best among the
Fig. 11. The impact of the exponent a for fused images: (a) impact of the exponent a in terms of MI; (b) impact of the exponent a in terms of Q AB=F ; (c) impact of the exponent a in terms of Q SSIM .
Fig. 12. The impact of the regularization constant b1 for fused images: (a) impact of the regularization constant b1 in terms of MI; (b) impact of the regularization constant b1 in terms of Q AB=F ; (c) impact of the regularization constant b1 in terms of Q SSIM .
49
W. Gan et al. / Infrared Physics & Technology 72 (2015) 37–51
To analyze the role of three techniques, i.e. guided filter, PC and MSEPD, we compared four methods: GF-based method [12], PC-based method, MSEPD-based method, and the proposed method. GF-based method, which are proposed by in [12], adopts guided filter and Laplacian operator to fuse source images. PC-based method is similar with GF-based method except for replacing Laplacian operator with PC. This means that PC-based method adopts guided filter and PC to fuse source images. MSEPD-based method is similar with GF-based method, except for introducing MSEPD into GF-based method. MSEPD-based method adopts guided filter, Laplacian operator, and MSEPD to fuse source images. The proposed method adopts guided filter, PC, and MSEPD to fuse source images. A comparison of the four method shows in Table 3. From Tables 3 and 4 we can see that both PC-based method and MSEPD-based method have the better performance than GF-based
compared methods in terms of Q SSIM . Though the average contributions of VIS image and IR image with the proposed method are not the best one in terms of MI, they are ranked in top three. The proposed method can effectively transfer both VIS and IR image into the fused image than other methods in terms of Q AB=F and Q SSIM . In addition, the information of IR and VIS images is evenly transfers into the fused image. This leads to the conclusion that the proposed method has the best performance among the compared methods on the whole.
4.3. Role analysis of the used techniques Three techniques, i.e., guided image filter, PC and multi-scale edge-preserving decomposition (MSEPD) are adopted in the proposed method. It is of importance to know which one plays the dominant role, and to know the advantage of combining the three techniques.
method in terms of MI, Q AB=F and Q SSIM . This finding demonstrates that both PC and MSEPD can effectively improve the fusion
0.63
2.8 2.7
0.61
MI 2.6
Q AB/F 0.59
2.5
0.57
2.4
6
7
8
9
10
11
12
13
14
15
0.55
16
6
7
8
9
10
11
S
S
(a)
(b)
12
13
14
15
16
0.76 0.73
Q SSIM
0.7 0.67 0.64
6
7
8
9
10
11
12
13
14
15
16
S
(c) Fig. 13. The impact of the scale constant S for fused images: (a) impact of the scale constant S in terms of MI; (b) impact of the scale constant S in terms of Q AB=F ; (c) impact of the scale constant S in terms of Q SSIM .
0.63
2.8
0.61
2.7
MI 2.6
Q
AB/F
0.57
2.5 2.4
0.59
25
30
35
40
45
50
55
60
0.55
65
25
30
35
40
45
r1
r1
(a)
(b)
50
55
60
65
0.76 0.73
Q SSIM
0.7 0.67 0.64
25
30
35
40
45
50
55
60
65
r1
(c) Fig. 14. The impact of the filter size of first level r 1 for fused images: (a) impact of the filter size of first level r 1 in terms of MI; (b) impact of the filter size of first level r 1 in terms of Q AB=F ; (c) impact of the filter size of first level r 1 in terms of Q SSIM .
50
W. Gan et al. / Infrared Physics & Technology 72 (2015) 37–51
guided image filter, PC, and MSEPD the proposed method has a better performance than GF based method, PC-based method, and
performance. More specifically, the average MI, Q AB=F and Q SSIM gains of PC-based method over GF-based method are 0.405, 0.071, and 0.094 respectively. Furthermore, we can observe that the averAB=F
MSEPD-based method in terms of MI, Q AB=F and Q SSIM .
SSIM
age MI, Q and Q gains of MSEPD-based method over GF-based method is 0.026, 0.199 and 0.013 respectively. These findings illustrate that PC plays the dominant role in terms of MI and
4.4. Parameter analysis The proposed method mainly consists of two parameters types. One is for the multi-scale edge-preserving decomposition (MSEPD), and the other is for the guided filter. For the multi-scale decomposition, the parameters consist of the exponent a, the regularization constant bi at each scale. In order to simplify regularization parameter analysis, we suppose that the bi =biþ1 ¼ S, where S is a scale constant. Thus, for the parameters of the multi-scale decomposition, we have three parameters, i.e., the exponent a, the regularization constant b1 and the scale constant S. We investigated the impact of the exponent a, the regularization constant b1 , and the scale constant S on the fused images.
Q SSIM , while MSEPD plays the dominant role in terms of Q AB=F . In other words, PC rather than Laplacian operator can significantly improve the performance of fusion method in terms of MI and Q SSIM , while MSEPD can significantly improve the performance in terms of Q AB=F . In addition, we can see that generally the proposed method has a better performance than PC-based method and MSEPD-based method in terms of MI, Q AB=F and Q SSIM . We can draw a conclusion that PC plays the dominant role in terms of MI and Q SSIM , while MSEPD plays the main role in terms of Q AB=F . Thus, PC and MSEPD are complementary with each other. By combining 2.8
0.63
2.7
0.61
MI 2.6
Q
AB/F
2.5
0.59 0.57
2.4
0.55 3
4
5
6
7
8
9
10
11
12
13
14
15
3
4
5
6
7
8
9
r2
r2
(a)
(b)
10
11
12
13
14
15
0.76 0.73
Q SSIM
0.7 0.67 0.64
3
4
5
6
7
8
9
10
11
12
13
14
15
r2
(c) Fig. 15. The impact of the filter size of second level r 2 for fused images: (a) impact of the filter size of second level r 2 in terms of MI; (b) impact of the filter size of second level r2 in terms of Q AB=F ; (c) impact of the filter size of second level r 2 in terms of Q SSIM .
Fig. 16. The impact of the regularization constant e1 for fused images: (a) impact of the regularization constant e1 in terms of MI; (b) impact of the regularization constant e1 in terms of Q AB=F ; (c) impact of the regularization constant e1 in terms of Q SSIM .
W. Gan et al. / Infrared Physics & Technology 72 (2015) 37–51
Figs. 11–13 plot the MI, Q AB=F and Q SSIM curves against a; b1 , and S respectively. From Figs. 11 and 12, we can observe that when a or b1 increases, the results get higher MI index, while when a or b1 increases, the results get lower Q SSIM index. And the Q AB=F is not sensitive to a or b1 . We can find crosscurrent between MI index and Q SSIM index. This means that we would get a higher MI index at the expense of Q SSIM index by tuning a or b1 . From Fig. 13, we can see that when S increases, the results of MI index would slowly increase. When S goes beyond ten, no significant improvement is observed. However, the results of Q AB=F would SSIM
change slowly and the results of Q goes lower as S increases. The guided filter contains two parameters, i.e. the filter size ri and the regularization constant ei in each scale. In order to simplify parameter analysis, we suppose that r 3 ¼ r2 1, and r4 ¼ r 3 1, and ei =eiþ1 ¼ 100. Thus, for the guided filter, we have three parameters i.e., the filter size of first level r1 , the filter size of second level r 2 , and the regularization constant e1 . Figs. 14–16 plot the MI, Q AB=F , and Q SSIM curves against r 1 ; r2 and e1 respectively. From Figs. 14 and 15, we can find that when r 1 or r2 increases, the results of MI index would slow increase, meanwhile when r 1 or r 2 increases the results of Q SSIM index would decrease. Both r 1 and r 2 are not sensitive to Q AB=F . From Fig. 16, we can see that all the three index MI, Q AB=F and Q SSIM of the results is not sensitive to that e1 is not important to the results.
e1 . This means
5. Conclusion We developed a new IR image and VIS fusion scheme by combining multi-scale edge-preserving decomposition with guided filter. The multi-scale edge-preserving decomposition can effectively extract the useful information from the source images, whereas the guided image filter can eliminate artifacts. The proposed fusion scheme achieves the best results by take the advantages of multi-scale decomposition and guided filter. In addition, PC rather than Laplace operator is adopted in this study to obtain better saliency maps, which improves the performance of the proposed method. Many experiments are conducted to evaluate the performance of the proposed method. The experiments show that the proposed method outperforms other fusion methods in terms of both visual quality and objective evaluation. The proposed method can be further improved by exploiting methods other than PC in future works. Another improvement is to select parameters automatically instead of using fixed parameters in the proposed method to achieve good performance. In addition, another future direction is to apply the proposed method to other multi-modality image fusions. Conflict of interest This paper has not been published previously or under consideration for publishing on any other journal. The authors of this paper certify that they have no affiliations with or involvement in any organization or entity with any financial interest, or non-financial interest in the subject matter or materials discussed in this paper. Acknowledgements This work is supported by National Natural Science Foundation of China (Grant Nos. 61271330, 11176018), by the Science and Technology Plan of Sichuan Province (No. 2014GZ0005), the
51
National Science Foundation for Post-doctoral Scientists of China (No. 2014M552357), by the Scientific Research Foundation for the Returned Overseas Chinese Scholars, State Education Ministry, and by Open Research Fund of Jiangsu Province Key Laboratory of Image Processing and Image Communication, Nanjing University of Posts and Telecommunications (No. LBEK2013001). References [1] X. Qu, J. Yan, G. Xie, Z. Zhu, B. Chen, A novel image fusion algorithm based on bandelet transform, Chinese Opt. Lett. 5 (2007) 569–572. [2] X. Bai, S. Gu, F. Zhou, B. Xue, Weighted image fusion based on multi-scale tophat transform: algorithms and a comparison study, Optik 124 (2013) 1660– 1668. [3] Y.L. Weiwei Kong, Longjun Zhang, Novel fusion method for visible light and infrared images based on NSST-SF-PCNN, Infrared Phys. Technol. 65 (2014) 103–112. [4] X. Bai, X. Chen, F. Zhou, Z. Liu, B. Xue, Multiscale top-hat selection transform based infrared and visual image fusion with emphasis on extracting regions of interest, Infrared Phys. Technol. 60 (2013) 81–93. [5] J.J. Lewis, R.J. O’Callaghan, S.G. Nikolov, D.R. Bull, N. Canagarajah, Pixel- and region-based image fusion with complex wavelets, Inform. Fusion 8 (2007) 119–130. [6] H. Wang, J. Peng, W. Wu, Fusion algorithm for multisensor images based on discrete multiwavelet transform, in: IEE Proceedings Vision, Image and Signal Processing, vol. 149, pp. 283–289. [7] F. Xu, S. Su, An enhanced infrared and visible image fusion method based on wavelet transform, in: 5th International Conference on Intelligent Human– Machine Systems and Cybernetics, pp. 453–456. [8] B. Yang, Z. Jing, Image fusion using a low-redundancy and nearly shift invariant discrete wavelet frame, Opt. Eng. 46 (2007). 107002-1–107002-10. [9] J. Adu, J. Gan, Y. Wang, J. Huang, Image fusion based on nonsubsampled contourlet transform for infrared and visible light image, Infrared Phys. Technol. 61 (2013) 94–100. [10] W. Cai, M. Li, X. yan Li, Infrared and visible image fusion scheme based on contourlet transform, in: Fifth International Conference on Image and Graphics, pp. 516–520. [11] P. Burt, E. Adelson, The Laplacian pyramid as a compact image code, IEEE Trans. Commun. 31 (1983) 532–540. [12] S. Li, X. Kang, J. Hu, Image fusion with guided filtering, IEEE Trans. Image Process. 22 (2013) 2864–2875. [13] K. He, J. Sun, X. Tang, Guided image filtering, IEEE Trans. Pattern Anal. Mach. Intell. 35 (2013) 1397–1409. [14] C. Li, A.C. Bovik, X. Wu, Blind image quality assessment using a general regression neural network, IEEE Trans. Neural Netw. 22 (2011) 793–799. [15] W. Chen, Y.Q. Shi, W. Su, Image splicing detection using 2-D phase congruency and statistical moments of characteristic function, in: Proceedings of SPIE, Security, Steganography and Watermarking of Multimedia Contents IX, vol. 6505. [16] L. Zhang, L. Zhang, X. Mou, D. Zhang, FSIM: a feature similarity index for image quality assessment, IEEE Trans. Image Process. 20 (2011) 2378–2386. [17] M.C. Morrone, R.A. Owens, Feature detection from local energy, Pattern Recogn. Lett. 6 (1987) 303–313. [18] P. Kovesi, Image features from phase congruency, Videre: J. Comput. Vis. Res. 1 (1999) 1–26. [19] Z. Farbman, R. Fattal, D. Lischinski, R. Szeliski, Edge-preserving decompositions for multi-scale tone and detail manipulation, ACM Trans. Graph. 27 (2008) 67. [20] J. Zhao, H. Feng, Z. Xu, Q. Li, T. Liu, Detail enhanced multi-source fusion using visual weight map extraction based on multi scale edge preserving decomposition, Opt. Commun. 287 (2013) 45–52. [21] J. Zhao, Q. Zhou, Y. Chen, H. Feng, Z. Xu, Q. Li, Fusion of visible and infrared images using saliency analysis and detail preserving based image decomposition, Infrared Phys. Technol. 56 (2013) 93–99. [22] Z. Liu, E. Blasch, Z. Xue, J. Zhao, R. Laganiere, W. Wu, Objective assessment of multiresolution image fusion algorithms for context enhancement in night vision: a comparative study, IEEE Trans. Pattern Anal. Mach. Intell. 34 (2012) 94–109. [23] G. Bhatnagar, Q.M.J. Wu, Z. Liu, Directive contrast based multimodal medical image fusion in NSCT domain, IEEE Trans. Multimedia 15 (2013) 1014–1024. [24] G. Qu, D. Zhang, P. Yan, Information measure for performance of image fusion, Electron. Lett. 38 (2002) 313–315. [25] C. Xydeas, V. Petrovic, Objective image fusion performance measure, Electron. Lett. 36 (2000) 308–309. [26] C. Yang, J.-Q. Zhang, X.-R. Wang, X. Liu, A novel similarity based quality metric for image fusion, Inform. Fusion 9 (2008) 156–160. [27] Z. Wang, A.C. Bovik, H.R. Sheikh, E.P. Simoncelli, Image quality assessment: from error visibility to structural similarity, IEEE Trans. Image Process. 13 (2004) 600–612.