Optik - International Journal for Light and Electron Optics 201 (2020) 163497
Contents lists available at ScienceDirect
Optik journal homepage: www.elsevier.com/locate/ijleo
Original research article
Infrared and visible image fusion via hybrid decomposition of NSCT and morphological sequential toggle operator
T
⁎
Zhishe Wanga, , Jiawei Xub, Xiaolin Jianga, Xiaomei Yanc a b c
School of Applied Science, Taiyuan University of Science and Technology, Taiyuan, Shanxi, 030024, China School of Computer Science, University of Newcastle, Newcastle upon Tyne, NE4 5TF, UK School of Electronical and Information Engineering, Taiyuan University of Science and Technology, Taiyuan, Shanxi, 030024, China
A R T IC LE I N F O
ABS TRA CT
Keywords: Image fusion Hybrid Decomposition Image feature Infrared image
Infrared and visible image fusion technique is beneficial to improve scene description capability and target detection accuracy for modern image processing. In this paper, we propose a novel and effective image enhanced fusion via a hybrid decomposition of non-subsample contourlet transform (NSCT) and morphological sequential toggle operator (MSTO). MSTO is constructed as a multi-scale decomposition on the base of top-hat transformation. We employ MSTO to extract the bright/dark image features (BIF/DIF) from approximation subband of NSCT decomposition. This hybrid decomposition can effectively suppress the noise and pseudo-edge of source images. The extracted BIF and DIF are fused with maximum selection rule based on local energy map at different scales. Meanwhile, the guided filter is used to enhance the fused BIF and DIF. These enhanced fused BIF and DIF are integrated into the combined approximation subband, which can largely improve the contrast and visible effect of final fusion image. Our experiments demonstrate that this proposed approach is superior to other fusion methods in terms of visual inspection and objective measures.
1. Introduction Image fusion technology can combine complementary information of infrared and visible images into a single image, which is popular in army and real-life applications, such as military reconnaissance, video surveillance, target tracking, and so on [1,2]. The infrared imaging sensors can capture obvious target information by infrared radiation. However, the acquired infrared images typically have shortcomings of scene information loss and low resolution [3]. In contrast, the visible imaging sensors have high spatial resolution, which can provide abundant detail and texture information. When the visible imaging sensors are working in bad illuminating conditions or the color and spatial characteristics of target is similar with that of the background, the acquired visible images are usually bad [4,5]. However, infrared and visible image fusion can provide useful information for these two types of sensors, the fused image can contain obvious target information and abundant scene information, which is beneficial to enhance the capability of scene description for human observation and improve the accuracy of target detection for future processing. The multi-scale transform (MST) have been fully dedicated to fuse infrared and visible images in the last decades. The MST approaches are usually designed for decomposing the source images into different scale sub-images. This process simulates human visual system characteristics. Thus, the MST can extract useful image features from different scales and have good local characteristics. Several studies have demonstrated that the fused image obtained by MST-based methods have good visual effect [6]. Popular
⁎
Corresponding author. E-mail address:
[email protected] (Z. Wang).
https://doi.org/10.1016/j.ijleo.2019.163497 Received 27 May 2019; Received in revised form 26 September 2019; Accepted 28 September 2019 0030-4026/ © 2019 Elsevier GmbH. All rights reserved.
Optik - International Journal for Light and Electron Optics 201 (2020) 163497
Z. Wang, et al.
MST approaches include discrete wavelet transform (DWT), dual-tree complex wavelet transform (DTCWT) [7], curvelet transform (CVT) [8,9], non-subsample contourlet transform (NSCT) [10], and so on. In these MST approaches, NSCT inherit multi-scale and multi-directional from contourlet and have good local characteristics. Meanwhile, it is a changeable and complete shift-invariant model, which can reduce the pseudo-Gibbs phenomenon in the sampled process. Therefore, NSCT is still the most active field for infrared and visible image fusion [11–17]. For example, Fu et al [11]. proposed NSCT and RPCA to fuse infrared and visible image. This method used sparse matrixes of RPCA decomposition to guide the fusion rule of NSCT transformation. Meng et al [17]. extracted object region of infrared image and obtained the primary fused image. The final fused image was obtained by inverse NSCT. All their methods have proposed combination of NSCT decomposition and other technique to obtain a better fusion image. During these years, researches on edge-preserving filter (EPF) have been very active, which were widely used to image enhancing [18,19], image denoising [20,21] and image fusion [22–29], et al. Similar to MST, source images can be decomposed into a base layer and several detail layers by EPF smoothing operation. The base layer can mainly capture the intense changes of source images. However, detail information can be preserved in the detail layers at different scales. The main merits of EPF are that it can keep the consistent spatial structures and decrease halo artifacts. For example, Hu et al [27]. Constructed multi-scale directional bilateral filter (MDBF) to decompose source images and yielded a better fusion result by given fusion rules. Li et al. [28] successfully applied the guided filtering into infrared and visible image fusion, which is a fast and efficient fusion method. Based on previous research, Jian et al. [29] proposed a novel rolling guidance filter (RGF) base fusion method, which is in an iterative manner to control image smoothing. Their fusion methods based-EPF can efficiently reduce halos and preserve scale-aware information. Through MST or EPF, the infrared and visible images can be decomposed into approximation subband and detail subbands. In general, the given fusion rules is “average” for approximation subband and “max-absolute” for detail subbands. However, the conventional “average” fusion rule cannot take full advantage of approximation subband efficiently. The fused image usually has some defects, such as low contrast, details missing, artifacts, and so on. To overcome this problem, we proposed a hybrid multi-scale decomposition based on NSCT and morphological sequential toggle operator (MSTO) in this paper. NSCT decomposition is used to firstly decompose the source images into approximation subband and detail subbands. The approximation subband controls the overall visible effect and contrast of the final fused image. Thus, we can adopt MSTO to extract the bright/dark image features (BIF/ DIF) from the approximation subband, which can be seen as the second decomposition. Because the value of extracted BIF and DIF are very small, the guided filter (GF) is adopted to enhance these features. These enhanced BIF and DIF were combined into primary combined approximation subband. This method can efficiently improve the contrast and receive a better visible effect of the final image. The experiments demonstrate that this proposed method is superior to other cutting-edge fusion methods. The reminder of this paper is organized as follows: advantages and problems of NSCT- and MSTO- based fusion method are respectively introduced in Section 2. The details of the proposed image fusion method are presented in Section 3. The experiment results are discussed in Section 4. Finally, Section 5 draws the main conclusions. 2. Related work 2.1. Analysis on NSCT- based image fusion NSCT decomposition is implemented by non-subsampled pyramid filter bank (NSPFB) and non-subsampled directional filter bank (NSDFB). The NSPFB is a shift-invariant filtering process, and it can capture discontinuities point. However, the NSDFB is a directional filter, which can link discontinuities point into linear structures. There are two main considerations when NSCT is used for image processing. One is to inherit multi-scale and multi-directional traits, and it can preserve good local characteristics in both frequency and spatial domains. Another is to possess in a shift-invariance manner, and it can extract the useful geometrical image features. It is because the above advantages, NSCT are widely implemented into infrared and visible image fusion. For conventional NSCT- based fusion method has still two drawbacks. One is to lost contrast of the fused image. This is mainly because the infrared and visible imaging sensors have different imaging modalities. The two types of images have different brightness at the same region. But, when conventional “average” fusion rule is selected to combine approximation subband, some energy at these local regions will be dropped. This directly causes that contrast of those regions in the fused image is reduced, and the fused image shows low-quality visible effect as a whole. Another is the difficulty to select the NSCT decomposition level. In order to extract the sufficient detail features from source images, the NSCT decomposition level will be large scales such as 4 or 5. Experiments have demonstrated that when NSCT decomposition level becomes larger, the fused images may have serious artificial effects [6]. It is caused by mis-registration and noises of source images which decreases the quality of detail subbands fusion. Thus, how to select optimal decomposition level should be seriously considered. In our method, we adopt morphological sequential toggle operator to extract the BIF and DIF from the approximation subband by NSCT decomposition, and these BIF and DIF will be enhanced by guided filter. These enhanced BIF and DIF are combined into primary combined approximation subband. The final fused image has inevitably higher contrast and better edges. Moreover, we do not need a large decomposition level for NSCT because a hybrid multi-scale decomposition is proposed. Thus, it is difficult to select the NSCT decomposition level can be efficiently sorted out. 2.2. Analysis on MSTO- based image fusion In the top-hat transformation theory, the white and black top-hat transform of the gray image f (x , y ) can be defined as: 2
Optik - International Journal for Light and Electron Optics 201 (2020) 163497
Z. Wang, et al.
WTH (f ) = f − f ∘B
(1)
BTH (f ) = f •B − f
(2)
Where f ∘B and f •B represent the opening and closing operators of image respectively, they can be implemented by operating a structuring element B (u, v ) on the original image. They can be defined as:
f ∘B = (fΘB ) ⊕ B
(3)
f •B = (f ⊕ B ) ΘB
(4)
When a structuring element sequence B = [B0, B1, ⋯Bs] is proposed to replace structuring element B (u, v ) , we can obtain the multi-scale white and black top-hat transform, which can be respectively defined as:
WTHBs (f ) = f − f ∘Bs
(5)
BTHBs (f ) = f •Bs − f
(6)
Where B0 represents the initial selected structuring element, and Bs represents the result of s dilation times by using the B0 . On this basis, we can construct two new operations, which are named as top-hat based contrast operator and the toggle contrast operator.
THCO Bs = f + WTHBs − BTHBs
TCO Bs =
(7)
⎧ f ⊕ Bs , iff ⊕ Bs − f < f − fΘBs fΘBs , iff ⊕ Bs − f > f − fΘBs ⎨ ⎩ f , else
(8)
Through sequential combination of top-hat based contrast operator and the toggle contrast operator, we can obtain the morphological sequential toggle operator, which can be defined as:
MSTO Bs = TCO Bs (THCO Bs )
(9)
MSTO- based fusion method was implemented infrared and visible image fusion with target regions extracted and well details preserved [30–32]. Even this method proves the high performance, it still has two drawbacks. One is that the bright/dark image features (BIF/DIF) extracted from source images have amount of noise and pseudo-edge, which causes serious artifacts appearance in the fused image. Another is that pixel values of extracted BIF and DIF are very small, which causes the limit of contrast improvement of the fused image. In the proposed method, NSCT decomposition is used to extract the useful detail features of infrared and visible images. The approximation subband can be viewed as an approximation of the source images, and the gray value changes are smooth on the whole. Meanwhile, the detail subbands represent the details and texture information. Thus, we adopt MSTO to extract BIF and DIF from the approximation subband. This can extract salient structures, target regions and preserve well details as well as effectively suppress the noise and pseudo-edge. We use GF to enhance the extracted BIF and DIF, which can achieve a result that the bright image is brighter and the dark image is darker. Thus, the contrast of fused image will be made for further improvement. We experimentally demonstrate that the proposed fusion method is effective and feasible. Detailed results and comparisons are shown in Section 4. 3. Proposed fusion method In this paper, the proposed image fusion framework is shown in Fig. 1. This method mainly consists of the following steps: (1) decomposing the infrared and visible images by NSCT; (2) extracting BIF and DIF by MSTO; (3) fusing BIF and DIF by maximum selection rule based on local energy map; (4) enhancing the fused BIF and DIF by guided filter; (5) reconstructing the fused image by inverse NSCT. 3.1. Extraction of BIF and DIF Infrared image f (x , y ) and visible image g (x , y ) are decomposed by NSCT, the approximation subband {Cjf0, Cjg0} and the detail subbands {Cjf, l, Cjg, l} can be obtained. We extract BIF and DIF from the approximation subband by comparing the gray values of MSTO Bs and Cjf0 with per-pixel. For infrared image, when the gray values of MSTO Bs are more than that of Cjf0 ,we choose the difference value of
MSTO Bs and Cjf0 . Otherwise, when the gray values of MSTO Bs are less than that of Cjf0 , we choose 0. This identifies that we have extracted BIF from approximation subband of infrared image. On the contrary, we can extract DIF. The BIF and DIF can be respectively defined as:
BIFBfs = max {MSTO Bs − Cjf0, 0}
(10)
DIFBfs = max {Cjf0 − MSTO Bs , 0}
(11) 3
Optik - International Journal for Light and Electron Optics 201 (2020) 163497
Z. Wang, et al.
Fig. 1. Overall framework of the proposed fusion method.
Similarly, for the approximation subband of visible image, the BIF and DIF can be respectively defined as:
BIFBgs = max {MSTO Bs − Cjg0, 0}
(12)
DIFBgs
(13)
=
max {Cjg0
− MSTO Bs , 0}
3.2. Fusion of BIF and DIF When BIF and DIF are extracted from infrared and visible images, we need to fuse those features. The fusion process is classified into two steps of at the same scale and different scales. The fusion rule of these two steps all is maximum selection operation based on local energy map. For decomposition scale s, the local energy map of BIF for infrared and visible images can be calculated, respectively.
EBjf0, Bs (x , y ) =
∑ ∑ BIFjf0,Bs (x + m, y + n)2 m
EBjg0, Bs (x , y ) =
(14)
n
∑ ∑ BIF jg0,Bs (x + m, y + n)2 m
(15)
n
At the same s scale, the fused BIF of infrared and visible images are obtained as follow:
FBIFj0, Bs =
f f g ⎧ BIFj0, Bs ifEBj0, Bs ≥ EBj0, Bs ⎨ BIF jg0, B else s ⎩
(16)
At the different s scale, through using maximum selection operation based on local energy map, the fused BIF are obtained as follow:
FBIFj0 = max(max(FBIFj0, Bs ))
(17)
In the same way, the fused DIF are obtained as follow:
EDjf0, Bs (x , y ) =
∑ ∑ DIFjf0,Bs (x + m, y + n)2 m
EDjg0, Bs (x , y ) =
(18)
n
∑ ∑ DIF jg0,Bs (x + m, y + n)2 m
(19)
n
4
Optik - International Journal for Light and Electron Optics 201 (2020) 163497
Z. Wang, et al.
FDIFj0, Bs =
f f g ⎧ DIFj0, Bs ifEDj0, Bs ≥ EDj0, Bs g ⎨ DIF j0, B else s ⎩
(20)
FDIFj0 = max(max(FDIFj0, Bs |s = 1, 2, ⋯, n))
(21)
3.3. Enhancement of fused BIF and DIF The gray value of the fused BIF and DIF from approximation subband is relatively lower, which is caused by inherent defects of the source image and multi-scale decomposition. If the fused BIF and DIF are directly integrated into the primary fused approximation subband, the contrast of the final fused image cannot be efficiently improved. Therefore, the fused BIF and DIF are necessary to enhance. In this paper, we adopt the guided filter to enhance fused BIF and DIF. The guided filter [28,33] has been successfully used in image enhancement, image fusion and image matting, and so on. For an input image Iin , when the guided filter is used to image process, we need a guidance image Ic . For simplicity, the filtered output image Iout can be expressed as (22)
Iout = G (Iin, Ic , r , ε )
The parameters r represents the filter size, ε determine the blur degree of the guided filter. When the guided filter is used to image enhancement, the guidance image can be replaced by the input image [28]. Thus, by using the guided filter, the enhanced fused BIF and DIF can be respectively expressed as: FBIFj0
EFBIFj0 = λ⋅(FBIFj0 − Iout
FDIFj0
EFDIFj0 = λ⋅(FDIFj0 − Iout where
FBIF Iout j0
and
FDIF Iout j0
FBIFj0
)) + Iout
(23)
FDIFj 0
) + Iout
(24)
are the enhanced fused BIF and DIF, respectively, λ represents the enhancement coefficient.
3.4. Setting of the fusion rule In this paper, the averaging rule is adopted to combine the approximation subband of infrared and visible image. The enhanced fused BIF and DIF are directly integrated into the combined approximation subband. Then, the primary fused approximation subband can be expressed as:
CjF0 = (Cjf0 +Cjg0) 2 + α⋅EFBIFj0 − β⋅EFDIFj0
(25)
where the parameters α and β represent fused weight coefficient of the enhanced fused BIF and DIF, respectively, α, β ∈ [0, 1].The detail subbands are combined together with maximum absolute rule. Finally, the final fused image is reconstructed with the inverse NSCT. 4. Experiment results and analysis To evaluate the performance of this proposed method, we test five pairs of infrared and visible images selected from https:// figshare.com/articles/TNO_Image_Fusion_Dataset/1008029 and http://www.imagefusion.org/. These image pairs are denoted as “Quad”, “UNcamp”, “Kaptein”, “Trees” and “Duck”. Five other approaches, which include non-subsample contourlet transform (NSCT-), morphological sequential toggle operator (MSTO-) [30], gradient transfer fusion (GTF-) [26], the guided filter (GFF-) [28], DenseFuse (DF-) [34] based fusion methods, are thoroughly compared during our experiments. In this proposed method, a hybrid multi-scale decomposition is with no need for big scale. The decomposition level of NSCT is setting for {2, 2, 3} . Two scale decomposition and ‘disk’ structuring element are implemented for MSTO, The parameters r and ε of guided filter is default to 16 and 0.01, and the parameter λ is set to 3. The fused weight parameters α and β are all 0.5. 4.1. Experimental comparison on BIF and DIF The extracted and enhanced results of the BIF and DIF are shown in Fig. 2. From these figures, Fig. 2 (a) and (b) are the extracted results from the source images, respectively. It can be seen that the extracted fused BIF and DIF include the major feature information such as the cars, traffic lights, pedestrians, while have amount of noise and pseudo-edge. If those features are directly integrated into the source images, they will reduce the fused image quality. Fig. 2 (c) and (d) show the extracted results from the approximation subband, respectively. Because the approximation subband obtained by NSCT decomposition are similar with source images, and the change of gray value is smooth, the extracted results not only include the major feature information, but also preserve unambiguous edges and have very few noise. This result indicates that extracting the BIF and DIF from the approximation subband is feasible and effective. However, comparing with that of the source image, the gray value of the extracted fused BIF and DIF from approximation subband is smaller. If those features are directly integrated into the combined approximation subband, the contrast of the fused image cannot be greatly improved. Therefore, the extracted fused BIF and DIF are necessary to enhance. The enhanced results of the fused 5
Optik - International Journal for Light and Electron Optics 201 (2020) 163497
Z. Wang, et al.
Fig. 2. Extracted and enhanced results of BIF and DIF.(a) and (b) are extracted results from source infrared image, (c) and (d) are extracted results from low-frequency images, (e) and (f) are enhanced results of (c) and (d).
BIF and DIF are shown in Fig.2 (e) and (f). Those enhanced image features are integrated into the combined approximation subband, the contrast and visible effect of the final fused image can be effectively improved, which is suitable for human visual perception and future image processing. 4.2. Subjective performance evaluation In order to compare our method, the experimental results with different image sets and fusion methods are shown in Figs. 3–7, respectively. In these figures, (a) and (b) show the source infrared and visible images. (c), (d), (e), (f) and (g) are the fused image based on NSCT-, MSTO- [30], GTF- [26], and GFF- [28], DF- [34], respectively. (h) is the result image of our proposed method. Fig. 3 shows the experimental results on the “Quad” images. From the source images, we can observe that the traffic lights, pedestrians and cars are clear but the adverting board is blurred in the infrared image, by the contrast, the traffic lights, pedestrians and cars are dim but the adverting board is clear in the visible image. For these results with different fusion methods, the fused images contain main information and important characteristics from the source images, whereas have some differences in terms of brightness, contrast and details. Fig. 3 (c) is obtained by NSCT- based fusion method. The fused image may have fewer artifacts and better edges, and the contrast of the fused image is poor. Even though NSCT have two merits of multi-scale transformation and shiftinvariant, however, the conventional “average-max” fusion rule bring about the poor contrast of fused image. For example, the brightness of traffic lights, pedestrians, cars and adverting board are relatively dim. The fused image obtained by MSTO (see Fig. 3 (d)) has serious artifacts. This fused method is extracting the BIF and DIF from the source images. The fused image can well preserve the brightness of traffic lights, pedestrians and cars but contain the amount of noise and pseudo-edge. The fused image obtained by GTF is shown on Fig. 3 (e). By gradient transfer, the fused image contains the important information of these two source images.
Fig. 3. Fusion results of different fusion methods for “Quad” images. (a)infrared, (b)visible, (c)NSCT-, (d)MSTO- [30], (e) GTF- [26], (f) GFF- [28], (g) DF- [34], (h) the proposed method. 6
Optik - International Journal for Light and Electron Optics 201 (2020) 163497
Z. Wang, et al.
Fig. 4. Fusion results of different fusion methods for “UNcamp” images. (a)infrared, (b)visible, (c)NSCT-, (d)MSTO- [30], (e) GTF- [26], (f) GFF[28], (g) DF- [34], (h) the proposed method.
Fig. 5. Fusion results of different fusion methods for “Kaptein” images. (a)infrared, (b)visible, (c)NSCT-, (d)MSTO- [30], (e) GTF- [26], (f) GFF[28], (g) DF- [34], (h) the proposed method.
Fig. 6. Fusion results of different fusion methods for “Tree” images. (a)infrared, (b)visible, (c)NSCT-, (d)MSTO- [30], (e) GTF- [26], (f) GFF- [28], (g) DF- [34], (h) the proposed method.
However, the edge and structure of targets are deformated in the fused image, and it may have obvious artifacts and decrease the brightness of targets. The traffic lights, pedestrians, cars and adverting board are blurred in the fused image. From Fig. 3 (f), the fused image obtained by GFF-based method may well preserve the brightness of one pedestrian, but the brightness of traffic lights, two pedestrians and cars is poor. But, detail information lost and obvious halos of the fused image are still appearing. Fig. 3 (g) is the fused image obtained by DF- based method, which is a deep network with CNN and dense block. This fused method can preserve well the main information of the both source images. However, object information of infrared image are lost too much. For example, the 7
Optik - International Journal for Light and Electron Optics 201 (2020) 163497
Z. Wang, et al.
Fig. 7. Fusion results of different fusion methods for “Duck” images. (a)infrared, (b)visible, (c)NSCT-, (d)MSTO- [30], (e) GTF- [26], (f) GFF- [28], (g) DF- [34], (h) the proposed method.
contrast of pedestrians is low. The fused result of our method is shown in Fig. 3 (h). We can find that the fused image can well preserve the brightness of traffic lights, pedestrians and cars, and its edge details are clear. This is because the proposed method implements MSTO to extract the BIF and DIF from approximation subband, and those features are enhanced by the guided filter. Therefore, the fused result of our method has better edges and higher contrast, which is better than any other method. Fig. 4 shows the experimental results of “UNcamp” images. From the source images, we can observe that the trees and road are clear but the pedestrian cannot be seen in the visible image. By the contrast, the pedestrian is clear but the trees and road are blurred in the infrared image. Fig. 4 (c) is the fused result produced by NSCT. We can find that the fused image can preserve the pedestrian of infrared image and the tree and road in the visible image. However, the contrast of pedestrian is poor in the fused image. Fig. 4 (d) is the result obtained by MSTO, which contain amount of noise and pseudo-edge. For example, the fused image well preserves the brightness of pedestrian, but appears serious artifacts such as blurred trees and road at the same time. The fused results produced by GTF (see Fig. 4 (e)) may lose some detail features. The fuse image may lose too much background information and have serious distortion. Fig. 4 (f) is the fused result produced by GFF, which is better than MSTO-, and GTF-based method, but the fused image decreases the brightness of pedestrian, and the contrast of the fused image is poor. Especially, the detail of road is missing in the fused image. The fused result produced by DF (see Fig. 4 (g)) has obvious artifacts and decrease the brightness of pedestrian. Fig. 4 (h) is the resultant image obtained by our method. Comparing with the other fused methods, it can be seen that the fused image well preserves the brightness of pedestrian and the road and trees are clear. Meanwhile, the contrast is better than other methods in terms of visual inspection. Figs. 5–7 shows the experimental results on the “Kaptein”, “Trees” and “Duck” images, respectively. From these figures, we can find that the fused images obtained by different fusion methods contain major characteristics of the source images, whereas contrast and details of the fused image have some differences, which is consistent to the previous results. It can be seen that our fusion result not only well preserve the brightness of targets, but also have better detail and scene information. In summary, we can draw the conclusion that the proposed method has better subjective fusion effect than other methods.
4.3. Objective performance evaluation In our approach, Q0 , Q w , Qe and Qabf are adopted as fusion quality metrics. The metric Q0 and Q w respectively reflects the image distortion degree and salient information transformation degree. The metric Qe and Qabf reflects the edge information and visual information quality of fused image, respectively. For those four metrics, higher value indicates that the fused image has better quality. Detailed calculation formula and theoretical analysis refer to the literature [35–37]. Quantitative assessments of different fusion methods for five image sets are shown in tables 1, respectively. The highest value of all the fusion quality metrics are highlighted in hold. From Tables 1, we can observe that our method can achieve the highest value of Q0 , Q w and Qe metrics for the “Quad”, “UNcamp”, “Kaptein” and “Trees” images. Meanwhile, the highest value of four metrics for the “duck” images. Our method has the highest value of Q0 , this demonstrates that the fused image has the least structural distortion and contrast distortion, this proves our method can effectively decrease noise and pseudo-edge into the fused image. Meanwhile, the highest Q w indicates that the proposed method can extract salient information from source mages. In addition, the highest Qe and ranks three in terms of Qabf demonstrates that the fused image from our method preserves the edges and detailed information compared to the other methods. It further illustrates that our method can obtain better fusion quality by joining the enhanced fused BIF and DIF into the primary combined approximation subband. Table 2 depicts the average quantitative assessment of different fusion methods. We can find our method has the highest average evaluation values in terms of Q0 , Q w and Qe and ranks two in terms of Qabf , which further illustrates that the quality of the fused image of this proposed method is better than that of the other methods. In order to further evaluate our fusion method, two image sequences, named as Nato_camp and Dune., are adopted. Fig. 8 illustrates the quantitative comparisons in terms of four metrics, i.e., Q0 , Q w , Qe and Qabf . It is obvious that the proposed method has the best Q0 , Q w and Qe for two image sequences (see red circles). In Qabf , this fusion method can obtain relatively better results, especially 8
Optik - International Journal for Light and Electron Optics 201 (2020) 163497
Z. Wang, et al.
Table 1 Quantitative assessment of different fusion methods.
Quad
UNcamp
Kaptein
Trees
Duck
Method
Q0
Qw
Qe
Qabf
NSCTMSTOGTFGFFDFProposed NSCTMSTOGTFGFFDFProposed NSCTMSTOGTFGFFDFProposed NSCTMSTOGTFGFFDFProposed NSCTMSTOGTFGFFDFProposed
0.5262 0.5018 0.4678 0.4798 0.5244 0.5344 0.5944 0.5891 0.4599 0.5661 0.5949 0.6054 0.5785 0.5762 0.4354 0.5500 0.5767 0.5835 0.6910 4.2754 0.5476 0.6563 0.6923 0.6912 0.6172 0.5616 0.4195 0.6075 0.6145 0.6189
0.7496 0.6842 0.5807 0.7684 0.6696 0.7788 0.7761 0.7210 0.7126 0.7389 0.7226 0.7915 0.7906 0.7372 0.6539 0.7883 0.7138 0.8274 0.8925 0.6607 0.7163 0.8345 0.8707 0.8984 0.8949 0.7747 0.7612 0.9277 0.7588 0.9092
0.7320 0.6682 0.5672 0.7600 0.6540 0.7606 0.7579 0.7041 0.6959 0.7216 0.7057 0.7730 0.7721 0.7200 0.6386 0.7699 0.6971 0.8081 0.8717 0.8416 0.6996 0.8150 0.8503 0.8774 0.8739 0.7566 0.7434 0.8860 0.7411 0.8879
0.4946 0.3010 0.3741 0.5394 0.3594 0.5167 0.4800 0.3315 0.4117 0.5161 0.3484 0.4709 0.5003 0.3430 0.3370 0.5090 0.3384 0.4928 0.4998 0.5089 0.4451 0.5483 0.3907 0.4937 0.7138 0.4419 0.6540 0.7193 0.4663 0.7235
Table 2 Average assessment for the fused images of different methods. Method
Q0
Qw
Qe
Qabf
NSCTMSTOGTFGFFGFProposed
0.6015 1.3008 0.4660 0.5719 0.6006 0.6067
0.8207 0.7156 0.6849 0.8116 0.7471 0.8411
0.8015 0.7381 0.6689 0.7905 0.7296 0.8214
0.5377 0.3853 0.4444 0.5664 0.3806 0.5395
for Nato_camp sequence, our method achieves the third performances. In addition, for two image sequences, NSCT- based method with blue circles performs better than several other methods. This also explains that NSCT is excellent at multi-scale decomposition, and can efficiently extract useful information from original images. Through above experiment and analysis, we can draw a conclusion that this proposed method can retain the important information from the original images, and has better objective fusion effect than other cutting-edge methods.
5. Conclusion In this paper, we developed a novel and efficient infrared and visible image fusion method via a hybrid decomposition of NSCT and MSTO. The proposed method employs MSTO to extract the BIF and DIF from the approximation subband obtained by NSCT decomposition. The extracted results not only include the major feature information, but also preserve unambiguous edges and have very few noises. The GF is adopted to enhance the fused BIF and DIF. The enhanced fused BIF and DIF are integrated into the primary combined approximation subband, and it can largely improve visible effect of the fused image. Five infrared and visible image sets, two image sequences and four objective metrics are used for experiments. Experiment results prove that our fusion method has the superiority in terms of visual inspection and objective measures. In the future, optimization on parameters of the fused weight coefficient will be discussed, and we will also apply our method to other multi-modality images.
9
Optik - International Journal for Light and Electron Optics 201 (2020) 163497
Z. Wang, et al.
Fig. 8. Quantitative comparisons of the four metrics, i.e., Q0 , Q w , Qe and Qabf on the Dune (left column) and Nato_camp (right column) sequences. For all the four metrics, larger values indicate better performance.
Acknowledgements This paper is supported by the Fund for Shanxi “1331 Project” Key Innovative Research Team (1331KIRT), Scientific and Technological Innovation Programs of Higher Education Institutions in Shanxi under Grant 2017162, and Startup Foundation for Doctors of Taiyuan University of Science and Technology under Grant 20162004. References [1] J. Ma, Y. Ma, C. Li, Infrared and visible image fusion methods and applications: a survey, Inf. Fusion 45 (2019) 153–178. [2] S. Li, X. Kang, L. Fang, et al., Pixel-level image fusion: a survey of the state of the art, Inf. Fusion 33 (2017) 100–112. [3] Z. Huang, L. Chen, et al., Robust contact-point detection from pantograph-catenary infrared images by employing horizontal-vertical enhancement operator, Infrared Phys. Technol. 101 (2019) 146–155. [4] Z. Huang, H. Fang, et al., Optical remote sensing image enhancement with weak structure preservation via spatially adaptive gamma correction, Infrared Phys. Technol. 94 (2018) 38–47. [5] Z. Huang, Z. Zhang, et al., Unidirectional variation and deep CNN denoiser priors for simultaneously destriping and denoising optical remote sensing images, Int. J. Remote Sens. 15 (2019) 5737–5748. [6] S. Li, B. Yang, J. Hu, Performance comparison of different multi-resolution transforms for image fusion, Inf. Fusion 12 (2) (2011) 74–84. [7] J.J. Lewis, R.J. O’Callaghan, S.G. Nikolov, et al., Pixel- and region-based image fusion with complex wavelets, Inf. Fusion 8 (2007) 119–130. [8] E.J. Cands, D.L. Donoho, Curvelets and curvilinear integrals, J. Approximation Theor. 113 (1) (2011) 59–90.
10
Optik - International Journal for Light and Electron Optics 201 (2020) 163497
Z. Wang, et al.
[9] F. Nencini, A. Garzelli, S. Baronti, et al., Remote sensing image fusion using the curvelet transform, Inf. Fusion 8 (2007) 143–156. [10] A.L. da Cunha, J. Zhou, M.N. Do, The nonsubsampled contourlet transform: theory, design, and applications, IEEE Trans. Image Process. 15 (10) (2006) 3089–3101. [11] Z. Fu, X. Wang, J. Xu, et al., Infrared and visible images fusion based on RPCA and NSCT, Infrared Phys. Technol. 77 (2016) 114–123. [12] J. Adu, J. Gan, Y. Wang, et al., Image fusion based on nonsubsampled contourlet transform for infrared and visible light image, Infrared Phys. Technol. 61 (2013) 94–100. [13] Y. Chen, J. Xiong, H. Liu, et al., Fusion method of infrared and visible images based on neighborhood characteristic and regionalization in NSCT domain, Opt. Int. J. Light Electron Opt. 125 (17) (2014) 4980–4984. [14] C. Zhao, Y. Guo, Y. Wang, A fast fusion scheme for infrared and visible light images in NSCT domain, Infrared Phys. Technol. 72 (2015) 266–275. [15] Q. Zhang, X. Maldague, An adaptive fusion approach for infrared and visible images based on NSCT and compressed sensing, Infrared Phys. Technol. 74 (2016) 11–20. [16] T. Xiang, L. Yan, R. Gao, A fusion algorithm for infrared and visible images based on adaptive dual-channel unit-linking PCNN in NCST domain, Infrared Phys. Technol. 69 (2015) 53–61. [17] F. Meng, M. Song, B. Guo, et al., Image fusion based on object region detection and non-subsampled contourlet transform, Comput. Electr. Eng. 62 (7) (2017) 375–383. [18] Z. Huang, Z. Zhang, et al., Progressive dual-domain filter for enhancing and denoising optical remote sensing images, IEEE Geosci. Remote Sens. Lett. 15 (2018) 759–763. [19] Z. Huang, L. Huang, et al., Framelet regularization for uneven intensity correction of color images with illumination and reflectance estimation, Neurocomputing 314 (2018) 154–168. [20] Z. Huang, Z. Zhang, et al., Spatially adaptive denoising for X-ray angiogram image, Biomed. Signal Process. Control 40 (2018) 131–139. [21] Z. Huang, Q. Li, et al., Iterative weighted sparse representation for X-ray cardiovascular angiogram image denoising over learned dictionary, IET Image Process. 12 (2) (2018) 254–261. [22] X. Yan, H. Qin, J. Li, et al., Infrared and visible image fusion using multiscale directional nonlocal means filter, Appl. Opt. 54 (13) (2015) 4299–4308. [23] J. Ma, Z. Zhou, B. Wang, et al., Infrared and visible image fusion based on visual saliency map and weighted least square optimization, Infrared Phys. Technol. 82 (2017) 8–17. [24] W. Gan, X. Wu, W. Wu, et al., Infrared and visible image fusion with the use of multi-scale edge-preserving decomposition and guided image filter, Infrared Phys. Technol. 72 (2015) 37–51. [25] J. Zhao, G. Cui, X. Gong, et al., Fusion of visible and infrared images using global entropy and gradient constrained regularization, Infrared Phys. Technol. 81 (2017) 201–209. [26] J. Ma, C. Chen, C. Li, et al., Infrared and visible image fusion via gradient transfer and total variation minimization, Inf. Fusion 31 (9) (2016) 100–109. [27] J. Hu, S. Li, The multiscale directional bilateral filter and its application to multisensory image fusion, Inf. Fusion 13 (3) (2012) 196–206. [28] S. Li, X. Kang, J. Hu, Image fusion with guided filtering, IEEE Trans. Image Process. 22 (7) (2013) 2864–2875. [29] L. Jian, X. Yang, Z. Zhou, et al., Multi-scale image fusion through rolling guidance filter, Future Gener. Comput. Syst. 83 (6) (2018) 210–325. [30] X. Bai, X. Chen, F. Zhou, et al., Multiscale top-hat selection transform based infrared and visual image fusion with emphasis on extracting regions of interest, Infrared Phys. Technol. 60 (2013) 81–93. [31] X. Bai, Infrared and visual image fusion through feature extraction by morphological sequential toggle operator, Infrared Phys. Technol. 71 (2015) 77–86. [32] Z. Wang, F. Yang, Z. Peng, et al., Multi-sensor image enhanced fusion algorithm based on NSST and top-hat transformation, Opt. Int. J. Light Electron Opt. 126 (23) (2015) 4184–4190. [33] K. He, J. Sun, X. Tang, Guided image filtering, IEEE Trans. Pattern Anal. Mach. Intell. 35 (2013) 1397–1409. [34] H. Li, X.J. Wu, DenseFuse: A Fusion Approach to Infrared and Visible Images, IEEE Trans. Image Process. 28 (5) (2019) 2614–2623. [35] Z. Wang, A. Bovik, A universal image quality index, IEEE Signal Proc. Lett. 9 (3) (2002) 81–84. [36] C.S. Xydeas, V. Petrovic´, Objective image fusion performance measure, Electron. Lett. 56 (2) (2000) 308–309. [37] G. Piella, H. Heijmans, A new quality metric for image fusion, Proc. IEEE Int. Conf. Image Process, Barcelona, Spain, 2003, pp. 173–176.
11