Accepted Manuscript Fusion of infrared-visible images using improved multi-scale top-hat transform and suitable fusion rules Pan Zhu, Xiaoqing Ma, Zhanhua Huang PII: DOI: Reference:
S1350-4495(16)30340-1 http://dx.doi.org/10.1016/j.infrared.2017.01.013 INFPHY 2218
To appear in:
Infrared Physics & Technology
Received Date: Revised Date: Accepted Date:
9 July 2016 7 December 2016 13 January 2017
Please cite this article as: P. Zhu, X. Ma, Z. Huang, Fusion of infrared-visible images using improved multi-scale top-hat transform and suitable fusion rules, Infrared Physics & Technology (2017), doi: http://dx.doi.org/10.1016/ j.infrared.2017.01.013
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Fusion of infrared-visible images using improved multi-scale top-hat transform and suitable fusion rules Pan Zhu, Xiaoqing Ma, Zhanhua Huang Key Laboratory of Opto-electronic Information Technology, Ministry of Education, Tianjin University, Tianjin 300072, China
Abstract: Integration of infrared and visible images is an active and important topic in image understanding and interpretation. In this paper, a new fusion method is proposed based on the improved multi-scale center-surround top-hat transform, which can effectively extract the feature information and detail information of source images. Firstly, the multi-scale bright (dark) feature regions of infrared and visible images are respectively extracted at different scale levels by the improved multi-scale center-surround top-hat transform. Secondly, the feature regions at the same scale in both images are combined by multi-judgment contrast fusion rule, and the final feature images are obtained by simply adding all scales of feature images together. Then, a base image is calculated by performing Gaussian fuzzy logic combination rule on two smoothed source images. Finally, the fusion image is obtained by importing the extracted bright and dark feature images into the base image with a suitable strategy. Both objective assessment and subjective vision of the experimental results indicate that the proposed method is superior to current popular MST-based methods and morphology-based methods in the field of infrared-visible images fusion. Keywords: Infrared and visible images fusion; Multi-scale top-hat transform; Bright and dark features extraction; Fuzzy logic; Multi-judgment contrast
1. Introduction Infrared and visible images acquired by different sensors can convey different object information for their imaging properties. For instance, infrared images are capable of showing the important and hidden targets because infrared sensor is sensitive to the unbalanced distribution of temperature in an observational scene, whereas, the Achilles heel might be presented as their low contrast and dim details which are not always consistent with the human visual characteristics. On the contrary, visible images, containing plentiful texture details and high spatial resolution, can clearly reveal the scene information, while they cannot detect obscured and disguised targets, and visible images quality will dramatically degrade in low illumination [1, 2]. In other words, infrared and visible images can separately reveal some specific information of the observational scene, and have preferable complementary information with each other. However, they cannot identically provide sufficient information of the observational scene, which will influence the subsequent information processing and human visibility. Therefore, it is really necessary to combine the complementary information of infrared and visible images by using image fusion algorithm [1]. With the development of information fusion technique, various fusion methods have been proposed in the last decade. These methods can be classified into three categories: substitution methods [3-6], neural network methods [7] and Multi-scale transform methods [10-22]. The substitution methods including independent component analysis (ICA) and principal component analysis (PCA) can extract main information and have been successfully applied in image fusion [3, 4]. However, these algorithms easily lose some details in the process of extracting the main information of the source images. In addition, as a promising substitution fusion method, sparse representation (SR) has experienced rapid development, but this method need to train intensive
dictionary for restoring image information, so it is time-consuming[5, 6]. And SR loses some high frequency information easily in the process of information substitution, which usually makes the fusion image smoothen [6]. Neural network method such as pulse coupled neural network (PCNN) [7] has attracted the attention of many scholars and has been widely developed in image fusion domain. Although most of the PCNN-based methods outperform other current conventional ones, a large number of inherent drawbacks still exist. PCNN has so many parameters with complex structures that its optimal estimation is a major limitation to practical application in essence. Besides,Multi-scale transform (MST) methods is the most popular fusion tools used in various image scenarios owing to its simplicity and effectivity in implementation. Typical MST-based fusion methods include Laplacian Pyramid (LP) [8], Gradient Pyramid [9], Discrete Wavelet Transform (DWT) [10], Curvelet Transform (CVT) [11], Contourlet Transform (CT) [ 12], Non-Subsampled Contourlet Transform (NSCT) [13], Non-Subsampled Shearlet Transform (NSST) [1,14-15], Bidimensional Empirical Mode Decomposition (BEMD) [16] and so on. These methods are capable of fusing image information at different scale levels through artificial multi-scale decomposition filters with various feasible fusion rules. These fusion rules are mainly used to capture salient features such as edges and line from the source image to achieve satisfactory fusion effect. However, it is worthwhile to notice that the predesign multi-scale filters inevitably have some deficiencies in analyzing the frequency or space characteristics of the source images [17]. These deficiencies tend to influence the performance of the MST-based fusion methods. For example, LP-based method easily leads to blocking phenomena in fusion results. DWT- and CVT-based methods often cause artifacts and Gibbs effects, because they lack the ability of catching spatial detail information. The fusion result based on CT is also prone to Gibbs phenomena due to the lack of the shift-invariance. Although NSCT- and NSST- based methods have achieved great success and are almost perfect in image fusion domains, the high computational complexity exist a limitation, especially the real-time applications. And different pyramid filters selected in NSCT-based transform domains will make the fusion images performance vary considerably. The researchers tried to improve computation speed by combining NSST and fast non-negative matrix [14]. However, some disadvantages such as halo and smooth are newly introduced by using non-negative matrix. In general, conventional MST-based fusion methods may not always maintain the details and important features of the source images well. Since different filters structure and transform pattern selected by classical MST-based methods could make the fusion result significantly change along with different properties of the source images. This greatly limits their applications [17]. Although BEMD, without priori filtering, can adaptively decompose the source image and obtain perfect image information according to the nature of the signal [18], the extraordinary time consuming characteristic also limits its application in real-time processing. Mathematical morphology provides an alternative approach to image processing based on shape concept stemmed from set theory [19]. Among numerous morphology operations, top-hat transform, as an important and effective image feature extraction operation, needs no priori filtering and is very suitable for extracting the target feature regions of images and fetching the distinct details of images [19, 20]. Therefore, with different structure elements, combined with multi-scale technology, top-hat transform has been widely used in target detection, image segmentation, image enhancement, image fusion and so on [21-27]. Traditional multi-scale top-hat transform employed single structure element can extract the multi-scale image features and
perform well in some fusion cases. However, some interested image regions or image details may be smoothed, which will affect the performance of these algorithms [28, 29]. In order to avoid some important image regions or image details smoothed, some improved top-hat transforms have been proposed to enhance the feature extraction performance by changing the structuring elements and morphology operations sequence [30-36]. Among them, X.Z. Bai proposed a new algorithm called center-surround top-hat transform, which can effectively extract image features and interested details [30-32]. Based on center-surround top-hat transform, the image fusion results can not only keep abundant feature regions and detail information, but also get the contrast ratio enhanced appropriately [32]. Toggle operator, as another important operation of extracting image features in morphology methods, is widely used in the field of image fusion. However, it also has some shortcomings. Using opening- and closing-based toggle operator could realize details preserved in the fusion results [33], but some important feature regions are smoothed, because this method only focuses on the detail extraction. The multi-scale toggle contrast operator is used for preserving the target edge well in the fusion image [34]. But the contrast of the whole fusion image is unacceptable, because the algorithm easily lose sight of target regions in source images. Combining multi-scale top-hat transform with toggle contrast operator could preserve image regions and details in the fusion result [35]. But the visual effect of the fusion image is unsatisfying, especially for the infrared image. An exciting fusion result can be acquired for most of source images through using sequentially combined toggle and top-hat [36]. But the performance of the method is undesirable for infrared images, because not all the smoothed image features are useful in sequential operations. So far, there are always some shortcomings when toggle operator is employed in image fusion. In addition, the newly proposed morphology method takes too much focus on feature region extraction, and the effect caused by the image fusion rules was neglected to some degree. For instance, the base image (similar to the low frequency components) is usually obtained by simply averaging different source images or smoothed source images, which will probably result in lots of undesirable effects, such as reduced contrast, loss of some important background information [32-36]. Although weighted average methods according to local energy are adopted and analyzed in literature [27] to improve the contrast of the fused image, differences of imaging characteristics between the infrared image and visible image have not been taken into account. Meanwhile, in most of the morphology fusion methods, the feature regions (similar to the high frequency components) are often merged by simply picking maximum value, which might suffer from the problem of detail loss and image distortion [9, 17]. Therefore, it is worthwhile to introduce proper fusion rules to the improved top-hat transform based fusion method for overcoming the defects above mentioned. Inspired by the thoughts of X.Z. Bai [30-32], an improved multi-scale top-hat transform for effectively extracting feature regions and detail information is proposed in this paper based on two different but correlated structuring elements. The bright (dark) feature regions at the same scale level, extracted from two source images, are merged by using multi-judgment consist fusion. And, the final feature images are obtained by adding all scales of the same feature regions together. Then, the base image, imported into the fusion result, is acquired by performing Gaussian fuzzy logic fusion rule on two smoothed source images. Finally, the fusion image is obtained by combining the feature images and the base image with proper strategy. The proposed method can effectively decrease ambiguity and minimize redundancy while retaining well visible details and
highlighting infrared target information. To verify the effectiveness of this proposed fusion algorithm, both visual analysis and quantitative analysis are employed to compare with eight comparative algorithms, LP [8], DWT [10], NSCT [13]and the multi-scale morphology methods proposed in [27, 29, 32-34]. Meanwhile, the proposed algorithm with conventional fusion rules using in Refs 27 is also compared with the proposed algorithm with suitable fusion rules, which conduce to verify the effectiveness of the suitable fusion rules for infrared-visible image fusion. The experimental results of various infrared and visible images prove that the performance of the proposed fusion algorithm with suitable fusion rules is obviously superior to the nine comparative fusion algorithms in the aspect of preserving the details and characteristics of the source images. And the fusion results of the proposed algorithm have high contrast, remarkable infrared target information and rich visible details that are more suitable for human visual characteristics. The rest parts of this paper are organized as follows:In Section 2, multi-scale top-hat transform with center-surround structure element is introduced. In Section 3, the details and fusion framework of the proposed algorithm are given, respectively. In Section 4, five groups of fusion experiments of infrared-visible images are performed with nine comparative fusion methods to verify the effectiveness of the proposed method, and the fusion results and analysis are provided. Finally, conclusions are drawn in Section 5.
2. Methodology This section is divided into two subsections to explain the classical top-hat transform and the improved multi-scale center-surround top-hat transform. Section 2.1 introduces the basic theory of morphological transform and the multi-scale top-hat transform. Section 2.2 illustrates the idea of the improved multi-scale center-surround top-hat transform. 2.1 Classic top-hat transform Mathematical morphology has been widely used in image processing [19]. Most of the morphology operations consist of sequential operations based on dilation and erosion with specific structuring element. Dilation obtains the maximum information value from the neighborhood of a given structuring element. Instead, erosion gets the minimum information value. Assuming that f(x, y) is a grayscale image with the size of M×N, and b(u, v) is a structuring element, the dilation and erosion of f(x, y) by b(u, v), denoted by f⊕b and fΘb respectively, are defined as f b max( f ( x u, y v) b(u, v))
(1)
f b min( f ( x u, y v) b(u, v))
(2)
u ,v
u ,v
Based on dilation and erosion, the opening and closing operations of f(x, y) by b (u, v), denoted by f ◦b and f• b respectively, are defined as below (3) f b ( f b) b f b ( f b)b Applying opening and closing operations, the classic top-hat transformations of f(x, y) by b (u, v) are defined as follow (4) WTH f f b BTH f b f The opening operation smooths bright image regions by removing the bright features whose size is smaller than the structuring element size, so WTH, the abbreviation of white top-hat transformation, can be used to extract the bright features of the image. Contrarily, the closing operation smooth dim image regions by removing the dark features whose size is smaller than the
structuring element size, so BTH, the abbreviation of can be used to extract the dark regions of the image. In other words, the top-hat transform can decompose an image into a base image and a feature region image, and the performance of the top-hat transformation is sensitive to image feature sizes. On account of that an image usually includes rich feature regions which generally have different sizes and exist at different scales, so it will result in failure to exact feature regions by the sole structuring element. To avoid leaving out some appropriate interested characteristics caused by the single structuring element, multi-scale techniques are widely used in top-hat transformation for different applications [24-29, 32-36]. 2.2 Improved multi-scale center-surround top-hat transform
M
Through changing the shape of structuring element, various improved multi-scale top-hat transforms have been proposed [26-29]. However, the special information between the region of interest and the surrounding regions is not well considered, because they are difficultly distinguished by the structuring element of the same shape. To improve the identification ability of top-hat transform in information extraction, X.Z. Bai proposed multi-scale center-surround top-hat transform through using two different types but correlated structural elements, which could effectively extract discriminating information between the regions of interest and the surrounding regions, and maintain other region features well [30]. Soon after, the new top-hat transform was successfully applied in different image processing [31, 32]. Owing to dual shape design of structuring element, the multi-scale center-surround top-hat transform can extract the region features and details of interest [31, 32]. However, some disadvantages still exist in the new top-hat transform, because some feature information is removed in the process of obtaining feature regions. Here, we inherit the idea of dual shapes design, and propose an improved multi-scale top-hat transform. Similar to the method in [30, 31], the new structuring elements used in the proposed multi-scale top-hat transform are constructed by two different but correlated structuring elements that are defined as the inner structuring element (Bi) and outer structuring element (Bo). It is important to note that the size of Bo must be larger than the size of Bi. The first structuring element is denoted by Bb, whose size could change from Bi to Bo according different applications. The second structuring element, called marginal structuring element and denoted by ΔB = Bo–Bi, picks out the discriminating information between the region of interest and its surrounding regions. An intuitive illustration about the dual structuring elements of square shape is shown in Fig. 1.
L
W
(a) Bb
(b) ∆B
Fig.1 Used structuring elements in this paper
The intuitive parameter annotation of the dual structuring elements is clearly marked in the Fig.1. L and W are the size of Bb and ΔB. And, the size of the margin in ΔB is set as M. Based on the novel dual structuring elements, the new opening and closing operations, denoted by f□Boi and f■Boi, can be calculated as follows [30, 31]. f Boi ( x, y) ( f B) Bb
(7)
f Boi ( x, y) ( f B)Bb
(8)
Boi expresses a set of dual structuring elements that consists of Bo and Bi. With the dual structural elements, f□Boi and f■Boi can be used to smooth the bright and dark feature regions, respectively. Different from the method of extracting the bright and dark feature regions used by X.Z. Bai [30-32], in this paper, the new white top-hat transform and black top-hat transform, denoted by NWTH and NBTH, are expressed as follows.
NWTH ( x, y) f ( x, y) f Boi ( x, y)
(9)
NBTH ( x, y) f Boi ( x, y) f ( x, y)
(10)
NWTH and NBTH can extract the bright and dark feature regions by following the used structuring elements in certain scale. To extract the multi-scale feature regions of the source image, we acquire multi-scale structuring elements with designed dual shapes by increasing sizes regularly according to a certain step length.
3 Proposed fusion algorithm 3.1 Extracting multi-scale image features Similar to X.Z. Bai’s scale expansion method [32], to extract the multi-scale region features of source image, we acquire multi-scale structuring elements by regularly increasing sizes. Suppose L and W respectively be the sizes of initial structuring element Bb and initial marginal structuring element ΔB , the size of the margin in ΔB is set as M, the scale size of the structuring element is expanded by n times, and the expanding step length is Step in each time. Then, the size of the structuring element at each scale s (1
Ws W s Step
(11)
At scale s, the dual structuring elements are expressed by ΔBs and Bbs, respectively. In the similar way, the sizes of Bbs and ΔBs are expressed by Ls and Ws, respectively. Suppose that there are k source images f1,…., fj,…., fk to be fused, 1≤j≤k. Then, the bright and dark feature regions extracted by the new top-hat transform at scale s are acquired as follows.
NWTH sj ( x, y) f j ( x, y) f j Boi ,s ( x, y)
(12)
BWTH sj ( x, y) f j Boi , s ( x, y) f j ( x, y)
(13)
With the increasing of the structuring element scale, the extracted feature regions will become rougher. However, these features extracted at a higher scale usually contain the features existing at a lower scale [37]. Therefore, the feature information extracted at different scales has many redundancies, which will affect the final fusion result. In order to obtain good feature information, we use the method below to acquire the bright and dark image feature regions which can be looked as image details and targets information.
NWTH sj,s 1 ( x, y) NWTH sj1 ( x, y) NWTH sj ( x, y)
(14)
NBTH sj, s 1 ( x, y) NBTH sj1 ( x, y) NBTH sj ( x, y)
(15)
Where, NWTH sj includes bright feature image of fj from scale 1 to s. NWTH sj1 includes bright feature image of fj from scale 1 to s+1. Then, the bright feature image of fj at scale s+1 is represented as NWTH sj,s 1 . Similarly, BWTH sj,s 1 represents the dark feature image of fj at scale s+1. 3.2 Image feature regions fusion The extracted bright (dark) image features contain the main characteristics and details of the source images, while a key problem of feature image fusion is whether the extracted information from the source images is reasonably retained. Acquiring the maximum value among the source feature images at the same scale, ignoring the relevance between pixels, might cause detail information loss and image distortion, moreover, noise is also retained in the fusion image [38]. To order to strengthen the relevance between pixels, neighborhood characteristic regional processing (such as regional gradient energy contrast, regional variance contrast and regional clarity contrast) is adopted in the image features fusion, which makes use of the correlation between pixels to get fusion factor and obtains the acceptable effect in many fusion scenarios [27]. However, not every neighborhood characteristic region based fusion rule can combine the image features well, because the single fusion rule easily lose the image details and reduce the ability of the noise suppression [38]. Therefore, judging the optimal fusion value from multiple fusion rules is a good solution for the defects of the single rule [38]. Regional energy and regional clarity can represent the extracted image features well to some extent, respectively. So the multi-judgment contrast fusion rule with regional energy and regional clarity can maintain the important features and details effectively and maximize the relative information specific at each individual scale level. Contrast ratios of both regional energy and regional clarity are compared in the used multi-judgment fusion rule as below [38]. (1) Fusion rule based on regional energy contrast: Regional energy of the extracting infrared and visible bright (dark) features at the same scale level is firstly calculated by Eq. (17) R1 ( x, y )
EVI ( x, y ) / EVI E IR ( x, y ) / E IR
E ( x, y )
1 M N
( M 1)/2
(16) ( N 1)/2
f ( x m, y n )2
(17)
m ( M 1)/2 n ( N 1)/2
In Eq. (16), EVI(x, y) and EIR(x, y) calculated by Eq. (17) denote the regional energy of the visible feature and infrared feature respectively in corresponding window(local window is 3*3). R1 denotes the regional energy contrast. If R1 (x, y) > 1, which means that the regional energy contrast of visible feature is bigger than that of infrared feature, we choose the feature value from visible feature. Otherwise, the infrared feature value is chosen instead. (2) Fusion rule based on regional clarity contrast: 1/2 1 (18) C ( x, y ) f ( x, y ) f ( x 1, y )2 f ( x, y ) f ( x, y 1)2 M N xM , y N
In Eq. (18), C (x, y) denotes the clarity of each pixel point in feature image. R2 ( x, y )
CVI ( x, y ) / CVI C IR ( x, y ) / C IR
(19)
In Eq. (19), CVI(x, y) and CIR(x, y) calculated by Eq. (18) denote the regional clarity of the visible feature and infrared feature respectively in corresponding window(local window is 3*3).
R2 denotes the regional clarity contrast. If R2 (x, y) > 1, which means that regional clarity contrast of visible feature is bigger than that of infrared feature, we choose the feature value from visible feature. Otherwise, the infrared feature value is chosen instead. R( x, y ) MAX ( R1 ( x, y ), R2 ( x, y ))
(20)
In Eq. (20), R (x, y) denotes the maximum value between R1 (x, y) and R2 (x, y). Finally, we gain the fused feature image ( WFs , s 1 , BFs , s 1 ) at each scale level according to the value of R (x, y) based on the consistency validation scheme [39]. In order to integrate the bright and dark feature regions at all scales, all the WFs,s+1 or BFs,s+1 will be combined as follows: n
WF WFs , s 1 s 1
n
BF BFs , s 1
(21)
s 1
WF and BF represent the final fused bright and dark feature images, respectively. 3.3 Acquiring base image and fusion image As the fused bright or dark feature images, WF and BF mainly represent the image details and image features of the source images. Therefore, a base image that contains the target information of infrared image and background details of visible image should be prepared. As the opening and closing operations can respectively smooth bright and dark information of the image, some researchers obtain the base image by acquiring the maximum value among the smoothed source images or the mean value among the source images [30-36]. But it not only results in a low contrast, but also leads to background details loss in the field of infrared and visible image fusion. Although weighted average methods can improve the contrast of the fusion image [27], infrared targets and visible background details could be weakened in the final fusion image. Moreover, averaging method and weighted average methods will cause the high redundancy. Interestingly, the performance of Gaussian fuzzy logic is in accordance with the histogram characteristics of the smoothed infrared images commendably and the degrees of membership can give a good description to the target and background of infrared image [40], which can endow greater weights to the target and smaller weights to the background in the smoothed infrared image by the pervasive presence of uncertainty. Therefore, Gaussian fuzzy logic can be used to highlight the target information and suppress the infrared background, and then effectively discriminate the target from the background in the smoothed infrared image [40]. And, the smoothed visible image can be given appropriate weights to emphasize the background details based on the adaptive weighted average rule between the smoothed infrared image and the smoothed visible image. So far, the fused base image can enhance the infrared target and keep visible background details well. Meanwhile, the fusion rule of Gaussian fuzzy logic can decrease ambiguity and minimize redundancy in the base image. The details of Gaussian fuzzy logic fusion rule with the purpose of acquiring the base image is depicted as below. Firstly, the smoothed infrared (visible) bright and dark images are acquired by using the opening and closing operations with the largest scale structuring element Bbn , OIR f IR Bbn
OVI fVI Bbn
(22)
CIR f IR Bbn
CVI fVI Bbn
(23)
OIR ( OVI ) is the smoothed infrared (visible) bright image, CIR ( CVI ) is the smoothed infrared
(visible) dark image. Secondly, the smoothed bright images are combined by using Gaussian fuzzy logic with an adaptive weighted average. O( x, y) T ( x, y)OIR ( x, y) B ( x, y)OVI ( x, y)
(24)
Herein, the subscripts IR and VI stand for the smoothed infrared and visible bright images respectively. ηT and ηB are the degrees of membership (weight) of the target and the background for each pixel in the smoothed infrared and visible image, respectively. Among them, the weight ηT can highlight the infrared target information and suppress the infrared background, and ηB can emphasize the background details from the smoothed visible image. Therefore, the fused smoothed bright image O( x, y) can preserve both the bright target information presented in the smoothed infrared image and bright background details presented in the smoothed visible image. The formulas of ηT and ηB are shown as below. (O ( x, y) )2 T ( x, y) exp IR 2(k )2
(25)
B ( x, y) 1 T ( x, y)
(26)
In Eq. (25), ηT is used to describe the target and background characteristics of infrared image by Gaussian fuzzy logic, while μ and σ are the mean value and standard deviation of the smoothed infrared bright image respectively. k is a constant that may be optimized by experimental results, whose classical threshold value is usually set between 1 and 3. In our experiments, we set 1.5 as the value of k. In the similar way, the fused smoothed dark image ( C ( x, y) ) can be obtained. After obtaining the fused smoothed bright (dark) image, the base image with highlighted infrared target and emphasized visible background details would be obtained as below.
A( x, y) (O( x, y) C( x, y)) / 2
(27)
The following operation is to import the combined bright and dark feature images into the base image with reasonable strategy for obtaining the final fusion image. The mean value weighted fusion method which can adaptively adjust the imported bright and dark features can commendably import useful bright and dark features into the base image, and enhance the local contrast of fusion image [27]. We adopt the mean value weighted method to integrate the feature regions and the base image as follows.
fu ( x, y) A( x, y) pw ( x, y) WF ( x, y) pb ( x, y) BF ( x, y)
pw ( x, y )
mw ( x, y ) mw ( x, y ) mb ( x, y )
pb ( x ,y )
mb ( x ,y ) mw ( x ,y ) mb x( y , )
(28)
(29)
Where mw and mb are the mean value of WF and BF with one window of size a × a, this paper chooses 3 for a. A represents the base image. fu represents the final fusion image.
3.4 Image fusion framework In this subsection, the image fusion framework, as shown in Fig. 2, is proposed. We assume that every group of source images be pre-registered in this paper. The fusion procedure of infrared and visible images is as follows: Step1: Extract the bright (dark) feature regions at several scale levels by the improved multi-scale center-surround top-hat transform in Section 3.1. Step2: Combine the bright feature regions of infrared and visible images at each scale level by the multi-judgment contrast fusion rule respectively, then the final bright feature image are obtained by adding all scales of bright feature images together. In the similar way, the final dark feature image is also obtained. Step3: Calculate the smoothed bright (dark) images of source images by the opening and closing operations with the largest scale structuring element, acquire the base image by performing Gaussian fuzzy logic fusion rule on the smoothed images. Step4: Obtain the fusion image by importing the bright and dark feature images into the base image with proper weight strategy. Infrared image
top-hat transform
WTH(IR)
Visible image
top-hat transform
BTH(IR)
WTH(VI)
Fuzzy logical
BTH(VI)
Fuzzy logical
WTH
mean
BTH
Improved multi-scale top-hat transform
Improved multi-scale top-hat transform Base image
NWTH1
NWTH2
NWTH0,1 NWTH1,2
Contrast judgment
...
...
NWTHn
NWTHn-1,n
Contrast judgment
...
NBTH2
...
NBTHn
NBTH0,1 NBTH1,2
...
NBTHn-1,n
NBTH1
Part 1
NWTH1
NWTH2
NWTH0,1 NWTH1,2
...
...
Contrast judgment
NWTHn
NWTHn-1,n
Contrast judgment
NBTH2
...
NBTHn
NBTH0,1 NBTH1,2
...
NBTHn-1,n
NBTH1
Contrast judgment
...
Contrast judgment
Part 2 WF
ω1
ω2
Fusion image
BF
Part 3
Fig.2. The image fusion framework of the proposed method. Part 1: acquire the base image; Part 2: acquire the bright and dark feature images; Part 3: acquire the final fusion image
4 Fusion experiments and quality analysis In order to verify the validity and correctness of the proposed fusion framework, five pairs of visible-infrared images are tested in MATLAB 2011b (64bit) on a PC with Intel Core i3/2.3 GHz/12G (64bit). Three currently popular MST-based fusion methods, LP [8], DWT [10], NSCT
[13], are utilized to make comparison with the proposed method. In order to make a persuasive and reliable comparison, the number of decomposition layers of the MST-based methods is selected as four based on this paper [17], and the ‘‘max-absolute’’ rule with a 3×3 window based consistency verification scheme is adopted to merge the high-pass bands. At the same time, the “area energy weight” rule with a 3×3 window is used to combine the low-pass bands. The fusion rules, used in the MST-based methods, are favorable to acquire good fusion results [17]. Moreover, the multi-scale morphological fusion methods in literatures [27, 29, 32-34] also are compared with the proposed method to demonstrate the advance of the proposed method. Meanwhile, both visual analysis and objective metrics are employed to evaluate the image fusion performance. 4.1 Objective evaluation metrics Objective assessment as the complement of subjective vision can effectively give the quantitative comparison according to the nature of the image. In this paper, the following five objective fusion quality metrics are employed to make the quantitative comparison [1, 35]. 1) Information entropy (IE) IE directly reflects the amount of average information in the fusion image. The larger the IE is, the more abundant the information of the fusion image is, which is defined as follow. l 1
IE P(i ) log 2 ( P(i ))
(30)
i 0
Where, P(i) indicates the probability of pixels whose gray value amount to i over the total image pixels. 2) Standard deviation (SD) SD shows the extent of deviation between the gray values of pixels and the average one of the fused image. Bigger SD indicates that the fusion image contain more texture information. In a sense, the quality of the fusion image is directly proportional to the value of SD. 1 m m 1 m m SD f (i , j ) f (i, j ) m n i 1 j 1 m n i 1 j 1
2
(31)
Where, f denotes the final fusion image whose size is m × n. 3) Average Gradient (AG) AG indicates the clarity and details of the image. Similar to SD, the fusion image contains richer gradient and clearer region information as the AG value increasing. AG
1 mn
m i 1
f (i, j) f (i, j 1) n
j 2
2
m i 2
f (i, j) f (i 1, j) n
j 1
2
(32)
2
4) Mutual information (MI) MI reflects fused effects and measures the relativity between two or more images. The larger MI is, the more abundant information the fused image contains. JE JEIR ,F (33) MI VIS ,F IEVIS IE IR
JE A,F i 0 j 0 PA,F (i, k )log PA,F (i, k ) / ( PA (i ) PF (k )) L1
L1
(34)
In equation (33), JE denotes the joint entropy between the source image (VIS or IR) and the
fusion image (F), respectively. IE is the entropy of the source image. The joint entropy is shown as equation (34), and PA,F(i, k) and PB,F(j, k) are the normalized gray histograms of the source image and the fusion image, respectively. 5) Space infrequency (SF) SF reflects the overall activity in spatial domain of image. The fusion image tends to be clearer and contain much more image details if SF is bigger. SF RF 2 CF 2
(35) 2
RF
1 m m f (i, j ) f (i 1, j ) m n i 2 j 1
2
CF
1 m m f (i, j ) f (i, j 1) m n i 1 j 2
(36)
Where, RF and CF respectively show the row frequency and column frequency. The greater the above four evaluation values are, the better the fusion algorithm performs, and the more useful information the fusion results contain. 4.2 Experiments and analysis This section includes two subsections to discuss the fusion image quality through using subjective qualitative evaluation and objective quantitative evaluation, respectively. To exhibit the advantages of the proposed fusion algorithm over the conventional MST-based and morphology-based methods, five groups of comparable fusion results of infrared-visible images are shown in Figs 3-7. In these figures, (a) and (b) list the source infrared and visible images, (c)–(j) show the fused results of LP-,DWT-, NSCT-based methods and the multi-scale morphological methods proposed in [27, 29, 32-34] respectively. Meanwhile, the fused result of the proposed algorithm with conventional fusion rules (combine feature image by maximum value method and acquire base image by mean value weighted method), listed in (k), be also used to compare with our proposed algorithm with suitable fusion rules for verifying the effectiveness of the suitable fusion rules. Finally, the fused results of our proposed algorithm with suitable fusion rules are shown in (l). A value labeled in bold indicates the best performance over other methods on the corresponding fusion metric in Tables 1-5. For simplicity, we term the used fusion methods as M1-M10, which represent the fusion methods used in this paper as follow: M1-M3 orderly represent the fusion methods of LP-, DWT-, NSCT-based. M4 represents the method based on the literature [27]. Ref. [27] analyzes the fusion effect of the extracted features when using area weight in multi-scale top-hat transform, but the basic image is obtained only by averaging the source images, which will reduce the contrast of fusion image. M5 denotes the method used in the literature [29]. Ref. [29] acquires the image regions of interest by setting an appropriate threshold in multi-scale top-hat transform, which may lead to some detail information lost. The method of obtaining the basic image is same as M5. M6 is based on multi scale center-surround top-hat transform [32]. A new top-hat transform with dual structure elements is proposed and has good application in image fusion, but the fusion rules and multi-scale technology used in this literature make the quality of fusion image declined. M7 represents the method based on the literature [33]. The algorithm places emphasis on the details extraction by using opening and closing based toggle operator, and lose sight of the relative information. M8 is instead of the fusion method proposed in [34]. The algorithm focuses on the image
edge information retention through using multiscale toggle contrast operator, which also ignores the relative information. M9 is the proposed fusion method but using conventional fusion rules. Other setting parameters are the same as M10. M9 is taken as a comparison of fusion rules with M10. M10 is the proposed fusion method with suitable fusion rules, whose initial dual structuring elements parameters are set as L=W=5, M=2, the scale expanding step length is 2, and the max expanding number is 6. The parameter setting of M4-M8 is strict accordance with the literature [27, 29, 32-34]. 4.2.1. Subjective visual analysis In order to make the subjective evaluation more reliable and persuasive, we firstly invite several colleagues to make a comprehensive survey for the fusion results of ten methods. According to the colleagues’ conclusion, the fused results of the proposed method out-perform those comparative methods in general. For example, Fig. 3 shows the comparison results of ‘‘UN camp’’ infrared and visible images. The source infrared image has prominent target and the source visible image has rich visual details as shown in Fig. 3 (a) and (b). The fusion images based on M1-M3 have clear infrared target feature of pedestrian, however, they lost large amount of visible detail information, such as the shrubs and road. The fusion result based on M4 is not very clear, which leads to the texture disorder. The fusion result based on M5 is superior to the previous methods in the aspect of the texture details. However, some details in the fusion image are still smoothed, such as the trees of bottom right in the fusion image, which affect people's visual sense. The fusion image based on M6 keep the infrared target of pedestrian well, but other image information from the infrared and visible images, such as bush, trees and wire nettings, is not very clear. The fusion image based on M7 highlights the detail information of the source images and preserves the infrared target well, but they don’t reasonably merge infrared intensity information and visible background detail information, which results in a bad visual effect with a low contrast. The target edges in M8 are enhanced to a certain extent, however, the feature details and background information lost too much, and the fusion result is also discordant with the human vision. Although algorithm structure of M9 is similar to M10, M9 uses conventional fusion rules to combine image information, it can be seen that the result of M9 is discordant on the vision because it has low contrast and lost lots of source image details. Obviously, compared with above nine comparative methods, M10 with suitable fusion rules not only obtains the high contrast and clarity, but also possesses rich visible spectral detail information and preserves primary infrared targets information well. As we can see, the trees on the bottom right and bottom left of fusion image own obvious infrared and visible features, the shrubs on the right have much more visible texture information, the infrared target of pedestrian is significantly enhanced, and the overall visual effect is the most optimal among all the used methods. Fig. 4 shows the second example on ‘‘boat’’ image set. Fig. 4(a) only places great emphasis on the thermal source information, such as the engine and the chimney, whereas Fig. 4(b) describes the overall surroundings. The visual effect based on M1-M3 look the same and well, because they retain main characteristic of the source images. However, we can find that some visual detail information is dim or lost, such as the cloud in the sky and the tires on the edge of the boat. The fusion results based on M4-M9 have poor visual effect resulting from the lack of many visual background details and infrared target information. Intuitively, M10 properly combines the infrared and visible characteristics, and the fusion result has much more reasonable contrast and
better visual effects. For example, the chimney, control tower and cloud etc. own obvious targets and details information coming from source infrared and visible images, the contrast and details in the fusion image are highlighted appropriately, especially the surface wave. The third comparison experiment shown in Fig. 5 is about “Octec” infrared and visible images. From Fig. 5(c) ~ (l), ten methods could availably merge the target information of the infrared image and the details information of the visible image, but there are obvious differences among them. The fusion results based on M1-M3 have similar visual effect and preserve many visual details. However, they lost some infrared signatures, such as the cloud and the people in infrared image, and the image contrast is reduced. The fusion results based on M4 and M9 are also unsatisfactory for some details and targets are blurry and the two images have low contrast, which leads to a bad visual effect. Although the fusion results based on M5-M6 preserves some visible detail information and infrared target information, there are still some information lost or distorted, such as the cloud and the window. The fusion results based on M7-M8 place heavy emphasis on highlighting details and edges of the source images and miss main energy and background information, which results in image fuzzy. As shown in Fig.5 (l), M10 preserves the infrared target and visible detail information well such as the cloud, lawn and houses, and properly enhances the image contrast, which produces the best visual effect. Fig. 6 shows the comparison fusion result about “Trees’’ visible and infrared images. The source visible image is not clear due to poor contrast, but the infrared target is distinct in the source infrared image. The results based on M1-M3 are similar due to the same fusion rules, especially the “area energy weight” rule. Their infrared target information is well preserved, but obvious halo appears resulting from the obvious pseudo-gibbs phenomena, and the visible details become unclear. M4 and M5 perform better than the previous methods, the fusion image have clear target and texture information. However, it is still difficult to distinguish the trees and the hill because some visible details are lost. Although the infrared targets based on M6 are enhanced to a certain extent, the areas of hillside and trees hardly can be distinguished, and the clarity of the whole image is unacceptable. Similar to the above analysis, M7 and M8 have very bad visual effect. The fusion result based on M9 has low contrast and obvious blurring. Obviously, M10 still achieves best vision effect with high contrast, rich visible details and highlighted infrared targets. For example, the pedestrian and trees etc. own obvious infrared targets and visible details information, the whole fusion image contrast is highlighted appropriately, so it is easy to identify the pedestrian, trees and lawns. The last comparison fusion experiments are performed on the “hillside” visible and infrared images as shown in Fig. 7. Similarly, the used methods successfully combine the information of source images. The visual effect of M1-M3 is similar. The infrared target information is well preserved, but the details from visible image become are smoothed. The fusion results based on M4-M7 and M9 are not clear apart from the pedestrian, because some details are smoothed. The fusion result based on M8 still has very bad visual effect. Compared with M1-M9, M10 achieves a better vision effect though properly preserving the infrared target and visible background details. And the image contrast is also good while the gray distribution is maintained. For example, the pedestrian is highlighted, and the details information and contrast of hillside are highlighted appropriately, which is conducive to human vision. According to the subjective discussion above, a preliminary conclusion can be drawn that the proposed method with suitable fusion rules is superior to other nine fusion methods in both
preserving important visible details and infrared target information and inheriting the characteristics of source images. Meanwhile, the proposed method can appropriately enhance the contrast of fusion images which is convenient for people's visual sense.
(a) Infrared image
(b) Visible image
(c) M1
(d) M2
(e) M3
(f) M4
(g) M5
(h) M6
(i) M7
(j) M8
(k) M9
(l) M10
Fig. 3 Fused results of ‘‘UN Camp’’ images
(a) Infrared image
(b) Visible image
(e) M3
(f) M4
(i) M7
(j) M8
(c) M1
(d) M2
(g) M5
(h) M6
(k) M9
(l) M10
Fig. 4 Fused results of ‘‘Boat’’ images
(a) Infrared image
(b) Visible image
(c) M1
(d) M2
(e) M3
(f) M4
(g) M5
(h) M6
(i) M7
(j) M8
(k) M9
(l) M10
Fig. 5 Fused results of ‘‘Octec’’ images
(c) M1
(a) Infrared image
(b) Visible image
(e) M3
(f) M4
(g) M5
(h) M6
(i) M7
(j) M8
(k) M9
(l) M10
Fig. 6 Fused results of ‘‘Trees’’ images
(d) M2
(a) Infrared image
(b) Visible image
(c) M1
(d) M2
(e) M3
(f) M4
(g) M5
(h) M6
(i) M7
(j) M8
(k) M9
(l) M10
Fig. 7 Fused results of ‘‘hillside’’ images
4.2.2 Objective evaluation metrics Although subjective visual evaluation can be utilized to provide instinctive comparisons, many factors cannot be ignored such as eyesight level, mental state and equipment required, which may influence the subjective results. Moreover, in most cases, there are few differences among fused results, which results in the difficulty of giving a reliable judgment only depending on subjective visual evaluation. For example, the fusion results (c) ~ (e) in Fig. 3~Fig.7 are almost the same for human visual sense. In other words, it is hard to find their tiny differences through subjective visual evaluation. Therefore, it is very necessary to provide objective assessment as the complement of subjective vision. In this paper, we use five objective fusion quality metrics, IE, MI, SD, AG and SF to make the quantitative comparison. A value labeled in bold indicates the best performance over other methods on the corresponding fusion metric in Table 1~5 which orderly show objective evaluation of Fig. 3~Fig. 7. From these tables, we find the IE, MI and SD values of M10 are obviously larger than that of M1-M8, although MI value of M3 method are slightly larger than M10 in the second case, which demonstrates that the fused results based M10 have better image quality with richer detail information and clearer target regions than M1-M8. It is obvious that the IE, MI and SD values of M4-M8 are lower than that of M1-M4 in the vast majority of cases, because the morphology methods based on M4-M8 focus on the details and edge extraction and ignore the fusion of the related information between the source images, especially M7 and M8. The SF and AG values of M10 are superior to that of M1-M6, which depicts that the fused images based on M10 have more clarity level and higher contrast than these results based on M1-M6. Although the SF and AG values of M7-M8 are larger than that of M10, the IE, MI and SD values of M7-M8 are obviously smaller than that of M10. Moreover, M7 focuses on the ascension of details extraction, and M8 places emphasis on highlighting edges information, which is contribute to get bigger SF and AG values. However, neither M7 nor M8 has regard for other relevant image information which can be displayed obviously in the visual effect of fused images and the IE, MI and SD values.
Therefore, from the point of comprehensive performance, M10 is still superior to M7-M8. Meanwhile, we can find that all objective quality metrics of M10 are significantly greater than those of M9, which indicates that the suitable fusion rules can greatly improve the fusion quality of infrared and visible images in comparison to the conventional fusion rules widely used in morphological fusion methods. Through careful observation, we can also find that the objective fusion performance of M1-M9 changes along with the different source images, which demonstrates that the methods based on M1-M9 have poor adaptability for infrared and visible images acquired in different conditions. On the whole, the proposed method has the more excellent performance compared to other nine comparative methods, which is consistent with the results of subjective analysis. After summing up the results above, it can be found that the proposed method can make the fusion results of the infrared and visible images contain rich gradient, clear region, abundant details, clear contrast and highlighted targets, because it has the perfect advantages of preserving important visible details and infrared target information. In order to give a more intuitive description, the average quantitative results of each algorithm are calculated as shown as Fig 8. The varied tendency of the chart data is consistent with the results of Table 1~Table 5. Therefore, it can also be recognized that the proposed method is superior to the comparative methods in this paper. From the objective analysis above, it is easily found that the results of objective evaluation agree well with subjective analysis. According to the analyzed results of subjective analysis and objective evaluation, it can be concluded that the proposed method possesses clear advantages over currently popular MST-based methods and the existing multi-scale morphology methods for infrared and visible images fusion, and the suitable fusion rules can greatly improve the fusion quality of infrared and visible images in comparison to the conventional fusion rules. Meanwhile, the proposed fusion method has good adaptability for infrared and visible images acquired in different conditions. Table 1 Objective assessment of ten fusion methods for ‘‘UN Camp’’ images. Methods
IE
MI
SD
SF
AG
M1
6.8183
1.7245
30.7785
11.4427
4.5376
M2
6.6894
1.8378
28.968
10.9001
4.1929
M3
6.7259
1.85
29.6565
11.1679
4.4842
M4
6.7614
1.3189
31.2617
12.9901
5.7978
M5
6.7864
1.3856
32.56
14.5409
5.9496
M6
6.7552
1.8027
33.8282
14.0882
5.1003
M7
6.6723
1.3653
30.0781
14.7657
5.941
M8
6.5693
1.2744
27.1842
20.5384
6.5514
M9
6.5911
1.3646
27.7175
12.8862
5.3328
M10
7.1775
1.9437
40.6689
16.1646
6.7132
Table 2 Objective assessment of ten fusion methods for ‘‘boat’’ images. Methods
IE
MI
SD
SF
AG
M1
5.8707
2.4748
17.5233
7.0484
2.0532
M2
5.7146
2.4013
16.9728
6.9196
1.9842
M3
5.7353
2.469
17.2383
6.9658
2.0342
M4
5.3762
1.2606
14.7393
7.9064
2.7254
M5
5.3915
1.6683
16.0213
9.3514
2.6027
M6
5.1398
1.6078
13.1571
7.2414
1.9731
M7
5.3534
1.3223
14.0937
9.6643
2.8561
M8
5.2945
1.2081
13.2917
12.0675
3.0713
M9
5.2383
1.272
12.912
8.1602
2.5909
M10
6.0133
2.4338
19.4432
9.6018
2.657
Table 3 Objective assessment of ten fusion methods for ‘‘Octec’’ images. Methods
IE
MI
SD
SF
AG
M1
6.7498
3.1208
40.1625
13.6584
3.947
M2
6.766
3.2053
39.882
13.6095
3.9174
M3
6.7752
3.2313
40.0605
13.6179
3.9414
M4
6.6565
2.4076
30.8034
16.7751
5.1205
M5
6.7676
2.5572
32.4499
18.4819
5.051
M6
6.4913
3.1785
28.475
12.2417
3.4301
M7
6.6829
2.5785
31.5329
21.1827
5.8127
M8
6.6011
2.3933
30.3799
23.6655
5.6023
M9
6.534
2.6087
29.0048
16.3091
4.8335
M10
7.1117
3.5708
56.1842
18.5268
5.1874
Table 4 Objective assessment of ten fusion methods for ‘‘Trees’’ images. Methods
IE
MI
SD
SF
AG
M1
6.1512
M2
6.1046
2.0627
18.87
9.0871
3.3715
2.0735
18.258
8.8281
3.2224
M3
6.1161
2.1448
18.462
8.8067
3.3164
M4
6.1034
1.5611
17.7684
9.0296
3.7669
M5
6.1818
1.7066
18.7075
10.7921
4.4439
M6
6.1568
2.0578
19.1685
10.8642
3.6697
M7
6.187
1.668
18.5943
13.286
4.9962
M8
6.1015
1.4265
17.3106
15.7855
5.1927
M9
6.0556
1.7433
16.9604
9.6386
3.7842
M10
6.6619
2.1587
29.067
11.158
4.4705
Table 5 Objective assessment of ten fusion methods for ‘‘hillside’’ images. Methods
IE
MI
SD
SF
AG
M1
5.9923
1.4138
15.555
7.9412
3.1766
M2
5.8554
1.4456
14.1015
6.645
2.6237
M3
5.9504
1.4082
15.0705
7.6684
3.1218
M4
6.2347
1.1931
18.431
8.2438
3.8469
M5
6.1878
1.2231
17.8185
9.267
4.2004
M6
6.1003
1.4775
17.318
9.2265
3.5003
M7
6.1463
1.182
17.3178
11.3527
4.5885
M8
6.0386
1.0475
16.0681
14.2516
4.9502
M9
6.0622
1.2446
16.3186
8.5559
3.7141
M10
6.5166
1.6011
22.3873
9.693
4.3408
(a) EI
(b) MI
6.8 6.7 6.6 6.5 6.4 6.3 6.2 6.1 6 5.9 5.8 5.7
2.4 2.2 2 1.8 1.6
1.4 1.2 1
M1 M2 M3 M4 M5 M6 M7 M8 M9 M10
M1 M2 M3 M4 M5 M6 M7 M8 M9 M10
(c) SD
(d) SF
40
19
35
17
30
15
25 20
13
15
11
10
9
5
7
0
5
M1 M2 M3 M4 M5 M6 M7 M8 M9 M10
M1 M2 M3 M4 M5 M6 M7 M8 M9 M10
(e) AG 5.5 5 4.5 4 3.5 3 2.5
M1 M2 M3 M4 M5 M6 M7 M8 M9 M10 Fig 8 Quantitative comparison using mean value of each algorithm (a) IE, (b) MI, (c)SD, (d)SF, and (e)AG
Apart from the subjective and objective performance, the computational efficiency which is characterized by the average running time (ART) is also an important index to judge the algorithm performance. We take the experimental results of the ‘‘UN Camp’’ source images for example to obtain the ART by measuring 10 times at each method. Table 6 records the ART of M1-M10. Obviously, M1 and M2 involve multi-scale pyramid, so they are the most time-saving which only require 0.0512 s and 0.1956 s, respectively. Compared with M1 and M2, M3 is more time-consuming due to the complex mechanism of NSCT which requires 41.3684 s. We can find multi-scale morphology methods (M4-M10) are time-saving on the whole. Among them, M7 and M8 have ideal running time (1.0823 s and 0.7635 s), because they only conduct the simple morphological operations without complex fusion rules. Compared with M7-M8, M4 and M9 are bit time-consuming due to introduce area values fusion rule, but they also only need 1.1947 s and 1.1281s. Because M5 introduced standard deviation as threshold value in the process of extracting feature regions, it need more time (2.4181 s). M6 involves new structure element and the way of
acquiring feature regions, so it need bit more than those of M4-M5 and M9. M10 combines improved top-hat transform and suitable fusion rules into the fusion method. These introduced operations are direct and convenient, so its ART is less than those of M5 and M6. Of course, the ART of M10 needs a bit more than these conventional morphology methods which only need simple morphological operations such as M4 and M7-M9. However, the ART of M10 is acceptable in image processing. Table 6 Average running time (ART) based on ‘‘UN Camp’’ images (unit: second). Method
M1
M2
M3
M4
M5
M6
M7
M8
M9
M10
ART
0.0512
0.1956
41.3684
1.1947
2.4181
3.2349
1.0823
0.7635
1.1281
1.7232
5. Conclusions Infrared image and visible image have abundant complementary information for each other though they are significantly different. For the purposes of integrating the information of infrared and visible images effectively and making the fusion image have clear contrast, highlighted target, rich details, a new morphology fusion method is proposed based on the improved multi-scale center-surround top-hat transform with multi-judgment contrast fusion rule and Gaussian fuzzy logic fusion rule. The proposed fusion method has been tested by contrast with nine fusion methods in five groups of visible and infrared images. The subjective and objective analysis demonstrate that the proposed method can not only effectively fuse visible and infrared images, but also get higher contrast, more fine visible details and more remarkable infrared target information than the conventional MST-based methods. And the proposed method is also superior to the morphology-based methods proposed in [27, 29, 32-34] and the method with conventional fusion rules for infrared and visible images fusion in general. After a careful analysis, it can be observed that the proposed method is not only superior to nine comparative fusion methods in both aspects of visual quality and objective evaluation, but also performing steadily for various infrared and visible images obtained in different conditions.
Conflict of interest No conflict of interest.
Acknowledgments We would like to thank the editors and the reviewers for their careful work and invaluable suggestions for helping us to improve this paper. We are also grateful to the websites www.imagefusion.org and www.vcipl.okstate.edu/otcbvs/bench/Data for providing the experiment images. This work is supported by the National Natural Science Foundation of China (Grant No: 61275009, 61475113).
References [1] W. W. Kong, L. Yang, Technique for image fusion between gray-scale visual light and infrared images based on NSST and improved RF, Optik. 124 (2013) 6423-6431. [2] S. Jamal, F. Karim, Infrared and visible image fusion using fuzzy logic and population-based optimization, Applied Soft Computing. 12 (2012) 1041-1054. [3] C. He, Q. Liu, H. Li, H. Wang, Multimodal medical image fusion based on HIS and PCA, Procedia Engineering 7 (2010) 280-285.
[4] N. Cvejic, D. Bull, N. Canagarajah, Region-based multimodal image fusion using ICA bases, IEEE Sens. J. 7 (5) (2007) 743–751. [5] M. Ding, L. Wei, B. Wang, Research on fusion method for infrared and visible images via compressive sensing, Infrared Phys. Technol. 57 (2013) 56-67. [6] X.Q Lu, B.H. Zhang, Y. Zhao, H. Liu, H.Q. Pei, The infrared and visible image fusion algorithm based on target separation and sparse representation, Infrared Phys. Technol. 67 (2014) 397-407. [7] Z. Wang, Y. Ma, J. Gu, Multi-focus image fusion using PCNN, Pattern Recognit.43 (2010) 2003–2016. [8] P. J. Burt, E. H. Adelson, The laplacian pyramid as compact image code, IEEE Trans On communications. 31(4) (1983) 532-540. [9] G. Piella, A general framework for multiresolution image fusion: from pixels to regions, Inform. Fusion. 4 (4) (2003) 259-280. [10] H.M. Lu, L.F. Zhang, S. Serikawa, Maximum local energy: an effective approach for multisensory image fusion in beyond wavelet transform domain, Comput. Math. Appl. 64(5) (2012)996-1003. [11] H.H Li, L. Guo, H. Liu, Research on Image Fusion Based on the Second Generation Curvelet Transform, ACTA OPTICA SINICA.26(5)(2006)657-662. [12] H.J. Wang, Q.K. Yang, R. Li, Tunable-Q contourlet-based multi-sensor image fusion, Signal Process. 93(7) (2013)1879-1891. [13] J.H. Adu, J.H. Gan, Y. Wang, J. Huang, Image fusion based on nonsubsampled contourlet transform for infrared and visible light image, Infrared Phys. Technol. 61(1) (2013) 94-100. [14] W.W. Kong, Y. Lei, H.X. Zhao, Adaptive fusion method of visible light and infrared images based on non-subsampled shearlet transform and fast non-negative matrix factorization, Infrared Phys. Technol. 67 (2014) 161-172 [15] W.W. Kong, L. Zhang, Y. Lei, Novel fusion method for visible light and infrared images based on NSST-SF-PCNN, Infrared Phys. Technol. 65 (2014) 103-112. [16]P. Zhu, Z.H. Huang, H. Lei, Fusion of infrared and visible images based on BEMD and NSDFB, Infrared Phys. Technol. 77 (2016) 82-93. [17] S.T. Li, B. Yang, J.W. Hu, Performance comparison of different multi-resolution transforms for image fusion, Information Fusion. 12 (2011) 74-84. [18] N. E. Huang, Z. Shen, S. R. Long, M. C. Wu, H. H. Shih, The empirical mode decomposition and the Hilbert spectrum for nonlinear and nonstationary time seriesanalysis, Proc. R. Soc. London A. 454(1998) 903-995. [19] J. Serra, Image Analysis and Mathematical Morphology, Academic Press. New York, 1982. [20] Y. Jiang, M.H. Wang, Image fusion with morphological component analysis, Information Fusion. 18 (2014) 107-118. [21] Y. Li, B. Yong, H.Y. Wu, R. An, H.W. Xu, An Improved Top-Hat Filter with Sloped Brim for Extracting Ground Points from Airborne Lidar Point Clouds, Remote Sens. 6(2014)12885-12908. [22] M. Liao, Y.Q. Zhao, X.H. Wang, P.S. Dai, Retinal vessel enhancement based on multi-scale top-hat transformation and histogram fitting stretching, Optics & Laser Technology. 58 (2014) 56–62. [23] X.Y. Tan, M. Chen, C.S. Jiang, The small target detection base on wavelet transform and mathematical morphology, Electronics Optics & Control. 15(9) (2008)25-28. [24] T. Barata, P. Pina, Improving classification rates by modelling the clusters of training sets in features space using mathematicalmorphology operators, Pattern Recognition. 4(2002)90-93. [25] J. W. Klingler, C. L. Vaughan, T. D. Fraker, L. T. Andrews, Segmentation of echocardiographic images usingmathematical morphology, Transactions on Biomedical Engineering.11(35)(1988) 925-934.
[26] S. Mukhopadhyay, B, Chanda, Fusion of 2D grayscale images using multiscale morphology, Pattern Recognition. 34 (2001) 1939-1949. [27]X.Z. Bai, S.H. Gu, F.G. Zhou, B.D. Xue, Weighted image fusion based on multi-scale top-hat transform: Algorithms and a comparison study, Optik. 124 (2013) 1660–1668. [28] Y.F. Li, X.Y. Feng, M.W. Xu, Infrared and visible image features enhancement and fusion using multi-scale top-hat decomposition, Infrared and Laser Engineering. 41(10) (2012) 2825-2832. [29] X.Z. Bai, X.W. Chen, F.G. Zhou, Z.Y. Liu, B.B. Xue, Multiscale top-hat selection transform based infrared and visual image fusion with emphasis on extracting regions of interest, Infrared Phys. Technol. 60 (2013) 81-93. [30] X.Z. Bai, F.G. Zhou, Analysis of new top-hat transformation and the application for infrared dim small target detection, Pattern Recognition. 43 (2010) 2145–2156. [31] X.Z. Bai, F.G. Zhou, B.B. Xue, Infrared image enhancement through contrast enhancement by using multi scale new top-hat transform, Infrared Phys. Technol. 54(2) (2011) 61-69. [32] X.Z. Bai, F.G. Zhou, B.D. Xue, Fusion of infrared and visual images through region extraction by using multi scale center-surround top-hat transform, Optics Express. 9(19) (2011)8444-8457. [33] X.Z. Bai, Y. Zhang, Detail preserved fusion of infrared and visual images by using opening and closing based toggle operator, Optics & Laser Technology. 63 (2014) 105-113. [34] X.Z. Bai, F.G. Zhou, B.D. Xue, Edge preserved image fusion based on multiscale toggle contrast operator, Image and Vision Computing. 29 (2011) 829-839. [35] X.Z. Bai, Morphological image fusion using the extracted image regions and details based on multi-scale top-hat transform and toggle contrast operator, Digital Signal Processing. 23 (2013) 542-554. [36] X.Z. Bai, Image fusion through feature extraction by using sequentially combined toggle and top-hat based contrast operator, APPLIED OPTICS,51(31)(2012)7566-7575. [37] J.H. Bosworth, S.T. Acton, Morphological scale-space in image processing, Digital Signal Processing. 13 (2003) 338-367. [38] Y. Chen, J. Xiong, H.L. Liu, Q. Fan, Fusion method of infrared and visible images based on neighborhood characteristic and regionalization in NSCT domain, Optik. 125 (2014)4980-4984. [39] H. Li, B. Manjunath, S. Mitra, Multisensor image fusion using the wavelet transform, Graph. Models Image Process. 57 (3) (1995) 235-245. [40] S.F. Yin, L.C. Cao, Q.F. Tan, G.F. Jin, Infrared and Visible Image Fusion based on NSCT and Fuzzy Logic, Proceedings of the 2010 IEEE. 2010, Xi'an, China.
The improved multi-scale top-hat transform is used in infrared-visible images fusion. The bright and dark feature regions of source images are extracted at different scales. The bright (dark) feature regions are merged by multi-judgment contrast fusion rule The base image is acquired by Gaussian fuzzy logic combination rule the proposed method is effective for infrared and visible image fusion