Optics Communications 335 (2015) 168–177
Contents lists available at ScienceDirect
Optics Communications journal homepage: www.elsevier.com/locate/optcom
A novel fusion scheme for visible and infrared images based on compressive sensing Zhaodong Liu a, Hongpeng Yin a,b,n, Bin Fang c, Yi Chai a,d a
College of Automation, Chongqing University, Chongqing City 400030, China Key Laboratory of Dependable Service Computing in Cyber Physical Society, Ministry of Education, Chongqing 400030, China c College of Computer Science, Chongqing University, Chongqing 400030, China d State Key Laboratory of Power Transmission Equipment and System Security and New Technology, College of Automation, Chongqing University, 400030, China b
art ic l e i nf o
a b s t r a c t
Article history: Received 13 June 2014 Received in revised form 10 July 2014 Accepted 11 July 2014 Available online 26 September 2014
An appropriate fusion of infrared and visible images can integrate their complementary information and obtain more reliable and better description of the environmental conditions. Compressed sensing theory, as a low signal sampling and compression method based on the sparsity of signal under a certain transformation, is widely used in various fields. Applying to the image fusion applications, only a part of sparse coefficients are needed to be fused. Furthermore, the fused sparse coefficients can be used to accurately reconstruct the fused image. The CS-based fusion approach can greatly reduce the computational complexity and simultaneously enhance the quality of the fused image. In this paper, an improved image fusion scheme based on compressive sensing is presented. This proposed approach can preserve more detail information, such as edges, lines and contours in comparison to the conventional transformbased image fusion approaches. In the proposed approach, the sparse coefficients of the source images are obtained by discrete wavelet transform. The low and high coefficients of infrared and visible images are fused by an improved entropy weighted fusion rule and a max-abs-based fusion rule, respectively. The fused image is reconstructed by a compressive sampling matched pursuit algorithm after local linear projection using a random Gaussian matrix. Several comparative experiments are conducted. The experimental results show that the proposed image fusion scheme can achieve better image fusion quality than the existing state-of-the-art methods. & 2014 Elsevier B.V. All rights reserved.
Keywords: Compressive sensing Image fusion DWT CoSaMP Gaussian matrix
1. Introduction In the last decade, with extraordinary advances in sensor technology, numerous imaging sensors have been applied in military and civilian applications. Infrared (IR) and visible images, as two wildly used image modalities, contain quite different and complementary information [1–8]. IR images can provide a promising alternative to visible images and have a good radiometric resolution, but IR image sensors are sensitive to the difference of temperature in the surrounding environment. Interestingly, visible images can provide more detail information since the visual sensor interprets the surrounding environment by processing information that is contained in visible light. Thus, an appropriate fusion of IR and visible images can combine the complementary information and obtain a more reliable and better
n Corresponding author at: College of Automation, Chongqing University, Chongqing City 400030, China. Tel.: þ81 23 6510 2481. E-mail address:
[email protected] (H. Yin).
http://dx.doi.org/10.1016/j.optcom.2014.07.093 0030-4018/& 2014 Elsevier B.V. All rights reserved.
description of the environmental conditions. The image fusion approach for IR and visible images can improve spatial awareness, increase accuracy in target detection and recognition, reduce operator workload and increase system reliability [3–8]. The fusing of the two kinds of images has attracted more attention due to its widespread use in many fields such as night vision and video surveillance [4–8]. Many fusion schemes of IR and visible images have been proposed in recent years. These can be classified into: pixel level fusion, feature level fusion and decision level fusion. In comparison to feature and decision-level fusion, pixel-level fusion affords much more original information. Among the presented fusion methods, multi-scale transform-based image fusion methods are popularly used in the pixel-level fusion domain, including discrete wavelet transform (DWT) [5,6], stationary wavelet transform (SWT) [7,8], dual-tree complex wavelet transform (DT-CWT) [9,10], curvelet transform [11], ridgelet transform [12], contourlet transform [13,14], etc. Therefore, Indhumadhi and Padmavathi utilize Laplacian fusion algorithm and SF algorithm to fuse the low and high approximations by applying 2D-DWT in order to
Z. Liu et al. / Optics Communications 335 (2015) 168–177
169
Fig. 1. The conventional transform-based image fusion scheme.
enhance the performance of the fused image [15]. Mirajka and Sachin explore transform fusion based on SWT to acquire more edge image information and this method achieves better quality [16]. Hill et al. introduce a novel image fusion approach based on the shift invariant and DT-CWT [17]. Fig. 1 shows a generic fusion scheme in a transform domain which is widely used [12,14,18,19]. In this scheme, there are two different approaches to select the transform coefficients. One approach merges all of the coefficients of source images to obtain high-quality fused image. The other approach integrates a number of sparse coefficients by applying the constrained threshold to enhance the speed of reconstruction. However, the two fusion approaches both have disadvantages specific to the different coefficient selection methods. Although the fusion approach can achieve better fused image quality by combining all of the coefficients. It suffers in the high computational complexity. In the real applications, it may lead to the problems of “Information overload”. Correspondingly, only merging certain coefficients can greatly reduce the computational complexity, but the quality of the integrated image cannot be guaranteed. The key problem is how to select the constrained threshold. It usually depends on the priori knowledge of source images [4,12,18]. The selection of the ill-suited threshold may result in the problems of blocking artifacts and poor fidelity. In comparison, the compressive sensing (CS) theory can accurately construct high-quality merged image via fusing fewer sparse coefficients. The only constrained condition is the coefficients should be sparse [20–24]. While, in the transform domain almost all the transform coefficients are sparse. In practical applications, the compressive sensing theory can be utilized to overcome the problems of “information overload”, blocking artifacts and poor fidelity [20–24]. Due to these merits, various image fusion approaches based on compressive sensing are proposed. Li and Qin give out a novel self-adaptive weighted average fusion scheme based on standard deviation of measurements to merge IR and visible images. It uses the better recovery tool of total variation optimization in the special domain of compressive sensing [21]. Looney and Mandic takes advantage of complex-valued image processing empirical mode decomposition to optimize the quality of the fused image [25]. Chen and Xiao propose a novel multisensor image fusion algorithm based on distributed compressive sensing [26]. Yang and Li propose a novel image fusion scheme using the sparse representation theory [27]. However, there are two major limitations above all these algorithms. Initially, only one fusion rule is used to integrate different coefficients. It perhaps leads to blocking artifacts and poor fidelity for multi-source images. Secondly, these methods produce much more reconstruction error because of focusing on fusing the measurements. In this work, an improved fusion scheme for visible and infrared images based on compressive sensing is proposed. In our proposed fusion approach, firstly, DWT is utilized to acquire the sparse coefficients: approximation coefficients and detail coefficients. Then, an improved entropy weighted fusion rule is used to fuse
the low-frequency information; the max-abs-based fusion rule is employed to integrate the high-frequency information. These can remove the blocking artifacts problem and enhance the quality of the fused image. Finally, the fused image can be reconstructed by Compressive Sampling Matched Pursuit algorithm (CoSaMP) after non-adaptive linear projection exploring a random Gaussian matrix. In particular, the CS-based image fusion approach only fuses fewer sparse coefficients and accurately reconstructs the fused image in comparison to the transform-based fusion scheme. It also means that computation complexity is greatly reduced and concurrently the quality of the integrated image is enhanced. The key contributions of this work can be elaborated as follows: (1) An improved CS-based image fusion scheme for infrared and visible images is proposed. Directly fuse the sparse coefficients before non-adaptive linear projection may reduce the reconstruction error. Furthermore, the proposed fusion approach is not limited to image fusion for the IR and visible images. It also can extend to the other image fusion applications. (2) An improved fusion rule based on entropy and mutual information is proposed to preserve more detail information, such as edges, lines and contours. In this work, the proposed fusion rule and the maximum selection fusion rule are utilized to fuse the low and high frequency information, respectively. These can generate quite satisfying results with good removal of visual artifacts.
2. The proposed fusion framework Reviewing image fusion in CS domain, one intuitional way is to fuse the linear projections before reconstructing the integrated image. However, in a real scenario, it is hard to obtain the accurate measurements. To overcome this limitation, directly merging the sparse coefficients is a proper solution. Employing this method, priori information of the source images is not necessary. It simplifies the algorithmic implementation of our proposed fusion method. The framework of the image fusion approach based on compressive sensing is indicated in Fig. 2. In this approach, multi-layer discrete wavelet transform is utilized to represent the visible and IR images [6,18]. The key merits of DWT are its high compression ratios and good localization. The complementary coefficients illustrate the approximation and detail components of the input images. Correspondingly, the decompositions of coefficients can be denoted by ^ ^ Al 1 ði; jÞ ¼ ∑ hðmÞ hðnÞA l ð2i m; 2j nÞ m;n A R
^ ^ gðnÞA D1l 1 ði; jÞ ¼ ∑ hðmÞ l ð2i m; 2j nÞ m;n A R
D2l 1 ði; jÞ ¼
^ ∑ g^ ðmÞhðnÞA l ð2i m; 2j nÞ
m;n A R
^ D3l 1 ði; jÞ ¼ ∑ g^ ðmÞgðnÞA l ð2i m; 2j nÞ; m;n A R
ð1Þ
170
Z. Liu et al. / Optics Communications 335 (2015) 168–177
Fig. 2. The framework of the image fusion approach based on compressive sensing.
where Al 1 is the low frequency coefficients of the source images, and D1l 1 , D2l 1 , D3l 1 are the horizontal, vertical and bidiagonal high frequency coefficients, respectively. i, j indicate the pixel coordinates, and l is the index of spatial resolution. m, n denote the ^ ^ size of the source images in the integer set R. hðmÞ, gðnÞ are quadrature mirror filters (QMFs) [28]. Corresponding to the decomposition, the reconstruction equation is given as i m j n Al ði; jÞ ¼ 4 ∑ hðmÞhðnÞAl 1 ; 2 2 m;n A R i m j n ; þ 4 ∑ hðmÞgðnÞD1l 1 2 2 m;n A R i m j n ; þ 4 ∑ gðmÞhðnÞD2l 1 2 2 m;n A R i m j n ; ; ð2Þ þ 4 ∑ gðmÞgðnÞD3l 1 2 2 m;n A R ^ ^ where h(n), g(n) are QMFs ðhðnÞ ¼ hð nÞ; gðnÞ ¼ gð nÞ; gðnÞ ¼ ð1 nÞ ð 1Þ hð1 nÞÞ. In order to obtain the significant coefficients of the visible and IR images, the discussion of multi-layer discrete wavelet transform is described. The initial M N visible and infrared images are vectorized, separately. Then the coefficients in the DWT domain are obtained by the following equation:
L1
H ¼ ∑ pðiÞ log 2 pðiÞ i¼0
L
ð3Þ
where x^ A and x^ B are the frequency information of the visible and infrared images, respectively. AnL is the low frequency coefficients of the source images. D1nH , D2nH and D3nH are the horizontal, vertical and bidiagonal high frequency coefficients of the source images, respectively. Designing a set of proper fusion rules is the key factor in image fusion scheme. Inspired by the DWT-based fusion method, a proposed fusion rule based on mutual information and the maxabs fusion rule are applied to combine the low and high frequency coefficients, respectively. Mutual information [29] reflects the correlation of source images, which can preferably assign the weight value for each source image. On account of the correlation between the visible and infrared images, the presented fusion rule extends the “average” pixel-based rule to improve the fused image's quality. For the high frequency information, the maximum selection fusion rule [30] is employed to process. The ultimate principle of the maximum selection fusion rule is to select the maximal frequency coefficients. It can be shown in the following equation: 8 < Dl ði; jÞ; if absðDl ði; jÞÞ 4 absðDl ði; jÞÞ AH AH BH l DFH ði; jÞ ¼ ; ð4Þ : DlBH ði; jÞ; if absðDlAH ði; jÞÞ r absðDlBH ði; jÞÞ
L
Γ ¼ ∑ ∑ hR;F ði; jÞ log 2 i¼1j¼1
hR;F ði; jÞ ; hR ðiÞhF ðjÞ
ð5Þ
where the parameter p(i) indicates the distribution probability of the gray i. The parameter hR;F ði; jÞ denotes the normalized combined gray histogram of two images R and F. R and F denote the visible and infrared images, respectively. L is the gray level. The conventional weighted average fusion rule is widely used in CS domain, but there are much striped noise and high brightness in the fused image. In this work, an improved weighted fusion rule is presented to fuse the low frequency information. It is more reasonable considering the inter-correlation of the visible and infrared images. Therefore, the weights w1, w2 can be defined as follows: w1 ¼
x^ A ¼ ½AAL ; D1AH ; D2AH ; D3AH x^ B ¼ ½ABL ; D1BH ; D2BH ; D3BH ;
where l is the decomposition level of DWT, and DFH is the high frequency coefficients, while DAH and DBH are the high frequency coefficients of visible and infrared images, respectively. For the low frequency information, the improved fusion rule takes the mutual information between the visible and infrared images into account. The entropy H of an image and the mutual information Γ of two images R and F can be denoted as
w2 ¼
H1
Γ
H2
Γ
H 12
H2
H 12
H1
Γ H1 þ H2 Γ H1 þ H2
;
ð6Þ
where H1, H2 are the entropy of the visible and IR images, respectively. H12 indicates the joint entropy of the source images. Γ denotes the mutual information. For the image fusion of visible and infrared images, the sum of the weight values should be 1: w1 þw2 ¼ 1. For the low frequency sub-bands, the fused coefficients AFL can be selected as AlFL ði; jÞ ¼ w1 AlAL ði; jÞ þ w2 AlBL ði; jÞ:
ð7Þ
In this proposed fusion scheme, the low and high frequency coefficients are fused via different rules prior to projection. It reduces reconstruction error compared with the conventional schemes. In reality, it also greatly reduces the time complexity in the proposed scheme. This proposed scheme is also convenient to reconstruct the fused image and to raise the quality of the fused image.
3. The implementation of the proposed fusion scheme In this section, the CS-based multi-scale Discrete Wavelet Transform is explored and verified. It provides the sparse coefficients of the visible and IR images. The Gaussian Random matrix [31] is used to sense the sparse coefficients capitalizing on the
Z. Liu et al. / Optics Communications 335 (2015) 168–177
theory of measurement matrix. The CoSaMP algorithm [32] is exploited to reconstruct the fused image. It can acquire better fidelity and resolution with a proper speed. 3.1. Image sparse representation method Sparse representation obtains a compact high-fidelity representation of the observed image, and also extracts significant information. Choosing a proper basis to represent the visible and infrared images is a crucial point to successfully exploit sparse representation in image processing. Images naturally have sparse representations with respect to fixed bases (i.e., discrete cosine transform, discrete wavelet transform), or concatenations of those. As a cutting edge technology, discrete wavelet transformationbased coding affords substantial improvements in image quality at higher compression ratio. The basic reason lies in overlapping basis functions and better energy compaction property. Utilizing spatial correlation and selecting the wavelet basis adaptively upon the image characteristic, compression ratio could be enhanced extremely. Many scholars pay close attention to merge discrete wavelet transform and image fusion [20,21,26,27]. The principle of discrete wavelet transform is utilizing a group of wavelet basis to indicate a function or an image. The decompositions transform an image into the pixel values in every row and every column in turn. The examples of multi-layer discrete wavelet transform can demonstrate the deviations in Fig. 3.
171
As shown in Fig. 3, the source image is shown in Fig. 3(a). The results of image decomposition by DWT at different levels are shown in Fig. 3(b)–(d). Fig. 3(c) contains much more detail information and less approximation information than Fig. 3(b); While the Fig. 3 (d) contains much more detail information and less approximation information in comparison to Fig. 3(c). It is obvious that if increasing the scaling factor, the representation output would contain more highfrequency information and less low-frequency information. Furthermore, larger scaling factor implies the location (time) resolution is smaller. That is, the selection of a particular scale is the key procedure to represent the source infrared and visible images. The choice really depends on the desired resolution trade-off between time and sparse representation. Therefore, in order to represent these images approximately, some experiments are sketched here to illustrate the selection of discrete wavelet transform level. Firstly, the reconstructed low-lighttelevision images based on multi-scale discrete wavelet transform are explicitly and intuitively demonstrated in Fig. 4. The source image is shown in Fig. 4(a). The reconstructed image is acquired by the compressive sensing method (or CS, for short) [21,22], see Fig. 4(b). In the majority of published examples, the conventional CS method only pays close attention to the approximation coefficients. As shown in Fig. 4(b), the reconstructed image is obtained by DWT at first level; While other results by DWT at different levels can be clearly shown in Table 1. Compared with the source image, the edge of the reconstruction
Fig. 3. The example of DWT. (a) The original image, (b) result of DWT at the first level, (c) result of DWT at the second level, and (d) result of DWT at the third level.
Fig. 4. The source image and reconfiguration experimental results for visible image. (a) The original image, (b) result of the conventional CS method, (c) result of CoSaMP algorithm based on the first level of DWT, (d) result of CoSaMP algorithm based on DWT at the second level, (e) result of CoSaMP algorithm based on DWT at the third level, (f) result of CoSaMP algorithm based on DWT at the fourth level, (g) result of CoSaMP algorithm based on DWT at the fifth level, and (h) result of CoSaMP algorithm based on DWT at the sixth level.
172
Z. Liu et al. / Optics Communications 335 (2015) 168–177
image is blurry in Fig. 4(b). The experimental results of the reconstructed image based on DWT at different levels are demonstrated in Fig. 4(c)–(h), which are acquired by the proposed Compressive Sensing-based method. Better reconstructed images can be obtained especially in Fig. 4(d) and (e). Compared with Fig. 4(d) and (e), there is more or less loss of information in Fig. 4 (c) and (f)–(h). In the sense, partial components are lost in the reconstruction processing due to increased high-frequency information and reduced low-frequency information in larger decomposition scale. In order to provide an objective assessment, the Peak Signal to Noise Ratio (PSNR) [33] of the visible image is clearly displayed in Table 1. In Table 1, the values of the evaluation criteria PSNR intuitively demonstrate that the best level of DWT in compressive sensing domain is the third level. The detail information is wellapproximated as well as the approximated information. The results acquired by the proposed approach perform better than with the compressive sensing method. These also indicate that the significant frequency information can be obtained by DWT at the third level specific to the low-light-television image. Furthermore, the corresponding experiments are applied for all the source visible images. It is surprised that the detailed and approximated information can be well-approximated at the third level, such as edges, lines and contours. Analogously, the simulation process of the forward-looking-infrared image is done in Fig. 5. In comparison to the source infrared image in Fig. 5(a), there is practically no loss of information as Fig. 5(c)–(h) shows intuitively. The limitations of the conventional CS approach can be demonstrated in Fig. 5(b), which is obtained by DWT at the first level. In addition, it only keep a watchful eye on the approximation coefficients. In order to visually view the performance, the results Table 1 The PSNR of reconstruction results based on multiscale DWT for visible image.
of DWT at different levels are elaborated in Table 2. For the forward-looking-infrared image, the results of the reconstructed images demonstrate that the proposed method achieves better visual quality with many skeletal features. The evaluation criteria of each experiment are calculated in the following table. The values of PSNR for the infrared image are depicted in Table 2. Better objective evaluation is obtained by the proposed approach than by the CS method, as shown in Table 2. The results on the infrared images are uniform with the output on the visible images. In particular, some experiments are done specific to all the source infrared images. In conclusion, in order to explore the optimal balance, several comparative experiments are conducted on account of the source infrared and visible images. The results illustrate that it could achieve the optimal representation by DWT at the third level. Therefore, three levels of DWT are utilized to obtain the significant approximate and detail coefficients in our proposed algorithm.
3.2. The selection of measurement matrix The theory of compressive sensing mainly captures and represents sparse images at a rate significantly below the Nyquist rate. This approach leverages non-adaptive linear projections to sustain the structure of the image. By solving an optimization problem, the integrated image can be reconstructed from these projections [22,23]. The measurement matrix plays an important part in the acquisition of measurement vector and image reconstruction. If an original signal x is K-sparse, it should satisfy the condition which is generally called restricted isometry property (RIP) [34].
Table 2 The PSNR of reconstruction results based on multiscale DWT for infrared image.
Method
1
2
3
4
5
6
Method
1
2
3
4
5
6
Proposed CS
26.9090 24.4436
30.1025 23.7853
37.2285 21.7262
20.1873 18.7595
20.8890 16.3627
17.9667 14.9932
Proposed CS
37.0564 32.8390
37.8311 26.9879
38.8119 22.3465
35.0500 18.6354
32.4940 16.0208
26.1237 14.3803
Fig. 5. The source image and reconstruction experimental results for infrared image. (a) The original image, (b) result of conventional CS method, (c) result of CoSaMP algorithm based on the first level of DWT, (d) result of CoSaMP algorithm based on DWT at the second level, (e) result of CoSaMP algorithm based on DWT at the third level, (f) result of CoSaMP algorithm based on DWT at the fourth level, (g) result of CoSaMP algorithm based on DWT at the fifth level, and (h) result of CoSaMP algorithm based on DWT at the sixth level.
Z. Liu et al. / Optics Communications 335 (2015) 168–177
An equivalent situation of RIP provided by Baraniuk [24] is that the measurement matrix Φ and the sparse basis ψ are incoherent. Random Gaussian measurement matrix [20,24,35] is the most common encoding measure employed in compressive sensing domain. There are two reasons to utilize Gaussian matrix in CS domain. In the acquisition and processing of images, restricting the average error via an assembly of images is significant. In real applications, the procedure of linear projection is not only interesting to obtain more information, but also more effective image processing is achieved. It also leads to excellent results in inverse problems. Moreover, random Gaussian matrix is proved that it's uncorrelated with most of the sparse orthogonal matrix. Therefore, in this work, the selection of the encoding measuring is the random Gaussian matrix depending on the advantages mentioned above. 3.3. The reconstruction of fused image The major challenge in compressive sensing is to approximate a sparse image from a set of noisy samples. The literatures schedule various approaches including greedy pursuits, convex relaxation and combinatorial algorithms which usually ignore the fact that most images contain scant information or noise information [20–24]. At the present stage, each model of the schemes has its essential limitations. The combinatorial methods are highly efficient, but demanding somewhat unusual samples. At the other extreme, convex relaxation produces better results but it is computationally burdensome. In particular, greedy pursuits are intermediate in speed and sampling efficiency. The CoSaMP algorithm [36], essentially a greedy pursuit method, affords the uniform assurances as the best optimization-based schemes. It also consults originalities from the combinatorial methods to improve running time and to offer rigorous deviation range. Meanwhile, it is more likely to be efficient in real applications because of claiming only matrix-vector multiplies with the sparse matrix. Incredibly, the running time is merely 2 OðN log NÞ [37]. The steps of the CoSaMP algorithm are shown in Algorithm 1. Algorithm 1. Framework of the CoSaMP algorithm. Require: Input: The CS observation y, sampling matrix Φ and sparsity level K; Output: A K sparse approximation x of the target; Initialization: x0 ¼ 0 (xJ is the estimate of x) at the Jth iteration and r ¼y (the current residual); Ensure: Iteration until convergence: 1: Compute the current error (Note that for Gaussian Φ,
ΦT Φ is diagonal): n e ¼ Φ r: 2:
Compute the best 2K support set of the error set):
Ω ¼ e2K : 3:
4:
(8)
Ω (index
Merge the strongest support sets and perform a LeastSquares Signal Estimation: (10)
Prune xJ and compute for next round: xJ ¼ bk ; r ¼ y ΦxJ :
(11)
In Algorithm 1, Φ denotes the Hermitian transpose of Φ. Φ ⊛ n n indicates the pseudo-inverse of Φ, Φ ¼ ðΦ ΦÞ 1 Φ . ∣T indicates the cardinality of the set, or the number of elements that it contains. Tc indicates the compliment of set T. x∣T indicates the n
vector x which is restricted only to the elements given in T. Φ∣T c indicates that the matrix Φ is restricted to the columns contained in Tc. supp(x) indicates the best support set of the vector x. Specific to a compressible image, the CoSaMP algorithm is a pragmatic reconstruction approach which takes many factors into account including the properties of robustness, competent resources usage and optimal error guarantees.
4. Experiments and analysis In this section, three sets of fusion experiments are conducted to show the validity of the proposed approach. The results prove that the proposed image fusion scheme achieves better image fusion quality than other fusion schemes including the max-abs, Principal Component Analysis (PCA), average, Lapacian Pyramid (LP), DWT, IDWT (The fused image is acquired by the inverse DWT without applying the reconstruction algorithm (CoSaMP).) and CS-based approach (The fused image is acquired by integrating the linear projections rather than the sparse coefficients in the CS-based fusion approach.) [21,25,27,38,39]. The subjective and objective quality evaluation are the basic methods to analyze the quality of the fused image. A few quality measures are commonly used to evaluate the image fusion results such as Average Gradient (AG), Mutual Information (MI) and Image Entropy (IE), Structural Similarity (SSIM) and Edge retention Q AB=F [40–46]. AG reflects the features of the small details contrast and texture change. It also reflects the definition of the fused image. The evaluation criteria MI implies the measurement of the average information. The fused image quality is better, if the index has larger numerical value for AG and MI. The index Q AB=F emphasizes the structural similarity including the infrared image, the visible image and the fused image. The evaluation criteria SSIM demonstrates the distinction between the integrated images and the reference image. The estimation values Q AB=F and SSIM approaching to 1 expound the quality of the fused image better than others. In this context, all the experiments are implemented in Matlab 7.0 and on a Pentium (R) 2.5 GHz PC with 2.00 GB RAM. The helicopter navigation images are conducted by Dr. Oliver Rockinger. The natural scenario images are kindly provided by Mr. John Lewis and Dr. Stavri Nikolov, University of Bristol. The remote sensing images are from the IEEE GRSS Data Fusion Committee Web Portal. All these images can be found at http://www.imagefu sion.org. 4.1. Fusion of helicopter navigation image
(9)
c T ¼ Ω [ suppðxJ 1 Þ b∣T ¼ Φ⊛ ∣T y; b∣T ¼ 0:
173
⊛
In this section, firstly, the reconstruction algorithm is implemented under five conditions to validate the property of compressive sensing theory and sensing measurement. In the experiments, there are only including two components: sparsity and reconstruction, when given measurement y ¼N. When given N (IDWT), it means that the fused image is directly acquired via the inverse DWT without applying the reconstruction algorithm (CoSaMP). Meanwhile, the PSNR is used to evaluate the merged image quality. The detail description is shown in Table 3. In Table 3, the run-time of reconstruction processing is acquired by averaging the results of 20 times recovery procedure. The distinction between sensing measurement and non-sensing measurement is significant. As shown in Table 3, high image quality and high compression ratio can be achieved, when given measurement y ¼ 0:7nN and non-measurement (y¼N). In particular, when given N(IDWT), the results demonstrate that there are no advantages in image quality and time complexity. Therefore, the measurement y ¼ 0:7nN is utilized to implement the proposed approach.
174
Z. Liu et al. / Optics Communications 335 (2015) 168–177
Helicopter navigation images contain the forward-lookinginfrared (FLIR) and low-light-television (LLTV) images. These significantly include much more edge and direction information as well as background information. The display of FLIR images differs from that of night vision as shown in Fig. 6(b). These operate in the visible light and near-infrared ranges. The LLTV images allow viewing of objects in extremely low light levels in Fig. 6(a). These would not be seen by the naked eye. The possible improvements integrate both imaging sources into a single fused image under the current limitation of helicopter navigation images. It provides the relevant image information of both imaging devices under poor visibility conditions. By the proposed approach and the conventional methods, a group of simulation experiments are displayed in Fig. 6. The fused image acquired by the proposed scheme is best for human vision because of better spatial information in Fig. 6(j), subjectively. There are some details and some edge information which are lost by other methods such as average-based method, DWT-based method, LP-based method, max-based method, PCAbased method, CS-based method, and IDWT-based method, as shown in Fig. 6(c)–(i). In order to provide an objective assessment, the detailed quantitative results are shown in Table 4. The results demonstrate that the improved fusion method provides feasible fusion effect remarkably, although the value of image entropy is not largest in comparison to the max-based fusion method. The objective evaluation criteria is coincident with the judgment of human visual perception. These all draw the conclusion that the proposed approach can achieve better quality than other conventional methods.
Table 3 The running time (time/seconds), compression size (size/bits) and PSNR in different measurements. Measurements N(CoSaMP) 0.7nN Time Size PSNR
0.5nN
0.3nN
N(IDWT)
0.678330 0.457751 0.371382 0.118947 0.669911 512n512n8 512n358n8 512n256n8 512n154n8 512n512n8 36.0305 35.1187 27.7410 22.6466 30.8384
4.2. Fusion of natural scenario image The most important property of CS theory is that the construction of measurement matrix is related with the acquisition of measurement vector and reconstruction error. In this subsection, some experiments are implemented to validate the performance of CS theory and the merits of sensing measurement. In the experiments, the running-time of reconstruction processing is acquired by averaging the results of 20 times recovery procedure. The assessment criteria (PSNR) is utilized to illustrate the fused image quality. Furthermore, high compression ratio can reduce the storage space and time complexity. The experimental results are shown in Table 5. In Table 5, although there are high compression ratio (i.e., the measurements are given y ¼ 0:7nN), the quality of the reconstruction image can be obtained in comparison to the results of nonlinear projection (i.e., the measurements are given y¼N). In particular, the results using the inverse DWT transform without applying the reconstruction algorithm (CoSaMP) (or IDWT, for short) are also shown in Table 5. In additions, it also has the common property of high compression ratio by exploiting measurement matrix, which can be shown in Table 5. Due to these merits, the measurement y ¼ 0:7nN is utilized to implement the proposed method. Natural scenario images are characterized by highly structured statistical properties. Observing from the visible and IR images in Fig. 7(a) and (b), there is apparently lost information for human Table 4 The objective evaluation of various methods for helicopter navigation image. Fusion methods
MI
AG
IE
Q AB=F
SSIM
Average DWT LP Max PCA CS IDWT Proposed
5.0991 5.1006 5.5401 6.9043 6.2630 5.7594 6.5365 6.9642
5.9152 6.8895 7.1550 7.1299 5.4165 6.9965 6.2801 8.1053
5.9152 6.1887 6.1562 6.7539 6.5814 6.6306 6.7024 6.6726
0.4186 0.6167 0.7296 0.8034 0.6431 0.6806 0.6308 0.8251
0.7435 0.7586 0.7997 0.7498 0.6950 0.7725 0.7218 0.8742
Fig. 6. The fusion results of different fusion methods for helicopter navigation images. (a) and (b) The LLTV and FLIR images, (c) the fused image using average-based method, (d) the fused image using DWT-based method, (e) the fused image using LP-based method, (f) the fused image using max-based method, (g) the fused image using PCA-based method, (h) the fused image using CS-based method, (i) the fused image using IDWT-based method without CoSaMP, and (j) the fused image using the proposed method.
Z. Liu et al. / Optics Communications 335 (2015) 168–177
175
Table 5 The running time (time/seconds), compression size (size/bits) and PSNR in different measurements. Measurements
N(CoSaMP)
0.7nN
0.5nN
0.3nN
N(IDWT)
Time Size PSNR
0.702894 632n496n8 35.6051
0.315974 632n348n8 34.5645
0.224774 632n248n8 25.4716
0.123576 632n148n8 19.9076
0.803777 632n496n8 30.1758
Fig. 7. The fusion results of different fusion methods for natural scenario images. (a) and (b) The visible and infrared images, (c) the fused image using average-based method, (d) the fused image using DWT-based method, (e) the fused image using LP-based method, (f) the fused image using max-based method, (g) the fused image using PCA-based method, (h) the fused image using CS-based method, (i) the fused image using IDWT-based method without CoSaMP, and (j) the fused image using the proposed method.
vision. The better quality image for characterizing natural environment is expected. In this section, the experimental results are demonstrated in Fig. 7. The visible and infrared images can be seen in Fig. 7(a) and (b). The fused images are shown in Fig. 7(c)-(j), which are obtained via the average-based method, DWT-based method, Lapacian Pyramidbased method, max-based method, PCA-based method, CS-based method, IDWT-based method and the proposed approach, respectively. According to the fused images acquired by various methods, subjectively, the experimental result of the proposed image fusion scheme is superior to the other conventional methods. In comparison to the source images, there is some lost information shown in Fig. 7(c)-(i). Compared with the visible and infrared images, the fused image acquired by the proposed fusion approach contains more suitable information for human vision in Fig. 7(j). The defined criterions are listed in Table 6. The objective evaluation values clearly indicate that our proposed image fusion scheme can get a better fusion quality. Although the evaluation criterion value of the fused image entropy acquired by the proposed approach is lower than by the general DWT method, the difference is very small. Both the subjective visual effect and the objective evaluation demonstrate that the proposed scheme can accomplish a better fusion quality. 4.3. Fusion of remote sensing image Remote sensing images particularly afford a repetitive and consistent view of the earth. To satisfy the requirements of different remote sensing applications, the fusion of remote sensed images provides a wide boundary of spatial, spectral, radiometric and temporal resolutions. On a high-fidelity image, there are more detailed geometric features and plenty of spectral information. In this work, firstly, some experiments are sketched to illustrate the importance of the measurement matrix and compressive sensing theory. The reconstruction time is obtained by measuring
Table 6 The objective evaluation of various methods for natural scene image. Fusion methods
MI
AG
IE
Q AB=F
SSIM
Average DWT LP Max PCA CS IDWT Proposed
6.3754 6.6866 7.3440 6.9523 6.2128 6.5355 6.322 7.5407
3.0066 3.3226 3.0750 3.5621 3.1683 3.6514 3.3997 3.9595
5.9143 6.9905 6.7585 6.4766 6.0316 6.0608 6.2637 6.7621
0.3716 0.5708 0.6785 0.6268 0.4989 0.6251 0.6083 0.6861
0.8860 0.6264 0.7704 0.9230 0.9205 0.9206 0.9417 0.9689
Table 7 The running time (time/seconds), compression size (size/bits) and PSNR in different measurements. Measurements N(CoSaMP) 0.7nN Time Size PSNR
0.5nN
0.3nN
N(IDWT)
0.717984 0.326660 0.178622 0.141252 0.865756 512n512n8 512n358n8 512n256n8 512n154n8 512n512n8 42.7171 41.4644 36.8956 26.1848 33.4510
20 times averaged at different measurements. The efficiency of sensing measurement processing is estimated by PSNR. It also has the advantages of low sampling ratio and low time complexity to utilize non-adaptive linear projection. The results can be shown in Table 7. The description of the experiments can clearly indicate that employing the measurement matrix enhance the speed of reconstruction procedure, as shown in Table 7. As described in previous section, the evaluation criterion PSNR demonstrate that the highquality reconstructed image can be achieved, when given measurement y ¼ 0:7nN or y¼ N. Whereas high compression ratio are the general property of applying the measurement matrix in
176
Z. Liu et al. / Optics Communications 335 (2015) 168–177
Fig. 8. The fusion results of different fusion methods for remote sensing images. (a) and (b) The remote sensing images, (c) the fused image using average-based method, (d) the fused image using DWT-based method, (e) the fused image using LP-based method, (f) the fused image using max-based method, (g) the fused image using PCA-based method, (h) the fused image using CS-based method, (i) the fused image using IDWT-based method without CoSaMP, and (j) the fused image using the proposed method.
Table 8 The objective evaluation of various methods for remote sensing image. Fusion methods
MI
AG
IE
Q AB=F
SSIM
Average DWT LP Max PCA CS IDWT Proposed
6.3573 5.9581 6.0151 6.1635 6.5667 5.4910 6.3595 6.7665
3.6854 4.2928 4.4511 4.9859 4.3910 4.7682 4.9427 4.9861
6.2558 6.4586 6.8054 6.8830 6.9030 6.3360 6.6364 6.9231
0.4444 0.4823 0.5482 0.6578 0.6294 0.5910 0.5915 0.6628
0.8770 0.9069 0.9730 0.8288 0.8769 0.9201 0.9067 0.9865
Table 7. The merits of applying sensing measurement can be clearly demonstrated in Table 7 in comparison to the condition by IDWT. Due to these merits, in this work, the measurement y ¼ 0:7nN is utilized to implement the proposed algorithm. The proposed method and other algorithms are utilized to fuse the remote sensing images, which are shown in Fig. 8. The source remote sensing images are displayed in Fig. 8(a) and (b). The other images are the fused images acquired by various methods, see Fig. 8(c)-(j). In comparison to the result acquired by the proposed scheme, there is not much practical significance in Fig. 8 (c) and (d) because of losing too much information. Although another three methods obtain proper results in Fig. 8(e)–(i), the fused image via the presented approach is appropriate for human visual perception as shown in Fig. 8(j). In Table 8, the proposed fusion approach is superior to other methods in terms of all the evaluation criteria. All approaches can sufficiently enhance the quality of the fused image, but only the proposed method provides admissible enhancement. The proposed approach takes all the frequency information into account and fuses the coefficients directly. The evaluation criteria sufficiently indicates that the proposed scheme outperforms the existing state-of-the-art methods.
5. Conclusion In this paper, an improved image fusion scheme for visible and infrared images based on compressive sensing is proposed. The
sparse transform, DWT, is utilized to decompose the source images into two components: low frequency coefficients and high frequency coefficients. The max-abs-based fusion rule is used to fuse the high frequency coefficients; while a mutual information-based fusion rule is proposed to fuse the low frequency coefficients. The fused image is accurately reconstructed by Compressive Sampling Matched Pursuit algorithm after the non-adaptive linear projection by Gaussian matrix. Experimental results demonstrate that the proposed fusion approach can achieve better fused image quality than the existing state-of-the-art methods. It also has the properties of low time complexity and low storage space. Although the proposed fusion scheme can get a satisfied result, there are still many works worth to do in the follow-up study. In practice, DWT is not applicable to all types of images. How to design an appropriate sparse transform is a fundamental problem. Furthermore, the Gaussian measurement matrix may generate uncertain results due to the randomness of measurement matrix. How to design a optimal deterministic measurement matrix is a key issue to put the proposed approach into practice. We will consider these problems in our future work.
Acknowledgments We would like to thank the supports by National Natural Science Foundation of China (61374135, 61203321, 61472053, 61173129), China Postdoctoral Science Foundation (2012M521676), Postdoctoral Scientific Research Project of Chongqing special funding (XM201307), Specialized Research Fund for the Doctoral Program of Higher Education of China (No. 20120191110026) and China Central Universities Foundation (106112013 CDJZR170005). References [1] W. Hizem, L. Allano, A. Mellakh, B. Dorizzi, IET Signal Process. 3 (4) (2009) 282. [2] S.R. Schnelle, A.L. Chan, Enhanced target tracking through infrared-visible image fusion, in: Proceedings of the 14th International Conference on Information Fusion, Chicago, IL, USA, 2011, pp. 1–8. [3] S. Jamal, F. Karim, Appl. Soft Comput. 12 (3) (2012) 1041. [4] S.K. Senthi, K.B. Mahesh, S. Muttan, Comput. Vis. Image Anal. 10 (1) (2011) 1. [5] H. Li, B.S. Manjunath, S.K. Mitra, J. Gr. Models Image Process. 57 (3) (1995) 235.
Z. Liu et al. / Optics Communications 335 (2015) 168–177
[6] R. Paresh, G. Sapna, V. Pankaj, Int. J. Comput. Technol. Electron. Eng. 1 (1) (2011) 1. [7] Z.M. Zhou, L. Jiang, J. Wang, P. Zhang, Image fusion by combining SWT and variational model, in: 2011 4th International Congress on Image and Signal Processing (CISP), Shanghai, China, 2011, pp. 1907–1910. [8] M. Beaulieu, S. Foucher, L. Gagnon, Multi-spectral image resolution refinement using stationary wavelet transform, in: Proceedings of the International Geoscience and Remote Sensing Symposium, 2003, pp. 4032–4034. [9] J.J. Lewis, R.J. Ocallaghan, S.G. Nikolov, D.R. Bull, C.N. Canagarajah, Inf. Fusion 8 (2) (2007) 119. [10] R.PS. Chauhan, R. Dwivedi, S. Negi, Int. J. Appl. Inf. Syst. 4 (2) (2012) 40. [11] S.G. Mamatha, L. Gayatri, Glob. J. Adv. Eng. Technol. 1 (2) (2012) 69. [12] K. Parmar, R. Kher, A comparative analysis of multimodality medical image fusion methods, in: Sixth Asia Modelling Symposium (AMS), Bali, Indonesia, 2012, pp. 93–97. [13] A.L. Cunha, J.P. Zhou, M.N. Do, IEEE Trans. Image Process. 15 (10) (2006) 3089. [14] H.A. Melkamu, S.A. Vijanth, I. Lila, F.MH. Ahmad, Int. J. Electr. Eng. Inf. 2 (1) (2010) 30. [15] N. Indhumadhi, G. Padmavathi, Int. J. Soft Comput. Eng. 5 (1) (2011) 298. [16] P.P. Mirajka, R.D. Sachin, Int. J. Adv. Eng. Res. Stud. 2 (4) (2013) 99. [17] P. Hill, N. Canagarajah, D. Bull, Image fusion using complex wavelets, in: Proceedings of the British Machine Vision Conference, Cardiff, UK, 2002, pp. 487–496. [18] S. Tania, Image Fusion: Algorithms and Applications, Elsevier, USA, 2008. [19] S.T. Li, X.D. Kang, J.W. Hu, B. Yang, Inf. Fusion 14 (2) (2013) 147. [20] J. Romberg, IEEE Signal Process. Mag. 25 (2) (2008) 14. [21] X. Li, S.Y. Qin, IET Image Process. 5 (2) (2011) 141. [22] D.L. Donoho, IEEE Trans. Inf. Theory 52 (4) (2006) 1289. [23] E. Candes, Compressive sampling, in: Proceedings of International Congress of Mathematicians, Zurich, Switzerland, European Mathematical Society Publishing House, 2006, pp. 1433–1452. [24] R.G. Baraniuk, IEEE Signal Process. Mag. 24 (4) (2007) 118.
177
[25] D. Looney, D.P. Mandic, IEEE Trans. Signal Process. 4 (57) (2009) 1626. [26] Y.S. Chen, G. Xiao, Fusion of infrared and visible images based on distributed compressive sensing, in: International Conference on Information Science and Engineering, Yangzhou, China, 2011. [27] B. Yang, S.T. Li, Inf. Fusion 1 (13) (2012) 10. [28] J.C. Goswami, A.K. Chan, Fundamentals of Wavelets: Theory, Algorithms, and Applications, Electrical and Electronics Engineering, 2010. [29] J.B. Kinney, G.S. Atwa, Equitability, mutual information, and the maximal information coefficient, arXiv:1301.7745, 2013. [30] S. Milan, H. Vaclav, B. Roger, Image Processing, Analysis, and Machine Vision, Cengage Learning, USA, 2007. [31] E.J. Candes, J. Romberg, Found. Comput. Math. 6 (2) (2006) 227. [32] D.L. Donoho, Y. Tsaig, I. Drori, J.L. Starck, IEEE Trans. Information Theory (2006). [33] D. Salomon, Data Compression: The Complete Reference, Springer-Verlag, London, 2007. [34] E.J. Candes, C.R. Math. 346 (9–10) (2008) 589. [35] E.J. Candes, J. Romberg, T. Tao, IEEE Trans. Inf. Theory 2 (52) (2006) 489. [36] Z. Wang, A.C. Bovik, H.R. Sheikh, E.P. Simoncelli, IEEE Trans. Image Process. 13 (4) (2004) 600. [37] M. Whittle, V.J. Gillet, P. Willet, J. Loeselr, A. Alexande, J. Chem. Inf. Comput. Sci. 44 (2004) 1840. [38] M.T. Sadeghi, M. Samiei, J. Kittler, EURASIP J. Adv. Signal Process. 23 (2010) 1. [39] D.A. Yocky, J. Opt. Soc. Am. Part A 12 (9) (1995) 1834. [40] C.F. Li, Y.W. Ju, A.C. Bovik, X.J. Wu, Q.B. Sang, Opt. Eng. 52 (5) (2013) 1. [41] Z.D. Liu, H.P. Yin, B. Fang, Y. Chai, Expert Syst. Appl. (2014). [42] H.F. Li, Y. Chai, H.P. Yin, G.Q. Liu, Opt. Commun. 285 (2) (2012) 91. [43] L.Q. Guo, M. Dai, M. Zhu, Opt. Express 20 (17) (2012) 18846. [44] X.Z. Bai, Appl. Opt. 51 (31) (2012) 7566. [45] L. Chen, J.B. Li, Chen, C.L. Philip, Opt. Express 21 (4) (2013) 5182. [46] Y. Chai, H.F. Li, G.F. Li, Opt. Commun. 284 (19) (2011) 4376.