Digital Signal Processing 23 (2013) 542–554
Contents lists available at SciVerse ScienceDirect
Digital Signal Processing www.elsevier.com/locate/dsp
Morphological image fusion using the extracted image regions and details based on multi-scale top-hat transform and toggle contrast operator Xiangzhi Bai a,b,∗ a b
Image Processing Center, Beijing University of Aeronautics and Astronautics, Beijing 100191, China State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing 100191, China
a r t i c l e
i n f o
Article history: Available online 13 November 2012 Keywords: Top-hat transform Toggle contrast operator Multi-scale Image fusion Mathematical morphology
a b s t r a c t Remaining useful information of the original images in the fusion image is very important in image fusion. To be effective for image fusion, a multi-scale top-hat transform and toggle contrast operator based algorithm using the extracted image regions and details is proposed in this paper. Top-hat transform could extract image regions, and operations constructed from toggle contrast operator could extract image details. Moreover, multi-scale top-hat transform and toggle contrast operator could be used to extract the effective image regions and details at multi-scales of the original images. Then, the extracted image regions and details are imported into the final fusion image to form the effective fusion result. Thus, the proposed multi-scale top-hat transform and toggle contrast operator based algorithm is an effective image fusion algorithm to keep more useful image information. The combination of the top-hat transform and toggle contrast operator for effective image fusion is the main contribution of this paper, which is the extension of the previous work using only the toggle contrast operator for edge preserved image fusion. Experimental results on multi-modal and multi-focus images show that the proposed algorithm performs very well for image fusion. © 2012 Elsevier Inc. All rights reserved.
1. Introduction Image fusion is a useful technique to combine the image information of different images, which produces one fusion image containing useful image information from different images, such as multi-modal images, multi-focus images and so on. Then, the fusion image would be very useful for image analysis and pattern recognition. This means, the key of image fusion is extracting the useful information of different images and reasonably combining them into the final fusion image. To achieve good performance, some mathematical tools have been used for image fusion and, based on these tools, some image fusion algorithms have been proposed [1–14]. Direct average algorithm uses pixel-wise averaging on the original images to form the final fusion image, which may smooth useful image regions and details. K–L transform or PCA based algorithms [1–3] extract the important image features of the original image and treat the extracted important image features as the final fusion image. However, some other useful image features may be smoothed. Wavelet and curvelet transform based algorithms [4–10] decompose the original images into multi-scale
*
Correspondence to: Image Processing Center, Beijing University of Aeronautics and Astronautics, Beijing 100191, China. E-mail address:
[email protected]. 1051-2004/$ – see front matter © 2012 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.dsp.2012.11.001
images which contain different image features. Then, the useful image information in multi-scale images are extracted and used to reconstruct the final fusion image. But, some useful image information may be lost, which results in an ineffective fusion image. Segmenting the useful image regions and then combining them in the final fusion image is also a good way in some cases [11,12]. But, inappropriate segmentation may affect the performance of the algorithm. Some tools, such as neural networks, blurring measures [13,14], are also ways for multi-focus image fusion. However, these algorithms have been only verified on the applications of multifocus image fusion, which may be not effective for the applications of multi-modal image fusion. Mathematical morphology [15,16] has been widely used for image processing after being proposed, including the image fusion [17–19]. Some of the mathematical morphology based algorithms use the strategy of multi-scale [17–19] or top-hat transform [17– 24] to extract the useful image regions at different image scales, and then combine the extracted image regions together to form the final fusion image. Because the extracting of image region usually does not need special rules for special applications, mathematical morphology based algorithms could be used for different types of image fusion, such as multi-modal image fusion and multifocus image fusion. This is very important for some general applications. However, although top-hat transform effectively extracts
X. Bai / Digital Signal Processing 23 (2013) 542–554
543
Fig. 1. Example of TCO on one medical image.
image regions, some useful image details are smoothed, which may be harmful for image fusion. One type of toggle contrast operator [25–31] is a contrast enhancement operator based on dilation and erosion, which could sharpen image and therefore enhance image details. This means, based on this toggle contrast operator, image details may be extracted and then used for image fusion. Thus, the toggle contrast operator could be also well used for obtaining good image fusion result with clear details [27]. Therefore, through combining the extracted image details by using toggle contrast operator and the extracted image regions by using top-hat transform, an effective image fusion result may be achieved. Also, multi-scale technique [32,33] could be used by top-hat transform and toggle contrast operator to extract the multi-scale useful image regions and details for image fusion, which could further improve the performance of algorithm for image processing. Moreover, utilizing the multi-scale technique could reduce the difficulty of choosing the structuring element or fixing the size of the structuring element, which could simplify the application of the algorithm. In light of this, an image fusion algorithm using the extracted image regions and details based on multi-scale top-hat transform and toggle contrast operator is proposed in this paper. Multi-scale top-hat transform is used to extract image regions of the original images. And, multi-scale toggle contrast operator is used to extract image details of the original images. Then, the extracted effective image regions and details are combined into a base image to form the final fusion image. The main contribution of this paper is the extension of our previous work [27] for image fusion using the extracted image regions and details through combining the tophat transform and toggle contrast operator. Experimental results on multi-modal and multi-focus images show that the proposed algorithm performs well for image fusion on different types of images, and the performance is better than some other algorithms. 2. Mathematical morphology Mathematical morphology is an important theory for image processing which is based on set theory [15,16]. This paper focuses on gray level mathematical morphology. Two types of sets are used in mathematical morphology: the original image f (x, y ) and structuring element B (u , v ). (x, y ) and (u , v ) are the pixel coordinates. The two basic operations, dilation (⊕) and erosion () of image f (x, y ) by using B (u , v ), are defined as follows:
f ⊕ B (x, y ) = max f (x − u , y − v ) + B (u , v ) , u,v
f B (x, y ) = min f (x + u , y + v ) − B (u , v ) . u,v
f ◦ B = ( f B) ⊕ B,
(3)
f • B = ( f ⊕ B) B.
(4)
Through comparing the results of opening and closing with the original image, the white top-hat transform (WTH) and black tophat transform (BTH) are defined as follows:
WTH(x, y ) = f (x, y ) − f ◦ B (x, y ),
(5)
BTH(x, y ) = f • B (x, y ) − f (x, y ).
(6)
Opening (closing) is usually used to smooth bright (dark) image regions corresponding to structuring element B. So, the bright (dark) image regions could be extracted by WTH (BTH). Toggle contrast operator (denoted by TCO) is defined based on dilation and erosion through a selective output following some rules. One type of toggle contrast operator is defined as follows:
⎧ f ⊕ B (x, y ), ⎪ ⎪ ⎪ ⎪ ⎨ if f ⊕ B (x, y ) − f (x, y ) < f (x, y ) − f B (x, y ), TCO(x, y ) = f B (x, y ), ⎪ ⎪ ⎪ ⎪ ⎩ if f ⊕ B (x, y ) − f (x, y ) > f (x, y ) − f B (x, y ), f (x, y ), else. (7) The definition of TCO indicates that the output of TCO for each pixel is the gray value of the result of dilation or erosion or the original image. Dilation or erosion enlarges or shrinks image regions, which changes the gray values of the marginal regions of image regions. Therefore, TCO, which is a selective output of the gray value of the result of dilation or erosion or the original image, makes the marginal regions of image region clear. This achieves image sharpening [15,27]. An example of TCO on one medical image is shown in Fig. 1 using a structuring element with circle shape and radius 1. (a) is the original medical image. (b) is the result of TCO. The processing result of TCO (Fig. 1(b)) is well sharpened because this image is clearer than the original medical image. Moreover, marginal regions are usually important image details for image processing [15,27]. In another words, based on TCO, important image details could be extracted. This would be very useful for image processing. 3. Algorithm
(1) 3.1. Image regions from multi-scale top-hat transform
(2)
Through composing dilation and erosion, opening (◦) and closing (•) operation of f (x, y ) by using B (u , v ) are defined as follows:
3.1.1. Multi-scale top-hat transform Top-hat transform extracts image regions with sizes smaller than the size of the used structuring element. To extract all the
544
X. Bai / Digital Signal Processing 23 (2013) 542–554
Fig. 2. Example of fusion image regions using pixel-wise maximum operation.
possible image regions, multi-scale structuring elements with different sizes should be used. And, multi-scale top-hat transform could be used to extract multi-scale image regions [22]. Suppose there are n scales of structuring elements with the same shape and increasing sizes: B 1 , B 2 , . . . , B n . B i = B 1 ⊕ B 1 ⊕ · · · ⊕ B 1 , 1 i n.
dilation i times
To extract bright image regions corresponding to the scale i represented by structuring element B i , white top-hat transform using structuring element B i could be calculated:
WTHi ( f ) = f − f ◦ B i .
(8)
Similarly, to extract dark image regions corresponding to the scale i represented by structuring element B i , black top-hat transform using structuring element B i could be expressed as follows:
BTHi ( f ) = f • B i − f .
(9)
Therefore, by using multi-scale top-hat transform, the multiscale image regions could be extracted and used for effective image fusion. 3.1.2. Image regions extraction Suppose the original images for image fusion are f and g. Opening smoothes bright image regions at the scale corresponding to the size of the used structuring element B i . Then, white top-hat transform extracts bright image regions at the scale corresponding to B i . So, the effective bright image regions are image pixels with large gray values in WTH. Image fusion should combine the effective image regions into the final fusion image. Therefore, the effective bright image regions at the scale corresponding to B i extracted from f and g could be obtained through the pixel-wise maximum operation on the results of WTHi ( f ) and WTHi ( g ) as follows:
WTHi ( f , g ) = max WTHi ( f ), WTHi ( g ) .
(10)
Fig. 2 is an example of fusion bright image regions using the pixel-wise maximum operation on the results of WTHi ( f ) and WTHi ( g ) at scale 1 following expression (10). (a) is the original image f ; (b) is the original image g; (c) is the extracted bright image regions of image f by top-hat transform at scale 1 WTH1 ( f ); (d) is the extracted bright image regions of image g by top-hat transform at scale 1 WTH1 ( g ); (e) is the fusion bright image regions at scale 1. Fig. 2 shows that the extracted bright image regions (Fig. 2(c) and (d)) from the original images are indeed bright image regions with large gray value pixels. Thus, using the pixel-wise maximum operation to combine these extracted bright image regions is reasonable. In the fusion bright image regions at scale 1 (Fig. 2(e)), the extracted bright image regions from different original images are well combined. Therefore, using the pixel-wise maximum operation for image region fusion is effective. Actually, other types of operations could be also used to perform the fusion procedure. However, because the extracted bright image regions have large gray values, using the pixel-wise maximum operation for image region fusion could well maintain the effective image regions in the final fusion image. This would achieve an effective fusion result. WTHi ( f , g ) is the extracted bright fusion image regions corresponding to the scale i. The extracted bright fusion image regions at each scale have large gray values. Therefore, to well maintain the bright image regions of all the scales, the final bright image regions for image fusion should be the large gray values of all the scales for each pixel, which could be calculated as follows using the pixel-wise maximum operation. Actually, other operations could be also used to combine the bright image regions of different scales:
WR = max WTHi ( f , g ) . i
(11)
X. Bai / Digital Signal Processing 23 (2013) 542–554
Similarly, we define the effective dark image regions at the scale corresponding to B i extracted from f and g as follows:
BTHi ( f , g ) = max BTHi ( f ), BTHi ( g ) .
(12)
And, we obtain the final dark image regions for image fusion as follows:
BR = max BTHi ( f , g ) .
(13)
545
Then, the image details in the result of toggle contrast operator produced by dilation and erosion at each scale could be extracted and used for image fusion. Suppose the original images for image fusion are f and g. By using multi-scale toggle contrast operator, the multi-scale image details produced by dilation in TOC for images f and g at scale s could be calculated as follows:
DTCOs ( f )(x, y ) = max TCOs ( f )(x, y ) − f (x, y ), 0 ,
(19)
3.2. Image details from multi-scale toggle contrast operator
DTCOs ( g )(x, y ) = max TCOs ( g )(x, y ) − g (x, y ), 0 .
(20)
3.2.1. Image details from toggle contrast operator Dilation and erosion usually enlarge or shrink image regions which mainly change the gray values of the marginal regions of image regions. Therefore, the result of toggle contrast operator, which is a selective output of the result of dilation or erosion or the original image, contains image details produced by dilation and erosion of the original image [27]. The gray value of the dilation result is not smaller than the gray value of the original image, which means f ⊕ B (x, y ) f (x, y ) [15, 27]. So, the image details produced by dilation in TOC could be calculated as follows [27]:
DTCO is the extracted image details produced by dilation. The gray values of the result of dilation are larger than the gray values of the original image. So, the effective image details in DTCO should be large. Then, the effective image details produced by dilation from f and g at scale s for image fusion could be calculated by using the pixel-wise maximum operation on DTCOs ( f ) and DTCOs ( g ) as follows:
i
DTCO( f )(x, y ) = max TCO( f )(x, y ) − f (x, y ), 0 .
(14)
Also, the gray value of the erosion result is not larger than the gray value of the original image, which means f B (x, y ) f (x, y ) [15,27]. So, the image details produced by erosion in TOC could be calculated as follows [27]:
ETCO( f )(x, y ) = max f (x, y ) − TCO( f )(x, y ), 0 .
(15)
3.2.2. Image details extraction TCO enhances image contrast through replacing the gray values of image with the gray values of the result of dilation or erosion. However, dilation and erosion use only one structuring element. Therefore, TCO enhances image details at the scale corresponding to the size of the used structuring element [27]. To enhance image details at all the scales, multi-scale structuring elements with the same shape and different sizes should be used in TCO [27]. This means, multi-scale toggle contrast operator could be used to extract multi-scale image details [27]. In this section, we use the strategy of the previous work [27] to extract the multi-scale image details for image fusion. Suppose there are m scales of structuring elements with the same shape and increasing sizes B 1 , B 2 , . . . , B m . B s = B 1 ⊕ B 1 ⊕ · · · ⊕ B 1 , 1 s m. The dilation and erosion of
dilation s times
image f (x, y ) by using structuring element B s (u , v ) at scale s is expressed as follows:
f ⊕ B s (x, y ) = max f (x − u , y − v ) + B s (u , v ) , u,v
f B s (x, y ) = min f (x + u , y + v ) − B s (u , v ) . u,v
(21)
Fig. 3 is an example of fusion bright image details using the pixel-wise maximum operation on the results of DTCOi ( f ) and DTCOi ( g ) at scale 1. The original images are the same as the Fig. 2(a) and (b). In Fig. 3, (a) is the extracted bright image details of image f by toggle contrast operator at scale 1 DTCO1 ( f ); (b) is the extracted bright image details of image g by toggle contrast operator at scale 1 DTCO1 ( g ); (c) is the fusion bright image details at scale 1. Fig. 3 shows that the extracted bright image details from the original images (Fig. 3(a) and (b)) have large gray values. So, using the pixel-wise maximum operation to combine these extracted bright image details could effectively maintain the extracted bright image details of the original images in the final fusion image (Fig. 3(c)). This would be useful for improving the performance of the algorithm for image fusion. DTCOs ( f , g ) is the extracted bright fusion image details produced by dilation for image fusion at scale s. The extracted bright fusion image details at each scale have large gray values. Therefore, to well maintain the bright image details of all the scales, the final bright image details produced by dilation for image fusion should be the large gray values of all the scales for each pixel, which could by calculated as the pixel-wise maximum of DTCOs ( f , g ) for all the scales as follows:
DF = max DTCOs ( f , g ) .
(22)
s
Similarly, by using multi-scale toggle contrast operator, the multiscale image details produced by erosion in TOC for images f and g at scale s could be calculated as follows:
(16) (17)
DTCOs ( f , g ) = max DTCOs ( f ), DTCOs ( g ) .
ETCOi ( f )(x, y ) = max f (x, y ) − TCOi ( f )(x, y ), 0 ,
(23)
ETCOi ( g )(x, y ) = max g (x, y ) − TCOi ( g )(x, y ), 0 .
(24)
Based on dilation and erosion using multi-scale structuring element B s , 1 s m, the multi-scale toggle contrast operator at scale s using structuring element B s (u , v ) is defined as follows:
Also, the effective image details produced by erosion from f and g at scale s for image fusion could be calculated as follows:
TCOs (x, y )
ETCOs ( f , g ) = max ETCOs ( f ), ETCOs ( g ) .
⎧ f ⊕ B (x, y ), s ⎪ ⎪ ⎪ ⎪ if f ⊕ B s (x, y ) − f (x, y ) < f (x, y ) − f B s (x, y ), ⎨ = f B s (x, y ), (18) ⎪ ⎪ ⎪ if f ⊕ B ( x , y ) − f ( x , y ) > f ( x , y ) − f B ( x , y ), ⎪ s s ⎩ f (x, y ), else.
(25)
And, we could obtain the final image details produced by erosion for image fusion as follows:
EF = max ETCOs ( f , g ) . s
(26)
546
X. Bai / Digital Signal Processing 23 (2013) 542–554
Fig. 3. Example of fusion image details using pixel-wise maximum operation.
3.3. Image fusion One idea of image fusion is combining the extracted image regions and details into one base image as follows:
F u = f b + (WR − BR) + (DF − EF ).
(27)
f u is the final fusion result image. f b represents the base image, which could be calculated as follows [19,27]:
f b (x, y ) = 0.5 × f (x, y ) + g (x, y ) .
(28)
The base image is the result of the simple pixel-wise mean of the original images. The pixel-wise mean of the original images would smooth some image regions and details. This means the basic image information of the original images would be remained in the base image f b . Then, importing the extracted effective image regions and details into this base image would produce an effective fusion result image. Therefore, using the pixel-wise mean of the original images as the base image is reasonable. WR and BR are the extracted bright and dark image regions. Adding WR on and subtracting BR from the base image will well remain the extracted image regions and enhance the contrast between the extracted image regions. Also, DF and EF are the extracted image details. Adding DF on and subtracting EF from the base image will well remain the extracted image details and enhance the contrast between the extracted image details. Therefore, the final fusion result would be effective. 3.4. Implementation The implementation of the algorithm is demonstrated in Fig. 4. The multi-scale white and black top-hat transforms are used to extract multi-scale bright and dark image regions. And, the multiscale operations constructed from toggle contrast operator are used to extract bright and dark image details. After that, the extracted image regions and details are imported into the mean of the original images to form the final fusion image. Because the multi-scale image regions and details are extracted and used for image fusion, the proposed algorithm would be effective for image fusion. 4. Experimental results To show the effectiveness of the proposed algorithm, different types of images, including multi-modal medical images, remote sensing images, infrared and vision images and multi-focus images, are used. Also, to do the comparison, the direct average algorithm, K–L transform based algorithm [2,3], wavelet pyramid algorithm [6–8] and toggle contrast operator based algorithm [27] are used in this paper. Some experimental results are demonstrated below.
Fig. 4. Implementation of the proposed algorithm. f b is the base image; WR and BR are the extracted final bright and dark image regions; DF and EF are the extracted final bright and dark image details; f u is the fusion result image.
X. Bai / Digital Signal Processing 23 (2013) 542–554
547
Fig. 5. Example of infrared and vision image fusion. (For interpretation of the references to color in this figure, the reader is referred to the web version of this article.)
The structuring element and numbers of scales are two types of important parameters used in the proposed algorithm. Flat structuring element is the simple and widely used structuring element in morphological operators. So, we use flat structuring element in this paper [15,16]. In flat structuring element, for any (u , v ), B (u , v ) = 0. Then, only the shape and size of the structuring element should be decided. The widely used shapes of structuring element are rectangle, square and circle [15,16]. Because the circle shape does not have sharp corners, using circle shape may suppress some block effects. So, we use circle shape in this paper. The size of the used structuring element at each scale is corresponding to the size of the scale number. The number of scales in multi-scale top-hat transform and toggle contrast operator decides the extracted image features used in the proposed algorithm. Theoretically, using a large number of scales may extract more image features, which may improve the performance of the proposed algorithm. However, useful image features including image regions and details usually exist at low scales. And, using a large number of scales may apparently increase the calculation time, which may decrease the applicable of the proposed algorithm. Usually, using 3 ∼ 5 scales is reasonable. In this paper, the number of scales used in multi-scale top-hat transform and toggle contrast operator is n = m = 3. Experimental results on different types of images verified that using these parameters was effective, and the proposed algorithm performed better than some other algorithms.
4.1. Visual comparison Figs. 5–9 are some visual comparison examples on different types of multi-modal and multi-focus images. In these figures, (a) and (b) are the original images for image fusion. (c) is the fusion result of direct average algorithm. (d) is the fusion result of K–L transform based algorithm. (e) is the fusion result of wavelet pyramid algorithm. (f) is the fusion result of toggle contrast operator based algorithm. (g) is the fusion result of the proposed algorithm. In general, all the algorithms could realize the image fusion. However, because some image details are smoothed and some image regions may be un-visible, direct average algorithm, K–L transform based algorithm and wavelet pyramid algorithm could not well combine the effective information of the original images. So, the results of direct average algorithm, K–L transform based algorithm and wavelet pyramid algorithm are not clear comparing with the results of toggle contrast operator based algorithm and the proposed algorithm. Because most of the image details are well maintained by the toggle contrast operator, the result of the toggle contrast operator based algorithm is clear. However, some small image details are still smoothed, which may reduce the resolution of the fusion image. Because the proposed algorithm effectively extracts and utilizes the extracted image regions and details of the original images, the fusion result images are clearer and contain more image information than other algorithms. Some specific discussions on different types of images are shown below. Fig. 5 is an example of multi-modal infrared and vision image fusion. (a) is the original infrared image. (b) is the original
548
X. Bai / Digital Signal Processing 23 (2013) 542–554
Fig. 6. Example of multi-modal medical image fusion. (For interpretation of the references to color in this figure, the reader is referred to the web version of this article.)
vision image. Effective infrared and visual image fusion should well combine the regions of interest in the infrared image and the image details of the visual image. The important regions of interest, such as the people region labeled by a red rectangle, in the fusion results of the direct average algorithm (Fig. 5(c)), K–L transform based algorithm (Fig. 5(d)) and wavelet pyramid algorithm (Fig. 5(e)) are dim and not clear. This may be harmful for target detection or recognition. Toggle contrast operator based algorithm enhances the edge details of the result image, but some image details are smoothed (Fig. 5(f)). Especially, the details of the people target region in the red rectangle are not well preserved comparing with the result of the proposed algorithm. In the result of the proposed algorithm (Fig. 5(g)), the image regions of the original images are effectively extracted and combined in the final fusion image. Moreover, the image details are also well combined in the final fusion image. Especially, the target region (people region) is very clear and the details of the people region are very clearer than other algorithms.
The resolution of the original infrared image is low. Thus, after the fusion algorithms, the resolution of the fusion results may be reduced. The resolution of the results of the direct average algorithm (Fig. 5(c)), wavelet pyramid algorithm (Fig. 5(e)) and toggle contrast operator based algorithm (Fig. 5(f)) are very low. The resolution of the results of wavelet pyramid algorithm (Fig. 5(e)) and the proposed algorithm (Fig. 5(g)) is better. And, because the fusion result of the proposed algorithm is clearer than the result of wavelet pyramid algorithm, it looks like that the resolution of the result of the proposed algorithm is a little lower. However, the image details of the proposed algorithm are very richer and clearer than the result of the wavelet pyramid algorithm. Therefore, the proposed algorithm achieves a better fusion result than other algorithms in Fig. 5. Fig. 6 is an example of multi-modal medical image fusion. The important image regions and details should be well combined and displayed in the final fusion image to assist the diagnosing. The inner region labeled by red rectangle contains rich image details. However, because the direct average algorithm, K–L transform
X. Bai / Digital Signal Processing 23 (2013) 542–554
549
Fig. 7. Example of multi-modal remote sensing image fusion. (For interpretation of the references to color in this figure, the reader is referred to the web version of this article.)
based algorithm and wavelet pyramid algorithm smooth image details and some image regions, this region of the results of these algorithms (Fig. 6(c), (d) and (e)) is not clear. Although the result of toggle contrast operator based algorithm is clear, the region labeled by the red rectangle is still not clearer than the result of the proposed algorithm and, the contained image details are less. Also, the two regions labeled by the yellow rectangles in the original image (Fig. 6(a)) contain the important bright regions, which are well maintained in the fusion result of the proposed algorithm (Fig. 6(g)). However, the fusion result of the toggle contrast operator based algorithm (Fig. 6(f)) could not well maintain these important bright regions. Fig. 7 is an example of multi-modal remote sensing image fusion. The original images contain many image details and some image regions are not clear. The fusion result of effective algorithm should contain rich image details and be clear. It could be easily observed that the results of direct average algorithm (Fig. 7(c)), K–L transform based algorithm (Fig. 7(d)) and wavelet pyramid algorithm (Fig. 7(e)) are not clear and the image details are less than the results of the toggle contrast operator based algorithm
(Fig. 7(f)) and the proposed algorithm (Fig. 7(g)). Although the toggle contrast operator based algorithm obtains a clear result, the contrast of the image regions and details are still not good comparing with the result of the proposed algorithm. The contrast of the final fusion image of the proposed algorithm is very good and the image details are very clear. Especially, in the region labeled by the yellow rectangle, the contrast of the result of the proposed algorithm is better than the toggle contrast operator based algorithm. Also, at the bottom region of the result images, the image details of the proposed algorithm are richer and clearer. So, the proposed algorithm performs well for image fusion. Fig. 8 is an example of multi-focus image fusion. (a) and (b) are the original multi-focus images. The multi-focus image fusion algorithm should well extract and combine the focus regions of the original images to form a clear fusion image. Because the focus regions of the original images are not well extracted by the direct average algorithm (Fig. 8(c)), K–L transform based algorithm (Fig. 8(d)) and wavelet pyramid algorithm (Fig. 8(e)), the final fusion images are not clear and some regions are even un-clearer than the original images. The results of the toggle contrast operator
550
X. Bai / Digital Signal Processing 23 (2013) 542–554
Fig. 8. Example of multi-focus image fusion. (For interpretation of the references to color in this figure, the reader is referred to the web version of this article.)
based algorithm (Fig. 8(f)) and the proposed algorithm (Fig. 8(g)) are clearer than the other algorithms. However, some regions, such as the number region labeled by the yellow rectangle, of the fusion result of the proposed algorithm are clearer than the result of the toggle contrast operator based algorithm. And, the focus regions of the final fusion image of the proposed algorithm are even clearer than the original image because the proposed algorithm could well extract image details and maintain them in the final fusion image. Therefore, the proposed algorithm is effective for multi-focus image fusion. Fig. 9 is another example of multi-focus image fusion. (a) and (b) are the original multi-focus images. Again, the proposed algorithm effectively extracts the focus image regions and details, and well combines them in the final fusion image. Therefore, the proposed algorithm gives the best fusion result. Especially, in the text regions and the region containing many straight lines, which are labeled by the yellow rectangles, the result of the proposed algorithm is the clearest. These experimental results show that because the multi-scale top-hat transform could well extract image regions and the multi-
scale toggle contrast operator well extracts image details and, the extracted image regions and details are reasonably combined into the final fusion image, the proposed algorithm is effective for image fusion. Moreover, although the images used in the experiments are different types of images, the algorithm performs well on all the images, which indicates that the proposed algorithm may be useful for different applications. 4.2. Quantitative comparison To do a quantitative comparison, two measures, named mean gradient [34] and spatial frequency [35], are used in this paper. To give a reasonable quantitative comparison and clearly show the comparison results, different groups of images including multimodal and multi-focus images are used to calculate the quantitative measures. And, the mean values of each measure on the fusion results of different groups of images corresponding to each algorithm are calculated. Because a number of different groups of images are used, we give the overall quantitative comparisons on all the images using
X. Bai / Digital Signal Processing 23 (2013) 542–554
551
Fig. 9. Another example of multi-focus image fusion. (For interpretation of the references to color in this figure, the reader is referred to the web version of this article.)
two different measures. However, all the individual comparisons on each group of images also verified that the performance of the proposed algorithm was better than other algorithms. This is mainly because the proposed algorithm does not need special information of the original images, so that the image regions and details could be well extracted and reasonably combined to form the effective fusion result. It has to be pointed out that, although we had tried to provide reasonable quantitative comparisons using two measures, giving accurate and sufficient quantitative comparisons in the application of image fusion is always difficult. So, we use two measures and the overall quantitative comparisons on all the images. The overall comparison using the mean values of each measure is shown in Figs. 10 and 11. Fig. 10 is the quantitative comparison result using mean gradient. Mean gradient is the mean value of the gradient image of the final fusion image. Gradient usually represents the image details and clear image regions. Then, a larger value of mean gradient indicates that the image contains more image details and clear image regions. Thus, the fusion result is more effective. Fig. 10 shows that
Fig. 10. Overall quantitative comparison result using mean gradient.
the proposed algorithm has very larger mean gradient value than other algorithms, which indicates that the proposed algorithm effectively extracts image details and regions of the original images and, gives a better fusion result.
552
X. Bai / Digital Signal Processing 23 (2013) 542–554
Fig. 11. Overall quantitative comparison result using spatial frequency.
Fig. 11 is the quantitative comparison result using spatial frequency. Spatial frequency represents the spatial information contained in image. If the final fusion image contains more effective image regions and details of the original images, the spatial information contained in the fusion image would be more. So, a larger value of spatial frequency indicates a better fusion result. Fig. 11 shows that, the spatial frequency value of the proposed algorithm is larger than other algorithms. The reason is because the proposed algorithm could well extract useful image regions and details and, then combine them into the final fusion result. Therefore, the final fusion image of the proposed algorithm contains more image information and achieves a better fusion result. Also, some examples of individual comparisons using the measures mean gradient and spatial frequency are also shown in Figs. 12 and 13. In them, the quantitative results shown in (a), (b),
Fig. 12. Examples of individual comparisons using mean gradient.
X. Bai / Digital Signal Processing 23 (2013) 542–554
553
Fig. 13. Examples of individual comparisons using spatial frequency.
(c), (d) and (e) correspond to the original images of Figs. 5, 6, 7, 8 and 9, respectively. Figs. 12 and 13 show that the values of the proposed algorithm are larger than other algorithms, which indicate the more effective performance than other algorithms. These individual examples of comparisons further verify the effective performance of the proposed algorithm. The quantitative comparison results show that because the proposed algorithm could effectively extract the useful image regions and details of the original image and, combine them into the final fusion image to form an effective fusion image, the proposed algorithm performs well on both the mean gradient and spatial frequency. Also, different types of multi-modal and multi-focus images are used in this experiment. All of these indicate that the
proposed algorithm is very useful for image fusion and could be widely used in different applications. 5. Conclusions To keep more useful image information in the final fusion image, an image fusion algorithm using the extracted image regions and details based on multi-scale top-hat transform and toggle contrast operator is proposed in this paper. Multi-scale top-hat transform extracts bright and dark image regions in the original images for image fusion. Multi-scale operations constructed from toggle contrast operator extract bright and dark image details in the original images for image fusion. So, through appropriately importing the extracted image regions and details into the final fusion image,
554
X. Bai / Digital Signal Processing 23 (2013) 542–554
the effective fusion result could be obtained. Experimental results show that the proposed algorithm performs very well for image fusion. Moreover, the used images in the experiment are different multi-modal and multi-focus images from different applications. And, the proposed algorithm performs very well on all of these images. Therefore, the proposed algorithm could be widely used in different applications for different purposes, such as biomedical engineering, target detection and so on. Because the proposed algorithm is mainly based on the extraction and combination of the high frequency components of the original images, the fusion result image may be enhancement. This may affect the fusion of the low frequency components. Also, because the multi-scale theory is used, the morphological operators have to be calculated several times. This would increase the calculation time of the proposed algorithm. However, some accelerating strategies [36–38] have been proposed to speed up the calculation of morphological operations, which could be used to reduce the calculation time. And, because the performance of the proposed algorithm is more effective than some other algorithms and no special information of the original images is needed, the proposed algorithm could be well used in different applications related to multi-modal and multi-focus image fusion. Furthermore, different types of morphological operators have different properties for image processing, which may be used to identify different types of image information. This paper is a preliminary application example of combining the effectiveness of different morphological operators. Further researches on appropriately combining and utilizing the identified image information by different morphological operators may achieve more effective result in wide area of applications. Acknowledgments The author thanks the anonymous reviewers for the very constructive comments. This work has been partly supported by the National Natural Science Foundation of China (Grant No. 61271023), open funding project of State Key Laboratory of Virtual Reality Technology and Systems, Beihang University (Grant No. BUAA-VR-12KF-04), Fundamental Research Funds for the Central Universities (Grant No. YWF-11-03-Q-065) and Innovation Foundation of AVIC (Grant No. CXY2010BH02). The author is grateful to Dr. Yan Li at Peking University, Beijing, China, for many helpful discussions and comments. The author acknowledges the used original images available at http://www.imagefusion.org. References [1] N. Cvejic, D. Bull, N. Canagarajah, Region-based multimodal image fusion using ICA bases, IEEE Sens. J. 7 (5) (2007) 743–751. [2] I. Jollife, Principal Component Analysis, Springer, 1986. [3] M. González-Audícana, J. Saleta, R. Catalán, R. García, Fusion of multispectral and panchromatic images using improved IHS and PCA mergers based on wavelet decomposition, IEEE Trans. Geosci. Remote Sens. 42 (6) (2004) 1291– 1299. [4] G. Piella, A general framework for multiresolution image fusion: from pixels to regions, Inf. Fusion 4 (2003) 259–280. [5] G. Pajares, J. Cruz, A wavelet-based image fusion tutorial, Pattern Recogn. 37 (2004) 1855–1872. [6] K. Amolins, Y. Zhang, P. Dare, Wavelet based image fusion techniques—An introduction, review and comparison, ISPRS J. Photogramm. Remote Sens. 62 (2007) 249–263. [7] G. Bhutada, R. Anand, S. Saxena, Edge preserved image enhancement using adaptive fusion of images denoised by wavelet and curvelet transform, Digital Signal Process. 21 (2011) 118–130. [8] Y. Chibani, A. Houacine, Redundant versus orthogonal wavelet decomposition for multisensor image fusion, Pattern Recogn. 36 (2003) 879–887. [9] F. Nencini, A. Garzelli, S. Baronti, L. Alparone, Remote sensing image fusion using the curvelet transform, Inf. Fusion 8 (2007) 143–156. [10] M. Choi, R. Kim, M. Nam, H. Kim, Fusion of multispectral and panchromatic satellite images using the curvelet transform, IEEE Geosci. Remote Sens. Lett. 2 (2) (2005) 136–140.
[11] S. Li, B. Yang, Multifocus image fusion using region segmentation and spatial frequency, Image Vision Comput. 26 (2008) 971–979. [12] A. Toet, M. Hogervorst, S. Nikolov, J. Lewis, T. Dixon, D. Bull, N. Canagarajah, Towards cognitive image fusion, Inf. Fusion 11 (2010) 95–113. [13] Z. Wang, Y. Ma, J. Gu, Multi-focus image fusion using PCNN, Pattern Recogn. 43 (2010) 2003–2016. [14] Y. Zhang, L. Ge, Efficient fusion scheme for multi-focus images by using blurring measure, Digital Signal Process. 19 (2009) 186–193. [15] P. Soille, Morphological Image Analysis—Principle and Applications, Springer, Germany, 2003. [16] J. Serra, Image Analysis and Mathematical Morphology, Academic Press, New York, 1982. [17] I. De, B. Chanda, A simple and efficient algorithm for multifocus image fusion using morphological wavelets, Signal Process. 86 (2006) 924–936. [18] I. De, B. Chanda, B. Chattopadhyay, Enhancing effective depth-of-field by image fusion using mathematical morphology, Image Vision Comput. 24 (2006) 1278– 1287. [19] S. Mukhopadhyay, B. Chanda, Fusion of 2D grayscale images using multiscale morphology, Pattern Recogn. 34 (2001) 1939–1949. [20] X. Bai, F. Zhou, Y. Xie, T. Jin, Enhanced detectability of point target using adaptive morphological clutter elimination by importing the properties of the target region, Signal Process. 89 (10) (2009) 1973–1989. [21] X. Bai, F. Zhou, B. Xue, Noise suppressed image enhancement using multiscale top-hat selection transform through region extraction, Appl. Opt. 51 (2011) 338–347. [22] X. Bai, F. Zhou, B. Xue, Image enhancement using multi scale image features extracted by top-hat transform, Opt. Laser Technol. 44 (2012) 328–336. [23] X. Bai, F. Zhou, Analysis of new top-hat transformation and the application for infrared dim small target detection, Pattern Recogn. 43 (6) (2010) 2145–2156. [24] X. Bai, F. Zhou, Analysis of different modified top-hat transformations based on structuring element constructing, Signal Process. 90 (11) (2010) 2999–3003. [25] S. Meyer, J. Serra, Contrasts and activity lattice, Signal Process. 16 (1989) 303– 317. [26] H. Kramer, J. Bruckner, Iterations of non-linear transformations for enhancement on digital images, Pattern Recogn. 7 (1975) 53–58. [27] X. Bai, F. Zhou, B. Xue, Edge preserved image fusion based on multiscale toggle contrast operator, Image Vision Comput. 29 (2011) 829–839. [28] P. Maragos, Morphological filtering for image enhancement and feature detection, in: A.C. Bovik (Ed.), The Image and Video Processing Handbook, 2nd edition, Elsevier Academic Press, 2005, pp. 135–156. [29] J. Schavemaker, M. Reinders, J. Gerbrands, E. Backer, Image sharpening by morphological filtering, Pattern Recogn. 33 (2000) 997–1012. [30] L. Dorini, N. Leite, A scale-space toggle operator for morphological segmentation, in: Proceedings of the 8th International Symposium on Mathematical Morphology, Rio de Janeiro, Brazil, October 10–13, 2007, in: MCT/INPE, vol. 1, 2007, pp. 101–112. [31] L. Dorini, R. Minetto, N. Leite, White blood cell segmentation using morphological operators and scale-space analysis, in: Proceedings of the XX Brazilian Symposium on Computer Graphics and Image Processing, Belo Horizonte, Minas Gerais, Brazil, October 7–10, 2007, pp. 294–301. [32] A. Jalba, M. Wilkinson, J. Roerdink, Morphological hat-transform scale spaces and their use in pattern classification, Pattern Recogn. 37 (2004) 901–915. [33] M. Oliveira, N. Leite, A multiscale directional operator and morphological tools for reconnecting broken ridges in fingerprint images, Pattern Recogn. 41 (2008) 367–377. [34] W. Wang, P. Tang, C. Zhu, A wavelet transform based image fusion method, J. Image Graph. 6 (11) (2001) 1130–1136. [35] V. Aslantas, R. Kurban, A comparison of criterion functions for fusion of multifocus noisy images, Opt. Commun. 282 (2009) 3231–3242. [36] F. Gonzalez, O. Tubio, F. Tobajas, V. De Armas, R. Esper-Chain, R. Sarmiento, Morphological processor for real-time image applications, Microelectron. J. 33 (2002) 1115–1122. [37] M. Droogenbroeck, H. Talbot, Fast computation of morphological operations with arbitrary structuring elements, Pattern Recognit. Lett. 17 (1996) 1451– 1460. [38] H. Park, R. Chin, Decomposition of arbitrarily shaped morphological structuring elements, IEEE Trans. Pattern Anal. Mach. Intell. 17 (1) (1995) 2–15.
Xiangzhi Bai received his B.S. and Ph.D. from Beijing University of Aeronautics and Astronautics (BUAA) in 2003 and 2009, respectively. He is currently an associate professor in Image Processing Center of BUAA. He has won the nomination prize of the national best doctoral thesis and several awards from different organizations. He holds 3 national invention patents and has published more than 70 international journal and conference papers in the field of mathematical morphology, image analysis, pattern recognition and bioinformatics. He also acts as active reviewer for around 20 international journals and conferences.