ARTICLE IN PRESS Signal Processing 90 (2010) 1643–1654
Contents lists available at ScienceDirect
Signal Processing journal homepage: www.elsevier.com/locate/sigpro
Enhancement of dim small target through modified top-hat transformation under the condition of heavy clutter Xiangzhi Bai , Fugen Zhou, Ting Jin Image Processing Center, Beijing University of Aeronautics and Astronautics, 100191 Beijing, China
a r t i c l e in fo
abstract
Article history: Received 31 May 2009 Received in revised form 9 November 2009 Accepted 10 November 2009 Available online 27 November 2009
A new algorithm to enhance dim small target through modified top-hat transformation is proposed in this paper. Firstly, the property of top-hat transformation is analyzed following the property of small target regions. Secondly, a judging value is calculated following the properties of target region and top-hat transformation. Finally, the judging value is imported into the top-hat transformation to form the modified top-hat transformation. Because of the judging value in modified top-hat transformation, dim target can be significantly enhanced and heavy clutter can be effectively suppressed. Experimental results verified that the modified top-hat transformation for target enhancement under the conditions of heavy clutter and dim target intensity was effective and robust. & 2009 Elsevier B.V. All rights reserved.
Keywords: Target enhancement Dim small target Top-hat transformation Heavy clutter False alarm reduction Target detection
1. Introduction Dim small target detection is a crucial technique in image processing applications, which has attracted increasing interests in recent years. Because the target is far away from imaging equipment, the target region in image usually appears as point-like shape and is embedded in heavy clutter background. When the target is dim and the clutter is heavy, the target may possess the features of low signal to noise ratio (SNR), clutter background, moving at unknown velocity or unavailable shape information. These features make the target detection difficult. The purpose of target detection is to find out the potential target regions from clutter background. So, the heavy clutter and dim target intensity increase the difficulty of target detection. In order to detect the dim target embedded in heavy clutter background, the
Corresponding author. Tel.: + 86 10 82338048.
E-mail addresses:
[email protected],
[email protected] (X. Bai). 0165-1684/$ - see front matter & 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.sigpro.2009.11.014
strategy of target enhancement through clutter elimination is adopted by most algorithms. Many algorithms are proposed to enhance dim target embedded in heavy clutter, such as median filter, max-mean filter [1], max-median filter [1], kernel smoothing methods [2], wavelet based algorithm [3], rectification filters [4], 3D directional filters [5], TTF [6] and 2D adaptive lattice algorithm [7]. However, median filter, max-mean filter [1], max-median filter [1], kernel smoothing methods [2], wavelet based algorithm [3] and rectification filters [4] are sensitive to heavy clutter or the variation of target features. Likewisely, 3D directional filters [5] need several consecutive images with target motion properties, but the images are not easy to be obtained. Although TTF [6] is not very sensitive to the evolving cloud data, the selection of optimized parameters is difficult. 2D adaptive lattice algorithm [7] can be used to suppress the effect of clutter, but the algorithm is complex and time consuming. Some intelligent tools such as neural networks [8], support vector machine [9] and probability visual learning [10,11] have been well used in some cases, but the needed effective training set is difficult to be organized. Because of the
ARTICLE IN PRESS 1644
X. Bai et al. / Signal Processing 90 (2010) 1643–1654
parallel property and easy implementation in real-time hardware system [12], mathematical morphology based algorithms [13–16], are widely used to enhance small target, especially the top-hat transformation [14–16]. Whereas, due to the heavy clutter and the property of image detail smoothing of mathematical morphological operations, these algorithms are sensitive to the clutter variation and may not perform well if the target is dim or the clutter is heavy, which make the pre- or postprocessing of target detection difficult. In all, most of those methods mentioned above are complex or time consuming or ineffective under the conditions of heavy clutter and dim target intensity. In order to make the dim target detection easy and efficient, the classical top-hat transformation is modified following the property of target region, which can enhance the dim target under the condition of heavy clutter. Firstly, top-hat transformation is analyzed following the property of target regions. Secondly, a judging value is calculated following the property of target region and top-hat transformation. Finally, the judging value is imported into the top-hat transformation to improve the efficiency of top-hat transformation for target enhancement. The importing of judging value not only significantly improves the performances of modified top-hat transformation for target enhancement, but also enhances the adaptability of modified top-hat transformation to heavy clutter and dim target intensity. Experimental results demonstrate that the modified top-hat transformation is effective and robust for target enhancement. This paper is organized as follows. Section 2 presents the definitions of mathematical morphology. Section 3 analyzes the properties of top-hat transformation. Section 4 demonstrates the details of modified top-hat transformation. Section 5 gives the results of the algorithm, and Section 6 concludes the discussion. 2. Mathematical morphology
ðf BÞðx; yÞ ¼ maxðf ðx p; y qÞ þ Bðp; qÞÞ;
ð1Þ
ðf YBÞðx; yÞ ¼ minðf ðx þp; yþ qÞ Bðp; qÞÞ;
ð2Þ
p;q
Bðp; qÞ ¼ 0; 8p; q;
where, the domain of fB and fYB are the dilation and erosion of the domain of f with the domain of B. Because
ð3Þ
then, Eqs. (1) and (2) become: ðf BÞðx; yÞ ¼ maxðf ðx p; y qÞÞ;
ð4Þ
ðf YBÞðx; yÞ ¼ minðf ðx þ p; yþ qÞÞ:
ð5Þ
p;q
p;q
Based on dilation and erosion, opening and closing of f (x, y) by B (p, q), denoted by f 3B and fB, are defined by ðf 3BÞðx; yÞ ¼ ððf YBÞðx; yÞÞ Bðp; qÞ;
ð6Þ
ðf BÞðx; yÞ ¼ ððf BÞðx; yÞÞYBðp; qÞ:
ð7Þ
In general, opening smoothes bright small regions of image, and closing eliminates dark small holes. Then, white top-hat transformation and black top-hat transformation of image f, denoted by WTH and BTH, are defined by WTHðx; yÞ ¼ f ðx; yÞ ðf 3BÞðx; yÞ;
ð8Þ
BTHðx; yÞ ¼ ðf BÞðx; yÞ f ðx; yÞ:
ð9Þ
Eqs. (8) and (9) indicate that, the white top-hat transformation (WTH) returns an image with the objects from the original image which are smaller than the structuring element and brighter than the surrounding regions. And, the black top-hat transformation (BTH) returns an image with the objects from the original image which are smaller than the structuring element and blacker than the surrounding regions. Small target region in infrared image is usually a small bright region. So, WTH can be used to directly detect the potential targets in infrared image through target enhancement as follows [15]. fT ðx; yÞ ¼ WTHðx; yÞ ¼ f ðx; yÞ ðf 3BÞðx; yÞ ¼ f ðx; yÞ fb ðx; yÞ;
Because the target is far away from imaging equipment, the size of small target in image is always very small, which is convenient for structuring element selection when the mathematical morphology is applied. Thus, the mathematical morphology is appropriate for target enhancement. Mathematical morphology is developed from geometry and based on set theory [12]. Most of the mathematical morphological operations are introduced by two basic operations: dilation and erosion, and work with two sets. One set is the original image to be analyzed and the other set is called structuring element. Let f and B represent the gray image and a structuring element, respectively. The dilation and erosion of f (x, y) by B (p, q), denoted by fB and fYB, are defined by p;q
of maximum (minimum) operation, the gray value of the result image of dilation (erosion) is larger (smaller) than the original image. If structuring element is flat, that is
ð10Þ
where, fT(x, y) is enhanced image, fb(x, y) is estimated background. 3. Property of classical top-hat transformations (WTH and BTH) Opening and closing change most of the gray values of original image [12], which means all the changed regions have outputs in the resulting image after top-hat transformations. So, the resulting image of top-hat transformation has many other non-zero regions except target region. This would largely increase the number of possible false alarms. In another way, because opening (closing) usually eliminates bright (black) regions from the original image, the gray change of bright (black) region after WTH (BTH) is larger than most of surrounding background regions. Moreover, small target region in infrared images is usually a bright region embedded in background [15]. All of these mean that, not all the non-zero regions in fT are potential target regions, but the region with bigger gray changes is.
ARTICLE IN PRESS X. Bai et al. / Signal Processing 90 (2010) 1643–1654
Therefore, a judging value can be imported into top-hat transformation to differentiate the real target region and slow varied clutter background through differentiating the gray changes of these regions. Obviously, the gray change range of each pixel after opening or closing operation is fixed. If the gray change of these pixels can be determined, a reasonable judging value to differentiate the potential target region in fT can also be determined. Therefore, the key of finding a reasonable judging value is to make clear the gray change range of bright region. The following propositions display the properties of opening and closing. The properties indicate the gray change range of bright (dark) region after opening (closing) operation and the principle of selecting judging value. Proposition 1. Let B denote the structuring element, which is flat. Let I^ denote bright region in image I, which is overlaid by B to be eroded. The minimum value in region of B corresponding to any position in I^ is min V, all min V corresponding to all the positions in I^ form a set BV. Let ^ and min I denote the max I denote the maximum value in I, maximum value in BV. Then, ^ vÞ ðI3BÞðu; ^ Iðu; vÞ rmaxI minI:
ð11Þ
^ where, (u, v) is the pixel coordinate of I. If (u, v) is the local maximum position, then ^ vÞ ðI3BÞðu; ^ Iðu; vÞ ¼ maxI minI:
ð12Þ
^ Because max I is Proof. Let DI^ represent the domain of I. ^ vÞ r maxI. Let I^1 ^ 8ðu; vÞ 2 D^ , Iðu; the maximum value in I, I denote the eroded region of I^ by B. Following the definition of erosion, I^ 1 is the minimum value of I^ by B. So, all the erosion values of the positions in I^ form the set
1645
BV. Because I^ is the bright region of I, the maximum value of BV denoted by min I should be smaller than the original ^ vÞ r maxI. value of I^1 . That is, 8ðu; vÞ 2 DI^ , minI r Iðu; Following the definition of opening, erosion is followed by dilation, which takes the maximum value of the interesting region. Meanwhile, min I is the maximum value of BV which is formed by the values of I^1 . That is, ^ vÞ ¼ maxfI^1 ðu; vÞg ¼ minI. Therefore, 8ðu; vÞ 2 DI^ , ðI3BÞðu; ^ vÞ ðI3BÞðu; ^ Iðu; vÞ rmaxI minI: If (u, v) is the position of local maximum value, that is ^ vÞ ¼ maxI, then the in-equation transforms into the Iðu; equation ^ vÞ ðI3BÞðu; ^ Iðu; vÞ ¼ maxI minI: & This proposition can be illustrated by one example as shown in Fig. 1. This example demonstrates that, the bright region eroded by structuring element is filled with small gray values. And, in this bright region, the largest value (39) is not larger than any pixel value. Then, the bright region of opening image is replaced by 39, and the gray changes of all the pixels in bright region are not larger than the difference between the largest gray value (90) of the original bright region and 39. Also, the gray change of the pixel with the maximum gray value (90) is the difference between 90 and 39, which is 51. Resulting image of Fig. 1(a) through top-hat transformation is shown in Fig. 1(e). Because of the heavy clutter in Fig. 1(a), the clutter in Fig. 1(e) is heavy, too. The gray values of some clutter regions in Fig. 1(e) labeled by circles and the target region are very approximate. These regions are difficult to be removed and will be false alarms. Moreover, the difference between the target region and labeled regions in Fig. 1(e) is even smaller
Fig. 1. Illustration of Proposition 1: (a) original image; (b) structuring element; (c) erosion image of (a); (d) opening image of (a); (e) resulting image of (a) through top-hat transformation.
ARTICLE IN PRESS 1646
X. Bai et al. / Signal Processing 90 (2010) 1643–1654
than that of the original image in Fig. 1(a), which means using top-hat transformation to enhance target is inefficient when clutter is heavy.
4. Modified top-hat transformation
Similarly, the dark region and closing have the similar property shown in Proposition 2.
Proposition 1 indicates that the gray change of bright region is not larger than the difference of the local maximum value and the maximum value of BV. It is not easy to calculate that difference because it is difficult to determine BV. Actually, there is no need to find out the accurate value of the difference, because it is enough to find a value which is located in the range specified by Proposition 1 and approximates the accurate value of the difference. In this paper, a strategy is proposed to find the approximate value of the gray change. All the gray values of target region are larger than the gray values of surrounding pixels [14–17]. Following this property, a L L window denoted by w, whose center is corresponding to each pixel of the image, is used to calculate the approximate gray change of each pixel. The approximate gray changes of all pixels calculated by using the window w form a gray change map. This gray change map is denoted by GCM in this paper. According to one pixel of GCM, firstly, the center of w is located at this pixel in the original image. Then, the maximum and minimum gray value of the pixels of the original image in the window overlaid by w is obtained as: max{f(x L/2+i, y L/2+ j), iA[0,L 1], jA[0,L 1]} and min{f(x L/2+i, y L/2+ j), iA[0, L 1], jA[0, L 1]}. L is the size of the window. Finally, the value of this pixel in GCM is assigned as the difference of max{f(x L/2+ i, y L/2+j), iA[0, L 1], jA[0, L 1]} and min{f(x L/2+ i, y L/2+j), iA[0, L 1], jA[0, L 1]}. The calculation of GCM can be expressed as follows:
Proposition 2. Let B denote the structuring element, which is flat. Let I^ denote dark region in image I, which is overlaid by B to be dilated. The maximum value in region of B corresponding to any position in I^ is max V, all max V corresponding to all the positions in I^ form a set BV. Let min I ^ and max I denote the denote the minimum value in I, minimum value in BV. Then, ^ ^ vÞ rmaxI minI: ðIBÞðu; vÞ Iðu;
ð13Þ
^ where, (u, v) is the pixel coordinate of I. If (u, v) is the local minimum position, then ^ ^ vÞ ¼ maxI minI: ðIBÞðu; vÞ Iðu;
ð14Þ
^ Because min I is Proof. Let DI^ represent the domain of I. ^ vÞ ZminI. Let I^ ^ 8ðu; vÞ 2 D^ , Iðu; the minimum value in I, I denote the dilated region of I^ by B. Following the definition of dilation, I^1 is the maximum value of I^ by B. So, all the dilation values of the positions in I^ form the set BV. Because I^ is the dark region of I, the minimum value of BV denoted by max I should be larger than the original ^ vÞ rmaxI. ^ That is, 8ðu; vÞ 2 D^ , minI r Iðu; value of I. I Following the definition of closing, dilation is followed by erosion, which takes the minimum value of the interesting region. Meanwhile, max I is the minimum value of BV which is formed by the values of I^ 1 . That is, ^ vÞ ¼ minfI^ 1 ðu; vÞg ¼ maxI. Therefore, 8ðu; vÞ 2 DI^ , ðIBÞðu; ^ ^ vÞ rmaxI minI: ðIBÞIðu; vÞ Iðu; If (u, v) is the position of local minimum value, that is ^ vÞ minIˆI, then the in-equation transforms into the Iðu; equation ^ ^ vÞ ¼ maxI minI:& ðIBÞIðu; vÞ Iðu; Proposition 1 indicates that the maximum gray change of bright region is not larger than a certain value max I min I. So, in order to discriminate the bright region of real target, the judging value to differentiate the gray change of bright target region and clutter background should be smaller than the certain value. Otherwise, the bright region will be lost. Therefore, the judging value should be selected reasonably. In another way, if the judging value is reasonably selected, this value can be imported into top-hat transformation to differentiate the real bright regions and slow varied background. In the following section, a strategy to calculate the judging value is proposed at first. Then, the modified tophat transformation is defined through importing the calculated judging value into the classical top-hat transformation. Finally, the properties of the modified top-hat transformation are analyzed.
4.1. Judging value calculation
GCMðx; yÞ ¼ maxff ðx L=2 þi; y L=2 þ jÞ; i 2 ½0; L 1; j 2 ½0; L 1g minff ðx L=2 þ i; y L=2 þjÞ; i 2 ½0; L 1; j 2 ½0; L 1g: ð15Þ GCM indicates that the value of GCM in each pixel is the difference of the maximum value and the minimum value in w. So, if the size of the window L is smaller than the size of target region, the minimum value of w will be larger than the maximum value of BV. Then, the value calculated by Eq. (15) will not exceed the range specified by Proposition 1 and can be used as the approximate gray change of each pixel of target region. Large L results in a large output in GCM which may exceed the range specified by Proposition 1. Therefore, L should be smaller than the size of target region. L is related to the size of structuring elements used in top-hat transformation. Experiments are used to determine the value of L. In the experiments, images with different clutter backgrounds are used. Firstly, the size of the structuring element B is determined following the prior knowledge of the used images. Then, L with different values is used in our algorithm to enhance the dim targets. Experiments show that, the algorithm would be efficient if L is selected as follows: L ¼ ð0:220:5Þ sizeðBÞ:
ð16Þ
where, size(B) is the size of structuring element B. Because the target is small, the structuring element B can be square
ARTICLE IN PRESS X. Bai et al. / Signal Processing 90 (2010) 1643–1654
or rhombus. size(B) should be determined following the prior knowledge of the target. Because the target is small, size(B) is small. size(B) is usually smaller than 10. In this case, the image can be categorized into three classes regarding to the values of GCM: target region, edge region of target region and clutter background. Firstly, the target region will have a small output in GCM because of the consistency of gray value in target region, and the output is smaller than the gray change of the region specified by Proposition 1. Secondly, the edge region of target region has a large output because the gray values of target region are larger than the surrounding clutter background. These large outputs mark out target region. Thirdly, the output of the clutter backgrounds is small because of the slow gray change of pixels in background. All of these indicate that large values of GCM mark out target region while small values of GCM give out clutter background. So, large and small values of GCM correspond to target region and clutter background, respectively. On the other hand, following Proposition 1, because the gray value in target region is large and the gray value in surrounding background is small, the gray change of target region after opening, which is the difference of the gray value of target region and surrounding background, is larger than the gray change of clutter background. Moreover, the window size L which is used to calculate the value of GCM is smaller than the size of structuring element B used in opening. And, the large values in GCM are also the differences of the gray values of target region and surrounding clutter background. So, large values in GCM will approximate the gray change of target regions and will be larger than the gray change of clutter background. Based on the analysis above, a value, which can differentiate the large and small values in GCM, can also differentiate the large and small gray change of the original image after opening. Therefore, a judging value, which is not larger than the large value in GCM and is larger than the small value in GCM, can be obtained from GCM. A way of using the mean value and variation of GCM is used here as Eq. (17) shown t ¼ m þ e s;
1647
Therefore, t can be used as a judging value in top-hat transform to differentiate the target region and clutter background through differentiating the gray change of the target region and clutter background. 4.2. Modified top-hat transformation By importing t into top-hat transformation, the whole procedure of modified WTH is demonstrated in Fig. 2. In Fig. 2, i is the index of the pixel of the image. N is the total number of pixels in image. Following Fig. 2, the modified WTH can be expressed as follows: MWTHðx; yÞ ¼ maxðf ðx; yÞ ðf 3BÞðx; yÞ; tÞ t;
ð18Þ
where, t is the judging value calculated from GCM. Similar to WTH transformation, MWTH can be used to mark bright target, such as small targets. MWTH indicates that, not all the changed regions are potential target regions, but the brighter regions whose gray change is larger than t are. This will suppress most of the clutter background and enhance the dim target in clutter background, which largely decreases false alarms and simplifies the target detection and tracking. Also, modified BTH can be expressed as follows: MBTHðx; yÞ ¼ maxððf BÞðx; yÞ f ðx; yÞ; tÞ t:
ð19Þ
MBTH possesses the similar property as MWTH, and can be used to enhance dark regions.
°
ð17Þ
where, m and s are the mean value and variation of GCM. e is a parameter to modulate t. The heavier the clutter is, the larger the e is. m and s are the first and second order moments of GCM. Using the first and second order moments of image to calculate a value to differentiate the large and small values of the image is an efficient way [15]. Here, this technique is used to calculate the judging value t. Eq. (17) indicates that, t, which is the combination result of the first and second order moments of GCM, is smaller than the large values in GCM and larger than the small values of GCM. Values in GCM correspond to the gray changes of pixels of the original image after opening. So, t is smaller than the large gray change of the pixels of the original image and larger than the small gray change of the pixels of the original image. Simultaneously, the target region in the original image will have a large gray change after opening operation, and clutter background will have a small gray change after opening operation.
°
°
Fig. 2. Procedure of modified white top-hat transformation (MWTH).
ARTICLE IN PRESS 1648
X. Bai et al. / Signal Processing 90 (2010) 1643–1654
Fig. 3. GCM and result of Fig. 1(a) after MWTH (L =3, t= 50.7118): (a) GCM of Fig. 1(a); (b) result of Fig. 1(a) by MWTH.
Obviously, if t= 0, modified top-hat transformations (MWTH, MBTH) degenerate as classical top-hat transformations (WTH, BTH). GCM and result of Fig. 1(a) after MWTH are shown in Fig. 3. The values of edge region in GCM (Fig. 3(a)) are large and mark out target region. Heavy clutter also brings some large values in GCM, which results in a larger s. Then, t (50.7118) is larger than the gray change of clutter background and smaller than the gray change of target region (51). So, all the remained clutter in Fig. 1(e) is eliminated because of t, and the target region is remained in Fig. 3(b). Although the gray value of target region is small (0.2882) in Fig. 3(b) after MWTH, the value can be enlarged through linear extension. Also, the clutter in Fig. 3(b) are suppressed comparing with Fig. 1(e), and the gray values of clutter will not be enlarged by linear extension because they are all 0. So, the target region will be largely enhanced in the resulting image of MWTH.
GCM mark out the bright region. t is calculated from GCM, which is the combination of the mean value and variation of GCM. The mean value and variation of GCM are the first and second order moments of the values of GCM. Therefore, the large value of GCM should be larger than t. That is, t will be smaller than the gray change of the real target regions. In this case, the real target region well be remained in the result image. If the region is a clutter background region, the gray value of the region would be a little brighter than the surrounding region. And, the values in GCM corresponds to this region would be small values. t is the combination of the first and second order moments of the values of GCM, which will be larger than the small value of GCM. In this case, the clutter background will be suppressed. So, t will suppress most of the slow varying regions and decrease false alarms. Meanwhile, t is subtracted from the image in Eq. (18). t is selected from GCM to differentiate the background and potential bright regions, which is calculated as a value to threshold image. Thus, the subtraction of t acts as the subtraction of one special value from image, which is one of the widely used operations for target enhancement, such as the temperature non-linearity removal in infrared image [15] and some kernel smoothing methods [2]. Eq. (19) indicates that, MBTH possesses the similar property. Therefore, the selection of t from GCM increases the efficiency of target enhancement of MWTH. Moreover, the subtraction of t combines the superiorities of top-hat transformations and other target enhancement operation, which increases the efficiency of MWTH to identify real potential target regions.
4.3. Property analysis 4.3.1. Better performance of MWTH for target enhancement Eq. (18) indicates that, if t is larger than 0, more clutter will be eliminated by MWTH than WTH. That means the performance of MWTH for potential target region identification will be better than WTH no matter how small the t is. So, proper selection of t will largely depress clutter and apparently enhance target. The parameter e in Eq. (17) is used to modulate t. Because the clutter background and bright region exist in the original image, the mean value m of GCM will be larger than 0. So, no matter how small the e is, t will be larger than 0. Then, the performance of MWTH for potential target region identification will be better than WTH. Therefore, the selection of parameter e is loose. Actually, experimental results on our image dataset show that setting e = 1.0 is efficient for almost all the images. And, in this paper, we set e = 1.0 for all the experiments. Moreover, if the used image is sensitive to the parameter e, e can be a very small value or even 0. In this case, t will be still larger than 0. The performance of MWTH will be still efficient. 4.3.2. Advantages of importing t If the region is a real target region, the value of the edge region of bright region in GCM will be large because of the larger gray values of the bright region and small gray values of the surrounding region. So, large values in
5. Experimental results For the purpose of easy detection and post-processing simplification, the enhancement of small target should efficiently enhance the target (5.1 target enhancement), largely decrease the number of false alarms (5.2 false alarm rejection) and suppress more clutter background (5.3 clutter background suppression). An infrared image dataset including more than 200 images with small targets is used in this paper. The used images have different types of heavy clutter. The main types of clutter background in infrared image are caused by sensors, the atmosphere, and the environment around the target. The infrared imaging sensor used by us to capture images produces many noises in image. These noises appear as some small un-flat clutter regions in the image. When capturing the ship image, the fog atmosphere and water gas of the sea heavily blur the target. Moreover, the environment around the target, such as buildings or sea waves also produces many clutter regions in some of the used images. So, the used image dataset includes the main types of clutter, which is important for verifying the efficient performance of our algorithm. The targets in our image dataset are aircrafts in the sky or ships in the sea. The original size of the original image is 720 576. The parameters of MWTH in all the
ARTICLE IN PRESS X. Bai et al. / Signal Processing 90 (2010) 1643–1654
1649
Fig. 4. 3D target intensity plots of the original image and the enhanced results of various methods in a 33 33 window around target. The rows from one to two are the original images and their corresponding 3D target intensity plots. The rows from 3 to 10 are the 3D target intensity plots of the enhanced results of MWTH, WHT, Max-median(3 3), Max-mean(3 3), Max-median(5 5), Max-mean(5 5), U-Kernel and E-Kernel.
ARTICLE IN PRESS 1650
X. Bai et al. / Signal Processing 90 (2010) 1643–1654
experiments are the same: L= 3, size(B) =5, e =1.0. The shape of structuring element used in experiments is square. Using the same value of parameters in different types of images indicates that, our algorithm is not very sensitive to the parameters. To do the comparison, some widely used methods, including the classical top-hat transformation (WTH) [14,15], max-mean filter [1], max-median filter [1], U-Kernel smoothing method [2], E-Kernel smoothing method [2], are used in the experiments. And, the results are compared with our algorithm (MWTH). 5.1. Target enhancement Dim target in clutter background is difficult to be detected. To efficiently detect the dim target, the target regions should be enhanced. So, the performance of target enhancement of our algorithm is verified in this experiment, firstly. Our algorithm and the comparison methods are applied on the images from the used image dataset. Several example images and the corresponding results are shown in Fig. 4. In Fig. 4, the first row is the original images. The targets are labeled by squares. The targets in the original image are dim. And, the clutter caused by sea waves and fog atmosphere exists in the first and third images. Also, the clutter caused by sky and sensor system exists in the second image. The 3D target intensity plots of the original image and enhanced results of various methods are shown in the rows from 2 to 10. The 3D target intensity plot is the 3D display of the gray values of the pixels in a 33 33 window around the target of image. The 3D target intensity plots of the original images are shown in the second row of Fig. 4. The enhanced result of MWTH is shown in the third row of Fig. 4. The enhanced results of WHT, Max-median(3 3), Max-mean(3 3), Max-median(5 5), Max-mean(5 5), U-Kernel and E-Kernel are shown in the rows of Fig. 4 from 4 to 10. Fig. 4 shows that, the targets of the original images are located in clutter background. Max-mean and max-median methods can efficiently suppress the clutter background, but the targets are also smoothed, which decreases the performance of the methods for dim target enhancement. E-Kernel and U-Kernel enhance the targets, but the clutter background is not efficiently suppressed. So, the performances of E-Kernel and U-Kernel are also inefficient. WTH enhances the targets, but the clutter background suppression is also inefficient. In fact, some clutters in the result images of WTH are also enhanced (see the results of the first and third image in the fourth row of Fig. 4). In contrast, MWTH efficiently enhances the targets, and almost all the clutter Fig. 5. An example of image with one true target and one false target. The true target is labeled by a square. The false target is labeled by an ellipse. The first row is the original image and the corresponding 3D target intensity plot of the real target in original image. The enhanced results of MWTH, WHT, Max-median(3 3), Max-mean(3 3), Maxmedian(5 5), Max-mean(5 5), U-Kernel and E-Kernel and their corresponding 3D target intensity plots of the real targets in enhanced images are illustrated in rows from 2 to 9.
ARTICLE IN PRESS X. Bai et al. / Signal Processing 90 (2010) 1643–1654
backgrounds are suppressed. So, MWTH is efficient for dim target enhancement in clutter. Fig. 5 is an example of image with a true target and a false target. The true target is labeled by a square. The false target is labeled by an ellipse. In Fig. 5, the first row is the original image and the corresponding 3D target
1651
intensity plot of the real target in original images. The enhanced results of MWTH, WHT, Max-median(3 3), Max-mean(3 3), Max-median(5 5), Max-mean(5 5), U-Kernel and E-Kernel and the corresponding 3D target intensity plots of the real target in enhanced image are illustrated in rows from 2 to 9. To clearly show the
Table 1 LSBRs of the target after different methods (dB).
Original image MWTH WHT Max-median(3 3) Max-median(5 5) Max-mean(3 3) Max-mean(5 5) U-Kernel E-Kernel
Target 1
Target 2
Target 3
Target 4
Target 5
Target 6
0.1000 7.8397 0.4177 3.8669 0.8286 0.6646 0.9259 0.3667 0.3861
0.2689 9.2529 0.6178 3.4882 0.5992 0.5511 1.0348 0.4561 0.4400
0.3765 10.7577 0.8283 3.7248 0.6636 0.6884 1.3181 0.6036 0.5717
0.4133 12.7783 0.6279 5.8054 0.8880 1.2352 1.5026 0.8011 0.8016
0.7241 15.3646 0.7134 5.9431 0.7929 1.0413 1.2697 4.0034 3.9620
1.3706 21.4988 1.7394 5.9518 1.3311 1.5951 2.8307 5.6554 5.2140
Fig. 6. PFA per pixel versus varying LSBR: (a) PFA per pixel versus varying LSBR in clutter images; (b) PFA per pixel versus varying LSBR in heavy clutter images.
ARTICLE IN PRESS 1652
X. Bai et al. / Signal Processing 90 (2010) 1643–1654
enhanced results, the enhanced images in Fig. 5 from rows two to nine are processed by using linear extension, so that the gray values of the image are in the interval [0, 255]. Fig. 5 shows that, WTH, U-Kernel and E-Kernel enhance the true target. But, the clutter background can not be efficiently suppressed, and the false target is also enhanced. Max-mean and max-median methods can efficiently suppress the background and the false target, but the true target is also smoothed. MWTH can efficiently enhance the true target and suppress the clutter background. And, the false target is also suppressed by MWTH. This experiment shows the efficient performance of MWTH for true target enhancement in clutter background. To do a quantitive comparison, a measure should be selected to describe the ability of small target enhancement of algorithms. Because of the small size and large gray value of small target region, small target in image usually acts as noise, which means the signal-to-noise ratio (SNR) is not suitable for describing the ability of small target enhancement. Because the ability of small target enhancement is based on both the signal intensity of target and surrounding background, the local signal-to-background ratio (LSBR) [16] is more suitable for describing the ability of small target enhancement of algorithms. LSBR is defined as follows. 9 8 W=2 = < 1 W=2 X X ½Iðx k; y jÞ mb 2 ; LSBR ¼ 10log10 2 ; :sb k ¼ W=2 j ¼ W=2
ð20Þ where, s2b is the variance and mb is the mean of background in the window described by width and height W around interest pixel (x, y). In this paper, W=33. An efficient algorithm for target enhancement will enhance the target and suppress the clutter background. Then, the local signal-to-background ratio would be a large value. So, large LSBR indicates good performance of target enhancement algorithm. In order to demonstrate the performance of target enhancement of MWTH, the LSBRs of small target after different algorithms are listed in Table 1. Table 1 shows that when LSBR of original target is small, the performances of most of the algorithms become worse. But, the LSBR of the enhanced target by using MWTH is better than any other algorithm. Moreover, the LSBR of the target after MWTH is still very large when the LSBR of
the original target is very small, which means MWTH is very efficient for dim target enhancement in clutter background. 5.2. False alarm reduction Because of target enhancement by modified top-hat transformation, LSBR of target is increased and therefore the number of false alarms is decreased. In order to demonstrate the superiority of false alarm reduction of MWTH, a measure based on LSBR provided by Soni [16] is used. This measure is the probability of false alarms (PFA) per pixel versus input LSBR. Regarding to the targets with one LSBR, the targets are enhanced by using one algorithm. Then, the enhanced result is converted into a binary image through a threshold. This threshold can just correctly threshold the real targets. The number of binary bright regions which are not target region in the binary image is divided by the number of the pixels of the image. The result of this division is the probability of false alarms (PFA) per pixel of the enhancement algorithm versus the LSBR. Fig. 6(a) and (b) show the variety of false alarms per pixel of resulting image with different clutter background as a function of LSBR. The images are processed by different target enhancement algorithms. Fig. 6(a) shows that, the performances of max-median and max-mean are sensitive to the size of window, and they perform even worse than original image if LSBR is large. U-Kernel and E-Kernel perform worse when the LSBR is small. MWTH performs better than any other algorithm regardless of LSBR. In Fig. 6(b), because of the effect of heavy clutter, the performances of all the algorithms except MWTH become worse than Fig. 6(a). But, MWTH performs very well again. The robust and efficient properties of MWTH in false alarm reduction under the conditions of dim target intensity and heavy clutter are demonstrated in this experiment. This experiment exhibits the good performance of MWTH for dim small target enhancement through false alarm reduction under the condition of heavy clutter. 5.3. Clutter background suppression Less residual background remained in enhanced image fT leads to a better performance of target enhancement and less false alarms in fT. This means more accurate
Table 2 Comparison of clutter background suppression by using MARB.
MWTH WHT Max-median(3 3) Max-median(5 5) Max-mean(3 3) Max-mean(5 5) U-Kernel E-Kernel
Image 1
Image 2
Image 3
Image 4
Image 5
Image 6
0.0003 0.5752 0.0178 0.0286 0.0406 0.0522 0.5908 0.5300
0.0019 0.2887 0.0045 0.0097 0.0186 0.0241 0.4072 0.3537
0.0020 0.4722 0.0141 0.0244 0.0356 0.0452 0.4815 0.4271
0.0027 0.7449 0.0252 0.0403 0.0563 0.0723 0.7158 0.6376
0.0396 5.6029 0.0908 0.4391 0.3616 0.5559 2.2208 1.9508
0.0434 3.0896 0.0579 0.2028 0.1878 0.3230 1.9853 1.8485
ARTICLE IN PRESS X. Bai et al. / Signal Processing 90 (2010) 1643–1654
estimation of clutter background (fb) results in a better performance of clutter background suppression of the algorithm, which will well enhance the dim target. In order to demonstrate the performance of clutter background suppression of algorithms, the mean absolute value of residual background (MARB) [17] is defined and used to compute the residual background remained in fT. P MARB ¼
x;y jFb ðx; yÞ
fb ðx; yÞj
Lw Lh
;
ð21Þ
where, Fb is the background of original image; fb is the estimated background corresponding to the enhanced image fT by different algorithms; Lw and Lh are width and height of image, respectively. If the clutter background is estimated more accurately, the remained residual background in the enhanced image will be less. Then, the difference between Fb and fb will be small. So, more accurate estimation of clutter background results in a smaller difference between Fb and fb. This leads to a small MARB. Therefore, a smaller MABR indicates a better performance of clutter background suppression of algorithm. In order to calculate MABR, more than 200 images from the image dataset are used. MARBs of different algorithms on some images are listed in Table 2. Because of the importing and subtracting of t in MWTH, more clutter background are estimated as background
1653
and removed from fT than that of WTH. So, MARB of MWTH is smaller than the MARB of WTH. MARB of MWTH is also smaller than the MARBs of other methods, which means the performance of clutter background suppression of MWTH is better than other methods. To show the efficient performance of MWTH on more images, MARBs of different algorithms are calculated on images from the image dataset. And, the mean value of the MARBs of each algorithm is calculated. These mean values corresponding to each algorithm are illustrated in Fig. 7. Fig. 7 shows that the mean value of MARBs of MWTH is the smallest, which means the performance of MWTH for clutter background suppression is the best among these algorithms. This will apparently improve the performance of MWTH for dim target enhancement. This experiment demonstrates the efficient performance of MWTH for dim small target enhancement through clutter background suppression. Moreover, target enhancement in 5.1 and false alarm reduction in 5.2 of MWTH are better than other methods. All of these indicate that the performance of MWTH is robust and efficient. 5.4. Comparison of computation time Several infrared images are used to analyze the computation time (CPU: Intel Pentium 4, 2.6 Hz. Memory: 512 MB.) of different methods. The size of these images is 128 128. The average times of different algorithms are listed in Table 3. Because of the small size of target, the size of structuring element used in MWTH is also small, which leads to a rapid computation of morphological operation in MWTH. However, the computation of mean and variance of GCM occupies certain time. So, the computation time of MWTH is a little larger than WTH, U-Kernel and E-Kernel. But, the performance of MWTH for target enhancement is better than WTH, U-Kernel and E-Kernel. Furthermore, the well designed implementation of the algorithm and using hardware system will apparently decrease the computation time of MWTH. Therefore, MWTH could be used in quasi real-time target detection and tracking applications. 6. Conclusions
Fig. 7. Comparison the mean value of MARBs of different algorithms.
Heavy clutter and dim intensity of target largely increase the difficulty of small target detection. In order to make the target detection easy, dim target should be enhanced. In this paper, a simple and efficient target enhancement method is proposed, which is a modified top-hat transformation. Firstly, the GCM is constructed
Table 3 Comparison of computation time (s). MWTH
WTH
Max-median(3 3)
Max-mean(3 3)
Max-median(5 5)
Max-mean(5 5)
U-kernel
E-kernel
0.330
0.018
2.290
1.890
2.326
1.948
0.020
0.022
ARTICLE IN PRESS 1654
X. Bai et al. / Signal Processing 90 (2010) 1643–1654
following the property of target region and the propositions of top-hat transformations. Secondly, a judging value is calculated from GCM. After importing the judging value into top-hat transformations, the modified top-hat transformations are formed. Finally, the resulting image, in which the dim small target is apparently enhanced, is obtained through the modified top-hat transformation. The judging value in modified top-hat transformations is calculated following the property of target region, which efficiently depresses the clutter and false alarms. And, the subtraction of judging value in modified top-hat transformations also suppresses some background and false alarms. Therefore, the proposed modified top-hat transformation is an effective dim target enhancement algorithm through importing the properties of small target region into the morphological transformation. In addition, because of the highly parallel property of mathematical morphology and small computation time, our algorithm could be implemented in real-time hardware system. Comparative analysis with other target enhancement algorithms reveals its superiority over other widely used methods on target enhancement, false alarm reduction and clutter background suppression. The main operation in this paper is the top-hat transformation. The modification of top-hat transformation is based on the region smoothing of morphological opening operation. The top-hat transformation and opening operation do not need any information on the statistical properties of the image clutter. Besides, the infrared small target images used in this paper are captured under various conditions, such as with fog atmosphere, at night time, with complicate environment and so on, which results in very different clutter background in different images. More importantly, the experimental results of images with very different clutter backgrounds show that, our algorithm perform very well and even better than some other widely used methods. All of these indicate that, the performance of our algorithm does not very sensitive to the statistical properties of the clutter. So, our algorithm is robust with respect to different clutter and could be used in wide types of images with different clutter statistical properties. This would be meaningful for dim small target detection. Moreover, the purpose of small target enhancement is to enhance the small target and suppress the clutter background. Then, the target can be detected easily, which significantly improves the detection capabilities of algorithms for target detection. So, efficient enhancement of the dim target indicates efficient detection capabilities of algorithms. Our algorithm can efficiently enhance the dim target and perform better than some widely used methods. Therefore, our algorithm for dim target enhancement will apparently improve the detection capabilities of algorithms for target detection. All of these indicate that, our algorithm can be widely used for dim small target enhancement and detection.
Acknowledgments The authors would like to thank the anonymous reviewers for their very constructive comments and suggestions. This work is partly supported by the National Natural Science Foundation of China (60902056) and Aeronautical Science Foundation of China (20090151007) and Innovation Foundation of Beijing University of Aeronautics and Astronautics for Ph.D. Students. The authors also would like to thank Dr. Yan Li at School of Geology and Space Science in Peking University, Beijing, China for many helpful suggestions and discussions. References [1] S.D. Deshpande, M.H. Er, V. Ronda, P. Chan, Max-mean and maxmedian filters for detection of small-targets, Proceedings of SPIE 3809 (1999) 74–83. [2] S. Leonov, Nonparametric method for clutter removal, IEEE Transactions on Aerospace and Electronic Systems 37 (3) (2001) 832–848. [3] T. Arodz, M. Kurdziel, T.J. Popiela, E.O.D. Sevre, D.A. Yuen, Detection of clustered microcalcifications in small field digital mammography, Computer Methods and Programs in Biomedicine 81 (2006) 56–65. [4] B. Zhang, T. Zhang, K. Zhang, Z. Cheng, Z. Cao, Adaptive rectification filter for detecting small IR targets, IEEE A&E Systems Magazine 22 (8) (2007) 20–26. [5] T. Zhang, Z. Zuo, W. Yang, X. Sun, Moving dim point target detection with three-dimensional wide-to-exact search directional filtering, Pattern Recognition Letters 28 (2) (2007) 246–253. [6] C.E. Cafer, J. Silverman, J.M. Mooney, Optimization of point target tracking filters, IEEE Transactions on Aerospace and Electronic Systems 36 (1) (2000) 15–25. [7] P.A. Ffrench, J.R. Zeidler, W.H. Ku, Enhanced detectability of small objects in correlated clutter using an improved 2-D adaptive lattice algorithm, IEEE Transactions on Image Processing 6 (3) (1997) 383–397. [8] X. Jin, C.H. Davis, Vehicle detection from high-resolution satellite imagery using morphological shared-weight neural networks, Image and Vision Computing 25 (2007) 1422–1431. [9] P. Wang, J.W. Tian, C.Q. Gao, Infrared small target detection using directional highpass filters based on LS-SVM, Electronics Letters 45 (3) (2009) 156–158. [10] B. Moghaddam, A. Pentland, Probabilistic visual learning for object representation, IEEE Transactions on Pattern Analysis and Machine Intelligence 19 (9) (1997) 696–710. [11] Z. Liu, X. Shen, C. Chen, Small objects detection in image data based on probabilistic visual learning, in: Proceedings of Fourth International Conference on Machine Learning and Cybernetics, Guangzhou, China, 2005, pp. 5517–5521. [12] P. Soille, Morphological Image Analysis: Principles and Applications, Springer-Verlag, Berlin, Germany, 2003. [13] S. Halkiotis, T. Botsis, M. Rangoussi, Automatic detection of clustered microcalcifications in digital mammograms using mathematical morphology and neural networks, Signal Processing 87 (2007) 1559–1568. [14] M. Zeng, J. Li, Z. Peng, The design of top-hat morphological filter and application to infrared target detection, Infrared Physics and Technology 48 (2006) 67–76. [15] F. Zhang, C. Li, L. Shi, Detecting and tracking dim moving point target in IR image sequences, Infrared Physics and Technology 46 (2005) 323–328. [16] X. Bai, F. Zhou, Y. Xie, New class of top-hat transformation to enhance infrared small targets, Journal of Electronic Imaging 17 (3) (2008) 0305011–0305013. [17] X. Bai, F. Zhou, Y. Xie, T. Jin, Enhanced detectability of point target using adaptive morphological clutter elimination by importing the properties of the target region, Signal Processing 89 (2009) 1973–1989.