Optics and Lasers in Engineering 50 (2012) 131–139
Contents lists available at SciVerse ScienceDirect
Optics and Lasers in Engineering journal homepage: www.elsevier.com/locate/optlaseng
Data field-based transition region extraction and thresholding Tao Wu a,c,n, Kun Qin b a b c
State Key Laboratory of Software Engineering, Wuhan University, Wuhan 430079, China School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China School of Information Science and Technology, Zhanjiang Normal University, Zhanjiang 524048, China
a r t i c l e i n f o
abstract
Article history: Received 25 June 2011 Received in revised form 12 September 2011 Accepted 30 September 2011 Available online 14 October 2011
Thresholding is a popular image segmentation method that converts a gray level image into a binary image. In this paper, we propose a data field-based method for transition region extraction and thresholding, which involves three major steps, including generating the image data field, deriving the transition region by comparing the potential values, and calculating the threshold from the transition region. Image data field can effectively represent the spatial interactions of neighborhood pixels, and its potential value is a more robust measurement for the gray level change. In addition, we introduce a fully automatic scheme for parameters selection. The approach is validated both quantitatively and qualitatively. Compared with existing relative methods on a variety of synthetic and real images, with or without noisy, the experimental results suggest that the presented method is efficient and effective. & 2011 Elsevier Ltd. All rights reserved.
Keywords: Data field Transition region Image segmentation Image thresholding
1. Introduction Great interest has been shown in image segmentation, which serves a variety of applications, such as image classification [1], iris segmentation and recognition [2–4], and salient object extraction [5]. A number of image segmentation approaches have been proposed, which have proven successful in many applications, but none of them is generally applicable to all images and different algorithms are usually not equally suitable for any given particular application. In spite of several decades of investigation, image segmentation remains a challenging research topic. Thresholding is one of the most important and effective techniques for image segmentation, and plays a key role when segmenting images with distinctive gray levels corresponding to object and background [6]. Many techniques and performance evaluation metrics for thresholding have been developed over the years. Comprehensive overviews and comparative studies of image thresholding can be found in the literature [7–10]. Sezgin et al. [10] provide a recent version, in which the Otsu method [11], the Kapur method [12] and Minimum Error Thresholding (MET) [13] are taken as state-of-art algorithms. But great efforts should be made in selecting a threshold in order to ensure quality and rapidity. Thresholding is sensitive to noise and has the disadvantage of spatial uncertainty since pixel location and neighboring information usually are ignored by thresholding n Corresponding author at: State Key Laboratory of Software Engineering, Wuhan University, Luojia Hill, Hubei, Wuhan 430072, China. Tel.: þ86 13030168783; fax: þ86 7593174781. E-mail address:
[email protected] (T. Wu).
0143-8166/$ - see front matter & 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.optlaseng.2011.09.017
methods [14]. Thus, utilizing knowledge on some other fields to overcome the existing drawbacks should be helpful. For example, some innovative methods for image segmentation have surfaced, which are inspired by the physical world. Sun et al. [15] introduce a low-level edge detection algorithm based on the law of gravitation. The algorithm assumes that each pixel is as a celestial body with a mass represented by its grayscale intensity, and each celestial body exerts forces onto its neighboring pixels and in return suffers forces from neighboring pixels, which are calculated by the law of gravity. Lopez-Molina et al. [16] calculate the gravitational forces using the triangular norm (t-norms) instead of the product operation in [15], and then present an extension study of t-conorms as substitutes of t-norms for the generalized gravitational approach to edge detection [17]. Similarly, Wang et al. [18] present an approach for edge detection based on the theory of electrostatic fields. However, none of them focus on image thresholding. Recently, transition region-based image thresholding has received some attentions [19–25]. It is an intermediate approach between edge detection and region extraction, since transition regions have both edge and region characteristics. Transition regions, which are with certain pixel width and non-zero cover area, locate between the object and the background, and cover around the objects. The main idea of transition region-based image thresholding is that the method measures changes in an image according to a specific criterion, then extracts the transition regions by choosing an appropriate threshold, and finally sets a threshold for segmentation corresponding to the peak or mean of the transition region histogram. Liu et al. [23] review the transition region-based methods and points out that the quality
132
T. Wu, K. Qin / Optics and Lasers in Engineering 50 (2012) 131–139
of transition region extraction directly influences the accuracy of optimal threshold and the quality of segmentation result. In the existing transition region-based methods, gradientbased methods are the most classical, such as the effective average gradient-based (EAG) method [19], which is a historical standard. Non-gradient based methods include the local entropy (LE) method [22] and the gray level difference (GLD) method [25]. The GLD method generates a gray level difference image with a predefined neighborhood, and extracts transition regions by appropriate thresholding. The final segmentation threshold is determined using the mean grayscale value of pixels in the transition region. Li et al. [25] analyze the advantage of gray level difference to determine transition region, and exhibits better performance in experimental results as compared to EAG and LE. However, there are some drawbacks of the GLD method, and the results are unsatisfactory or even questionable in some cases. Firstly, GLD fails at representing gray level changes in the neighborhood, since the measurement proposed by Li et al. [25] cannot reflect the difference between the gray level of central pixel and that of each neighboring pixel. Second, GLD cannot accurately capture the extent of gray level changes in the neighborhood. Given different pixels with the same grayscale value, the gray level difference between these pixels and the central pixel may be exactly the same, while the extent of gray level change is not always consistent. The absolute spatial difference should be taken into account, but the GLD method is neglect of this. We believe that an improved descriptor for transition region is necessary. This paper proposes image data field to represent local gray level changes in the neighborhoods of transition regions, and provides a novel approach for transition region-based image thresholding. An image data field is developed by simulating a short-range nuclear force. The main ideas are: (1) each image pixel is a data particle with mass, and has interactions with neighboring pixels; (2) the potential sum at any pixel is calculated by obeying the law of short-range nuclear force field; (3) transition regions are extracted by an appropriate potential threshold; (4) the final segmentation threshold is determined using the mean grayscale value of pixels in transition region. Compared with two relative methods (LE and GLD) and three classical methods (the Otsu method, the Kapur method and MET), the performance of the proposed method is demonstrated on a variety of images, with or without noise. Experimental results show the effectiveness and the efficiency of the proposed method. The rest of the paper is organized as follows: Section 2 proposes a novel algorithm for image thresholding, and the algorithm analysis is also presented, such as parameter setup and computational complexity. Section 3 shows experimental results and provides some discussions. Finally, conclusions are drawn in Section 4.
2. The image data field-based method Li [26] thinks of each data object as a particle with mass related to data space, and uses data fields to describe the complex correlation among data objects, where there are some effects and interactions in an unknown way. We introduce data field and conduct a novel mechanism for transition region extraction. Similar to that by Sun, Lopez-Molina, and Wang et al. [15,16,18], the mechanism encourages the homogeneous pixels to cluster, and then separates the transitional pixels from homogeneous regions. This idea enables us to make an analogy with the mechanism of the nuclear field theory. In nuclear field, nuclear force binds protons and neutrons together to form the nucleus of an atom. Similarly, we take each pixel as a particle with mass, which relates to the grayscale
difference in a certain neighborhood. Each pixel receives and exerts an attraction or repulsion from other pixels, and the magnitude of attraction or repulsion is determined by the corresponding potential value, the better the potential value, the higher the magnitude of repulsion. We refer to this mechanism as image data field, and take the corresponding potential value as an indication of gray level changes to extract transition regions. 2.1. Data field It is clear that data field [26] is the key technique. Given a data object x in data space O, let jx ðyÞ be the potential at any position yA O produced by x, then jx ðyÞ can be computed by anyone of the following equation:
jx y ¼ mx expððJxyJ=sÞk Þ
ð1Þ
jx ðyÞ ¼ G mx =ð1 þðJxyJ=sÞk Þ
ð2Þ
jx ðyÞ ¼ mx =ð4pe0 ð1þ ðJxyJ=sÞk ÞÞ
ð3Þ
where JxyJ is the distance between x and y, the strength of interaction mx Z0 can be regarded as mass or charge of data objects, a natural number k is the distance index, and s A ð0, þ1Þ is the influential factor that indicates the range of interaction. Additionally, the distance is usually measured by Euclidean, Manhattan or Chebyshev metric. In this paper, we choose the Chebyshev distance for JxyJ. Eqs. (1)–(3) are three common choices of the above potential function. Eq. (1) imitates nuclear field with Gaussian potential, while Eqs. (2) and (3) imitate gravitational field and electrostatic field respectively, where G and e0 are the constants depended on the law of gravitation and the Coulomb law. Mathematically considered, therefore, the latter two seem essentially the same, and we only compare Eqs. (1) and (3) in the next subsection. In addition, we need to state that, there are several alternative formulae for jx ðyÞ, such as electromagnetic field, temperature field or nuclear field with exponential potential. In general, there is more than one object in data space. To obtain the precise potential value of any position under these circumstances, all interactions from data objects should be concerned. Given a data set D ¼ fx1 ,x2 , . . . ,xn g, because of overlap, the potential of any position y in the data space is the sum of all data radiation,
jðyÞ ¼
n X
jxi ðyÞ
ð4Þ
i¼1
where jxi ðyÞ is calculated by one of Eqs. (1)–(3). 2.2. Image data field Suppose P ¼ fp ¼ ðpx ,py Þ9px A ½0,w14py A ½0,h14px ,py A Zg is a finite space consisting of two-dimensional pixels, f : P-½0,L1 is a mapping, and then an image is a pair I ¼ /P,f S, where Z denotes set of integers, and h, w, and L are the height, width, and grayscale level of the image respectively. Inspired by data field, each pixel p A P is a particle with mass, and the grayscale change interactions (attraction or repulsion) between each other form an image data field on P. Because of several alternative formulae for jx ðyÞ, we need to choose an appropriate one for image segmentation. Nuclear force field corresponding to Eq. (1) is a short-range interaction in the physical world [27], and conversely, gravitational field corresponding to Eq. (2) is a long-range interaction. Field intensity of the former attenuates rapidly with the increment of interaction distance. To verify how the field forms with a given s affect the potential attenuation, we fix mx ¼ 1,k ¼ 2 in Eqs. (1) and (2), and
T. Wu, K. Qin / Optics and Lasers in Engineering 50 (2012) 131–139
1
1 nuclear field with σ = 1 gravitational field with σ = 1 nuclear field with σ = 2 gravitational field with σ = 2 nuclear field with σ = 5 gravitational field with σ = 5
0.9 0.8 0.7
nuclear field with k = 1 nuclear field with k = 2 nuclear field with k = 3 nuclear field with k = 5 nuclear field with k = 10
0.9 0.8 0.7 0.6 φ (y)
0.6 0.5
x
φ (y) x
133
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0
0 0
2
4
6
8
10
0
||x − y||
1
2
3
4
5
||x − y||
Fig. 1. Comparisons on the potential functions with varied forms or varied parameters: (a) comparison of the attenuation curves of nuclear force field and gravitational field with s ¼ 1; 2,5 (mx ¼ 1,k ¼ 2), (b) comparison of the attenuation curves of nuclear force field with k¼ 1, 2, 3, 5, 10 ðmx ¼ 1, s ¼ 1Þ.
vary s from case to case, i.e. s A f1; 2,5g. In Fig. 1(a), we list the attenuation curve of nuclear force field and gravitational field. Just for the same s, the potential value of nuclear force field tends to attenuate more rapidly than that of gravitational field. Obviously, this feature of short-range interaction is more suited to describe the grayscale change relationships in the image neighborhood, since two pixels should have a relatively lower relevance if the distance between them is too long. Furthermore, we compare the influence of k on the potential. The attenuation curve of nuclear force field with various k is shown in Fig. 1(b), where mx ¼ 1, s ¼ 1, and kA f1; 2,3; 5,10g. The smaller k implies the longer attenuation range, but this does not necessarily mean we should choose a considerable k. Chosen a too large k, all the interactions should be nearly same as long as the distance of pixels is within s. As shown in Fig. 1(b), the potential values of nuclear field with k¼10 are almost equal when interaction distances JxyJ being within 1. In extreme circumstances, Eq. (1) with k-1 will be approximatively equivalent to nuclear field with square potential. Hence, a slightly large k is favorable. In the following, we use the nuclear force field with k¼3. Finally, the mass is also significant for image data field. Three kinds of masses, inertial mass, active gravitational mass, and passive gravitational mass, are defined in theoretical physics, and the mass mx in Eq. (1) refers to the active gravitational mass of x acting on y. Here we provide a measurement related to the grayscale difference between central pixel and its neighboring pixels. In order to adaptively capture the local information, the mass of a specified pixel is changed with the local neighborhood. This mass also determines the magnitude of the pixel over the field, that is, the greater the local grayscale difference, the higher the potential value. Thus, the active gravitational mass that acts on pixel p by pixel q is as, mpq ¼ 9f ðqÞf ðpÞ9
ð5Þ
where f ðpÞ and f ðqÞ are the grayscale values of p and q respectively. To sum up, we precisely update the definition of our image data field. Given a two-dimensional space P, each pixel acts on each other by some interaction forces, and forms an image data field. Assuming two pixels p,q A P, let jq ðpÞ be the potential at any
pixel p produced by q, and then jq ðpÞ can be computed by
jq ðpÞ ¼ 9f ðqÞf ðpÞ9 expððmaxð9px qx 9,9py qy 9Þ=sÞ3 Þ
ð6Þ
The above potential simulates the short-range nuclear force’s field corresponding to the Gaussian potential field. The Gaussian function satisfies three sigma rule. Thus, in the image data field, the influential range of a data object is the pffiffiffi geometrical neighborhood with the distance shorter than 3s= 2, that is to say, beyond a certain distance data objects are almost not influenced by a specified data object, and the potential values become zero, as can be seen from Fig. 1(b). Therefore, the potential of any pixel p can be as, X jðpÞ ¼ 9f ðqÞf ðpÞ9 expððmaxð9px qx 9,9py qy 9Þ=sÞ3 Þ ð7Þ q A xðpÞ
pffiffiffi where xðpÞ ¼ fq9q A P4JpqJ r3s= 2g denotes the pre-specified neighborhood of p. 2.3. The connection between image data field and transition region 2.3.1. A case of study Intuitively, when s is assigned, the above image data field describes the grayscale variation in a neighborhood. The higher the potential of p is, the bigger the variation will be. For example, given the rice image with size of 256 256 (see Fig. 2(a)), an image data field is generated. Fig. 2(b) is the equipotential lines produced by this image data field, in which the two dimensions denote the position of pixel with horizontal and vertical direction respectively. As can be seen in Fig. 2(b), whether inside the rice or in the background regions, there are lower potential values than those in the transition regions. In physics, scalar fields often describe the potential energy associated with a particular force. The force is a vector field, which can be obtained as the gradient of the potential energy scalar field. A particle gives out force in all directions, and receives forces from the neighboring particles. For detailed analysis, we cut a sub-block in Fig. 2(a) (bordered with a red rectangle), the equipotential lines and the field lines are shown in Fig. 2(c) and (d), and a combined map is in Fig. 2(e), in which the magnitude of the force is denoted by length of line. In Fig. 2(c), it is clear that higher equipotential lines surround the transition pixels. In Fig. 2(d), the force between particles in the current form belong to repulsion. Near the edge or in transition regions, the force rises, otherwise falls. And the force field points to
134
T. Wu, K. Qin / Optics and Lasers in Engineering 50 (2012) 131–139
200
50 150 100
150 100 200
250
50 50
100
150
200
250
700 5
600
10
500
15
400
20
300
25
200
30
100 5
10
15
20
25
30
5
5
10
10
15
15
20
20
25
25
30
30 5
10
15
20
25
30
5
10
15
20
25
30
Fig. 2. Equipotential map for an image data field: (a) original image, (b) equipotential lines on the whole image, (c) equipotential lines on a sub-image (bordered with a red rectangle in (a)), (d) field lines on the sub-image, (e) combined map on the sub-image. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
the edge. Additionally, we note that attraction and repulsion are twin concepts, as one falls, another rises. Similar to nuclear field, the forces of image data field bind homogeneous pixels together to form the homogeneous regions. That is to say, in the homogeneous regions, the force field is uniform, pixels attract each other, and group into a cluster. Meanwhile, in the transition regions, the force field is nonuniform, pixels are repulsed by homogeneous pixels, and then separated from the homogeneous regions. From Fig. 2(e), one can find that the repulsion between two pixels indicates the higher potential value, conversely, the attraction appears the lower potential, even zero. Therefore, the corresponding potential value is an indication of force type (attraction or repulsion), furthermore, it can be as a measurement of gray level changes, which is useful to extract transition regions.
2.3.2. Relations among mass, distance, potential value, and field force To understand the potential and the force in image data field in detail, we investigate the influence of gravitational mass and interactive distance. Eq. (7) shows that: (1) within the same distance, the more massive pixel the higher potential, (2) with the same mass, the longer distance with the lower potential, (3) the gravitational mass is more dominant than the interactive distance. In Table 1, we list the possible cases of their relations. The force is attraction with a small mass (regardless of the distance), and the two pixels should be homogeneous. While the two should be heterogeneous if the mass is great and the distance is short. Maybe, for a single interaction, one cannot accurately distinguish Case 4 from Case 1. But, considering the neighboring pixels and overlying the potential values, there should be differences between Case 1 and Case 4.
Table 1 Relations among mass, distance, potential value, and field force. Possible case
Mass acting on p by q
Distance between p and q
jq ðpÞ
Interaction force
1 2 3 4
Small Small Great Great
Short Long Short Long
Low Low High Likely high
Attraction Attraction Repulsion Repulsion
2.4. Transition region extraction The image data field has two important features on transition region: (1) pixels in the interior of a homogeneous region have lower potential values, even zero, which indicates that all the pixels in the neighborhood have the same grayscale; (2) pixels near the edge points suffer higher potential values. Hence, the potential value is related to the grayscale change in the neighborhood. The sharper the change is, the higher the potential. In other words, we can estimate the possibility by comparing the potential value of the central pixel in a neighborhood. Theoretically, an appropriate threshold T can be set to detect the pixels in the candidate transition regions, that is, R ¼ fp ¼ ðpx ,py Þ A P9jðpÞ Z Tg
ð8Þ
2.5. The automatic scheme for parameters selection The proposed method remains the choice of two parameters, influence factor s in Eq. (7) and potential threshold T in Eq. (8).
T. Wu, K. Qin / Optics and Lasers in Engineering 50 (2012) 131–139
135
2 field consumes pffiffiffi a time of order Oðð2e þ 1Þ h wÞ, where e ¼ roundð3s= 2Þ, and roundð:Þ rounds a number to the nearest integer. In general, ð2e þ 1Þ2 5h w. Steps 3 and 4 take time OðN t Þ, where Nt is the number of the pixels in the transition regions, and Nt is less than h w. In Step 5, each pixel is scanned once which takes about Oðh wÞ. Additionally, the Li method and the Rosin method for parameters selection cost time not more than OðNt Þ. Therefore, the time complexity of the proposed algorithm is approximately linear in the size of the original image (h w).
3. Experimental results
Fig. 3. Rosin’s method for the potential threshold.
For the influence factor s, Li [26] has proposed a general method (the Li method for short) which is also suitable for the image data field. The Li method introduces potential entropy to evaluate whether or not the potential distribution is fit with sample data, then searches the optimal influence factor using iterative method, which minimizes the potential entropy. When the size of sample data is large, a random sampling is adopted to reduce the time cost. For the potential threshold T, we introduce a method based on the potential histogram. Rosin [28] has proposed a threshold for unimodal histograms (the Rosin method for short). The Rosin method assumes that there is one dominant population in the image that produces one main peak located at the lower end of the histogram relative to the secondary population. This is also applied to potential histogram. In Fig. 3, the threshold divides the histogram into two classes, the first one may be mostly constituted by homogeneous pixels, while the second be constituted by transition pixels. A straight line is drawn from the peak to the high end of the histogram. The threshold point is selected as the histogram index that maximizes the perpendicular distance between the line and the histogram (see Fig. 3). 2.6. The proposed algorithm The proposed algorithm for the transition region-based image thresholding is as following: Step 1: Given the original image, calculate the influence factor s according to the Li method [26]. Step 2: Generate the image data field, and calculate the potential values by Eq. (7). Step 3: Compute the potential threshold T using the Rosin method [28], and extract the transition regions by Eq. (8). Step 4: Calculate the mean grayscale value of pixels in the transition region, and take that as the segmentation threshold (denoted as g opt ). Step 5: Segment the image with the above threshold g opt .
2.7. Computational complexity Except for the parameter selection, the main time costs of the algorithm lie in Steps 2–5. In Step 2, generating the image data
In order to illustrate the performance of the algorithm, both synthetic and real images, with and without contamination with Gaussian-white noise, are considered. We conduct three groups of experiments with other existing techniques. In the first two groups, the LE method [22] and the GLD method [25] have also been implemented under MATLAB 2007b environment. There are two parameters both in LE and in GLD. For a fair comparison, three methods are chosen parameters from a guess in these two groups, including influence factor s and potential threshold T in our method. In addition, we compare the proposed method with three state-of-art algorithms using non-destructive testing (NDT) images, such as the Otsu method [11], the Kapur method [12], and MET [13]. In the last group, all the methods are auto-parameters. All the experiments are performed on a 2.3 GHz Dual Core PC with 2 GB RAM. 3.1. Experiments with synthetic images It is usually desirable to test the thresholding algorithm using the synthetic images for which the ideal threshold can be identified directly [29]. The synthetic image with 256 256 pixels, named gearwheel, is chosen for the first group experiment. The original image and its segmentation result are listed in Fig. 4. As visible in the histogram in Fig. 4(b), this image can be segmented using an optimal threshold about 100. Therefore, the three methods obtain similar result, but only the proposed method extracts the effective and accurate transition region. By comparison, the transition regions extracted by the other two methods are too rough or discontinuous, which likely influences the quality of segmentation results. To investigate the performance of the new algorithm under noisy environment, the gearwheel image is added Gaussian white noise of zero mean, and variances from 0.01 to 0.50 with step size of 0.01. We quantified the performance of the methods by means of misclassification error (ME) [30] and Baddeley’s Delta Metric (BDM) [31]. Considering image segmentation as a pixel classification process, the percentage of pixel misclassification is a measure of discrepancy. ME reflects the percentage of background pixels wrongly assigned to foreground, and conversely, foreground pixels incorrectly assigned to background. For the two-class segmentation problem, it can be expressed as [30]: ME ¼ 1ð9Bo \ Bt 9þ 9F o \ F t 9Þ=ð9Bo 9 þ9Bt 9Þ
ð9Þ
where background and foreground are denoted by Bo and Fo for the ground-truth image, and by Bt and Ft for the test image. Bo \ Bt is the number of background pixels rightly assigned to background, and F o \ F t vice versa. 9:9 is the cardinality of a set. ME varies from 0 to 1, and 0 means a perfectly classified image. BDM [31] is an error measure for binary images based on the Hausdorff distance. Binary images can take a binary value at each pixel p A P, which can be interpret as background and foreground respectively. And the ground-truth image and the test image can
136
T. Wu, K. Qin / Optics and Lasers in Engineering 50 (2012) 131–139
4000 3000 2000 1000 0 0
100
200
Fig. 4. Transition extraction and thresholding for gearwheel: (a) original image; (b) histogram; (c) transition region extracted by the LE method; (d) segmentation result obtained by the LE method ðg opt ¼ 128Þ; (e) transition region extracted by the GLD method; (f) segmentation result obtained by the GLD method ðg opt ¼ 85Þ; (g) transition region extracted by the proposed method; (h) segmentation result obtained by the proposed method ðg opt ¼ 90Þ.
be uniquely identified by the set of foreground pixels F o ,F t . BMD reflects the discrepancy between the ground-truth image and the test image with respect to both misclassification error and location error. Then, in our experiments, we use a specific form of BDM as [31]: !1=2 1 X 2 BDM ¼ D2 ðF o ,F t Þ ¼ 9minðc,dðp,F o ÞÞminðc,dðp,F t ÞÞ9 h w pAP ð10Þ where dðp,F o Þ denotes the shortest distance from p A P to F o DP, and c Z0 is a cut-off value. In our experiments, we use c¼5 following Baddeley [31]. Lower values of BDM mean the segmented images are more similar to the ground-truth image. Since the Gaussian noise contamination is a random process, we repeat the process of each variance 10 times, and average the results. The overall results for the gearwheel image with Gaussian white noise of zero mean are shown in Fig. 5. In order to view the details in greater depth, the same curves restricted to the interval [0, 0.1] are also shown in Fig. 5. From the ME curves in Fig. 5(b), with variances not more than 0.1, the proposed method obtains the lower ME values and has better performance than the LE method and the GLD method, the GLD method the second, and the LE method the worst because of its greatest ME values. And from the curves in Fig. 5(a), this dominant position of the proposed method gently decreases with the incremental variances. On the other hand, from the BDM curves in Fig. 5(c), the proposed method always obtains the lowest BDM values even the gradual change of the noise variance, since the BDM measure, considering both misclassification error and location error, spreads the results a relatively wide range. And the LE method yields moderate results, the GLD method the worst. Compared the curves in Fig. 5, the GLD method is even more sensitive than the LE method if involved noisy contamination. From the view of the timing performance, the proposed method takes the nearidentical time with the GLD method, and much lower than that of the LE method. For this group of experiment, the average running
times of LE, GLD and the proposed method are 15.123, 1.655, and 1.241 s respectively. In practice, the proposed method produces the result within 1.5 s for an image with size of 256 256. 3.2. Experiments with laser cladding images In this section, three laser cladding images are involved in the experiments, namely Laser 1, Laser 2, and Laser 3. Laser cladding by powder injection is an advanced material processing with several industrial applications [32,33]. A laser beam melts powder and a thin layer of the substrate to create a layer on the substrate. Having a reliable feedback system for the closed loop control is critical to this process. Thus, we conduct this group experiment with laser cladding images. Quantitative comparisons of segmentation results yielded by various methods are listed in Table 2, which shows that our results correspond to lower ME values and lower BDM values, and obtain less misclassified pixels. In other words, the proposed method yields better segmentation. The results are displayed in Fig. 6. From these figures, one can observe that our results are closest to the ground truth images. 3.3. Experiments with NDT images In NDT applications, the thresholding is again often the first critical step in a series of processing operations such as morphological filtering, measurement, and statistical assessment [10]. To investigate the performance for the complex images, a small NDT dataset is involved in this group experiment. The test data consisted of 25 NDT images with ground-truth images [10]. Sample images and representative results are listed in Fig. 7. The results provided by the proposed method are compared with those yielded by three thresholding techniques widely used in the literature, such as the Otsu method, the Kapur method, and MET. Sezgin et al. [10] summarizes five metrics for thresholding performance, that is, ME, Edge MisMatch (EMM), region NonUniformity (NU), Relative foreground Area Error (RAE), and shape
T. Wu, K. Qin / Optics and Lasers in Engineering 50 (2012) 131–139
137
Fig. 5. The overall results for the gearwheel image with Gaussian white noise of zero mean: (a) ME curve; (b) ME curves in interval [0,0.1]; (c) BDM curve; (d) BDM curves in interval [0,0.1].
Table 2 Thresholds, numbers of misclassified pixels, values of ME, values of BDM, and running times obtained by applying various methods to the laser cladding images. Original images
LE
GLD
The proposed method
Laser 1 Threshold Misclassified pixels ME values BDM values Running time (s)
201 2519 0.0384 0.854 15.058
136 10748 0.164 1.984 1.751
211 1769 0.027 0.672 0.749
distortion penalty via Modified Hausdorff Distance (MHD). These performance measures are adjusted so that their scores vary from 0 for a totally correct segmentation to 1 for a totally erroneous case [10]. The comparison is based on two combined performance measures, that is, FEM [34] and AVE [10]. FEM is the fuzzy evaluation measure of the first four metrics, and AVE is the arithmetic average of the five metrics. FEM and AVE are defined as follows [34,10]: FEM ¼ ðmMEðxÞ ME þ mEMMðxÞ EMM þ mNUðxÞ NU þ mRAEðxÞ RAEÞ=4 AVE ¼ ðME þ EMM þ NU þ RAE þ MHDÞ=5
Laser 2 Threshold Misclassified pixels ME values BDM values Running time (s)
188 1497 0.0228 0.607 15.081
107 12631 0.1927 2.189 1.711
199 806 0.0123 0.372 0.664
Laser 3 Threshold Misclassified pixels ME values BDM values Running time (s)
196 2167 0.0331 0.782 15.089
124 12970 0.1979 2.188 1.616
202 1654 0.0252 0.645 0.616
ð11Þ
where mMEðxÞ , mEMMðxÞ , mNUðxÞ , mRAEðxÞ are performed by making use of the S-function in [34], and ME, EMM, NU, RAE, MHD are the scores of the five metrics. In addition, another metric based on Baddeley’s Delta Metric, i.e. BDM [31], is also adopted. The overall evaluation result of NDT images is given in Fig. 8. For FEM, our method ranks first, followed by Kapur, MET, and OTSU. And for AVE, Kapur wins first, followed by MET, the proposed method, and OTSU. And it should be note that the differences in scores are not very remarkable. While for BDM, results of the four methods are spread a relatively wide range. Even so, our method still
138
T. Wu, K. Qin / Optics and Lasers in Engineering 50 (2012) 131–139
Fig. 6. Transition extraction and thresholding for laser cladding images: From top to bottom, there list the results for Laser 1, Laser 2 and Laser 3 respectively. For each row, the first two columns are original images and its ground truth image; the next three columns are segmentation result obtained by the LE method, the GLD method and the proposed method respectively.
Fig. 7. Image thresholding for sample NDT: For each row, the first two columns are original image and its ground-truth image; the last four columns are segmentation results obtained by Kapur, OTSU, MET, and the proposed method respectively.
obtains the second-best result, which is very close to the first one by MET. In a way, the results indicate that our method is effective compared with the three classical algorithms. 3.4. Discussions
Fig. 8. The combined performance measures over 25 NDT images.
To sum up, we have the following discussions: (1) For images with and without the noise, the LE method yields the acceptable results, but the LE method is too timeconsuming, and has failed to extract the effective transition regions in some cases. (2) For images without the noisy contamination, the GLD method generally obtains the preferable results, but produces the extremely bad results when the laser cladding images are involved in. For images contaminated by noise, the GLD method cannot provide the effective segmentation results, since the gray level difference considers the gray scale changes too partially, and it is easy to be interfered by the noise. (3) For images without the noisy contamination, the proposed method obtains the results as well as the GLD method. In addition,
T. Wu, K. Qin / Optics and Lasers in Engineering 50 (2012) 131–139
for laser cladding images, the proposed method yields significant results, which are vastly preferable to those obtained by the GLD method. For noisy images, the proposed method still has a robust performance. Even compared with the state-of-art algorithms, the proposed method also shows a good performance. Moreover, the proposed method is efficient. For an image with the size of 256 256, the time consume is usually not more than 1 s in our practice.
4. Conclusion In this paper, a new image data field-based method for transition region extraction and thresholding has been proposed. Image data field is generated inspired by the short-range nuclear force’s field in the physical world, and the potential value in the image data field is as the measurement of the grayscale changes in the corresponding images. Compared with the relative methods on a variety of synthetic and real images, experimental results show that images can be segmented accurately, effectively and efficiently by using the new technique. The extension of the technique to the images under uneven lighting conditions is currently under investigation and will be reported later.
Acknowledgements The authors would like to thank all the anonymous reviewers for their valuable comments and thoughtful suggestions, which improved the quality of the presented work. Furthermore, the authors wish to thank Zuoyong Li very much for his kindness to provide the codes for GLD. This work was partially supported by the National Natural Science Foundation of China under Grant No. 60875007, and by National Key Basic Research and Development Program under Grant No. 2007CB311003. References [1] Ferna´ndez A, A´lvarez MX, Bianconi F. Image classification with binary gradient contours. Opt Lasers Eng 2011;49(9–10):1177–84. [2] Roy K, Bhattacharya P, Suen CY. Iris segmentation using variational level set method. Opt Lasers Eng 2011;49(4):578–88. [3] Belcher C, Du Y. Region-based sift approach to iris recognition. Opt Lasers Eng 2009;47(1):139–47. [4] Basit A, Javed M. Localization of iris in gray scale images using intensity gradient. Opt Lasers Eng 2007;45(12):1107–14. [5] Liu Z, Shen L, Zhang Z. Unsupervised image segmentation based on analysis of binary partition tree for salient object extraction. Signal Process 2011;91(2): 290–9. [6] Li Z, Liu C, Liu G, Cheng Y, Yang X, Zhao C. A novel statistical image thresholding method. AEU Int J Electron Commun 2010;64(12):1137–47. [7] Sahoo P, Soltani S, Wong A. A survey of thresholding techniques. Comput Vision Graphics Image Process 1988;41(2):233–60.
139
[8] Pal NR, Pal SK. A review on image segmentation techniques. Pattern Recognition 1993;26(9):1277–94. [9] Zhang Y. A survey on evaluation methods for image segmentation. Pattern Recognition 1996;29(8):1335–46. [10] Sezgin M, Sankur B. Survey over image thresholding techniques and quantitative performance evaluation. J Electron Imag 2004;13(1):146–65. [11] Otsu N. A threshold selection method from gray-level histogram. IEEE Trans Syst Man Cybern 1979;9(1):62–6. [12] Kapur J, Sahoo P, Wong A. A new method for graylevel picture thresholding using the entropy of the histogram. Comput Graphics Image Process 1985;34(11):273–85. [13] Kittler J, Illingworth J. Minimum error thresholding. IEEE Trans Syst Man Cybern 1986;19(1):41–7. [14] Yang X, Zhao W, Chen Y, Fang X. Image segmentation with a fuzzy clustering algorithm based on ant-tree. Signal Process 2008;88(10):2453–62. [15] Sun G, Liu Q, Liu Q, Ji C, Li X. A novel approach for edge detection based on the theory of universal gravity. Pattern Recognition 2007;40(10):2766–75. [16] Lopez-Molina C, Bustince H, Fernandez J, Couto P, De Baets B. A gravitational approach to edge detection based on triangular norms. Pattern Recognition 2010;43(11):3730–41. [17] Lopez-Molina C, Bustince H, Fernandez J, Couto P, De Baets B. On the use of t-conorms in the gravity-based approach to edge detection. in: Ninth international conference on intelligent systems design and applications. Pisa, Italy: IEEE; 2009. p. 1347–52. [18] Wang Z, Quan Y. A novel approach for edge detection based on the theory of electrostatic field. in: Proceedings of international symposium on intelligent signal processing and communication systems. Xiamen, China: IEEE; 2007. p. 260–3. [19] Zhang Y, Gerbrands JJ. Transition region determination based thresholding. Pattern Recognition Lett 1991;12(1):13–23. [20] Groenewald AM, Barnard E, Botha EC. Related approaches to gradient-based thresholding. Pattern Recognition Lett 1993;14(7):567–72. [21] Zhang Y. Transition region and image segmentation. Acta Electron Sin 1996;24(1):12–7. [22] Yan C, Sang N, Zhang T. Local entropy-based transition region extraction and thresholding. Pattern Recognition Lett 2003;24(16):2935–41. [23] Liu S, Yang J. Survey on extraction methods of transition region. Chin Eng Sci 2007;9(9):89–96. [24] Hu Q, Luo S, Qiao Y, Qian G. Supervised grayscale thresholding based on transition regions. Image Vision Comput 2008;26(12):1677–84. [25] Li Z, Liu C. Gray level difference-based transition region extraction and thresholding. Comput Electr Eng 2009;35(5):696–704. [26] Li D, Du Y. Artificial intelligence with uncertainty. Boca Raton: Chapman and Hall CRC; 2007. [27] Van Kolck U. Effective field theory of nuclear forces. Prog Part Nucl Phys 1999;43:337–418. [28] Rosin PL. Unimodal thresholding. Pattern Recognition 2001;34(11):2083–96. [29] Bazi Y, Bruzzone L, Melgani F. Image thresholding based on the em algorithm and the generalized gaussian distribution. Pattern Recognition 2007;40(2): 619–34. [30] Arifina AZ, Asano A. Image segmentation by histogram thresholding using hierarchical cluster analysis. Pattern Recognition Lett 2006;27(13):1515–21. [31] Baddeley A. An error metric for binary images. in: Robust computer vision: quality of vision algorithms. Karlsruhe: Wichmann Verlag; 1992. [32] Tizhoosh HR. Image thresholding using type-2 fuzzy sets. Pattern Recognition 2005;38(12):2363–72. [33] Bustince H, Barrenechea E, Pagola M, Fernandez J, Sanz J. Comment on: Image thresholding using type-2 fuzzy sets. importance of this method. Pattern Recognition 2010;43(9):3188–92. [34] Sezgin M, Sankur B. Selection of thresholding methods for nondestructive testing applications. in: Proceedings of 2001 international conference on image processing. Thessaloniki, Greece: IEEE; 2001. p. 260–3.