Automatic image segmentation based on PCNN with adaptive threshold time constant

Automatic image segmentation based on PCNN with adaptive threshold time constant

Neurocomputing 74 (2011) 1485–1491 Contents lists available at ScienceDirect Neurocomputing journal homepage: www.elsevier.com/locate/neucom Letter...

1003KB Sizes 2 Downloads 108 Views

Neurocomputing 74 (2011) 1485–1491

Contents lists available at ScienceDirect

Neurocomputing journal homepage: www.elsevier.com/locate/neucom

Letters

Automatic image segmentation based on PCNN with adaptive threshold time constant$ Shuo Wei, Qu Hong , Mengshu Hou School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 610054, PR China

a r t i c l e i n f o

abstract

Article history: Received 24 June 2010 Received in revised form 24 November 2010 Accepted 15 January 2011 Communicated by M.T. Manry Available online 24 February 2011

PCNN is a novel neural network model to simulate the synchronous phenomenon in the visual cortex system of the mammals. It has been widely used in the field of image processing and pattern recognition. However, there are still some limitations when it is applied to solve image processing problems, such as trial-and-error parameter settings and manually selection of the final results. This paper studies a simple model of PCNN(S-PCNN) and applies it to image segmentation. The main contributions of this paper are: (1) A new method based on the simplified model of PCNN is proposed to segment the images automatically. (2) The parameter settings are studied to ensure that the threshold decay of S-PCNN would be adaptively adjusted according to the overall characteristics of the image. (3) Based on the time series in S-PCNN, a simple selection criteria for the final results is presented to promote efficiency of the proposed method. (4) Simulations are carried out to illustrate the performance of the proposed method. & 2011 Elsevier B.V. All rights reserved.

Keywords: Image segmentation PCNN Parameter adjusting Adaptive threshold decay Time series

1. Introduction In the late 1980s, during their study on the synchronous oscillation phenomenon in the visual cortex of cats, Eckhorn et al. introduced the mammal neuron model-Eckhorn neuron model [1]. This model was further developed by Johnson [2] and finally evolved into the pulse coupled neural network (PCNN). It is reported that PCNN possesses a strong capability to solve image processing problems, e.g., image segmentation [3,4], image fusion [5,6] and object detection [7,8]. Lots of studies have been conducted in image segmentation using PCNN. Kuntimad derived the conditions that guarantee perfect segmentation for an image when the intensity ranges of adjacent regions overlap, and proposed an inhibition receptive field to reduce the extent of overlap of the intensity ranges of adjacent regions [3]. Karina applied PCNN to segment land from water in satellite images [4]. Hiroaki introduced inhibitory connections into the traditional PCNN, each pixel has three neurons associated with it, responsible for the RGB colors, respectively. This model was also used to segment color images in [9]. Gu brought forward a unitlinking PCNN model for image segmentation, which significantly reduced the number of the parameters in the model [10].

Since traditional PCNN involves too many parameters, many researchers have been focusing on the study of the simplified model of PCNN, and attending to seek an automatic method to adjust those parameters [11,12]. Meanwhile, since each iteration generates a binary output, and the final result is often selected manually, this also makes the model inconvenient to use. In this paper, we propose a new method to set parameters of PCNN, so that the decay speed of the threshold would be adjusted adaptively. A simple new output selection standard based on the time series is also discussed in this paper. Experiments are carried out and comparisons with some other segmentation methods show its usefulness with high segmentation accuracy. The remainder of the paper is organized as follows. A brief description of the traditional PCNN and its simplified version are presented in Section 2. The proposed method for parameters setting, the new criterion presented to select the final results, and the new algorithm nominated for image segmentation are described in Section 3. Example and comparison with traditional segmentation methods are given in Section 4. Finally, conclusions are drawn in Section 5.

2. PCNN model $ This work was supported by the National Science Foundation of China under Grant 60905037, Specialized Research Fund for the Doctoral Program of Higher Education of China under Grant 200806141049.  Corresponding author. Tel.: + 86 28 61831670. E-mail address: [email protected] (Q. Hong).

0925-2312/$ - see front matter & 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.neucom.2011.01.005

2.1. Traditional PCNN model As illustrated in Fig. 1, a typical PCNN neuron consists of three parts: input fields, modulation fields and pulse generator fields [3].

1486

S. Wei et al. / Neurocomputing 74 (2011) 1485–1491

Fig. 2. The structure of S-PCNN neuron. Fig. 1. The structure of PCNN neuron.

Each neuron receives signals from both external sources Sij and other neurons through the input fields. The signals reach the neuron through two different channels: Fij through the feeding channel and Lij through the linking channel. The input field can be seen as a leakage integrator, and it simulates the dendritic part of the corresponding biological neuron [4]. In the modulation field, signals from both linking channel and feeding channel are combined in a nonlinear way into the internal activity Uij, which is a simulation of the electric potential of the inspired biological neuron [9]. As its name applies, the last field takes charge of the pulse generating activity-firing [8]. It utilizes an adaptive threshold variable yij to control the firing event. This threshold operates as a step function. Generally neuron fires only when the internal activity associated with it is larger than (in some cases, equal to or larger than) the value of the threshold. If a neuron fires, the threshold associated with it would rise immediately to a very high value, preventing the neuron from further firing. The time interval between two firing event of a specific neuron is called the refractory period. If the neuron does not fire, its corresponding threshold would decay exponentially over time until it becomes smaller than the neuron’s internal activity, which then enables the neuron to fire [2]. The behavior of a single pulse coupled neuron (PCN) can be described as [2] X Fij ½n ¼ eaF Fij ½n1 þ VF mijkl Ykl ½n1 þ Sij , ðk,lÞ A Nði,jÞ

X

Lij ½n ¼ eaL Lij ½n1 þ VL

2.2. Simplified version Although PCNN model can effectively simulate the synchronous firing phenomenon, it is uneasy to use, since there are too many parameters in it. Therefore, some simplified models are more often used to solve the practical problems. In this paper, we consider the following simplified model: Fij ½n ¼ Sij , X

Lij ½n ¼

ð1Þ wijkl Ykl ½n1,

ð2Þ

ðk,lÞ A Nði,jÞ

Uij ½n ¼ Fij ½nð1 þ bLij ½nÞ, ( Yij ½n ¼

1

if Uij ½n 4 yij ½n1,

0

otherwise,

yij ½n ¼ eay yij ½n1 þVy Yij ½n1:

ð3Þ

ð4Þ

ð5Þ

It is illustrated in Fig. 2. The suitable selection of parameters is critical to PCNN when it is applied to image segmentation, and to date the relations between all kinds of parameters are still not completely clear. Experiments and experiences play important roles. In this simplified model, only four parameters are preserved, namely, w, b, ay and Vy , which greatly facilitates its application while maintains the key features of PCNN.

wijkl Ykl ½n1,

ðk,lÞ A Nði,jÞ

Uij ½n ¼ Fij ½nð1 þ bLij ½nÞ, ( Yij ½n ¼

1

if Uij ½n 4 yij ½n1,

0

otherwise,

yij ½n ¼ eay yij ½n1 þ Vy Yij ½n1, where ij (1 ri r M, 1 rj rN, and M and N represent the length and width of the network, respectively) stands for the position of neuron in the network, N(i,j) refers to the specified neighborhood of neuron (i,j), kl stands for the position of a neuron which belongs to N(i,j), Fij and Lij are as described before with aF and aL representing their associated time constant, respectively, Uij and yij are also as mentioned above, Sij represents the external source, Yij is the pulse output, b is the linking strength, mijkl and wijkl represent the constant synaptic weights from neuron (k,l) to neuron (i,j), and ay and Vy are the threshold decay time constant and the normalization constant, respectively.

3. Parameter adjusting and output selection standard In this section, two important improvements are introduced to further facilitate the use of PCNN in image segmentation. One is the adaptive threshold decay time constant, the other is the selection standard for the final results. The former improvement spares the PCNN model from having to set this particular parameter every time for different images. The later improvement makes it possible for the model to select the final output image automatically. 3.1. Adaptive threshold decay time constant In the presented methods for image segmentation using PCNN, different images may require quite different parameters. Therefore, an adaptive adjusting method of the parameters is needed. In this subsection, we introduce a method to adjust parameter ay adaptively according to the overall features of the target images.

S. Wei et al. / Neurocomputing 74 (2011) 1485–1491

would show that the same parameters can be applied to process many different images by only adjusting the threshold decay time constant with our method.

Researchers have found that the subjective feeling of human eyes is logarithmically related to the actual light intensity [13]. It is illustrated by the following function: S ¼ Klog10 B þ K0 ,

1487

ð6Þ 3.2. Output selection standard

where B is the actual light and S is the human subjective feeling of light, K and K0 are two constants. This phenomenon implies that light and dark images should be treated differently with different parameters. To further illustrate the phenomenon, the corresponding curve graph is shown in Fig. 3. We can see that in high intensity area, the difference of intensity human feels would be smaller than the actual level, so a slower decay speed is more suitable to segment different parts of the image. On the other hand, for a relatively darker image, in order to obtain a complete segmentation result, a faster decay speed is needed. To simulate this phenomenon, first the average gray level m is computed; then an inverse correlation between the average gray level and the threshold decay time constant is established as

ay ¼ C=m,

With our proposed method, parameter ay can be adjusted adaptively. This greatly increases the flexibility of PCNN in dealing with different images. However, since each iteration of PCNN generates a binary image, the final result has to be selected from many images. This process is often done manually, which not only makes it hard to use, but also impossible to be implemented in hardware, since it involves human knowledge. Thus, the selection standard must be employed. It is well known that entropy is often used in the existent selection standards for the resulted images. Here, we propose a quite simple standard based on the times series of PCNN and proves its priority to the entropy standard.

ð7Þ

3.2.1. Image entropy Shannon introduced the concept of entropy in 1948, which established the measuring standard of the information [14]. Following this concept, image entropy has been adopted in image processing [15], presented as

where C is a constant, m is the average gray level, and ay is the threshold decay time constant. Note that this method does not deal with other parameters, so they have to be determined manually. But later experiments

H ¼ P1 log2 P1 P0 log2 P0 ,

ð8Þ

where P1 and P0 refer to the incidence of 1 and 0 in the output binary image, respectively. The output image which has the largest image entropy will be selected as the final segmentation result. The hypothesis underlying the image entropy standard is that the final binary image should contain as much information as possible [16]. In the following, we would point out that the effect of this standard is equivalent to that of our time series standard. 3.2.2. Time series Lindbladh in [13] introduced the time series of the PCNN model, shown as X G½n ¼ Yij ½n, ð9Þ ði,jÞ A VðNETÞ

where V(NET) is the collection of all the neurons in the network. It is usually utilized to perform feature extraction or target recognition tasks. Here, we presents a variant of the time series and uses it as the PCNN output selection standard, shown as Fig. 3. The relationship between the subjective feeling of light and the actual light intensity.

Gmin½n ¼ minðG½n,1G½nÞ:

1

0.5

1

0.9

0.45

0.9

0.8

0.4

0.8

0.7

0.35

0.7

0.6

0.3

0.6

0.5

0.25

0.5

0.4

0.2

0.4

0.3

0.15

0.3

0.2

0.1

0.2

0.1

0.05

0.1

0

0

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Fig. 4. Curve graphs of (a) H, (b) Gmin and (c) H over Gmin.

0

0

ð10Þ

0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5

1488

S. Wei et al. / Neurocomputing 74 (2011) 1485–1491

In order to see clearly the relationship between the image entropy and time series, we change Eq. (8) into the following form:

G½n is normalized as G½n ¼

X

Yij ½n=ðMNÞ,

ð11Þ

ði,jÞ A VðNETÞ

where M and N refer to the length and width of the image to be segmented, respectively. In fact, G[n] is exactly P1 of the n-th iteration. The output which makes the largest value of Gmin½n is selected as the final segmentation result. We now point out that Gmin½n can be used instead of H.

H ¼ P1 log2 P1 ð1P1 Þlog2 ð1P1 Þ:

ð12Þ

 When G is equal to 0.5, both H and Gmin achieve the highest value, 1 and 0.5, respectively.

 When G is smaller than 0.5, Gmin equals G, thus H ¼ Glog2 Gð1GÞlog2 ð1GÞ ¼ Gminlog2 Gminð1GminÞlog2 ð1GminÞ:

ð13Þ

 When G is larger than 0.5, Gmin equals 1 G, thus Table 1 Parameter list.

H ¼ Glog2 Gð1GÞlog2 ð1GÞ ¼ ð1GminÞlog2 ð1GminÞGminlog2 Gmin:

Parameter

C

b

Vy

w

Value

0.16

0.2

100

0.1036 for diagonal direction 0.1464 for horizontal or vertical direction

ð14Þ

So in either cases, there is H ¼ ð1GminÞlog2 ð1GminÞGminlog2 Gmin:

ð15Þ

Fig. 5. Segmentation of different images by our PCNN and other methods: Otsu, GMM and K-means, respectively. Column 1 is the original image. Column 2 lists the segmentation by our PCNN. Column 3 lists the segmentation by Otsu. Column 4 lists the segmentation by GMM. Column 5 lists the segmentation by K-means.

S. Wei et al. / Neurocomputing 74 (2011) 1485–1491

The derivative of function H over Gmin is _ ¼ log H 2



1 Gmin

 1 :

ð16Þ

_ is always positive, which Since Gmin is always smaller than 0.5, H means H is a strictly increasing function of Gmin. So Gmin can be used instead of H in deciding the output of which iteration is to be selected. The relationship between H, G and Gmin is shown in Fig. 4. In cases where there is little possibility that more than half the neurons in a network would fire at the same time, the time series G can be used directly as the output selection standard. Comparing with the image entropy, it is simpler and quite timesaving. It is also the standard we adopt in our later experiments. 3.3. Segmentation using the proposed PCNN method For clarity, the following algorithm is presented to show the steps of segmentation with our proposed PCNN method. The algorithm can be described as (1) Initialize the network: set S such that the value of each neuron is the gray level of the corresponding pixel, set Y as

0, and y such that all values of its elements are the gray level of the pixel which has the highest value, initialize F as S. Besides, an one-dimensional vector R is adopted to store the output Y for each iteration in its corresponding element; (2) Parameter determination: set ay according to Eq. (7); (3) Run the network: (a) Calculate L according to Eq. (2); (b) Calculate U according to Eq. (3); (c) Calculate Y according to Eq. (4), save Y in R[n] for the n-th iteration; (d) Calculate y according to Eq. (5); (e) If the process has been run for given times TIMES, then go to step 4; else, back to step 3a and continues; Select the segmentation result: calculate G for each R[n] for all (4) n ¼1,2,y,TIMES. Suppose G[i] of the i-th iteration has the highest value, then R[i] is selected as the final segmentation result. Using this algorithm, only three parameters need to be decided manually, that is, w, b and Vy . Since C is an invariant constant for all images, ay will be adjusted adaptively, and the final result can be chosen automatically without the interference of humans.

1

1

0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1

0

0

10

20

30

40

50

60

70

80

90

100

0

1

1

0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1

0

0

10

20

30

40

50

60

70

80

90

100

1489

0

0

10

20

30

40

50

60

70

80

90

100

0

10

20

30

40

50

60

70

80

90

100

Fig. 6. Time series of the above four types of images, respectively, in the same order as they are shown.

1490

S. Wei et al. / Neurocomputing 74 (2011) 1485–1491

4500

16

4000

14

3500

4

12

3000

10

2500

8

2000

6

1500

4

1000

2

500 0

x 10

0

50

100

150

200

250

300

0

0

50

100

150

200

250

300

Fig. 7. Gray histograms of (a) Fig. 5(p) and (b) its GMM fitting.

Apparently, this algorithm greatly increases the flexibility and efficiency of PCNN and makes it possible for the application of PCNN in image segmentation to be implemented in hardware.

4. Experiment and comparison Several images have been processed to show the effectiveness of the adaptive time constant ay and our selection standard, time series G. The experiment is carried out with MATLAB 7.8. All images are of resolution 512  512. Table 1 is the parameter list of our method. C is the invariant constant for all images and is determined through experiments. b and Vy is determined experimentally. In this paper, only neurons which are directly connected to a specified neuron is considered to be in the neighborhood of the neuron. w stands for the linking weight between two adjacent neurons and is set to be the normalized inverse of the Euclidean distance between them. The initial threshold value is set as the gray-level value of the lightest pixel in the image and the iteration times is 100. The simulation results are shown in Fig. 5. Each line represents the segmentation results by using four different methods: S-PCNN method, Otsu [17], GMM [18] and K-means [19]. The first column shows the original image; the other four columns show the corresponding segmentation results by our PCNN, Otsu, GMM and K-means, respectively. Fig. 6 shows the corresponding time series generated by S-PCNN on the above four original images. In Fig. 5(a), since n ¼5 has the largest value for the time series, see Fig. 6(a), so the fifth output is selected in our method. Fig. 5(b) is the result of our method, and Figs. 5(c)–(e) are the results of Otsu, GMM and K-means, respectively. Our PCNN method separates the blood cells from its background with very high accuracy, while all the other methods have mistaken some parts of the background for objects. It is much easier to count the number of cells based on our segmentation result. For Fig. 5(f), the output of the third iteration is selected. In the segmentation result by our PCNN, every single cell is distinctly separated from each other. The results generated by Otsu method and K-means are better than that of GMM, but in both of them, the central two cells are not separated clearly. The result by GMM shows too many impurities within the body of the cells. It makes the cells difficult to be identified.

In Fig. 6(c), n ¼3 has the largest value, so the third output is selected in our method for Fig. 5(k). Fig. 5(l) is the result of our method, and Figs. 5(m)–(o) are the results of Otsu, GMM and K-means, respectively. It can be seen that the nasal and hat parts are very clear in the result of S-PCNN, while there are more or less missing in the results of others. As to the last image to be segmented, the output of the second iteration is selected in our method. The results show that the segmentation by our PCNN has a better performance than the Otsu and K-means methods. The contour of the blonde’s nose is clear in Fig. 5(q), while it has disappeared in Figs. 5(r) and (t). The segmentation result by GMM method is superior to ours for the whiteness of the teeth is preserved. It is due to the peculiar characteristic of the original image (Fig. 5(p)). Fig. 7(a) shows the gray histogram of Fig. 5(p). The distribution of gray levels is very similar to the mixture of two Gaussian distributions, making it very suitable to be treated by GMM. Fig. 7(b) is the gray histogram of the GMM fitting.

5. Conclusions Parameters determination is very important to PCNN applications. This paper deals with the automatic adjusting of the threshold decay time constant, ay . A new PCNN output selection standard is also presented. Experiments with image segmentation using PCNN are carried out. Comparison with other segmentation methods shows the usefulness of our proposal. However, a good segmentation by PCNN is concerned with all the parameters. Here in order to exhibit the effectiveness of our method, all the other parameters except the threshold decay time constant are set to the same value. The quality of the segmentation result could be further improved if the other parameters had also been adjusted. So further research is required to set other parameters automatically.

References [1] R. Eckhorn, H.J. Reitboeck, M. Arndt, Feature linking via synchronization among distributed assemblies: simulations of results from cat visual cortex, Neural Computation 2 (3) (1990) 293–307. [2] J.L. Johnson, M.L. Padgett, PCNN models and applications, IEEE Transactions on Neural Networks 10 (3) (1999) 480–498.

S. Wei et al. / Neurocomputing 74 (2011) 1485–1491

[3] G. Kuntimad, H.S. Ranganath, Perfect image segmentation using pulse coupled neural networks, IEEE Transactions on Neural Networks 10 (3) (1999) 591–598. [4] K. Waldemark, T. Lindblad, et al., Patterns from the sky: satellite image analysis using pulse coupled neural networks for pre-processing, segmentation and edge detection, Pattern Recognition Letters 21 (3) (2000) 227–237. [5] J.M. Kinser, C.L. Wyman, B.L. Kerstiens, Spiral image fusion: a 30 parallel channel case, Optical Engineering 37 (2) (1998) 492–498. [6] R.P. Broussard, S.K. Rogers, et al., Physiologically motivated image fusion for object detection using a pulse coupled neural network, IEEE Transactions on Neural Networks 10 (3) (1999) 554–563. [7] H.S. Ranganath, G. Kuntimad, Object detection using pulse coupled neural networks, IEEE Transactions on Neural Networks 10 (3) (1999) 615–620. [8] R.C. Mures-an, Pattern recognition using pulse-coupled neural networks and discrete Fourier transforms, Neurocomputing 51 (2003) 487–493. [9] H. Kurokawa, S. Kaneko, M. Yonekawa, A color image segmentation using inhibitory connected pulse coupled neural network, Lecture Notes in Computer Science, Advances in Neuro-Information Processing, vol. 5507, 2009, pp. 776–783. [10] X. Gu, S. Guo, D. Yu, A new approach for automated image segmentation based on unit-linking PCNN, Proceedings of the First International Conference on Machine Learning and Cybernetics, vol. 1, 2002, pp. 175–178. [11] G. Sze´kely, T. Lindblad, Parameter adaptation in a simplified pulse-coupled neural network, Proceedings of SPIE, vol. 3728, 1999, pp. 278–285. [12] H. Berg, R. Olsson, T. Lindblad, J. Chilo, Automatic design of pulse coupled neurons for image segmentation, Neurocomputing 71 (10–12) (2008) 1980–1993. [13] J.M. Kinser, T. Lindblad, Implementation of pulse-coupled neural networks in a CNAPS environment, IEEE Transactions on Neural Networks 10 (3) (1999) 584–590. [14] C.E. Shannon, A mathematical theory of communication, Bell System Technical Journal 22 (1948) 379–423. [15] K. Bouzouba, L. Radouane, Image identification and estimation using the maximum entropy principle, Pattern Recognition Letters 21 (8) (2000) 691–700. [16] F. Mario, B. Giuseppe, C. Terry, Entropy-based representation of image information, Pattern Recognition Letters 23 (12) (2002) 1391–1398. [17] N. Otsu, A threshold selection method from gray-level histograms, IEEE Transactions on Systems, Man, and Cybernetics SMC-9 (1) (1979) 62–66. [18] S.-K.S. Fan, Y. Lin, C.-C. Wu, Image thresholding using a novel estimation method in generalized Gaussian distribution mixture modeling, Neurocomputing 72 (1–3) (2008) 500–512. [19] T. Kanungo, D.M. Mount, et al., An efficient k-means clustering algorithm: analysis and implementation, IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (7) (2002) 881–892.

1491 Wei Shuo received the B.S. degree in mathematics from Anhui University, Hefei, China, in 2007. He is currently working toward the M.S. degree in Computational Intelligence Laboratory, School of Computer Science and Engineering, University of Electronic Science and Technology of China. His research interests include neural networks, image processing and optimization.

Hong Qu received the B.S. and M.S. degrees and Ph.D. degree in computer science and engineering from University of Electronic Science and Technology of China, in 2000, 2003 and 2006, respectively. He is currently working in Computational Intelligence Laboratory, School of Computer Science and Engineering, University of Electronic Science and Technology of China. His current research interests include neural networks, robot, neurodynamics, intelligent computation and optimization.

Hou Mengshu received the M.S. and Ph.D. degrees in computer science and engineering from the School of Computer Science and Engineering, University of Electronic Science and Technology of China, in 2002 and 2005, respectively. Since 2005, he has been working in the School of Computer Science and Engineering. He has published more than 10 papers in scientific journals, including ACM and computer science. His current research interests include networks, computer system, intelligent computation and optimization.