A method to improve the transiently chaotic neural network

A method to improve the transiently chaotic neural network

ARTICLE IN PRESS Neurocomputing 67 (2005) 456–463 www.elsevier.com/locate/neucom Letter A method to improve the transiently chaotic neural network ...

431KB Sizes 3 Downloads 92 Views

ARTICLE IN PRESS

Neurocomputing 67 (2005) 456–463 www.elsevier.com/locate/neucom

Letter

A method to improve the transiently chaotic neural network Xinshun Xu, Zheng Tang, Jiahai Wang Faculty of Engineering, Toyama University, Toyama-shi 930-8555, Japan Received 29 February 2004; received in revised form 6 July 2004; accepted 6 July 2004 Available online 19 March 2005 Communicated by R.W. Newcomb

Abstract In this article, we propose a method for improving the transiently chaotic neural network (TCNN) by introducing several time-dependent parameters. This method allows the network to have rich chaotic dynamics in its initial stage and to reach a state in which all neurons are stable soon after the last bifurcation. This enables the network to have rich search ability initially and to use less CPU time to reach a stable state. The simulation results on the N-queen problem confirm that this method effectively improves both the solution quality and convergence speed of TCNN. r 2005 Elsevier B.V. All rights reserved. Keyword: Neural network; Transiently neural network; N-queen problem

1. Introduction Neural networks have been shown to be powerful tools for solving combinatorial optimization problems, particularly NP-hard problems. The Hopfield neural network [4,5], one of the well-known models of this type, converges to a stable equilibrium point due to its gradient descent dynamics; however, it causes severe local-minimum problems whenever it is applied to optimization problems. Although Corresponding author.

E-mail address: [email protected] (X. Xu). 0925-2312/$ - see front matter r 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.neucom.2004.12.004

ARTICLE IN PRESS X. Xu et al. / Neurocomputing 67 (2005) 456–463

457

many methods have been suggested that attempt to improve them [8,13], the results have not always been satisfactory. Recently, many artificial neural networks with chaotic dynamics have been investigated [1,10]; they have much rich and far-from equilibrium dynamics with various coexisting attractors, not only of fixed and periodic points but also of strange attractors, even though its equation is simple. But it is usually difficult to decide how long to maintain chaotic dynamics, or how to harness chaotic behavior so that the network converges to a stable point of equilibrium corresponding to an acceptably near-optimal state [2]. Wishing to reconcile the Hopfield network’s convergent dynamics with the principles of chaotic dynamics, Chen and Aihara have proposed a transiently chaotic neural network (TCNN) [2]. However, there are many parameters in TCNN that affect its convergence speed and its solution quality [2], and these therefore need to be chosen carefully. Furthermore, a network with rich dynamics usually requires more steps to stabilize. In this article, we present a method of introducing several time-independent parameters into the original TCNN model. This modified TCNN has relatively rich dynamics initially, and, after bifurcation, converges sooner to a state in which all its neurons are stable.

2. The transiently chaotic neural network Chen and Aihara’s TCNN model is defined as follows: vi ðtÞ ¼

1 , 1 þ eui ðtÞ=

ui ðt þ 1Þ ¼ kui ðtÞ þ að

(1) n X

wij vj þ I i Þ  zi ðtÞðvi ðtÞ  I 0 Þ,

(2)

j¼1;jai

zi ðt þ 1Þ ¼ ð1  bÞzi ðtÞ,

(3)

where i is the index of neurons and n is the number of neurons(1pipn), vi the output of neuron i, ui the internal state for neuron i, wij the connection weight from neuron j to neuron i, Ii the input bias of neuron i, a the positive scaling parameter for inputs, k the damping factor of the nerve membrane(0pkp1), zi(t) the self-feedback connection weight or refractory strength (zi ðtÞX0), e the steepness parameter of the output function (40), b the damping factor of the time-dependent zi(t), and I0 the positive parameter. In this model, the variable zi(t) corresponds to the temperature in the usual stochastic annealing process. Eq. (3) is therefore an exponential cooling schedule for the annealing. The network functions in a fashion similar to the Hopfield network when the value of zi(t) is sufficiently small. In [2], Chen et al. showed both the parameter b governed the bifurcation speed of the transient chaos and that the parameter a could affect the neurodynamics; in other words, the influence of the energy function was too strong to generate transient chaos when a was too large, and the energy function could not be sufficiently reflected in the neurodynamics when a

ARTICLE IN PRESS 458

X. Xu et al. / Neurocomputing 67 (2005) 456–463

was too small. So in order for the network to have rich dynamics initially, b must be set to a small value, and a must be set to a suitable value. However, when b is small, the network requires more steps for all neurons to become saturated. In addition, it is difficult to find a suitable value of a for different problems.

3. The method to improve the original model In order to improve the convergence speed and search ability of the original TCNN, we replace a, b, and e, constants in the original TCNN model, with three variables (a(t), b(t), and e(t)). They are updated as follows: aðt þ 1Þ ¼ ð1 þ lÞaðtÞ

if aðtÞo0:1

bðt þ 1Þ ¼ ð1 þ fÞbðtÞ if bðtÞo0:2 ðt þ 1Þ ¼ ð1  ZÞðtÞ if ðtÞ40:001

else aðtÞ ¼ 0:1, else bðtÞ ¼ 0:2, else ðtÞ ¼ 0:001,

(4) (5) (6)

where l, f and Z are small positive constants(selected empirically). Initially, a(0) and b(0) are set to small values. When a(t) is small, the influence of the energy function is still weak enough to allow transient chaos to be generated. At first, while b(t) is still a small value, z(t) decreases slowly, which gives the network enough time to maintain large self-feedback, without which it cannot generate chaos; and then, when b(t) reaches a large value, z(t) begins to decrease quickly. Moreover, the influence of the energy function becomes strong when a(t) increases gradually, which means that the self-feedback signal weakens in the motion equation. Therefore, using Eqs. (4) and (5), we can allow the neural network to have rich chaotic dynamics in its initial stage. Then, when both a(t) and b(t) have become sufficiently large, the chaos disappears quickly. Once bifurcation has appeared, the self-feedback becomes too weak to generate chaotic signals. Nevertheless, it usually takes many steps for the network to converge to a stable state; that is to say, each neuron requires many steps to stabilize. In order to enable the neural network to converge soon after bifurcation to a state in which all its neurons are stable, we use Eq. (6) to update the variable e(t), which is the steepness parameter of the output function (Eq. (1)). Initially, e(t) is set to a large value. This means that the steepness of the output function is small. Allowing the neural network to generate chaos easily. Then, once e(t) has decreased, the steepness of the output is large. That is to say, a small value for the internal state of neurons allows the output of neurons to converge to 1 or 0 quickly. Of course, this network may become uncontrollable unless these parameters are kept within bounds, and so a bound must be assigned to each parameter, so that when it reaches this bound, a parameter will not change. Based on the conditions for the network’s stability [3], in this article, a(t) is bounded on 0.1, b(t) on 0.2, and e(t) on 0.001.

ARTICLE IN PRESS X. Xu et al. / Neurocomputing 67 (2005) 456–463

Output

1 0.8 0.6 0.4 0.2 0

0

500

1000

1500

2000

2500 3000 Iteration

3500

1000

1500

2000

2500 3000 Iteration

3500

4000

4500

5000

1500

2000

2500 3000 Iteration

3500

4000

4500

5000

1500

2000

2500 3000 Iteration

3500

4000

4500

5000

(a)

4000

4500

5000

Output

1 0.8 0.6 0.4 0.2 0 0

500

0

500

0

500

(b)

Output

459

1 0.8 0.6 0.4 0.2 0 1000

(c) 1 Output

0.8 0.6 0.4 0.2 0 1000

(d)

Fig. 1. These graphs plot the evolution of a single neuron model: (a) evolution when no parameter is updated, (b) evolution when parameter a is updated by Eq. (4), (c) evolution when parameter b is updated by Eq. (5), and (d) evolution when parameter e is updated by Eq. (6).

In order to show the tradeoff between dynamics and convergences, the evolution of a single neuron model is plotted in Fig. 1. All the parameters were set as follows: k ¼ 0:9;

zð0Þ ¼ 0:08;

l ¼ 0:002;

bð0Þ ¼ 0:002;

f ¼ 0:002;

Z ¼ 0:002.

að0Þ ¼ 0:001;

ð0Þ ¼ 0:008; ð7Þ

Fig. 1(a) plots the evolution when a, b, and e are constants. We see that the network reached a saturated state after about 4000 steps. Fig. 1(b), (c) and (d) show the evolution when a, b, and e, respectively, are updated. In Fig. 1(b), we can see that when a is updated by Eq. (4), TCNN maintains chaotic dynamics at the beginning, converges sooner to a stable state, and uses only about 1400 steps to reach a saturated state. The network has similar positive features when b is updated by

ARTICLE IN PRESS X. Xu et al. / Neurocomputing 67 (2005) 456–463

460

Eq. (5). We can also see clearly that the network converges sooner to a saturated state when e is updated by Eq. (6).

4. Simulations on the N-queen problem In order to confirm the effectiveness of the proposed method for TCNN, we tested it on the N-queen problem. This problem involves placing N queens on an N by N chessboard in such a way that no queen is under attack. The N-queen problem has been solved with a variety of artificial neural networks [9,12], and, more recently, with several chaotic models [6,7,11]. In [11], a chaotic neural network with reinforced self-feedback was proposed. In this article, we use the formulation of the N-queen problem in [6]. Simulation results are summarized in Table 1. The columns ‘‘N’’, ‘‘TCNN1’’, ‘‘TCNN2’’ and ‘‘Proposed’’ represent, respectively, the number of queens; the results of the original TCNN when parameter b was set to 0.001, a small value; the results of the original TCNN when parameter b was set to 0.08, a large value; and the results of the proposed model. In the proposed model, the parameters are the same as those in Eq. (7), and bounds for a(t), b(t), and e(t) were 0.1, 0.2 and 0.001, respectively. For each instance, each algorithm performed 100 simulations with different initial neuron states. The rate of optimal solution and the average number of steps for each algorithm are presented in Table 1. Note that, in this article, the network is said to be stable and hence, terminated if the criterion {|Vij(t+1)Vij(t)| o5  105; i, j ¼ 1, 2, y, N} is met. Table 1 also presents for comparison the results using Ohta’s chaotic neural network for the N-queen problem. In Ohta’s model, dT was set to 0.0005. An examination of Table 1 yields the following observations:



The proposed model can solve the N-queen problem with a high global minimum convergence rate, and the number of steps does not increase when the problem scale is large.

Table 1 Simulation results on the N-queen problem N

10 20 50 100 200 300 500 600

TCNN1

TCNN2

Proposed

Ohta

Conv.

Steps

Conv.

Steps

Conv.

Steps

Conv.

Steps

100% 99% 98% 100% 97% 98% 96% 96%

1615.2 2145.7 2354.6 2511.4 2587.7 2713.4 2823.8 2991.9

91% 93% 87% 90% 82% 76% 79% 57%

435.5 457.2 477.9 613.6 733.4 792.1 803.2 833.1

100% 100% 99% 100% 100% 98% 99% 96%

472.3 463.7 479.4 485.2 477.9 491.5 467.8 482.6

100% 100% 98% 99% 96% 90% 57% 46%

116.4 177.1 185.7 193.5 274.8 516.3 783.7 804.9

ARTICLE IN PRESS

V

X. Xu et al. / Neurocomputing 67 (2005) 456–463

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

200

400

600

800

1000 1200 Iteration

1400

1600

0

200

400

600

800

1000 1200 Iteration

1400

1600

(a)

V

461

1800

2000

1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0

(b)

1800

2000

Fig. 2. The evolutions of the #1 neuron in the original model (b ¼ 0:001) and the model with proposed method for 10-queens: (a) evolution in original model, (b) evolution in proposed model.





Although the original model, with b set to a small value, can solve this problem with high solution quality, it requires more than 2000 steps to converge to a stable state. When b is set to a large value, the number of steps is reduced, but the convergence rate is unsatisfactory. Ohta’s algorithm yields good results when the problem scale is small. However, its convergence rate and average number of steps are unsatisfactory for problems on a large scale.

In order to gain insight into the evolution progress of individual neurons, in Fig. 2 we plot the evolution of the #1 neuron for both the proposed model and the original model (b ¼ 0:001) when N is equal to 10. Fig. 2 shows that the proposed model had rich dynamics in its initial stage and that the neuron stabilized in fewer steps after bifurcation. Although the original model also had rich dynamics initially, the neuron converged to a saturated state only after more than 1500 steps. So we can conclude that the proposed method effectively improves both the convergence speed and the solution quality of the TCNN.

5. Conclusions We have presented a method to improve the original TCNN for combinatorial optimization problems. This method allows a TCNN to maintain rich dynamics and converge in fewer steps after the last bifurcation to a state in which all its neurons are stable. Simulations on the N-queen problem show that the proposed model is

ARTICLE IN PRESS 462

X. Xu et al. / Neurocomputing 67 (2005) 456–463

superior to the original model and another chaotic neural network because of its optimal convergence rate and average update steps. For other combinatorial optimization problems, these parameters may be need to be adjusted. This method will therefore be ameliorated if adaptive algorithms are used to modify its parameters. We keep this problem open.

Acknowledgements We wish to thank Prof. R. Newcomb and our anonymous reviewers for their valuable suggestions. References [1] K. Aihara, T. Takabe, M. Toyoda, Chaotic neural networks, Phys. Lett. A 144 (6, 7) (1990) 333–340. [2] L. Chen, K. Aihara, Chaotic simulated annealing by a neural network model with transient chaos, Neural Networks 8 (6) (1995) 915–930. [3] L. Chen, K. Aihara, Chaos and asymptotical stability in discrete-time neural networks, Physica D 104 (3–4) (1997) 286–325. [4] J.J. Hopfield, D.W. Tank, Neural computation of decisions in optimization problems, Biol. Cybern. 52 (4) (1985) 141–152. [5] J.J. Hopfield, D.W. Tank, Computing with neural circuits: a model, Science 233 (1986) 625–633. [6] T. Kwok, K.A. Smith, Experimental analysis of chaotic neural network models for combinatorial optimization under a unifying framework, Neural Networks 13 (7) (2000) 731–744. [7] T. Kwok, K.A. Smith, L. Wang, Incorporating chaos into the Hopfield neural network for combinatorial optimization, in: Proceedings World Multiconference on Systemics, Cybernetics and Informatics, Florida, vol. 1, 1998, pp. 659–665. [8] S.Z. Li, Improving convergence and solution quality of Hopfield-type neural networks with augmented Lagrange Multipliers, IEEE Trans. Neural Networks 7 (6) (1996) 1507–1516. [9] J. Mandziuk, Neural networks for the N-Queens problem: a review, Control and Cybern. 31 (2) (2002) 217–248. [10] H. Nozawa, A neural network model as a globally coupled map and applications based on chaos, Chaos 2 (3) (1992) 377–386. [11] M. Ohta, Chaotic neural networks with reinforced self-feedbacks and its application to N-queen problem, Math. Comput. Simulation 59 (4) (2002) 305–317. [12] K.A. Smith, Neural networks for combinatorial optimization: a review of more than a decade of research, INFORMS J. Comput. 11 (1) (1999) 15–34. [13] D.E. Van den Bout, T.K. Miller, Improving the performance of the Hopfield-Tank neural network through normalization and annealing, Biol. Cybern. 62 (2) (1989) 129–139. Xinshun Xu received a B.S. degree from Shandong Normal University, Shandong, China and an M.S. degree from Shandong University, Shandong, China in 1998 and 2002, respectively. From 1998 to 2000, he was an Engineer in Shandong Provincial Education Department, Shandong, China. Now he is working toward the Ph.D. degree at Toyama University, Toyama, Japan. His main research interests are neural networks, machine learning, pattern recognition, image processing, and optimization problems.

ARTICLE IN PRESS X. Xu et al. / Neurocomputing 67 (2005) 456–463

463

Zheng Tang received a B.S. degree from Zhejiang University, Zhejiang, China in 1982 and an M.S. degree and a D.E. degree from Tsinghua University, Beijing, China in 1984 and 1988, respectively. From 1988 to 1989 he was an Instructor in the Institute of Microelectronics at Tsinghua University. From 1990 to 1999, he was an Associate Professor in the Department of Electrical and Electronic Engineering, Miyazaki University, Miyazaki, Japan. In 2000, he joined Toyama University, Toyama, Japan, where he is currently a Professor in the Department of Intellectual Information Systems. His current research interests include intellectual information technology, neural networks, and optimizations.

Jiahai Wang received a B.S. degree from Gannan Teachers College, Jiangxi, China and an M.S. degree from Shandong University, Shandong, China in 1999 and 2001, respectively. Now he is working toward the Ph.D. degree at Toyama University, Toyama, Japan. His main research interests include neural networks, meta-heuristic algorithms, evolutionary computation, hybrid soft computing algorithms and their applications to various real-world combinatorial optimization problems.