PSO-based single multiplicative neuron model for time series prediction

PSO-based single multiplicative neuron model for time series prediction

Available online at www.sciencedirect.com Expert Systems with Applications Expert Systems with Applications 36 (2009) 2805–2812 www.elsevier.com/loca...

209KB Sizes 0 Downloads 109 Views

Available online at www.sciencedirect.com

Expert Systems with Applications Expert Systems with Applications 36 (2009) 2805–2812 www.elsevier.com/locate/eswa

PSO-based single multiplicative neuron model for time series prediction Liang Zhao *, Yupu Yang Department of Automation, Shanghai Jiao Tong University, Shanghai 200240, PR China

Abstract Single multiplicative neuron model is a novel neural network model introduced recently, which has been used for time series prediction and function approximation. The model is based on a polynomial architecture that is the product of linear functions in different dimensions of the space. Particle swarm optimization (PSO), a global optimization method, is proposed to train the single neuron model in this paper. An improved version of the original PSO, cooperative random learning particle swarm optimization (CRPSO), is put forward to enhance the performance of the conventional PSO. The proposed CRPSO, PSO, back-propagation algorithm and genetic algorithm are employed to train the model for three well-known time series prediction problems. The experimental results demonstrate the superiority of CRPSO-based neuron model in efficiency and robustness over the other three algorithms. Ó 2008 Elsevier Ltd. All rights reserved. Keywords: Multiplicative neuron model; Global optimization; Particle swarm optimization; Cooperative random learning particle swarm optimization; Time series prediction

1. Introduction Time series prediction, predicting the future events and behaviors of the systems based on current given data, is an important tool in complex system identification. It has been used widely in scientific and engineering areas such as statistics, signal processing and econometrics. Different methods taken from a variety of fields have been employed for this task to handle more real life nonlinear time series. Various neural network models and training algorithms have been used for time series prediction (Geva, 1998; Manevitz, Birtar, & Giroli, 2005; Mendes, Kennedy, & Neves, 2004; Shimodaira, 1996; Wakuya & Zurada, 2001; Yadav, Kalra, & John, 2007). In Yadav et al. (2007), the authors presented a single multiplicative neuron model for time series prediction inspired from sin-

*

Corresponding author. Tel.: +86 21 3420 4261; fax: +86 21 3420 4427. E-mail address: [email protected] (L. Zhao).

0957-4174/$ - see front matter Ó 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2008.01.061

gle neuron computation in neuroscience Koch, 1997;Koch and Segev, 2000. The single neuron converts the incoming streams of binary pulses into analog, spatially distributed variables, such as postsynaptic membrane potential and calcium distribution throughout the dendritic tree and cell body (Koch & Segev, 2000). A single real neuron is as complex an operational unit as an entire artificial neural network (ANN), and formalizing the complex computations performed by real neurons is essential to the design of the enhanced processor elements for use in the next generation of ANNs (McKenna, Davis, & Zornetzer, 1992). The artificial single neuron models are used to solve many engineering problems (Iyoda, Nobuhara, & Hirota, 2003; Yadav et al., 2007; Yadav, Mishra, Yadav, Ray, & Kalra, 2007). The models with simple structures and lower computational complexity are easy to implement by using the standard backpropagation (BP) learning algorithm in practice and exhibit better performance than the multilayered neural networks with special structures (Yadav et al., 2007;Yadav et al., 2007).

2806

L. Zhao, Y. Yang / Expert Systems with Applications 36 (2009) 2805–2812

As is known that BP algorithm is based on the gradient descent method which depends on the initialized values and always converges to a suboptimal point. A number of different evolutionary algorithms have been proposed to improve the capability of the neural network (Angeline, Saunders, & Pollack, 1994;Liang, 2007;Tang, Quek, & Ng, 2005). In order to enhance the approximation capability of the single neuron model, particle swarm optimization (PSO) is introduced to train the model. PSO is a population-based stochastic optimization algorithm developed by Kennedy and Eberhart (1995), which is the latest evolutionary optimization technology used for solving a wide range of real world optimization problems as a substitute for genetic algorithm (GA). The conventional PSO may easily get trapped in a local optimum when tackling complex problems as well as other evolutionary algorithms. A cooperative random learning particle swarm optimization (CRPSO), which evolves multiple sub-swarms simultaneously and uses a randomly selected gbest from all the sub-swarms to calculate the velocity and position of the particle, is proposed to overcome the shortcomings of conventional PSO. The single neuron model with CRPSO is used for a variety of time series prediction problems and the results are compared with the other three learning algorithms BP, GA and PSO. The rest of the paper is organized as follows: Section 2 describes the single neuron model and the welldefined BP learning algorithm. The standard PSO algorithm and the proposed CRPSO algorithm are provided in Section 3. Section 4 discusses the detailed applications of the model with various learning algorithms for time series prediction problems and conclusions are drawn in Section 5.

2. The single multiplicative neuron model The artificial single multiplicative neuron model, first proposed by Yadav et al. (2007), is used as a learning machine for function approximation and other applications. The capabilities, such as nonlinear generalization, input–output mapping and noise tolerance of the model, are investigated by using Vapnik’s statistical learning theory (Vapnik, 1998). The detailed description can be found in Yadav et al. (2007). The single neuron model is sufficient for the applications that require a conventional neural network with a number of neurons in different layers. The model and the learning algorithm are provided in the following sections.

2.1. The structure of the single multiplicative neuron model The diagram of a generalized single multiplicative neuron with learning algorithm is illustrated in Fig. 1,

Fig. 1. The structure of the single multiplicative neuron.

where (x1, x2, . . ., xn) is the input pattern. (w1, w2, . . ., wn) and (b1, b2, . . ., bn) are the weights and biases of the model, respectively. The operator X is a multiplicative operation as in Eq. (1) and u is equal to X. X¼

n Y ðxi xi þ bi Þ:

ð1Þ

i¼1

The output function is the logsig function defined as Eq. (2). y¼

1 : 1 þ eu

ð2Þ

The learning algorithm is used to minimize the error between the output of the model and actual values. In this study, BP algorithm, GA, PSO and CRPSO are employed. 2.2. BP learning algorithm for the single neuron Back-propagation algorithm has been used widely in neural network learning. The standard BP algorithm is based on the steepest descent gradient approach applied to the minimization of an energy function representing the instantaneous error. BP algorithm is adopted to train the single neuron model and minimize the function which is defined as Eq. (3), E¼

N 1 X 2 ðy  y dp Þ ; 2N p¼1 p

ð3Þ

where y dp represents the desired network output for the pth input pattern and yp is the actual output of the neuron shown in Fig. 1. Using the steepest descent gradient approach and the chain rules for the partial derivatives, the learning rules for the weights and biases of the model are given in Eqs. (4) and (5), respectively,  N  dE 1X u ¼ g ðy p  d p Þy p ð1  y p Þ xi ; Dwi ¼ g dwi N p¼1 wi xi þ bi  N  dE 1X u Dbi ¼ g ; ¼ g ðy p  d p Þy p ð1  y p Þ dbi N p¼1 wi xi þ bi

ð4Þ

ð5Þ

L. Zhao, Y. Yang / Expert Systems with Applications 36 (2009) 2805–2812

where g is the learning rate parameter, which is used for controlling the convergent speed of the algorithm. The updated equations of weights and biases can be written as Eqs. (6) and (7), respectively. ¼ wold þ Dwi ; wnew i i

ð6Þ

bnew i

ð7Þ

¼

bold i

þ Dbi :

According to Eq. (6) and (7), the iterated procedure is repeated until the predefined termination criteria such as the maximum generation and the error goal are reached. 3. Particle swarm optimization 3.1. The standard particle swarm optimization Particle swarm optimization is a novel evolutionary algorithm paradigm which imitates the movement of birds flocking or fish schooling looking for food. Each particle has a position and a velocity, representing the solution to the optimization problem and the search direction in the search space. The particle adjusts the velocity and position according to the best experiences which are called the pbest found by itself and gbest found by all its neighbors. The updating equations of the velocity and position of the particle are given as follows: vðt þ 1Þ ¼ xvðtÞ þ c1 r1 ½p  xðtÞ þ c2 r2 ½pg  xðtÞ;

ð8Þ

xðt þ 1Þ ¼ xðtÞ þ vðt þ 1Þ;

ð9Þ

where x and v represent the velocity and position of the particle at the time t + 1; c1 and c2 are positive constants referred to as acceleration constants; r1 and r2 are random numbers following the uniform distribution between 0 and 1; p refers to the best position found by the particle and pg refers to the best position found by its neighbors. Introduced by Shi and Eberhart (1988), x is the inertia weight, which is used to balance global and local search abilities of the algorithm by controlling the influence of previous information on the new velocity updated (Shi & Eberhart, 1988;Shi & Eberhart, 1999). The major difference between PSO and other population-based algorithms is that the individuals will always survive during the entire search process.

2807

ronments where they have experienced their best performance and determines the local search capability of the algorithm; and the last component, c2r2[pg  x(t)], is the social component which represents the tendency of individuals to follow the success of other individuals and determines the global search capability of the algorithm (Bergh & Engelbrecht, 2006). The ideal solutions cannot be achieved by simply adjusting inertia weight x, parameters c1 and c2 when PSO faces complex optimization problems. Several novel learning strategies using more information that is found by all the particles are proposed to balance the local and global search abilities (Liang, Qin, Suganthan, & Baskar, 2006;Mendes et al., 2004). Cooperative search algorithms have been extensively studied in the past decade to solve the complex optimization problems. The basic approach involves having more than one search module running and exchanging information among each other in order to explore the search space more efficiently and reach better solutions. It can be regarded as a parallel algorithm approach since these modules may be running in parallel (El-Abd & Kamel, 2006). Under the cooperative search framework, this paper presents an improved PSO algorithm, named CRPSO, which employs a cooperative random learning mechanism to get the trade-off between local and global search. In the proposed learning mechanism, multiple sub-swarms are used to search different portions of the search space simultaneously and the particles in each sub-swarm learn the gbest found by all the sub-swarms randomly when updating their velocity and position.The velocity updating equation is rewritten as follows: vj ðt þ 1Þ ¼ xvj ðtÞ þ c1 r1 ½pj  xj ðtÞ þ c2 r2 ½pg ðrÞ  xj ðtÞ; ð10Þ where j = 1, . . ., n, is the number of the swarms and r is a random integer between 1 and n which refers to the index of the gbest selected to update the velocity at one iteration. Moreover, the information exchange is implemented through r. The schematic diagram of CRPSO is shown in Fig. 2, where the central circle represents the archive of gbest found by all the sub-swarms and the neighboring circles represent different sub-swarms. In this study, three sub-swarms are chosen according to the previous experi-

3.2. Cooperative random learning particle swarm optimization In PSO algorithm, each particle, which is associated with the best solutions, keeps track of its coordinates in the search space. The consistent behaviors of all the particles can be achieved by moving toward their pbest and gbest locations. Seen from Eq. (8), the velocity of a particle is determined by three factors: the first component is v(t), which serves as a momentum term to prevent excessive oscillations in the search direction; the second one, c1r1 [p  x(t)], is the cognitive component which represents the natural tendency of individuals to return to the envi-

Fig. 2. The schematic diagram of the information exchange of CRPSO.

2808

L. Zhao, Y. Yang / Expert Systems with Applications 36 (2009) 2805–2812

ments which provide the best solutions. The information exchange among the sub-swarms is carried out in two stages: first, each sub-swarm provides its gbest to the archive when the gbest is updated; second, the particle selects a gbest from the archive randomly to renew its velocity and position. The process of CRPSO is summarized in the following seven steps: Step 1. Initialize the positions and velocities of the particles of all the sub-swarms. Step 2. Calculate the fitness of all the particles, initialize the pbest and gbest of each sub-swarm as the conventional PSO, and comprise the initial gbest archive using the obtained gbest. Step 3. Generate a stochastic integer r between 1 and the number of the sub-swarms. Step 4. Update the velocity and position of particle i in each sub-swarm according to Eqs. (10) and (9), respectively. Step 5. If f(xi) < f(p), then update the pbest of the subswarm. Step 6. If f(p) < f(pg), then update the gbest of the subswarm. Step 7. Repeat Steps 1–6 for a given maximal number of iterations. where f(xi), f(p) and f(pg) represent the fitness values of particle i, pbest and gbest, respectively. The initialization and evolving process of each subswarm are performed independently and the particles do not always move toward a single gbest location. This mechanism helps the algorithm get away from the local optima, which is the primary advantage of CRPSO when comparing with canonical PSO. Thus, the diversity of the swarm is maintained efficiently and more feasible solutions can be got due to the enlarged search space. Furthermore, the multiple independent random initialization processes of the sub-swarms increase the probability of the algorithm to find the global optima and make it more robust. On the other hand, much useful information from different sub-swarms is used when updating the velocities and positions of the particles in the evolving process, which counteracts the additional computation cost of preserving and selecting the gbest and leads to a high-speed convergence. 3.3. Encoding strategy In the PSO society, there exist several encoding strategies such as vector encoding strategy and binary encoding strategy. The binary encoding strategy was firstly proposed by Kennedy and Eberhart (1997). The binary PSO, whose particles are encoded for string bits including the values 1 and 0 similar to the chromosome in genetic algorithm, is used to search the discrete space for special optimization tasks. The vector encoding strategy encodes each particle as a real

vector and is often used in continuous optimization problems. This strategy is adopted in this study, and the corresponding encoding style is given as [w1, w2, . . ., wn, b1, b2, . . ., bn], where n is the dimension of the input variable. The vector represents all the parameters of the single neuron model. However, when calculating the outputs of the model, each particle needs to be decoded to weight vectors and bias vectors. By using this kind of strategy, the weight vector [w1, w2, . . ., wn] and the bias vector [b1, b2, . . ., bn] can be obtained by splitting the style vector directly and the fitness of the particles can be calculated easily. 4. Results and discussion 4.1. The parameters of the algorithms The developed CRPSO-based single multiplicative neuron model is applied here in conjunction with three time series prediction problems: Mackey-Glass (MG) time series (Mackey & Glass, 1977), Box-Jenkins (BJ) time series (Box, Jenkins, & Reinse, 1994) and Electroencephalogram (EEG) data (http://www.cs.colostate.edu). The robustness and efficiency of the proposed method are compared with BP, PSO and GA. The parameters of these algorithms are given in the succedent paragraphs. The learning rate g of BP algorithm is set as 0.7 and the maximum iterating epoch is set as 5000 in all the experiments. The MATLAB Genetic Algorithm and Direct Search Toolbox (GADS) is used to carry out the involved optimized task in this study. Population size of GA is set as 30, the maximum generation is set as 1000 and the other parameters are set as default values. The parameters of CRPSO are listed in Table 1. PSO uses the same parameters as CRPSO except that the generation is set as 3000. The data sets have been pre-processed by normalizing them between 0.1 and 0.9. All the experiments are implemented for 50 runs. The mean MSE, the standard deviations and the ‘‘robustness” are compared in the following sections. Here, the term ‘‘robustness” is used to mean that Table 1 The parameters of CRPSO Parameter

Value

Sub-swarm numbers Generations Particle numbers Search range c1 and c2 x

3 1000 10 [15 15] [2 2] 0.9–0.5

Table 2 The numbers of fitness evaluations of CRPSO, PSO and GA Parameters

PSO

CRPSO

GA

Individual numbers Generations Fitness evaluation numbers

10 3000 30000

30 1000 30000

30 1000 30000

L. Zhao, Y. Yang / Expert Systems with Applications 36 (2009) 2805–2812

2809

Table 3 The training and testing performance for predicting the MG time series BP

PSO

CRPSO

GA

Training

Mean Std Best

0.0038 (0.0037) 5.3577e004

0.0017 (0.0025) 5.2559e004

5.2504e004 (2.1940e006) 5.2311e004

5.9584e004 (7.1716e005) 5.3821e004

Testing

Mean Std Best

0.0041 (0.0040) 5.6499e004

0.0018 (0.0027) 5.3089e004

5.4910e004 (2.9311e006) 5.4657e004

6.2173e004 (7.2473e005) 5.1773e004

500 samples are used for testing the generalization ability of the model. The training and testing minimum thresholds are set as the same value 0.001. The training MSEs and testing MSEs are given in Table 3 and the ‘‘successful” numbers of the four algorithms in 50 runs are given in Table 4. The training and testing results are shown in Fig. 3. From Table 3, it is observed that CRPSO, PSO and GA perform better than BP in the mean MSEs and the best MSEs in both training and testing cases. PSO and GA have the better testing MSEs than CRPSO. In the other cases, CRPSO shows the best performance of the four algorithms, so it can be concluded that CRPSO is the most effective learning algorithm for training the model. As is seen in Table 4, CRPSO has the most ‘‘successful” numbers

Table 4 ‘‘Successful” numbers out of 50 runs for the MG time series Method

BP

PSO

CRPSO

GA

Train Test

28 28

39 39

50 50

49 50

the algorithm succeeded in converging below a specified threshold using fewer than the maximum iterations. A ‘‘robust” algorithm manages to reach the threshold consistently during all runs (Bergh & Engelbrecht, 2004). The efficiency of the algorithms with the same numbers of fitness evaluations as presented in Table 2 is evaluated by comparing their convergence results.

Table 5 The training and testing performance for predicting BJ time series

4.2. Mackey-Glass time series The Mackey-Glass (MG) series, based on the MackeyGlass differential equation (Mackey & Glass, 1977), is often regarded as a benchmark used for testing the performance of neural network models. This series is a chaotic time series generated from the following time-delay ordinary differential equation: dyðtÞ ayðt  sÞ  byðtÞ; ¼ dt 1 þ y 10 ðt  sÞ

ð11Þ

BP

PSO

CRPSO

GA

Training

Mean Std Best

0.0030 (0.0019) 0.0016

0.0029 (0.0019) 0.0016

0.0017 (5.5406e004) 0.0016

0.0018 (7.7220e004) 0.0016

Testing

Mean Std Best

0.0056 (0.0049) 0.0019

0.0054 (0.0050) 0.0019

0.0021 (0.0015) 0.0018

0.0023 (0.0021) 0.0018

Table 6 ‘‘Successful” numbers out of 50 runs for the BJ time series

where s = 17, a = 0.2 and b = 0.1. The task of this study is to predict the value of the time series at the point y(t + 1) from the earlier points y(t), y(t  6), y(t  12) and y(y  18). The training is performed on 450 samples and

Method

BP

PSO

CRPSO

GA

Train Test

33 8

33 33

49 49

48 42

0.9 0.8 0.7

y(t)

0.6 0.5 0.4 0.3 0.2 0.1

Training 0

200

Target Output

Testing

400

600

800

t Fig. 3. The prediction results of the MG time series using the CRPSO-based model.

1000

2810

L. Zhao, Y. Yang / Expert Systems with Applications 36 (2009) 2805–2812 0.9 0.8 0.7

y(t)

0.6 0.5 0.4 0.3 0.2 0.1

Training 0

50

100

Target Output

Testing 150

200

250

300

t Fig. 4. The prediction results for the BJ time series using the CRPSO-based model.

both in training and testing, 50 out of 50, respectively, while BP behaves the worst, only 56 out of 100. The model trained by CRPSO follows the chaotic behavior of the time MG series very well as is demonstrated in Fig. 3. 4.3. Box-Jenkins gas furnace The Box-Jenkins gas furnace data set was recorded from a combustion process of a methane–air mixture. There are originally 296 data points y(t),u(t), from t = 1 to t = 296. y(t) is the output CO2 concentration and u(t) is the input gas flowing rate. It has been found by most methods that the best set of input variables for predicting y(t) is y(t  1) and u(t  4), which is used in our study. The training is performed on 140 samples and the model is tested on 150 samples. The training and testing minimum thresholds are both set as 0.002. The training MSEs and testing MSEs are provided in Table 5 and the ‘‘successful” numbers of the four algorithms are given in Table 6. Fig. 4 shows the training and testing results. As is seen in Table 5, the four algorithms have the same best training MSE 0.0016, while CRPSO and GA have better testing MSE than BP and PSO. CRPSO has the best training and testing mean MSEs and BP has the worst ones. It is observed from Table 6 that CRPSO has the most ‘‘successful” numbers both in training and testing, 49 out of 50, respectively, while BP behaves the worst, especially in the testing, 8 out of 50. The model trained by CRPSO follows the dynamic behavior with small deviations as is shown in Fig. 4.

4.4. Electroencephalogram (EEG) data Electroencephalogram (EEG) data used in this work was taken from http://www.cs.colostate.edu. The EEG data was recorded by Zak Keirn at Purdue university for his work on his Master’s degree of Science thesis in the Electrical Engineering Department at Purdue. This problem is intentionally selected since it is observed that it cannot be predicted by the linear models. Four measurements y(t  1),y(t  2), y(t  4) and y(t  8) were used to predict y(t) here. The training is performed on 150 samples and the 150 samples are used to test the model. Both the training and testing successful thresholds are set as 0.009. The training MSEs and testing MSEs are offered in Table 7 and the numbers succeeded to converge below the thresholds of the four algorithms are listed in Table 8. Fig. 5 provides the training and testing results. From Table 7, it can be seen that CRPSO and GA have the same training and testing MSEs which are a little worse than that of PSO except that CRPSO has the best testing MSE 0.0063. BP has the worst values in all the cases. Table 8 shows that CRPSO, PSO and GA have the same ‘‘successful” numbers both in training MSEs and testing MSEs, which are 50 out of 50. BP has 14 and 15 numbers sucTable 8 ‘‘Successful” numbers out of 50 runs for the EEG data Method

BP

PSO

CRPSO

GA

Train Test

14 15

50 50

50 50

50 50

Table 7 Comparison of performance for EEG data between CRPSO and BP, GA, PSO based model

Training

Testing

Mean Std Best Mean Std Best

BP

PSO

CRPSO

GA

0.0142 (0.0044) 0.0081 0.0107 (0.0032) 0.0066

0.0080 (7.6044e009) 0.0080 0.0066 (3.2200e007) 0.0066

0.0081 (7.5500e005 ) 0.0080 0.0067 (2.7111e004) 0.0063

0.0081 (6.2712e005) 0.0080 0.0067 (2.1826e004) 0.0064

L. Zhao, Y. Yang / Expert Systems with Applications 36 (2009) 2805–2812

2811

0.9 Target Output

0.8 0.7

y(t)

0.6 0.5 0.4 0.3 0.2 Training 0.1

0

50

100

Testing

150

200

250

300

t Fig. 5. The prediction results for the EEG data using the CRPSO-based model.

ceeded in training and testing MSEs, respectively. The CRPSO-based model follows the trend of the Electroencephalogram data well as is seen in Fig. 5. 4.5. Discussion The results of the aforementioned experiments show that BP algorithm performs worse than the other three algorithms both in efficiency and robustness. The reason is that BP is a gradient descent based algorithm and it cannot escape from a trapped local minimum when facing the nondifferentiable problems or other complicated tasks. Meanwhile, BP is very sensitive to the values of initialization. Thus, BP always converges to a local optimum and has worse robustness. The conventional PSO is also sensitive to its parameters and the initial values, which may cause it to be easily trapped in a local optimum when solving complex problems. Therefore, PSO has worse performance in the three problems. The novel learning mechanism maintains the diversity of the population effectively and provides more useful information in the iterated process so that CRPSO has the best training and testing MSEs in most cases of the four algorithms. Another benefit of the mechanism is that multiple sub-swarms search the space independently. This ensures that the search space is sampled thoroughly, thus it increases the chances of finding a good solution. So CRPSO has better standard deviations and more successful convergence numbers than the other algorithms, that is, CRPSO is the most robust algorithm. Genetic algorithm is also a preferable learning algorithm for training neural network. In this study, GA performs better than BP and PSO, while a little worse than CRPSO. 5. Conclusion A cooperative random learning PSO algorithm is introduced to train the single multiplicative neuron model for time series prediction. The single neuron model is considered as a neural network with simple structure and less parameters and it is used as a learning machine for function approx-

imation. PSO is one of the latest evolutionary algorithms and has been applied to a wide range of optimization problems. CRPSO, a variation of the original PSO, decreases the sensitivity to the initial values and increases the ‘‘robustness” of the original PSO. The proposed CRPSO, PSO, GA and BP have been used as the learning algorithms of the single neuron model. The simulation results show that CRPSO algorithm exhibits much better performance in most criteria than the other algorithms. The proposed method can also be used to train other neural network models and implement other optimization tasks in further studies. References Angeline, P. J., Saunders, G. M., & Pollack, J. P. (1994). An evolutionary algorithm that constructs recurrent neural networks. IEEE Transactions on Neural Networks, 5, 54–65. Bergh, F. V. D., & Engelbrecht, A. P. (2004). A cooperative approach to particle swarm optimization. IEEE Transaction on Evolutionary Computation, 8, 225–239. Bergh, F. V. D., & Engelbrecht, A. P. (2006). A study of particle swarm optimization particle trajectories. Information Sciences, 176, 937–971. Box, G. E. P., Jenkins, G. M., & Reinse, G. C. (1994). Time series analysis: Forecasting and control. Englewood Cliffs, NJ: Prentice-Hall. El-Abd, M., & Kamel, M. (2006). Cooperative particle swarm optimizers: A powerful and promising approach. Studies in Computational Intelligence (Vol. 31). Berlin, Heidelberg: Springer-Verlag, 239–259. Geva, A. B. (1998). ScaleNet-multiscale neural network architecture for time series prediction. IEEE Transaction on Neural Networks, 9(6), 1471–1482. Iyoda, E. M., Nobuhara, H., & Hirota, K. (2003). A solution for the N-bit parity problem using a single multiplicative neuron. Neural Processing Letter, 18, 233–238. Kennedy, J., & Eberhart, R. C. (1995). Particle swarm optimization. In Proceedings of IEEE international conference on neural networks (pp. 1942–1948). Perth, Australia. Kennedy, J. & Eberhart, R. C. (1997). A discrete binary version of the particle swarm algorithm. In Proceedings of the world multiconference on systemics, cybernetics and informatics (pp. 4104–4109). Piscataway, NJ. Koch, C. (1997). Computation and single neuron. Nature, 385, 207–210. Koch, C., & Segev, I. (2000). The role of single neurons in information processing. Nature Neuroscience, 3(Suppl.), 1171–1177. Liang, Yi-Hui (2007). Evolutionary neural network modeling for forecasting the field failure data of repairable systems. Expert Systems with Applications, 33, 1090–1096.

2812

L. Zhao, Y. Yang / Expert Systems with Applications 36 (2009) 2805–2812

Liang, J. J., Qin, A. K., Suganthan, P. N., & Baskar, S. (2006). Comprehensive learning particle swarm optimizer for global optimization of multimodal functions. IEEE Transaction on Evolutionary Conputation, 10, 281–285. Mackey, M., & Glass, L. (1977). Oscillation and chaos in physiological control systems. Science, 197, 287–289. Manevitz, L., Birtar, A., & Giroli, D. (2005). Neural network time series forecasting of finite element mesh adaptation. Neurocomputing, 63, 447–463. McKenna, T., Davis, J., & Zornetzer, S. F. (1992). Single neuron computation (neural nets: Foundations to applications). Academic Press. Mendes, R., Kennedy, J., & Neves, J. (2004). The fully informed particle swarm: Simpler, maybe better. IEEE Transaction on Evolutionary Computation, 8, 204–210. Shi, Y., & Eberhart, R. C. (1998). A modified particle swarm optimizer. In: Proceedings of the IEEE congress on evolutionary computation (pp. 69–73). Piscataway, USA.

Shi, Y., & Eberhart, R. C., (1999). Empirical study of particle swarm optimization. In: Proceedings of the IEEE congress on evolutionary computation (pp. 1945–1950). Piscataway, USA. Shimodaira, H. (1996). A method for selecting similar learning data in the prediction of time series using neural networks. Expert Systems with Applications, 10(3–4), 429–434. Tang, A. M., Quek, C., & Ng, G. S. (2005). GA-TSKfnn: Parameters tuning of fuzzy neural network using genetic algorithms. Expert Systems with Applications, 29, 769–781. Vapnik, V. (1998). Statistical learning theory. New York: John Wiley and Sons, Inc. Wakuya, H., & Zurada, J. M. (2001). Bidirectional computing architecture for time series prediction. Neural Networks, 14(9), 1307–1321. Yadav, R. N., Kalra, P. K., & John, J. (2007). Time series prediction with single multiplicative neuron model. Applied Soft Computing, 7, 1157–1163. Yadav, A., Mishra, D., Yadav, R. N., Ray, S., & Kalra, P. K. (2007). Time-series prediction with single integrate-and-fire neuron. Applied Soft Computing, 7, 739–745.