TAIFEX and KOSPI 200 forecasting based on two-factors high-order fuzzy time series and particle swarm optimization

TAIFEX and KOSPI 200 forecasting based on two-factors high-order fuzzy time series and particle swarm optimization

Expert Systems with Applications 37 (2010) 959–967 Contents lists available at ScienceDirect Expert Systems with Applications journal homepage: www...

514KB Sizes 0 Downloads 44 Views

Expert Systems with Applications 37 (2010) 959–967

Contents lists available at ScienceDirect

Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa

TAIFEX and KOSPI 200 forecasting based on two-factors high-order fuzzy time series and particle swarm optimization Jin-Il Park a, Dae-Jong Lee a, Chang-Kyu Song b, Myung-Geun Chun a,* a b

Department of Electrical and Computer Engineering, Chungbuk National University, Gaeshin-dong, 12, Cheongju 361-763, Republic of Korea CBNU BK21 Chungbuk Information Technology Center, Chungbuk National University, Gaeshin-dong, 12, Cheongju 361-763, Republic of Korea

a r t i c l e

i n f o

Keywords: Fuzzy time series Two-factors high-order fuzzy logical relationships Particle swarm optimization TAIFEX KOSPI 200

a b s t r a c t Since the fuzzy time series forecasting methods provide a powerful framework to cope with vague or ambiguous problems, they have been widely used in real applications. The forecasting accuracy of these methods usually, however, depend on their universe of discourse and the length of intervals. So, we present a new forecasting method using two-factors high-order fuzzy time series and particle swarm optimization (PSO) for increasing the forecasting accuracy. To show the effectiveness of the proposed method, we applied our method for the Taiwan futures exchange (TAIFEX) forecasting and the Korea composite price index (KOSPI) 200 forecasting. The results show better forecasting accuracy than previous methods. Ó 2009 Elsevier Ltd. All rights reserved.

1. Introduction Song and Chissom (1994a, 1994b) presented the fuzzy time series model based on the fuzzy set theory (Zadeh, 1965) to forecast the historical enrollments of the University of Alabama. It has the some drawbacks such as large amount of computation time when the fuzzy rule matrix is large and lack of persuasiveness in determining universe of discourse and the length of intervals. Chen (1996, 2002) presented a simple fuzzy composition method and high-order fuzzy time series to overcome these problems and improve the forecasting accuracy rate of the fuzzy time series. Lee, Wang, Chen, and Leu (2004, 2006) presented a two-factors highorder fuzzy time series for temperature prediction and the Taiwan futures exchange (TAIFEX) forecasting. It is noted that the forecasting accuracy rates of these methods mainly depend on their universe of discourse and the length of intervals. In recent years, some methods have been proposed to remedy these drawbacks in the fuzzy time series. Huarng (2001a, 2001b) presented two effective methods based on distribution and average for effective lengths of intervals and then adding a heuristic function to get better forecasting results. Lee and Chen (2004) presented a method for temperature prediction using genetic algorithms and fuzzy time series. Lee, Wang, and Chen (2008) also presented a method for temperature prediction and TAIFEX forecasting based on high-order fuzzy logical relationships and genetic simulated annealing techniques (Gen & Cheng, 1997; Goldberg, 1998; Goldberg, Korb, & Deb, 1989; Holland, 1975; Kirkpatric, Gelatt, & Vecchi, 1983; Lin & Chen, 2004). * Corresponding author. Tel.: +82 43 261 2388; fax: +82 43 268 2386. E-mail address: [email protected] (M.-G. Chun). 0957-4174/$ - see front matter Ó 2009 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2009.05.081

On the other hand, particle swarm optimization (PSO) technique was developed by Kennedy and Eberhart (1995). PSO mimics the social behavior of a flock of migrating birds trying to reach an unknown destination. Emad, Tarek, and Donald (2005) presented comparison among five recent evolutionary-based optimization algorithms: genetic algorithms, memetic algorithms, particle swarm, ant-colony systems, and shuffled frog leaping. The comparative results show the PSO method was generally found to perform better than other algorithms in terms of success rate and solution quality. Also, some comparative research works for the real problems presented that the PSO based results have better performance than the based on GA (Gaing, 2004; Panda & Padhy, 2007). Motivated by these previous research works, we present a new method using PSO and two-factors high-order fuzzy time series to increase the forecasting accuracy. We have applied the proposed method for the Taiwan futures exchange (TAIFEX) forecasting and also the Korea composite price index (KOSPI) 200 forecasting. Here, the proposed method shows better forecasting accuracy than other previous methods. The rest of the paper is organized as follows. In Section 2, we present a brief overview for fuzzy time series and particle swarm optimization. In Section 3, we present a particle swarm optimization based two-factors high-order fuzzy time series and some experimental results are given in Section 4. Finally, some concluding remarks are given in Section 5. 2. Related works 2.1. Fuzzy time series Song and Chissom (1993a, 1993b, 1994a, 1994b) presented the fuzzy time series based on the fuzzy set and fuzzy boundary. It

960

J.-I. Park et al. / Expert Systems with Applications 37 (2010) 959–967

x2

provides a powerful framework to cope with vague or ambiguous problems and can express linguistic values and human subjective judgments of natural language. The universe of discourse U can be represented as U ¼ fu1 ; u2 ; . . . ; un g. A fuzzy set Ai defined in the universe of discourse U can be represented as Ai ¼ fAi ðu1 Þ=u1 þ fAi ðu2 Þ= u2 þ    þ fAi ðun Þ=un . Here, fAi is the membership function of the fuzzy set Ai ; fAi : U ! ½0; 1; uk belongs to the fuzzy set Ai , and fAi ðuk Þ denotes the degree of membership of uk belonging to the fuzzy set Ai . The fuzzy time series can be summarized as follows Song and Chissom (1993a, 1993b, 1994a, 1994b).  Fuzzy time series FðtÞ is defined by a collection of fi ðtÞ.  If there exists a fuzzy logical relation Rðt  1; tÞ such asFðtÞ ¼ Fðt  1Þ  Rðt  1; tÞ, then Fðt  1Þ ! FðtÞ, where ‘’ is the max–min composition operator.  If fuzzy logical relation is defined by Ai ! Aj , where Fðt  1Þ ¼ Ai and FðtÞ ¼ Aj , then Ai called ‘left-hand side’ and Aj called ‘righthand side’ of the fuzzy logical relation, respectively.  Fuzzy logical relationship groups can be presented by the lefthand side of the fuzzy logical relation.

Ai ! Aj1 ; Aj2 ; . . . :  Suppose FðtÞ is caused by Fðt  1Þ only, and FðtÞ ¼ Fðt  1Þ  Rðt  1; tÞ. For any t, if Rðt  1; tÞ is independent of t, then FðtÞ is named a time-invariant fuzzy time series, otherwise a time-variant fuzzy time series. 2.2. Particle swarm optimization PSO method is a member of wide category of swarm intelligence methods for solving optimization problems. Since the PSO is based on a simple concept, the social behavior of particles in the swarm, it has been applied with success to complex real world problems. Each particle in PSO flies through the search space with an adaptable velocity that is dynamically modified according to its own flying experience and also communicates together as they flying experience of the other particles. Further, each particle has a memory and hence it is capable of remembering the best position in the search space ever visited by it. In the PSO, each particle remember the best position from its own flying experience is called pbest, and then the overall best out of all the particles in the population is called gbest. The features of the searching procedure of the PSO can be summarized as follows (Panda & Padhy, 2007).

i

v

ðtþ1Þ i

ðtÞ

¼w

v

ðtÞ i

þ c1  r1  ðpbest 

ðtÞ xi Þ

þ c2  r 2  ðgbest 

gbest

vi(t +1) pbest

xi(t ) x1 Fig. 1. Description of velocity and position updates in particle swarm optimization for two-dimensional parameter space. ðtþ1Þ

xi

ðtÞ

ðtþ1Þ

¼ xi þ v i ðtÞ i

wðtÞ ¼ wmax  t  ðwmax  wmin Þ=iter max

ð3Þ

where wmax is an initial weight, wmin is a final weight, and itermax is a maximum iteration number. Applying a large inertia weight at the start of the algorithm and making it decay to a small value through the PSO execution make the algorithm search globally at the beginning of the search, and search locally at the end of the execution. In

Initialize; Generate random population of N solutions (particles); Initialize the value of the weight factor w;

Do

For each particle i∈ N ; Calculate objective value (i) ; For each particle; Set pbest as the best position of particle i; If objective value (i) is better than pbest; pbest(i) = objective value (i); End; Set gbest as the best objective value of all particles; For each particle; Calculate particle velocity according to Eq. (1); Update particle position according to Eq. (2); End; Update the value of the weight factor w according to Eq. (3);

Until

Check if termination condition is true.

ðtÞ xi Þ

ð1Þ

ð2Þ

ðtÞ xi

where v and are the current velocity and position of the ith particle at iteration t. w is the inertia weight factor. The parameters c1 andc2 are called cognitive and social acceleration factors, respectively. r 1 and r2 are two independent random numbers uniformly distributed in the range [0, 1]. Usually tth the inertia weight wðtÞ is set according to the following equation:

 Initial positions of pbest are different. However, using the different direction of pbest and gbest, all particles gradually get close to the global optimum.  The method is usually applied to the discrete problem using grids for XY position and its velocity. However, the modified value of the particle position can be continuous and so the method can be also applied to the continuous problem.  There are no inconsistencies in searching procedures even if continuous and discrete state variables are utilized with continuous axes and grids for XY positions and velocities. Namely, the method can be applied to mixed integer non-linear optimization problems with continuous and discrete state variables naturally and easily. The new velocity and position of each particle can be calculated using the current velocity v and distance from the pbest to gbest as shown in the following equations:

xi(t +1)

v (t )

Fig. 2. Particle swarm optimization algorithm.

961

J.-I. Park et al. / Expert Systems with Applications 37 (2010) 959–967

the PSO, each particle moves in the search space with a velocity according to its own previous best solution and its group’s previous best solution. The parameters c1 and c2 determine the relative pull of pbest varying these pulls. In the above equations, superscripts denote the iteration number. Fig. 1 shows the velocity and position updates of a particle for two-dimensional parameter space and the overall computational flow chart of the PSO algorithm is given in Fig. 2.

main-factor

Fig. 3. The encoding format of the particles in the PSO.

3. A particle swarm optimization based two-factors high-order fuzzy time series In this section, we present a new forecasting method using particle swarm optimization and two-factors high-order fuzzy time series. Let FðtÞ be a fuzzy time series. If FðtÞ is caused by Fðt  1Þ; Fðt  2Þ; . . ., and Fðt  nÞ, then the fuzzy logical relationship is represented by

Fðt  nÞ; . . . ; Fðt  2Þ;

Fðt  1Þ ! FðtÞ

ð4Þ

and it is called an one-factor nth-order fuzzy time series forecasting model. If FðtÞ is caused by ðF 1 ðt  1Þ; F 2 ðt  1ÞÞ; ðF 1 ðt  2Þ; F 2 ðt  2ÞÞ; . . ., and ðF 1 ðt  nÞ; F 2 ðt  nÞÞ, then this fuzzy logical relationship is represented by ðF 1 ðt  nÞ; F 2 ðt  nÞÞ; . . . ; ðF 1 ðt  2Þ; F 2 ðt  2ÞÞ; ðF 1 ðt  1Þ; F 2 ðt  1ÞÞ ! FðtÞ ð5Þ

and it is called the two-factors nth-order fuzzy time series forecasting model, where F 1 ðtÞ and F 2 ðtÞ are called the main-factor and the second-factor, respectively. Based on the two-factors high-order fuzzy time series adopted from Lee et al. (2008), the proposed forecasting algorithm can be described as follows:

second-factor

Step 3: Define the linguistic term Ai ; fori ¼ 1; 2; . . . ; n, which represented by fuzzy sets of the main-factor. A1 ¼ 1=u1 þ 0:5=u2 þ 0=u3 þ 0=u4 þ 0=u5 þ . . . þ 0=un2 þ 0=un1 þ 0=un A2 ¼ 0:5=u1 þ 1=u2 þ 0:5=u3 þ 0=u4 þ 0=u5 þ . . . þ 0=un2 þ 0=un1 þ 0=un A3 ¼ 0=u1 þ 0:5=u2 þ 1=u3 þ 0:5=u4 þ 0=u5 þ . . . þ 0=un2 þ 0=un1 þ 0=un .. .. .. . . . An ¼ 0=u1 þ 0=u2 þ 0=u3 þ 0=u4 þ 0=u5 þ . . . þ 0=un2 þ 0:5=un1 þ 1=un ð6Þ

where A1 ; A2 ; . . . ; An are linguistic terms to describe the values of the main-factor which are divided into equal length intervals u1 ; u2 ; . . . ; un . And also, define the linguistic term Bj ; forj ¼ 1; 2; . . . ; m, which represented by fuzzy sets of the second-factor as follows:

B1 ¼ 1=v 1 þ 0:5=v 2 þ 0=v 3 þ . . . þ 0=v m2 þ 0=v m1 þ 0=v m B2 ¼ 0:5=v 1 þ 1=v 2 þ 0:5=v 3 þ . . . þ 0=v m2 þ 0=v m1 þ 0=v m B3 .. .

¼ 0=v 1 þ 0:5=v 2 þ 1=v 3 þ . . . þ 0=v m2 þ 0=v m1 þ 0=v m .. .. . .

Bm ¼ 0=v 1 þ 0=v 2 þ 0=v 3 þ . . . þ 0=v m2 þ 0:5=v m1 þ 1=v m ð7Þ

Step 1: Define the universe of discourse Uof the main-factor U ¼ ½Dmin  D1 ; Dmax þ D2 , where Dmin and Dmax are the minimum value and the maximum value from the known historical data, respectively, and D1 and D2 are two proper positive real numbers to divide the universe of discourse U into equal length intervals. And also, define the universe of discourse V of the second-factor V ¼ ½Emin  E1 ; Emax þ E2 , where Emin and Emax are the minimum value and the maximum value from the known historical data, respectively, and E1 and E2 are two selected positive numbers to divide the universe of discourse V into equal length intervals. Initialize the PSO parameters (weight factor, acceleration factor and maximum iteration) with the predefined values and generate initial random velocity and position of all particles. Divide the particle into two parts. Each part consist of n  1 ‘X’ and m  1 ‘Y’, where the X means the main-factor’s each interval and the Y means the second-factor’s each interval. In the sequence, the format of each particle is represented as follows: where xi1 6 xi2 6 . . . 6 xiðn1Þ and yi1 P yi2 P . . . P yiðm1Þ , for i ¼ 1; 2; . . . ; c; c is a swarm size. Here, the universe of discourse U of the main-factor is partitioned into n intervals u1 ; u2 ; . . . ; un where u1 ¼ ½U min ; xi1 ; u2 ¼ ½xi1 ; xi2 ; . . ., and un ¼ ½xiðn1Þ ; U max . Also, the universe of discourse V of the second-factor is partitioned into m intervals v 1 ; v 2 ; . . . ; v m where v 1 ¼ ½yi1 ; V max ; v 2 ¼ ½yi2 ; yi1 ; . . ., and v m ¼ ½V min ; yiðm1Þ . Step 2: Encode the values from the position of all particles into the particles in the PSO. Here, the two-factors are encoded in a single particle as shown in Fig. 3.

where B1 ; B2 ; . . . ; Bm are linguistic terms to describe the values of the second-factor which are divided into equal length intervals v 1; v 2; . . . ; v m. Step 4: Fuzzify the historical data described as follows. Find out the intervalui , where 1 6 i 6 n, to which the value of the main-factor belongs to ui . Case (1) If the value of the main-factor belongs to u1 , then the value of the main-factor is fuzzified into 1=A1 þ 0:5=A2 , denoted by X 1 . Case (2) If the value of the main-factor belongs to ui , where 2 6 i 6 n  1, then the value of the main-factor is fuzzified into 0:5=Ai1 þ 1=Ai þ 0:5=Aiþ1 , denoted byX i . Case (3) If the value of the main-factor belongs to un , then the value of the main-factor is fuzzified into 0:5=An1 þ 1=An , denoted by X n .Find out the interval v j , where 1 6 j 6 m, to which the value of the second-factor belongs to v 1 . Case (1) If the value of the second-factor belongs to v 1 , then the value of the second-factor is fuzzified into 1=B1 þ 0:5=B2 , denoted by Y 1 . Case (2) If the value of the second-factor belongs to v j , where 2 6 j 6 m  1, then the value of the second-factor is fuzzified into 0:5=Bj1 þ 1=Bj þ 0:5=Bjþ1 , denoted by Y j . Case (3) If the value of the second-factor belongs to v m , then the value of the second-factor is fuzzified into 0:5=Bm1 þ 1=Bm , denoted by Y m .

962

J.-I. Park et al. / Expert Systems with Applications 37 (2010) 959–967

Step 5: Get the two-factors nth-order fuzzy logical relationships based on the fuzzified main-factor and the fuzzified second-factor from the fuzzified historical data obtained in Step 4. If the fuzzified historical data of the main-factor of day i is X i , then construct the two-factors kth-order fuzzy logical relationships ‘ððX ik ; Y ik Þ; . . . ; ðX i2 ; Y i2 Þ; ðX i1 ; Y i1 ÞÞ ! X i ’ from day i  k to day i, where 2 6 k 6 n and X ik ; . . . ; X i2 ; X i1 denote the fuzzified values of the main-factor of days i  k; . . . ; i  2; i  1, respectively; Y ik ; . . . ; Y i2 ; Y i1 denote the fuzzified value of the second-factor of days i  k; . . . ; i  2; i  1, respectively. Then, divide the derived fuzzy logical relationships into fuzzy logical relationship groups based on the current states of the fuzzy logical relationships. Step 6: Calculate the forecasted values for each individual based on the following principles. (1) If the two-factors kth-order fuzzified historical data before day i are ðX ik ; Y ik Þ; ðX iðk1Þ ; Y iðk1Þ Þ; . . ., and ðX i1 ; Y i1 Þ, where k P 2; X ik ; X iðk1Þ ; . . . ; X i1 and X j are fuzzified values represented by fuzzy sets of the main-factor fuzzy time series; Y ik ; Y iðk1Þ ; . . . ; Y i1 are fuzzified values represented by fuzzy sets of the second-factor fuzzy time series, and there is the following fuzzy logical relationship in the kth-order fuzzy logical relationship groups, shown as follows:

ðX ik ; Y ik Þ; ðX iðk1Þ ; Y iðk1Þ Þ; . . . ; ðX i1 ; Y i1 Þ ! X j then the forecasted value tj of day i is calculated as follows:

tj ¼

8 m þ0:5m 1 2 ; > > < 1þ0:5

0:5mj1 þmj þ0:5mjþ1 0:5þ1þ0:5 > > : 0:5mn1 þmn ; 0:5þ1

if j ¼ 1 ; if 2 6 j 6 n  1

ð8Þ

if j ¼ n

Initialize; Define the universe of discourse (U, V); Generate random population of N solutions (particles); Initialize the value of the weight factor w;

Do

For each particle i∈ N ; Construct the two-factors nth-order fuzzy time series of particle i; Calculate objective value (i) according to Eq. (13); Set pbest as the best position of particle i; If objective value (i) is better than pbest; pbest(i) = objective value(i); End; Set gbest as the best objective value of all particles; For each particle; Calculate particle velocity according to Eq. (1); Update particle position according to Eq. (2); End; Update the value of the weight factor w according to Eq. (3);

Until

Check if termination condition is true.

Fig. 4. PSO based two-factors high-order fuzzy time series forecasting algorithm.

(2)

where mj1 ; mj and mjþ1 are the midpoints of the intervals uj1 ; uj and ujþ1 , respectively. If the two-factors kth-order fuzzified historical data before day i are ðX ik ; Y ik Þ; ðX iðk1Þ ; Y iðk1Þ Þ; . . ., and ðX i1 ; Y i1 Þ, where k P 2; X ik ; X iðk1Þ ; . . . ; X i1 ; X j1 ; X j2 ; . . ., and X jp are fuzzified values represented by fuzzy sets of the main-factor fuzzy time series; Y ik ; Y iðk1Þ ; . . . ; Y i1 are fuzzified values represented by fuzzy sets of the second-factor fuzzy time series and the fuzzy logical relationships in the kth-order fuzzy logical relationship group are shown as follows:

ðX ik ; Y ik Þ; ðX iðk1Þ ; Y iðk1Þ Þ; . . . ; ðX i1 ; Y i1 Þ ! X j1 ðX ik ; Y ik Þ; ðX iðk1Þ ; Y iðk1Þ Þ; . . . ; ðX i1 ; Y i1 Þ ! X j2 .. .

ð9Þ

ðX ik ; Y ik Þ; ðX iðk1Þ ; Y iðk1Þ Þ; . . . ; ðX i1 ; Y i1 Þ ! X jp where X ik ; X iðk1Þ ; . . . ; X i1 ; X j1 ; X j2 ; . . ., and X jp are fuzzy sets of the main-factor and Y ik ; Y iðk1Þ ; . . . ; Y i1 are fuzzy sets of the second-factor, and the number of

Table 1 Historical data of the TAIFEX and the TAIEX from August 3, 1998 to September 30, 1998 (Huarng, 2001b). Date

TAIFEX index

TAIEX index

8/3/1998 8/4/1998 8/5/1998 8/6/1998 8/7/1998 8/10/1998 8/11/1998 8/12/1998 8/13/1998 8/14/1998 8/15/1998 8/17/1998 8/18/1998 8/19/1998 8/20/1998 8/21/1998 8/24/1998 8/25/1998 8/26/1998 8/27/1998 8/28/1998 8/29/1998 8/31/1998 9/1/1998 9/2/1998 9/3/1998 9/4/1998 9/5/1998 9/7/1998 9/8/1998 9/9/1998 9/10/1998 9/11/1998 9/14/1998 9/15/1998 9/16/1998 9/17/1998 9/18/1998 9/19/1998 9/21/1998 9/22/1998 9/23/1998 9/24/1998 9/25/1998 9/28/1998 9/29/1998 9/30/1998

7552 7560 7487 7462 7515 7365 7360 7330 7291 7320 7300 7219 7220 7285 7274 7225 6955 6949 6790 6835 6695 6728 6566 6409 6430 6200 6403.2 6697.5 6722.3 6859.4 6769.6 6709.75 6726.5 6774.55 6762 6952.75 6906 6842 7039 6861 6926 6852 6890 6871 6840 6806 6787

7599 7593 7500 7472 7530 7372 7384 7352 7363 7348 7372 7274 7182 7293 7271 7213 6958 6908 6814 6813 6724 6736 6550 6335 6472 6251 6463 6756 6801 6942 6895 6804 6842 6860 6858 6973 7001 6962 7150 7029 7034 6962 6980 6980 6911 6885 6834

963

J.-I. Park et al. / Expert Systems with Applications 37 (2010) 959–967

ðX ik ; Y ik Þ; ðX iðk1Þ ; Y iðk1Þ Þ; . . . ; ðX i1 ; Y i1 Þ ! #

X j1 ; X j2 ; . . ., and X jp appearing in the fuzzy logical relationship group are nj1 ; nj2 ; . . ., and njp , respectively, then the forecasted value of day i is calculated as follows:

nj1  tj1 þ nj2  tj2 þ    þ njp  tjp nj1 þ nj2 þ    þ njp

(3)

ð11Þ

Then the forecasted value of day i is calculated as follows:

ð10Þ

1  t ik þ 2  tiðk1Þ þ    þ k  ti1 1 þ 2 þ  þ k

where the values of t j1 ; tj2 ; . . ., and t jp are calculated by (8), respectively. If the two-factors kth-order fuzzified historical data before day i are X ik ; Y ik ; ðX iðk1Þ ; Y iðk1Þ Þ; . . ., and ðX i1 ; Y i1 Þ, where k P 2; X ik ; X iðk1Þ ; . . ., and X i1 are fuzzified values represented by fuzzy sets of the main-factor fuzzy time series; Y ik ; Y iðk1Þ ; . . . ; Y i1 are fuzzified values represented by fuzzy sets of the second-factor fuzzy time series, and there is the following fuzzy logical relationship in the kth-order fuzzy logical relationship groups in which the right-hand side of the fuzzy logical relationship is an unknown value ‘#’, shown as follows:

ð12Þ

where the values of tik ; tiðk1Þ ; . . ., and t i1 are calculated by (8), respectively.Here, we use the mean square error (MSE) as shown in Eq. (13) as the objective values of each particle in the particle swarm optimization.

Pn MSE ¼

i¼1 ðForecasted

value of i  Actual value of iÞ2 n

 100 ð13Þ Step 7: Calculate the pbest based on objective values for each particle. If the obtained objective value has better than the previous value, then the pbest is updated with the

Fig. 5. Convergence of best objective values for GA and PSO based forecasting methods (seventh-order case).

Fig. 6. Convergence of average objective values for GA and PSO based forecasting methods (seventh-order case).

Table 2 A comparison of mean square errors for different orders of each method. Algorithms

First-order

Second-order

Third-order

Fourth-order

Fifth-order

Sixth-order

Seventh-order

Eighth-order

GA based forecasting (Lee et al., 2008) PSO based forecasting

799.19 577.45

193.88 115.17

208.79 121.55

142.26 101.67

143.31 70.47

147.14 72.38

105.02 55.96

124.45 61.71

Table 3 The partition of the universe of discourse both the main-factor and the second-factor of the best particle (seventh-order case). Algorithms

x1

x2

x3

x4

x5

x6

x7

x8

x9

x10

x11

x12

x13

x14

x15

GA based forecasting PSO based forecasting

6300 6200

6372 6414

6455 6566

6679 6701

6722 6726

6730 6769

6833 6794

6846 6848

6876 6880

6920 6916

6971 6952

7107 7039

7211 7221

7231 7288

7369 7325

GA based forecasting PSO based forecasting

y1 7466 7481

y2 7170 7301

y3 7111 7297

y4 6967 7004

y5 6939 6996

y6 6865 6978

y7 6814 6929

y8 6785 6902

y9 6742 6588

y10 6723 6532

y11 6676 6412

y12 6641 6408

y13 6571 6398

y14 6536 6361

y15 6488 6304

964

J.-I. Park et al. / Expert Systems with Applications 37 (2010) 959–967

new value for each particle. Otherwise, the pbest value remains the previous best value, respectively. And then set the gbest of all particles, which choose the best pbest of all particles. If new objective value of gbest has better than the previous value, then the gebset is updated as the new value, otherwise it remains the previous value. Step 8: For each particle, calculate particle velocity according to Eq. (1), which based on the gbest and the pbest values and then update particle position according to Eq. (2). This step will lead the each particle to a more promising solution for optimal tuning of two-factors nth-order fuzzy time series. Step 9: Update the value of the weight factor w according to Eq. (3). Step 10: Stop if the stop criterion of the PSO, the maximum number of iteration, is satisfied. Otherwise, go to Step 2.

The overall computational flow of the proposed PSO based twofactors high-order fuzzy time series forecasting is described in Fig. 4.

Table 5 Parameters used for PSO and GA based forecasting. PSO parameters

GA parameters

Swarm size: 30 Maximum number of iterations: 500 c1 ¼ 0:2; c2 ¼ 2 wmin ; wmax ¼ 0:4; 0:9

Population size: 30 Maximum number of iterations: 500 Type of selection: grade selection Type of crossover: simple crossover [0.8] Type of mutation: dynamic mutation [0.05]

Universe of discourse: main-factor ½U min ¼ 6100; U max ¼ 7700, second-factor ½V min ¼ 6100; V max ¼ 7700

Table 4 A comparison of the forecasting values of the TAIFEX and the mean square errors for different forecasting methods. Date

Actual TAIFEX index

Chen’s method (Chen, 1996)

Huarng’s method (Huarng, 2001a) (two-variable heuristic)

Huarng’s method (Huarng, 2001b) (three-variable heuristic)

Lee et al.’s method (Lee et al., 2006)

Lee et al.’s method (Lee et al., 2008) (GA; seventh-order; annealing constant a ¼ 0:9)

Proposed forecasting method (seventh-order case)

8/3/1998 8/4/1998 8/5/1998 8/6/1998 8/7/1998 8/10/1998 8/11/1998 8/12/1998 8/13/1998 8/14/1998 8/15/1998 8/17/1998 8/18/1998 8/19/1998 8/20/1998 8/21/1998 8/24/1998 8/25/1998 8/26/1998 8/27/1998 8/28/1998 8/29/1998 8/31/1998 9/1/1998 9/2/1998 9/3/1998 9/4/1998 9/5/1998 9/7/1998 9/8/1998 9/9/1998 9/10/1998 9/11/1998 9/14/1998 9/15/1998 9/16/1998 9/17/1998 9/18/1998 9/19/1998 9/21/1998 9/22/1998 9/23/1998 9/24/1998 9/25/1998 9/28/1998 9/29/1998 9/30/1998

7552 7560 7487 7462 7515 7365 7360 7330 7291 7320 7300 7219 7220 7285 7274 7225 6955 6949 6790 6835 6695 6728 6566 6409 6430 6200 6403.2 6697.5 6722.3 6859.4 6769.6 6709.75 6726.5 6774.55 6762 6952.75 6906 6842 7039 6861 6926 6852 6890 6871 6840 6806 6787

– 7450 7450 7500 7500 7450 7300 7300 7300 7183.33 7300 7300 7183.33 7183.33 7183.33 7183.33 7183.33 6850 6850 6775 6850 6750 6775 6450 6450 6450 6450 6450 6750 6775 6850 6775 6775 6775 6775 6775 6850 6850 6850 6850 6850 6850 6850 6850 6850 6850 6850

– 7450 7450 7450 7500 7450 7350 7300 7350 7100 7350 7300 7100 7300 7100 7100 7100 6850 6850 6650 6750 6750 6650 6450 6550 6350 6450 6550 6750 6850 6750 6650 6850 6850 6650 6850 6950 6850 6950 6850 6850 6850 6950 6850 6750 6750 6750

– 7450 7450 7500 7500 7450 7300 7300 7300 7188.33 7300 7300 7100 7300 7188.33 7100 7100 6850 6850 6775 6750 6750 6650 6450 6550 6350 6450 6550 6750 6850 6750 6650 6775 6775 6775 6850 6850 6850 6850 6850 6850 6850 6850 6850 6750 6850 6750

– – – 7450 7550 7350 7350 7350 7250 7350 7350 7250 7250 7250 7250 7250 6950 6950 6750 6850 6650 6750 6550 6450 6450 6250 6450 6650 6750 6850 6750 6750 6750 6817 6817 6817 6950 6850 7050 6850 6950 6850 6850 6850 6850 6850 6750

– – – – – – – 7329 7289.5 7329 7289.5 7215 7215 7289.5 7289.5 7215 6949.5 6949.5 6796 6848 6698.5 6726 6569.5 6417 6417 6205 6417 6698.5 6726 6848 6763 6726 6726 6763 6763 6949.5 6904.5 6848 7064 6848 6904.5 6848 6904.5 6848 6848 6796 6796

– – – – – – – 7325 7287.5 7325 7287.5 7221.3 7221.3 7287.5 7287.5 7221.3 6952.2 6952.2 6794.3 6848.2 6700.8 6725.6 6566 6414.1 6414.1 6200 6414.1 6700.8 6725.6 6848.2 6768.7 6700.8 6725.6 6768.7 6768.7 6952.2 6916 6848.2 7039 6848.2 6916 6848.2 6880.5 6880.5 6848.2 6794.3 6794.3

9668.94

7856.5

5437.58

1364.56

MSE

105.02

For the reference, the detailed descriptions about initial parameters for PSO and GA algorithms are presented in Table 5.

55.96

J.-I. Park et al. / Expert Systems with Applications 37 (2010) 959–967

4. Experimental results We use two stock market datasets to perform comparative study with most recently studied GA based two-factors high-order fuzzy time series (Lee et al., 2008). In the first experiment, we apply the proposed method to the TAIFEX from August 3, 1998 to September 30, 1998 where the TAIFEX is called the main-factor and the TAIEX (Taiwan stock exchange capitalization weighted stock index) is called the second-factor. In the second case, we perform to forecast the Korea composite stock price index (KOSPI) 200, where the futures price can be considered as the main-factor and the underlying price can be considered as the second-factor. For comparing the performances of the two methods, all experiments are fifty times executed for each fixed size of population and iteration, and then we present the best performance solutions in each experiment.

965

results shows that the proposed method has better prediction accuracy in terms of mean square error. The simulation results according to the orders of each model are presented in Table 2 where the proposed method shows better forecasting accuracy than GA based forecasting method for all orders. It is noted that the seventh-order case shows the best performance. The partitions of the universe of discourse both the mainfactor and the second-factor having best forecasting are obtained as shown in Table 3 where the universe of discourse of the mainfactor and the second-factor are divided into 16 intervals as the same in Lee et al. (2008). On the other hand, Table 4 denotes several comparisons of the forecasted results of the TAIFEX in terms of mean square errors for different forecasting methods. As shown in this Table, the proposed method shows better forecasting results than previous forecasting ones. 4.2. Forecasting for KOSPI 200

4.1. Forecasting for TAIFEX Table 1 shows the historical data of TAIFEX and TAIEX from August 3, 1998 to September 30, 1998. The following figures show the performance of the TAIFEX forecasting. The convergence of best objective values and average objective values for GA and PSO based two-factors seventh-order fuzzy time series are depicted in Figs. 5 and 6, respectively. These

Table 6 Historical data of the KOSPI 200 from March 3, 2008 to April 30, 2008. Date

Futures price

Underlying price

3/3/2008 3/4/2008 3/5/2008 3/6/2008 3/7/2008 3/10/2008 3/11/2008 3/12/2008 3/13/2008 3/14/2008 3/17/2008 3/18/2008 3/19/2008 3/20/2008 3/21/2008 3/24/2008 3/25/2008 3/26/2008 3/27/2008 3/28/2008 3/31/2008 4/1/2008 4/2/2008 4/3/2008 4/4/2008 4/7/2008 4/8/2008 4/10/2008 4/11/2008 4/14/2008 4/15/2008 4/16/2008 4/17/2008 4/18/2008 4/21/2008 4/22/2008 4/23/2008 4/24/2008 4/25/2008 4/28/2008 4/29/2008 4/30/2008

213.05 213.65 213.85 217.4 211.5 208.05 208.9 212.8 206 206 200.95 202.95 208.3 207.35 211.55 212.8 214.9 216.05 214.8 219.1 217.6 219.1 225.85 228 229.45 229.75 227.15 228.4 229.9 225.7 224.6 227.3 228.85 228.9 232.5 230.9 232.35 232.7 236.55 236.4 234.75 237

211.73 212.14 212.14 214.92 210.57 206.29 208.06 210.49 204.42 202.63 199.68 201.79 206.48 206.62 209.69 211 213.73 214.41 213.85 217.22 217.65 217.81 223.76 226.99 226.95 227.79 225.1 226.44 228.8 224.54 223.51 225.36 226.98 227.25 231.11 229.22 230.56 230.77 234.56 234.79 233.24 235

The Korea composite stock price index (KOSPI) 200 consists of 200 dominant items selected from all stocks listed on the Korea exchange (KRX) stock market. We consider the daily stock price index from March 3, 2008 to April 30, 2008 also shown in Table 6. Here, the futures price can be considered as the main-factor and the underlying price can be considered as the second-factor. The historical data of the KOSPI 200 were taken from: http://eng.krx.co.kr/index.html. The following figures show the performance of the KOSPI 200 forecasting. The convergence of best objective values and average objective values for GA and PSO based two-factors eighth-order fuzzy time series are depicted in Figs. 7 and 8, respectively. More detail results to forecast the KOSPI 200 are shown in Table 7. The partitions of the universe of discourse both the main-factor and the second-factor having best forecasting are obtained as shown in Table 8 where the universe of discourse of the main-factor and the second-factor are divided into 16 intervals. Table 9 shows a comparison of the forecasted values of the KOSPI 200 and the mean square errors of each method. The proposed

Fig. 7. Convergence of best objective values for GA and PSO based forecasting methods (eighth-order case).

Fig. 8. Convergence of average objective values for GA and PSO based forecasting methods (eighth-order case).

966

J.-I. Park et al. / Expert Systems with Applications 37 (2010) 959–967

method has a smaller mean square error than the GA based forecasting method.

So, fuzzy time series has shown good performances for these real world problems. This paper presented a new forecasting method based on PSO and two-factors high-order fuzzy time series. After applying the proposed forecasting method for the real world datasets of TAIFEX and KOSPI 200, we found that our approach shows better forecasting accuracy than previous ones. In this work, we forecasted the future price of a stock based solely on the trends

5. Concluding remarks Stock market is very volatile time series in nature and it has difficult to make the potential relationship as a mathematical model. Table 7 A comparison of mean square errors for different orders of each method. Algorithms

First-order

Second-order

Third-order

Fourth-order

Fifth-order

Sixth-order

Seventh-order

Eighth-order

GA based forecasting PSO based forecasting

1.80 0.71

0.36 0.21

0.26 0.17

0.26 0.24

0.27 0.17

0.26 0.18

0.26 0.17

0.25 0.16

Table 8 The partition of the universe of discourse for the main-factor and the second-factor of the best particle (eighth-order case). Algorithms

x1

x2

x3

x4

x5

x6

x7

x8

x9

x10

x11

x12

x13

x14

x15

GA based forecasting PSO based forecasting

199 201

204 203

207 206

208 208

210 212

214 215

217 219

219 225

221 227

224 229

227 230

229 231

231 233

234 235

239 237

y1

y2

y3

y4

y5

y6

y7

y8

y9

y10

y11

y12

y13

y14

y15

237 233

232 223

226 221

221 220

216 219

211 218

208 216

202 214

196 210

195 206

192 206

190 199

189 197

188 196

187 186

GA based forecasting PSO based forecasting

Table 9 A comparison of the forecasting values of the KOSPI 200 and the mean square errors of each method. Date

Actual futures price

GA based forecasting (eighth-order case)

PSO based forecasting (eighth-order case)

3/3/2008 3/4/2008 3/5/2008 3/6/2008 3/7/2008 3/10/2008 3/11/2008 3/12/2008 3/13/2008 3/14/2008 3/17/2008 3/18/2008 3/19/2008 3/20/2008 3/21/2008 3/24/2008 3/25/2008 3/26/2008 3/27/2008 3/28/2008 3/31/2008 4/1/2008 4/2/2008 4/3/2008 4/4/2008 4/7/2008 4/8/2008 4/10/2008 4/11/2008 4/14/2008 4/15/2008 4/16/2008 4/17/2008 4/18/2008 4/21/2008 4/22/2008 4/23/2008 4/24/2008 4/25/2008 4/28/2008 4/29/2008 4/30/2008 MSE

213.05 213.65 213.85 217.4 211.5 208.05 208.9 212.8 206 206 200.95 202.95 208.3 207.35 211.55 212.8 214.9 216.05 214.8 219.1 217.6 219.1 225.85 228 229.45 229.75 227.15 228.4 229.9 225.7 224.6 227.3 228.85 228.9 232.5 230.9 232.35 232.7 236.55 236.4 234.75 237

– – – – – – – – 205.88 205.88 200.94 203.2 207.95 207.95 212.07 212.07 215.3 215.3 215.3 218.59 218.59 218.59 225.38 227.48 229.99 229.99 227.48 228.73 229.99 225.38 225.38 227.48 228.73 228.73 232.59 229.99 232.59 232.59 237.38 237.38 234.51 237.38 0.25

– – – – – – – – 206 206 200.95 202.95 207.83 207.83 212.18 212.18 215.25 215.25 215.25 218.60 218.60 218.60 225.38 228.54 229.70 229.70 227.22 228.54 229.70 225.38 225.38 227.22 228.54 228.54 232.52 230.90 232.52 232.52 236.65 236.65 234.75 236.65 0.16

For the reference, the detailed descriptions about initial parameters for PSO and GA algorithms are presented in Table 10.

J.-I. Park et al. / Expert Systems with Applications 37 (2010) 959–967 Table 10 Parameters used for PSO and GA based forecasting. PSO parameters

GA parameters

Swarm size: 30 Maximum number of iterations: 500 c1 ¼ 0:2; c2 ¼ 2 wmin ; wmax ¼ 0:4; 0:9

Population size: 30 Maximum number of iterations: 500 Type of selection: grade selection Type of crossover: simple crossover [0.8] Type of mutation: dynamic mutation [0.05] Universe of discourse: main-factor ½U min ¼ 185; U max ¼ 240, second-factor ½V min ¼ 185; V max ¼ 240

of the two-factors; adding the multi factors may render more favorable forecasting accuracy if they have potential correlations with the stock price. Therefore, multi factor forecasting based on the described scheme is now under consideration for further studies. References Chen, S. M. (1996). Forecasting enrollments based on fuzzy time series. Fuzzy Sets and Systems, 81(3), 311–319. Chen, S. M. (2002). Forecasting enrollments based on high-order fuzzy time series. Cybernetics and Systems, 33(1), 1–16. Emad, E., Tarek, H., & Donald, G. (2005). Comparison among five evolutionary-based optimization algorithm. Advanced Engineering Informatics, 19, 43–53. Gaing, Z. L. (2004). A particle swarm optimization approach for optimum design of PID controller in AVR system. IEEE Transactions on Energy Conversion, 19(2), 384–391. Gen, M., & Cheng, R. (1997). Genetic algorithms and engineering design. New York: John Wiley & Sons. Goldberg, D. E. (1998). Genetic algorithms in search optimization and machine learning. Massachusetts: Addison-Wesley. Goldberg, D. E., Korb, B., & Deb, K. (1989). Messy genetic algorithms: Motivation, analysis, and first results. Complex Systems, 3(5), 493–530.

967

Holland, J. H. (1975). Adaptation in natural and artificial systems. Massachusetts: MIT Press. Huarng, K. (2001a). Effective lengths of intervals to improve forecasting in fuzzy time series. Fuzzy Sets and Systems, 123(3), 387–394. Huarng, K. (2001b). Heuristic models of fuzzy time series for forecasting. Fuzzy Sets and Systems, 123(3), 369–386. Kennedy, J., & Eberhart, R. C. (1995). Particle swarm optimization. In Proceedings of IEEE international conference on neural networks (Vol. 4, pp. 1942–1948). Kirkpatric, S., Gelatt, C. D., Jr., & Vecchi, M. P. (1983). Optimization by simulated annealing. Science, 220(4598), 671–880. Lee, L. W., & Chen, S. M. (2004). Temperature prediction using genetic algorithms and fuzzy time series. In Proceedings of the 2004 international conference on information managements, Miaoli, Taiwan, Republic of China (pp. 299–306). Lee, L. W., Wang, L. H., Chen, S. M., & Leu, Y. H. (2004). A new method for handling forecasting problems based on two-factors high-order fuzzy time series. In Proceedings of the 2004 ninth conference on artificial intelligence and applications, Taipei, Taiwan, Republic of China. Lee, L. W., Wang, L. H., & Chen, S. M. (2008). Temperature prediction and TAIFEX forecasting based on high-order fuzzy logical relationships and genetic simulated annealing techniques. Expert Systems with Applications, 34, 328–336. Lee, L. W., Wang, L. H., Chen, S. M., & Leu, Y. H. (2006). Handling forecasting problems based on two-factors high-order fuzzy time series. IEEE Transactions on Fuzzy Systems, 14(3), 468–477. Lin, C. H., & Chen, S. M. (2004). A new method for multiple DNA sequence alignment based on genetic simulated annealing algorithms. In Proceedings of the 2004 international conference on information management, Miauli, Taiwan, Republic of China. Panda, S., & Padhy, N. P. (2007). Comparison of particle swarm optimization and genetic algorithm for FACTS-based controller design. Applied Soft Computing, ASOC-417, 1–10. Song, Q., & Chissom, B. S. (1993a). Fuzzy time series and its models. Fuzzy Sets and Systems, 54(3), 269–277. Song, Q., & Chissom, B. S. (1993b). Forecasting enrollments with fuzzy time series – Part I. Fuzzy Sets and Systems, 54(1), 1–9. Song, Q., & Chissom, B. S. (1994a). Some properties of defuzzification neural networks. Fuzzy Sets and Systems, 61(1), 83–89. Song, Q., & Chissom, B. S. (1994b). Forecasting enrollments with fuzzy time series – Part II. Fuzzy Sets and Systems, 62(1), 1–8. Zadeh, L. A. (1965). Fuzzy sets. Information and Control, 8, 338–353.