Particle swarm intelligence tunning of fuzzy geometric protoforms for price patterns recognition and stock trading

Particle swarm intelligence tunning of fuzzy geometric protoforms for price patterns recognition and stock trading

Expert Systems with Applications 40 (2013) 2391–2397 Contents lists available at SciVerse ScienceDirect Expert Systems with Applications journal hom...

868KB Sizes 5 Downloads 58 Views

Expert Systems with Applications 40 (2013) 2391–2397

Contents lists available at SciVerse ScienceDirect

Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa

Particle swarm intelligence tunning of fuzzy geometric protoforms for price patterns recognition and stock trading _ ´ ski a, Przemysław Grzegorzewski a,b,⇑ Piotr Ładyzyn a b

Faculty of Mathematics and Computer Science, Warsaw University of Technology, Koszykowa 75, 00-662 Warsaw, Poland Systems Research Institute, Polish Academy of Sciences, Newelska 6, 01-447 Warsaw, Poland

a r t i c l e

i n f o

Keywords: Consolidation phase Decision trees Fuzzy protoform Fuzzy sets Genetic optimization Particle swarm intelligence Patterns recognition Time series

a b s t r a c t A novel approach for detecting patterns in price time series is shown. The proposed system for identifying consolidation phases is based on fuzzy geometric protoforms and classification trees. Promising results of the empirical studies prove that the suggested fuzzy geometric protoforms are very useful for identifying patterns in graphical visualizations of data. Moreover, the architecture of the system enables successful incorporation of genetic optimization what enables capturing various data sets structure and unstable conditions on financial markets. Ó 2012 Elsevier Ltd. All rights reserved.

1. Introduction While observing price charts of financial instruments (shares, futures contracts, commodities) one can realize that prices tend to form some unique patters before extraordinary market events happen. Investors try to identify such patterns to predict significant price movements, market volatility changes, bull or bear market commencement. Although the problem of identifying crucial characteristics of the financial time series is so important, there are not too many papers dealing with the price patterns recognition. An interesting method for assessing the predictive power of price patterns was proposed by Savin, Weller, and Zvingelis (2007). Wu, Lin, and Lin (2006) present a stock trading method by combining the filter rule and the decision tree technique. Five trend following (TF) algorithms are presented and compared in Fong, Si, and Tai (2012). Most of publications try to solve the problem using neural networks (Kara, Boyacioglu, & Baykan, 2011; Kamijo & Tanigawa, 1990; Guo, Liang, & Li, 2007) trained with preprocessed price time series (usually smoothing averages and technical analysis indicators are calculated). Other systems extract patterns with manually established rules comparing some indicators calculated for the fixed time windows. Unfortunately, such systems cannot adapt to changing markets.

⇑ Corresponding author at: Systems Research Institute, Polish Academy of Sciences, Newelska 6, 01-447 Warsaw, Poland. _ ´ ski), pgrzeg@ibspan. E-mail addresses: [email protected] (P. Ładyzyn waw.pl (P. Grzegorzewski). 0957-4174/$ - see front matter Ó 2012 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.eswa.2012.10.066

In this paper we propose a novel approach for detecting patterns in price time series. Our system utilizes a new concept of so-called fuzzy geometric protoform which seems to be quite effective for modeling patterns. The paper is organized as follows: In Section 2 we describe typical patterns that appear in technical analysis. In Section 3 we propose a mathematical model of the consolidation phase price pattern. Then in Section 4 we define a fuzzy geometric protoform being the core of our system. Two trend indicators are discussed in Section 5. In Section 6 we present a structure of the consolidation phase detection system while in Section 7 we describe a Particle Swarm Optimization algorithm for intelligent tuning of our system. Finally, in Section 9 we discuss the results of the empirical study. 2. Patterns in technical analysis Investors usually forecast future price trends basing on information on historical prices and trading volume summarized in the form of charts. In Fig. 1 prices form the so-called price channel while oscillating between two parallel lines called resistance (upper line) and support (lower line), respectively. An upstroke in late November (we say that there is an upstroke when prices break resistance line) is an indicator of approaching uptrend. On the other hand, when prices came back to the old resistance line in the first week of December, it was the strong negation of the previous signal. In Fig. 2 prices form a consolidation phase. After resistance line break prices elevate in a significant uptrend. Well formed consolidation phase is rather a rare market phenomena. Moreover consolidation phase breakup is one of the strongest technical

2392

´ ski, P. Grzegorzewski / Expert Systems with Applications 40 (2013) 2391–2397 P. Ładyz_ yn

Fig. 1. A price channel and an upstroke in late November.

Fig. 2. A consolidation phase and an upstroke confirmed by a strong uptrend.

analysis indicators. It provides an opportunity to open a position with a high probability of exceeding return and low risk. There are two most known trading rules connected with the consolidation phase:  Upper breakup: uptrend is expected. We open a long position and place STOP LOSS order in the middle of breakup day price candle. If prices follow uptrend we move our STOP LOSS up to protect the profit.  Lower breakup: downtrend is expected. We open a short position and place STOP LOSS order in the middle of breakup day price candle. If prices follow downtrend we move our STOP LOSS down to protect the profit. The support and resistance lines in the consolidation phase price pattern form a semi-rectangle of unknown length.

Definition 1. Let d, d 2 f1; . . . ng, denote the length of a subwindow. Then a set SBd  R2 given by

SBd ¼ fðt; Y t Þ : Y t ¼ minfX i ; X iþ1 ; . . . ; X iþd g; i ¼ 1; . . . ; n  dg

ð1Þ

fX t gnt¼1 .

Simi-

SCd ¼ fðt; Y t Þ : Y t ¼ maxfX i ; X iþ1 ; . . . ; X iþd g; i ¼ 1; . . . ; n  dg

ð2Þ

is called the d-span support base of the time series larly, a set SCd  R2 given by

is called the d-span resistance base of the time series We also define the truncations of these sets:

fX t gnt¼1 .

Definition 2. Lets tp ; t0 denote time moments such that t p < t 0 . Then

SBdj½tp ;t0  ¼ fðt; YÞ 2 SBd : t 2 ½t p ; t 0 g

ð3Þ

SCdj½tp ;t0  ¼ fðt; YÞ 2 SCd : t 2 ½t p ; t0 g

ð4Þ

3. The support and resistance lines Let fX t gnt¼1 denote a time series of prices for a given financial instrument (or a part of the longer time series fX t gt2T ). To examine patterns that may appear in this time series the following sets will be helpful.

Further on we utilize the support base and the resistance base to define a price-quadrangle for our time series fX t gnt¼1 . First of all let us define the support line and the resistance line for the given time interval ½t p ; t 0 . Please, note, that all concepts discussed above are obtained for a fixed subwindow d.

´ ski, P. Grzegorzewski / Expert Systems with Applications 40 (2013) 2391–2397 P. Ładyz_ yn ðt ;t Þ

ðt ;t Þ

Definition 3. Let y ¼ ab p 0 t þ bb p 0 denote a line obtained using the least square method for points SBdj½tp ;t0   R2 . Then

n o ðt ;t Þ ðt ;t Þ Bdj½tp ;t0  ¼ ðt; yÞ : tp 6 t 6 t0 ; y ¼ ab p 0 t þ bb p 0

ð5Þ

is called the d-span support of the time series fX t g for the interval ½t p ; t 0 . ðt ;t Þ ðt ;t Þ Similarly, let y ¼ ac p 0 t þ bc p 0 denote a line obtained using the least square method for points SCdj½tp ;t0   R2 . Then

n o ðt ;t Þ Cdj½tp ;t0  ¼ ðt; yÞ : t p 6 t 6 t 0 ; y ¼ ac p 0 t þ bcðtp ;t0 Þ

ð6Þ

is called the d-span resistance of the time series fX t g for the interval ½t p ; t 0 . Further on, unless it is declared differently, we call the d-span support base (resistance) just the support base (resistance) and the d-span support (resistance) as the support (resistance). Empirical studies show that d-span of 10 days works quite well. Now we are able to define one of the fundamental concepts of our contribution, called a price-quadrangle. Definition 4. A price-quadrangle based on the time series fX t gnt¼1 for the given interval ½t p ; t0  is a set

Q dj½tp ;t0  ¼ fðt; yÞ : t p 6 t 6 t 0 ; Bdj½tp ;t0  6 y 6 Cdj½tp ;t0  g

ð7Þ

Different examples of the price-quadrangles are given in Figs. 1–3. In particular, the price-quadrangle in Fig. 1 form the increasing price channel with parallel support and resistance. In Fig. 2 we have shown the price-quadrangle for the consolidation phase, while in Fig. 3 we find two price-quadrangles: one corresponding to the decreasing price channel and the second illustrating the consolidation phase. These few examples show that a shape of the price-quadrangle may be applicable for detecting different patterns in a time series under study. In particular, if the support and resistance are both increasing (decreasing) then we get an increasing (decreasing) price channel, while the price-quadrangle made by the support and resistance parallel to the time axis seems to be characteristic for the consolidation phase. In other words, the desired shape of the price-quadrangle indicating the consolidation phase is a rectangle parallel (perpendicular) to the time (price) axis. Of course, to get the ideal rectangle for the real data might be impracticable. However, we should expect that the price-

2393

quadrangle corresponding to the consolidation phase would be at least semi-rectangular, i.e. the slope of the support ab and the slope of the resistance ac should be ‘‘nearly’’ equal and ‘‘close’’ to 0. Thus a natural question arises, how to model such expressions like ‘‘nearly’’, ‘‘close’’, etc. It seems that the fuzzy set theory might be quite useful here. Besides modeling vague statements one has also to be aware of a random error that often occur for sample data. Since each pricequadrangle is formed by two straight lines obtained by the least square method, the corresponding random error will be defined by

rQ ¼

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi SSE2b þ SSE2c ; SSE2b

ð8Þ

SSE2c

where and denote the sum of squared error for the support and resistance, respectively. 4. Fuzzy geometric protoform The concept of a protoform (prototypical form), that is recently vividly advocated by Zadeh (2002), is defined as an abstract prototype. Zadeh points out the relevance of protoforms in the formalization of human consistent reasoning in various problems related to broadly perceived information technology and knowl_ edge engineering. Kacprzyk and Zadrozny (2005) apply protoforms for generating families of linguistic database summaries and fuzzy queries. In this application protoforms appear as so powerful conceptual tools because they let a uniform way to formulate and handle different types of linguistic summaries. Below we define an abstract concept of a fuzzy geometric protoform and then we propose how to apply this idea for modeling and validating the consolidation phase in a time series. Definition 5. A fuzzy geometric protoform (FGP) is an ordered triple ðP; F ; MÞ, where:  P is a family of parameters describing the geometric structure of the object under study,  F is a (fuzzy) domain for parameters in P,  M is a valuation function defined on F . Here we do not discuss fuzzy geometric protoforms in general but we show that this concept might be useful for the consolidation phase detection.

Fig. 3. Dark grey points represents minimum prices from last ten trading days and light grey points represents maximum prices (visualization of sets SBd and SCd – see Definition 1).

2394

´ ski, P. Grzegorzewski / Expert Systems with Applications 40 (2013) 2391–2397 P. Ładyz_ yn

Definition 6. Let ðX t Þ be a given price time series. A fuzzy geometric protoform of the consolidation phase is an ordered triple ðP; F ; MÞ such that   ðt ;t Þ ðt ;t Þ  P ¼ ab p 0 ; ac p 0 ; T , where T is a time window, tp ; t 0 2 T are ðt ;t Þ

ðt ;t Þ

certain time moments such that tp < t0 , while ab p 0 and ac p 0 denote the slope of the support and resistance, respectively, calculated for the time interval ½tp ; t0 ;  F ¼ ðAc ; Apar ; AT Þ, where Ac , Apar and AT denote fuzzy subsets of the real line R which are domains of the successive parameters in P, respectively;     ðt ;t Þ ðt ;t Þ ðt ;t Þ  M ¼ SðT ðlc ac p 0 ; lpar ac p 0  ab p 0 ; lT ðt0 ÞÞÞ, where S and T denote an S-conorm and T-norm, respectively, while lb , lc and lT stand for the membership functions of successive fuzzy sets Ab ; Ac ; AT that appear in F . Please note, that classical S-conorm and T-norm are defined on a unit square ½0; 12 , while function considered above are specified for more than two arguments. Hence they are actually S-multiconorm and T-multinorm. However, such functions are easily obtained from the classical ones due to associativity of S-conorms and T-norms. A natural question arises: How to define fuzzy domains of the parameters enclosed in P and a valuations function M. Of course, each particular fuzzy geometric protoform of the consolidation phase should depend on the problem under study and hence all objects which characterize such fuzzy protoform ought to be strongly consistent with the real-life data. A fuzzy set Ac corresponds to possible values of the resistance. However, since our intention is to detect the consolidation phase, the most desired values of the slope of the resistance are close to 0 while values far from 0 are unwanted for the consolidation. Thus a natural membership function for Ac is symmetrical about 0. The same is true for Ab . However, in our study we have decided to substitute the fuzzy set Ab corresponding to the support by a fuzzy Apar describing parallelism. Actually, instead of checking whether the slope of the resistance and support are both close to 0 we may examine only one of these coefficients and then test if the support and resistance are parallel, i.e. if the distance between their slopes is zero. Thus the membership function of Apar should be also symmetrical about 0. We refer the reader to Fig. 4 for the graphs of lc and lpar . Finally, we need a valuation function M. Actually we are interested in a value of M for a fixed time moment t0 . Since the price-quadrangle is calculated for given interval ½tp ; t0  from time window T, M would also strongly depend on ½t p ; t 0 . Without

loss of generality we may assume that T ¼ t 1 ; . . . ; tn such that tn ¼ t0 . In our study (see Section 7 for the description and results) we compare price-quadrangles corresponding to short-term, medium-term and long-term investments horizons. Of course, such statements like ‘‘short-term’’, ‘‘medium-term’’ and ‘‘long-term’’ are fuzzy and therefore they are satisfactory modeled by fuzzy subsets AT S , AT M , AT L of the real line R. In our study we have decided to utilize trapezoidal fuzzy numbers – both because of their simplicity and a natural interpretation which is easy to grasp by the users. Membership functions lT S , lT M and lT L of these fuzzy sets are given in Fig. 4. Therefore, we define three valuation functions – each one connected with different investment horizon. We have decided to apply max operator as S-conorm and the product as T-norm. Moreover, in our opinion the random error connected with the resistance and support should be also taken into consideration. So we propose the following valuation functions

        ðt ;t Þ ðt ;t Þ ðt ;t Þ ðt ;t Þ Ms ¼ max lc ac p 0  lpar ac p 0  ab p 0  lSSE rQ p 0  ls ðt p Þ t p 2T

ð9Þ where s 2 fT S ; T M ; T L g and lSSE is a membership function of a fuzzy ðt ;t Þ set ASSE which describes the acceptable values of the error rQ p 0 corresponding to the price-quadrangle calculated for the time interval ½t p ; t 0 . As before, lSSE is symmetrical about 0. Its graph is also shown in Fig. 4. Please note that any valuation function takes values in the unit interval [0, 1] and hence the estimated value of Ms may be perceived as a degree of conviction that we are in the consolidation phase. In particular, high values of valuation functions Ms indicate that we are already in the consolidation phase. 5. Trend following indicators Exponential smoothing of the rate of returns from a certain financial instrument could be used as a trend following indicator. Tests for trend may also be useful for the consolidation phase detection. It is so since in well formed consolidations there is no significant trend. Now let us recall two classical tools for handling with a time series that will be useful in the next section. Definition 7. The exponential moving average for the time series fX t g is a time series fF t g defined by

F t ¼ kX t þ ð1  kÞF t1 ; where F 0 ¼ X 0 .

Fig. 4. Membership functions of fuzzy sets used for the construction of valuation function.

ð10Þ

´ ski, P. Grzegorzewski / Expert Systems with Applications 40 (2013) 2391–2397 P. Ładyz_ yn

2395

pðxi Þ ¼ ðh; e; H; EÞ

Definition 8. The exponential n-moving average of rate of ^ exp;ðnÞ for the price time series fX t g is the exponential nreturns m moving average calculated for time series fmt g, where

h ¼ ð0; 0; 1; 1; 0; 0; 0; 0; 1; 0Þ

X t  X t1 mt ¼ X t1

H ¼ ðð0; 0; 0; 0Þ; ð0; 0; 0; 0Þ; ð6; 10; 15; 19Þ; ð15; 19; 24; 29Þ; ð0; 0; 0; 0Þ;

ð11Þ

ð0; 0; 0; 0Þ; ð0; 0; 0; 0Þ; ð0; 0; 0; 0Þ; ð24; 29; 29; 44Þ; ð0; 0; 0; 0ÞÞ

is the rate of return from X t in the time interval ½t  1; t.

E ¼ ð15; 0; 0; 0; 0; 25; 0; 35; 0; 0Þ

6. Consolidation phase detection system In previous sections we have shown how to compute measures aggregating information that might be useful for detecting patterns in a price time series. In this section we suggest how to construct a decision rule based system for detecting the consolidation. As it is known machine learning algorithms could adapt to the problem structure and automatically capture decision rules from data. Random forests, for instance, seem to work quite well for our problem (see Hastie, Tibshirani, & Friedman, 2009). The following steps show how to implement a system for the consolidation phase recognition: 1. Prepare a training set by selecting days which belong to the consolidation phase. For every trading day we assign a class yi , such that yi ¼ 1 corresponds to the consolidation phase and yi ¼ 0 indicates the lack of consolidation. 2. For each trading day from a learning set we calculate a vector

  ðexpÞ ðexpÞ ðexpÞ xi ¼ MT S ; MT M ; MT L ; m15 ; m25 ; m35 :

ð12Þ

3. Train the decision tree using the learning set fðyi ; xi Þg. 7. Particle Swarm Optimization of the system Although the system proposed in the previous section gives satisfactory results there is a place for further improvement. Note that time horizons and exponential smoothing parameters in (12) were set in the reference to investment practices and expert knowledge. Let us recall that dependencies in financial markets structure are not constant in time. Moreover, properties of various price charts also differ. It seems to be a good idea to incorporate into the system some tools for the machine adjustment of the system parameters. In this section we present a modification of the Particle Swarm Optimization algorithm for intelligent tuning of our system (see, Kennedy & Eberhart, 1995). Firstly we extend the definition of informative vector (12) consisting of three protoforms and three moving averages. We will search for appropriate number of protoforms and moving averages (maximum ten) along with the most suitable parameters:

  ðexpÞ ðexpÞ xi ¼ MT 1 ; MT 2 ; . . . ; MT 10 ; mðexpÞ a1 ; ma2 ; . . . ; ma10

 Divide training set into five subsets each of about N5 elements.  Perform five-fold crossvalidation performance test of the system by training random forests on four of five parts of the data and validating on the last one.  Define a cost function for the particle:

f ðpÞ ¼ 1  ^f

where h ¼ ðh1 ; h2 ; . . . ; h10 Þ is a vector defining active protoforms (suppose that if hi ¼ 1 then the protoform is active and if hi ¼ 0 we do not consider its value in the informative vector construction), e ¼ ðe1 ; e2 ; . . . ; e10 Þ is a vector defining active moving averages, H ¼ ðT 1 ; . . . ; T 10 Þ is a vector containing information on trapezoidal fuzzy sets representing time horizons of each protoform T i ¼ ðAi ; Bi ; C i ; Di Þ and E ¼ ða1 ; . . . ; a10 Þ represents the lengths of moving averages. For example, using this convention vector (12) might be coded as follows:

ð16Þ

where ^f is a percent of properly classified points in the crossvalidation test. The original PSO algorithm proposed by Kennedy and Eberhart (1995) needs some modification to work well for our particle genome definition. Let S be number of particles. Then. 1. For each particle pk ¼ ððhk ; ek ; Hk ; Ek ÞÞ for k ¼ 1; . . . ; S do:  Initialize at random a vector hk ¼ ðhk1 ; hk2 ; . . . ; hk10 Þ and ek ¼ ðek1 ; ek2 ; . . . ; ek10 Þ by the random generator defined as: hik  X; ejk  X and PðX ¼ 1Þ ¼ 13 ; PðX ¼ 0Þ ¼ 23.  Let Hind k ¼ fi1 ; . . . ; ink : hij k ¼ 1; j ¼ 1; . . . ; nk g denote a set of indexes of active protoforms and let nk ¼ jHind k j stand for a number of active protoforms. For each ij 2 Hind we define k trapezoidal fuzzy sets T kij ¼ ðAkij ; Bkij ; C kij ; Dkij Þ as follows:

T ki1 ¼ ðAki1 ; Bki1 ; C ki1 ; Dki1 Þ ¼ ð0; bjY 1 jc; bY 1 þ X 1 c; bjY 1 j þ jX 1 j þ jZ 1 jcÞ T ki2 ¼ ðAki2 ; Bki2 ; C ki2 ; Dki2 Þ ¼ ðC ki1  bjY 2 jc; C ki1 ; C ki1 þ bjX 2 jc; C ki1 þ bjX 2 j þ jZ 2 jcÞ .. . T kij ¼ ðAkij ; Bkij ; C kij ; Dkij Þ ¼ ðC kij1  bjY j jc; C kij1 ; C kij1 þ bjX j jc; C kij1 þ bjX j j þ jZ j jcÞ .. . T kink ¼ ðAkink ; Bkink ; C kink ; Dkink Þ ¼ ðC kin

 bjY nk jc; C kin

k 1

; C kin

k 1

þ bjX nk jc; C kin

k 1

þ bjX nk j þ jZ nk jcÞ

ð13Þ

ð14Þ

ð15Þ

Secondly, let us define a cost function for our system performance which we try to minimize. Let ðyi ;  xi ÞNi¼1 be a training set. For each particle defined by (14) do:

k 1

Let us define a genome of the particle xi as a quadruple consisting of 70 natural numbers:

p ¼ ðh; e; H; EÞ;

e ¼ ð1; 0; 0; 0; 0; 1; 0; 1; 0; 0Þ

ð17Þ

where X i , Y i and Z i denote random normal distributions



 D D ; i ¼ 1; . . . ; nk ; 2nk 2nk   D D ; i ¼ 1; . . . ; nk Yi  N ; 2nk 2nk ! P Pi1 D  i1 l¼0 jX l j  jY 1 j D  l¼0 jX l j  jY 1 j ; i ¼ 1; . . . ; nk Zi  N ; nk  ði  1Þ nk  ði  1Þ Xi  N

ð18Þ and where D is the length of the maximal time horizon we consider (in our approach D ¼ 60). Note that fuzzy sets T ki1 ; . . . ; T kink define a fuzzy partition of the time interval ½0; D.

´ ski, P. Grzegorzewski / Expert Systems with Applications 40 (2013) 2391–2397 P. Ładyz_ yn

2396

 Let Eind a set k ¼ fu1 ; . . . ; umk : euj k ¼ 1; j ¼ 1; . . . ; mk g denotes     denote of indexes of active moving averages and mk ¼ Eind  k we a number of active moving averages. For each uj 2 Eind k initialize the exponential moving average length at random according to the discrete uniform distribution auj  U ½2; 3; . . . ; D.  Initialize the particle’s best known position to its initial position: pbest pi . i  Group particles into swarms S1 ; . . . ; S10 :

       ind  pk 2 Sj () Hind k  ¼ Ek  ¼ j;

j ¼ 1; . . . ; 10

Qðpk Þ ¼ ðAki1 ; Bki1 ; C ki1 ; Dki1 ;

ð20Þ

au1 ; . . . ; aumk Þ; ind where ij 2 Hind k ; j ¼ 1; . . . nk and uj 2 Ek ; j ¼ 1; . . . mk . Transform Q produces a vector of real numbers basing on parameters characterizing trapezoidal fuzzy sets in active protoforms and lengths of active moving averages in given genome. Note that a number of dimensions of Q ðpk Þ is equal to 5j if and only if particle pk belongs to swarm Sj .  Initialize particle’s pk velocity vector:

j blo

  i h   j j   j j  blo  bhi ; blo  bhi  ; j bhi

ð21Þ

where and are lower and upper boundaries of the search space in swarm Sj such as pk 2 Sj . 2. Until preassumed fitness condition is reached repeat:  For each swarm Sj , j ¼ 1; . . . ; 10 do: ! For each particle pk 2 Sj do:  For each dimension d ¼ 1; . . . ; 5j do: ,! Pick random numbers rp ; r g  U½0; 1.

Update particle velocity:

 

xv k;d þ up rp Q pbest i

ðdÞ

 Q ðpk ÞðdÞ

  ðdÞ þ ug rg Q ðg j ÞðdÞ  Q ðpk ÞðdÞ wk 

pk

 ,! ,!

 j

v kðdÞ

k

ð22Þ

Update particle genome by adding integer velocity wk vector to corresponding dimensions (we update only dimensions of active protoforms and moving averages in given particle’s genome):

pk wk

,!

Akink ; Bkink ; C kink ; Dkink ;

vk  U

v ðdÞ k

ð19Þ

 For each swarm Sj ; j ¼ 1; . . . ; 10 find its best known position g j satisfying condition: f ðg j Þ 6 f ðpk Þ for all elements pk 2 Sj .  Let us define a transform which projects the particle genome into a real number search space:

.. .

,!

ð23Þ  If f ðpk Þ < pbest do: k Update particle best position: pbest pk k If f ðpk Þ < ðg j Þ update best swarm position: gj pk g ¼ maxðg 1 ; . . . ; g 10 Þ

3. g is the best solution. Parameters S; x; up ; ug are set empirically according to suggestions given in Pedersen and Chipperfield (2010) and Pedersen (2010). 8. Empirical results The system for detecting consolidation phases discussed above was trained on KGHM (the largest copper mining company in Europe) quotations from 1-01-1998 to 1-01-2006 (1426 trading days). Then the performance of the system was examined using data obtained from 2-01-2006 to 1-09-2011. During this period 29 consolidation phases occurred, 27 of which were recognized by the system which gave the total accuracy of 93% of detected objects. It is worth noting that the implementation of the PSO algorithm for intelligent tuning of our system resulted in significant improvement. Experts classified 386 days which belong to consolidations (from overall 1426 days in test period). The accuracy of the system (in terms of detected days) without PSO was equal to 0.84 while after incorporating PSO tuning raised to 0.89.

Fig. 5. Sample consolidations extracted by the system. Thick vertical lines mark days selected by the system as consolidation days. Technical consolidation is selected within black rectangle. 16th February was omitted in the classification.

´ ski, P. Grzegorzewski / Expert Systems with Applications 40 (2013) 2391–2397 P. Ładyz_ yn

2397

Fig. 6. Sample consolidations extracted by the system.

Dark grey thick lines in Figs. 5 and 6 mark days classified as the consolidation phase. Looking at Fig. 5 one may conclude that the system has found the consolidation phase lasting from the second week of September to the first week of October. Technical analyst would say that this consolidation phase begins in the third week of August. It is true but we should remember that when the consolidation starts we are unable to verify its existence until it forms a certain pattern. The existence of the consolidation phase itself is not precisely defined. When constructing the learning set we assume that the consolidation phase starts when prices rebound from one of boundary lines (support, resistance) for the second time without crossing it. Empirical results show that the system is able to adapt to this assumption (see Figs. 5 and 6). Despite of promising performance the system has a weak point. In some cases the detected consolidation phases were not coherent which means that inside such consolidation phase single days or small group of days were classified as non-consolidation. PSO tuning of the system decreases this problem however do not eliminate it as we can see in Fig. 6. 9. Conclusions In this paper we presented a novel system for detecting patterns in price time series. The system utilizes so-called fuzzy geometric protoforms suggested also in the paper. Promising results of the empirical study proves that the concept of fuzzy geometric protoform is a useful tool for identifying patterns in graphical visualizations of data. Moreover the architecture of the system enables successful incorporation of genetic optimization what enables capturing various data sets structure and mutable conditions of financial markets.

Although the performance of the system is quite satisfactory, some further improvements would be desirable. Firstly, we should try to upgrade a classifier using boosting technique. Secondly, the problem of single misclassified days inside the consolidation phase needs further investigation. References Fong, S., Si, Y. W., & Tai, J. (2012). Trend following algorithms in automated derivatives market trading. Expert Systems with Applications, 39, 11378–11390. Guo, X., Liang, X., & Li, X. (2007). A stock pattern recognition algorithm based on neural networks. In Natural Computation, ICNC 2007. Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning. Springer-Verlag. _ Kacprzyk, J., & Zadrozny, S. (2005). Linguistic database summaries and their protoforms: Towards natural language based knowledge discovery tools. Information Sciences, 173, 281–304. Kara, Y., Boyacioglu, M. A., & Baykan, O. K. (2011). Predicting direction of stock price index movement using artificial neural networks and support vector machines: The sample of the Istanbul stock exchange. Expert Systems with Applications, 38, 5311–5319. Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization. In Proceedings of the IEEE International Conference on Neural Networks, Perth, Australia. Kamijo, K., & Tanigawa, T. (1990). Stock price pattern recognition – A recurrent neural network approach. In International Joint Conference Neural Networks, 1990 IJCNN . Pedersen, M. E. H., & Chipperfield, A. J. (2010). Simplifying particle swarm optimization. Applied Soft Computing, 10, 618–628. Pedersen, M.E.H. (2010). Good parameters for particle swarm optimization. Technical Report HL1001, Hvass Laboratories. Savin, N. E., Weller, P. A., & Zvingelis, J. (2007). The predictive power of head-andshoulders price patterns in the U.S. stock market. Journal of Financial Econometrics, 5, 243–265. Wu, M. C., Lin, S. Y., & Lin, C. H. (2006). An effective application of decision tree to stock trading. Expert Systems with Applications, 31, 270–274. Zadeh, L. (2002). A prototype-centered approach to adding deduction capabilities to search engines – The concept of a protoform. BISC Seminar, University of California, Berkeley.