Available online at www.sciencedirect.com
Mathematics and Computers in Simulation 79 (2008) 539–554
A computational method of forecasting based on fuzzy time series S.R. Singh ∗ Department of Mathematics, Banaras Hindu University, Varanasi 221005, India Received 6 February 2008; accepted 24 February 2008 Available online 7 March 2008
Abstract In this paper, a computational method of forecasting based on fuzzy time series have been developed to provide improved forecasting results to cope up the situation containing higher uncertainty due to large fluctuations in consecutive year’s values in the time series data and having no visualization of trend or periodicity. The proposed model is of order three and uses a time variant difference parameter on current state to forecast the next state. The developed model has been tested on the historical student enrollments, University of Alabama to have comparison with the existing methods and has been implemented for forecasting of a crop production system of lahi crop, containing higher uncertainty. The suitability of the developed model has been examined in comparison with the other models to show its superiority. © 2008 IMACS. Published by Elsevier B.V. All rights reserved. Keywords: Fuzzy time series; Time invariant; Time variant; Linguistic variables; Fuzzy logical relations
1. Introduction The time series forecasting investigates the relations on the sequential set of past data measured over time to forecast the future values. The area has been widely studied and traditional forecasting are frequently conducted by statistical tools like regression analysis, moving averages, integrated moving average and autoregressive integrated moving average. One of the major limitations of these methods is not attending the forecasting problems in which the historical data are in linguistic terms. Fuzzy set theory and fuzzy logic introduced by Zadeh [28,29] provides a general method for handling uncertainty and vagueness in information available in linguistic terms. Song and Chissom [18–20] used the fuzzy set theory given by Zadeh to develop models for fuzzy time series forecasting and considered the problem of forecasting enrollments on the time series data of University of Alabama. Chen [1] presented a method of fuzzy time series forecasting of enrollments using the simplified arithmetic operations. The major problem in fuzzy time series forecasting is the problem of accuracy in forecast. Many researchers, Sullivan and Woodall [22]; Kim and Lee [9]; Hwang et al. [8]; Chen and Hwang [3]; Huarng [6] worked on various models for fuzzy time series forecasting to improve the forecast.
∗
Tel.: +91 5422307435. E-mail address: singh
[email protected].
0378-4754/$32.00 © 2008 IMACS. Published by Elsevier B.V. All rights reserved. doi:10.1016/j.matcom.2008.02.026
540
S.R. Singh / Mathematics and Computers in Simulation 79 (2008) 539–554
Chen [2] considered the forecasting of enrollments with high-order fuzzy time series models. Song [21] considered an average autocorrelation function as a measure of the dependency between fuzzy data for the selection of suitable order for fuzzy time series model of forecasting. Lee and Chou [10] explored the improvement in fuzzy time series forecasting by redefining the universe of discourse and subsequent partition of intervals of universe of discourse as supports of the fuzzy numbers representing the linguistic values to the linguistic variables. Li and Chen [12] considered the recursive partitioning of the universe of discourse level by level to improve the enrollment forecast. Chen and Hsu [4] proposed a method to forecast enrollments by dividing the universe of discourse into intervals and then re-dividing these intervals according to the frequency of their occurrences and framed some rules for improved forecasting. Yu [26,27] presented a refined fuzzy time series model and a weighted fuzzy time series model for TAIEX forecasting by redefining the length of intervals to improve the forecasted values. Own and Yu [14] presented a heuristic higher order model and applied the model to forecast TAIFEX. Tsaur et al. [24] used the concept of entropy to measure the degree of fuzziness of a system and to determine a time of steady state to improve the enrollments forecast. Further, Huarng and Yu [7] applied a ratiobased length of intervals to improve the enrollments forecast. Cheng et al. [5] studied the fuzzy time series forecasting of enrollments by approaches: minimize entropy principle approach (MEPA) and Trapezoid fuzzification approach (TFA). Further, Singh [17] presented a simple method of forecasting based on fuzzy time series concept using difference operator and tested it in forecasting of enrollments and wheat crop production. Fuzzy time series is also being studied using the concept of clustering techniques. Ozava et al. [15] studied the application of fuzzy auto regression in time series data. Singh [16] studied the application of fuzzy pattern matching in financial time series forecasting. Moller et al. [13] considered a fuzzy random process for the modeling of time series with fuzzy data in a case of heavy goods vehicle traffic data. Lee et al. [11] considered a fuzzy candlestick pattern with fuzzy linguistic variables of fuzzy time series for financial forecasting and also studied in the case of enrollments forecasting. Further, Wang, Chen [25] presented a method for temperature and TAIFEX forecasting based on automatic clustering techniques and two-factor high-order fuzzy time series to improve the forecasting in the case of their study. The motivation for the present work is to develop a computational method of forecasting suitable for general application in agricultural production forecasting to support the development of crop simulation models for tactical and forecasting applications barring its limitations of availability of weather, soil and crop management information. In tactical applications, the crop models are actually run prior to growing season to help the farming community, producers and decision makers. In forecasting applications of the crop models, the main interest is in the final expected yield for planning the crop in the season. Apart for crop producers, it is also important for local area companies to have optimal plan for their required input of raw material. The application study of the developed model has been made on the agricultural production system, which involves the uncertainty in the crop yield even though all the standard cropping practices are adopted. Here, we have considered the time series data of lahi (mustard) crop, which is very sensitive to agro climatic conditions, pests and diseases, the uncertain and uncontrolled parameters affecting the crop production. The objective of the present study is to develop a suitable forecasting model based on fuzzy time series having ability to cope up with the situation of high uncertainty having large fluctuations in the consecutive values. The study comprise of model development, its testing on enrollments forecast to examine its suitability in forecasting over the other available models and then its implementation in agricultural crop (lahi) production forecasting. The application of the developed model has been studied on the historical time series data of crop (lahi) production of Pantnagar farm, G.B. Pant University of Agriculture and Technology, Pantnagar, India. Here, the lahi production has been recorded in terms of productivity in kilogram per hectare.
1.1. Basics of fuzzy time series In view of making our exposition self-contained, the various definitions and properties of fuzzy time series forecasting found in [1–29] are summarized and reproduced as:
S.R. Singh / Mathematics and Computers in Simulation 79 (2008) 539–554
541
Definition 1. A fuzzy set is a class of objects with a continuum of grade of membership. Let U be the Universe of discourse with U = {u1 , u2 , u3 , . . . un },where ui are possible linguistic values of U, then a fuzzy set of linguistic variables Ai of U is defined by Ai =
μAi (u1 ) μAi (u2 ) μAi (u3 ) μAi (un ) + + + ... + u1 u2 u3 un
(1)
here μAi is the membership function of the fuzzy set Ai , such that μAi : U = [0, 1]. If uj is the member of Ai , then μAi (uj ) is the degree of belonging of uj to Ai . Definition 2. Let Y(t) (t = . . ., 0, 1, 2, 3, . . .), is a subset of R, be the universe of discourse on which fuzzy sets fi (t) (i = 1, 2, 3, . . .) are defined and F(t) is the collection of fi , then F(t) is defined as fuzzy time series on Y(t). Definition 3. Suppose F(t) is caused only by F(t − 1) and is denoted by F(t − 1) → F(t); then there is a fuzzy relationship between F(t) and F(t − 1) and can be expressed as the fuzzy relational equation: F (t) = F (t − 1) ◦ R(t, t − 1)
(2)
here ‘‘ ◦ ’’ is max–min composition operator. The relation R is called first-order model of F(t). Further, if fuzzy relation R(t, t − 1) of F(t) is independent of time t, that is to say for different times t1 and t2 , R(t1 , t1 − 1) = R(t2 , t2 − 1), then F(t) is called a time invariant fuzzy time series. Definition 3. If F(t) is caused by more fuzzy sets, F(t − n), F(t − n + 1), . . ., F(t − 1), the fuzzy relationship is represented by Ai1 , Ai2 , . . . , Ain → Aj here F(t − n) = Ai1 , F(t − n + 1) = Ai2 , . . ., F(t − 1) = Ain . This relationship is called nth order fuzzy time series model. Definition 4. Suppose F(t) is caused by an F(t − 1), F(t − 2), . . ., and F(t − m) (m > 0) simultaneously and the relations are time variant. The F(t) is said to be time variant fuzzy time series and the relation can be expressed as the fuzzy relational equation: F (t) = F (t − 1) ◦ Rw (t, t − 1)
(3)
here w > 1 is a time (number of years) parameter by which the forecast F(t) is being affected. Various complicated computational methods are available to for the computations of the relation Rw (t, t − 1). 1.2. Time series analysis: statistical vs. fuzzy approach A series of observations x(t), made sequential in time constitute a time series. The statistical analysis of time series are in general considered in two domains: time domain and frequency domain. In frequency domain, spectral density function (spectrum) is the normal tool for considering frequency properties of a time series. In the time domain analysis, a time series is represented by a mathematical model: Y (t) = f (t) + X(t)
(4)
where f(t) represents a systematic or ordered part and X(t) represents a random part. Here, the former is also known as signal component and later as noise component of a time series. However, the fact is that the two components cannot be observed separately but may involve several parameters. In general in model (4), the random part is dealt with the stochastic process, a random function of time and the systematic part as a non-random or deterministic function of time. The deterministic part analyzed on account of characteristics described as trend, cycle component and seasonal components. These components are
542
S.R. Singh / Mathematics and Computers in Simulation 79 (2008) 539–554
studied by using functions like a low-degree polynomial, a short Fourier series, a periodic function or a trigonometric function. The analysis of random part a stochastic process{Xt , t ∈ I} depends on the structure of {Xt }. It may be purely random process if E{Xt } = m, is independent of time. The time series is classified as stationary or non-stationary depending on the structure of {Xt }. A stationary time series in general is dealt by first-order Markov process, moving average (MA) process, autoregressive (AR) process and autoregressive moving average (ARMA) process. Further, there are several models available to deal with non-stationary time series. Box–Jenkins forecasting method for dealing with non-stationary time series is based on a class of model called integrated autoregressive moving average (ARIMA) process. Box–Jenkins further gave a general multiplicative seasonal model by modifying the ARIMA model to seasonal integrated autoregressive moving average (SARIMA) process. On other side, fuzzy time series, as described in Section 1.1, deals with forecasting under the fuzzy environment in which the historical time series data are linguistics containing the uncertainty, imprecision and vagueness, thus making the two approaches different in their philosophy. The concepts of stationary and non-stationary time series in statistical time series are here being dealt as time invariant fuzzy time series and time variant fuzzy time series. Further, the fuzzy time series analysis deals with fuzzy logical relations in time series data rather than random and non-random functions in case of usual time series analysis. In many real life situations it is hard to harvest a trend or a cycle component in time series observations and thus can be analyzed by using the fuzzy time series methods. 1.3. Proposed model The proposed model is of order three, as F(t + 1) is caused by F(t − 2), F(t − 1) and F(t) and F(t + 1) is computed as F (t + 1) = F (t − 1) ∗ R(t, t − 1, t − 2) here the fuzzy relation R is considered a numeric value rather than a fuzzy relational matrix and is being computed as difference between differences in the consecutives values of year n with n − 1 and of values of year n − 1 with n − 2. The computational procedure for forecasting the value of year n+1 is presented in the form of computational algorithms in the next section. The developed model extensively utilizes a time variant parameter Di =||(Ei − Ei−1 )| − |(Ei−1 − Ei−2 )|| being obtained by the differences of the past 3 years data as current state to forecast the values for the next state. Since the values of the parameter is dependent of time with a characteristic of time variant relation and hence a time variant fuzzy time series. The complete procedure of computation of F(t + 1) with known F(t), F(t − 1) and F(t − 2) are given in the next section. 2. Computational algorithm of proposed method for fuzzy time series forecasting In this section, we present the stepwise procedure of the proposed method for fuzzy time series forecasting based on historical time series data. (1) Define the Universe of discourse, U based on the range of available historical time series data, by rule: U = [Dmin − D1 , Dmax − D2 ] where D1 and D2 are two proper positive numbers. (2) Partition the Universe of discourse into equal length of intervals: u1 , u2 , . . ., um . The number of intervals will be in accordance with the number of linguistic variables (fuzzy sets) A1 , A2 , . . ., Am to be considered. (3) Construct the fuzzy sets Ai in accordance with the intervals in Step 2 and apply the triangular membership rule to each intervals in each fuzzy set so constructed. (4) Fuzzify the historical data and establish the fuzzy logical relationships by the rule: If Ai is the fuzzy production of year n and Aj is the fuzzify production of year n + 1, then the fuzzy logical relation is denoted as Ai → Aj . Here Ai is called current state and Aj is next state.
S.R. Singh / Mathematics and Computers in Simulation 79 (2008) 539–554
543
(5) Rules for forecasting Some notations used are defined as • • • • •
[*Aj ] is corresponding interval uj for which membership in Aj is Supremum (i.e. 1). L[*Aj ] is the lower bound of interval uj . U[*Aj ] is the upper bound of interval uj . l[*Aj ] is the length of the interval uj whose membership in Aj is Supremum (i.e. 1). M[*Aj ] is the midvalue of the interval uj having Supremum value in Aj . For a fuzzy logical relation Ai → Aj :
• • • • • •
Ai is the fuzzified enrollments of year n. Aj is the fuzzified enrollments of year n + 1. Ei is the actual enrollments of year n. Ei−1 is the actual enrollments of year n − 1. Ei−2 is the actual enrollments of year n − 2. Fj is the crisp forecasted enrollments of the year n + 1.
This model of order three utilizes the historical data of years n − 2, n − 1, n for framing rules to implement on fuzzy logical relation, Ai → Aj , where Ai , the current state, is the fuzzified enrollments of year n and Aj , the next state, is fuzzified enrollments of year n + 1. The proposed method for forecasting is mentioned as rule for generating the relations between the time series data of years n − 2, n − 1, n for forecasting the enrollment of year n + 1. Computational algorithm: Forecasting enrollments Fj for year n + 1 (i.e. 1974) and onwards
544
S.R. Singh / Mathematics and Computers in Simulation 79 (2008) 539–554
3. Computations of enrollments forecast Algorithm of the proposed method is being implemented on the time series data of enrollments at University of Alabama and the stepwise results obtained are Step 1: Universe of discourse U = [13000, 20000]. Step 2: Partition of universe of discourse U in the seven intervals (linguistic values). u1 = [13000, 14000] u2 = [14000, 15000] u3 = [15000, 16000] u4 = [16000, 17000] u5 = [17000, 18000] u6 = [18000, 19000] u7 = [19000, 20000]
S.R. Singh / Mathematics and Computers in Simulation 79 (2008) 539–554
545
Step 3: Define seven fuzzy sets A1 , A2 , . . ., A7 as linguistic variables on the universe of discourse U. These fuzzy variables are being defined as A1 : poor enrollment A2 : below average enrollment A3 : average enrollment A4 : good enrollment A5 : very good enrollment A6 : excellent enrollment A7 : extraordinary enrollment and the membership grades to these fuzzy sets of linguistic values are defined as
A1 = 1/u1 + 0.5/u2 + 0/u3 + 0/u4 + 0/u5 + 0/u6 + 0/u7 A2 = 0.5/u1 + 1/u2 + 0.5/u3 + 0/u4 + 0/u5 + 0/u6 + 0/u7 A3 = 0/u1 + 0.5/u2 + 1/u3 + 0.5/u4 + 0/u5 + 0/u6 + 0/u7 A4 = 0/u1 + 0/u2 + 0.5/u3 + 1/u4 + 0.5/u5 + 0/u6 + 0/u7 A5 = 0/u1 + 0/u2 + 0/u3 + 0.5/u4 + 1/u5 + 0.5/u6 + 0/u7 A6 = 0/u1 + 0/u2 + 0/u3 + 0/u4 + 0.5/u5 + 1/u6 + 0.5/u7 A7 = 0/u1 + 0/u2 + 0/u3 + 0/u4 + 0/u5 + 0.5/u6 + 1/u7
Step 4: The historical time series data of enrollments are fuzzified using the triangular membership function to obtain the enrollments in terms of linguistic variables and are placed in Table 1. Further, the fuzzy logical relations are established. Step 5: Using the proposed algorithms, (rule for forecasting) in Section 3, the computations have been carried out with the proposed model and the results obtained are placed in Table 2 along with results of other models. To have a comparison of accuracy in forecasted values of our proposed models with other models, the mean square error (MSE) and average error in forecasted values have been computed by the procedure n
valuei − forecasted valuei )2 and forecasting error as n |forecasted − actual value| forecasting error (%) = × 100 actual value sum of forecasting error average forecasting error (%) = number of errors
mean square error =
i=1 (actual
The forecasted enrollments by the proposed method in different cases arising due to dividing the universe of discourse into number of intervals along with average error and MSE in each case are placed in Table 3. The accuracy in forecasted values of our proposed models with several other models of forecasting on the bench mark of the MSE and average error of forecast have been compared in Table 4. To have a realistic error estimate of the proposed model with other models in Table 4, the MSE and average error have been computed from the year 1974 and onwards also for other methods.
546
S.R. Singh / Mathematics and Computers in Simulation 79 (2008) 539–554
Table 1 Fuzzy historical enrollments Year
Actual enrollments
Enrollments in linguistic variables
1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992
13055 13563 13867 14696 15460 15311 15603 15861 16807 16919 16388 15433 15497 15145 15163 15984 16859 18150 18970 19328 19337 18876
A1 A1 A1 A2 A3 A3 A3 A3 A4 A4 A4 A3 A3 A3 A3 A3 A4 A6 A6 A7 A7 A6
Table 2 Forecasted enrollments by different models at a glance using seven number of linguistic variables Year
Actual Enroll
Proposed Method
Singh [17]
Cheng [5] MEPA
Cheng [5] TFA
Tsaur et al. [24]
Lee and Chou [11]
1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992
13055 13563 13867 14696 15460 15311 15603 15861 16807 16919 16388 15433 15497 15145 15163 15984 16859 18150 18970 19328 19337 18876
– – – 14331 15489 15463 15412 15559 16500 16616 16516 15538 15440 15497 15280 15351 16395 18500 18376 19366 19407 18604
– – – 14286 15361 15468 15512 15582 16500 16361 16362 15735 15446 15498 15306 15442 16558 18500 18475 19382 19487 18744
– 15430 15430 15430 15430 15430 15430 15430 16889 16871 16871 15447 15430 15430 15430 15430 16889 16871 19333 19333 19333 19333
– 14230 14230 14230 15541 15541 15541 16196 16196 16196 17507 16196 15541 15541 15541 15541 16196 17507 18872 18872 18872 18872
– 14000 14000 14000 15500 15500 16000 16000 16000 16500 16500 15500 15500 15500 15500 15500 16500 18500 19000 19000 19000 –
– 14025 14568 14568 15654 15654 15654 15654 16197 17283 17283 16197 15654 15654 15654 15654 16197 17283 18369 19454 19454 –
S.R. Singh / Mathematics and Computers in Simulation 79 (2008) 539–554
547
Table 3 Interval analysis in forecasting by proposed method Year
1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 Average error MSE
Actual enroll
13055 13563 13867 14696 15460 15311 15603 15861 16807 16919 16388 15433 15497 15145 15163 15984 16859 18150 18970 19328 19337 18876
Forecated enrollments by proposed method with number of intervals 5
7
10
14
20
– – – 14790 14916 15432 15208 16140 15935 16683 16680 15260 15255 15249 15141 16165 16304 17900 19227 19164 19325 19264 1.644277 115860
– – – 14331 15489 15463 15412 15559 16500 16616 16516 15538 15440 15497 15280 15351 16395 18500 18376 19366 19407 18604 1.5319 95306
– – – 14615 15472 15459 15463 16024 16850 16872 16116 15377 15435 15488 15384 16049 16819 18250 18966 19668 19570 19045 0.80391 28451.63
– – – 14750 15236 15400 15765 15667 16750 16796 16168 15191 15234 15231 15194 15791 16769 18250 18766 19207 19277 18792 0.834423 23685.42
– – – 14527 15686 15369 15569 15965 16675 16975 16325 15203 15578 15299 15281 15903 17025 18075 19079 19475 19471 18707 0.736901 17702.74
Fig. 1. Actual enrollments vs. forecasted enrollments.
548
S.R. Singh / Mathematics and Computers in Simulation 79 (2008) 539–554
Table 4 A comparison of mean square error and average error Model
Proposed method (seven intervals)
Singh 2007
Cheng 2006 MEPA
Cheng 2006 TFA
Tsaur et al. 2005
Lee and Chou 2004
Huarng 2001
Yu 20005
Huarng 2006
MSE Average error
95306 1.5319
90977 1.5321
181755 1.7236
258303 2.5436
138322 1.8508
240047 2.4998
239483 2.4826
1041158 4.45589
174809 1.8418
The trends in forecast by the above-mentioned methods are being illustrated in Fig. 1. The comparative study of MSE, average error and the graphical representation of the forecasted values obtained by the proposed model clearly show the superiority of the proposed model over the other fuzzy time series models.
4. Computation of lahi production forecast In this section, the proposed method is implemented into real life problem of a dynamical system containing fuzziness like crop production. In view of suitability as presented in Section 4, the proposed model is being implemented for forecasting the lahi production. The historical time series data of lahi production are of the huge farm of G.B. Pant University, Pantnagar, India. The historical time series data of lahi production is in terms of productivity in kilogram per hectare (Table 5). The proposed method given in Section 2 has been implemented and computations carried out and are presented stepwise.
Step 1: Universe of discourse U = [400, 1100]. Step 2: The Universe of discourse is partitioned into seven intervals of linguistic values:
u1 = [400, 500] u2 = [500, 600] u3 = [600, 700] u4 = [700, 800] u5 = [800, 900] u6 = [900, 1000] u7 = [1000, 1100]
Step 3: Define seven fuzzy sets A1 , A2 , . . ., A7 having some linguistic values on the universe of discourse U. The linguistic values to these fuzzy variables are as follows: A1 : poor production A2 : below average production A3 : average production A4 : good production A5 : very good production A6 : excellent production A7 : extraordinary production
S.R. Singh / Mathematics and Computers in Simulation 79 (2008) 539–554
549
the membership grades to these fuzzy sets of linguistic variables are defined as A1 = 1/u1 + 0.5/u2 + 0/u3 + 0/u4 + 0/u5 + 0/u6 + 0/u7 A2 = 0.5/u1 + 1/u2 + 0.5/u3 + 0/u4 + 0/u5 + 0/u6 + 0/u7 A3 = 0/u1 + 0.5/u2 + 1/u3 + 0.5/u4 + 0/u5 + 0/u6 + 0/u7 A4 = 0/u1 + 0/u2 + 0.5/u3 + 1/u4 + 0.5/u5 + 0/u6 + 0/u7 A5 = 0/u1 + 0/u2 + 0/u3 + 0.5/u4 + 1/u5 + 0.5/u6 + 0/u7 A6 = 0/u1 + 0/u2 + 0/u3 + 0/u4 + 0.5/u5 + 1/u6 + 0.5/u7 A7 = 0/u1 + 0/u2 + 0/u3 + 0/u4 + 0/u5 + 0.5/u6 + 1/u7 Step 4: The historical time series data is fuzzified with triangular membership function in order to have the fuzzy logical relations. The historical time series data of actual lahi production and the corresponding fuzzified production in linguistic terms are given in Table 6 and the obtained fuzzy logical relations are given in Tables 7 and 8. Step 5: The forecasted values have been obtained by using the algorithms in Section 3 Further, the lahi production forecast has also been obtained by Chen [1] method given for the enrollments forecasting. The forecasted production of lahi obtained by these methods is placed in Table 9. The suitability of the proposed model in forecasting the lahi production has been studied on the basis of MSE and average error of the forecast in comparison with the other fuzzy time series models in Table 10.
Table 5 The historical data of lahi production Year
Production (kg/ha)
1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003
1025 512 1005 852 440 502 775 465 795 970 742 635 994 759 883 599 499 590 911 862 801 1067 917
550
S.R. Singh / Mathematics and Computers in Simulation 79 (2008) 539–554
Table 6 Lahi production Year
Actual production
Production in linguistic variables
1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003
1025 512 1005 852 440 502 775 465 795 970 742 635 994 759 883 599 499 590 911 862 801 1067 917
A7 A2 A7 A5 A1 A2 A4 A1 A4 A6 A4 A3 A6 A4 A5 A2 A1 A2 A6 A5 A5 A7 A7
Table 7 Fuzzy logical relationships of the historical lahi production A7 → A2 A2 → A7 A7 → A5 A5 → A1 A1 → A2
A2 → A4 A4 → A1 A1 → A4 A4 → A6 A6 → A4
A4 → A3 A3 → A6 A6 → A4 A4 → A5 A5 → A2
A2 → A1 A1 → A2 A2 → A6 A6 → A5 A5 → A5
A5 → A7 A7 → A6
Table 8 Fuzzy logical relationship groups A1 → A2 A2 → A1 A3 → A6 A4 → A1 A5 → A1 A6 → A4 A7 → A2
A1 → A4 A2 → A4 A4 → A3 A5 → A2 A6 → A5 A7 → A5
A2 → A6
A2 → A7
A4 → A5 A5 → A5
A4 → A6 A5 → A7
A7 → A6
The comparison of MSE in Table 9 shows the superiority of the proposed models over the Singh [17] model and is much better than Chen [1] model, as it provides forecast of higher accuracy in the case of high uncertainty in time series data. Table 11 clearly exhibits that increase in number of intervals provide forecast of better accuracy in natural way. Moreover, if we compare the forecasting by proposed method with 10 intervals with Lee et al. [11] method with 10 intervals forecasted values, the MSE of forecast by the proposed method is 28451 where as the MSE of forecast by Lee et al. [11] is 44686. Further, the accuracy of forecast of the proposed model can more rigorously examined with more clarity by Fig. 2.
S.R. Singh / Mathematics and Computers in Simulation 79 (2008) 539–554
551
Fig. 2. Actual lahi production vs. forecasted lahi production. Table 9 Actual lahi production vs forecasted lahi production by different methods with seven intervals Year 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003
Actual production (kg/ha) 1025 512 1005 852 440 502 775 465 795 970 742 635 994 759 883 599 499 590 911 862 801 1067 917
Forecasted by proposed method
Forecasted by Singh method [17]
Forecasted by Chen method [1]
– – – 850 450 541.42 750 450 750 950 750 658.33 974 746 851.33 550 444.5 553.92 950 858.72 833.33 1050 957.25
– – – 850 450 559.75 750 450 750 950 750 669.5 950 746 851.33 550 444.5 570.5 950 850 850 1050 957.25
– 783.33 800 783.33 725 650 800 725 650 725 800 725 950 800 725 725 800 650 800 800 725 725 783.33
Table 10 Mean square error and average error of lahi production forecast by different methods with seven intervals Methods
Proposed method
Singh method [17]
Chen method [1]
Average error MSE
3.824774 913.6202
4.23012 1140.133
21.0841 28200.71
552
S.R. Singh / Mathematics and Computers in Simulation 79 (2008) 539–554
Table 11 Interval analysis in proposed method for lahi production forecasting Year
1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003
Actual production
1025 512 1005 852 440 502 775 465 795 970 742 635 994 759 883 599 499 590 911 862 801 1067 917
Average error MSE
Forecasted production by proposed method with number of Interval 5
Interval 7
– – – 917.5 491 485.9722 750 470 750 1030 782.5 609.6667 1014 746 864.6667 607 476 582 890 891.375 771.6667 1030 876
– – – 850 450 541.42 750 450 750 950 750 658.33 974 746 851.33 550 444.5 553.92 950 858.72 833.33 1050 957.25
3.924092 1124.081
3.824774 913.6202
Interval 10 – – – 855 435 497.6389 785 435 785 995 715 656.6667 996.5 785 853 562.5 512 570.3333 925 860.3889 798.5556 1065 925 2.159027 335.9656
Interval 14 – – – 875 425 514.875 775 475 775 975 725 630.5 986.5 775 879 562.5 475 583 925 867.0556 820.8333 1075 925 1.924074 239.1218
Interval 20 – – – 837.5 452.5 491.8056 767.5 452.5 802.5 977.5 732.5 631.75 977.5 754.75 877.75 592.5 487.5 591.75 907.5 872.5833 804.3889 1082.5 907.5 1.238136 92.89591
5. Conclusions The motivation of the implementation of fuzzy time series in crop production forecast is to support the development of decision support system in agricultural production system, one of the real life problems falling in the category having uncertainty in known and unknown parameters. The past experiences reveal that the agricultural production system is a complex process and hard to model by the mathematical formulations, as a matter of fact even all the standard practices of cropping are adapted; the uncertainty lies in the crop production due to some uncontrolled parameters. Further, the crop production being dealt with the field data, precision of data is always a matter of concern. The historical time series crop production data used in the present study is of a mechanized farm of Pant University with adoption of standard agricultural practices but it clearly shows the fuzziness in production. The scope and suitability of fuzzy time series forecasting, in crop production, over the commonly used methods can be viewed in Fig. 3, where the trends in forecast by the proposed method have been compared with the forecast by the other available commonly used statistical methods applicable in view of uncertain trend like: polynomial model and moving average methods by fitting the polynomial of higher degrees like degree of 5 and 6. It is evident from the R2 values that the mentioned generally used statistical methods are not suitable in such case of forecast in fuzzy environment. Further, it is also evident that both the methods polynomial fitting and moving averages are far away from the actual production curve. However, the proposed model shows its affinity with the actual production. The present study, the theory and application of fuzzy time series model for short-term agricultural forecasting for local area may help farmers, producers in estimating crop yield for expected financial gain and can provide an advantageous basis to farm administration for better postharvest management and to local industries in planning for their raw material requirement management. Thus, it can be optimally utilized in agri-business management. Further, the proposed forecasting model is a computational method for time series forecasting having the complexity of linear order and hence processing of any huge time series data may not be a matter of concern. It minimizes the complicated computations of fuzzy relational
S.R. Singh / Mathematics and Computers in Simulation 79 (2008) 539–554
553
Fig. 3. Comparison of forecasting methods.
matrices and search for a suitable defuzzification process and provides the forecasted values of better accuracy. The suitability of the developed model can be viewed from Tables 4 and 10 containing the errors in forecasting for the case of enrollments forecasting and lahi production forecasting. It gives improved forecasting in case of high uncertainty and hence the developed method is a generalized method of forecasting proving the forecast of better accuracy than the existing models. Acknowledgement Author is highly thankful to The General Manager, Farms, Govind Ballabh Pant University of Agriculture and Technology, Pantnagar 263145, Udham Singh Nagar, Uttaranchal, India, for providing the valuable time series data of crop (lahi) production. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12]
S.M. Chen, Forecasting enrollments based on fuzzy time series, Fuzzy Sets Syst. 81 (1996) 311–319. S.M. Chen, Forecasting enrollments based on high-order fuzzy time series, Cybernet. Syst.: Int. J. 33 (2002) 1–16. S.M. Chen, J.R. Hwang, Temperature prediction using fuzzy time series, IEEE Trans. Syst. Man Cybernet. B: Cybernet. 30 (2000) 263–275. S.M. Chen, C.C. Hsu, A new method to forecast enrollments using fuzzy time series, Int. J. Appl. Sci. Eng. 2 (2004) 234–244. C.H. Cheng, J.R. Chang, C.A. Yeh, Entropy-based and Trapezoid fuzzification-based fuzzy time series approaches for forecasting IT project cost, Technol. Forecast. Social Change 73 (2006) 524–542. K. Huarng, Heuristic models of fuzzy time series for forecasting, Fuzzy Sets Syst. 123 (2001) 369–386. K. Huarng, T.H.K. Yu, Ratio-based lengths of intervals to improve fuzzy time series forecasting, IEEE Trans. SMC B: Cybernet. 36 (2006) 328–340. J.R. Hwang, S.M. Chen, C.H. Lee, Handling forecasting problems using fuzzy time series, Fuzzy Sets Syst. 100 (1998) 217–228. I. Kim, S.R. Lee, A fuzzy time series prediction method based on consecutive values, in: IEEE International Fuzzy Systems Conference, Proceedings II, Seol, Korea, August 22–25, 1999, pp. 703–707. H.S. Lee, M.T. Chou, Fuzzy forecasting based on fuzzy time series, Int. J. Comput. Math. 81 (2004) 781–789. C.H.L. Lee, Alan Lin, W.S. Chen, Pattern discovery of fuzzy time series for financial prediction, IEEE Trans. Knowl. Data Eng. 18 (2006) 613–625. S.T. Li, Y.P. Chen, Natural partitioning-based forecasting model for fuzzy time series, in: FUZZ-IEEE-2004 Hungry, IEEE, 2004, pp. 1355–1359.
554
S.R. Singh / Mathematics and Computers in Simulation 79 (2008) 539–554
[13] B. Moller, M. Beer, U. Reuter, Theoretical basics of fuzzy randomness-application to time series with fuzzy data, Proc. ICOSSAR (2005) 1701–1707. [14] C.M. Own, P.T. Yu, Forecasting fuzzy time series on a heuristic high-order model, Cybernet. Syst.: Int. J. 36 (2005) 705–717. [15] K. Ozawa, T. Watanabe, M. Kanke, Fuzzy auto regressive model and its applications, in: First International Conference on Knowledge-based Intelligent Electronic Systems, Adelaide, May 21–23, 1997. [16] S. Singh, Pattern modeling in time series forecasting, Cybernet. Syst. 31 (2000) 49–65. [17] S.R. Singh, A simple method of forecasting based on fuzzy time series, Appl. Math. Comput. 186 (2007) 330–339. [18] Q. Song, B.S. Chissom, Fuzzy time series and its models, Fuzzy Sets Syst. 54 (1993) 269–277. [19] Q. Song, B.S. Chissom, Forecasting enrollments with fuzzy time series, Part I, Fuzzy Sets Syst. 54 (1993) 1–9. [20] Q. Song, B.S. Chissom, Forecasting enrollments with fuzzy time series, Part II, Fuzzy Sets Syst. 64 (1994) 1–8. [21] Q. Song, A note on fuzzy time series model relation with sample autocorrelation functions, Cybernet. Syst.: Int. J. 34 (2003) 93–107. [22] J. Sullivan, W.H. Woodall, A comparison of fuzzy forecasting and Markov modeling, Fuzzy Sets Syst. 64 (1994) 279–293. [23] C.C. Tsai, S.J. Wu, Forecasting local region data with fuzzy time series, in: Proceedings of the ISIE 2001, IEEE, Pusan, Korea, 2001, pp. 122–133. [24] R.C. Tsaur, J.C.O. Yang, H.F. Wang, Fuzzy relation analysis in fuzzy time series model, Comput. Math. Appl. 49 (2005) 539–548. [25] N.Y. Wang, S.-M. Chen, Temperature prediction and TAIFEX forecasting based on automatic clustering techniques and two factors high order fuzzy time series, Expert Syst. Appl. (2007), doi:10.1016/j.eswa.2007.12.013. [26] H.K. Yu, A refined fuzzy time series model for forecasting, Physica A 346 (2005) 657–681. [27] H.K. Yu, Weighted fuzzy time series model for TAIEX forecasting, Physica A 349 (2005) 609–624. [28] L.A. Zadeh, Fuzzy set, Inform. Control 8 (1965) 338–353. [29] L.A. Zadeh, The concept of a linguistic variable and its application to approximate reasoning, Part I, Inform. Sci. 8 (1975) 199–249.