A new approach for crude oil price analysis based on Empirical Mode Decomposition

A new approach for crude oil price analysis based on Empirical Mode Decomposition

Available online at www.sciencedirect.com Energy Economics 30 (2008) 905 – 918 www.elsevier.com/locate/eneco A new approach for crude oil price anal...

914KB Sizes 5 Downloads 129 Views

Available online at www.sciencedirect.com

Energy Economics 30 (2008) 905 – 918 www.elsevier.com/locate/eneco

A new approach for crude oil price analysis based on Empirical Mode Decomposition Xun Zhang a,b , K.K. Lai c , Shou-Yang Wang a,b,⁎ a

c

Institute of Systems Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100080, China b School of Management, Graduate University of Chinese Academy of Sciences, Beijing 100080, China Department of Management Sciences, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong Received 22 September 2006; received in revised form 26 February 2007; accepted 26 February 2007 Available online 4 May 2007

Abstract The importance of understanding the underlying characteristics of international crude oil price movements attracts much attention from academic researchers and business practitioners. Due to the intrinsic complexity of the oil market, however, most of them fail to produce consistently good results. Empirical Mode Decomposition (EMD), recently proposed by Huang et al., appears to be a novel data analysis method for nonlinear and non-stationary time series. By decomposing a time series into a small number of independent and concretely implicational intrinsic modes based on scale separation, EMD explains the generation of time series data from a novel perspective. Ensemble EMD (EEMD) is a substantial improvement of EMD which can better separate the scales naturally by adding white noise series to the original time series and then treating the ensemble averages as the true intrinsic modes. In this paper, we extend EEMD to crude oil price analysis. First, three crude oil price series with different time ranges and frequencies are decomposed into several independent intrinsic modes, from high to low frequency. Second, the intrinsic modes are composed into a fluctuating process, a slowly varying part and a trend based on fine-to-coarse reconstruction. The economic meanings of the three components are identified as short term fluctuations caused by normal supply-demand disequilibrium or some other market activities, the effect of a shock of a significant event, and a long term trend. Finally, the EEMD is shown to be a vital technique for crude oil price analysis. © 2007 Elsevier B.V. All rights reserved. JEL classification: C49; Q49 Keywords: Empirical Mode Decomposition; Crude oil price; Forecasting; Composition; Volatility

⁎ Corresponding author. Institute of Systems Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100080, China. E-mail address: [email protected] (S.-Y. Wang). 0140-9883/$ - see front matter © 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.eneco.2007.02.012

906

X. Zhang et al. / Energy Economics 30 (2008) 905–918

1. Introduction Oil is one of the most important energy resources in the world and is known for wide price swings. It has significant effects on global economic activities. High oil prices often lead to an increase in inflation and subsequently hurt economies of oil-importing countries. Low oil prices, on the other hand, may result in economic recession and political instability in oil-exporting countries since their economic development can get retarded. Besides the price levels, economic losses are also driven by volatility of oil price. A relatively small increase in price can result in sizeable losses. Studies show that a 10% increase in price of oil is equivalent to 0.6 to 2.5% GDP growth for US (Sauter and Awerbuch, 2003; Rotemberg and Woodford, 1996). There have been abundant studies on analysis and forecasting of crude oil price. The approaches can be grouped into two categories: structure models and data-driven methods. Standard structure models outline the world oil market and then analyze the oil price volatility in terms of a supplydemand equilibrium schedule (e.g. Bacon, 1991; Al Faris, 1991; Huntington, 1994; Mehrzad, 2004; Yang et al., 2002). Data-driven models include linear models such as Autoregressive Moving Average (ARMA), Autoregressive Conditional Heteroscedasticity (ARCH) type models (e.g. Sadorsky, 2002; Morana, 2001) etc., and nonlinear models such as Artificial Neural Network (e.g. Mirmirani and Li, 2004; Moshiri, 2004; Nelson et al., 1994; Yu et al., 2006), Support Vector Regression (Xie et al., 2006), etc. A number of other references on this topic exist. (e.g. Abosedra and Baghestani, 2004; Abramson and Finizza, 1991; Abramson and Finizza, 1995; Chaudhuri, 2001; Hagen, 1994; Maurice, 1994; Nelson et al., 1994; Pindyck, 1999; Stevens, 1995; Wang et al., 2005; Watkins and Plourde, 1994). The structure models can help understand the mechanisms of oil price determination and quantify each factor's impact on oil price. However, this approach has proved to be difficult due to some specific characteristics of crude oil market. For example, supply is hard to model because oil is supplied by both a set of independent producers (non-OPEC nations) that act as price takers and an organization (OPEC) that uses myriad factors to determine levels of production, besides installed capacity (Déesa et al., 2007). In addition, the dynamic and unstable market environment increases the difficulty of modeling. Data-driven methods often perform well when applied to short term forecasting but they lack economic meaning and can not explain the inner driving forces that move crude oil price. The dilemma between difficulties in modeling and lack of economic meaning can be solved by an objective data analysis method, i.e. Empirical Mode Decomposition (EMD), introduced by Huang et al. (1998). EMD is an empirical, intuitive, direct and self-adaptive data processing method which is proposed especially for nonlinear and non-stationary data. The core of EMD is to decompose data into a small number of independent and nearly periodic intrinsic modes based on local characteristic scale, which is defined as the distance between two successive local extrema in EMD. Each derived intrinsic mode is dominated by scales in a narrow range. Thus, according to the scale, the concrete implications of each mode can be identified. For example, an intrinsic mode derived from an economic time series with a scale of three months can often be recognized as the seasonal component. Since data is the only link we have with the reality, by exploring data's intrinsic modes, EMD not only helps discover the characteristics of the data but also helps understand the underlying rules of reality. EMD was initially proposed for study of ocean waves, and then successfully applied in many areas, such as biomedical engineering, structured health monitoring, earthquake engineering, and global primary productivity evolution. However, these applications are mainly limited to studies of nature science and engineering. There have been only two successful applications in social sciences so far. The first is to apply EMD to financial data, which is used to examine the changeability of the markets

X. Zhang et al. / Energy Economics 30 (2008) 905–918

907

by Huang et al. (2003b). The second is by Cummings et al. (2004), to use EMD to prove the existence of a spatial-temporal traveling wave in the incidence of dengue hemorrhagic fever in Thailand. We will show that EMD can be more widely used in social sciences, such as for crude oil price analysis, later. In this paper, we apply Ensemble EMD (EEMD, Wu and Huang, 2004), an improved EMD, to crude oil price data and find that it can help interpret the formation of crude oil price from a novel perspective. First, three crude oil price series with different time ranges and frequencies are decomposed into several independent intrinsic modes, from high to low frequency. Second, the intrinsic modes are composed into a fluctuating process, a slowly varying part and a trend based on fine-to-coarse reconstruction. The economic meanings of the three components are identified as short term fluctuations caused by normal supply–demand disequilibrium or some other market activities, the effect or shock of a significant event, and a long term trend, according to their respective scales and characteristics. Finally, we define the features of the three components and the evolution of crude oil price. Some forecasting strategies for crude oil price are also discussed in the end of the paper, based on our conclusions. The rest of the paper is organized as follows: Section 2 gives a brief introduction to the basic theory and algorithm of EMD and EEMD. Section 3 introduces the data materials and decomposes them by EEMD. The derived intrinsic modes are also shown in this section. Detailed analyses based on a composition of intrinsic modes are presented in Section 4. Section 5 concludes the paper. 2. Empirical Mode Decomposition 2.1. EMD theory and algorithm EMD is a generally nonlinear, non-stationary data processing method developed by Huang et al. (1998). It assumes that the data, depending on its complexity, may have many different coexisting modes of oscillations at the same time. EMD can extract these intrinsic modes from the original time series, based on the local characteristic scale of data itself, and represent each intrinsic mode as an intrinsic mode function (IMF), which meets the following two conditions: 1) The functions have the same numbers of extrema and zero-crossings or differ at the most by one; 2) The functions are symmetric with respect to local zero mean. The two conditions ensure that an IMF is a nearly periodic function and the mean is set to zero. IMF is a harmonic-like function, but with variable amplitude and frequency at different times. In practice, the IMFs are extracted through a sifting process. The EMD algorithm is described as follows: 1) Identify all the maxima and minima of time series x(t); 2) Generate its upper and lower envelopes, emin(t) and emax(t), with cubic spline interpolation. 3) Calculate the point-by-point mean (m(t)) from upper and lower envelopes: mðtÞ ¼ ðemin ðtÞ þ emax ðtÞÞ=2

ð1Þ

4) Extract the mean from the time series and define the difference of x(t) and m(t) as d(t): dðtÞ ¼ xðtÞ−mðtÞ

ð2Þ

908

X. Zhang et al. / Energy Economics 30 (2008) 905–918

5) Check the properties of d(t): • If it is an IMF, denote d(t) as the ith IMF and replace x(t) with the residual r(t) = x(t) − d(t). The ith IMF is often denoted as ci(t) and the i is called its index; • If it is not, replace x(t) with d(t); 6) Repeat steps 1)–5) until the residual satisfies some stopping criterion. One stopping criterion proposed by Huang et al. (2003a) for extracting an IMF is: iterating predefined times after the residue satisfies the restriction that the number of zero-crossings and extrema do not differ by more than one and the whole sifting process can be stopped by any of the following predetermined criteria: either when the component ci(t) or the residue r(t) becomes so small that it is less than the predetermined value of a substantial consequence, or when the residue r(t) becomes a monotonic function from which no more IMFs can be extracted. The total number of IMFs is limited to log2N, where N is the length of data series. The original time series can be expressed as the sum of some IMFs and a residue: xðtÞ ¼

N X

cj ðtÞ þ rðtÞ:

ð3Þ

j¼1

Where N is the number of IMFs, and r(t) means the final residue. In the sifting process, the first component, c1, contains the finest scale (or the shortest period component) of the time series. The residue after extracting c1 contains longer period variations in the data. Therefore, the modes are extracted from high frequency to low frequency. Thus, EMD can be used as a filter to separate high frequency (fluctuating process) and low frequency (slowing varying component) modes. In practice, the following algorithm, based on fine-to-coarse reconstruction, i.e. high-pass filtering by adding fast oscillations (IMFs with smaller index) up to slow (IMFs with larger index) is adopted: 1) Computing the mean of the sum of c1 to ci for each component (except for the residue); 2) Using t-test to identify for which i the mean significantly departs from zero; 3) Once i is identified as a significant change point, partial reconstruction with IMFs from this to the end, is identified as the slow-varying mode and the partial reconstruction with other IMFs is identified as the fluctuating process. The advantages of EMD can be briefly summarized as follows: first, it can reduce any data, from non-stationary and nonlinear processes, into simple independent intrinsic mode functions; second, since the decomposition is based on the local characteristic time scale of the data and only extrema are used in the sifting process, it is local, self-adaptive, concretely implicational and highly efficient (This characteristic makes EMD much different from wavelet (Daubechies, 1992). See Huang et al., 1998 for detailed comparisons between EMD and wavelet.); third, the IMFs have a clear instantaneous frequency as the derivative of the phase function, so Hilbert transformation can be applied to the IMFs, allowing us to analyze the data in a time–frequency–energy space. 2.2. Ensemble EMD EMD has proved to be quite versatile in a broad range of applications for extracting signals from data generated in nonlinear and non-stationary processes. However, the original EMD has a

X. Zhang et al. / Energy Economics 30 (2008) 905–918

909

drawback — the frequent appearance of mode mixing, which is defined as a single IMF either consisting of signals of widely disparate scales, or a signal of a similar scale residing in different IMF components. To overcome the problem, Wu and Huang (2004) proposed EEMD. The basic idea of EEMD is that each observed data are amalgamations of the true time series and noise. Thus even if data are collected by separate observations, each with a different noise level, the ensemble mean is close to the true time series. Therefore, an additional step is taken by adding white noise that may help extract the true signal in the data. The procedure of EEMD is developed as follows: 1) Add a white noise series to the targeted data; 2) Decompose the data with added white noise into IMFs; 3) Repeat step 1 and step 2 iteratively, but with different white noise series each time; and obtain the (ensemble) means of corresponding IMFs of the decompositions as the final result. The added white noise series present a uniform reference frame in the time–frequency and time–scale space for signals of comparable scales to collate in one IMF and then cancel itself out (via ensemble averaging), after serving its purpose; therefore, it significantly reduces the chance of mode mixing and represents a substantial improvement over the original EMD. The effect of the added white noise can be controlled according to the well-established statistical rule proved by Wu and Huang (2004): e en ¼ pffiffiffiffi : ð4Þ N Where N is the number of ensemble members, ε is the amplitude of the added noise, and εn is the final standard deviation of error, which is defined as the difference between the input signal and the corresponding IMFs. In practice, the number of ensemble members is often set to 100 and the standard deviation of white noise series is set to 0.1 or 0.2. 3. Decomposition Through EEMD, crude oil price data series can be decomposed into a set of independent IMFs with different scales, plus the residue. The analyses of these IMFs and the residue help explore the variability and formation of crude oil price from a new perspective. 3.1. Data The monthly data of West Texas Intermediate (WTI) crude oil spot price, which is treated as the benchmark crude oil price for international oil markets, are used in our analysis. Fig. 1 shows the data series of WTI from Jan. 1946 to May 2006. In our experiments, three subdata sets of WTI are used. The first one is all the monthly data from Jul. 1946 to May 2005, 512 data points in total. The long time range of this data set helps extract more information and analyze crude oil price from a long term view. The inclusion of crude oil price in different time periods, having different characteristics, does not affect the final results since EEMD is local. The other two data sets just cover the period from Jul. 2000 to May 2006, but one is weekly data of 308 data points and the other is monthly data of 71 data points. The shorter period lets us focus on features of recent periods of high oil price and the different frequencies of data allow us to explore the features of crude oil price in different frequency ranges.

910

X. Zhang et al. / Energy Economics 30 (2008) 905–918

Fig. 1. The observed time series of WTI oil price from Jan.1946 to May 2006.

3.2. IMFs The IMFs and the residue derived by applying EEMD to the three data sets are shown in Figs. 2, 3, and 4. In all the three EEMDs, an ensemble member of 100 is used, and the added white noise in each ensemble member has a standard deviation of 0.2. Since the number of IMFs will be restricted to log2N, where N is the number of samples, the sifting processes produce 7 IMFs plus one residue for the longer monthly data, 3 IMFs plus one residue for the shorter monthly data and 5 IMFs plus one residue for the shorter weekly data. All the IMFs are listed in the order in which they are extracted, that is, from the highest frequency to the lowest frequency. In each figure, the last one is the residue. All the IMFs present changing frequencies and amplitudes, which is not the same with any harmonic. With the frequency changing from high to low, the amplitudes of the IMFs are becoming larger: for example, all the amplitudes of IMF1 in Fig. 2 are smaller than 5 but the amplitudes of IMF7 in Fig. 2 are restricted to only 15. The last residues are modes slowly varying around the long term average. 3.3. IMF statistics The following measures are taken to analyze IMFs: mean period of each IMF, correlation between each IMF and the original data series, the variance and variance percentage of each IMF. Tables 1, 2, and 3 show related information about the three decompositions separately. The mean period here is defined as the value derived by dividing the total number of points by the number of peaks for each IMF since the frequency and the amplitude of an IMF may change with time continuously and the periods are also not constant. Two correlation coefficients, Pearson product moment correlation coefficient and Kendall rank correlation coefficient are used to measure the correlations between IMFs and the observed data from different points of view. Otherwise, since these IMFs are independent of each other, it is possible to sum up the variances and use the percentage of variance to explain the contribution of each IMF to the total volatility of the observed data. However, the variances of IMFs and the residue do not always add up to the observed variance (there is a positive 8.00% difference in Table 1, a positive 2.46% difference in Table 2 and a negative 9.34% difference in Table 3), due to a combination of rounding errors,

X. Zhang et al. / Energy Economics 30 (2008) 905–918

911

Fig. 2. The IMFs and residue for the WTI monthly data from Jan. 1946 to May 2006 derived through EEMD.

nonlinearity of the original time series and introduction of variance by the treatment of the cubic spline end conditions (Peel et al., 2005). For all the three decompositions, it is observed that the dominant mode of the observed data is not any IMF but the residue. Both Pearson and Kendall coefficients, between the residue and the observed data, reach a high level of more than 0.75, 0.66 and 0.80 in Tables 1, 2 and 3, respectively. At the same time, variances of the residue account for more than 75% of the total variability. The highest one is even more than 90%. As Huang et al. (1998) mentioned, the residue is often treated as the deterministic long term behavior. For the longer time series data, the second important mode is the lowest frequency IMF, IMF7, which has a mean period of nearly 30 years. Interestingly, the two correlation coefficients differ very much for IMF7 and also for IMF6. This is because these IMFs vary

Table 1 Measures of IMFs and the residue for the WTI monthly data from Jan. 1946 to May 2006 derived through EEMD

Observed IMF1 IMF2 IMF3 IMF4 IMF5 IMF6 IMF7 Residue Sum

Mean period (month)

Pearson correlation

Kendall correlation

3.09 6.90 13.43 26.85 60.42 145.00 362.50

0.10⁎ 0.07 0.14⁎ 0.17⁎ 0.29⁎ 0.23⁎ 0.34⁎ 0.81⁎

0.06⁎ 0.05 0.07⁎ −0.00 0.03 0.01 −0.01 0.75⁎

⁎: Correlation is significant at 0.05 level (2-tailed).

Variance 181.41 0.66 0.66 0.89 2.97 2.55 7.91 32.53 147.76

Variance as % of observed

Variance as % of (ΣIMFs + residual)

0.36% 0.36% 0.49% 1.64% 1.41% 4.36% 17.93% 81.45% 108.00%

0.34% 0.34% 0.45% 1.51% 1.30% 4.04% 16.61% 75.41% 100.00%

912

X. Zhang et al. / Energy Economics 30 (2008) 905–918

Table 2 Measures of IMFs and the residue for the WTI monthly data from Jul. 2000 to May 2006 derived through EEMD

Observed IMF1 IMF2 IMF3 Residue Sum

Mean period (month)

Pearson correlation

Kendall correlation

3.38 7.89 14.20 71.00

0.16 − 0.01 0.19 0.97⁎

0.12 0.01 0.22⁎ 0.66⁎

Variance 195.02 3.16 2.36 7.32 186.98

Variance as % of observed

Variance as % of (ΣIMFs + residual)

1.62% 1.21% 3.75% 95.88% 102.46%

1.58% 1.18% 3.66% 93.58% 100.00%

⁎: Correlation is significant at 0.05 level (2-tailed).

slowly. An up (down) movement can last for a long time before the direction changes. Therefore, the fluctuation is often reverse of the observed data, which is highly volatile. Although the residue also has the feature of a long cycle, the residue always maintains an increasing trend and this direction is the same in the observed data at most of the data points; so the Kendall correlation for it is still high. The sum of variances for the two most important components, the residue and IMF7, contribute 92.02% of total variance. On the other hand, the first two IMFs not only exhibit very low correlation coefficients with the observed data but also account for less than 1% of total variance. This means these IMFs do not have serious effect on crude oil price. For the two shorter time series, the second important mode is still the IMF of the lowest frequency, which is similar to the case of longer time series data. But the variance percentage is rather small, less than 4%. The two monthly data sets generate nearly the same high frequency IMFs. But IMFs for long periods can not be extracted from the shorter monthly data sets due to the length restriction. Therefore, all the information of IMF4 to IMF7 and the residue of the longer monthly data set are contained in the residue of the shorter data set. This means when observing data series through a short time span, the long term factors in a longer time span will be treated as part of the trend. Decomposition results of weekly and monthly data in the same time range show consistency of EEMD. The monthly and weekly decomposition results have a good corresponding relationship, in accordance with their mean periods, and modes extracted from monthly data can be reconstructed roughly from modes generated by weekly data, one by one. We use the word ‘roughly’ because there are some weekly data covering daily data in two successive months, which Table 3 Measures of IMFs and the residue for the WTI weekly data from Jul. 2000 to May 2006 derived through EEMD

Observed IMF1 IMF2 IMF3 IMF4 IMF5 Residue Sum

Mean period (week)

Pearson correlation

Kendall correlation

3.28 8.32 18.12 38.50 102.67

0.07 0.09 0.18⁎ 0.14⁎ 0.39⁎ 0.96⁎

0.06 0.07 0.12⁎ 0.10⁎ 0.27⁎ 0.80⁎

⁎: Correlation is significant at 0.05 level (2-tailed).

Variance 196.09 0.82 0.99 2.93 1.93 6.60 164.50

Variance as % of observed

Variance as % of (ΣIMFs + residual)

0.42% 0.51% 1.49% 0.98% 3.36% 83.89% 90.66%

0.46% 0.56% 1.65% 1.08% 3.71% 92.54% 100.00%

X. Zhang et al. / Energy Economics 30 (2008) 905–918

913

Fig. 3. The IMFs and residue for the WTI monthly data from Jul. 2000 to May 2006 derived through EEMD.

results in errors in reconstruction. However, the first two modes in weekly data are so fluctuating, revealing information contained only in high frequency data. 4. Composition In Section 3, decomposition results of three WTI price data sets and basic analysis of each IMF are provided. In this section, the IMFs are separated into high frequency parts and low frequency

Fig. 4. The IMFs and residue for the WTI weekly data from Jul. 2000 to May 2006 through derived EEMD.

914

X. Zhang et al. / Energy Economics 30 (2008) 905–918

parts, based on the algorithm mentioned in Section 2.1. The two components and residue have abundant economic meanings and reveal some new features of crude oil price. We focus only on the longest data set, for its decomposition provides more information about composing factors, owing to the longer time span. The mean of the fine-to-coarse reconstruction as a function of IMFs index K is shown in Fig. 5. The mean of the fine-to-coarse reconstruction departs significantly from zero at IMF 4. Therefore, the partial reconstruction with IMF1, IMF2 and IMF3 represents high frequency component and the partial reconstruction with IMF4, IMF5, IMF6 and IMF7 represents the low frequency component. The residue is treated separately. Fig. 6 shows the three components and Table 4 gives statistical measures, including Pearson and Kendall correlations between each component and the observed price, variance of each component and variance percentages. Each component has some distinct characteristics. The residue, as mentioned before, is slowly varying around the long term mean. Therefore, it is treated as the long term trend during the evolution of oil price; each sharp up or down of the low frequency component corresponds to a significant event, which should be representative of the effect of these events; the high frequency component, with the characteristics of small amplitudes, contains the effects of markets' short term fluctuations. 4.1. Trend The trend holds a high correlation with the original price and accounts for more than 70% of variability, suggesting it is a deterministic force for oil price evolution in the long run. The continuing increasing trend is consistent with the economic development of the world, which may imply that the long term trend of crude oil price is determined by global economic development.

Fig. 5. The mean of the fine-to-coarse reconstruction as a function of index K. The vertical dash-line at K = 4 indicates that the mean departs significantly from zero ( p b 0.01).

X. Zhang et al. / Energy Economics 30 (2008) 905–918

915

Fig. 6. The three components of the WTI monthly data series from Jan.1946 to May 2006.

In fact, from the comparison of the trend with the observed price, we can see that historically, although oil price would fluctuate dramatically due to significant events, it would return to the trend after the influence of the event is over. For example, the oil crisis in 1979 and 1980 made oil rise suddenly from $15.5/barrel to $39.5/barrel, but the price fell slowly after that and finally returned to the trend price of $22.5/barrel in 1986. 4.2. Effects of significant events The effects of significant events are mainly described by IMF4 to IMF7. Looking at the mean periods of these IMFs, the shortest is more than two years and the longest can be as long as thirty years, suggesting that it is hard for the market itself to eliminate these effects soon; the duration of the effect of a significant event may be very long. In addition, the amplitudes at some data points could be more than $10 or even higher, suggesting that the effects of some significant events on oil price may be very serious. Since the trend changes slowly and the markets' normal fluctuation is small and happens at a high frequency, large fluctuations in medium term arise only from significant events. In fact, if we define change rate as the difference between prices for two consecutive months divided by the value of the earlier month, the change rate of the original price is consistent with that of the low frequency component. But this change rate for the low frequency Table 4 The correlation and the variance of the components for the WTI monthly data series from Jan. 1946 to May 2006

Observed High frequency component Low frequency component Trend Sum

Pearson correlation

Kendall correlation

0.16⁎ 0.43⁎ 0.81⁎

0.10⁎ − 0.13⁎ 0.75⁎

⁎: Correlation is significant at the 0.05 level (2-tailed).

Variance 181.41 2.90 59.65 147.76

Variance as % of observed

Variance as % of (ΣIMFs + residual)

1.60% 32.88% 81.45% 115.93%

1.38% 28.36% 70.26% 100.00%

916

X. Zhang et al. / Energy Economics 30 (2008) 905–918

component often changes more slowly since it excludes the effects of very high frequency activities. This is also the reason why the curve generated by the significant event looks like a smoothed crude oil price series. By separating significant events as the low frequency component from the whole price, the effect of every significant event can be measured and the result can then be a reference for forecasting the effect of the next significant event of the same type. For example, during the period of Iran revolution and Iran–Iraq war in 1979 and 1980, this component resulted in a price increase of $22.05, which means the maximum effect of the two events was $22.05. This upward movement in price began at the beginning of 1979 and it did not return to zero until the end of 1985. Since no serious event occurred during this period, we can conclude that the influence of the two events lasted for 7 years. 4.3. Normal market disequilibrium Besides significant events and the intrinsic trend, crude oil prices are also influenced by many other factors, such as bad weather, strikes and depletion of inventory. Durations of these effects are often short. So they are classified into high frequency events and their effects are contained in the high frequency component. Although we call this component as effects of normal market disequilibrium for short, it should be treated as a collection of events with short term impact on oil price. Since the data used in our experiments are monthly data, the words short term should be described, in general, as less than one year. The normal market fluctuations, such as disequilibrium of supply-demand, have no serious impact on oil price — it is generally within $5. But these events are becoming more and more frequent and have lately become the fundamental impetus for pushing oil prices up. Thus, normal market fluctuations can be neglected in long term trend prediction, but they are important for short term forecasting. The main finding of this section is that, assuming crude oil price is composed of the three components is reasonable, the residue, which is also described as the “trend” in EMD, represents the major trend of oil price in the long run. The low frequency component can be treated as the effects of significant events. It is the main reason for the dramatic oil price variability in the medium term. However, the high frequency component should be explained as normal market fluctuations or events which have only a short term impact on crude oil price. Through this method, the price of $70.94/barrel in May 2006 can be decomposed into a trend price ($33.42), a significant event price ($32.55), and a normal fluctuation of $4.86. 5. Conclusion The data of West Texas Intermediate crude oil price are decomposed into several independent intrinsic modes with varying and different frequencies, bringing out some interesting features of crude oil price volatility. The IMFs and the residue are summed up into only three components, based on fine-to-coarse reconstruction. Then the crude oil price can be explained as the composite of a long term trend, effect of a shock from significant events, and short term fluctuations caused by normal supply-demand disequilibrium. Oil price in the long run is basically determined by the trend, which changes continuously and stays around the long term mean. The sharp downs or ups in oil prices are triggered by unpredictable and significant events, the impact of which may endure for several years. Otherwise, the small fluctuations in the short term are mainly driven by normal market activities or some small events which do not have a serious influence on oil markets.

X. Zhang et al. / Energy Economics 30 (2008) 905–918

917

By analyzing the composition of oil price, many forecasting strategies can be considered: the first is predicting every IMF based on its own characteristics, such as using a polynomial function to fit the residue, using a Fourier function to simulate low frequency IMFs, and applying nonlinear forecasting technology to deal with high frequency IMFs, and then integrating individual parts to obtain a final result. The second might be grouping the IMFs into a nonlinear part and a linear part, forecasting each individually, and then summing them up together. While considering the concrete implications of each component, we suggest every part must be forecast based on both its concrete implications and data characteristics. The trend can be predicted by fitting the curve and the short term fluctuations can be dealt with nonlinear forecasting techniques such as Support Vector Regression and Artificial Neural Network. But significant events are hard to predict and evaluate. As people know, a significant event itself is influenced by many factors, such as political situation, weather and other complicated factors. No one knows when and where, what will happen. Even though an irregular event may be expected to happen, evaluating its influence is still very difficult. And even if the same event happens again, it may have a different impact on oil price, at different times. So there should be some new method or an integrated forecasting framework to handle these issues. Acknowledgements This work is supported by the NSFC, CAS and RGC of Hong Kong. The authors would like to thank the two anonymous referees for their many very helpful comments and suggestions. References Abosedra, S., Baghestani, H., 2004. On the predictive accuracy of crude oil future prices. Energy Policy 32, 1389–1393. Abramson, B., Finizza, A., 1991. Using belief networks to forecast oil prices. International Journal of Forecasting 7 (3), 299–315. Abramson, B., Finizza, A., 1995. Probabilistic forecasts from probabilistic models: a case study in the oil market. International Journal of Forecasting 11 (1), 63–72. Al Faris, A., 1991. The determinants of crude oil price adjustment in the world petroleum market. OPEC Review 15. Bacon, R., 1991. Modelling the price of oil. Oxford Review of Economic Policy 7 (2), 17–34. Chaudhuri, K., 2001. Long-run prices of primary commodities and oil prices. Applied Economics 33, 531–538. Cummings, D.A.T., Irizarry, R.A., Huang, N.E., Endy, T.P., Nisalak, A., Ungchusak, K., Burke, D.S., 2004. Travelling waves in the occurrence of dengue haemorrhagic fever in Thailand. Nature 427 (6972), 344–347. Daubechies, I., 1992. Ten Lectures on Wavelets. SIAM, [M] Philadelphia. Déesa, Stéphane, Karadelogloua, Pavlos, Kaufmannb, R.K., Sáncheza, Marcelo, 2007. Modelling the world oil market: assessment of a quarterly econometric model. Energy Policy 35, 178–191. Hagen, R., 1994. How is the international price of a particular crude determined? OPEC Review 18 (1), 145–158. Huang, N.E., Shen, Z., Long, S.R., 1998. The empirical mode decomposition and the Hilbert spectrum for nonlinear and nonstationary time series analysis. Process of the Royal Society of London. A 454, 903–995. Huang, N.E., Wu, M.L., Long, S.R., Shen, S.S.P., Qu, W.D., Gloersen, P., Fan, K.L., 2003a. A confidence limit for the empirical mode decomposition and the Hilbert spectral analysis. Proceedings of the Royal Society of London. A 459, 2317–2345. Huang, N.E., Wu, M.L., Qu, W.D., Long, S.R., Shen, S.S.P., 2003b. Applications of Hilbert–Huang transform to nonstationary financial time series analysis. Applied Stochastic Models in Business and Industry 19, 245–268. Huntington, H.G., 1994. Oil price forecasting in the 1980s: what went wrong? The Energy Journal 15 (2), 1–22. Maurice, J., 1994. Summary of the Oil Price. Research Report, La Documentation Francaise Website. http://www.agiweb. org/gap/legis106/Oil Price.html. Zamani, Mehrzad, 2004. An econometrics forecasting model of short term oil spot Price. IIES Energy Economist, 6th IAEE European Conference 2004. Mirmirani, S., Li, H.C., 2004. A comparison of VAR and neural networks with genetic algorithm in forecasting price of oil. Applications of Artificial Intelligence in Finance and Economics: Advances in Econometrics 19, 203–223. Morana, C., 2001. A semiparametric approach to short-term oil price forecasting. Energy Economics 23 (3), 325–338.

918

X. Zhang et al. / Energy Economics 30 (2008) 905–918

Moshiri, S., 2004. Testing for Deterministic Chaos in Futures Crude Oil Price: Does Neural Network Lead to Better Forecast? Economics, Working Paper, vol. 5. Nelson, Y., Stoner, S., Gemis, G., Nix, H.D., 1994. Results of Delphi VIII survey of oil price forecasts. Energy Report, California Energy Commission. Peel, M.C., Amirthanathan, G.E., Pegram, G.G.S., McMahon, T.A., Chiew, F.H.S., 2005. Issues with the application of empirical mode decomposition. In: Zerger, A., Argent, R.M. (Eds.), Modsim 2005 International Congress on Modelling and Simulation, pp. 1681–1687. Pindyck, R.S., 1999. The long-run evolution of energy prices. The Energy Journal 20 (2), 1–25. Rotemberg, J.J., Woodford, M., 1996. Imperfect competition and the effects of energy price increases on economic activity. Journal of Money, Credit and Banking 28 (4), 550–577. Sadorsky, P., 2002. Time-varying risk premiums in petroleum futures prices. Energy Economics 24 (6), 539–556. Sauter, R., Awerbuch, S., 2003. Oil price volatility and economic activity: a survey and literature review. IEA Research Paper. Stevens, P., 1995. The determination of oil prices 1945–1995. Energy Policy 23 (10), 861–870. Wang, S.Y., Yu, L., Lai, K.K., 2005. Crude oil price forecasting with Tei@I methodology. Journal of Systems Science and Complexity 18 (2), 145–166. Watkins, G.C., Plourde, A., 1994. How volatile are crude oil prices? OPEC Review 18 (4), 220–245. Wu, Z., Huang, N.E., 2004. Ensemble Empirical Mode Decomposition: a Noise-assisted Data Analysis Method. Centre for Ocean-Land-Atmosphere Studies. Technical Report, vol. 193, p. 51. http://www.iges.org/pubs/tech.html. Xie, W., Yu, L., Xu, S.Y., Wang, S.Y., 2006. A new method for crude oil price forecasting based on support vector machines. Lecture Notes in Computer Science 3994, 441–451. Yang, C.W., Hwang, M.J., Huang, B.N., 2002. An analysis of factors affecting price volatility of the US oil market. Energy Economics 24, 107–119. Yu, L., Wang, S.Y., Lai, K.K., 2006. Forecasting Foreign Exchange Rates and International Crude Oil Price Volatility — TEI@I Methodology. Hunan University Press, Changsha.