Introducing an Ogive method for discontinuous data

Introducing an Ogive method for discontinuous data

Agricultural and Forest Meteorology 162–163 (2012) 58–62 Contents lists available at SciVerse ScienceDirect Agricultural and Forest Meteorology jour...

395KB Sizes 0 Downloads 58 Views

Agricultural and Forest Meteorology 162–163 (2012) 58–62

Contents lists available at SciVerse ScienceDirect

Agricultural and Forest Meteorology journal homepage: www.elsevier.com/locate/agrformet

Short communication

Introducing an Ogive method for discontinuous data B.L. Barnhart a,∗ , W.E. Eichinger a , J.H. Prueger b a

IIHR – Hydroscience and Engineering, The University of Iowa, 100 C. Maxwell Stanley Hydraulics Laboratory, Iowa City, IA 52242-1585, USA United States Department of Agriculture – Agricultural Research Service, National Laboratory for Agriculture and the Environment, 2110 University Boulevard, Room 242, Ames, IA 50011-3120, USA b

a r t i c l e

i n f o

Article history: Received 18 January 2012 Received in revised form 28 March 2012 Accepted 2 April 2012 Keywords: Empirical mode decomposition (EMD) Ogive Data gaps

a b s t r a c t Empirical mode decomposition (EMD) is a spectral decomposition algorithm, which acts as a dyadic filter in the time-domain when extracting periodic components from turbulent atmospheric data. A new development in the algorithm allows it to work with discontinuous data. This investigation uses the discontinuous form of EMD (DEMD) to develop a new Ogive function, or cumulative flux calculation, which may be used with atmospheric data containing data gaps. The method is simple and effective, and will extend the utility of Ogives. The code is written in Matlab and available for use. Published by Elsevier B.V.

1. Introduction

cumulative flux is compared with the total flux as calculated for a long duration, for example 150 min duration.

The eddy covariance technique is the most popular method for determining the turbulent energy exchange near the earth’s surface (Foken, 2008; Moncrieff et al., 2004). The technique utilizes turbulent atmospheric measurements such as wind velocity, temperature, humidity, and CO2 concentrations in order to characterize their near-surface exchange. In analyzing these measurements, a number of tools have been developed which assess the quality of the data (Foken and Wichura, 1996). One of these tools is the Ogive function, which was introduced by Oncley et al. (1990). A traditional Ogive function is the cumulative sum of the covariance between two variables as summed from high-frequency to low-frequency contributions as shown in Eq. (1) (Foken and Wichura, 1996; Foken, 2008; Moncrieff et al., 2004; Oncley et al., 1990).

   OGx min   OG  = CC



f

OGwk (f ) =

COwk (f  )df 

(1)

fNyquist

The highest frequency, the Nyquist frequency, is determined by the sampling rate, and the lowest frequency is determined by the sampling duration. The Ogive of all frequencies is equal to the total covariance. The Ogive function has often been used as a quality assessment tool to determine whether meteorological signals have been sufficiently measured to capture the total turbulent fluxes present. A convergence criterion, CC, can be defined as in Eq. (2) where the

∗ Corresponding author. Tel.: +1 319 335 5403. E-mail address: [email protected] (B.L. Barnhart). 0168-1923/$ – see front matter. Published by Elsevier B.V. http://dx.doi.org/10.1016/j.agrformet.2012.04.003

(2)

150 min

If the CC converges to within 10% of unity, then the point of its convergence represents the largest eddy structure (lowestfrequency) which contains contributions to the total flux. If the Ogive (CC within 10% of unity) does not converge, then presumably there exist larger eddies which are not adequately sampled and contribute to the flux (Foken, 2008; Moncrieff et al., 2004; Oncley et al., 1990). One problem with the Ogive function, though, is that it only works with continuous data. Currently, if artificial data cannot be reliably inserted to fill gaps of discontinuous data, then the entire data period must be removed from consideration. A new spectral decomposition algorithm called discontinuous empirical mode decomposition (DEMD) provides an alternative means by which to calculate Ogive functions, while leaving the original data (with gaps) intact. The new method will be introduced and a proof of concept will be shown. The codes were developed in Matlab and are available for general use.

2. Discontinuous EMD (DEMD) method The empirical mode decomposition (EMD) algorithm can be used to decompose a data set into a set of time-domain functions (called intrinsic mode functions (IMFs)), which represent its periodic variability at different scales (Huang et al., 1998; Rao and Hsu, 2008). The algorithm utilizes a sifting process described here:

B.L. Barnhart et al. / Agricultural and Forest Meteorology 162–163 (2012) 58–62

59

1. Locate the local maxima and minima of a signal. 2. Conjoin the local maxima with an interpolation function to form an upper envelope. 3. Conjoin the local minima with an interpolation function to form a lower envelope. 4. Compute the local mean, which is defined as half the difference between the upper and lower envelopes. 5. Subtract the local mean from the signal. 6. Iterate on the residual until the signal meets the criteria of an IMF. An IMF is, by definition, a signal which embodies the following two criteria:

Fig. 1. Flowchart depicting the construction of an Ogive method, which can be used for discontinuous data.

1. The number of extrema and zero-crossings must be equal or differ at most by one. 2. At any point, the mean value defined by local envelopes must be zero.

The specifics of the original EMD algorithm are beyond the scope of this investigation. For a complete explanation, see Huang et al. (1998), Rao and Hsu (2008), or Barnhart and Eichinger (2011). The discontinuous EMD (DEMD) algorithm is a generalization of the EMD algorithm, which is able to handle data with gaps. Essentially, it relies on constructing piecewise envelopes around discontinuous data and then sifting each individual segment to obtain discontinuous IMFs (Barnhart et al., 2011). For a complete explanation of discontinuous EMD, see Barnhart et al. (2011). For this investigation, it is sufficient to note that the output of the discontinuous EMD algorithm provides a set of discontinuous IMF functions which represent the intrinsic periodic variability of the original signal in the time-domain (Barnhart et al., 2011). The sum of the IMFs returns the original signal.

The sifting process is repeated until the first IMF has been extracted from the data. This IMF is the highest frequency component of the signal. Then, the remaining signal is sifted until all IMFs have been extracted and a monotonic trend remains. The decomposition is expressed in Eq. (3).

D(t) = Rn +

n−1 

IMFj (t)

(3)

j=1

Temperature

IMF1

38 36

IMF2

IMF3

IMF4

IMF5

1

1

1

1

1

0.5

0.5

0.5

0.5

0.5

0

0

0

0

0

−0.5

−0.5

−0.5

−0.5

−0.5

34 32 30 28

0

1 Pts

2

−1

5

5

x 10

IMF6

10 Pts

−1

15

5

4

x 10

IMF7

10 Pts

−1

15

5

4

x 10

IMF8

10 Pts

−1

15

5

4

x 10

IMF9

10 Pts

−1

15 x 10

IMF10

1

1

1

1

1

0.5

0.5

0.5

0.5

0.5

0.5

0

0

0

0

0

0

−0.5

−0.5

−0.5

−0.5

−0.5

−0.5

5

10 Pts

−1

15

5

4

x 10

IMF12

10 Pts

−1

15

5

4

x 10

IMF13

10 Pts

−1

15

5

4

x 10

IMF14

10 Pts

−1

15

5

4

x 10

IMF15

10 Pts

−1

15

5

4

x 10

IMF16

1

1

1

1

1

0.5

0.5

0.5

0.5

0.5

0

0

0

0

0

−0.5

−0.5

−0.5

−0.5

−0.5

10 Pts

15 4

x 10

IMF11

1

−1

5

4

10 Pts

15 4

x 10

IMF17 40

35

30

−1

5

10 Pts

−1

15 4

x 10

5

10 Pts

−1

15 4

x 10

5

10 Pts

−1

15 4

x 10

5

10 Pts

−1

15 4

x 10

5

10 Pts

25

15 4

x 10

5

10 Pts

Fig. 2. Example of discontinuous data decomposed into its set of IMFs, which represent the periodic variability of the signal at different scales.

15 4

x 10

60

B.L. Barnhart et al. / Agricultural and Forest Meteorology 162–163 (2012) 58–62

Day 1

Day 2

1

1

0.5

0.5

0

0

2

4

6

8 10 IMF Number Day 3

12

14

16

18

0

1

1

0.5

0.5

0

0

2

4

6

8 10 IMF Number Day 5

12

14

16

18

0

1

1

0.5

0.5

0

0

2

4

6

8 10 IMF Number Day 7

12

14

16

18

0

1

1

0.5

0.5

0

0

2

4

6

8 10 IMF Number Day 9

12

14

16

18

0

1

1

0.5

0.5

0

0

2

4

6

8 10 IMF Number

12

14

16

18

0

0

2

4

6

8 10 IMF Number Day 4

12

14

16

18

0

2

4

6

8 10 IMF Number Day 6

12

14

16

18

0

2

4

6

8 10 IMF Number Day 8

12

14

16

18

0

2

4

6

8 10 IMF Number Day 10

12

14

16

18

0

2

4

6

8 10 IMF Number

12

14

16

18

Fig. 3. Discontinuous empirical mode decomposition Ogive function shown for 10 separate days. The y-axis is the fractional cumulative sum of the contributions to the total flux, where the total flux is defined as the 2.5-h duration flux. Durations ranged from 5 min to 2.5 h. The horizontal lines set the convergence threshold at 10%.

The DEMD algorithm will now be used to construct an Ogive method for discontinuous data. It will be shown that by calculating the relative contributions from each of the IMFs to the total covariance, an Ogive can be constructed, regardless of gaps present. 3. A new Ogive method using DEMD The goal of this investigation is to introduce an Ogive method, which works for discontinuous data. The method is shown in a flowchart in Fig. 1. First, two discontinuous data sets (input1, input2) are chosen. For this example, vertical wind velocity, w, and temperature, T,



(wIMF1 , T IMF1 )  ⎜ (wIMF2 , T IMF1 ) nancov(w, T ) = nancov ⎜ .. ⎝ . (wIMFN , T IMF1 ) were used. They were 20 Hz data measured from towers during the 2002 Soil Moisture EXperiment (SMEX) in Ames, Iowa (see Prueger et al. (2009) for complete experiment information). The DEMD was used to decompose each data set into its set of IMFs. These IMFs are discontinuous time-domain functions, which represent the periodic variability of the data at different scales. An

example decomposition of 150 min temperature data is shown in Fig. 2. Two artificial gaps were induced in the data set before decomposition. Each gap was 5000 data points long (∼3% of the data set duration). The algorithm’s dependence on gap size will be shown later. The sum of the IMFs is equivalent to the original signal. Therefore, the contribution to the total covariance can be obtained by calculating the covariances between each IMF pairs of the different variables. For instance, the Matlab function nancov() was used to calculate the covariance between two IMFs of w and T, while skipping the gaps (NaNs). A covariance matrix is then produced as shown in Eq. (4). (wIMF1 , T IMF2 ) (wIMF2 , T IMF2 ) .. . (wIMFN , T IMF2 )

··· ··· .. . ···



(wIMF1 , T IMFN ) (wIMF2 , T IMFN ) ⎟ ⎟ .. ⎠ . (wIMFN , T IMFN )

(4)

This shows the covariance contribution from each of the IMF pairs, skipping gaps. The total covariance is equal to the sum of the covariances from all the IMF pairs. The columns of the covariance matrix can then be summed so that all cross-term contributions are combined. For example, the first column combines the sum of the covariance contributions

B.L. Barnhart et al. / Agricultural and Forest Meteorology 162–163 (2012) 58–62

61

between wIMF1 and all the TIMFs. The second column combines the sum of the covariance contributions from wIMF2 and all the TIMFs, etc. See Eq. (5).

⎞ ⎛ n−1  nancov(wIMF1 , T IMFj ) 0 0 ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ j=1 ⎟ ⎜ n−1  ⎜ ⎟ ⎟ ⎜ 0 nancov(wIMF2 , T IMFj ) 0 nancov(w, T ) = ⎟ ⎜ ⎟ ⎜ j=1 ⎟ ⎜ n−1 ⎟ ⎜  ⎝ 0 0 nancov(wIMFn−1 , T IMFj ) ⎠

(5)

j=1

Finally, Ogive curves can be constructed from the cumulative sum of the covariance contributions shown in Eq. (5). To demonstrate this new Ogive concept using actual data, 10 days of 20 Hz vertical wind velocity and temperature data from the 2002 Soil Moisture EXperiment (SMEX) were used. Each data set began at 1000 local time and ended 1230 local time. Two artificial gap durations, with equal lengths (5000 points each), were induced in each data set at two locations. The data sets were then decomposed into their IMFs, a covariance matrix was calculated, and the cumulative sum was plotted in Fig. 3. The y-axis in Fig. 3 is the cumulative sum of the covariance as contributed from the IMFs, and the x-axis is the IMF number, or column number in Eq. (5). IMF 13 represents scales of approximately 30 min. Each plot was normalized by the total 2.5-h covariance. The Ogive curves work in precisely the same way as traditional Ogive curves. That is, if the Ogive is within some threshold of the convergence criterion, as shown in Eq. (2), then the flux is said to have been sufficiently captured. Two horizontal lines were drawn in the plots within Fig. 3, which denote 10% convergence thresholds. If the Ogive remains within these limits, then the Ogive meets the convergence criterion. Then, the entire covariance is said to be contained in the eddies (or in this case, IMFs) with periodicities equal or less than the point of convergence. If the Ogive is either above or below the 10% convergence thresholds at the longest periodicities

Fractional Ogive of Gap vs. No Gap 1 5min gap 10min gap 20min gap 0.99

0.98

0.97

0.96

(largest IMF numbers), then presumably larger structures, which have not been completely sampled, contribute to the covariance (Foken and Wichura, 1996). In Fig. 3, days 1, 3, 6, 7, and 9 all fall within the 10% convergence threshold. Days 4 and 10 do not reach convergence while days 2, 5, and 8 rise above the convergence threshold. Note that if the convergence threshold was set to 15%, then 7 out of 10 days would be within the convergence threshold. There are a number of reasons for non-convergence, including nonstationarity and largescale advection (Finnigan et al., 2003; Sakai et al., 2001). Rather than postulating the reason for non-convergence, this tool can merely be used as a quality check to determine which data sets are adequately sampled. When the Ogives fail to meet this quality check, it is assumed that flux-carrying eddies exist at longer wavelengths and are undersampled. It is also interesting to notice that the IMFs at which the Ogives converge varies. This describes which size (frequency) scales are most responsible for the occurring flux, and is due to the local conditions near the measurement system. The gap size within the data will disrupt flux calculations and ultimately determine the resulting shape of the Ogive functions. In order to test this, one 2.5 h data set of vertical wind velocity and temperature were used. A single data gap was induced in the middle of the data set for a given duration. Three different durations were used including 5, 10, and 20 min. Then, the DEMD Ogive functions were calculated and compared with the calculated Ogives for the same data with no gaps. The results are shown in Fig. 4. For the 5 min gap, the Ogive estimates were approximately 99% of those calculated for the data with no gaps. As the gap size is increased, the fractional Ogive decreases. However, even for a 20 min data gap, the discontinuous Ogive gives values which are within 10% of the no gap Ogive. Increasing the gap size further results in highly irregular fractional Ogives. Therefore, we suggest the DEMD method works well for gaps which have durations less than 15% of the total data length. Further research is needed to verify these conclusions with a number of different data sets during different atmospheric conditions.

0.95

4. Conclusions 0.94

0.93

0

2

4

6

8

10

12

14

16

18

IMF #

Fig. 4. Fractional Ogive calculations between data with gaps and no gaps. Each data set with a particular gap size in the direct middle of the data are analyzed with discontinuous EMD and its Ogive is calculated. Then, it is divided by the Ogive estimates from the continuous, no gaps, data. The results show that the Ogive estimates decrease as the gap size widens.

The Ogive function has been long used as a quality assessment tool to determine whether the measured atmospheric fluxes of near-surface variables have been sufficiently sampled. However, it is a fundamental flaw that Ogive functions cannot be used with discontinuous data. Data gaps are common in flux data, either due to instrument failure or specialized instrumentation which require frequent calibration. Therefore, it is beneficial to extend our use of Ogive functions to data which may contain gaps.

62

B.L. Barnhart et al. / Agricultural and Forest Meteorology 162–163 (2012) 58–62

A new method has been developed which allows Ogive curves to be calculated from discontinuous data sets, without the inclusion of artificial data. This will greatly increase the application of Ogive methods. For this investigation, we utilized the discontinuous empirical mode decomposition (DEMD) algorithm. Studies have noted that the original EMD, and thus DEMD, acts as a dyadic filter in the time-domain when analyzing turbulent data (Flandrin et al., 2004). That is, each IMF extracted has a mean period which is twice as long as the previous, when processing turbulent data. Since turbulence can be thought as a collection of eddies at all size (frequency) scales, each IMF then is actually the superposition of all eddy structures within that particular dyadic size (frequency) range. This suggests that any spectral filter, which works entirely in the timedomain, could also be used to create discontinuous Ogive curves. This should prompt further research to compare the effectiveness of different time-domain filtering methods and their ability to generate Ogive curves as well as their relative speed and ease of implementation.

Appendix A. Supplementary material Supplementary material associated with this article can be found, in the online version, at http://dx.doi.org/10.1016/ j.agrformet.2012.04.003.

References Barnhart, B.L., Wa Nandage, H.K., Eichinger, W.E., 2011. Assessing discontinuous data using Ensemble Empirical Mode Decomposition. Adv. Adapt. Data Anal. 3 (4), 1–9. Barnhart, B.L., Eichinger, W.E., 2011. Analyzing sunspot variability using empirical mode decomposition. Sol. Phys. 269 (2), 439–449. Finnigan, J., Clement, R., Malhi, Y., Leuning, R., Cleugh, H.A., 2003. A re-evaluation of long-term flux measurement techniques. Part I. Averaging and coordinate rotation. Boundary-Layer Meteorol. 107, 1–48. Flandrin, P., Rilling, G., Gonc¸alvès, P., 2004. Empirical mode decomposition as a filter bank. IEEE Signal Process. Lett. 11 (2), 112. Foken, T., 2008. Micrometeorology. Springer-Verlag, Berlin, Heidelberg. Foken, T., Wichura, B., 1996. Tools for quality assessment of surface-based flux measurements. Agric. Forest Meteorol. 78, 83–105. Huang, N.E., Shen, Z., Long, S., Wu, M., Shih, H., Zheng, Q., Yen, N., Tung, C., Liu, H., 1998. The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc. Roy. Soc. London A 454 (1971), 903. Moncrieff, J., Clement, R., Finnigan, J., Meyers, T., 2004. Averaging, detrending, and filtering of eddy covariance time series. In: Lee et al., X. (Ed.), Handbook of Micrometeorology. Kluwer Academic Publishers, Netherlands, pp. 7–31. Oncley, S.P., Businger, J.A., Itsweire, E.C., Friehe, C.A., LaRue, J.C., Chang, S.S.,1990. Surface layer profiles and turbulence measurements over uniform land under near-neutral conditions. In: 9th Symp. on Boundary Layer and Turbulence. American Meteorological Society, Roskilde, Denmark, pp. 237–240. Prueger, J., Hatfield, J., Albertson, J., Cahill, T., Cooper, D., Eichinger, B., Hipps, L., Kustas, B., Norman, J., 2009. SMEX02 SMACEX Tower Meteorological/Flux Data: Iowa. National Snow and Ice Data Center. Digital media, Boulder, CO, USA. Rao, A.R., Hsu, E.-C., 2008. Hilbert-Huang Transform Analysis of Hydrological and Environmental Time Series. Springer, Berlin, ISBN 978-1-4020-6453-1. Sakai, R., Fitzjarrald, D., Moore, K.E., 2001. Importance of low-frequency contributions to eddy fluxes observed over rough surfaces. J. Appl. Meteorol. 40, 2178–2192.