Double-Sided Optimization of ITS Data Aggregation Via Wavelet Transformation

Double-Sided Optimization of ITS Data Aggregation Via Wavelet Transformation

JOURNAL OF TRANSPORTATION SYSTEMS ENGINEERING AND INFORMATION TECHNOLOGY Volume 8, Issue 1, February 2008 Online English edition of the Chinese langua...

256KB Sizes 22 Downloads 11 Views

JOURNAL OF TRANSPORTATION SYSTEMS ENGINEERING AND INFORMATION TECHNOLOGY Volume 8, Issue 1, February 2008 Online English edition of the Chinese language journal Cite this article as: J Transpn Sys Eng & IT, 2008, 8(1), 49−54.

RESEARCH PAPER

Double-Sided Optimization of ITS Data Aggregation Via Wavelet Transformation LIU Menghan1, YU Lei1,2,*, GENG Yanbin3, CHEN Xumei1 1 School of Traffic and Transportation, Beijing Jiaotong University, Beijing 100044, China 2 Department of Transportation Studies, Texas Southern University, Houston 77004, USA 3 Decision Support Research Center, Transportation Planning and Research Institute, Ministry of Communications, Beijing 100029, China

Abstract: This paper addresses a modeling approach to analyzing an integrated system that includes freeway mainlines, ramp metering, and an upstream signalized diamond interchange. The modeling approach takes into consideration of the various components, their operational characteristics, and their interactions within the system. Strategies are also proposed for real-time operation and possible field implementation. The key element of achieving an integrated operation is to control the ramp feeding the traffic through special signal timings at the diamond interchange. Whenever a long queue is detected at the metered ramp, the signal timing should be adjusted to reduce the traffic flows entering the ramp. In this way, the ramp meter will remain in operation as long as possible, which would delay the onset of queue flush (i.e., termination of ramp meter) and minimize the possibilities of a freeway breakdown. Because a diamond interchange is usually controlled by special signal phasing and timing, the control strategies are specially focused on the special diamond signal phasing schemes. The system design and system architecture are also presented for potential deployment of the system in the field. Key Words: data aggregation; wavelet transformation; double-sided optimization; information-loss index

1

Introduction

As a new signal processing and analysis tool, the wavelet technique has been widely used in many application fields, and has shown unique advantages in solving specific traffic engineering problems in the Intelligent Transportation Systems (ITS). In the field of data aggregation of the multi-source and real-time ITS data, an effective method is urgently needed to process the massive ITS data and extract useful information, to improve the ability of making correct decisions in a complicated environment. In this context, the wavelet transformation is introduced into the field of ITS data aggregation to enhance the efficiency of data management.

2

Data aggregation

2.1 Conception of data aggregation The collection intervals of raw traffic flow data are usually less than two minutes in ITS, and such short intervals always exhibit large fluctuations, which cannot be directly used for planning applications. Thus, the traffic flow data should be

aggregated[1]. In the traffic flow data, the aggregation level refers to the time interval, which is a special interval used to measure traffic flow data. Moreover, the optimal aggregation level is the optimal time interval that is calculated for a particular traffic application need[2]. Data aggregation is the process of transforming the real-time traffic flow data into aggregated data at optimal intervals using certain methods, according to the requirements of different users. The real-time traffic flow data analyzed in this paper includes volume, speed, and occupancy. 2.2 Existing aggregation methods There are primarily two types of data aggregation techniques, the statistical-based method[3] developed by Gajewski et al. and the wavelet-based method[4] proposed by Qiao et al. Both the methods have the same goal of reserving the changing characteristics of short interval data after data aggregation. However, the difference is that only a single sample is analyzed in the statistical-based method to derive the trends of the whole sample, so the undesirable information

Received date: 2007-06-13; Revised date: 2007-09-04; Accepted date: 2007-09-28 *Corresponding author. E-mail: [email protected] Copyright © 2008, China Association for Science and Technology. Electronic version published by Elsevier Limited. All rights reserved.

LIU Menghan et al. / J Transpn Sys Eng & IT, 2008, 8(1), 49−54

(such as errors and noises) could be included in the aggregated data series. On the other hand, the wavelet-based method determines the common characteristics by analyzing the similarities among multiple samples. Therefore, the noises as well as useless components in the signal are removed, and the common information of components can be distilled in the real application of the wavelet-based method. Through the analysis of multiple data sets, the effect caused by the accidental traffic events can be eliminated. Therefore, the wavelet method has more application significance and value. The initial wavelet-based model is a process of single side optimization, where the similarities among the stored data with common attributes (i.e., data derived from the same week days of several weeks) are compared[5]. More specifically, the power spectrums of signal components under different wavelet scales are analyzed and processed, and the comparison criteria are then determined by adopting the mathematical formula, and finally, the left side of the optimal aggregation level is determined. This method can only determine the upper boundary of the aggregation levels, therefore, it is an aggregation of single-sided optimization (SSO)[6,7]. The intention of this paper is to present a double-sided optimization (DSO) method, which had been originally proposed by Qiao et al.[8], by calculating the lower boundary of aggregation levels.

The basic ideas and detailed description of the methodologies presented in this paper can be found in Qiao et al.[8] After full consideration of the characteristics of ITS data in Beijing, the data aggregation framework based on the idea of developing the range of optimal aggregation levels is shown in Fig. 1. On the basis of the left-side aggregation method, the DSO method that meets the application needs of ITS data is developed by introducing the Information-Loss Index (ILI). The primary idea of the framework is based on the SSO method, the right boundary of the aggregation level is determined using the ILI, and then the range of the optimal aggregation level is determined. The following use the traffic volume data of five days on the third-ring express road in Beijing as an example to explain the detailed steps in implementing the right side optimization. (1) Data aggregation To construct the indicator of ILI, the raw ITS data series are directly aggregated under several specified aggregation levels initially, which will be the evaluation basis of information loss of each aggregation level. As shown in Fig. 2, the horizontal axis represents the time, and the vertical axis represents the volume. After data aggregation, the peak value of the traffic volume is reduced, the fluctuations are smoother, and the trends are clearer.

Volume

Methodologies

80 70 60 50 40 30 20 10 0 1

51

101

151

201

251

301

351 401 Raw data

151

201

451

501

551

601

651

701

70 60 50 Volume

3

Fig. 1 Framework of ITS data aggregation based on DSO

40 30 20 10 0 1

51

101

251

301

351

Aggregated data

Fig. 2 Comparison between ITS data before and after aggregation

(2) Wavelet decomposition Both the raw data and aggregated data are decomposed by the wavelet to the specific decomposition scales, whereas, their larger scales correspond to the lower frequency components of the signals. The upper part of Fig. 3 shows the values of the wavelet composition coefficients from the raw data, and the lower part shows those from the aggregated data. In this Figure, the vertical axis represents the wavelet coefficients, indicating the peak value decreases from 400 to 300, after aggregation, which reflects the information loss of data caused by the data aggregation.

LIU Menghan et al. / J Transpn Sys Eng & IT, 2008, 8(1), 49−54

Coefficients of CWT

300 200 100

0 -100 1

51

101

151

201

251

301

351 401

451

501

551

601

651

701

-200

Power Spectrums

100000

400

80000 60000 40000 20000 0

-300

1

-400

2

3

4

5

6

7

8

9

10

Raw data

Raw data

100000

300 200 100 0 -100 1

51

101 151 201 251 301 351 401 451 501 551 601 651 701

-200 -300 -400

Power Spectrums

Coefficients of CWT

400

80000 60000 40000 20000 0

Aggregated data

1

Fig. 3 Results of ITS Data after CWT (Continuous Wavelet Transformation)

ni

nj

M i = ∑∑ (qijt − qija ) i =1 j =1

where Mi is the information loss index for nt records of the t data series at the decomposition level i; q ij is the power at point j for the raw data series t at the decomposition level i; q ija is the power at the point j for the aggregated data series at the decomposition level i. By calculating the information loss under different wavelet decomposition scales, as shown in Fig. 5, the changing relationships between the ILI and the decomposition scales are illustrated. The horizontal axis represents the aggregation levels, whereas, the vertical axis represents the ILI. The curves in different colors stand for different scales of wavelet decomposition, which indicate the change of scales from 800 to 6000. It can be concluded that the information loss becomes higher with an increase in aggregation levels. When the aggregation level is fixed, the information loss between the raw data and the aggregated data under the larger

3

4

5

6

Aggregated data

7

8

9

10

Fig. 4 Power spectrum coefficients of ITS data

1 0.9

Scale800

0.8

1000

0.7

1200 1500

0.6

1800

0.5

ILIs

(3) Fast Fourier transformation The results of wavelet decomposition are transformed from the time domain to the frequency domain by Fast Fourier Transformation. Then the power spectrum of raw data and aggregated data are calculated, and their energy loss is compared. The upper and lower parts of Fig. 4 show the power spectrums of raw data and aggregated data, respectively. The maximum power spectrum of the aggregated data drops from 70,000 to 40,000. The data aggregation eliminates the noises and the components of high frequencies, but reduces a part of the signal energy. (4) Information-loss index calculation For each data series of raw data and aggregated data, the ILI can be calculated by the following equation. In this equation, Mi is set as the indicator of information loss at the decomposition level i.

2

2000

0.4

2500 0.3

3000

0.2

4000

0.1

6000

0 1

3

5

7

9

11 13 15 17 19 21 23 25 27 29

Aggregated levels

Fig. 5 Information-loss indexes of ITS data

scales is much higher than it is under smaller scales. It indicates that the distribution of energy of raw data and aggregated data, under smaller scales, have more similarities. (5) Double-sided optimization The Dissimilarity Index (DI) curve, the exponential regression curve of DI, and the ILI curve are plotted in Fig. 6. As observed in this figure, with an increase in the decomposition scales (as the increase of the aggregation levels), the DI becomes higher, whereas, the ILI becomes lower. The objective of DSO is to acquire the range of the aggregation levels, and to obtain both the similarities and the useful information of ITS data to the greatest extent. Therefore, two cut-off values for the DI and the ILI can be set respectively. When both the indices decline to the specific values, the corresponding left and right ranges of the aggregation levels can be used to realize this objective. The optimal range of the aggregation levels in Fig. 6 is 6–8 min, determined by two curves together.

LIU Menghan et al. / J Transpn Sys Eng & IT, 2008, 8(1), 49−54

4

In this paper, the ITS data aggregation software based on DSO is developed on the basis of a MATLAB platform[9]. The traffic flows, including: volume, speed, and occupancy, from the three adjacent lanes of the #03006 detector, located in the northwest corner of Hu Jialou Intersection, on the Jing Guang Bridge of the third-ring express road are used as an example. The data of ten whole Wednesdays, from March 13 to May 22 of 2002 have been selected[10], to compare SSO and DSO. The results are shown in Table 1, and further analysis of sensitivities is conducted in this paper. The upper three parts of Table 1 show the optimal aggregation levels of three traffic parameters, including: volume, speed, and occupancy, obtained by the DSO method, whereas, the lower part is the optimal aggregation level for speed, obtained by the SSO method.

1 0.9 DI

0.8

Regression for DI

Indexes

0.7

ILI

0.6 0.5 0.4 0.3 0.2 0.1 0 2

10

18

26

34

42

50

Case study

58

Aggregation Levels

Fig. 6 Optimal decomposition scale based on DSO

Table 1 Optimal aggregation level of different hours based on DSO and SSO Volume(1)

Time period (0:00 – 23:00)

Occupancy(1)

Speed(1)

Speed(2)

Lane 1

Lane 2

Lane 3

Lane 1

Lane 2

Lane 3

Lane 1

Lane 2

Lane 3

Lane 1

Lane 2

Lane 3

0–1

6

20

12

24

22

14

24

10

12

24

10

12

1–2

14

24

24

30

22

20

18

18

22

18

18

22

2–3

16

24

16

16

24

18

18

22

24

18

22

24

3–4

24

18

24

24

20

24

24

16

14

24

16

14

4–5

22

12

22

24

18

20

30

14

16

30

14

16

5–6

18

14

14

20

12

14

16

10

12

16

10

12

6–7

12

10

14

12

14

10

8

12

8

8

6

8

7–8

10

10

12

10

10

20

12

20

10

4

4

6

8–9

10

8

12

8

8

12

8

16

8

6

4

8

9–10

12

10

16

16

14

14

8

8

8

8

6

6

10–11

20

10

12

12

12

12

8

8

8

8

6

8

11–12

10

10

12

12

12

14

10

14

8

10

4

8

12–13

8

10

12

10

12

14

10

10

8

10

6

8

13–14

14

10

10

16

10

10

10

10

8

10

6

8

14–15

14

10

10

12

8

16

8

14

8

8

4

6

15–16

10

8

12

8

8

10

8

20

8

8

4

8

16–17

14

8

10

12

8

12

10

24

12

6

2

4

17–18

12

8

8

10

8

18

12

30

8

4

2

6

18–19

2

10

10

8

10

10

8

16

8

6

4

6

19–20

10

10

8

10

8

12

10

8

8

10

6

8

20–21

16

10

12

18

8

16

8

8

10

8

6

10

21–22

14

8

10

18

14

12

10

8

10

10

8

10

22–23

22

10

14

24

12

14

18

10

8

18

10

8

Note: (1) DSO (2) SSO

LIU Menghan et al. / J Transpn Sys Eng & IT, 2008, 8(1), 49−54

22-23

20-21

18-19

16-17

22-23

20-21

18-19

16-17

(a) Lane 1

22-23

20-21

18-19

16-17

14-15

12-13

10-11

8-9

0-1

Lane 3

SSO

6-7

Lane 1 Lane 2

35 30 25 20 15 10 5 0

DSO

35 30 25 20 15 10 5 0 4-5

Aggregation Levels

Fig. 7 Comparison among three traffic flow parameters

22-23

(b) Lane 2 Aggregation Levels

Fig. 8 Comparison among three traffic lanes

DSO

25

SSO

20 15 10 5 22-23

20-21

18-19

16-17

14-15

12-13

10-11

8-9

6-7

4-5

0 0-1

Figure 7 illustrates the aggregation levels of different traffic parameters on lane 2 using DSO, and compares their trends at different time periods in the day. It can be observed that the two lines of the optimal aggregation level of volume and occupancy are similar to each other. Both of them have relatively lower aggregation levels in high traffic demand, such as 8 min for the optimal aggregation level in the peak traffic hours from 8:00 a.m. to 9:00 a.m. (morning peak) and from 4:00 p.m. to 5:00 p.m. (evening peak) on lane 2. Moreover, the aggregation levels are much higher in the non-peak hours, such as 24 min for the optimal aggregation level from 2:00 a.m. to 3:00 a.m. These characteristics account for lower data fluctuations and less frequency components in the non-peak hour traffic data, in comparison with the peak hour data under the same aggregation levels and more regularity is exhibited in the non-peak hour traffic data. Therefore, the information loss in non-peak hours is less than in peak hours, and the aggregation level of the non-peak hours, after calculation, is much higher than that in peak hours, under the same conditions. The optimal aggregation levels of adjacent lanes are compared using DSO as shown in Fig. 8, and the changes at different time periods of a day are compared. It is found that the optimal aggregation levels of the different lanes have similar characteristics at the same time because the adjacent lanes experience the peak and non-peak characteristics at the same time, and the information of the measured traffic flow has similar changing features. The optimization results of DSO and SSO for the speed parameter among three lanes are compared, as shown in

30

2-3

20-21

18-19

16-17

14-15

12-13

Time period

10-11

8-9

6-7

4-5

2-3

T ime Period

0-1

Aggregation Levels

14-15

T ime Period

2-3

Time period

14-15

12-13

10-11

8-9

6-7

4-5

2-3

0-1

0

12-13

5

10-11

0-1

10

8-9

15

SSO

6-7

20

DSO

4-5

Speed

25

35 30 25 20 15 10 5 0 2-3

Occupancy

30 Aggregation Levels

Aggregation Levels

Volume

35

T ime Period

(c) Lane 3 Fig. 9 Comparison between SSO and DSO

Fig. 9. It is obvious that the optimal aggregation levels acquired by DSO are always higher than those produced by SSO or equal to them. That is because the objective of DSO always searches for those upper decomposition levels, to make the aggregated data have relatively less information loss. Therefore, in real application, sometimes, the aggregation results of DSO would either keep the aggregation level given by SSO or expand it to the larger ones.

5

Conclusions

On the basis of the analysis in the case study, conclusions and suggestions are summarized as follows: (1) Under the same aggregation levels, the information loss at peak hours is higher than that at non-peak hours, therefore, it is suggested that data users reserve short interval data for peak hours and store long interval data for non-peak hours as far as possible. (2) The optimal aggregation levels of the adjacent lanes have similar changing trends, therefore, it is suggested that one refers to the aggregation levels of its adjacent lane when

LIU Menghan et al. / J Transpn Sys Eng & IT, 2008, 8(1), 49−54

the detector loses some historical data for the problems of data collection. (3) DSO is a balancing process, and sometimes users cannot get the boundary of aggregation levels in accordance with the evaluation criteria because of the strict parameter settings or the characteristics of the data themselves. Under these circumstances, the aim of the research must be considered: which is the more important influencing factor, the loss of information after data aggregation or the similar properties among the data? Different data users must choose either the two sides of the DSO method or the more important side, to define the optimal aggregation levels according to their practice. The data aggregation of traffic flow is significantly important in the application of ITS data management, and the wavelet analysis method, based on digital signal processing, can deal with the massive real-time traffic flow data at the same time, which makes it precede the conventional statistical method. Its powerful and efficient data-processing capabilities meet the requirement of ITS well and have broad applications. In this paper, the DSO approaches have been primarily applied, and suggestions have been provided. Further studies will be conducted on the practical application of the data aggregation algorithm, based on the existing research results.

References [1] Geng Y B, Yu L. Current situation and demand analysis of real-time ITS data. Journal of Transportation Systems Engineering and Information Technology, 2005, 5(6): 47–53. [2]

Wu J Q, Yu L, Yuan Z Z. Application and case study of real-time ITS data sampling strategy. Journal of Transportation Systems Engineering and Information Technology, 2004, 4(2):

31–41. [3] Gajewski B, Turner S, Eisele W, et al. Intelligent transportation systems data archiving: statistical techniques for determining optimal aggregation widths for inductive loop detector speed data. Transportation Research Record 1719, TRB, National Research Council, Washington, D.C., 2000, 85–93. [4] Qiao F X, Wang X, Yu L. Optimizing aggregation level for ITS data based on wavelet decomposition. Transportation Research Record 1840, TRB, National Research Council, Washington, D.C., 2003, 10–20. [5] Liu M H, Yu L, Yuan Z Z. Improved aggregation levels of ITS data via wavelet decomposition and fast Fourier transform algorithm.

Shanghai,

China:

IEEE

6th

International

Conference on Intelligent Transportation Systems, 2003, 1780–1785. [6] Liu M H, Yu L, Geng Y B. Sensitive analysis of ITS data aggregation via wavelet transform-based algorithm. Beijing: 2005 Doctoral Forum of China, 2005, 101–106. [7] Yu L, Chen X M, Geng Y B. Optimized aggregation level for ITS data based on wavelet decomposition. Journal of Tsinghua University (Science and Technology), 2004, 44(6): 793–796. [8] Qiao F, Wang X, Yu L. Double-Sizes optimization of aggregation level for ITS data. Transportation Research Record 1879, TRB, National Research Council, Washington, D.C., 2004, 80–88. [9] Misiti M, Misiti Y, Oppenheim G, et al. Wavelet Toolbox for Use with MATLAB: User’s Guide Version 2.1, The Math Works, Inc. Natick, MA, 2001. [10] Beijing Jiaotong University. Report of Research on Intelligent Transportation System Data Management Technology, the Ministry of Science and Technology of China, Beijing, 2004.