Clustering of interval-valued time series of unequal length based on improved dynamic time warping

Clustering of interval-valued time series of unequal length based on improved dynamic time warping

Accepted Manuscript Clustering of interval-valued time series of unequal length based on improved dynamic time warping Xiao Wang, Fusheng Yu, Witold ...

NAN Sizes 2 Downloads 83 Views

Accepted Manuscript

Clustering of interval-valued time series of unequal length based on improved dynamic time warping Xiao Wang, Fusheng Yu, Witold Pedrycz, Lian Yu PII: DOI: Reference:

S0957-4174(19)30005-3 https://doi.org/10.1016/j.eswa.2019.01.005 ESWA 12406

To appear in:

Expert Systems With Applications

Received date: Revised date: Accepted date:

14 August 2018 1 January 2019 2 January 2019

Please cite this article as: Xiao Wang, Fusheng Yu, Witold Pedrycz, Lian Yu, Clustering of intervalvalued time series of unequal length based on improved dynamic time warping, Expert Systems With Applications (2019), doi: https://doi.org/10.1016/j.eswa.2019.01.005

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

Clustering of interval-valued time series of unequal length based on

CR IP T

improved dynamic time warping

Xiao Wang1,2 , Fusheng Yu3∗ , Witold Pedrycz4 and Lian Yu3

School of Economics and Management, Beijing Institute of Petrochemical Technology, Beijing 102617, China 2 3

4

Beijing Academy of Safety Engineering and Technology, Beijing 102617, China

School of Mathematical Sciences, Beijing Normal University, Beijing 100875, China

AN US

1

Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB T6R 2V4, Canada ∗

Corresponding author. Tel.: +86 10 58807735

E-mail addresses: [email protected] (X. Wang), [email protected] (F. Yu), [email protected] (W. Pedrycz), [email protected] (L. Yu).

M

Abstract: Clustering of a group of interval-valued time series of unequal length is often encountered and the key point of this clustering is the distance measure between two interval-valued time series. However,

ED

most distance measure methods apply to interval-valued time series of equal length, and another methods applicable to unequal-length ones usually show high computational cost. In order to give a reasonable and efficient distance measure, this paper first proposes a new representation in the form of a sequence of 3-tuples

PT

for interval-valued time series. In this representation, fully take into account the time-axis and value-axis information to decrease the loss of information. Meanwhile, this representation is guaranteed to achieve

CE

dimensionality reduction. Based on the new representation, dynamic time warping algorithm is then employed and an improved dynamic time warping algorithm is produced. Furthermore, a hierarchical clustering algorithm based on the new proposed distance measure is designed for interval-valued time series of equal

AC

or unequal length. Experimental results show the effectiveness of the proposed distance and quantify the performance of the designed clustering method. Keywords: interval-valued time series, unequal length, hierarchical clustering, distance measure, dynamic time warping.

1

Introduction

Numerical time series are used to represent time-varying phenomena, have gained a great deal of attention and have been studied in many application areas. However, in some situations, other types instead of numeric ones are of interest. For instance, consider the outdoor temperature during a certain time period in a region. We usually record the maximum and minimum temperature for each day. In this case, an interval-valued time 1

ACCEPTED MANUSCRIPT

series is generated and would be more appropriate to describe this phenomenon. In medicine, daily diastolic and systolic blood pressure is stored in an interval-valued dataset. Then the blood pressure within a certain period of time also forms an interval-valued time series. In essence, interval-valued time series (IVTS, for brief) are interval-valued data collected in a chronological sequence. IVTS also arise quite naturally in the fields of

CR IP T

economic (Arroyo, Esp´ınola & Mat´e, 2011; Wang, Zhang & Li, 2016; Xiong, Bao & Hu, 2014), engineering (Garc´ıa-Ascanio & Mat´e, 2010; Mu˜ noz et al., 2007) and agriculture (Lin & Gonz´ alez-Rivera, 2016).

There has been a growing focus on such time series, and data mining techniques on them have been gaining a great deal of attention. As one of data mining task, clustering of IVTS is of much significance. For classical time-series clustering, it is usually classified into whole clustering and subsequence clustering (Keogh, Lin & Truppel, 2003). In whole clustering, given a group of time series, the objective is to group similar time

AN US

series into the same cluster, while subsequence clustering is implemented on a single time series and similar subsequences are excavated. Such division is also suitable for interval-valued time series. For subsequence clustering of interval-valued time series, Coppi and D’Urso (2002) proposed three fuzzy clustering models based on the analysis of the cross sectional or longitudinal characteristics of their components (centers and spreads). Subsequently, Coppi and D’Urso (2003) also discussed the clustering of fuzzy multivariate time trajectories, where interval-valued time series can be considered a particular case of fuzzy

M

multivariate time trajectory. Since an interval-valued time series is composed of a series of interval-valued data with time, clustering of interval-valued data also attracted attention of scholars. Different versions of dynamic

ED

clustering or fuzzy c-means algorithm (Carvalho & Lechevallier, 2009; Carvalho & Ten´ orio, 2010; Carvalho & Sim˜oes, 2017; Coppi, D’Urso & Giordani, 2012; D’Urso & Giovanni, 2006; D’Urso et al., 2017) were proposed for interval-valued dataset clustering. In addition, fuzzy c-ordered medoids clustering of interval-valued data

PT

was proposed in (D’Urso & Leski, 2016), and two robust clustering methods of interval-valued data were provided in (D’Urso & Giordani, 2006; D’Urso & Giovanni, 2014). These clustering algorithms rely on different

CE

distances between interval-valued data, such as adaptive quadratic distance or L2 distance. This paper mainly focuses on the whole clustering, which relies heavily on the ability of a formulation of a suitable distance measure between two interval-valued time series. Based on the Hausdorff distance (Irpino &

AC

Tontodonato, 2006) and Ichino-Yaguchi distance (Ichino & Yaguchi, 1994) between a pair of interval-valued data, Arroyo and Mat´e (2006) proposed two distance measures for the IVTS. Gonz´ alez et al. (2007) defined a kernel-based distance between IVTS and Sun et al. (2012) adjusted the Euclidean distance to the IVTS. These existing distance methods are applicable to interval-valued time series of equal length and may fail to work when encountering unequal-length ones. However, due to various reasons, the data set we are faced with is not necessarily complete, and time series or interval-valued time series of unequal length are commonly encountered. For time series of unequal length, Wang et al. (2016) studied the distance measure and a hierarchical clustering algorithm was proposed (Wang, Yu, Pedrycz & Wang, 2018). Interval-valued time series of unequal length also often exist in reality. For instance, the volatility of daily stock price is represented by an interval-valued data set. The volatility of a stock price since its issue can be regarded as an IVTS. Generally, the recording dates of two stocks are 2

ACCEPTED MANUSCRIPT

usually different, and interval-valued time series of unequal length are present. Hence, the clustering methods suitable for interval-valued time series of unequal length are urgently needed. Since clustering is closely related to the distance measure, how to construct an appropriate distance measure between IVTS of unequal length becomes the most important task.

CR IP T

For interval-valued time series of unequal length, the dynamic time warping algorithm is commonly used to calculate their distance, in which the entries of the distance matrix are always calculated by distance measure between interval-valued data, such as Hausdorff distance, L1 distance, L2 distance, Wasserstein distance and Ichino-Yaguchi distance. In addition, D’Urso and Giovanni (D’Urso & Giovanni, 2006) proposed a weighted distance for interval-valued data, which is called D’Urso-Giovanni distance. More details about these distance measures are shown in Table 1, where ωα + ωβ = 1 and 0 ≤ ωβ ≤ ωα in D’Urso-Giovanni distance. It is

AN US

apparent that the existing studies about IVTS paid more attention to the interval-valued data at each time point. That is, the value-axis information has attracted much attention, while the time-axis information is often overlooked. But beyond that, dynamic time warping (DTW, for brief) algorithm has a quadratic time and space complexity that limits its use for large-scale interval-valued time series.

Table 1: Distance measures used for interval-valued data [l, u] and [v, w].

L1 distance (Souza & Carvalho, 2004)

M

Hausdorff distance (Irpino & Tontodonato, 2006)

ED

L2 distance (Carvalho, Brito & Bock, 2006)

Wasserstein distance (Verde & Irpino, 2008)

PT

Ichino-Yaguchi distance (Ichino & Yaguchi, 1994)

|l − v| + |u − w| p |l − v|2 + |u − w|2 q v+w 2 1 u−l ( l+u 2 − 2 ) + 3( 2 −

w−v 2 2 )

max(u, w) − min(l, v) − 12 [(u − l) + (w − v)] q v+w 2 w−v 2 2 u−l ωα2 ( l+u 2 − 2 ) + ωβ ( 2 − 2 )

CE

D’Urso-Giovanni distance (D’Urso & Giovanni, 2006)

max{|l − v|, |u − w|}

In order to reduce computing overhead and space complexity, dimensionality reduction is necessary before using dynamic time warping algorithm. To decrease the loss of information as far as possible in dimensionality

AC

reduction process, this paper first proposes a new representation of interval-valued time series, where each interval-valued time series is represented by a sequence of 3-tuples. This sequence of 3-tuples, including the time-axis and value-axis information of the original interval-valued time series, is realized by three stages. Nondimensionalize the time and value axis in the first stage. After normalization, in the two-dimensional

space, each interval-valued time series is regarded as a belt-shaped region, which is surrounded by the lower boundary time series and the upper boundary time series. In the stage of segmentation, a series of nonoverlapping triangles fill this belt-shaped region by using l1 trend filtering (Kim et al., 2009), piecewise linear representation (Li, 2015) and certain rule-based connection. In the third stage, calculate the geometrical barycenter and the exterior radius of each triangle. Here, the geometrical barycenter and the exterior radius form a 3-tuple, and the interval-valued time series is approximated by a sequence of 3-tuples. At the same time, this new representation is guaranteed to achieve dimensionality reduction. 3

ACCEPTED MANUSCRIPT

After dimensionality reduction, DTW algorithm is employed to calculate the distance between the constructed sequences of 3-tuples. This distance measure based on geometrical barycenter, exterior radius and DTW is named of BRDTW (geometrical Barycenter and exterior Radius-based DTW) for short, and the algorithm for calculating such distance between two interval-valued time series is called BRDTW algorithm.

CR IP T

The proposed distance measure exhibits two advantages. (1) This distance measure method can be adapted to interval-valued time series of equal or unequal length. (2) The computational complexity of the BRDTW algorithm decreases significantly compared to the dynamic time warping algorithm. In addition, we apply BRDTW algorithm to the clustering of interval-valued time series of unequal length and deduce a hierarchical clustering method by considering single-linkage clustering. An overall processing realized is shown in Fig. 1. Output: Clustering result

[X]1 Distance matrix by BRDTW

[X]2

Cluster 2

Cluster K

ED

M

Singlelinkage method

Cluster 1





[X]N

AN US

Input: A group of IVTS of unequal length

Figure 1: The flow of BRDTW-based single-linkage clustering algorithm. The structure of this paper is organized as follows. Section 2 gives a discussion of all necessary prerequisites,

PT

such as the concept of interval-valued time series, l1 trend filtering and dynamic time warping algorithm. A new distance measure called BRDTW between two interval-valued time series is presented in Section 3, and

CE

BRDTW-based single-linkage clustering algorithm for IVTS of unequal length is given in Section 4. In Section 5, a series of experiments are reported to illustrate the performance of the BRDTW algorithm and BRDTW-

2

AC

based single-linkage clustering algorithm. Finally, Section 6 provides a conclusion of the whole work.

Preliminaries

In this section, we briefly recall the concept of interval-valued time series, l1 trend filtering (Kim et al., 2009) and dynamic time warping algorithm (Berndt & Clifford, 1994) being employed in the study.

2.1

Interval-valued time series

At each time point, an interval-valued time series comes as an interval defined by the two values, namely the lower and the upper bounds. An interval-valued time series [X] is expressed as follows [X] = {(tX + ∆t, [l1 , u1 ]), (tX + 2∆t, [l2 , u2 ]), · · · , (tX + n∆t, [ln , un ])} 4

ACCEPTED MANUSCRIPT

where n is the number of points in [X] and usually called the length of [X], tX + i∆t denotes the time coordinate of the ith element of [X], ∆t is the time unit and can be taken as second, minute or day, and interval [li , ui ] denotes the bounds of the amplitude of [X] at time tX + i∆t, li , ui ∈ R1 and li ≤ ui for i = 1, 2, · · · , n.

CR IP T

An interval-valued time series [X] can be decomposed into its lower boundary time series Xl = {(tX + ∆t, l1 ), (tX + 2∆t, l2 ), · · · , (tX + n∆t, ln )} and upper boundary time series

Xu = {(tX + ∆t, u1 ), (tX + 2∆t, u2 ), · · · , (tX + n∆t, un )}.

l1 trend filtering

AN US

2.2

Assume that time series y = (y1 , y2 , · · · , yn ) consists of an underlying trend component x = (x1 , x2 , · · · , xn ) and a random component z = (z1 , z2 , · · · , zn ). x can be obtained through an optimization problem in the following form (Kim et al., 2009)

n n−1 X 1X (yt − xt )2 + λ |xt−1 − 2xt + xt+1 | 2 t=1 t=2

(1)

M

min Q(x) =

where the non-negative parameter λ is used to control a trade-off between the smoothness of x and the

ED

magnitude of the residual y − x, and λ is usually given in advance. The first term standing in (1) expresses the residual y − x, while the second term quantifies the smoothness of x. The trend x is usually called l1

PT

trend filtering of y, and l1 trend filtering is shown in Fig. 2. 24

CE

22

y

20

AC

18 16

0

500

1000

1500

2000

2500

3000

3500

4000

4500

4000

4500

l1 trend filtering

24 22 20

x

18 16

0

500

1000

1500

2000

2500

3000

Figure 2: l1 trend filtering. 5

3500

ACCEPTED MANUSCRIPT

lt lt Based on (Efron et al., 2004; Rosset & Zhu, 2007), the solution x = (xlt 1 , x2 , · · · , xn ) of (1) is a piecewise

linear function. Assume that there are p points of slope changes (ti , xlt ti ) with t1 < t2 < · · · < tp . The initial lt lt point and end point of x are written as (t0 , xlt t0 ) and (tp+1 , xtp+1 ), where t0 = 1 and tp+1 = n. Then xt in the

time period of [ti , ti+1 ] can be expressed as a line segment for i = 0, 1, · · · , p. Meanwhile, the original time

2.3

CR IP T

series y is approximated by p + 1 line segments, where the value of p depends on the parameter λ.

Dynamic time warping algorithm

Given two time series q = (q1 , q2 , · · · , qn ) and r = (r1 , r2 , · · · , rm ), dynamic time warping is performed by applying dynamic programming on an n × m grid, where each grid point (i, j) corresponds to an alignment between points qi and rj . A warping path W = {w1 , w2 , · · · , wK } is produced by aligning the elements of q W also needs to satisfy three constraints as below.

AN US

and r, where min(m, n) ≤ K < m + n − 1 and wk = (i, j) for k = 1, 2, · · · , K. In addition, the warping path (1) Endpoint constraint: w1 = (1, 1) and wK = (n, m).

(2) Monotonicity constraint: if wk = (i, j) and wk−1 = (i0 , j 0 ), then i ≥ i0 and j ≥ j 0 .

(3) Continuity constraint: if wk = (i, j) and wk−1 = (i0 , j 0 ), then i ≤ i0 + 1 and j ≤ j 0 + 1.

M

A warping path W which minimizes the warping cost is expressed as (K ) X min d(wk ) W

k=1

ED

where d(wk ) is the distance between qi and rj when wk = (i, j), and the Euclidean distance is usually considered (Berndt & Clifford, 1994). In order to find the optimal warping path, dynamic programming is

PT

employed and the following recurrence expression is evaluated (Berndt & Clifford, 1994): Γ(i, j) = d(i, j) + min {Γ(i, j − 1), Γ(i − 1, j), Γ(i − 1, j − 1)}

CE

where d(i, j) and Γ(i, j) are the distance and cumulative distance between qi and rj , respectively. Once the cumulative distance Γ has been built, the warping path W is immediately determined and the dynamic time

3

AC

warping distance between q and r is Γ(n, m).

Distance measure between two interval-valued time series

Definition 3.1 Under the time unit ∆t, given are two interval-valued time series [X] = {(tX + ∆t, [l1 , u1 ]), (tX + 2∆t, [l2 , u2 ]), · · · , (tX + m∆t, [lm , um ])} of length m and [Y ] = {(tY + ∆t, [v1 , w1 ]), (tY + 2∆t, [v2 , w2 ]), · · · , (tY + n∆t, [vn , wn ])} of length n. If m 6= n, [X] and [Y ] are called interval-valued time series of unequal length. 6

ACCEPTED MANUSCRIPT

It is noted that there is no requirement for tX = tY . Denote by Xl and Yl the lower boundary time series of [X] and [Y ], respectively. Meanwhile, let Xu and Yu represent the upper boundary time series of [X] and [Y ], respectively. In more detail,

CR IP T

Xl = {(tX + ∆t, l1 ), (tX + 2∆t, l2 ), · · · , (tX + m∆t, lm )}, Xu = {(tX + ∆t, u1 ), (tX + 2∆t, u2 ), · · · , (tX + m∆t, um )}, Yl = {(tY + ∆t, v1 ), (tY + 2∆t, v2 ), · · · , (tY + n∆t, vn )} and

Yu = {(tY + ∆t, w1 ), (tY + 2∆t, w2 ), · · · , (tY + n∆t, wn )}.

For the distance measure between two interval-valued time series of unequal length, the dynamic time

AN US

warping algorithm is commonly used. The element of the distance matrix in dynamic time warping algorithm is the distance measure between interval-valued data, and the time-axis information in interval-valued time series is always ignored. Besides that, the dynamic time warping algorithm has a quadratic time and space complexity that limits its use for large-scale interval-valued time series. In order to reduce computing overhead and space complexity, dimensionality reduction is necessary before using dynamic time warping algorithm. To

M

decrease the loss of information as far as possible in dimensionality reduction process, a new representation of interval-valued time series is first proposed in the following part.

New representation of interval-valued time series

ED

3.1

Since an interval-valued time series is composed of a series of interval-valued data, the previous viewpoint

PT

is more concerned with the value-axis information instead of time-axis information. However, an intervalvalued time series [X] = {(tX + ∆t, [l1 , u1 ]), (tX + 2∆t, [l2 , u2 ]), · · · , (tX + m∆t, [lm , um ])} includes these two aspects. In order to fully take into account the time-axis and value-axis information and achieve dimensionality

CE

reduction, we propose a new representation of [X]. In the first stage, considering that the dimension of time axis is not consistent with that of value axis.

AC

As shown in Fig. 3, nondimensionalize the time axis and value axis of [X] by normalization and [X] is transformed into

0 [X]0 = {(t0X + ∆t0 , [l10 , u01 ]), (t0X + 2∆t0 , [l20 , u02 ]), · · · , (t0X + m∆t0 , [lm , u0m ])}

where ∆t0 =

1 0 m−1 , tX

= −∆t0 , li0 =

li − min{l1 , l2 , · · · , lm } max{u1 , u2 , · · · , um } − min{l1 , l2 , · · · , lm }

u0i =

ui − min{l1 , l2 , · · · , lm } max{u1 , u2 , · · · , um } − min{l1 , l2 , · · · , lm }

and

for i = 1, 2, · · · , m. For the sake of convenience, we still use [X] to denote the interval-valued time series after normalization. 7

ACCEPTED MANUSCRIPT

22 [X]

20

16 14

0

10

20

30

40

50

60

70

Normalization 1

CR IP T

18

80

90

100

0.5

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

M

0

AN US

[ X ]'

Figure 3: Normalization of an interval-valued time series [X].

ED

In the two-dimensional space, the whole information of [X] is represented by a belt-shaped region, which is surrounded by the lower boundary time series Xl and the upper boundary time series Xu . In the second

PT

stage, we do the segmentation of the belt-shaped region and a series of non-overlapping triangles fill this belt-shaped region, as shown in Fig. 4. The specific segmentation process is summarized as Algorithm 1. Segmentation

CE

Algorithm 1

Step 1 Implement l1 trend filtering to Xl and Xu (the key points of Xl and Xu are produced). Step 2 Employ piecewise linear representation method (Xl and Xu are approximated by some line seg-

AC

ments).

Step 3 Generate a series of non-overlapping triangles. In the Step 3 of Algorithm 1, we merge the key points of the lower boundary time series Xl and the upper

boundary time series Xu , rearrange them according to the time axis in an ascending order and connect the key points to form a series of non-overlapping triangles. Assume that Xl is composed of K1 line segments and written as Xl = {X 1 , X 2 , · · · , X K1 }, where X i = h(tLi , lLi ), (tRi , lRi )i is the ith line segment, (tLi , lLi ) and (tRi , lRi ) represent its left and right endpoints in two-dimensional space for i = 1, 2, · · · , K1 , and the right endpoint of X i coincides with the left endpoint of X i+1 . Suppose that Xu is composed of K2 line segments and expressed as Xu = {X 1 , X 2 , · · · , X K2 }, where X j = h(tLj , uLj ), (tRj , uRj )i for j = 1, 2, · · · , K2 . After segmentation, the number of the non-overlapping triangles resulted by [X] is K1 + K2 − 2. 8

ACCEPTED MANUSCRIPT

1 0.8

[X ]

0.6

0.2 0

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Segmentation 1 0.8

0.4 0.2 0

0

0.1

0.2

0.3

0.8

AN US

0.6

CR IP T

0.4

0.4

0.5

0.6

0.7

0.8

0.9

0.9

1

1

M

Figure 4: Segmentation of an interval-valued time series [X]. In the third stage, we use the geometrical barycenter and the exterior radius to represent the information

ED

of a triangle, where the geometrical barycenter can reflect its location and the exterior radius can reflect its scale. We call this stage ‘Calculation’.

Suppose that the coordinates of the vertices of a triangle (∆ABC) in the two-dimensional space are (t1 , v1 ),

PT

(t2 , v2 ) and (t3 , v3 ). Then the geometrical barycenter (tG , vG ) and the exterior radius (r) of this triangle are

AC

and

CE

shown in Fig. 5 and given in the form

where

(tG , vG ) = (

t1 + t2 + t3 v1 + v2 + v3 , ) 3 3

r=

AB ∗ AC ∗ BC 4S

t1 1 S = | t2 2 t3

v1 v2

1 1 |, 1

v3 p p p AB = (t1 − t2 )2 + (v1 − v2 )2 , AC = (t1 − t3 )2 + (v1 − v3 )2 and BC = (t2 − t3 )2 + (v2 − v3 )2 .

At this time, 3-tuple (tG , vG , r) represents the information of the triangle ∆ABC. Since interval-valued

time series [X] is made up of a series of non-overlapping triangles, [X] then can be transformed into a sequence of 3-tuples. Specifically, [X] can be represented as 

(tG1 , vG1 , r1 ), (tG2 , vG2 , r2 ), · · · , (tGK1 +K2 −2 , vGK1 +K2 −2 , rK1 +K2 −2 ) 9



ACCEPTED MANUSCRIPT

where (tGi , uGi ) and ri respectively denote the geometrical barycenter and exterior radius of the ith triangle generated in the stage of segmentation for i = 1, 2, · · · , K1 + K2 − 2. Generally, K1 , K2 << m and the dimension of the new representation of [X] is greatly reduced compared to the initial expression form

1.2

1

C

( t 3, v 3)

0.8

r

0.6 A

AN US

( t 1, v 1)

CR IP T

[X] = {(tX + ∆t, [l1 , u1 ]), (tX + 2∆t, [l2 , u2 ]), · · · , (tX + m∆t, [lm , um ])}.

( tG, v G)

0.4

Circumcircle

0.2

( t2 , v 2 )

0 0.2

0.4

B

0.6

0.8

1

1.2

M

0

ED

Figure 5: The geometrical barycenter (tG , vG ) and the exterior radius (r) of a triangle. Since the proposed representation of interval-valued time series relies mostly on geometrical barycenter and exterior radius, we call this new representation ‘geometrical barycenter and exterior radius-based trans-

Geometrical barycenter and exterior radius-based DTW algorithm

CE

3.2

PT

formation’.

A new representation of interval-valued time series [X] was proposed in Section 3.1. In the same way, [Y ] is also represented by a 3-tuples sequence

AC

 (sG1 , uG1 , R1 ), (sG2 , uG2 , R2 ), · · · , (sGK3 +K4 −2 , uGK3 +K4 −2 , RK3 +K4 −2 )

where K3 and K4 are the number of line segments of Yl and Yu , respectively.

In general, the number of the triangles produced by [X] is not equal to that produced by [Y ]. In the

sequel, we employ the dynamic time warping algorithm to calculate the distance between 3-tuples sequences

and



(tG1 , vG1 , r1 ), (tG2 , vG2 , r2 ), · · · , (tGK1 +K2 −2 , vGK1 +K2 −2 , rK1 +K2 −2 )



 (sG1 , uG1 , R1 ), (sG2 , uG2 , R2 ), · · · , (sGK3 +K4 −2 , uGK3 +K4 −2 , RK3 +K4 −2 )

and regard this dynamic time warping distance as the distance between the original interval-valued time series [X] and [Y ]. The algorithm for calculating such distance between two interval-valued time series of 10

ACCEPTED MANUSCRIPT

unequal length under DTW and ‘geometrical barycenter and exterior radius-based transformation’ is named of BRDTW algorithm (geometrical Barycenter and exterior Radius-based DTW algorithm) for short and its flow chart is shown in Fig. 6.

Input: interval-valued time series [X] and [Y] of unequal length

Normalization

AN US

Segmentation

CR IP T

Begin

Calculation

DTW on new representation

M

Output: distance between [X] and [Y]

ED

End

PT

Figure 6: The flow chart of BRDTW algorithm.

Remark 3.1 For interval-valued time series [X] and [Y ], the (i, j) entry of the distance matrix in BRDTW

CE

algorithm is (tGi −sGj )2 +(vGi −uGj )2 +(ri −Rj )2 for i = 1, 2, · · · , K1 +K2 −2 and j = 1, 2, · · · , K3 +K4 −2. As shown in Fig. 6, we can see that the BRDTW algorithm includes four stages. In addition, the second stage consists of three parts. That is, l1 trend filtering, piecewise linear representation and generation of

AC

triangles. The computational complexity of Stage 1 is O(1). According to (Kim et al., 2009), the computational complexity of l1 trend filtering is O(n). The computational complexity of Stage 3 is O(n). Meanwhile, the computational complexity of Stage 4 is O((K1 + K2 )(K3 + K4 )), where K1 , K2 << n, K3 , K4 << m. Therefore, the overall complexity of the BRDTW algorithm is O(n).

4

BRDTW-based single-linkage clustering algorithm

In this section, we discuss the clustering of interval-valued time series of unequal length. Suppose that given is a group of interval-valued time series of unequal length X = {[X]1 , [X]2 , · · · , [X]N }, where [X]i = {(ti + ∆t, [li1 , ui1 ]), (ti + 2∆t, [li2 , ui2 ]) · · · , (ti + ni ∆t, [lini , uini ])} 11

ACCEPTED MANUSCRIPT

ni is the length of interval-valued time series [X]i for i = 1, 2, · · · , N and there exists at least a pair i 6= j such that ni 6= nj . First of all, by using the BRDTW algorithm, we obtain the distances among the interval-valued time series in X and the distance matrix of dataset X is generated. As for the clustering method, we employ the (1) It does not require the number of clusters to be provided in advance. (2) It has the ability to cluster the dataset with different sample sizes.

CR IP T

agglomerative hierarchical clustering method for three reasons.

(3) It has attractive capabilities to visualize the clustering result by a dendrogram.

In addition, this paper concretely employs the hierarchical clustering algorithm with single-linkage, where the distance between two clusters C1 and C2 is taken to be the minimum distance between pairs of elements

AN US

x in C1 and y in C2 .

And finally, under the distance measure calculated by using BRDTW algorithm, a novel hierarchical clustering algorithm for interval-valued time series of unequal length is being formed. It can be referred to as BRDTW-based single-linkage clustering algorithm, and an overall processing realized can be seen in Fig. 1. In essence, this clustering algorithm is composed of two phases: (I) distance matrix calculation and (II)

Phase I. Distance matrix calculation For two interval-valued time series

M

single-linkage hierarchical clustering.

ED

[X]i = {(ti + ∆t, [li1 , ui1 ]), (ti + 2∆t, [li2 , ui2 ]) · · · , (ti + ni ∆t, [lini , uini ])} and

PT

[X]j = {(tj + ∆t, [lj1 , uj1 ]), (tj + 2∆t, [lj2 , uj2 ]) · · · , (tj + nj ∆t, [ljnj , ujnj ])} coming from X = {[X]1 , [X]2 , · · · , [X]N }, the distance between [X]i and [X]j denoted as d([X]i , [X]j ) can be calculated by invoking BRDTW algorithm.

CE

Then, the distance matrix of dataset X denoted as D is produced, in which the (i, j) element of D is d([X]i , [X]j ) for i, j = 1, 2, · · · , N .

AC

Phase II. Single-linkage hierarchical clustering The single-linkage hierarchical clustering method is applied to distance matrix D. Then, the clustering result of dataset X is produced and displayed in a form of a dendrogram.

5

Experimental studies

In this section, a series of experiments are reported to test our proposed methods. First of all, we compare BRDTW with other distance measures among three interval-valued time series of unequal length from the perspective of effectiveness and efficiency. Furthermore, we show the performance of BRDTW-based singlelinkage clustering algorithm by comparing the clustering results under different distance measures, where hierarchical clustering is employed in all of the clustering process. 12

ACCEPTED MANUSCRIPT

5.1

Comparative analysis

The main objective of this experiment is to testify the performance of BRDTW algorithm. At first, we briefly describe the data set used in this part. Then we mainly illustrate the performance of BRDTW algorithm

5.1.1

CR IP T

from the perspective of effectiveness and efficiency. Dataset description

In this experiment, we randomly choose three interval-valued time series of unequal length coming from Shanghai Stock Exchange (http://www.aigaogao.com/tools/history.html?s=) shown in Fig. 7, in which an interval data at each time point is formed from the low price and high price. The lengths of interval-valued

8 6

0

50

100

150

8

0

250

300

350

400

450

10

100

CE

8

150

200

[Y ] 250

300

350

400

450

6

500

0

50

100

[ Z] 150

200

250

300

350

400

450

500

AC

4

500

PT

12

50

ED

10

6

200

[X ]

M

4

AN US

time series [X], [Y ] and [Z] are 500, 495 and 485, respectively, and the time unit is ‘day’.

Figure 7: Three interval-valued time series coming from Shanghai Stock Exchange.

5.1.2

Effectiveness comparison

Since the lengths of interval-valued time series [X], [Y ] and [Z] are different. For the interval-valued time series of unequal length, the dynamic time warping algorithm is commonly used to calculate their distance, in which each entry of the distance matrix is always calculated by the distance measures shown in Table 1. In what follows, we will do a comparison between the BRDTW algorithm and the DTW algorithms based on Hausdorff distance, L1 distance, L2 distance, Wasserstein distance, Ichino-Yaguchi distance and D’Urso13

ACCEPTED MANUSCRIPT

Giovanni distance by carrying out experiment on data set shown in Fig. 7. After the first stage of BRDTW algorithm, three interval-valued time series after normalization including normalization of value axis and time axis are displayed in Fig. 8. Moreover, the parameter λ is set to 0.1 in l1 trend filtering appearing in the stage of segmentation, and the parameters ωα and ωβ are set to 0.7 and 0.3, respectively in D’Urso-

CR IP T

Giovanni distance. The distance matrices produced by the BRDTW algorithm and the DTW algorithms based on Hausdorff distance, L1 distance, L2 distance, Wasserstein distance, Ichino-Yaguchi distance and D’Urso-Giovanni distance are denoted by D1 , D2 , D3 , D4 , D5 , D6 and D7 respectively. In more detail,       0 2290.1 590.8 0 1182.5 325.2 0 179.5 287.6             D1 =  179.5 0 1315.9  0 702.5  , D3 =  2290.1 0 244.9  , D2 =  1182.5       590.8 1315.9 0 325.2 702.5 0 287.6 244.9 0 0

  D4 =  1625.8  424.1

1625.8 0 943.7

and

424.1





0

    943.7  , D5 =  1145.9   294.5 0 

0

1145.9 294.5 0

653.1



1145.0 295.4 0 658.0



  454.1  .  0

ED

M

0

    653.1  , D6 =  1145.0   295.4 0

801.4 205.0

  D7 =  801.4 0  205.0 454.1

0.5

0.1

0.2

0.3

[ X ]' 0.4

0.5

0.6

0.7

0.8

0.9

1

CE

0

PT

1

0



AN US



1

AC

0.5

0

0

[ Y ]' 0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1

0.5 [ Z ]' 0

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Figure 8: Three interval-valued after normalization. 14

0.9

1



  658.0   0

ACCEPTED MANUSCRIPT

From the above distance matrices, we can see that the values in matrix D1 indicate that interval-valued time series [X] is closer to [Y ] than [Z], while the values in remaining matrices indicate that interval-valued time series [X] is closer to [Z] than [Y ]. In addition, we carry out the DTW algorithms based on Hausdorff distance, L1 distance, L2 distance,

CR IP T

Wasserstein distance, Ichino-Yaguchi distance and D’Urso-Giovanni distance to the interval-valued time series after normalization shown in Fig. 8, and regard the current distance as the distance between the interval-valued time series before normalization. The parameters ωα and ωβ are set to 0.7 and 0.3, respectively in D’UrsoGiovanni distance. The distance matrices among [X]0 , [Y ]0 and [Z]0 resulting from the DTW algorithms based on Hausdorff distance, L1 distance, L2 distance, Wasserstein distance, Ichino-Yaguchi distance and D’Urso-Giovanni distance are denoted by D20 , D30 , D40 , D50 , D60 and D70 , respectively, and expressed as follows 101.9

  D20 =  101.9 0  42.3 108.4 

0

90.0

  D50 =  90.0 0  33.7 95.9

42.3





0

181.0

    0 , D =   181.0 108.4 0 3   0 69.7 192.0 33.7





0

69.7





AN US

0

    95.9  , D60 =  90.5 0   0 34.8 96.0

0

    0 , D =   131.9 192.0 4   0 51.6

90.5 34.8

M







0

131.9 0

138.4

51.6

  138.4   0

61.9 22.9

    96.0  and D70 =  61.9 0   0 22.9 66.6





  66.6  .  0

The values in the above distance matrices consistently indicate that interval-valued time series [X] is closer

ED

to [Z] than [Y ], while interval-valued time series [X] is close to [Y ] instead of [Z] in shape as shown in FIg. 7. Through the above analysis, for the data set shown in Fig. 7, the distance matrix produced by the BRDTW

PT

algorithm can reflect their differences in shape, while other algorithms fail to work. Hence, the BRDTW algorithm is feasible and effective to measure the difference in shape between two interval-valued time series

5.1.3

CE

of unequal length.

Efficiency comparison

AC

In general, efficiency is vital to any algorithm. Referring to Section 3.2, the BRDTW algorithm consists of four stages: (I) normalization, (II) segmentation, (III) calculation and (IV) DTW on new representation. We carry out the comparison of computing overhead for the data set reported in Fig. 8. More detailed runtime information of comparison between the BRDTW algorithm and the DTW algorithms based on Hausdorff distance, L1 distance, L2 distance, Wasserstein distance, Ichino-Yaguchi distance and D’Urso-Giovanni distance is listed in Table 2, where the time costs in four stages of (I) normalization, (II) segmentation, (III) calculation and (IV) DTW on new representation are 0.0949s, 0.2597s, 0.0512s and 0.1509s, respectively in BRDTW algorithm, and alphabet ‘s’ stands for the time unit ‘second’. From Table 2, we can see that the BRDTW algorithm is superior to the DTW algorithms based on Hausdorff distance, L1 distance, L2 distance, Wasserstein distance, Ichino-Yaguchi distance and D’UrsoGiovanni distance for interval-valued time series of unequal length when assessing them in terms of computing 15

ACCEPTED MANUSCRIPT

overhead. In fact, the computing overhead of the BRDTW algorithm is about a third of that of L1 distancebased DTW algorithm. Table 2: Comparison of (1) BRDTW, (2) Hausdorff distance-based DTW, (3) L1 distance-based DTW, (4) L2 distancedistance-based DTW in time consumption.

CR IP T

based DTW, (5) Wasserstein distance-based DTW, (6) Ichino-Yaguchi distance-based DTW and (7) D’Urso-Giovanni

Time cost (s)

(1) BRDTW

0.5567

(2) Hausdorff distance-based DTW

1.5637

(3) L1 distance-based DTW

1.0453

(4) L2 distance-based DTW

1.0719

AN US

Distance metric algorithm

(5) Wasserstein distance-based DTW

1.1249

(6) Ichino-Yaguchi distance-based DTW

1.1847

(7) D’Urso-Giovanni distance-based DTW

1.1034

M

All in all, the proposed method offers not only a sound approach but comes with high efficiency when

5.2

Clustering analysis

ED

quantifying the distance between interval-valued time series of unequal length.

PT

In this section, we will apply the proposed clustering method (BRDTW-based single-linkage clustering algo-

5.2.1

CE

rithm) on two experiments to show its performance.

Experiment I

AC

Since a lot of research shows that the most commonly used distance measure in interval-valued time-series clustering is the Euclidean distance which applies only to the case of equal length, the data set in this experiment is composed of interval-valued time-series of equal length. We randomly select six time series (X1 , X2 , · · · , X6 ) in ‘Face four’ dataset coming from UCR Time Series Classification Archive (Chen et al., 2015). Each time series has 350 data points. That is, Xi = (vi1 , vi2 , · · · , vi350 ) for i = 1, 2, · · · , 6. Obviously, ‘Face four’ dataset is in the form of univariate time series, rather than interval-valued time series we need. Based on these six time series, we construct a group of interval-valued time series named Group I by the following two steps. (1) The time unit is added and Xi is translated into Xi = {(1, vi1 ), (2, vi2 ), · · · , (350, vi350 )}. (2) The real number vij is replaced by an interval number [lij , uij ] at each time point, where lij = vij −0.2, uij = vij + 0.2 for i = 1, 2, 3; j = 1, 2, · · · , 350 and lij = vij − 0.4, uij = vij + 0.4 for i = 4, 5, 6; j = 1, 2, · · · , 350. 16

ACCEPTED MANUSCRIPT

Group I including six interval-valued time series ([X1 ], [X2 ], · · · , [X6 ]) is shown in Fig. 9, and exhibits

4

4

2

2

0

0

-2

-2

[ X] 1

-4

0

50

100

150

200

250

300

-4

350

4

4

2

2

0

0

-2

-2

-4

[ X] 2 0

50

100

150

200

-4

250

300

350

0

50

100

150

200

250

300

350

200

250

300

350

200

250

300

350

[ X] 5

0

50

100

4

2

150

2

0

0

-2 -4

[ X] 4

AN US

4

CR IP T

two clusters C1 = {[X]1 , [X]2 , [X]3 } and C2 = {[X]4 , [X]5 , [X]6 }.

-2

[ X] 3 0

50

100

150

200

250

300

350

-4

[ X] 6

0

50

100

150

M

Figure 9: Group I produced by ‘Face four’ dataset exhibiting two clusters: C1 = {[X]1 , [X]2 , [X]3 } and

80

200

70 60

PT

150

ED

C2 = {[X]4 , [X]5 , [X]6 }.

100

50 40 5

1

2

3

CE

6

1

2

(1)

120

3 4 (2)

6

5

AC 3

(5)

4

6

100 1

2

3

4

6

5

2

3

4

6

5

6

5

(4)

140 80

120

200 5

1

(3)

250

60

2

150

70

100

80

1

200

100

300

100

200

150

50

4

250

1

3

2

5

6

4

60

80

50

60

40 1

2

3

4

6

5

(7)

(6)

1

2

3

4

(8)

Figure 10: Clustering results of Group I under eight clustering algorithms, where the horizontal axis and vertical axis represent interval-valued time series label and distance, respectively. In what follows, we carry out comparison between (1) BRDTW-based single-linkage clustering algorithm and the hierarchical clustering algorithms with (2) Hausdorff distance-based DTW, (3) L1 distance-based DTW, (4) L2 distance-based DTW, (5) Wasserstein distance-based DTW, (6) Euclidean distance, (7) Ichino17

ACCEPTED MANUSCRIPT

Yaguchi distance-based DTW and (8) D’Urso-Giovanni distance-based DTW. The parameter λ is set to 1 in l1 trend filtering appearing in the BRDTW algorithm, and the parameters ωα and ωβ are set to 0.7 and 0.3, respectively in D’Urso-Giovanni distance. The clustering results of Group I under these eight clustering algorithms are displayed in Fig. 10 by the form of dendrogram, and the running time of these eight clustering

CR IP T

algorithms is 0.7743s, 2.2751s, 2.8073s, 2.8124s, 2.5037s, 0.0218s, 2.4221s and 2.4190s, respectively. Taking into account time consumption, the hierarchical clustering algorithm based on Euclidean distance is optimal. However, Figure 10 tells us that the correct clustering result cannot be obtained by the hierarchical clustering algorithm based on Euclidean distance. The clustering results generated by the rest of algorithms accord with the structure of Group I. In these clustering algorithms which can produce the correct clustering result, the time cost of BRDTW-based single-linkage clustering algorithm is about a third of that of other

shows its superiority than other clustering algorithm.

5.2.2

Experiment II

AN US

algorithms. Considering the feasibility and effectiveness, BRDTW-based single-linkage clustering algorithm

The data set in this experiment is composed of interval-valued time-series of unequal length. We randomly

M

select six time series (X1 , X2 , · · · , X6 ) in ‘CBF’ dataset, which still comes from UCR Time Series Classification Archive (Chen et al., 2015). Each time series has 128 data points, namely, Xi = (vi1 , vi2 , · · · , vi128 ) for i = 1, 2, · · · , 6. Based on these six time series, we construct a group of interval-valued time series of unequal

ED

length named Group II by the following three steps. (1) The time unit is added and Xi is translated into Xi = {(1, vi1 ), (2, vi2 ), · · · , (128, vi128 )}. (2) The real number vij is replaced by an interval number [lij , uij ] at each time point, where lij = vij − 0.08i

PT

and uij = vij + 0.08i for i = 1, 2, · · · , 6 and j = 1, 2, · · · , 128. Xi is then translated into

CE

[X]i = {(1, [li1 , ui1 ]), (2, [li2 , ui ]), · · · , (128, [li128 , ui128 ])}.

(3) Delete some elements to ensure Group II to be of unequal length. In detail, delete the last four time

AC

points and the corresponding interval numbers of [X]2 ; delete the last six time points and the corresponding interval numbers of [X]3 ; delete the first four time points and the corresponding interval numbers of [X]4 and delete the last five time points and the corresponding interval numbers of [X]6 . As a result, Group II including six interval-valued time series ([X1 ], [X2 ], · · · , [X6 ]) of unequal length is

shown in Fig. 11, and exhibits three clusters: Cylinder {[X]1 , [X]2 }, Bell {[X]3 , [X]4 } and Funnel {[X]5 , [X]6 }. Since the interval-valued time series in Group II is of unequal length, the hierarchical clustering algorithm with Euclidean distance fails to work. Hence, we carry out comparison between (1) BRDTW-based singlelinkage clustering algorithm and the hierarchical clustering algorithms with (2) Hausdorff distance-based DTW, (3) L1 distance-based DTW, (4) L2 distance-based DTW, (5) Wasserstein distance-based DTW, (6) Ichino-Yaguchi distance-based DTW and (7) D’Urso-Giovanni distance-based DTW. The parameter λ is set to 0.25 in l1 trend filtering appearing in the BRDTW algorithm, and the parameters ωα and ωβ are set to 18

ACCEPTED MANUSCRIPT

0.7 and 0.3, respectively in D’Urso-Giovanni distance. The clustering results of Group II under these seven

2 0 [X ] 1

50

0

0

-2 100

0

2

2

0

0 [X ] 2

-2 0

50

50

100

[X ]5

0

2

50

100

0

[X ]4

-2 100

-2

[X ]3

AN US

0

2

0

50

[ X ]6

-2

100

0

50

100

M

-2

2

CR IP T

clustering algorithms are shown in Fig. 12 by the form of dendrogram.

Figure 11: Group II produced by ‘CBF’ dataset exhibiting three clusters: Cylinder {[X]1 , [X]2 }, Bell

ED

{[X]3 , [X]4 } and Funnel {[X]5 , [X]6 }. 120

12

8

115

80

110

75 70

6

2 5 6 3 4 1 2

100

95

60

90

55

85

AC

(1)

105

CE

65

4

5 6 1 2 3 4 (2)

60

95

PT

10

85

56

90

54

85

52

80

50

38 55 36 34

50

48

75

32

46

70

45

44 65

30

42 28

60 5 6 1 2 4 3 (3)

5 6 1 2 3 4 (4)

5 6 1 2 4 3 (5)

5 6 1 2 4 3 (6)

1 2 5 6 4 3 (7)

Figure 12: Clustering results of Group II under seven clustering algorithms, where the horizontal axis and vertical axis represent interval-valued time series label and distance, respectively. In this experiment, only BRDTW-based single-linkage clustering algorithm gets the right clustering result, while other clustering algorithms with Hausdorff distance-based DTW, L1 distance-based DTW, L2 distancebased DTW and Wasserstein distance-based DTW obtain wrong labels regards to the middle two intervalvalued time series (see Fig. 12). In light of this experiment on clustering analysis, we conclude that the BRDTW-based single-linkage clustering algorithm is capable of capturing the interval-valued time series of unequal length similarity in shape and provide the correct clustering result. 19

ACCEPTED MANUSCRIPT

6

Conclusion

This paper mainly studied the clustering of interval-valued time series of unequal length. At first, we proposed a novel representation in the form of a sequence of 3-tuples for interval-valued time series. Based on the new representation and dynamic time warping, the BRDTW algorithm was produced to measure the distance

CR IP T

between interval-valued time series. In addition, we designed the BRDTW-based single-linkage clustering algorithm to solve the clustering of interval-valued time series of unequal length. Experiments carried on the high and low price data of Shanghai Stock Exchange and the UCR time series classification archive illustrated the good performance of the BRDTW algorithm, and showed that the BRDTW-based single-linkage clustering algorithm applied to a group of interval-valued time series of unequal length exhibits good effectiveness and efficiency.

AN US

In future, we will use this clustering method to do some other mining work for interval-valued time series, such as outliers detection and forecasting.

Acknowledgment

M

This work was supported by National Natural Science Foundation of China (Nos. 11701338, 11571001), a Project of Shandong Province Higher Educational Science and Technology Program (No. J17KB124) and

ED

Natural Science Foundation of Shandong Province (No. ZR2016AP12).

PT

References

Arroyo, J., & Mat´e, C. (2006). Introducing interval time series: accuracy measures. In: Proceedings in computational statistics, Heidelberg: Physica-Verlag, 1139–1146.

CE

Arroyo, J., Esp´ınola, R., & Mat´e, C. (2011). Different approaches to forecast interval time series: a comparison in finance. Computational Economics, 37(2), 169–191.

AC

Berndt, D., & Clifford, J. (1994). Using dynamic time warping to find patterns in time series. In: KDD Workshop, Seattle, 10, 359–370.

Carvalho, F., Brito, P., & Bock, H. (2006). Dynamic clustering for interval data based on L2 distance. Computational Statistics, 21(2), 231–250.

Carvalho, F., & Lechevallier, Y. (2009). Dynamic clustering of interval-valued data based on adaptive quadratic distances. IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans, 39(6), 1295–1306. Carvalho, F., & Ten´ orio, C. (2010). Fuzzy K-means clustering algorithms for interval-valued data based on adaptive quadratic distances. Fuzzy Sets and Systems, 161(23), 2978–2999. Carvalho, F., & Sim˜ oes, E. (2017). Fuzzy clustering of interval-valued data with City-Block and Hausdorff distances. Neurocomputing, 266, 659–673. 20

ACCEPTED MANUSCRIPT

Chen, Y., Keogh, E., Hu, B., Begum, N., Bagnall, A., Mueen, A., & Batista, G. (2015). The UCR time series classification archive, URL=http://www.cs.ucr.edu/∼eamonn/time− series− data. Accessed 5 Nov 2018. Coppi, R., & D’Urso, P. (2002). Fuzzy K-means clustering models for triangular fuzzy time trajectories. Statistical Methods and Applications, 11(1), 21–40.

tational Statistics & Data Analysis, 43(2), 149–177.

CR IP T

Coppi, R., & D’Urso, P. (2003). Three-way fuzzy clustering models for LR fuzzy time trajectories. Compu-

Coppi, R., D’Urso, P., & Giordani P. (2012). Fuzzy and possibilistic clustering for fuzzy data. Computational Statistics & Data Analysis, 56(4), 915–927.

D’Urso, P., & Giordani P. (2006). A robust fuzzy k-means clustering model for interval valued data. Computational Statistics, 21(2), 251–269.

AN US

D’Urso, P., & Giovanni, L.D. (2006). A weighted fuzzy c-means clustering model for fuzzy data. Computational Statistics & Data Analysis, 50(6), 1496–1523.

D’Urso, P., & Giovanni, L.D. (2014). Robust clustering of imprecise data. Chemometrics and Intelligent Laboratory Systems, 136(7), 58–80.

D’Urso, P., & Leski, J.M. (2016). Fuzzy c-ordered medoids clustering for interval-valued data. Pattern Recognition, 58, 49–67.

M

D’Urso, P., Massari, R., Giovanni, L.D., & Cappelli, C. (2017). Exponential distance-based fuzzy clustering for interval-valued data. Fuzzy Optimization and Decision Making, 16(1), 51–70.

ED

Efron, B., Hastie, T., Johnstone, I., & Tibshirani, R. (2004). Least angle regression. Annals of Statistics, 32(2), 407–499.

Garc´ıa-Ascanio, C., & Mat´e, C. (2010). Electric power demand forecasting using interval time series: a

PT

comparison between VAR and iMLP. Energy Policy, 38(2), 715–725. Gonz´alez, L., Velasco, F., Angulo, C., Ortega, J., & Ruiz, F. (2007). Sobre n´ ucleos, distancias y similitudes entre intervalos. Inteligencia Artificial, 8(23), 111–117.

CE

http://www.aigaogao.com/tools/history.html?s= Accessed 13 Nov 2018. Ichino, M., & Yaguchi, H. (1994). Generalized Minkowski metrics for mixed feature type data analysis. IEEE

AC

Transactions on Systems, Man, and Cybernetics, 24(4), 698–708. Irpino, A., & Tontodonato, V. (2006). Clustering reduced interval data using Hausdorff distance. Computational Statistics, 21(2), 271–288.

Keogh, E., Lin, J.,& Truppel, W. (2003). Clustering of time series subsequences is meaningless: implications for previous and future research. In: Proceedings of the Third IEEE International Conference on Data Mining, 115–122. Kim, S., Koh, K., Boyd, S., & Gorinevsky, D. (2009). l1 trend filtering. Siam Review, 51(2), 339–360. Li, H. (2015). Piecewise aggregate representations and lower-bound distance functions for multivariate time series. Physica A: Statistical Mechanics and its Applications, 427, 10–25. Lin, W., & Gonz´alez-Rivera, G. (2016). Interval-valued time series models: estimation based on order statistics exploring the agriculture marketing service data. Computational Statistics & Data Analysis, 21

ACCEPTED MANUSCRIPT

100, 694–711. ´ (2007). iMLP: applying multi-layer perceptrons to intervalMu˜ noz, A., Mat´e, C., Arroyo, J., & Sarabia, A. valued data. Neural Processing Letters, 25(2), 157–169. Rosset, S., & Zhu, J. (2007). Piecewise linear regularized solution paths, Annals of Statistics, 35(3), 1012–

CR IP T

1030. Souza, R., & Carvalho, F. (2004). Clustering of interval data based on city-block distances. Pattern Recognition Letters, 25(3), 353–365.

Sun, T., Sun, H., & Chen, W. (2012). Dimensionality reduction for interval time series. In: Proceedings of World Congress on Information and Communication Technologies, 1115–1120.

Verde, R., & Irpino, A. (2008). A new interval data distance based on the Wasserstein metric. In: Data

AN US

Analysis, Machine Learning and Applications, Springer, Berlin, Heidelberg, 705–712.

Wang, X., Zhang, Z., & Li, S. (2016). Set-valued and interval-valued stationary time series. Journal of Multivariate Analysis, 145, 208–223.

Wang, X., Yu, F., & Pedrycz, W. (2016). An area-based shape distance measure of time series. Applied Soft Computing, 48, 650–659.

Wang, X., Yu, F., Pedrycz W., & Wang, J. (2018). Hierarchical clustering of unequal-length time series with

M

area-based shape distance. Soft Computing, doi:10.1007/s00500-018-3287-6.

Xiong, T., Bao, Y., & Hu, Z. (2014). Multiple-output support vector regression with a firefly algorithm for

AC

CE

PT

ED

interval-valued stock price index forecasting. Knowledge-Based Systems, 55, 87–100.

22