Mining information from time series in the form of sentences of natural language

Mining information from time series in the form of sentences of natural language

International Journal of Approximate Reasoning 78 (2016) 192–209 Contents lists available at ScienceDirect International Journal of Approximate Reas...

1MB Sizes 0 Downloads 15 Views

International Journal of Approximate Reasoning 78 (2016) 192–209

Contents lists available at ScienceDirect

International Journal of Approximate Reasoning www.elsevier.com/locate/ijar

Mining information from time series in the form of sentences of natural language Vilém Novák University of Ostrava, Institute for Research and Applications of Fuzzy Modeling, NSC IT4Innovations, 30. dubna 22, 701 03 Ostrava 1, Czech Republic

a r t i c l e

i n f o

Article history: Received 30 September 2015 Received in revised form 5 July 2016 Accepted 12 July 2016 Available online 18 July 2016 Keywords: Fuzzy natural logic Evaluative linguistic expressions Intermediate quantifiers Fuzzy transform Fuzzy type theory Tectogrammatical tree

a b s t r a c t The goal of this paper is to provide a more detailed explanation of the principles how special formulas that characterize properties of trend of time series can be formed and how they are interpreted. Then we show how these formulas can be used in a tectogrammatical tree that construes special sentences of natural language, using which information on behavior of time series is provided. We also outline the principles of mining this information. The last part is devoted to application of the theory of intermediate quantifiers to mining summarized information on time series also in sentences of natural language. © 2016 Elsevier Inc. All rights reserved.

1. Introduction Mining information from time series is an interesting problem that is extensively elaborated. A nice overview of various tasks raised in connection with time series and some of the possible methods for solving them is presented by Fu in [7] where one can find also a lot of references to special methods. One of the tasks is generation of automatic comments in natural language. This problem is, in fact, wider because time series is only one possible sort of data. This was initiated almost 60 years ago by Luhn [17] (see also [23] and the citations therein) and since then, many authors contributed to it. Summarization of knowledge about time series is a narrower problem. In the literature, one can find methods for finding interesting patters in time series (cf. [41]). In [15] Kobayashi and Okumura describe method for generation of comments in natural language that can serve as recommendation for stock broker about what happened with stock price trends during the day. Kacprzyk and Wilbik in [11] present a method for finding sequences of monotonous behavior and on the basis of them characterize similarity of time series. A special task is summarization of knowledge about time series using quantifiers of natural language. This task was solved using techniques of fuzzy set theory especially by Kacprzyk, Wilbik, Zadrozny ˙ [12,13] and Castillo-Ortega, Marín, Sánchez (see, e.g., [2,3]). The authors suggested various heuristic methods for mining information on the basis of which proper natural language expressions can be generated. In this paper, we suggest alternative methods for this task. First, we apply techniques based on the theory of fuzzy transform that makes it possible to analyze time series. Then, we apply the theory of fuzzy natural logic (FNL) that provides a formal model of semantics of special expressions of natural language as

E-mail address: [email protected]. http://dx.doi.org/10.1016/j.ijar.2016.07.006 0888-613X/© 2016 Elsevier Inc. All rights reserved.

V. Novák / International Journal of Approximate Reasoning 78 (2016) 192–209

193

well as schemes for reasoning with them. We first form special formulas of FNL and then obtain expressions of natural language by interpretation of them. Ramos-Soto et al. in [39] analyze two variants how linguistic descriptions of data can be produced: using a standard natural language generation approach (cf. [40]) or using template-based NLG. Both approaches have their pros and cons. The former is often taken as superior over the latter because of its ability to generate richer structures that are closer to the way how people speak. However, as pointed out by van Deemter et al. in [43] this distinction is now still more blurred because both kinds of systems are developing and even the NLG systems have to be simplified because of extreme complexity of natural language. Therefore, they are not that far from templates. In this paper, we will suggest methods for generation of natural language expressions characterizing time series. Our solution is essentially template-based. Its merit consists in the fact that we stem from the formal model of semantics developed within FNL. We address three tasks: (a) characterization of trend in natural language, (b) finding intervals of definite character of trend, and (c) summarization of the characteristics of time series using intermediate quantifiers. A special attention is payed especially to task (c), because tasks (a) and (b) were in detail elaborated in the previous papers [30,32,33]. The structure of the paper is as follows. In the next section we briefly introduce main mathematical concepts, namely the fuzzy type theory (FTT), two theories belonging to FNL, the fuzzy transform and also precisely specify the concept of time series. The main contribution of this paper is in Section 3. We first introduce special formulas of FTT representing properties of time series, overview some already published methods and focus on mining linguistic information on the course of time series. The subsequent subsection is focused on mining summarized information using the concept of intermediate quantifier. In conclusion, we summarize the results of this paper and also outline future research that, among others, will be focused on utilization of results in the theory of generalized syllogistic reasoning. 2. Preliminaries The theoretical frame for the methods developed in this paper is formed by fuzzy natural logic (FNL) and fuzzy (F-)transform. Recall that the former is a formal logical theory that consists of (a) a formal theory of evaluative linguistic expressions (see [27]), (b) a formal theory of fuzzy IF–THEN rules and approximate reasoning (see [26,31]) and (c) a formal theory of intermediate and generalized fuzzy quantifiers (see [6,20,22,28]). FNL is a mathematical theory developed using formalism of the fuzzy type theory (FTT) (see [25,29]). Its formal language is extension of the lambda calculus. By a fuzzy set we understand in this paper a function A : U −→ E where E is a set of truth values. By F (U ), we denote the (crisp) set of all fuzzy sets on the universe U . Hence, the formula A ∈ F (U ) is either true or false. If A is a fuzzy set  on U then we often write A ⊂ U . The symbol A (u ) u denotes a fuzzy singleton where u ∈ U is an element and A (u ) is its ∼ membership degree. 2.1. Fuzzy type theory The chosen version of FTT is the Łukasiewicz one (Ł-FTT). Some of its main concepts are briefly overviewed below. Truth values are supposed to form the standard Łukasiewicz MV -algebra

L = [0, 1], ∨, ∧, ⊗, →, 0, 1, 

(1)

where  is a unary operation such that (a) = 1, if a = 1 and (a) = 0 otherwise. Each formula in FTT is assigned a type. This can be understood as an index encoding the kind of objects that are represented by the given formula. The primitive types are o representing truth values and  representing primary objects. We will also consider the type τ which will represent both real numbers as well as time moments. More complex types are formed by concatenation of simpler ones. The language J of Ł-FTT consists of variables xα , . . . , special constants c α , . . . (α ∈ Types), the symbol λ, and brackets. The connectives (which are special constants) are fuzzy equality/equivalence ≡, conjunction ∧, implication ⇒, negation ¬, Łukasiewicz conjunction & , disjunction ∨, and delta . As usual, we will write fuzzy equality between formulas of type α as ( A α ≡ B α ), instead of (≡ B α ) A α . Note that this is a formula of type o (truth value). If A is a formula or a variable of type α , then we write A α .1 Hence, if α = β then A α and A β are different formulas. Sometimes we will also write A ∈ Formα to stress that A is a formula of type α . If the type of a formula is clear from the context, we will omit it to simplify reading of the formulas. Recall that λxα A β is a formula of type β α . If A β α is a formula of type β α and B α is a formula then A β α B α is a formula of type β . We will omit brackets wherever possible and use them only to clarify reading of more complex formulas. Semantics of FTT is defined on the basis of a general frame

M = ( M α , =α )α ∈Types , E

1

In FTT we do not distinguish between terms and formulas and call all them just formulas. Various authors call them alternatively λ-terms.

(2)

194

V. Novák / International Journal of Approximate Reasoning 78 (2016) 192–209

where E is an algebra of truth values. Furthermore, formulas of type o are assigned elements from the set M o = E and formulas of type  are assigned elements from M  . In general, a formula of type β α is assigned a function from objects of M type α into objects of type β , i.e., an element from the set M β α ⊆ M β α . Thus, a formula A oα is interpreted as a fuzzy set M ( A oα ) ⊂ ∼ M α (i.e., it is a function M ( A oα ) : M α −→ E). Likewise, A (oα )α is interpreted as a fuzzy relation M ( A (oα )α ) : M α × M α −→ E. The binary relation =α is a fuzzy equality on M α , i.e. M (=α ) : M α × M α −→ E. Formulas of type τ are assigned elements from M τ = R. We will also consider the classical inequalities <, > between real numbers, i.e., their interpretation in a model are sharp relations that attain only truth values 0 or 1. The following is a special formula saying that the given formula A o represents a nonzero truth value:

ϒoo ≡ λ zo · ¬(¬ zo ). It can be proved that M (ϒ( A o )) = 1 iff M ( A o ) > 0 holds in any model M . 2.2. Evaluative linguistic expressions The central role in FNL is played by the theory of evaluative linguistic expressions, for example small, very big, rather strong, more or less light, etc. Since we have already presented this theory in many papers, we will only very briefly remind its main concepts. In this paper, we will consider simple evaluative linguistic expressions that have the form

linguistic modifier TE-adjective

(3)

where TE-adjective is one of the adjectives “small, medium, big” (and possibly other specific adjectives, especially the so called gradable or evaluative ones), or “zero”, or an arbitrary symmetric fuzzy number. The linguistic modifier is a special expression representing a linguistic phenomenon called hedging. The latter specifies more closely the topic of utterance. In our case, the linguistic modifier makes the meaning of the TE-adjective either more or less specific. Quite often it is represented by an intensifying adverb such as “very, roughly, approximately, significantly”, etc. The evaluative expressions (3) are called simple. We may form also compound ones using logical connectives (usually “and” and “or”). Note that the syntactic and semantic limitations of natural language prevent compound evaluative expressions to form a boolean algebra! If linguistic hedge is missing (i.e., we deal with expressions such as “weak, large”, etc.) then we understand that an empty linguistic hedge is present. Thus, all the simple evaluative expressions have the same form (3). Since they characterize values on an ordered scale, we may assume that scales are divided into two parts that are usually interpreted as positive and negative. Hence, the evaluative expressions may have also a sign, namely “positive” or “negative”. In this paper, we will use several chosen hedges that can be ordered according to their semantic effect from the most narrowing to the most widening: 2

Ex (extremely) ≪ Si (significantly) ≪ Ve (very) ≪ empty hedge

≪ ML (more or less) ≪ Ro (roughly) ≪ QR (quite roughly) ≪ VR (very roughly).

(4)

Since there is also the natural ordering

Ze (zero) ≪ Sm (small) ≪ Me (medium) ≪ Bi (big) we obtain a linear lexicographic ordering ≪ of simple evaluative expressions (3) that can be interpreted as “to be more specific”. We distinguish abstract evaluative expressions from more specific evaluative predications. The latter are expressions of natural language of the form ‘ X is A ’ where A is an evaluative expression and X is a variable which stands for objects, for example “degrees of temperature, height, length, speed”, etc. Examples are “temperature is high”, “speed is extremely low”, “quality is very high”, etc. In general, the variable X represents certain features of objects such as “size, volume, force, strength,” etc. and so, its values are often real numbers. Important notion is that of linguistic context.3 In our theory it is a triple of (real) numbers w =  v L , v S , v R where v L is the leftmost typically small value, v S is typically medium value and v R is the rightmost typically big value. For example, when speaking about height of people in Europe, we may set v L = 140 cm, v S = 170 cm and v R = 220 cm. In the sequel, we will consider a set of all linguistic contexts

W = { w =  v L , v S , v R | v L , v S , v R ∈ R, v L < v S < v R }.

(5)

We say that an element x belongs to a context w ∈ W if x ∈ [ v L , v S ] ∪ [ v S , v R ]. Then we write x ∈ w. 4

2

The “TE” is a short for “trichotomic evaluative”. In philosophical logic, a more general concept of possible world is used instead. Because we deal with a limited repertoire of linguistic expressions, we prefer using the special concept of linguistic context. 4 Of course, [ v L , v S ] ∪ [ v S , v R ] = [ v L , v R ]. We write it as a union of two overlapping intervals to emphasize the role of the middle point v S . 3

V. Novák / International Journal of Approximate Reasoning 78 (2016) 192–209

195

+ + When dealing with time series we need to distinguish positive (part of) the context w + =  v + L , v S , v R and negative − − − − one w =  v R , v S , v L . We can switch between both kinds of the context by simple multiplication by −1. Hence, w − = + + −1 · v + R , −1 · v S , −1 · v L . Therefore, in the formulas below we will consider one context (5) only. Note that it is usual to − + put v L = v L = 0. The meaning of an evaluative linguistic expression and predication is represented by its intension

Int( X is A ) : W −→ F (R)

(6)

where F (R) is a set of all fuzzy sets on R (set of real numbers). The set of all the considered evaluative expressions is denoted by EvExpr. In the sequel, the symbol Ev ∈ EvExpr denotes arbitrary evaluative expression. For each context w ∈ W , the extension Ext w ( X is A ) ⊂ R is a specific fuzzy set on R, i.e., Ext w ( X is A ) = ∼ Int( X is A )( w ). Extensions of few selected evaluative expressions are depicted in Fig. 2 below. Definition 1. Let W be a set of contexts and consider an x ∈ R. Let K ⊆ EvExpr be a finite set of evaluative predications such that ( K , ≪) is linearly ordered. Put

P = {“ X is B ” ∈ K | Ext w ( X is B )(x) > 0}.

(7)

Furthermore, let P  ⊆ P be a set of evaluative predications such that

“ X is B ” ∈ P 

iff

Ext w ( X is B )(x) is maximal,

and

P  = {“ X is B ” ∈ K | Ext w ( X is B )(x) ≥ a0 } ⊆ P ,

(8)

where a0 > 0 is some threshold.5 The function of local perception is a partial function LPerc K : R × W −→ K which to any value x ∈ R and any context w ∈ W assigns an evaluative predication as follows:

⎧  ⎪ ⎨min P , K LPerc (x, w ) = “ X is A ” = min P  , ⎪ ⎩

undefined,

if P  = ∅,

if P  = ∅ and P = ∅,

(9)

otherwise

where the minimum is taken with respect to the ordering ≪. The predication “ X is A ”∈ P  is the most specific (i.e., the smallest) with respect to the ordering ≪ either in the set P  , if it is non-empty, or in P  .6 We see that using the function of local perception (9), each value x ∈ R is characterized by some evaluative expression. This will be called a perception of x in the context w and it can be understood as a certain kind of “measurement” done by people in a concrete situation. 2.3. Intermediate quantifiers Intermediate quantifiers are expressions of natural language such as most, many, almost all, a few, a large part of, etc. A detailed (informal) analysis of their meaning was presented by Peterson in [37]. A mathematical model of their meaning was developed using means of higher-order fuzzy logic in a series of papers [20–22,28]. Let us remark that these quantifiers belong among fuzzy generalized quantifiers (cf. [5,6,28]). Intermediate quantifiers of type 1, 1 have the general form “ Q B are A” and are expressed by means of the following formulas: ∀ ( Q Ev x)( B , A ) := (∃ z)((( z ⊆ B ) &(∀x)( z x ⇒ Ax)) ∧ Ev((μ B ) z)), ∃

( Q Ev x)( B , A ) := (∃ z)(((z ⊆ B ) & (∃x)( zx ∧ Ax)) ∧ Ev((μ B ) z))

(10) (11)

where z, B , A ∈ Formoα are formulas of type “fuzzy set”,7 x ∈ Formα is a variable of type α . Interpretation of (10) and (11) in a model is simple (informally): we take the largest fuzzy set z ⊆ B such that each element x having the property represented by z (and so, having also the property B) has also the property A and, at the same time, the size of z w.r.t. B is evaluated by the evaluative expression Ev (for example, it can be very big, not small, etc.).

5 6 7

We usually put a0 = 0.9 or a0 = 1. If P = ∅ then also P  = ∅. Their interpretation is a function from a set M α of objects of type

α to a set M o of truth values.

196

V. Novák / International Journal of Approximate Reasoning 78 (2016) 192–209

The size of the fuzzy set z ⊆ B w.r.t. B is mathematically characterized by a measure (μ B ) z. Example of possible measure for the fuzzy set B with finite support is

⎧ ⎪ ⎨1

(μ B ) z = min{1, ||Bz|| } ⎪ ⎩ 0

iff z = B , (12) otherwise



¯ = 0, 0.5, 1 . where | B | = x∈Supp( B ) Bx. The evaluation is realized w.r.t. the standard context w We can also introduce type 1 intermediate quantifiers: ∀ ( Q Ev x) Ax := (∀x)(x ∈ Supp( A ) ⇒ Ax) ∧ Ev((μ V ) A ), ∃

( Q Ev x) Ax := (∃x)( Ax & x ∈ Supp( A )) ∧ Ev((μ V ) A )

(13) (14)

where V is a universe and Supp( A ) is the support of the fuzzy set A. Below are a few concrete intermediate quantifiers studied in the mentioned papers: ∀ A: All B are A := Q Bi  ( B , A ) ≡ (∀x)( Bx ⇒ Ax) ∀ P: Almost all B are A := Q Bi Ex ( B , A )

(extremely big part of B has A)

∀ T: Most B are A := Q Bi Ve ( B , A ) ∀ K: Many B are A := Q ¬( Sm ν¯ ) ( B , A ) ∀ B: Few B are A := Q Sm Ve ( B , A )

(very big part of B has A) (not small part of B has A) (very small part of B has A)

∃ I: Some B are A := Q Bi  ( B , A ) ≡ (∃x)( Bx ∧ Ax)

where  is a unary connective interpreted by the function . Note that A, I collapse just to the classical quantifiers. The quantifiers are linearly ordered by implication. The higher quantifiers are called superalterns and lower ones subalterns which means that in any model, a truth value of superaltern is smaller than or equal to that of subaltern. For example, A ⇒ P, i.e., A is superaltern of P and the latter is subaltern of the former. 2.4. Fuzzy transform The fuzzy transform (or simply F-transform) is a universal technique introduced by Perfilieva in [35] and further elaborated in [36] and several other papers. Its fundamental idea is to map a bounded continuous function f : [a, b] −→ R to a finite vector of numbers and then to transform it back. The former is called a direct F-transform and the latter an inverse one. The result of inverse F-transform is a function ˆf that approximates the original function f . Parameters of the F-transform can be set in such a way that the approximating function ˆf has desired properties. In case of time series, it was proved in [34] that the F-transform has ability to filter out high frequencies and reduce noise. At the same time, it is possible to estimate average values of first and second derivatives of f over a specified area [16]. The first step of the F-transform procedure is to form a fuzzy partition of the domain [a, b]. It consists of a finite set of fuzzy sets

P = { A 0 , . . . , A n },

n ≥ 2,

(15)

defined over nodes a = c 0 , . . . , cn = b. Properties of the fuzzy sets from A  are specified by five axioms, namely: normality, n locality, continuity, unimodality, and orthogonality that is formally defined by i =0 A i (x) = 1, x ∈ [a, b ]. A fuzzy partition P is called h-uniform if the nodes c 0 , . . . , cn are h-equidistant, i.e., for all k = 0, . . . , n − 1, ck+1 = ck + h, where h = (b − a)/n. The membership functions A 0 , . . . , A n of fuzzy sets forming the fuzzy partition A are usually called basic functions. A direct F-transform of a continuous function f is a vector F[ f ] = ( F 0 [ f ], . . . , F n [ f ]), where each k-th component F k [ f ] is equal to

b Fk[ f ] =

a

f (x) A k (x) dx , b A k (x) dx a

k = 0, . . . , n .

The inverse F-transform of f with respect to F[ f ] is a continuous function ˆf : [a, b] −→ R such that

ˆf (x) =

n

F k [ f ] · A k (x),

x ∈ [a, b].

k =0

All the details and full proofs of many theorems characterizing properties of the F-transform can be found in [35,36].

(16)

V. Novák / International Journal of Approximate Reasoning 78 (2016) 192–209

197

Fig. 1. F1 -transform of the function f (x) = 2x + sin 10x − sin 30x.

The F-transform introduced above is F0 -transform (i.e., zero-degree F-transform). Its components are real numbers. If we replace them by polynomials of arbitrary degree m ≥ 0, we arrive at the higher degree Fm transform. The following theorem is important for our methods described below. Theorem 1. If f is four-times continuously differentiable on [a, b] then for each k = 1, . . . , n − 1,

βk0 [ f ] = f (ck ) + O (h2 ),

(17)

βk1 [ f ] = f  (ck ) + O (h2 ),

(18)

where

 c k +1 βk0 [ f ]

=

 c k +1 βk1 [ f ]

=

f (x) A k (x)dx

c k −1  c k +1 c k −1

A k (x)dx

(19)

,

f (x)(x − ck ) A k (x)dx c k −1  c k +1 . (x − ck )2 A k (x)dx c k −1

(20)

Thus, the F1 -transform components provide a weighted average of values of the function f in the area around the node ck (17), and also a weighted average of its slopes (18) in the same area. To demonstrate that F-transform is indeed able to extract slope of a complex and volatile function, let us consider the following function:

f (x) = 2x + sin 10x − sin 30x.

(21)

This is a simple linear function with added two periodic components. The function Tr(x) = 2x represents trend of (21). On the basis of Theorem 1, we apply F1 -transform to f in (21) using fuzzy partition with the distance between nodes equal to 1.356 (= 2 · 0.628). The result is depicted in Fig. 1. One can see that the “trend” Tr(x) is well approximated. Moreover, values of the coefficients (20) in nodes 1.356, 2.712, 4.068, 5.424, 6.780, 8.136 are 1.979, 2.015, 2.036, 2.025, 1.990, 1.965 respectively. One can see that they are very close to the expected value 2 of the slope (do not forget that these values were obtained from function (21) that is quite volatile). Let us remark that in [11,12], another method for characterization of the direction of time series is suggested. Unlike F1 -transform, however, it is much more complicated and, moreover, it is purely heuristic without proof that it is indeed able to find the slope of a more complicated function. 2.5. Time series A time series is a stochastic process (see [1,10]) X : T × −→ R where is a set of elementary random events and T = {0, . . . , p } ⊂ N is a finite set whose elements are interpreted as time moments. Our basic assumption is that the time series can be decomposed into four components:

X (t , ω) = Tr(t ) + C (t ) + S (t ) + R (t , ω),

t ∈ T, ω ∈ ,

(22)

where is a set of elementary random events, Tr(t ) is trend, C (t ) is cycle and S (t ) is a seasonal component. Both C (t ) and S (t ) are a mixtures of periodic functions. The R (t , ω) is a random noise with zero mean and finite variance.

198

V. Novák / International Journal of Approximate Reasoning 78 (2016) 192–209

Because it is difficult to distinguish Tr and C , they are often joined into one component trend-cycle TC (t ) = Tr(t ) + C (t ). It was proved in [34] that the F-transform enables us to extract the trend-cycle TC with the error close to zero. Likewise, we can use it also for extraction of the trend Tr. ¯ ⊆ T be a subinterval of T. Then X |T¯ denotes the time series X in (22) We will use below the following notation: Let T ¯ is T¯ × and X |T( ¯ t , ω) = X (t , ω) for all t ∈ T¯ and ω ∈ . In the same way we ¯ , i.e., the domain of X |T restricted to T can also write the whole time series (22) as X |T. In the sequel, we will work with one realization of X for a fixed ω ∈ . Therefore, we will write X (t ) or also X |T(t ). 3. Mining information from time series In this section, we will present methods for mining of two kinds of information about time series: (a) Characterization of its course. (b) Summarization of trend-related information about one time series and also about a set of them. In both cases, the information is provided in special expressions of natural language. To be able to solve this task, we must first precisely specify what properties are we speaking about. Because time series is a fairly complicated object, to be able to form the properties in concern precisely, we need higher-order fuzzy logic (fuzzy type theory). 3.1. Formalization of properties of trend In this subsection, we will introduce precise logical formulas characterizing properties of trend, which is necessary to be able to understand well their structure. Subsequent interpretation of the formulas in a model enables us to develop algorithms that can be applied in the mining process. 3.1.1. Special formulas The formulas presented below concern one realization of time series (cf. definition (22)). We will extend the language J by special constants, namely X ∈ Formτ τ and B ∈ Formτ ((oτ )τ ) . Interpretation of X is a realization of a considered time series (22) and interpretation of B is the coefficient (20). In a model, the constant X is assigned a realization of a concrete time series (22) and B τ (oτ ) is interpreted by (20). For further analysis, we will represent a time series by a formula

X (oτ )τ ≡ λt τ λxτ · ( Xt ≡ x).

(23)

Formula (23) represents time series as a fuzzy relation. Namely, every time moment t and every value x are assigned a truth value 0 or 1 (because of ) depending on truth of the equality Xt ≡ x (that is, “the value of X in time t is equal to x”). Because interpretation of X is a function, to every t there is only one value x for which the latter equality is true. Restriction of a time series to a given time domain is the following formula of type ((oτ )τ )(oτ ):

X | ≡ λ zoτ λt τ λxτ · ( zt & ( Xt ≡ x)).

(24)

Let a time domain T ⊆ R be given. Then X |T ∈ Form(oτ )τ ) is the time series X restricted to the time domain T. Note that (24) represents a time series restricted to the time domain zoτ as a fuzzy relation. Note also, that zoτ can be, in general, a fuzzy set. The formula

β 1 ≡ λ v (oτ )τ B v

(25)

where v (oτ )τ is a variable for a time series represents a slope of the time series computed using (20). The time series X can be restricted to a time domain T so that the resulting formula β 1 ( X |T) ∈ Formτ represents a slope of X over T. The next step is to introduce formulas for properties. Each property is represented by an intension which is a function from the set of all contexts (more generally, from set of all possible worlds) into a set of fuzzy sets over some universe of elements. Therefore, we introduce also a type ω for contexts8 and use a variable w ∈ Formω . In a formal language of evaluative expressions we deal with specific formulas Ze, Sm ν , Me ν , Bi ν ∈ Form(oα )ω where ω is a type of context and ν is a hedge. These are formulas for intensions of the evaluative expressions. For example SmVe is intension of very small”. We will often use a meta-variable Ev ∈ Form(oα )ω to denote intension of an arbitrary evaluative expression. The property “to be stagnating” is defined as follows:

Stag ≡ λ v (oτ )τ λ w ω λ zoτ [(Ze w )(β 1 ( v | z))] ∨ [(Ze(−1w ))(β 1 ( v | z))] ∨

[(SmEx) w (β 1 ( v | z))] ∨ [(SmEx)(−1w )(β 1 ( v | z))]. 8

According to the theory of evaluative expressions, the type

ω is, in fact, a complex type αo; cf. [27].

(26)

V. Novák / International Journal of Approximate Reasoning 78 (2016) 192–209

199

Table 1 Table with definition of special hedges used in characterization of trend of time series.

special hedge

Ne Sl Sw Sm Cl Ro FL QL La Sh Si Hu

(negligible) (slight) (somewhat) (small) (clear) (rough) (fairly large) (quite large) (large) (sharp) (significant) (huge)

Ev in (30) and (31) SmSi (significantly small) SmVe (very small) SmRa (rather small) Sm (small) SmVR or Me (very roughly small or medium) BiVR (very roughly big) BiRo (roughly big) BiRa (rather big) Bi (big) BiVe (very big) BiSi (significantly big) BiEx (extremely big)

This formula should be read as follows: Stag is a function which assigns to each time series v (oτ )τ (cf. (23)), each context w ω and each time domain zoτ a truth value. The latter is obtained as disjunction of the following truth values: (Ze w )(β 1 ( v | z)) is a truth value of the proposition saying that the slope β 1 ( v | z) of the time series v restricted to the time domain z is zero in the (positive) context w. Similarly, (Ze(−1w ))(β 1 ( v | z)) says the same but for the negative context −1w. Analogously, (SmEx) w (β 1 ( v | z)) and (SmEx)(−1w )(β 1 ( v | z)) represent truth values of the propositions saying that the given slope β 1 ( v | z) is extremely small both in the positive as well as negative context w. Saying more freely, trend of a time series is stagnating if its slope is either zero or extremely small (in a given context). The disjunction ensures that the resulting truth value is the highest of the partial truth values of all the disjuncts in (26). Using (26) we obtain intension of the property “ X is stagnating in the period z” where z ∈ Formoτ is a variable:

Stag X ≡ λ w ω λ zoτ · [(Ze w )(β 1 ( X | z))] ∨ [(Ze(−1w ))(β 1 ( X | z))] ∨

[(SmEx) w (β 1 ( X | z))] ∨ [(SmEx)(−1w )(β 1 ( X | z))].

(27)

According to (27), each context w ω is assigned an extension (Stag X ) w ω , which is a fuzzy set of time domains z in which X is stagnating (i.e., it has the property Stag) — see below. The properties “to be increasing” and “to be decreasing” are defined as

Inc ≡ λ v (oτ )τ λ w ω λ zoτ · ¬ϒ((Stag v ) w z) ∧ (β 1 ( v | z) > 0),

(28)

Dec ≡ λ v (oτ )τ λ w ω λ zoτ · ¬ϒ((Stag v ) w z) ∧ (β 1 ( v | z) < 0).

(29)

We see that formulas (28) and (29) for a given time series v (oτ )τ represent intensions of crisp properties (because the operation ϒ is crisp and the relations > and < are also crisp). Namely, (Inc X ) w T is true if the time series X in a context w and time interval T is not stagnating at all (i.e., the truth degree of stagnation is 0) and the coefficient β 1 ( X |T) > 0. Otherwise it is false. Similarly for the decrease. Various kinds of properties of trend of a time series are defined in Table 1. They are distinguished by a special evaluative expression occurring in their definition:

special hedge Inc ≡ λ v (oτ )τ λ w ω λ zoτ · (β 1 ( v | z) > 0) ∧ (Ev w )(β 1 ( v | z)),

(30)

special hedge Dec ≡ λ v (oτ )τ λ w ω λ zoτ · (β 1 ( v | z) < 0) ∧ (Ev(−1w ))(β 1 ( v | z)).

(31)

In the sequel, we will use the symbol

(EvTr X ) w T

(32)

that will stand for any of the formulas on the right-hand side of (30) and (31). For example, according to Table 1, the properties “slight increase” and “slight decrease” are defined as follows:

SlInc ≡ λ v (oτ )τ λ w ω λ zoτ · (β 1 ( v | z) > 0) ∧ (Sm Ve) w (β 1 ( v | z)), 1

1

SlDec ≡ λ v (oτ )τ λ w ω λ zoτ · (β ( v | z) < 0) ∧ (Sm Ve)(−1w ) (β ( v | z)).

(33) (34)

After substituting a time series X into (33) and (34) we obtain the following intensions:

SlInc X ≡ λ w ω λ zoτ · (β 1 ( X | z) > 0) ∧ (SmVe) w (β 1 ( X | z)), 1

1

SlDec X ≡ λ w ω λ zoτ · (β ( X | z) < 0) ∧ (SmVe)(−1w ) (β ( X | z)).

(35) (36)

200

V. Novák / International Journal of Approximate Reasoning 78 (2016) 192–209

Fig. 2. Extensions of evaluative expressions Ev used in the right column of Table 1 in the context w = 0, 0.4, 1 . The curves represent membership functions of extensions of the following evaluative expressions (left to right): SmEx, SmSi, SmVe, Sm, Me, BiVR, BiRo, Bi, BiRa, BiVe, BiSi, BiEx.

Formula (35) (or (36)) has the following meaning: each context w ω and each time interval z ∈ Formoτ are assigned a truth value of the statement “the coefficient β 1 in (20) computed over the time domain z for the time series X is very small”. Formulas (35) and (36) represent intensions of the corresponding properties. For each context w ω , they give extensions ( SlInc X ) w and ( SlDec X ) w which represent fuzzy sets of time domains. Hence, the formulas

( SlInc X ) w T ≡ (β 1 ( X |T) > 0) ∧ (SmVe) w (β 1 ( X |T)),

(37)

( SlDec X ) w T ≡ (β 1 ( X |T) > 0) ∧ (SmVe)(−1w ) (β 1 ( X |T))

(38)

provide truth degree of the property “trend of the time series X in the time interval T is in the context w slightly increasing”. Of course, this truth degree can be any value from [0, 1]. This is discussed in the next subsection. Similar considerations can be performed also with the other properties from Table 1. 3.1.2. Interpretation of special formulas As usual in logic, interpretation of formulas of FTT starts with the definition of a model and then each symbol of the language is assigned a precise object in it. In this paper we do not need to construct a model precisely. Instead we will use the following notation: if A α is a formula then its interpretation in a (arbitrary) model is  A α . This denotes an element of type α assigned to the formula A α . For example, if A oα is a formula of type oα then  A oα  ⊂ M α is a fuzzy set in the ∼ set M α (this is assigned to the type α ). Similarly for the other types. Following this notation, interpretation of the time series X ∈ Formτ τ is one realization of (22) which is a real function  X  : R −→ R. Then interpretation of X in (23) is the (crisp) relation  X  : R × R −→ {0, 1}. Interpretation of X |T is a fuzzy relation  X |T : T × R × R −→ [0, 1]. Interpretation β 1 ( X |T) of the coefficient 1 β ( X |T) ∈ Formτ is determined by formula (20). Interpretation of the intension ( SlInc X ) ∈ Formo(oτ )ω is a function

 SlInc X  : W −→ F (F (R))

(39)

which assigns to each context  w  ∈ W extension ( SlInc X ) w whose interpretation is a fuzzy set of time domains T (each of which can also be a fuzzy set):



 ( SlInc X ) w  = ( SlInc X ) w T T T ∈ F (R) .

(40)

In words: (40) is a fuzzy set of all time domains T over which the time series X is in the context w slightly increasing. The symbol ( SlInc X ) w T denotes a truth value of the latter proposition. Similarly are interpreted also the other properties introduced in Table 1. Note that extensions of the properties (28) and (29) are the following crisp sets:



 (Inc X ) w  = a T T T ∈ F (R), 

1 if ((Stag X ) w T = 0 and β 1 ( X |T) > 0), , where a T = 0 otherwise.

(41) (42)

V. Novák / International Journal of Approximate Reasoning 78 (2016) 192–209



 (Dec X ) w  = a T T T ∈ F (R), 

1 if ((Stag X ) w T = 0 and β 1 ( X |T) < 0), . where a T = 0 otherwise.

201

(43) (44)

Let us summarize the above procedure as follows. Computation of the truth value ( SlInc X ) w T (a) Let the time domain T be the interval T = [ck−1 , ck+1 ] ⊂ R where ck−1 , ck , ck+1 ∈ R are nodes. Define a basic function A k over them and compute value of the coefficient β 1 ( X |T) using (20):

 c k +1

1

β ( X |T) =

X (t )(t − ck ) A k (x)dt

c k −1  c k +1 (t c k −1

− ck )2 A k (t )dt

.

(45)

(b) Set the context  w t g  =  v L , v S , v R for the slope. (c) Evaluate the truth value β 1 ( X |T) > 0 which is equal to 1 or 0 (because the inequality > is crisp). (d) If the truth value of (c) is 1 then evaluate the truth value ((SmVe) w ) β 1 ( X |T) which is equal to the membership degree of (45) in the extension of the evaluative expression SmVe (very small) in the context w t g (cf. (35)). (e) Put

( SlInc X ) w T = β 1 ( X |T) > 0 ∧ ((SmVe) w ) β 1 ( X |T).

(46)

3.2. Characterization of local trend by sentences of natural language In this section, we will show how sentences of natural language characterizing direction of trend of time series (i.e., whether it is increasing, stagnating, or decreasing, and in which extent) can be automatically generated. We will start with choosing several kinds of sentences doing the job. The next step is to analyze their semantical structure on the basis of which we suggest a procedure enabling to find a formula of FNL construing the given sentence. Interpretation of this formula in a model provides new information about the time series as well as an algorithm how a sentence that mediates this information can be generated. There are two techniques that can be applied. One is the technique based on the concept of protoform. The second one is based on the well established technique developed in linguistic theory which provides full analysis of a sentence of natural language on several levels: the first level called phonetic represents its sound form, the highest level called tectogrammetical represents its semantical structure. The latter is a starting point for our procedure. 3.2.1. Protoforms In the literature on applications of fuzzy techniques to natural language, for example [12–14,18,19,39,44] and the citations therein, one can meet the term protoform.9 This term was introduced by Zadeh in [45]. He provides the following definition: A protoform (abbreviation of “prototypical form”) is an abstracted summary. More concretely, a protoform, A, of an object, B, written as A = P F ( B ), is defined as a deep semantic structure of B. B may be a proposition, command, question, scenario, geometrical form, functional form or other type of construct. Usually, A is a string of symbols, but more generally it may be a graph, network, a geometrical form or other entity. As can be seen, this idea is intended to be more general but till now, it has been applied in linguistic considerations. The cited literature contains the following examples: (i) Zadeh’s examples: let us consider the sentence “Eva is young” and let A be an abstraction of “age”, B abstraction of “Eva” and C abstraction of “young”. Then we may translate this sentence into a sequence of protoforms “ A ( E va) is young”, “ A ( E va) is C ” and “ A ( B ) is C ”. (ii) Example of a summarizing sentences on time series: “Among all segments, most are slowly increasing” is translated into the protoforms “Among all segments, Q are P ”, or “Among R segments, Q are P ” where R is a quantifier of type 1 . Similarly, the sentence “Among all trends, trends of large variability took most of the time” is translated into the protoform “Among R trends, P trends took Q ” time. The sentence “Slowly decreasing trends that took most of the time are of a large variability” is translated into the protoform “R trends that took Q time are P ”.

9 In linguistics, the term “protoform” is used as a hypothetical form of a word, reconstructed from derived words. For example, the proto-form of “pepper” is “*peppar”. Instead of “protoform”, linguists use also “urform”.

202

V. Novák / International Journal of Approximate Reasoning 78 (2016) 192–209

(iii) Characterization of periodicity of time series: Let us consider the sentence “The first two months, the series is highly periodic [with a periodicity of approximately 1 week]” and let P rec 1 , P rec 2 be evaluative expressions, such as exactly, approximately, P dt y be a periodicity assessment characterized again by an evaluative expression, π be a computed value of periodicity and P rec 2 pˆ be a fuzzy number. Then the above sentence is translated into the protoform “ P rec 1 T imeCtxt the series is P dt y (π ) [with a period of P rec 2 pˆ units]”. (iv) In [44], the so called gradual summaries are considered: “Problems are solved faster”, “Time series is decreasing” that are translated into the protoform “ y’s are getting P G ” where y are objects that are summarized and P G is called gradual summarizer (in fact, this is an adjectival phrase containing an adjective in comparative form). The problem with the concept of protoform is the lack of precise definition. One can see that in protoform we can replace arbitrary part of the sentence by some symbol and then assign it a meaning. As can be seen from the examples above, the replaced parts are usually evaluative expressions (special adjectival phrases) and generalized quantifiers (usually the intermediate ones) but this is not specified precisely in advance. The symbols are interpreted by a certain heuristic procedure without explicitly constructed formula that would provide clear semantics of the expression in concern. Another way how semantics of sentences can be represented are tectogrammatical trees. 3.2.2. Tectogrammatical trees A well developed, sophisticated linguistic technique that enables to capture the structure and meaning of sentences is Functional Generating Description of natural language (FGD). It has been introduced by P. Sgall in 1968 and then intensively developed by Prague linguistic circle. It was in detail described in [42] and elsewhere (cf. [4]). As mentioned above, the sentence is characterized on four hierarchically ordered levels. For us, the most interesting is the highest one called tectogrammatical. On this level, the meaning of sentence is represented in the form of a tectogrammatical tree. The tectogrammatical tree is a dependency one, where individual nodes represent the meaning units of the given sentence and edges stand for (deep) syntactic relations between the relevant nodes. Each meaning unit (i.e., relevant word of a sentence) is assigned several characteristics called grammatemes which express the role of the word in the meaning of the sentence and provide correspondence to its surface form. Each edge also contains its membership in topic or focus.10 The “vertical dimension” of the dependency tree captures relations between a head (governor) and its modifiers (dependents). The “horizontal dimension” which is the order of the nodes from left to right represents the deep word order which accounts for its topic-focus articulation. Note that synonymous sentences have a single representation on this level, while an ambiguous ones have more than one tectogrammatical representation. The detailed and precise presentation of FGD can be found (among others) in the cited book, precise definition of the tectogrammatical tree can be found in [9,38]. In this paper, we show how nodes of the tectogrammatical tree can be replaced by (sub)formulas. The result is a formula that together with the structure of the tectogrammatical tree construes the meaning of the given sentence. Let us remark that the first version of such a procedure has already been proposed in the book [24]. As tectogrammatical tree is precisely constructed from the syntactical structure of the sentence and we need precise method how logical formulas representing the meaning of sentences can be constructed, we prefer tectogrammatical representation of the semantics of sentences. Concrete (simplified) examples of tectogrammatical trees are presented in the next subsection. 3.2.3. Construction of formulas construing tectogrammatical trees In this subsection, we will demonstrate how tectogrammatical trees of special sentences characterizing trend of a given time series can be constructed and assigned formulas of FNL that construe their meaning. Let us remark that recognition of trend may not be a trivial task even when watching the graph. For example, see the time series in Fig. 3 — what is its trend in various time intervals? In Subsection 2.4 we gave arguments that the F1 -transform is able to cope well with this problem. Indeed, Theorem 1 assures us that using it, we obtain estimation of the average slope (tangent) of a given function in a specified subset of its domain. Moreover, in [34] we proved that the F-transform is able to filter out high frequencies which, in case of time series, means that the F1 -transform provides convincing estimation of its slope (cf. Fig. 1). In [30], we described a method which by combining the F1 -transform and fuzzy natural logic enables to generate evaluation of trend of time series in the form of natural language expressions. The result is depicted in Fig. 3 and presented in its caption. One can see that volatility of the time series is not the principal obstacle. We argue that the presented results are in good correspondence with the intuition.11

10

According to the theory of the topic-focus articulation, each sentence is divided into two parts: the topic, that is the known information, and focus, that

is the new information. For example, there are at least two readings of the following sentence, namely “EVA f ist youngt ”, i.e., we learn about age of EVA and not of JOHN, or “Evat IS f YOUNG f ”, i.e., we learn that Eva IS YOUNG and not OLD. The theory is elaborated in many papers and books, e.g., [8]. 11 For more examples, see [30].

V. Novák / International Journal of Approximate Reasoning 78 (2016) 192–209

203

Fig. 3. Linguistic evaluation of local trend of time series in the following time intervals T1 = [1, 49]: somewhat decrease; T2 = [85, 91]: sharp increase; T3 = [116, 127]: negligible decrease. Trend of the whole time series is stagnating. The range of values of the time series is 3900–6450. The context for the slope was set to w t g = 0, 800/12, 2000/12 .

Having the necessary tools at disposal, we will now elaborate more precisely method for generation of comments to the direction of trend12 of time series. Let us consider the following sentence:

Trend of time series X in time period P is slightly increasing

(47)

where X is a name of time series (e.g., “unemployment rate”), P is a name of time period (e.g., “first quarter” or simply “spring”) and “slightly increasing” characterizes direction and extent of its slope. The simplified13 tectogrammatical tree of this sentence is the following:

In this tree, “is” is a copula that has an actor (Act) and predicate (Pred) realized by the adjectival complementation. The letters “t, f” denote membership of the given node in the topic and focus, respectively. The “Appurt” is a short for appurtenance. It is a free modification of a noun and characterizes objects denoted by the latter. Extent is a modification of adjective and “Temp” is a temporal modification of the verb “to be”. We may also consider sentence (47) in past form, i.e., the present “is” can be replaced by was”. This depends on the use of sentence in wider context, e.g., whether we describe the past situation or simply analyze the present data. In the tectogrammatical tree, the corresponding node for the verb must be extended by specific grammatemes which characterize its semantics more precisely (cf. [42]). As this problem is unimportant from the point of view of our analysis, for simplicity, we omitted grammatemes at the nodes of the tectogrammatical tree. Each node is accompanied by the symbol :=, which is followed by a (sub)formula that construes the given node. The respective branches then represent successive substitutions that lead to the resulting formula. Hence, starting from below, v is replaced by X so that the left branch is construed by the formula β 1 ( X | z). Then, left and middle branch are construed by β 1 ( X |T) where z is replaced by T by means of the identity relationship. The right branch represents the meaning of the expression “some value is slightly increasing” that is construed by ( y > 0) ∧ (Sm Ve) w y. Finally, the whole tree is construed by the formula

(β 1 ( X |T) > 0) ∧ (SmVe) w (β 1 ( X |T))

(48)

which is the formula ( SlInc X ) w T defined in (37). Let us now consider another sentence:

Increase of trend of time series X in time period P is slight.

(49)

12 Note that by the abuse of language, it is common to say, e.g., “trend is increasing (decreasing, stagnating)” instead of “direction of trend is increasing (decreasing, stagnating)”. 13 We omitted grammatemes of the meaning units.

204

V. Novák / International Journal of Approximate Reasoning 78 (2016) 192–209

Table 2 Formulas construing sentences of the type (47) or (49). The first column contains the “Extent”, the second one contains a formula construing the whole sentence. Level 0 1 2 3 4 5 6 7 8 9 10 11 12 13

Evaluative expression in the focus of (47) Stagnating Increasing/Decreasing Negligibly increasing/decreasing Slightly increasing/decreasing Somewhat increasing/decreasing Small increase/decrease Clearly increasing/decreasing Roughly increasing/decreasing Fairly large increase/decrease Quite largely increasing/decreasing Largely increasing/decreasing Sharply increasing/decreasing Significantly increasing/decreasing Hugely increasing/decreasing

(EvTr X ) w T (Stag X ) w T (Inc X ) w T/(Dec X ) w T (NeInc X ) w T/(NeDec X ) w T (SlInc X ) w T/(SlDec X ) w T (SwInc X ) w T/(SwDec X ) w T (SmInc X ) w T/(SmDec X ) w T (ClInc X ) w T/(ClDec X ) w T (RoInc X ) w T/(RoDec X ) w T (FLInc X ) w T/(FLDec X ) w T (QLInc X ) w T/(QLDec X ) w T (LaInc X ) w T/(LaDec X ) w T (ShInc X ) w T/(ShDec X ) w T (SiInc X ) w T/(SiDec X ) w T (HuInc X ) w T/(HuDec X ) w T

Its tectogrammatical tree is the following:

Similarly as above, the left branch is construed by the formula (β 1 ( X |T) > 0). Following this, the adverbial complementation of Extent gives the formula (48) (i.e., (β 1 ( X |T) > 0) ∧ (SmVe) w (β 1 ( X |T))). We again obtained the formula ( SlInc X ) w T defined in (37). Though both tectogrammatical trees are different, and so, strictly speaking, sentences (47) and (49) are not synonymous, they lead to the same formula which formalizes their meaning. Consequently, both sentences have the same semantics. The topic of these sentences determines the context w. Information about the context, however, is not contained in them but comes from other, in general, non-linguistic sources. It depends both on the kind of the time series X as well as on the time period P . For example, the increase of national product by 1% during 1 year can be fairly large increase for some European country but small for China. The same increase during half a year can be already sharp for a European country. Hence, after assigning w a concrete interpretation w t g , we obtain the truth value (46) which is extension of both sentences in the context w t g . One can see that the evaluative expression “slightly increasing” (or “slight” (increase)) in sentences (47) and (49) is contained in the focus of them and can be changed. All the possibilities are summarized in Table 2. Note also that though the tectogrammatical trees of both sentences (47) and (49) are construed by the same formula, the sentences are not synonymous and should be used on different places of the text. Moreover, the grammar of English does not allow to generate them equally with the same evaluative expression from Table 2. For example, we cannot say “The increase of time series X is somewhat”. Similarly, the property “stagnating” can appear only in sentence (47). 3.3. Time series segmentation

¯ i , i = 1, . . . , s within which the time series X An important task solved in [30] is finding time intervals (segments) T behaves monotonously (“increasing, decreasing, stagnating”). This task, according to [7], belongs to the general problem of time series segmentation. An algorithm presented in [30] is based on a simple idea: Let T be a time domain on which the time series X is defined. First, we set nodes c 0 , c 1 , . . . , cm ∈ T where c 0 = 0 is the first and cm = p the last time moment ¯ j  = c j − c j −1 , j = 1, . . . , m. Each T ¯ j  is assigned a formula (EvTr X ) w T¯ j of T and introduce auxiliary time intervals T

V. Novák / International Journal of Approximate Reasoning 78 (2016) 192–209

205

¯ i , i = 1, . . . , s with monotonous characterizing evaluation of X in this time interval using (50). Then the time segments T behavior of time series are constructed as follows. ¯ m . Let T ¯ i  be processed. Then the left interval T¯ j  is considered, the The algorithm starts with the last segment T ¯ i ∪ T¯ j ) is constructed and either the equality (EvTr X ) w (T¯ i ∪ T¯ j ) = (EvTr X ) w T¯ i , or the equality formula (EvTr X ) w (T (EvTr X ) w (T¯ i ∪ T¯ j ) = (EvTr X ) w T¯ j −1  is tested. If neither of them is fulfilled, we take T¯ i  as completed, generate ¯ i +1 . formulas (32) and continue to form the next segment T Note that this algorithm gives not only the found segments but also linguistic evaluation of the direction of trend within them. The precise algorithm is described in [30]. Note that in [11], an algorithm providing similar result is also provided. Its problem is that it uses the real values of the time series that can be (an usually are) subjected to unwelcome short periodicities and random disturbances which may significantly influence the result. This problem can be essentially avoided when using F-transform. 3.4. Mining linguistic information about course of time series The theory described above makes it possible to mine information about behavior of a given time series X and to express it in sentences of natural language. We suppose to have the following data at disposal: (a) One realization of the time series X in (22). ¯ 1 , . . . , T¯ s } of time segments of T. These segments can be determined either using the (b) The list T = {T, T ¯ i can algorithm mentioned in the previous subsection, or specified on the basis of some additional assumptions (T represent months, weeks, etc.). (c) The context  w t g  for the slope of X . The task of mining information about the course of time series consists in generating formulas (32) (the concrete examples of which are (30) or (31)) that were demonstrated to construe semantics of sentences (47) or (49). Note that the evaluative expression Ev inside (30) or (31) according to Table 1 can be obtained using the function of local perception where the set K = {Ze, SmSi, SmVe, Sm, SmVR, Me, BiVR, BiRo, BiRa, Bi, BiVe, BiSi, BiEx, }:

Ev = LPerc K (β 1 ( X |T),  w t g )

(50)

where β ( X |T) is the coefficient providing estimation of the slope of trend of X over the interval T whose interpretation is computed using (20). 1

¯ ∈ T do the following: Mining procedure For all time segments T (i) (ii) (iii) (iv)

¯ using (45). Compute values of the coefficient β 1 ( X |T) ¯  w t g ). Using (i) and function (9) determine the local perceptions LPerc(β 1 ( X |T), ¯ (32) and compute the truth values (EvTr X ) w T ¯ . Using (ii) determine the formulas (EvTr X ) w T ¯ obtained in step (iii) construe the meaning of sentences of natural language of the form (47) Formulas (EvTr X ) w T and (49). Having them at disposal, we can generate the corresponding sentences, for example: The trend of time series X in the first (second, third, fourth) quarter of the year is slightly (sharply, clearly, etc.) decreasing (increasing, stagnating). This sentence has structure similar to (47). Its tectogrammatical tree, however, must be modified in the temporal branch as follows:

The RSTR is a functor of restriction and “the first quarter of the year” is construed by a special time period T1 .

206

V. Novák / International Journal of Approximate Reasoning 78 (2016) 192–209

Similarly, we can generate sentence of the form (49): The decrease of trend of time series X in the first (second, third, fourth) quarter of the year is slight (sharp, quite large, etc.).

¯ ≡ T we can also generate a special form of (47): Note that for T The global trend of time series X is slightly (sharply, quite largely, etc.) increasing. Let us remark that step (iii) enables us to generate also other kinds of sentences providing additional information about time series, for example:

¯ i time moments (days, weeks, months, The longest period of stagnation (increase, slight decrease, etc.) of time series X lasted T etc.) A more detailed elaboration of such additional possibilities that go apparently beyond the simple template-based generation will be published in some of the next papers. 3.5. Linguistic summarization of knowledge and syllogistic reasoning As mentioned in the introduction, this is an interesting task that has already been addressed by several authors. Our ideas in this paper are essentially the same. However, our approach is much more formal and uses the logical theory developed in FNL. In case of summarization, we suggest to apply the sophisticated formal theory of intermediate quantifiers. 3.5.1. Summarization of knowledge about one time series A typical sentence summarizing knowledge about on time series is the following:

In most time intervals, trend of the time series X is slightly increasing.

(51)

The tectogrammatical tree of this sentence is the following:

Using the results from Section 3 and the theory of intermediate quantifiers (see Section 2.3), we will construe this sentence using the following type 1 quantifier T (cf. definition (13) and below): ∀ ( Q Bi Ve z )(( SlInc X ) w z ) ≡ (∀ z )( z ∈ Supp(( SlInc X ) w ) ⇒ ( SlInc X ) w z ) ∧ Ev((μT )( SlInc X ) w ).

(52)

Interpretation of this formula is the following. The quantified variable is z ∈ Formoτ (a time period) and the universe is the set of time periods T . Hence, the interpretation  z is one of the intervals contained in T . Recall that it is a finite set having s + 1 elements. The formula ( SlInc X ) w z was defined in (37). Further procedure is the following:

¯ ∈ T find the truth values ( SlInc X ) w (T) ¯ using (46). (a) For all T (b) Determine the support (40)

¯ | ( SlInc X ) w (T) ¯ > 0, T ¯ ∈ T }. Supp(( SlInc X ) w ) = {T

V. Novák / International Journal of Approximate Reasoning 78 (2016) 192–209

207

(c) Compute the relative cardinality (12)

 (μT )(( SlInc X ) w ) =

¯ T T∈

¯ ( SlInc X ) w (T) |T |

and determine the truth value  BiVe((μT )( SlInc X ) w ) in the standard context  w  = 0, 0.5, 1 . (d) Compute truth value of (52) as follows: 14

   ∀ ( Q Bi z )(( SlInc X ) w z ) = min min SlInc X ) w z )  z  ∈ Supp (( SlInc X ) w ) , ( Ve   BiVe((μ V )(( SlInc X ) w ) .

(53)

Mining procedure Mining summarizing knowledge in sentences of natural language of the form (51) proceeds as follows. (i) Analogously as (52) compute the truth value ∀ ( Q Ev z)((EvTr X ) w z)

(54)

for each formula

EvTr ∈ {Stag, NeInc, NeDec, SlInc, SlDec, Sm Inc, Sm Dec, ClInc, ClDec, RoInc, RoDec, F LInc, F LDec, Q LInc, Q LDec, LaInc, LaDec, ShInc, ShDec, H uInc, H uDec} and for each intermediate quantifier A, P, B, T, K, I from Section 2.3 (and restricted to type 1 ). The result is a set of formulas (54) (cf. also (52)), each of which construes a sentence of the form (51), for example: In most (many, few) cases, the time series was stagnating (sharply decreasing, roughly increasing). (ii) Formulas of the form (54) in which we consider various quantifiers have, of course, various truth degrees. We choose a formula that has the highest truth degree (possibly greater than a given threshold) and at the same time it characterizes the highest level of steepness of the time series. 3.5.2. Summarization of knowledge about a set of time series Let a set { X i | i = 1, . . . , r } of time series (22) be given. Analogously as in the previous section, we can formally construe semantics of sentences, such as (a) and (b): (a) Most (many, few, etc.) analyzed time series stagnated recently but their future trend is slightly (negligibly, largely, significantly, etc.) increasing. Let v (oτ )τ be a variable for time series (23). Then this sentence is construed using the following type 1, 1 quantifier: ∀ ¯ i , ( SlInc v ) w T¯ i +1 ) ( Q BiVe v )((Stag v ) w T

¯ i is the recent time period and T¯ i +1 the future one. where T (b) There is an evidence of huge (slight, clear, etc.) decrease of trend of almost all time series in the recent quarter of the year. Formalization: ∀ ¯ i ). ( Q BiSi v )(( H uDec v ) w T

Interpretation of the corresponding formulas is constructed in the same way as explained above. The difference is that the quantified variable runs over the set of time series. It is also possible to combine more quantifiers, i.e., to mine linguistic information, for example Few financial time series often stagnated. This can be formalized by: ∀ ∀ ( Q SmVe v )( v F in , ( Q Bi z)(Stag v F in ) w z)

where v F in is a variable representing a financial time series (no context is involved) and “often” is modeled by the quantifier ∀ z ). ( Q Bi 14

This is the truth value of the proposition “the relative cardinality (μT )(( SlInc X ) w ) is very big”.

208

V. Novák / International Journal of Approximate Reasoning 78 (2016) 192–209

There are almost unlimited possibilities how interesting information can be mined from the given set of time series using the developed formalism. Especially interesting is to include also forecast of their future development. Then we can generate comments to interesting time intervals, or we can also give answers to questions, for example: In which period was the time series sharply increasing? How long was the time series stagnating or decreasing before sharp increase? 4. Conclusion and future work In this paper, we focused on the problem of mining information in the form of sentences of natural language from time series. Note that there are many methods for mining information from time series. Very few, however, provide information in sentences of natural language (see [7] and the citations therein). Our main contribution consists in utilization of the coherent techniques of F-transform and fuzzy natural logic. With the help of them we are able to disclose various kinds of information and put them together using sentences of natural language. Using F-transform we are able to provide new data about the structure of time series, such as the course of trend, segments of special behavior or numerical value of its slope. The strength and reliability of these techniques are based on their properties proved mathematically. By combination of fuzzy natural logic and special linguistic techniques, we are able to construct formulas that construe essential parts of sentences of natural language. Namely, we suggest a sentence that gives a certain kind of information on time series (from this point of view, our approach can be taken as “template based”). Then we form a tectogrammatical tree that provides semantic analysis of the given sentence and finally replace its nodes by formulas in concern. The result is a formal structure that construes semantics of the given sentence. Having a formula construing the meaning of the given sentence at disposal, we can construct a model that fits the data provided by the time series (after application of special mathematical techniques, such as F-transform). If such a model exists then the given sentence can be provided as a comment to the time series. Notice that precise and detailed formal analysis of relatively simple sentences is quite complicated. Our future work will therefore be focused on refinement and extension of the above presented techniques. We must also include in the formal model the topic-focus articulation phenomenon. An exciting possibility not mentioned yet is to derive new information on the basis of the formal theory of generalized syllogistic reasoning developed, e.g., in [20,22] where validity of over 105 generalized Aristotle’s syllogisms with intermediate quantifiers was formally proved. It is important to note that if a syllogism is valid then it is true in all models. In our case this means that it is true for arbitrary time series and in arbitrary context. The following is example of a syllogism of Fig. II: No sharply increasing trend is long

¬(∃ z)(( ShInc X ) w z ∧ (Bi w  )z)

Most increasing trends are long

∀ z)((Inc X ) w z, (Bi w  ) z) ( Q BiVe ∀ z)((Inc X ) w z, ¬( ShInc X ) w z) ( Q BiVe

Most increasing trends are not increasing sharply

where the formula (Bi w  ) z says that the period z is “big” (= “long”) in the context w  . Note that w is a context for slope interpreted by  w  =  w t g  while w  represents context for the length of the period that depends on the interpretation of the time series (for example, it can be around 30 for the time series with daily periodicity, or 3–4 months for month periodicity, etc.). Interpretation of all the formulas occurring in these syllogisms can be constructed in a straightforward way analogously as in (53). The detailed formal analysis and possible algorithms will be the topic of some future paper. Acknowledgement This paper was supported by the project LQ1602 IT4Innovations excellence in science. References [1] J. Andˇel, Statistical Analysis of Time Series, SNTL, Praha, 1976 (in Czech). [2] R. Castillo-Ortega, N. Marín, D. Sánchez, Time series comparison using linguistic fuzzy techniques, in: E. Hüllermeier, R. Kruse, F. Hoffmann (Eds.), Computational Intelligence for Knowledge-Based Systems Design, in: Lecture Notes in Computer Science, Springer, Berlin, 2010, pp. 330–339. [3] R. Castillo-Ortega, N. Marín, D. Sánchez, A fuzzy approach to the linguistic summarization of time series, J. Mult.-Valued Log. Soft Comput. 17 (2–3) (2011) 157–182. [4] S. Cinková, et al., Annotation of English on the Tectogrammatical Level, Universitas Carolina Pragensis, Prague, 2008. [5] M. Delgado, M. Ruiz, D. Sanchez, M. Vila, Fuzzy quantification: a state of the art, Fuzzy Sets Syst. 242 (2014) 1–30. [6] A. Dvoˇrák, M. Holˇcapek, L-fuzzy quantifiers of the type 1 determined by measures, Fuzzy Sets Syst. 160 (2009) 3425–3452. [7] T.-C. Fu, A review on time series data mining, Eng. Appl. Artif. Intell. 24 (2011) 164–181. [8] E. Hajiˇcová, B.H. Partee, P. Sgall, Topic-Focus Articulation, Tripartite Structures, and Semantics Content, Kluwer, Dordrecht, 1998. [9] E. Hajiˇcová, P. Sgall, J. Hana, T. Hoskovec (Eds.), Prague Linguistic Circle Papers, vol. 4, J. Benjamin, Amsterdam/Philadelphia, 2002. [10] J. Hamilton, Time Series Analysis, Princeton University Press, Princeton, 1994.

V. Novák / International Journal of Approximate Reasoning 78 (2016) 192–209

209

[11] J. Kacprzyk, A. Wilbik, Using fuzzy linguistic summaries for the comparison of time series: an application to the analysis of investment fund quotations, in: J.P. Carvalho, D. Dubois, U. Kaymak, J. da Costa Sousa (Eds.), Proceedings of the Joint 2009 IFSA World Congress and 2009 EUSFLAT Conference, Lisbon, Portugal, 2009, pp. 1321–1326. [12] J. Kacprzyk, A. Wilbik, S. Zadrozny, ˙ Linguistic summarization of time series using a fuzzy quantifier driven aggregation, Fuzzy Sets Syst. 159 (2008) 1485–1499. [13] J. Kacprzyk, A. Wilbik, S. Zadrozny, ˙ An approach to the linguistic summarization of time series using a fuzzy quantifier driven aggregation, Int. J. Intell. Syst. 25 (2010) 411–439. [14] J. Kacprzyk, R. Yager, Linguistic summaries of data using fuzzy logic, Int. J. Gen. Syst. 30 (2001) 133–154. [15] I. Kobayashi, N. Okumura, Verbalizing time-series data: with an example of stock price trends, in: J.P. Carvalho, D. Dubois, U. Kaymak, J. da Costa Sousa (Eds.), Proceedings of the Joint 2009 IFSA World Congress and 2009 EUSFLAT Conference, Lisbon, Portugal, 2009, pp. 234–239. [16] V. Kreinovich, I. Perfilieva, Fuzzy transforms of higher order approximate derivatives: a theorem, Fuzzy Sets Syst. 180 (2011) 55–68. [17] H.P. Luhn, The automatic creation of literature abstracts, IBM J. Res. Dev. 2 (1958) 159–165. [18] N. Marin, D. Sánchez, On generating linguistic descriptions of time series, Fuzzy Sets Syst. 285 (2016) 6–30. [19] G. Moyse, M. Lesot, Linguistic summaries of locally periodic time series, Fuzzy Sets Syst. 285 (2016) 94–117. [20] P. Murinová, V. Novák, A formal theory of generalized intermediate syllogisms, Fuzzy Sets Syst. 186 (2012) 47–80. [21] P. Murinová, V. Novák, Analysis of generalized square of opposition with intermediate quantifiers, Fuzzy Sets Syst. 242 (2014) 89–113. [22] P. Murinová, V. Novák, The structure of generalized intermediate syllogisms, Fuzzy Sets Syst. 247 (2014) 18–37. [23] A. Nenkova, K. McKeown, Automatic summarization, Found. Trends Inf. Retr. 5 (2011) 103–233. [24] V. Novák, The Alternative Mathematical Model of Linguistic Semantics and Pragmatics, Plenum, New York, 1992. [25] V. Novák, On fuzzy type theory, Fuzzy Sets Syst. 149 (2005) 235–273. [26] V. Novák, Perception-based logical deduction, in: B. Reusch (Ed.), Computational Intelligence, Theory and Applications, Springer, Berlin, 2005, pp. 237–250. [27] V. Novák, A comprehensive theory of trichotomous evaluative linguistic expressions, Fuzzy Sets Syst. 159 (22) (2008) 2939–2969. [28] V. Novák, A formal theory of intermediate quantifiers, Fuzzy Sets Syst. 159 (10) (2008) 1229–1246. [29] V. Novák, EQ-algebra-based fuzzy type theory and its extensions, Log. J. IGPL 19 (2011) 512–542. [30] V. Novák, Linguistic characterization of time series, Fuzzy Sets Syst. 285 (2016) 52–72. [31] V. Novák, S. Lehmke, Logical structure of fuzzy IF–THEN rules, Fuzzy Sets Syst. 157 (2006) 2003–2029. [32] V. Novák, V. Pavliska, I. Perfilieva, M. Štˇepniˇcka, F-transform and fuzzy natural logic in time series analysis, in: Proc. Int. Conference EUSFLAT-LFA’2013, Milano, Italy, 2013, pp. 40–47. [33] V. Novák, V. Pavliska, M. Štˇepniˇcka, L. Štˇepniˇcková, Time series trend extraction and its linguistic evaluation using F-transform and fuzzy natural logic, in: L. Zadeh, A. Abbasov, R. Yager, S. Shahbazova (Eds.), Recent Developments and New Directions in Soft Computing, Springer, Berlin, 2014, pp. 429–442. [34] V. Novák, I. Perfilieva, M. Holˇcapek, V. Kreinovich, Filtering out high frequencies in time series using F-transform, Inf. Sci. 274 (2014) 192–209. [35] I. Perfilieva, Fuzzy transforms: theory and applications, Fuzzy Sets Syst. 157 (2006) 993–1023. ˇ [36] I. Perfilieva, M. Danková, B. Bede, Towards a higher degree F-transform, Fuzzy Sets Syst. 180 (2011) 3–19. [37] P. Peterson, Intermediate Quantifiers. Logic, Linguistics, and Aristotelian Semantics, Ashgate, Aldershot, 2000. [38] V. Petkeviˇc, A new formal specification of underlying representations, Theor. Linguist. 21 (1995) 7–61. [39] A. Ramos-Soto, A. Bugarín, S. Barro, On the role of linguistic descriptions of data in the building of natural language generation systems, Fuzzy Sets Syst. 285 (2016) 31–51. [40] E. Reiter, R. Dale, Building Natural Language Generation Systems, Cambridge University Press, Cambridge, 2010. [41] A. Said, T. Taskaya-Temizel, A. Khurshid, Summarizing time series: learning patterns in ‘Volatile’ series, in: Z. Yang, H. Yin, R. Everson (Eds.), Intelligent Data Engineering and Automated Learning?, IDEAL 2004, in: Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, 2004, pp. 523–532. [42] P. Sgall, E. Hajiˇcová, J. Panevová, The Meaning of the Sentence in Its Syntactic and Pragmatic Aspects, D. Reidel, Dordrecht, 1986. [43] K. van Deemter, E. Krahmer, M. Theune, Real versus template-based natural language generation: a false opposition?, Comput. Linguist. 31 (2005) 15–24. [44] A. Wilbik, U. Kaymak, Gradual linguistic summaries, Commun. Comput. Inf. Sci. 443 (2014) 405–413. [45] L. Zadeh, A prototype-centered approach to adding deduction capabilities to search engines? The concept of a protoform, in: Prof. of NAFIPS 2002, New Orleans, LA, USA, 2002, pp. 523–525.