Fuzzy least-squares algorithms for interactive fuzzy linear regression models

Fuzzy least-squares algorithms for interactive fuzzy linear regression models

Fuzzy Sets and Systems 135 (2003) 305 – 316 www.elsevier.com/locate/fss Fuzzy least-squares algorithms for interactive fuzzy linear regression models...

168KB Sizes 0 Downloads 119 Views

Fuzzy Sets and Systems 135 (2003) 305 – 316 www.elsevier.com/locate/fss

Fuzzy least-squares algorithms for interactive fuzzy linear regression models Miin-Shen Yang ∗ , Hsien-Hsiung Liu Department of Mathematics, Chung Yuan Christian University, Chung-Li, Taiwan 32023, Republic of China Received 10 August 2001; accepted 20 March 2002

Abstract Fuzzy regression analysis can be thought of as a fuzzy variation of classical regression analysis. It has been widely studied and applied in diverse areas. In general, the analysis of fuzzy regression models can be roughly divided into two categories. The 0rst is based on Tanaka’s linear-programming approach. The second category is based on the fuzzy least-squares approach. In this paper, new types of fuzzy least-squares algorithms with a noise cluster for interactive fuzzy linear regression models are proposed. These algorithms are robust for the estimation of fuzzy linear regression models, especially when outliers are present. Numerical examples c 2002 Elsevier Science B.V. All rights reserved. are given to detail the e5ectiveness of this approach.  Keywords: Fuzzy sets; Regression models; Estimation; Fuzzy least squares; Linear programming; Noise cluster; Outlier

1. Introduction Regression analysis is used to model the functional relationship between dependent and independent variables. In conventional regression analysis, deviations between the observed values and the estimates are assumed to be due to random errors. Thus, statistical techniques are applied to perform estimation and inference in regression analysis. However, the deviations are sometimes due to the inde0niteness of the structure of the system or imprecise observations. The uncertainty in this type of regression model becomes fuzziness, not randomness. Since Zadeh [18] proposed fuzzy sets, fuzziness has received more attention. Now fuzzy data analysis has become increasingly important (see [2]). Tanaka et al. [14] 0rst proposed a study in linear regression analysis with a fuzzy model. They considered the parameter estimations of fuzzy linear regression (FLR) models under two factors, ∗

Corresponding author. Tel.: +886-3-456-3171; fax: +886-3-456-3160. E-mail address: [email protected] (Miin-Shen Yang).

c 2002 Elsevier Science B.V. All rights reserved. 0165-0114/03/$ - see front matter  PII: S 0 1 6 5 - 0 1 1 4 ( 0 2 ) 0 0 1 2 3 - 9

306

Miin-Shen Yang, Hsien-Hsiung Liu / Fuzzy Sets and Systems 135 (2003) 305 – 316

namely the degree of the 0tting and the vagueness of the model. The estimation problems were then transformed into linear programming (LP) based on these two factors. This type of analysis of FLR models is called Tanaka’s approach. The extension of FLR models and di5erent estimation methods have been proposed by many researchers. Since the measure of best 0tting by residuals under fuzzy consideration is not presented in Tanaka’s approach, Diamond [6] proposed the socalled fuzzy least-squares approach, which is a fuzzy extension of the ordinary least squares based on a new de0ned distance on the space of fuzzy numbers. According to Zadeh’s construction of fuzzy sets as a basis for a theory of possibility [19], fuzzy regression analysis is also named as a possibility regression analysis. Thus, Tanaka’s approach to possibility regression analysis, instead of the measure of best 0tting by residuals, uses linear programming inclusion relations. However, the fuzzy least-squares approach to possibility regression analysis, does not consider inclusion relations, directly uses the best 0tting measure by residuals and information included in the input–output data under fuzzy consideration. Fuzzy regression models and estimation techniques have been widely studied and applied in diverse areas (see [1,4,9–11]). Generally, these fuzzy regression methods can be roughly divided into two categories. The 0rst is based on Tanaka’s LP approach (see [9–14]). The second category is based on the fuzzy least-squares approach (see [1,6,16–17]). A fuzzy number A is de0ned as a convex normalized fuzzy set of the real line R so that there exists exactly one x 0 ∈ R with A (x 0 ) = 1, and its membership A (x) is piecewise continuous. A fuzzy number M is of the LR-type if there are m; ¿0; ¿0 in R so that    m−x   if x 6 m; L   M (x) =    x−m   if x ¿ m; R  where L and R are decreasing functions from R+ to [0; 1], and L(x) = R(x) = 1, for x60; 0 for x¿1. m is called the center value of M and  and are called the left and right spreads, respectively. Symbolically, M is denoted by M = (m; ; )LR (see [20]). Let M and N be two LR-type fuzzy numbers with M = (m; ; )LR and N = (n; ; )LR . Then, by the extension principle, the following operations are de0ned: M +N = (m+n; +; +)LR ; −N = (−n; ; )RL ; (m; ; )LR −(n; ; )RL = (m− n;  + ; + )LR ; (m; ; )LR = (m; ;  )LR when ¿0; (m; ; )LR = (m; − ; −)RL when ¡0 (see [7]). For an LR-type fuzzy number A = (a; ; )LR , if L and R are of the form T (x) = 1−x for 06x61 and 0 otherwise, A is called a triangular fuzzy number, denoted by A = (a; ; )T . If  = ; A = (a; ; )T is called a symmetrical triangular fuzzy number, denoted by A = (a; )T . Tanaka et al. [14] considered the following FLR model: Y ∗ = A0 + A 1 x 1 + · · · + A p x p ; where x = (x 0 ; x1 ; : : : ; xp ) are non-fuzzy inputs (with x 0 = 1) and A0 ; A1 ; : : : ; Ap are symmetrical triangular fuzzy parameters with Ai = (ai ;  i )T ;  i ¿0; i = 0; 1; : : : ; p, in which  the model is also called as a non-interactive FLR. Then Y ∗ = A0 x 0 +A1 x1 +· · ·+Ap xp = ( pi=0 ai xi ; pi=0  i |xi |)T is also a symmetrical triangular fuzzy output. Let {(x j ; Yj ); j = 1; : : : ; n} be a data set with Yj = (yj ; ej )T ; j = 1; : : : ; n. The di5erent parameter estimation methods ai ;  i ; i = 1; : : : ; p were studied in Tanaka’s approach [11–14] and also in fuzzy-least squares approach [6,16–17].

Miin-Shen Yang, Hsien-Hsiung Liu / Fuzzy Sets and Systems 135 (2003) 305 – 316

307

To consider the interaction among fuzzy parameters, Tanaka and Ishibuchi [12] proposed another ˜ by de0ning the estimation method for the interactive FLR model Y ∗ = A0 + A1 x1 + · · · + Ap xp = Ax quadratic membership functions of fuzzy parameters as A˜ (!) = max{0; 1 − (! − a) C −1 (! − a)} where the fuzzy vector A˜ is called a quadratic fuzzy vector; a = (a0 ; a1 ; : : : ; ap ) is the center vector ˜ and C is a symmetrical positive de0nite matrix and called a spread matrix of A. ˜ The spreads of A; and interactions of fuzzy parameters are presented by the spread matrix C. The membership function ˜ can then be written as Y ∗ (y) = max{0; 1 − (y − a x)2 =x Cx}. Tanaka and Ishibuchi [12] of Y ∗ = Ax formulated the estimation problem as:  (1) Obtain a center vector a∗ of A that minimizes nj=1 (yj − a x j )2 .  (2) Find the coeKcient matrix C ∗ that min C J = nj=1 xj Cx j subject to a∗  xj + ((1 − H )xj Cxj )1=2 ¿ yj + ((1 − H )ej )1=2 ; − a∗  xj + ((1 − H )xj Cxj )1=2 ¿ −yj + ((1 − H )ej )1=2 ;

j = 1; : : : ; n:

If C ∗ is not positive semi-de0nite, then the orthogonal conditions, xj Cx j = 0 for all i; j = 1; : : : ; n with i = j, must be added to the optimization problem. However, there is no alternative fuzzy least-squares method for these interactive FLR models. In this paper we propose a robust fuzzy least-squares algorithm (RFLSA) for the interactive FLR models. In general, fuzzy least squares are sensitive to outliers. It is not robust in the estimation of parameters. We used a cluster-wise type of fuzzy least squares in conjunction with the so-called noise cluster. We found that this fuzzy least-squares approach presents well for interactive FLR models. These fuzzy least-squares methods are described in Section 2. Section 3 gives the numerical results and conclusions are made in Section 4. 2. Fuzzy least squares for interactive FLR models For the (non-interactive) fuzzy FLR model Y ∗ = A0 + A 1 x 1 + · · · + A p x p : Yang and Ko [16] proposed a cluster-wise fuzzy regression analysis by embedding fuzzy clustering into FLR models. Let FLR (R) be the space of all LR-type fuzzy numbers. They de0ned a metric dLR on FLR (R) as follows: d2LR (A1 ; A2 ) = (a1 − a2 )2 + [(a1 − l1 ) − (a2 − l2 )]2 + [(a1 + r 1 ) − (a2 + r 2 )]2 ;

1 where A1 = (a1 ; 1 ; 1 )LR and A2 = (a2 ; 2 ; 2 )LR are LR-type fuzzy numbers in FLR (R); l= 0 L−1 (!) 1 d! and r = 0 R−1 (!) d! if L−1 and R−1 are integrable over the interval [0; 1]. Fuzzy clustering is a powerful tool in pattern recognition. Fuzzy c-means (FCM) clustering is the most popular, see [3,15] for examples. Yang and Ko [16] embedded the FCM clustering into FLR models and then created a cluster-wise fuzzy regression analysis described as follows: Suppose that the data set {(x j ; Yj ); j = 1; : : : ; n} with Yj∗ = (yj ; ej )T ; j = 1; : : : ; n comes from c FLR models, Yj∗ = Ai0 + Ai1 x1j + · · · + Aip xpj ;

i = 1; : : : ; c; j = 1; : : : ; n:

308

Miin-Shen Yang, Hsien-Hsiung Liu / Fuzzy Sets and Systems 135 (2003) 305 – 316

The values  ij are used to represent the membership of x j in ith class for which  ij are values in the interval [0; 1] so that ci=1  ij = 1 for all j. It is known that { 1 ; : : : ;  c } with  i = ( i1 ; : : : ;  in ) is called a fuzzy c-partition. A cluster-wise fuzzy least-squares objective function L(a; ; ) then becomes c n ijm d2LR (Yj ; Ai0 + Ai1 x1j + · · · + Aip xpj ); L(a; ; ) = i=1 j=1

where m ∈ [1; ∞) is the  index of  fuzziness in clustering. Let h(a; ; ; ) be the Lagrangian with h(a; ; ; ) = L(a; ; ) + nj=1 j ( ci=1  ij − 1). Set the 0rst derivatives of h with respect to all parameters equal to zero. The following necessary conditions for a minimizer (a; ; ) of L are obtained. That is,    n p n     2 ais = ijm xsj yj − aik xkj  ijm xsj ; i = 1; : : : ; c; s = 0; 1; : : : ; p; (2.1)   j=1 j=1 k=0 k =s



  n p n     2 ijm xsj ej − ik |xkj | ijm xsj ; is =   j=1 j=1

i = 1; : : : ; c; s = 0; 1; : : : ; p

(2.2)

k=0 k =s

and

 ij =

c (d2LR (Yj ; Ai0 + Ai1 x1j + · · · + Aip xpj ))1=(m−1) (d2LR (Yj ; Ak0 + Ak1 x1j + · · · + Akp xpj ))1=(m−1) k=1

−1 ;

i = 1; : : : ; c; j = 1; : : : ; n:

(2.3)

Therefore, a cluster-wise fuzzy least-squares algorithm for computing a minimizer of L(a; ; ) has iterations through the necessary conditions (2.1)–(2.3) with the constraints  is ¿0; i = 1; : : : ; c; s = 0; 1; : : : ; p. Now let us consider interactive FLR models. Let a ∈ Rn and A be an n×n positive de0nite matrix. A fuzzy vector A˜ is called a quadratic fuzzy vector on Rn if it has a quadratic membership function A˜ (x) = 1 − min{1; (x − a) A−1 (x − a)} = max{0; 1 − (x − a) A−1 (x − a)} where a is called the center ˜ Symbolically A˜ is denoted by A˜ = (a; A)Q (see vector of A˜ and A is called the spread matrix of A. p+1 [12]). Let FQ (R ) denote the set of all quadratic fuzzy vectors on Rp+1 . Consider the following interactive FLR model with interactive fuzzy parameters: ˜ Y ∗ = A0 + A1 x1 + · · · + Ap xp = Ax; where x = (x 0 ; x1 ; : : : ; xp ) is a non-fuzzy input and A˜ is a fuzzy vector parameter in FQ (Rp+1 ). In ˜ a new type of order to derive fuzzy least-squares algorithms for the interactive FLR model Y ∗ = Ax, distance d2Q on the space FQ (Rn ) is proposed. Let A˜ = (a; A)Q and B˜ = (b; B)Q be any two quadratic fuzzy vectors in FQ (Rn ). A distance d2Q between A˜ and B˜ is de0ned as ˜ B) ˜ = a − b 2 + tr((A − B) (A − B)): d2Q (A;

Miin-Shen Yang, Hsien-Hsiung Liu / Fuzzy Sets and Systems 135 (2003) 305 – 316

309

On the basis of the Cauchy–Schwarz inequality and Cauchy sequence, it can be proven that (FQ (Rn ); dQ ) constitutes a complete metric space. The proposed distance dQ is used to de0ne a fuzzy least-squares objective function. Using the well-known extension principle, Tanaka and Ishibuchi [12] demonstrated that if A˜ = (a; D)Q then ˜ = (x a; x Dx)Q . For the interactive FLR model Y ∗ = Ax, ˜ let {(x j ; Yj ); j = 1; : : : ; n} be the data set Ax √ √  with non-fuzzy inputs x j = (x 0j ; x1j ; : : : ; xpj ) and fuzzy outputs Yj = (yj ; ej )Q = (yj ; ej ; ej )LR where L(z) = R(z) = 1 − z 2 . Thus, one has ˜ j = (x j a; x j Dxj )Q = (x j a; (x j Dxj )1=2 ; (x j Dxj )1=2 )LR : Yj∗ = Ax If the distance dLR de0ned by Yang and Ko [16] is used, then the fuzzy least-squares objective function is JLR (a; D) =

n

d2LR (Yj ; Yj∗ )

j=i

=

n

 (yj − x j a)2 +

j=1

  2  2√ 2  yj − ej − x  j a − x j Dxj 3 3

  2   2√ 2   : + yj + ej − x j a + x j Dxj 3 3 JLR (a; D) therefore becomes quite complicated. If the new type of distance dQ is used, then the fuzzy least-squares objective function becomes JQ (a; D) =

n

d2Q (Yj ; Yj∗ )

j=1

=

n

[(yj − x j a)2 + (ej − x j Dxj )2 ]:

j=1

For the optimization of JQ (a; D), we consider that D must be positive semi-de0nite at least. Therefore, the fuzzy least-squares estimation is the minimization of JQ (a; D) subject to the constraint that D must be positive semi-de0nite. Set the 0rst derivatives of JQ (a; D) with respect to a and D equal to zero. The following necessary conditions for a minimizer (a; D) of JQ are then obtained: n p xij ai ) j=1 xsj (yj − n

as = n

j=1

Dst =

j=1

i=0 i=s

2 xsj

xsj xtj (ej −

; p

p

n

i=0 i=s

j=1

s = 0; 1; : : : ; p;

2 2 xsj xtj

k=0 k =t

(2.4)

xij Dik xkj ) ;

s = 0; 1; : : : ; p t = 0; 1; : : : ; p:

(2.5)

310

Miin-Shen Yang, Hsien-Hsiung Liu / Fuzzy Sets and Systems 135 (2003) 305 – 316

Based on the orthogonal conditions xi Dx j = 0 for all i = j proposed by Tanaka and Ishibuchi [12], D can be obtained as positive semi-de0nite, by adding these orthogonal conditions as the constraints for the optimization problem. Thus, if the solution D in (2.5) is positive semi-de0nite, then it is a solution for the optimization problem. Otherwise, add the constraint of orthogonal conditions xi Dx j = 0 for all i = j to the optimization problem JQ (a; D). Suppose that the data set {(x j ; Yj ); j = 1; : : : ; n} with Yj = (yj ; ej )Q and x j = (x 0j ; x1j ; : : : ; xpj ) ; j = 1; : : : ; n comes from c interactive FLR models, Yj∗ = A˜ i xj ;

i = 1; : : : ; c; j = 1; : : : ; n;

where A˜i = (a i ; Di )Q are quadratic fuzzy vector parameters in FQ (Rp+1 ). Let  ij be the membership of x j in ith class. A cluster-wise fuzzy least-squares objective function is then JQ (a; D; ) =

n c

ijm d2Q (Yj ; A˜ i xj ):

i=1 j=1

In order to consider the robustness, a noise cluster is considered with the cluster-wise fuzzy least squares. The concept of a noise cluster proposed by Dave [5] is that all of the points have equal priori opportunity of belonging to a noise cluster. However, the “good” points increase their chance of being classi0ed into a “good” cluster as the clustering algorithm progresses. It is hoped that all of the noise points (or outliers) can be dumped into a noise cluster during a clustering algorithm in progress. Therefore, Dave [5] de0ned a noise prototype as follows: A point - is called a noise prototype if the distance d(xj ; -) between the data point xj and - are all equal to a constant , i.e. d(xj ; -) =  for j = 1; : : : ; n. Now the noise cluster concept is applied to the cluster-wise fuzzy least squares. Assume that the cluster (c + 1) is a noise cluster. Then the objective function becomes JQ0 (a; D; )

=

n c+1

ijm d2ij ;

i=1 j=1

where

 d2ij

=

d2Q (Yj ; A˜ i xj ); i = 1; : : : ; c; 2 ;

i = c + 1;

j = 1; : : : ; n; j = 1; : : : ; n:

The following necessary conditions for a minimizer (a; D; ) of JQ0 are thus obtained: n p m aik xkj ) j=1 ij xsj (yj − n

ais = n

j=1

Dist =

k=0 k =s

;

m 2 j=1 j xsj

ijm xsj xtj (ej − n

p

j=1

p

l=0 l=s 2 2 ijm xsj xtj

k=0 k =t

i = 1; : : : ; c; s = 0; 1; : : : ; p; xlj Dilk xkj ) ;

i = 1; : : : ; c; s = 0; 1; : : : ; p; t = 0; 1; : : : ; p

(2.6)

(2.7)

Miin-Shen Yang, Hsien-Hsiung Liu / Fuzzy Sets and Systems 135 (2003) 305 – 316

and

−1  c+1 (d2ij )1=(m−1) ij = ; 2 1=(m−1) (d ) kj k=1

with

 d2ij

=

i = 1; : : : ; c + 1; j = 1; : : : ; n

311

(2.8)

d2Q (Yj ; A˜ i xj ); i = 1; : : : ; c; j = 1; : : : ; n; 2 ;

i = c + 1; j = 1; : : : ; n:

and n  2 ˜ j ):  = d (Yj ; Ax n j=1 Q 2

Therefore, when c = 1, an RFLSA for interactive FLR models is given through iterations (2.6)– (2.8) with the orthogonal conditions xi Dx j = 0 for all i = j. This new proposed RFLSA is used to estimate the interactive fuzzy parameters for the interactive FLR models. The numerical results will be presented in the next section. 3. Numerical results In Section 2, an RFLSA was constructed for the estimation of the interactive FLR models. In this section some numerical results and conclusions are made. Example 1 (non-fuzzy data). The input–output data from Tanaka and Ishibuchi [12] are used here ˜ j. and shown in Table 1. The interactive FLR model is Yj∗ = A0 + A1 xj = Ax ∗  Using RFLSA, one obtains a = (3:79; 0:41) ,   0:00 0:00 ∗ : D = 0:00 0:00 The predicted values Yj∗ = (x j a∗ ; x j D∗ x j )Q = (yj∗ ; 0)Q and the sum of square of residuals (SSR) are shown in Table 2. Tanaka and Ishibuchi [10] indicated that a∗ = (3:79; 0:41) ,   3:77 −0:26 ∗ : D = −0:26 0:05 The predicted values Yj∗ = (x j a∗ ; x j D∗ x j )Q = (yj∗ ; e∗ )Q and the SSR are shown in Table 3. In this dataset, the estimate a∗ = (3:79; 0:41) is obtained by RFLSA. Thus, it is exactly the same as that obtained by Tanaka’s method. D∗ = 0 using RFLSA which means that the predicted values Yj∗ = (x a∗ ; x j D∗ x j )Q = (yj∗ ; 0)Q are non-fuzzy. Tanaka’s method obtains   3:77 −0:26 ∗ ; D = −0:26 0:05

312

Miin-Shen Yang, Hsien-Hsiung Liu / Fuzzy Sets and Systems 135 (2003) 305 – 316

Table 1 Input–output data [12] No.

1

2

3

4

5

6

7

8

9

10

xj

2

4

6

9

12

13

14

16

19

20

Yj = (yj ; 0)Q

4

7

5

8

7

9

12

9

14

10

Table 2 Predicted values and SSR with RFLSA (SSR = 25:35) No.

1

2

3

4

5

6

7

8

9

10

yj∗

4.61

5.43

6.25

7.48

8.70

9.11

9.52

10.34

11.57

11.98

Table 3 Predicted values and SSR with Tanaka’s method (SSR = 512:82) No.

1

2

3

4

5

6

7

8

9

10

yj∗

4.61

5.43

6.25

7.48

8.70

9.11

9.52

10.34

11.57

11.98

ej∗

2.93

2.47

2.41

3.07

4.61

5.32

6.12

8.04

11.65

13.05

Table 4 Input–output data [12] No.

1

2

3

4

5

xj

5

8

11

14

17

Yj = (yj ; ej )Q

(7; 4)

(9; 1)

(10; 1)

(11; 4)

(13; 9)

which means that the predicted values Yj∗ = (x a∗ ; x j D∗ x j )Q are fuzzy data. Both methods obtain di5erent predicted value results in which one is non-fuzzy and the other is fuzzy. This is because Tanaka’s method uses the vagueness indices of the model and inclusion relations. The proposed RFLSA method uses the best-0tting measure by residuals and then takes the structure and information encountered in the data set into account at each iteration of the algorithm. Example 2 (fuzzy data). The input–output data from Tanaka and Ishibuchi [12] are shown in ˜ j. Table 4. The interactive FLR model is Yj∗ = A0 + A1 xj = Ax ∗  Using RFLSA, one obtains a = (4:87; 0:47) ,   14:57 −1:44 ∗ : D = −1:44 0:15 The predicted values Yj∗ = (x j a∗ ; x j D∗ x j )Q = (yj∗ ; ej∗ )Q and the SSR are shown in Table 5.

Miin-Shen Yang, Hsien-Hsiung Liu / Fuzzy Sets and Systems 135 (2003) 305 – 316

313

Table 5 Predicted values and SSR with RFLSA (SSR = 0:51) No.

1

2

3

4

5

Yj∗ = (yj∗ ; ej∗ )Q

(7:20; 3:92)

(8:60; 1:14)

(10:00; 1:08)

(11:40; 3:47)

(12:80; 9:12)

Table 6 Predicted values and SSR with Tanaka’s method (SSR = 18:69) No.

1

2

3

4

5

Yj∗ = (yj∗ ; ej∗ )Q

(7:20; 4:84)

(8:60; 1:96)

(10:00; 2:27)

(11:40; 5:47)

(12:80; 12:45)

Table 7 Input–output data with outliers No.

1

2

3

4

5

6

7

8

9

10

11

xj

2

4

6

9

12

13

14

16

19

20

3

Yj = (yj ; 0)Q

4

7

5

8

7

9

12

9

14

10

30

Tanaka and Ishibuchi [12] indicated that a∗ = (4:87; 0:47) ,   16:72 −1:63 : D∗ = −1:63 0:18 The predicted values Yj∗ = (x j a∗ ; x j D∗ x j )Q = (yj∗ ; e∗ )Q and the SSR are shown in Table 6. In this dataset, the estimate a∗ = (4:78; 0:47) is the same for both RFLSA and Tanaka’s method. In fact, this estimate is exactly the same as the classical regression estimate when the spreads ej are discarded . The RFLSA obtains   14:57 −1:44 ∗ D = −1:44 0:15 and Tanaka’s method obtains   16:72 −1:63 : D∗ = −1:63 0:18 Both estimates D∗ are very close. Example 3 (outlier detection). The input–output data are shown in Table 7. This data set is derived from the dataset of Example 1 by adding an outlier (x11 ; Y11 ) = (3; 30). For the dataset in Table 7, Tanaka’s method obtains a∗ = (11:16; −0:07). The predicted line is shown in Fig. 1. According to Fig. 1, it is seen that the predicted line is greatly a5ected by the

314

Miin-Shen Yang, Hsien-Hsiung Liu / Fuzzy Sets and Systems 135 (2003) 305 – 316 (3,30)

30

y

20

y = 11.16 - 0.07 x 10

0

10

20

x

Fig. 1. Input–output data and predicted line using Tanaka’s method.

(3,30)

30

y

20

y = 3.86 + 0.40 x 10

0

10

20

x

Fig. 2. Input–output data and predicted line using RFLSA.

outlier (x11 ; Y11 ) = (3; 30). Thus, Tanaka’s method is quite sensitive to outliers. Using RFLSA, one obtains a∗ = (3:86; 0:40) ,   0:00 0:00 ∗ ; D = 0:00 0:00 which is shown in Fig. 2. The predicted values Yj∗ = (x j a∗ ; x D∗ x j )Q = (yj∗ ; 0)Q and memberships j ; j = 1; : : : ; 11 and the SSR are shown in Table 8. According to the results from Table 8, the membership 11 of the point (3; 30) is 0.07, i.e. the membership of the noise cluster is 0.93. Obviously, the point (3; 30) is detected to be an outlier. It is also found that the estimate a∗ = (3:86; 0:40) is near that of a∗ = (3:79; 0:41) , given in Example 1, without adding the outlier (3; 30). Therefore, the proposed RFLSA is robust and also an outlier detector.

Miin-Shen Yang, Hsien-Hsiung Liu / Fuzzy Sets and Systems 135 (2003) 305 – 316

315

Table 8 Predicted values and memberships and SSR with RFLSA (SSR = 24:32) No.

1

2

3

4

5

6

7

8

9

10

11

j

0.99

0.95

0.97

0.99

0.95

1.00

0.88

0.97

0.88

0.93

0.07

yj∗

4.46

5.45

6.25

7.44

8.63

9.03

9.43

10.22

11.42

11.81

noise

4. Conclusions In this paper, attention was focused on the analysis of interactive FLR models. Fuzzy regression analysis was used to demonstrate the fuzzy functional relationship between the dependent and independent variables under fuzzy phenomena. Tanaka’s LP approach is most commonly used to estimate the fuzzy parameters of FLR models. The fuzzy least-squares approach is another method for fuzzy parameter estimation in FLR models. However, these methods are sensitive to outliers. Therefore, a new type of robust fuzzy least-squares algorithms (RFLSA) for interactive FLR models was proposed in this paper. Several numerical examples were made. According to these results, the proposed RFLSA actually presents well and is also robust to outliers. In other words, they are well-de0ned robust estimation procedures for interactive FLR models and are recommended as estimation algorithms for the analysis of interactive FLR models. We mention that the robust idea in this paper can also be applied to non-interactive FLR models. Suppose that the dataset presents k groups of FLR models, or assume that the data come from k numbers of FLR models with the 0xed integer k¿2, i.e., they are assumed to have k switching FLR models. By setting c = k in RFLSA, the algorithm can then be used to estimate k groups of fuzzy parameters and detect outliers for the type of k switching FLR models. In contrast to this, Hathaway and Bezdek [8] proposed estimation algorithms for switching ordinary (non-fuzzy) regression models in conjunction with fuzzy c-means algorithms. In all of these cases, the problem of the unknown number k may occur because this number is usually unknown. The problem of 0nding optimal k groups in cluster analysis is usually called cluster validity. This validity problem about switching (fuzzy or non-fuzzy) regression models is interesting and will be a further study. Acknowledgements The authors are grateful to the anonymous reviewers for their comments. References [1] M. Albrecht, Approximation of functional relationships to fuzzy observations, Fuzzy Sets and Systems 49 (1992) 301–305. [2] H. Banademer, W. NRather, Fuzzy Data Analysis, Kluwer Academic Publishers, Dordrecht, 1992. [3] J.C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum Press, New York, 1981. [4] A. CelminTs, A practical approach to nonlinear fuzzy regression, SIAM J. Sci. Statist. Comput. 12 (3) (1991) 521–546.

316 [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20]

Miin-Shen Yang, Hsien-Hsiung Liu / Fuzzy Sets and Systems 135 (2003) 305 – 316 R.N. Dave, Characterization and detection of noise in clustering, Pattern Recognition Lett. 12 (1991) 657–664. P. Diamond, Fuzzy least squares, Inform. Sci. 46 (1988) 141–157. D. Dubois, H. Prade, Fuzzy Sets and Systems: Theory and Applications, Academic Publishers, New York, 1980. R.J. Hathaway, J.C. Bezdek, Switching regression models and fuzzy clustering, IEEE Trans. Fuzzy Systems 1 (3) (1993) 195–204. B. Heshmaty, A. Kandel, Fuzzy linear regression and its applications to forecasting in uncertain events, Fuzzy Sets and Systems 15 (1985) 159–191. D.T. Redden, W.H. Woodall, Properties of certain fuzzy linear regression methods, Fuzzy Sets and Systems 64 (1994) 361–375. H. Tanaka, Fuzzy data analysis by possibilistic linear models, Fuzzy Sets and Systems 24 (1987) 363–375. H. Tanaka, H. Ishibuchi, Identi0cation of possibilistic linear systems by quadratic membership functions of fuzzy parameters, Fuzzy Sets and Systems 41 (1991) 145–160. H. Tanaka, H. Ishibuchi, S. Yoshikawa, Exponential possibility regression analysis, Fuzzy Sets and Systems 69 (1995) 305–318. H. Tanaka, S. Uegima, K. Asai, Linear regression analysis with fuzzy model, IEEE Trans. Systems Man Cybernet. 12 (1982) 903–907. M.S. Yang, A survey of fuzzy clustering, Math. Comput. Modelling 18 (1993) 1–16. M.S. Yang, C.H. Ko, On cluster-wise fuzzy regression analysis, IEEE. Trans. Systems Man Cybernet.—Part B: Cybernet. 27 (1) (1997) 1–13. M.S. Yang, T.S. Lin, Fuzzy least-squares linear regression analysis for fuzzy input–output data, Fuzzy Sets and Systems 126 (2002) 389–399. L.A. Zadeh, Fuzzy sets, Inform. and Control 8 (1965) 338–353. L.A. Zadeh, Fuzzy sets as a basis for a theory of possibility, Fuzzy Sets and Systems 1 (1978) 3–28. H.-J. Zimmermann, Fuzzy Sets Theory and its Applications, Kluwer Academic Publishers, Dordrecht, 1991.