A fuzzy case based reasoning approach to value engineering

A fuzzy case based reasoning approach to value engineering

Expert Systems with Applications 38 (2011) 9334–9339 Contents lists available at ScienceDirect Expert Systems with Applications journal homepage: ww...

347KB Sizes 61 Downloads 192 Views

Expert Systems with Applications 38 (2011) 9334–9339

Contents lists available at ScienceDirect

Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa

A fuzzy case based reasoning approach to value engineering M.H. Fazel Zarandi ⇑, Zahra S. Razaee, M. Karbasian Department of Industrial Engineering, Amirkabir University of Technology, Tehran, Iran

a r t i c l e

i n f o

Keywords: Value engineering Fuzzy case-based reasoning Fuzzy clustering Fuzzy data

a b s t r a c t This paper is intended to assist the experts during the creativity phase of value engineering through utilizing the past experiences and avoid them in a specific domain from repeating the same experience. To this purpose, a general fuzzy case based reasoning (CBR) system is developed. Our system benefits from a fuzzy clustering model for fuzzy data to facilitate case retrieval and reduce the time complexity. The inherent analogical nature of a case-based reasoning (CBR) model and its integration with fuzzy theory would facilitate access to more precise and systematically classified information during a VE workshop. In order to test the performance of the proposed system, it is applied to suburban highway design data extracted from National Cooperative Highway Research Program (NCHRP) Report 282.  2011 Elsevier Ltd. All rights reserved.

1. Introduction Value engineering (VE) is an organized approach directed at analyzing the function of systems, facilities, services, and supplies for the purpose of achieving their essential functions at the lowest life-cycle cost consistent with required performance, reliability, quality and safety (Mandelbaum & Reed, 2006). The VE process consists of several phases, including the information phase, function analysis phase, creativity phase, evaluation phase, presentation phase and implementation phase. Creativity depends on the human brain and cannot be computerized easily by conventional programming. Case-based reasoning (CBR) from AI can be used to improve efficiency of this stage, since this approach is able to utilize the specific knowledge of experiences by retrieving and adapting the solutions from similar past cases. In the literature, existing models mainly involve conventional approaches and less has been devoted to devising AI approaches. One of the earliest works was done by the US Army Corps of Engineers through establishing an information retrieval system called VE-trieval. This program can be queried by key-word methodology on a particular subject to obtain an abstract and other useful information (Degenhardt, 1985). Park (1994) developed VEPRO which is a spreadsheet rule-based system with database features and consists of several models parallel to the VE job plan. Alcantara (1996) designed a support program for the information phase of VE, which assigned data structure for representing and performing analytical tasks on rational data. A computer model for VE methodology was developed by Assaf, Jannadi, and Al-Tamimi (2000) emphasizing life cycle cost calculations. Dahim (2001) at Pitts⇑ Corresponding author. Tel.: +98 21 641 3034; fax: +98 21 6641 3025. E-mail addresses: [email protected] (M.H. Fazel Zarandi), [email protected] (Z.S. Razaee), [email protected] (M. Karbasian). 0957-4174/$ - see front matter  2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2011.01.124

burgh University developed an expert system for VE application in suburban highway design. It utilizes the analytical hierarchy process (AHP) method for the evaluation phase of VE. Naderpajouh and Afshar (2008) proposed a conceptual expert case-based reasoning (CBR) framework that outlines knowledge entities and their relations in the VE workshop. It also benefits from a fuzzy approach to handle uncertainties in the evaluation phase of the job plan. In general, devising an expert system for a VE job plan is recommended by different researches (Al-Yousefi, 1991; Assaf et al., 2000; Shen & Brandon, 1991). The main objective of this study is to assist the experts during the creativity phase of VE through utilizing the past experiences to prevent repeating the same experience in a particular domain. To this purpose, a comprehensive fuzzy CBR system is proposed involving fuzzy representation of cases and a fuzzy clustering of fuzzy data model to similarity matching in order to facilitate case retrieval. The basic idea that motivates us to use fuzzy theory is that in early stages of the project development, where VE has the greatest payoffs (Dell’Isola, 1998), most of the parameters have uncertainties (Naderpajouh, Afshar, & Mirmohammadsadeghi, 2006). In addition, many experts cannot express their judgments in accurate numerical terms and use linguistic expressions. In these cases, fuzzy theory may be employed to handle uncertainties and support linguistic assessments. Thus, the inherent analogical nature of a case-based reasoning (CBR) model and its integration with fuzzy theory would facilitate access to more precise and systematically classified information during a VE workshop. The rest of the paper is organized as follows. Section 2 summarizes the literature survey for the related areas. We propose a distance measure for fuzzy data based on Wasserstein Metric in Section 3; by means of this distance and following Keller’s approach, we propose a fuzzy clustering model for fuzzy data with outliers (Section 4). For determining the optimal number of

9335

M.H. Fazel Zarandi et al. / Expert Systems with Applications 38 (2011) 9334–9339

clusters, we modify Kown (1998) validity index so that it can be used in a complete fuzzy framework and also in noisy environments (Section 5). In Section 6, the main methodology is proposed. As an application, our system is tested on suburban highway design data provided in NCHRP Report 282 (NCHRP, 1986). Finally, conclusions and future works are presented in Section 8.

inherent fuzzy nature of similarity measurement in CBR is another motivation to use fuzzy theory in case retrieval (Burkhardm & Richterm, 2001). For related work in this area, see for example Hirota et al. (1998), Dvir, Langholz, and Schneider (1999), Liang and Shi (2003) and Wang (1997). 2.3. Clustering analysis

2. Background This section will briefly provide some relative literature in the areas of case-based reasoning, fuzzy case-based reasoning, clustering analysis, fuzzy data and Metrics for fuzzy data. 2.1. Case-based reasoning The case based reasoning was first proposed by Watson (1997). It is a problem-solving paradigm that involves solving new problems by searching through a database of previously-solved problems (called a case library) for one or more cases whose identifying features closely resemble the current problem. When found, the solution employed in the historical case (s) is retrieved and applied to the current problem. However, if the retrieved case is not a close match, the solution is revised producing a new case that can be retained. Finally, the current problem with the new solution can be added to the case library to increase its robustness. Aamodt and Plaza (1994) regarded CBR as composed of the following cycle (CBR cycle) with four main subjects:  Retrieving similar previously experienced cases whose problem is judged to be similar.  Reusing the cases by copying or integrating the solutions from the cases retrieved.  Revising or adapting the solution (s) retrieved in an attempt to solve the new problem.  Retaining the new solution once it has been confirmed or validated. The procedures for a CBR are shown in Fig. 1. 2.2. Fuzzy case-based reasoning Adding a fuzzy logic concept into the conventional CBR methods can improve the CBR performance. Fuzzy logic can be used in case representation to provide a characterization of imprecise and uncertain information. In other words, fuzzy logic allows us to represent cases whose attributes have imprecise and vague values. Moreover, one of the major issues in fuzzy set theory is measuring similarities in order to design robust systems. The

Clustering is a division of a given set of objects into subgroups or clusters, so that objects in the same cluster are as similar as possible, and objects in different clusters are as dissimilar as possible. From a machine learning perspective, clustering is an unsupervised learning of a hidden data concept (Berkhin, 2002). In conventional (hard) clustering analysis, each datum belongs to exactly one cluster, whereas in fuzzy clustering, data points can belong to more than one cluster, and associated with each datum is a set of membership degrees. Fuzzy data are imprecise data obtained from measurements, human judgements or linguistic assessments. In cluster analysis, when there is simultaneous uncertainty in both the partition and data, a fuzzy clustering model for fuzzy data should be applied (D’Urso & Giordani, 2006a). In our CBR system, cases are fuzzy data. Thus, in Section 4 we propose a fuzzy clustering of fuzzy data for clustering cases in order to reduce the cases necessary for searching and to save time. 2.4. LR-type fuzzy data The LR-type fuzzy data represent a general class of fuzzy data. When we are dealing with univariate LR fuzzy data, this kind of data can be shown by a vector of LR-fuzzy numbers. In the more general case of multivariate analysis, we have a matrix of LR-fuzzy numbers (De Oliveira & Pedrycz, 2007). To be more specific, let L (and R) be a decreasing shape function, which map Rþ ! ½0; 1 with L(0) = 1; L(x) < 1,"x > 0; L(x) > 0,"x < 1; L(1) = 0 or (L(x) > 0,"x and L(+1) = 0) (Zimmermann, 2001). Then, a fuzzy ~ is of LR-type if for c,l > 0,r > 0 in R, number A

(   L cx for x 6 c; l  leA ðxÞ ¼ xc R r for x P c:

ð1Þ

e respectively. where, c, l, r are the center, left and right spreads of A, e ¼ ðc; l; rÞ . Symbolically we can write A LR In LR-type fuzzy numbers, the triangular fuzzy numbers (TFNs) ~ is called triare most commonly used. An LR-type fuzzy number A angular fuzzy number if L(x) = R(x) = 1  x, characterized by the following membership function:

(

leA ðxÞ ¼

for x 6 c; 1  cx l 1  xc r

ð2Þ

for x P c:

2.5. Metrics for fuzzy data In the recent literature, there are some distance measures for fuzzy data. We review some of them in this section. Definition (The Hausdorff distance). Considering two crisp sets A; B # Rk , and a distance d(x,y) where, x 2 A and y 2 B, the Hausdorff distance is defined as follows:

(

)

dH ðA; BÞ ¼ max sup inf dðx; yÞ; sup inf dðx; yÞ : x2A

Fig. 1. CBR cycle (Aamodt & Plaza, 1994).

y2B

y2B

x2A

ð3Þ

According to the concept of a-cuts, the Hausdorff metric dH can e where e e : R ! ½0; 1: be generalized to fuzzy numbers e F ; G, F ðor GÞ

9336

M.H. Fazel Zarandi et al. / Expert Systems with Applications 38 (2011) 9334–9339

e ¼ dq ð e F ; GÞ

8 hR i1=q > < 01 ðdH ðF a ; Ga ÞÞq da

if q 2 ½1; 1

> : sup dH ðF a ; Ga Þ

if q ¼ 1;

ð4Þ

a2½0;1

where, the crisp set F a  fx 2 Rk : FðxÞ P ag; a 2 ½0; 1, is called the a-cut of eF (Näther, 2000). Tran and Duckstein (2002) proposed the following distance between two intervals:

   aþb þ xðb  aÞ 2 12 12   2

uþv dx dy  þ yðv  uÞ 2 " #   2  2 aþb u þ v

1 ba v  u 2 : ð5Þ  þ þ ¼ 2 3 2 2 2

dTD ðA; BÞ ¼

Z

1 2

Z

1 2

Then, they used it to formulate their distance measure for fuzzy numbers, but dTD does not satisfy the reflexivity property (Irpino & Verde, 2008):

   2 aþb aþb  2 2 " 2  2 # 1 ba ba þ þ 3 2 2  2 2 ba P 0: ¼ 3 2

dTD ðA; AÞ ¼

R1

1

ð7Þ

1

where, k ¼ 0 L ðtÞdt; q ¼ 0 R ðtÞdt are parameters that summarize the shape of the left and right tails of the membership function and L,R are decreasing shape functions which were defined in Section 2. 3. The proposed distance for fuzzy data In this section, we first present a new distance measure for interval-valued data, and then it is used to formulate the distance measure for fuzzy data. Let Ii = [ai,bi], be an interval for i ¼ 1; 2. We can parameterize Ii as follows:

Ii ðtÞ ¼ ai þ tðbi  ai Þ 0 6 t 6 1:

ð8Þ

i If we represent Ii by means of its midpoint mi ¼ ai þb and radius 2 i di ¼ bi a , Eq. (9) can be rewritten as follows: 2

Ii ðtÞ ¼ mi þ ð2t  1Þdi

0 6 t 6 1:

ð9Þ

The distance measure between I1 and I2 can be defined as follows: 2

d ðI1 ; I2 Þ ¼

Z

1

½I1 ðtÞ  I2 ðtÞ2 dt

0

¼

Z

1

e1; A e 2Þ ¼ dð A

½ðm1  m2 Þ þ ðd1  d2 Þð2t  1Þ2 dt

0

Z

1

0

2 1 ðF 1 1 ðtÞ  F 2 ðtÞÞ dt

ð10Þ

This distance takes into account all the points in both intervals. Irpino and Verde (2008) has derived Eq. (10) from another point of view, using the Wasserstein distance. To be more specific, let F1

;



12 2 e 1Þ ; ðA e 2 Þ da : dWass ð A a a

1 ¼ ci þ ð1  aÞðr i  li Þ: 2 1 ¼ ð1  aÞðr i þ li Þ: 2

ð11Þ

ð12Þ

ð13Þ ð14Þ

Then we have: 2 e e d ðA 1; A2Þ ¼

Z

1

0

2 e 1Þ ; ðA e 2 Þ Þ da dWass ðð A a a ( 

 2 ) 1 da de de ð A 1 Þa ð A 2 Þa ð A 2 Þa 3 ð A 1 Þa 0 2 Z 1 ( 1 ¼ ðc1  c2 Þ þ ð1  aÞ½ðr 1  r 2 Þ  ðl1  l2 Þ 2 0 ) 1 2 2 þ ð1  aÞ ½ðr 1  r 2 Þ þ ðl1  l2 Þ da 12 ¼

Z

1

2

me

me

þ

1 ¼ ðc1  c2 Þ2 þ ½ðl1  l2 Þ2 þ ðr1  r2 Þ2  ðl1  l2 Þðr 1  r 2 Þ 9 1 ð15Þ  ðc1  c2 Þ½ðl1  l2 Þ  ðr 1  r2 Þ: 2 We use this distance in the next section for fuzzy clustering of fuzzy data. 4. Fuzzy clustering of fuzzy data with outliers In this section Keller’s approach (Keller, 2000) is modified so that it can be used for fuzzy data. Similar to his approach, an additional weighting factor is added for each datum to identify outliers and reduce their effects. Before describe the procedure, let us introduce the following notation: U  {uik:i = 1, . . ., c;k = 1, . . ., n} is the membership matrix of order (c  n), where c is the number of clusters, n is the number of data vectors; uik 2 [0, 1] denotes the membership degree of the kth object to the ith cluster. In contrast to Keller’s approach where data elements and cluster prototypes are crisp, we define them as triann

e ~ gular fuzzy data. Thus, X xjk ¼ c~xj ; l~xj ; r~xj : k ¼ 1; . . . ; n; j ¼ k k k n

o e  v ~ ji ¼ cv~ j ; lv~ j ; r v~ j : i ¼ 1; . . . ; c; j ¼ 1; . . . ; p are 1; . . . ; pg and V i

1 ¼ ðm1  m2 Þ þ ðd1  d2 Þ2 : 3 2

1=2

ei ¼ We calculate this distance for triangular fuzzy numbers. Let A e ðci ; li ; r i Þ; i ¼ 1; 2 be triangular fuzzy numbers and ð A i Þa ¼ ½li a þ e i Þ are as ðci  li Þ  ri a þ ðci þ r i Þ, the midpoint and the radius of ð A a follows:

ð A i Þa

2

1

1 where F 1 1 and F 2 are the quantile functions of the two distributions. If we assume Fi for i = 1, 2 to be the uniform distribution function on [ai, bi], then F 1 i ðtÞ is the same as the parametric representation Ii(t) in Eq. (8). Thus, the Wasserstein distance coincides with the distance defined in Eq. (10). Now we are ready to construct a distance between fuzzy data. According to a-cuts, the Wasserstein distance dWass can be generale 1 and A e2: ized to fuzzy numbers A

de

dYK ðk; qÞ ¼ ðc1  c2 Þ2 þ ½ðc1  kl1 Þ  ðc2  kl2 Þ2 R1

0

ð A i Þa

ð6Þ

Z

dWass ðF 1 ; F 2 Þ ¼

me

A squared Euclidean distance between a pair of LR-type fuzzy data e 1 ¼ ðc1 ; l1 ; r 1 Þ and A e 2 ¼ ðc2 ; l2 ; r 2 Þ, where c denotes the center and l, A r indicate, respectively, the left and right spread, is defined by Yang and Ko (1996):

þ ½ðc1 þ qr 1 Þ  ðc2 þ qr 2 Þ2 ;

and F2 be distribution functions, the Wasserstein L2 metric is defined as follows (Gibbs & Su, 2002):

i

i

fuzzy data and fuzzy prototype matrices, respectively. Let us now introduce the objective function:

e U; V eÞ ¼ J ð X;

c X n X i¼1 k¼1

um ik :

1

xqk

2

:d ðv~ i ; ~xk Þ:

ð16Þ

9337

M.H. Fazel Zarandi et al. / Expert Systems with Applications 38 (2011) 9334–9339

under the constraints n X k¼1 c X

xk ¼ x:

ð17Þ

uik ¼ 1:

ð18Þ

i¼1 2

where, m is the degree of fuzziness, and d ðv~ i ; ~ xk Þ is as follows: 2

d ðv~ i ; ~xk Þ ¼

p 

2 1 

2 X lv~ j  l~xj cv~ j  c~xj þ 9 i k i k j¼1 j¼1

2



 þ r v~ j  r~xj  lv~ j  l~xj rv~ j  r~xj i k i k i k

h



i 1 cv~ j  c~xj lv~ j  l~xj  r v~ j  r~xj : ð19Þ  2 i i i k k k p X

2

d



v~ ji ; ~xjk



¼

As Keller points out, the factor xk represents the weight for the kth datum and x is a constant real valued parameter. With constant parameter q, the influence of the outlier weighting factor can be controlled. For this purpose, outliers are assigned a large weight xk, so x1q is small in this case. The necessary conditions for minimizk ing the objective function are as follows:

h h



ii 1 1 m lv j  l~xj  r v j  r~xj k¼1 uik  xq 2c ~xj þ 2 i i k k k k P cv~ j ¼ : 1 i 2 nk¼1 um ik  xq k



i P n m 1 h2 1 þ 12 cv~ j  c~xj k¼1 uik  xq 9 l~xj þ 9 r v~ j  r ~xj i k k k Pn i m 1k lv~ j ¼ : 2 i k¼1 uik  xq 9 k



i Pn m 1 h2 1  12 cv~ j  c~xj k¼1 uik  xq 9 r ~xj þ 9 lv~ j  l~xj i i k k k k Pn m 1 : r v~ j ¼ 2 i k¼1 uik  xq 9 Pn

P c

xk ¼

m 2 ~ ~ i¼1 uik :d ð i ; xk Þ

Pn Pc Pc

1

qþ1

ð21Þ

ð22Þ

k

 x: 1

qþ1

ð23Þ

m 2 ~ ~ i¼1 uil :d ð i ; xl Þ

l¼1

uik ¼

v

ð20Þ



1

v

: 1

m1

d2 ðv~ i ;~xk Þ r¼1 d2 ðv~ r ;~x Þ k

as partition coefficient and partition entropy (Bezdek, 1974a, 1974b) can be directly applied to the fuzzy clustering of fuzzy data, but they use only fuzzy memberships, which may not have close connection to the geometrical structure of data, (Zhang, Wang, Zhang, & Li, 2008). There is also another class of indices which simultaneously take fuzzy memberships and the data structure into consideration. These indices cannot be directly applied to the fuzzy clustering of fuzzy data and should be extended to a complete fuzzy framework. Kwon validity index (Kown, 1998) is a member of the second class of proposed validity indices. It is a modification of Xie and Beni validity index (Xie & Beni, 1991) with the added advantage of monotonically decreasing tendency as the number of clusters increases, but it has the disadvantage of not being robust to noise. Here, in order to obtain the number of clusters c in a complete fuzzy framework and also in noisy environments, we modify Kwon validity index as follows:

Pc Pn i¼1

e; X eÞ ¼ F fr ðU; V

m k¼1 uik



1

xqk

2

 d ðv~ i ; ~xk Þ þ 1c 2

PP m 1 xk u  q~  ¼ Pi Pk ik xk . where, v~ f um  1 i

k ik

min d ðv~ i ; v~ k Þ

c P i¼1

2

d ðv~ i ; v~ f Þ ð25Þ

i–k

xq k

Our goal is to find the fuzzy c-partition with the smallest value of Ffr. The differences between the modified version of Kown validity index and Kown validity index are as follows:  The modified validity index can be used in a complete fuzzy framework.  The weighted fuzzy mean is used instead of crisp mean of data.  The factor x1q is added to the first term of the numerator, so that k

it can be used in noisy environments.  The weighting exponent is generalized from 2 to m. Thus, this modified version of Kwon validity index is robust to noise and can be used for fuzzy clustering of fuzzy data. 6. Methodology

ð24Þ

As it is observed, the membership degrees are left unchanged, while the cluster centers take into account the weights; points with high representativeness are more effective than outliers. On the basis of the necessary conditions, we can construct an iterative algorithm as follows: Algorithm. Step 1: Fix the degree of fuzziness (m), the number of clusters (c), x and q. Choose an initial fuzzy c-partition U(0). Also, choose initial spreads and weights for each datum subject to Eq. (17). Set t ¼ 0. e ðtÞ ¼ ðc ~ ðtÞ ; l ~ ðtÞ ; r ~ ðtÞ Þ using U(t), spreads, weights Step 2: Calculate V v v v and Eqs. (20)–(23). ðtÞ Step 3: Update xk ; k ¼ 1; . . . ; n using Eq. (23) and update U(t) by e ðtÞ ¼ ðc ~ ðtÞ ; l ~ ðtÞ ; r ~ ðtÞ Þ and Eq. (24). U(t+1) using V v v v (t+1) Step 4: If kU  U(t)k < e, where e is a non-negative small number fixed by the researcher, the algorithm has converged. Otherwise, set t = t + 1 and go to Step 2. 5. Cluster validity index As Pal and Bezdek (1995) pointed out, once clusters are found, it is necessary to validate them. This is a cluster validity problem. In the literature, we can find many validity indices. Early indices such

This section presents the methodology of developing our system and presents its modules in detail. 6.1. Case representation and indexing Each case is a project to which VE have been applied. One feature is the name of the part of project on which VE studies have been conducted. We use this feature as an index. Cases are classified according to this feature. Other features are project characteristics which are triangular fuzzy numbers and determined by experts. These are usually domain dependent. The features are weighted through a weighting method like fuzzy AHP. Solutions are practical ideas (or alternatives) which were generated by experts in VE workshop. Each solution is a binary vector, where each entry is correspondent to each idea, i.e., if an idea is generated, the corresponding entry is equal to one; otherwise, it is equal to zero. Fig. 2 illustrates case representation in the case library. 6.2. Case retrieval The retrieval algorithm is begun by deciding to which class the query case belongs. After determining the class of the query case, we have to search for similar cases in that class. For this purpose, the cases will be clustered by the algorithm proposed in Section 3.7 into several groups. Next, the degree of similarity between the query case and each cluster prototype vi is calculated using d(W) and substituting vi for r in the similarity function defined as follows:

9338

M.H. Fazel Zarandi et al. / Expert Systems with Applications 38 (2011) 9334–9339

1

bad

good

poor

excellent

0.9 0.8 0.7 0.6 0.5 0.4

Fig. 2. Case representation in the case library. ðWÞ

SMðq; rÞ ¼ ebdFR

ðq;rÞ

0.3

ð26Þ

;

where, b is a positive constant. The most similar cluster to the query case is the one satisfying arg maxi SM(q,vi) for i = 1,. . .,c. The cases within this cluster will be compared to the query case according to Eq. (26).

0.2 0.1 0 0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Fig. 3. Linguistic variables and the associated fuzzy numbers.

6.3. Case adaptation If there is a case with the similarity degree equal to one of the retrieved cases, null adaptation will be used, applying the solution from the retrieved case to the query case without any modification. Otherwise, we suggest compositional adaptation based on the fact that cases usually have little variance in similarity degrees. This allows us to combine the corresponding solutions in an efficient way to obtain the final solution. This is done as follows: for each feature, we take a weighted average over the data points with weights proportional to similarity degrees and then we apply a threshold; those above the threshold are mapped to one, otherwise to zero. If we represent the solutions of the retrieved cases by Si, their similarity with the query case by SMi and the threshold by h, then the solution of the query case (sq) is as follows:

&P Sq ¼

k i¼1 SM i  Pk i¼1 SM i

Si

’  h1 ;

ð27Þ

where, d.e is the ceiling function. 6.4. Case retainment Eventually, if the experts find the solution (ideas/alternatives) acceptable, it will be added as a new case to the case library; otherwise, they run a brainstorming session and generate ideas and add the practical ones to the case library as the solution of the query case. 7. Application Our system was tested on suburban highway design data which was extracted from the National Cooperative Highway Research Program (NCHRP) Report 282 (NCHRP, 1986). The features include existing design, maximum available width and the desirability of operational and safety indices. The existing design can be one of these options:  Two-lane Undivided, abbreviated as 2U.  Three-Lane Divided with Center Two-Way- Left-Turn Lane, abbreviated as 3T.  Four-Lane Undivided, abbreviated as 4U.  Four-Lane Divided with raised Median, abbreviated as 4D.  Five-Lane Divided with Center Two-Way- Left-Turn-Lane, abbreviated as 5D.  Six-Lane Divided with raised median, abbreviated as 6D.  Seven-Lane with Center Two-Way- Left-Turn Lane, abbreviated as 7T.

Table 1 The average MSE for each class of the indexed cases. Existing design

MSE

2U 3T 4U 4D 5T 6D 7T

0.08 0.08 0.13 0.13 0.05 0.08 0.07

There are 44 feasible alternatives that we use as case solutions. We pick up the ‘‘existing design’’ feature as an index and classify the cases according to this feature. Then, the class of the query case is determined. The ‘‘Maximum available width’’ is crisp. We change it to fuzzy singleton. Other features are linguistic terms and are transformed to triangular fuzzy numbers according to Fig. 3. After retrieving the similar cases through our clustering algorithm and adapting their solutions as explained in Sections 6.2 and 6.3, the solution of the query case is generated. If this solution is acceptable, it will be added to the case library. The performance of our system was validated by leave-one-out cross-validation. LOOCV involves using a single observation from the original sample as the validation data, and the remaining observations as the training data. This is repeated such that each observation in the sample is used once as the validation data. Each time the mean squared error (MSE) is computed. The average MSE for each class of the indexed cases is shown in Table 1. Since we wanted to develop a general system that can be used in all domains, we used compositional adaptation. For developing a system for a specific domain, other methods of adaptation such as transformational and derivational adaptation may reduce the error. 8. Conclusion and future works This paper presented a fuzzy CBR system for value engineering. This system can contribute significantly to the efficiency of the value study, providing the VE team with an extensive memory of previous experiences. Since cases are fuzzy data, a fuzzy clustering model for fuzzy data, based on a new distance is used to reduce the cases necessary for searching and save time. In addition, Kwon cluster validity index is modified to validate the number of clusters. Finally, to test the performance of our system, it is applied to suburban highway design data extracted from NCHRP Report 282. Another problem that can be explored is to develop a rough

M.H. Fazel Zarandi et al. / Expert Systems with Applications 38 (2011) 9334–9339

set-based case-based reasoner for value engineering and as we mentioned before, trying other methods of adaptation such as transformational and derivational adaptation may reduce the error.

References Aamodt, A., & Plaza, E. (1994). Case-based reasoning: Fundamental issues, methodological variations, and system approaches. AI Communications. IOS Press (Vol. 7( 1), pp. 39–59). Alcantara, P. Jr. (1996). Development of a computer understandable representation of a design rationale to support value engineering. Unpublished Ph.D. dissertation, School of Virginia Polytechnic Institute and State University. Al-Yousefi, A. S. (1991). Expert system: A programmable approach to VE logic. In Proceeding of the 1991 SAVE international conference (pp. 155–167). Kansas City. Assaf, S., Jannadi, O. A., & Al-Tamimi, A. (2000). Computerized system for application of value engineering methodology. ASCE Journal of Computing in Civil Engineering, 14(3), 206–214. Berkhin, P. (2002). Survey of clustering data mining techniques. Accrue Software Inc. . Bezdek, J. C. (1974a). Numerical taxonomy with fuzzy sets. Journal of Mathematical Biology, 1, 57–71. Bezdek, J. C. (1974b). Cluster validity with fuzzy sets. Journal of Cybernetics, 9, 58–72. Burkhardm, H. D., & Richterm, M. M. (2001). On the notion of similarity in case based reasoning and fuzzy theory. Soft computing in case-based reasoning. London: Springer (chap. 2). Dahim, H., & Mohammad A. (2001). Value engineering expert system in suburban highway design (VEESSHD). Ph.D. thesis, University of Pittsburgh. Degenhardt, G. (1985). VE-TRIEVAL a corp of engineers value engineering information retrieval system. In Proceeding of the 1985 SAVE international conference (pp. 14–25). Texas. Dell’Isola, A. J. (1998). Value engineering: Practical applications (BK and Disk ed.). R.S. Means Company. De Oliveira, J. V., & Pedrycz, W. (2007). Advances in fuzzy clustering and its applications. San Francisco: Wiley. D’Urso, P., & Giordani, P. (2006a). A weighted fuzzy c-means clustering model for fuzzy data. Computational Statistics Data Analysis, 50(6), 1496–1523. Dvir, G., Langholz, G., & Schneider, M. (1999). Matching attributes in a fuzzy case based reasoning. Fuzzy Information Processing Society, 33–36. Gibbs, A. L., & Su, F. E. (2002). On choosing and bounding probability metrics. International Statistical Review, 70, 419. Hirota, K., Yoshino, H., Xu, M. Q., Zhu, Y., Li, X. Y., & Horie, D. (1998). A fuzzy case based reasoning system for the legal inference. Fuzzy systems proceedings. IEEE

9339

world congress on computational intelligence. In The 1998 IEEE international conference (Vol. 2, pp. 1350–1354). Irpino, A., & Verde, R. (2008). Dynamic clustering for interval data using a Wasserstein-based distance. Pattern Recognition Letters, 29, 1648–1658. Keller, A. (2000). Fuzzy clustering with outliers. In T. Whalen (Ed.), Proceedings of the 19th international conference on the North American fuzzy information processing society, NAFIPS00 (pp. 143–147). Kown, S. H. (1998). Cluster validity index for fuzzy clustering. IEEE Electronic Letters, 34(22). Liang, Z., & Shi, P. (2003). Similarity measures on intuitionistic fuzzy sets. Pattern Recognition Letters, 24, 2687–2693. Mandelbaum, J., & Reed, D. L. (2006). Value engineering handbook, IDA Paper P-4114, Alexandria, VA: Institute for Defense Analysis. Naderpajouh, N., & Afshar, A. (2008). A case-based reasoning approach to application of value engineering methodology in the construction industry. Journal of Construction Management and Economics, 26, 363–372. Naderpajouh, N., Afshar, SA., & Mirmohammadsadeghi, A. (2006). Fuzzy decision support system for application of value engineering in construction industry. International Journal of Civil Engineering, 4(4), 261–273. Näther, W. (2000). On random fuzzy variables of second order and their application to linear statistical inference with fuzzy data. Metrika, 51, 201–221. National Cooperative Highway Research Program (NCHRP) Report 282 (1986). Multi-lane design alternatives for improving suburban highways. Washington, DC: Transportation Research Board. Pal, N. R., & Bezdek, J. C. (1995). On cluster validity for fuzzy c-means model. IEEE Transactions on Fuzzy Systems, 1, 370–379. Park, C. (1994). An integrated value engineering computer system or construction projects. Unpublished Ph.D. dissertation, School of Engineering University of Florida. Shen, Q., & Brandon, P. S. (1991). Can expert systems improve VM implementation? In Proceedings of the 1991 SAVE international conference (pp. 168–176). Kansas City. Tran, L., & Duckstein, L. (2002). Comparison of fuzzy numbers using a fuzzy distance measure. Fuzzy Sets and Systems, 130, 331–341. Wang, W. J. (1997). New similarity measures on fuzzy sets and on elements. Fuzzy Sets and Systems, 85, 305–309. Watson, I. (1997). Applying case-based reasoning: Techniques for enterprise systems. Morgan Kaufmann Publishers. Xie, X. L., & Beni, G. (1991). A validity measure for fuzzy clustering. IEEE Transactions on Pattern Analysis Machine Intelligence, 13, 841–847. Yang, M. S., & Ko, C. H. (1996). On a class of fuzzy c-numbers clustering procedures for fuzzy data. Fuzzy Sets and Systems, 84, 49–60. Zhang, Y., Wang, W., Zhang, X., & Li, Y. (2008). A cluster validity index for fuzzy clustering. Information Science, 178, 1205–1218. Zimmermann, H. J. (2001). Fuzzy Set Theory and its Applications. Dordrecht: Kluwer Academic Press.