Expert Systems with Applications 38 (2011) 9334–9339
Contents lists available at ScienceDirect
Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa
A fuzzy case based reasoning approach to value engineering M.H. Fazel Zarandi ⇑, Zahra S. Razaee, M. Karbasian Department of Industrial Engineering, Amirkabir University of Technology, Tehran, Iran
a r t i c l e
i n f o
Keywords: Value engineering Fuzzy case-based reasoning Fuzzy clustering Fuzzy data
a b s t r a c t This paper is intended to assist the experts during the creativity phase of value engineering through utilizing the past experiences and avoid them in a specific domain from repeating the same experience. To this purpose, a general fuzzy case based reasoning (CBR) system is developed. Our system benefits from a fuzzy clustering model for fuzzy data to facilitate case retrieval and reduce the time complexity. The inherent analogical nature of a case-based reasoning (CBR) model and its integration with fuzzy theory would facilitate access to more precise and systematically classified information during a VE workshop. In order to test the performance of the proposed system, it is applied to suburban highway design data extracted from National Cooperative Highway Research Program (NCHRP) Report 282. 2011 Elsevier Ltd. All rights reserved.
1. Introduction Value engineering (VE) is an organized approach directed at analyzing the function of systems, facilities, services, and supplies for the purpose of achieving their essential functions at the lowest life-cycle cost consistent with required performance, reliability, quality and safety (Mandelbaum & Reed, 2006). The VE process consists of several phases, including the information phase, function analysis phase, creativity phase, evaluation phase, presentation phase and implementation phase. Creativity depends on the human brain and cannot be computerized easily by conventional programming. Case-based reasoning (CBR) from AI can be used to improve efficiency of this stage, since this approach is able to utilize the specific knowledge of experiences by retrieving and adapting the solutions from similar past cases. In the literature, existing models mainly involve conventional approaches and less has been devoted to devising AI approaches. One of the earliest works was done by the US Army Corps of Engineers through establishing an information retrieval system called VE-trieval. This program can be queried by key-word methodology on a particular subject to obtain an abstract and other useful information (Degenhardt, 1985). Park (1994) developed VEPRO which is a spreadsheet rule-based system with database features and consists of several models parallel to the VE job plan. Alcantara (1996) designed a support program for the information phase of VE, which assigned data structure for representing and performing analytical tasks on rational data. A computer model for VE methodology was developed by Assaf, Jannadi, and Al-Tamimi (2000) emphasizing life cycle cost calculations. Dahim (2001) at Pitts⇑ Corresponding author. Tel.: +98 21 641 3034; fax: +98 21 6641 3025. E-mail addresses:
[email protected] (M.H. Fazel Zarandi),
[email protected] (Z.S. Razaee),
[email protected] (M. Karbasian). 0957-4174/$ - see front matter 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2011.01.124
burgh University developed an expert system for VE application in suburban highway design. It utilizes the analytical hierarchy process (AHP) method for the evaluation phase of VE. Naderpajouh and Afshar (2008) proposed a conceptual expert case-based reasoning (CBR) framework that outlines knowledge entities and their relations in the VE workshop. It also benefits from a fuzzy approach to handle uncertainties in the evaluation phase of the job plan. In general, devising an expert system for a VE job plan is recommended by different researches (Al-Yousefi, 1991; Assaf et al., 2000; Shen & Brandon, 1991). The main objective of this study is to assist the experts during the creativity phase of VE through utilizing the past experiences to prevent repeating the same experience in a particular domain. To this purpose, a comprehensive fuzzy CBR system is proposed involving fuzzy representation of cases and a fuzzy clustering of fuzzy data model to similarity matching in order to facilitate case retrieval. The basic idea that motivates us to use fuzzy theory is that in early stages of the project development, where VE has the greatest payoffs (Dell’Isola, 1998), most of the parameters have uncertainties (Naderpajouh, Afshar, & Mirmohammadsadeghi, 2006). In addition, many experts cannot express their judgments in accurate numerical terms and use linguistic expressions. In these cases, fuzzy theory may be employed to handle uncertainties and support linguistic assessments. Thus, the inherent analogical nature of a case-based reasoning (CBR) model and its integration with fuzzy theory would facilitate access to more precise and systematically classified information during a VE workshop. The rest of the paper is organized as follows. Section 2 summarizes the literature survey for the related areas. We propose a distance measure for fuzzy data based on Wasserstein Metric in Section 3; by means of this distance and following Keller’s approach, we propose a fuzzy clustering model for fuzzy data with outliers (Section 4). For determining the optimal number of
9335
M.H. Fazel Zarandi et al. / Expert Systems with Applications 38 (2011) 9334–9339
clusters, we modify Kown (1998) validity index so that it can be used in a complete fuzzy framework and also in noisy environments (Section 5). In Section 6, the main methodology is proposed. As an application, our system is tested on suburban highway design data provided in NCHRP Report 282 (NCHRP, 1986). Finally, conclusions and future works are presented in Section 8.
inherent fuzzy nature of similarity measurement in CBR is another motivation to use fuzzy theory in case retrieval (Burkhardm & Richterm, 2001). For related work in this area, see for example Hirota et al. (1998), Dvir, Langholz, and Schneider (1999), Liang and Shi (2003) and Wang (1997). 2.3. Clustering analysis
2. Background This section will briefly provide some relative literature in the areas of case-based reasoning, fuzzy case-based reasoning, clustering analysis, fuzzy data and Metrics for fuzzy data. 2.1. Case-based reasoning The case based reasoning was first proposed by Watson (1997). It is a problem-solving paradigm that involves solving new problems by searching through a database of previously-solved problems (called a case library) for one or more cases whose identifying features closely resemble the current problem. When found, the solution employed in the historical case (s) is retrieved and applied to the current problem. However, if the retrieved case is not a close match, the solution is revised producing a new case that can be retained. Finally, the current problem with the new solution can be added to the case library to increase its robustness. Aamodt and Plaza (1994) regarded CBR as composed of the following cycle (CBR cycle) with four main subjects: Retrieving similar previously experienced cases whose problem is judged to be similar. Reusing the cases by copying or integrating the solutions from the cases retrieved. Revising or adapting the solution (s) retrieved in an attempt to solve the new problem. Retaining the new solution once it has been confirmed or validated. The procedures for a CBR are shown in Fig. 1. 2.2. Fuzzy case-based reasoning Adding a fuzzy logic concept into the conventional CBR methods can improve the CBR performance. Fuzzy logic can be used in case representation to provide a characterization of imprecise and uncertain information. In other words, fuzzy logic allows us to represent cases whose attributes have imprecise and vague values. Moreover, one of the major issues in fuzzy set theory is measuring similarities in order to design robust systems. The
Clustering is a division of a given set of objects into subgroups or clusters, so that objects in the same cluster are as similar as possible, and objects in different clusters are as dissimilar as possible. From a machine learning perspective, clustering is an unsupervised learning of a hidden data concept (Berkhin, 2002). In conventional (hard) clustering analysis, each datum belongs to exactly one cluster, whereas in fuzzy clustering, data points can belong to more than one cluster, and associated with each datum is a set of membership degrees. Fuzzy data are imprecise data obtained from measurements, human judgements or linguistic assessments. In cluster analysis, when there is simultaneous uncertainty in both the partition and data, a fuzzy clustering model for fuzzy data should be applied (D’Urso & Giordani, 2006a). In our CBR system, cases are fuzzy data. Thus, in Section 4 we propose a fuzzy clustering of fuzzy data for clustering cases in order to reduce the cases necessary for searching and to save time. 2.4. LR-type fuzzy data The LR-type fuzzy data represent a general class of fuzzy data. When we are dealing with univariate LR fuzzy data, this kind of data can be shown by a vector of LR-fuzzy numbers. In the more general case of multivariate analysis, we have a matrix of LR-fuzzy numbers (De Oliveira & Pedrycz, 2007). To be more specific, let L (and R) be a decreasing shape function, which map Rþ ! ½0; 1 with L(0) = 1; L(x) < 1,"x > 0; L(x) > 0,"x < 1; L(1) = 0 or (L(x) > 0,"x and L(+1) = 0) (Zimmermann, 2001). Then, a fuzzy ~ is of LR-type if for c,l > 0,r > 0 in R, number A
( L cx for x 6 c; l leA ðxÞ ¼ xc R r for x P c:
ð1Þ
e respectively. where, c, l, r are the center, left and right spreads of A, e ¼ ðc; l; rÞ . Symbolically we can write A LR In LR-type fuzzy numbers, the triangular fuzzy numbers (TFNs) ~ is called triare most commonly used. An LR-type fuzzy number A angular fuzzy number if L(x) = R(x) = 1 x, characterized by the following membership function:
(
leA ðxÞ ¼
for x 6 c; 1 cx l 1 xc r
ð2Þ
for x P c:
2.5. Metrics for fuzzy data In the recent literature, there are some distance measures for fuzzy data. We review some of them in this section. Definition (The Hausdorff distance). Considering two crisp sets A; B # Rk , and a distance d(x,y) where, x 2 A and y 2 B, the Hausdorff distance is defined as follows:
(
)
dH ðA; BÞ ¼ max sup inf dðx; yÞ; sup inf dðx; yÞ : x2A
Fig. 1. CBR cycle (Aamodt & Plaza, 1994).
y2B
y2B
x2A
ð3Þ
According to the concept of a-cuts, the Hausdorff metric dH can e where e e : R ! ½0; 1: be generalized to fuzzy numbers e F ; G, F ðor GÞ
9336
M.H. Fazel Zarandi et al. / Expert Systems with Applications 38 (2011) 9334–9339
e ¼ dq ð e F ; GÞ
8 hR i1=q > < 01 ðdH ðF a ; Ga ÞÞq da
if q 2 ½1; 1
> : sup dH ðF a ; Ga Þ
if q ¼ 1;
ð4Þ
a2½0;1
where, the crisp set F a fx 2 Rk : FðxÞ P ag; a 2 ½0; 1, is called the a-cut of eF (Näther, 2000). Tran and Duckstein (2002) proposed the following distance between two intervals:
aþb þ xðb aÞ 2 12 12 2
uþv dx dy þ yðv uÞ 2 " # 2 2 aþb u þ v
1 ba v u 2 : ð5Þ þ þ ¼ 2 3 2 2 2
dTD ðA; BÞ ¼
Z
1 2
Z
1 2
Then, they used it to formulate their distance measure for fuzzy numbers, but dTD does not satisfy the reflexivity property (Irpino & Verde, 2008):
2 aþb aþb 2 2 " 2 2 # 1 ba ba þ þ 3 2 2 2 2 ba P 0: ¼ 3 2
dTD ðA; AÞ ¼
R1
1
ð7Þ
1
where, k ¼ 0 L ðtÞdt; q ¼ 0 R ðtÞdt are parameters that summarize the shape of the left and right tails of the membership function and L,R are decreasing shape functions which were defined in Section 2. 3. The proposed distance for fuzzy data In this section, we first present a new distance measure for interval-valued data, and then it is used to formulate the distance measure for fuzzy data. Let Ii = [ai,bi], be an interval for i ¼ 1; 2. We can parameterize Ii as follows:
Ii ðtÞ ¼ ai þ tðbi ai Þ 0 6 t 6 1:
ð8Þ
i If we represent Ii by means of its midpoint mi ¼ ai þb and radius 2 i di ¼ bi a , Eq. (9) can be rewritten as follows: 2
Ii ðtÞ ¼ mi þ ð2t 1Þdi
0 6 t 6 1:
ð9Þ
The distance measure between I1 and I2 can be defined as follows: 2
d ðI1 ; I2 Þ ¼
Z
1
½I1 ðtÞ I2 ðtÞ2 dt
0
¼
Z
1
e1; A e 2Þ ¼ dð A
½ðm1 m2 Þ þ ðd1 d2 Þð2t 1Þ2 dt
0
Z
1
0
2 1 ðF 1 1 ðtÞ F 2 ðtÞÞ dt
ð10Þ
This distance takes into account all the points in both intervals. Irpino and Verde (2008) has derived Eq. (10) from another point of view, using the Wasserstein distance. To be more specific, let F1
;
12 2 e 1Þ ; ðA e 2 Þ da : dWass ð A a a
1 ¼ ci þ ð1 aÞðr i li Þ: 2 1 ¼ ð1 aÞðr i þ li Þ: 2
ð11Þ
ð12Þ
ð13Þ ð14Þ
Then we have: 2 e e d ðA 1; A2Þ ¼
Z
1
0
2 e 1Þ ; ðA e 2 Þ Þ da dWass ðð A a a (
2 ) 1 da de de ð A 1 Þa ð A 2 Þa ð A 2 Þa 3 ð A 1 Þa 0 2 Z 1 ( 1 ¼ ðc1 c2 Þ þ ð1 aÞ½ðr 1 r 2 Þ ðl1 l2 Þ 2 0 ) 1 2 2 þ ð1 aÞ ½ðr 1 r 2 Þ þ ðl1 l2 Þ da 12 ¼
Z
1
2
me
me
þ
1 ¼ ðc1 c2 Þ2 þ ½ðl1 l2 Þ2 þ ðr1 r2 Þ2 ðl1 l2 Þðr 1 r 2 Þ 9 1 ð15Þ ðc1 c2 Þ½ðl1 l2 Þ ðr 1 r2 Þ: 2 We use this distance in the next section for fuzzy clustering of fuzzy data. 4. Fuzzy clustering of fuzzy data with outliers In this section Keller’s approach (Keller, 2000) is modified so that it can be used for fuzzy data. Similar to his approach, an additional weighting factor is added for each datum to identify outliers and reduce their effects. Before describe the procedure, let us introduce the following notation: U {uik:i = 1, . . ., c;k = 1, . . ., n} is the membership matrix of order (c n), where c is the number of clusters, n is the number of data vectors; uik 2 [0, 1] denotes the membership degree of the kth object to the ith cluster. In contrast to Keller’s approach where data elements and cluster prototypes are crisp, we define them as triann
e ~ gular fuzzy data. Thus, X xjk ¼ c~xj ; l~xj ; r~xj : k ¼ 1; . . . ; n; j ¼ k k k n
o e v ~ ji ¼ cv~ j ; lv~ j ; r v~ j : i ¼ 1; . . . ; c; j ¼ 1; . . . ; p are 1; . . . ; pg and V i
1 ¼ ðm1 m2 Þ þ ðd1 d2 Þ2 : 3 2
1=2
ei ¼ We calculate this distance for triangular fuzzy numbers. Let A e ðci ; li ; r i Þ; i ¼ 1; 2 be triangular fuzzy numbers and ð A i Þa ¼ ½li a þ e i Þ are as ðci li Þ ri a þ ðci þ r i Þ, the midpoint and the radius of ð A a follows:
ð A i Þa
2
1
1 where F 1 1 and F 2 are the quantile functions of the two distributions. If we assume Fi for i = 1, 2 to be the uniform distribution function on [ai, bi], then F 1 i ðtÞ is the same as the parametric representation Ii(t) in Eq. (8). Thus, the Wasserstein distance coincides with the distance defined in Eq. (10). Now we are ready to construct a distance between fuzzy data. According to a-cuts, the Wasserstein distance dWass can be generale 1 and A e2: ized to fuzzy numbers A
de
dYK ðk; qÞ ¼ ðc1 c2 Þ2 þ ½ðc1 kl1 Þ ðc2 kl2 Þ2 R1
0
ð A i Þa
ð6Þ
Z
dWass ðF 1 ; F 2 Þ ¼
me
A squared Euclidean distance between a pair of LR-type fuzzy data e 1 ¼ ðc1 ; l1 ; r 1 Þ and A e 2 ¼ ðc2 ; l2 ; r 2 Þ, where c denotes the center and l, A r indicate, respectively, the left and right spread, is defined by Yang and Ko (1996):
þ ½ðc1 þ qr 1 Þ ðc2 þ qr 2 Þ2 ;
and F2 be distribution functions, the Wasserstein L2 metric is defined as follows (Gibbs & Su, 2002):
i
i
fuzzy data and fuzzy prototype matrices, respectively. Let us now introduce the objective function:
e U; V eÞ ¼ J ð X;
c X n X i¼1 k¼1
um ik :
1
xqk
2
:d ðv~ i ; ~xk Þ:
ð16Þ
9337
M.H. Fazel Zarandi et al. / Expert Systems with Applications 38 (2011) 9334–9339
under the constraints n X k¼1 c X
xk ¼ x:
ð17Þ
uik ¼ 1:
ð18Þ
i¼1 2
where, m is the degree of fuzziness, and d ðv~ i ; ~ xk Þ is as follows: 2
d ðv~ i ; ~xk Þ ¼
p
2 1
2 X lv~ j l~xj cv~ j c~xj þ 9 i k i k j¼1 j¼1
2
þ r v~ j r~xj lv~ j l~xj rv~ j r~xj i k i k i k
h
i 1 cv~ j c~xj lv~ j l~xj r v~ j r~xj : ð19Þ 2 i i i k k k p X
2
d
v~ ji ; ~xjk
¼
As Keller points out, the factor xk represents the weight for the kth datum and x is a constant real valued parameter. With constant parameter q, the influence of the outlier weighting factor can be controlled. For this purpose, outliers are assigned a large weight xk, so x1q is small in this case. The necessary conditions for minimizk ing the objective function are as follows:
h h
ii 1 1 m lv j l~xj r v j r~xj k¼1 uik xq 2c ~xj þ 2 i i k k k k P cv~ j ¼ : 1 i 2 nk¼1 um ik xq k
i P n m 1 h2 1 þ 12 cv~ j c~xj k¼1 uik xq 9 l~xj þ 9 r v~ j r ~xj i k k k Pn i m 1k lv~ j ¼ : 2 i k¼1 uik xq 9 k
i Pn m 1 h2 1 12 cv~ j c~xj k¼1 uik xq 9 r ~xj þ 9 lv~ j l~xj i i k k k k Pn m 1 : r v~ j ¼ 2 i k¼1 uik xq 9 Pn
P c
xk ¼
m 2 ~ ~ i¼1 uik :d ð i ; xk Þ
Pn Pc Pc
1
qþ1
ð21Þ
ð22Þ
k
x: 1
qþ1
ð23Þ
m 2 ~ ~ i¼1 uil :d ð i ; xl Þ
l¼1
uik ¼
v
ð20Þ
1
v
: 1
m1
d2 ðv~ i ;~xk Þ r¼1 d2 ðv~ r ;~x Þ k
as partition coefficient and partition entropy (Bezdek, 1974a, 1974b) can be directly applied to the fuzzy clustering of fuzzy data, but they use only fuzzy memberships, which may not have close connection to the geometrical structure of data, (Zhang, Wang, Zhang, & Li, 2008). There is also another class of indices which simultaneously take fuzzy memberships and the data structure into consideration. These indices cannot be directly applied to the fuzzy clustering of fuzzy data and should be extended to a complete fuzzy framework. Kwon validity index (Kown, 1998) is a member of the second class of proposed validity indices. It is a modification of Xie and Beni validity index (Xie & Beni, 1991) with the added advantage of monotonically decreasing tendency as the number of clusters increases, but it has the disadvantage of not being robust to noise. Here, in order to obtain the number of clusters c in a complete fuzzy framework and also in noisy environments, we modify Kwon validity index as follows:
Pc Pn i¼1
e; X eÞ ¼ F fr ðU; V
m k¼1 uik
1
xqk
2
d ðv~ i ; ~xk Þ þ 1c 2
PP m 1 xk u q~ ¼ Pi Pk ik xk . where, v~ f um 1 i
k ik
min d ðv~ i ; v~ k Þ
c P i¼1
2
d ðv~ i ; v~ f Þ ð25Þ
i–k
xq k
Our goal is to find the fuzzy c-partition with the smallest value of Ffr. The differences between the modified version of Kown validity index and Kown validity index are as follows: The modified validity index can be used in a complete fuzzy framework. The weighted fuzzy mean is used instead of crisp mean of data. The factor x1q is added to the first term of the numerator, so that k
it can be used in noisy environments. The weighting exponent is generalized from 2 to m. Thus, this modified version of Kwon validity index is robust to noise and can be used for fuzzy clustering of fuzzy data. 6. Methodology
ð24Þ
As it is observed, the membership degrees are left unchanged, while the cluster centers take into account the weights; points with high representativeness are more effective than outliers. On the basis of the necessary conditions, we can construct an iterative algorithm as follows: Algorithm. Step 1: Fix the degree of fuzziness (m), the number of clusters (c), x and q. Choose an initial fuzzy c-partition U(0). Also, choose initial spreads and weights for each datum subject to Eq. (17). Set t ¼ 0. e ðtÞ ¼ ðc ~ ðtÞ ; l ~ ðtÞ ; r ~ ðtÞ Þ using U(t), spreads, weights Step 2: Calculate V v v v and Eqs. (20)–(23). ðtÞ Step 3: Update xk ; k ¼ 1; . . . ; n using Eq. (23) and update U(t) by e ðtÞ ¼ ðc ~ ðtÞ ; l ~ ðtÞ ; r ~ ðtÞ Þ and Eq. (24). U(t+1) using V v v v (t+1) Step 4: If kU U(t)k < e, where e is a non-negative small number fixed by the researcher, the algorithm has converged. Otherwise, set t = t + 1 and go to Step 2. 5. Cluster validity index As Pal and Bezdek (1995) pointed out, once clusters are found, it is necessary to validate them. This is a cluster validity problem. In the literature, we can find many validity indices. Early indices such
This section presents the methodology of developing our system and presents its modules in detail. 6.1. Case representation and indexing Each case is a project to which VE have been applied. One feature is the name of the part of project on which VE studies have been conducted. We use this feature as an index. Cases are classified according to this feature. Other features are project characteristics which are triangular fuzzy numbers and determined by experts. These are usually domain dependent. The features are weighted through a weighting method like fuzzy AHP. Solutions are practical ideas (or alternatives) which were generated by experts in VE workshop. Each solution is a binary vector, where each entry is correspondent to each idea, i.e., if an idea is generated, the corresponding entry is equal to one; otherwise, it is equal to zero. Fig. 2 illustrates case representation in the case library. 6.2. Case retrieval The retrieval algorithm is begun by deciding to which class the query case belongs. After determining the class of the query case, we have to search for similar cases in that class. For this purpose, the cases will be clustered by the algorithm proposed in Section 3.7 into several groups. Next, the degree of similarity between the query case and each cluster prototype vi is calculated using d(W) and substituting vi for r in the similarity function defined as follows:
9338
M.H. Fazel Zarandi et al. / Expert Systems with Applications 38 (2011) 9334–9339
1
bad
good
poor
excellent
0.9 0.8 0.7 0.6 0.5 0.4
Fig. 2. Case representation in the case library. ðWÞ
SMðq; rÞ ¼ ebdFR
ðq;rÞ
0.3
ð26Þ
;
where, b is a positive constant. The most similar cluster to the query case is the one satisfying arg maxi SM(q,vi) for i = 1,. . .,c. The cases within this cluster will be compared to the query case according to Eq. (26).
0.2 0.1 0 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Fig. 3. Linguistic variables and the associated fuzzy numbers.
6.3. Case adaptation If there is a case with the similarity degree equal to one of the retrieved cases, null adaptation will be used, applying the solution from the retrieved case to the query case without any modification. Otherwise, we suggest compositional adaptation based on the fact that cases usually have little variance in similarity degrees. This allows us to combine the corresponding solutions in an efficient way to obtain the final solution. This is done as follows: for each feature, we take a weighted average over the data points with weights proportional to similarity degrees and then we apply a threshold; those above the threshold are mapped to one, otherwise to zero. If we represent the solutions of the retrieved cases by Si, their similarity with the query case by SMi and the threshold by h, then the solution of the query case (sq) is as follows:
&P Sq ¼
k i¼1 SM i Pk i¼1 SM i
Si
’ h1 ;
ð27Þ
where, d.e is the ceiling function. 6.4. Case retainment Eventually, if the experts find the solution (ideas/alternatives) acceptable, it will be added as a new case to the case library; otherwise, they run a brainstorming session and generate ideas and add the practical ones to the case library as the solution of the query case. 7. Application Our system was tested on suburban highway design data which was extracted from the National Cooperative Highway Research Program (NCHRP) Report 282 (NCHRP, 1986). The features include existing design, maximum available width and the desirability of operational and safety indices. The existing design can be one of these options: Two-lane Undivided, abbreviated as 2U. Three-Lane Divided with Center Two-Way- Left-Turn Lane, abbreviated as 3T. Four-Lane Undivided, abbreviated as 4U. Four-Lane Divided with raised Median, abbreviated as 4D. Five-Lane Divided with Center Two-Way- Left-Turn-Lane, abbreviated as 5D. Six-Lane Divided with raised median, abbreviated as 6D. Seven-Lane with Center Two-Way- Left-Turn Lane, abbreviated as 7T.
Table 1 The average MSE for each class of the indexed cases. Existing design
MSE
2U 3T 4U 4D 5T 6D 7T
0.08 0.08 0.13 0.13 0.05 0.08 0.07
There are 44 feasible alternatives that we use as case solutions. We pick up the ‘‘existing design’’ feature as an index and classify the cases according to this feature. Then, the class of the query case is determined. The ‘‘Maximum available width’’ is crisp. We change it to fuzzy singleton. Other features are linguistic terms and are transformed to triangular fuzzy numbers according to Fig. 3. After retrieving the similar cases through our clustering algorithm and adapting their solutions as explained in Sections 6.2 and 6.3, the solution of the query case is generated. If this solution is acceptable, it will be added to the case library. The performance of our system was validated by leave-one-out cross-validation. LOOCV involves using a single observation from the original sample as the validation data, and the remaining observations as the training data. This is repeated such that each observation in the sample is used once as the validation data. Each time the mean squared error (MSE) is computed. The average MSE for each class of the indexed cases is shown in Table 1. Since we wanted to develop a general system that can be used in all domains, we used compositional adaptation. For developing a system for a specific domain, other methods of adaptation such as transformational and derivational adaptation may reduce the error. 8. Conclusion and future works This paper presented a fuzzy CBR system for value engineering. This system can contribute significantly to the efficiency of the value study, providing the VE team with an extensive memory of previous experiences. Since cases are fuzzy data, a fuzzy clustering model for fuzzy data, based on a new distance is used to reduce the cases necessary for searching and save time. In addition, Kwon cluster validity index is modified to validate the number of clusters. Finally, to test the performance of our system, it is applied to suburban highway design data extracted from NCHRP Report 282. Another problem that can be explored is to develop a rough
M.H. Fazel Zarandi et al. / Expert Systems with Applications 38 (2011) 9334–9339
set-based case-based reasoner for value engineering and as we mentioned before, trying other methods of adaptation such as transformational and derivational adaptation may reduce the error.
References Aamodt, A., & Plaza, E. (1994). Case-based reasoning: Fundamental issues, methodological variations, and system approaches. AI Communications. IOS Press (Vol. 7( 1), pp. 39–59). Alcantara, P. Jr. (1996). Development of a computer understandable representation of a design rationale to support value engineering. Unpublished Ph.D. dissertation, School of Virginia Polytechnic Institute and State University. Al-Yousefi, A. S. (1991). Expert system: A programmable approach to VE logic. In Proceeding of the 1991 SAVE international conference (pp. 155–167). Kansas City. Assaf, S., Jannadi, O. A., & Al-Tamimi, A. (2000). Computerized system for application of value engineering methodology. ASCE Journal of Computing in Civil Engineering, 14(3), 206–214. Berkhin, P. (2002). Survey of clustering data mining techniques. Accrue Software Inc.
. Bezdek, J. C. (1974a). Numerical taxonomy with fuzzy sets. Journal of Mathematical Biology, 1, 57–71. Bezdek, J. C. (1974b). Cluster validity with fuzzy sets. Journal of Cybernetics, 9, 58–72. Burkhardm, H. D., & Richterm, M. M. (2001). On the notion of similarity in case based reasoning and fuzzy theory. Soft computing in case-based reasoning. London: Springer (chap. 2). Dahim, H., & Mohammad A. (2001). Value engineering expert system in suburban highway design (VEESSHD). Ph.D. thesis, University of Pittsburgh. Degenhardt, G. (1985). VE-TRIEVAL a corp of engineers value engineering information retrieval system. In Proceeding of the 1985 SAVE international conference (pp. 14–25). Texas. Dell’Isola, A. J. (1998). Value engineering: Practical applications (BK and Disk ed.). R.S. Means Company. De Oliveira, J. V., & Pedrycz, W. (2007). Advances in fuzzy clustering and its applications. San Francisco: Wiley. D’Urso, P., & Giordani, P. (2006a). A weighted fuzzy c-means clustering model for fuzzy data. Computational Statistics Data Analysis, 50(6), 1496–1523. Dvir, G., Langholz, G., & Schneider, M. (1999). Matching attributes in a fuzzy case based reasoning. Fuzzy Information Processing Society, 33–36. Gibbs, A. L., & Su, F. E. (2002). On choosing and bounding probability metrics. International Statistical Review, 70, 419. Hirota, K., Yoshino, H., Xu, M. Q., Zhu, Y., Li, X. Y., & Horie, D. (1998). A fuzzy case based reasoning system for the legal inference. Fuzzy systems proceedings. IEEE
9339
world congress on computational intelligence. In The 1998 IEEE international conference (Vol. 2, pp. 1350–1354). Irpino, A., & Verde, R. (2008). Dynamic clustering for interval data using a Wasserstein-based distance. Pattern Recognition Letters, 29, 1648–1658. Keller, A. (2000). Fuzzy clustering with outliers. In T. Whalen (Ed.), Proceedings of the 19th international conference on the North American fuzzy information processing society, NAFIPS00 (pp. 143–147). Kown, S. H. (1998). Cluster validity index for fuzzy clustering. IEEE Electronic Letters, 34(22). Liang, Z., & Shi, P. (2003). Similarity measures on intuitionistic fuzzy sets. Pattern Recognition Letters, 24, 2687–2693. Mandelbaum, J., & Reed, D. L. (2006). Value engineering handbook, IDA Paper P-4114, Alexandria, VA: Institute for Defense Analysis. Naderpajouh, N., & Afshar, A. (2008). A case-based reasoning approach to application of value engineering methodology in the construction industry. Journal of Construction Management and Economics, 26, 363–372. Naderpajouh, N., Afshar, SA., & Mirmohammadsadeghi, A. (2006). Fuzzy decision support system for application of value engineering in construction industry. International Journal of Civil Engineering, 4(4), 261–273. Näther, W. (2000). On random fuzzy variables of second order and their application to linear statistical inference with fuzzy data. Metrika, 51, 201–221. National Cooperative Highway Research Program (NCHRP) Report 282 (1986). Multi-lane design alternatives for improving suburban highways. Washington, DC: Transportation Research Board. Pal, N. R., & Bezdek, J. C. (1995). On cluster validity for fuzzy c-means model. IEEE Transactions on Fuzzy Systems, 1, 370–379. Park, C. (1994). An integrated value engineering computer system or construction projects. Unpublished Ph.D. dissertation, School of Engineering University of Florida. Shen, Q., & Brandon, P. S. (1991). Can expert systems improve VM implementation? In Proceedings of the 1991 SAVE international conference (pp. 168–176). Kansas City. Tran, L., & Duckstein, L. (2002). Comparison of fuzzy numbers using a fuzzy distance measure. Fuzzy Sets and Systems, 130, 331–341. Wang, W. J. (1997). New similarity measures on fuzzy sets and on elements. Fuzzy Sets and Systems, 85, 305–309. Watson, I. (1997). Applying case-based reasoning: Techniques for enterprise systems. Morgan Kaufmann Publishers. Xie, X. L., & Beni, G. (1991). A validity measure for fuzzy clustering. IEEE Transactions on Pattern Analysis Machine Intelligence, 13, 841–847. Yang, M. S., & Ko, C. H. (1996). On a class of fuzzy c-numbers clustering procedures for fuzzy data. Fuzzy Sets and Systems, 84, 49–60. Zhang, Y., Wang, W., Zhang, X., & Li, Y. (2008). A cluster validity index for fuzzy clustering. Information Science, 178, 1205–1218. Zimmermann, H. J. (2001). Fuzzy Set Theory and its Applications. Dordrecht: Kluwer Academic Press.