Automation in Construction 28 (2012) 106–115
Contents lists available at SciVerse ScienceDirect
Automation in Construction journal homepage: www.elsevier.com/locate/autcon
High-performance Concrete Compressive Strength Prediction using Time-Weighted Evolutionary Fuzzy Support Vector Machines Inference Model Min-Yuan Cheng a, 1, Jui-Sheng Chou a, 2, Andreas F.V. Roy b,⁎, Yu-Wei Wu a, 3 a b
Dept. of Construction Engineering, National Taiwan University of Science and Technology,#43, Sec. 4, Keelung Rd., Taipei 106, Taiwan, ROC Department of Civil Engineering, Parahyangan Catholic University, Indonesia
a r t i c l e
i n f o
Article history: Accepted 27 July 2012 Available online 24 August 2012 Keywords: High performance concrete Fuzzy logic Time series Weighted support vector machines Fast messy genetic algorithms
a b s t r a c t The major different between High Performance Concrete (HPC) and conventional concrete is essentially the use of mineral and chemical admixture. These two admixtures made HPC mechanical behavior act differently compare to conventional concrete at microstructures level. Certain properties of HPC are not fully understood since the relationship between ingredients and concrete properties is highly nonlinear. Therefore, predicting HPC behavior is relatively difficult compared to predicting conventional concrete behavior. This paper proposes an Artificial Intelligence hybrid system to predict HPC compressive strength that fuses Fuzzy Logic (FL), weighted Support Vector Machines (wSVM) and fast messy genetic algorithms (fmGA) into an Evolutionary Fuzzy Support Vector Machine Inference Model for Time Series Data (EFSIMT). Validation results show that the EFSIMT achieves higher performance in comparison with Support Vector Machines (SVM) and obtains results comparable with Back-Propagation Neural Network (BPN). Hence, EFSIMT offers strong potential as a valuable predictive tool for HPC compressive strength. © 2012 Elsevier B.V. All rights reserved.
1. Introduction High-performance concrete (HPC) is a construction material that has gained in popularity over the last decade due to special characteristics that include high workability, high strength, and high durability [35]. HPC differs from conventional concrete, which is a mixture of four ingredients, namely Portland cement (PC), water, fine aggregates and coarse aggregates. HPC employs an additional two ingredients, namely a mineral admixture (e.g., fly ash, blast furnace slag, silica fume) and a chemical admixture (superplasticizer) [11]. Therefore, the major difference between HPC and conventional concrete is essentially the use of mineral and chemical admixtures [5,29]. Those two admixtures made HPC mechanical behavior acts differently compare to conventional concrete at microstructure level [2]. The microstructure of HPC is more compact as mineral admixture acts as fine filler and pozzolanic materials. Moreover chemical
⁎ Corresponding author at: Jl. Ciumbuleuit 94, Bandung, 40141, West Java, Indonesia. Tel.: +62 22 2033691; fax: +62 22 2033692. E-mail addresses:
[email protected] (M.-Y. Cheng),
[email protected] (J.-S. Chou),
[email protected],
[email protected] (A.F.V. Roy),
[email protected] (Y.-W. Wu). 1 Tel.: +886 2 27336596; fax: +886 2 27301074. 2 Tel.: +886 2 27376321; fax: +886 2 2737 6606. 3 Tel.: +886 2 2733004; fax: +886 2 27301074. 0926-5805/$ – see front matter © 2012 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.autcon.2012.07.004
admixture reduces the water content which at the same time reduces level of porosity within the hydrated cement paste [2,29]. Therefore compressive strength of HPC is higher than conventional concrete since those two admixtures decreased hydrated cement paste porosity which represents the weakest links in concrete microstructure. Predicting HPC behavior is relatively difficult compared to predicting conventional concrete behavior. Chou et al. [15] stated that certain properties of HPC are not fully understood since the relationship between ingredients and concrete properties is highly nonlinear. Therefore, traditional model of concrete properties is inadequate for analyzing HPC compressive strength. Mix proportion is the process of choosing suitable ingredients of concrete and determining their relative quantities with the object of producing as economically as possible concrete of certain minimum properties, such as compressive strength [33]. There are popular methods of mix proportion of HPC such method proposed by [1,2,32] among other methods [28]. However, to obtain required mix proportions of HPC most commonly based on trial mixes as stated in relevant standards, experience, and rules of thumb approach [3,29]. Compressive strength is a mechanical property critical to measuring HPC quality [4,34]. Twenty-eighth day compressive strength is the most widely used objective function in the mixture design. However, as pointed out previously, the result depends on ingredient combinations and proportions, mixing techniques and other factors that must
M.-Y. Cheng et al. / Automation in Construction 28 (2012) 106–115
be controlled during manufacturing. Kasperkiewicz et al. [24] stated that the introduction of new ingredients and technologies implies that the number of parameters for HPC mix design may extend to 10-, 20- or even higher dimensional decision space numbers. Waiting 28 days to get 28-day compression strength is time consuming and not a common practice in the construction industry. Therefore, many researchers have worked to establish prediction tools able to obtain an early determination of compressive strength, ideally well before concrete is laid down at a construction site. Prediction of concrete compressive strength is one area of active research in the civil engineering field, and a considerable number of relevant studies have been carried out over the past 30 years. Zain and Abd [42] attempted to categorize methods into three types, i.e., those using statistical techniques, computational modeling and artificial neural networks. Akkurt et al. [4] also noted the use of fuzzy logic to predict concrete compressive strength. Statistical techniques represent a conventional approach, and are used primarily to predict conventional concrete compressive strength by establishing linear and nonlinear regression equations. The approach starts with an analytical equation assumption, followed by regression analysis that employs limited experimental data to determine an unknown coefficient. While many regression models have been suggested, obtaining a suitable regression equation is not an easy task. Moreover in this prediction effort, the early compressive strength at 6-hour, 1-day and 3-day is usually embodied in a prediction equation that necessitates some time delay in prediction [34]. Furthermore, for HPC, where the number of influencing factors is greater than for conventional concrete, this regression model is neither suitable nor adequate to predict compressive strength [41]. As traditional methods handle complex non-linear and uncertain materials (like HPC) poorly, many researchers have sought better prediction tools. Many studies have proposed artificial neural networks (ANNs) and ANN variations to map non-linear relationships among factors of influence on 28-day HPC compressive strength. Kasperkiewicz et al. [24] proposed an artificial neural network of the fuzzy-ARTMAP to predict HPC strength properties. It was found that concrete property prediction could be effectively modeled using a neural system without being affected by data complexity, incompleteness, or incoherence. In 1998, Yeh demonstrated the superiority of ANNs in predicting HPC compressive strength that produced better results than regression analysis. Yeh also showed how easily ANNs could adapt to different numerical experiment settings in order to review the effect on the concrete mix of each variable proportion. Topçu and Saridemir [38] used ANNs and fuzzy logic (FL), separately, to predict 7-, 28- and 90-day compressive strengths in HPC with both high-lime and low-lime fly ash contents. Obtaining prediction result values very close to actual experimental results, Topçu and Saridemir demonstrated neural networks and fuzzy logic as practicable for use as predictive tools for determining the compressive strength value in fly ash concrete. Zarandi et al. [43] fused fuzzy neural networks (FNNs) and polynomial neural networks (PNNs) to form fuzzy polynomial neural networks (FPNN) Type 1 and Type 2, which were also employed to predict concrete compressive strength. Using root means square (RMS) and correlation factors (CFs) as evaluation criteria, FPNN Type-1 delivered better results than attained by the adaptive network-based fuzzy inference system (ANFIS). Parichatprecha and Nimityongskul [35] developed an ANN model to determine the influences of water content in cement, the water-binder ratio, and the effect of replacing fly ash and silica fume on HPC durability. In this model, ANNs were used to predict HPC durability, the results of which were then compared against regression equation results. Furthermore, as the neural network is a “black box” model, Yeh and Lien [41] proposed genetic operation tree (GOT) as an alternative model for predicting HPC compressive
107
strength. GOT comprises an operation tree (OT) and genetic algorithm (GA), and automatically produces self-organized formulas to predict strengths. However, even though GOT obtained results that were better than non-linear regression formulas, prediction accuracy was inferior to those of ANNs. The success of ANN and its variants as AI techniques in handling highly complex materials such as HPC opened the possibilities of using other AI approaches. The development of new AI techniques has spurred follow-on research into their adoption and utilization in the construction industry. For example, SVM, which represents a new AI technique, has been shown to deliver comparable or higher performance than traditional learning machines and has been introduced as a powerful tool to solve classification and regression problems [7,13]. However, SVM presents several inherent shortcomings. Firstly, SVM is unable to provide high prediction accuracy for either the penalty parameter (C) or kernel parameter settings. Secondly, SVM considers all training data points equally in order to establish the decision surface. Therefore, Ling and Wang [30] proposed a modified version of SVM, known as fuzzy SVM (FSVM) or weighted SVM (wSVM), to weight all training data points in order to allow different input points to contribute differently to the learning decision surface. Such modification is also suitable when the case problem involves time series prediction problems, where older training points are associated with lower weights, so that the effect of older training points is reduced when the regression function is optimized. The main purpose of this research study was to predict compressive strength in HPC using an AI hybrid system that fused FL, wSVM and fast messy genetic algorithms (fmGA) into an evolutionary fuzzy support vector machine inference model for time series data (EFSIMT). Within the EFSIMT, FL is used as a fuzzy inference mechanism to handle vagueness and uncertainty due to material characteristics such as HPC ingredient mix, workmanship, site environment situations, temperature, etc. wSVM handles the complex fuzzy input–output mapping relationship and focuses on time series data characteristics inherent in HPC experimental datasets as compressive strength measured at different testing ages. fmGA is deployed as an optimization tool to handle FL and wSVM search parameters. This study applied HPC experimental data originally generated by Yeh [40] and posted to the University of California, Irvine machine learning repository website. To verify and validate the proposed system, EFSIMT performance was compared against original SVMs and back-propagation neural network (BPN). 2. Brief introduction to FL, weighted SVMs, time series analysis, and fmGA 2.1. Fuzzy logic Fuzzy logic (FL) is a popular AI technique invented by Zadeh in the 1960s that has been used in forecasting, decision making and action control in environments characterized by uncertainty, vagueness, presumptions and subjectivity [6]. Chan et al. [9] found that, between 1996 and 2005, FL was used by many scholars in construction-related research, either as single or hybrid techniques that may be categorized into four different types, namely: decision-making, performance, evaluation/assessment, and modeling. Cases including contractor selection in multi-criteria environments, sustainable residential building assessments, site layout planning, dynamic resource allocation, procurement selection modeling, bid/no-bid decision-making, and project selection are several example applications of FL in construction management decision-making. FL consists of four major components, namely fuzzification, rule base, inference engine and defuzzification. Fuzzification is a process that uses membership functions (MFs) to convert the value of input variables into corresponding linguistic variables. The result, which is
108
M.-Y. Cheng et al. / Automation in Construction 28 (2012) 106–115
used by the inference engine, stimulates the human decision-making process based on fuzzy implications and available rules. In the final step, the fuzzy set, as the output of the inference process, is converted into crisp output. This process, which reverses fuzzification, is called defuzzification [26]. Despite the advantages of FL, the approach has a number of problems, including identifying appropriate MFs and number of rules for application. This process is subjective in nature and reflects the context in which a problem is viewed. The more complex the problem, the more difficult MF construction and rules become [27]. Some researchers perceive this drawback as an optimization problem because determining MF configurations and fuzzy rules is complicated and problem-oriented. Some researchers have worked to overcome remaining difficulties by fusing FL with AI optimization techniques, such as GA and ant colony [23,31]. These optimization methods have demonstrated their ability to minimize time-consuming operations as well as the level of human intervention necessary to optimize MFs and fuzzy rules. 2.2. Weighted support vector machines The term “weighted support vector machines” (wSVMs) was proposed by Fan and Ramamohanarao [18] as a synonym for Fuzzy Support Vector Machines (FSVMs) to draw attention to the effective weighting of fuzzy memberships at each FSVM training point.4 Fan and Ramamohanarao [18] stated that different input vectors make different contributions to the learning of decision surface. Thus, the important issue in training weighted SVMs is how to develop a reliable weighting model to reflect the true noise distribution in the training data. Fan and Ramamohanarao [18] developed emerging patterns (EPs) to weight the training data. Lin and Wang [30] developed FSVMs to enhance support vector machine (SVM) abilities to reduce the effects of outliers and noise in data points. While SVMs a recent AI paradigm developed by Vapnik [39] that has been used in a wide range of applications, treat all training points of a given class uniformly, training points in many real world applications bear different importance weightings for classification purposes. To solve this problem, Lin and Wang [30] applied a fuzzy member to each SVM input point, thus allowing different input points to contribute differently to the learning decision surface. In such time series prediction problems, older training points are associated with lower weights, so that the effect of older training points is reduced when the regression function is optimized. In sequential learning and inference methods such as time series problems, where a point from the recent past may be given greater weight than a point from further in the past, function of time ti can be selected as the weighted SVM si scheme. Lin and Wang [30] proposed three time functions, linear, quadratic, and exponential, as shown in Eqs. (1)–(3). Those three time functions were used by Khemchandani et al. [25] on financial time series forecasting problems, who demonstrated their abilities to bring about better results than SVM. si ¼ f l ðt i Þ ¼
1−σ t σ −t 1 t þ m t m −t 1 i t m −t 1
ð1Þ
t −t 2 si ¼ f q ðt i Þ ¼ ð1−σ Þ i 1 þσ t m −t 1 si ¼ f e ðt i Þ ¼
1 1 : 1 þ exp σ−2σ tt i −t −t l
ð2Þ
ð3Þ
1
However, as the wSVM was developed from SVM, it presents the user with similar problems. Schlkopf and Smola (2002) expressed that SVM bandwidth and penalty parameter C, which determines 4
In this paper, to avoid confusion with the FL technique, the term “wSVM” is used.
the trade-off between margin maximization and violation error minimization, represent an issue that requires attention and handling. Another point of concern is the setting of kernel parameters, such as gamma (γ), on the radial basis function, which must also be set properly to improve prediction accuracy. In addition, using wSVM requires users to set a further parameter, i.e., weighting data parameter σ. Therefore, three different parameters must be optimized, including the penalty parameter (C), kernel parameter (i.e. γ, if the RBF kernel is employed), and σ. To overcome this challenge, an optimization technique (e.g., fmGA) may be used to identify best parameters simultaneously [13]. 2.3. Time series analysis Time series analysis is a powerful data analysis technique with two specific goals. The first goal is to identify a suitable mathematical model for data, and the second is to forecast future values in a series based on established patterns and, possibly, other relevant series and/ or factors [16]. Over the past several decades, much has been written in the technical literature about linear prediction in time series analysis, covering such approaches as smoothing methods, the Box–Jenkins time series model and the auto regression model. Accurate and unbiased estimation of time series data produced by these linear techniques cannot always be achieved, as real word applications are generally not amenable to linear prediction techniques [37]. Real world time series applications are fraught by highly nonlinear, complex, dynamic and uncertain conditions in the field. Thus, estimation requires development of a more advanced time series prediction algorithm, such as that achieved using an AI approach. Refenes et al. [36] described structural change as a time series data characteristic that should always be taken into account in all methodological approaches to time-series analysis. In light of this characteristic, Cao et al. [8] expressed that recent data provide more relevant information than distant data. Consequently, recent data should be assigned weights relatively greater than weights assigned earlier data. Cao et al. [8] and Khemchandani et al. [25] adopted this approach effectively by applying AI techniques such as SVMs and wSVMs in financial time series forecasting applications. 2.4. Fast messy genetic algorithm The fast messy genetic algorithm (fmGA) is a genetic algorithmbased optimization tool able to find optimal solutions to large-scale permutation problems efficiently. Goldberg et al. [21] developed fmGA as an improvement on messy genetic algorithms (mGAs). Different from simple genetic algorithms (sGAs), which describe possible solutions using fixed length strings, fmGA applies messy chromosomes to form strings of various lengths [17,19]. A messy chromosome is a collection of messy genes. A messy gene in fmGA is represented by the paired values “allele locus” and “allele value”. Allele locus indicates gene position and allele value represents the value of the gene in that position. Consider the two messy chromosomes as follows: chromosome C1: ((1 0) (2 1) (3 1) (1 1)) and C2: ((3 1) (1 0)) both represent valid strings with lengths of three. As the above example shows, messy chromosomes may have various lengths. Moreover, messy chromosomes may be either “over-specified” or “underspecified” in terms of encoding bit-wise strings. Chromosome C1 is an over-specified string, which has two different values in the gene 1 position. To handle this over-specified chromosome, the string may be scanned from left to right following the first-come-first-served rule. Thus, C1 represents bit string 011. On the other hand, a competitive template would be employed to evaluate an underspecified chromosome, such as C2. The competitive template is a problem-specific, fixed-bit string that is either generated randomly or found during the search process. As shown in Fig. 1, if the competitive template is 111, C2 represents bit string 011 by assigning corresponding allele values
M.-Y. Cheng et al. / Automation in Construction 28 (2012) 106–115
Messy chromosome: (3 1)(1 0) Bit string Competitive template:
0
1
1
1
1
1
Fig. 1. Evaluation of an underspecified messy chromosome.
in the position of gene 2 from the competitive template to represent missing genes. The fmGA contains two loop types, i.e., inner and outer. The process starts with the outer loop. Firstly, a competitive template, represented by a problem-specific, fixed-bit string, is generated randomly or found during the search process. Each outer loop cycle is one “era”, which iterates over the order k of processed building blocks (BBs). A building block is a set of genes, a subset of strings that are short, low-order and high-performance. With the start of each new era, the three-phase operations of the inner loop, including the initialization phase, the building block filtering (BBF) or primordial phase, and the juxtapositional phase, are invoked. In the initialization phase, an adequately large population contains all possible BBs of order k. fmGA performs the PCI process at this stage, which randomly generates n chromosomes and calculates their fitness values. There are two operations in the primordial phase, namely building-block filtering and threshold selection. In the primordial phase, ‘bad’ genes that do not belong to BBs are filtered out, so that, in the end, the resultant population encloses a high proportion of ‘good’ genes belonging to BBs. In the juxtaposition phase, operations are more similar to those of sGAs. The selection procedure for good genes (BBs) is used together with a cut-and-splice operator to form a high quality generation, which may contain the optimal solution. Operations in the next outer loop begin once those in the inner loop have finished. The competitive template is substituted with the best solution found so far, which becomes the new competitive template for the next era. The whole process is repeated until the maximum number kmax is reached. The fmGA can also perform over “epochs”. This term is used to describe a complete process that starts from a first era and continues until kmax. The best solution found in one complete process is passed to succeeding epochs through the competitive template. Epochs can be performed as many times as desired. The algorithm is terminated once a good-enough solution is obtained or no further improvement is made. 3. Evolutionary fuzzy support vector machine inference model for time series data Evolutionary fuzzy support vector machine inference model for time series data (EFSIMT) is a hybrid AI system that fuses three different AI techniques, namely, FL, wSVM and fmGA. The developed EFSIMT based on the FL paradigm is a hybrid AI system that allows computer systems to solve problems intelligently by imitating human reasoning to recommend decisions with a level of accuracy similar to that attained by experts. In this complementary system, FL deals with vagueness and approximate reasoning; wSVMs act as a supervised learning tool to handle fuzzy input–output mapping and focused on time series data characteristics; and fmGA works to optimize FL and wSVM parameters. The ability of FL to deal with vagueness and uncertainty depends heavily on the appropriate distribution of MFs, number of rules and selection of proper fuzzy set operations. FL parameter construction is not easy, as they are problem-oriented and rely heavily on expert knowledge. wSVMs and fmGA were introduced to resolve such issues. The fuzzy inference engine and fuzzy rules based on the conventional
109
FL system were replaced by wSVMs. However, the generalizability and predictive accuracy of wSVMs are determined by searched problem parameters, including the optimal penalty parameter, kernel parameters and the lower bound of the weighted data parameter. To overcome this shortcoming, EFSIMT utilizes fmGA to search simultaneously for optimum wSVM and FL parameters. Fig. 2 illustrates the EFSIMT architecture. Nine steps must be followed to establish the EFSIMT model, as explained below: (1) Training data. The EFSIMT uses sequential data as training data. The appropriate factors need to be indentified first before input and output patterns can be collected. Subsequently, these patterns, representing training data, must be normalized to avoid greater numeric ranges dominating those with smaller numeric ranges and help avoid numerical difficulties [22]. As inference results of new problems may be greater or smaller than desired outputs distributed in input patterns, the normalization method was revised. Maximum and minimum output parameters were enlarged by 10% [12]. Functions used to normalize the data are shown in Eqs. (4)–(6). Xn ¼
X a −X L X U −X L
ð4Þ
X U ¼ X max þ X range =10
ð5Þ
X L ¼ X min − X range =10
ð6Þ
where Xn Xa XU XL Xmax Xmin Xrange
Output parameter after normalization (range between 0 and 1) Output parameter before normalization Upper bound of output parameter Lower bound of output parameter maximum of output parameter minimum of output parameter Difference between maximum and minimum.
(2) Data weighting. For time series prediction problems, certain data points are more important to the training process and others are less important, based on the nearness of their date to the present and degree of noise corruption. To deal with such issues, the model applies weight to each input point according to three types of time functions, as shown in Eqs. (1)–(3). In doing so, different input points can make different contributions to the learning of the approximated function, and can improve the SVM in diminishing the effect of outliers and noisy data. Due to this weighting process, the last data point xm will be treated as most important, and thus be assigned an smvalue of 1. The first data point x1 will be treated as least important and given a weighting value equal to σ. In this step, the value of σ was generated randomly and encoded by fmGA. In this research, the LIBSVM developed by Chang and Lin [10] was embedded into the EFSIMT model. (3) Fuzzification. In this step, each normalized input attribute is converted into corresponding membership grades. This mapping of crisp quantity to fuzzy quantity is carried out by membership function (MF) sets generated and encoded by fmGA. This study used trapezoidal MFs and triangular MF shapes (see Fig. 3) that, in general, may be developed by referencing summit points and widths [23]. The summit and width representation method (SWRM) was used in this study to encode complete MF sets (see Fig. 3(c)) [27]. Fig. 4 illustrates the fuzzification process. (4) Weighted SVM training model. In this step, wSVMs developed based on SVMs are deployed to handle fuzzy input–output
110
M.-Y. Cheng et al. / Automation in Construction 28 (2012) 106–115
6 fmGA parameters search
NO 9
8
σ
SVM Parameters
MFs
(C, γ)
1 Training Data
2
3 Data Weighting
Defuzzification Parameter
4
Fuzzification
Termination criteria
Optimal Prediction Model
7
5 weighted SVM training model
YES
Fitness evaluation
Defuzzification
Legend: Data flow
Control flow Fig. 2. EFSIMT structure.
into a binary string. Chromosomes consist of two segments, including FL and weighted SVMs. The FL segment contains MF and dfp substrings. The weighted SVM segment contains penalty parameters C, kernel parameter γ from the RBF function and the lower boundary of weighted data parameter σ. Fig. 5 illustrates the chromosome structure. As mentioned above, MF substrings are encoded using the SWRM method, which defines the distribution of uneven MFs using their summits and widths (see Fig. 3(c)). In Fig. 3(a), trapezoidal MF summits are sm1 and sm2, whereas left and right widths are wd1 and wd2, respectively. A triangular MF may be regarded as a special trapezoidal MF case, in which sm1 = sm2. A complete MF set includes two shoulders. Fig 3(c) shows the complete trapezoidal MF set, consisting of five summit points (sm1, sm2, sm3, sm4, sm5) and four widths (wd1, wd2, wd3, wd4). Applying the SWRM method, the required length of the MF binary substring RLMF may be defined as follows:
mapping. Fuzzification process output, in the form of membership grades, acts as fuzzy input for wSVMs. wSVMs train this dataset to obtain the prediction model and use penalty (C) and kernel parameters (γ) that are randomly generated and encoded by fmGA. This study used the RBF kernel as a reasonable first choice [22]. (5) Defuzzification. Once the wSVM has finished the training process, output numbers are expressed in terms of fuzzy output, which must be converted into crisp numbers. Employing fmGA, the EFSIMT generates a random dfp substring and encodes it to convert wSVM output. This evolutionary approach is simple and straightforward, as it uses dfp as a common denominator for wSVM output. (6) fmGA parameter search. fmGA was employed to search concurrently for the fittest shapes of MFs, dfp, penalty parameter C, RBF kernel parameter γ and the lower boundary of weighted data parameter σ. As fmGA works based on the concept of genetic operations, chromosome design plays a central role in achieving objectives. The chromosome that represents a possible solution for searched parameters consists of five parts, namely the MF substring, dfp substring, penalty parameter substring, kernel parameter substring and lower bound of weighting data substring. Every substring has a specific length that should fit within certain requirements, which correspond to the searched parameter. These requirements include length of decimal point string and upper and lower parameter bounds, among others. The chromosome, as the model variable in EFSIMT, is encoded
sm1
sm1 = sm2
sm2
¼ rn
cMF
sm sm wd wd n rl þ n rl
ð7Þ
where rncMF represents the required number of complete MF sets, n sm represents the number of summits in a complete MF set, rlsm represents the required length for a summit depending on demand, n wd represents the number of widths in one complete MF set, and rl wd represents the required demanddependent width. Considering that each input variable uses
sm2
sm1
Membership grade
1.0
MF
RL
sm4 sm5
sm3
MF2
MF1
MF3
Degree value
0
x1
x2 wd1
x4
x3
x1
wd2
x 2 = x3 wd1
x4 wd2
X lb
wd1 wd2
a
b
X ub
wd3
c
Fig. 3. Membership function: (a) trapezoidal; (b) triangular; (c) complete MF set [27].
wd4
M.-Y. Cheng et al. / Automation in Construction 28 (2012) 106–115 sca
sca
sca
X11
X12
,
111 sca
X13
,
,
...
X1n
,
...
MFset 1
mg1
x11
,
mg2
x11
,
MFset 3
MFset 2
mg3
x11
,
mg1
x12
,
mg2
x12
,
mg3
x12
,
mg1
x13
,
mg2
x13
,
MFset n
mg3
x13
,
...
mg
, x1n 1 ,
mg2
x1n
,
mg3
x1n
Legend: sca
Xij : scaled input pattern xijmg k
i : number of cases
: membership grade k of
sca Xij
j : number of input pattern
k : number of membership function in one complete membership set Fig. 4. Fuzzification process.
a common complete MF set to fuzzify crisp input data, rn cMF is carried as follows: rn
cMF
¼
1 iv n
parameter settings and numbers of bits required for chromosome design. (7) Fitness evaluation. Every chromosome that represents MFs, dfp, C, γ and σ is encoded and used to train the dataset. Model accuracy is obtained when a prediction model of the training dataset is obtained. Each chromosome is further evaluated using a fitness function. The fitness function was designed to measure model accuracy and the fitness of generalization properties [27]. This function describes the fittest shape of MFs, optimized dfp number and weighted SVM parameters. The fitness function integrates model accuracy and model complexity, as expressed in Eq. (10).
if all input variables use a common complete MF set if each input variable uses its individual complete MF set
ð8Þ where n iv represents the number of input variables. dfp is a number searched by fmGA that will convert fuzzy output from the inference engine into crisp output. The required length rl x of the dfp binary substring may be defined by adapting the variable mapping function of Gen and Cheng [20] from domain [lb x, ub x], as follows: rlx −1
2
x
x
b ub −lb
rp
rlx
10 ≤2 −1
ð9Þ
fi
f ¼
where rp is the required number of places after the decimal point, and lb x and ub x represent the lower and upper bound values of variable x. For the weighted SVM parameter segment (containing penalty parameter C, gamma γ and sigma σ substrings), the required length of each binary C, γ and σ substring is also computed using Eq. (9). Table 1 summarizes
MF1
MF2
MF3 … MFn
dfp
C
γ
σ
consist of
sm1, sm2,…,sm5
ð10Þ
where c aw represents the accuracy weighting coefficient, s er represents the prediction error between actual output and desired output, c cw represents the complexity weighting coefficient, and mc represents model complexity, which can be assessed simply by counting number of support vectors. (8) Termination criteria. The process is terminated when termination criteria are satisfied. While still unsatisfied, the model will proceed to the next generation. As the EFSIMT uses fmGA, the termination criterion used here is either number of era (k) or number of epoch (e). The loop process continues when specified criteria are not met. (9) Optimal prediction model. The loop stops once the termination criterion is fulfilled, i.e., the prediction model has identified the input/output mapping relationship incorporating optimal MF, C, γ, σ and dfp parameters.
weighted SVMs segment
FL segment
1 caw ser þ ccw mc
wd1,wd2...wd4
Legend: MFi : membership function i-th dfp : defuzzification parameter C : penalty parameter γ : RBF kernel parameter σ : lower bound of weighting data parameter smj : summit point j-th of MFi wdj : width j-th of MFi Fig. 5. EFSIMT chromosome structure.
Table 1 Summary of EFSIMT parameter settings. Parameter
Upper bound
Lower bound
Number of bits
MF set C γ σ Linear and quadratic Exponential dfp
– 200 1 1 20 1
– 0 0.0001 0.1 0.05 0.5
27a 5 10 10 10 9
a
Number of bits required for one complete MF set.
112
M.-Y. Cheng et al. / Automation in Construction 28 (2012) 106–115
Table 2 HPC database: input and output variables.
Table 4 Comparison of results among SVMs, BPN and EFSIMT.
Input/output variables
Unit
Minimum
Maximum
Cement Blast furnace slag Fly ash Water Superplasticizer Coarse aggregate Fine aggregate Age of testing Concrete compressive strength
(kg/m3) (kg/m3) (kg/m3) (kg/m3) (kg/m3) (kg/m3) (kg/m3) Day MPa
102.00 0.00 0.00 121.75 0.00 801.00 594.00 1.00 2.33
540.00 359.40 200.10 247.00 32.20 1145.00 992.60 365.00 82.60
Dataset
Training set
Testing set
Evaluation performance measurement
SVMs
r MAE (MPa) RMSE (MPa) R2 r MAE (MPa) RMSE (MPa) R2
0.850 7.122 8.854 0.722 0.867 8.116 10.401 0.752
BPN
0.951 3.869 5.094 0.904 0.935 5.238 6.902 0.873
EFSIMT Linear
Quadratic
Exponential
0.954 4.184 5.120 0.909 0.957 4.781 5.865 0.916
0.954 4.189 5.126 0.910 0.961 4.121 5.378 0.923
0.951 4.235 5.152 0.902 0.963 4.410 5.430 0.927
4. EFSIMT for predicting HPC compressive strength This section verifies and validates the performance of the hybrid system EFSIMT in predicting HPC compressive strength. The model proposed herein predicts the compressive strength of HPC using an experimental database originally collected by Yeh [40] and furnished from various university research labs, which was posted to the University of California, Irvine machine learning repository website. The database includes a total of 1030 concrete samples and covers 9 attributes, 8 of which are quantitative input variables and 1 of which is an output variable. Each instance includes the amount of cements, fly ash, blast furnace slag, water, superplasticizers, coarse aggregate, fine aggregate, age of testing and the compressive strength (in MPa). Table 2 shows the general details of the nine attributes used in this study. However, the database often contains unexpected inaccuracies [24], as for instance, the class of fly ash may not be indicated. Another problem is related to superplasticizer as chemical admixture produced by different manufactures which may have different chemical compositions [15,41]. Moreover, Chou et al. [15] identified that such inaccuracies induce another difficulty related to the compressive strength which can be classified into a specific class such as high or low concrete compressive strength. EFSIMT employs FL to manage environments characterized by uncertainty, vagueness, presumptions and subjectivity. This capability is suited to HPC database characteristics. As pointed out by Kasperkiewicz et al. [24] the HPC database often contains inaccuracies due to mixing proportions, mixing techniques and ingredient characteristics (e.g., varying degrees of finesses, classes of fly ash, and types of superplasticizer). Such makes prediction of HPC compressive strength a highly uncertain task. Moreover, EFSIMT is also able to deal with time series data characteristics inherent to HPC databases (e.g., compressive strength measures representing 14 different testing ages ranging from 1 day to 365 days as shown in Table 3). To develop the HPC compressive strength prediction system, the 1030 samples were divided randomly into training and testing sets.
90% or 927 samples were assigned to the training set and the remainder, 10% or 103 samples, were assigned to the testing set. As the EFSIMT was to be compared against SVM and BPN result accuracies, SVM and BPN parameter setting procedures followed previous researcher suggestions and settings. In this study, as suggested by Hsu et al. [22] parameter settings for SVMs, herein C and γ were set to 1 and 1k respectively, with k representing number of input patterns. The parameter setting for BPN followed Yeh [40] and Yeh and Lien [41] and assigned network architecture settings as: 1 hidden layer containing 8 hidden units, and learning parameter settings as: 1.0 for learning rate, and 0.5 for momentum factor. This study employed four performance measures, namely root mean square error (RMSE), coefficient correlation (r), coefficient of determination (R 2) and mean absolute error (MAE) to verify and validate the accuracy of the proposed system and other AI models. Table 4 shows RSME, r, R 2, and MAE results of the proposed EFSIMT system (linear, quadratic and exponential time series functions) compared against the other AI systems (SVM and BPN). Based on the four different evaluation methods for both training and testing datasets, SVMs provided the least satisfactory result. In comparing BPN and EFSIMT (linear, quadratic and exponential time series functions) based on RSME and MAE, BPN performed slightly better than EFSIMT, but only on training data (not on the testing data set). However, in terms of coefficient correlation (r) and the coefficient of determination (R 2) for the training data set, EFSIMT is comparable to BPN. Fig. 6 presents scatter diagrams of SVMs, BPN and EFSIMT (linear, quadratic and exponential time series functions) for the training data set. Better results were achieved by EFSIMT in terms of predicting testing dataset results, which shows that the EFSIMT training data learning process provides a prediction model superior to BPN. Such confirms that EFSIMT (linear, quadratic and exponential time series functions) delivers comparable or higher performance than BPN. This better learning ability demonstrates EFSIMT ability to cope with
Table 3 HPC database examples. Cement (kg/m3)
Blast furnace slag (kg/m3)
Fly ash (kg/m3)
Water (kg/m3)
Superplasticizer (kg/m3)
Coarse aggregate (kg/m3)
Fine aggregate (kg/m3)
Age of testing (day)
Concrete compressive strength (MPa)
540.0 540.0 332.5 332.5 198.6 168.0 168.0 190.0 485.0 374.0 313.3 425.0 425.0 375.0
0.0 0.0 142.5 142.5 132.4 42.1 42.1 190.0 0.0 189.2 262.2 106.3 106.3 93.8
0.0 0.0 0.0 0.0 0.0 163.8 163.8 0.0 0.0 0.0 0.0 0.0 0.0 0.0
162.0 162.0 228.0 228.0 192.0 121.8 121.8 228.0 146.0 170.1 175.5 153.5 151.4 126.6
2.5 2.5 0.0 0.0 0.0 5.7 5.7 0.0 0.0 10.1 8.6 16.5 18.6 23.4
1040.0 1055.0 932.0 932.0 978.4 1058.7 1058.7 932.0 1120.0 926.1 1046.9 852.1 936.0 852.1
676.0 676.0 594.0 594.0 825.5 780.1 780.1 670.0 800.0 756.7 611.8 887.1 803.7 992.6
28 28 270 365 360 14 28 28 28 3 3 3 3 3
79.99 61.89 40.27 41.05 44.30 17.82 24.24 40.86 71.99 34.40 28.80 33.40 36.30 29.00
M.-Y. Cheng et al. / Automation in Construction 28 (2012) 106–115
SVMs
100
BPN
R2 = 0.7217
100
R2 = 0.9038
80
Predicted Output (MPa)
Predicted Output (MPa)
80
60
40
20
0
60
40
20
0
20
40
60
80
0
100
0
20
Actual Output (MPa)
80
100
80
100
R2 = 0.9096
100
Predicted Output (MPa)
80
60
40
60
40
20
20
0
20
40
60
80
100
0
0
20
Actual Output (MPa)
40
60
Actual Output (MPa) Exponential EFSIM T
R2 = 0.9023
100
80
Predicted Output (MPa)
Predicted Output (MPa)
60
Quadratic EFSIM T
R2 = 0.9088
80
0
40
Actual Output (MPa)
Linear EFSIMT
100
113
60
40
20
0
0
20
40
60
80
100
Actual Output (MPa) Fig. 6. Scatter diagram of actual vs. predicted between SVMs, BPN and EFSIMT on training data set.
114
M.-Y. Cheng et al. / Automation in Construction 28 (2012) 106–115
uncertain characteristics inherent in HPC databases. Moreover, as EFSIMT employed wSVM, the proposed model is also able to map the complex relationship between input and output variables as well as manage time series characteristics inherent to HPC databases. While EFSIMT employed three different time series functions (linear, quadratic and exponential) to weigh data points, one preferable time series function should be chosen based on performance achieved by each time series function, both in the training and testing datasets. As shown in Table 4, the EFSIMT using quadratic functions, generally provides slightly better performance, especially on the testing data set, in comparison with the EFSIMT using linear and exponential time series functions. However, it should be noted that differences in performance obtained between the three time functions were not significant. This shows that there remains room for improvement to find a better time series function to predict HPC compressive strength. The proposed model, EFSIMT, offers the potential to predict HPC compressive strength. The practitioners can obtain early, applicable and reliable prediction of concrete compressive strength for pre-design and quality control, as waiting 28 days to get 28-day compressive strength or later-age compressive strength is time-consuming. In accordance with Zain and Abd [42] and Chou et al. [14] the rapid prediction would enable the adjustment of mix proportion to avoid situation where concrete does not reach the required compressive strength, which would save time and construction costs. 5. Conclusion This paper proposed EFSIMT as a hybrid AI system to predict HPC compressive strength, a mechanical property critical to measuring HPC quality. EFSIMT was developed by fusing FL, wSVMs and fmGA. FL was used to address uncertainties inherent in HPC; wSVMs addressed complex relationships related to fuzzy input–output mapping and measured variations in time series data in the HPC database (e.g., compressive strength) with regard to testing age; and fmGA was an optimization tool used to handle FL and wSVM search parameters. In comparison with SVMs, the accuracy of the proposed EFSIMT was significantly better for four different evaluation measurements. However, in comparison with BPN, especially in terms of training dataset results, the proposed method achieved comparable results. Such was contrary to testing dataset results, where EFSIMT performed better than BPN. Such results demonstrate the superior ability of EFSIMT to manage 1) time series data characteristics inherent in HPC experimental data, 2) complex relationships between input and output variables, and 3) uncertainties inherent in HPC databases. Therefore, EFSIMT offers strong potential as a predictive tool for HPC compressive strength. Acknowledgments The authors would like to thank Professor I-Cheng Yeh for providing the HPC database. References [1] 363 R-92. ACI, State-of-the Art Report on High Strength Concrete, ACI Manual of Concrete Practice, Part—I, American Concrete Institute, 1993. [2] P.C. Aïtcin, High Performance Concrete, E&FN SPON, New York, 1998. [3] G. Akhras, H.C. Foo, A knowledge-based system for selecting proportions for normal concrete, Expert System with Application 7 (2) (1994) 323–335. [4] S. Akkurt, S. Ozdemir, G. Tayfur, B. Akyol, The use of GA-ANNs in the modeling of compressive strength of cement mortar, Cement and Concrete Research 33 (2003) 973–979. [5] B.H. Bharatkumar, R. Narayanan, B.K. Raghuprasad, D.S. Ramachandramurthy, Mix proportioning of high performance concrete, Cement and Conrete Composites 23 (2001) 71–80. [6] G. Bojadziev, M. Bojadziev, Fuzzy Logic for Business, Finance, and Management, 2nd ed. World Scientific, Singapore, 2007. [7] C. Burges, A tutorial on support vector machines for pattern recognition, Data Mining and Knowledge Discovery 2 (1998) 121–167.
[8] L.J. Cao, K.S. Chua, L.K. Guan, Ascending support vector machines for financial time series forecasting, in: Proceeding of 2003 International Conference on Computational Intelligence for Financial Engineering (CIFEr2003), Hongkong, 2003, pp. 317–323. [9] A.P.C. Chan, D.W.M. Chan, J.F.Y. Yeung, An overview of the application of fuzzy techniques in construction management research, Journal of Construction Engineering and Management 135 (11) (2009) 1241–1252. [10] C.C. Chang, C.J. Lin, LIBSVM : a library for support vector machines, Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm 2001. [11] T.P. Chang, F.C. Chung, H.C. Lin, A mix proportioning methodology for high-performance concrete, Journal of the Chinese Institute of Engineers 19 (6) (1996) 646–655. [12] M.Y. Cheng, H.S. Peng, Y.W. Wu, T.L. Chen, Estimate at completion for construction projects using evolutionary support vector machine inference model, Automation in Construction 19 (2010) 619–629. [13] M.Y. Cheng, Y.W. Wu, Evolutionary support vector machine inference system for construction management, Automation in Construction 18 (5) (2009) 597–604. [14] J.S. Chou, C.K. Chiu, M. Farfoura, I. Al-Taharwa, Optimizing the prediction accuracy of concrete strength based on comparison of data-mining techniques, Journal of Computing in Civil Engineering 25 (3) (2011) 242–252. [15] J.S. Chou, C.F. Tsai, Concrete compressive strength analysis using a combined classification and regression technique, Automation in Construction 24 (2012) (2012) 52–60. [16] J.D. Cryer, K.S. Chan, Time Series Analysis with Applications in R, 2nd ed. Springer, New York, 2008. [17] K. Deb, D.E. Goldberg, mGA in C: A Messy Genetic Algorithm in C. IlliGAL Technical Report 91008, Department of General Engineering, University of Illinois, Urbana-Champaign, Urbana, Illinois, 1991. [18] H. Fan, K. Ramamohanarao, A weighting scheme based on emerging patterns for weighted support vector machines, in: Proceeding IEEE International Conference on Granular Computing, 2, 2005, pp. 435–440, 2. [19] C.W. Feng, H.T. Wu, Integrated fmGA and CYCLONE to optimize schedule of dispatching RMC trucks, Automation in Construction 15 (2006) 186–199. [20] M. Gen, R. Cheng, Genetic Algorithms and Engineering Design, John Wiley & Sons, New York, 1997. [21] D.E. Goldberg, K. Deb, H. Kargupta, G. Harik, Rapid, accurate optimization of difficult problems using fast messy genetic algorithms, in: Proceedings of the Fifth International Conference on Genetic Algorithms, 1993, pp. 56–64. [22] C.W. Hsu, C.C. Chang, C.J. Lin, A Practical Guide to Support Vector Classification, Technical Report, Department of Computer Science, National Taiwan University, Taipei, Taiwan, 2003. [23] H. Ishigami, T. Fukuda, T. Shibata, F. Arai, Structure optimization of fuzzy neural network by genetic algorithm, Fuzzy Sets and Systems 71 (1995) 257–264. [24] J. Kasperkiewicz, J. Racz, A. Dubrawski, HPC strength prediction using artificial neural network, Journal of Computing in Civil Engineering 9 (4) (1995) 279–284. [25] R. Khemchandani, Jayadeva, S. Chandra, Regularized least square fuzzy support vector regression for financial time series forecasting, Expert System with Applications 36 (1) (2009) 132–138. [26] G.J. Klir, B. Yuan, Fuzzy Sets and Fuzzy Logic: Theory and Applications, Prentice Hall PTR, Upper Saddle River, New Jersey, 1995. [27] C.H. Ko, Evolutionary Fuzzy Neural Inference Model (EFNIM) for Decision-Making in Construction Management, Ph.D. Thesis, Department of Construction Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan, 2002. [28] A.I. Laskar, S. Talukdar, A new mix design method for high performance concrete, Asian Journal of Civil Engineering (Building and Housing) 9 (1) (2008) 15–23. [29] C.H. Lim, Y.S. Yoon, J.H. Kim, Genetic algorithm in mix proportioning of high-performance concrete, Cement and Concrete Research 34 (2004) 409–420. [30] C.F. Lin, S.D. Wang, Fuzzy support vector machines, IEEE Transactions on Neural Networks 13 (2) (2002) 464–471. [31] C. Martinez, O. Castillo, O. Montiel, Comparison between ant colony and genetic algorithms for fuzzy system optimization, in: Castillo, et al., (Eds.), Soft Computing for Hybrid Intel. Systems, Springer-Verlag, Berlin Heidelberg, 2008, pp. 71–86. [32] P.K. Mehta, P.C. Aitcin, Microstructural basis of selection of materials and mix proportion for high strength concrete, in: Proceedings of the 2th International Symposium on High Strength Concrete, Detroit, American Concrete Institute, 1990, pp. 265–286. [33] A.M. Neville, Properties of Concrete, 4th ed. Longman Group Limited, Essex England, 1995. [34] H.G. Ni, J.Z. Wang, Prediction of compressive strength of concrete by neural networks, Cement and Concrete Research 30 (2000) 1245–1250. [35] R. Parichatprecha, P. Nimityongskul, Analysis of durability of high performance concrete using artificial neural networks, Construction and Building Materials 23 (2009) 910–917. [36] A. Refenes, N. Refenes, Y. Bentz, D.W. Bunn, A.N. Burgess, A.D. Zapranis, Financial time series modeling with discounted least squares back-propagation, Neurocomputing 14 (1997) 123–138. [37] N.I. Sapankevych, SankarR. , Time series prediction using support vector machine: A survey, Computational Intelligence Magazine, IEEE in Computational, 2009. [38] I.B. Topçu, M. Saridemir, Prediction of compressive strength of concrete containing fly ash using artifictial neural networks and fuzzy logic, Computational Material Science 41 (2008) 305–311.
M.-Y. Cheng et al. / Automation in Construction 28 (2012) 106–115 [39] V.N. Vapnik, The Nature of Statistical Learning Theory, 2nd ed. Springer-Verlag, New York, 1995. [40] I.C. Yeh, Modelling of strength of high-performance concrete using artificial neural networks, Cement and Concrete Research 28 (12) (1998) 1797–1808. [41] I.C. Yeh, L.C. Lien, Knowledge discovery of concrete material using genetic operation trees, Expert System with Applications 36 (2009) 5807–5812.
115
[42] M.F.M. Zain, S.M. Abd, Multiple regression model for compressive strength prediction of high performance concrete, Journal of Applied Science 9 (1) (2009) 155–160. [43] M.H.F. Zarandi, I.B. Turksen, J. Sobhani, A.A. Ramezanianpour, Fuzzy polynomial neural networks for approximation of the compressive strength of concrete, Applied Soft Computing 8 (2008) 488–498.