Expert Systems with Applications 38 (2011) 10094–10098
Contents lists available at ScienceDirect
Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa
Obtaining industrial experimental designs using a heuristic technique M.C. Alonso a, A. Bousbaine b, J. Llovet a, J.A. Malpica a,⇑ a b
Mathematics Department, University of Alcala, Madrid, Spain Nestlé Research Center, Lausanne, Switzerland
a r t i c l e
i n f o
Keywords: Fractional factorial Experimental designs Heuristic techniques Expert systems
a b s t r a c t There are several methods of obtaining orthogonal fractions of symmetrical and asymmetrical factorials, including Hadamard matrices, collapsing and replacing, and orthogonal arrays. The objective is to find the right permutations of levels of each factor in order to obtain uncorrelated main effects with a minimum number of runs. This leads to a combinatorial optimization problem. Simulated annealing is an optimization technique that has been successfully used to find a good solution for combinatorial optimization problems. By applying the computational technique of simulated annealing to orthogonal fractional factorial designs, we provide a method to obtain uncorrelated main effects for symmetrical and asymmetrical factorials. Ó 2011 Elsevier Ltd. All rights reserved.
1. Introduction The system representing an experimental design is shown in the model in Fig. 1. Some of these inputs are controllable while others are not. The controllable inputs are called factors xi. Changes are made on these controllable input factors in order to ascertain the reasons for changes that may be observed in the output response. The objective is to determine the influence of the xi on the response y. The system represented in Fig. 1 can be thought of as a black box where different inputs yield different outputs. Since it is impossible, or very difficult, to find the exact functional relationship between the xi and the response y, an experiment can be conducted to determine how the xi influence the response y. It is desirable to design an experiment that explains this process in a most comprehensive way. Factorial experiments are crucial in many scientific fields (Montgomery, 2005). As the number of factors increases, the number of runs required for a complete replica of the design grows exponentially. It might not be possible to run every factor combination in many experimental situations, whether for economic or other reasons. When the number of factors is large, it is reasonable to assume that the higher-order interactions are negligible. This reduces the number of effects to those that have real interest for the investigator, and makes it possible to use fewer runs than in a complete factorial experiment. Usually a fraction of the full factorial is used. This is called a fractional factorial design. The key issue is to
⇑ Corresponding author. Tel.: +34 18856761. E-mail address:
[email protected] (J.A. Malpica). 0957-4174/$ - see front matter Ó 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2011.02.004
choose an appropriate fraction that verifies the desired properties, especially the orthogonal property. 2. Orthogonal main-effect designs The usual notation for characterizing orthogonal fractional fack torial is sk11 ; sk22 ; . . . ; shh ðnÞ where n is the number of runs, si is the number of levels of the factors, and ki is the number of factors with si levels. Let the matrix d of dimension n p be built by factors as columns, and runs as rows, with p = k1 + k2 + + kh. In experimental design literature, d is known as the design matrix. The process shown in Fig. 1 is approximated by a polynomial function and represents a good description of the relationship between the experimental variable and the responses within a limited experimental domain. It is well known that the multiple lineal regression model which represents an experimental design is
y ¼ l þ X1 b þ e;
ð1Þ
where l, the general mean; y, n 1 response values; b, p 1 vector of all main effects coefficients; X1, n p matrix of contrast coefficients for the vector of main effects; and e, n 1 vector of independent random errors. Note that p is the number of factors, n the number of runs and x X1 = (x1, . . . , xp). Let X = kxx11 k ; . . . ; kxpp k be the normalized matrix. Therefore X0 X is the correlation matrix, where X0 stands for the transpose matrix of X. The properties of the X0 X matrix give a good indication of the quality of the design. If the X0 X matrix is diagonal then the design will be advantageous, due not only to the computational simplicity but also to the fact that the estimators of all the regression coefficients are uncorrelated.
10095
M.C. Alonso et al. / Expert Systems with Applications 38 (2011) 10094–10098
Inputs x1 x2 x. 3 ..
Output Process
y Response
xp Factors Fig. 1. Model of an experimental design.
In practice, it may be relatively easy to do this. As a consequence of matrix algebra, the off-diagonal elements of X0 X are the sums of cross-products of the columns in X. Therefore, to achieve uncorrelated coefficients the dot product of the columns of X must be equal to zero; i.e. these columns must be orthogonal. Experimental designs that have this property for fitting a regression model are called orthogonal designs. Thus, an orthogonal design is a fractional factorial design which allows the estimation of all relevant effects with no correlation. Designs with all factors having the same number of levels are called symmetrical designs. The designs considered in our study are symmetrical and asymmetrical fractional factorials, but only those where all interactions are assumed to have negligible magnitudes are the ones dealt with in this paper. These designs are known in experimental design terminology as orthogonal resolution III designs. 3. Criteria for orthogonal fractions There are various methods used to calculate orthogonal fractions of symmetrical and asymmetrical factorial designs based on Galois fields, finite Euclidean and projective Geometry, Hadamard matrices, etc. But none of these is universal enough for building any experimental design. The objective is to find the right permutations of levels in each factor to obtain uncorrelated main effects. Even for a design with few factors corresponding to a limited number of levels, each new factor and new level produce a large quantity of permutations. 3.1. Optimality criteria When orthogonal designs are not possible because of excessive runs and restricted budgets, it would be desirable to obtain a design as close as possible to an orthogonal one, with just a few runs. Such designs are called nearly-orthogonal, and several criteria have been put forward in the literature. Following we give a brief review. Let D be the determinant of X0 X. It is well known that for any design D 6 1, and is equal to one if and only if the original design is orthogonal. A design is D-optimal if it maximizes D. Other criteria consist in maximizing tr(X0 X) and similar operators. Several criteria have been use: A-optimality, G-optimality, etc. (Heredia-Langner, Carlyle, Montgomery, Borror, & Runger, 2003). These are popularly known as the alphabetically-optimal designs and are often generated by an algorithm implemented using a computer. Generally, most of these alphabetically-optimal design follow the rule of favouring the design that has high values in the diagonal, and small in out-diagonal elements, i.e. matrices X0 X as close as possible to diagonal, which represents an orthogonal design. The alphabetically-optimal designs work with the model matrix X0 X. Xu (2002) proposed a criterion called J2, which works with the design matrix instead of the model matrix and is simple to program. This author applies it for the construction of nearly-orthogonal designs. We propose a criterion based on Addelman
Table 1 Design 32 22 (9) obtained using simulated annealing (SA). 1
2
3
4
1 0 2 0 2 1 2 1 0
1 0 0 2 2 0 1 2 1
1 1 0 1 1 1 1 0 0
0 1 0 0 1 1 1 1 1
0 1 2
1
–
2
0 1 1 1
1 1 1 1
2 1 1 1
1
–
3
0 1 2
0 1 1 1
1 2 2 2
(a)
(b)
2
–
3
0 1 2
0 1 1 1
1 2 2 2
(d)
–
4
0 1 2
0 1 1 1
1 2 2 2
(c)
2
–
4
0 1 2
0 1 1 1
1 2 2 2
(e)
1
3
–
4
0 1
0 1 2
1 2 4
(f)
frequencies (see below) that works also with the design matrix; this provides an efficient algorithm for constructing orthogonal designs. Recently, more criteria have been given for saturated and supersaturated designs. A supersaturated design cannot be an orthogonal design, so it is necessary to define measures of departure from orthogonality. Yamada and Lin (2002) obtained asymmetrical supersaturated designs that maximize the value of a v2 statistics, defined on a measure of dependency between two columns of a design. They applied it to the hypothesis test in a two-way contingency table. Fang, Lin, and Liu (2003) proposed a new criterion, called the E(fNOD) criterion, also for asymmetrical supersaturated designs, while Georgiou, Koukouvinos, and Mantas (2006) provided some E(fNOD) optimal supersaturated designs. 3.2. Condition of proportional frequencies According to Addelman (1962), a necessary and sufficient condition for the estimates of the main effects of any two factors be uncorrelated is that the levels of one factor occur with each of the levels of the other factor with proportional frequencies. Let xA and xB be the two factors, with r and s the levels, respectively. The estimates of the components of factors xA and xB are orthogonal to each other and to the mean if
nij ¼ ni nj =n;
ð2Þ
where ni, the number of times the i level of factor xA occurs in the design; nj, the number of times the j level of factor xB occurs in the design; n, the number of runs in the design, and nij, the number of times the i level of factor xA occurs with the j level of factor xB. 3.3. Example The design shown in Table 1 is one with an optimal solution for the experiment 32 22 (9). Since there are four factors, there will be six frequency charts. One chart for the first factor with the second factor, another chart for the second with the third, and so on, with 4 combinations in total. The combinations can be seen in the 2 charts in Table 1.
10096
M.C. Alonso et al. / Expert Systems with Applications 38 (2011) 10094–10098
The first chart, (a), represents the frequency of levels between the first factor with three levels and the second factor with three levels (1-2). The charts (b), (c), (d) and (e), represent the frequency between a factor with three levels and a factor with two levels (13, 1-4, 2-3, 2-4). Finally, chart (f) represents the frequency between the two factors with two levels each (3-4). We can see how the condition of proportional frequencies is verified. In this paper, we choose to apply this criterion in order to construct orthogonal design. 4. Symmetrical and asymmetrical designs construction algorithms To construct efficient asymmetrical designs, row and column exchange algorithms have been used; see Nguyen and Miller (1992) for a review. Actually, Xu (2002) proposed a derivation of a column exchange algorithm for minimizing the J2 criterion. Several heuristic techniques have been successfully applied to the design of experiments. For alphabetically-optimal designs, Haines (1987) applied simulated annealing (SA) to the construction of exact D-, I-, and G-optimal designs; genetic algorithms have been applied by Heredia et al. (2003) to generate D-efficient designs. Recently, Guo, Simpson, and Pignatiello (2007) used a genetic algorithm to construct nearly orthogonal designs with a new criterion based on a combination of the balance property and of J2 criterion. Using this criterion, they relax the requirement of balance and orthogonally, and hence allowing a reduced number of experimental runs in practice. The work presented in this paper deals with the application of SA to fractional factorial designs using the Addelman proportional frequencies criterion in order to obtain orthogonal designs. A brief introduction to SA is given in the next section, and the application of SA to design construction is given in Section 4.2. 4.1. Simulated annealing It is known that there are many combinatorial optimization problems for which no algorithm for finding an optimum, and an exhaustive computer search would require prohibitive amounts of computational time. SA is an optimization technique that has been successfully used on various occasions to find an acceptable solution for combinatorial optimization problems. Problems that require no polynomial execution time are known as NP. Since there is no optimization algorithm for NP problems, we therefore cannot find a global optimal solution. Consequently, it seems practical to devise some ‘‘acceptable’’ approximative solution in an admissible amount of computational time. These approximation algorithms are also called heuristics, and basically, this is what SA is about. SA was independently presented by Kirkpatrick, Gelatt, and Vecchi (1982) and Cerny (1984). The name simulated annealing derives from the analogy of the method to thermodynamics, with the way that metals anneal. A body in the state of thermal equilibrium has its energy distributed according to the Boltzmann’s distribution probability equation, also called Gibbs distribution (Feynman, Leighton, & Sands, 1963),
1 expðEi =kTÞ; PðE ¼ Ei Þ ¼ ZðTÞ
ð3Þ
where Ei is the energy of the system for the configuration i; T, is the temperature of the system, and k, is a known constant in statistical mechanics referred to as Boltzmann’s constant.
ZðTÞ ¼
X
expðEi =kTÞ;
ð4Þ
i
where Z(T) is a normalization factor. The summation runs over all possible configurations.
Metropolis was the first to put this idea into a computational form Metropolis, Rosenbluth, Rosenbluth, Teller, and Teller (1953). To carry out calculations, they proposed a Monte Carlo method, which generates sequences of energy states for a given body. Implementation of this algorithm is quite simple: a system will change from state i, with energy E1, to state j, with energy E2, when (E2–E1) is negative (i.e. if the new state has a lower energy). If (E2–E1) is positive, then the probability of shifting to new state j is given by exp (-(E2-E1)/kT). This strategy, based on always taking into consideration the state of lower energy together with an occasional state of higher energy, became known as the Metropolis algorithm. 4.2. Elements to obtain designs with SA Each experimental design can be considered as a system about to be optimized by the SA technique, where each design is a configuration of the system. For example, a system configuration would be the experiment 32 22 in 9 runs, as previously discussed. If we take at random the levels of a design for this experiment, we will get a configuration for the system, shown in the design example in Table 2. We built the corresponding frequency charts from the last design. As can be observed in charts (a), (c), (e) and (f) in Table 2, the condition of proportional frequencies is not satisfied; therefore this design is not adequate to estimate uncorrelated main effects. Following, we will show how to obtain a design that verifies proportional frequencies from the initial design where combinations of levels have been randomly obtained. From the different ways of defining frequencies for each factor level, the user needs to choose the right scheme, for instance, the one given in charts (a)–(f) in Table 1. An energy or cost function that indicates how far off a design or configuration is from the chosen one that attains proportional frequencies condition is established. This energy is calculated subtracting each position of the actual frequency charts (Table 2, charts (a)–(f)) from the corresponding scheme that satisfied the frequencies condition (Table 1(a)–(f)). The absolute values of the differences are added to obtain the desired energy function. For instance, in our example the energy function will be 16, which can be easily computed from the frequency charts.
Table 2 Design for the experiment 32 22 (9) obtained at random. 1
2
3
4
0 2 1 0 2 0 2 1 1
0 1 2 2 0 2 1 1 0
1 1 1 0 0 1 1 0 1
1 1 0 1 1 0 1 1 0
0 1 2
1
–
2
1
–
3
1
–
4
0 1 1 1
1 0 1 2
2 2 1 0
0 1 2
0 1 1 1
1 2 2 2
0 1 2
0 1 2 0
1 2 1 3
(a)
(b)
2
–
3
0 1 2
0 1 1 1
1 2 2 2
(d)
(c)
2
–
4
0 1 2
0 1 0 2
1 2 3 1
(e)
3
–
4
0 1
0 0 3
1 3 3
(f)
10097
M.C. Alonso et al. / Expert Systems with Applications 38 (2011) 10094–10098
To go from one configuration to the next, two random rows for each column are chosen. Column by column, in the matrix design, the values corresponding to two chosen rows are changed; if they are equal there will be no change, and if they are different there will be a change in that column. Therefore, we will obtain a range of changes between none and as many as the number of factors in the design. The parameter T Temperature simulates the temperature shifts. An important feature in the SA algorithm is the cooling schedule (Van Laarhoven & Aarts, 1998). After experimenting with many cooling schedule methods, the best results have been obtained with the following logarithmic approach:
T nþ1 ¼ T 0 =ð1 þ lnðnÞÞ:
ð5Þ
An application of SA to particular types of experimental designs, known as Taguchi experimental designs, can be found in Chang (2008). 5. Results and discussion Results show the versatility of SA for this kind of combinatorial problem. We have used a 3 GHz Pentium computer for our tests and subsequently, we have obtained computational time for hundreds of optimal designs applicable to several experimental situations. Table 3 includes some of those with less than 5 min of computational time. For small experiments, the computational time is very short. However, it is not always true that the number of factors, runs, levTable 3 Some designs constructed with SA.
Table 5 Design 37 2 (18) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
2 0 0 2 1 2 0 2 2 0 1 1 0 0 1 1 1 2
0 2 1 1 1 2 1 1 2 0 2 2 0 2 0 0 1 0
2 2 2 0 0 1 1 1 2 0 0 1 1 0 1 2 2 0
0 0 2 2 0 2 1 0 1 0 2 0 2 1 1 2 1 1
1 2 0 2 0 2 1 1 0 2 1 0 0 1 2 1 2 0
2 0 1 0 2 2 0 1 1 1 1 0 2 2 1 0 2 0
0 1 0 0 2 2 2 1 2 2 1 0 1 0 0 2 1 1
1 1 0 1 1 0 1 0 1 0 1 0 1 0 1 0 0 0
els, etc. generate more computational time. The more saturated the experimental design, the more computational time it may require. One major disadvantage of SA is the extremely long computational time it takes to generate some experiments. Due to the inherent property of SA, the same design can take different amounts of computational time every time it runs. An advantage of SA is that for some designs this algorithm needs a lower number of runs than with traditional methods. For instance, the design 36 22 requires 27 runs using traditional methods and 25 runs using SA (see Table 4). Another example is the design 37 2 that requires 27 runs using traditional methods and only 18 runs using SA (see Table 5), which means a save of 9 runs.
Design
6. Conclusions
4 23 (8) 3 24 (9) 4 34 (16) 37 2 (18) 4 33 24 (18) 36 22 (25) 4 33 23 (25) 43 24 (25) 46 (32)
Table 4 Design 36 22 (25). 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
1 0 2 1 1 0 1 2 0 0 2 1 0 0 0 1 2 1 1 1 2 0 0 0 1
1 0 0 1 0 0 2 1 1 0 1 0 1 2 2 1 2 0 2 1 0 0 1 1 0
0 0 0 0 1 0 2 2 2 2 0 1 1 0 1 0 1 1 0 1 1 0 1 1 2
0 2 1 1 0 0 2 0 1 1 1 1 0 0 1 2 0 1 1 0 2 0 1 2 0
1 0 0 0 0 2 0 0 2 1 1 0 0 0 1 1 1 1 2 2 2 1 0 1 1
1 0 1 0 1 0 0 2 1 0 0 2 0 1 1 1 0 0 2 0 1 2 1 2 1
0 0 1 0 0 1 1 1 0 0 0 0 0 0 1 1 0 1 0 1 0 1 1 0 0
1 0 0 0 0 0 1 0 0 1 1 1 0 1 0 0 0 0 0 1 1 1 1 0 0
This paper presents a method for obtaining fractional factorial designs with a minimum number of runs that achieves uncorrelated main effects, all based on the technique of simulated annealing. With appropriately tuned parameters for SA, it is possible to obtain optimality for experimental design for a fixed number of runs, for which could be applied the theory of proportional frequencies. For this kind of problem, SA may be considered as a general method applied to fractional factorial designs, which ensures an optimal solution. When an orthogonal design is not possible, SA gives designs with nearly orthogonal properties. Because there is no universal technique that outperforms all others in all cases, the addition of SA to the traditional design building techniques proposed here may provide practitioners with a useful alternative for dealing with problems of practical relevance. If we cannot invoke an assumption where all interactions are negligible, then a design with higher resolution is required. The SA method could be applied to higher resolutions (IV and V) with a properly generalized energy function. This must be accomplished in future work. Acknowledgements The authors would like to thank the Spanish MICINN for financial support, Project No. CGL2010-15357/BTE. References Addelman, S. (1962). Orthogonal main – Effect plans for asymmetrical factorial experiments. Technometrics, 4(1), 21–46.
10098
M.C. Alonso et al. / Expert Systems with Applications 38 (2011) 10094–10098
Cerny, V. (1984). Minimization of continuos functions by simulated annealing. Research Institute for Theoretical Physics, University of Helsinki, Preprint No. HU-TFT-84-51. Chang, H-H. (2008). A data mining approach to dynamic multiple responses in Taguchi experimental design. Expert Systems with Applications, 35, 1095–1103. Fang, KT., Lin, DKJ., & Liu, MQ. (2003). Optimal mixed-level supersaturated design. Metrika, 58, 279–291. Feynman, R. P., Leighton, R. B., Sands, M. (1963). Lectures on physics. AddisonWesley, sixth printing. Georgiou, S., Koukouvinos, C., & Mantas, P. (2006). On multi-level supersaturated designs. Journal of Statistical Planning and Inference, 136(8), 2805–2819. Guo, Y., Simpson, J. R., & Pignatiello, J. J. Jr., (2007). Construction of efficient mixedlevel fractional factorial designs. Journal of Quality Technology, 39(3), 241–257. Haines, LM. (1987). The application of the annealing algorithm to the construction of exact optimal design for linear-regression models. Technometrics, 29, 430–447. Heredia-Langner, A., Carlyle, WM., Montgomery, DC., Borror, CM., & Runger, GC. (2003). Genetic algorithm for the construction of D-optimal designs. Journal of Quality Technology, 35, 28–46.
Kirkpatrick, S. C., Gelatt, D., Vecchi, M. P. (1982). Optimization by simulated annealing, IBM Research Report RC 9355. Metropolis, N., Rosenbluth, A., Rosenbluth, M., Teller, A., & Teller, E. (1953). Equation of state calculations by fast computing. Journal Chemical Physics, 21, 1087–1092. Montgomery, D. (2005). Design & analysis of experiments (6th ed.). New York: John Wiley & Sons. Nguyen, N-K., & Miller, AJ. (1992). A review of some exchange algorithm for constructing discrete D-optimal designs. Computational Statistics Data Analysis, 14, 489–498. Van Laarhoven, P. J. M., & Aarts, E. H. L. (1998). Simulated annealing: Theory and applications. Kluwer Academic Publishers. Xu, H. (2002). An algorithm for constructing orthogonal and nearly-orthogonal arrays with mixed levels and small runs. Technometrics, 44(4), 356–368. Yamada, S., & Lin, DKJ. (2002). Construction of mixed-level supersaturated design. Metrika, 56, 205–214.