Food Research International 26 (1993) 27-37
Random-centroid o ltimization for food form1 .ation Jinglie DOU,Sadiq Toma & Shuryo Nakai* University of British Columbia, Department of Food Science, 6650 NW Marine Drive, Vancouver, British Columbia, Canada V6T 124
Random-centroid optimization was modified by introducing a penalty function to accommodate constraints in formula optimization. Quality parameters not significantly different @ > 0.05) from those obtained by the Complex (constrained simplex) modified for formulation (Forplex) were computed by this new program and yielded similar ingredient compositions to those computed by Forplex. The new program was similar or more efficient in optimization than Forplex which occasionally stalled during searching for the optimum. It has been found that the new, multifactor program written to accommodate up to 20 factors is appropriate for use in research and development, especially for formulations with constraints, while Forplex is recommended for routine formulation. Keywords: random-search,
centroid-search,
INTRODUCTION Mixture designs and linear programming are the most popular optimization techniques for food formulation, Mixture designs are a group of techniques which define combinations of ingredients (i.e. formulas) to achieve the optimum based on chosen quality parameters. These techniques use multiple regression analysis to derive regression equations modified to satisfy a composition constraint such that the sum of the ingredients equals 1.O (Hare, 1974). Our experience shows that the reliability of the values obtained for ingredient composition is highly sensitive to errors included in the response values measured during optimization experiments. These errors are inherent to the optimization techniques based on curve fitting. To improve the reliability, mixture design experiments may have to be repeated using different extreme vertices to cover different search spaces. Furthermore, mixture designs cannot be applied to complicated formulations with constraints. Linear programming has been frequently used for menu formation (Harper dz Wanninger, 1970) *To whom correspondence
should be addressed.
Food Research International 0963-9969/93/$06.00 0 1993 Canadian Institute of Food Science and Technology
food formulation,
optimization
as well as in the meat industry for least-cost formulation (Pearson & Tauber, 1984). Least-cost formulation minimizes the ingredient cost while maintaining bind values and composition within predetermined ranges (Norback & Evans, 1983). However, it is restrictive since it cannot accommodate nonlinear relationships, such as equations to predict the product quality from ingredient composition. Nonlinearity of ingredient-functionality relationships of proteins was reported by Arteaga et al. (1992). Since Morgan and Deming (1974) applied simplex optimization for selection of analytical conditions, this method has become one of the most popular optimization techniques in chemistry. This optimization technique can accommodate nonlinear equations to predict response values by including a subroutine; however, it is incapable of handling constraints with the exception of a boundary constraint (Nakai, 1981). VazquezArteaga (1990) modified the Complex (constrained simplex) technique of Box (1965) for application to meat formulation. The method (Forplex) searches for the best quality within an acceptable cost range, in contrast to least-cost formulation. Equations to predict quality parameters were derived as functions of the ingredient composition through small-scale experiments for frankfurter
28
J. Dou, S. Toma, S. Nakai
preparation. Forplex was superior to least-cost formulation as it could obtain quality parameter values, closer to the values for desirable product than those obtained by least-cost formulation. In addition to the incapability of handling constraints, simplex optimization suffers from the following shortcomings: (a) quick loss of efficiency during optimization and (b) difficulty in homingin on the global optimum. Random-centroid optimization was established in our laboratory (Nakai, 1990) to circumvent these shortcomings. It is possible to accommodate constraints through mapping by selecting new search scales to avoid trespassing the level values which will violate the constraints. The computer program written previously (Nakai, 1990) was subsequently modified to accommodate up to 20 factors to meet complicated quality requests for food products in the future. The objectives of this study were to write a multifactor program of the random-centroid optimization useful for food formulation and to compare this method with other formula optimization techniques (other than mixture design - see previous discussion).
space, and L, and U, are the lower and upper bounds of factor k. The denominator is the diagonal distance of the n-dimensional hypercube. B, is computed for the combination of each pair of points i and j. The preset value of t is the threshold of B, (usually t is 0.25-0.33 depending on the number of factors and the number of experimental points). If B, 2 t (for j = 1,2, .. .. i), the new point Pi is accepted, otherwise it is discarded and another point should be randomly generated and tested again. Centroid search The centroid is the average of r vertices excluding the worst vertex in r+l vertices as in the case of simplex optimization for Y factors (Morgan & Deming, 1974). Centroid search is conducted by altering the vertex to be excluded in the centroid computation from the worst to the second worst and then to the third and so on until the subsequent response becomes worse than the preceding response (Aishima & Nakai, 1986). In the multifactor program, vertices Ci for the centroid for ith factor were computed using the following formula:
METHODS
Ci =&x,/q /=I
The following modifications were made to the computer program for the random-centroid optimization of Nakai (1990). Random design algorithm A deterministic rule used by Nakai (1990), as discussed later in the Discussion section, was modified in order to obtain more uniform distribution of experimental points. The following B, value, which is a coefficient to describe the distance between two points, was defined:
(i = 1,2, .. .. N; j = 1,2, .... i)
(1)
where n and N are the number of factors and the number of experimental points randomly chosen, respectively, and Pki and Pkj are the coordinate values of two points Pi and Pj for kth factor, respectively. The numerator is the geometric distance between Pi and Pi in the n-dimensional
(2)
where .x~ is the best j vertices for ith factor and q is the number of the best vertices selected for computation of the centroid out of the number of vertices used in the random search; q is usually the same as or slightly larger than the number of factors n. Mapping Mapping is an approximation of the response surface. Mapping assists visualization of the true response surface which is usually invisible during sequential steps of simplex optimization. This method was first introduced by Nakai et al. (1984) and later revised (Nakai, 1990). The disadvantage of mapping, however, is that as the number of factors is increased, the predictability of the optimum is quickly weakened. The number of factors which could be manipulated was increased from 5, used in the previous program (Nakai, 1990) to 20 in the new program used in this study. The present program also includes a routine for the simultaneous viewing of four maps on the monitor
Random-centroid optimization for food formulation
29
Table 1. Quality prediction equations -..Shrink Tmloss
= 17,13x, + 11.83x, - 104,5x,x, - 12.18x,x, = 16.43~~ + 39.69x, + 68.15x,x2 + 69.65x,x,
Twloss
= 10.1 lx, + 29.75x, + 29.11x,x,
ES
= 2.61x, ~ 0.19x3
+ 231,7x,x,
+ 16.48x,x,
Exfluid
= 13.67~) + 20.64~~ + 12.13x, - 31.93x,x,
Exwater
= 18.15~~ + 10.58~~ - 35,57x,x,
Exfat
= 15.01x, + 1.72~~ + 1.24x,
- 27.43~~~~
- 25,31x,x,
Hard 1
= 795.99x, + 212.25~~ - 8308,69x,x,
Shear
= -1,79x,
- 1131.69x,x,
+ 14410.5x,x,x,
+ 4,71x, + 7.33x,
Cohes
= 0.291x, + 0.222~~ + 0.328~~ - 0.336x,x,
Gummy
= 220.27~~ + 68,79x, - 2266,94x,x,
Chewy
=
-219,77x,
~ 56,08x,x,
- 339.77~~~~ + 4015.16x,x2x,
+ 46.17~~ + 275.3x, ~______.
x,, x2 and x1 are pork fat, mechanical HARDl: SHRINK: TWLOSS: SHEAR: EXFLUID: COHES: TMLOSS: ES: EXFAT: GUMMY: EXWATER: CHEWY:
deboned
poultry
meat and beef meat, respectively
(Vazquez-Arteaga,
1990).
Hardness at first compression (Newtons). ‘%,Weight loss after processing. ‘% Weight loss after thermal treatment. Maximum shear force (Newtons). ‘I/oExpressible fluid. Cohesiveness. ‘Id)Water loss per moisture content of the meat block. ‘%IFat released after thermal treatment. ‘I/uExpressible fat. Gummyness (Newtons). ‘I/uExpressible water. Chewiness (Newtons).
screen as was previously reported (Nakai, 1990). It was found that this improved program facilitated narrowing of the search spaces for the subsequent search cycle. The mapping subroutine was thus modified to further expand the scale for grouping described previously (Nakai, 1990) if necessary despite sacrificing the reliability of the derived map. Penalty function Random-centroid optimization cannot, in principle, manipulate constraints except for the boundary constraint. In the case of the boundary constraint, the boundary of each factor level can readily be defended by setting the search space within the boundaries. For other constraints, a penalty function was introduced to restrain the search within the constraints. A penalty is imposed to experiments which violate a constraint by assigning a highly inferior response value. However, this subroutine is left as an option to the operator for its introduction. Assuming one of m constraints, g,(x), can be defined as c,, I g,(x) I cZi for minimization of j(x) where x is the vector of n-dimensional space and c,, and cZi are bounds of the constraint, hli is
set to be g,(x) - cli and 0 when g;(x) - cl, < 0 and g,(x) ~ c’], 2 0, respectively, and h,,to be cZi gi(x) and 0 wh en c2i- g,(x) < 0 and c2;- g,(x) 2 0, respectively. Then the new response +(x) is computed as follows: +(x) = f(x)
+ &(/r,i’
+ h2;Z)
(3)
where p is a large value, e.g. 1000. Therefore, if g,(s) is within a constraint, both h,, and h2i equal 0, thus +(x) = f(x), otherwise, +(x) becomes an extremely large value in the case of minimization as a penalty. The computer program for formulation was written in Quick Basic for IBM PC computers as an accessory to the program for the multifactor random-centroid optimization. Optimization of frankfurter formulation The equations shown in Table 1 were used in a comparison of random-centroid formulation and Forplex in predicting the quality parameters of meat products. To derive the equations, VazquezArteaga (1990) prepared frankfurter samples from skinless pork fat (x,), frozen mechanically deboned poultry meat (x2) and lean beef (x3). The
J. Dou, S. Toma, S. Nakai
30
mixture design method of Hare (1974) was used to construct 10 different formulas. The parameters measured are shown in Table 1 (footnote). Both optimization methods searched for the best quality formula that met predetermined product specification for (a) proximate composition, (b) ingredient levels and (c) cost range. The combined response values (R) to be minimized were computed as a sum the absolute values (abs) of the standardized differences of the predicted quality values from the target quality values as follows: R = xabs
(target yi - predicted ui) / di
(4)
where yi and di are zth quality parameter and the detectable limit of ith quality parameter, respectively. In this study, di was replaced by 0.01 X (max-min)i for each quality parameter, where (max-min)i was the range of predicted values.
of the cooked product, assuming that there was no moisture loss during extrusion cooking. Search ranges of the ingredients were entered to make the total 1.O which were 0.15-0.25, O-0.02, O-001, 0-0~01, O-0.02 and O-0.03 for surimi base, dry egg white, soy protein concentrate, gluten, starch and fat, respectively. The seventh factor was water which was computed by the program to be the residual, i.e. subtracting the total of the above six ingredients from 1.O. The ranges chosen for the processing conditions were 50-85°C for 5-10 min for preheating and 90-120°C for 0.5-2 min for extrusion cooking. Protein content of the final product was the constraint, which should have been between 0.154.25 after entering the protein content of each ingredient as 0.72, 1.0, 0.7, 0.9, 0 and 0 into the computer.
RESULTS Multifactor model for 15 factors
A mathematical model for 15 factors (x1-x,J was formulated using the matrices of Bowman and Gerard (1967) to compute response y as follows: y = 3.3~~2 + 3.1X22 + 3.2x,2 + 1.6x,2 + 2.6~~2 + 2.9~~2 + 3.2~~2 + 3.5~~2 + 2.8x,2 + 2+6x,,2 + 2.9x*,2 + 3.2~~~~+ 3.5~~~2 + 2.9~~~2+ 2.9x1,2 + 2x,x, + 2.2x2x, + 2.4x,x, + 1.6x,x, + 1.8x,x, + 2x,x, + 2.2x,x, + 2.4x,x, + 1.6x9x10+ 1.8x,fill + 2xl,x12 + 2.2~,~x,, + 2.4x13x,, + 1.8x1,x,, - 4.39~~ - 3.3x2 - 3.14x, - 2.4x, - 464x, - 3.22x6 - 45x, - 7.22x, - 6.16x, - 5.68x,, - 5.14x,, - 4.46x,, - 8.9x,, -7.74x,, - 2.78~~~ + 21.347
Three different computations were conducted. (1) The effectiveness of random-centroid optimization and Forplex for hypothetical frankfurter formulation was compared. Then, (2) random-centroid optimization was applied to the model with 15 factors [Equation (5)] followed by models with 413 factors derived from the 15-factor model. And (3) Equation (5) was also converted to a hypothetical model for surimi preparation with 10 factors (6 ingredients and 4 processing conditions) which was optimized using the random-centroid optimization. By using the program written, one can readily increase the number of constraints for formula optimization. Comparison of two optimization methods
(5)
When x values are 0.5, 0.3, 0.2, 0.4, 0.7, 0.2, 0.4, 0.7, 0.6, 0.7, 0.6, 0.2, 0.9, 0.9 and 0.2 for x1-x15, y assumes the minimum value of 0. Equation (5) was converted to a hypothetical surimi formulation model: using surimi base (28% moisture), egg white powder, soy protein concentrate, gluten, starch and fat as the ingredients. It was assumed that prepared formula would be preheated and cooked in a specially designed extruder. The compositions of the above six ingredients were optimized. Temperature and time of preheating and extrusion cooking were simultaneously optimized. Therefore, it was a lo-factor optimization with a constraint for protein content
Cases with two quality parameters
A comparison of Forplex (a modified Complex) and random-centroid optimization (RCO) is shown in Table 2. For model 1, Forplex was slightly better than RCO with smaller differences in Hard1 from the target value 132.3. In model 2, RCO was better than Forplex in terms of Twloss. For model 3, there was no marked difference between the two methods. For models 4, 5, and 6, Forplex was superior in ES, both Exfat and Gummy, and Chewy, respectively. However, Forplex occasionally required larger numbers of iterations, 59, 99 and 72 in models 3, 4 and 5, respectively. This may be critical for experimental optimization when the conducting of experiments is time-consuming and costly.
Random-centroid optimization for food formulation
31
Table 2. Comparison of Forplex and random+entroid optimization for frankfurter formula optimization trials where two quality parameters were considered as a measure of the formulation quality Quality parametep
Target value
Forplex ITb
(1) HARD1 SHRINK
132.3 8.75
22
20
(2) TWLOSS SHEAR
22.1 4.53
13
22
(3) EXFLUID COHES
9.6 0.25
59
22
(4) TMLOSS ES
38.1 0.55
20
99
(5) EXFAT GUMMY
4.78 33.87
72
30
(6) EXWATER CHEWY
4.88 105.71
24
Optimum ingredientsc Xl x2 x3 Xl X2 x3 Xl X2 X3 Xl X2 X3 Xl X2 x3 Xl x2 x3 Xl x2 x3 Xl X2 x3 Xl x2 x3 Xl X2 x3 Xl X2 x3
Random-centroid
Response valued
= 0.25 = 0.201 = 0.549 = 0.283 = 0,160 = 0.558 = 0.238 = 0.242 = 0,520 = 0.241 = 0.233 = 0,562 = 0.246 = 0.196 = 0.557 = 0.247 = 0.199 = 0,554 = 0,250 = 0,201 = 0,549 = 0.248 = 0.205 = 0.547 = 0,252 = 0.199 = 0.549 = 0.252 = 0.198 = 0,550 = 0,252 = 0,198 = 0,550
Predicted quality
ITb
Gptimum ingredient@
2.73
131.67 8.74
23
69.54
132.68 8.57
27
49.15
21.63 4.53
21
32.0
21.7 4.52
25
Xl x2 X3 Xl X2 X3 Xl x2 x3 Xl x2 x3 Xl x2 X3 Xl x2 X3 Xl x2 X3 Xl x2 x3 Xl x2 x3 Xl x2 x3 Xl x2 x3
11.22
9.63 0.252
28
10.95
9.61 0.252
28
2.1
38.7 0.55
26
2.67
38.04 0.55
27
3.57
4.8 1 33.58
23
2.69
4.81 33.75
28
2.13
4.88 105.17
28
= 0.272 = 0,170 = 0,558 = 0.271 = 0.169 = 0,561 = 0,248 = 0,202 = 0.550 = 0.243 = 0.21 = 0.547 = 0.229 = 0.197 = 0.560 = 0.235 = 0,197 = 0,563 = 0,244 = 0.191 = 0,564 = 0.246 = 0.194 = 0.559 = 0.258 = 0.187 = 0.555 = 0.255 = 0.191 = 0.555 = 0.252 = 0.201 = 0.547
Response valued
Predicted quality
27.12
134.03 8.65
30.75
136.02 8.69
2.15
22.11 4.54
2.27
22.07 4.56
8.94
9.52 0.251
17.19
9.59 0.252
16.18
38.28 0.53
8.42
38.2 0.536
12.54
4.88 34.2
9.3
4.84 34.4
3.19
4.85 104.49
0 See Table 1 for definitions. b Iteration. c Xl, skinless pork fat; X2, frozen mechanically deboned poultry meat; X3, lean beef. d Instead of the sum, the multiple of the absolute values of the standardized differences was used for comparisons in Tables 2-5, as it was used by Vazquez-Arteaga (1990). To avoid degeneration, ri = (target yi - predicted yJ/di was replaced by 1.0 when ri was less than 1.0.
Table 3. Comparison of Forplex and random-centroid optimization for frankfurter formula optimization trials where three and four quality parameters were considered as a measure of the formulation quality Quality parameter0
Target value ITb
(1) EXWATER CHEWY GUMMY (2) SHRINK ES HARD1 (3) TMLOSS TWLOSS EXFLUID EXFAT
7.32 178.13 53.0 10.04 0.25 191.4 40.27 25.61 10.68 3.35
Random-centroid
Forplex
31
47 50
Optimum ingredient+ Xl x2 x3 Xl x2 x3 Xl x2 x3
= = = = = = = = =
0,148 0,101 0,751 0,148 0.117 0.735 0.150 0.100 0.75
Response valued 2.96
3.86 2.54
Predicted quality I.33 179.11 53.11 10.02 0.250 191.33 40.23 25.6 10.66 3.31
ITb
Optimum ingredientsc
27
Xl x2 x3 Xl x2 x3 Xl x2 X3
27 25
= = = = = = = = =
0.148 0.102 0.750 0.149 0,112 0.739 0,147 0.101 0.752
Response valued
Predicted quality
2.77
7.31 178.38 52.99 10.02 0.25 191.4 40.22 25.65 IO.66 3.31 .~
3.26 4.58
0 See Table 1 for definitions. b Iteration. c Xl, skinless pork fat; X2, frozen mechanically deboned poultry meat; X3, lean beef. d Instead of the sum, the multiple of the absolute values of the standardized differences was used for comparisons in Tables 2-5, as it was used by Vazquez-Arteaga (1990). To avoid degeneration, ri = (target vi - predicted yi)ldi was replaced by 1.0 when ri was less than 1.0.
J. Dou, S. Toma, S. Nakai
32
Table 4. Comparison of Forplex and random-centroid optimization for frankfurter formula optimization trials where five quality parameters were considered as a measure of the formulation quality Quality parameter0
ITh
(1) TWLOSS EXWATER EXFAT HARD1 COHES
(2) SHRINK EXFLUID SHEAR COHES GUMMY
(3) TMLOSS EXFAT HARD1 GUMMY CHEWY
(4) TWLOSS ES EXFAT HARD1 COHES
25.61 I.32 3.35 191.4 0.27
10.04 10.68 5.1 0.27 52.99
40.27 3.35 191.4 53.00 178.13
25.61 0.25 3.35 191.4 0.27
Random-centroid
Forplex
Target value Optimum ingredientsC
Response valued
16
Xl = 0.15 x2 = 0.10 x3 = 0.75
48.1
56
Xl = 0.152 x2 = 0.101 x3 = 0.747
39.0
95
Xl = 0.149 X2 = 0.098 x3 = 0.753
51.23
30
x1 = 0.150 x2 = 0.101 x3 = 0.749
47.15
21
Xl = 0.150 x2 = 0.101 x3 = 0.749
2.6
41
Xl = 0.149 x2 = 0.101 x3 = 0.749
2.8
30
x1 = 0.149 x2 = 0.102 x3 = 0,749
51.4
35
Xl = O-150 X2 = 0.098 X3 = 0.752
51.0
Predicted quality 25.61 I.32 3.35 191.4 0.21 25.56 7.32 3.35 191.3 0.27 IO.05 IO.7 5.1 0.27 53.1 10.03 IO.66 5.66 0.27 40.24 3.35 191.4 53.0 171.9 40.19 3.34 191.2 53.0 25.60 0.25 3.34 191.47 0.27 25.65 0.25 3.35 191.38 0.27
IT”
Optimum ingredientsr
Response valued
28
Xl = 0.158 x2 = 0.095 x3 = 0.747
30.94
27
Xl = 0.157 x2 = 0.105 X3 = 0.738
44.71
28
Xl = 0.153 x2 = 0.103 x3 = 0.744
36.2
27
Xl = 0.153 x2 = 0.100 x3 = 0.747
36.3
28
x1 = 0.151 x2 = 0.090 x3 = 0.759
9.23
28
Xl = 0.147 x2 = 0.094 x3 = 0.759
IO.06
28
x1 = 0.155 x2 = 0.101 x3 = 0.744
36.18
28
Xl = 0.154 x2 = 0.102 x3 = 0.744
35.8
Predicted quality 25.56 I.3 3.46 190.7 0.27 25.41 7.17 3.45 190.9 0.27 IO.01 10.64 5.66 0.27 52.8 10.02 IO.68 5.67 0.27 40.51 3.36 191.2 53.0 179.9 40.38 3-31 191-5 53 2 25.5 0.26 3.35 191.04 0.273 25.5 0.26 3.41 191.12 0.273
a See Table 1 for definitions. /J Iteration. c Xl, skinless pork fat; X2, frozen mechanically deboned poultry meat; X3, lean beef. dInstead of the sum, the multiple of the absolute values of the standardized differences was used for comparisons in Tables 2-5, as it was used by Vazquez-Arteaga (1990). To avoid degeneration, ri = (target yi ~ predicted yi)/di was replaced by 1.O when ri was less than 1.O.
Cases with three and four quality parameters
In this group of combinations (Table 3) Forplex showed slightly inferior results for model 1, otherwise both methods showed comparable results. Again, Forplex required larger numbers of iterations in certain cases. Cases with jive quality parameters
For combinations of five quality parameters (Table 4), RCO yielded slightly larger deviations in quality parameters from the target values in models 1, 3 and 4; however, it was superior to
Forplex in terms of the number of iterations. More than 40 experiments were occasionally required in the case of Forplex, while less than 30 experiments were sufficient when RCO was used. Cases when a quality parameter is constrained
Three formula optimization trials were conducted as shown in Table 5, using different combinations of quality parameters with one extra quality parameter assigned as a constraint. The lower and upper limits of quality parameters used as the constraints were arbitrarily chosen as shown in
Random-centroid optimization for food formulation
33
Table 5. Comparison of Forplex and random-centroid optimization for frankfurter formula optimization trials where a quality parameter was considered as a constraint Quality parameter0
Target value
Forplex Optimum ingredient@
(1) SHRINK SHEAR COHES GUMMY (2) EXWATER EXFAT HARD1 COHES (3) SHRINK EXFAT COHES
8.1 4.8 0.255 40.0 6.0 4.5 160 0.255 9.0 4.5 0.255
Random-centroid
Response valued
Xl = 0,223 X2 = 0.198 x3 = 0.579
372.8
Xl = 0.233 X2 = 0.164 X3 = 0.602
127.8
Xl = 0.230 x2 = 0.194 X3 = 0.576
33.35
Predicted quaiity 9.1 4.78 0.255 39.8 5.49 4.53 160.6 0.255 9.06 4.5 0.254
ITb
Optimum ingredientsc
25
Xl = 0.238 x2 = 0.180 x3 = 0.581
287.0
21
Xl = 0.231 x2 = 0.149 X3 = 0.620
125.3
28
Xl = 0.238 x2 = 0.155 x3 = 0.606
64.6
Respondse value
Predicted quality 9.03 4.68 0.253 39.15 5.7 4.49 167.2 0,255 9.14 4.59 0.254
Quality parameters used as constraints were: model 1: EXFLUID 9.50-10.50 model 2: TWLOSS 22.m24.00 model 3: HARDl: 150.0-180.0 0 See Table 1 for definitions. b Iteration. c’Xl, skinless pork fat; X2, frozen mechanically deboned poultry meat; X3, lean beef. dInstead of the sum, the multiple of the absolute values of the standardized differences was used for comparisons in Tables 2-5, as it was used by Vazquez-Arteaga (1990). To avoid degeneration, ri = (target yi - predicted yi)ldi was replaced by 1.O when ri was less than I.O.
Table 5 (footnote). In all three trials, optimal formulas were successfully homed-in by both methods with slightly larger deviations from the target value in the case of RCO as compared to Forplex. Multifactor optimization RCO was applied to the 15-factor model [eqn (5)]. Equation (5) was also used to optimize computations for 3-15 factors by replacing unused factors with their optimal level values. To optimize these models, the mapping process was automated by selecting narrower search spaces for subsequent search cycles to be one-third the size of search spaces of the previous search cycle around the best response values. Figure 1 shows the number of experimental points for search convergency for mathematical models with different number of factors. Figure 1 shows that in situations when the number of factors is less than eight, the number of experimental points required for optimization slowly increases up to 50; however, the number of experimental points quickly increases when the number of factors is greater than eight. Therefore, it is recommended that the factors be divided into two groups when the number of factors is greater than eight and that two series of optimizations be conducted as shown in Figure 2. Optimization of eight factors, which is more important than seven
factors, is carried out first. Subsequently, the second series of optimization including all factors is conducted using narrow search spaces around the best level values for the eight factors as found in the first series of optimization. As shown in Figure 2, the model optimization was carried out for the ftrst eight factors and required 49 and 24 experiments for the first and
c
FACTORS
Fig. 1. number
Relationship between the number of factors and the of experimental points which is needed to obtain convergence.
34
J. Dou, S, Toma, S. Nakai
Surimi formulation
fmgmup:tacton1-a seoond group: &ton 915
I aetmcond~twtorbveb wlthlhamlddbvllurrot tbmkm
4 I
Flowchart of random-centroid optimization of the 15-factor model [eqn (5)] when two series optimizations are used.
Fig. 2.
second cycles, respectively. After selecting the search spaces from the maps, 74 and 34 experiments for the first and second cycles, respectively, were required for the second series of optimization for seven factors in addition to the eight factors using narrow ranges chosen in the first series of optimization. Therefore, to finally home-in on the optimum, a totalof 181 experiments were required. This number was almost half that for a 15factor model shown in Figure 1 without dividing the factors into two groups. A potential risk of missing the global optimum exists in this strategy as a result of narrowing the search ranges of factors selected in the first series of optimization. As observed previously (Nakai, 1990) random search possesses high flexibility by freely extending its search spaces outside of the set ranges if required and finally homing-in on the global optimum in the case of models with local optima. Therefore, the global optimum may not be frequently overlooked.
Equation (5), after modification, was used for a hypothetical surimi formulation. The optimum was designed to be homed-in by using the following formulation conditions: surimi base (0.2), egg white powder (O.Ol), soy protein isolate (0X)03), gluten (0.002), starch (OXMIS),fat (O.Ol), water (computed to be 0.77 by subtracting total of other ingredients from 1.O), preheating temperature (7O”Q preheating time (6 min), extrusion cooking temperature (11O’C) and extrusion time (1.2 min). The protein constraint was set at 15-25% of the final product. The composition of six ingredients and four processing conditions were simultaneously optimized. After a total of 60-70 experiments, when different operators carried out three consecutive cycles, a y value of 0.28 was obtained with the largest y of 204.4 yielded during optimization processing. The best values obtained for ingredients were 0.201 (surimi base), 0.01 (egg white protein), 0.004 (soy protein isolate), 0.002 (gluten), 0.005 (starch) and 0.01 (fat) with a value of 0.766 for water (moisture content of the final product would be higher than this due to the contribution of moisture from the other ingredients). The best values found for processing were 76.6”C (preheating temperature), 6.2 min (preheating time), 104.1“C (extrusion cooking temperature), and 1.3 min (extrusion time). Examples of the maps drawn after 21 experiments in cycle 1 are illustrated in Figure 3(A) and (B). The arrows which appeared at the bottom of maps show the assigned locations of the optimum which have been used for computing the approximated slope curves of the response surface. For surimi base [Fig. 3(A)], map (a) was not reasonable as the curve and hypothetical optimum location (indicated by the large shaded arrow on the x axis) did not match. Maps (c) and (d) were more likely to be the true response surface. Although maps (e) and (f! appeared not extremely unreasonable, the fact that two maps (c) and (d) demonstrated the same surface trends rather than one favored the conclusion that maps (g) and (h) [Fig. 3091 were likely to be more reliable approximation of the response surface. As the number of factors increased as in the case of surimi formulation, it became increasingly difficult to choose the search spaces for the subsequent cycle from maps. For cases such as this, selection of one-third the size of the search space around the best response found in the previous search is recommended.
Random-centroid optimization for food formulation
35
A
r 50
a
23
E
L..._& :
6! O B 30
8__i..i
.i....L
d-.L_____L_I_L_~
-
F
d
b
I’
23
0 0.13
0.2
23
0.23 SURIMI
BASE
25
t., -.._--..--
1.(..
-___----i-.ixic------’
F
h
23
EXTRUSION TJlMPF.RAWRE WJ
Fig. 3. Examples of cycle 1 mapping of surimi formula optimization. A: surimi base, B: extrusion temperature. Maps designated with letters a, b, c and d in Fig. 3(A) and e, f, g and d in Fig. 3(B) are maps drawn according to the method of Nakai (1990) by assuming that the optimum is located at where the large shaded arrow at the bottom of each map points on x scale. A pair of small arrows on maps indicates the area on x scale bordered by the arrows where the optimum may locate.
DISCUSSION Random search is an alternative to the grid methods for searching the global optimum, however, the random method is often inefficient as was discussed by Schwefel (1981). To avoid this problem,
deterministic rules are frequently introduced. The rules included in the previous program (Nakai, 1990) were (a) at least one of each level value selected randomly should locate within 20% of the search space inside at the upper and lower ends of the space and (b) the average of all level values of
36
J. Dou, S. Toma, S. Nakai
each factor should be within f 10% of the span of search space from the mid-point of the search space. However, as the number of factors is increased, a more uniform distribution of the search interval is required; otherwise, an incomplete coverage of search space would result. Modification of the random design algorithm for more even distribution of search interval is an approach toward the grid search in defining the working conditions of experimental points. However, the adjusted random search in the new optimization program still exploits the advantage of random search and is relatively insensitive against stochastic perturbation. The results shown in Tables 24 indicated that there were no significant differences 0, > 0.05) in deviations predicted quality values from the target between random-centroid and Forplex, using Student’s t-test. However, it was observed that Forplex search occasionally stalled near the boundaries. In addition, the speed to reach the optimum slowed down as more constraints were introduced, due probably to the fact that Forplex is a restrained search technique which retreats from the boundaries defined by the constraints. This is inherent to Complex due to the nature of the algorithm. However, this is not a serious drawback of Complex when it is used for computational optimization (e.g. routine formulation) as the computer time required for optimization is practically inconsequential, therefore repeating the search is relatively inexpensive. The optimization efficiency of the Forplex is dependent on the initial search space chosen, therefore, by entering different search spaces it is possible to finally home-in on the true optimum. The higher optimization efficiency obtained by random-centroid search is an advantage which is critical for new product developments as the number of experiments immediately affects the costs of research and development. A similar advantage of the randomcentroid optimization over simplex optimization was previously discussed (Nakai, 1990). A subroutine for penalty function was introduced into the new computer program for random-centroid optimization to restrain the search outside of the off-limit zone. This subroutine improves the efficiency of optimization by introducing a capacity to manipulate constraints which the original program lacks. However, it is sometimes useful to carry out the random-centroid search without activating the penalty function subroutine. The conditions used for setting constraints in optimization may alter and new constraints are
needed in future optimization. Maps drawn after searching without the restriction of constraints are usually useful in visualizing the entire situation including the level values of factors where the constraints would be violated. There are two different strategies in the optimization: deterministic and probabilistic. The simplex, Complex and linear programming methods are all deterministic, while random-centroid is typical of probabilistic methods. Theoretically, the deterministic methods which are regulated by rules to search for the optimum are more efficient in optimization than the probabilistic methods. Moreover, the rigidity of the deterministic algorithm based on a fixed internal method of the objective function is advantageous if the deterministic strategy corresponds closely enough to the model. If this is not the case, the advantage may even turn into a disadvantage. In contrast, if the objective function is stochastically perturbed, as in the case of research and develop ment, numerous unknown regulations or relations would be involved and the probabilistic approach may be more appropriate to apply. The random methods which do not fit any particular model are best suited when the optimum is located in a particularly difficult location, such as in the case of a strongly restricted search area for objective functions with large perturbations. The Monte Carlo optimization technique of Hendrix (1980) is a random method being frequently used. Although there is no data for direct comparison with the randomcentroid optimization proposed in this study, it is unlikely that the Monte Carlo optimization technique possesses efficiency and flexibility comparable to those of the former which is now capable of accommodating constraints.
REFERENCES Aishima, T. & Nakai, S. (1986). Centroid mapping optimization: a new efficient optimization for food research and processing. J. Food Sci., 51, 1297-300, 1310. Arteaga, G., Li-Chan, E., Nakai, S., Cofrades, S. & JimnezColmenero, F. (1992). Mixture design experimentation: a new approach to study hydrophobicity and functionality of food protein mixtures. J. Food Sci. in press. Bowman, F. & Gerard, F. A. (1967). Higher Culculus. Cambridge University Press, London, p. 234. BOX, M. J. (1965). A new method of constrained optimization and a comparison with other methods. Computer J., 8, 42252. Hare, L. B. (1974). Mixture designs applied to food formulation. Food Technol., 28, 50-56, 62. Harper, J. M. & Wanninger, L. A. Jr (1970). Process modeling and optimization. 3. Use of mathematical model to optimize product and process. Food Technol., 24, .59&95.
Random-centroid optimization for food formulation Hendrix, C. (1980). Through the response surface with test tube and pipe wrench. Chem. Technol., 10,488-97. Morgan, S. L. & Deming, S. N. (1974). Simplex optimization of analytical chemical method. Anal. Chem., 46, 1170-8 1. Nakai, S. (1981). Comparison of optimization techniques for application to food product and process development. J. Food Sci., 41, 144-52, 157.
Nakai, S. (1990). Computer-aided optimization with potential application in biorheology. J. Jap. Sot. Biorheology, 4, 143-52.
Nakai, S., Koide, K. & Euguster, K. (1984). A new mapping super-simplex optimization for food product and process development. J. Food Sci., 49. 1143-8, 1170.
37
Norback, J. P. & Evans, S. R. (1983). Optimization and food formulation. Food Technol., 37, 73-80. Pearson, A. M. & Tauber, F. M. (1984). Least-cost formulation and preblending of sausage. In ‘Proposed Meats’, 2nd edn. AVI Publishing, Westport, CT, p. 158. Schwefel, H.-P. (1981). Numerical Optimization of Computer Models. John Wiley & Sons, New York, p. 235. Vazquez-Arteaga, M. C. (1990). Computer-aided formula optimization. MSc thesis, University of British Columbia, Vancouver.
(Received 4 May 1992; accepted 23 June 1992)