Chapter 2
STABILITY AND RESPONSE SURFACE METHODOLOGY STEPHEN P. JONES
Boeing Computer Services, The Boeing Company, P.O. Box 24346, MS 7L-22, Seattle, WA 98124-0346, United States
2.1 I N T R O D U C T I O N In recent years much attention has been focused on the impact of the use of statistics, and in particular experimental design, to improve the quality of products and processes. An important component of the quality of a product is its robustness or stability in the presence of what Taguchi has called noise variables. These noise variables can be from a variety of sources, such as environmental conditions, deterioration of components, or variation in product components and manufacturing processes. It is possible that variation due to these sources will cause variation in the key characteristics of a product or process, resulting in a product of inferior quality. This chapter will examine the application of statistical experimental design to designing a product or process that is robust to variation from environmental variables. It should be understood that the phrase "environmental variables" is to be viewed broadly and is not just limited to variables such as temperature and humidity. In this context, variation from environmental variables is variation that is external to the product and that is outside of the control of the manufacturer during production. Thus, it might also include variation in the conditions in which the customer uses the product, or in the conditions in which the product is stored, or in how the product is maintained and serviced. It should be noted that experiments with this objective of robust design have been run for many years in agricultural research. For example, a paper by Yates and Cochran [ 1] describes experiments on crop varieties in different regions over several years; the objective being to determine a variety that consistently will produce a good yield over a range of climate
11
12
S.P. JONES
and soil conditions represented by the different regions. They used a graphical analysis of the interaction between the varieties and the regions to investigate the robustness of the varieties to the different regions. It is clear from this description that investigating crop varieties that are robust to environmental variation, whether due to climate, soil, aspect, farming practice, etc., is an application of experimental design techniques to robust design. The experiments conducted to perform ruggedness tests of measurement procedures can also be viewed as experiments to investigate robust design; see, for example, Wernimont [2], and Youden [3,4]. The objective of ruggedness tests is to determine a robust measurement procedure; that is a procedure that will give a consistent (and correct) result under a range of measurement conditions. An industrial example of the use of experimental design for robust design, given in Box and Jones [5], is the case of a manufacturer of medical packaging material who sought a method of manufacture that would yield a robust packaging material. In this context, a robust packaging material is one that can be used to seal medical equipment under a range of sealing process conditions used by its customers, the medical equipment manufacturers. The environmental conditions were the sealing process factors. The objective of the experiment was directed towards achieving a suitable product design so that the variation in the environmental conditions did not result in variation in the product's performance, that is, how well the material seals. Packaging material that would yield a good seal over a range of sealing process conditions would have a competitive advantage since medical equipment manufacturers would not have to operate their sealing process within a narrow tolerance to produce a good seal. Therefore the equipment manufacturer can use less precise equipment or machines that are difficult to control consistently or a less qualified workforce. The motivation for interest in designing robust products and processes is that it is frequently more cost effective to reduce the effect of the environmental variation rather than to eliminate the source of the variation by controlling the environment. Furthermore, in some situations it might be impossible to eliminate or control the environmental variation. As an example, a manufacturer cannot control the variation in the use of their product and so would prefer to design the product to be robust to a wide range of customer usage conditions rather than to impose instructions that
STABILITY AND RESPONSE SURFACE METHODOLOGY
13
need to be strictly adhered to by the customer. In this way the product design is forgiving of variation beyond the control of the manufacturer. It should be noted that although it has been stated that the environmental variables are beyond the control of the manufacturer in the normal production or usage conditions, it is necessary that they can be controlled for an experiment. The objective of the experiment is to learn how to minimize the influence of the environmental variables on the product or process performance. To accomplish this objective it will be necessary to understand how variation in environmental conditions affects the product or process performance. The methodology that will be described in this chapter requires that the environmental conditions be changed in a controlled, structured manner.
2.1.1 Example Consider the set of data given in Table 2.1. In this example a tablet formulation is desired that will retain desired properties in both tropical and temperate climates. The actual climatic conditions that will be experienced in practice are beyond the control of the manufacturer but they can be simulated in a laboratory experiment. In this example, experiments are to be run with three constituents of the tablet formulation, say, glidant, lactose, and disintegrant, which will be denoted as A, B, and C, in a 2 3 factorial design. The two levels for each of the factors in the experiment are denoted by -1 and +1. The manufacturer wants a stable, or robust, tablet formulation so that it will retain its efficacy when stored in a range of temperatures and humidities. To yield data on this, for each of the eight tablet formulations, the storage temperature and humidity will be varied in a laboratory experiment following a 32 factorial design. In this design the environmental variables are varied in a climate-controlled chamber above and below their nominal settings (denoted by + l, -1, and 0, respectively). A set of hypothetical data for a response of interest, say crushing strength, is shown in Table 2.1. The objective is to determine a combination of the factors glidant (A), lactose (B), and disintegrant (C) that will yield high values for crushing strength across the ranges of temperature and humidity studied in the experiment. At first glance it might appear that the formulation with A--, B =-, and C=+ gives good values for crushing strength. Indeed at the nominal settings of temperature and humidity (0, 0) the crushing strength is 125 for this design combination, close to the largest response in the data set.
14
s.P. JONES
However, calculations of means and standard deviations for the response over the environmental conditions, shown in Table 2.2, reveal that the formulation with A=-, B=+, and C=+ yields an average crushing strength that is identical in magnitude but with considerably less variation as the temperature and humidity variables are changed. This formulation is robust, or stable, to storage in the range of climates represented by the changes in temperature and humidity considered in the experiment.
T A B L E 2.1 HYPOTHETICAL DATA SET FOR TABLET FORMULATION EXPERIMENT . . . . . . . . . . . . . . . . . . . .
~......
~
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Environmental
.
.
.
.
.
.
.
.
.
.
Variables
Temperature
-
-
-
0
0
0
+
+
+
Humidity
-
0
+
-
0
+
-
0
+
Design
Variables
A
B
C
-
-
-
119
106
97
107
107
95
87
88
87
+
-
-
100
95
87
101
119
91
107
87
83
102 91 128 103 103 104
101 102
87
105
105
100
103
85
88
96
125
97
99
107
76
94 111 102
88
96
97
107
106
107
98
97
89
80 108 102
-
+
-
116
112
119
+
+
-
109
-
-
+
115
93 108
+
-
+
112
113
-
+
+
121
95
+
+
+
104
103
100 104 90 101 89
The arrangement containing the tablet design formulations is the inner array and the arrangement containing the environmental variables is the outer array. In this chapter these two arrays will be referred to as the design and environmental arrays and the total design will be called a crossproduct array. If there are n~ runs in the design array and n 2 runs in the environmental array, and the runs are made independently, then the total experiment will require n~ x n2 runs. Thus, except where both nl and n 2 are small, this could involve a large amount of experimental work. An issue that will be considered in this chapter is how the investigator can construct experimental designs that will require less work than these cross-product arrays and still be able to determine settings for the design variables that are stable (or insensitive) to variation from the environmental variables.
STABILITY AND RESPONSE SURFACE METHODOLOGY
15
T A B L E 2.2 S U M M A R Y STATISTICS FOR DATA SET OF TABLE 2.1 Design Variables ......... Standard A B C Mean Deviation 99.22 11.21 + 96.67 11.42 + 105.22 9.61 + + 96.33 7.81 + 106.56 15.66 + + 97.00 10.87 + + 106.56 7.14 + + + 98.67 6.00 The next section will present an overview of the statistical techniques associated with response surface methodology. In Section 2.3 the applicability of response surface methodology for robust design will be investigated. Section 2.4 will discuss the applicability of an alternative class of experimental designs called split-plot designs and show how the use of these designs can significantly reduce the amount of work required to conduct robust design experiments. Conclusions are given in Section 2.5.
2.2 A N O V E R V I E W O F R E S P O N S E S U R F A C E METHODOLOGY The strategy for robust design experiments that will be considered in Section 2.3 is based on the statistical techniques associated with response surface methodology. This section will give an overview of response surface methodology, presenting some of the more common experimental designs that have been developed in this area. To motivate the response surface approach, suppose that there is some response of interest (for example, crushing strength in the tablet formulation example of Section 2.1.1), and a set of quantitative, continuous design variables that are of interest to the researcher (for example, the quantities of glidant, lactose, and disintegrant for the tablet formulation example). One possible objective for the researcher might be to understand and describe the relationship between the design variables and the response. This relationship can be described mathematically by
16
s.P. JONES
constructing an empirical model of the response as a function of the design variables over a range of interest. In the case where there is one design variable of interest, say percentage of lactose, the model of the response can be graphed as a curve on an x-y plot, as shown in Figure 2.1. When there are two factors of interest, the model of the response can be represented as a surface, often plotted as a contour diagram, as shown in Figure 2.2. On this plot the lines are contours of constant response and indicate the predicted response for the design variable combination. A response surface can be used to determine optimum factor settings for the response or to indicate a range of factor settings that yield an approximately equivalent response. This latter use indicates a region in the factor space where the response is robust to changes in the factors. For example, in Figure 2.2, it appears that the maximum crushing strength occurs when the percentage of lactose is 23% and the percentage of disintegrant is 3.3%. However, it can also be seen that near the optimum point the crushing strength is more stable or robust to changes in the quantity of lactose than to changes in the quantity of disintegrant.
110 105 tm
100 .t--4
95 90 !
2O
!
30 Percentage Lactose
!
40
Figure 2.1 Curve showing the effect of lactose on crushing strength.
STABILITY AND RESPONSE SURFACE METHODOLOGY
4.a
4.a
17
4.0 3.5 3.0
tat~ 4.a
2.5 90 2.0 !
I
!
,,,,
20
30 40 Percentage Lactose Figure 2.2 Response surface showing the effect o f lactose and disintegrant on crushing strength To introduce some notation, let the response of interest be denoted by 7/ and suppose that there are p quantitative, continuous design variables, x 1, x2, ..., Xp, such that r/is a function of the design variables, x r x2, ..., Xp, that is
q =f(x l, x2, ..., x )
(1)
where the form o f f is unknown. If the response is measured at a particular setting of the design variables then the measured response will differ from the true response due to experimental error, that is y=q+
c
(2)
where y is the measured response and e is the error. In response surface methodology, it is frequently assumed that f can be approximated in some region of the design variables by a low-degree polynomial. For example, if p=2, and a first-order model is assumed appropriate then
Y = flo + fll X l + flaX2 + e
(3)
where/70, ill, and f12 are constant coefficients that measure the mean and
18
s.P. JONES
the effects of x 1 and X 2 o n the response. It is assumed that the x's are controlled and measured with no error in the experiment. Alternatively, the experimenter might assume that f can be approximated by a second-order model so that Y = fl0 + fllXl + ~2X2 + flllXl 2 + flzzX22 + fll2X'X2 + ~
(4)
The rationale for using low degree polynomials to approximate f is based on a Taylor series expansion o f f around x=0. The statistical techniques associated with response surface methodology are concerned primarily with two aspects of the experimentation process; the construction of experimental designs that yield data to permit the efficient modeling of the response surfaces, and the analysis of the experimental data and derived response surfaces. The statistical investigation of response surfaces has a history dating back to the pioneering work of George Box and his colleagues in the 1950's; see, for example, Box and Wilson [6], Box [7], Box and Youle [8]. An introduction to the concepts and techniques associated with response surface methodology can be found in Box, Hunter, and Hunter [9] (chapter 15) and Cornell [10]. For an extensive coverage of response surface methodology see Myers [11], Box and Draper [12], Khuri and Cornell [13]. The following sub-sections will describe some of the experimental designs that are commonly used to fit the first-order and second-order model. These designs will be called first-order and second-order designs, respectively.
2.2.1 First-order designs Suppose that there are p quantitative variables of interest, Xl, X2, ..., Xp, and that in some region of interest the response can be approximated by the general first-order model y = p0 + p , x ,
+ ... +
+
(5)
A class of experimental designs that are appropriate for obtaining data that will permit the estimation of the coefficients in equation (5) by least squares are the two-level factorial and fractional factorial designs. A single replicate of a two-level full factorial design in p variables will have 2p
STABILITY AND RESPONSE SURFACE METHODOLOGY
19
experimental runs composed of all possible combinations of the p variables. Such a design will permit the estimation of all p main effects, all possible two-factor interactions, all possible three-factor interactions, ..., pfactor interaction; a total of 2p- 1 main effects and interactions. Frequently an experiment that required all 2p experimental runs would be too costly to run, especially for p not small. In these situations important information on the effects of the variables may be determined by running only a fraction of the full factorial design. With such designs, called fractional factorial designs, the ability to estimate the effect of some higher-order interactions is lost, and other effects and interactions are aliased together. This aliasing implies that the calculated effects cannot be unambiguously assigned to one of the effects or interactions that are aliased together in the design. To illustrate the concept of aliasing, consider an experiment with three variables, x r x2, x3, using the fractional factorial design given in Table 2.3. In a two-level fractional factorial design with the two levels of each factor coded-1 and + 1, the estimate for the coefficient of a variable is calculated as half of the difference between the average response at the high and the low setting of the variable. Thus, b~, the estimate of/31 the coefficient for x l, will be calculated as +Y3 Yl+Y4 1 b l = ~1 [Y2 -----~--------~--
(6)
Now in Table 2.3 the column headed X2X3 has been derived by multiplying together the columns for x 2 and x 3. This column can be used to calculate the interaction effect of x 2 and x 3. The interaction effect of two variables measures how the effect of one variable on the response depends on the level of the other variable. From the table it can be seen that the column headed x2x 3 is identical to the column headed x~ and so the estimate of the interaction effect of x2x 3 will be identical to that for the estimate of the coefficient for x 1. Thus x~ is aliased with X2X3 and the calculated effect in equation (6) cannot be unambiguously assigned to the effect of x~ or the interaction effect x2x 3. In general, the aliasing of effects occurs when the calculation of the effects uses identical columns (apart from a switching of signs).
20
S.P. JONES
The degree of aliasing in a design can be summarized by stating the design's resolution. In general a design has resolution R if all effects containing k or more variables are unaliased with any variables containing less than R-k variables. Resolution is denoted by the appropriate Roman numeral. A resolution III design has all main effects unaliased with other main effects but may alias them with two-factor interactions. A resolution IV design does not alias main effects with two-factor interactions, but may alias two-factor interactions with one another. A resolution V design does not alias main effects with three-factor interactions nor alias two-factor interactions with one another, but may alias three-factor interactions with two-factor interactions. T A B L E 2.3 FOUR-RUN DESIGN WITH THREE FACTORS Run
xI
x2
x3
XzX 3
Response
1
-1
-1
-1
-1
Yl
2 3 4
+1 +1 -1
+1 -1 +1
-1 +1 +1
+1 +1 -1
Y2
Y3 Y4
Although some information is lost when fractional factorial designs are used instead of full factorial designs, the advantage of these designs is that the total number of experimental runs can be reduced considerably. Furthermore, by careful choice of design and allocation of the variables to the design, and by following a sequential approach to experimentation, the experimenter can use fractional factorial designs to obtain information in an economical manner. An excellent description of fractional factorial designs and aliasing can be found in Box, Hunter, and Hunter [9]. This book also contains a description of how these designs can be blocked to remove additional sources of variation from the analysis, thereby increasing the precision of the estimates of the coefficients. There is also a discussion of how experimental designs can be run sequentially, designing the next experiment in the light of information that has been obtained, and the unresolved questions that remain, from the previous experiments. The rationale for sequential experimentation is that the best time to design an experiment is after the experiment has been run, since at that stage more is
STABILITY AND RESPONSE SURFACE METHODOLOGY
21
known about the process than when the experiment was designed. Box, Hunter, and Hunter recommend that no more than 25% of the experimental budget be devoted to the first experiment, so that sufficient resources are retained to investigate questions that the data from the first experiment will raise. From the discussion of design resolution above, it should be clear that a resolution III design will permit the fitting of all the coefficients for the first-order model in equation (5). However, the resolution III design will alias main effects with two-factor interactions. The aliasing of effects in fractional factorial designs has implications for the fitting of the response surface. The aliasing in the resolution III design implies that the coefficients associated with the main effects will be biased by the presence of any interactions in the true (unknown) model. To illustrate this, consider fitting the first-order model with three variables, x 1, x2, x 3, (7)
Y = ~0 nt- ~IXl d- ]~]2X2+ ~3X3 + E 3-1
Suppose that the experimenter runs the 2 fractional factorial design shown in Table 2.3. With this design each main effect is aliased with the two-factor interaction composed of the other two factors; that is, x l is aliased with x2x3, x 2 is aliased with x~x 3, and x 3 is aliased with x~x 2. This can be verified by multiplying together the appropriate columns, as was done for x2x 3. Suppose that the true unknown model is Y = ,8o + fllXl + ~2x2 "}- ]~3X3 q- ~I2XIX2 q-
(8)
Then with the design given in Table 2.3, b 3, the experimenter's estimate of /33, will be biased by the coefficient/312" In fact E(b3) = [/3 + 1312
(9)
Similarly, if the true model is Y -- J~0 d- ~l x ' -t- ~2X2 nt- ]~3X3-t- ~23X2X3 +
(10)
22
S.P. JONES
then with the design in Table 2.3, b l, the experimenter's estimate of ill, will be biased by the coefficient/323; that is E(b,) = ,8~ + ,/323
(1 1)
It can be seen that the use of a fractional factorial design can lead to biases in the estimation of the first-order coefficients from any interactions that are present in the true (unknown) model and that have been aliased with the main effects of the factors. Therefore, the experimenter needs to be aware of the aliasing that occurs with the use of a fractional factorial design and understand the biases that can result in the estimation of the coefficients of the model. A more complete discussion of the biases in estimation of coefficients from using fractional factorial designs and a description of how the biasing can be calculated for larger fractional factorial designs can be found in Box and Draper [12] (pp. 65-70) and Myers [11] (pp. 110-114). Some protection against the effect of biases in the estimation of the firstorder coefficients can be obtained by running a resolution IV fractional factorial design. With such a design the two-factor interactions are aliased with other two-factor interactions and so would not bias the estimation of the first-order coefficients. In fact the main effects are aliased with threefactor interactions in a resolution IV design and so the first-order effects would be biased if there were third-order coefficients of the form x x x k in the true model. Fractional factorial designs use n = 4, 8, 16, 32, 64, ... runs, and can be constructed to carry up to p = n-1 variables. (A design that has p = n-1 variables in only n runs is called a saturated design since it cannot hold any more variables.) For values of n that are multiples of 4 but not a power of 2, that is, n = 12, 20, 24, 28, 36,..., an alternative class of first-order design that can be used are the Plackett-Burman designs; see Plackett and Burman [ 14]. Plackett-Burman designs may be of use in screening situations, that is in situations when the experimenter wishes to examine many variables but believes that only a few are of importance. Furthermore, Plackett-Burman designs are particularly useful when following a sequential experimental strategy since a resolution IV design can be constructed from a PlackettBurman design by augmenting it with the foldover design; that is the design where all of the runs have the signs of all the variables switched. An example of a Plackett-Burman design with 11 factors in 12 runs is given in Table 2.4.
STABILITY AND RESPONSE SURFACE METHODOLOGY
23
It can be seen that this 12-run design is generated by starting with a particular row o f - l ' s and +l's and generating the next row by cycling through the variables and shifting each sign one place to the right. This is repeated eleven times to obtain the first eleven runs and then the final run is constructed by adding a final row of-l's. The starting rows for 12-, 20-, and 24-run designs given by Plackett and Burman [14] are as follows: n=12:+1 +1-1 +1 +1 + 1 - 1 - 1 - 1 +1-1 n=20:+1 + 1 - 1 - 1 +1 +1 +1 +1-1 +1-1 + 1 - 1 - 1 - 1 - 1 +1 +1-1 n=24:+1 +1 +1 +1 +1-1 +1-1 +1 +1 -1-1 +1 + 1 - 1 - 1 +1-1 + 1 - 1 - 1 - 1 - 1 .
Run
A
1 2 3 4 5 6 7 8 9 10 11 12
+1 -1 +1 -1 -1 -1 +1 +1 +1 -1 +1 -1
T A B L E 2.4 PLACKETT-BURMAN DESIGN IN 12 RUNS B C D E F G H I +1 +1 -1 +1 -1 -1 -1 +1 +1 +1 -1 -1
-1 +1 +1 -1 +1 -1 -1 -1 +1 +1 +1 -1
+1 -1 +1 +1 -1 +1 -1 -1 -1 +1 +1 -1
+1 +1 -1 +1 +1 -1 +1 -1 -1 -1 +1 -1
+1 +1 +1 -1 +1 +1 -1 +1 -1 -1 -1 -1
-1 +1 +1 +1 -1 +1 +1 -1 +1 -1 -1 -1
-1 -1 +1 +1 +1 -1 +1 +1 -1 +1 -1 -1
-1 -1 -1 +1 +1 +1 -1 +1 +1 -1 +1 -1
J
K
+1 -1 -1 -1 +1 +1 +1 -1 +1 +1 -1 -1
-1 +1 -1 -1 -1 +1 +1 +1 -1 +1 +1 -1
Plackett and Burman [14] give the method of design construction for all values of n that are multiples of 4 up to 100 except n=92. A disadvantage of Plackett-Burman designs is that the structure of the aliasing is more complex than the fractional factorial designs, so that it is harder to determine the effect of any biases in the estimates of the coefficients. Draper and Lin [ 15] indicate how, in the situation where only a few variables are important, additional runs can be added to PlackettBurman designs to yield experimental designs with higher resolution or clearer alias structure. Hamada and Wu [16], under the assumptions that there are few significant variables and that any variable in a significant interaction is likely to have a significant main effect, show how it may be
24
S.P. JONES
possible to study a few interactions in Plackett-Burman designs without adding any runs. Box and Meyer [17] describe how a Bayesian analysis can reveal active variables in the complex aliasing that occurs with Plackett-Burman designs. In conclusion, Plackett-Burman designs tend to have a complex alias structure and so the presence of interactions in the true model induces a complex bias structure on the first-order coefficients. Therefore, it is recommended that these designs only be used if the assumption of no second-order interactions is reasonable, or as a part of a sequential strategy of experimentation that would generate a resolution IV design by augmenting the Plackett-Burman design with its foldover design.
2.2.2 Adding center points When an unreplicated experiment is run, the error or residual sum of squares is composed of both experimental error and lack-of-fit of the model. Thus, formal statistical significance testing of the factor effects can lead to erroneous conclusions if there is lack-of-fit of the model. Therefore, it is recommended that the experiment be replicated so that an independent estimate of the experimental error can be calculated and both lack-of-fit and the statistical significance of the factor effects can be formally tested. In some experimental contexts, however, each experimental run is expensive. Thus it is infeasible to replicate each design point of the experiment to obtain an estimate of the experimental error. When all of the variables are quantitative, an estimate of the experimental error can be obtained by adding to the full factorial, fractional factorial or Plackett-Burman design, a number of runs at the center of the design. The center of the design is the midpoint between the low and high settings of the two-level factors in the experiment. Thus, if there are p variables, and the levels of the variables have been coded (-1, +1), then the center of the design is (x l, x2, ..., Xp) = (0, 0, ..., 0). If the center point is replicated n o times in the experiment, then the variance of the response at those runs provides an estimate of the experimental error with n o - 1 degrees of freedom to statistically test both the lack-of-fit of the model and the significance of the coefficient estimates of the model. Another reason for augmenting the two-level design with center points is that these points allow for an overall test of curvature. It is clear that with only two levels for each variable it is impossible to detect any quadratic effect of the variables. Thus, the underlying model is assumed to
STABILITY AND RESPONSE SURFACE METHODOLOGY
25
be linear over the experimental region. To examine the quadratic effect of all the variables requires each variable to be run with at least three levels. An overall test of the presence of quadratic effects can be obtained by comparing the average of the center point runs, Y0, with the average of the cube portion of the design, ~ , since the expected value of (yc- Y0 ) is (12)
E ( Yc - Y O)= /~ I +,/~2 + "" + flpp
where i~ii is the quadratic effect of factor Xf A formal statistical test can be constructed by comparing
F= ncno(Yc - yO )2 (nc + n ~ 2 with the Fl,(n0_l )
(13) distribution, where
82
is the estimate of the
experimental error from the variance of the n o center point runs, and n c is the number of runs in the cube portion of the design. If the F-test is significant then there is evidence of a quadratic effect due to at least one of the variables. With the present design, however, the investigator will not be able to determine which of the variables has a quadratic effect on the response. Additional experimentation, perhaps by augmenting the current design with some star points to construct a central composite design (see section on central composite designs below), will need to be conducted to fully explore the nature of the quadratic response surface.
2.2.3 Second-order designs Suppose that there are p variables of interest, x~, X2, ..., Xp, and that in some region of interest the response can be approximated by the general second-order model
y-- ,/5~0"+- Q/]IXl -l- ,B2X2 -t-... -+- ~pXp ) -~- ('~! ,Xl 2
+/~22X2 2 + "'" + flppXp2)
-t- (/~12XlX2 -+- ,/~13XlX3 -+-... q- ~lpXlXp -+-/~23X2X3 q-... -+-/~(p_l)pX(p_l)Xp ) "+- g
(14)
26
s.P. JONES
that is, y = intercept + (first-order terms) + (quadratic terms) + (cross-product terms) + This section will describe some of the classes of experimental designs that are appropriate for obtaining data that will permit the estimation of the coefficients in equation (14) by least squares.
Three-level designs It is obvious that to be able to estimate the quadratic coefficients, ~11' 1~22 ' /5'33 ,.., flpp, in equation (14), it is necessary to have at least three distinct levels or settings for the variables. This suggests that a suitable design for estimating the coefficients of the second-order model would be a single replicate of a three-level full factorial design in p variables. This design will have 3 p experimental runs composed of all possible combinations of the p variables. If there are only p=2, or p=3 variables then a full factorial design is often feasible. However, the number of runs required becomes prohibitively large as the number of variables increases. For example, with p=5 variables, the second-order model requires the estimation of 21 coefficients: the mean, five main effects, five pure quadratic terms, and ten two-factor interactions. The three-level full factorial design would require 5 3 = 243 runs. It might be supposed that a smaller design that permitted the estimation of the coefficients of interest could be constructed by taking a fraction of the full factorial. However, the aliasing of three-level designs is very complex and so fractionating a three-level design will not be pursued. The interested reader may refer to Kempthome [ 18]. Therefore, unless the number of factors is small, three-level designs are not usually feasible for response surface studies.
Central Composite Designs An alternative approach to constructing designs for estimating secondorder models is to consider building a design from those constructed for the first-order model. In Section 2.2.1, we discussed the use of fractional factorial designs to estimate the coefficients of the first-order model. It was noted that a fractional factorial design of resolution V would yield
STABILITY AND RESPONSE SURFACE METHODOLOGY
27
unbiased estimates of all coefficiems for the main effects and two-factor interactions. To estimate the quadratic coefficients of the second-order model this design could be augmented with additional poims where the variables are at additional settings to the fractional design so that each variable has at least three settings. A class of augmented designs, first proposed by Box and Wilson [6] and frequently applied in response surface work, is the central composite design. Composite designs consist of: 9 a full or fractional factorial design of at least resolution V; the number of runs in this design will be n = 2 (p-k),these runs forming a cube portion with coordinates of the form (+ 1, + 1, ..., + 1); 9 n = 2p star points with coordinates (+a, 0, 0, ..., 0), (0, •
0, ..., 0), ...,
(o,o,..., • 9 no
center points (0, 0, 0, ..., 0).
The use of the terms cube, star and center points is descriptive of the design pattern, as is clear when there are p = 3 variables. In that case the points of the central composite design, shown in Table 2.5, can be represented by the points in Figure 2.3. In Table 2.5, runs 1-8 are the cube portion, runs 9-14 are the star portion, and runs 15-17 are the center points. In general the cube portion might be replicated r times and the star portion might be replicated r s times. Also, it might be possible to use a fractional factorial design of resolution less than V if the experimenter is prepared to assume that certain interactions are negligible. A central composite design in four variables is shown in Table 2.6. In this table, runs 1-16 are the cube portion, runs 17-24 are the star portion, and runs 25-27 are the center points. The central composite design has several advantages over the three-level design. Firstly, the total number of runs in a central composite design is frequently less than that required for a three-level full factorial design. For example, with p = 5 variables 243 runs would be required for the three-level full factorial design, whereas with single replicates for the cube and star portions and four center points, the total number of runs required for a central composite design would be 16 + 10 + 4 - 30 (for the cube portion a 1) 2 (5 fractional factorial design could be used).
28
S.P. JONES
A
A
I
+1
I
9
01..-
.~ cube points -1
~ -1
/
|
~ -1 +1
star points 0 center point
F i g u r e 2.3 Central composite design with three variables.
TABLE 2.5 CENTRAL COMPOSITE DESIGN WITH THREE VARIABLES ....
Run
"
A
B
C
1 2 3 4 5 6 7 8 9
-1 +1 -1 +1 -1 +1 -1 +1 -a
-1
-1 +1 +1 -1 -1 +1 +1 0
-1 -1 -1 -1 +1 +1 +1 +1 0
10
+a
0
0
11
0
-a
0
12
0
+a
0
13
0
0
-a
14
0
0
+a
15 16 17
0 0 0
0 0 0
0 0 0
"
STABILITY AND RESPONSE SURFACE METHODOLOGY
29
A second advantage of the central composite design is that it lends itself to a sequential approach to experimentation, since the central composite design can be built in sections. For example, an experimenter might initially assume that the response surface can be adequately represented by a first-order model, possibly with the addition of some two-factor interaction terms. Thus they might initially conduct a resolution V fractional factorial design. Following the analysis the experimenter might suspect some nonlinearity and so augment the first design with some center points. If examination of the response at the center point runs indicates the presence of quadratic effects, then the experimenter might be interested in fitting a second-order model. Data to enable this to be accomplished can be obtained by augmenting the design with star points to generate the central composite design. In some situations design augmentation can be accomplished so that the designs are orthogonally blocked, thus allowing for block differences to be eliminated in the analysis and estimation of the coefficients. The central composite design gives the experimenter the flexibility of choosing the value of a, the distance of the star points from the center of the design. One possible criterion for a is to choose it so that the central composite design is rotatable. A rotatable design is one in which the precision of the predicted response is the same at all points equidistant from the center point (0, 0,..., 0). Rotatability is a useful property for a design since it relieves the experimenter from making any assumption that the underlying response surface is oriented in a particular direction. Rotatability ensures that whatever the orientation of the response surface the precision of the predicted response will not be dependent on the direction from the center of the design, only on the distance from the center of the design. It can be shown that for a central composite design to be rotatable the distance of the star points from the design center is a = (2(P-k)r / r )1/4 x
C
S"
where 2 ~p-k)is the number of factorial points. Therefore, i f p =5 and k =l, then the design would be rotatable if a = ( 2 ( 5 1 ) ) I/4 = 2. For p=3, then 6c=(23)TM = 1.68 generates a rotatable design. A possible disadvantage of the central composite design is that it requires five levels of each variable (0, + 1, +a). In some situations it might be necessary or preferable to have only three different settings of the variables. In this case a can be chosen to be 1 and the design is called a face-centered composite design. These designs are not rotatable.
~
~ ~"
~
O
r
~=~
x
~
0
~o.~
E
o.=
=ag
~-
o
,..
,~-
~o
~=,-
~ o
~'~
~~ ~ ~"
r~
0
0S'~"
o
o
o
o
o
~
~
o o o
0
--
o~-
r~ r~
-~-
o
~
~~~~~~~-~
o
~
~
o
~
o
o
0
o
~
~-' +
,
+
,
+
,
+
,
+
,
+
~
o
0
o +
_-I-
_~
+
_-I-
+
_-I-
+
_-I-
'
_-I-
,
_-I-
,
_-I-
,
,
+
,
+
,
+
~~++11++11++11++11
+~
,
+
,
,
,
+
,
,
,
|
,
+
|
,
,
>
m:
>
<
9
ml
9
m~
9
Z
9 Z
STABILITY AND RESPONSE SURFACE METHODOLOGY
31
can be assumed negligible, then it might be possible to use a resolution IV design with a particular assignment of variables to columns of the design. Alternatively, if there are certain pure quadratic effects that are deemed unimportant, then star points for those variables need not be added to the design.
Box-Behnken Designs Another alternative to the 3 p full factorial is the Box-Behnken design (Box and Behnken [19]). These designs are a class of incomplete three-level factorial designs that either meet, or approximately meet, the criterion of rotatability. A Box-Behnken design for p=3 variables is shown in Table 2.7. This design will estimate the ten coefficients of the second-order 3 model in only fifteen runs, in contrast with the 3 =27 runs required by the full factorial design. This design is shown graphically in Figure 2.4.
T A B L E 2.7 BOX-BEHNKEN DESIGN FOR THREE VARIABLES Runs A B C 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
-1 +1 -1 +1 -1 +1 -1 +1 0 0 0 0 0 0 0
-1 -1 +1 +1 0 0 0 0 -1 +1 -1 +1 0 0 0
0 0 0 0 -1 -1 +1 +1 -1 -1 +1 +1 0 0 0
Table 2.8 gives the runs for a Box-Behnken design in four variables. In this table, the runs are grouped in sets of four, each set of four being
32
s.P. JONES
composed of all the combinations of + 1 for the two variables indicated, the other two variables being set at 0. The design is completed with three center points, runs 25-27.
+1
+1 -1 -1
+1
Figure 2.4 Box-Behnken design for three variables
T A B L E 2.8 BOX-BEHNKEN DESIGN FOR FOUR VARIABLES Runs A B C
D
1-4 5-8 9-12 13-16 17-20 21-24 25-27
0 +1 • 0 0 +1 0
+1 0 +1 0 +1 0 0
+1 0 0 • 0 +1 0
0 +1 0 +1 +1 0 0
As was mentioned above for central composite designs, the experimenter can modify these designs if they believe that certain two-factor interactions can be assumed negligible. Box and Jones [20,21] show how this can be done to yield what they call a modified Box-Behnken design that requires fewer runs than the standard Box-Behnken design. A table of Box-Behnken designs for p - 3, 4, ..., 7 variables can be found in Box and Draper [12], and for p = 3, 4, ..., 7, 9, 10, 11, 12, 16 variables in Box and Behnken [19].
STABILITY AND RESPONSE SURFACE METHODOLOGY
33
2.2.4 Optimal designs Optimal design theory provides an altemative approach to the selection of an experimental design. For a description of the theory of optimal design see, for example, Atkinson and Donev [22]. To motivate this approach, suppose that an experiment with n runs will be conducted and a model with k coefficients is to be fit to the data. This model can be represented in matrix notation as y=X]3+~
(15)
where y is an n x 1 column vector of response values, 13 is a k x 1 column vector of coefficient estimates, X is a n x k matrix that defines the runs in the experiment and ~ is an n x 1 column vector of errors. It is commonly assumed that the errors are independent and follow a normal distribution with variance oa. One of the questions that the experimenter needs to consider is how to choose good values for the elements of X. It can be shown that the variance of the coefficient estimates, b, of j3 is o2(XvX) ~. Furthermore, the variance of the predicted response at any setting of the variables is also a function of (XTX)~. Thus one way to choose good values for the elements of X is to choose them so that (XTX) ~ is, in some sense, "small". A number of criteria have been developed, the most popular of which are: 9 A - o p t i m a l i t y criterion: 9 D - o p t i m a l i t y criterion: 9 E - o p t i m a l i t y criterion: 9 G - o p t i m a l i t y criterion:
minimize minimize minimize minimize
trace (XVX)1, det (XvX) "l, max eigenvalue (XTX) l, maximum value of the variance of
the predicted response. Efficient algorithms have been developed that construct D-optimal designs for a given response model, candidate design points, and number of runs (see, for example, Mitchell [23]). Optimal experimental design can be useful in situations where: 9 the experimental design region is irregularly shaped due to constraints on the variables, 9 it is necessary to augment an existing design, 9 designs must be constructed for special models or with a limited number of runs,
34
S.P. JONES
9 the experimenter has prior knowledge on the form of the model and desires coefficient estimates in a minimal number of runs. There has been an extensive critique on the role of optimal design theory in practical experimental design; see, for example, Box [24], Box and Draper [12]. One of the underlying assumptions behind optimal experimental design is that since the designs are only optimal within the region defined by the candidate design points, then there is a well-defined region of interest within which experiments can be run. The assumption is that the experimenter has no interest in the response outside of the region defined by the candidate points. In typical response surface studies, however, the region of interest might be poorly defined and might change as the investigation proceeds. Thus, it might be advisable to design the experiment to obtain information about the response beyond the current region of interest defined by the candidate points. Furthermore, optimal design theory assumes that the model is true within the region defined by the candidate design points, since the designs are optimal in terms of minimizing variance as opposed to bias due to lackof-fit of the model. In reality, the response surface model is only assumed to be a locally adequate polynomial approximation to the truth; it is not assumed to be the truth. Consequently, the experimental design chosen should reflect doubt in the validity of the model by allowing for model lack-of-fit to be tested. 2.2.5 Other second-order designs There are many other second-order designs that have been proposed in the statistical literature. Some of these designs are based on variants of the central composite design. For example, the designs proposed by Hartley [25] use cube portions in the central composite design of resolution less than V with no two-factor interactions aliased with one another. Secondorder designs can also be constructed using irregular fractional factorial designs for the cube portion. Irregular fractional factorial designs (see, for example, John [26] and Maclean and Anderson [27]) are non-orthogonal fractions of a full factorial design. Some second-order designs, such as the uniform shell designs (Doehlert [28]), have been proposed which are not based on the central composite design. A more thorough treatment of additional second-order designs can be found in the texts mentioned earlier: see Myers [11 ], Box and Draper [ 12], Khuri and Comell [ 13 ].
STABILITY AND RESPONSE SURFACE METHODOLOGY
35
2.2.6 Interim Summary This section has given an overview of some of the experimental designs that are suitable for collecting data to estimate the coefficients of the firstorder and second-order model. Many of these designs are based on factorial and fractional factorial designs. It is clear that if the first-order model (equation (5)) is assumed to be valid, then a resolution III design or a Plackett-Burman design can be used since this design will estimate all of the p~without bias. However, it has been shown that with a resolution III design the estimates of the coefficients/3, will be biased if the true (unknown) model contains interactions. The biases and lack-of-fit of the first-order model due to interaction effects can be examined by running a resolution IV design, which will yield unbiased estimates of all of the fl~. A resolution IV design can be obtained from a Plackett-Burman or resolution III fractional factorial design by augmenting the initial design with its foldover design. If the true model contains quadratic terms then the estimate of the intercept,/3 0, of the first-order model will be biased. The lack-of-fit of the first-order model due to quadratic effects can be tested by adding center points to the design. To construct the central composite design to estimate the coefficients of the second-order model (equation (14)), usually a fractional factorial design of at least resolution V is used. In this case, if the model is valid, then all of the estimates of the main effect coefficients, /3~, and the
]3o,
interaction coefficients, are unbiased. An alternative to the central composite designs for estimating the coefficients of the second-order model are the Box-Behnken designs or the designs referenced in Section 2.2.5. Table 2.9, shows the minimum number of runs for a single replicate of a fractional factorial design with the desired resolution for p variables, p=3,..., 11.
2.3 R O B U S T D E S I G N A N D R E S P O N S E S U R F A C E METHODOLOGY This section will describe how the techniques associated with response surface methodology, outlined in Section 2.2, can be applied to designing a product or process that is insensitive, or robust, to variation that is difficult
36
S.P. JONES
or impossible to control. Two alternative strategies will be outlined in this introduction and will be considered in detail in Sections 2.3.1 - 2.3.5. T A B L E 2.9 MINIMUM NUMBER OF RUNS FOR DESIRED RESOLUTION Number of Variables
3 4 5 6 7 8 9 10 11
.
.
III 4 8 8 8 8 16 16 16 16
.
.
.
Resolution
.....................
IV 8 8 16 16 16 16 32 32 32
V 8 16 16 32 64 64 128 128 128
In the first approach it is assumed that the effect of environmental variation on the response is investigated by running a replicated experiment. The replication enables the variation of the response to be estimated at each design point. In this scenario the environmental variation is uncontrolled during the experiment but is assumed to affect the response in a random manner and is captured in the replication. It is acknowledged that the variation that is measured at each design point will be from many sources, including the sources of the environmental variation. However, with this approach the objective of finding design variable settings that minimize the variation in the response can be achieved, although no information will be gained as to how the design variable settings might make the response robust to particular sources of environmental variation. The design and analysis of experiments with this first approach will be covered in Sections 2.3.1 and 2.3.2. In the second approach, the environmental variation is deliberately introduced into the experiment by including in the experimental design environmental variables that are controlled at predetermined settings during the experiment. In this approach it will be possible to estimate how much of the variation is due to the environmental variables and how much is due to unassignable sources. It will be possible also to determine how particular design variable settings might make the response robust to the sources of environmental variation considered in the experiment. The
STABILITY AND RESPONSE SURFACE METHODOLOGY
37
design and analysis of experiments with this second approach will be covered in Sections 2.3.3 and 2.3.4, and an example will be given in section 2.3.5. It will be seen that some of the methods for analysis of experiments conducted under the first approach can also be applied to data derived from experiments conducted under the second approach. 2.3.1 Response surface modeling of the mean and standard deviation In Section 2.2 it was shown that response surface methodology can be applied to enable a researcher to model the effect of multiple quantitative variables on a response with a low-degree polynomial. Frequently, response surface techniques have focused on the mean response as the only response of interest. However, by regarding the variation in the response as an additional response of interest, the researcher can investigate how to achieve a mean response that is on target with minimum variation. In particular, if a researcher replicates each design point in an experiment, then an estimate of the standard deviation at each point can be calculated and used to model the effect of the variables on the variability of the response. To illustrate this approach, suppose that in an experiment on tablet formulation a researcher is interested in understanding how three quantitative variables, pressure force, lactose quantity, and disintegrant quantity, affect crushing strength. Suppose that the objective is to have a mean crushing strength of 125 N with minimum variation. If it is believed that the effect of the variables on the crushing strength can be adequately represented by a second-order polynomial then a 17-run central composite design, shown in Table 2.5, could be run to estimate all of the terms in the second-order model
Y = flO + ]~lXl + ]~2X2 + ]~3X3 + 181,X, 2+ ]~22X22-1-]~33X32 -1t- ]~12XlX2 "t- ]~13XlX3 -'t- ~23X2X3 -t- oc'~
(16)
where the x~ are coded settings for the three design variables. Now if each of the design points in the central composite design is replicated five times, so that the complete design has 75 runs, then at each design point we can calculate the average response and the standard deviation of the response. The analysis techniques associated with response surface methodology can then be applied to fit separate models to
38
S.P. JONES
the mean and the standard deviation. The researcher is then in a position to determine settings of the variables that will give a mean response that is close to target with minimum variation. (It should be noted that many authors suggest that, for theoretical reasons, the log of the standard deviation, ln(s), be modeled rather than s; see, for example, Bartlett and Kendall [29] and Box [30].) In the context of the tablet formulation example, the model of the mean and the standard deviation can be used to determine which factors affect the mean crushing strength only, which affect the variability in crushing strength only, and which affect both the mean and the variability. The researcher can then choose settings of the variables that will give a mean crushing strength that is consistently close to 125 N. At this stage it is important to stress that the run order of the experimental design, including all replicates, should be completely randomized, since the purpose of the replicates is to provide an estimate of the total variation in the process or product at each design combination. If the replicated experiment is not completely randomized, then it is likely that the variation at each design point will be under-estimated since it will not include a component due to any variation in the set-up of the design variables. This could lead to erroneous conclusions about robust design combinations if certain design combinations have less set-up variation than others. The advantages of using the response surface approach to study both the mean and the variability are that it is easy to apply, no new methods of analysis are required, and the standard analysis methods can be used to bring insight to bear on the dual objective of the mean response and the variability. Some of these methods of analysis are considered in Section 2.3.2. As was mentioned above, a disadvantage of this approach is that the variation that is measured at each replicated design point will be from many sources, including sources of environmental variation, and it will be impossible to attribute the variation to a particular source. Another disadvantage of this approach is that it assumes that the variation experienced at the design points during the course of the experiment is similar to that experienced in practice in the real world. Frequently an experiment will be well-controlled and so the variation experienced will be considerably less than that normally encountered. One of the rationales for the noise arrays and cross-product designs advocated by Taguchi and discussed in Section 2.1 is to deliberately
STABILITY AND RESPONSE SURFACE METHODOLOGY
39
introduce into the experiment sources of variation that are more in line with what will be encountered in practice. During the experiment the noise (or environmental) variables are changed in a controlled manner that mimics the variation likely to be experienced in practice. Experiments that deliberately introduce the variation into the experiment through the experimental design (called the second approach, above) will be considered in Sections 2.3.3 and 2.3.4. 2.3.2. Analyzing the mean and standard deviation response surfaces
One analysis approach, appropriate if there are only a couple of design variables, is to construct contour plots of the mean response and the standard deviation of the response over the range of the variables. This will enable the researcher to see the constraints and trade-offs that may need to be made to achieve required values for the mean and variability of the response. A more rigorous analysis for simultaneously obtaining a target value for the mean and minimizing the variance has been discussed by Vining and Myers [31]. They propose applying the dual response approach developed by Myers and Carter [32] and state that this approach can satisfy the goals of achieving a target for the mean and for the variance within a more rigorous statistical methodology than that proposed by Taguchi. The objective of the dual response approach of Myers and Carter is to optimize a primary response subject to an appropriate equality constraint on the value of a secondary response. An application of this approach to the study of products and processes that are stable to environmental variation would involve running a response surface design, such as a central composite design or Box-Behnken design, that is replicated at each design point, as described in Section 2.3.1. Since each design point is replicated, the mean and variance can be calculated for each point in the experiment. Separate second-order models are fit to the data from the experiment that adequately describe the effect of the variables on the mean and on the standard deviation of the response. Then these two models are studied using the dual response approach of optimizing a primary response subject to an appropriate equality constraint on the value of a secondary response. The choice of whether to make the mean the primary or the secondary response will depend on the objectives of the experiment. For example, if the objective is to have the mean on target with minimum variation then the dual response approach would suggest minimizing the variance (or
40
S.P. JONES
some function of the variance such as In(s)), subject to the constraint that the mean is at its target value. In this case the variance (or In(s)) will be the primary response and the mean will be the secondary response. Alternatively, if the objective is to maximize (or minimize) the mean response and keep the variation as small as possible then the dual response approach would suggest optimizing the mean subject to the constraint that the variance is less than some upper bound. In this case the mean will be the primary response and the variance will be the secondary response. As suggested by Vining and Myers [31], the investigator may wish to select several possible constraint values for the variance, find the corresponding optimum values for the mean response subject to these variance constraints, and select a good compromise among these values. Details of the dual response approach can be found in the references given above. It is an extension of ridge analysis (Hoerl [33], see also Box and Draper [12]). The assumption is that there is a spherical region of interest of the design variables and that the variable combination that optimizes the primary response subject to a constraint on the secondary response is likely to be on the boundary of this region of interest. Thus an additional constraint is introduced, that the optimal value for the primary response is on the boundary of this spherical region. Lagrange multpliers are used to solve this constrained optimization problem. An example of the application of the dual response approach is given in Vining and Myers [31]. The application of the standard nonlinear programming techniques of constrained optimization on analyzing the mean and variance response surfaces has been investigated by Del Castillo and Montgomery [34]. These techniques are appropriate since both the primary and secondary responses are usually quadratic functions. Del Castillo and Montgomery recommend the generalized reduced gradient (GRG) algorithm for the following reasons. Firstly, the GRG algorithm is a primal method meaning that at each iteration the method searches only through the feasible region to determine a point that improves the primary response. Secondly, the GRG algorithm is one of the most robust nonlinear programming methods in that it can solve a wide variety of problems. Finally, the GRG method is known to work well unless the starting point is far from optimal and the constraints are highly nonlinear. Neither of these conditions are likely to be of concern when applying GRG methods to the dual response problem. Del Castillo and Montgomery also mention that if, in the dual response problem, the
STABILITY AND RESPONSE SURFACE METHODOLOGY
41
primary response is quadratic and the secondary response is linear, then a simpler method, such as quadratic programming, would be appropriate. An explanation of the GRG algorithm and its application to the dual response problem is given in Del Castillo and Montgomery [34]. In this paper Del Castillo and Montgomery claim that the GRG methodology has an advantage over the dual response method of Vining and Myers [31] in that it allows more constraints (secondary responses, such as cost constraints) to be included in the optimization and the constraints can be of a more flexible form. Furthermore, the optimization can be conducted over non-spherical regions of interest; for example, a cuboidal region defined by design variables within the region -1 < x~ < + 1.
2.3.3 Experimental design with environmental variables In this section it is supposed that the environmental variation is deliberately introduced into the experiment by including in the experimental design environmental variables that are controlled at predetermined settings during the experiment. Freeny and Nair [35] considered robust design experiments with uncontrollable, but measureable, environmental variables. Their approach will not be considered here; in this chapter it will be assumed that environmental variables can be controlled during the experiment. An advantage of including environmental variables in the experimental design is that the analysis can investigate the effect of design variables on specific sources of environmental variation with the objective of understanding how particular design variable settings might affect the variation in the response due to changes in the environmental variables. This and the subsequent section will consider the application of response surface methodology to these experiments. Section 2.2.4 will show how split-plot designs can be applied to include environmental variables in the experimental design. An example of this type of experiment is the tablet formulation experiment described in Section 2.1 and given in Table 2.1. The usual method that Taguchi advocates for introducing the environmental variation is to construct an experimental design that contains the environmental variable settings and to completely cross this design with the experimental design that contains the design variables. If there are n I runs in the design array and n 2 r u n s in the environmental array, and the runs are made independently, then there will be n~ x n 2 runs for the total experiment.
42
S.P. JONES
Thus, the experimental designs advocated by Taguchi can require a prohibitively large number of runs. An alternative approach is to regard the environmental variables as standard experimental variables and to apply the techniques associated with response surface methodology to the combined set of design and environmental variables (see Welch, Yu, Kang, and Sacks [36], Shoemaker, Tsui, and Wu [37], and Box and Jones [38]). This approach can result in considerably smaller and therefore cheaper experiments. As an example of the reduction in the size of the experiment, consider the tablet formulation study of Table 2.1 which had three quantitative design variables, x~, x 2, x 3, and two quantitative environmental variables, z~, z 2. Suppose that all of the variables, both design and environmental, are to be studied at three settings (coded- 1, 0, + 1), and that each combination was to be run independently and the experiment fully randomized.
T A B L E 2.10 TAGUCHI-DESIGN FOR THREE DESIGN VARIABLES AND TWO ENVIRONMENTAL VARIABLES .
.
.
.
.
Environmental Variables Z1 Z2
-1 -1
-1 0
-1 +1
0 -1
0 0
0 +1
+1 -1
+1 0
+1 +1
Design Variables x I
x 2
X3
-1
-1
-i
-1 -1 0 0 0 +1 +1 +1
0 +1 -1 0 +1 -1 0 +1
+1 0 +1 0 -1 0 -1 +1
Taguchi's approach of using a separate design and environmental array might result in a nine run fractional factorial design for the design
STABILITY AND RESPONSE SURFACE METHODOLOGY
43
variables and a nine run full factorial design for the environmental variables. The complete crossed design is shown in Table 2.10. It can be seen that it would require 9x9 = 81 runs. This design would yield estimates of linear and quadratic effects for the variables and of the interactions between the design and the environmental variables. However, it does not yield any unbiased estimates of the two-factor interactions among the design variables. An alternative design, based on applying response surface methodology to a combined set of design and environmental variables, could result in a smaller number of runs. One such design is the face-centered composite design described in Section 2.2.3. This design would consist of a 16-run, resolution V, fractional factorial design, augmented by a pair of star points for each factor, and a number (no) of center points. Such a design, with n o = 4 center points, is shown in Table 2.11. This design will permit the estimation of all the terms of a full-second order model 3
2
3
2
2
2
3
Y - ] ~ + Zi~ixi + Z ~ j z j + Zi~iix~ + Z y j j z j + Z Zi~ikXiXk i=1 j=l i=1 j=l i--lk=i+l 3 2 + ~12ZlZ2 + Z Z 4jxizj i=1 j=l
(17)
+ F_,
Thus, not only will this design estimate all of the linear and quadratic terms and interactions between the design and the environmental variables, but it will also estimate all of the two-factor interactions among the design variables and among the environmental variables. It will accomplish this in only (26 + no) runs, compared with the 81 runs for the Taguchi design that yields less information. It might be argued that a more reasonable approach for the Taguchi-type design given in Table 2.10 would be to run a two-level design in the environmental variables since an experimenter is unlikely to be interested in estimating the quadratic effects of the environmental variables. Such a situation would permit the use of a 2 2 full factorial design for the environmental array, and the complete design would require 9x4 = 36 runs. It is noted that this is still more than is required for the composite design in Table 2.11. In fact, under the assumption that the quadratic effects of the environmental variables are not of interest, the design of Table 2.11 could be reduced to (22 + no) runs by eliminating runs 23-27, the star points for the environmental variables.
9
~
~
~~~
~~
t~
~aag.
~. ;> ~. ~
~
~~~ o~.~
N
9
FjO
9
3~
Z
Z
Z
STABILITY AND RESPONSE SURFACE METHODOLOGY
45
for the environmental array giving a complete crossed design of 9x27 = 243 runs. This design would yield estimates of the linear and quadratic effects for all the variables and of the interactions between the design and the environmental variables. However, it does not yield any unbiased estimates of the two-factor interactions among the design variables. An alternative design would be the seven variable Box-Behnken design shown in Table 2.12. In this table each group of eight runs consists of all eight combinations of +1 for the three variables indicated, the other four variables being set at their center point, 0. The design is completed with n o center points giving a total of (56 + no) runs. T A B L E 2.12 BOX-BEHNKEN DESIGN FOR THREE DESIGN VARIABLES AND FOUR ENVIRONMENTAL VARIABLES ,,
Rill'IS
X1
X2
X3
Z1
Z2
Z3
Z4
1-8 9-16 17-24 25-32 33-40 40-48 49-56 no
0 +1 0 +1 0 +1 0 0
0 0 +1 +1 0 0 +1 0
0 0 0 0 +1 • 1 +1 0
+1 0 0 +1 +1 0 0 0
+1 0 +1 0 0 +1 0 0
+1 +1 0 0 0 0 +1 0
0 +1 +1 0 +1 0 0 0
The design of Table 2.12 will permit the estimation of all the terms of a full-second order model 3
4
3
3
j=l 3
+Z Z
+Z
j=lk=j+l
3
+ Zyjjzj + Z i=1
4
2 2
+ Z p , x, + Z y j z j + i=1
4 j=l
4
i=lk=i+l
(18)
+
i=1 j = l
Thus, this design will provide data to estimate all of the linear and quadratic terms and interactions between the design and environmental variables, but it will also estimate all of the two-factor interactions among the design variables and among the environmental variables. As with the
46
S.P. JONES
previous design, the Taguchi crossed design gives less information while requiring more runs than a standard second-order design. Note that even if the environmental variables are at two levels so that a 2 (41) fractional factorial design can be run for the Taguchi environmental array, the complete crossed design has 9x8 = 72 runs, still more than the Box-Behnken design of Table 2.12, while providing estimates of fewer coefficients of the second-order model. Shoemaker et al. [37] give several examples of the reduction in the number of experimental runs that can occur when it is assumed that some of the terms in the full second-order model are negligible. The reader is warned, however, that assuming a term is negligible is not an assurance that it can be ignored. The presence of terms in the true model that were assumed negligible will bias the estimates of the other coefficients. Box and Jones [20,38], showed that by considering the experimental objective, it is possible to construct smaller designs without having to assume that certain interactions are negligible. They showed that if the experimenter's objective is to find the design combination that minimizes the variance, then the second-order effects among the environmental variables (that is the pure quadratic and interaction terms) are not of interest. Consequently, smaller designs can be constructed by aliasing together interactions among the environmental variables. These designs would still enable the unbiased estimation of all other coefficients of the second-order model, even if the interactions among the environmental variables are not negligible. As an example, consider the experiment with three design variables and four environmental variables described above. A design based on the central composite design could be used that would only require (38 + no) runs. This would be achieved by using as the cube portion a 32-run, resolution IV design that confounded all of the two-factor interactions among the environmental variables with one another, along with 6 star points for each of the design variables, and n o center points. A facecentered composite design of this form, with four center points, is shown in the example given in Section 2.3.5; see Table 2.13. It should be noted that the use of this experimental design does not require the assumption of negligible interactions among the environmental variables, only that they are not of interest. If non-negligible interactions do exist they will not bias the estimates of the other coefficients of the second-order model. To summarize, it has been shown that combining the design and environmental variables into a single set for a response surface design not
STABILITY AND RESPONSE SURFACE METHODOLOGY
47
only results in experiments that frequently require fewer runs than Taguchi's designs, but also there is considerable flexibility in choosing the designs so that all of the coefficients of interest can be estimated and runs are not wasted to estimate coefficients that can be ignored.
2.3.4 Analysis of experimental designs with environmental variables Having considered the advantages of designing an experiment with a combined set of design and environmental variables, as opposed to Taguchi's crossed arrays, this section will consider the analysis of such experiments. It should be noted that in contrast to the previous section there is no pure replication of the design points from the response surface design. Consequently it is not possible to estimate the variance at each design point and to fit a model for the variance. The analysis approach in this section is based on fitting a model to the data without distinction as to whether the variables are design or environmental variables. The explicit modeling of the environmental variables has advantages over the modeling of a summary measure of variation such as the standard deviation which can lead to erroneous conclusions (see Steinberg and Bursztyn [39]). At this stage it is helpful to consider, in general terms, the objective of an experiment to investigate robustness. Consider an experiment with one design variable, x, and one environmental variable, z. The objective is to determine a setting of x that will yield a response that does not change as z varies. From this description it is clear that information on robustness will be contained in the interaction between x and z.
~
x--1
Q) 0
~
x
=
O
a~ x=+l
Environmental Variable,z
Figure 2.5 Design x Environment interaction plot
48
S.P. JONES
Figure 2.5 shows a possible interaction plot of x and z. In this figure the 0 setting of x yields a response that is approximately constant as the environmental variable, z is changed. This setting yields a response that is robust, or stable to the environmental variation, z. In contrast, at the other settings of x the response changes as z is varied, indicating that these settings of x do not make the response robust to the environmental variation, z. A good summary of the analysis methods discussed in this section can be found in Myers, Khuri, and Vining [40]. Similar approaches have been described by Welch et al [36], Shoemaker et al [37], and Box and Jones [5,38]; see also Myers [41 ]. Suppose that a response surface design has been run with n design variables, x I x2, x 3 ~ ' " ~ x,, and m environmental variables, z I~ z 2, z 3 ~ " ' ~ z m" During the experiment the environmental variables are controlled at fixed levels and can be regarded as fixed effects. Suppose that the x's and z's are centered and scaled around 0. In this section, several alternative models for the relationship between the design and environmental variables and the response will be considered. Suppose, initially, that the response from the experiment can be adequately modeled by a first-order model in both the design and the environmental factors. ;'7
m
Yxz - flo + ~ - ' ~ x i + ~-]yjz] + c i=1
(19)
j=l
In matrix notation,
=/3o + xTP + zTv +
(20)
where 13 and x are (n x 1) vectors and ~, and z are (m x 1) vectors, and the are independent N(0, cr2 ). In the experiment the environmental variables are controlled at fixed levels, but in reality the environmental variables have a random effect on the response, y~. Thus, the actual variation in the response is var(y~ ) = var(zV,{) + var(~;) = yVvy + O"2
(21)
STABILITY AND RESPONSE SURFACE METHODOLOGY
49
where z are random settings of the environmental variables that affect the response in reality (outside of the experiment), and V is the variancecovariance matrix of z. It is clear from this formula that the variance of the response is independent of x, the settings of the design variables. Consequently, there is no opportunity for achieving a more robust response in the presence of the environmental variation, z, by selecting particular settings for the design variables. Consider, now, a second example where the response from the experiment can be adequately modeled by a model that contains linear terms in the design variables, x, and the environmental variables, z, and also cross-product terms xz. Therefore, if there are n design variables, x l, x 2, . . . , x , and m environmental variables, Zl, z2' 9 9 "'
Zm'then the response,
Yxz'can be represented by g/
m
t"/
m
Y = A + Z l~iXi "k-Z rjZj -k- Z E ~ijXiZj + ~ i=1
j=l
(22)
i=1 j = l
In matrix notation,
Y~z= ,8o + xTI3 + zTY + zTDx + c
(23)
where 13 and x are (n x 1) vectors, y and z are ( m x 1) vectors, and where D is an (m x n) matrix that contains the coefficients that measure the interactions between the design and the environmental variables. It is assumed that an experimental design has been conducted that will permit estimation of all these two-factor interactions and the main effects of the design and the environmental factors. Box and Jones [21] discuss experimental designs that accomplish this. Now let
gj(x)-,
IdYxz ] I 3zj z=0
"
rj + Z4j i=1
,
forj = 1.... ,m
(24)
50
S.P. JONES
Then
dYxz]
,[dYxz
dYxz
gT(x)'-{[ ~Z 1 Z=0 -~Z2 ]Z=0'""[ ~Zm ]Z=0}
(25)
and g(x) - ~, + Dx is a measure of the change in the response, as a function of the design variables, in the direction of z at z = 0. Therefore, we have = P0 + xW[3+ ZIg(x) + C
(26)
Now, as before, in reality the environmental variables have a random effect on the response, Y~z"Therefore the actual variance of the response is V ( y ) = var(zWg(x)) + var(g) = gW(x)Vg(x) + ae2
(27)
where g ( x ) = y + Dx and V is the variance-covariance matrix of z. From this formula, it can be seen that the variance of the response is a function of the settings of the design variables. Therefore there is an opportunity for making the response robust to the environmental variation by careful selection of the settings of the design variables. Suppose that from an experiment good estimates of the terms o f , / a n d D are obtained and that the elements of V are known. Then the variance OfYxz can be minimized as a function of the design variables, x. Also from equation (26), the mean response level is E(y~) = fl0 + xW[3,
(28)
under the assumption that the random environmental variables have a mean of zero. It can be seen that both V(y~)(equation (27)) and E(y~) (equation (28)) are essentially response surface models. From an experiment, estimates of 3',
D, o-2 ,/30, and 13 can be derived. Suppose, also that the elements of V are known, or can be estimated. Then the search for a choice of design variables that yields a response that is robust to the environmental variation and close to target will involve an examination of these two response surfaces. At this point, the scientist might proceed by following
STABILITY AND RESPONSE SURFACE METHODOLOGY
51
the dual response or constrained optimization approaches discussed in Section 2.3.2 or by simply overlaying contour plots of the mean and variance response surfaces. In practice, of course, there could be considerable uncertainty as to values for the elements of V, although it might be possible to estimate them from historical data. If reliable estimates of the values of the elements of V are unavailable then several alternative guesses could be made and the sensitivity of the conclusions to these estimates could be ascertained. If there is some target value, r, for the response then a measure of closeness of the mean response to that target is M(y~) = ( r - (fl0 + xTI3))2.
(29)
Box and Jones [38] discussed the use of a general robustness measure of the form R(y) =
gV(y) + (1-/t,)M(y)
(30)
where 0 < )~ < 1. Selection of a particular value for )~ corresponds to a particular weighing of the relative importance of being close to target and having small variation. Suppose, now, that the response from the experiment can be adequately represented by a model as in equation (22) but with the addition of pure quadratic and interaction terms for the design variables, x. For n design variables, Xl, X2, . .., Xm and m environmental variables, z l, Z 2 , 9 9 Zm, it is supposed that the model for the experiment is n
m
Y = i~0 + EI~iXi + Z Y j Z j n
i=1 m
EZ
j=l
n
n-1
n
+ E ~ i i X ~ + Z Zi~ikXiXk i=1 i=lk=i+l
(31)
i;xiz; +c
i=1 j = l
In matrix notation we have Y = - flo + xT~ + xTBx + zT7 + zTDx + C
(32)
52
S.P. JONES
where 13 and x are (n x 1) vectors, 3' and z are (m x 1) vectors, B is an (n x n) matrix that contains the coefficients that measure the interactions and pure quadratic terms among the design variables, and D is an (m x n) matrix that contains the coefficients that measure the interactions between the design and the environmental variables. As before, let gT(x) be as in equation (25) so that g(x) = 7 + Dx is a measure of the change in the response, as a function of the design variables, in the direction of z at z = 0. Therefore, we have Yxz = flo + xTI3+ xTBx + zVg(x) + ~"
(33)
Now, as before, in reality the environmental variables have a random effect on the response, Yxz. Therefore the actual variance of the response is V(Yx) : var(zTg(x)) + var(g) = gV(x)Vg(x) + ere2
(34)
where g ( x ) = 7 + Dx. Therefore, the formula for the variance of the actual response is identical to the previous model (see equation (2.7)) and is a function of the settings of the design variables only through y and D. Therefore, as before, there is an opportunity for making the response robust to the environmental variation by careful selection of the settings of the design variables. Also from equation (33), the mean response level is
E(y~) = ,130+ xT~ + xTBx,
(35)
under the assumption that the random environmental variables have a mean of zero. The mean response level is now a function of both the firstorder and second-order terms in the design variables. Thus, it can be seen that both V(yx) and E(yx) are quadratic response surface models in x. From an experiment, estimates of Y, D, o-2 , rio, 13, and B can be derived. Suppose, also that the elements of V are known, or can be estimated. Then, as before, the search for a choice of design variables that yields a response that is robust to the environmental variation and close to target will involve an examination of these two response surfaces, equations (34) and (35).
STABILITY AND RESPONSE SURFACE METHODOLOGY
53
2.3.5 Example To illustrate the approach described above, consider an experiment with three design variables, x~, x 2, x 3, and four environmental variables, z~, z 2, z 3, z 4. The objective was to find a setting of the design variables that will lead to a small response with minimum variability due to the environmental variables. Suppose that it was reasonable to assume that the second-order effects (that is the pure quadratic and interaction terms) among the noise variables were not of interest. The experimental design used was one based on the face-centered central composite design that only required (38 + no) runs. This design, described in Section 2.3.3, has as the cube portion a 32-run, resolution IV design that confounds all of the two-factor interactions of the noise factors with one another, along with six star points for the design factors and n o center points. The design, with the responses from the experiment, is shown in Table 2.13. The following model, equation (36), was fit to the data. 3
4
3
2
3
y = 13o + ~ i~iXi "k-Z yjZj "k-Z i~iiX2i -t- Z E i~ikXiXk i=1 j=l i=1 i=lk=i+l 3
4
(36)
ZE4x,zj+ i=1 j = l
It can be seen that this model contains all main effects, all quadratic terms in the design variables, all interactions among the design variables, and all interactions between the design and the environmental variables. An estimate of the pure experimental error can be obtained from the replication at the four center points. The ANOVA table shown in Table 2.14 indicates that there was no significant lack-of-fit of the model. Parameter estimates and t-statistics for this model are shown in Table 2.15. The following model for the response was derived using the significant effects indicated in Table 2.15.
Y~z = 41.83 + 2.50x~ - 3.9 lx 2 + 4.19z~ - 4.38z 3 + 2.69xlx 2 + 2.38xlz ~ - 2.8 lx2z 3 + ~;
(37)
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~-I-
~
~
~
~
~
~
-
~
~
~
,
I
~-I-
~
~
~
-
~
~
-
~
~
~
F
,
~
~
~
~
~
-I-
~-I-
~--I-
~
~-I-
,
+
-I-
,
,
+
-I-
-I-
-I-
,
'
'
-+-
-I-
-I-
,
+
-I-
-I-
-F
-I-
-I-
,
,
,
,
+
-I-
,
+
-I-
-I-
-I-
+
-I-
'
-I-
,
-I-
-I-
+
-I-
'
,
-I-
,
,
-F
-I-
'
,
,
-I-
,
-t-
,
+
-I-
-I-
-i-
,
+
,
-I-
-I-
,
,
-I-
+
,
-I-
,
-I-
-I-
,
-F
,
-I-
,
,
,
-I-
-F
'
'
-I-
-I-
-F
,
-F
'
'
-I-
,
-I-
,
-t-
'
'
,
-I-
,
-I-
+
'
,
,
,
,
-I-
'
-I-
-I-
-I-
-I-
-F
,
'
-I-
-I-
-I-
,
-I-
,
'
-I-
-I-
,
-I-
,
-I-
'
-I-
-I-
,
,
-F
,
'
-I-
,
-I-
-I-
,
-I-
'
-I-
,
-I-
,
,
-I-
'
-I-
,
,
-I-
-F
,
'
-I-
,
,
'
,
,
'
,
-F
-I-
-Jr
-i-
-I-
'
,
-I-
-I-
'
-I-
-I-
'
,
-I-
,
-I-
,
,
'
,
-I-
,
,
-i-
-i-
'
'
,
-I-
-I-
,
,
'
'
,
-I-
,
,
,
'
'
,
,
-I-
-i-
-I-
'
'
,
,
'
N
N
N
~,~o
c~
t~
9 Z
STABILITY AND RESPONSE SURFACE METHODOLOGY
55
T A B L E 2.14 A N O V A TABLE FOR DATA IN TABLE 2.13
Source
df
Sum of Squares
Mean Square
Model Error Lack-of-fit Pure Error Total
24 16 13 3 41
2863.16 240.67 206.67 34.00 3103.83
114.526 15.042 15.898 11.333
.
.
.
.
.
.
.
.
.
.
.
F-ratio
p-value
7.614
< 0.0001
1.403
0.4394
.
This can be re-expressed as
Y~z= 41.83
+ 2.50x 1 -
3.9 l x
2+
2.69XlX2
+ (4.19+ 2.38Xl)Z 1 + (- 4.38- 2.81x2)z 3 + ~;
(38)
From this we have as the estimated mean response surface t~(y~) = 41.83
+ 2.50x I -
3.9 lx 2 +
2.69XlX 2
(39)
and, if we assume that the z's are uncorrelated, then the estimated variance response surface is
r162 (yQ = (4.19+ 2.38x~) 2 Crzl ,,2 + (- 4.38- 2.8 lx2) 2 Crz3+ ,,2 cr,,2e
(40)
Now from the center point runs an estimate of de2 of 11.33 is obtained. Suppose that from previous studies or additional information it is known that good estimates for Ozl ^2 and o%3 ^2 are 1.0. Using these estimates, the estimated response surface for the variance is
r162 (Yx) = (4.19+ 2.38x,) 2 + (- 4.38- 2.8 lx2) 2 + 11.33.
(41)
From equations (40) and (41), it can be seen that x 3 has no effect on either the mean response or the variation. Furthermore, it can be seen that both x~ and x 2 have an effect on the variation of the response and that an opportunity exists to minimize the effect of the environmental variables z~ and z 3 by a particular selection of these two design variables. It should be
56
s.P. JONES
noted that the analysis indicates that the other two variables, z 2 and z 4, do not affect the response.
environmental
T A B L E 2.15 P A R A M E T E R ESTIMATES F O R DATA IN TABLE 2.13 .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Variable
standard Error
t
p-value
Intercept xI
38.58 2.50
1.51 0.67
25.58 3.76
< 0.0001 0.0017
x2
-3.91
0.67
-5.88
< 0.0001
x3
0.38
0.67
0.58
0.5734
z1
4.19
0.69
6.11
< 0.0001
z2
-0.06
0.69
-0.09
0.9285
z3
-4.38
0.69
-6.38
< 0.0001
z4
0.75
0.69
1.09
0.2902
X 12
4.34
2.31
1 88
0.0787
X22
0.34
2.31
0.15
0.8850
X32
-0.66
2.31
-0.29
0.7787
XlX 2
2.69
0.69
3.92
0.0012
XlX 3
1.06
0.69
1.55
0.1408
X2X3
0.44
0.69
0.64
0.5324
XlZ 1
2.38
0.69
3.46
0.0032
X~Z2
-0.88
0.69
-1.28
0.2201
XlZ3
-0.06
0.69
-0.09
0.9285
XlZ4
-0.19
0.69
-0.27
0.7880
XzZ l
0.63
0.69
0.91
0.3755
X2Z2
0.50
0.69
0.73
0.4764
XzZ3
-2.81
0.69
-4.10
0.0008
X2Z4
-0.81
0.69
-1.19
0.2533
X3Z1
0.13
0.69
0.18
0.8576
X3Z2
0.38
0.69
0.55
0.5920
X3Z3
0.69
0.69
1.00
0.3309
-0.94
X3Z4 .
.
Parameter Estimate
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0.69 .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
-1.37 .
.
.
.
.
.
.
.
.
.
.
0.1904 .
.
.
.
.
.
.
.
.
.
STABILITY AND RESPONSE SURFACE METHODOLOGY
57
Figures 2.6 and 2.7 show contour plots of the mean response and the standard deviation. It can be seen that within the experimental region the best setting of the design variables to minimize the response is to choose x~=-i and x2=+l. In terms of minimizing the variation due to the environmental variables the best setting would be Xl=-1 and x2=-1. Similar conclusions are reached for a range of alternative choices for the estimates of O-zl ^2 and Crz3, ^2 indicating that the conclusions are not oversensitive to the particular estimates chosen for the variances of the environmental variables. Clearly, a compromise between the two objectives of minimizing the mean response and the variation in the response would need to be reached. One possibility would be to minimize the variation subject to the constraint that the mean response would be less than some target value. Altematively, one could minimize the mean response subject to the constraint that the variation would be less than some target value. If there are different costs involved in operating the process at different settings of the design variables, then a contour plot showing the operating costs can be constructed to help find an operating condition that has low operating cost and reaches a satisfactory compromise between the two objectives.
2.4 SPLIT-PLOT DESIGNS FOR ROBUST DESIGN In Sections 2.2 and 2.3 we considered the application of response surface methodology to the investigation of the robustness of a product or process to environmental variation. The response surface designs discussed in those sections are appropriate if all of the experimental runs can be conducted independently so that the experiment is completely randomized. This section will consider the application of an alternative class of experimental designs, called split-plot designs, to the study of robustness to environmental variation. A characteristic of these designs is that, unlike the response surface designs, there is restricted randomization of the experiment. Section 2.4.1 contains a brief description of split-plot designs and describes several alternative split-plot arrangements. In Section 2.4.2 the precision of the estimates from these split-plot arrangements is considered. Section 2.4.3 contains a discussion of variants of the standard split-plot arrangements. In Section 2.4.4 there is a discussion of the analysis of split-plot designs and a comparison of the design and analysis of split-plot experiments with the design and analysis methods proposed by Taguchi.
58
S.P. JONES
1.0
0.5
0.0
-0.54
--1.0 -1.0
-/~.5
o~o
o~5
1.o
xl
Figure 2.6 Contour plot of the mean response surface
1.0
1.0 - 1.0
. - 0.5
0.0
.
.
.
0.5
.
.
. 1.0
xl
Figure 2.7 Contour plot of the variance response surface
STABILITY AND RESPONSE SURFACE METHODOLOGY
59
2.4.1 Overview of split-plot designs Split-plot designs occur in a wide range of applications of experimental design. One application area for split-plot designs is when there are some variables that can be applied only to experimental units that are larger than the units to which the other variables can be applied. As an example, consider an investigation on a heat treat process to determine if the temperature of the quench bath, whether parts are stacked horizontally or vertically, and two different types of fixture, affect warpage in metal castings. Suppose the bath can hold eight parts. Then within each bath we can test the effect of stacking and fixture, for example by running 8 parts according to a replicated 22 design in those two factors. We would run that same set-up for quench baths of different temperatures. To investigate the effect of temperature we need larger experimental units (baths), but to investigate the effect of stacking and fixture we can use smaller experimental units (parts). A second application area is when there is a variable that is difficult or expensive to change and so the randomization is restricted to limit the number of changes of that variable. This is accomplished by conducting the experiment in blocks with the restricted variable held constant within a block, but changed randomly between blocks. In this case the large experimental units are the blocks and the smaller experimental units are the individual runs within a block. An excellent exposition of split-plot experimental designs can be found in D.R. Cox's book, "Planning of Experiments" [42]. He states that splitplot designs are particularly useful when one (or more) factors are what he calls classification factors. These factors are included in the experiment to determine whether they modify the effect of the other factors or indicate how the other factors work. The classification factors are included to examine their possible interaction with the other factors. Lower precision is tolerated for comparisons of the classification factors, in order that the precision of the other factors and the interactions can be increased. In the standard terminology associated with split-plot experiments, the classification factors are called whole-plot factors and are applied to the larger experimental units. The smaller experimental units are called subplots. In the following subsections several alternative experimental arrangements of split-plot experiments will be considered. The tablet formulation data given in the example of Table 2.1 in Section 2.1.1 will be
60
S.P. JONES
used to illustrate the applicability of these split-plot experiments to designing robust products and processes. Recall that in that example an experiment is to be run with three constituents of the tablet formulation, the design variables, which will be denoted as A, B, and C, and two environmental variables, storage temperature and humidity. The three design variables are arranged in a 23 factorial design which is crossed with a 3 2 factorial design containing the environmental variables. A set of hypothetical data was shown in Table 2.1. The objective of the experiment is to determine a combination of the design variables that will yield high values for the response across the ranges of temperature and humidity studied in the experiment. The same set of data will be used in the following subsections to illustrate the different analyses for the alternative split-plot designs. Clearly, in practice the correct analysis of the experiment will depend on the particular experimental arrangement that was adopted.
Design (I)" environmental factors as whole-plot factors Using Cox's concept of classification factors, it seems most reasonable to have the classification factors, that is the whole-plot factors, associated with the environmental variables, since they are in fact included primarily to examine their possible interaction with the design variables. Thus, the first arrangement considered is one in which the whole plots contain the environmental variables and the subplots contain the design variables. Now, suppose that there are m levels of the environmental variables, E , E 2, 9 9 E , 9 9 E m, applied to the whole plots, that there are n levels of the design variables, D , D 2 , . . . , D , . . . , D , applied to the subplots, and that there are l replicates, r, r 2 , . . . , r k , . . . , rl, with the whole plots in l randomized blocks. For the tablet formulation example given in Table 2.1 in Section 2.1.1, the environmental variables are temperature and humidity that are varied in a climate-controlled chamber and m is 9, the design variables are the quantities of A, B, and, C in the tablet formulation and n is 8, and since there is only one replicate l is 1. For the tablet formulation example of Table 2.1, this split-plot arrangement would require mxnxl = 9x8 = 72 tablet formulation batches to be made but only mxl = 9 operations of the climate chamber. The experiment would be conducted by placing in the climate chamber a complete set of 8 different tablet formulations at the same time. A completely randomized experiment (the cross-product experiment of
STABILITY AND RESPONSE SURFACE METHODOLOGY
61
Taguchi) with no replication would require not only 72 tablet batches, but also 72 operations of the chamber. It is clear, therefore that this experimental arrangement can be considerably easier to run than the completely randomized cross-product design. The model for arrangement (I) is Y ok = m + r k + E. + h. k +
O i
(42)
+ (DE)o. + e Ok'
where Yok is the response of the k th replicate of the
i th
level of factor D, and
the jth level of factor E, m is the overall mean, r k is the random effect of the k th replicate, with rk~N(O, crf ), E j is the fixed effect of the jth level of E, D i is the fixed effect of the i th level of D, (DE)o. is the interaction effect of the i th level of D with the jth level of E, hk~N(O, 0 "2), is the whole-plot error, eok~N(O , 0 .2 ) is the subplot error, and h k and eok, are independent.
The ANOVA table is shown in Table 2.16.
In this table/~j, /5,, and
DE,~are estimates of E , D , and (DE)o respectively. It can be seen from the ANOVA table that the sources of variation split into two parts, those coming from the whole plots (Env, and RxE) and those coming from the subplots (Design, DxE, and Error). In this case the mean square for Env would be tested against that of RxE, and the mean square for Design and for DxE would be tested against that of Error. Now suppose that there is no replication. Then to test Env, Design, and the interaction DxE estimates of cr2 and cr2 + nor2 would be required. One possibility is to construct two normal plots, one for whole-plot and one for subplot contrasts, and to pick out as active contrasts those that fall away from a line (see Daniel [43]). Alternatively, if the design and the environmental factors are factorial combinations it may be possible to assume that higher-order interactions are negligible. If this assumption is reasonable then the whole-plot error can be estimated by pooling the higher-order interactions among the environmental variables, and the subplot error can be estimated by pooling the higher-order interactions among the design factors and between the design and the environmental factors. For example, for the tablet formulation data of Table 2.1, assuming that all contrasts involving three or more factors are estimating error, the following ANOVA table (see Table 2.17) is obtained.
62
S.P. JONES
T A B L E 2.16 ANOVA TABLE FOR ARRANGEMENT (I)
Source
df
Reps(R)
l-1
Env(E)
m-1
RxE
a-1)(m-1)
Sum of Squares
Expected Mean Squares mncr2 + O-2s + nCr2w nl m m_ l ~" E2 + cr2s +nor2w
m
"2
nl~'~Ej
j=l
j=l
n
Design(D)
n-1
n
tm ED2i +a2s
lm ~" ~)2
n - 1 i=l
i=1 n
DxE
m
(n-1) (m-1) l E E
l
D~-.2
m
( n - 1 ) ( m - 1 ) ~'~ ~'~DE2ij + cr2s i=1 j=l
i=1 j=l
Error
n
(l-1)m(n-1)
For the subplot analysis it appears that the effects due to A, and the interaction between B and Humidity are real, with some evidence of an interaction between B and Temperature. It is possible to split the two degrees of freedom for Temperature and Humidity into linear and quadratic contrasts and to construct a normal probability plot for the whole plot contrasts. This would reveal important effects due to the linear components of both Temperature and Humidity.
Design (II): design factors as whole-plot factors An alternative an'angement to design (I) would be to have the design variables as the whole-plot factors and the subplots contain the environmental variables. As before, suppose that there are m levels of the environmental variables, E l, E 2, . .., E, . .., E , that there are n levels of the design variables, D , D 2, . . . D , r, r2,...,
. . . , D , and that there are l replicates,
r k , . . . , r t, with the whole plots in l randomized blocks. The
contrast with arrangement (I) is that in this arrangement the environmental variables are applied to the subplots, and not the whole plots, whereas the design variables are applied to the whole plots not the subplots. For the tablet formulation example, this experimental arrangement would arise if eight tablet formulation batches were made according to the eight different design combinations and that each of these batches were
STABILITY AND RESPONSE SURFACE METHODOLOGY
63
divided into nine sub-batches and that each of the 72 sub-batches were placed individually in the chamber for the appropriate setting of temperature and humidity. This would require only n x l = 8 tablet formulation batches to be made but would require m x n x l = 72 operations of the chamber. A completely randomized experiment in which there was no replication would have required 72 tablet batches and also 72 operations of the chamber. It is clear, therefore that this experimental arrangement can be considerably easier to run than the completely randomized design. T A B L E 2.17 ANOVA FOR THE TABLET FORMULATION DATA OF TABLE 2.1 FOR ARRANGEMENT (I) ....
Source
Whole Plot (Env)
................... d f
Temp(T) Humidity(H) TxH Design A B C AxB AxC BxC Design x Env A x T AxH BxT BxH CxT CxH Higher Order Interactions
......
2 2 4 1 1 1 1 1 1 2 2 2 2 2 2 45
SS
f~lS
1204.1 1199.4 350.5 938.9 60.5 144.5 24.5 40.5 18.0 62.1 17.0 399.0 799.1 65.3 97.6 2759.5
602.1 599.7 87.6 938.9 60.5 144.5 24.5 40.5 18.0 31.1 8.5 199.5 399.5 32.7 48.8 61.3
F-ratio
15.31 0.99 2.36 0.40 0.66 0.29 0.51 0.14 3.25 6.52 0.53 0.66
F ,45(.05) = 4.06 F,4s(.05 ) = 3.20 F ,45(.01) = 7.23 F,45(.01) = 5.11 The model for experimental arrangement (II) is (43)
Yok = m + r k + D i + q , k + E + ( D E ) O.+ eo.k,
where, as before, Yok is the response of the k th replicate of the factor D, and the
jth
i th
level of
level of factor E, m is the overall mean, r k is the
64
S.P. JONES
random effect of the/(~ replicate, with rk~N(0, o-2 ), E,is the fixed effect of the jth level of E, Di, is the fixed effect of the ith level of D, (DE)o. is the interaction effect of the it~ level of D with the j 'h level of E, eo.k~N(O,o .2 ) is the subplot error. In arrangement (II), qik~N(O,o2) is the whole-plot error, and qik and eo.k are independent. The ANOVA table is shown in Table 2.18. This table shows that the sources of variation can be split into two parts, those coming from the whole plots (Design, and RxD) and those coming from the subplots (Env, DxE, and Error). In this case the mean square for Design would be tested against that of RxD, and the mean square for Env and for DxE would be tested against that of Error. T A B L E 2.18 ANOVA TABLE FOR ARRANGEMENT (II)
Source
df
Reps(R)
l-1
Sum of Squares
mncr2 + crf + mcr2 n
n
Design(D)
n-1
RxD
(l-0(n-1)
Expected Mean Squares
lm ~" D 2 i=1
lm E D 2 +crff ++mcr2 n - 1 i=1 o-2 + + mO-w 2
m "2
Env(E)
m-1
nlZEj
DxE
(n-1) (m-1) l~'~Z D~2
j=l
n
yll
m - 1 j=l
m
l
(l-1)n(m-1)
,,
m
( n - 1 ) ( m - 1 ) ~'~ Z DE2 + 0.2
i=1 j=l
Error
m
~'~ p2 _k_ tw2
i=1 j=l
cr2
As has been already pointed out, of course, the constructed data would only be appropriate for the model that reflected the way in which the experiment was carried out. However, to illustrate the analysis the same data f r o m T a b l e 2.1 will be used. The ANOVA table for the tablet formulation data, assuming that the experiment was run according to arrangement (II), is shown in Table 2.19. For this arrangement, higher-order interactions are assumed to be negligible and their sums of squares are pooled to give an estimate of error. From the ANOVA table it appears that in the sub-plot analysis there
STABILITY AND RESPONSE SURFACE METHODOLOGY
65
are real effects due to both Temperature and Humidity as well as the interaction between B and Humidity. Further analysis would reveal that both the Temperature and Humidity effects are predominantly linear rather than quadratic. A normal plot of the whole-plot contrasts involving the design variables could be constructed and, in this example, would indicate the importance of factor A.
TABLE 2.19 ANOVA FOR THE TABLET FORMULATION DATA OF TABLE 2.1 FOR ARRANGEMENT (II) MS F-ratio ...................... Source . . . . . . . . . . . d~ . . . . . . . . SS Whole-Plot (Design)
A B C AxB AxC BxC AxBxC Env Temp(T) Humidity(H) TxH Design x Env A x T AxH BxT BxH CxT CxH Higher Order Interactions
1 1 1 1 1 1 1 2 2 4 2 2 2 2 2 2 44
938.9 60.5 144.5 24.5 40.5 18.0 72.0 1204.1 1199.4 350.5 62.1 17.0 399.0 799.1 65.3 97.6 2687.5
938.9 60.5 144.5 24.5 40.5 18.0 72.0 602.1 599.7 87.6 31.1 8.5 199.5 399.5 32.7 48.8 61.1
9.86 9.82 1.43 0.51 0.14 3.25 6.52 0.53 0.66
F1,44(.05 ) = 4.06 F,44(.05 ) = 3.21 F ,44(.01) = 7.25 F44(.01) = 5.12 ..... Design (III): strip-block design Let us now consider an experimental arrangement where the subplot levels are assigned randomly in strips across each block of whole-plot levels. Such arrangements are frequently called strip-block designs. As an illustration of this arrangement, suppose that we have a whole-plot variable with three levels, a , a2, and a 3, a subplot variable with two levels, b I and
66
S.P. JONES
b 2 and three blocks. Pictorially, the design is represented by the following figure (F i gure 2.8). Block 1 a2
bl
al
Block 2
a3
a3
a2
Block 3
al
al
b2
a3
a2
bl ,,
b2
bl
b2
Figure 2.8 Strip-block design There is a certain symmetry with the allocation of the two variables, to the extent that it seems that both variables could equally well be designated as the whole-plot variable. Suppose, as before, that there are m levels of the environmental variables, El, E 2 , . . . , E , . . . , E , that there are n levels of the design variables, D~, D 2, ...Di,
. ..,
On, and that there are l replicates,
r l, r 2 , . . . , r k , . . . , rz. For the tablet formulation example, the strip-block arrangement would arise if nxl = 8 tablet formulation batches were made according to the design combinations and that each of these batches were divided into mxl = 9 sub-batches. One sub-batch from each of the eight batches is then selected at random and these eight are placed in the chamber at the same time at the appropriate setting of temperature and humidity. This design would require only n = 8 tablet formulation batches to be made and only m = 9 operations of the chamber. Strip-block experiments, such as the one described in this section, are clearly considerably easier to run than either the completely randomized product design or either of the split-plot designs described above, that is, arrangements (I) and (II). The model appropriate for the strip-block arrangement is
Yok = m + r k + E +h.k + D i + qik + (DE)o + eok'
(44)
where, as before, Yok is the response of the k 'h replicate of the i'h level of factor D, and the jt~ level of factor E, m is the overall mean, r k is the random effect of the k 'h replicate, with rk~N(0, Crr2 ), E. is the fixed effect of J t h e jth level of E, D is the fixed effect of the i'h level of D, (DE) ~ is the interaction effect of the i'h level of D with the jth level of E. In arrangement
STABILITY AND RESPONSE SURFACE METHODOLOGY
67
(III), hzN(0,o-2), q izN(0,O-D2 ), ej-N(0,o-2) and hk , q ik' and euk are independent. The ANOVA table is shown in Table 2.20. This table shows that the sources of variation can be split into three parts. The mean square for the environmental variables are tested against that of RxE; the mean square for the design variables are tested against that of RxD; and the mean square for the design x environment interactions are tested against that of RxExD. When there is no replication and the design is sufficiently large either three normal plots can be constructed, one for the design variable contrasts, one for the environmental variable contrasts, and one for the designxenvironment interactions contrasts. Altematively,( 0 -2 + n o-2 ) could be estimated by pooling the higher-order interactions among the environmental
variables,
(crZ+mo-2 ) by pooling the higher-order
interactions among the design variables, and
by pooling the higher-
O"2
order interactions among the design x environment interactions. T A B L E 2.20 ANOVA TABLE FOR ARRANGEMENT (III)
Source
df
Reps(R)
l-1
Env(E)
m-1
Sum of Squares Expected Mean Squares
m
m
"2
nl~Ej j=l
(#1)(m-1)
O - 2 + n o .2 n
Design(D) n-1
(l-1)(n-1) 17
i=1 j = l
(# l) (m-1) (n-1)
tm ZD +
+m g
n - 1 i=1 o-2 + m o -2
m
DxE m,d3xE
.
n
lmZb i=1
RxD
m - 1 j=l
l
(n-1)(m-1)
n
m
i=l j = l
Cr2
To illustrate the analysis the same data given in Table 2.1 will be used, assuming that the experiment was conducted as a strip-block design. The ANOVA table for the tablet formulation data is given in Table 2.21.
68
S.P. JONES
As was discussed with arrangement (I), it is possible to split the two degrees of freedom for Temperature and Humidity into linear and quadratic contrasts and to construct a normal probability plot for the environmental variable contrasts. This would reveal important effects due to the linear components of both Temperature and Humidity. A normal plot for the design contrasts would indicate that there appears to be a real effect due to A. The analysis of the design x environment interactions is obtained by pooling together higher-order interactions to obtain an estimate of the error
term
o -2 .
This analysis indicates that the interaction
between B and Humidity appears to be the only real interaction effect.
T A B L E 2.21 ANOVA FOR THE TABLET FORMULATION DATA OF TABLE 2.1 FOR ARRANGEMENT (III) Source
.....
Temp(T) Humidity(H) TxH A B C AxB AxC BxC AxBxC AxT AxH BxT BxH CxT CxH Higher Order Interactions
d f .......
2 2 4 1 1 1 1 1 1 1 2 2 2 2 2 2 44
SS
1204.1 1199.4 350.5 938.9 60.5 144.5 24.5 40.5 18.0 72.0 62.1 17.0 399.0 799.1 65.3 97.6 2687.5
F2,44(.05 ) = 3.21
..........
MS
602.1 599.7 87.6 938.9 60.5 144.5 24.5 40.5 18.0 72.0 31.1 8.5 199.5 399.5 32.7 48.8 61.1 F2,44(.01) = 5.12
F-ratio
0.51 0.14 3.25 6.52 0.53 0.66
STABILITY AND RESPONSE SURFACE METHODOLOGY
69
2.4.2 Precision of split-plot designs The above descriptions of alternative split-plot arrangements show that the ease of experimentation can vary depending on the particular experimental design followed. However, in considering which experimental design to adopt, the investigator should weigh many other criteria besides ease of experimentation. One criterion of importance is the precision of the estimates of the effects that these arrangements yield. It can be shown, see for example Kempthome [18], Box and Jones [5], that for both design (I) and (II), the whole plot effects are determined less precisely than compared with the cross-product design, but that the subplot variables and the design x environment interactions are determined more precisely. For the strip-block design both the design and environmental vaiable effects are determined less precisely than with the cross-product design, but the design x environment interactions are determined more precisely. When the strip-block design is compared with both split-plot designs (I) and (II), the whole plot effects are determined with the same precision, the sub-plot effects are determined with less precision but the design x environment interactions are determined with more precision. 2.4.3 Variants of split-plot designs Adaptations of the split-plot methodology have been suggested by many authors (see, for example, Kempthome [18], Cochran and Cox [44]). These authors describe various blocking arrangements to control for other sources of variation in split-plot experiments. The relevance of some of these arrangements to split-plot designs that investigate the influence of environmental variation is discussed in Box and Jones [5]. One of the most useful adaptations of the split-plot design is the confounding of higher-order split-plot interactions when the split-plot treatments are in a factorial design; first suggested in Bartlett [45]. Such an experimental design requires fewer sub-plots within each whole plot, but still enables the required effects to be estimated. When the whole-plot design is a factorial design then it may be possible to reduce the number of whole plots required by confounding certain whole-plot interactions. The use of factorial and fractional factorial designs in split-plot arrangements has been investigated by Addelman [46], see also Daniel [47]. As an example of such an arrangement, consider a tablet formulation experiment with two environmental variables, temperature (T) and humidity (H), and five design variables, A, B, C, D, and E; with all of the
70
S.P. JONES
variables at two levels. Suppose that it has been decided that the environmental variables will be assigned to the whole plots and the design variables assigned to the sub-plots (design arrangement (I)). Suppose that the chamber can hold no more than 20 batches of tablets at a time. With this constraint it is no longer possible to use a full factorial for the design factors within each whole plot (run of the chamber) since this would require 25 = 32 tablet batches for each run of the chamber. An altemative design would be to use a half-fraction of the design variables for each run of the chamber. Such a design, before randomization, is shown in Table 2.22. With this design the ABCDE fivefactor interaction is confounded with the TxH whole-plot contrast. Under the assumption of negligible three-factor and higher-order interactions all main effects and two-factor interactions can be estimated as well as interactions between the design and the environmental variables.
2.4.4 Analysis of split-plot designs for robust experimentation The appropriate analysis of data obtained from an experiment should be determined by the experimental design used to obtain those data. The fundamental characteristic of split-plot designs is that there are experimental units of different sizes and consequently multiple sources of variation. The analysis needs to take account of this structure and include multiple error terms and to test the significance of effects and interactions against the appropriate error term. This has been illustrated above with the three experimental arrangements for split-plot and strip-block designs. If there is replication of the experiment then an independent estimate of the error terms can be calculated and valid statistical tests, such as ANOVA, can be constructed. In split-plot and strip-block designs that are unreplicated, there is no independent estimate of the appropriate error terms available. Several alternative analysis approaches have been advocated. A number of authors, for example, Mason, Gunst, and Hess [48] (p. 370), suggest estimating the error terms by combining higher-order interactions that are assumed to be negligible. An alternative is to construct separate normal or half-normal probability plots for the effects and interactions calculated from the different types of experimental units. Then under the assumption of effect sparsity the slopes of the lines from the inert effects can be used to estimate the separate error terms. An alternative approach, which also depends on the assumption of effect sparsity, suggested by Box and Jones [5], would be to construct separate
STABILITY
AND
RESPONSE
SURFACE
71
METHODOLOGY
T A B L E 2.22 EXAMPLE OF A SPLIT-PLOT DESIGN USING A FRACTIONAL FACTORIAL Chamber Run 1
Chamber Run 2
Chamber Run 3
Chamber Run 4
Te m p = -
Te m p = +
Te m p = -
Te m p = +
Humidity =-
Humidity =-
Humidity = +
Humidity = +
AB
AB
C DE
AB
AB
+
+
+
+
+
+
+
+
+
+
+
+
+
+
-
+
+
+
+
-
-
-
C DE
C DE
+
+
+
+
+
-
+
+
+
+
+
+
-
+
+
+
+
-
+
+
+
-
+
+
+
+
-
+
+
+
+
-
+
+
+
+
+
-
+
+
-
+
+
+
+
-
+
+
+
+
+
+
-
-
+
+
+
+
-
+
+
_
_
_
+
+
+
+
+
-
_
+
+
+
_
_
+
+
-
+
+
-
_
+
+
+
_
_
+
-
+
+
+
-
_
+
+
+
-
-t-
-
-
-t-
+
-
+
+
-
+
-
-t-
-
-t-
-
+
-
+
-
+
+
-
_
+
+
C DE +
+
+
+
+
-
+
+
+
+
+
-
+
+
+
+
+
+
-
+
+
+
+
+
+
+
-
-
_
_
_
_
+
+
-
_
+
_
_
+
+
-
-
_
+
_
_
+
-
+
-
+
+
-
-
-t-
-
-
-+-
-
+
-
+
-
-t-
-
-t-
-
+
-
_
+
+
-
-t-
-t-
-
-
+
-
-t-
-I-
-
-
-I-
-
-
-
-+-
-
+
+
+
-
-
+
+
+
-
-t-
-
-
-
-t-
-t-
-
-
-+-
-
-
+
+
-
+
-
+
+
-
+
-t-
-
-
-+-
-
-t-
-
-t-
-
-
-
+
-
+
+
-
+
-
+
+
-+-
-
-t-
-
-
-t-
-t-
-
-
-
_
_
+
+
+
_
_
+
+
+
-t-
-t-
-
-
-
Bayesian probability plots (Box and Meyer [49]) for the contrasts from different types of experimental units. The split-plot arrangement has similarities with Taguchi's crossed designs since both arrangements divide the factors into two groups; in the split-plot terminology the factors are assigned to either whole plots or subplots, in Taguchi's terminology the factors are either assigned to an inner (design) array or an outer (noise) array. Although there are similarities in the appearance of the designs, there are marked differences in the analysis of these designs. Some of these differences reflect different philosophical approaches to data analysis. Taguchi's analysis of robust design experiments is frequently conducted in terms of a performance statistic, such as a signal-to-noise ratio, that is calculated for each point of the design array using data obtained from the environmental (noise) array about that point.
72
S.P. JONES
The use of a single-valued summary statistic, such as a signal-to-noise ratio, has been considered in Box [30]. One of the criticisms of these signal-to-noise ratios is that they can obscure information that is contained in the data and thus limit the impact that the experimenter can have as they study the data. From an analysis of a signal-to-noise ratio, the experimenter does not know which of the environmental variables have significant interactions with the design variables and the magnitude of those interactions. This information, if it were available, might suggest ideas to the experimenter as to the underlying scientific theory which might influence the future course of the investigation.
110
Z
105
hi3
a~
B=+I
100
t~
9,-, ,-:I
95 90
B = -1 l
-1
I
0
I
+1
Humidity
Figure 2.9 Interaction offactor B and humidity in data from Table 2.1
As was discussed in Section 2.3.4, the effects of the environmental variables on the design variables can be determined by studying the interactions between the design and environmental variables. To illustrate this, for the tablet formulation with the split-plot design arrangement (I) it was concluded that there was a significant interaction between humidity and factor B (see Table 2.17). This interaction is illustrated in Figure 2.9. From this figure it can be seen that by using the + 1 setting of B the tablet is less sensitive to the changes in humidity. This information could be important to the subject matter specialist who might know of similar
STABILITY AND RESPONSE SURFACE METHODOLOGY
73
constituents that could be included in the tablet formulation to make it even more robust to changes in the humidity. It is the investigation of these design x environment interactions that are the key to understanding robustness and could lead to new aspects in the investigation and to significant improvements in the robustness of products. Therefore, a preferred analysis is one that identifies the significance and magnitude of individual design x environment interactions rather than an analysis in terms of a signal-to-noise ratio or a standard deviation that would obscure information that would be present in the individual interactions. We have seen that split-plot and strip-block designs can be considerably more convenient to conduct than the cross-product designs advocated by Taguchi. In particular, since in robust designs we are not specifically interested in the main effects of the environmental variables and would be prepared to accept lower precision in our estimates of these main effects, it would generally be more appropriate to have the environmental variables as whole-plot factors, as in arrangement (I). Alternatively, a strip-block design, arrangement (III), might be the most convenient design and would yield precise estimates of the key design x environment interactions. With regard to analysis, since Taguchi's analysis is frequently conducted in terms of a single performance measure, such as a signal-to-noise ratio, that is calculated for each point of the design array, he ignores any information that might be contained in particular design x environment interactions. It is these interactions that are key to understanding robustness of product designs. In contrast, the split-plot and strip-block designs enable efficient estimation of these interactions. Therefore, split-plot designs and strip-block designs are of tremendous value in robust design experiments since they permit the precise estimation of the interactions of interest and can be considerably easier to run than the cross-product design that have traditionally been advocated.
2.5 C O N C L U S I O N S The concept of designing products and processes that are robust, or stable, to environmental variation is clearly very important. Robust design enables the experimenter to discover how to modify the design of a product or process to minimize the effects due to variation from environmental sources that are difficult, if not impossible, to control.
74
S.P. JONES
In this chapter the use of statistical experimental designs in designing products and processes to be robust to environmental conditions has been considered. The focus has been on two classes of experimental design, response surface designs and split-plot designs. The choice of an appropriate experimental design depends on the experimental circumstances. Box and Draper [12] (p. 502, 305) list a series of experimental circumstances that should be considered by the investigator when selecting a response surface design. Many of these considerations also apply to split-plot designs, and to experimental design in general. In the response surface strategy that was discussed in Section 2.3 standard response surface techniques are used to generate two response surface models, one for the mean response and one for the standard deviation of the response (or some function of the standard deviation). The standard deviation measures the stability of the response to the environmental variation. Standard analysis can reveal which factors affect the mean only, which only affect the variability, and which affect both the mean and the variability. The researcher can then apply optimization methods or construct contour plots of the mean and standard deviation response surfaces to determine settings of the design variables that will give a mean response that is close to the target with minimum variation. Taguchi's designs, a cross-product of two experimental designs, one for design variables and one for environmental variables, can require an excessive number of runs. In Section 2.3.3 it was shown how the number of runs can be substantially reduced by constructing a single experimental design that combined both the design variables and the environmental variables. The designs associated with response surface methodology offer considerable flexibility and can be built sequentially so that experimental resources can be used efficiently. The analysis proposed by Taguchi involves the construction of a signalto-noise ratio that combines both the mean and variability. Thus, with Taguchi's analysis there is a missed opportunity for a deeper understanding of the different variability that might affect the mean and the variability. As has been noted in this chapter, any restriction on the randomization of the experiment will lead the investigator to conduct one of the split-plot designs that were described in Section 2.4. In that section it was shown that the split-plot type designs can be a more efficient way to run robust design experiments than the cross-product arrays of Taguchi. Furthermore, the standard methods of analysis of split-plot experiments, that seek to
STABILITY AND RESPONSE SURFACE METHODOLOGY
75
estimate individual design x environment interactions, will yield more information than the signal-to-noise ratios proposed by Taguchi.
ACKNOWLEDGEMENTS The author expresses his appreciation to Denis Janky, David Rose, and Rod Tjoelker for their helpful comments on earlier versions of this chapter.
REFERENCES F. Yates, and W.G. Cochran, The analysis of groups of experiments, Journal of Agricultural Science, 28 (1938) 556-580. [2] G. Wemimont, Ruggedness evaluation of test procedures, Standardization News, 5 (1977) 13-16. [3] W.J. Youden, Experimental design and ASTM committee, Materials Research and Standards, 1 (1961) 862-867. Reprinted in Precision and measurement Calibration (Vol. 1. Special Publication 300), Gaithersburg, MD: National Bureau of Standards, 1969, ed. H.H. Ku. [4] W.J. Youden, Physical measurement and experimental design, Reprinted in Precision and measurement Calibration (Vol. 1. Special Publication 300, 1961), Gaithersburg, MD: National Bureau of Standards, 1969, ed. H.H. Ku. [5] G.E.P. Box and S.P. Jones, Split-plot designs for robust product experimentation, Journal of Applied Statistics, 19 (1992) 3-26. [6] G.E.P. Box and K.B. Wilson, On the experimental attainment of optimum conditions, Journal of the Royal Statistical Society, Series B, 13 (1951) 1-45. [7] G.E.P. Box, The exploration and exploitation of response surfaces: some general considerations and examples, Biometrics, 10 (1954) 116-60. [8] G.E.P. Box and P.V. Youle, The exploration and exploitation of response surfaces: an example of the link between the fitted surface and the basic mechanism of the system, Biometrics, 11 (1955) 287-323. [9] G.E.P. Box, J.S. Hunter and W.G. Hunter, Statistics for Experimenters, New York, Wiley, 1978. [10] Cornell J.A., How to Apply Response Surface Methodology. Basic References in Quality Control: Statistical Techniques, Vol. 8. Milwaukee: American Society for Quality Control, 1985. [ 11] R.H. Myers, Response Surface Methodology, Boston, Allyn and Bacon, 1971. [12] G.E.P. Box and N.R. Draper, Empirical Model Building and Response Surfaces, New York, Wiley, 1986. [13] A.I. Khuri and J.A. Comell, Response Surfaces. Designs and Analyses, New York, Marcel Dekker, 1987. [ 14] R.L. Plackett and J.P. Burman, The design of optimum multifactorial experiments, Biometrika, 33 (1946) 305-325.
[1]
76
S.P. JONES
[15] N.R. Draper and D.K.J. Lin, Projection properties of Plackett and Burman designs, Technometrics, 34 (1992) 423-428. [16] M. Hamada and C.F.J. Wu, Analysis of designed experiments with complex aliasing, Journal of Quality Technology, 24 (1992) 130-137. [17] Box, G.E.P. and R.D. Meyer, Finding the active factors in fractional screening experiments, Journal of Quality Technology, 25 (1993) 94-105. [18] O. Kempthome, The Design and Analysis of Experiments, New York, Wiley, 1952. [19] G.E.P. Box and D.W. Behnken, Some new three-level designs for the study of quantitative variables, Technometrics, 2 (1960) 455-475. [20] G.E.P. Box and S.P. Jones, Robust product designs, part II: second-order models, Report No. 63 (1990), Center for Quality and Productivity Improvement, University of Wisconsin-Madison. [21] G.E.P. Box and S.P. Jones, Robust product designs, part I: first-order models with design x environment interactions, Report No. 62 (1990), Center for Quality and Productivity Improvement, University of Wisconsin-Madison. [22] A.C. Atkinson and A.N. Donev, Optimum Experimental Designs, New York, Oxford University Press, 1992. [23] T.J. Mitchell, An algorithm for the construction of D-optimal experimental designs, Technometrics, 16 (1974) 203-210. [24] G.E.P. Box, Choice of response surface design and alphabetic optimality, Utilitas Mathematica, 21B (1982) 11-55. [25] H.O. Hartley, Smallest composite designs for quadratic response surfaces, Biometrics, 15 (1959) 611-624. [26] P.W.M. John, Statistical Design and Analysis of Experiments, New York, Macmillan, 1971. [27] R.A. Maclean and V.L. Anderson, Applied Factorial and Fractional Designs, New York, Marcel Dekker, 1984. [28] D.H. Doehlert, Uniform shell designs, Journal of the Royal Statistical Society, Series C, 19 (1970) 231-239. [29] M.S. Bartlett and D.G. Kendall, The statistical analysis of variance heterogeneity and the logarithm transformation, Journal of the Royal Statistical Society, Series B, 8 (1946) 128-150. [30] G.E.P. Box, Signal-to-noise ratios, performance criteria, and transformations, (with discussion), Technometrics, 30 (1988) 1-40. [31] G.G. Vining and R.H. Myers, Combining Taguchi and response surface philosophies: a dual response approach, Journal of Quality Technology, 22 (1990) 38-45. [32] R.H. Myers and W.H. Carter, Jr., Response surface techniques for dual response systems, Technometrics, 15 (1973) 301-317. [33] A.E. Hoerl, Optimum solution of many variables equations, Chemical Engineering Progress, 55 (1959) 69-78. [34] E. Del Castillo and D.C. Montgomery, A nonlinear programming solution to the dual response problem, Journal of Quality Technology, 25 (1993) 199-204. [35] A.E. Freeny and V.N. Nair, Robust parameter design with uncontrolled noise variables, Statistica Sinica, 2 (1992).
STABILITY AND RESPONSE SURFACE METHODOLOGY
77
[36] W.J. Welch, T.K. Yu, S.M. Kang and J. Sacks, Computer experimems for quality control by parameter design, Journal of Quality Technology, 22 (1990) 15-22. [37] A.C. Shoemaker, K.L. Tsui and C.F.J. Wu, Economical experimentation methods for robust design, Technometrics, 33 (1991) 415-427. [38] G.E.P. Box and S.P. Jones, Designing products that are robust to the environment, Total Quality Management, 3 (1992) 265-282. [39] D.M. Steinberg and D. Bursztyn, Dispersion effects in robust-design experiments with noise factors, Journal of Quality Technology, 26 (1994) 12-20. [40] R.H. Myers, A.L. Khuri and G.G. Vining, Response surface alternatives to the Taguchi robust parameter design approach, American Statistician, 46 (1992) 131139. [41] R.H. Myers, Response surface methodology in quality improvement, Communications in Statistics: Theory and Methods, 20 (1991) 457-476. [42] D.R. Cox, Planning of Experiments, New York, Wiley, 1958. [43] C. Daniel, Use of half-normal plots in interpreting factorial two-level experiments, Technometrics, I (1959) 311-341. [44] W.G. Cochran and G.M. Cox, Experimental Design, New York, Wiley, 1957. [45] M.S. Bartlett, Discussion of "Complex experiments", by F. Yates, Journal of the Royal Statistical Society, Series B, 2 (1935) 224-226. [46] S. Addelman, Some two-level factorial plans with split-plot confounding, Technometrics, 6 (1964) 253-258. [47] C. Daniel, Applications of Statistics to Industrial Experimentation, New York, Wiley, 1976. [48] R.L. Mason, R.F. Gunst and J.L. Hess, Statistical Design and Analysis of Experiments, New York, Wiley, 1989. [49] G.E.P. Box and R.D. Meyer, An analysis for unreplicated fractional factorials, Technometrics, 28 (1986) 11- 18.