Design and Analysis of Experiments

Design and Analysis of Experiments

Chapter 21 Design and Analysis of Experiments Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and ...

141KB Sizes 0 Downloads 157 Views

Chapter 21

Design and Analysis of Experiments Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write. H.G. Wells

21.1 INTRODUCTION The design of experiments (DOE) has been frequently used in the control of processes to identify the factors that impact the quality of products and services. Through the DOE, it is possible to improve the adjustment of the process and product design, as well as to reduce the time necessary to develop new products and processes. Montgomery (2013) defines an experiment as a test or a series of tests in which intentional changes are carried out in relation to the input variables of a process, in order for it to be possible to observe and identify the corresponding changes in the output variable. The process presented in Fig. 21.1 transforms resources (inputs) into new products, goods, or services (outputs) for internal and external clients. Some process variables are controllable, while other variables are uncontrollable. According to Montgomery (2013), the objectives of an experiment include: 1) 2) 3) 4)

Determining which input variables most influence the dependent or the y answer variable; Determining the set of xs variables and their respective values so that y is as close as possible to the desired value; Determining the set of xs variables and their respective values so that the variability in y is the smallest possible; Determining the set of xs variables and their respective values so that the effects of the uncontrollable variables z are minimized.

We will discuss the main terms used in the design of experiments based on Banzatto and Kronka (2006). A treatment or factor is any method, element, or material that one wishes to measure, test, or assess in an experiment. In an experiment, there may be more than one factor and more than one dependent or answer variable. Treatments or factors correspond to the model’s explanatory variables. In an experiment, there are at least two types of treatments that can be qualitative or quantitative. As examples of treatments, we have fertilizer, insecticide, different equipment to assess the noise level in the work place, different equipment to measure thermal stress, different methods to assess body composition, soil treatments to evaluate the production of watermelon and melon, different types of products, different ages, different periods of time, etc. On the other hand, the term experimental unit corresponds to the unit, member, physical entity, or place to which the treatment is applied. This supplies the data to be analyzed. An experimental unit can be an animal, a patient, a plot of land, an engine, a piece of equipment, a customer, etc. The variation that happens randomly in an experiment, due to uncontrollable variables, is called experimental error.

21.2 STEPS IN THE DESIGN OF EXPERIMENTS Montgomery (2013) describes the necessary steps for applying the DOE technique: 1. Defining the problem: a clear definition of the problem and of the experiment’s objectives significantly helps to better understand and solve the problem. 2. Choosing the factors and levels: in an experiment, we define the factors and their respective variation ranges, and the specific levels used in the procedure. 3. Defining the response variable: we usually use the mean or the standard deviation (or both) of the characteristic being evaluated as the response variable. Data Science for Business and Decision Making. https://doi.org/10.1016/B978-0-12-811216-8.00021-5 © 2019 Elsevier Inc. All rights reserved.

935

936

PART

VIII Other Topics

Controllable variables x1 x2 xr

... Input

Output (y) Process

Material Equipment Energy Etc.

Products Goods Services

... z1

z2

zs

Uncontrollable variables FIG. 21.1 Flowchart of a process.

4. Choosing the type of design: Section 21.4 discusses the types of design of experiments. 5. Conducting the experiment: when conducting the experiment, it is necessary to monitor the process carefully in order to ensure that the experiment will happen as planned. 6. Data analysis: statistical techniques are used to analyze the data from the experiment. 7. Conclusions and recommendations: To validate the experiment’s results and conclusions, graphical methods and also exploratory and confirmatory tests are used.

21.3

THE FOUR PRINCIPLES OF EXPERIMENTAL DESIGN

To ensure that the data are collected correctly, four basic principles must be considered during the design of experiments. Sharpe et al. (2015) describe each one of them: 1. Randomization: this principle consists in randomly distributing the treatments in the experimental units, in a way that each treatment has the same chance of occupying any experimental unit. This principle minimizes the effects of unknown and uncontrollable variables. 2. Replication: is the number of times each treatment appears in the experiment. If the number of repetitions is the same for each treatment, we have a balanced experiment. Through replication, we can estimate the experimental error, increase the experiment’s accuracy, and even increase the robustness of the statistical tests. 3. Control: controlling odd sources of variation significantly reduces the variability of the response variables, making it easier to discern the differences between the experimental units or treatment groups. In test drives, for example, all of the alternatives must be offered to customers at the same time and in the same conditions. Otherwise, external variables, such as, the price of gasoline, volatility of the stock market, fluctuations in the interest rate, among others, would make it difficult to assess the effects of the treatments. 4. Blocking: in some cases, there may be one uncontrollable factor that directly affects the response variable or the way in which the factors being studied influence the response. To minimize this effect, the factors are grouped into blocks or homogeneous groups, in such a way that each experiment is analyzed separately for each block. Different from the first three principles, blocking is not necessary in all experiments.

21.4

TYPES OF EXPERIMENTAL DESIGN

Sharpe et al. (2015) and Banzatto and Kronka (2006) describe three types of design of experiments: (a) completely randomized, (b) randomized blocks, (c) factorial.

21.4.1

Completely Randomized Design (CRD)

It is the simplest of all experimental designs. It only uses the principles of randomization and replication. The treatments are distributed in the units in a totally random way, with the same number or with a different number of replications. The CRD only considers one explanatory variable with two or more categories.

Design and Analysis of Experiments Chapter

21

937

(50 patients)

Random selection (100 patients)

Group 1

Treatment 1

Group 2

Treatment 2

(50 patients)

FIG. 21.2 An example of a completely randomized design. (Modified from Sharpe, N.R., de Veaux, R.D., Velleman, P.F., 2015. Business Statistics. 3rd ed. Pearson Education.)

Imagine an experiment in which one wishes to test two types of diet with two groups of patients. Thus, 100 patients are randomly divided into 2 groups of the same size and the diets are assigned to these groups in a random way, as shown in Fig. 21.2. One-way ANOVA has been widely used to analyze data coming from a completely randomized design.

21.4.2

Randomized Block Design (RBD)

It is the most common design. Besides the principles of randomization and replication, it also considers the principle of local control by creating blocks. Thus, the units are grouped into homogeneous blocks. For each block, we distribute different factors or treatments randomly. The main objective is to reduce the variability within each block and to identify the effect the factors have on the dependent or response variable. The number of units per block is equal to the number of factors or treatments being studied. The factors or treatments are distributed in the units in a random way, in such a way that the randomization is carried out within each block. Imagine an experiment with 600 patients from a health care clinic divided into two groups: healthier and not so healthy. For each group, 300 patients were selected randomly and three different treatments were assigned at random to these patients, in order for each subgroup with 100 patients to undergo a certain treatment. The main objective here is to analyze the effect of three types of food production systems on these patients’ health: (a) food from conventional production; (b) food from organic production; (c) food from biodynamic production. Fig. 21.3 describes this process in a simplified way.

21.4.3

Factorial Design (FD)

When there are two or more factors in the experiment being carried out, the researcher uses the factorial design. In an experiment with two factors, in each replication of the experiment, all the possible combinations of the levels of these factors are investigated. Therefore, if there are two factors A and B with a levels of factor A and b levels of factor B, then each replication contains all the a  b combination possibilities (Montgomery, 2013). Two-way ANOVA has been broadly used to analyze data coming from a factorial design considering two factors.

300 patients Healthier (Random selection) Selection (600 patients)

(300 patients) Not so healthy (Random selection)

FIG. 21.3 An example of a randomized block design.

Subgroup 1 100 patients

Treatment 1 (random) Conventional

Subgroup 2 100 patients

Treatment 2 (random) Organic

Subgroup 3 100 patients

Treatment 3 (random) Biodynamic

Subgroup 1 100 patients

Treatment 1 (random) Conventional

Subgroup 2 100 patients

Treatment 2 (random) Organic

Subgroup 3 100 patients

Treatment 3 (random) Biodynamic

938

21.5

PART

VIII Other Topics

ONE-WAY ANALYSIS OF VARIANCE

A single factor or one-way analysis of variance (one-way ANOVA) has been widely used to analyze data obtained from a completely randomized design. These data could also be analyzed by using regression models. According to Fa´vero et al. (2009), one-way ANOVA allows the researcher to verify the effect a qualitative explanatory variable (factor) has on a quantitative dependent variable. Each group includes the observations of the dependent variable in one of the factor’s categories. One-way ANOVA was discussed in Section 9.8.1 in Chapter 9. All the concepts of the one-way ANOVA, its hypotheses, its model, and respective calculations can be found in that section in a very detailed way. The application of the one-way ANOVA is described in Example 9.12, as well as its solution on SPSS and on Stata software. In that example, the factor corresponds to the variable Supplier and the dependent variable is Sucrose.

21.6

FACTORIAL ANOVA

Factorial ANOVA is an extension of the one-way ANOVA considering two or more factors. Factorial ANOVA assumes that the quantitative dependent variable is affected by more than one qualitative explanatory variable (factor). It also tests the possible interactions between the factors. For Pestana and Gageiro (2008) and Fa´vero et al. (2009), the main goal of factorial ANOVA is to determine if the means for each factor level are the same (isolated effect of the factors on the dependent variable) and to verify the interaction between the factors (joint effect of the factors on the dependent variable). Two-way ANOVA was discussed in Section 9.8.2.1 in Chapter 9. All the concepts of the two-way ANOVA, its hypotheses, its model and respective calculations can be found in that section. The application of the two-way ANOVA is described in Example 9.13, as well as its solution on SPSS and on Stata. In that example, the fixed factors correspond to the variables Company and Day_of_the_week, and the dependent variable is Time. The two-way ANOVA can be generalized for three or more factors. According to Maroco (2014), the model becomes very complex, since the effect of multiple interactions can confound the effect of the factors (Section 9.8.2.2).

21.7

FINAL REMARKS

The design of experiments technique has often been used to control processes, aiming at identifying the explanatory variables or factors that affect the quality of products and services (dependent or response variable). Among all the experimental designs, the completely randomized design is the simplest and considers only one explanatory variable with two or more categories. One-way ANOVA has been widely used to analyze data coming from a completely randomized design. On the other hand, the randomized block design is used more frequently. Finally, when the experiment considers two or more factors, we use the factorial design. Two-way ANOVA has been broadly used to analyze data that comes from a design with two factors.

21.8

EXERCISES

1) An aerospace company manufactures civilian and military helicopters at its three factories. Table 21.1 shows its monthly helicopter production in the last 12 months, in each factory. Check and see if there is a difference between the population means. Assume that a ¼ 5%. 2) A steel company wants to know how the factors “Type of iron ore” and “Type of converter” affect the properties of steel, more specifically the Brinell hardness (BH), measured in kgf/mm2. In order to do that, an experiment with 81 samples was carried out, with 3 types of iron ores (hematite, limonite, magnetite) and 3 types of converters (Bessemer, LD, and Siemens-Martin). For each experimental unit, the hardness was measured. The data are available in Table 21.2. 3) A gas and oil company wants to understand how petroleum refining processes and the type of petroleum impact gasoline quality parameters, more specifically its octane rating. In order to do that, an experiment with 48 samples was carried out, considering 4 petroleum refining processes (distillation, cracking, reforming, and alkylation) and 3 types of petroleum (light, naphthenic, and paraffinic). For each experimental unit, the octane rating was measured. The data are available in Table 21.3.

Design and Analysis of Experiments Chapter

21

939

TABLE 21.1 Monthly Helicopter Production for Each Factory Factory 1

Factory 2

Factory 3

24

28

29

26

26

25

28

24

24

22

30

26

31

24

20

25

27

22

27

25

22

28

29

27

30

30

20

21

27

26

20

26

24

24

25

25

TABLE 21.2 Brinell Hardness (BH) per Type of Iron Ore and Converter Type of Iron Ore Limonite

Magnetite

Type of Converter

Hematite

Bessemer

161

154

149

145

151

154

168

165

174

157

163

150

141

147

153

163

175

172

161

165

156

139

155

140

181

182

180

164

169

152

134

144

140

165

164

177

149

155

164

139

142

149

181

183

165

167

159

160

133

129

137

167

178

179

169

165

152

135

141

148

165

166

183

154

163

167

130

142

129

175

178

179

159

151

165

137

135

141

164

183

179

LD

Siemens-Martin

TABLE 21.3 Octane Rating per Type of Petroleum and Refining Process Petroleum Refining Process Cracking

Reforming

Alkylation

Type of Petroleum

Distillation

Light

95

95

95

97

96

94

95

96

94

94

95

96

94

93

96

95

87

86

89

90

86

87

89

91

86

87

88

90

87

85

90

89

90

91

92

91

90

91

89

92

92

89

90

92

92

90

92

91

Naphthenic

Paraffinic