Experimental Hematology 33 (2005) 1271–1272
LETTER TO THE EDITOR
On the proper use of statistical design of experiments In a recent article published by Yao et al. [1], the authors have presented a very interesting application of the design of experiments (DOE) approach to the optimization of a serum-free medium for the expansion of cord blood (CB) hematopoietic stem cells (HSC). In the first part of their work, the authors have performed screening as well as optimization experimental designs on four serum replacement additives: bovine serum albumin (BSA), insulin (I), transferin (TF), and 2-mercaptoethanol (2-ME). In the second part, they screened and optimized a cytokine cocktail starting from 10 factors: thrombopoietin (TPO), interleukin (IL)3, stem cell factor (SCF), Flt-3 ligand (FL), IL-6, granulocyte-macrophage colony-stimulating factor (GM-CSF), granulocyte colony-stimulating factor (G-CSF), stem cell growth factor a (SCGF), IL-11, and hepatocyte growth factor (HGF). In both parts, the optimization objective was to maximize the expansion of both white blood cells (WBC) and CD341 cells. Our main concerns in regard to this work are as follows: 1) the a priori assumption that second- and higher-order interactions between factors were not significant, without providing any statistical evidence to this effect, 2) a general lack of statistical significance testing, and 3) the inefficient use of some empirical optimization methods. In the first part, a 24 full factorial design (16 experiments) was performed, followed by least-squares multilinear regression, to estimate the coefficients of a model based solely on the main effects (i.e., individual effects) of the four serum replacement components. Based on the data provided in the paper, we could reproduce the regression coefficients for Equation 2 (effect on WBC expansion) but the coefficients we obtained for Equation 3 (effect on CD341 cells) differ significantly from those published in the article (constant: 6.28 instead of 12.39, BSA coefficient: 5.02 instead of 9.33, I coefficient: 0.76 instead of 1.67, TF coefficient: 0.65 instead of 1.62, and 2-ME coefficient: 20.23 instead of 20.11). For both models, the authors considered all positive coefficients to be significant. However, using standard 95% confidence intervals, we found for the model described by Equation 2 that only 3 out of the 5 coefficients were indeed significant (the constant term and the coefficients for BSA and I), whereas 4 coefficients out of 5 were significant for Equation 3 (2-ME coefficient was
nonsignificant). It was possible to evaluate confidence intervals in this case since one can estimate as many coefficients as the number of experiments. Thus, 11 degrees of freedom are available here. A very interesting feature of full factorial designs, such as this 24, is that they allow the experimenter to simultaneously and independently estimate all main (individual) effects as well as all factor interactions. The authors did not make use of this powerful capability of the technique and rather seemed to neglect all potential interactions between factors. After reanalyzing the data and estimating interactions, we found that the BSA/TF two-factor interaction as well as the BSA/I/TF three-factor interaction had a significant negative effect on WBC expansion, whereas BSA/I interaction had significant positive effect on this response. In addition, the following interactions had significant positive effects on CD341 cell expansion: TF/2-ME, BSA/I, and BSA/TF/2-ME. No negative interaction was found for CD341 cell expansion. Omitting these interactions can be very misleading and could lead to a suboptimal component mixture (see chapter 15 of Box et al. [2] for an example). In this new analysis, significance testing was performed using normal probability plots of the estimated effects instead of using confidence intervals since no degrees of freedom are available (16 coefficients are estimated using 16 experiments). In the second part, the authors performed a 210-6 twolevel fractional factorial design to screen the effects of 10 cytokines with only 16 experiments. In this design, each main effect of individual cytokines is confounded with at least 2 two-factor interactions and numerous higher-order interactions [2]. The use of such a highly fractionated experimental design must therefore always be followed by further experiments in order to break the confounding pattern and to guarantee the identification of causal relationships between factors and responses. Instead, Yao et al. [1] concluded without any significance testing that 9 of the 10 factors were significant. This conclusion cannot be supported by only 16 experiments unless all interactions are known a priori to be inexistent, which is not the case here as we have previously shown. The authors also used a steepest ascent method (SA) to optimize the concentration of serum substitutes and cytokines. The SA is a powerful gradient-based line search method designed to experimentally seek a global optimum by successively performing DOE to evaluate the local
0301-472X/05 $–see front matter. Copyright Ó 2005 International Society for Experimental Hematology. Published by Elsevier Inc. doi: 10.1016/j.exphem.2005.07.012
1272
Letter to the Editor/ Experimental Hematology 33 (2005) 1271–1272
gradient and then moving along that direction until no further improvements are found. This procedure (DOE followed by line search) is repeated until no improvement directions can be found [2]. The use of this method by the authors is very inefficient since they applied the line search within the initial experimental domain when it could have been found using surface response methods with no or very few additional experiments. In addition, if their linear model is adequate, but has never been tested, then the optimum necessarily lies on one corner of the design. The authors should have saved these efforts and associated costs to explore outside the initial domain to seek for a global optimum. In conclusion, we firmly believe that the use of DOE can be very useful in the field of cell physiology and the development of culture media. In that respect, Yao et al. [1] have great merit in applying these contemporary tools to the very important problems of ex vivo stem cell culture. However, we have shown that their work suffered from serious methodological flaws, which questions the validity of their conclusions. In fact, DOE methods are often simplistically presented in commercial statistical package, and this is often the cause of data misinterpretation. A better integration
of these concepts in the science and engineering curricula could possibly contribute to improve this situation, which is often encountered in the literature. Franc¸ois-Thomas Michaud* Victor-Alain Parent* Alain Garnier Carl Duchesne De´partement de ge´nie chimique Universite´ Laval, Que´bec, QC, Canada
References 1. Yao CL, Chu IM, Hsieh TB, Hwang SM. A systematic strategy to optimize ex vivo expansion medium for human hematopoietic stem cells derived from umbilical cord blood mononuclear cells. Exp Hematol. 2004;32:720–727. 2. Box GEP, Hunter WG, Hunter JS. Statistics for experimenters: An introduction to design, data analysis, and model building. New York: John Wiley & Sons Inc.; 1978. p. 206–413.
*These authors contributed equally to this work. Offprint requests to Carl Duchesne.