! 6th European Symposiumon ComputerAided Process Engineering and 9th International Symposiumon Process Systems Engineering W. Marquardt, C. Pantelides (Editors) © 2006 Published by Elsevier B.V.
Optimization-based Root Cause Analysis Eyal Dassau and Daniel Lewin
PSE Research Group, Chemical Engineering, Technion, Haifa, Israel Abstract
A systematic approach to yield enhancement was recently proposed by Dassau et al [ 1], which combines six-sigma with design and control to improve estimated yields in the process design stage. After identifying the critical-to-quality variables, the key step involves the analysis of process measurements to identify the root cause for low quality or yield. The availability of a model of the process permits a significant improvement to this step, through the incorporation of formal optimization as a means of automating root cause analysis. The problem is formulated as a mixed-integer nonlinear program, whose system variables include the possible perturbations that affect low quality and low yield, and whose decision variables are all of the possible process improvements. The potential of proposed optimization-based root cause analysis to enhancing yield is demonstrated on the process of penicillin production, involving both fermentation and downstream purification.
Keywords: Root Cause Analysis; Bioprocessing; Six-sigma; Yield Enhancement 1. Introduction
Batch process design is an iterative activity, involving significant capital and human resources. Identifying bottlenecks and other constraints to enhanced process performance is an important aspect of incremental improvement, which can be assisted by employing systematic root cause analysis. In multi-step processes, this calls first for the detection of the most problematic step and then the identification of the combination of the associated degrees-of-freedom that contribute to the problem. This is often carried out in practice by applying Design of Experiments (DOE) in the process itself. As will be shown in this paper, a more efficient approach is one relying on process simulation and optimization. As pointed out by Bogle et al [2], a way to overcome development and manufacturing problems is to adopt a plantwide stance in the design and optimization of the process, which calls for a systematic method, such as Six-sigma (6~), that can serve both to identify sources of poor performance and as the driving force for continuous improvement. Poor performance could be the outcome of either a poorly designed process, or its control system, or a combination of the two. By improving the most significant drawbacks, one will generally improve the process controllability and resiliency leading to increased sigma levels and to a superior process [3]. In this paper, we extend the approach of Dassau et al [1 ], showing how an optimization-based approach can be used for systematic root cause analysis. After some background on root cause analysis and 6cy, we present our proposed approach and then demonstrate its capabilities in the improvement of a process for penicillin manufacture. Root cause analysis is routinely used to identify a failure, investigate its causes, and suggest corrective actions. This is traditionally carried out by a descriptive statement of each failure followed by the suggestion of its probable causes and their investigation,
943
E. Dassau and D. Lewin
944
often carried out in practice by brainstorming. Subsequently, the most probable causes are identified and recommendations are made [4, 5]. Six-sigma (6~) is an iterative five-step procedure to progressively improve product quality. The five steps are: Define, Measure, Analyze, Improve, and Control, referred to by the acronym, DMAIC. Initially, the DMAIC procedure is applied to quantify the base-case conditions. Then, cycles of the procedure are implemented to iteratively improve the process [6]. In itself, 6t~ is a systematic way to enhance process performance, with the key step in the DMAIC procedure being the identification of the root cause of inadequate performance. However, this will only detect the weakest link in the production sequence, whereas a more useful outcome would be to identify the degrees of freedom in a critical unit's design and/or control system that need to be manipulated to improve the process. This can be done by optimization-based root cause analysis, as described next
2. O p t i m i z a t i o n - b a s e d root cause analysis A new perspective to root cause analysis is introduced as a solution of an optimization problem, formulated as a mixed-integer nonlinear program (MINLP), whose system variables include the possible perturbations that affect low quality (Critical to quality, CTQ) and low yield (Critical to Productivity, CTP), some of whom are decision variables. Following Lewin et al [7], the MINLP is formally defined as: min J {_x,u__,d,_O} u_~u,o
Subjectto"
_~=f(x,u_,d,_O)
y=g(x,u,d,O_),y~Y z = h_{x_,u_,d,O_}, z~ Z
(1)
where J is a objective function accounting for benefits and costs associated with process improvements, x_ is a vector of process states, ~ is a vector of measured process outputs where the output variables are required to lie inside a hypercube, Y, rather than meet specific setpoints, z is a vector of process quality variables (CTQ and CTP) or attributes of the manufactured product that needs to meet specifications within the hypercube Z, u is vector of manipulated variables, _d, is a vector of uncontrolled disturbances, and _0, a vector of continuous and discrete design variables. As demonstrated in the example that follows, after appropriately defining the objective function of the above MINLP, its solution identifies the combination of manipulated variables that are the root cause of inadequate process performance. Figure 1 presents a summary of operations that are executed to lower the DPMO level of the entire plant, through the adjustment the CTQ and CTP variables to acceptable values. First, the CTQ and CTP variables are selected, and then a preliminary simulation is invoked based on the current design and initial conditions. The process performance is diagnosed using the simulation results, meaning that the values of the CTQ and CTP variables are collected and the defects per million opportunities (DPMO) computed for each unit, that is, the number of observations of each CTQ or CTP variables that are outside their specification windows. The DPMO value can either be estimated by counting the actual off-specification measurements in the production trajectory over time, or can be computed on the basis of the observed mean, ~t, and standard deviation, 6, of the CTQ variable. It should be noted that the main trigger for RCA is a management decision and could be the DPMO level, as in the following example, or the production time or any other target. Next, the most problematic unit
Optimization-based Root Cause Analysis
945
operation is selected as a target for detailed root cause analysis. In this framework, root cause analysis involves the formulation of a M1NLP and its robust solution obtained using a Genetic Algorithm (GA, [8]), that manipulates design and control variables (both discrete and continuous) of the unit to identify the root cause or causes and to suggest improved values for those variables. This instigates the next DMAIC improvement cycle, where the improvements are implemented and performance is again diagnosed. The procedure is repeated until acceptable DPMO values or CTQ/CTP variables are obtained in each process unit, or until economic considerations dictate its termination. Note that since the GA maintains a population of potential solutions, it not only provides a solution to the MINLP of Eq. (1) but also has the ability to trace the source of the poor performance by analyzing the variance of the solution population.
i I
_J Invokings,m.la~ni~ij
Yes
T ii~ii~i~iiiii!ii!iiiii!jii!iii~ii!i!!!ii!i
Invokingim~!ii~..i.lii.ii.iii ! optim~i~iiiiiii!iiiiiiii!i!iiiiiiii ~ii~iii~ii!i!~i~i~iii!i~i~i~i~iii~iiiii!~i~i~iiiiii i~i~i~iiii~iiiii i
J Fig. 1. Optimal root cause analysis working sequence. The proposed methodology is demonstrated on a simplified process for the production of penicillin, considering only the fermentation and the first down-stream processing step as shown in Figure 2. Each unit operation, together with its control system, is modeled, calibrated and implemented, using Matlab® and Simulink®. For details, see Dassau et al [1]. Penicillium c hrysogenamesbsUat
=
Reaction / Fermentation .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Primau' RecoverT .
.
.
.
.
.
.
.
.
.
.
.
.
.
Intermediate Recoveu,'
.
Fig. 2 Schematic of simplified penicillin process.
Penicillin (product)
Final Purification
946
E. D a s s a u a n d D. L e w i n
3. Demonstrative example The DMAIC procedure is applied on the penicillin simulation to define the base-case conditions as summarized in Table 1. Subsequently, cycles of the procedure are implemented to iteratively improve the process, noting that improvements at each cycle are implemented in the triggered unit exhibiting the highest DPMO value or if its CTQ or CTP variables are off-specification. Table 1 - Summary of control limits, DPMO and TY for the base case LCL UCL DPMO Fermentor pH 4.9 5.1 45,445 Temperature 22 28 465 Reactive Extractor- TY = 73% pH 4.8 5.2 462,456 Cx (mole/liter) 6.75× 10-5 Reactive Re-Extractor - TY = 86% pH 7 9 31,264 Cx (mole/liter) 4.2× 10-5 Total Production Time (hr) Total TY %
ProductionTime (hr) 422 5 5 432 63
Cycle I. As clearly indicated in Table 1, the first unit operation that needs to be selected for improvement is the reactive extractor, since in the base-case design, it exhibits the highest value of DPMO. Analysis shows that the degree of extraction reaches only 73% after 5 hours and that the pH value is clearly not on its set point of 5. Moreover, the value of Cx, which represents degradation products of penicillin, is rising constantly during the batch. The total throughput yield for the sequence is 63% and the production time is 432 hours. Ideally, we would like to simultaneously reduce the values of DPMO and Cx, to increase the throughput yield and to minimize the production time. As a step to overcome such poor performance the root cause analysis mechanism is invoked, which involves the solution of the following specific MINLP: min
w.r .t Ca(O),t,CL s.t.
c ~ . D P M O + % . C x + ~3
k_ = f ( t , x , u ) , x ( O ) = X_o D P M O U <_D P M O <_D P M O U
<-Cx <-
,Vr <_vr
CLa (0) ~ C a ( 0 ) ~ C U (O),t L < t < t U C L ~ [0,1]
(2)
where, CZl,(z2 and (z3 are objective function weights noting that in this case we do not include costs associated with improvements, the superscript L and U are lower and upper limit respectively, and the manipulated variables are the initial concentration of LA-2 in the organic phase, C~(0), the extractor processing time, t, and the status of the possible pH controller, CL (open- or closed-loop). Careful selection of manipulated variables based on process understanding and base-case results can help in choosing DOFs for optimization. In this case, the DOFs were selected in two steps. First, only the design parameter CL was selected as a DOF, leading to promising results (lower DPMO in Cx). Then, the two additional variables were added as DOFs. A more general way to arrive to the optimal solution is to allow the GA to simultaneously manipulate all of the
947
Optimization-based Root Cause Analysis
DOFs of the system and investigate which of them are important (i.e., belong to the subset of the root cause) by analyzing their variance and the magnitude of the computed change from their initial values. The solution of the MINLP in Eq. (2) indicates that the root cause is a combination of problems: (a) the absence of a pH control system in the reactive extractor, leading to increase in the amount of impurities, which has a negative affect on the downstream units; (b) an insufficient amount of LA-2 in the feed; (c) insufficient processing time. By introducing a control system to maintain the pH at its set point of 5, increasing Ca(O) from 0.01 to 0.044 mole/liter and increasing the processing time from 5 to 5.54 hr, this achieves not only excellent pH control, but also reduces the amount of impurities by 84%, and increases the TY of the unit from 73 to 97%, leading to an increase in the overall TY from 63 to 80%. It should be noted that each of the components of the objective function are normalized to avoid bias. Furthermore, since the GA produces a population of possible solutions, the analysis of the variance in the values of the DOFs manipulated allows the key variables to be identified. In this case, it is apparent that the absence of a control system and the amount of LA-2 are more important root causes to inferior performance than the short production time. Cycle II. Having improved the extractor operation, the DMAIC procedure is repeated to further improve the process. We note that the improvement implemented in Cycle I improves the quality of the feed to the reactive re-extractor, this reduces the DPMO for that unit from 31,264 to 4,092 before having made any additional improvements (see Table 2). Moreover, noting that the fermentation part now exhibits the highest DPMO level and dominates the overall production time, its reduction would provide a means to increasing the overall productivity of the process. Once again the root cause analysis mechanism was invoked involving the MINLP"
mw.in (0( 1oDPMO v + oc2,DPMOpH + 0(3 "t ) Gr ,fg ,P,,.
s.t.
f DPMOi L <_DPMO~ < DPMOi v, i - T andpH t < t U, C; < L Cp, _ f g _
(3)
where Gr is the threshold glucose concentration in the fermentor at which additional substrate is added, fg is the oxygen flow rate, Pw is the agitator power setting (affect oxygen mass transfer), and Cp is the local penicillin concentration, and Cp is the s
production specification. The solution of Eq. (3) leads to the following recommendations: (a) changing Gr from 0.3 to 49 g/l; (b) changing fg from 8.6 to 8.3 l/h; and (c) changing Pw from 29.9 to 39 W. Implementation of these recommendations reduces the fermentation time for a peak penicillin concentration of 1.5 g/1 from 422 to 264 hours This reduced production time is achieved at a price of temperature distributions with a higher variance than in the base case, with a DPMO level of 1,754, which however, needs to be weighed against the resulting reduction in batch time of about 40%. Note also that these have no effect on the total TY. Cycle III. The last cycle of the DMAIC procedure is invoked on the re-extractor to reduce penicillin degradation which is rather high. This situation is improved by introducing a pH controller in this unit also, with the most important outcome being a decrease of 45% in the concentration of impurities in this unit. The down-side is a slight decrease in the degree of extraction from 83% to 81%, reducing the total TY to 79%. This improvement should be evaluated with respect to what is more critical to the
948
E. Dassau and D. Lewin
management: 45% decrease in impurity level is achieved at a cost of a 1% decrease in yield. Moreover, it is important to note that what are considered as acceptable ranges for DPMO values and CTQ/CTP variables are specific to each processing unit and variable. 5. Conclusions
We have shown that formulating root cause analysis as an optimization problem and implementing it as part of the DMAIC procedure can locate the root cause and generate better and more comprehensive solutions than could be achieved by conventional brainstorming. However, automated root cause analysis cannot replace process understanding, and should be seen as a means of assisting in the generation of more creative solutions to production problems. For example, in the process analysis demonstrated here, the proposed approach achieves a 37% reduction in batch time, accompanied by a 25% increase in throughput yield and a 45% reduction in impurities. Evidently, this systematic approach can make a substantial impact in the pharmaceutical industry, through improved overall process yield, quality and return on investment. Table 2 - Record of improvements using the proposed procedure. Base-case Cycle 1 Cycle 2 Fermentor DPMO - pH 45,445 45,445 22,942 DPMO- Temperature 465 465 1,754 Reactive Extractor DPMO - pH 462,456 <1 <1 Cx (mole/liter) 6.75x10 5 1.1xl0 -5 1.1xl0 -5 Reactive Re-Extractor DPMO - pH 31,264 4,092 4,092 Cx (mole/liter) 4.2×10 -5 2.0×10 -5 2.0>(10-5 TY % 63 80 80 Production Time (hr) 432 432.4 273.5
Cycle 3 22,942 1,754 <1 1.1xl0 5 1,883 1.1>(10-5 79 273.5
References
[ 1] Dassau, E., I. Zadok, and D.R. Lewin, "Combining Six-Sigma with Integrated Design and Control for Yield Enhancement in Bioprocessing" submitted to I&EC Research (2005). [2] Bogle, I. D. L., A. R. Cockshott, M. Bulmer, N. Thomhill, M. Gregory, and M. Dehghani, "A Process Systems Engineering View of Biochemical Process Operations," Computers & Chemical Engineering, 20(6-7) 943-949 (1996). [3 ] Seider, W. D., J. D. Seader, and D. R. Lewin, Product and Process Design Principles." Synthesis, Analysis, and Evaluation. 2nd ed. John Wiley and Sons, New York (2004). [4] Diesselhorst, T. and E. Klaui, "Root Cause Analysis of Operational Induced Vibrations in a F eedwater System," Nuclear Engineering and Design, 206(2-3) 129-137 (2001). [5] Elleithy, R. H., "Root Cause Analysis; Fundamentals and Applications," Annual Technical Conference- Society of Plastics Engineers, 60(3) 3082-3087 (2002). [6] Rath and Strong, Six Sigma Pocket Guide. Rath & Strong Management Consultants (2000). [7] Lewin, D. R., W. D. Seider, and J. D. Seader, "Towards Integrated Design and Control for Defect-free Products," Chapter D3 in Integration of Process Design and Control eds. P.E. Seferlis and M.C.E. Georgiadis, Elsevier Science: San Diego. 533-554 (2004). [8] Lewin, D. R., "Multivariable Feedforward Control Design Using Disturbance Cost Maps and a Genetic Algorithm," Computers & Chemical Engineering, 20(12) 1477-89 (1996).