Pharmaceutical Technology Applying Nonexperimental Study Approach to Analyze Historical Batch Data YONG CUI, XILING SONG, KING CHUANG, MINLI XIE Small Molecule Pharmaceutical Development, Genentech, Inc., South San Francisco, California 94080 Received 30 September 2011; revised 28 December 2011; accepted 9 January 2012 Published online 30 January 2012 in Wiley Online Library (wileyonlinelibrary.com). DOI 10.1002/jps.23066 ABSTRACT: One common challenge in pharmaceutical product development is mapping the potential effects of a large number of variables. Conventional experimental tools such as designof-experiment (DoE) approach demand study scales too large to be practical. In comparison, nonexperimental studies have the advantage to evaluate a large number of variables, but may suffer from the inability to define causal relationships. Given this situation, the current study sought to divide the mapping operation into two steps. The first step screens out potential significant variables and confirms the causal relationships, and the second step involves DoE studies to define the design space. This report demonstrates that nonexperiments can be effectively applied in the first step. The screening task was performed on the nonexperimental dataset consisting of data collected from historical batches manufactured as clinical testing materials. A combination of statistical analysis and technical assessment was applied in the screening. By invoking the variable selection procedure embedded in the multivariate regression analysis, the significance of variables to the responses was assessed. Potential technical mechanisms and variable confounding were then examined for the significant correlations identified. Experimental confirmation was performed to confirm the causal relationships. The last two measures were introduced to remedy the weakness of the nonexperiments in defining causal relationships. Through this effort, the relationships among a large number of variables were quantitatively evaluated and the variables of potential risks to product quality and manufacturability were identified. The results effectively directed further DoE studies to the high-risk variables. Overall, the nonexperimental analysis improve the mapping efficiency and may provide a data-driven decision-making platform to enhance quality risk assessment. © 2012 Wiley Periodicals, Inc. and the American Pharmacists Association J Pharm Sci 101:1865–1876, 2012 Keywords: chemometrics; factorial design; formulation; physical characterization; solid dosage form; design of experiments; nonexperiments; historical batches; variable selection
INTRODUCTION In recent years, product development in pharmaceutical industry has been in a process of transformation advanced by quality-by-design (QbD) paradigm as manifested in several regulatory guidance documents.1,2 The goal of QbD is to enhance the understanding of potential impacts of material attributes, formula factors, and process parameters on product quality and manufacturability. On the basis of this knowledge, appropriate operational ranges, also referred as design space, can be defined to ensure desired product quality and manufacturability. Ideally, the design space should be constructed by Correspondence to: Yong Cui (Telephone: +650-467-1402; Fax: +650-467-2179; E-mail:
[email protected]) Journal of Pharmaceutical Sciences, Vol. 101, 1865–1876 (2012) © 2012 Wiley Periodicals, Inc. and the American Pharmacists Association
systematic mapping of the multidimensional space encompassing all relevant variables. Yet, this could be a formidable effort considering the large number of variables involved in the manufacturing of a drug product. Relevant variables may include various properties of all raw materials, formula compositions, process parameters of multiple steps, scale, and equipment. A complete mapping of such a complex multidimensional space is impractical, given the limited time and resources. As a result, it seems that data acquired from designed experimental studies are almost always limited in contrast to the large number of variables that need to be evaluated. This challenge prompts a search for new tools and sources of data to assist the mapping effort. In this work, we sought to apply nonexperimental study approach to analyze data collected on historical batches. Here, “nonexperimental study” refers to a class of
JOURNAL OF PHARMACEUTICAL SCIENCES, VOL. 101, NO. 5, MAY 2012
1865
1866
CUI ET AL.
research method, of which the details will be reviewed below. “Historical batches” refers to batches made to support clinical studies (i.e., clinical testing material, CTM) at various development stages and the corresponding pilot batches that were made on scale to establish process conditions for the CTM batches. Research methods, in general, can be classified into three categories: experiments, quasi-experiments, and nonexperiments.3 The first category, experiments, refer specifically to the type of study that deliberately and randomly varies at least one variable (independent variable, IV) in order to observe changes of other variables (dependent variable, DV). In experimental designs (design of experiments, DoE), IVs being studied are randomized, whereas other IVs are controlled (i.e., kept unchanged or equivalent). This allows any observed changes in DVs to be attributed to the changes of the IVs being studied. Quasi-experiments differ from experiments only in that the IVs are not randomly or not completely randomly varied.4 Besides the IVs being studied, other IVs may also vary and the changes are not completely randomized between the groups. In nonexperimental designs, changes in variables are introduced neither deliberately nor randomized.3,5 Quasi-experiments and nonexperiments are widely applied in many fields, particularly in the circumstances where manipulation of IVs cannot be easily performed or is not ethical.4,6 For example, characteristics of subjects or materials cannot be adjusted as readily as operational parameters. As a result, these unmanipulable variables, in general, are not amenable to DoE studies.4 Nonetheless, unmanipulable variables can also cause changes in the DVs of interest.4 In fact, unmanipulable variables sometimes are more fundamental causes of the changes than manipulable ones. An example is the impact of granule flow quality on tablet content uniformity. Granule flow quality is a material property that cannot be directly manipulated. Granulation process parameters, on the contrary, are variables directly manipulable to yield the target granule flow quality. In considering the impact on tablet content uniformity, granule flow quality is probably the more mechanistic cause than the granulation parameters, even though the former is not directly manipulable. Similar cases are widespread in product development. When DoEs are not suitable for the evaluation of these variables, nonexperimental studies may be carried out to meet this demand. The strength of DoE studies is their ability to define a causal relationship. In DoE studies, any observed changes in DVs can only be attributed to the changes of the IVs being studied as long as all other IVs are in tight control. On the contrary, quasiexperiments and nonexperiments are much weaker
JOURNAL OF PHARMACEUTICAL SCIENCES, VOL. 101, NO. 5, MAY 2012
in drawing definitive causal relationship without excluding all alternative explanations arising from the presence of other uncontrolled variables.4–6 Nonetheless, DoEs demand randomization of all IVs being studied and, as a result, require much larger study sizes (e.g., more batches). The low efficiency of DoEs gives rise to the challenge of mapping a large number of variables. Contrarily, quasi-experiments or nonexperiments have the benefit of analyzing a large number of variables with a relatively small study scope. Although the conclusions from these studies may not be definitively causal, the relationships identified may provide a clear direction as to where further experiments should be focused on. The comparison above indicates that DoEs, quasiexperiments, and nonexperiments are complementary research tools and, therefore, should be selected judiciously to meet different study needs. This insight gives rise to a novel approach in addressing the variable-mapping challenge. Suppose that the mapping operation can be divided into two steps with the first step screening out statistically significant variables and confirming the causal relationships, and the second step defining the design space. It is then possible to apply nonexperimental approach to carry out the screening task, and subsequent DoEs to define predictive models and design space. In this manner, the overall mapping efficiency may be improved. In this context, we now consider the characteristics and significance of historical batch data. As defined earlier, historical batches include CTM batches and the corresponding pilot batches manufactured at the same scales. The importance of historical batch data is acknowledged by International Conference on Harmonisation (ICH) guidelines and literature.2,7,8 The benefits of historical batch data may include8 (1) batches made typically on full scale, including on projected commercial scale, an advantage often lacked in designed experimental studies that are generally conducted at lab-scale; and (2) a rich dataset, of which a large number of variables undergo significant and realistic changes. These changes arise typically from the evolution and optimization of various aspects of the drug product development, which can include changes in active pharmaceutical ingredients (APIs), formula, process, scale, and manufacturing equipments. The variability presents truly what has been encountered in the development history. Therefore, the dataset is a reliable and representative source of information and may offer unique product-specific insights. Nevertheless, systemic extraction of reliable development information from historical batch data remains a challenge. The difficulty arises primarily from the lack of randomization or control of variables in historical batches. Change in variables are typically introduced as needed at various stages of the
DOI 10.1002/jps
NONEXPERIMENTAL STUDY APPROACH TO ANALYZE HISTORICAL BATCH DATA
development and, therefore, are often neither randomized nor controlled, in contrast to multivariate DoE studies. Owing to this feature, historical batch data cannot be treated and analyzed as DoE studies. This drawback gives rise to the undesirable situation that knowledge and experience acquired from the manufacturing and testing of historical batches are often scattered, descriptive, and qualitative. By treating historical batch data as nonexperimental studies, these data may be systemically and quantitatively evaluated, with a conscious recognition of their limitations.
VARIABLE SCREENING METHOD The screening operation was performed through a combination of statistical analysis and technical assessment. Historical batch data is a multivariate dataset, which enables us to invoke the variable selection procedure embedded in the multivariate regression analysis to assess the significance of variables to the responses. Additionally, potential technical mechanisms and confounding variables were examined for the significant correlations identified. Experimental confirmation was thereafter performed to confirm the causal relationships. These measures were introduced to remedy the weakness of the nonexperiments in defining causal relationships. The detailed considerations are summarized below. In multivariate regression analysis, one general but key step is to identify variables (from all variables collected) relevant or “significant” to the response factor, so that the relevant variables identified can be included in the final regression model. The irrelevant (insignificant) variables will be left out of the model. This statistical procedure is termed “variable selection” in multivariate regression analysis.9,10 By excluding insignificant variables, the variable selection procedure serves to11 (1) improve the prediction performance of the predictive variables (e.g., by avoiding statistical overfitting), (2) provide more cost-effective models for prediction and control, and (3) provide a better understanding of the underlying mechanism between variables and the response. These objectives conform to the QbD paradigm. Specifically, by excluding insignificant variables, the final predictive model is able to clearly differentiate critical and noncritical variables, which is an important goal of QbD. As pointed out by Short et al.,12 the appendices in ICH guideline Q8R22 illustrate that the design space hyperspace is “defined by process critical control parameters that thereby condition the critical quality attributes.” Noncritical variable should not be included in the design space hyperspace. It is important to note that the variable selection procedure shares the same objective as the mapping DOI 10.1002/jps
1867
operation in this study, which is to identify variables (from a large variable pool) significant or critical to the response factor. Therefore, this procedure may provide a statistical and quantitative tool for variable criticality assessment for the historical batch data. The variable selection procedure in this study was implemented first through a systematic univariate regression analysis, termed as “variable ranking” operation in multivariate regression analysis.11,13 This ranking operation was applied to examine correlations between all variables in the historical batch dataset, and it identified the statistically significant correlations. The procedure is different from the approach used in analyzing typical multivariate DoE studies. The latter evaluates variables on the basis of their statistical significance (i.e., p values) attained from multivariate model fitting. This approach was found incompatible to historical batch data analysis mainly because of the following observations: (1) Unlike DoE designs, which often include few preselected IVs, the historical batch dataset contains many more variables that were not intensively prescreened. In this study, for instance, essentially all data collected on material properties were included in the dataset without preselection (see further discussion on variable preselection in Relation to QRA). Consequently, in the historical batch dataset, it is highly likely that a considerable number of variables are insignificant. In such cases, the variable ranking procedure is a more appropriate and efficient approach for variable selection.13 (2) Confounding variables (i.e., more than two variables intercorrelating significantly with one another) are likely to be present in nonexperimental dataset, particularly with regard to material properties.14 This is also different from the DoE studies wherein IVs are randomized so that confounding is unlikely to occur. Confounding variables present a challenge to multivariate model fitting in terms of its ability to assess variables. Conventional multivariate regression is unable to establish models for confounding variables. Approaches such as principle component analysis can generate multivariate models, but can no longer yield p value for each variable. Given these special characteristics of the historical batch dataset, the variable ranking procedure was adopted here for variable selection. Following the ranking procedure, we further introduced a review on potential physical mechanisms, and an experimental confirmation step to help eliminate false positive relationships and to verify the causal relationships. These procedures were designed to remedy the weakness of nonexperiments in drawing causal conclusions. Generally, three conditions need to be met in order to infer causal relationships from nonexperiments.5,6 (1) The two variables must be related, (2) changes in the IV must occur before the changes in the DV, and (3) no other alternative JOURNAL OF PHARMACEUTICAL SCIENCES, VOL. 101, NO. 5, MAY 2012
1868
CUI ET AL.
explanation can account for the relationship between the IV and DV. The first two conditions are readily achievable in development studies. When a significant correlation is identified via variable ranking operation, it is evident that the variable at upstream process steps changes before the variable at the downstream steps. The difficulty, however, arises often from meeting the third condition, where all alternative explanations need to be ruled out. In this study, we introduced several measures to examine and exclude alternative explanations, which include (1) identifying confounding variables, which are the major source concealing the real causal relationship.5 This can be achieved via the systemic univariate regression analysis conducted during variable ranking operation. (2) Reviewing potential physical mechanisms for all significant correlations, which helps rule out coincidences (false positives); and (3) designing DoE studies to further confirm all significant correlations. These procedures carried out variable screening tasks while remedying the weakness of the nonexperimental dataset. Besides the variable screening method outlined above, two additional concerns with regard to historical batch data are worth further attention: (1) the changes of IVs may be too small to be pharmaceutically meaningful, and (2) variables may heavily skew toward a locus of the whole variation range so that the conclusions are not statistically reliable. In other words, the variables may not be sampling the multidimensional space homogeneously. Both concerns arise from the fact that changes of variables are not deliberately introduced and thus could be too small or heterogeneously distributed. To address these concerns, we introduced a detailed data inspection procedure in data analysis to ensure the analysis quality. The detailed procedure will be described later in Data Analysis. The goal of this study was to explore the mapping capability of nonexperimental study approach. The variable screening procedure adopted here conformed to this goal (i.e., identifying potential critical variables) and was thus presented in detail. Defining the final model and the design space, on the contrary, was not the focus of this study. The latter could be accomplished through further DoE studies once the critical variables were identified by the mapping procedure. Additionally, the mapping effort may provide a data-driven platform to enhance quality risk assessment (QRA). To highlight its significance, the variable mapping effort was treated as an independent subject in this text, without getting into too much details of the final model construction. A case study involving an immediate-release capsule product is presented below to demonstrate the methodology. We note that a small portion of the data was published earlier.14 JOURNAL OF PHARMACEUTICAL SCIENCES, VOL. 101, NO. 5, MAY 2012
EXPERIMENTAL The case study here describes an immediate-release hard capsule formulation developed by Genentech, Inc. (South San Francisco, California). The API is a Biopharmaceutics Classification System class II compound (i.e., low aqueous solubility and high permeability), chemically and physically stable, and nonhygroscopic. The API was provided by Genentech Small Molecule Process Group. Various API physical property data were collected, including particle size distribution, bulk and tap density, surface area, polymorphic form, and particle morphology. Other quality attributes were controlled by the relevant aspects of the API specification and are not detailed here because of the lack of relevance. The capsule product was made through a conventional wet granulation process, that is, dry mixing → high-shear wet granulation → fluid-bed drying → dry granule sizing → blending/lubrication → encapsulation. Data from 16 batches of API and their corresponding capsule products were collected, consisting of 12 CTM batches and four pilot batches. These batches included phase I (n = 1), phase II (n = 5), pivotal clinical trial (n = 5), and commercial-scale (n = 5) batches. The batch size ranged from approximately two to 90 kg made on three equipment trains of the same operating principle. These batches incorporated all major changes encountered in the product development, including changes in formula, process, scale, and equipment. They also constituted a majority of the testing drug supplies used in clinical trials. Therefore, they were considered representative of the product development history. The product quality was assessed by conducting the final capsule release testing. Dissolution rate and uniformity of dosage units (UDU) were identified to be the key quality attributes that may be affected by API physical properties and drug product manufacturing process. Therefore, they were used in the analysis to represent product quality. The product manufacturability was assessed by collecting various in-process data on process intermediates. The latter includes granule particle size distribution, granule bulk density, granule tap density (GTD), granule flow quality, water content, blend uniformity (BU), capsule weight variation during encapsulation, and so on. All abovementioned material attributes of API, product processing intermediates, and final capsule product are summarized in Table 1. The analytical methods used in data collection were reported previously.14 Although much more data were collected than those listed in Table 1, only unmanipulable variables such as properties of raw materials, properties of process intermediates, and quality attributes of the final product were selected to construct the DOI 10.1002/jps
NONEXPERIMENTAL STUDY APPROACH TO ANALYZE HISTORICAL BATCH DATA
Table 1.
1869
Variables Examined in Nonexperimental Analysis and Their Observed Variation Ranges
API properties
Dry granule properties
Final blend Encapsulation Capsule product
Particle size distribution D10 (:m) Particle size distribution D50 (:m) Particle size distribution D90 (:m) Surface area (m2 /g) Bulk density (g/mL) Tap density (g/mL) Granule particle size ≤ 75 :m (%, w/w) Granule particle size ≥ 850 :m (%, w/w) Bulk density (g/mL) Tap density (g/mL) Flow index Moisture content (%, w/w) Blend uniformity (%RSD) Capsule weight variation (%RSD) Dissolution at 10 min (%, w/w) Dissolution at 20 min (%, w/w) Dissolution at 30 min (%, w/w) Dissolution at 45 min (%, w/w) Uniformity of dosage units (%RSD)
4–46 21–120 76–433 0.24–1.15 0.45–0.94 0.76–0.99 24–90 0–14 0.60–0.70 0.69–0.79 0.71–1.06 0.81–1.90 0.30–1.70 0.48–1.74 28.5–84.5 43.0–91.7 53.1–92.5 63.2–96.0 0.50–2.90
RSD, relative standard deviation.
nonexperimental dataset. Manipulable variables such as process parameters were disregarded. This is because (1) changes in material properties are often more reflective of the fundamental causes than process parameters for changes in product quality and manufacturability. Therefore, the analysis may provide mechanistic insights to the effects. (2) Material properties are unmanipulable variables and therefore, cannot be studied readily by DoEs. This analysis thus complements other DoE studies. (3) Historical batches consist of batches manufactured on different scales and equipment, and many process parameters (manipulable variables) are equipment and scale dependent. Analysis of changes of equipment-dependent parameters across different scale is not physically meaningful.
DATA ANALYSIS Data from 16 batches of API and their corresponding capsule products were collected. Discounting a couple of missing data, the final dataset entering the analysis contains 14–16 data points (batches) depending on individual variables. The dataset was inspected to ensure that all IVs presented sufficient or pharmaceutically meaningful variations (see Table 1). Furthermore, all correlation plots were examined to ensure relative homogeneity of the data points in the variation ranges. The inspection of plots was performed in the variable ranking step as described below. Data analysis was conducted in four consecutive steps: (1) Variable ranking was conducted by applying systematic univariate linear regression analysis to examine correlations between all variables listed in Table 1. This operation identified all statistically DOI 10.1002/jps
significant correlations. Furthermore, all correlation plots were inspected to ensure correct interpretation, which included identifying any significant heterogeneous distribution of data points (e.g., data clustering) and any potential high-order (i.e., nonlinear) correlations. If data clustering was observed, correlations within each cluster may be performed. If nonlinear correlations were observed, correlations at higher orders or from an interaction of multiple variables were further examined. (2) Following the ranking analysis, a technical review was performed, wherein the pharmaceutical relevance and physical basis were evaluated for each significant correlation identified. This step helped to differentiate true causal relationships from those due to coincidences (false positives). (3) The magnitude of significant effects identified was further evaluated to ensure that they were also pharmaceutically significant. This operation is consistent with the current trend in experimental analysis.4 (4) Finally, potential confounding variables were examined for each significant correlations identified. If confounding variables were found, further analysis (e.g., principle component analysis, see Ref. 14) was performed on these confounding variables to identify the one with the most significant contribution. This would help suggest a true cause for the effect. The analysis procedure is depicted in the flowchart in Figure 1. All statistical analyses were conducted in the R environment for statistical computing (The R Foundation for Statistical Computing, Vienna, Austria).15 We noted that the univariate regression analysis adopted in step (1) did not aim at constructing univariate regression models. Instead, it functioned as the variable ranking step for the multivariate regression analysis.9 JOURNAL OF PHARMACEUTICAL SCIENCES, VOL. 101, NO. 5, MAY 2012
1870
CUI ET AL.
Figure 2. Data clustering observed for correlation between capsule content uniformity and particle size of active pharmaceutical ingredients (APIs). Red triangles, API particle size D90 ; blue squares, API particle size D50 ; magenta diamonds, API particle size D10 .
Figure 1. Decision tree for nonexperimental data analysis on historical batch data.
RESULTS AND DISCUSSIONS Data Inspection The variable ranges and data point distribution were inspected. Table 1 provides the ranges for all variables entering the analysis. It can be seen that most variables presented pharmaceutically meaningful ranges. The only exception seems to be GTD (between 0.69 and 0.79 g/mL), which is a DV relative to API properties, but an IV for downstream variables. We decided to leave it in the dataset for analysis because there was no technically reasonable method to deliberately enlarge the variation range for GTD. In other words, this variable is unmanipulable. As a result, correlations involving GTD as a DV were still pharmaceutically meaningful, whereas correlations using it as an IV were no longer meaningful. The distribution of data points was assessed via the inspection of all correlation plots. Typically, data points of historical batches do not distribute as homogeneously as DoE studies. However, we did not find this to be a great concern to the correlation quality in our analysis. In fact, in many cases, the statistical reliability of historical batch data appeared JOURNAL OF PHARMACEUTICAL SCIENCES, VOL. 101, NO. 5, MAY 2012
to be even more favorable than the DoE studies. This was mainly because all variables examined in this study were numerically unmanipulable variables, which distributed in their variation ranges very differently from the manipulable variables in the DoE designs. The latter typically sampled very limited levels. For example, a two-level factorial design with center points can only sample three levels for each variable in the study. In contrast, numerically unmanipulable variables such as material properties sampled much more levels in historical batches, as can be seen later in Figures 2–4. Essentially, each batch represented one distinct level for every variable. The correlations were therefore drawn from data points at numerous levels, which were statistically more favorable. Through inspecting all correlation plots it was found that the current dataset sampled the range of each variable relatively evenly. The only exception was a case of data clustering, which will be discussed below. The results indicated that the multidimensional factorial space was mapped by these batches relatively homogeneously and the correlations results were statistically reliable. Although the variable ranges and data distribution were found acceptable in this case study, it should be noted that the results of inspection are indeed study and variable dependent. In the cases where the inspection shows that the variable ranges are too small or data distribution is highly heterogeneous for some critical variables, it may be necessary to remedy these deficiencies. For example, if API particle size does not vary sufficiently or data points are lacking in certain interested ranges of the correlation plots, additional batches using API of desired particle sizes may be DOI 10.1002/jps
NONEXPERIMENTAL STUDY APPROACH TO ANALYZE HISTORICAL BATCH DATA
Figure 3. Three significant correlations identified between process intermediates’ properties and capsule quality attributes as well as subsequent process intermediates’ properties. (a) Correlation between capsule’s uniformity of dosage units (UDU) and capsule weight variation during encapsulation. Blue diamonds, samples taken from 16 historical batches; pink squares, samples taken at predetermined time points during encapsulation process of one batch. (b) Correlation between capsule UDU and the product of capsule weight variation during encapsulation and the blend uniformity. (c) Correlation between granule water content and capsule weight variation during encapsulation.
DOI 10.1002/jps
1871
Figure 4. Correlations between granule particle size, capsule dissolution, and uniformity of dosage units (UDU). (a) The percentage of large (≥850 :m) granules versus capsule dissolution at 45 min; (b) the percentage of small (≤75 :m) granules versus capsule dissolution at 45 min; (c) the percentages of large (≥850 :m) and small (≤75 :m) granules versus capsule UDU. In panels a and b, there is one outlier (open symbols), which is a batch made from an unmilled active pharmaceutical ingredient (API) with a very large API particle size (D90 = 433 :m). The decrease in dissolution should be attributed to API particle size, an effect shown in Table 2.
JOURNAL OF PHARMACEUTICAL SCIENCES, VOL. 101, NO. 5, MAY 2012
1872
CUI ET AL.
manufactured and added to the dataset. This will strengthen the reliability of the analysis results. Variable Ranking Variable ranking was conducted by applying systematic univariate linear regression analysis to all variables in Table 1. Two 19 × 19 matrixes were generated for R2 and two-tailed p values, respectively, in addition to correlation plots for each regression. A portion of the results with pharmaceutical significance is provided in Tables 2 and 3. Table 2 lists the results between API physical properties, capsule quality attributes, and process intermediates’ properties, which can be summarized as follows: (1) Significant correlations were found between multiple API physical properties (i.e., particle size distribution, surface area, and bulk density) and capsule dissolution, UDU, granule particle size (i.e., percentage of granules ≤ 75 :m), and granule flow quality, respectively. These correlations had sound pharmaceutical rationale, that is, smaller API particle size and larger surface area may lead to faster product dissolution, better UDU, smaller granule particle size, and poorer granule flow quality. Thus, the effects were likely causal relationships. The magnitude of the effects was pharmaceutically significant (i.e., −32.8% for dissolution at 45 min). Furthermore, it was also found that these API physical properties presented significant intercorrelations (last part of Table 2), indicating that they were confounding variables. On the basis of the finding, principle component analysis was performed to identify the variable accounting for the majority of the effects, which should be considered as the root cause for the observed significant correlations. The detailed follow-up analysis is reported elsewhere.14 Additionally, further designed experiments were carried out to confirm that API particle size indeed impacted capsule quality attributes and the properties of intermediates (data not shown). (2) Capsule dissolution at earlier time points (10, 20, and 30 min) showed high correlations with dissolution at 45 min (R2 = 0.9375, 0.9712, and 0.9915, respectively; all p values < 0.0001), which indicated that they were confounding variables. Thus, dissolution at 45 min was considered representative of the entire dissolution profile. This was further confirmed by correlations between the API properties and dissolution at 10 and 45 min, respectively (shown in the first part of Table 2). No significant differences were observed between these two sets of regressions. On the basis of these observations, only JOURNAL OF PHARMACEUTICAL SCIENCES, VOL. 101, NO. 5, MAY 2012
correlations with dissolution at 45 min were provided for most variables (Tables 2 and 3). (3) Clustering of data points was observed in the correlation plots between multiple API properties and capsule UDU. An example (i.e., API particle size vs. capsule UDU) is shown in Figure 2. Essentially, as depicted by the dashed circles in Figure 2, there appeared to be no correlation between the API properties and the UDU when the UDU varied within approximately 2% relative standard deviation (RSD). Beyond this range, significant positive correlations were observed. This behavior suggested that the 2% variation was representative of the manufacturing process instead of an effect of the API properties. (4) Multiple correlations involving API surface area were found nonlinear by inspecting the correlation plots. For example, the correlation between the API surface area and capsule dissolution at 45 min was found to be nonlinear (R2 = 0.3675). When switched to the square root of surface area, the correlation improved to R2 = 0.4275. This was true for all correlations involving API surface area. We, therefore, replaced the API surface area with its square root in all linear regressions (see also footnote c of Table 2). The correlation analysis results between the properties of process intermediates and capsule quality attributes as well as properties of subsequent process intermediates are provided in Table 3. Two significant correlations were detected, which were (1) capsule weight variation during encapsulation versus capsule UDU and (2) capsule weight variation versus granule water content. The first effect (Fig. 3a) had sound physical basis, that is, it is reasonable that higher variations in capsule weight can result in higher variation in capsule UDU. Thus, this effect was likely a causal relationship. Furthermore, this observation prompted us to examine the interaction effect on capsule UDU by capsule weight variation and BU, because poor BU may also impact capsule UDU. The result (Table 3) showed that capsule UDU had a stronger correlation with the product of capsule weight variation and BU (i.e., the interaction effect of the two variables; Fig. 3b) than with capsule weight variation alone (Fig. 3a), indicating that there was indeed an interaction effect. Note that BU alone did not present significant correlation to capsule weight variation and capsule UDU (Table 3). This suggested that BU and capsule weight variation were not confounding variables and that BU accounted only for a minor portion of the interaction effect. On the contrary, the significant correlation between granule water content and capsule weight variation DOI 10.1002/jps
NONEXPERIMENTAL STUDY APPROACH TO ANALYZE HISTORICAL BATCH DATA
Table 2.
1873
Regression Analysis Between API Physical Properties, Product Critical Quality Attributes, and Process Intermediate Properties
API Properties (X) API properties versus capsule dissolution rateb
Particle size distribution D10 (:m) Particle size distribution D50 (:m) Particle size distribution D90 (:m)
API properties versus uniformity of dosage units
API properties versus granule particle size
Surface area (square root) (m/g1/2 )c Bulk density (g/cm3 ) Tap density (g/cm3 ) Particle size distribution D10 (:m) Particle size distribution D50 (:m) Particle size distribution D90 (:m) Surface area (square root) (m/g1/2 ) Bulk density (g/cm3 ) Tap density (g/cm3 ) Particle size distribution D50 (:m) Particle size distribution D10 (:m)
API properties versus granule density
Particle size distribution D50 (:m) Particle size distribution D90 , :m Surface area (square root) (m/g1/2 ) Bulk density (g/cm3 ) Tap density (g/cm3 ) API Particle size distribution D10 (:m) API Particle size distribution D50 (:m) API Particle size distribution D90 (:m) API surface area (square root) (m/g1/2 ) API bulk density (g/cm3 ) API tap density (g/cm3 )
API properties versus granule flow index
Intercorrelations among API properties
API particle size D10 (:m) API particle size D50 (:m) API particle size D90 (:m) API surface area (square root) (m/g1/2 ) API bulk density (g/cm3 ) API tap density (g/cm3 ) API particle size distribution D10 (:m) API particle size distribution D50 (:m) API surface area (square root) (m/g1/2 ) API bulk density (g/cm3 )
Capsule Quality Attributes and Properties of Intermediates (Y)
R2
Two-Tailed p Value
Statistical Correlationa
0.6296 0.6100 0.8126 0.8102 0.8467 0.8448 0.4275 0.6604 0.0011 0.329
0.0002 0.0004 <0.0001 <0.0001 <0.0001 <0.0001 0.006 0.0001 0.9021 0.0254
Significant Significant Significant Significant Significant Significant Significant Significant Insignificant Significant
0.4265 0.4145 0.1122 0.5934 – 0.0579
0.0083 0.0096 0.2224 0.0008 – 0.4074
Significant Significant Insignificant Significant No correlation Insignificant
0.4111
0.0135
Significant
0.4621 0.5535 0.2149 0.4175 – 0.0614 0.0319 0.0882 0.0283 0.0545 0.0153 0.1285 0.0562 0.1498 0.0646 – – 0.2424
0.0075 0.0023 0.0950 0.0125 – 0.3733 0.5240 0.2824 0.5489 0.4024 0.6604 0.1895 0.3949 0.1541 0.3605 – – 0.0622
Significant Significant Significant Significant No correlation Insignificant Insignificant Insignificant Insignificant Insignificant Insignificant Insignificant Insignificant Insignificant Insignificant No correlation No correlation Significant
0.3288 0.3669 0.2878
0.0254 0.0167 0.0393
Significant Significant Significant
API particle size D90 (:m) API particle size D50 (:m) API particle size D90 (:m)
0.6352 – 0.7916 0.8509 0.9688
0.0004 – <0.0001 <0.0001 <0.0001
Significant No correlation Significant Significant Significant
(:m) (:m) (:m) (:m) (:m) (:m)
0.5429 0.6057 0.7007 0.8476 0.8446 0.6910
0.0011 0.0004 <0.0001 <0.0001 <0.0001 0.0001
Significant Significant Significant Significant Significant Significant
Dissolution at 45 min Dissolution at 10 min Dissolution at 45 min Dissolution at 10 min Dissolution at 45 min Dissolution at 10 min Dissolution at 45 min Dissolution at 45 min Dissolution at 45 min Uniformity of dosage units (%RSD)
% ≥ 850 :m of granules (w/w) % ≤75 :m (fines) of granules (w/w)
Granule bulk density (g/cm3 ) GTD (g/cm3 ) Granule bulk density (g/cm3 ) GTD (g/cm3 ) Granule bulk density (g/cm3 ) GTD (g/cm3 ) Granule bulk density (g/cm3 ) GTD (g/cm3 ) Granule bulk density (g/cm3 ) GTD (g/cm3 ) Granule bulk density (g/cm3 ) GTD (g/cm3 ) Granule flow index
API particle size D90 API particle size D50 API particle size D10 API particle size D90 API particle size D50 API particle size D10
a “ Significant” correlation was defined by p ≤ 0.10; “insignificant” correlation referred to the observation that the scattering plot could show a trend, but p > 0.10; and “no correlation” referred to the observation that X and Y are independent. b High correlations were found between dissolution at earlier time points (10, 20, and 30 min) and dissolution at 45 min, indicating that dissolution at 45 min is representative of the whole dissolution profile. Therefore, for most variables, only correlations with dissolution at 45 min were shown in the table. c The square root of API surface area was used instead of API surface area in the analysis above because the former displayed a better correlation than the latter. This observation held true for other analyses throughout this study. Thus, API surface area was replaced by its square root in all analyses. API, active pharmaceutical ingredient; GTD, granule tap density.
DOI 10.1002/jps
JOURNAL OF PHARMACEUTICAL SCIENCES, VOL. 101, NO. 5, MAY 2012
1874
CUI ET AL.
Table 3. Regression Analysis on Process Intermediate Properties, Capsule Quality Attributes, and Subsequent Process Intermediate Properties Intermediate Properties (X) Granule properties versus capsule quality and encapsulation parameter
Blend properties versus capsule quality Encapsulation parameter versus capsule quality
Response Parameters (Y)
Granule particle size (% ≤75 :m)
Dissolution at 45 min Uniformity of dosage units (%RSD) Capsule weight variation (%RSD)a BU (%RSD) Granule particle size Dissolution at 45 min (% ≥850 :m) Uniformity of dosage units (%RSD) Capsule weight variation (%RSD) BU (%RSD) Granule bulk density, Dissolution at 45 min g/mL Uniformity of dosage units (%RSD) Capsule weight variation (%RSD) BU (%RSD) Granule flow index Dissolution at 45 min Uniformity of dosage units (%RSD) Capsule weight variation BU (%RSD) Granule water Dissolution at 45 min content (%, w/w) Uniformity of dosage units (%RSD) Capsule weight variation (%RSD) BU (%RSD) Granule flow index BU (%RSD) Capsule weight variation (%RSD) Uniformity of dosage units (%RSD) Dissolution at 45 min Capsule weight Dissolution at 45 min variation (%RSD) Uniformity of dosage units (%RSD) BU × capsule weight Uniformity of dosage units (%RSD) variation (%RSD2 )
R2
Two-Tailed p Value
Statistical Correlation
– 0.0432 0.0460 0.0800 – 0.0954 0.0400 0.1205 0.0301 0.0022 0.1999 0.1044 0.1861 – 0.0665 0.0104 0.0147 0.2021 0.6158 0.0762 – 0.0013 0.1582 0.1393 0.0260 0.4412 0.6423
– 0.4760 0.4614 0.3273 – 0.2826 0.4932 0.2239 0.5361 0.8737 0.1090 0.2598 0.1084 – 0.3733 0.7287 0.6802 0.1068 0.0009 0.3394 – 0.9965 0.1590 0.1886 0.5817 0.0096 0.0006
No correlation Insignificant Insignificant Insignificant No correlation Insignificant Insignificant Insignificant Insignificant Insignificant Insignificant Insignificant Insignificant No correlation Insignificant Insignificant Insignificant Insignificant Significant Insignificant No correlation Insignificant Insignificant Insignificant Insignificant Significant Significant
a Capsule weight variation during encapsulation is collected every 15 min during encapsulation process as an in-process check. It is an in-process parameter rather than a final-product quality attribute. BU, blend uniformity; RSD, relative standard deviation.
(Fig. 3c) did not have sound physical basis as to why granule water content may positively impact capsule weight variation (i.e., higher granule water content relates to higher capsule weight variation). To verify the causal relationship, a DoE study was designed to compare the capsule weight variation resulting from granules of different water content levels. The granules of lower water content level [0.7% (w/w)] was produced by overdrying and yielded a capsule weight variation of approximately 1.28% RSD. In comparison, granules of an average water content of 1.68 ± 0.19% (w/w; n = 5) yielded an average capsule weight variation of 1.24 ± 0.39% RSD, indicating that no significant change in capsule weight variation was observed. The change in water content from 0.7% (w/w) to 1.68 ± 0.19% (w/w) was reasonably large in comparison with the in-process limit for granule water content [≤2% (w/w)], and these values were in the range where significant correlation was detected in the correlation analysis (Fig. 3c and Table 3). Therefore, it was concluded that the significant correlation identified between granule water content and capsule weight variation was not definitively causal and could be a coincidence. Aside from the significant correlations discussed above, the remaining correlations were classified as JOURNAL OF PHARMACEUTICAL SCIENCES, VOL. 101, NO. 5, MAY 2012
either “no correlation” or “insignificant correlation,” which led to the conclusion that in the range observed, there was no impact of the IV on the corresponding DV. This conclusion is considered reliable. In other words, “false negatives” are unlikely even though variables were not randomized in historical batches. This, again, is because numerically unmanipulable variables in historical batches sampled numerous levels and each batch essentially represented one distinct level for every variable. If two variables (A and B) have opposite impact to a third variable C, they would have to change coordinatively at each data point (batch) so that their impact on C can offset each other to result in an insignificant correlation (false negative) between A and C. If this indeed does happen, A and B must be highly correlated confounding variables and will be detected in the systematic correlation analysis as described earlier. Furthermore, highly correlated material properties should have clear physical basis and thereby would have been identified through the detailed technical review. Therefore, the likelihood of a false negative correlation is rare, if not impossible. Among the “no-impact” conclusions, one case worth noting is that the granule physical properties (i.e., granule particle size, bulk density, and flow quality) DOI 10.1002/jps
NONEXPERIMENTAL STUDY APPROACH TO ANALYZE HISTORICAL BATCH DATA
did not show impact on the properties of subsequent process intermediates and the final product (Table 3). As an example, Figure 4 provides the correlations of granule particle size [i.e., the percentages of large (≥850 :m) and small (≤75 :m) granules] versus capsule dissolution and UDU. It is evident that capsule dissolution and UDU are essentially independent of the percentages of large (≥850 :m) and small (≤75 :m) granules in the blends. We further confirmed this result by a designed study, wherein the final blend of a single batch was sieved and portions of small (≤75 :m) and large (≥425 :m) granules were manually filled, separately, into capsules. The dissolution profiles of capsules made with granules of different particle sizes are almost overlapping, confirming that the analysis results from historical batches were reliable. The discussion above laid out the procedure of the systematic nonexperimental analysis approach. We also demonstrated the methodology of treating confounding variables, data clustering, nonlinear correlations, interaction effects from multiple IVs, and potential coincidences (false positives). The analysis procedure was shown effective in analyzing historical batches. It should be stressed that statistical analysis tools (e.g., variable ranking operation) alone cannot ensure correct interpretation of the results, especially with regard to inferring causal relationships. Intensive data inspection, detailed technical review, and further experimental confirmation are crucial in the assessment. Overall, the nonexperimental analysis of historical batch data indeed disclosed valuable insights into the potential impact of a variety of variables and their mechanistic relationships.
Relation to QRA One important application of historical batch data analysis is to support QRA. QRA is a comprehensive assessment of risks to product quality and manufacturability, of which the output is determined by the robustness of the dataset.7 Yet, the comprehensiveness of QRA demands an assessment of all variables relevant to product manufacturing, for which data from designed experiments are often inadequate. To meet this demand, historical batch data may afford an important supplement to the designed experiments. As demonstrated above, the historical data analysis can effectively differentiate the significance of a large number of variables to the responses and thereby help define their criticality to the product quality and manufacturability. Taking the case study presented above as an example, historical batch data analysis mapped a series of material property variables and pointed out that API physical properties (e.g., API particle size) and encapDOI 10.1002/jps
1875
sulation process (i.e., capsule weight variation during encapsulation) may be critical for product quality and manufacturability. Other process steps such as granulation, drying, granule milling, and blending were found to be of low risk in impacting product quality and manufacturability. Although the analysis did not directly yield design space for manipulable variables such as process parameters, it effectively differentiated areas of high and low risks, and thus provided direction and focus for subsequent DoE studies. In other words, historical batch data analysis provided a data-driven tool to identify potential risks and to prioritize development efforts. More specifically, the decision tree in Figure 1 can be readily converted to risk assessment procedure, wherein the “noimpact” relationships can be defined as potential low risk, whereas “causal relationships” can be treated as potential high risk. As such, the QRA output can be directly supported by data substantiated by systemic statistical analysis. Finally, we consider the preselection of variables prior to constructing the dataset. As discussed above, more the variables included in the analysis, less likely that a true cause is overlooked and, thus, more reliable the suggestions inferred on potential causal relationships. Consequently, the preselection of variables for the dataset is an important part of the analysis. In practice, the researchers need to make selections based on the technical significance and physical basis of the variables. Some variables will be purposely and sometimes inevitably left out. In the current study, for example, only material property variables were selected, whereas other variables that also underwent changes were not included. Note that the incompleteness of the variable list does not affect “no correlations” or “insignificant correlations.” In other words, these correlations and the “no-impact” conclusions derived from them are still valid. It does, however, have a potential impact on significant correlations in terms of inferring true causal relationships. It is possible that a significant correlation could be attributed to a variable that was not included initially in the analysis. In such cases, the statistical variable selection procedure is unable to identify the true causes. Therefore, during data analysis, the investigator needs to be vigilant to any potential interference from these variables and take them into consideration whenever necessary.
REFERENCES 1. US Food and Drug Administration. 2006. Guidance for industry. Quality system approach to pharmaceutical CGMP regulations.Rockville, Maryland: US Food and Drug Administration. 2. International Conference on Harmonisation (ICH). 2009. Q8(R2), pharmaceutical development.Geneva, Switzerland: International Conference on Harmonisation. JOURNAL OF PHARMACEUTICAL SCIENCES, VOL. 101, NO. 5, MAY 2012
1876
CUI ET AL.
3. Trochim W. 2000. The research methods knowledge base. The design section. 2nd ed. Cincinnati, Ohio: Atomic Dog Publishing. 4. Shadish WR, Cook TD, Campbell DT. 2002. Experimental and quasi-experimental designs for generalized causal inference (chapter 1). Belmont, California: Wadsworth Cengage Learning, pp 1–31. 5. Belli G. 2009. Nonexperimental quantitative research. In Research essentials: An introduction to designs and practices; Lapan SD, Quartaroli MT, Eds. 1st ed. San Francisco, California: Jossey-Bass, pp 59–77. 6. Johnson B, Christensen L. 2004. Educational research: Quantitative, qualitative, and mixed approach (chapter 13; Nonexperimental quantitative research). Thousand Oaks, California: SAGE Publications, Inc., pp 343–374. 7. International Conference on Harmonisation (ICH). 2005. Q9, quality risk assessment.Geneva, Switzerland: International Conference on Harmonisation. 8. Garcia-Munoz S. 2009. Establishing multivariate specifications for incoming materials using data from multiple scales. Chemom Intell Lab Syst 98:51–57.
JOURNAL OF PHARMACEUTICAL SCIENCES, VOL. 101, NO. 5, MAY 2012
9. Hocking RR. 1976. The analysis and selection of variables in linear regression. Biometrics 32:1–49. 10. Efroymson MA. 1960. Multiple regression analysis. In Mathematical methods for digital computers; Ralston A, Wilf HS, Eds. New York: Wiley, pp 191–203. 11. Elisseeff A, Guyon I. 2003. An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182. 12. Short SM, Cogdill RP, Drennen Iii JK, Anderson CA. 2011. Performance-based quality specifications: The relationship between process critical control parameters, critical quality attributes, and clinical performance. J Pharm Sci 100:1566–1575. 13. Hosmer DW, Lemeshow S. 2000. Applied logistic regression. 2nd ed. Hoboken, New Jersey: John Wiley & Sons, Inc., pp 117–118. 14. Yong Cui, Xiling Song, Mark Reynolds, King Chuang, Minli Xie. 2012. Interdependence of drug substance physical properties and the corresponding quality control strategy. J Pharm Sci 101:312–321. 15. Dalgaard P. 2008. Introductory statistics with R.New York City, New York: Springer Science + Business Media LLC.
DOI 10.1002/jps