Computer Methods and Programs in Biomedicine 59 (1999) 19 – 29
A procedure for generating bootstrap samples for the validation of nonlinear mixed-effects population models John Parke a,*, Nicholas H.G. Holford b, Bruce G. Charles c b
a Pharmacy Department, Wolston Park Hospital, Wacol, Qld. 4064, Australia Department of Pharmacology and Clinical Pharmacology, Uni6ersity of Auckland, Auckland, New Zealand c School of Pharmacy. The Uni6ersity of Queensland, St. Lucia, Qld. 4072, Australia
Received 16 June 1998; received in revised form 2 November 1998; accepted 10 November 1998
Abstract A method is presented for automated preparation of bootstrap data samples and their presentation to NONMEM for use in the validation of population pharmacokinetic or pharmacodynamic models, which have been developed with relatively small numbers of subjects. The bootstrap sampling procedure involves the use of an MS-DOS batch file and a script file for use with the public-domain text processing program, AWK, which is a single EXE file (48k in size for the version used in this report). A UNIX version of AWK is also available and UNIX users can adapt the process to their needs. Its use obviates the need for expensive, high-end statistical packages and associated script files written using in-built programming languages. The method can easily be adapted to bootstrap sampling requirements in other biomedical modelling applications. © 1999 Elsevier Science Ireland Ltd. All rights reserved. Keywords: Bootstrap; AWK; Excel; NONMEM; Population modelling
1. Introduction NONMEM (NONlinear Mixed Effects Modelling) [1], currently is the most widely used software for the pharmacokinetic and/or pharmacodynamic modelling of sparse drug concentration/drug effect data obtained from a population of patients. The program uses a maximum * Corresponding author. Tel.: +61-7-32718231; fax: + 617-32718231.
likelihood criterion to simultaneously estimate population median values of fixed-effects parameters (e.g. drug clearance and volume of distribution and coefficients for demographic and clinical factors which modify these values in the population) and values of the random-effects parameters (e.g. inter-individual variance of clearance and volume and the intra-individual or unexplained variance in the data which remains after fitting the population model). The parameters of a model frequently are used for simulation or pre-
0169-2607/99/$ - see front matter © 1999 Elsevier Science Ireland Ltd. All rights reserved. PII: S 0 1 6 9 - 2 6 0 7 ( 9 8 ) 0 0 0 9 8 - 4
20
J. Parke et al. / Computer Methods and Programs in Biomedicine 59 (1999) 19–29
Fig. 1. Flow chart for bootstrap sampling with NONMEM.
diction of dosage requirements, however, nonlinear iterative search algorithms such as used in NONMEM may produce spurious parameterization of models, especially if variables in a complex population model ‘mask’ each other. Confidence in the accuracy and robustness of the model can be greatly enhanced if its predictive performance can be assessed in data which is representative of that to be analysed but which is not used to develop the models [2,3]. This may be achieved
quite simply by randomly selecting 25–50% of the data as the ‘validation’ data set and setting this aside at the outset while the model is developed on the remaining (‘test’) data. The observed responses in the validation data set and those predicted by the model may be compared by estimating the statistics of mean bias and precision and associated confidence limits [4]. Often, however, the total amount of data available may be limited to the extent that the develop-
J. Parke et al. / Computer Methods and Programs in Biomedicine 59 (1999) 19–29
21
Fig. 2. Source listing of BSNM.AWK.
ment of the model is markedly compromised if the validation data are taken out. Furthermore, assessment of predictive performance using a vali-
dation data set which is very small may lead to rejection of an otherwise ‘valid’ model. Here we describe some automated procedures for prepar-
22
J. Parke et al. / Computer Methods and Programs in Biomedicine 59 (1999) 19–29
Fig. 2 (Continued).
ing bootstrap data samples with an implementation of the public domain program, AWK. An MS-DOS
batch file controls the process and presents the bootstrap data samples to NONMEM.
J. Parke et al. / Computer Methods and Programs in Biomedicine 59 (1999) 19–29
23
Fig. 2. (Continued)
2. Principle of the bootstrap Theory and application of bootstrap methods have been reported previously in detail [5 – 8]. Briefly, bootstrapping approaches are used to assess how accurately the value of a given parameter of a sampled distribution can estimate the (unknown) true value in the wider population. The original data are assumed to be an independent and identically distributed sample of size, m, from an unknown probability distribution, G (x1, x2,…, xn ). The empirical distribution function, G. , of this sample is the discrete distribution that sets the probability of each value xi (i =1, 2,…, m) in the sample to the value of 1/m. The value of some statistic (e.g. arithmetic mean) of the empirical distribution function G. is computed and used as an estimator of the same statistic for the entire distribution G. In order to determine its
accuracy the statistic is calculated for a large number of bootstrap samples. A total of 200 iterations are suggested for the mean value and standard deviation of a statistic, whereas several thousands are required for estimating 95% confidence intervals [7]. The standard error of the statistic over all of these bootstrap samples is called the ‘bootstrap estimate of the standard error’ of the statistic. As the number of bootstrap samples approaches infinity the standard deviation of the statistic estimated from G. approaches the population standard deviation of G. The original data used to develop the model can be represented by the set of vectors Y= (x1, x2,…xi,…, xm) G. , where the vector xi represents all the data (dependent and independent variables) for the i-th patient. A bootstrap sample is generated by repeated random sampling, with replacement, of an m-sized ‘pseudosample’ from
24
J. Parke et al. / Computer Methods and Programs in Biomedicine 59 (1999) 19–29
Fig. 3. Source listing of BOOTNM.BAT
the original data set. At each sampling step, every vector xi has an equal probability of being chosen. Thus, it is possible to choose three of x1, none of
x2, five of x3 and so forth. This sampling is repeated until the bootstrap sample also consists of m vectors, Y*= (x*1 , x*2 ,…, x*i , …, x*m } G. ,
J. Parke et al. / Computer Methods and Programs in Biomedicine 59 (1999) 19–29
25
Fig. 3. (Continued)
where the vector x*i represents all the observations for the i-th randomly selected subject.
3. Methods
3.1. Generation of bootstrap samples An overview of the process is provided by the flow diagram (Fig. 1). The bootstrap-sampled data set is produced using the text file, BSNM.AWK (Fig. 2), which instructs AWK.EXE to randomly sample, with replacement, the original NONMEM data set on the basis of the patient ID number in column 1. The process continues until the last data line in the original data set. The resulting ‘bootstrap sample’ is then saved to a text file, NMSAMP.DAT. In the process, a new ID number is prepended to each of the original IDs and this must be taken into account during any subsequent modelling.
3.2. Presentation of bootstrap samples to NONMEM The MS-DOS batch file program BOOTNM. BAT (Fig. 3) controls the process by telling BSNM.AWK and AWK.exe the number of iterations to process. For each iteration, it also invokes NONMEM, for data processing according to the instructions contained in a standard NMTRAN control file and in a FORTRAN subroutine INFNOUT.FOR (Fig. 4) after compiling and linking to the NONMEM executable in the usual way [1]. The INFNOUT.FOR subroutine (which must be called from the NMTRAN control file) outputs a set of files containing the estimates and their standard errors, together with an error file (IEREC) which indicates whether the estimation terminated successfully. It also outputs a table containing observed and predicted values, which are used to find the bias and precision for the model).
26
J. Parke et al. / Computer Methods and Programs in Biomedicine 59 (1999) 19–29
Fig. 4. Source listing of subroutine INFNOUT.FOR
BOOTNM.BAT then renames these output files according to the iteration number (i.e. THETA1, …, THETA20, …, etc.). Finally, at the comple-
tion of the sequence of iterations, it concatenates these files, except for the tables, (to BIGTHETA etc.) and moves the resulting files to another
J. Parke et al. / Computer Methods and Programs in Biomedicine 59 (1999) 19–29
Fig. 4. (Continued)
27
directory. Processing of the bootstrap sample data sets takes place in batches of 50 because of limitations in the number of IDs able to be placed on one line in BATCH.BAT. To test whether everything works with a set of data and control files, BOOTNM.BAT initially processes just two iterations. If all is satisfactory, the user has the option to ‘REM’ out this line and remove the REMs on succeeding lines up to the number of iterations required. If 250 iterations are performed, then those iterations with errors in BIGIEREC can be deleted
Fig. 5. Source listing for sample NMTRANS control file (BS.CTL).
J. Parke et al. / Computer Methods and Programs in Biomedicine 59 (1999) 19–29
28
Table 1 First 25 sets of parameters from 200 bootstrap iterations THETA1
THETA2
ETA (OMEGA)
ERR1
0.24633 0.25076 0.25512 0.24835 0.24989 0.27196 0.24620 0.24679 0.26456 0.25436 0.25028 0.24894 0.25836 0.26566 0.25694 0.25051 0.26147 0.26690 0.26091 0.26596 0.25580 0.25688 0.26480 0.26903 0.25078 – – –
0.77900 0.72626 0.74527 0.76496 0.70898 0.66972 0.73949 0.71573 0.72676 0.73492 0.73143 0.73943 0.74304 0.74854 0.73113 0.75616 0.74541 0.69576 0.70940 0.69024 0.74773 0.70959 0.67502 0.68991 0.71365 – – –
0.04100 0.03906 0.04182 0.03235 0.03807 0.04422 0.04917 0.04556 0.05024 0.02965 0.03418 0.04119 0.02380 0.02610 0.04519 0.04142 0.06226 0.03579 0.04115 0.04293 0.02978 0.04174 0.03920 0.03655 0.03888 – – –
6920 4910 4860 5530 6850 5370 5630 6450 4990 4800 4650 5320 4850 5560 6480 4730 6480 4310 4840 6000 4790 4900 5810 5260 6180 – – –
and predictive performance of a pharmacokinetic model developed using NONMEM [9]. It is anticipated that the use of bootstrap validation methods will increase substantially in the future. Generation of each bootstrap data set takes just a few seconds, which is negligible compared with the time required for NONMEM to process the data. A NONMEM data set containing 9994 rows and 18 fields of data required 10 s for the generation of each bootstrap-sampled NONMEM data set with an Intel Pentium 233 system using 32 Mb of RAM. The steps described above are limited to providing the necessary bootstrap data files that are in a form ready for analysis using NONMEM. The output from NONMEM, viz. the values of the fixed effects parameters would then be subjected to further statistical assessment such as described elsewhere [4,7]. The procedure described here obviates the need to use expensive, high-end statistical program packages (e.g. S-Plus, Systat, Statistica) which often require script files written under the format of the in-built programming language. AWK.EXE is a small public-domain program (the version used for this process is only 48k in size), whereas other scripting languages, such as PERL, can occupy many megabytes of disk space. The sampling method is simple to use; the process is started at the DOS level by the command,
whilst still leaving 200 + iterations for determination of mean parameter values (in practice, it is unusual to find more than 20 estimations which terminate with error codes).
4. Discussion In its document ‘Guidance for Industry: Population Pharmacokinetics-Draft Guidance’ published in September 1997, the US Center for Drug Evaluation and Research, Department of Health and Human Services (Food and Drug Administration), suggested the use of the bootstrap procedure as a satisfactory method for the validation and checking of population models in the drug approval process. Recently the bootstrap approach has been to be used to assess the stability
.
‘bootnm data – file – name’ Table 2 Comparison of bootstrap mean THETAs with original THETAs Parameter
Original data
Mean 200
% Difference
u1 S.E. u1 u2 S.E. u2 v; 2CL S.E. v; 2CL s2 S.E. s 2 s S.E. s Bias Precision
0.2560 0.00836 0.732 0.0263 0.0405 0.00952 5690 829 75.43
0.2560 0.0082 0.7322 0.0261 0.04029 0.00911 5695 790 75.47
0 1.951 0.02732 7.66 0.521 4.5 0.0878 4.937 0.053
−9.233 109.6002
−9.471 107.9814
2.577 1.499
J. Parke et al. / Computer Methods and Programs in Biomedicine 59 (1999) 19–29
where data – file – name is the name of the original data file. The NMTRANS control file must be renamed, or copied, to ’BS.CTL’. At the finish of the process, the output files (BIGTHETA etc.) are ready for analysis to find the means of all the relevant parameters, which are then compared with corresponding values for the model developed with the original data set. Similarly, the bias and precision are obtained from each table, entered into a spreadsheet and the mean values are compared with those of the original data set. A sample NMTRANS control file is shown in Fig. 5. The first ten lines of a sample BIGTHETA file are shown in Table 1, whilst a comparison between the THETAs from the original data and the mean values for 200 bootstrap samples is shown in Table 2. Finally, it should be noted that there are different versions of AWK.EXE currently available in the public domain, but many do not produce any output with BOOT.AWK. The version used by the authors (version 3.20; 22 May, 1991; Copyright 1990, 91 by Rob Duff, Vancouver BC, Canada V5N lY9) for the present application is a 47.9 kb file. Further information on BSNM.AWK and AWK.EXE can be directed to one of us (NH) at n.holford@,auckland.ac.nz. Other programming inquiries including electronic
29
distribution of the program files should be directed to (JP) at
[email protected].
References [1] A.J. Boeckmann, L.B. Sheiner, S.L. Beal, NONMEM Users Guides. The NONMEM Project Group, Regents of the University of California, San Francisco, CA, 1992. [2] T. Lee, B. Charles, P. Steer, V. Flenady, A. Shearman, Population pharmacokinetics of intravenous caffeine in neonates with apnea of prematurity, Clin. Pharmacol. Ther. 61 (1997) 628 – 640. [3] J. Parke, B. Charles, NONMEM population modelling of orally-administered cyclosporine from routine drug monitoring data following heart transplantation, Ther. Drug Monit. 20 (1998) 284 – 293. [4] L.B. Sheiner, S.L. Beal, Some suggestions for measuring predictive performance, J. Pharmacokinet. Biopharm. 9 (1982) 503 – 512. [5] B. Efron, Bootstrap methods: another look at the jacknife, Am. Stat. 7 (1979) 1 – 26. [6] B. Efron, G. Gong, A leisurely look at the bootstrap, the jacknife and cross validation, Am. Stat. 37 (1983) 36 – 48. [7] B. Efron, R.J. Tibshirani, Bootstrap methods for standard errors, confidence intervals and other measures of statistical accuracy, Stat. Sci. 1 (1986) 54 – 77. [8] B. Efron, R.J. Tibshirani, An Introduction to the Bootstrap, Chapman and Hall, New York, 1993. [9] E.I. Ette, Stability and performance of a population pharmacokinetic model, J. Clin. Pharmacol. 37 (1997) 486 – 495.