Heart rate correction models to detect QT interval prolongation in novel pharmaceutical development

Heart rate correction models to detect QT interval prolongation in novel pharmaceutical development

Journal of Pharmacological and Toxicological Methods 64 (2011) 25–41 Contents lists available at ScienceDirect Journal of Pharmacological and Toxico...

3MB Sizes 2 Downloads 75 Views

Journal of Pharmacological and Toxicological Methods 64 (2011) 25–41

Contents lists available at ScienceDirect

Journal of Pharmacological and Toxicological Methods j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / j p h a r m t ox

Original article

Heart rate correction models to detect QT interval prolongation in novel pharmaceutical development Michael Markert a,⁎, Ruijun Shen b, Thomas Trautmann a, Brian Guth a a b

Department of Drug Discovery Support, General Pharmacology Group, Boehringer Ingelheim Pharma GmbH & Co Kg, Germany Department of Bioinformatics, Eberhard-Karls University of Tübingen, Germany

a r t i c l e

i n f o

Article history: Received 4 April 2011 Accepted 5 May 2011 Keywords: Methods Conscious dog QTana Telemetry QT duration Heart rate

a b s t r a c t Introduction: The QT interval of the electrocardiogram (ECG) reflects the duration of ventricular depolarization and repolarization. A drug-induced prolongation of ventricular repolarization, and thereby QT prolongation, is recognized to be a marker for an enhanced risk for ventricular arrhythmia. The assessment of a drug's effect on the QT interval has therefore become routine within pharmaceutical research and development. However, the heart rate has a major influence on the QT interval; the QT interval shortens as heart rate increases such that one needs to account for such heart rate-dependent changes when evaluating possible drug-induced effects on the QT interval. The relationship between the QT interval and heart rate can be modeled mathematically and using this function a so-called “corrected” QT interval (QTc) can be generated to assess drug-induced effects independent from heart rate-dependent effects. In the past few years, a large number of mathematical relationship have been described that supposedly best describe the heart rate–QT relationship. In this paper we describe a novel approach for selecting the optimal mathematical function for this purpose for a given individual. Methods: Mongrel, purpose-bred dogs (16, males and females) were instrumented with radiotelemetry transmitters (ITS) for measurement of aortic pressure (AP), left ventricular pressure (LVP), the lead II ECG and body temperature. ECGs were recorded continuously without drug treatment and include a range of HRs due to spontaneous, physiological changes over the 24 h of data acquisition. Various mathematical models (N 20) were then used to evaluate the HR–QT relationship and these were compared statistically to objectively select the model best fitting the data set of each individual animal. Results: In this study a dynamic analysis algorithm was developed to find the optimal descriptor of the HR–QT relationship for a given individual animal under control conditions. The use of this optimal relationship provides the best possible approach for detecting drug-induced effects on the QT interval for compounds that also affect the heart rate. Discussion: Several numerical methods to optimize the correction functions and statistical procedures to perform significance tests were discussed and implemented in a QT/RR relationship analysis system, named QTana. Given a sample data set, QTana searches the best correction model(s) from the integrated 11 QT/RR relationship modeling functions. © 2011 Elsevier Inc. All rights reserved.

1. Introduction Since the mid 1990s both cardiac and non-cardiac drugs have been shown to be potentially proarrhythmic due to the blockade of myocardial potassium currents that repolarize the heart thereby delaying ventricular repolarization. This effect is seen as a QT interval prolongation in the electrocardiogram. QT interval prolongation is itself not clinically relevant, but it is considered to be a risk factor associated with an increased propensity for ventricular tachyarrhythmias including Torsades de Pointes (TdP), which may lead to syncope, cardiac arrest, or sudden death. Therefore, in the drug discovery process it is ⁎ Corresponding author at: Boehringer Ingelheim Pharma GmbH & Co KG J91 UG, Birkendorferstr.65 88397 Biberach, Germany. Tel.: + 49 7351 548727; fax: + 49 7351 838727. E-mail address: [email protected] (M. Markert). 1056-8719/$ – see front matter © 2011 Elsevier Inc. All rights reserved. doi:10.1016/j.vascn.2011.05.002

very important for pharmaceutical companies to detect, as early as possible, whether a new compound or its metabolites can prolong the QT interval. (Chaves et al. 2006; Fermini & Fossa 2003) The heart rate is has a major influence on the QT interval with a QT shortening as heart rates increases and a lengthening of the QT interval as heart rate is decreased. The detection of drug-induced effects on the QT interval is therefore complicated when changes in the heart rate occur. This has been approached in the past with the use of various QT– heart rate “correction” models that attempt to remove the heart ratedependent effect on the QT interval through the calculation of a “heart rate corrected” (QTc) QT interval duration. With a QTc, the heart ratedependent effect should be removed but the possible drug-dependent effect should be maintained (Davey 1999). The success of this approach requires that the mathematical transformation applied faithfully reflects the heart rate–QT relationship for a given individual. Various algorithms that have been proposed for this purpose and it has

26

M. Markert et al. / Journal of Pharmacological and Toxicological Methods 64 (2011) 25–41

become apparent that there is not a single transformation that provides the best possible mathematical fit for all individuals, suggesting that some sort of objective selection process is required. In this study, a dynamic method was developed to derive data-driven heart rate correction models from any given experimental data set. Different models for correcting the QT interval duration for heart rate were assembled from the literature to comprise a library of transformation options. The new approach allows one to compare each of the available models and then select the one that best reflects the individual QT–heart rate relationship, which can then be used in further studies to differentiate heart rate-dependent from drug-dependent effects on the QT interval. As any experimental data collection may contain outlying observations, several noise detection methods were developed to filter data sets for such outlying values before applying a given set of model parameters. Specific parameters for each model were estimated by using (nonlinear) least squares regression methods. Applying these data-driven models to post-dose data groups, different model selection criteria, such as Pearson's coefficients, PRESS RMSE, Akaike's information criterion (AIC) and positive–negative tests, are considered to select the best model(s) (Akaike 1974). To evaluate the performance of the QT/RR relationship analysis system, a set of positive and negative tests are run on it. As expected, it identifies significant QT interval prolongations in positive tests and reports none in negative ones. Further factors having influences on QT intervals are also studied in the present work through following tests. Two-way ANOVAs are applied to view the effects of gender and dosage differences on QT intervals. These tests show that no interaction effect is detected but gender differences do affect the QT/RR relationship patterns. Special attention should be attracted to gender specified studies in non-clinical development of drugs prolonging QT intervals. Two-sample t-tests and one-way ANOVAs are conducted to investigate inter-individual variabilities and their influences on QT/RR relationship patterns. Some individuals present biological significant differences with one and the other. Correction formulas for the individual data sets may also differ from each other. During the QT/RR relationship analysis, a study of individual correction models is not unnecessary. The last sets of tests are run to inspect the influences of median values of different time intervals on QT/RR relationship patterns. These tests show that 10 min median values smoothen beat-to-beat measurement variability better and contain less outlying observations, whereas 5 min median values have doubled the sample size, which potentially can estimate parameters for correction models more accurately. But at the same time, these sample values contain much more outliers as reflections of biological differences and measurement variability. If the sample sizes are big enough, no obvious differences can be seen between the two types of median values except the thickness of data point clouds in corresponding plots. Filtration algorithms to prepare noise-free data, non-linear least square regression analysis methods for parameter estimation, such as Nelder–Mead Simplex algorithm, and different model selection criteria are also discussed. The practical design and implementation of these methods are presented. Tests with different design manners are performed on the implemented tool (QTana). Designs and results of these tests are demonstrated.

rate, inadequacies in this model have been apparent. Fridericia's cubic QT root correction QTc = pffiffiffiffiffiffi represents a popular alternative, which has 3

RR

been suggested to correct QT better than Bazett's function. Nevertheless, it is still unreliable at low and high heart rates. Both of these functions over-correct QT at low heart rates and under-correct at fast heart rates. In comparison to these two non-linear correction models, linear regression correction functions, like the Framingham heart study linear correction model, have been introduced subsequently in an attempt to correct QT for heart rate changes (Sarma, Sarma, Bilitch, Katz, & Song 1984). It has become apparent that a universal correction model may not be feasible. A model developed in one laboratory may not be utilized in another one. Moreover, the application of a model developed using data from a specific species even in the same laboratory is not applicable to other species (Markert, Stubhan, Mayer, Trautmann, & Guth 2008). Even a model gained through the data collected from a single animal might not work with the data collected in its later life period. In the last few years, many correction models, including parabolic, hyperbolic, exponential, polynomial functions, have been introduced and reported to have excellent performance in a given situation (Malik, Farbom, Batchvarov, Hnatkova, & Camm 2002; Satterthwaite 1946). For example, Meyners and Markert found the data-oriented Sarma's exponential correction function (QTc = QT + b (exp(c ⋅ RR) − exp(c)) had good performance in their studies using conscious Labrador dogs (Meyners & Markert 2004). One may conclude that it is necessary to develop a specific HR correction function for a given data set. This procedure is, however, very complicated and time-consuming. In this work a dynamic analysis algorithm was developed to simplify this work. First, a model pool, containing a linear function, Bazett's and Fridericia's formulas and other non-linear correction models, was set up. Then, given an input data set, different numerical and statistical methods were performed to detect the most suitable correction model(s). 2. Methods 2.1. Telemetry system The telemetry system used to measure cardiovascular parameters is manufactured by Konigsberg Instruments, Inc. (Pasadena, CA) and marketed by RMISS (Delaware). It consists of 5 major components: a) an implantable unit; b) a receiver (antenna) located in the animals cage together with an amplifier; c) ambient pressure monitor to measure atmospheric pressure; d) a PC-based “base station” to receive and process the amplified signals; e) a PC-based data acquisition system (NOTOCORD Hem 4.2) to process signals. The implantable unit (“T27” total implant) consists of (1) two high fidelity pressure transducers (4.0 mm diameter), (2) ECG cable, (3) micro-power battery-operated electronics that process and digitize the information from the pressure transducers and the ECG lead, (4) a radio-frequency transmitter that sends the signals to the telemetry receiver, and (5) a battery. A small cable projecting from the transmitter contains a magnetic switch that allows the device to be turned on and off. 2.2. Animals

1.1. Current QT correction models and problems, background QT interval correction approaches model the observed correlation between QT interval and RR with a certain mathematical function. A given measurement of the QT interval is transformed using this function to produce a so-called “corrected” QT interval (QTc). The best known correction model for QT interval is Bazett's square ffiffiffiffi) which was introduced in 1920 (Bazett root formula (QTc = pQT RR 1920). Although it is still widely used to correct QT interval for heart

Treatment of the animals followed the German Law on the Protection of Animals and was performed with permission from the Baden-Württemberg Animal Welfare Committee. Trained Labrador dogs (male or female), at least one year of age, with body weights between 22 and 30 kg bred at Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach were used. The dogs were group-housed as pairs of 2 in separate cages and had access to water ad libitum and were fed a standard dog diet once daily. They also had daily exercise periods of at least 1 h, each afternoon.

M. Markert et al. / Journal of Pharmacological and Toxicological Methods 64 (2011) 25–41

2.3. One-time surgical implantation The transducers of the T27 implant were calibrated and the unit was sterilized using a low pressure ethylene oxide process prior to implantation. For perioperative analgesia the dogs were given meloxicam (Metacam, 0.4 mg/kg, orally). Additionally, a second analgesic component was given: 2–4 Durogesic Smat patches (fentanyl, 25 μg/h) were attached behind the ear the day before surgery after shaving and degreasing the skin. During surgery, a supplemental fentanyl infusion (0.005 mg/kg/h) was given. For prophylactic antibiotic treatment, amoxicillin (Duphamox LA, 15 mg/kg) was given beginning the day before surgery. Dogs were anesthetized with a combination of Rompun (xylazine hydrochloride, 1 ml/10 kg, i.v.) and Ketavet (ketamine hydrochloride, 0.7 ml/10 kg, i.v.) after premedication with Atropin (0.04 mg/kg i.m.) and ventilated with 66% O2 and 1%–1.5% isofluorane (forene) at a ventilation rate of 14/min. All procedures were performed under aseptic conditions using sterilized equipment. The dogs were placed in a lateral recumbency with the left side facing the surgeon. An incision was made between the fifth and sixth intercostal space. A small pocket was opened in the abdominal wall for implantation of the transmitter, battery housing, and induction switch coil. The cables with both pressure transducers and ECG leads extending from the ventrally implanted transmitter were guided subcutaneously to the lateral incision. The antenna was guided subcutaneously from the transmitter location towards the spine and then runs parallel to the spine for ~25 cm. The initial ventral incisions required for battery and transmitter placement were closed. A left thoracotomy was performed between the fifth and sixth intercostal space to expose the left ventricle apex for insertion of the left ventricular Konigsberg transducer. The aorta pressure transducer was implanted next. The aortic transducer, which also served as one electrode of the ECG, was inserted into the thoracic aorta just below the aortic arch. The transducer was sutured into place and blood flow was restored. The lung was then inflated and the intercostal muscles were sutured closed and the pneumothorax evacuated. Chest incisions were closed. The gas anesthesia was then turned off and dogs were allowed to wake up. Analgesics and antibiotics (Temgesic and Duphamox® LA, Amoxicillin-Trihydrate) were administered for 10 days following the procedure To support a good recovery and to ensure that the dog has no postoperative pain, a transdermal plaster (fentanyl, 25 μg/h) was placed on the skin for 2 days. Dogs are allowed to recover for at least 21 days before experiments using test substances are initiated. 2.4. Experimental design 2.4.1. Data acquisition and analysis Digitized telemetry signals were processed by NOTOCORD software (Hem 4.2) on a beat-to-beat basis. The following parameters were continuously recorded for the duration of the experiment: aortic pressure (AP), left ventricular pressure (LVP), ECG lead II and body temperature. The hemodynamic and ECG parameters were calculated during the experiment and include: systolic, diastolic and mean aortic pressure, peak systolic and end-diastolic left ventricular pressure, LV dP/dt max and dP/dt min, heart rate; PQ-, QRS- and QT intervals. NOTOCORDsoftware was used for acquisition of data whereas EXCEL was used for some basic data analyses. Data were summarized at predefined time points by calculating median values± SD. At each time point, a minimum of 400 sequential beats were used to calculate the median value. The summarized data are given as mean values ± SEM. The different base levels of the individual animals were taken into account by referring the measured values after administration of the test compound to the pre-treatment values. With these values the standardized area under the curve (AUC divided by interval length) was calculated for the time intervals of interest. Comparisons between treatment and placebo were performed by one way analysis of

27

variance (ANOVA) followed by the Student t-test. Statistical significance was accepted when p b 0.05. Using the error term of the analysis of variance as an estimate for the variability many-to-one t-tests comparing the dose groups of the test compound with the control group were added. The evaluation was performed using the software package SAS 8.02 (SAS Institute Inc., Cary, USA). 2.5. Outlier filtration methods Rare or atypical data objects that do not comply with the general behavior or model of a data set are outliers. They originate from one of the following three sources: • an error occurred during the recording of data, • an error of data collection, e.g. including objects from other populations, or • an actual extreme value from an unusual subject. Data affected by errors of this type in the acquired experimental data often have a detrimental effect on the analysis performance. Therefore, before the analysis procedure, erroneous data are identified and removed to improve the accuracy of the analysis. Outliers from the last source are actually correct data and should not be removed from the analysis model. It is, however, difficult to distinguish such extreme data from erroneous ones. Specific background knowledge and experience are needed to fulfill this task. Essentially, there are two major outlier detection modes (Esbensen 2002): • Data analytical: the relative (geometrical) distribution of the data, e.g. in score plots, is all the analyst has to go by. Decisions must be based wholly on experience. • Domain, problem-specific knowledge may in other situations constitute an absolute reference with which to make the decision whether a particular data point or data set are outliers or not. Score plots are helpful for outlier detection. Compared to the other objects, an outlier will have one, or more, excessively high or low scores. An outlying object or a group of them will then be separated from the rest of the data instances in a score plot to a larger or smaller degree. Fig. 1 gives two examples of outliers (Esbensen 2002). The marked object in the left panel is a potential outlier, which nevertheless may be accepted by some observers as a correct data point. The right panel shows a clear outlier, which is distinctly separated from the rest of the objects. For data exploration purposes it may be feasible to take out outlying objects manually, whereas for automatic data analysis it is not acceptable to perform a continuous manual inspection of score plots because of the needed time and resources. In these situations it is essential to identify outliers and then delete them automatically. As discussed above, it is difficult to decide whether an outlier is really an erroneous object or a correct, but extreme value. Automatic filtration methods may exclude these extreme values unexpectedly and reduce the accuracy of the analysis system. However, it is reasonable and also important to apply these filtration methods to automatically collected data and their analysis, since removal of some correct extreme values would bring less harm to an analysis system than keeping some erroneous outlier in it. Thus, in this thesis, the following filtration methods were applied. 2.5.1. Range filtration Physiological heart rates of any species vary in a certain range, e.g. 30–200 bpm in dogs and 40–200 bpm in humans (Klumpp, Trautmann, Markert, & Guth 2006). Empirically a QT interval is considered abnormal, when it is outside the range of 150–500 ms. Any data point outside these ranges can be treated as an outlier and reasonably deleted. This is a very simple but quite efficient method to identify and delete erroneous data points in this situation.

28

M. Markert et al. / Journal of Pharmacological and Toxicological Methods 64 (2011) 25–41

Fig. 1. Mild and extreme outliers in score plots.

2.5.2. Filtration with box plot Another filtration approach used in this work is based on box plots. Box plots are useful to represent data visually. It can be applied to identify outliers and compare data distributions. A typical box plot has the components listed in Table 1. Values beyond an inner fence, but not an outer fence, are called “outside values” (suspect outliers), Values outside the outer fences are named “far out values” (highly suspect outliers). They usually are plotted as ‘o's and asterisks by most statistical analysis programs, respectively. The upper hinger, lower hinger and median form the box. Vertical lines, called “whiskers”, are often added at upper and lower adjacents. These lines depict not only the spread of the data but also separate outside and far out values. Fig. 2 represents a typical box plot for detecting outliers. 2.5.3. ESD filtration Another statistical filtration method inspected in present study is to detect outliers by use of Student's test. It applies a test statistic, named the extreme studentized deviate (ESD). ESD = max 1;…;n

jxi −x j s

ð2:1Þ

to identify outliers, where s is the standard variation of the sample. If a sample of size n has no outliers, the ESD value will correspond n th percentile approximately (Rosner 1995). to the 100%× n+1 Otherwise the statistic will be larger. The appropriate critical value ESDn,1 − α to imply existence of outliers is dependent on sample size n α (Rosner 1983; Rosner 1995): and the percentile p = 1 − 2n tn−2; pðn−1Þ ESDn;1−α = rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi   2 n n−2 + tn−2;p

Assume one has a sample X ~ N(μ,δ2), and it is suspected that there are some outliers. The procedure to detect an outlier in X is as follows: • Hypothesis: H0: there is no single outlier vs. H1: there exists an outlier. • Compute the test statistic ESD with Eq. (2.1) and the critical value ESDn,1 − α with Eq. (2.2). Denote the ESD corresponding sample value xi with x(n). • If ESD N ESDn,1 − α, H0 is rejected and x(n) is reported as an outlier. Otherwise H0 is accepted and no outliers exist. This procedure detects only a single outlier through one hypothesis test. Usually, there are more than one outlier in a sample. In this situation, an ESD many-outlier detection procedure will be appropriate, as it can detect either single or multiple outliers accurately. Multiple outliers can inflate the standard variation and make it difficult to detect the single most extreme sample point as an outlier, especially when the outliers are roughly of equal distance to the sample mean. Before the ESD many-outlier procedure can be performed, an upper bound  nfor the number of outliers should be set. Rosner suggested min ;5 as a reasonable upper bound (Rosner 1983; Rosner 1995). 10 If there are more than 5 outliers, the sample has very likely an underlying non-normal distribution unless the sample size is very large.

ð2:2Þ

Table 1 Terminology of box plots. Name

Definition

Upper hinger Lower hinger Median H-spread Step Upper inner fence Upper outer fence Lower inner fence Lower outer fence Upper adjacent Lower adjacent

75th percentile of the data set 25th percentile of the data set 50th percentile of the data set Upper hinger-lower hinger 1.5 × H-spread Upper hinger + 1 Step Upper hinger + 2 Steps Lower hinger − 1 Step Lower hinger − 2 Steps Largest value below upper inner fence Smallest value above lower inner fence

Fig. 2. A typical box plot to detect outliers.

M. Markert et al. / Journal of Pharmacological and Toxicological Methods 64 (2011) 25–41

Now, assume one has a sample X ~ N(μ,δ2), and it is suspected that n  ;5 multiple outliers. The ESD multiple-outlier there are k = min 10 detection procedure can be performed as follows: • Hypothesis: H0: there are no outliers vs. H1: there exist l outliers, 0 b l ≤ k. • Compute the test statistic ESD and denote ESD and the corresponding sample value xi with ESDn and x(n). • Remove x(n) from the sample and recompute the mean, standard variation, ESD statistic from the remaining n − 1 data. Denote this ESD statistic and the corresponding sample value with ESDn − 1 and x(n − 1). • Continue removing the most outlying data from the sample and computing ESD statistic for each step. Denote the ESD statistics and the possible outliers with ESDn,ESDn − 1, …,ESDn − k + 1 and x(n),x(n− 1), …,x(n − k + 1). • Compute the corresponding critical values ESDn,1 − α,ESDn − 1,1 − α, …, ESDn-k + 1,1 − α with Eq. (2.2).

where QTi is the value of the response or dependent variable from the i-th pair, α and β are two unknown parameters, RRi is the value of the independent variable from the i-th pair and εi is a random error term which is normally distributed with mean 0 and unknown variance σ2. The parameters α and β are called regression coefficients. α is the intercept of the straight regression line. If the scope of the independent variable includes 0, α gives the corresponding value of the dependent variable when the independent variable equals 0. β, usually named slope of the regression line, represents the increase (or decrease, if β b 0) in the dependent variable associated with an increase in the independent variable. A good estimation of both parameters can be assessed through a method called “least squares”. Given a set of data pairs, for each pair of values (RRi, QTi), the deviation from the observed value QTi to its expected value QTi' = α + β × RRi is considered. Then, the sum of squared deviations is computed:  n  2 S = ∑ QTi −ðα + β×RRi Þ i

2.6. Linear and nonlinear regression

29

ð2:4Þ

Good estimates of α and β minimize the sum S. They can be computed as follows:

The goal of QT/RR analysis is to detect changes of the QT interval caused by the administration of a new drug candidate. Significant QT interval prolongation can result in life threatening arrhythmia. However, the dependence of QT interval on heart rate masks potential drug-induced QT interval changes. Therefore, to view these changes, the right correlation between QT and RR should be defined and this relationship may then be used to remove the dependency with a corresponding correction model. Regression analysis is an appropriate technique for investigating relationships between variables. It can be applied both for assessment of association and for prediction. In most research cases, one variable usually is taken as the response or dependent variable, which is to be predicted from or explained by other variable(s), usually called predictor(s) or independent variable(s). Independent variables are often assumed to be normally distributed and a model is formulated to express the mean of this normal distribution as a function of potential independent variables under investigation. In this study, the dependent variable is QT interval, and the independent variable is RR. Since early last century different models to describe the dependence of QT on heart rates have already been introduced (Bazett 1920; Fridericia 1920). Universally applicable models, nevertheless, have never been found. QT/RR relationship may change from species to species, male to female, one animal to another of the same species and may even change in a given individual over time. Thus, in the current study, instead of finding a general model, the QT/RR relationship detection procedure is based upon a pool of possible models. With respect to the data of each experiment, a suitable model with dataspecific parameters is to be found. The candidate models, containing both linear and non-linear functions, were collected from the QT/RR analysis literatures (Bazett 1920; Fridericia 1920; Satterthwaite 1946; Malik et al. 2002). The parameters of these functions and corresponding correction functions are not fixed but are determined with the help of experimental data. The parameter estimate process is classified into linear and non-linear regression analyses, which are discussed in following sections.

where xi is a known vector of independent variables for the i-th datum (in this study, xi = RRi), θ is a vector of unknown parameters and f a known nonlinear function of x and θ. The errors εi are assumed that they are either uncorrelated with common unknown variance σ2 or independently normally distributed N(0,σ2). Similar to linear regression models, parameters of the nonlinear functions θ are to be estimated when a set of data is given. But the estimation process is different from that by linear regression. Following are the methods most commonly used.

2.6.1. Linear regression In this case, it is assumed that the relationship between the independent variable(s) (RR) and the dependent variable (QT) is linear in nature. Linear regression model describes the mean of normally distributed dependent variable as a linear function of independent variable:

2.6.2.1. Linearization. In many cases the nonlinear regression can be transformed to linear form (linearization) and the parameters can then be roughly estimated by use of least squares method. This method is very easy to use, but the estimated parameters usually are not precise enough. For instance, the Michaelis–Menten model, a very commonly used model in enzyme kinetic, is:

QTi = α + β⋅RRi + εi

ð2:3Þ

∑RR∑QT ∑ðRR × QT Þ− n β= 2 2 ð∑RRÞ ∑RR − n

ð2:5Þ

α = QT −β × RR:

ð2:6Þ

Whether the relationship between independent variables and dependent variables can be depicted with a linear regression, can be decided by the Pearson's correlation coefficient r, which will be discussed in details. Plots, e.g. scatter diagram, can also give a graphic overview of the goodness of fit. 2.6.2. Non-linear regression It is quite often that a linear regression cannot give a satisfactory solution. In QT/RR analysis literatures, most of the suggested correction functions utilize non-linear regression models, such as exponential, logarithmic, etc. These models are often data-driven, i.e. the parameters of these models are determined by a given experimental data set. Given different data set, the parameters usually are also different. The general form of a nonlinear regression is yi = f ðxi ; θÞ + εi

f ðx; θ1 ; θ2 Þ =

θ1 x θ2 + x

ð2:7Þ

ð2:8Þ

30

M. Markert et al. / Journal of Pharmacological and Toxicological Methods 64 (2011) 25–41

which demonstrates the relationship between the velocity of an enzymatic reaction and the substrate concentration x (Zhu 2004). The nonlinear regression is then: yi = f ðxi ; θ1 ; θ2 Þ + εi

ð2:9Þ

As the equation can be transformed to the following: 1 1 θ 1   = + 2 ; θ θ1 xi 1 f xi; θ1 ; θ2

ð2:10Þ

which is achieved by solving the following estimation equations: n ∂S ∂f = −2 ∑ ðyi −f ðxi ; θÞÞ ðx ; θÞ = 0; j = 1; …; p: ∂θj ∂θj i i

p is the dimension of the θ, i.e. number of parameters. If X denotes the ∂f ðxi ; θÞ as, the estimation equations in n × p matrix with entries (i, j) ∂θ j Eq. (2.12) are: T

ð2:11Þ

where ε′i is a random error with common variance, which however is different from that in Eq. (2.9). These two models (2.9) and (2.11) show great differences when xi is approaching 0 (Zhu 2004). The parameters estimated with this method are then not precise enough for most purpose, even though it can be taken a good start point for a regression on Eq. (2.9). The following simulation is a good example to represents the differences between the linearized and the original models (Fig. 3). The simulation is performed with xi = 0.1,0.2, …,2, θ1 = θ2 = 1 and σ = 0.1. Fig. 3 (a) plots the raw data with the true model curve of x 1 1 (solid curve), the fitted curve after regressing and (dotted y= 1+x yi xi curve) and the fitted curve by using nonlinear regression methods, e.g. Gauss–Newton method (dashed curve). Fig. 3(b) gives the raw data and 1 1 all the three curves on to scale. It can be seen that the extreme rightyi xi hand point distorts the dotted curve (linear fit) remarkably, while the dashed curve (nonlinear regression fit) runs much nearer to the solid curve (true regression fit). The right parameter estimation method for nonlinear regression models is nonlinear least squares. The distribution of random errors ε is the same as that of linear models. Thus, the appropriate parameter estimation is to minimize the nonlinear sum of squared deviations: n

2

S = ∑ ðyi −f ðxi ; θÞÞ ; i

T

X Y = X f ðxi ; θÞ:

the nonlinear regression (2.9) can be linearized as: 1 1 θ 1 + 2 + εi′ = yi θ θ1 xi

ð2:12Þ

Both X and f (xi,θ) have a dependence on the unknown parameters θ. Thus, estimates of these parameters cannot be achieved by use of linear algebra methods, and iterative techniques are needed to approximate them. Almost all nonlinear regression methods behave with following steps (Motulsky & Christopoulos 2003): 1. Start with an initial guess of the parameters (θ(0)) in the equation. 2. Generate the curve defined by the initial estimates and calculate the sum of squares, i.e. the sum of the squares of the vertical distances of the residuals from the curve. 3. Adjust the parameters (θ(m)) to make the curve closer to the data points. 4. Repeat step 3 many times and let the curve even closer to the residuals. 5. Stop the calculations until convergence is reached, e.g. the adjustments make virtually no difference in the sum of squares. 6. Report the current ˆθ as the right values for the best fit. The precise values obtained depends in part on the initial values chosen in step 1 and the convergent criteria of step 5. Therefore, repeat analysis of the same data set may report different results. Take the Michaelis–Menten model (Eq. (2.8)) for instance. First, an infinite number of curves are generated by varying the two parameters θ1 and θ2. For each pair of parameters, the sum of squares is then calculated and the goodness of fit is assessed. Fig. 4 gives a good overview of this process (Motulsky & Christopoulos 2003). As

Fig. 3. Simulated data set for Michaelis–Menten model.

M. Markert et al. / Journal of Pharmacological and Toxicological Methods 64 (2011) 25–41

31

Table 2 QT/RR modeling and correction functions. Model

Modeling function

Bazett

QT QTc = pffiffiffiffiffiffi RR QT 1 QTc = 1 QT = aRR3 RR3 QT = a − b exp(cRR) + ε QTc = QT + b(exp(cRR) − exp(c)) QT = a + bRR + ε QTc = QT + b(1 − RR)   b 1 +ε QTc = QT−b −1 QT = a + RR RR QT QT = aRRb + ε QTc = b RR QT = a + b log(RR) + ε QTc = QT − b log(RR)     QT −bðRR−1Þ 1000 QT=log(a+bRR)1000+ε QTc = log exp 1000 QT = a + b exp(− RR) + ε QTc = QT − b(exp(− RR) − exp(− 1))

Fridericia Sarma Exp. Linear Hyperbolic Parabolic Logarithmic Shifted Log. Exponential

Correction function

1

QT = aRR2

Þ

Polynomial 2 QT = a + bRR2 + cRR2 + ε 2

2

QTc = QT + b(1 − RR) + c(1 − RR2) + d(1 − RR3)

Polynomial 3 QT = a + bRR + cRR + dRR3 + ε

Fig. 4. Best fit search with nonlinear regression.

the bottom of the surface in the graph is reached, the optimized pair of parameters (ˆθ) is found. Due to the adjustment methods used in the iterative steps 3 and 4 to approach the best fit, there exist many nonlinear regression methods. Most commonly studied and applied techniques are steepest descent, Gauss–Newton, Nelder and Mead (Simplex) methods, which are discussed in following.

QTc = QT + b(1 − RR) + c(1 − RR2)

2.6.2.3. Gauss–Newton method. The key idea of Gauss–Newton method is the assumption that     f ðxi ; θÞ≈f xi ; θˆ + X θ− θˆ : Suppose current estimate is θ(m), then ˆθ can be approximated by solving     T T T θ−θðmÞ ; X Y = X f xi ; θˆ ≈X f ðxi ; θðmÞÞ + X ^ which leads to

2.6.2.2. Steepest descent. Like other methods, this approach starts with an initial guess θ(0), and then runs a sequence of approximations of the form: θðm + 1Þ = θðmÞ + ΔðmÞ; m = 0; 1; 2; … until convergence is reached. At the moment the current estimated value is θ(m), the residual vector e(m) is Y − f (xi,θ(m)), and the gradient of S(θ) equals −2x T e(m). The Δ (m) to direct θ(m) to θ(m + 1) is then k x T e(m), where k is a variable parameter. Usually, the start value of k is 1. It reduces as a sum of squares S(θ(m + 1)) smaller than S(θ(m)) is found. Compared to other methods, deepest descent runs always downhill, can avoid saddle points and is efficient when θ(m) is further from ˆθ. Another advantage of this method is that it always finds a better set of θ unless a local minimum of sum of squares is reached. Thus, it is a good approach to perform nonlinear regression estimations when the start values are poor. On the other hand, it performs slow as it might “zigzag” down the valleys. And it will be even more slower when θ(m) is approaching ˆθ.

 −1 T T θˆ≈θðmÞ + X X X eðmÞ: Similar to deepest descent, here the Δ (m) is k(x Tx) − 1x T e(m), where k usually starts at 1 and reduces as a set of parameters θ(m + 1) is found to have decreased the sum of squares. Gauss–Newton method performs relatively efficient and especially well when closing the minimum point. It, however, can be lost with poor initial values. 2.6.2.4. Levenberg–Marquardt method. Levenberg–Marquardt method is a variation of Gauss–Newton method. Instead of k(x Tx) − 1x T e(m), the Δ (m) used here is    −1 T T T X eðmÞ : k X X + λdiag X X diag(x Tx) is a diagonal matrix whose diagonal entries are the same as those of xTx. It is added to improve the stability of estimation procedure in cases the matrix xTx is nearly singular. This term has also the compromise effect of allowing the method to act like the steepest descent method Table 3 Positive test: results table (moxifloxacin). Models

Fig. 5. Structure of the QT interval prolongation analysis system.

Bazett Fridericia Sarma Exp. Linear Parabolic Hyperbolic Logarithmic Shifted Log. Exponential Polynomial 2 Polynomial 3

Parameter estimates

r

a

b

211.206 220.176 264.027 192.349 228.149 271.246 228.358 5.143E101 269.85 173.545 75.551

c

d

− 0.92 − 0.79 116.966 − 1.195 0.1 35.011 − 0.01 0.173 0.03 − 41.151 0.18 40.217 0.06 1.699E102 0.73 −112.821 0.08 68.131 − 13.725 0.07 342.134 − 254.18 66.935 − 0.21

PRESS AIC RMSE 56.15 35.54 15.66 16.98 15.9 14.48 15.84 10.37 15.73 16.03 21.42

468.25 415.19 322.04 330.51 322.85 311.99 322.42 273.24 321.59 324.78 359.3

32

M. Markert et al. / Journal of Pharmacological and Toxicological Methods 64 (2011) 25–41

when θ(m) is further from the global minimum and like the Gauss– Newton method when θ(m) is coming near to ˆθ. λ is not defined rigidly. Some nonlinear regression programs, e.g. SAS, initiate it as 0.001 and reduce it by a factor of 10 when a step reduces the sum of squares. If this step fails to do so, λ will be increased accordingly by a factor of 10. 2.6.2.5. Nelder–Mead (Simplex) method. This Method is completely different from Gauss–Newton methods. It is also quite commonly

used in nonlinear regression programs. The method starts the regression analysis by constructing a Simplex, which is a polygon with by k + 1 vertices, where k is the dimension of parameters. For a two parameter problem, the start Simplex is a triangle. For each triangle, the function values at the three vertices are compared. The unfavorable one, i.e. with largest value, is replaced by a new vertex. While the search processes till the minimum point is approached, the values at the triangle vertices become smaller and smaller and the size

Fig. 6. Positive test: result plots (moxifloxacin).

M. Markert et al. / Journal of Pharmacological and Toxicological Methods 64 (2011) 25–41

33

Fig. 6 (continued).

of triangle decreases too. Generally, this method performs with following rules: • Reflect the point with the highest sum of squares through centroid of the Simplex.

• If this produces the lowest sum of squares (best point), expand the Simplex and reflect further. • If this is just a good point, start at the top and reflect again. • If this is the highest sum of squares (worst point), contract the Simplex and reflect closer.

34

M. Markert et al. / Journal of Pharmacological and Toxicological Methods 64 (2011) 25–41

Steps following these rules are processed repeatedly until convergence is met. For problems with two parameters, the Simplex algorithm runs as following (Mathews 2003): Start triangle initialization: First, a start triangle BGW is initialized. Let f (α,β) the function to be minimized and Vk = (αk,βk) with k = 1,2,3 the three vertices of the start triangle. The three vertices are reordered according to the corresponding function values, i.e. f (α1,β1) b f (α2,β2) b f (α3,β3). Denote the best vertex (α1,β1) by B, the good vertex (α2,β2) by G and the worst (α3,β3) by W. The minimization process of Simplex method is performed with help of the midpoint of the line segment between B and G. The midpoint is denoted by M and defined as follows:

M=

1 ðB + GÞ = 2

  α1 + β1 α2 + β2 ; : 2 2

Reflection with R: The function value decreases when a point (αi,βi) moves along the triangle side from W to B as well as along the line from W to G. It is very likely that the function values will be smaller with points on the other side of the line between B and G. Thus, a new test point, denoted by R, is obtained by reflecting the triangle through the side between B and G. It is defined by drawing a line from W to M at first. Denote the line length by d. Then the line is extended further to a point with a distance of d to M. This point is the reflection point R with: R = M + ðM−W Þ = 2 M−W: Expansion with E: If the function value at R is smaller than that at W, the right direction toward minimum is run. A point farther than R perhaps is nearer to the minimum. Another test point E is then reached by extending the line from M to R with a length of d, i.e.: E = R + ðR−M Þ = 2R−M: If the function value at E is less than that at R as expected, then a better vertex is found. Contraction with C: If the reflection point R is not better than W, a new test point should be found. Perhaps the function value at the midpoint M is better. However, this point cannot be taken because the triangle form should be kept. Instead of M, the function values at the two midpoints C1 and C2 of the lines WM and MR are compared. The point of a smaller function value is denoted by C. The current triangle is contracted to a new one BGC when C is better than W. Shrinking toward B: The function value at C can also be no less than that at W. Then current triangle should be shrinked toward B.

G is replaced with M and W with the midpoint of line between W and B, denoted by S. Fig. 4 gives a schematic representation of these processes. By repeating these steps, the values of the triangle vertices become smaller and smaller and the minimum is being approached. The Nelder–Mead (Simplex) method is relatively robust and numerically less complicated. It is the regression method applied in this manuscript. 3. Results 3.1. Design of QTana The structure of this analysis system is depicted in Fig. 5. Given a set of experimental data, it starts by calculating the parameters of all build-in models with respect to the training placebo data. In second step, all these specified models are inspected under the model selection criteria with the help of the test placebo data. Once best correction model(s) are reported, their plots with test placebo data are investigated before they are applied to correct the dosed data and determine QTi prolongation. As discussed, there is no universal QT/RR correction model. Many research groups have presented different correction functions with good effects on their own data. The basic idea of QTana is to set up a candidate model collection and try to find the best correction model when a certain data set is given. In order to have a broad spectrum of correction models, linear as well as different types of nonlinear functions such as logarithmic, exponential, polynomial, etc., are included in the candidate collection ( Meyners & Markert 2004; Molnar, Weiss, Zhang, & Rosenthal 1996) . The two most famous but also mostly criticized correction functions from Bazett and Fridericia are also taken into the model pool for comparison purpose. These QT/RR modeling and the corresponding correction functions are listed in Table 2. Beside these build-in correction functions, further models can also be imported and analyzed in QTana. In the optimization unit, the least squares method is applied to optimize linear correction models and Nelder–Mead (Simplex) method to minimize nonlinear correction models. These correction models with specified parameters are studied in the model selection unit. Model(s) with minimum square form of Pearson's correlation coefficient (r 2), AIC and PRESS MRSE are reported and investigated further with the help of data and model plots. In the QT interval prolongation analysis unit, linear regression lines are drawn for all groups of transformed QT/RR data in scatter plots. By comparing dosed linear functions with the placebo function, it can be identified whether a QT interval prolongation happens and when yes, by which group(s).

Table 4 Positive test: mean values of corrected QTi and p-values from t-test for differences between the dosed groups and the control group. Placebo

Dose Gr. 3 kg/mg

Models

Mean

Mean

p

Mean

Dose Gr. 10 kg/mg p

Mean

Dose Gr. 30 kg/mg p

Bazett Fridericia Sarma Exp. Linear Parabolic Hyperbolic Logarithmic Shifted Log. Exponential Polynomial 2 Polynomial 3

213.31 ± 20.431 221.52 ± 11.974 228.6 ± 8.041 227.19 ± 9.82 228.06 ± 14.869 230.08 ± 16.887 228.28 ± 15.231 239.81 ± 26.159 228.3 ± 15.289 227.9 ± 14.96 230.03 ± 16.889

218.621 ± 20.901 224.536 ± 12.362 231.71 ± 10.659 230.04 ± 11.826 230.862 ± 15.834 233.33 ± 18.346 231.336 ± 16.502 242.638 ± 27.356 231.372 ± 16.604 230.926 ± 17.826 233.038 ± 17.826

0.1404 0.0226 0.0024 0.0050 0.0036 0.0020 0.0026 0.0194 0.0260 0.0030 0.0032

230.254 ± 27.528 236.74 ± 19.368 244.3 ± 13.427 243.06 ± 15.275 243.545 ± 19.043 245.724 ± 20.345 244.03 ± 19.333 252.535 ± 25.158 244.028 ± 19.321 243.66 ± 19.128 245.802 ± 20.634

b 0.0001 b 0.0001 b 0.0001 b 0.0001 b 0.0001 b 0.0001 b 0.0001 b 0.0001 b 0.0001 b 0.0001 b 0.0001

242.57 ± 25.283 244.79 ± 16.067 248.27 ± 11.877 246.81 ± 12.784 247.47 ± 13.454 249.74 ± 13.454 248.01 ± 13.066 252.79 ± 16.771 247.98 ± 13.148 247.56 ± 13.167 249.82 ± 13.573

b 0.0001 b 0.0001 b 0.0001 b 0.0001 b 0.0001 b 0.0001 b 0.0001 b 0.0001 b 0.0001 b 0.0001 b 0.0001

M. Markert et al. / Journal of Pharmacological and Toxicological Methods 64 (2011) 25–41

3.2. Assistant software (packages) QTana is developed in JAVA (JDK1.5). It implements the numeric and statistic analytical methods discussed in chapter 2 with the help of some license-free Java-developed software and packages. The graphic user interface is realized with JAVA-Swing package and another totally free layout manager tool (tablelayout, https://tablelayout.dev.java. net/). During the implementation, different nonlinear regression analysis methods are investigated and compared. The Nelder–Mead (Simplex) method is chosen because of its less computation complexity. This method is available in a free Java package developed by MT Flanagan. By the realization of importing external user defined correction functions, various GNU/LGPL-licensed scripting packages, e.g. BeanShell, Jython and Rhino, are taken into consideration, but none gives a satisfactory performance because of long runtime. An elaborate evaluation of these tools has been performed by D. Kearns. A language recognition tool, ANTLR (ANother Tool for Language Recognition, formerly PCCTS), gives a better solution and is applied to import external functions in the format described. ANTLR can be used freely under the BSD license. It is a very powerful language tool which provides a framework for constructing recognizers, compilers, and translators from grammatical descriptions containing Java, C#, C++, or Python actions. The plot of sample data and correction models in QTana is implemented by means of another powerful and flexible LGPL-licensed Java charting library, JFreeChart (http://www.jfree.org/jfreechart/). After the analysis system is implemented, a set of positive and negative tests are designed to view its performance. Besides these tests, some other tests are also performed to explore the effects of other factors on QT/RR correlation. Theoretically, QT interval of female is known to be longer than QTi of male under the same conditions. Thus, it's worth comparing the variability of male group and female group in a sample data set to see whether they are significantly different. In some experiments, recorded data of individual animals are distributed differently. A test to investigate whether they are significantly different should also be meaningful. Furthermore, a comparison between a 10 min median sample and a 5 min median version of the same study is designed to observe the effects of different median values on correction models. In general, the data used in following tests are derived from telemetry measurements employing conscious Labrador dogs, minipigs, cynomolgus or rhesus monkeys within different drug safety studies. A cross-over study design is usually applied on a small number of animals (e.g. 4) within each study. The animals are arranged with the appropriate housing conditions discussed in (Klumpp et al. 2006) during the studies, since it has been shown that the quality of the acquired cardiovascular data is dependent on the pen configuration and group make-up during a study. Median values from 10 min or 5 min time interval are calculated for analysis in order to avoid beat-to-beat variability. QT interval and heart rate values are extracted from the measurements. The heart rate values are used further to derive RR values (60/HR), because the models included in the analysis system are based on QT interval and RR values. In some publications, heart rate values are used instead of RR, but there is no significant difference between HR-based results and RR-based results (Meyners & Markert 2004). 3.3. Positive and negative tests In positive tests, QTi prolongation is expected to be identified by the models specified with the analysis system. On the other hand, it is not expected to be seen in negative tests. The fluoroquinolone antibiotic moxifloxacin is known associated with QT interval prolongation by inhibiting the hERG (human Ether-a-go-go Related Gene) potassium K+ channel (Alexandrou et al. 2006). It has been recommended as a positive control by regulatory authorities to

35

evaluate the sensitivity of both clinical and nonclinical studies to detect small but significant increases in QT interval measurements (Chen et al. 2005). In present study, moxifloxacin administrated data from conscious Labrador dogs are also chosen as a positive control. The model selection is based only on the placebo data. These data are subdivided into a training set and a test set so as to assess the predictive performance of the correction models. The dosed data groups are then corrected by models with the data-specified parameters. Table 3 shows the data-specified parameters for each correction model as well as the corresponding values for the model selection criteria, i.e. correlation coefficients (r), PRESS RMSE and AIC. Bazett's, Fridericia's and shifted logarithmic correction models have produced largest absolute correlation coefficients and then have the poorest predictive performance. Hyperbolic function and polynomial in order 3 do not have good enough predictive performance with coefficients of 0.18 and 0.21 respectively, too. The other formulas have similar coefficients and RMSEs. The difference among them is minor. “Search best fit” function of QTana has selected linear model (QT = a + bRR + ε) as the best correction function, which has a coefficient of − 0.01. Sarma exponential, parabolic, logarithmic, exponential formulas and polynomial of order 2 produce similar results with absolute coefficients from 0.03 to 0.1. Fig. 6 gives the corresponding plots with corrected data and linear regression lines for each dose group to ease QT interval prolongation identification. As expected, Bazett and Fridericia models have under-corrected QTi by high heart rates and overcorrected by low rates. Except these two and shifted logarithmic model the other eight models give similar plots, which are consistent with Table 3. The slight difference among these plots is that except hyperbolic and Sarma exponential models the polynomial of order 3 has a trend to under-correct QTi in low heart rate region and the rest have a trend to under-correct QTi in high heart rate region. This difference may be more obvious with larger data sets (Meyners & Markert 2004). Considering both the results in Table 3 and the plots in Fig. 6 Sarma exponential correction formula can be taken as the best model. Table 4 reports the respective corrected QTi values and p-values from t-test for differences between the dosed groups and the control group. Significant p values at 5% are marked bold, indicative values (0.05 b p b 0.01) are set in italics. No QTi difference in dosed group 3 mg/kg is detected with Bazett's model, while Fridericia's and shifted logarithmic models show indicative differences. All models show there are significant QT interval prolongation in dosed groups 10 mg/ kg and 30 mg/kg. This indicates that the best model selected do not make false positive decisions. Test with other positive data sets give similar results. Besides positive tests, some negative tests are also performed with compounds which are known not associated with QT interval prolongation. For confidential reasons, the compound names will not be given here. The test procedure is analog to that of positive tests. Fig. 7, Tables 5 and 6 show the corresponding results of a negative test. With respect to the correlation coefficient, linear formula has the best predictive performance again in this situation. The corrected data have a perfect coefficient 0.0. Parabolic and exponential correction models have the next best absolute coefficients of 0.06 and 0.1. Sarma exponential, logarithmic and the two polynomial formulas perform with coefficients between 0.1 and 0.2. Visual inspection of the corrected data in Fig. 6 reveals, however, that linear function does not correct the QT interval better than hyperbolic, Sarma exponential or polynomial of order 3. It under-corrects in high heart rate region and over-corrects in low region. Same effects also take place in corrections with parabolic, exponential and logarithmic models. Polynomial of order 2 over-corrects data in low region. Bazett's, Fridericia's and shifted logarithmic models correct the data most poorly. Sarma exponential, hyperbolic functions and

36

M. Markert et al. / Journal of Pharmacological and Toxicological Methods 64 (2011) 25–41

polynomial of order 3 are the visually best correction models. Plots in Fig. 6 also depict that none corrected data shows any QT interval prolongation. It can be assumed that the best models selected will not make a false negative decision. Results of t-test in Table 6 show some differences among dosed and control data, which are reflection of biological differences and measurement variability (Malik 2001). Other negative tests demonstrate similar results. Therefore, the preferred models will satisfy two important criteria of a QT/HR correction algorithm, i.e. the correction model should remove the

dependence of QT interval on heart rate and maintain biological variations at the same time. 3.4. Gender groups comparison Since Bazett noted in 1920, gender influences on heart rate corrected QT interval have constantly been reported (Bazett 1920; Rautahar, Zhou, & Wong 1992; Drici, 2001, Burke et al. 1997). Feminine gender tends to have longer QT intervals and is at higher

Fig. 7. Negative test: result plots.

M. Markert et al. / Journal of Pharmacological and Toxicological Methods 64 (2011) 25–41

37

Fig. 7 (continued).

risk for severe cardiac adverse events like torsade de points in either drug-induced or congenital long QT syndromes. It is recommended that not only caution should be exercised when administrating drugs in women at risk of developing cardiac arrhythmias but also studies of gender specify should be performed both in preclinical and clinical development of drugs prolonging QT intervals (Drici 2001). In present

manuscript, gender influence on QT intervals and its interaction with administration of drugs developing QT prolongation, e.g. moxifloxacin, are also investigated. Fig. 8 shows the gender placebo data points as well as logarithmic regression lines for each gender. Different to the literature arguments, the values of this feminine data group are lower than those of male

38

M. Markert et al. / Journal of Pharmacological and Toxicological Methods 64 (2011) 25–41

Table 5 Negative test: results table. Models

Bazett Fridericia Sarma Exp. Linear Parabolic Hyperbolic Logarithmic Shifted Log. Exponential Polynomial 2 Polynomial 3

Parameter estimates

r

a

b

c

d

251.549 248.525 250.823 199.616 244.333 281.534 244.638 3.343E103 287.313 106.772 23.313

− 0.94 − 0.79 486.863 − 4.659 − 0.18 43.566 0.0 0.171 − 0.06 −35.899 − 0.22 41.819 − 0.13 1.012E103 0.62 − 116.21 − 0.1 255.571 − 113.558 − 0.14 557.065 −458.533 125.38 − 0.17

PRESS AIC RMSE 60.68 36.69 19.68 14.88 15.74 17.84 16.26 7.63 16.06 20.46 20.23

920.65 807.95 670.42 606.84 619.36 647.48 626.61 457.11 623.92 679.11 677.52

data group. The mean of male group is 242.557 ± 13.876, while the feminine group has a mean of 228.429 ± 7.295. Significant difference is reported by a t-test for the difference between the gender groups with a p-value of b 0.0001. Further differences can be detected by the distribution of the QT values considering the RR range. The values of male group distribute in a RR range of (0.494,1.967), while the QT interval values of female group are within the range of (0.583,1.463). The RR mean value of male group is 0.909 ± 0.289, whereas that of feminine group is 0.643 ± 0.482. p-value of a t-test between them is b 0.0001. Significant difference of the distributions exists. Therefore, it is reasonable to study whether gender plays an important role in QT interval prolongation, what kind of interaction it has with different dosages of a compound and whether it is important to separate them for QTi correction. A two-way ANOVA with gender and dosage as two factors is designed to answer these questions. Table 7 presents the results of the two-way ANOVA. Fig. 9 is the gender–dosage interaction plot, which show some possible reciprocal relationship between them. However, the p-value for interaction in Table 7 is 0.1121, which exceeds 0.05. The interaction null hypothesis is rejected, i.e. there is not a significant interaction between genders and dosages. When interaction is absent, the lines in Fig. 9 should be parallel. Both p-values for genders and dosages are smaller than 0.0001, which clearly mean that the QTi in dose groups as well as in gender groups are significant different. The factor plots in Fig. 10 confirm the p-values and show that QT intervals in dosage groups 10 and 30 are statistically significant longer than in dosage groups 0 and 3, and the male gender group has significantly longer QT intervals than the feminine group. This two-way ANOVA test is also run with some other data sets. Similar results are produced. Gender groups do have different QT intervals under the same dosage conditions. An inspection of gender specified heart rate correction models for QT intervals is not thriftless.

Fig. 8. Plot of gender group data points and corresponding regression lines.

For instance, polynomial order 3 model (QT = 54.972 + 358.187RR − 234.21RR2 + 53.701RR3) corrects the male data set of the moxifloxacin samples in last section best, whereas hyperbolic model (QT = 21:609 ) is the best correction formula for the female group 249:427− RR (Fig. 11). 3.5. Impact of sampling interval (10 min median vs. 5 min median) Conventionally, median values over a time interval are applied to smoothen beat-to-beat variabilities in the analysis of QT/RR relationship patterns. Median values of longer intervals can potentially avoid beat-to-beat variabilities better, but reduce the sample data size at the same time, which is quite important for the model training. Accordingly, values of shorter intervals produce larger sample data size, but are they broad enough to smoothen the beat-to-beat variability? In present study, median values of 10 and 5 min time intervals are compared to view the effect of different time intervals on QT/RR relationship patterns. The negative test in Section 3.3 is performed with 5 min median QT interval values. For comparison purposes, a 10 min version of the data set is studied here further. Fig. 12 shows the plots of the original data sets of both sample versions. As expected, the sample of 5 min median values contains more measurement variabilities than the sample of 10 min median values, especially in high heart rate regions. The QTi dispersions in both plots are similar except the thickness of the data point clouds. Furthermore, transformed with the same heart rate correction models, surely with different parameter estimates to 5 min values, the 10 min QTc values also show similar analysis results and almost the same plots in Fig. 7 except the thickness of data point clouds. For

Table 6 Negative test: mean values of corrected QTi and p-values from t-test for differences between the dosed groups and the control group 43. Models

Bazett Fridericia Sarma Exp. Linear Parabolic Hyperbolic Logarithmic Shifted Log. Exponential Polynomial 2 Polynomial 3

Placebo

Dose Gr. 3 kg/mg

Mean

Mean

p

Mean

Dose Gr. 10 kg/mg p

Mean

Dose Gr. 30 kg/mg p

257.259 ± 23.616 250.089 ± 13.979 245.801 ± 7.6177 242.616 ± 10.019 243.798 ± 9.592 245.199 ± 8.105 244.134 ± 9.010 242.533 ± 7.134 244.051 ± 9.179 248.318 ± 7.817 246.908 ± 7.559

253.077 ± 24.668 247.11 ± 14.647 244.285 ± 7.560 240.694 ± 9.828 241.929 ± 9.362 243.43 ± 7.876 242.285 ± 8.763 241.872 ± 6.295 242.205 ± 8.929 247.335 ± 8.640 245.312 ± 7.464

0.0250 0.0086 0.0102 0.0122 0.0110 0.0046 0.0068 0.1984 0.0082 0.1098 0.0060

252.241 ± 22.978 247.295 ± 12.949 245.66 ± 6.9213 241.861 ± 9.571 242.979 ± 8.883 244.59 ± 7.292 243.386 ± 8.301 243.786 ± 7.505 243.262 ± 8.520 248.166 ± 7.072 246.771 ± 6.866

0.0058 0.0086 0.7776 0.3140 0.2610 0.3010 0.2514 0.0266 0.2552 0.7702 0.7898

259.302 ± 25.858 251.461 ± 14.667 247.482 ± 7.257 243.153 ± 9.612 244.689 ± 9.104 246.464 ± 7.459 245.099 ± 8.363 243.395 ± 7.524 245.015 ± 8.556 250.979 ± 8.885 248.581 ± 7.264

0.2848 0.2078 0.0032 0.4600 0.2084 0.0306 0.1562 0.1166 0.1728 b0.0001 0.0030

M. Markert et al. / Journal of Pharmacological and Toxicological Methods 64 (2011) 25–41 Table 7 ANOVA table with two factors: gender and dosage. Source of variation

SS

df

MS

F statistic

p

Dosage Gender Interaction Within Total

33031.102 31934.343 985.676 106210.240 172161.361

3 1 3 648 655

11010.3674 31934.343 328.5586 163.9047

67.1754 194.8348 2.0046

b0.0001 b0.0001 0.1121

Fig. 9. Gender–dosage interaction plot.

brevity, the plots of 10 min corrected values are not presented here. Practically, if the sample size of a study is big enough, 10 min median values will be better than the 5 min version, as variabilities of measurement will be better avoided. On the other hand, if the sample size is relatively small, 5 min median value sets will be more helpful to specify the parameters for the heart rate QT interval correction models, but the reflection of the interdependent variabilities should be removed as outliers before analysis. 4. Discussion QT interval prolongation is a very important parameter in cardiactoxicity analysis during the development of a novel pharmaceutical. QT interval has an inverse relationship with heart rate, which usually masks the drug-induced QTi changes and cumbers the direct inspection of QTi prolongation. It is then very urgent to remove the dependence of QTi on heart rates. This can be fulfilled with some heart rate correction formulas, various types of which have already been discussed and published continually. Generally, it is assumed that the relationship between QTi and RR can be described with certain mathematical functions, the parameters of which normally are not given, but are specified with given experimental samples. In the present work, different types of heart

39

rate QT interval correction models are studied. Several numeric methods to optimize the correction functions and statistic procedures to perform significance tests are discussed and implemented in a QT/ RR relationship analysis system, named “QTana”. Given a sample data set, the QTana searches the best correction model(s) from the integrated 11 QT/RR relationship modeling functions. There exist, however, more dozens of further functions modeling QT/RR relationship patterns and even more will be presented. As an improvement or extension of the analysis system, QTana facilitates the evaluation of these models by an “import function” module, which will keep the analysis system up to date. Some positive and negative tests are directed to investigate the performance of the analysis system. In positive tests, the preferred models detect QTi prolongation reliably with data-specified parameters. In negative tests, significant QTi prolongations are not identified. The analysis system makes neither false positive nor false negative decisions. What's more, it keeps and presents biological differences and measurement variabilities. Thus, it satisfies the both criteria of heart rate correction algorithms to detect QTi prolongation, i.e. elimination of the dependence of QT interval on heart rates and maintaining biological differences at the same time. It's known that there are differences between male QT interval and female QT interval under the same conditions. Two-way ANOVA tests with gender and dosage of a compound are performed to investigate their effects on QT intervals. In a positive study (moxifloxacin), no significant effect of the interaction of both factors is detected. However, the significant differences between the two gender groups are identified. The QTana analysis system determines polynomial of order 3 as the best correction model for male group and hyperbolic model as the preferred for feminine group. Combining data sets from both groups, Sarma's exponential function performs best heart rate correction for QT intervals. From these results, it can be concluded that it might be reasonable to perform gender specified studies in pre-clinical development of drugs with potential QTi prolongation effects. A further set of studies to test the significant differences among individuals report that QT/RR relationship in healthy individuals do exhibit substantial inter-individual variabilities. During the analysis of QT/RR relationship patterns, a view of individual preferred correction models is not meaningless. The influences of median values of different time intervals on QT/RR relationship patterns are also inspected in a set of 5-min/10-min comparison studies. The results report as expected that more “noisy” data points appear in sample values of shorter time interval. These “noisy” information can be interpreted as reflection of biological differences and measurement variabilities (Table 8). QT values of longer time interval, i.e. 10 min, present significantly less outlying observations. Other significant differences between the two kinds of sample values cannot be identified except the thickness of data point clouds in plots. Therefore, 10 min median values are preferable with less outlier, when the sample size of a study is big enough. Otherwise, median values of

Fig. 10. Dosage and gender factor plots.

40

M. Markert et al. / Journal of Pharmacological and Toxicological Methods 64 (2011) 25–41

Fig. 11. Separate plots of corrected data points of the four individuals and corresponding linear regression lines for each dosage group.

Fig. 12. Plots of 5 min median and 10 min median QT interval values.

5 min time interval will be more helpful to get accurate parameter estimates for data-driven correction models. What would also be very interesting is to set up a database containing all placebo data for each animal. This database can facilitate the comparisons of historical placebo data with current ones to view whether significant differences exist among different life times of an animal and how long is the time interval in which the QT/ RR relationship patterns remain relatively stable. A life time interval over many years with stable QT/RR patterns means that these animals can be applied in drug safety studies for a long time. Thus, the number of animals in a laboratory will be reduced, hence, also the cost for a study and concerns about animals protection laws. Another worthwhile study not performed in current work is the test of inter-species variability. These tests will answer the question whether a correction model derived from one species can be transferred to another species directly, e.g. from Labrador dogs to minipigs, under the same

Table 8 One-way ANOVA table for inter-individual variability test. Source of variations

Sum of squares

df

MS

F statistic

p

Between Within Total

3173 63712 66785

3 324 327

1057.533 196.333

5.3864

0.0013

laboratory conditions, even though QT and RR ranges of these species exhibit obvious differences. 5. Conclusion The QT interval has a reverse relationship with heart rate, which may mask the drug-induced QTi changes and thereby limits the direct inspection of QTi prolongation. It is therefore essential to remove the dependence of QTi on heart rates. This can be achieved with heart rate correction formulas, various types of which have already been discussed and published. Generally, it is assumed that the relationship between QTi and RR can be described with certain mathematical functions, the parameters of which normally are not given, but are specified with given experimental data sets. Different types of heart rate QT interval correction models were compared. Several numerical methods to optimize the correction functions and statistical procedures to perform significance tests were discussed and implemented in a QT/RR relationship analysis system, named QTana. Given a sample data set, QTana searches the best correction model(s) from the integrated 11 QT/RR relationship modeling functions. There exist, however, dozens of additional functions modeling QT/RR relationship patterns and even more will be presented in the future. As an improvement or extension of the analysis system, QTana allows the evaluation of these new models by an “import function” module, which will keep the analysis system up to date.

M. Markert et al. / Journal of Pharmacological and Toxicological Methods 64 (2011) 25–41

References Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716–723. Alexandrou, A. J., Duncan, R. S., Sullivan, A., Hancox, J. C., Leishman, D. J., Witchel, H. J., et al. (2006). Mechanism of herg k+ channel blockade by the fluoroquinolone antibiotic moxifloxacin. British Journal of Pharmacology, 147(8), 905–916. Bazett, H. C. (1920). An analysis of the time-relations of electrocardiograms. Heart, 7, 353–370. Burke, J. H., Ehlert, F. A., Kruse, J. T., Parker, M. A., Goldberger, J. J., & Kadish, A. H. (1997). Gender-specific differences in the QT interval and the effect of autonomic tone and menstrual cycle in healthy adults. The American Journal of Cardiology, 79(2), 178–181. Chaves, A. A., Keller, W. J., Sullivan, S. O., Williams, M. A., Fitzgerald, L. E., McPherson, H. E., et al. (2006). Cardiovascular monkey telemetry: Sensitivity to detect QT interval prolongation. Journal of Pharmacological and Toxicological Methods, 54(2), 150–158. Chen, X., Cass, J. D., Bradley, J. A., Dahm, C. M., Sun, Z., Kadyszewski, E., et al. (2005). Qt prolongation and proarrhythmia by moxifloxacin: Concordance of preclinical models in relation to clinical outcome. British Journal of Pharmacology, 146, 792–799. Davey, P. (1999). A new physiological method for heart rate correction of QT interval. Heart, 82, 183–186. Drici, M. D. (2001). Influence of gender on drug-acquired long QT syndrome. European Heart Journal Supplements, 3(Supplement K), K41–K47. Esbensen, K. H. (2002). Multivariate Data Analysis — In practice: An introduction to multivariate data analysis and experimental design (5 edition). : CAMO Process AS. Fermini, A., & Fossa, A. A. (2003). The impact of drug-induced QT interval prolongation on drug discovery and development. Nature Reviews Drug Discovery, 2, 439–447. Fridericia, L. S. (1920). Die systolendauer im elektrokardiogram bei normalen menschen und bei herzkranken. Acta Medica Scandinavica, 53, 469–486. Klumpp, A., Trautmann, T., Markert, M., & Guth, B. (2006). Optimizing the experimental environment for dog telemetry studies. Journal of Pharmacological and Toxicological Methods, 54(2), 141–149.

41

Malik, M. (2001). Problems of heart rate correction in assessment of drug-induced qt interval prolongation. Journal of Cardiovascular Electrophysiology, 12(4), 411–420. Malik, M., Farbom, P., Batchvarov, V., Hnatkova, K., & Camm, A. J. (2002). Relation between QT and RR intervals is highly individual among healthy subjects: Implications for the heart rate correction of the QT interval. Heart, 97, 220–228. Markert, M., Stubhan, M., Mayer, K., Trautmann, T., & Guth, B. (2008). Validation of the normal, freely moving Göttingen minipig for pharmacological safety testing. Journal of Pharmacological and Toxicological Methods, 58(2), 79–87. Mathews, J. H. (2003). Module for Nelder–Mead search for a minimum. Available from: http://math.fullerton.edu/mathews/n2003/NelderMeadMod.html Meyners, M., & Markert, M. (2004). Correcting the QT interval for changes in hr in preclinical drug development. Methods of Information in Medicine, 43, 445–450. Molnar, J., Weiss, J., Zhang, F., & Rosenthal, J. E. (1996). Evaluation of five QT correction formulas using a software-assisted method of continuous QT measurement from 24-hour holter recordings. The American Journal of Cardiology, 78, 920–926. Motulsky, H. J., & Christopoulos, A. (2003). Fitting models to biological data using linear and nonlinear regression. A practical guide to curve fitting. San Diego CA: GraphPad Software Inc. Rautahar, P. M., Jr., Zhou, S. H., & Wong, S. (1992). Sex differences in the evolution of the electrocardiographic QT interval with age. Canadian Journal of Cardiology, 8, 690–695. Rosner, B. (1983). Percentage points for a generalized esd many-outlier procedure. Technometrics, 25(2), 165–172. Rosner, B. (1995). Fundamentals of biostatistics (4 edition). : Duxbury Press. Sarma, J. S. M., Sarma, R. J., Bilitch, M., Katz, D., & Song, S. L. (1984). An expontential formula for heart rate dependence of QT interval during exercise and cardiac pacing in humans: Reevaluation of Bazett's formula. The American Journal of Cardiology, 54, 103–108. Satterthwaite, F. W. (1946). An approximate distribution of estimates of variance components. Biometrics Bulletin, 2, 110–114. Zhu, Z. (u). Applied statistics I: Lecture notes. Available at:http://www.unc.edu/˜zhuz/ teaching/Stat174/Notes/