Sample size calculations for single group post-marketing cohort studies

Sample size calculations for single group post-marketing cohort studies

J Clin Epidemiol Vol. 47, No. 4, pp. 435439, 1994 Copyright 0 1994 Elsevier Science Ltd 08954356(93)EOO14-3 Printed in Great Britain. All rights rese...

447KB Sizes 0 Downloads 80 Views

J

Clin Epidemiol Vol. 47, No. 4, pp. 435439, 1994 Copyright 0 1994 Elsevier Science Ltd 08954356(93)EOO14-3 Printed in Great Britain. All rights reserved

089S-4356194 %7.00+ 0.00

SAMPLE

SIZE CALCULATIONS FOR SINGLE GROUP POST-MARKETING COHORT STUDIES

P. TUBERT-BITTER,’ B. BBGAUD, 2* Y. MORIDE’ and L. ABENHAIM~ ‘INSERM U.169, Villejuif, France, *Dtpartement de Pharmacologic Clinique, UniversitC de Bordeaux 2, 33076 Bordeaux Cedex, France and 3Department of Epidemiology and Biostatistics, McGill University and Centre for Clinical Epidemiology and Community Studies, The Sir Mortimer B. Davis Jewish Hospital, Montreal, Canada (Received in revised form 26 May 1993)

Abstract-In pharmacoepidemiology, single group cohort is the most frequently proposed design to determine if the incidence rate of an adverse drug reaction among the exposed differs from a reference value. In many situations, the number of events expected in the cohort is too small to conduct sample size calculations based on the normal distribution. This paper proposes, for a single group cohort study, calculations and tables derived from the Poisson distribution. The results are based on a one-sided test with a 0.05 significance level and a power of 0.9 and 0.8. Two parameters have to be specified a priori: the expected incidence of the event under the null hypothesis and the minimum risk ratio to be detected. The required sample size and the critical number of events to reject the null hypothesis are directly derived from the tables. Results show

that the normal approximation may lead to an underestimation of the required sample size. Pharmacoepidemiology Adverse drug reactions size Poisson distribution Methodology

study

Sample

If the expected incidence of the event is not too low, type A reactions [l], the expected number of cases, in a cohort ranging from 1000 to 10,000 exposed subjects, may be larger than 20 or 30 which allows doing statistical calculations based on the normal distribution. Many references and tables have been published from which can be read sample sizes for one- or two-group design [e.g. 241. Unfortunately, safety issues after marketing often involve rare reactions (type B); the expected number of cases in the exposed group may become small, for instance below 15, which questions the use of the normal approximation [5]. Other approaches, based on the binomial [6] or Poisson distributions have to be used [7]. For a twogroup design, Gail [8] then Brown and Green [9] proposed power calculations based on the total

INTRODUCTION

Post-marketing cohort studies are becoming frequently proposed when a drug is suspected to be associated with a given adverse event, or to compare the tolerance of two or more drugs. The number of events observed in the exposed group must be compared to a reference value. The reference value is either drawn from a control group or is fixed a priori by the manufacturer or the regulatory authorities, as an acceptable or unacceptable threshold. The core problem is then to estimate how many subjects would be necessary to detect, with a specified power, a given difference at the specified significance level. *Author for correspondence. CE 47/&H

Cohort

435

436

P. TLJBERT-BITTER et al.

number of cases assumed to follow a Poisson distribution. To our knowledge, such tables have not been published for single-group cohorts. For a single group design, Gordon [lo] recommended sample size calculations based on the confidence intervals of a Poisson parameter. However, this approach does not take into account the power specification. This paper proposes a simple approach to sample size calculations for a single group design used to test if the risk of a given rare reaction is higher or lower than a reference value p,,. Examples of applications are: a regulatory agency asks a manufacturer to carry out a study to demonstrate that the risk of liver injury associated with a newly marketed drug remains below l/1000, or the quantification of the relative risk associated with the use of a given drug. METHOD

Let X be the number of adverse drug reactions (ADRs) observed in a cohort of size n. We hypothesise that X follows a Poisson distribution with parameter m; m = np is the expected number of cases given the incidence of the event p in the exposed group and the considered period of time after initiation of treatment. Specifying the null hypothesis H,,:p =pO (the reference value), two one-sided alternative hypotheses are considered: H, :p p,, . In pharmacoepidemiology, the context favours a one-sided alternative hypothesis because the direction of the effect is often imposed, e.g. to demonstrate that the risk of adverse event associated with a new drug is lower than the reference value, which will maintain the drug on the market. In order to calculate the sample size n, four parameters have to be specified u priori:

(0 the significance level a, which is the probability of unjustly rejecting Ho when it is true, (ii) the power of the test, 1 - /?, i.e. the probability of detecting a difference if one exists, (iii) the reference value which is the expected incidence of the adverse event pO under the null hypothesis Ho, (iv) the minimum value of the ratio of the two risks one wants to detect with probability 1 - p. This ratio is referred to as R, =pJp for the left-sided alternative hypothesis (H, :p c pO) and as R, = p/p0 for the right-

sided alternative hypothesis (HZ :p > p,,). The process to construct the tables starts with the variable x,,. For the left-sided hypothesis (H, = p < pO), x0 is the maximum number of cases allowed in the Table l(a). Sample size calculation for a one-sided test at 0.05 significance level and a power of 0.9 when the incidence rate of the event in the cohort,p, is expected to be lower than a reference value p0 $unded

values)

1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 and 2.8 2.9 and 3.0 3.1 3.5 4.0 5.0

it

R, (exact values) 1.20 1.30 1.40 1.50

1.60 I .69 1.80 1.89 I .99 2.08 2.19 2.25 2.33 2.42 2.53 2.66 2.83 3.05 3.34 3.77 4.45 5.72 8.94

rn”

X”

283.926 142.867 89.791 64.402 49.809 40.69 1 33.753 29.063 25.500 23.098 20.669 19.443 18.208 16.963 15.706 14.435 13.149 II.843 10.514 9.154 7.754 6.296 4.744

256 123 74 51 38 30 24 20 17 15 13 12 11 10 9 8 7 6 5 4 3 2

1

R, =p,,/p is the minimum risk-ratio one wants to detect with the specified power. For the selected R, value, the

required sample size n is obtained by dividing the corresponding m0 by pa. For example, for p0 = l/500 and R, = 2, n = 25.5 x 500 = 12,750. x0 is the maximal number of events in the cohort allowing to reject the null hypothesis in favour of the alternative hypothesis H, :p < pO;in the above example, a maximum of 17 cases among the followed subjects could be accepted.

Table l(b). As for Table I(a) but for a power of 0.8 $unded 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3 2.4 and 2.6 2.7, 2.8 3.0 3.5 and 4.5 and 6.0

values)

4

(exact values)

I .20 1.30 1.40

2.5 and 2.9 4.0 5.0

1.50 1.60 1.69 I .78 1.88 1.97 2.08 2.16 2.25 2.36 2.51 2.70 2.97 3.38 4.11 5.76

m. 205.806 103.978 66.629 47.541 36.077 30.241 25.500 21.887 19.443 16.963 15.706 14.435 13.149 Il.843 10.514 9.154 7.754 6.296 4.744

X0

182 87 53 36 26 21 17 14 12 IO 9 8 7 6 5 4 3 2

I

Sample Size for Single Group Cohorts

study group to reject the null hypothesis H,: p =po in favour of the alternative hypothesis. Using the following equation derived from the Poisson formula, .3 exp( - m)m k c k,

P(XGxcJ=

k=O

it is possible, for each integer value of x0, to obtain m,, the smallest m value for which P(X < x0) < a and m, , the largest m value for which P (X < x0) $1 - /?. m, is the expected number of cases, under Ho: m, = np,, corresponding to the chosen value x0 for the specified a level; for a given p. value, one can derive the minimum number of subjects to be followed: n = mo/po. m, is the expected number of cases, for the specified fl level, when considering the assumed value of the risk ratio R,; m, = m,/R, thus, R, = m,/m,.

In the same way, the rejection region corresponding to H2 is (X B x0} where x0 is the minimum number of cases that allows to reject the null hypothesis in favour of the alternative hypothesis H,:p > po. Using the following equation, P(X2x,)=

1-

‘o-‘exp(-m)mk

c

k=O

k!

it is possible, for each integer x0 value, to obtain mo, the largest m value for which P (A’2 x0) < a and m,, the smallest m value for which P(X 2x0) 2 1 - fi. As for H,, we derive: n = mo/po and R, = m,/m,. RESULTS

The results corresponding to a = 0.05 and 1 - p = 0.9 or 0.8 are summarized in Tables I (a) and (b) for H, (p po). For each risk ratio value to be detected, the following parameters can be read from the tables: value of m, = np, from which can be calculated the required sample size: n = m,/p,. (ii) the critical number of events in the exposed group, x0, necessary to reject the null hypothesis: H,:p =po when the alternative hypothesis is H, :p < po. Ho can be rejected if the number of events remains under or equal to x0. If the alternative hypothesis is H2 :p > po, Ho can be rejected if this number is larger or equal to x0.

(0 the corresponding

437

The following guidelines should be applied to use the tables. The first step is to fix the reference value p. for the incidence rate corresponding to the selected period of time, and to decide whether or not the incidence p in the cohort is expected to be lower (alternative hypothesis H, , Table 1) or larger (alternative hypothesis Hz, Table 2) than po. Depending on the power selected, the reading in the table will be made in Section a (0.9) or b (0.8). The second step requires to make an assumption on the value of the risk ratio associated with drug exposure. Intermediate values should not be interpolated. If the selected risk ratio cannot be found in the second column of the table, the closest rounded value (first column of the table) that results in a greater m,, should be retained. For example, in Table 2(a), m, is 3.980 for R, = 3. The minimum sample size is obtained by dividing m, by the reference value po. Assuming a complete follow-up of the n subjects, it could be concluded that p, is greater than p. if the observed number of events is greater or equal to 8. DISCUSSION

Tables 1 and 2 show that, for a given po, the minimal sample size, n, depends on: (i) The assumed ratio between p and po: the smaller the ratio, the larger the n value. For instance, for p. = 1/1500, a = 0.05 and 1 - fi = 0.8, Table 2b gives n = 16.549 x 1500 = 24,824 subjects if R, = 1.7 and n = 1.970 x 1500 = 2955 if R, > 3.5. The assumption of the magnitude of the ratio is then a key issue before designing a postmarketing cohort study. Let us imagine a new antidepressant drug for which one wants to quantify the association with liver injury; considering the value of p = 1/lO,OOO proposed by Strom [ 1l] for the background incidence of cholestatic jaundice, 301,950 subjects would be required if R2 is assumed to be 1.5 (Table 2b), compared to 13,660 for a R2 >, 5. (ii) The specified power. In the first example, the minimum sample size, for R, > 1.7, is 23.297 x 1500 = 34,945 for a power of 90% (Table 2a), which is 41% more subjects than for a power of 80%. As put forward in the introduction, sample size calculations based on the normal distribution are questionable when the number of events expected in the cohort group under Ho,

438

P. TUBERT-BITTER er al.

Table 2(a). Sample size calculation for a one-sided test at 0.05 significance level and a power of 0.9 when the incidence rate of the event in the cohort, p. is supposed to be larger than a reference value-&, &nded

values)

1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9

2.0 2.1 2.2 2.3 2.4 2.5 and 2.6 2.7 2.8 and 2.9 3.0 3.5 4.0 4.5 5.0 7 11 50 and over

R2 (exact values) 1.20 1.30

1.40 1.50 1.60 1.70 1.80 1.87 1.90 1.93 1.96 1.99 2.03 2.08 2.13 2.18 2.25 2.32 2.40 2.50 2.62 2.77 2.96 3.21 3.55 4.06 4.90 6.52 10.96 45.16

m. 234.058 109.049 64.063 42.507 30.195 23.297 18.218 15.719 14.893 14.072 13.254 12.441

11.634 10.832 10.035 9.246 8.463 7.689 6.924 6.169 5.425 4.695 3.980 3.285 2.613 1.970 1.366 0.817 0.355 0.05 1

x0

260 127 78 54 40 :: 23 22 21 20 19 18 17 16 I5 14 13 12 11 10 9 8 7 6 5 4 3 2 1

R, = p/p, is the minimum risk ratio one wants to detect with the specified power. For the selected R, value, the required sample is obtained by dividing m, by po. For example, for p. = l/500 and R, = 2, n = 12.441 x 500=6221. x0 is the minimum number of events in the study group allowing to reject the null hypothesis in favour of the alternative hypothesis H,:p >po; in the above example, a minimum number of 19 cases is required.

m,, becomes as small as 15 [5] or 5 [7]. For example, consider p,, = l/1000 and a specified power of 0.8 (Table 2b). For R,= 1.2, a sample size of 168,850 is obtained with the Poisson distribution (m. = 168.8) and 164,503 with the normal approximation; an underestimation of 2.6% (4347 subjects). The magnitude of the underestimation becomes more important for a greater R, . For R,= 2, sample sizes of 9246 and 8026 are obtained, respectively for the Poisson (m, = 92) and the normal approximation; this represents an underestimation of 13.2% (1220 subjects). Especially in the second case, the sample size derived from the normal distribution may lead to failure to reject the null hypothesis because of insufficient power. This issue is central in post-marketing surveillance where the expected number of events is always

Table 2(b). As for Table 2(a) but for a power of 0.8 &nded

values)

1.2 1.3 1.4 1.5 1.6 1.7

1.8

1.9 2.0 2.1 2.2 2.3 2.4 2.5 2.6 and 2.7 2.8, 2.9 and 3.0 3.1 3.5 and 4.0 4.5 and 5.0 5.5 9 40 and over

4

(exact values) I .20 1.30 1.40 1.50 1.60 1.70 1.72 1.74 1.76 1.79 1.82 1.85 1.88 1.92 1.97 2.02 2.07 2.14 2.22 2.31 2.43 2.58 2.77 3.03 3.42 4.04 5.24 8.44 31.57

m. 168.850 77.726 45.175 30.195 21.593 16.549 15.719 14.893 14.072 13.254 12.441 I 1.634 10.832 10.035 9.246 8.463 7.689 6.924 6.169 5.425 4.695 3.980 3.285 2.613 1.970 1.366 0.817 0.355 0.051

x0 191 93 57 40 30 24 23 22 21 20 19 18 17 16 15 14 13 12 ll 10 9 8 7 6 5 4 3 2

1

low. One deals with events that are too rare to have been detected in randomised controlled trials which generally include between 500 and 5000 patients [12]. Sample size calculations based on the Poisson distribution usually apply to small m,. However, when m, becomes larger, it follows closely the normal approximation. The Poisson approach is recommended in both situations, provided that n is large (> 100). Finally, another practical advantage of using the Poisson approach is that it is not necessary to compute one table for each possible value of p,,; all sample sizes for the 2 one-sided hypotheses and 2 power values can be obtained from 4 tables. Acknowledgements-The

comments of Joseph Lellouch Ph.D. and L. Rachid Salmi M.D., Ph.D. on this manuscript were greatly appreciated.

REFERENCES Rawlins MD. Post-marketing surveillance of adverse drug reactions to drugs. Br Med J 1984; 1: 879-880. Schlesselman JJ. Sample size requirements in cohort and case control studies of disease. Am J Epidemiol 1974; 99: 381. Machin D, Campbell MJ. Statistical Tables for the Design of Clinical Trials. Oxford: Blackwell; 1987.

Sample Size for Single Group Cohorts 4.

Lemeshow S, Hosmer DW, Klamar J, Lwanga SK. Adequacy of Sample Size in Health Studies. Chichester: Wiley; 1990. 5. Snedecor GW, Cochran WG. Statistical Methods, 6th edn. Ames: The Iowa State University Press; 1967: 223. 6. Lewis JA. How many patients? Trends Pharmacol Sci 1981: 93-94. I. Fleiss JL. Statistical Methods for Rates and Proportions, 2nd edn. New York: Wiley; 1981. for designing 8. Gail M. Power computations comparative Poisson trials. Biometrics 1974; 30: 231-237.

439

Brown CC, Green SB. Additional power computations for designing comparative Poisson trials. Am J Epidemiol 1982; 115: 752-758. 10. Gordon T. Sample size estimation in occupational mortality studies with use of confidence interval theory. Am J Epidemiol 1987; 125: 158-162. 11. Strom BL. Sample size considerations for pharmacoepidemiologic studies in pharmaco-epidemiology. In: Strom BL, Ed. Pharmacoepidemiology. New York: Churchill Livingstone: 1989: 27-37. 12. Rawlins MD, J&erys’DB. Study of United-Kingdom product licence applications containing active substances, 187-9. Br Med J 1991; 302: 223-225. 9.