195
Mutation Research, 64 (1979) 195--204
© Elsevier/North-Holland Biomedical Press
T H E MICRONUCLEUS TEST: ST A T I ST I C A L DESIGN AND ANALYSIS
BRUCE E. MACKEY and JAMES T. MACGREGOR U.S. Department o f Agriculture, Science and Education Administration, Western Regional Research Center, Berkeley, CA 94 710 (U.S.A.)
(Received 8 June 1978) (Revision received 28 November 1978) (Accepted 11 December 1978)
Summary Alternative statistical procedures are discussed which may be e m p l o y e d to c o m p ar e the incidences among t r e a t m e n t groups of micronucleated polychromatic and n o r m o c h r o m a t i c e r y t h r o c y t e s and their ratios. Comparison of incidences o f micronucleated p o l y c h r o m a t i c e r y t h r o c y t e s using a sequential sampling strategy based on the negative binomial distribution is shown to require fewer animals for the same sensitivity of test than a similar procedure based on the binomial distribution. The sequential test is superior, bot h in p o w e r and n u m b e r of animals required, to an alternative 1-stage test based on the same distribution. The procedure described permits the investigator to optimize the n u m b e r o f animals in each test group and the n u m b e r of cells c o u n t e d per animal to detect a predetermined increase in the incidence of micronucleated cells over that observed in the control popul at i on within chosen limits o f t y p e I and t y p e II error. An alternative sequential approach based on t h e binomial distribution is presented, which is applicable when the n u m b e r of cells analyzed per animal is variable.
The micronucleus test developed by Schmid and coworkers [7] has becom e a widely used screening pr oc e dur e for the in vivo det ect i on of c h r o m o s o m e breakage or c h r o m o s o m e loss induced in bone-marrow erythroblasts by chemical mutagens. The technical m e t h o d o l o g y of the test has been described in detail [7,8] b u t little a t t e n t i o n has been given to the statistical design of experiments employing this test. We describe below a sequential statistical analysis based on the negative binomial distribution, which permits the investigator to optimize the n u m b e r o f animals per test group and the n u m b e r of cells analyzed per animal. The test will d e t e c t a pr e de t e rm i ned increase in the incidence of micronucleated cells over that observed in the control population within chosen limits o f probability of t ype I (~) and t y p e II 03) error. In this case, ~ is
196 the probability of declaring that a c o m p o u n d is mutagenic when it is not, and is the probability of not declaring that a c o m p o u n d is mutagenic when it in fact is. The relative merits of our sequential approach are assessed in comparison with a corresponding single-stage method and an alternative treatment of data which is presently in use.
The micronucleus test -- experimental design The most c o m m o n experimental design of the micronucleus test [8] consists of: (1) dosing groups of animals with several doses of test material and appropriate positive and negative control substances, (2) preparing and staining bonemarrow smears from each animal at an appropriate interval after dosing, and (3) scoring the incidence of polychromatic and normochromatic erythrocytes with micronuclei. The parameters to be compared among treatment groups are: (1) the incidence of micronucleated polychromatic erythrocytes, (2) the incidence of micronucleated normochromatic erythrocytes, and (3) the ratio of normochromatic to polychromatic erythrocytes. Choice o f distribution The primary variable of interest is the incidence of micronucleated polychromatic erythrocytes, which in our laboratory averages about 2/1000 for the negative controls. If approximately the same number of cells, e.g., 1000, are counted for each animal, the number of micronucleated cells takes on a discrete distribution of small counts. One might consider a Poisson distribution as a logical choice except that variances in c o u n t are considerably greater than their means for positive controls. The negative binomial distribution is the usual choice in this situation. A discussion of fitting the negative binomial distribution to our data is given in the Appendix, along with some sampling plan formulae. Sequential sampling plan construction Oakland [5] used the sequential sampling theory of Wald [10] to derive fomulae specific to the negative binomial for decision limits, operating characteristic curves and expected number of samples to reach a decision. Figs. 1, 2, and 3, respectively, give the results of applying Oakland's formulae to our data. As can be seen from the operating characteristic curve (Fig. 2), the sampling plan was constructed such that a sample would have a 1% chance of being declared mutagenic (a = 0.01) if the " t r u e " incidence was 2 micronucleated polychromatic erythrocytes/1000 in the population from which it was drawn. (The 2/1000 incidence was chosen as the approximate average of negative control counts from our laboratory.) Conversely, a 3-fold increase in " t r u e " incidence results in a 90% chance of the mutagenic decision (/~ = 1--0.90). Similarly, 0.95 and 0.99 probability levels are attained for 3.6- and 5.3-fold increases, respectively. It should be emphasized that the above test properties are only intended to be used as an example and are not intended as a general recommendation. In theory, the increase in mutagenicity that would be of practical importance to detect and the degree of certainty desired would vary with each experiment. Using the m e t h o d presented here and the formulae in the Appendix, an investigator can achieve different test sensitivities for limited numbers of animals and cells counted per animal.
197 26
52
Z
A --I "~ U 0 0 0
24
48 ,s' 44
22
N 2o ~
4O
36
I
_i uJ ~ 28 ~
r, 12
24
~'P ~
CONI~NUE SAMPLING
/
~ , w
m
- -
8 ~-
oZ ___
~
12
48
B 4
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
!
.
_
O cZ ,=4
6 m
7-
2 f,~
/
~"Ts~ 2
I
J
J
J
J
I
J
3
4
5
6
7
8
9
NUMBER
10
0
OF A N I M A L S
Fig. 1. D e c i s i o n limits for 2 sequential sampling plans, 1 0 0 0 (solid lines) and 5 0 0 ( b r o k e n lines) cells c o u n t e d per animal, based o n the negative b i n o m i a l distribution.
Sequential samplingplan application Application of the plan is based on decision limits given in Fig. 1. Counts/ 1000 cells are accumulated over a number of animals until the plot of the accumulative count versus the number of animals falls above the upper line or below the lower line of the plan based on counts/1000. One must continue treating animals and making counts as long as total counts fall between the 2 lines. If a plot crosses the upper line, the associated unknown is declared mutagenic, while crossing the lower line leads to the non-mutagenic decision.
Z 0.9 ~ 0.8 0.7 --ONE-STAGE NEGATIVE BINOMIAL
O o.6 ~ o.5 z O z
0.4
~ 0.3 SEQUENTIAL~ NEGATIVE BINOMIAL
i
~
0.2 O1 I 2
I 3
I 4
5
6
7
8
9
I0
"TRUE" FREQUENCY FOR A N U N K N O W N
(MICRONUCLEATEDCELLS/1OO0CELLS) Fig. 2. Operating characteristic curves for sequential and 1-stage sampling plans.
198 10
Z
_o u~
5
-
Z u~ 500 CELLS/ANIMAL
Z
r~
)<
i o[ 0
I
I
I
]
I
I
[
J
I
I
2
3
4
5
6
7
8
9
I0
"TRUE" FREQUENCY FOR AN UNKNOWN (MICRONUCLEATED CELLS/1000 CELLS)
Fig. 3. A v e r a g e n u m b e r s of a n i m a l s r e q u i r e d for t e r m i n a t i o n of 2 s e q u e n t i a l s a m p l i n g strategies, 1 0 0 0 a n d 5 0 0 cells c o u n t e d p e r a n i m a l , b a s e d o n t h e n e g a t i v e b i n o m i a l d i s t r i b u t i o n .
The test presented is designed to detect an increased incidence of micronucleated p o ly ch r o m a t i c e r y t h r o c y t e s over that observed among historical negative controls. As a prerequisite for applying this test to a given experiment, it is essential t h a t both the negative and positive controls fall into their appropriate classification categories (i.e. non-mutagenic and mutagenic, respectively). If these criteria are n o t met, either the e x p e r i m e n t should be rejected or the decision limits should be recalculated based on the responses of the current controls. (These modifications would n o t be necessary for the binomial procedure described below.) The animal t r e a t m e n t and slide preparation would actually be done in batches o f 5 or 6. In order to satisfy the randomness assumptions of the test, slides should be scored blind and the results given to a tabulator. The tabulator would sort o u t the treatments by code and determine whether a decision could be made or whether data from more animals should be added. Thus, employing a sequential sampling approach results in considerable savings in effort since a minimum n u m b e r of slides need to be counted. Further, the n u m b e r of animals per t r e a t m e n t group can be set such that there is a high probability of being able to make decisions after the first run. This would result in additional savings through optimizing the n u m b e r of animals treated in a given experiment. A l t e r n a t i v e use o f binomial
A statistical analysis of the mutagenicity test based on the binomial distribution and the tables of Kastenbaum and Bowman [3] was used by Maier and
199
Schmid [4]. Their binomial parameter (p) is the ratio of micronucleated polychromatic erythrocyte counts for the u n k n o w n group divided by the sum of those for the u n k n o w n plus negative control groups. Thus the expected value of p would be 0.5 if there is no mutagenic effect, and the test is constructed to detect a significant increase in p. Formulae given by Onsager [6] were used to construct a sequential procedure based on the binomial. Decision limits and expected counts needed to make a decision are given in Figs. 4 and 5, respectively. The test was constructed such that the operating characteristic curve would be essentially the same as the one for the procedure on the sequential negative binomial (Fig. 2). For this test, the number of micronucleated polychromatic erythrocytes in the u n k n o w n is plotted against the total for the u n k n o w n and the negative control vs. the decision limits given in Fig. 4. Expected sample size comparisons show that at an average of 2/1000, 3.9 animals are required on the average to reach a decision (Fig. 5). Correspondingly, about 3 animals would be required for the test based on the negative binomial (Fig. 3). Thus, to reach a decision at the same confidence level at the approximate incidence of the negative controls, 25% fewer animals are required using the procedure based on the negative binomial.
Effect of counting a smaller number of cells Negative binomial distributions are determined by two parameters, k and the mean, and we have assumed that k is a constant (see Appendix). Therefore, to consider distributions for variable numbers of cells counted, we simply make appropriate shifts in the range of the counts. An incidence of 2/1000 for the negative controls is equivalent to 1/500, and 3-fold increase would become
3/5OO. Fig. 1 also gives a sampling plan based on the negative binomial that would be appropriate if 500 rather than 1000 cells per animal were counted. The operating characteristic curve is again essentially identical to that for the 1000 36
28 24 MUTAGENIC
U ~
~
12
4q o 4
8
12
16
20
24
28
32
36
40
44
48
CUMULATIVE COUNT FOR THE NEGATIVE CONTROL PLUS A N U N K O W N (MICRONUCLEATED CELLS)
Fig. 4. Decision limits for a
sequential sampling p l a n b a s e d o n t h e b i n o m i a l d i s t r i b u t i o n .
200 7.0
.i\
6.5
Z Z ~
4.0
3.5
I
2
I
3
I
4
I
$
I
6
"TRUE" FREQUENCY FOR A N U N K N O W N (MICRONUCLEATED CELLS/1000 CELLS)
Fig. 5. A v e r a g e n u m b e r s o f a n i m a l s r e q u i r e d f o r t e r m i n a t i o n the binomial distribution.
of a sequential sampling strategy based on
cells/animal sequential negative binomial (Fig. 2), but the abscissa values would be divided by 2. The same adjustment is necessary for the e x p e c t e d sample size graph (Fig. 3). Comparison of e xpe c t e d sample sizes reveals a r e q u i r e m e n t of 1--2 additional animals for the savings of counting half as m any cells per animal.
1-stage test The negative binomial distribution is of course n o t limited to sequential procedures. Total counts for n animals would follow negative binomial distributions with dispersion parameter nk [1]. Fig. 2 also gives the operating characteristic curve for a single-stage test for total counts from 5 animals, k-5(2.54) = 12.7. A total of greater than 22 micronucleated p o l y c h r o m a t i c e r y t h r o c y t e s o u t of 5000 cells from 5 animals would lead to rejection of the non-mutagenicity hypothesis at a less t ha n 0.01. However, the sequential test is slightly more powerful than the 1-stage test for 5 animals, and the sequential procedure would require less animals on the average. Norrnochrornatic erythrocytes The same sequential procedure could be used for n o r m o c h r o m a t i c as p o l y c h r o m a t i c counts if the n u m b e r of cells c o u n t e d were as consistent. However, the procedure in this laboratory has been to c o u n t until a certain n u m b e r o f p o l y ch r o matic cells has been reached, resulting in a highly variable n u m b e r o f n o r m o c h r o m a t i c cells c o u n t e d per mouse. In this case the sequential negative binomial approach is n o t applicable because the random variable is defined on the basis of a fixed n u m b e r of cells c o u n t e d per mouse. The sequential approach based on the binomial distribution (Fig. 4), would be applicable, and we therefore suggest this alternative when counts are variable.
201
Ratios of normochromatic/polychromatic cells One m a y also wish t o test w h e t h e r or n o t the average ratio o f the 2 t y p e s o f e r y t h r o c y t e s is a f f e c t e d b y the t r e a t m e n t s . It has b e e n o u r e x p e r i e n c e t h a t the d i s t r i b u t i o n o f such ratios is n o t n o r m a l and in p a r t i c u l a r t h a t e x t r e m e values are quite c o m m o n . Thus, a n o n - p a r a m e t r i c p r o c e d u r e such as the Kruskal-Wallis test [9] w o u l d be a p p r o p r i a t e . Appendix
Fitting the negative binomial distribution to the data 2 p a r a m e t e r s define the negative binomial, m and k, w h e r e m is the m e a n and k is related to the degree o f clumping. T h e p r o b a b i l i t y o f observing any positive c o u n t r is given b y
(k+r--l)( m Ir( k I k r \m + k / \m-+--k! w h e r e m and k are greater t h a n zero, and the variance is m2 m +k
(*)
T h e Poisson d i s t r i b u t i o n is the limiting f o r m as k -~ co. T h e objective p r i o r to sampling plan c o n s t r u c t i o n is t o see if the data can be a d e q u a t e l y fit with a single value o f k f o r all values o f m. The m e t h o d s o f Bliss and O w e n [2] were used to e x a m i n e the validity o f and calculate a c o m m o n k. (A F o r t r a n p r o g r a m f o r estimating k is available on request.) T h e m o m e n t m e t h o d f o r estimating k is based on the f o l l o w i n g a r g u m e n t . First, we define x'=(Yc)2--s2/n and y ' = s 2 - - ~ , where ~ and s 2 are the m e a n and variance o f a sample o f size n, respectively. Since E(x') = m 2 and by a s u b s t i t u t i o n f r o m ( , ) E(y')= m2/k, 1/k can be e s t i m a t e d as the slope o f a regression o f y' o n x'. (E(x') is the e x p e c t a t i o n o f x ' , and is d e f i n e d as the average Value f o r the p r o b a b i l i t y d i s t r i b u t i o n o f x'.) T h e precision o f o u r k estim a t e d e p e n d s o n the variance o f y' - - x ' / k , V = 2 x ' ( g + k')2[k'(k ' + 1 ) - - (2k' -- 1)/n -- 3/n 2] ( n - - 1 ) k '4
which varies a m o n g samples. weights being the reciprocals d e p e n d on w h a t we are t r y i n g tive fashion. An initial estimate
T h e r e f o r e , a weighted regression is used, the o f these variances, w = 1/V. Since the weights to estimate, the calculation p r o c e e d s in an iterao f k is t a k e n to be
g i=1
w h e r e g is t h e n u m b e r o f samples. Sucessive a p p r o x i m a t i o n s are t h e n m a d e by g
kc
g
~'~' ( w i x i ) / ~ i=l i=l
(wixiYi)
202 1000.0
100.0
-
y, 10.0 - -
1.0--
o.1 0.I
j
I
/ ~ 1 1.0
[
10.0
100.0
1000.0
X' Fig. 6. N e g a t i v e b i n o m i a l d i s t r i b u t i o n s w i t h a c o m m o n k would straight line with a slope of one. y' = s 2 --x and x' = (x) 2 --s 2/n.
be represented
by points
following a
with the w's being recalculated at each stage from the k of the previous stage. The procedure terminates when the difference between k's of the final two iterations is sufficiently small. 1000.0
I
1000
~ Z
~.~
NEG
ON
~0.5 --
A
~ COLCHICINE X
OI
Ol
I
I
1o
lOO
IOOO
MEAN F i g . 7. T h e r e l a t i o n s h i p b e t w e e n t h e m e a n s a n d v a r i a n c e s o f t h e d a ~ a vs. t h o s e o f t h e P o i s s o n d i s t r i b u t i o n and a series of negative binomial distributions with k = 2.54.
203 For illustrative purposes we will consider data from 8 separate micronucleus tests carried out at our laboratory. The data to be considered consist of 31 treatment groups (of, in most cases, 6 animals) of which 8 groups were negative controls and 6 positive controls. The positive control substance was in one case colchicine and in the o t h e r s was either 0.2 or 0.25 mg/kg triethylenemelamine (TEM). Fig. 6 is a graph of log y' versus log x'. In this case, the sample means and variances are of micronucleated erythrocyte counts/1000 for each treatment group. Fig. 6 does not include data for which ~ is greater than s 2 because log y' would then be undefined. Omission of this data does not impair goodnes of fit comparisons since we are primarily interested in samples that would fall somewhere between the negative and positive controls. Negative binomials with a c o m m o n k show a straight line dispersion between log y' and log x' with a slope o f one, because if y' = x'/k, log y' = log x' - - l o g k. We can see from Fig. 6 that the positive controls fall slightly below the line, but in view of tl~e variability, the points are encouragingly close to satisfying the required conditions. Since k is a function of y'/x', log (y'/x') should be independent of log (~) if h is not correlated with the mean. The slight negative correlation (r = --0.353) is reflective of the result for the positive controls in Fig. 6, i.e., points to the right falling below the line. However, the correlation is n o t significantly different from zero, and therefore we accept the hypothesis of a constant k. Fig. 7 shows log variances versus log means for the data against those for negative binomial distributions with h = 2.54, estimated from the data, and against those for Poisson distributions.
Sampling plan formulae The null and alternative hypotheses are: Ho : ~' ~< xl and HA: ~' ~> x2, respectively. First, p and q are calculated as follows.
Parameter a
Ho
HA
p =x/h
Pl = ~ l l k
P2 =~21h
q=l+p
ql =I+pl
q2=l+p2
ao
~,
1 ~q
~.
Then the intercepts are calculated as 1--a and
: og
\Plqz/
The slope is
\Plq2/
204
In order to get rough ideas of the O--C and expected sample size curves, a few points on each can be calculated as follows [6] : ~'
O--C curve
E x p e c t e d s a m p l e size
L09 )
hi(n)
0 71 ~2
1 1 --c~ {3
--a I / b * *
b
a2/(a 2-a
1)
* E ( n ) = [a2 + ( a l - - a 2 ) L ( P ) ] / ( x '
--~ala2/(b2/k + b) -- b)
Acknowledgement The authors are grateful to Nancy Lynne Ullner for providing the computer analyses.
References 1 Anscombe, F.J., The statistical analysis of insect counts based on the negative binomial distribution, Biometrics, 5 (1949) 165--173. 2 Bliss, C.I., a n d A . R . G . O w e n , N e g a t i v e b i n o m i a l d i s t r i b u t i o n s w i t h a c o m m o n k, B i o m e t r i k a , 4 5 (1958~ 37--58. 3 K a s t e n b a u m , M . A . , a n d K . O . B o w m a n , T a b l e s f o r d e t e r m i n i n g t h e s t a t i s t i c a l s i g n i f i c a n c e of m u t a t i o n f r e q u e n c i e s , M u t a t i o n Res., 9 ( 1 9 7 0 ) 5 2 7 - - 5 4 9 . 4 Maier, P., a n d W. S c h m i d , T e n m o d e l m u t a g e n s e v a l u a t e d b y the m i c r o n u c l e u s t e s t , M u t a t i o n Res., 4 0 (1976) 325--337. 5 O a k l a n d , G.B., A n a p p l i c a t i o n o f s e q u e n t i a l a n a l y s i s t o w h i t e f i s h s a m p l i n g , B i o m e t r i c s , 6 ( 1 9 5 0 ) 5 9 - 67. 6 0 n s a g e r , J . A . , T h e r a t i o n a l e o f s e q u e n t i a l s a m p l i n g w i t h e m p h a s i s o n its use in p e s t m a n a g e m e n t , U.S. D e p t . Agric. T e c h . Bull., 1 5 2 6 ( 1 9 7 6 ) 19 p p . 7 S c h m i d , W., T h e M i e r o n u c l e u s T e s t f o r C y t o g e n i c A n a l y s i s , in: A. H o l l a e n d e r ( E d . ) , C h e m i c a l M u t a g e n s , Vol. 4, P l e n u m , N e w Y o r k , 1 9 7 6 , p p . 3 1 - - 5 3 . 8 S c h m i d , W., T h e m i c r o n u c l e u s t e s t , M u t a t i o n R e s . , 31 ( 1 9 7 5 ) 9 - - 1 5 . 9 Siegel, S., in: N o n p a r a m e t r i c S t a t i s t i c s f o r t h e B e h a v i o r a l S c i e n c e s , M c G r a w - H i l l , N e w Y o r k , 1 9 5 6 , pp. 184--193. 1 0 Wald, A., S e q u e n t i a l t e s t s o f s t a t i s t i c a l h y p o t h e s e s , A n n . M a t h . S t a t . , 16 ( 1 9 4 5 ) 1 1 7 - - 1 8 6 .