Tubercle, Lond.,
(I958),
39, I43
Reliability of the Multiple Puncture Tuberculin Test Compared. with the Mantoux Test By C. J. S T E W A R T , R. G. C A R P E N T E R and P. M c C A U L E Y from the Ipswich and East Suffolk Chest Department, the Department of Human Ecol@,y, University of Cambridge and St. Audrey's Hospital, ~lIelton With the assistance of M. BARRY, D. P. F. EMBLETON, D. VAN ZWANENBERG and R. DENT H e a f (i 95I) described a method of performing the tuberculin test using a multiple puncture apparatus. This test is being increasingly used in Britain as a diagnostic test, for routine group testing and for epidemiological surveys. Some of the practical advantages over tile Mantoux test in field work have been described by Stott (1955) and Irvine (I954). More recently it has been claimed that a cutaneous or intradermal test using either killed or live BCG might be a more specific test of previous virulent tuberculous infection (Assis and Carvalho, i942 ; Ustvedt and Aanonsen, I949; Ustvedt, I95o , I953; Bueno, i947; Frappier and Guy, I949, I95o; Frappier and others, I955). I-leaf (i 955) suggested that an intradermal injection o f o . o I rag. of live BCG might be substituted for the pre-BCG vaccination tuberculin test. This paper presents the results of a trial of the technical errors of administrating the 5 T U O T Mantoux test and H e a f multiple puncture tests using PPD and BCG. Other factors that have bearing on the choice of tests are also briefly discussed. T h e terms used in this p a p e r and in particular the definitions of positivity decided before the analysis was started are defined in the Appendix.
Plan o f the Survey
~
Persons Tested T h r e e tests, the 5 T U Mantoux, the multiple puncture MP/PPD and the MP/BCG, were each performed on 4o5 long-stay patients at a mental hospital. T h e r e were i8o males, aged between 2o and 8o years, the average age being 54, and 225 females, between 2o and 9 ° years old, the average age being 59 years. T h e percentage of positives is high in both sexes. For males it vanes according to the test used and the day of reading between 77"8 and 9 I . I ; for females it is significantly lower, and varies between 67-I and 84"4. These high percentages are what is expected for a mental hospital population (Snell and others, I943). No special facilities were provided in the wards on which the testing and reading were done. T h e lighting conditions varied from very good to bad. Generally the conditions were much the same as those met with in field surveys when school buildings and village halls are used as testing centres.
Selection of Tuberculh~ and Materials Used T h e 5 T U Mantoux test was chosen because of the claim that it is the most suitable single test for the selection of candidates for BCG vaccination (Palmer, 1953; Edwards and others, I953). Pure Old Tuberculin .(supplied commercially by Burroughs Welleome) was diluted i : 2,ooo in borate buffered solution in sealed sterile ampoules.
x44
TUBERCLE
T w o new tuberculin syringes and needles supplied at the commencement of the investigation were used throughout. A separate needle was used for each Mantoux test. After each session syringes and needles were washed in boiled water and autoclaved. T h e multiple puncture apparatus described by H e a f (i 95 I) was used to administer glycerated PPD (Weybridge) 2 mg. per mh and the freeze dried BCG vaccine of the Pasteur Institute (batch No. BCG 5x9/I556/OOOSIV/x I42 ) was reconstituted to contain 75 mg. per ml. (He'af, personal communication). All antigens were stored at 3 ° C. until the day of use. T h e BCG was reconstituted at the commencement of each session and all unused antigens were discarded at the end of a session. T h e skin at each test site was cleaned with acetone. Both BCG and PPD were applied with separate glass rods. New multiple puncture apparatus was supplied for both the PPD and BCG tests and were kept separate. A i ram. depth of puncture was used throughout. The plates and puncture needles were sterilized by flaming between each application. Preliminary testing of tuberculous patients with BCG antigen suggested that severe reactions would be unlikely.
Procedure and Order of Testhzg T h e subjects tested were divided into nine groups of 45, the procedure being the same for each group. T h e three tests were performed at the same session on each patient. Each test was performed by a different tester. T h e sites used for the tests were the upper and lower third of the right forearm and lower third of the left forearm. Tile sites of the testing were changed after every fifth patient and, the tester performing the tests changed after every fifteenth patient. These r.otations were arranged by means of a set of instruction sheets, one for each tester. T h e sheets listed the subjects by reference n u m b e r in the order in which they were to come for testing, and showed which test the tester should give and where it should be given. T h e instruction sheet a tester received depended on which test he chose to do first. Subjects were allocated to different testers at r a n d o m as follows. A list of subjects who would form the next group to be tested was given to us. Reference numbers" were randomly allocated to the subjects on this list, the randomization being done by means of r a n d o m sampling numbers. When the tests were given and read, the subjects were arranged in an order determined by their reference numbers. O f the nine groups tested in this way there were 4 groups of male and 5 groups of female subjects. Hence each tester gave each test to I35 subjects, 6o of t h e m male and 75 female. T w o of tile three workers giving the tests, Testers I and II, were members of the s t a f f o f a chest clinic, and had experience of both the Mantoux and multiple puncture tests. Occasionally they were replaced by experienced deputies from the clinic. Tester I I I had only limited previous experience with Mantoux testing and none with the multiple puncture technique. No training was done to standardize the work of giving or reading the reactions. T h e readings of the reactions reported here were all m a d e by one experienced observer, tester I I , on the third or fourth and on the seventh day, except that in reading one group of patients on the third day he was replaced by his deputy. T h e reactions were also read by another experienced observer, tester I, or. his deputy, and by two other less experienced observers. Each reaction was thus read independently by four observers on the third or fourth and on the seventh day. When reading, the experienced observers read 15 reactions of one kind and on one site in the first group of 15 subjects, and then changed to a second and third site, reading reactions to a second and third test for the next two ~roups of subjects. Three circuits were made by each observer reading this way to read three reactions on each subject. T h e order in which the sites were inspected was set out on the reading sheets, which were prepared beforehand. It is probable that an observer would only r e m e m b e r a
MULTIPLE
DAY M A N T O U X REACTIONS 3 6 3°1o 6 3 7 °/o 3I~O: ~ V -E- t +VE ~
PUNGTURE
I45
TESTS
7 DAY M A N T O U X REACTIONS
(OT)
(OT)
TESTE~ I
E
3o~ 21-s 0/¢7e-5%
= I --VEI"kVE~ 2oI--~ |
3o
,7o%, e3 0% ~ r ~
TESTER]I
"rESTER
20
V/A
z w I0 o.
~°~2-~-'
73.3°/° +VE ~
i,°tl
--VE
.V£"
r EST.ER
< 2O
I
°~°I
30 1 2s.~% 2"t'8°/o
60
TESTFR ~
!
!
!
!
O 2 I0 14 IB 2:' 26 30 6 I0 14 18 22 26 28 32 MM MM Fio. x. - Showing the effect of tester on tile distribution of 5 T U Mantoux reactions, when read on the third or fourth and seventh day. 0
few of the subjects upon whom he performed a particular test. Also after having read 45 other reactions an observer would not be likely to remember what he had recorded when he read the other tests, nor would he see it, as the readings made on each circuit were recorded on separate reading sheets. This material made possible a detailed study of the effects of observer error; but only a few of these results are included here. For simplicity in presenting the results, the work of a deputy has been included with that of the worker for whom he was deputizing unless otherwise stated. The effects of this are considered.
Findings
ThE EFFECT OF DIFFERENT TESTERS T h e effects o f t h e t e s t e r o n t h e M a n t o u x r e a c t i o n s a t t h e t h i r d o r f o u r t h d a y a r e s h o w n i n Fig. I. E a c h d i s t r i b u t i o n is b a s e d o n I 3 5 M a n t o u x tests. D i f f e r e n c e s i n t h e p e r - , c e n t a g e s o f p o s i t i v e s will b e s e e n , t h e m a x i m u m d i f f e r e n c e b e i n g 14 p e r c e n t b e t w e e n t e s t e r s I a n d I I . A X 2 test s h o w s t h e v a r i a t i o n s i n t h e p e r c e n t a g e s o f p o s i t i v e s a r e s i g n i f i c a n t (o.o 5 > P > o ' o I ) . T h e m e a n sizes o f t h e % [ a n t o u x r e a c t i o n s r e a d o n t h e t h i r d o r f o u r t h d a y a r e s h o w n i n T a b l e I. S t a t i s t i c a l a n a l y s i s * s h o w s t h a t r e a c t i o n s t o tests g i v e n b y t e s t e r I t e n d t o be smaller than those of the other two testers (o.oi >P>o.ool). T h e m e a n size o f Tim-mean sizes of tile reactions of tlm tbrce testers may be compared by an overall analysis of variance followed by t tests. Such tests wilt not be much affected by the/'act that the distributions are not normal, nor by any differences there may be in their spread as each mean is based on the same number of observations (Box and Andersen, I955). The signifieancd of the differences was not altered it"one reaction read as 63 mm. (see Fig. t), was discarded from the analysis. Alternatively the locations of the distributions may be compared by non-parametric Mann-$%qfitney tests (Mann and Whitney, x947; S iegal, 1956). The results of the two methods of analysis are the same.
I46
TUBERCLE TABLE I. -- SIIOWIXO TIIE ~IEAN SIZE OF TIlE REACTIONS TO TIIE ~IANTOUN TESTS GIVEN BY TIIREE TESTERS~ AND READ ON TIlE TIIIRD OR FOURTh AND ON TIIE SEVENTII DAY BY A SINGLE READER
Mean size (ram.)
Tester
aVo. of subjects
3rd or
tested
4th day*
7th da.yt
i
I
135
]
8"I
8-0
II IlI
135 135
]
io-8 io. 7
lO.5 lO.4
* T h e s t a n d a r d error of these m e a n s , based on a
pooled estimate of variance of the third or fourth day readings, is 4- o'66. tThe standard error of these means, based on a pooled estimate of variance of the seventh day readings is 4- o'55. reactions to tests given by tester I I is very close to that of tests given by tester I I I , and the difference could easily have occurred by chance. However, a h h o u g h . t h e m e a n sizes of two groups of reactions m a y be the same the distributions can be different. A X 2 test shows that the distributions of testers I I a n d I I I are significantly different ( o . o 5 > P > o . o x ) . Inspection of Fig. I shows that the difference is in the n u m b e r of reactions with between 2 and I4 ram. of induration. A X2 test also shows significant differences (o.oi > P > o . o o x ) between the distributions o f testers I and I I I . W h e n the distributions of testers I and I I are c o m p a r e d b y means of a X °- test, the difference is not significant. This is because this test is not very sensitive to differences in the location of the distributions. But it also draws attention to the fact that the significant differences in the percentage of positives recorded at the beginning of this section are due to the level at which a reaction is classed as positive. I f the dividing line had, a priori, been taken a n y w h e r e between 4 rnm. and I o m m . inclusive, the results o f the test for differences in the percentages of positives would have been the same. If, however, the dividing line h'ad been taken at T or 2 ram. tile differences in the percentages of positives between all three testers would not have been significant. I f i n d u r a t i o n s of 3 ram. or more had been regarded as positive the X2 test would have bordered on significance. Fig. i also compares the distributions of the 3 testers when the M a n t o u x reaction. are read after seven days. E a c h distribution shows fewer negative reactions t h a n on th'e third or fourth day. T h e m e a n sizes o f the reactions o f the three distributions are shown in T a b l e I. T h e differences between the distributions o f the seventh d a y readings follow an exactly similar pattern to differences between the distributions o f the third or fourth day readings. Fig. 2 compares the reactions resulting from the x35 h l P / P P D tests p e r f o r m e d ' b y each tester. T h e variation in the percentages of positives between the testers is m u c h smaller t h a n with the M a n t o u x test. O n tile third or fourth day readings the m a x i m u m difference is less t h a n five per cent a n d on the seventh d a y is less than 4 per cent. Differences in the percentages of positives between the testers up to 9 per cent could easily occur by chance. Differences are, however, seen in the distribution of the sizes of the M P / P P D reactions in Fig. 2. T h e r e is little difference between testers I and I I w h e n the tests are read either at the third or fourth day or at the seventh d~/y. Since the reactions are not classified on a linear scale there is no such thing as" a m e a n size of reaction, b u t the distributions m a y be c o m p a r e d by means of ;E°" tests a n d M a n n - W h i t n e y U tests ( M a n n and Whitney, x947; Siegal, x956 ). T h e differences between testers I and I I
MULTIPLE
3/4
DAY
60"1 18 5 % t-'~E
I~lP.
REACTIONS
E~ 5 % -tVE
PUNCTURE
TESTS
7
(PPD}
~47
DAY
N.P.
(PPD)
REACTIONS
s7 4 % -fVE
TESTER T
TESTER T
'° t'_V;'° ~.2o
20
,,
,01 J~oE~%i ~oi~'~0 ,ol I
l~%
~,~~,
8S
6o
TESTER
2°/o
+VE
u
"~
TESTER "J'T
~4o
gQ. 2o
--VE ~
g
~,
,ol,,o
~4 e°lo I e5 2%
~o ~
+VE
TESTER Tn"
40
[~
e e g °lo
4O Z
2o
~o I
0
I
2
4 5 PAPULES
6
z 9 Gc
15 2[ ~LAQUE
27
33
k4M
0
[
I
1 i i Ioi I J t l | 3 4 S 15 Z 9 IS ~l M M P,~.PULE S pLAQUE
F I G . . . - Showing t h e effect of tester on the distribution of M P / P P D reactions, w h e n read on t h e third or fourth a n d seventh day.
3/4
"
DAY
40J17001"
MP
REACTIONS
' e3 0°/"
~-vE
TESTER I
I+VE
20
40
u
--VE
+VE
7 DAY
(BEG)
.(
40-(
~
20
~z w
TESTER ~"
ao
,~,CL°
r-IP R E A C T I O N S
, ~ . , °/o
__
(BCG}
TESTER I
--
TESTER ."
=
135%
I 86.2%
TESTER "m"
~o
/ 9 6 °/o
I 9 0 4 °/o
~
TESTER "m"
20
0
I
2
3 4 PAPULES
5
6
0Z 9 E"
15 21 24 MM. PLAQUE
2 PAPULES
=
15 I 21 PLAQUE
27 M . ~
FIG. 3 . - Showing the effect of tester on the distribution of M P / E C G reactions, w h e n read on t h e third or fourth a n d seventh day.
I48
TUBEI~CLE
are not significant. T h e distribution of tester I I I ' s reactions tends to be larger than those of testers I and I I . T h e difference between tester I I I and tester I is statisticalIy significant (o.oi > P > o . O O l ) . T h e difference between tester I I I and tester I I is also significant (o.oi > P > o . o o I ) . By the seventh day the differences between tester I I I ' s results and those of the other two testers are slightly more marked. Among those tested by tester I I I there are on the seventh day 27 per cent more plaques than a m o n g subjects tested by the other testers. Fig. 3 is a similar study of the effect of tester on the MP/BCG test. Again, each tester did 135 of the tests. T h e variations in the percentages of positive reactions are larger than for the M P / P P D test, the m a x i m u m differences being 6 per cent at "the third or fourth day and 7 per cent on the seventh day, but are not significant. T h e distributions suggest that tester I produces more reactions of six papules than the other two testers, that tester I I produces more rings than the other two testers, and that tester I I I produces more plaques than the other two testers. These differences are rather more marked on the seventh day, when X2 tests show they are significant - that is they are unlikely to be due to chance. Mann-Whitney U tests show that the differences between testers I and I I amount only to differences in certain parts of the distribution, and do not represent a significant tendency for tester I I to produce larger reactions than tester I. Differences between tester I I I and the other testers consist of a general increase in the size of the reactions, an effect which is significant on both third or fourth and seventh days, the largest of four values of P for these comparisons being less than o.oI. T h e findings m a y be summarized by saying that only in the Mantoux test are tile percentages of positive and negative reactors significantly affected by the tester. In the Mantoux test testers I I and I I I gave larger reactions than tester I, and differences also appear between the reactions of testers I I and I I I . I n both NIP tests, tester I I I gives larger reactions than the other two testers. T h e M P / P P D test shows little difference between testers I and II, but with the MP/BCG test differences occur in the distributions of reactions of 6 papules or larger. Although tester I I I ' s lack of experience m a y be partly responsible for some of the results, important differences also occur between the experienced testers I and II, particularly with the Mantoux test. Before finally accepting the results as indicating real differences in testing technique it is worth considering the influence of other factors. T h e differences between testers I and I I were slightly reduced, and not increased, b y amalgamating their tests with those performed by their deputies. Tester I I I was present at all sessions. T h e readings ma.de by the other experienced observer gives results that are similar to those presented, and analysis did not suggest that if an observer was the one who gave the test it prejudices his reading. I t seemed possible that in giving 45 tests at a session the testers might become fatigued or the instruments affected, such as by antigen adhering to the points of the multiple puncture needles in increasing concentrations, or by the plunger of the Mantoux syringe becoming more difficult to operate due to the adhesion of tuberculin. T h e trial was not specifically designed to investigate these points, but the order in which the testers did the tests was not the same for each group and so they could to some extent be investigated. Analysis does not suggest that either of these factors affects the n u m b e r of positive and negative reactions or their size, in particular they do not explain the association of small Mantoux reactions with tester I. A third possible explanation is that differences between the testers represent real differences between the subjects. T h e design of the tri.al balanced a number of factors. These were: tests, the site on which the test was given, fatigue on reading or the order in which the tests were read, the groups of patients (which includes sex), and the testers. As a result any of these factors which affect the reactions will affect equal
M U L T I P L E P U N G T U R E TESTS
I49
proportions of tests given by each tester. For example, each tester tested 60 men and 75 women, so that sex cannot be tile cause of the differences. The systematic design did not include ttle effects of age, which were found to be as follows. For males the percentage of positive reactions to the MP tests do not show a trend with age. The Mantoux reactions read at the third or fourth day show a trend but it is not statistically significant ( P > o . x ) and is less marked by ttle seventh day I n contrast, the percentage of positive reactions among female subjects falls fairly steadily with age with all the tests. The trend is statistically significant (o.o 5 > P > o.o I). This trend in the percentage of positives with increasing age is coupled with a tendency for the reactions to become smaller. To safeguard the results against the effects of age and other unforeseen or unknown factors, in particular differences between the patients, the allocation of subjects to testers was done by means of random numbers. It is unlikely to have given rise to real differences between the groups tested by different people, but the method has been known" to fail (Hamilton and others, 1953). As a check, the age distribution of the subjects was investigated to see if it was similar for the three testers. This was found to be very satisfactory. The maximum difference in mean ages was less than two and a half years and the variation in the means was slightly less than expected. Furthermore, each worker tested roughly equal proportions in each age group and there is no evidence that differences between the testers affected one age group more than another, although the numbers of subjects in the more extreme age groups were rather small. The most important tester-effects on the percentage of positives are on the Mantoux test between testers I and II. The subjects all received two multiple puncture tests which could be used as a cross check whether the difference in the percentages of positives to the Mantoux test were real. The multiple puncture tests did suggest that a small difference might exist, of between 4 and 5 per cent. The estimated difference varies slightly according to whether corrections are introduced for possible tester effects with the multiple puncture tests. T h e difference is insignificant. The most convincing evidence that the size of the reactions and to a c~rtain extent the number of reactions depends on the tester is that the subjects were each given one test by each tester. Thus those given the Mantoux test by tester I I I were given the M P / P P D and MP/BCG tests by testers I or II, and so on. Thus any trend between the testers observed in one test would be expected to rotate in the others. In fact in all three tests the effect of a tes{er is similar. COMPARISONSOF TIlE TESTS IN OTItER RESPECTS
Reading Errors Besides the foregoing results the trial also included an observer error study. Under the conditions of this trial experienced observers were found to be biased, besides making random errors in interpreting the reactions. The differehees between independent readings of the tests varies with the tests, the size of the reaction, and the day of reading. When positive reactions are deftued as in this investigation (see Appendix), the reading errors for the Mantoux test are 3.1 and 6. 7 per cent on the third or fourth and seventh days respectively. The corresponding figures for the M P / P P D tests are 3.I per cent and 0. 7 per cent, and for the NIP/BOG tests they are 3.I and 3.0 per cent respectively. Each of these percentages is based on over 2o0 independent readings by the two experienced observers made at sessions when they were both present. Observer error was therefore found to be least for the M P / P P D reactions when these were read on the seventh day. Observer error is an additional factor to tester error, and the two factors do not cancel each other out. It is estimated that the combined effects of errors in testing and reading are such that the experienced workers would have differed in about I8 per
x5o
TUBERGLE
TABLE II.--REACTIONS READ ON TIIE TIIIRD OR FOURTII DAY COMPARED XVITIITIIOSE READ ON TIlE SEVENTII DAY
(a) 5 TU ~|AN"roux TEST
7th day Wegative 3rd or 4th day
Negative Positive
Total
Positive
Total tI 4
9t It
23 280
29t
102
303
405
Wegative
Positive
Total
5° o
I9 334
7x 334
52
353
4o5
•'egative
Positive
Total
47 7
a9 332-
66 339
54
351
405
(b) MP/PPD TEST
7th day
3rd or 4th day
Negative Positive I
Total
I
(c) MP/BCG TEST
7th day
3rd or 4th day Total
Negative Positive
cent o f cases as to whether these subjects were positive or negative to the 5 T U M a n t o u x test, if they had both tested and read these subjects independently. With the M P ] P P D test, when read on the seventh day, an u p p e r limit for such differences is estimated as 5 per cent; and for the M P ] B C G test it is about 7 per cent. Day of Reading T a b l e I I compares the third or fourth d a y readings with the seventh d a y readings of thee three tests. T o each test there are approximately 5 per cent of delayed positive reactions. T h e increases in the percentages of positives between the third or fourth and seventh d a y are statistically significant2 These, together with other findings to be published later, are in conflict with claims that B C G reactions can be read early (Frappier and Guy, I949). I t will also be seen in T a b l e I I that the M a n t o u x and B C G tests each gave a small proportion of transient positive reactions. These transient reactions occurred in both male and female subjects, and are too numerous to be dismissed as merely a manifestation o f observer error. As there were no transient reactions to the M P / P P D test, it seems only necessary to read these reactions on the seventh day. T o detect all the positives to the other two tests it would be necessary to read the reactions on the third or fourth d a y and to read those that were negative again on the seventh day. Comparison of Reactions Positive and negative reactors to each of the tests are c o m p a r e d in T a b l e I I I , transient and delayed positive reactions being included with those positive at both the third or fourth and seventh day. I t will be seen that of the 3x4 subjects who were M a n t o u x positive only 2 were
t51
MULTIPLE PUNGTURE TESTS T J ~ B ~ I I 1 . - T t i E NU.XlBER OF" SUI~ECITS POSITIVE TO "rnE TIIREE TESTS
(The category 'positive' here includes those with reactions positive on the third or fourflxdays but not the seventh and those positive on the seventh but not the third or fourth, as well as those positive on both occasions.) (a) ~[AN'I'OUX- MP/PPD 3lantoux
MP/PPD
Negative Positive
Total
Wegative
Positive
Total
5° 4x
2 312
5" 353
9t
314
405
(b) MP/PPD- MP/BCG MP/PPD
MP/BCG Total
Negative Positive
Wegati~
Positi~'e
Total
4° 12
7 346
47 358
5~
353
405
negative to the MP]PPD test. O f the 91 Mantoux negative, 41 were positive to the MP/PPD test. 2x of these Mantoux tests were given by tester I, 9 by tester II, and I I by tester I I I . This means that I6 per cent of the I35 subjects Mantoux tested by tester I were Mantoux negative and MP/PPD positive, compared with 7 per cent tested by tester I I and 8 per cent tested by tester I I I . Thus, the agreement between the results of these two tests varies somewhat according to the testers. Table l I I also compares the two hiP tests. The two tests were given by different workers yet the results are very similar. Discussion We have shown that the technique of individual testers can influence the resulting reactions to both of the h i p tests and to tile Mantoux test. Studies of the effects of variations in testing technique have been made with the Mantoux test. Meyerand others (x95I) compared two tests given on the same patient. Palmer and Edwards (I953) have shown that the depths at which the point of the needle is placed in the skin slightly affects the size of BCG reactions. A reflux of tuberculin into the syringe due to elastic]'ecoil of the tissues can reduce the dose of tuberculin given (Griep and Bleiker, i957). However, in none of these experiments has the work of two different testers been directly compared, and no evidence has previously been available of the sort of errors likely to occur in common practice. Nor have any previous studies been published of the technical variations of applying MP tests. We lind the effect of technique upon the reactions to the hiP tests is not so marked as for the Mantoux test. This may be because the depth and duration of puncture of the needles is predetermined and cannot be influenced by the operator. There must, however, be other factors such as the degree of pressure applied by the plate of the MP apparatus, the quantity and evenness of application of allergen to the skin, the degree of heating of the plate, and the thoroughness ofsterilizing and cleaning the needles. These might one or all influence the degree of reaction. Our observer error findings will be reported in greater.detail elsewhere. It may be said, however, that they are in general agreement with those already reported by Meyer and others (i95i) for the Mantoux test, and by Gillis and Stradling (I957) for the multiple puncture test.
152
TUBERGLE
Gonclusiol~ T h e H e a f M u l t i p l e P u n c t u r e t e c h n i q u e is r e c o g n i z e d as h a v i n g t h e a d v a n t a g e s o f the simplicity o f the e q u i p m e n t , ease o f a d m i n i s t r a t i o n a n d relative low cost o f m a t e r i a l s a n d replacements. W e have shown t h a t it has the a d d i t i o n a l a d v a n t a g e t h a t it is less affected t h a n the M a n t o u x test b y variations in the t e c h n i q u e o f a d m i n i s t r a t i o n . W h e n used to a d m i n i s t e r P P D t u b e r c u l i n it gave, on tile average, IO p e r c e n t m o r e positive t u b e r c u l i n t y p e reactions t h a n t h e 5 T U O T Mantomx test. T h e r e a c t i o m a r e b e t t e r r e a d on the seventh ttlan on t h e t h i r d or fourth day. O f t h e t h r e e tests e x a m i n e d in this survey it also gave the lowest observer error, p r o v i d e d all r e a c t i o n s o f t h r e e or m o r e p a p u l e s were t a k e n as positive. ( F u r t h e r evidence in s u p p o r t of.these last two points will be p u b l i s h e d later.) I t therefore seems a n ideal m e t h o d for use in e p i d e m i o l o g i c a l surveys a n d for r o u t i n e t u b e r c u l i n testing. T h e m u l t i p l e p u n c t u r e test using live B C G , as in this trial, a l t h o u g h giving results t h a t a r e - i n close a g r e e m e n t w i t h the P P D test, has no a p p a r e n t a d v a n t a g e . T h e reactions t e n d to be s o m e w h a t smaller, w h i c h m a y inflate errors in r e a d i n g , a n d transient reactions also occur. W h e r e a m p l e facilities a r e a v a i l a b l e a n d speed is o f no g r e a t i m p o r t a n c e the M a n t o u x t e c h n i q u e Ires the a d v a n t a g e t h a t the dilutions c a n be serialized, a n d t h a t b o t h dosage a n d t h e resulting reactions c a n b e m e a s u r e d . T h e s e m a y m a k e it m o r e s u i t a b l e t h a n M P tests for c e r t a i n research studies. H o w e v e r , i f it is used careful a t t e n t i o n m u s t b e p a i d to errors due to v a r i a t i o n in t e c h n i q u e o f a d m i n i s t r a t i o n a n d reading.
Summary A carefully designed trial r e v e a l e d t h a t unsuspected variations in tire t e c h n i q u e o f g i v i n g the 5 T U M a n t o u x test can affect the size o f the reactions a n d p e r c e n t a g e s o f positive reactions. Different testers d i d not significantly affect the percentages o f positives to a H e a l m u l t i p l e p u n c t u r e test, w h e n used to a d m i n i s t e r P P D o r BCG, b u t the size o f the reactions was affected. T h e m u l t i p l e p u n c t u r e t e c h n i q u e using P P D t u b e r c u l i n in a c o n c e n t r a t i o n o f 2 mg. p e r ml. is t h e most satisfactory of.the t h r e e tests for r o u t i n e use. I t is b e t t e r r e a d on the seventh d a y t h a n on the third o r fourth days. In a work of tiffs nature the persons who, in one way or another, give their help are too numerous to mention individually. We wish particularly to thank Professor F. R. G. Heafwho unwittingly stimulated this investigation, and to which he contributed invaluable advice and arranged for the supply of the French freeze-dried BCG vaccine. Dr I. P. Davies, the Physician Superintendent, and medical staffofSt. Audry's Hospital willingly gave us every assistance and access to their patients. Dr T. Shaw and Dr A. B. Lintott, pathologists to the Ipswich Group Hospitals, gave us help and advice on the technical aspects of the trial as well as providing some of the material. The sisters and the nursing staff of the Hospital, the technical and secretarial staff, particularly Miss M. Green, of the Chest Clinic, and rite Department of Iiuman Ecology all gave us help wtffch contributed to the successful completion of this investigation. We are also indebted to the East Anglian Rcglonal Hospital Board for punching the necessary Hollerith cards. APPENDIX DEFINITION OF TERMS AND ABBREVIATIONS x. M P / P P D t e s t - a test a p p l i e d with the H e a f m u l t i p l e p u n c t u r e a p p a r a t u s using a c o n c e n t r a t i o n o f purified p r o t e i n derivative e q u a l to 2 rag. p e r ml. 2. ~IlP/BCG t e s t - a test a p p l i e d w i t h the H e a f m u l t i p l e p u n c t u r e a p p a r a t u s using freeze d r i e d B C G r e c o n s t i t u t e d to a concentraffon o f 75 rag. p e r ml. 3. T h e r e a c t i o n to the M a n t o u x test was r e c o r d e d as the a v e r a g e o f the m a x i m u m d i a m e t e r o f i n d u r a t i o n a n d the d i a m e t e r at right angles to it. T h e r e a c t i o n to t h e
I~IULTIPLE PUNGTURE TESTS
t53
M P tests was recorded as (x) the number of papules, or (2) a ring, or (3) a plaque of induration. T h e diameter of a plaque o f induration was recorded as for the Mantoux test.
4. (a) Positive ~Iantoux reaction - an area of induration of the skin whose mean diameter (defined above) is 5 ram. or larger.
(b) Negative Mantoux reaction - induration less than 5 ram. diameter, or no palpable reaction.
(c) Positive M P / P P D or MP[BCG reaction--a reaction with three or more papules. (d) Negative M P] PPD or MP]BCG reaction~no papules or reaction giving one.or two papules.
(e) Transient positive reaction - a reaction positive (as defined) at the third or fourth day but negative at the seventh day. References
Ass[s, A. de, and Carvalho, A. de 0942) Hospital, Rio deft., 22, 173. Box, G. E. P., and Andersen, S. L. 0955) ft. Roy. Statist. Soc., Ser. B., x7, x. Bueno, M. M. (I947) Amer. Rev. Tuberc., 55, 25o. Edwards, L. B., Palmer, C. E., and Magnus, K. (1953) BCG Vaccination, WHO Monograph Series No. 12, Geneva. Frappier, A., and Guy, R. 0949) Canad. reed. Ass.ft., 6I, 18. Frappier, A., and Guy, R. (x95o) Canad.ft.pubL Hlth., 4 x, 7~. Frappler, A., Guy, R., Desjardlns, R., Roy, O., and Painschaud, C. (1955) Rev. Hyg. reed. Soc., 3, 95. Gillis, S., and Stradling, P. (1957) Tubercle, Lond., 38, 27. Griep, W. A., and Bleiker, M. A. (1957) Tubercle, Lond., 38, 259. Hamilton, M., Wilson, G. M., Armitage, P., and Boyd, J. T. 0953) Lancet, i, 367 . Heal, F. R. G. (x95 0 Lancet, ii, I5I. Heaf, F. R. G. (I955) Lancet, i, 315 . Irvine, K. N. (x954) BCG and Vole Vaccination, p. 47, N A P T , London. ~Xlann, H. B., and Whitney, D. R. (x947) Ann. math. Statist., x8, 5o. Meyer, S. N., Hougen, A., and Edwards, P. (195t) Publ. Hlth. Rep. Wash., 66, 561. Palmer, (3. E. (t953) Amer. Rev. Tuberc., 68, 678. Palmer, (3. E., and Edwards, L. B. (t953) Brit. reed. ft., i, 363 . Siegal, S. (x956) Nonparametric Statistics, McGraw-Hill, New York. Snell, W. E., MacMahon, J. F., and Heat', F. R. G. (1943) Lancet, ii, 636. Stott, H. (I955) Tubercle, Land., 36, EE9. Ustvedt, H . J . (x95o) Bull. World Hlth. Org., % 355Ustvedt, H.J., and Aanonsen, A. (1949) Aeta tuberc, scavd., 23, I. Ustvedt, H . J . (x953) Report of WHO Committee on Vaccination against Tuberculosis, WHO]Exp. Vacc. TBC/6, Copenhagen.