Zent.bl. Bakteriol. 287, 241-247 © Gustav Fischer Verlag 1998
Zentralblatt fiir
IIkterio.o"
Research paper
European Interlaboratory Comparison of Lyme Borreliosis Serology E. C. Guy, J. N. Robertson, M. Cimmino, L. Gem, Y. Moosmann, S. G. T. Rijpkema, V. Sambri, and G. Stanek
Summary Serological testing for Lyme borreliosis was compared in 5 European reference labo ratories with a total of 79 sera in order to determine variations in laboratory perform ance. A considerable range of methods were used and several laboratories employed 2 or 3 genomospecies of Borrelia burgdorferi sensu lato. No laboratory relied routinely on a single test and each weighted the significance of the findings of the various tests differently. A difference in strategy between laboratories in high and low prevalence areas was apparent in that laboratories in low prevalence areas emphasised specificity more than sensitivity and therefore produced fewer false positives, but also missed some cases. Overall agreement between the laboratories was poor and it was conclud ed that there is a need for a quality assurance scheme within Europe.
Introduction The diagnosis of Lyme borreliosis by serological testing is difficult. False pos itive findings occur due to the presence of cross-reacting antibodies to Borre lia burgdorferi antigens present in the normal population (1) and this prob lem is compounded by the low levels of specific antibody, associated typically with early infection, that may give rise to false negative results (2). An addi tional factor that might contribute to such problems is the reported antigenic heterogeneity among B. burgdorferi strains isolated from different regions within Europe (3). The first serological tests introduced for Lyme borreliosis were based on in direct immunofluorescence assays (4). A number of different tests have been introduced subsequently in order to improve serodiagnosis, including enzyme immunoassays (EIA) and Western blotting (WB). Antigen preparations em ployed in these tests have included sonicated or unsonicated whole cell prep-
242
E. C. Guy et al.
arations of B. burgdorferi, purified flagellum (5) and recombinant proteins (6). Also, as a result of recent work identifying the genomospecies B. afzelii and B. garinii, in addition to B. burgdorferi sensu stricto, there has been much emphasis on determining the most appropriate species to include as antigens in serological tests. At present there appears to be no single test available for the unequivocal serodiagnosis of Lyme borreliosis and many laboratories rely on a combina tion of tests. Due to the wide variety of qualitative and quantitative serolog ical tests employed, and the different combinations of these used in achiev ing a final laboratory diagnosis, standardisation and comparison of overall performance of Lyme borreliosis testing is difficult. One of the key roles of the working group on laboratory diagnosis within the European Union Con certed Action on Lyme Borreliosis (EUCALB) was to identify appropriate serological methods for use in epidemiological studies of Lyme borreliosis in Europe. An important first step, therefore, was to compare the performance of five European reference laboratories in different countries to determine the level of agreement in serological data provided from these different sources. This study, which is presented in full below, indicated significant variations in performance and in the types of tests employed by the different laborato ries and led to initiation of a quality assurance (QA) study designed to pro mote wider standardisation of serological testing. The QA study involved 42 laboratories in 22 European countries and 4 laboratories in the USA (includ ing the Centers for Disease Control), and it demonstrated both the feasibility and the need for a European quality assurance scheme and will be reported fully elsewhere. A further finding of the first study was that immunoblotting, which is wide ly considered to have a role in confirmatory testing, varied among the partic ipating centres both in terms of the antigen used (B. burgdorferi s.s., B. afzelii and/or B. garinii) and the precise criteria for seropositivity. A third study on immunoblot standardisation, involving the testing of 240 sera by laboratories in 7 countries, was therefore undertaken and is still in progress at the time of writing.
Materials and Methods Serum samples Five participating European centres, located in Austria, England, Italy, Swit zerland and The Netherlands, each contributed comparable numbers of sera from patients with erythema migrans (EM) or with symptoms of late Lyme borreliosis. The diagnoses of EM and late Lyme borreliosis were made by lo cal clinicians experienced in these conditions. A total of 53 EM sera were con tributed. The late Lyme borreliosis samples consisted of sera from patients with arthritis (n = 4), acrodermatitis chronic urn atrophicans ACA (n = 4) and
European Interlaboratory Comparisons of Lyme Borreliosis Serology
243
neuroborreliosis (n = 2) and had been found seropositive for B. burgdorferi s.l. by the contributing laboratory. One laboratory supplied sera from 16 healthy women attending antenatal clinic to serve as normal controls. Distribution of samples and compilation of data
The 5 laboratories were coded A to E. Sera were sent to laboratory E by each of the other laboratories, and were aliquotted and coded by staff not involved in the study. Samples, together with a questionnaire requiring information about the tests used, were sent to each laboratory to be tested according to local protocols. All results were returned to laboratory E for decoding. Immunoassays
Tests employed were those used routinely for serodiagnosis in each laborato ry where cut-off criteria for positive and equivocal results had been estab lished previously. Three types of conclusion on the samples were agreed: neg ative, equivocal and positive.
Results The overall strategy for testing differed among the five laboratories and a va riety of serodiagnostic tests were used to test the sera (Table 1). No laboratory relied on a single serological test. Laboratory A employed only EIA but in cluded two different assays based on different antigen preparations (a whole cell sonicate and a recombinant antigen). Only laboratory B employed a single genomospecies, B. burgdorferi s. s., as test antigen. All laboratories tested for both IgM and IgG antibodies with the exception of laboratory C which test ed for IgG only. The number of samples detected amongst the EM sera ranged from 4/53 (8%) to 28/53 (53%) when only samples considered positive were counted, and from 4/53 (8%) to 32/53 (60%) when equivocal findings were also in cluded (Table 2). However, in two of the three laboratories, where increased senisitivity resulted from inclusion of sera considered equivocal, there was a concomitant decrease in specificity. When the results for individual EM sera were compared, complete agreement among all five laboratories was achieved with only 12/53 (23%) of EM sera (3 were positive and 9 negative). Agree ment between laboratories Band D, which had identified the largest number of EM sera, was found in 35/53 (66%) of samples. All laboratories displayed a higher level of sensitivity with sera from late Lyme borreliosis patients, but no single laboratory found all 10 samples pos itive. When equivocal findings were included, laboratory A succeeded in iden tifying all 10 sera. Overall, better agreement was reached with these sera than with the EM sera.
244
E. C. Guy et al.
Table 1. Tests employed by the participating laboratories Laboratory
Test format and Ig class detected
Antigen preparation
Isolate
A
ELISA (IgG) & ELISA (IgM) ELISA (IgG) & ELISA (IgM)
Whole cell sonicate Recombinant: (plOO, p4l, p3l, p20, p4l-i')
Ba,Bg
B
IFA (lgG) ELISA (IgG) capture ELISA (IgM) Immunoblot (IgG)
Whole cell Whole cell sonicate Flagellum Whole cell
Bb, B3l Bb,B3l Bb, B3l Bb,B3l
C
IFA (IgG) Immunoblot (IgG)
Whole cell Whole cell
Bb, IRS Bb, IRS Ba, VS46l Bg,PBi
D
Inhibition ELISA (IgG) Immunoblot (IgG) Immunoblot (IgM)
Flagellum Whole cell Whole cell
Bb, B3l Bb,B3l Ba, A39s
E
ELISA (IgG & IgM) Immunoblot (IgG)
Whole cell sonicate Whole cell sonicate
Bb, B3l Ba, ACA-l
Ba - Borrelia afzelii, Bg - Borrelia garinii, Bb - Borrelia burgdorferi sensu stricto. • Internal fragment of the flagellin protein.
The number of normal control sera considered positive ranged from 0/16 (laboratories A, C and E) to 3116 in laboratory D. Laboratories D and E con sidered 3/16 and 2/16 respectively to be equivocal. Since the participating laboratories each provided sera to the study it is pos sible that bias may have been introduced if any laboratory was more efficient at identifying its own local sera than those contributed by the other centres. In order to investigate this question an analysis of variance (ANOVA) was undertaken of each EM result obtained by each laboratory against the source five centres there was no of each sample. The analysis indicated that for significant difference (p =0.40) between the proportions of EM sera found positive and the source laboratory.
all
Discussion There was little consensus among the five laboratories participating in this study regarding the most appropriate serological tests for the detection of antibody specific for B. burgdorferi s.l., and the combinations of tests for
European Interlaboratory Comparisons of Lyme Borreliosis Serology
245
Table 2. Sensitivity and specificity of overall serodiagnosis of early and late Lyme borreliosis (LB) calculated on the basis of two different criteria for seropositivity Number of samples found positive by laboratory Sera EM (n
A
B
C
D
E
17 (32%)
24 (45%)
4 (8%)
28 (53%)
10 (19%)
17 (32%)
28 (53%)
4 (8%)
32 (60%)
26 (49%)
9 (90%)
7(70%)
7 (70%)
8 (80%)
7(70%)
10 (100%)
8 (80%)
7(70%)
9 (90%)
9 (90%)
0(100%)
1 (94%)
0(100%)
3 (81%)
0(100%)
0(100%)
1 (94%)
0(100%)
6 (62%)
2 (87%)
= 53)
positive (sensitivity) positive or equivocal (sensitivity)
Late LB (n
= 10)
positive (sensitivity) positive or equivocal (sensitivity)
Normal controls (n positive (specificity) positive or equivocal (specificity)
= 16)
achieving the most accurate overall serodiagnosis. For example, one labora tory does not routinely include an EIA, another does not use WB and one does not test for IgM. One laboratory incorporates three genomospecies of B. burg dorferi s.l. into testing while one laboratory uses B. burgdorferi s. s. only. Cri teria for WB positivity varied both in the number and molecular size of key antigens. As a result, direct comparison of overall laboratory serodiagnostic performance is difficult, and comparisons of single tests might be misleading. Due to the wide variety of protocols it was decided to compare only the overall serodiagnosis achieved using the combination of tests preferred by each laboratory, and not results from particular tests. Values for overall spec ificity and sensitivity of testing were calculated primarily in order to permit direct comparison between the participating laboratories. It is important to note that the absolute values obtained cannot be compared directly with val ues from other studies that relate to the performance of individual serological tests. In the serodiagnosis of early Lyme borreliosis, three of the five laboratories achieved a significant finding (either positive or equivocal) in approximately
246
E. C. Guy et al.
50% of the EM sera. However, the two laboratories detecting the fewest EM sera had the highest overall specificity, 100% in both cases. Interestingly, la boratory C, the only laboratory to incorporate all three genomospecies of B. burgdorferi s.l. into serological testing, detected the fewest EM sera as pos itive. Laboratory B, which detected the second highest number of EM sera, used only B. burgdorferi s. s. as test antigen. Thus, while it is likely that the performance of some serological tests might be optimised by selecting as anti gen the genomospecies most closely related to that infecting the local popula tion, the present findings underline the point that other factors may have a greater effect on sentitivity. Indeed, laboratory C was the only laboratory not to test for IgM, and this might have contributed to the poor sensitivity ob served in that case. Where equivocal results are excluded, the two laboratories (C and E) de tecting the fewest EM sera also had the highest apparent specificity (100%). These two laboratories are in regions of Europe with the lowest reported in cidence of disease among the regions represented in the study, and were using a strategy to optimise exclusion of false positive findings, despite there being a significant decrease in sensitivity. This is because as specificity decreases the ratio of false positive results to true positive results will become relatively less favourable in populations with lower incidence of infection. Using this strat egy, clinical management of Lyme borreliosis is based on the premise that neg ative results exclude suspected late, but not early, Lyme borreliosis. Seronega tive cases of early Lyme borreliosis may be subsequently identified by the de tection of rising antibody levels in a later blood sample. However, antibiotic treatment may prevent seroconversion. No laboratory was able to identify as unequivocally seropositive all 10 se ra from late Lyme borreliosis and agreement among the 5 laboratories was poor. One explanation might be that some samples were not in fact true Lyme borreliosis but were the result of an initial false positive in the particular la boratory contributing to the study. The normal control sera proved to be a problem for laboratory D, the la boratory with the most sensitive tests for detecting EM. Specificity fell to 62 % in this laboratory when positive and equivocal results on the normal controls were counted together. A number of previous interlaboratory studies have compared the perform ance of particular serological tests for Lyme borreliosis (7, 8). While impor tant in providing insight into the reliability and relative accuracy of tests, and also facilitating improvements in these, such studies can provide only limited insight into the overall performance of laboratories in achieving a correct ser ological diagnosis and in supporting the most appropriate clinical manage ment. Any study aimed at investigating overall performance must take ac count of the wide variety and combinations of tests, the weighting placed on the result of each test in achieving an overall diagnosis, and also the signifi cance placed on positive, equivocal and negative findings in determining ap propriate management in suspected early and late Lyme borreliosis. The re-
European Interlaboratory Comparisons of Lyme Borreliosis Serology
247
quirement for future studies in assessment of overall serodiagnostic perfor mance is underlined by recent recommendations that combinations of sero logical tests, such as screening by EIA with confirmation by WB, can result in more accurate laboratory diagnosis than single-test strategies (9).
Acknowledgements We thank Dr. P. Anda, Dr. G. Bigaignon, Prof. M . Granstrom, Dr. O. Lesnyak and PD. Dr. B. Wilske for fruitful discussions.
References 1. Shrestha, M., R. L. Grodzicki, and A. C. Steere: Diagnosing early Lyme disease. Am. J.Med. 78 (1985)235-240 2. Grodzicki, R. L. and A. C. Steere: Comparison of immunoblotting and indirect enzyme-linked immunosorbent assay using different antigen preparations for diag nosing early Lyme disease. J. Inf. Dis. 157 (1988) 790-797 3. Hauser, U., G. Lehnert, R. Lobentanzer, and B. Wilske: Interpretation criteria for standardized western blots for three European species of Borrelia burgdorferi sensu lato. J. Clin. Microbiol. 35 (1997) 1433-1444 4. Wilkinson, H. w.: Immunodiagnostic tests for Lyme disease. Yale J. BioI. Med. 57 (1984) 567-572 5. Guy, E. c.: The laboratory diagnosis of Lyme borreliosis. Rev. Med. Microbiol. 4 (1993) 89-96 6. Burkert, S., D. Rossler, P. Munchhoff, and B. Wilske: Development of enzyme-linked immunosorbent assays llsing recombinant borrelial antigens for serodiagnosis of Borrelia burgdorferi inf{ :tion. Med. Microbiol. Immunol. 185 (1996) 49-57 7. Bakken, L. L., K. L. Case S. M. Callister, N.]. Bourdeau, and R. F. Schell: Perform ance of 45 laboratories participating in a proficiency testing program for Lyme dis ease serology. JAMA 268 (1992) 891-895 8. Bakken, L. L., S. M. Callister, P.]. Wand, and R. F. ScheU: Interlaboratory compari son of test results for detection of Lyme disease by 516 participants in the Wiscon sin State Laboratory of Hygiene/College of American Pathologists Proficiency Test ing Program J. Clin. Microbiol. 35 (1997) 537-543 9. Anonymous: Recommendations for test performance and interpretation from the second national conference on serologic diagnosis of Lyme disease. Morb. Mortal. Wkly. Rep. 44 (1995) 590-591