Human leukocyte differentiation antigens: monoclonal antibody computer databases as a tool for the future

Human leukocyte differentiation antigens: monoclonal antibody computer databases as a tool for the future

Molecular and Cellular Probes (1987) 1, 61-72 Human leukocyte differentiation antigens : monoclonal antibody computer databases as a tool for the fut...

495KB Sizes 0 Downloads 33 Views

Molecular and Cellular Probes (1987) 1, 61-72

Human leukocyte differentiation antigens : monoclonal antibody computer databases as a tool for the future

Steve Cobbold Division of Immunology, Cambridge University Department of Pathology, Tennis Court Road, Cambridge CB2 1QP, UK (Received 15 January 1987, Accepted 22 January 1987)

The immunofluorescence results of the Third International Workshop on Human Leukocyte Antigens have been used as the basis of a microcomputer database system . As well as providing a convenient means to access data on individual antibodies and the antigens identified by the Workshop (designated as 'clusters of differentiation' or CD antigens), it can be used to help characterize and identify new monoclonal antibodies . The ability to continue adding to this database means that it should become an increasingly powerful tool for both immunologists and haematologists . The philosophy behind the methods described should also be of interest to anyone making monoclonal antibodies to differentiation antigens .

KEYWORDS: monoclonal antibody, leukocyte, differentiation antigen, computer database, cluster analysis .

THE THIRD INTERNATIONAL LEUKOCYTE ANTIGEN WORKSHOP The use of antisera was already widespread as a means of distinguishing cell lineages on the basis of 'differentiation antigens' expressed on the surface membrane, even before Kohler and Milstein described a method of producing monoclonal antibodies .' The exquisite specificity, ease of production, and virtually unlimited supply of monoclonal antibodies in general, has meant that there are now thousands of reagents available against what is probably in the order of a few hundred different molecules on the surface of human leukocytes . The aim of the International Workshops on Human Leukocyte Differentiation Antigens has been to identify groups of monoclonal antibodies which react with and define a series of unique differentiation antigens expressed on a variety of human haemopoietic cells . By the Third Workshop held in Oxford 1986 (results to be published in May 19872 ) at least 45 different antigen specificities had been defined using over 600 monoclonal antibodies (reviewed by Franklin et al.)' However, approximately 200 further antibodies remained unclassified, and there must be an even larger number of

0890-8508/87/010061 + 12 $03 .00/0

61

© 1987 Academic Press Inc . (London) Ltd



62

S . Cobbold

reagents which were not submitted for analysis . This means that, although the Workshops held so far have made an enormous contribution to the identification and classification of human leukocyte antigens, this may only represent a small part of the overall picture . What is needed is a simple means to compare any monoclonal antibody with those tested in the Workshops, on similar lines to those already familiar to the molecular biologist working with gene sequence databases and comparison programs . The development of such a system, starting from the immunofluorescence data submitted for antibody cluster analysis, became a subsidiary aim of the Oxford Workshop, and it is some of the results of this particular effort which are reviewed here .

ANTIBODY CLUSTER ANALYSIS-HOW IT WORKS Biological basis The basic idea behind the type of cluster analysis performed for the Leukocyte Antigen Workshops is very simple : it assumes that each differentiation antigen within the blood system will have a unique pattern of expression during the process of differentiation, from the multi-potential stem-cell, through intermediate progenitor cells, to the variety of mature leukocytes . In theory, to identify monoclonal antibodies reacting with the same differentiation antigen we simply have to test their ability to bind to a wide range of haemopoietic cells and find those which show the same pattern of reactivity . A homogeneous group of different antibodies can then be said to form a 'cluster' and, if biochemical and any other additional information are consistent, classified according to the CD (cluster of differentiation) nomenclature .' In practice, this means blind testing a large number of antibodies (as two or preferably more are required against each antigen to form a cluster) using a relatively non-subjective measure of reactivity such as the fluorescence activated cell sorter . It is not sufficient just to score antibodies as positive or negative for each cell type because nearly all haemopoietic cell populations are heterogeneous with respect to antigen expression (even cell lines), and many preparations will contain a variable proportion of different cell types (e .g ., peripheral blood mononuclear fraction) . Therefore, the most important two pieces of information collected by the Workshop for each antibody/cell combination were : first, the proportion of cells which bound more antibody than a negative control, and second, whether any positive cells were clearly separated from the negatives or represented a more complex pattern including weakly positive cells . This inevitably led to a mass of data which was best analysed statistically by computer .

Cluster analysis by computer The computer is programmed to group the antibodies in essentially the same way as a 'manual' comparison-by comparing all the combinations pairwise in turn and then grouping them into bigger clusters-it is just that the computer is more reliable when starting from roughly 200,000 experimental results! The only problem



Human leukocyte antigen database

63

is how to measure exactly how similar two antibodies are to each other, bearing in mind the possible sources of variation in the cells (heterogeneous), the antibodies themselves (affinity), and their detection in different laboratories (fluorescent antiglobulin, cell sorter sensitivity, etc .) . The simplest way to show how this is done is by using what is termed a 'scatter plot' of one antibody against another (Fig . 1a) . The x and y co-ordinates of each point represent the median reactivity (percentage of cells which were positive) for the first and second antibody, respectively, with a single cell type on which they were both tested . It can be seen that, if the two antibodies react similarly with all the different cells tested, the best line through these points would be at 45 degrees passing through the origin . This line can also become curved as shown in Fig . lb (for reasons of affinity or detection as discussed above), while random variations will give small amounts of scatter round the best fitted curve .' Therefore, to measure how different two antibodies are we simply have to try and fit the best curve and measure the average distance of the points from this line (Fig . 1c) : the smaller the distance, the more similar the antibodies . The overall result from all the possible pairwise comparisons can be represented by a 'dendrogram' (produced by a program such as'CLUSTAN2', 6) a small part of which is shown in Fig . 2 . The length of each horizontal 'branch' shows the maximum distance as measured above, between clusters of antibodies, rather like a family tree . The antibodies contained within each cluster (hatched in Fig . 2) are all similar to each other (a distance of less than 5 . 0) and are, therefore, likely to bind to the same differentiation antigen . The final decision as to what designates a single cluster must be somewhat arbitrary, and in practice depends on additional data, such as immunoprecipitation or inhibition of a specific function . I would now like to concentrate on what can be done once this overall cluster analysis has been performed . In particular, how can new antibodies be compared with the identified clusters (CD groups), and how can we best use the information we have accumulated about the expression of the 'CD antigens'? The simplest solution is to combine the Workshop results with suitable comparison programs and create a simple-to-use database system which can be used on a microcomputer . Two such preliminary systems, one for the IBM PC/AT and another for the Acorn BBC Micro, were made available at the Workshop conference in Oxford for all to use, and final software packages should be available in 1987 .

A MONOCLONAL ANTIBODY DATABASE ON MICROCOMPUTER A leukocyte antigen database-what should it contain? The initial aim in producing a database was to allow easy access to summarized data from the Oxford Workshop, for each antibody or designated cluster of antibodies (CD group) for a common set of cell types . Access to this data is provided in the form of profiles, where for each cell type, the median percentage of positive cells and the shape of the fluorescence histogram (see below) are depicted graphically (Fig . 3). All the individual blind coded antibodies in the Workshop, as well as consensus profiles for each CD group are available . However, it is the ability to edit data and add further results or new antibodies to the database, combined with suitable analysis programs, which potentially makes the computer database such a powerful tool .



64

S. Cobbold

(a)

(b)

188

188

58

58

8

8 8

58 HORIZONTAL AXIS= CD83 VERTICAL AXIS= T3 128 126 127

188

8

58 HORIZONTAL AXIS= CD45 VERTICAL AXIS= C45(tow affinity)

DISTANCE= 0 .9

DISTANCE= 10 . 1

(c) 188

58

8 8

58 HORIZONTAL AXIS= 018 VERTICAL ROIS= C45R

188

DISTANCE= 42 .0

Scatter plots between pairs of antibody profiles showing how the distance measure is obtained . (a) Consensus CD3 (horizontal) compared with combined data for three different CD3 antibodies (vertical) : distance 0 . 9 (very similar) . (b) Consensus CD45 (horizontal) compared with a low affinity antibody of the same specificity (vertical) showing how the best fitted line can be curved . The distance measure (10 . 1) allows for the fact that increasingly curved lines also increase the uncertainty that the two antibodies are really the same . (c) A comparison of two completely different antibodies (CD18 vs . CD45R) showing how the points are scattered far from the best fitted line, giving a large distance of 42-0 .

Fig. 1 .

188



Human leukocyte antigen database

CD 2 CD 3 w

CD 4 CD 8

-----

0

-.

5 distance

10

Fig . 2 . Part of the CD antigen dendrogram . The final designated clusters of antibodies are shown on the left-hand side, and in this example all react with different T-cell antigens (CD2, CD3, CD4 and CD8) . The horizontal branches extending from each cluster join at a point corresponding to the distance (as measured on scatter plots) to the next most similar cluster or 'family' of clusters . The horizontal length of the individual clusters (hatched) shows the maximum distance between antibodies within that cluster . For example, the cluster labelled CD2 has three very similar (maximum distance 3-5), while the CD3 cluster contains four antibodies (maximum distance 4-5) . The CD2 and CD3 clusters are also more similar to each other (maximum distance 7 . 7) than to CD4 or CD8 (with individual distances of between 8 . 9 and a maximum of 13. 0) .

CD29 188%_

58

-1 11111

1 0 SHRPE CCCCPPHNCPCPCPPPPPPPCCCPCCPPHCP

CELL CODE

PPPPPPTLTPPCIMHHJCPUKECTRRBTRST BBBBBLOHHHNOIISPUER95BRRMMCCTEL LTDGMAHTYRRH2SBBRME36ULLLMLLLZY TB 33RTT20K B72BLL LLL LM

Fig . 3 . A consensus reaction profile for CDw29 . The percentage of cells which were positive are shown as a histogram for each cell type tested (horizontal axis) . The type of reaction (shape : P=positive. C = complex, N = negative ; see text) is also recorded in each case . The cell codes used are : PBL, peripheral blood (PB) lymphocytes; PBT, PB T-cells ; PBB, PB non-T-cells ; PBG, PB granulocytes; PBM, PB monocytes ; PLAT, PB platelets ; TONB, tonsil non-T-cells ; LNT, lymph-node T-cells ; THY, thymocytes ; PHA3, PHA 3-day blasts; PMA3, PMA-activated B-cells; CONA, Concanavalin A-blasts; IL2T, 112-dependent T-cell lines/clones ; MIST, miscellaneous T-cell lines ; HSB2, HSB2 cell line; HPBA, HPBALL cell line ; JURK, JURKAT cell line; CEM, CEM cell line; PREB, pre-B-cell lines (NALM-1, NALM-6) ; U937, U937 cell line; K562, K562 cell line; EBVB, EB virus lymphoblastoid cell lines ; CALL, common acute lymphoblastic leukaemia (ALL); TALL, T-cell ALL; AML, acute myeloid leukaemia; AMML, acute myelomonocytic leukaemia ; BCLL, chronic B-cell lymphocytic leukaemia; TCLL, chronic T-cell lymphocytic leukaemia ; ATL, acute T-cell lymphoma (HTLV-1 positive) ; SEZL, Sezary leukaemia; TLYM, T-cell lymphoma .

65

66

S. Cobbold

A complete database system-what can it do?

(i) Characterizing new monoclonal antibodies To illustrate how such a database can be used, I shall describe how to compare a new monoclonal antibody with the 45 designated CD groups . Figure 4a shows six example (hypothetical) immunofluorescence histograms, together with the recorded percentage positive and shape (as in Fig . 3 : Negative, Complex or Positive mode) . These can be added to the computer database, and retrieved as a profile (Fig . 4b) . The next step is for the computer to perform pairwise comparisons with all the consensus CD groups in turn, and then to print out the scatter plots for those with the smallest distance (i .e ., most similar) as shown in Fig . 5 . Visual inspection of these should then reveal the best match, bearing in mind that some cell types will give less consistent results than others (for example, leukaemic cells vary widely in the proportion of blasts and levels of antigen expression, so that differences based only on results such as these should be treated with caution) . In the example shown, there is insufficient data from the six cells tested to decide between CD3, CD5 and CD7 (CD18 which is very different is shown for comparison). However, the use of scatter plots between these CD groups (Fig . 6) shows that the following cell types are most likely to discriminate between them : HSB2, CEM and Sezary Leukaemia (SEZL) . Obviously this type of analysis cannot be as rigorous as that familiar to molecular biologists for identifying DNA sequences, but the results can be extremely useful as a guide to further biochemical or functional experiments .

(ii) Identifying cell types and leukaemias As well as being able to analyse the database with regard to the antibodies, it is also possible to make use of the information about antigen expression on different cell types . First, CD antigen profiles can be simply obtained in an analogous way to the antibody profiles (Fig . 7) . Second, it is possible to identify cell types from the expression of a selection of the CD antigens in a similar way to that described above for antibodies . This could have a number of potential applications for checking the identity of new and established cell lines and long-term cultures, and perhaps with an expanded database in the future, for routine phenotyping and identification of leukaemic cells .

Fig. 4 . (a) Six example fluorescence histograms from a cytofluorographic analysis of a T-cell monoclonal antibody (coded as NEWT) . These show the type of data produced by the majority of cell sorter/analyser machines . The solid curves are the negative control (PBS) samples, and the open curves are the antibody labelled samples. The gate (as depicted by the horizontal bar) on each histogram was set such that the negative controls gave less than 5% positive cells . The percentage of cells within the gate is recorded for each labelled sample, together with the shape (see text) in parentheses . The following cell types were tested : A, peripheral blood T-cells (PBT) obtained by E-rosetting; B, peripheral blood non-T-cells (PBB) obtained by E-rosette depletion ; C, peripheral blood monocytes (PBM) as gated on forward and 90-degree scatter ; D, thymocytes (THY) ; E, PHA-activated (3-day) T-cell blasts (PHA3) ; F, EB virus lymphoblastoid cell line (EBVB) . (b) The same data as in (a) presented as a profile after entry in the database .

Human leukocyte antigen database (a)

FLUORESCENCE INTENSITY (linear channel no .)

(b)

NEWT 188%-

58

-

8 SHAPE CELL CODE

II .

PCCCPH PPPTPE BBBNHB TBMVRV 38

67



68

S. Cobbold

(a)

(b)

188

188

58

58

I

i

I

t 8

8 8

58 HORIZONTAL AXIS= CD85 VERTICRL AXIS= MENT

108

8

58 HORIZONTRL RXIS= CD87 VERTICRL RXIS= RENT

DISTANCE= 2 .3

(c)

(d)

188

188

188

DISTANCE =5 .4

8r

58

58

I THY

I

I Pop I PBN EBVB

8

8 8

58 HORIZONTRL RXIS= 013 RXIS= VERTT ICRL M

DISTANCE =5 .7

188

8

58 HORIZONTAL AXIS= CD18 RXIS= UERTICHL MENT

DISTANCE= 28 .0

Fig. 5 . Scatter plot comparisons of the example new antibody with consensus CD groups . The unknown antibody (NEWT) is plotted on the vertical axis in each case, and compared with the three most similar CD groups : (a) CD5 distance 2. 3, (b) CD7 distance 5. 4 and (c) CD3 distance 5 . 7 . A plot against CD18 (d) is also shown (distance 28 . 0) for comparison (for cell codes, see Fig . 3). Data points are plotted as `, I, - or + according to whether the shape of the two reactivities is P or N vs. P or N; C (vertical) axis vs . P or N (horizontal) ; P or N (vertical) vs . C (horizontal) ; or C vs . C, respectively . A complex shape is associated with increased uncertainty in the true percentage reactivity (which is taken into account in the distance measure by giving half the weight to any data point where one of the SHAPES is complex') .

188



69

Human leukocyte antigen database

(a)

(b)

188

188

r r

I+

58

58 + r

I IL2T

#r

r

8

8 8

50 HORIZONTAL AXIS= CD85 VERTICAL AXIS= CD87

188

8

58 HORIZONTAL RXIS= CD83 AXIS= VERTICAL CD85

DISTANCE= 16 .9

D I STANCE= 15 .3

(c)

r r

r

* WALL

I IL2T

B

50 HORIZONTAL AXIS= CD83 VERTICAL AXIS= 0117

DISTANCE =17 .8

Fig . 6 . Scatter plots between the CD5, CD7 and CD3 groups . The highlighted cell codes are those which are far from the best fitted line (which is close to 45 degrees for these plots). it can be seen that HSB2 and SEZL are most discriminating in (a), CEM in (b) and HSB2 and CEM in (c) . See Fig . 3 for details of cell codes .

188



70

S. Cobbold

EBUB

188%-

5B -

Il

il 1 .Iii 1 ffl il I 1 1111 SHRPE HHCCHHHHHHPHHHHCHHPCPCCPCHCCCCCCCHHHHCCPPPPPC CDna CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC CODE DDDDDDDDDD111DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD4 098800008111111111112222222222333333333344445 1234567899RBC3456789812345678981234567898345R ,r

Fig. 7 . A consensus CD antigen profile for EB virus lymphoblastoid cell lines. The mean reactivity for each CD group of antibodies is given in the form of a histogram, with the majority shape of the immunofluorescence data . Note that CD -11a, CD11b, CD11c and CD45R are given as abbreviated codes C11A, C11B, C11C and C45R, respectively .

THE FUTURE Expanding the database In order to exploit the full potential of the Leukocyte Workshop database it will be necessary to overcome some of the current limitations . In particular, this means filling in the gaps for antibody/cell combinations which were not tested in the Oxford Workshop, and improving the quality of the data where results were based on very small numbers of experiments . The latter is particularly important with respect to leukaemic cells where the present data represents very few examples of most types of leukaemia . It is hoped that the expansion and improvement of the database will be an integral part of the next International Leukocyte Antigen Workshop to be held in Vienna in early 1989 .



Human leukocyte antigen database

71

Clusters, sub-clusters and epitope mapping Until now we have been analysing antibody binding to the native antigen molecules on the surface of different leukocytes, relying on specific changes in expression during haemopoiesis to cluster antibodies such that each cluster should generally represent activity against a single differentiation antigen . However, by extending this to cells with deliberately modified antigens it is possible to 'sub-cluster' antibodies by epitopes on a single antigen . There are numerous ways which are suitable for changing the different epitopes selectively before analysis, such as treating the cells with a range of enzymes, using cross-reactions with non-human primate cells, screening 'on panels of mouse X human hybrids expressing different chrojnosomes, or perhaps most elegantly by looking at a series of transfected cell lines . Initial experiments of this type have been described for CD45 antibodies (against the 'leukocyte common antigen' or T200) . There are a number of forms of the CD45 antigen (probably due to alternate RNA splice sites) which are expressed on different haemopoietic cells, so that by using 'sub-cluster' analysis in combination with cross-inhibition studies, at least nine epitopes can be identified .'

Beyond the human leukocyte There is no reason why the complete database system should be limited to monoclonal antibodies to human leukocytes . Many of the antigens (the majority?) normally thought of as haemopoietic cell antigens are already known to be expressed on other tissues in the body . For example, CDw29, an antigen present on helper T-cells (a subset of the CD4+ve T-cells), is also expressed on a range of other haemopoietic (Fig. 3) and non-haemopoietic cells, including endothelium, renal tubules, nerve cells, adipocytes and connective tissue .' Perhaps the database could eventually include a more complete range of monoclonal antibodies to human differentiation antigens of other tissues . In parallel to the effort directed to human leukocyte antigens, there is also a large pool of monoclonal antibodies to the mouse haemopoietic system . In many cases the functional and molecular equivalents of the human CD antigens have been identified : for example, in the mouse L3T4, Lyt-2, Lyt-1 and Ly5(T200) are the exact equivalents of CD4, CD8, CD5 and CD45, respectively . Initial efforts are now being made towards creating a database for xenogeneic (rat) anti-mouse leukocyte monoclonal antibodies . It may be that, by creating a mouse antigen database which is compatible with the human one, it will become easier to relate some of the murine experimental systems directly to man .

AVAILABILITY OF DATABASES The complete database system should be available for IBM PC-compatible microcomputers soon after May 1987 from Oxford University Press (Electronic Publishing), UK . The simplified version which runs only on the Acorn BBC Micro ('LAWS' v 0 . 5), which was used to produce the examples of Figs 3-7, is also available .



72

S . Cobbold

REFERENCES 1 . Kohler, G. & Milstein, C . (1975) . Continuous cultures of fused cells secreting antibody of predefined specificity . Nature 256, 495 . 2 . McMichael, A . J . et al., eds (1987) . Leucocyte Typing III . Oxford University Press . 3 . Franklin, W . A., Hogg, N . & Mason, D. (1987) . Human Leucocyte Differentiation Antigens : Review of the Third International Workshop . Molecular and Cellular Probes 1, 55- 60 . 4 . Bernard, A ., Boumsell, L . & Hill, C . (1984) . Joint report of the First International Workshop on Human Leucocyte Differentiation Antigens by the Investigators of the Participating Laboratories . In Leucocyte Typing : Human Leucocyte Differentiation Antigens Detected by Monoclonal Antibodies : Specification-Classification-Nomenclature . (Bernard, A . et al ., eds) . New York : SpringerVerlag . 5 . Speigelhalter, D . & Gilks, W . (1987) . Statistical methods . In Leucocyte Typing 111 . (McMichael, A . J . et al., eds) . Oxford : Oxford University Press . 6 . Wishart, D . (1982) . Clustan User Manual . 3rd edn . Edinburgh : Program Library Unit, University of Edinburgh . 7 . Cobbold, S ., Hale, G . & Waldmann, H . (1987) . Summary of studies performed on the non-lineage, LFA-1 family, and leucocyte common antigen panels of antibodies . In Leucocyte Typing 111 . (McMichael, A . J . et al ., eds) . Oxford : Oxford University Press . 8 . McMichael, A . J . & Gotch, F. M . (1987) . T lymphocyte antigens : report on the collaborative experiments on the T cell panel of antibodies . In Leucocyte Typing Ill . (McMichael, A . J . et al ., eds) . Oxford : Oxford University Press .