Data compression: 8-dimensional flow cytometric data processing with 28K addressable computer memory

Data compression: 8-dimensional flow cytometric data processing with 28K addressable computer memory

Journal of Immunological Methods, 113 (1988) 205-214 205 Elsevier JIM 04899 Data compression: 8-dimensional flow cytometric data processing with 28...

593KB Sizes 1 Downloads 46 Views

Journal of Immunological Methods, 113 (1988) 205-214

205

Elsevier JIM 04899

Data compression: 8-dimensional flow cytometric data processing with 28K addressable computer memory James V. Watson 1, T e r e n c e S. H o r s n e l l 2 and Paul J. Smith 1 MRC Laboratories of I Clinical Oncology and 2 Molecular Biology, The Medical School, Hills Road, Cambridge CB2 2QH, U.K.

(Received 27 January 1988, revised received 21 April 1988, accepted 26 April 1988)

A method of data analysis for flow cytometry is presented which enables up to eight-dimensional data to be handled by a microcomputer with 28K addressable plus a further 32K non-addressable memory. The multi-parameter coordinates are coded into single numbers using a minimal modification of the array vector mapping equation. These code numbers, each of which corresponds to a given set of coordinates, are then ranked in ascending order according to magnitude and the frequency of each code number is found. Following this step the code is then decoded by integer arithmetic into its original coordinates which are then packed, together with the frequency, into two, three or four 16-bit words depending on the dimensionality of the data set. A five-dimensional data set is used as the illustration. Three regions were set on one two-dimensional data space and the five mono-dimensional histograms, plus a different bivariate distribution, were extracted in a single pass through the processed data file. In addition to considerable space saving the technique has two further attributes, namely, increased speed with which the user can appreciate multiparameter data and the ability to analyse such data sets with a microcomputer. Key words: Flow cytometry; Data processing; Multiparameter data

Introduction Flow cytometers offer'unique opportunities to make multiple simultaneous measurements of both biochemical and physical properties from individual cells at very rapid rates. Many commercial instruments now possess four detector ' data acquisition channels whilst research and development instruments have acquired data on up to 32 detectors simultaneously (Salzman et al., 1979). Our instrument quantitates pulse height, width (time of flight through the laser beam) and area under each pulse for up to eight detectors simulta-

Correspondence to: J.V. Watson, MRC Laboratories of Clinical Oncology, The Medical School, Hills Road, Cambridge CB2 2QH, U.K.

neously and additionally incorporates the computer time-stamp in the data base (Watson, 1980, 1981, 1985). Thus, a total of 25 individual numbers can be collected from each cell at through-put rates of about 350 cells/s. Typically, our analytical runs measure forward and 90 ° light scatter plus two fluorescence channels simultaneously. Examples of such multiparameter procedures in routine use in our department include D N A measurements using propidium iodide (red fluorescence) and fluorescinated monoclonal antibody probes (green fluorescence) with forward and 90 o scatter (Rabbits et al., 1985; Watson et al., 1985b): Further examples of multiparameter measurements include fluorescence emission spectrum analysis with Hoechst 33342 on five fluorescence and two light scatter channels (Smith et al., 1985;

0022-1759/88/$03.50 © 1988 Elsevier Science Publishers B.V. (Biomedical Division)

206

Watson et al., 1985a). These assays necessarily impose considerable interpretative stress as multiple runs through a list-mode data set have to be made with conventional computing procedures before the data can be fully appreciated. The work described in this paper was undertaken to accelerate the interpretative process by enabling the investigator to handle up to 8-dimensional data in a single pass through a processed data set on a microcomputer with only 28K addressable memory. Furthermore, we have attempted to present the technique and processes involved in the computations in a manner which could be implemented by the enthusiastic nonspecialist without recourse to professional assistance.

Data processing with arrays Mono-variate data A single parameter data processing procedure assigns storage space for a data array where the number of storage locations corresponds to the number of digitization steps of the analogue-to-

digital converters (ADC) in the system. Hence a 1024 A D C would require an array of 1024 and all cells with the same value would be summed into that particular array location. The most obvious universally appreciated data set of this type is the D N A histogram, and with a G1 peak recorded in channel 200 about 70% of the array will contain no data and represents wasted space. Bi-variate data A similar two parameter processing procedure would have to assign a two-dimensional array of 1024 * 1024 storage locations. Immediately we have a major problem as this represents 2 Mbytes of storage which is beyond the capacity of most main-frame computers to which the majority could have access. Fig. 1 shows a bivariate distribution (90 ° scatter versus D N A where three regions R1, R2 and R3 have been set, see later) and it is obvious that the vast majority of the array contains no data. With a data set of 10000 cells in which no two cells have the same coordinates over 99% of the 1024.1024 array contains zeros and represents wasted space. Tri-variate data Extending this type of data processing system to three parameters could not even be contemplated as this would require the assignment of a 1 0 2 4 . 1 0 2 4 . 1 0 2 4 array, i.e., 2K Mbytes of memory.

DNA

Multiparameter data processing

Ca

I

I

I

I

I

I

I

°

I

I

Fig. 1. Bivariate contour plot display of 90 o scatter (pulse width) on the ordinate versus D N A on the abscissa. T h r e e elliptical regions R1, R2 and R3 w e r e set as shown in this data space.

Two major problems arise in handling multiparameter data sets. Firstly, the standard FORT R A N language, in which many programs are written, cannot assign arrays with greater than three dimensions. Even if it could we would not be able to use them with 10-bit (1024) resolution for the reason outlined above. Secondly, the frequency of each set of identical n-dimensional coordinates within the data base must be found. For example, we have to find the number of times say x = 480, y = 210 and z = 863 occurs in the list-mode data set and how many times the set of x = 23, y = 493 and z = 126 occurs. A coordinate-by-coordinate comparison could be made but this results in a

207 massive computation for three-dimensional data with 10000 sets of coordinates, one set for each cell, and becomes prohibitive as the dimensionality of the data set increases further. We have routine applications in our laboratory which require four- and five-dimensional analyses and for some applications this has been extended to seven dimensions. Hence, methods have been developed to handle data sets with up to eight dimensions which is the m a x i m u m that can be handled rapidly by a 16-bit computer.

problems are encountered during decoding. Hence with 10-bit data precision we can code four-dimensional data as a single number as the maxim u m code will be 240 . If data sets with more than four dimensions are needed the data precision must be reduced. Thus, six-dimensional data must be reduced to 8-bit precision (0-255), and eightdimensional data must be reduced to 6-bit precision (0-63), as in both cases the m a x i m u m code will be 2 4 8 .

Ranking the coded data Multidimensional arrays are not stored as such in computer m e m o r y but as a monodimensional linear array and the positions of the array elements are located using the array vector mapping equation. This equation, with the addition of 1, can be used as a means of coding multidimensional coordinates as a single number, CODE. It was introduced for data reduction by one of the authors (JVW) at the Meeting of the Society for Analytical Cytology, Schloss Elmau in 1982, and has subsequently been used by Mann (1987). The form of this equation is as follows:

Each set of n-dimensional coordinates is coded as a single number and stored sequentially in a monodimensional array. The m a x i m u m dimension of this 64-bit precision coded array is the number of cells in the data set and in our P D P system this is stored in virtual memory. The array is now ranked in ascending order according to the magnitude of the value in each d e m e n t of the array. It is now a simple process to find the frequency with which a given code occurs in a single pass through the array with concomitant reduction in storage requirements as the frequency with which a given code appears is stored in a 16-bit precision array.

CODE = l + ( x - 1 ) + X(y-1)+ XY(z-1)

Decoding, data packing and unpacking

Single number coordinate coding

(1)

where C O D E is unique for a given set of the three-dimensional coordinates x, y and z and where X and Y are the m a x i m u m values of the x and y measurements. The equation is written in this form as the terms (x - 1), ( y - 1) and z - 1) must be handled as the whole of each expression within the parentheses. The reason for this will be apparent in the worked example in appendix A where the decoding procedure is presented and a little practice will convince the reader that all sets of three-dimensional coordinates can be coded as a single number and decoded to give the original coordinates. Furthermore, equation 1 can be expanded for n dimensions. In practice this coding procedure is limited by the magnitude of a number that can be handled by a particular computer. The m a x i m u m code is 248 using 64-bit precision (double precision floating point) with F O R T R A N in a 16-bit computer (PDP 1 1 / 4 0 and LSI 11/73) as with values larger than this rounding down

The data now exist in the form of two arrays, one of 64-bit precision and one of 16-bit precision. The latter contains the frequency of a given code in the array location corresponding to its code value in the former. Thus, 80 bits are needed for each record in order to store a given set of coded coordinates together with the frequency. This can be relatively costly in terms of storage and the following space saving procedures have been developed. These use 'bit-shifting' routines which move a specified number of bits from one location in a first variable to a specified location in a second variable. VAX F O R T R A N 77 contains such a subroutine, MVBITS, and an assembler language 'look-alike' routine, also called MVBITS, has been written for P D P 16-bit computers using RT-11. This routine is given in appendix B together with two F O R T R A N routines which perform similar operations. The data are decoded according to the procedures given in appendix A and packed, using the

208

coo,....,..-: 3-D~-

6

5 ¥

4 ~

,

,

3 ¥

, 2 ~ ~'

1 ~

, ,

e "i

5-D

i,.

! .

/ / 8-D

Results

1

~

/

IPDW(I)

/ / / IPDW(2)

IPDW(3)

IPDW(4)

Fig. 2. Summary of the locations of the 6-bit coordinates within the IPDWs for three- through eight-dimensional data. 'F' represents the space available for the frequency of a given set of coordinates. MVBITS routine, into two, three or four 16-bit integer packed data words (IPDW) depending on the data resolution and data set dimensionality. A schematic summary for 6-bit data resolution is shown in Fig. 2. Taking four-dimensional data as an example, the 4th and 3rd coordinates are stored in bit locations 10 through 15 and 4 through 9 respectively (bit positions are numbered from the right starting at zero) in IPDW(1). The second coordinate straddles IPDW(1) and IPDW(2), where the top four bits are located in bit positions 0 through 3 in IPDW(1) and the bottom two bits are located in bits 14 and 15 of IPDW(2). The first coordinate and the frequency are located in bit positions 8 through 13 and 0 through 7 in IPDW(2). 10-bit resolution data (up to four dimensions) and 8-bit resolution data (up to six dimensions) are similarly packed, but obviously into different bit positions. With four-dimensional data and 10-bit resolution the 4th coordinate is packed into bit positions 6 through 15 in IPDW(1) and the highest six bits of coordinate 2 are packed into bit positions 0 through 5. IPDW(2) contains the lowest four bits of coordinate 2, coordinate 3 and the highest two bits of coordinate 4 in bit positions 12 through 15, 2 through 11 and 0 through 1 respectively. The lowest eight bits of coordinate 4 and the frequency are now packed into bit positions 8 through 15 and 0 through 7 in IPDW(3). The data can be unpacked by reversing these procedures.

The space saving afforded by these various procedures is comprised of two components. Firstly, all data sets will tend to contain a variable number of cells with identical coordinates. The quantity of memory saved by storing the values of those coordinates only once together with their frequency is variable and totally dependent on the tightness of data clustering. Conventional storage methods would require N + 1 16-bit words after the frequency of a given set of N-dimensional coordinates has been found. The method used here for finding the frequency involves coding the data as a single number and either this or the I P D W 'packing' procedure can be used as the second component for saving memory. Comparisons of the conventional method ( N + 1 16-bit words) with both the I P D W and coded storage systems are shown in Fig. 3. The relative quantity of store required per coordinate-frequency record for the coded coordinates and for I P D W packed 10-bit, 8-bit and 6-bit data resolution compared with N + 1 16-bit words is plotted on the ordinate

1.5'

a ILl er

5 oa l

1.0"

ai11 a-

o

I-

ra > I--

0.5-

'.'......... 7 "-:.'.7...............

Ill er

Fig. 3. Relative store required per cell (ordinate) versus data set dimension (abscissa). Solid squares (ll) give the ratio comparing single number coded coordinates, 80 bits including frequency, versus conventional requirements, 16 *(N + 1) bits where N is the number of dimensions. Circles and triangles show a similar comparison for data packing procedures versus conventional methods. Solid circles (O) 10-bit data resolution (up to four dimensions; open circles (©) 8-bit data resolution (up to six dimensions); and triangles (zx) 6-bit data resolution (up to eight dimensions).

209

R3

R2 ¥

R1 T

÷

p O 2 c my©

DNA

DNA

DNA

l

p S 2 ©-my©

p 6 2 c-my¢

90 ° SCATTER (W)

S 0 ~ SCATTER ( W )

9 0 ~ SCATTER ( W )

3° SCATTER (W)

~SCATTER

3° S C A T T E R ( W )

8 0 ° S C A T T E R (A)

9 0 rJ SCATTER (A)

(W)

90 ° SCATTER (A)

Fig. 4. Histograms of the five parameters associated with the three regions of Fig. l. Columns 1, 2 and 3 correspond respectively to R1, R2 and R3. The rows from top show DNA, p62 c'"yc, 90 o scatter (pulse width), 3 ° to 9 o forward scatter (pulse width) and 90 o scatter (pulse area).

210

versus data set dimensionality on the abscissa. This figure demonstrates the increased efficiency of the data packing procedure in which relatively fewer storage bits are wasted. The data processing procedure is illustrated with five-dimensional data from ovarian carcinoma nuclei extracted from archival material (Hedley et al., 1983) and stained for D N A with propidium iodide (PI) and the nuclear associated oncoprotein p62 c-myC using a monoclonal antibody (Watson et al., 1985a). Forward and 90 ° light scatter were also recorded simultaneously with the red ( D N A ) and green (oncoprotein) signals. Pulse height, width (time of flight through the beam) and pulse area were digitized from each detector as we use a crossed cylindrical lens pair to focus the 488 nm excitation line from the Innova 70-5 argon laser (Coherent, Palo Alto, CA) producing partial slit scan illumination. The data were collected list m o d e on a 450 Mbyte RP07 disk via dedicated LSI 11/23 and time sharing PDP 11/40 computers. After collection, selected data were recalled from disk and processed on an LSI 11/73 computer (all DEC, Maynard, MA). Five parameters, red fluorescence (pulse area), green fluorescence (pulse area), forward scatter (pulse width) and 90 ° scatter (pulse width and area) were extracted from the list mode data set and processed as above. Fig. 1 shows 90 ° scatter pulse width (ordinate) versus D N A (abscissa) as a contour plot and elliptical gates were set on three putative subsets (R1, R2 and R3). The histograms of the five extracted parameters (DNA, oncoprotein, forward scatter pulse width, and 90 ° scatter width and area) were generated for all regions in a single pass through the processed data set. These data are shown in Fig. 4 where the columns correspond to subsets 1, 2 and 3 respectively and the rows (from the top) show DNA, oncoprotein, 90 ° scatter pulse width, forward scatter pulse width and 90 ° scatter pulse area respectively. Clearly, subset 3 is a discrete entity and from inspection of the data in columns 1 and 2 we can see that subsets 1 and 2 are also discrete entities. Subset 1 shows tighter forward scatter compared with subset 2 (row 4), it exhibits considerably less total 90 ° scatter (rows 5), less D N A (row 1) and lower oncoprotein content (row 2). The p62 c-myc content

RI DNA

E

R2 I

l

l

l

J

l

L

DNA

u

g

0 FI3 DNA

J 0

Fig. 5. p62 c-myc (ordinate) versus D N A (abscissa) as bivariate contour displays for the three regions (R1, R2 and R3) defined in Fig. 1.

211 (ordinate) versus D N A (abscissa) is shown for each subset in Fig. 5.

Discussion

There are three major advantages of the data handling system described here. Firstly, a considerable saving of memory has been achieved with a compression of eight-dimensional list mode data requiring 450 blocks into about 30 blocks but the size of a compacted data set is variably dependent on the 'tightness' of data clustering. Secondly, multi-dimensional data can be appreciated in a single pass through the processed data set with very considerable time saving for the user. The user 'data appreciation' time is s h o r t not only because the data set is compacted and hence can be read from disc more rapidly, but also because the data set is 'structured'. This structuring derives from the processing step where the C O D E is arranged in ascending order according to magnitude with specific C O D E values corresponding to specific coordinate values. For example, in a 3parameter data set with 3*3 * 3 dimensionality C O D E numbers of 1 through 9 correspond to a z-coordinate of 1, and C O D E numbers of 19 through 27 correspond to a z-coordinate of 3. Similarly, C O D E numbers of 4 through 6, 13 through 15 and 22 through 25 correspond to a y-coordinate of 2. Thus, specific combinations of coordinates and their frequencies can be found rapidly within the structured data set. This can be best appreciated by the data shown in Figs. 1, 4 and 5 which were all obtained in a single pass through the structured processed data set. Finally, the data can be processed and handled in a microcomputer with only 28K addressable memory, but obviously, virtual storage is needed. Although the procedures were originally developed on our VAX 8600, and are now routinely implemented with this computer, the algorithms were run on the LSI 11/73 with 32K memory (28K addressable) plus an extra 32K non-addressable memory to produce the results presented here. This last attribute means that all users, not just those with access to a large main frame computer, can acquire and readily appreciate data sets containing up to eight parameters.

Appendix A

The coordinates x = 3, y = 2 and z = 4, where X, Y and Z are all 5, have a C O D E of 83 according to equation 1 and can be decoded as follows. Rearrangement of equation 1 and on dividing through by X Y gives: (CODE-l) XY

x -1 --+ XY

X(y -1) - +(z-l) XY

Using integer arithmetic in this equation we can see that the first two terms on the right-hand-side (RHS), ( x - 1 ) / X Y and X ( y - 1 ) / X Y will be zero as X Y = 25 and the largest possible numerator in these terms, X ( y - 1), will have a m a x i m u m value of 20, hence: (CODE - 1)

z- 1

(2)

XY

Thus, the z coordinate is given by: 82 (CODE-xy1) +1=~-~+1 Again using integer arithmetic the term (82/25) = 3, hence z = 4. The y coordinate can now be decoded. As the z coordinate has been obtained the last term in equation 2, X Y ( z - 1) = 75. This is subtracted from the original C O D E to give: C O D E - l - XY(z -1) = (x - 1 ) + X ( y - 1 ) 82-75 = 7= ( x - l ) + X ( y - 1 ) Dividing through by X, which is equal to 5, gives: 7 5

x-1 5 +y-1

With integer arithmetic this reduces to l=O+y-1, hence y = 2. The x coordinate can now be extracted by inserting the decoded z and y coordinates into equation 2 giving: 83 = 1 + ( x - 1 ) + 5 ( 2 - 1 ) + 2 5 ( 4 - 1 ) , hence x = 3.

212 Appendix B (1) Assembler language bit-shifting routine for RT-11. S O U R C E is the integer variable from which N B I T S bits are to be shifted. S O U R C E S T A R T B I T is the location of the first bit of the

N B I T S string in SOURCE. Bits are numbered from the right starting at zero. D E S T if the integer variable to which the N B I T S string is to be shifted. D E S T S T A R T B I T in the starting bit position in DEST.

TITLE MVBITS ;CALL MVBITS(SOURCE,SOURCESTARTBIT,NBITS,DEST,DESTSTARTBIT) MVBITS:: MOV @2(R5),R0 ;get source word MOV @4(R5),RI ;source-start-bit NEG R1 ;prepare for RIGHT shift MOV @6(R5),R2 ;nbits ASL R2 ;make it a word-index ASH RI,R0 ;right-justify data BIC SRCMSK(R2),R0 ;remove junk MOV @I2(R5),RI ;dest-start-bit MOVDSTMSK(R2),R3 ;get destination mask ASH RI,R3 ;position it BIC R3,@I0(R5) ;make space in dest ASH RI,R0 ;position new data BIS R0,@I0(R5) ;insert it RTS PC DSTMSK: ~B0 ^B0000000000000001 ^B0000000000000011 ^B0000000000000111 AB0000000000001111 AB0000000000011111 +B0000000000111111 ~B0000000001111111 *B0000000011111111 ^B0000000111111111 ^B0000001111111111 ^B0000011111111111 ^B0000111111111111 AB0001111111111111 ~B0011111111111111 ^B0111111111111111 ~BIIIIIIIIIIIIIIII SRCMSK: AB0 ^BIll1111111111110 ^BIIIIIIIIIIIII100 ~BIIIIIIIIIIII1000 ^BIIIIIIIIIII10000 ~BIIIIIIIIIII00000 ~BIIIIIIIIII000000 ^BIIIIIIII10000000 ~BIIIIIIII00000000 ~BIIIIII1000000000

;0 bits (dummy) ;i bit ;2 bits

;16 bits

;0 bits (dummy) ;i bit ;2 bits

213

ABIIIIII0000000000 ABII11100000000000 ^BLI11000000000000 ~BIII0000000000000 ~BI100000000000000 ~BI000000000000000 ~BO000000000000000

(2) F O R T R A N subroutines which pack and unpack 6-bit resolution data. N D I M is the number of dimensions in the data set. ICRD(s) are

C... C... C... C...

i0 20 30

;16 bits

packed into IPDW(s) in BITPAK. The reverse operations are carried out in U N P A C K .

SUBROUTINE BITPAK(NDIM, ICRD, IPDW) FORTRAN "bit-packing" routine for 6-bit data resolution NDIM (3 through 8) is data set dimensionality ICRD is data input, 6-bit resolution IPDW is packed data word output IMPLICIT INTEGER*2 (A-Z) DIME~SICN ICRD( 9 ), IPDW( 4 ) REAL RDtR IPDW( 1 )-0 RIX24=FLOAT (ICRD (NDIM+I) )"1024. IF (RDUM.GE.3~768.) RDU~RDUM-65536. IPDW( 1 )-IFIX(RDUM) .OR. (16*ICRD(NDIM) ) .OR. (ICRD(NDIM-I )/4 ) IPDW( 2 )=0 RIXP~FLOAT ( (ICRD(NDIM-I ) .AND. 3 ) )*16384. IF (RDtM.GE.32768.) RDUM=RDUM-65536. IPDW( 2 )-I FIX(RDt~) IF (NDIM.EQ.3) GO TO i0 IPDW( 2 )=IPDW( 2 ) .OR. (ICRD (NDIM-2) *256 ) IF (NDIM.EQ.4) GO TO i0 IPDW( 2 )=IPDW( 2 ) .OR. (ICRD(NDIM-3 )*4 ) IF (NDIM.EQ.5) GO TO 20 IPEW( 2 )-IPDW( 2 ) .OR. (ICRD( NDIM-4 )/16 ) IPDW( 3 ) - 0 RDUM-FLOAT ( ( ICRD (~DIM-4) .AND. "17 ) ) *4096. IF (RDS~.GE.32768.) RDUM=RDUM-65536. IPDW( 3 )=IFIX(RIX/M) IF (NDIM.EQ.6) GO TO 30 IPDW( 3 )=IPDW( 3 ).OR. (ICRD( NDIM-5 )*64 ) IF (NDIM.EQ.7) GO TO 30 IPDW( 3)=IPDW(3) .OR. (ICRD(NDIM-6 ) ) IPDW( 4 )-ICRD (I ) RETUSN IPDW( 2 )=IPEW( 2 ) .OR. ICRD( 1 ) RE~JRN IPDW( 3 )-ICRD (1 ) REFJRN IPDW( 3 )-IPDW( 3 ) .OR. ICRD( 1 ) RETURN END

214

C... C... C... C...

i0 20 30 40 50

SUBROUTINE UNPACK(NDIM, IPDW, ICRD) FORTRAN routine for "unpacking" data with 6-bit resolution NDIM (3 through 8) is data set dimensionality IPE~q is packed data word input ICRD is 6-bit resolution data output IMPLICIT INTEGER*2 (A-Z) DIMENSION IPDW(4),ICRD(9) GO TO (110,90,70,30,20,10) NDIM-2 ICRD(1)=IPDW(4)+I ICRD(NDIM-6 )= (IPDW( 3 ).AND. "77 )+i ICRD(NDIM-5 )=( (IPDW(3) .AND. "7777 )/64 )+i GO TO 40 ICRD(1)=(IPDW(3).AND."7777)+I JDUM=(IPDW(3).AND."I70000)/4096 IF (JDUM) 50,60,60 JDUM:I6+JDUM

60

ICRD(NDIM-4 )=(JDUM.OR. ((IPDW(2) .AND. "3)'16) )+i GO TO 80

ICRD(1)=IPDW(3)+I

70

80

ICRD(NDIM-3)=( (IPDW(2) .AND. "377 )/4 )+i GO TO I00

90 i00 110 120 130 140

150 160

ICRD(1)=(IPI~I(2).AND."377)+l ICRD(NDIM-2)=((IPDW(2).AND."37400)/256)+I GO TO 120 ICRD(1)=(IPDW(2).AND."37777)+l JIXPI=(IPDW(2).AND."140000)/16384 IF (JDUM) 130,140,140 JDUM=4+JIXPI ICRD(NDIM-1) =(JDUM.OR. (IPDW(1) .AND. "17 )"4 )+1 ICRD(NDIM)= ( (IPDW( 1 ) .AND. "1760 )/16 )+i JDUM=(IPDW(1).AND."176000)/1024 IF (JDUM) 150,160,160 JDUM=64+JDUM ICRD(NDIM+I)--JDUM+I RE.oliN END

Acknowledgement We thank Miss Cordelia Munn for the artwork.

References Hedley, D.W., Friedlander, M.I., Taylor, I.W., Rugg, C.A. and Musgrove, E.A. (1983) Method for analysis of cellular DNA content of paraffin-embedded pathalogical material using flow cytometry. J. Histochem. Cytochem. 31, 1333-1335. Mann, R.C. (1987) On multi-parameter data analysis in flow cytometry. Cytometry 8, 184-189. Rabbits, P.H., Watson, J.V., Lamond, A., Fischer, W., Forester, A., Stinton, M.A., Evan, G.I., Atherton, E., Sheppard, R.C. and Rabbits, T.H. (1985) Metabolism of c-myc gene products: c - m y c mRNA and protein expression in the cell cycle. EMBO J. 4, 2009-2015. Salzman, G.C., Mullaney, P.F. and Price, B.J. (1979) Lightscattering approaches to cell characterization. In: M.

Melamed, P.F. Mullaney and M. Mendelsohn (Eds.), Flow Cytometry and Sorting. John Wiley, New York, chapter 6. Smith, P.J., Nakeff, A. and Watson, J.V. (1985) Flow-cytometric detection of changes in fluorescence emission spectrum of a vital DNA-specific dye in human tumour cells. Exp. Cell Res. 159, 37-46. Watson, J.V. (1980) Enzyme kinetic studies in cell populations using fluorogenic substrates and flow cytometric techniques. Cytometry 1, 143-151. Watson, J.V. (1981) Dual laser beam focussing for flow cytometry through a single crossed cylindrical lens pair. Cytometry 2, 14-19. Watson, J.V. (1985) A method for improving light collection by 600% from square cross section flow cytometry chambers. Br. J. Cancer 51,433-435. Watson, J.V., Nakeff, A., Chambers, S.H. and Smith, P.J. (1985a) Fluorescence emission spectrum analysis of chicken thymocytes using Hoechst-33342. Cytometry 6, 310-315. Watson, J.V., Sikora, E.K. and Evan, G.I. (1985b) A simultaneous flow cytometric assay for c-rnyc oncoprotein and cellular DNA in nuclei from paraffin embedded material. J. Immunol. Methods 83, 179-192.