Measurement of C-bands in human chromosomes

Measurement of C-bands in human chromosomes

Compur. Biol. Med. Pergamon Press 1975. Vol. 5. pp. 179-201. Printed m Great Britam. MEASUREMENT OF C-BANDS CHROMOSOMES IN HUMAN D. MASON, I...

2MB Sizes 0 Downloads 64 Views

Compur.

Biol. Med.

Pergamon

Press 1975. Vol. 5. pp. 179-201.

Printed

m Great

Britam.

MEASUREMENT OF C-BANDS CHROMOSOMES

IN HUMAN

D. MASON, I. LAUDER, D. RUTOVITZ and G. SPOWART MRC Clinical and Population

Cytogenetics Unit, Western General Hospital, Crewe Road, Edinburgh EH4 2XU

(Received 26 June 1974 and in revised form 9 December 1974)

Abstract-A’ technidue is presented for obtaining quantitative measurements of C-bands in human chromosomes using a computer-controlled microscope and scanner. Relative efficiencies of various procedures for C-band and chromosome area and integrated optical density measurement and the sources and extent of variance in C-band measurements, whether arising from the preparation or measurement techniques or from biological factors, are investigated. Statistical methods for determining whether or not differences in homologue band size can be detected are discussed. Measurements were taken from 50 cells of the same blood culture. Results are given for the number 1 and 16 chromosome C-bands, which appeared heteromorphic on visual inspection. In the case of the number 1 the mean size of each band population was determined to within 8% at the 95% confidence level and the means were found to be significantly different (p < O.Ol), the ratio of the smaller to the larger being about 0.78. The number 16 C-bands in the sample were smaller and visually less obviously heteromorphic than the number 1 chromosome bands and it was not possible to detect a significant heteromorphism with the sample used. C-bands

Computer

Pattern recognition

Operator interaction

Statistics

1. INTRODUCTION A number of different techniques have recently been developed for producing characteristic patterns of light and dark bands on the arms of chromosomes. One of these techniques brings out a pronounced band in the centromere region of certain chromosomes, notably chromosomes 1, 9 and 16, and these bands are known as C-bands [l]. Usually they are located on the long arms but in about 3% of cases may be wholly or partly on the short arms giving different band or chromosome shapes. Each homologue in an individual is contributed by one parent. If there is a difference, of either size or shape, between the C-bands, this can be used to determine from which parent each chromosome of a pair was inherited. It may therefore be possible to follow the inheritance of this region through several generations which could be of interest in work on gene localisation. In the longer term, it would also be of interest to know whether the distribution of C-band sizes is discrete or continuous. Currently C-bands are allotted to size categories by visual inspection. We have attempted to devise a system of scanning and measurement to provide non-subjective quantitative estimates of band size. The undertaking is only just feasible with an optical microscope (we did not consider electron microscopy) for band areas are typically 1 m2 or less. Indeed we do not claim to measure area, strictly speaking (or integrated O.D.), but attempt only to obtain a reasonably repeatable relative size measurement of some kind. 179

D. MASON,I. LAUDER,D. RUTOVIE!and G. SPOWART

180

The following steps are involved: (1) blood samples are collected and cultured according to a modification of the method of Hungerford [2]; (2) the cells are spread on slides and treated to produce the banding patterns by the method of Sumner [3]; (3) metaphase cells in which the chromosomes of interest are sufficiently clearly visible for our purposes are located on the slides (Fig. 1); (4) the specified chromosomes are found; (5) a line of demarcation between the chromosomes and the cell background is determined; (6) the band is demarcated from the chromosome; (7) measurements of band and chromosome are obtained; (8) results are collated and statistically interpreted. The entire process of cell location, chromosome identification, demarcation and measurement is carried out using a computer controlled microscope and scanning system described in [4]. The metaphase cells are located by an auxiliary scanning and processing set-up constructed for this purpose, and then further gssessed by an operator before final acceptance. We will describe in some detail in Section 2 the procedures used for chromosome and band demarcation and measurement. The aims of the project described in this paper are as follows: (1) to examine the relative efficiency of various tactics for area and integrated optical density demarcation and measurement, and especially to test a number of different approaches to the problem of threshold setting; (2) to determine the sources and extent of variance in C-band measurements, whether arising from biological factors, from the preparation technique or from the measurement procedure; (3) to find normalisation procedures leading to specimen-independent measurement of band size; (4) to find suitable statistical methods for determining whether or not differences of homologue band size can be detected. 2. EQUIPMENT

AND

PROGRAMS

USED

The instrumentation consists of a computer controlled microscope, an image dissector scanner (which is addressable in the same way as a cathode ray tube (C.R.T.)), a PDP 9 computer with 16 K memory, a 1 Mbyte disc and a link to a larger machine, a display screen with vector and character modes, a camera and the metaphase finder mentioned above [SJ. Special equipment (supplied by Information Internation Inc.) is provided for interaction, namely 4 incremental knobs, 2 eight position button racks, 6 two position switches, 2 foot pedals and a light pen, and of course there is a teletype. Normally the operator simply presses the space bar of the teletype to proceed from one step to another; only if it seems that some correction is necessary to the machine action, e.g., if the chromosome must be trimmed because of too close an association with a nearby object (see Fig. 2), does the wider range of controls have to be used. Pressing the space bar in any case always causes continuation to the next following stage. With this arrangement there is little difficulty in training technicians to use the system. The C-Band program

For some years now we have been developing a PDP 9 based support package known as “JSCAN” for just such applications as this. To cope with the C-band problem a suite of variants of JSCAN modules and a number of special purpose assembly code routines were compiled. On entry to the program the operator is at the first of a series of “inspection windows”, in this case for purposes of accepting or correcting the cell identification. Depressing the teletype space bar leads to a video display of the cell as seen by the scanner.

Fig. I. A typical cell used in the C-band study. The number 1 and 16 chromosome pairs are indicated. The C-bands of both these chromosome pairs appezr hetermorphic on inspection.

CBM,/:p.

180

Fig. 2. Variable brightness display of the digitised optical density structure of the largest abovethreshold object found within the steerable ring. If the chromosome of interest within the steerable ring is not separated from neighbouringchromosomes by the initial thresholding procedure a trimming device may be used to remove unwanted material. In (a) the chromosome to the left has not been compietely isolated. In (b) the trimmer is positioned at the required position and angle in the field. In (c) the trimming action has been performed, leaving the isolated chromosome. Fig. 3. Display of the chromosome and C-band boundaries determined as a result of the final thresholding procedure, superimposed on a variable brightness display of the optical densities within the chromosome envelope. The trimmer may be used again to remove portions of boundaries felt to be incorrect, as in the chromosome boundary in (a). In (b) the boundary to be trimmed is first selected by pointing the light-pen at it. In (c) the trimmer is brought to the required position. In (d) the trim has been completed and the C-band boundary restored ready for measurement .

Measurement

of C-bands in human chromosomes

181

A steerable ring is superimposed on this picture to enable the operator to choose the chromosome he next wants to measure ; by means of the console knobs the ring can be positioned at any point in the field and its size adjusted appropriately. It is not necessary for the operator to fit the ring exactly around the chromosome-the program will select the largest connected above-threshold region within the area covered by the ring. On the next push of the space-bar the region which has been selected is shifted to a portion of the image dissector photocathode known to be blemish free, by movement of the microscope stage. Another inspection slot is provided to make sure that the ring has been correctly located after stage movement. (The stage moves in steps of 4 pm with a standard deviation of + pm-a scanning and imaging operation is required to precisely locate and centre the chromosome after movement). At the next step the selected region is scanned and a threshold procedure (see below) is used to set an initial cut off point between chromosome and background. The results are displayed to the operator at the next inspection slot (Fig. 2). The only amending action allowed for is the removal of intrusive parts of neighbouring chromosomes; the threshold strategy ensures that recombination of fragments will not be necessary. Next an envelope about 1 pm wide is thrown around the chromosome outline so far determined, to provide a standard amount of surrounding background for the purposes of a more precise second threshold. The focal plane is optimised over this envelope using an automatic focusing technique described elsewhere [6]. This ensures that C-band size variation due to misfocus is negligible. After this a final scan is carried out at the optimum focal position and a more refined threshold strategy invoked to find a cut off point for the chromosome and another for the C-band. As an aid to monitoring the proceedings, the sequential boundaries of the chromosome and C-band are determined and a variable brightness reconstruction of the digital values of the optical densities within the chromosome envelope are presented to the operator, with the boundaries, at the next inspection slot (Fig. 3). Again the only correcting action available to the operator is total rejection or trimming off of portions of the band or chromosome boundary felt to be unsuitable. Once the operator has accepted the boundaries, measurements are found and displayed after which a final push on the space bar causes reversion to the identification phase: the program automatically adjusts the chromosome identihcation and repetition number, so as to guide the operator in his next selection. The whole measurement process takes about a minute for a straight-forward chromosome, but up to 14 times longer if trimming is required, as it was for about loo/, of the chromosomes in our experiment. Considerable improvement could be obtained by using specially written code to bypass the general purpose (JSCAN) programs at some points, and by removing certain known inefficiencies in the latter. 3. METHODS

OF

CHROMOSOME AND BAND AND MEASUREMENT

DEMARCATION

The main points to be considered in evaluating algorithms for determining C-band and chromosome magnitude are: (1) errors introduced by the measuring system and the software must be minimal in relation to those due to biological factors or the method of culture; (2) the nature of the response of the algorithm to over-all stain level is important as the level can vary a great deal between different cells, especially when they are on different slides; (3) since we are concerned with the possibility of

D. MASON, I. LAUDER, D. RUT~VITZ and G. SPOWART

182

distinguishing of homologue

homologues,

the decisive factor in choice of algorithm

is the efficiency

separation,. A measure of this is the dispersion

&c1

fs’

where p is the mean difference between homologous bands and CTthe standard deviation of this difference. As will appear in more detail later, p and (T cannot be estimated directly, as, using cells with a normal karyotype, we have no means of identifying a particular member of a homologous pair in any one cell. Also there is systematic variation of homologue pair measurements due to factors such as stain level and cell contraction, and so standardization procedures have to be employed before these estimates can be arrived at. Nevertheless, this is essentially the statistic upon which our algorithm selection is based. The algorithms tested were mainly dependent on the traditional thresholding approach in which points having an optical density (O.D.) reading exceeding a certain value are considered to belong to the object of interest, and points below to the background. Optimal determination of the cut-off value is the main and crucial requirement However an alternative procedure was also implemented. Instead of trying to fix on a particular O.D. value for separating objects from their background, the manner in which the area varies with choice of threshold may be considered. If there is a range Border

limit

Mean b/ockVound value. E Area

C-/band

Mean top->f-chromosome \ above chromosome threshold

value.

T

Fig. 4. (a) The optical density histogram of the points in the bordered chromosome domain, showing a peak due to the background (at B) and a shallow peak due to the relatively flat top of the chromosome (at T). (b) the weighted optical density histogram, with each point weighted by l/(Vg’ + 1). The weighting enhances the two peaks, thus increasing the accuracy to which they may be found (see text).

Measurement

of C-bands in human chromosomes

183

of separation values over which the relationship is roughly linear, there is an opportunity of obtaining consistent measurements by using the intercept of the corresponding fitted line. The size/threshold function of a banded chromosome displays two approximately linear areas suitable for this purpose: a slowly falling portion corresponding to the O.D. values of its edge, and a similar portion corresponding to the edge of the band. Lines can be fitted to these parts of the curves. It was thought that extrapolation to the background O.D. value should yield estimates of the required parameters which would not be much perturbed by small variations in O.D. measurement. The background-value intercept of the chromosome-edge line, in particular, could reasonably be held to give a measure of whole chromosome size which would include that portion normally lost due to the fact that one must separate above, not at, background level. As such it bears some affinity with Mendelsohn’s methods of integrated optical density (I.O.D.) estimation [7]. Similar considerations apply to C-band measurement: details are given in the description of Method 4 and Method 5, below. All our threshold-setting procedures are based on properties of weighted distributions of optical density values in the regions concerned. The weights employed are either unity or some function of the optical density gradient, defined in the following way: if go denotes the O.D. at the current point and gr-g8 are the O.D. values of its 8 neighbours on a conventionally diagonally connected square grid, the gradient Vg is given by the equation

Neighbours of a point.

which is approximately @I

proportional -

id2

+

@3

to : -

97)’

+

+{@2

-

g6j2

+

(94

-

gd2k

In a normal chromosome preparation one would expect the regions of high gradient to be concentrated around the edges of the chromosomes, so that taking the mean of the gradient-weighted density distribution would give a cut off value more or less centrally situated on the chromosome edge (see Fig. 6). This is in fact the method employed to determine the threshold for the initial separation of the chromosome from the background. To demarcate the C-band from the chromosome, we first find the O.D. level of the top of the chromosome arms, so that O.D. values above this point can be used in a subsequent determination of the C-band threshold while those lying below can be used in an accurate reestimation of the chomosome threshold. We have found that an efficient way of proceeding is to construct an optical density histogram of the points in the bordered chromosome domain, in which each point in the histogram is weighted by l/(Vg’ + 1). This histogram (Fig. qb) typically exhibits

184

D.

MASON, I. LAUDER, D.

RUTOVITZand G. SPOWART

two main peaks, one (at B) due to the relatively flat background in the envelope around the chromosome, the other (at 7’) due to the density plateau of the chromosome arms outside the band. This double peak structure may often be seen in the simple O.D. frequency histogram (Fig. 4a) but the technique of weighting by 1/(Vg2 + 1) enhances the flatter regions in the background and the top of the chromosome relative to its steeply-sloping sides thus making the maxima sharper and more pronounced. The weighting increases both the reliability and the accuracy of finding 7: Tolerance exceeded, accept Tolerance not exceeded, reject

I

Continuous

maximum

_

Discrete

maximum

(b)

Fig. 5. Calculation of the positions of the two maxima in the l/(Vg’ + 1) histogram involves both discrete and continuous peak-finding logic. (a) Initially the trend of the curve is considered to be rising and this state is assumed to continue until the current point has fallen more than a pre-assigned tolerance below the highest point reached (in the current rising segment). When this occurs a local maximum is recorded at the highest point, the trend is switched to falling, and the process repeated inversely, and so on. (b) Next, a 5-point parabola is fitted to the points in the region of each maximum, and the position of the maximum taken as the abscissa of the parabola’s axis of symmetry. Fractional parts of the peak positions are kept for increased accuracy.

The error in our determination of T increases when chromosomes have uneven or rounded tops, or if extraneous material is present in the border outside the chromosome. Errors due to the latter cause are avoided by allowing into the l/(Vg2 + 1) histogram, only those points with grey values above the initial chromosome threshold level which lie inside the original unbordered chromosome domain. The positions of the two maxima are calculated by finding the two peaks in the l/(Vg2 + 1) histogram using a combination of a discrete and continuous peak finding logic (see Fig. 5 for details). When the weighting factor is used, exactly two peaks have always been found in the banded chromosomes examined to date.

Measurement

of C-bands in human chromosomes

185

Once the values of T and B have been determined, various methods may be used for finding C-band and chromosome thresholds (in what follows C-band and chromosome threshold methods have been rather arbitrarily paired together but there is no necessary association of like-numbered methods).

Lower

Q.D..

Upper

g

*

Fig. 6. The Vg’-weighted O.D. histogram of the points in the bordered chromosome domain. The C-band threshold is taken as the median of the section of the distribution lying above 7: while the chromosome threshold is the median of the section lying between B and T (see text for details).

Method 1: threshold at median of gradient-weighted

O.D. histogram

Weighting the optical density histogram by the square of the gradient at each point amplifies the steeply-sloping sides of the chromosome and C-band relative to points in flatter regions. The upper (C-band) threshold may then be taken as the median of the section of this distribution lying above 7; while the lower (chromosome) threshold is the median of the section of the distribution lying between B and ZYThe lower threshold L is actually calculated from the distribution between B + C and T (Fig. 6) where C is a small positive offset used to make the chromosome area determination insensitive to the width of the background envelope. A similar technique is used for the upper threshold U, with the proviso that if the median value is very close to T we instead set U = T + K, where K is another small offset; this avoids spurious results with very small C-bands. For a unit step in threshold the average changes in chromosome area and I.O.D. are about 5 and 3%, and for band area and I.O.D. about 8 and 6%, respectively. Therefore, to eliminate quantisation error, fractional parts of the thresholds are kept, and the corresponding C-band and chromosome area and I.O.D. measurements found by interpolation (this applies to all the other methods as well). Method 2

(a) Chromosome method: threshold as average of B and T. The average of B and T may not be as strongly influenced by extraneous material in the chromosome envelopes as the threshold calculated by method 1. (b) C-band method: median and envelope. A refinement of the median method for finding the C-band threshold involves first calculating it as above and then enveloping

186

D. MASON, I. LAUDER,D. RUTOVITZand G. SPOWART

the largest connected domain lying above this threshold (in practice always the C-band) by a fixed border width of 1 pm. The Vg2-weighted density histogram is then recalculated, and its new median position taken as the new threshold. This procedure reduces the effect on the threshold of irregularities in the top of the chromosome lying far from the C-band. Method 3; constant offset A constant offset is simply added to B to get the lower threshold, and another offset to Tto get the upper threshold. The optimum values of these offsets have to be determined empirically.

Chromosome threshold C-band threshold (Method 2)

Fig. 7. Histogram of the number of points in the bordered chromosome domain having O.D. greater than value 9. plotted against 9. To estimate the chromosome area, a straight line is fitted to the region of the histogram around (B + n/2, then extrapolated back to B, and the corresponding area read off. The band area is determined in a similar way by fitting a line to the points in the region of the upper threshold calculated by method 3, and extrapolating either to T(Method 4) or B (Method 5).

These thresholds again should be less dependent on extraneous material in the region of chromosome and C-band than the thresholds of Method 1. Method 4: area by extrapolation This method and the following one due to Dr. A. Sumner are used only for measurement of areas. The histogram of “area above grey value” versus grey value is first compiled (Fig. 7). For the lower threshold, a straight line is fitted to the region of the histogram around (B + T)/2. The line is then extrapolated back to B and the corresponding area read off. Similarly, for the upper threshold, a straight line is fitted to the region of the histogram above the upper threshold calculated from Method 3, the line is extrapolated to T and the C-band area read off. Method 5: extrapolation to base-line As 4, but the upper threshold is extrapolated back to B rather than T (Fig. 7). This method differs from all the others in that it is fairly independent of the value chosen for T.

Measurement

4. ANALYSIS

OF

of C-bands in human chromosomes

C-BAND

AND

187

CHROMOSOME

VARIANCE

The results of this and the following sections were obtained from an experiment in which two slides, L and H, from the same culture were used, slide H being more heavily stained than slide L. Measurements were obtained from 25 cells on each slide. The number 1 and 16 chromosomes and C-bands were measured. According to observers, the C-bands of both the number 1 and 16 chromosomes in this culture exhibited some degree of heteromorphism (Fig. 1). The number 2 chromosomes (which had almost no C-band) were also measured to provide independent data for normalisation of band sizes with respect to variations in chromosome straining and contraction. The battery of algorithms described in the previous section was applied to the same data for each measurement. Each object was measured twice, the second measurement immediately following the first, and the readings averaged. This was to reduce the effect of short term drift in the sensitivity of the image dissector over the scan time of the object, which unfortunately becomes significant at the levels of accuracy required for C-band work. Where the difference between the successive readings exceeded a certain pre-assigned level, a third reading was taken and the two readings in closest agreement used. Repeatability of measurements by different methods

The taking of two readings for each object allowed a repeatability study to be made on the different measurement methods. The replication error due to uncorrected machine and (in some cases) operator performance was examined by calculating the co-efficient of variation in replicated area and I.O.D. measurement of chromosomes and C-bands, using all five methods. The results for the co-efficient of variation of replication (C.V.), defined by the formula measurement,

CV . . =lO@(

C (measurement,

- measurementJ2/2N + measurement,)/2N



are shown in Table 1. The standard deviation o,, of the coefficient of variation estimated by the formula

is also given. Table 1. Coefficients of variation of replicated area and I.O.D. measurements. N = 50 (slide Lonly) Method

chromosome 1

chromosome 2

chromosome 16

C-Band 1

C-Band 16

aPCa

100

area

100

area

100

area

IO0

area

IO0

1

1.6

2.6

2.0

3.5

1.9

3.3

7.5

7.4

7.6

6.8

2

2.3

3.0

2.4

3.4

2.4

3.3

9.0

8.3

7.5

7.1

3

1.7

2.8

2.0

3.4

3.3

4.1

9.5

9.1

11.1

9.5

45

2.2

2.3

_]

3.3

_

,;:“7

-

';:I

I

-+0.2

-%I.3

C.V. Ermr

-+0.2

_ ]

-to.3

.

-20.25 -to.3

-+0.9

-f0.8

-+0.9

-?0.8

188

D. MASON,I. LAUDER,D. RUT~VITZand G. SFQWART

The table shows that, as one might expect, replication errors are considerably greater for C-band areas than for chromosomes, the range being from about l& for chromosome 1 to about 8% for the bands. The table also shows a greater consistency in area measurements for chromosomes than for their I.O.D, but there is little to choose between the two for the bands. The repeatability of chromosome measurements does not differ significantly between the different methods, but Method 1 seems marginally better than any of the others for band measurement. The magnitude of the replication error relative to the total C-band variance is discussed in Section 8. Table 2. Coefficients of variation of chromosome Method

Chmmosome 1: Al%

chromosome 2

Chronvxome 16:

size ratios chromosome 2

100

AtWJ

IOD

1

4.9

10.3

8.6

15.6

2

4.9

10.5

9.0

16.1

3

5.8

12.0

8.8

16.8

4/5

5.4

8.6

The ratios considered are those for the sum of area or I.O.D. measurements of the homologous pairs.

Between-cell variation of chromosome measurements

In an attempt to obtain a first indication of the sensitivity of the different methods of chromosome measurement to differences of stain level, configuration and contraction between different cells we calculated the co-efficient of variation of the ratios of chromosomes 1 and 16 to chromosome 2 for all methods. The average of the two corresponding homologues was used as the measure of chromosome size in each cell. Table 2 indicates a slight preference for Methods 1 and 2 for chromosome area and I.O.D., but the results are not statistically significant. Contraction eflects

One of the first questions to be asked in any study involving dimensions of chromosomes or parts of chromosomes is the extent to which they are affected by the degree of contraction of the cell. Unfortunately, it turns out that the apparent C-band size is far more critically dependent on chromosome stain level than on chromosome size, so that in order to see the effect of chromosome size on band size, data has to be selected from a narrow stain range. Using the slide Ldata, which satisfied this requirement, we have determined the linear regression of C-band area against corrected chromosome area (the chromosome area minus the band), for chromosomes 1 and 16. For both parameters, averaged data from the two homologues of each type were used. It can be seen from Table 3 that for the number 1 bands, the lines have non-zero intercepts on the axis of the dependent variable (the band size). Thus for a given change in chromosome size a smaller proportional change in band size occurs; this seems to indicate that the band is a relatively inextensible object which does not expand and contract in sympathy with the chromosome. The same effect is observed if the whole chromosome area (or the independent number 2 chromosome area) rather than the corrected area is used as the independent variable, so that whether or not the band area is subtracted

Measurement

of C-bands in human chromosomes

189

Table 3. Regression intercepts for C-band area against corrected chromosome area CBl v. CHl-CBI Method

intercept

s.e.

CB16 v. CH16-CB16

mean band area

l36

14

89

'45

14

88

*100

19

95

l104

23

135

*265

70

278

The results pertain to a linear regression of C-band area against corrected area of the chromosome containing the band. Units are sq. pm x lo-‘. A significantly non-zero regression coefficient (at the 5% level) is marked “*“. The slide L data were used. The significantly non-zero regression intercepts for the number 1 band would seem to indicate that the band is proportionately less compressible than the chromosome.The picture is less clear for the number 16 chromosome.

from the chromosome area, the result is not affected. This also means that the ratio of band size to chromosome size is not a constant which is independent of contraction, and that taking a ratio is not the most accurate way of normalising the band size. A better normalisation can be effected using the regression line (see Section 5). The picture is less clear for the number 16 chromosomes, since some methods give significantly non-zero intercepts while others do not. However, for consistency we have normalised both chromosome types in the same way. Stain effects Table 4 shows the heavy dependence of our C-band measurements on the stain level in the cell. The table shows that except in the case of method 5, over half the variance is stain related, so much so that there is no significant advantage in taking contraction as well as staining into account. The measure of stain level used was the average optical density (A.O.D.) of the number 2 chromosomes in the cell.? The negative regression co-efficients indicate that the apparent band. size decreases as the stain level increases. Detailed examination of our measurements shows that this is because the stain level of the already darkly-stained band rises at a much slower rate than that of the relatively lightly-stained chromosome over our stain range. An oversimplified picture in terms of the stain-density surface can be obtained by regarding the band as a roughly conical object which has become saturated and stays constant as the stain level is increased, while the top of the chromosome gradually rises up the band. Thus, for those methods (all except Method 5) which in effect use the top of the chromosome as their band measurement base line, the band area decreases as the stain level increases. Method 5, however, makes measurements not from the top of the chromosome but from the background surrounding the chromosome, so that its measurements are almost independent of chromosome stain level. The apparent band shrinkage at high stain level is also a feature of estimates made by the human eye, as we presumably judge C-band area by comparing density levels in the band more with levels in the surrounding chromosome than with the background level. t A.O.D. = &II/A, + IZ/A2), where, for i = 1, 2, Ai and Ii are the area and integrated optical density of the number 2 chromosomes.

190

D. MASON,I. LAUDER,D. RUKWITZ and G. SPOWART Table 4. Denendence of measured band area and I.O.D. on chromosome stain level Type

1 chromosome

regression coefficient

Area Methods

100 Methods

S.E.

bands

-

Proportion of individual band variance accounted for

Type

regres sion coeffi cient

16 chromosome

S.E.

bands

Proportion of individual band variance accounted for

1

*-50

12

0.55

'-38

6

2

*-47

6

0.55

'-37

6

0.33

*-go

10

0.50

*-53

14

0.33

3 4

*-83

0.32

*-59

11

0.28

5

*0.2 -5 )(area) (AOO)

1 .06j 22

z:::

l-63

28

0.07

I

l-39

14

0.09

*-14

*-30

13

0.08

-13

11

0

*-74

22

0.15

-21

25

0.14

2 3

)

2.5

I

0.32

The dependence of band size upon chromosome stain level is determined by regressing the average band size of the two homologues against the A.O.D. of the No. 2 chromosomes. The data are taken from both slides. A significantly non-zero regression coefficient is marked ‘**“.The band area regression coefficients are in units of 10 x sq. pm/O.D. unit: the band I.O.D. coefficients are in units of sq. pm x 10m2. Except in the case of area Method 5, the amount of variance accounted for by the regression is not significantly increased if a multiple regression on chromosome stain level and chromosome area is used instead. For this method only, both coefficients from the multiple regression are given. The A.O.D. of the No. 2 chromosome calculated by Method 2 is used in conjunction with area Methods 4 and 5, since these do not measure I.O.D.

Figure S(a) is a scatter diagram of the average band area of the two number 1 homologues, plotted against average O.D. of the number 2 chromosomes in the same cell, using Method 2. The diagram shows the considerable variability of these measurements and also the way they cluster differently for cells from different slides. Indeed the mean band areas are quite different on the two slides and it is only by taking into account the stain dependence that we can reconcile these differences. Figure 8(b) shows the same results for Method 5. These show much less stain dependence, but unfortunately more variability than Method 2, even allowing for their difference in magnitude.

5. NORMALISATION

OF

C-BAND

MEASUREMENTS

As a consequence of the effects remarked on above we have normalised the C-band data by correcting it to absolute values of chromosome stain and area. In considering the precise form which the correction should take, a simple model which suggests itself is one which preserves the absolute difirence between homologue band sizes. Although this may be a good correction over a narrow range of stain and contraction, it is clearly wrong over a wide range. A form of correction we consider to be more realistic is one which ensures that the ratio of homologue band sizes remains unaltered by the correction. Such a correction

Measurement

of C-bands in human chromosomes

191

T

E =I_ ~ 1.25 s p

I.00

“0 n 0.75 6 f 0.50 0 0) 0, 0.25 E P

0

I

I

0.050

Average

2

4.07

.

z.

._I .

Jj 0 ‘0 z 2

3.0-

.’ .

I 0.100

.

/

Method

.

( li: .e .:

:.

l

=

5

0 0 O o c o OS0 ‘,“a

l

.

+

“!

2O-

I 0.200

, 0.150

I

O.D.of No.2 chromosomes(O.D.units) (a)

,”

0

:, g : it Q

IOs, O-

I 0.050

I

Average

I 0.100

I

I 0.150

1 0.200

O.D. densityof No.2chromosomes0Ounlts) (b)

Fig. 8. (a) Scatter diagram of the average band area of the two number 1 homologues versus the A.O.D. of the number 2 chromosomes in the cell, using method 2. The points cluster differently for cells from different slides, and the mean band areas are quite different on the two slides. s0 is the absolute value of stain intensity to which we eventually correct the band areas. (b) A similar scatter diagram for the band areas using method 5. These show less stain dependence, but relatively more variability, than the method 2 results.

is obtained i.e.

if, instead of correcting

the absolute band size, we correct

its logarithm,

log CBC = log CB - k,(s - so) - k,(a - a& where CBC and CB are the corrected and uncorrected band sizes respectively, k, and k, are the regression co-efficients determined from the multiple regression of the logarithm of average band size against chromosome stain and area, and s and a are the values of the average number 2 chromosome stain and area in the cell. s,-, and a0 are the absolute values of stain and area to which we correct. We have chosen to correct to a number 2 chromosome A.O.D. of 0.075 O.D. units and area of 1Opm’. These values are in the mid-ranges of stain and area of the number 2 chromosomes from slide L, the lighter of our slides. In cases where the band size is principally dependent on stain, (in practice for all methods other than Method 5), this correction can be reduced to log CBC = log CB - k’s - so), where k: is the linear regression co-efficient of log CB against stain.

D. MASON,I. LAUDER, D. RUTOVITZand G. SPOWART

192

It is important to note that the correction can be made to members of homologous pairs without knowledge of the parentage of particular homologues. The disadvantage of this correction is that it might induce inhomogeneities in the variances of corrected band populations, possibly leading to bias in the estimated band sizes. For our results, however, this appears to be a small effect (see Section 7).

6. ESTIMATION

OF

INDIVIDUAL

HOMOLOGUE

SIZES

Care has to be exercised in interpreting the data obtained for the C-bands on pairs of homologous chromosomes, because although we know that each cell contains homologues contributed by each parent we cannot match up corresponding homologues between cells-indeed the object of the exercise is to see whether differentiation is possible on a quantitative basis. The situation can be described as follows: There are two variates, x and y, distributed according to pi(x) and P&J), say. We take samples from these distributions in pairs, but we cannot associate our observations directly with their originating distributions-the problem is to estimate the means pi and ,u~ of the distributions pi and pz. Hinkley [S] and Moore [9] give some tests of significance to determine whether there are two distinct mean values when data are sampled in this manner and when there are no between-pair effects. In our situation between-pair effects due to contraction and stain are present. If it is not possible to remove these effects, then inference can be made only on the modulus of difference in the means /pi - p2 ( [8]. We have attempted to remove the between-pair effects using the regression methods referred to in the previous section. It is then possible to estimate the means pi, ,Q which are of genetic interest. Two methods of estimation have been attempted: (i) that of maximum likelihood; (ii) that due to Moore [9]. Maximum

likelihood

The general two distribution solution is attempted where derive from two normal distributions pi and p2 with means tions f3,, fs2. In this notation the probability of obtaining x, y are assumed independent and of unknown “parentage”

it is assumed that the pairs pi, p2 and standard deviaa sample pair (x, y) where is

Pi(X) P2CY)+ PI(Y) Pz(x). The likelihood f, for a sample of II such independent .LI =

ig

(PI(xi)P2(Yi)

+

pairs (Xi, yi) is

Pl(Yi)P2(Xi))r

which may be written as a function of the means and standard deviations, f, (Pi, P2, 013 ~7~). The maximum likelihood estimates for pi, p2, gl, g2 (denoted pi, b2, 81, G2) are the values which maximise f, or equivalently log cf,) with respect to the four parameters. These values are determined by the standard method as the solution to the four simultaneous equations a logf, a logf, -=O,i= 1,2 -=o,i= 1,2. 34

dai

Measurement

of C-bands in human chromosomes

193

The values are found to the numerical precision of the computer using Newton-Raphson iteration from appropriate starting values. That the iteration converges to a maximum is verified by standard mathematical results on stationary points applied to this four dimensional situation (see, e.g. [lo]). The important question is whether the convergence is to a local maximum or a global maximum within relevant ranges for the four parameters. Factors such as starting values, the iteration procedure and the log (likelihood) surface itself are important here. Well chosen starting values can reduce this problem. In the present situation we can use the means and standard deviations of the larger and smaller bands in a pair. The value for the larger mean will tend to overestimate p, and the other three values will tend to underestimate ,u~, ol, 02. The iteration procedure should converge “inwards” from the starting values for the means and “outwards” to the values of the standard deviations. This procedure combined with a scan of the log (likelihood) surface and iteration from different sets of starting values if necessary should guarantee convergence to the global maximum. Note also that the convergence is numerically sufficiently accurate since the statistical errors in the estimates are appreciably larger than any due to computational error. Once the maximum likelihood solution is obtained, the question of whether there is a detectable difference in the mean size of homologous bands can be answered by considering the estimated difference j& - j&. If it differs significantly from zero, we may assume the means are distinct. Now var (fir - j&) = var fir + var G2 - 2cov (j& - ,&). Estimates of the terms on the right-hand side of this equation are obtained from the information matrix at the point (pi, fiZ, 6i, 6.2). To test for significant departures of 6, - fi2 from zero, we have considered the statistic R, where R = 1,i&-F2 ~/cJ~,_~, and fJ2 ;,_Q, is the estimate of var (bl -j&) obtained from the information matrix. Monte Carlo calculations show significant departures of R from a folded normal distribution at the sample sizes we consider. Percentage points for R based on 3ooO samples of 25 pairs from an N(0, 1) distribution give Ro.os = 4.0, R,.,, = 5.4. There is no detectable convergence of R towards normality for samples of size up to 100 pairs. Further the percentage points do not alter if common variance is not assumed. This was ascertained by considering pairs drawn from an N(0, 1) and an N(0, 1.5) population. The simulations show however that the estimated means fil, j& are normally distributed to a good approximation. The likelihood ratio test can also be employed. The test is the hypothesis H,: pl = p2, g1 = cZ against the general four parameter alternative Hi: pl, /.L~,ol, ~7~where it is not necessary that p1 = p-Lzor r~i = g2. If we denote the likelihood of the maximum likelihood solution under Ho as A0 and under Hi as I,, then large sample theory gives under H,, -2 In (&/J.i) - xz. However a significant result from this test could mean that the variances as well as the means could be significantly different. The simulation confirmed the goodness of the xs approximation for the likelihood ratio statistic.

194

D. MASON,I. LAUDER, D. RU~VITZ and G. SPOWART

Moore’s method

Moore developed a here i.e. the detection is summarised here-for the advantage that it method. The method assumes ance 0’. From the sums and

E

[

method to cope with a problem similar to the one discussed of differences between homologous chromosomes. The method details readers are referred to the original paper [9]. It has is much simpler to implement than the maximum likelihood two normal distributions differences of homologous

i$l

txi

+

Yi

-

with means ,LQ,,u~ and common varipairs we have

1

Ci + Y))’ = 2(n - l)a2 = 2.S2, say.

Now under the hypothesis ,u, = p2, S1 - 0’ x 2n and, for n > 1, S2 - cr2x ‘“_ 1 regardless of hypothesis. The statistic f*=

S& Szl(n

-

1)

gives a test of the null hypothesis since f* - F,, n_l under the null hypothesis has a non-central F distribution otherwise. A more powerful statistic to test the null hypothesis is

and

D21En

f’=

S2Mn- 1) where

D = i ,i lxi - yil, I 1

f’ - F,- 1.n_ 1 under the null hypothesis. by d where

E, = 4/7c + [2(7c - 2)/n7r]

The difference in the means d, is estimated

From this the means of the two distributions are estimated by b; = _ij +

+J, jg

zz

_t

-

+J

where

X=

C(Xi +

yi)/2n.

Note that no errors are given for the means and that the assumption of common variance is made. A check on the maximum likelihood solution is given by comparing its likelihood with that of Moore’s solution, noting that the likelihood of the former should be greater than or equal to that of the latter.

Measurement

of C-bands in human chromosomes

7. RESULTS

OF

195

ANALYSIS

As stated previously, the data were obtained from 50 cells of the same culture, 25 being on slide L which was the more lightly stained and 25 from slide H which was more heavily stained. The number 1 and 16 chromosomes and C-bands were measured, and also the number 2 chromosomes for normalisation purposes. Visually, both the number 1 and the number 16 C-bands exhibited heteromorphism (Fig. 1). Prior to analysis, the C-band data were corrected using regression coefficients for C-band size versus number 2 chromosome area and average optical density (A.O.D.) which were calculated from all the data. A problem with the data was that the measuring algorithms would occasionally fail consistently for a particular chromosome, and produce inaccurate results for its C-band.

Table 5. Results for number 1 C-bands by maximum likelihood methods for all measurement methods Area Methods (area in sq. urnx lo-*)

/

Maximum Likelihood Results

IOD Methods (100 in sq. 11111 x o.d. units Y 10-3)

1

2

and Moore’s

3

-

4

5

1

3

2 -

deviations of larger and smaller bands. (starting values of iteration)

Fitted means and standard deviations

97.8

99.0

110.0

152

303

167

165.4

180

73.6

75.3

77.6

113

235

120

120

122

12.4

11.8

21.0

23.6

50

25.8

23

40

14.2

13.6

21.3

25.6

46

27.4

25.3

39.6

96.2

97.8

99.5

135

269

161

161.4

75.3

76.4

88.2

130

269

126

124

13.2

12.6

24.9

19.3

5s

28.5

15.7

14.6

26.8

39.5

50

32.4

24.7 29.1 -

20.9

01 - 02 Standard deviation

3.9

21.4

11.3

5.8

0

12.4

2.8

3.5

26.5

12.4

50

**6.2

0.4

6.5

0

35

37.4 8.3

0 e, - 02 R = IO, - q/o

"5.4

Q1-02

'4.5 -

'6.8

Likelihood ratio statistic

'8.4

0.1

0

5.2

1.7

4.2 -

Oispersion

1.0

1.1

0.3

0.1

0

Stationary point

Max

MU

MU

Max

Not MW

Moore's Methods Results

I

0.8

1.0

MU

Max -

fl'a

94.4

96.1

95.5

142

272

153

154.6

Olb

76.8

78.1

92.5

123

266

134

130.8

f*

1.6

*1.s

1.0

1.2

1.0

1.1

1.3

f'

*1.s

l*2.0

1.2

1.3

1.0

1.4

*1.7

I

I

I

1

I

1

The table gives the maximum likelihood results for the number 1 C-bands, using data from 46 cells. Results for all area and I.O.D. methods are shown. Results from Moore’s methods are given below the maximum likelihood results. “**” and “*” indicate a heteromorphism significant at the 1% and 5% levels respectively.

196

D. MASON,I. LAUDER,D. RUTOVITZand G. SPOWART

This might occur, for instance, if there was a particularly shallow stain transition from C-band to chromosome. The data were first screened for such outliers by doing an initial maximum likelihood fit to estimate the C-bands means and standard deviations. Then, for each chromosome number, any cell which had a larger C-band more than three standard deviations away from the larger mean, or a smaller band more than three standard deviations from the smaller mean, was classed as an outlier for this homologue. The fit was then repeated without the outliers, (as was the prior calculation of the regression coefficients). For both number 1 and number 16 chromosomes, four cells (not the same four for each type) were rejected from the slide H data. This contained poorer cells than slide L, with smaller bands. The results from 46 cells for the number 1 chromosome C-band are given in Table 5, for all methods. For the maximum likelihood analysis the different methods may be compared by their R factors, where R = [,i& - ,&[/cT~,_~~, and by their likelihood ratio statistics. Method A2 (i.e. Area Method 2) especially and Method Al, show highly significant R factors. However, Method A5 does not, which is unfortunate, since it is the least dependent on the chromosome stain level. This is due to the relatively higher band variance under Method 5, even taking into account differences in magnitudes. From these results, the likelihood ratio statistic does not appear to be as powerful a test as the R factor (we intend to investigate this in further Monte Carlo simulations). The C-band area results appear to be somewhat better than those for C-band optical density. Using starting values of the means and standard deviations of the larger and smaller bands to begin the iteration for the maximum likelihood solution, the calculation has for all methods (except A5) converged to a true maximum, generally not far from the starting values. The results for Moore’s methods are also given in Table 5 (see Section 6).f’ is generally larger thanf*, sof appears to be a better significance test at these sample sizes, which is in accordance with Moore’s findings. Although agreement is generally quite good, particularly for Al and A2, the significance levels for Moore’s methods appear to be slightly lower than for maximum likelihood; also, the difference in the means is somewhat less, perhaps indicating a greater sensitivity to the small within-cell correlation which unfortunately exists even after the regression correction. We next examine differences between the 2 slides. Table 6 shows the results for the number 1 C-bands using the data from each slide separately. Results for method A2 only are shown, as this seems the best method according to Table 5. The slide L data show highly significant results, but the slide H results are not significant. This is partly due to the poorer quality of the data on slide H and partly to the large corrections applied to the slide H data to correct them to the mid-range of the slide L data. Any deficiency in the correction model should tend to reduce the significance level. For the degree of heteromorphism to be the same on both slides, the estimates of the larger C-band mean ,&(= max (fiI, ,&)), the smaller mean ,&,( = min (fil, Q, and the difference between the means (/IO-,&,), should not differ between the slides after correction. Neither fib nor (A - ,LI&) showed significant differences between slides, but, if only statistical error is considered, fiO almost differs significantly (at the 5% level) between the two slides. However jiOand &, are also affected by error in the regression coefficients. The errors in ,& and ,&, due to regression coefficient error were derived using a regression against stain alone (stain is the dominant independent variable in the regression), and by making small changes in the regression coefficient about its estimated value. From Table 6,

Measurement

of C-bands in human chromosomes

197

Table 6. Number 1 C-band areas by Method A2, for slides Land H separately Slide Ii

Means and standard deviations of larger and smaller

U10 uzo

bands (Starting values for iteration)

a10 010

Fitted variances

01 - 02 Standard deviation

103.0

94.0

76.0

74.2

10.6

11.5

12.2

15.2

'1

10.6

11.9

a2

12.4

17.2

26.3

14.4

3.4

6.2

*'7.8

2.3

o al-n2

Heteromorphism has easily been detected from slide L, but the slide H results are not significant. “**” and “*” indicate a heteromorphism significant at the 1% and 5% levels respectively. Area units are sq. pm x lo-*.

Table 7. Confidence Limits on number 1 C-band mean areas and the difference in areas (Method A2), with and without regression error-derived term Limits for statistical + regression errors

1 a, - (lb 1

7.4

)

35.4

7.2

I

35.6

The approximate 95% confidence limits are given for A, Fbr A-&, for the A2 results on number 1 bands from 46 cells, first considering only the statistical errors, and then including the effect of the regression coefficient error. It has been assumed that the statistical and regression errorderived terms can be combined in Gaussian fashion. Using the Monte Carlo results, the limits for fi,,, fib have been derived assuming they are normal variates, while the limit for fi,,- fib has been taken as being 4 8p, _F2 from the mean. Area units are sq. pm x lo-‘.

I

198

D. MASON,I. LAUDER,D. Rurovrrz

and G. SPOWART

the C-band means from slide L are almost independent of the regression coefficient error, since we correct to the mid-range for slide L However the means for slide H show errors due to regression which are comparable with the statistical errors on the means. This is due to the larger corrections necessary for slide H. If this regression error term is included, no significant difference can be detected in the larger C-band means from either slide. Table 8. Results for number 16 C-bands by maximum likelihood for all measurement methods Area Methods (units are sq. !Jmx 10-Z)

Maximm Likelihood Results 1 Means and standard deviations of W&r and smaller

Ml0 "20

2

3

4

IOD Methods (units are sq. pm x o.d. units x 10-3) 5

1

2

3 99

61.8

65.8

63.8

104.5

233

97

98

44.5

49.4

38.6

67.5

164

66

71

58

14.3

12.8

24.1

36.6

68

25

22.5

44

13.7

11.8

18.9

31.8

69

22.5

19.2

30

(Starting values for iteration)

?o

Fitted values

01

53.6

58.9

55.0

88.5

198

84

87.8

86

02

52.7

56.4

47.1

83.0

198

79.9

81.9

70.6

81

17.6

16.8

28.0

43.5

75

31.8

28.5

50

82

14.8

12.0

20.3

32.6

75

23.6

19.7

31

0

4.1

5.9

15.4

9.7

9.1

11.4

0.4

0.6

1.3

a20

"1 - 02

0.9

2.5

7.9

5.5

Standard deviation

6.6

6.3

7.8

12.1

R = 101-02//~ a,_e2

0.1

0.4

1.0

0.5

Dispersion

0.1

0.1

0.2

0.1

Max )

Max

Max 1

Max

25.3

' 0, - Q2

Stationary Point

/

1

0

)

M:x

)

:,:

1

1,:

1

:a:,

The table gives the maximum likelihood results for the No. 16 C-bands, using data from 46 cells. Results for all area and I.O.D. methods are shown. Significant heteromorphism was not detected by any method using the maximum likelihood R factor. Moore’s methods and the likelihood ratio statistic also did not detect any heteromorphism.

Table 7 gives approx. 95 per cent confidence limits for A, ,i& and j&---j&, for the number 1 bands from all 46 cells (by method A2), considering first only the maximum likelihood statistical error and then combining this with the stain regression coefficient error. The confidence limits are unfortunately quite wide, though hopefully the regression error components at least should be reducible in the future. It should be noted that ,& and ,i& both have the same form of linear dependence on the stain regression coefficient k, (with the same gradient for both lines), while ,i&-/&, is almost completely independent of k,. The logarithmic model which we have taken as our example of a constant ratio model (for its simplicity as much as anything) is only one of many possible models exhibiting the constant ratio property. With the scant data available to us it is not

Measurement

of C-bands in human chromosomes

199

possible to judge the relative merits of other constant ratio models at this stage. The logarithmic model may well require modification in the light of further information. Although the analysis techniques have detected a heteromorphism in the number 1 chromosome C-bands, we have had no such success with the number 16 chromosome C-bands, even though a small size difference appears to exist by eye. The results from all cells for the number 16 bands are shown in Table 8, for all methods. None of the methods has managed to detect any significant difference between the band means. Assuming that heteromorphism is present, a rough upper limit to its size can be obtained using the means and variances of the larger and smaller bands. These indicate that, while the bands sizes are smaller for the number 16 chromosomes than the number 1 chromosomes, the ratio of smaller to larger band is about the same (- $), so that the difference between the means has reduced proportionately. However, the variances have not, with the result that the dispersion of the number 16 bands is probably no more than 70% that of the number 1 bands. Unfortunately we do not yet know the power (i.e. the ability to detect heteromorphism) of the maximum likelihood test as a function of dispersion, but (based upon powers off’ presented in [9], and assuming maximum likelihood is similar), we feel that there is an appreciable chance that we would fail to detect a hetermorphism having this dispersion at this sample size. 8. DISCUSSION The results of the experiment show that the measuring method with the highest resolution (Method A2) is also one with a fairly strong stain dependence. In this situation, an important requirement of a practical measuring system is that the same stain dependence should be exhibited at least by all C-bands of a particular chromosome type, for all preparations and all individuals and hopefully even by all C-bands of all chromosome types. Some support for the idea of such a “universal” stain dependence comes from the fact that the regression coefficient for the number 16 C-bands from our data is not significantly different from the number 1 C-band regression coefficient, even though the number 16 bands are smaller than the number 1 bands. The effect of stain dependence could be minimised by accepting only cells that lie in a fairly narrow stain range. This would make the exact form of the regression model chosen of less importance, and reduce the effect of imprecision in the regression coefficients. The disadvantage of this strategy is that the harvest of acceptable cells would be reduced. To overcome this problem, we are trying to make the chromosome arm staining more reproducible. The analysis technique has had some success in detecting the heteromorphism in the number 1 C-bands. However, the resolution of the method is currently such that, at the sample sizes considered, we will probably be limited to detecting medium to large hetermorphism among C-bands that are not too small. To increase resolution without increasing sample size, it is necessary to reduce C-band variance. A significant component of the C-band variance is caused by drift in the image dissector. Short term drift (over the scan time) manifests itself in the repetition error. From Table 1, this is about 9% for number 1 bands using Method A2. Given that the C-band standard deviation is about 15% of the mean C-band area and that two repeated measurements are made on each object, the variance due to repetition error is about 18% of the total variance. Long term drift (over several hours) probably contributes

200

D. MASON,I. LAUDER,D. RUTOVITZand G. SPOWART

to the remainder of the variance (and to within-cell correlation), but the amount is not known. Future work will include attempts to reduce the C-band measurement variance and to examine whether or not C-bands of a particular type have the same stain dependence. Before attempting to examine between-person differences, it must also be established that there are no differences between different cell cultures from the same individual. SUMMARY A computer-based system for obtaining quantitative measurements of C-bands in human chromosomes is described, together with various procedures for C-band and chromosome area and integrated O.D. (optical density) measurement. These involve determination of O.D. values for the local background and the chromosome arm density plateau using a histogram of O.D. weighted by l/(Vg’ + 1). Band size is seen to depend on the state of contraction of the metaphase, though the band appears to be a relatively inextensible object compared to the chromosome (at least for the number 1 bands studied). When bands are measured with reference to the chromosome arm density plateau, the size is strongly dependent on the level of staining in the cell, though this is not the case when the background adjacent to the chromosome is used as a baseline. A normalisation procedure is used which corrects band size to absolute values of chromosome stain and area, and which conserves the ratio of homologue band sizes. Statistical techniques for determining whether a difference in homologue band sizes can be detected are also considered. The problem is unusual in that observations are taken in pairs in which the ordering of each pair is unknown. The various band measurement procedures and the statistical techniques, were compared using data from visually heteromorphic number 1 and 16 chromosome C-bands from 50 cells. Heteromorphism was detected in the number 1 bands by several methods, one (Method A2) which uses the chromosome arm density plateau as a band measurement baseline being most sensitive. The mean size of each band population was determined to within 8% at the 95% confidence level and the means were found to be significantly different (p ~0.01) the ratio of the smaller to the larger being about 0.78. The band population standard deviations were about 15% of the average number 1 band size. None of the methods detected heteromorphism in the number 16 bands due we believe to the reduced power of the statistical tests at their lower dispersion. Detection of heteromorphism in the number 1 bands provides quantitative confirmation of the common visual impression that homologous chromosomes may have systematically different C-band sizes. Acknowledgement+We should like to thar& Prof. N. E. Morton for suggesting the maximum likelihood analysis technique, Dr. A. Sumner for his method of C-band and chromosome area determination and Miss K. Buckton and Prof. H. J. Evans for advice regarding biological aspects of the project. Also Mrs. M. Stark and A. Chisholm for carrying out the measurements, sometimes in difficult enough video conditions, and Mrs. V. Gibb for her careful work on a much redrafted manuscript.

REFERENCES 1. Paris Conference (1971): Standardisation in Human Cytogenetics. Birth Defects: Original Article Series, VIII, 7, The National Foundation, New York (1972).

Measurement of C-bands in human chromosomes

201

2. D. A. Hungerford, Leukocytes cultured from small inocula of whole blood and the preparation of metaphase chromosomes by treatment with hypotonic KCl, Stain Tech. 40, 333 (1965). 3. A. T. Sumner, A simple technique for demonstrating centromeric heterochromatin, Exp. Cell Res. 76, 304 (1972). 4. D. Rutovitz, J. Cameron, A. S. J. Farrow, R. Goldberg, D. K. Green and C. J. Hilditch, Instrumentation and organisation for chromosome measurement and karyotype analysis, 5th Pfizer Inc. Symp. (Edinburgh) Edinburgh University Press, Edinburgh (1969). 5. D. K. Green and J. Cameron, Metaphase cell finding by machine, Cytogenetics 11, 475 (1972). 6. D. C. Mason and D. K. Green, Automatic focusing of a computer-controlled microscope, Trans. IEEEBME-22, 4 (1975). 7. M. L. Mendelsohn, B. H. Maya11 and B. H. Perry, Generalised grayness profiles as applied to edge detection and the organization of chromosome images, Advances in medical physics, (J. S. Laughlin and E. W. Webster, Eds.), p. 327. 2nd Int. Conf. Medical Physics, Inc., Boston (1971). 8. D. V. Hinkley, Two-sample tests with unordered pairs, J. R. Statist. Sac. (Series B) 35, (2) 337 (1973). 9. D. H. Moore, Do homologous chromosomes differ?, Cytogenetics and Cell Genetics 12, (5) 305 (1974). 0. H. S. W. Massey and H. Kestelman, Ancillary Mathematics, Pitman, London (1964). Ahout the Author-DAVID C. MASON was born in Abertillery, Wales, in 1942. He received an honours degree in Physics (1963) and a Ph.D. in Elementary Particle Physics (1968), from Imperial College. University of London. He joined the staff of the Medical Research Council in 1968 and has been working on pattern recognition applied to human chromosomes and related problems, in the Clinical and Population Cytogenetics Unit since then. About the hhur-IAN

JAMESLAUDERwas born in North Berwick, Scotland, in 1947. He graduated from Edinburgh Universitv in 1970 with a BSc. honours in Mathematics and Theoretical Physics. He received an M.Sc. in Statistics from the University of Kent at Canterbury, England, in 1972. Since then he has been working as a statistician with the Medical Research Council and has interests in both theoretical and applied aspects of statistics and mathematics in Biology, Genetics and Medicine. About the Authur-DENIS RUT~VITZreceived the degrees of BSc., M.Sc. and Ph.D. (1956) in Pure Mathematics from the University of Cape Town and later completed a second Ph.D. at Cambridge University. After teaching in the Mathematics Departments of the Universities of Berkeley. Manchester, Nairobi and Sussex, he joined the staff of the Medical Research Council in 1965 and has been working on pattern recognition and related problems in the Clinical and Population Cytogenetics Unit since then. Ahout the h&u-GEORGE

SPOWART was born in Dunfermline, Scotland, on 16th January 1936. He was on the technical staff of the Scottish Marine Biological Association for 11 years before joining the Medical Research Council in 1969. He is a member of the Cytogenetics Section and part of his work is to liaise with the Pattern Recognition Section.