Statistical analysis of cell directions in applied current experiments

Statistical analysis of cell directions in applied current experiments

Computer Methods and Programs in Biomedicine 20 (1985) 189 - 199 189 Elsevier CPB 00706 Statistical analysis of cell directions in applied current...

986KB Sizes 0 Downloads 31 Views

Computer Methods and Programs in Biomedicine 20 (1985) 189 - 199

189

Elsevier

CPB 00706

Statistical analysis of cell directions in applied current experiments S t e p h e n M. Ross MRC Group in Periodontal Physiology, Room 4384, Medical Sciences Building, University of Toronto, Toronto, Ontario, Canada M5S 1A8

A program has been developed to analyze cell directions sampled during the course of experiments in which uniform electrical currents are passed over cells grown in tissue culture. The program can test data sets for statistically significant orientation, i.e. clustering about a certain mean angle, and can also perform a pairs test between two data sets to determine whether there is a significant difference in orientation between them. Applied current

Directional data

Statistics

1. Introduction

From time to time in biological studies data are sampled in the form of directional data, either as angles on the circle or in the form of times of occurrence of a periodic variable. A researcher might wish to calculate a mean angle for the data, analogous to the arithmetic mean calculated from linear data. In the case in which the angles are dispersed over a range of values one might also wish to perform a statistical test to address the question of whether there is significant orientation, or tendency of the sampled values to cluster about an expected direction. There are some unique problems involved in analyzing angular data [1], and performing statistical analysis on such data will require some modification of the procedures used with linear data. Consider a set of angular data values, 8i, i = 1. . . . . n. They might be sampled as angles relative to a fixed direction, as in animal homing experiments or the experiment to be described below or, if they represent times of occurrence, ti, sampled from some process known to have period, T, they can be converted to angles by the operation: 6i = 27rtJT. Calculation of the arithmetic mean can

produce inconsistent results; for example the arithmetic mean of the angles 40 °, 0% - 3 % 10% - 8 0 ° is - 6 . 6 ° while the arithmetic mean of the equivalent angles 40 ° , 0% 357 ° , 10 ° , 280 ° is 137.4% We could avoid this problem by using the convention of folding all angles into a specified range, either [ - 1 8 0 ° , + 180°], [ - w , + ~ r ] or [0°,360°], [0,2Tr], but even then we would have the problem of deriving useful tests of hypotheses concerning the mean, or the concentration of data about such a mean. This is not a very fruitful approach. One would expect that better results would be produced by an approach making use of trigonometric functions, which operate modulo 2Tr, and it turns out that a convenient way to do this is to use vector arithmetic, as will be described below. Clinical researchers have discovered that application of electrical current can cause healing of, for example, intractable skin ulcers [2] and bone fracture non-unions [3]. Concomitant with the description of in vivo effects of electrical current [4], interest has been generated among cell biologists in the effects of endogenously applied electrical currents on cells grown in tissue culture. When uniform fields of electrical current are applied to cells growing on a flat glass or plastic

0169-2607/85/$03.30 © 1985 Elsevier Science Publishers B.V. (Biomedical Division)

190 tissue culture dish substrate they have almost universally been observed to align with their long axis perpendicular to the direction of current flow [5 7]. Fig. 1 illustrates this phenomenon. This article presents a program for the statistical analysis of cell directions when low-level currents, resulting in incomplete cell alignment, are applied. The program can be used to discern threshold levels for the applied current effect, and can be used to make comparisons between low-level current effects on different cell types.

Since Lord Rayleigh's pioneering work [8], the theory of analysis of directional data has been developed until today fairly reliable methodologies are available for performing statistical tests. Two monographs are available on the subject of circular statistics: Mardia's [1] is a more rigorous work, while that of Batschelet [9] is in a "cookbook" style, designed for practical use by biologists. These books are oriented towards the older, hand calculation style of analysis in which a statistic is calculated and then compared to a corresponding significance value in a table. The method presented here is, in contrast, the result of returning to the original literature to obtain the approximations to the distribution functions that allow the computation of significance values for the circular statistics. These values can be printed along with a plotted display illustrating the data values and the data analysis. There does not seem to be any uniformity of technique in the analysis of alignment data among those working in the field of applied current experiments. Cooper and Keller [6] simply calculated the projected length/width ratios, while Erickson and Nuccitelli [5] used the analysis technique described by Curray [10]. Durand and Greenwood [11] pointed out that there is a better approximation to the distribution of the Rayleigh test statistic than the simple exponential expression originally developed by Lord Rayleigh and used by Curray, and that there is a more powerful test than the Rayleigh test when the true angle of orientation is known. It is hoped that this article will promote the use of more consistent analytical techniques among biologists working with orientation data, and especially the use of the V-test to be decribed below, which, under certain circumstances, is a uniformly most powerful test [11].

2. Hardware and software specifications Fig. 1. A. Photomicrograph of human gingival fibroblast cells showing random orientation in the absenceof applied electrical current. B. Human gingival fibroblast cells have been treated for 6 h by applying 158.7 mA/cm2 electrical current, resulting in a voltage drop in the culture medium of 10 V/cm. Current flowed horizontally from left to right during the experiment. Photographed via phase contrast microscopy; the scale bar denotes 100 ~m.

2.1. Hardware description The program has been implemented on a DEC LSI-11/23 minicomputer with 192 K o f main memory, a 5 M Winchester hard disk and a floppy disk drive which can use single- or double-sided

191

8-inch floppies in either RX01 or RX02 (single- or double-density) formats. A VT-100 terminal is connected as the console and an Integral Data Systems Prism 80 printer is connected to the printer port, but controlled via the LS rather than the LP handler. The printer can be switched into graphics mode under software control to produce plotted output. A digitizing tablet, the Summagraphics Bit Pad One, is connected via a serial port to allow data sampling directly from photographs.

j /

~f

,,

.t

.--~\"J'--]

,-J'/

i / /

/

2.2. Data acquisition In the author's experiments cell angles and lengths are measured for subsequent analysis. If these have already been sampled manually the numerical data can simply be entered on the console. Usually, however, the experimental results are recorded by means of micrographs (prints or 35 mm slides) as illustrated in Fig. 1. In this case photographs are laid or projected onto the surface of the bitpad, and the head and tail of each cell are sampled using the bitpad cursor. From this information the cell length and the angle made by the cell's long axis with the direction of uniform current flow is calculated. These data can then be stored in a disk file if desired. After sampling of a particular photograph has been completed a plot will be produced showing the cell lengths and angles relative to the direction of applied current (which is taken as 0°); this is illustrated in Fig. 2.

\\ /

.......

-

2.3. Program organization

,i I ~I

~

ill

,--5-7 ,

''i

'I ; _

i

~

!Iil

/;f 'I

-"

i

,il

,i

Fig. 2. A. Sample of the graphics output produced when the cells illustrated in Fig. 1A are sampled using the digitizing tablet. The long axis of each cell is plotted. At the tail of each cell a line is drawn parallel to the direction of current flow to facilitate visual judgement of cell angles. The direction of current flow is taken as 0 °. After the plot of cell axes is finished, data values of the angles made by the cells' long axes with the direction of current flow can be printed. B. Output from sampling of the cells illustrated in Fig. lB.

The entire program has been written in DEC F O R T R A N IV, except for the handlers for the digitizing tablet, which have been written in DEC MACRO-11. The main program is organized as a "monitor" which has the function of reading a two-letter mnemonic command corresponding to a desired operation and then calling the appropriate subroutine to perform that function; in this respect it is similar to another program described previously by the author [12]. A list of the commands implemented is given in Table 1. Data can be thought of as residing in one of ten memory areas, or "buffers", numered 0 to 9. Operations on different buffers can be specified by appending numbers to the two-letter commands. For example, the command VT2 will perform the V-test (described below) on the angles contained in buffer 2. 512 × 10 REAL*4 arrays store the cell lengths and angles, and a 108 × 260 INTEGER*2 array is used to store a "bit map" of the plot to be produced after analysis. The entire program, wi,h these large data arrays, fits into our computer's memory, but on a machine with less main memory one might be forced to link the program in overlaid format as described in [12] so that only the

192 'FABLE 1 Command functions implemented in the program Command

Function

incremental plotters which use Calcomp-like subroutine calls simply by linking in those alternative subroutines, or to other printers by setting up the dump of the array MAP in a format compatible with the printer in question.

AV

Performs analysis of variance on linear data

DA

Allows doubling of angles in the Rayleigh test, V-test and pairs test

EX

Exits from the program

FF

Sends a form feed to the printer

FR

Reads a data file from disk onto the specified buffer

FW

Writes the data contained in the specified buffer into a disk file

PR

Prints out the data contained in the specified buffer

PT

Performs the pairs test on angular data contained in two specified buffers

RD

Allows entry of data on the console

SA

Allows sampling of cell lengths and angles on the digitizing tablet

where

Allows calculation of distance scale on the digitizing tablet either from a scale bar or from a photograph of a stage micrometer

C= k c°s8i, S= k

SC

3. Computationalmethods 3.1. The Von M&es distribution

As was mentioned in Section 1, we must switch to vector arithmetic when analyzing directional data. We define unit vectors, D i, having angles 8, and components (cos 8i, sin 8i) and define a mean vector, D: D =1-

D~ = n ( C , S)

i=1

TY

Types out the data contained in the specified buffer on the console

(1)

Hi= l

sinS,

i=1

The mean angle, 8, is simply the direction of the mean vector: (2)

= tan-*(S/C)

subprogram being called by the main program is in memory at any one time, with the other parts of the program residing on disk. The plotting portions of the program have been designed for easy interconvertability. The plotting subroutines have a format similar to the Calcomp subroutine formats that many incremental plotters use. Rather than generating pen control codes, however, bits are set in an array MAP(108, 260) in (X, Y) positions corresponding to the desired dark areas in the final plot. The Prism 80 printer has a 7-pin print head which scans 1/120" to the right when a byte is sent in graphics mode, and pins 1-7 are actuated according to whether bits 0 - 6 are set high or low. Once a plot has been set up in MAP the array is dumped to the printer byte by byte, with the appropriate carriage return-line feed codes at the end of each row. It should therefore be a simple matter to adapt the program, either to

and the length of the mean vector, D: D = ! ( C 2 + S 2 ) '/2

(3)

n

will be determined by the amount of "dispersion"

~

5units ~

5units

Fig. 3. An illustration of how the length of the mean vector is determined by the amount of dispersion in angular data. Both vectors have mean angle 30 ° , but the unit vectors on the left tend to cancel each other out under vector addition, while the resultant of the vectors on the right is very close in length to the sum of the unit vector lengths, 5 units.

193 in the angles (Fig. 3). It is this mean vector that we use in our statistical tests for concentration of angles about an expected value. Several distributions for angular data have been proposed; see [1] or [13] for details and references. The one which has proved most generally useful and which will concern us here is the Von Mises distribution. An angular variable, 8, can be said to have a Von Mises distribution if it has the following probability density function (pdf):

p(8)=exp[k.cos(8-1~)]/[2~r. Io(k)]

(4)

where/~ is the population mean angle, k is called the concentration parameter and I0(k ) is the modified Bessel function of the first kind and order zero: I0(k ) = ~ (k/2)2' i=0 i!2

(5)

One can see that the pdf peaks when ~ = ~ and tapers off to a minimum at 6 = -/~. The parameter k determines how "sharply peaked" the pdf is, i.e. how quickly p(6) declines as 6 is shifted from ~. As k declines p(6) becomes a progressively shallower curve until at k = 0 the uniform pdf of equal probability for all angles around the circle is obtained. When k is large sampled angles will tend to cluster around the mean angle. Gumbel, Greenwood and Durand [14] attempted to coin the name "circular normal distribution" for the pdf given above, but the name could be applied as well to other distributions which are analogous to the normal distribution, and most workers refer to it by the name of its inventor today. It can be shown that ~ is a maximum likelihood estimator of /~ [1,91.

3.2. The Rayleigh test We may want to test the hypothesis that a sample drawn from a Von Mises-distributed population of angles has a uniform distribution, i.e. k = 0, indicating that there is no significant orientation of angles in the sample. To do this we may apply the Rayleigh test, as illustrated in Fig. 4, in which we compare the calculated mean vector length, D, to the mean vector length, E, expected for a given

significance such as 5% or 1%. E is calculated from the distribution of D, given the number of samples, n. If D is smaller than E we accept the hypothesis that the angles are randomly oriented, while if D is greater than E we accept the alternative hypothesis that there is significant orientation of the sampled angles. This makes sense in the light of the concept illustrated in Fig. 3; a set of sampled vectors will never exactly cancel, and we compare the length of the resultant to the value which could be expected to occur purely by chance, based on the assumption that the samples are drawn from a Von Mises-distributed population. It may be that there is no significance to the head or tail of a specimen being measured, or that they are indistinguishable. Arbitrarily assigning a positive or negative direction to the cells may bias the sample in this situation, so we must treat the angles as axial data, which are confined to the [0°,180 °] domain, and then apply the Rayleigh test after doubling all the angles to map the data onto the [00,360 ° ] range. In the program this function can be specified by the DA command. Note that when we double the angles the calculated mean angle will also be twice that of the mean angle calculated from the raw data. The distribution of D has been derived, but is not tractable analytically, and even numerical solution would not be practical for routine use within an analysis program. Lord Rayleigh derived an exponential approximation to the distribution which was not very accurate, but Greenwood and Durand [15] produced a very good approximation which provides enough accuracy for statistical testing purposes, even for sample sizes as small as n = 6 [11], and which is easily computed within a program. For the approximation we calculate the new quantity Z = n - D 2 and use Z in the following expression to calculate the significance value,

Q: Q = 1 - P(D)= exp(-Z)[1 + (-Z

+(Z-Z2/2!)/2n

+ 1 1 Z 2 / 2 ! - 19Z3/3! + 9Z4/4!)/12n 2

+ ( - 2 Z - 4Z2/2! + 69Z3/3! - 163Z4/4! + 145Z5/5! - 45Z6/6!)/24n 3]

(6)

194

/ ./ // \.

~^YLEIGN EIF

1,

TEST

FOR

l~t?lil, CONTROLCKA~8[!

BUFFER

0

VECTOR

SIGNIFICANCE

MEAN MEAN

ORIENTATION

where P ( D ) is the probability distribution value of D. We may then see whether Q calculated by eq. (6) is greater or less than our desired significance level. For example, if Q = 0.33 we would conclude that there is no significant orientation in the data whereas if Q = 0.041 we would conclude that the data display orientation at better than the 5% significance level. While eq. (6) can be calculated quite rapidly on a computer, those persons implementing the test on a programmable calculator will find that it can take quite a while to c o m p u t e the entire expression. If only larger sample sizes (n > 12) will be used, and no values of Q in the extreme tail of the distribution (Q < 0.01) will be required, then one can safely ignore the n2 and n3 terms above to obtain a large saving of execution time while retaining sufficient accuracy [11]. Eq. (1) can be inverted so that the mean vector length, E, corresponding to a specified significance level, Q, may be calculated:

VECTOR VECTOR

NUMBER

OF

CELLS:

LENGTHS:

LENGTH: 1549E-02 SIGNIFICANCE 97

98 I%

=

0

MEAN

216

VECTOR

5%

=

O

ANGLE

E = ( Z / n ) 1/2

175

-90

(7)

9

69~

where Z= Y+(2Y-

Y Z ) / 4 n + ( 1 2 Y - 3 Y z - Y3)/72n 2

+ ( - 12Y + 42Y 2 - 8Y 3 - Y4)/288n3 and where Y = - I n Q.

RAYLEIGN

TEST

FOR

ORIENTATION

ElF |,I&I?tlI. CHAmtlI, HGrCILLS|~INliVICn ANGLES

DOUBLED

FOR

ANALYSIS

BUFFER o N U M B E R OF C E L L S : I % ? 4 VECTOR SIGNIFICANCE LENGTHS MEAN VECTOR LENGTH: 0923 MEAN VECTOR SIGNIFICANCE 000%

= 0Z48 MEAN VECTOR

5~ = ANGLE

0

201 -8?6

Fig. 4. A. Graphics output from the Rayleigh test, which was applied to the sampled angles from Fig. 2A. The angles are plotted on the outer circle, which represents the unit vector radius. The inner concentric circle in the plot represents the mean vector radius required for the 5% significance level when the number of cells is as given. If the mean vector, plotted in the center of the circles, does not extend past the inner circle, we may accept the hypothesis of uniformity of the angular data. If the mean vector extends past the inner circle we may accept the alternative hypothesis that the data display significant orientation at the 5% significance level. If the mean vector extends past the outer concentric circle, then we may accept the alternative hypothesis at the 1% level. Calculated parameters are printed out below the plot. B. Graphics output from the Rayleigh test as applied to the sampled angles from Fig. 2B. Highly significant orientation is clearly indicated.

195

3.3. The V-test In certain situations the population mean angle about which sampled angles will cluster is known in advance. For example, in the author's experiments when high enough currents are applied for a long enough time all the cells can be observed to align at right angles to the direction of current flow. This phenomenon of perpendicular alignment relative to the direction of uniform current flow has been observed in all cells which align in currents [5-7], and therefore, if we define the direction of positive current flow to be 0 °, the population mean angle will be 90 ° . In the case of Von Mises-distributed data for which the population mean angle (/a) is known, the V-test to be described below is a "uniformly most powerful" test for uniformity of directions on the circle [1,11]. The Rayleigh test should be resorted to only when /z is unknown. Durand and Greenwood [11] opined that the V-test is preferable to the Rayleigh test at typical significance levels whenever/.t can be specified to within 30 °. The V-statistic is simply calculated as the projection of the mean vector, D, onto the axis of the population mean angle, /x: V = D . cos(6 - / ~ )

(8)

statistic V by the following procedure [16]. First a new variable U = V2v~n is calculated, after which an approximately normal variate, W, is calculated: W = U + ( U 3 - 3U)/16n + (71U 5 - 224U 3 - 15U)/4608n 2 + (385U v - 1323U 5 - 981U 3

+ 1575U)/V3728n 3

(10)

Now Q ( W ) = 1 - P(W), where W is distributed N(0, 1), may now be readily calculated. A N(0, 1) variable has the following distribution: 1

P(W)=-~- f'

w

exp(-x2/Z)dx

(11)

which is very similar to the error function (erf): 2 Y e r f ( Y ) = ~ - - fo e x p ( - x 2 ) d x

(12)

The error function can also be expressed as an infinite series: 2 erf(Y) = ~ -

~

( - 1 ) 1 Y (2i+1) i ! ( 2 i + 1)

(13)

i=0

If/~ = 0, then a faster way to calculate V is:

V = C/,

(9)

rather than calculating both the C and S components of the vector D using eq. (1). It is always possible to define the population mean angle to be 0 ° . For example, in the cell alignment experiments we could take the direction of cell alignment to be 0 °, so that the current would be passing at - 9 0 °. However, the angular data are easier to visualize when cell directions are calculated relative to the direction of current flow and an equivalent operation, the one used in this program, is to simply subtract ~ from each sampled angle before computing V using eq. (9) above. Another advantage to subtracting ~ from the data and using eq. (9) is that we do not have to double /~ if the data are doubled, as we must when using eq. (8). We can calculate the significance, Q, of the

Series (13) can be iterated, summing successive terms until the desired accuracy is reached. Note that if one wishes to sum terms until the 7-digit accuracy of a REAL*4 variable is achieved one should use double precision arithmetic in computing the terms of eq. (13). Now since P(0)= 0.5, P ( W ) can be calculated as~

P ( W ) = 0.5 +

1

erf(IV/v/2)

or

Q ( W ) = 0.5 - ~ - 1

erf( W/7'2 )

(14)

The value of U corresponding to a given significance level can also be calculated [16]. If we now

196 d e f i n e W to be a N(O, 1) v a r i a t e c o r r e s p o n d i n g to the d e s i r e d level of s i g n i f i c a n c e then:

U= W - ( W 3 - 3W)/16n -(17W

5 - 8W 3 -

177W)/4608n 2

Eq. (15) is used to c a l c u l a t e the p o s i t i o n of ticks p l o t t e d a l o n g the p o p u l a t i o n m e a n a n g l e axis, as i l l u s t r a t e d in Fig. 5.

3.4. A pairs test for the concentration parameter

- (33W 7 + 165W 5 - 1989W 3

+ 999W)/73728n 3

(15)

.-vj - ~. ~-, -H q-~,-+~¢,_

/

/

i - { ",:

/

.,:

I

4 7 ./

\, V-TEST

FOR

12,/` ORIENTATION

If w e h a v e two sets of a n g u l a r data, 8~g, i = 1 . . . . . n 1 a n d 82i, i = 1 . . . . . n 2, h a v i n g m e a n vectors, b ~ a n d D 2 , we m a y w a n t to a d d r e s s the q u e s t i o n o f w h e t h e r there is m o r e c l u s t e r i n g of angles a b o u t the m e a n in o n e d a t a set t h a n in the other. In o t h e r w o r d s we m a y w a n t to test w h e t h e r a d i f f e r e n c e in the c o n c e n t r a t i o n p a r a m e t e r s , k~ a n d k 2, o r e q u i v a l e n t l y b e t w e e n the m e a n v e c t o r lengths, D 1 a n d D 2, is s i g n i f i c a n t or not. As will b e d e s c r i b e d b e l o w , we c a l c u l a t e a statistic to test the h y p o t h e s i s Ho: ka = k z a n d e i t h e r a c c e p t it or reject it, in w h i c h case w e a c c e p t the a l t e r n a t i v e h y p o t h e s i s t h a t o n e d a t a set, the o n e w i t h the l o n g e r m e a n v e c t o r length, is s i g n i f i c a n t l y m o r e o r i e n t e d t h a n the other. T h i s test c a n be u s e d to c o m p a r e t r e a t e d cells to c o n t r o l s , to c h e c k for d i f f e r e n c e s d u e to d i f f e r e n t levels o f t r e a t m e n t , as w h e n the a m o u n t of a p p l i e d c u r r e n t is c h a n g e d , o r to c h e c k for s a t u r a t i o n of t r e a t m e n t effects. T o d e t e r m i n e h o w the test s h o u l d b e c o m p u t e d we first n e e d to c a l c u l a t e the length, R , of the total mean vector, R, [1]: m

4, l~l?ll(, CIII~II | , 4Z 0]J S[lll[ l : HGF [[L[,$ 6 H IN ~ ¥ICH ANGLES DOUBLED FOR ANALYSIS

lIP

SUFFER 0 NUMBER OF C E L L S 63 VECTOR 5 I O N I F I C A N C E PROJECTIONS 1% = 0 2 0 7 MEAN V E C T O R MEAN V E C T O R L E N G T H : 0 271 MEAN V E C T O R P R O J E C T I O N , V 0 26O MEAN V E C T O R S I G N I F I C A N C E : 0 16~

R = D 1+ D2 5% ANGLE

=

~

(16)

147

73 8

Fig. 5. Graphics output from the V-test. A line has been drawn at the expected angle of orientation, 90 ° , and the 1% and 5% significance values of V are indicated by the upper and lower cross-lines, respectively, on this line. The angles are indicated by means of ticks on the circle as in Fig. 4. The mean vector is plotted as well as its projection onto the 90 ° axis, which represents the value of V. If V is below the 5% level on the line then we accept the hypothesis of uniformity of the angular distribution. If V is above either the 5% or 1% bars then we accept the alternative hypothesis that there is significant clustering of the angles about 90 ° at the corresponding significance level. This analysis has been performed on a set of angles sampled from a population of cells which were treated for 6 h with 79.4 mA/cm 2 electrical current, resulting in a voltage drop in the culture medium of 5 V/cm. One may see simply by examination of the ticks around the circle that the angles are much less obviously clustered than those of Fig. 4B, although the test indicates significant orientation.

and 1 R - -

-

-

[(C,+C2)2+(S,+S2)2] 1/2

(17)

nI + n2

N o w if R < 0.45 statistic, W:

W= 2~ sin-'(D' v~

[1/(n

we c a l c u l a t e

the f o l l o w i n g

31/~)-sin-l(D2 1 -4)+1/(n2-4)]

3v/~) (18) 1/2

w h i c h is d i s t r i b u t e d N ( 0 , 1). S i g n i f i c a n c e c a n b e c a l c u l a t e d as d e s c r i b e d in S e c t i o n 3.3. T h e critical r e g i o n for W c o n s i s t s of the e q u a l tails o f the d i s t r i b u t i o n , so t h a t we d o u b l e the v a l u e o f Q c a l c u l a t e d b y eq. (14). If 0.45 ~< R ~< 0.70 t h e n the statistic W is c a l c u -

197 lated differently:

+

q 4 ( W 5 -t- 4 4 W 3 + 1 8 3 W ) 2880

W = s i n h - I [(D1 - a ) / b ] - s i n h - 1 [(D2 - a ) / b ] 0.89325[1/(n,

+q4(9WS-284W3-1513W))])

- 3) + 1 / ( n z - 3)] 1/2

155--~-~r 2

(22)

(19) w h i c h is also d i s t r i b u t e d N(0, 1), with the critical r e g i o n c o n s i s t i n g of the e q u a l tails, a n d w h e r e a = 1.08940 a n d b = 0.25789. M a n y v e r s i o n s of FORTRAN implemented on micro- and minic o m p u t e r s do n o t h a v e the inverse h y p e r b o l i c sine as a b u i l t - i n f u n c t i o n . I n this case the p r o g r a m m e r c a n take a d v a n t a g e of the f o l l o w i n g i d e n t i t y : s i n h - ' x : In[ x + ( x 2 + 1)'/2]

(20)

If R > 0.70 we c a l c u l a t e a n e w statistic, F: F

( h I -- D l ) ( n 2 - 1) (/'12 - D 2 ) ( n I - 1)

(21)

w h i c h is d i s t r i b u t e d F ( n x - 1 , n 2 - 1 ) , a n d a g a i n the critical r e g i o n consists of the e q u a l tails. N o t e t h a t if F < 1, t h e n we s h o u l d i n v e r t F: F ( n 2 - 1, n 1 - 1) = 1 / F ( n 1 - 1, n 2 - 1). T h e critical v a l u e s of F c a n b e c a l c u l a t e d u s i n g the C o r n i s h - F i s h e r a p p r o x i m a t i o n d e s c r i b e d in [17]. If m 1 a n d m 2 are the degrees of f r e e d o m of a v a r i a b l e with distrib u t i o n F ( m 1, m 2 ) a n d Q is o u r desired signific a n c e level t h e n we d e f i n e W to b e a N(0, 1) v a r i a t e c o r r e s p o n d i n g to s i g n i f i c a n c e Q / 2 a n d we also d e f i n e the n e w v a r i a b l e s q = 1 / m t - 1 / m 2 , r = 1 / m 1 + l / m 2 a n d s = ~/~-/2. W e n o w use q, r, s a n d W in the f o l l o w i n g e x p r e s s i o n to c a l c u l a t e the critical v a l u e s of F:

F= exp{2[s.W-

r(W +s

q(W2+2)6

+ 3W) qZ(W3 + llW) 24 + 72r

q.r(W4+9W2+8)

q3(2W4+7W2-16)

120

+s{rI(W

)

3240r

5 + 20W 3 + 15W) 1920

PAIRED

DATA

TEST

FOR

ORIENTATION

t i p I, l l / ? / l t , CEA~II 1, HCI ELLIS t E IN t0 VICE ANGLER DOUBLED FOR ANALYSIS OUTSIDE T I C K S : BUFFER 0 N O DF CELLS: 74 MEAN VECTOR LENGTH: 0 7 2 3 NEAN VECTOR ANGLE; DP. I, 2517/|4, CKt~|E | , IX 0 | J SLID£ l HGT gILLS i H IN S VIgR ANCLE$ DOUBLED FOR ANALYSIS INSIDE T I C K S : BUFFER 1 NO OF CELLS: 63 MEAN VECTOR LENGTH: 0 271 MEAN VECTOR ANGLE:

MEAN TEST

CONBINED RADIUS: STATISTIC: B 0B

SIGNIFICANCE: O OO~, CRITICAL POINTS ARE 5%

0

619 IS 1

96,

DISTRIRUTED I~

2,58,

-89

819

N(

011&

6

0, 3

I)

29

Fig. 6. Graphics output from the pairs test for the concentration parameter, k. The data sampled from cells treated with 10 V/cm and illustrated in Fig. 4B have been compared to the data sampled from cells treated with 5 V/cm and illustrated in Fig. 5. The two data sets have been plotted by means of the outer and inner ticks, respectively, around the circle. Mean vectors are also plotted. In this test we compare the value of the calculated statistic with the critical values printed below. If the value of the statistic is smaller than the 5% critical value we accept the hypothesis that there is no significant difference in concentration of angles between the two populations. If the value of the statistic is greater than either the 5%, 1% or 0.1% critical values then we accept the alternative hypothesis that one population displays significantly more concentration of angles about the mean.

198

The test statistic calculated from the data according to eq. (21) can now be compared to the critical value computed from eq. (22). If the test statistic is smaller than the critical value we accept the null hypothesis; if the test statistics is larger we accept the alternative hypothesis that there is a significant difference in orientation between the two populations. Fig. 6 illustrates the results from a typical analysis comparing two sets of data as described above.

The A N O V A calculation for two data groups is performed as follows. Let us say that we have two sets of linear data, d~,, i = 1 . . . . , n x and dzi , i = 1 . . . . . n 2. We now calculate:

3.5. A N O V A for testing length data

SS, = E d2i + ~-, d 2 i - G

a

11

+ E d2,

n t q- n 2

(23)

i=1

the we calculate the total sum of squares, SS/ "~1

n2

i~l

The linear length measurements are analyzed by means of a one-way analysis of variance (ANOVA) with two data groups. This is a pairs test analogous to the circular pairs test described in Section 3.4. Usually cell lengths after treatment with applied current are compared with controls which have had no current applied to see whether there has been a significant change in average length. The A N O V A procedure has been frequently described (e.g. [18]). The t-test seems to be more commonly used by biologists when comparing pairs of data sets, but the t-test is mathematically equivalent to an A N O V A with two data groups [18]. Since a subroutine to calculate the critical points of the F-distribution had already been implemented it seemed best to use the A N O V A calculation and calculate the critical points with the same subroutine. The distribution of t2(m), where m is the number of degrees of freedom, is equivalent to that of F(1, m) [17]. An alternative, if one wished to use the t-test, would be to calculate the critical value of F for 2Q, where Q is the desired level of significance, and take the square root to obtain the critical value of t.

i

(24)

i=1

Next the sum of squares between groups, SS b, is calculated:

-nl

dli i=

d2i

+n2

- G

(25)

i=

and then the sum of squares within groups, SSw: SS w = SS, - SS b

(26)

An A N O V A table is then constructed as illustrated in Table 2. The value of F, with degrees of freedom 1, n~ + n 2 - 2, may then be compared to the 5%, 1% or 0.1% critical values which are calculated as described in Section 3.4 and printed below the A N O V A table during output of the analysis results.

4. Sample run

Figs. 2, 4, 5 and 6 illustrate the output obtained from sampling and analysis during typical program runs.

TABLE 2 T h e A N O V A table Source of

D e g r e e s of

Sum of

freedom

squares

Mean square

F(1, n I + n 2 - 2 )

variation Between

1

SS b

MS b

MSh/MSw

Within

n 1 + n 2 -2

SSw = SS t - S S b

MS w = SSw/(n I + n 2-2)

Total

hi+ n 2 - 1

SS t

= SS b

199

5. Program availability Program source listings are available from the author on request. Workers who have floppy disk d r i v e s c o m p a t i b l e w i t h o u r s (see S e c t i o n 2.1) m a y s e n d a b l a n k , p r e f o r m a t t e d d i s k if t h e y w i s h t o receive the program on magnetic medium. In either case, it w o u l d b e a p p r e c i a t e d if r e s e a r c h e r s i n North America and western Europe could send a c h e q u e o r m o n e y o r d e r f o r $5 t o c o v e r o u r costs.

References [1] K. Mardia, Statistics of Directional Data (Academic Press, New York, 1972). [2] P. Wheeler, L. Wolcott, J. Morris and M. Spangler, Neural considerations in the healing of ulcerated tissue by clinical electrotherapeutic application of weak direct current: findings and theory, in: Neuroelectric Research: Electroneuroprosthesis, Electroanesthesia and Nonconvulsive Electrotherapy, eds. D. Reynolds and A. Sjoberg, pp. 83-99 (Charles C. Thomas, Springfield IL, 1971). [3] C. Brighton, The treatment of non-unions with electricity, J. Bone Joint Surg. 63-A (5) (1981) 847-851. [4] L. Jaffe and R. Nuccitelli, Electrical controls of development, Annu. Rev. Biophys. Bioeng. 6 (1977) 445-476. [5] C. Erickson and R. Nuccitelli, Embryonic fibroblast motility and orientation can be influenced by physiological electric fields, J. Cell Biol. 98 (1984) 296-307. [6] M. Cooper and R. Keller, Perpendicular orientation and directional migration of amphibian neural crest cells in DC electrical fields, Proc. Natl. Acad. Sci. U.S.A. 81 (1984) 160-164.

[7] S. Ross and J. Ferrier, Applied current effects on cellular morphology, alignment and proliferation, Biophys. J. 45 (2) (1984) 97a. [8] J. Strutt (Lord Rayleigh), On the resultant of a large number of vibrations of the same pitch and of arbitrary phase, Phil. Mag. 10 (1880) 73-78. [9] E. Batschelet, Circular Statistics in Biology (Academic Press, London, 1981). [10] J. Curray, The analysis of two-dimensional orientation data, J. Geol. 64 (1956) 117-131. [11] D. Durand and J. Greenwood, Modifications of the Rayleigh test for uniformity in analysis of two-dimensional orientation data, J. Geol. 66 (1958) 229-238. [12] S. Ross, NOISE: an interactive program for time series analysis of physiological data, Comput. Programs Biomed. 15 (1982) 217-232. [13] E. Batschelet, Statistical Methods for the Analysis of Problems in Animal Orientation and Certain Biological Rhythms (American Institute of Biological Sciences, Washington DC, 1965). [14] E. Gumbel, J. Greenwood and D. Durand, The circular normal distribution: theory and tables, J. Amer. Stat. Assn. 48 (1952) 131-152. [15] J. Greenwood and D. Durand, The distribution of length and components of the sum of n random unit vectors, Ann. Math. Stat. 26 (1955) 233-246. [16] D. Durand and J. Greenwood, Random unit vectors. II. Usefulness of Gram-Charlier and related series in approximating distributions, Ann. Math. Stat. 28 (1957) 978-986. [17] M. Kendall and A. Stuart, The Advanced Theory of Statistics, Vol. 1 (Charles Griffin and Co. Ltd., London, 1977). [18] R. Sokal and F. Rohlf, Introduction to Biostatistics (W.H. Freeman and Co., San Francisco CA, 1973).