Understanding Distributions and Data Types Rose D. Sheats and V. Shane Pankratz To comprehend some of the fundamental concepts of statistical analysis, it is important to appreciate the importance of the distribution of data points in the sample that is drawn to represent the population. Furthermore, one must understand the classification of data types. Data type and the distribution pattern of their values influence the choice of appropriate statistical tests. Emphasis will be placed on the normal, or Gaussian, distribution. This is an important distribution to understand because the assumption of this distribution underlies the use of many common statistical tests. The purpose of this article is to familiarize the clinician with some basic biostatistical concepts without delving into the statistical theory or debate associated with these concepts. There will be no presentation of mathematical formulas. The interested reader is referred to several texts, which will provide greater depth and rigor for those seeking additional knowledge. These texts were chosen specifically for their general readability and clarity and are referenced at the end of the article. (Semin Orthod 2002;8:62-66.)
Copyright 2002, Elsevier Science (USA}. All rights reserved.
et us begin by describing the types o f data that are collected a n d analyzed. Data may be quantitative or qualitative. Quantitative data m e a s u r e s o m e t h i n g with a n u m b e r . In orthodontics, we m a k e myriad m e a s u r e m e n t s . We quantify such traits as the a m o u n t o f crowding, oveijet, incisor inclination, a n d maxillomandibular skeletal discrepancy. However, we also make qualitative assessments. We r e c o r d attributes, n o t m e a n i n g f u l l y s u m m a r i z e d by a n u m b e r , which m a y influence o u r diagnostic a n d t r e a t m e n t decisions such as the sex o f the patient, j u d g m e n t s r e g a r d i n g the p r e s e n c e or absence o f additional craniofacial growth, severity of m a n d i b u l a r plane angle (high, n o r m a l , low), likelihood o f c o m p l i a n c e with h e a d g e a r o r elastics ( y e s / n o ) , a n d so forth. In biostatistics, data are classified as either c o n t i n u o u s or catcgorical (Table 1). C o n t i m l o u s
L
p)'om the Mayo Clinic Rochester; Departments of Dental Specialtics and Health Sciences t~,sca'mh, Rochester, MR. Address corrcspondcnce to Rose D. Shears, DMD, MPH, D@at~merit of Dc)Ttal Specialties, Mayo Clinic, 200 First Street SYv; Rochcstc~; M N 55902. CopyrLght 2002, Elsevier Science ((25A). All dght.s ~,se.rved. 1073 8746/02/0802-0004535.00/0 doi: l O.105 3/sodo. 2002.32075
62
data are actual m e a s u r e m e n t s o f some variable such as millimeters overjet, degrees o f m a n d i b ular plane angle, or megapascals o f shear b o n d forces. Categorical data are observations that fall into specified levels such as Angle's classification, Adhesive R e m n a n t I n d e x scores, 1 or Apical R o o t R e s o r p t i o n scores. 2 Quantitative data are often c o n t i n u o u s but may be categorical, whereas qualitative data are always categorical. Categorical data may be furt h e r classified into ordinal or n o m i n a l (Table 1). O r d i n a l data have an o r d e r to their levels of assignment, each m o r e (or less) severe t h a n the previous. T h e Apical R o o t R e s o r p t i o n index 2 is an e x a m p l e o f categorical quantitative data. It is m e a s u r e d by an ordinal scale f r o m 1 to 4 that assigns a severity level to the a m o u n t o f apical r o o t r e s o r p t i o n that has o c c u r r e d . Each increasing score level reflects an increasing severity o f apical r o o t resorption. A n o t h e r e x a m p l e is the Adhesive R e m n a n t Index, 1 which classifies a t o o t h a c c o r d i n g to the a m o u n t o f adhesive that remains o n the t o o t h after a bracket is removed. Each successive score f r o m 1 to 5 indicates lesser a m o u n t s o f adhesive left o n the tooth. N o m i n a l data, on the o t h e r h a n d , have no i n h e r e n t o r d e r to their levels o f assignment. Race a n d sex are c o m m o n data that are n o m i n a l
Seminars in Orthodonlics, Vol 8, No 2 ~une), 2002: pp 62-66
Understanding Distributions and Data Types
63
Table 1. Data Types Data Type
Description
Examples
Continuous
Variables that are measured and can take on any value along a continuum
Categorical
Variables whose values fall into distinct categories or defined levels
Ovmjet in mm Mandibular plane angle in ° Shear bond force in megapascals
Ordinal
Variables tot which an order exists in the levels assigned
Adhesive Remnant Index Apical Root Resorption Index
Nominal
Variables for which there is no hierarchical order to the catego U level
Race Sex Angle's classification of malocclusion (Class 1, 11, or III)
in n a t u r e . A n g l e ' s classification o f m a l o c c l u s i o n is a n o t h e r e x a m p l e o f n o m i n a l data, w h i c h d o not arrange themselves into a hierarchy of order o f severity. A n g l e ' s Class I l l m a l o c c l u s i o n s m a y n o t n e c e s s a r i l y b e w o r s e t h a n A n g l e ' s Class II, o r i n d e e d , s o m e A n g l e ' s Class I m a l o c c l u s i o n s . T h e classification o f d a t a type is i m p o r t a n t because it is a factor in d e t e r m i n i n g the statistical tests that are a p p r o p r i a t e for analyzing the data. Data types themselves are o r d e r e d : c o n t i n u o u s data provide the most i n f o r m a t i o n , followed by o r d i n a l data, a n d finally by n o m i n a l d a t a ? Sometimes d a t a o f a h i g h e r level can be rescaled to a lower level such as w h e n c o n t i n u o u s d a t a are cate g o r i z e d into o r d i n a l o r n o m i n a l levels. :~A n exampie is to g r o u p m a n d i b u l a r p l a n e angle cases into categories using cutoff values o f the c o n t i n u o u s m e a s u r e m e n t in degrees. Categories such as high, n o r m a l , o r low angle cases m a y b e c r e a t e d by g r o u p i n g t h e m a c c o r d i n g to their m a n d i b u l a r p l a n e angle m e a s u r e d in d e g r e e s (Table 2). This c o n t i n u o u s variable is rescaled to a categorical variable in which each s u c c e e d i n g category f r o m low to high r e p r e s e n t s increasing m a n d i b u l a r p l a n e angle. Because o f this o r d e r e d progression, this is an o r d i n a l variable. H i g h e r o r d e r s o f d a t a c a n b e r e s c a l e d to l o w e r o r d e r s o f data, b u t t h e reverse is n o t possible. W h e n d a t a a r e r e s c a l e d to a l o w e r o r d e r , i n f o r m a t i o n is lost, w h i c h i n f l u e n c e s t h e p o w e r
o f t h e study to d e t e c t d i f f e r e n c e s . T h e r e are, h o w e v e r , i n s t a n c e s w h e n it is b e n e f i c i a l to convert c o n t i n u o u s d a t a to o r d i n a l o r n o m i n a l types. T h e study statistician will advise w h e n it m a y b e a d v a n t a g e o u s to "collapse" t h e data. Raw d a t a t h a t a r e c o n t i n u o u s c a n b e s u m m a r i z e d by c a l c u l a t i n g m e a n s a n d s t a n d a r d deviations (SDs), statistics t h a t c h a r a c t e r i z e t h e sample and estimate parameters of the larger p o p u l a t i o n f r o m w h i c h t h e s a m p l e was drawn. Data that are categorical cannot be summarized by t h e s e p a r a m e t e r s . Be wary o f c o n v e r t i n g cate g o r i c a l d a t a i n t o n u m e r i c scores a n d t h e n subj e c t i n g t h e m to a r i t h m e t i c m a n i p u l a t i o n s to obtain m e a n s a n d SDs. S u c h c a l c u l a t i o n s m a y b e m e a n i n g l e s s (eg, w h e n trying to d e t e r m i n e t h e m e a n sex o r m e a n r a c e o f a s a m p l e ) . P e r c e n t ages o f o b s m w e d r e s p o n s e c a t e g o r i e s a r e useful s u m m a r i e s o f n o m i n a l data. The importance of correctly identifying data type will b e a p p a r e n t f r o m a s u b s e q u e n t article in this issue. I n brief, its s i g n i f i c a n c e lies in t h e p o t e n t i a l ability to use p a r a m e t r i c statistics if t h e s a m p l e c a n b e s u m m a r i z e d with p a r a m e t e r s ( m e a n s a n d s t a n d a r d deviations, p e r c e n t a g e s , a n d so o n ) . N o n p a r a m e t r i c statistics a r e availa b l e to analyze d a t a f o r w h i c h t h e a p p r o p r i a t e p a r a m e t e r s c a n n o t b e c a l c u l a t e d o r for t h o s e data that do not meet other requisite assumptions f o r p a r a m e t r i c analysis. T h e s e c o n c e p t s will b e a d d r e s s e d f u r t h e r e l s e w h e r e in this issue.
Table 2. Rescaling Continuous Data to Categorical Data Continuous
Categorical
MPA < 30° 30 ° --< MPA --< 38° MPA > 38 °
Low angle Normal angle High angle
Distributions If y o u have a c o l l e c t i o n o f d a t a p o i n t s , b e g i n y o u r initial analysis by p l o t t i n g t h e m o n a g r a p h to see h o w t h e y a r e d i s t r i b u t e d . O f t e n t h e s e
64
Shears and Pankralz
p o i n t s c a n b e s e e n to follow s o m e r e c o g n i z e d p a t t e r n o r d i s t r i b u t i o n . M a n y p a t t e r n s o f distrib u t i o n s o c c u r in n a t u r e . F r e q u e n t l y , t h e s e patterns can b e d e s c r i b e d by m a t h e m a t i c a l functions, w h i c h t h e n e n a b l e y o u to d e t e r m i n e t h e l i k e l i h o o d t h a t a d a t a p o i n t will Ihll u n d e r a specific a r e a o f t h e d i s t r i b u t i o n curve. Let's say that y o u place brackets o n all your patients' teeth i n c l u d i n g first a n d s e c o n d molars, a n d you would like to know if t h e r e is a difference in the failure rate o f brackets a c c o r d i n g to t o o t h type. You e x a m i n e y o u r r e c o r d s o f the last 50 c o m p l e t e d patients a n d p l o t the n u m b e r o f b r a c k e t fhilures by t o o t h type (assume you only c o u n t a t o o t h once, even if the b r a c k e t c o m e s off m o r e than o n c e ) . Your d a t a m a y l o o k like Figure 1. This is an e x a m p l e o f a u n i f o r m distribution. Approximately, the salne n u m b e r o f brackets c a m e off for each t o o t h type. Regardless o f which t o o t h type it was (x-axis), the n u m b e r o f failures (y-axis) was almost the s a m e (uniform). To see a n o t h e r type o f distribution, let's assume you would like to k n o w at w h a t age the s e c o n d p e r m a n e n t m o l a r typically erupts. You have access to d a t a f r o m a large study o f c h i l d r e n a n d p l o t the p r e s e n c e o f s e c o n d molars versus the age o f the child. Figure 2 is a hypothetical r e p r e s e n t a t i o n o f h o w y o u r data may look. You notice two peaks in y o u r d a t a and, o n f u r t h e r analysis, d e t e r m i n e that these peaks c o r r e s p o n d to the different ages at which m o s t gMs a n d m o s t boys have a t t a i n e d their s e c o n d molars. This type o f distribution, with two peaks, is known as a b i m o d a l distribution.
0¢-
O e¢-
"O
o
TTT ITTT 8
9
10
11
12
13
TTTT 14
15
16
Figure 2. Bimodal distribution (two peaks): Hypothetical age at which 2nd molars erupt. The peak on the left represents mean age of eruption for gMs: the peak on the right represents mean age of eruption fbr boys.
The Normal Distribution In m a n y b i o l o g i c systems, t h e d i s t r i b u t i o n o f d a t a p o i n t s f o r a p a r t i c u l a r f a c t o r (also k n o w n as a v a r i a b l e ) o f t e n takes, at least a p p r o x i m a t e l y , the form of the normal distribution or Gaussian d i s t r i b u t i o n . This b e l l - s h a p e d curve is shown in F i g u r e 3. M a n y o f y o u k n o w t h e b e l l - s h a p e d curve fi-om t h e d i s t r i b u t i o n o f scores o n a national e x a m i n a t i o n . It c a n be s e e n t h a t t h e d a t a c l u s t e r a r o u n d a c e n t r a l p o i n t a n d s p r e a d symm e t r i c a l l y a r o u n d this c e n t e r p o i n t . I n t h e n o r real d i s t r i b u t i o n , t h e c e n t r a l p o i n t is t h e m e a n o f t h e s a m p l e . This c e n t r a l p o i n t in a s y m m e t r i c d i s t r i b u t i o n such as t h e n o r m a l d i s t r i b u t i o n is also t h e m e d i a n a n d m o d e , w h i c h a r e d i s c u s s e d in g r e a t e r detail in a n o t h e r article in this issue. T h e width o f t h e b e l l - s h a p e d curve d e p e n d s o n h o w n m c h variability t h e r e is in t h e data. O n e way to e s t i m a t e t h e a m o u n t o f variability is to c a l c u l a t e t h e SD, t h e s q u a r e r o o t o f t h e a v e r a g e squared deviation of each data point from the m e a n value o f all t h e d a t a points. This s e e m i n g l y
Mean
O
O cO
m
M2
M1
PM 2
PM 1
C
12
11
Tooth type
Figure 1. Uniform distribution. Hypothetical number of bracket failures by tooth type. Abbreviations: M 2, 2nd molars; M], 1st molars; PMz, 2nd premolars; PMI, 1st molars; C, canine; 12, lateral incisors; I~, central incisors.
17
Age (years)
la.
Figure 3. Normal (Gaussian) distribution.
Understanding Distributions and Data T3~es
c o n t o r t e d c a l c u l a t i o n is n e c e s s a r y to a c c o u n t f o r d e v i a t i o n s b o t h a b o v e a n d b e l o w t h e m e a n . If t h e n e g a t i v e values were n o t s q u a r e d , t h e m e a n o f t h e d e v i a t i o n s a b o u t t h e m e a n w o u l d always e q u a l 0. T h e f o r m u l a for m a k i n g this c a l c u l a t i o n can b e f o u n d in any g e n e r a l statistics b o o k a n d will n o t b e p r e s e n t e d h e r e . Suffice it to say t h a t t h e l a r g e r t h e SD is, t h e g r e a t e r t h e variability in t h e data. T h e g r e a t e r the variability is, t h e w i d e r t h e s h a p e o f t h e curve. A c h a r a c t e r i s t i c o f t h e n o r m a l d i s t r i b u t i o n is t h a t d a t a p o i n t s t h a t fall u n d e r t h e curve w i t h i n 1 SD o f t h e m e a n e n c o m p a s s 68% o f all t h e data. Thus, t h e a m o u n t o f d a t a t h a t fall i n t o t h e tails o f t h e curve b e y o n d 1 SD f r o m t h e m e a n is t h e r e f o r e 32% o r 16% in e a c h tail b e c a u s e t h e curve is s y m m e t r i c (Fig 4A). T h e inte~wal def i n e d by t h e m e a n -+ 2 SD e n c o m p a s s e s a p p r o x i m a t e l y 95% o f t h e data, with 5% o f t h e d a t a o c c u r r i n g in t h e tails o r 2.5% in e a c h tail (Fig 4B). Finally, t h e interval f o r m e d by t h e m e a n -+ 3 SD i n c l u d e s 99.97% o f t h e data.
A 0 10 O"
It.
-3
-2
-1
0
+1
+2
+3
Standard deviations from mean
B iiiiii!i
95%
~iHi~i __1, 0 co
ililiiii
o"
!iiiiiiiiliiii~iiiiii!il iiiiiiiiiiiiiiiiiiiiiiii!i!iiiiiiiiiiiiiiiiii~. , " ~,
-3
-2
-1
0
+1
+2
+3
Standard deviations from mean
Figure 4. Standardized normal (Gaussian) distribution: areas under the curve. Mean -- 1 SD (A). Mean -+ 2 SD (g).
65
0
cO
Mean-1 SD
Mean+1 SD
O" 1,1.
Mean S -2 SD /
D ]
~ Mean~,
80°
85°
90°
Mean
95°
100°
IMPA
Figure 5. Hypothetical normal distribution of lower incisor to mandibular plane angle. By u s i n g such k n o w l e d g e , o n e c a n c a l c u l a t e t h e p r o p o r t i o n o f d a t a p o i n t s t h a t w o u l d fall u n d e r t h e n o r m a l curve a b o v e o r b e l o w t h e m e a n if you k n o w t h e value o f t h e m e a n a n d t h e size o f t h e SD. F o r e x a m p l e , p r e t e n d t h a t you have a s a m p l e o f 100 u n t r e a t e d subjects, a n d y o u have m e a s u r e d t h e a n g l e o f l o w e r i n c i s o r inclin a t i o n to t h e m a n d i b u l a r p l a n e (IMPA) in t h e s e subjects. L e t ' s a s s u m e t h a t t h e I M P A is n o r m a l l y d i s t r i b u t e d a n d t h a t t h e m e a n i n c l i n a t i o n in this s a m p l e is 90 ° with a s t a n d a r d d e v i a t i o n o f 5 °. F i g u r e 5 shows w h a t t h e n o r m a l d i s t r i b u t i o n o f t h e s e d a t a w o u l d l o o k like. I f y o u w a n t to e s t i m a t e t h e p r o p o r t i o n o r p e r c e n t a g e o f subjects w h o have an I M P A g r e a t e r t h a n 100 °, h o w w o u l d y o u c a l c u l a t e this? It suffices to d e t e r m i n e h o w m a n y SDs f r o m t h e m e a n this specific v a l u e is so t h a t y o u c a n use your knowledge of the normal distribution and p r o p o r t i o n o f d a t a e n c o m p a s s e d by various intervals o f t h e curve. B e c a u s e t h e m e a n is 90 ° a n d t h e SD is 5 ° in this e x a m p l e , y o u k n o w t h a t y o u r value o f 100 ° is 2 SD g r e a t e r t h a n t h e m e a n . T h e calculation that you have j u s t m a d e is the z transformation, an a r i t h m e t i c t e c h n i q u e for converting y o u r d a t a to have a m e a n o f 0 a n d a SD o f 1. Subjects with I M P A > 100 ° are at least 2 SD above the m e a n . F r o m the n o r m a l distribution curve, we k n o w that the probability that data will o c c u r b e y o n d the m e a n + 2 SD is 2.5% (Fig 4B). You c o n c l u d e that the p r o p o r t i o n o f subjects with a n I M P A > 100 ° is a p p r o x i m a t e l y 2.5% (Fig 5). W h a t p r o p o r t i o n o f y o u r s a m p l e will have an I M P A b e t w e e n 85 ° a n d 95°? T h e c a l c u l a t i o n to d e t e r m i n e t h e n u m b e r o f SDs a b o v e a n d b e l o w t h e m e a n t h a t y o u r s p e c i f i e d values fall is a g a i n s i m p l e a r i t h m e t i c . You k n o w t h a t 85 ° is 5 °, o r 1
66
Sheats and Pankratz
SD b e l o w t h e m e a n o f 90 °, a n d t h a t 95 ° is 5 °, o r 1 SD a b o v e t h e m e a n o f 90 °. T h i s interval between 85 ° a n d 95 ° r e p r e s e n t s t h a t p r o p o r t i o n o f d a t a e n c o m p a s s e d by t h e m e a n -+ 1 SD u n d e r the n o r m a l d i s t r i b u t i o n curve. I n t h e n o r m a l d i s t r i b u t i o n , we k n o w t h a t a p p r o x i m a t e l y 68% o f the d a t a fall in t h e i n t e r v a l e n c o m p a s s e d by 1 SD above a n d 1 SD b e l o w t h e m e a n (Fig 4A). Thus, y o u c o n c l u d e t h a t 68% o f y o u r s a m p l e subjects have a n I M P A b e t w e e n 85 ° a n d 95 ° (Fig 5).
Importance of Distributions W h y is it i m p o r t a n t to evaluate the distribution o f d a t a values? Many statistical tests are b a s e d o n p a r a m e t r i c a s s u m p t i o n s (ie, the d a t a are a s s u m e d to tbllow a distribution that can be s u m m a r i z e d by p a r a m e t e r s ) r e q u i r i n g distribution o f the d a t a which is n o r m a l (bell-shaped). Many p a r a m e t r i c statistical tests are insensitive to m i l d d e p a r t u r e s o f the d a t a f r o m normality, b u t severe d e p a r t u r e s f r o m the n o r m a l distribution m a n d a t e the use o f distribution-free tests. Such distribution-free tests are called n o n p a r a m e t r i c statistics a n d will b e discussed in g r e a t e r detail elsewhere in this issue. Parametric statistics t e n d to b e m o r e powerful than n o n p a r a m e t r i c statistics. This m e a n s that they are m o r e likely t h a n n o n p a r a m e t r i c statistics to d e t e c t a significant significance b e t w e e n samples w h e n the difference is real, b u t use o f a p a r a m e t r i c test w h e n assumptions are violated is incorrect.
Confirming Normality H o w d o y o u k n o w if y o u r d a t a a r e n o r m a l l y d i s t r i b u t e d ? You c a n i n s p e c t it visually f r o m y o u r p l o t o f y o u r data, b u t it p r o b a b l y c o m e s as n o s u r p r i s e to l e a r n t h a t t h e r e a r e statistical tests to c h e c k for d e p a r t u r e f r o m n o r m a l i t y . Tests such as the S h a p i r o - W i l k W test can p r o v i d e i n f o r m a tion to tell y o u if y o u r d a t a w e r e n o t likely to have b e e n s a m p l e d f r o m a n o r m a l d i s t r i b u t i o n . If y o u r d a t a a r e n o t n o r m a l in d i s t r i b u t i o n , it m a y b e p o s s i b l e to t r a n s f o r m y o u r d a t a i n t o a normal distribution. A log transformation, for e x a m p l e , m a y r e s u l t in a n o r m a l d i s t r i b u t i o n o f t h e t r a n s f o r m e d d a t a points. A p p r o p r i a t e statistical tests c a n t h e n b e a p p l i e d to t h e transformed data although interpretation of the t r a n s f o r m e d d a t a m a y b e m o r e difficult. This level o f statistical analysis will m o s t likely r e q u i r e assistance f r o m a statistician.
Summary T h e r e are m a n y m o r e distributions that are b e y o n d the scope o f this issue a n d that figure i m p o r t a n t l y in statistical analyses. T h e n o r m a l dist r i b u t i o n is o n e o f the m o s t central to statistical analysis because o f its role in the a p p r o p r i a t e use o f p a r a m e t r i c statistics. T h e type o f data b e i n g analyzed is also i m p o r t a n t . C o n t i n u o u s d a t a in g e n e r a l c o n t a i n m o r e i n t b r m a t i o n t h a n categorical data. O f the latter, o r d i n a l d a t a are m o r e desirable t h a n n o m i n a l data. T h e statistical tests selected for d a t a analysis d e p e n d on the type o f data available a n d the distribution p a t t e r n o f the data. A review o f the m o s t c o m m o n statistical tests is p r e s e n t e d in a n o t h e r article in this issue. T h e goal o f this essay was to familiarize the clinician with the n e e d to c o m p r e h e n d the c o n c e p t s o f data type a n d d a t a distribution patterns. U n d e r s t a n d i n g these concepts will h e l p you in d e t e r m i n i n g if an a p p r o p r i a t e statistical analysis was a p p l i e d to a study you find o f interest. In this way, you can d e c i d e how c o n f i d e n t you are in the conclusions o f the study a n d j u d g e w h e t h e r the evidence is w o r t h c o n s i d e r i n g in future diagnostic o r treatm e n t decisions in the care o f y o u r individual patients. G a i n i n g familiarity in the a r e a o f statistics will h e l p you to optimize t r e a t m e n t o u t c o m e s b a s e d on k n o w l e d g e you gain f r o m thoughtfifl evaluation o f p a t i e n t - o r i e n t e d , clinical research. This practice o f striving to p r o v i d e patients the best t r e a t m e n t possible using the best available evidence is the goal o f evidence-based medicine. 4
References 1. Artun J, Bergland S: Clinical trials with clTstal growth conditioning as an alternative to acid-etch enamel pretreatment. Am J Orthod 85:333-340, 1984 9. Malmgn'enO, Goldson L, Hill C, et al: Root resorption after orthodontic treatment. Am J Orthod 89:487491; 1989 3. Riegelman R, Hirsch R: Studying a Study and Testing a Test (ed 3). Boston, MA, Little, Brown, 1996, pp 259-970 4. Sackett D, Strauss SE, Richardson WS, et al: Evidencebased Medicine: How to Practice and Teach EBM (ed 2). New York, NY, Churchill Livingstone, 2000
Suggested Additional References Brunette DM: Critical Thinking: Understanding and Evaluating Dental Research. Chicago, IL, Quintcssence Publishing Co, Inc, 1996 Dawson-Satmders B, Trapp R: Basic & Clinical Biostatistics. No~svalk, CT, Appleton & Lange, 1994 Riegehnan R, Hirsch R: Studying a Study and Testing a Test. Boston, MA, Little, Brown, 1996 SallJ, I,ehman A:.IMP Start Statistics. Behnont, CA, Duxbm T Press, Wadsworth Publishing Company, 1996