J. theor. Biol. (1984) 108, 131-142
Determination of Relative Size: The "Criterion of Subtraction" Problem in Allometry RICHARD J. SMITH
Department of Orthodontics, University of Maryland Dental School, Baltimore, Maryland 21201, U.S.A. (Received 19 April 1983, and in finalform 14 November 1983) By providing a predicted value against which to judge observed values, allometric equations are often used as a "criterion of subtraction" to calculate measurements corrected for the effects of overall size. The observed and predicted values have been used to calculate several different versions of a size-adjusted measurement; two of the more common being (observed/predicted) and (log observed-log predicted). Using data on brain size, tooth size, metabolic rate, and long bone shape, it is found that the manner in which the relative, size value is calculated can alter interpretations and statistical results. Some of the assumptions underlying use of a criterion of subtraction calculated from empirical data are reviewed. It is suggested that predicted values determined from a priori theoretical equations often have several advantages over those from empirical equations.
In an average adult human, brain weight is approximately 2.0% of total body weight (1330 g brain weight, 65 kg body weight), while in the squirrel monkey, Saimiri sciureus, the brain is approximately 3"6% of total body weight (24g brain weight, 0.66kg body weight) (Stephen, Frahm & Baron, 1981). Thus, squirrel monkeys have relatively larger brains than humans. Yet, it is quite widely agreed that humans have exceptionally large brains, which, relative to body weight, are the largest of all animals. The apparent contradiction between the above two statements is due to the use of different measures of body weight against which the brain weights are compared. Among diverse mammalian species, empirical data suggest that brain weight is related to the 0-67 or 0.75 power of body weight (Martin, 1981). If this power function relationship, brain weight = b (body weight) °'67, is made linear by conversion to logs, log brain weight = log b + 0 . 6 7 (log body weight), the brain weight of humans falls further above this regression line than the brain weight of squirrel monkeys. Whether humans or squirrel monkeys have relatively larger brains depends upon whether brain weight is compared to body weight or body weight °'67. 0022-5193/84/090131 + 12 $03.00/0
131 © 1984 Academic Press Inc. (London) Ltd.
132
R . J . SMITH
This example illustrates a fundamental concept of comparative biology. To determine whether a species or individual has a structure (or physiological rate) that is relatively large, relatively small, or exactly what is to be expected, simple proportions between that feature and some reference parameter will not do. Such a proportion incorrectly assumes that structural or functional equivalence in animals of different sizes is represented by geometric similarity. In order to make statements about the relative size of structures, it is first necessary to establish the appropriate baseline for equivalence. In practice, this problem is solved by measuring the feature in question and the reference parameter for a number of relevant species or individuals, converting these data to logarithms, and then fitting a linear equation. This allornetric equation is considered to represent the changes in the y-axis variable required by changes in the general size measurement on the x-axis, and deviations from the regression line are individual adaptations unrelated to general size. The regression line acts as a "criterion of subtraction" for the interpretation of individual points (Gould, i975a, b). In spite of the importance and wide application of this approach, there has been little critical discussion of the methodology for calculating sizeadjusted values ("relative sizes") or the theoretical consequences of using a criterion of subtraction in comparative biology. Measurements of Relative Size
The allometric equation has been used as a criterion of subtraction to compute at least four different versions of a relative size measurement. Two of these require transformation of data back to a linear scale, and two use data remaining in logarithmic form:
(1)
observed predicted
(2)
observed - predicted predicted
(3)
log o b s e r v e d - l o g predicted = log residual
(4)
log o b s e r v e d - l o g predicted standard error of estimate
log residual S.E.E.
In all cases observed and predicted values refer to the variable plotted on the ordinate (y in the allometric equation).
CRITERION
OF SUBTRACTION
IN ALLOMETRY
133
These four definitions of relative size do not take into account numerical re-expression. As described by Simpson, Roe & Lewontin (1960), a ratio of 5:10, for example, can be re-expressed as a fraction (1/2), a quotient (0.5), a percentage (50%), or a quotient multiplied by a constant (0.5 x 100 = 50). Each of the four basic forms can be numerically re-expressed in a variety of ways. The first relative size measurement (observed/predicted) is the form of Jerison's (1973) encephalization quotient, and has also been used by Emerson & Radinsky (1980) and Radinsky (1981). An observed value equal to the predicted value results in a ratio of 1.0. The value of the ratio is theoretically unlimited in the positive direction while the smallest observed values could result in a ratio approaching zero. No negative values are possible. An animal with a y value five times the predicted value would have a ratio of 5.0 while one with an observed value one fifth of the predicted value would have a ratio of 0.20. The second relative size value ((observed-predicted)/predicted) has been used often, including Kay (1975) as the "percentage difference" and Smith (1980, 1981 a) as the "percentage prediction error". An observed value equal to the predicted value results in a ratio of zero. Values larger than predicted lead to positive ratios and smaller than predicted to negative ratios. However, because the numerator indicates the difference between observed and predicted, an observation five times predicted would result in a value of 4.0, e.g. ((500-100)/100), and a value one fifth of predicted to - 0 . 8 ((20-100)/100). The negative values can approach but not be less than - 1 . 0 . There is a simple relationship between the ratios calculated by these first two methods: observed-predicted_ observed predicted predicted
1.
Thus, these two relative size values should be considered no more than numerical re-expressions of each other. The third relative size measurement, log observed-log predicted, is the residual of the allometric equation. It has been used by Clutton-Brock & Harvey (1980) and Goldstein, Post & Melnick (1978). Since log (observed/predicted) = log observed-log predicted, relative size measurement (1) (observed/predicted) is the antilog of this residual. Log observedlog predicted will be zero when the observed value equals the predicted value. Unlike the first two definitions of relative size, proportionately equal positive and negative deviations from predicted result in values which are symmetrical about zero, so that a specimen five times predicted size would have a log residual of 0.6990, and one a fifth of predicted would be -0.6990.
134
R . J . SMITH
Values outside of the range +1.0 to -1"0 will be unusual, since they would require specimens more than ten times larger or smaller than predicted (with common logarithms). The second form of relative size measurement using logarithmic units is the standardized residual, in which the log o b s e r v e d - log predicted is divided by the standard error of estimate for the equation (standard deviation of residuals). As with the log residual, the standardized residual will be zero when the observed value equals the predicted value. Values larger than predicted will be positive and those smaller than predicted will be negative, and proportionally equal positive and negative residuals will be symmetrical about the mean of zero. Assuming the log residuals to be normally distributed (this assumption will be examined in detail subsequently), approximately 68% of the standardized residuals would have values between +1.0 and - 1 . 0 and 95% between +2.0 and - 2 . 0 , based on the frequency distribution of the normal curve.
Interpretation of Relative Size Measurements Three of the four relative size measurements, observed/predicted, (observed-predicted)/predicted, and log o b s e r v e d - l o g predicted, provide identical information. Given any one, the other two can be calculated from a table of logarithms. These three values all indicate how the observed value compares in proportion to the predicted value, such as that the observed value is 20% larger or 50% smaller than expected. The standardized residual provides an entirely different kind of information. First, it cannot be calculated from the others unless additional information specific to the equation is known--the standard error of estimate. Unlike the first three relative size values, this measurement gives no information about how much larger or smaller than expected an individual observation is. Rather, it indicates, relative to all other points, whether the particular deviation is uncommonly large or small. In one data set, 20% larger than predicted might be an unusually high deviation, while in another data set, it might be of much less significance because many individual points are 80-100% greater than predicted. Only the standardized residual indicates this type of information. Although the correlation coefficient is often used to judge the "tightness" of points about the regression line, this interpretation of a correlation is incorrect (Smith, 1980, 1981c). In any aUometric study, it is important to report whether data were transformed to common (base 10) or natural (base e) logarithms. Although three of the four relative size values will be the same with either transformation, log o b s e r v e d - l o g predicted will not. Also, the standard error of the
CRITERION
O.F S U B T R A C T I O N
IN ALLOMETRY
135
equation and the value for b (the intercept) will be different with base 10 and base e logs.
Statistical Properties of Relative Size Measurements Relative size measurements have been used as data in a variety of statistical tests, including principal co-ordinates analysis (Kay, 1975), principal components analysis (Kay, 1975; Radinsky, 1981), multivariate distance statistics (Susman & Creel, 1979), discriminant function analysis (Susman & Creel, 1979; Radinsky, 1981), t-tests (Mace & Eisenberg, 1982) and various models of analysis of variance (Clutton-Brock & Harvey, 1980; Harvey, Kavanaugh & Clutton-Brock, 1978). An important consideration in selecting from among the different forms of relative size measurement would be if they differ in how well they meet the assumptions of normality and homogeneity of variance required for some of these statistical tests. Even procedures which do not assume normality, such as Pearson product-moment correlations, can be affected by unusual distributions (Haldane, 1949). Interpretation of the standard error of the allometric equation also assumes that the errors of prediction are normally distributed and homoscedastic (Roscoe, 1975). In addition to assumptions required by statistical tests, questions concerning relative size measurements arise because several of them appear to be ratios. Whether or not ratios create statistical difficulties, and how to deal with these proposed difficulties, has been the subject of extensive debate (Atchley, Gaskins & Anderson, 1976; Hills, 1978; Albrecht, 1978; Bollen & Ward, 1979). One concern is that ratios typically are positively correlated with their numerator and negatively correlated with their denominator. Also, although the purpose of calculating a relative size value is to "eliminate" size-correlated variation, it is not entirely clear that these size corrected values are uncorrelated with body weight. Ratios also tend to be distributed non-normally. Radinsky (1981) suggests that the observed/predicted relative size measurement appears to be a ratio, but that it is instead the antilog of the residual, which is something different. While this is correct for each individual point, transformations alter the distribution of a set of points, so that observed/predicted might have some of the distributional properties more common to ratios while the log residual might not. In order to test some of these questions about different forms of relative size measurements, four sets of data were analyzed. All were from previously published interspecific studies dealing with some of the most common subjects of allometric research: (1) Brain weights against body weights for 76 species of insectivores and primates. Data from Tables 1-3 by Stephan et al. (1981).
136
R.J.
SMITH
(2) Mesiodistal length of the mandibular first molar against body weight for 42 primate species. Males and females were considered separately, resulting in 81 data points. Data from Table 1 of Gingerich, Smith & Rosenberg (1982). (3) Femur diameter against femur length for 69 bovids. Data from Table 1 of McMahon (1975). (4) Metabolic rate against body weight for 72 nonpasserine birds. Data from Table 2 of Lasiewski & Dawson (1967). For each of these data sets, least squares regression equations were calculated with log transformed data (all correlations between variables were greater than 0-96, so that reduced major axis slopes would be quite similar), and all four forms of relative size measurement were calculated for each point of each data set. For all analyses in this section, the results were identical for the two relative size measurements in linear form, and for the two in logarithmic form. Results are reported for these two categories only, rather than separately for each of the four relative size values. To examine how well logarithmic and linear relative size measurements conform to the assumption of normality, the third and fourth moments about the mean (skewness and kurtosis) were calculated (Table 1). Examination of these data indicates that except for logarithmic forms of the relative brain weight measurements, there were statistically significant deviations from normality with all forms of size-adjusted measurements. However, for every data set, and for both skewness and kurtosis, relative size measurements left in logarithmic form deviated less from normality than those transformed back to linear scale. Testing for homogeneity of variance requires dividing each data set into groups. The number of groups and species included in each group depends upon specific hypotheses such as tooth size in different diet categories, or brain weight in species with different social structures. For the purposes of this study, groups were defined in a more arbitrary manner to provide a stringent test of homoscedastity. Each data set was divided into four groups, corresponding to each quartile of body weight. Without a correction for weight, the y-axis variable would have much greater variance in a group of large animals compared with a group of small animals. Cochran's C-test (Roscoe, 1975), was used to determine significant differences in withingroup variance. This test was selected because it is more accurate than most tests of homogeneity when distributions are skewed or leptokurtic. Since the test assumes equal sample sizes within groups, one subject was dropped from data sets (2) and (3). The point selected from each set had an x-axis value close to the median.
Brain weight Tooth size Femur shape Metabolic rate
-0.45 0.83 0-50 2.54
-0.01 1.79 5'16 13-55
0.73 1.39 1"59 6.58
observed
-0.03 4.23 9.15 49.35
predicted or o b s e r v e d - predicted predicted
Kurtosis (g2) log residual or log standardized residual
observed
Normality
predicted or observed - predicted predicted
Skewness: Po-o5= 0-46. Kurtosis: Po-os= 3-87 (upper tail), 2.27 (lower tail). C test: Po.o5 =0-345, Po-ol =0.370.
(1) (2) (3) (4)
Data
log residual or log standardized residual
Skewness (gl)
0.452 0.370 0"528 0.523
log residual or log standarized residual
0.379 0-449 0.594 0-852
predicted or o b s e r v e d - predicted predicted
Homogeneity of Variance Cochran's C Test observed
Tests of normality and homogeneity of variance for relative size measurements
TABLE 1
138
R . J . SMITH
T h e results are listed in Table 1. L o w e r values for the C statistic indicate m o r e h o m o g e n e o u s within-group variances. Except for brain weight, the logarithmic relative size m e a s u r e m e n t s had lower C values than the linear measurements. Nevertheless, the results indicate that statistically significant heteroscedastity can exist a m o n g subsets of a sample regardless of the form of the size-corrected values. Correlations between the size-adjusted m e a s u r e m e n t s and the observed and predicted values used to compute the ratios are listed in Table 2. Also TABLE 2
Correlations between relative size measurements and original data
Correlation with:
(1)
log x-axis variable log y-axis variable log y predicted
0.00 0.26 -0.02
log residual or log standardized residual Data set: (2) (3) 0.00 0.27 -0.04 observed predicted
-
Correlation:
(I)
log x-axis variable log y-axis variable log y predicted
0-02 0.26 -0-02
-
Ol
(2) -0.01 0-26 -0-06
0.00 0.26 -0.03 observed - predicted predicted Data set: (3) -0.01 0.25 -0.05
(4) 0.00 0"19 0.01
(4) -0.04 0"13 -0.03
po.os= 0-23, po.ol= 0-30. included are the correlations with the x-axis measurement. All relative size m e a s u r e m e n t s have a modest positive correlation (ranging f r o m 0.13 to 0.27) with the observed (y-axis) values. H o w e v e r , the correlations with the x-axis m e a s u r e m e n t s and predicted values are very close to zero, indicating that all forms of the size-corrected m e a s u r e m e n t s are quite satisfactory in eliminating size-correlated variation. For each of the four data sets, a o n e - w a y analysis of variance was run with groups defined as the same four quartiles used to evaluate homogeneity of variance. As shown in Table 3, defining relative size with log values or with data converted to antilogs can alter the F ratio between groups. While the differences are relatively small with these data, the results suggest that it might be possible to get both statistically significant and insignificant results in an analysis of data that are identical except for the m a n n e r in which size-adjusted values are calculated.
CRITERION OF SUBTRACTION IN ALLOMETRY
139
TABLE 3
One-way analysis of variance for each data set divided into four quartiles: difference in F ratio and probability with different definitions of relative size
Data Set (1) (2) (3) (4)
Brain weight Tooth size Femur shape Metabolic rate
df 3, 72 3, 76 3, 64 3, 68
log residual or log standardized residual F-ratio probability 0-745 3.546 0.226 0-456
0"53 0"02 0-88 0.71
observed o b s e r v e d - predicted - or predicted predicted F-ratio probability 0.145 3.652 0.391 0-511
0.93 0-02 0-76 0.68
Selection of a Relative Size M e a s u r e m e n t
The choice from among the four forms of relative size measurement will vary with specific data sets and the preferences of individual workers. Nevertheless, some recommendations can be made. First, the log standardized residual provides information different from the other measurements and will often be of value. Of the three remaining relative size measurements, the log residual is usually closer to being normally distributed, and may have a tendency to have more homogeneous within-group variances than either of the linear relative size values. There may be data sets in which the linear forms have superior statistical properties. For many workers, the log residual may not be as readily interpretable as observed/predicted or (observed-predicted)/predicted. This is a valid consideration. Of the two values computed with antilogs, observed/predicted eliminates both negative values and zero values obtained with (observed - predicted)/predicted. This will often be helpful in computations. The a priori Predicted Value as an Alternative Criterion of Subtraction
Throughout the preceding discussion and analysis, the conventional approach of using an empirical allometric line as the criterion of subtraction has been followed. However, with a criterion of subtraction determined by data, relative size values are specific to the data set. Elimination or addition of one species, or alteration of the specimens used within species, will change the slope and intercept of the equation and therefore change all predicted values. But the predicted value is supposed to indicate the size of the y-axis variable that should be expected because of the overall size of the animal. Does it make sense for this biological value to change when two data sets differ in perhaps a single species?
140
R . J . SMITH
In addition, the method for fitting a line to an allometric data set has been the subject of debate for over 30 years (Kermack & Haldane, 1950). At present, there remains active debate among proponents of least squares regression (Goldstein et al., 1978; Gould, 1975a; Smith, 1981b; Wolpoff, 1982), reduced major axis (Ricker, 1973; Clarke, 1980; Leamy & Bradley, 1982; Harvey & Mace, 1982) and major axis (Jolicoeur, 1975) techniques. As is well-known, the different methods of line fitting usually result in different equations. Thus, with the same data set, relative size values will differ depending upon the criterion used to fit the equation. Because of these methodological difficulties, relative size measurements from different studies are not generally comparable. Results are specific to the sample and line fitting technique used. Differences in results are often due to methodology rather than to differences in biologically-relevant parameters. Analyses often do not build upon or refine previous work, but instead duplicate each other after arguing that the wrong species or wrong method of line fitting had been used previously. In addition to methodological concerns, there are also fundamental theoretical problems with an empirically determined criterion of subtraction. To summarize briefly previous conclusions (Smith, 1980), the method makes the clearly incorrect assumption that correlation is causation, in that all variation in the y-axis variable correlated with size is considered to be due to size. It also overlooks the fact that the biological role of the y-axis variable may change as a direct result of a change in its size, regardless of whether or not it deviates from the overall regression. While an empirical criterion of subtraction can result in residuals which have no clear biological meaning, it is evident that in some form, the criterion of subtraction is a valid concept. It is necessary, for example, for the perfectly reasonable biological statement that humans have relatively larger brains than squirrel monkeys. Thus, it is a useful concept for which we do not currently have an adequate methodology. A partial solution to this problem may be found in the use of strictly a priori nonempirical equations to calculate predicted values. It is not suggested that these replace current methodologies, but only that they could be a valuable additional tool. Both the assumptions and interpretation of an analysis are altered by use of a theoretical criterion of subtraction instead of an empirical one. Consider, as a typical case, McMahon's (1975) demonstration of elastic similarity in long bones. McMahon considers slopes such as 0.640 and 0-624 (overall forelimb and hindlimb slopes for Bovidae) to support the theoretically expected slope of 0.67. So why are the actual slopes different? Either because elastic similarity is modified by some other effect, or because elastic similarity is the only factor and the 0.640 and
CRITERION
OF SUBTRACTION
IN ALLOMETRY
141
0.624 slopes represent sampling error. In either case, the relative size of individual specimens should be calculated from a slope of 0-67. If some other effect is involved, calculation of relative size to a slope of 0.67 will allow study of the remaining factor, since the data will be known to be corrected purely for elastic similarity. On the other hand, if the slope represents only sampling error from a value of 0.67, then the slopes of 0.640 and 0.624 have no biological interpretation as a criterion of subtraction. In order to proceed with a relative size determination using an a priori predicted value, it is necessary to select both an intercept and slope for the allometric equation (which are equal to the constant and exponent of the allometric power function, y bxk). It is often the case that there is a theoretical model for a slope but not for an intercept. When this occurs, the intercept can be simply handled by defining it as 1.0 in equations which are in the form of power functions. A relative size value would be calculated as (y/1-0(x) k) where y is the observed value, x is the body weight or other reference parameter, and k is the selected theoretical exponent. With transformation to logs, the constant becomes the y-intercept of the equation. Since log 1.0 = 0, use of 1.0 for b sets the intercept of an a priori allometric line at the origin. The relative size value is calculated as log y - k ( l o g x). Alternatively, the intercept could be defined by an iterative routine which minimizes residual variance. Of course, when a theoretical basis for the intercept exists, it can be used. With an a priori equation, relative size for a specimen can be determined with only that single value, and is not relative to any particular data set. Relative size can be calculated without reference to the difficult problem concerning criteria for fitting the equation. Finally, relative sizes can be computed and contrasted using different criteria. For example, are there differences in tooth size between frugivores and folivores assuming geometric similarity (tooth area plotted against body weight with a slope of 0.67) and how do these compare to differences assuming metabolic equivalence (same data at a slope of 0.75)? It should be noted that workers interested in relative brain size have applied a standardized equation (Jerison's encephalization quotient) with success. Expanded application of the concept to other problems should prove useful. =
This study was undertaken to answer a question raised by Gene Albrecht during the symposium on Size and Scaling in Primate Biology at the 9th Congress of the International Primatological Society, Atlanta, Georgia, August 11, 1982. I thank William Jungers and Mark Teaford for comments on an earlier draft of the
142
R . J . SMITH
manuscript, and Steve E. Hartman for a draft of an unpublished study which led to the idea to check for differences in F ratios with different definitions of relative size, Computer time was provided by the University of Maryland Computer Center, I thank David Gipe for assistance with the calculations and Barbara Bass for preparation of the manuscript. REFERENCES ALBRECHT, G. H. (1978). Syst. ZooL 27, 67.
ATCHLEY,W. R., GASKINS,C. T. & ANDERSON,D. (1976). Syst. ZooL 25, 137. BOLLEN, K. A. & WARD, S. (1979). Soc. Meth. Res. 7, 431. CLARKE, M. R. B. (1980). Biometrika 67, 441. CLU'TFON-BROCK,T. H. & HARVEY, P. H. (1980). J. Zool., Lond. 190, 309. DODSON, P. (1978). Syst. Zool. 27, 62. EMERSON, S. B. & RADINSKY,L. (1980). Paleobiology 6, 295. GINGERICH,P. D., SMITH,B. H. & ROSENBERG, K. (1982). Am. J. phys. Anthropol. 58, 81. GOLDSTEIN, S., POST, D. & MELNICK, D. (1978). Am. J. phys. Anthropol. 49, 517. GOULD, S. J. (t975a). Am. ZooL 15, 351. GOULD, S. J. (1975b). Contrib. Primatol. 5, 244. HALDANE, J. B. S. (1949). Biometrika 36, 467. HARVEY, P. H., KAVANAUGH, M, & CLU'ITON-BROCK,Z. H. (1978). Nature 276, 817. HARVEY, P. H. & MACE, G. M. (1982). In: Current Problems in Sociobiology (Kings College Sociobiology Group, eds). Cambridge: Cambridge University Press. HAYAMI, I. & MATSUKUMA,A. (1970). Palaeontology 13, 588. HILLS, M. (1978). Syst. Zool. 27, 61. JERISON, H. J. (1973). Evolution of the Brain and Intelligence. New York: Academic Press. JOLICOEUR, P. (1975). J. Fish. Res. Bd Can. 32, 1491. KAY, R. F. (1975). Am. Z phys. Anthropol. 43, 195. KERMACK, K. A. & HALDANE, J. B. S. (1950). Biometrika 37, 30. LASIEWSKI, R. C. & DAWSON, W. R. (1967). Condor 69, 13. LEAMY, L. & BRADLEY, D. (1982). Evolution 36, 1200. MACE, G. M. & EISENBERG, J. F. (1982). Biol. Z Linn. Soc. 17, 243. MARTIN, R. D. (1981). Nature 293, 57. MCMAHON, Z. A. (1975). Am. Nat. 109, 547. RADINSKY, L. B. (1981). Biol. J. Linn. Soc. 15, 369. RICKER, W. E. (1973). J. Fish. Res. Bd Can. 30, 409. ROSCOE, J. T. (1975). FundamentalResearch StatisticsfortheBehaoioralSciences. New York: Holt, Rinehart & Winston. SIMPSON, G. G., ROE, A. & LEWONTXN,R. C. (1960). Quantitative Zoology. New York: Harcourt, Brace & World. SMITH, R. J. (1980). J. theor. Biol. 87, 97. SMITH, R. J. (1981a). Am. J. phys. Anthropol. 55, 323. SMITH, R. J. (1981b). J. hum. Evol. 10, 165. SMITH, R. J. (1981c). Growth 45, 291. STEPHAN, H., FRAHM, H. & BARON, G. (1981). Folia Primatol. 35, 1. SUSMAN, R. L. • CREEL, N. (1979). Am. J. phys. Anthropol. 51, 311. WOLPOFF, M. H. (1982). J. hum. EvoL 10, 151.