FERTILITY AND STERILITY威 VOL. 82, NO. 2, AUGUST 2004
LETTERS TO THE EDITOR
Copyright ©2004 American Society for Reproductive Medicine Published by Elsevier Inc. Printed on acid-free paper in U.S.A.
Paul G. McDonough, M.D. Associate Editor
Measurement error! Communication between content experts and statisticians To the Editor: Dr. Meseguer and colleagues made use of the correlation coefficient as a measure of linear association between sperm parameters and PTRR of motile sperm (1). They found correlations ranging from 0.05– 0.18 with nonsignificant P values and concluded that there is no association between these parameters. Although their conclusion might be correct, the lack of association can also be attributed to two other factors: statistical power and random measurement error. In their study, the investigators recruited 122 study participants. We computed a post-hoc power analysis and found a 51.8% power to yield a statistically significant result. This computation assumes that the correlation in the study is 0.18 (the largest correlation estimated in their article). A sample size of 240 is required to achieve 80% power to detect the observed correlation. Moreover, we estimated that given the study sample size, the minimal significant detectable correlation with 80%, power is 0.25, which is above any estimated correlation in their study. Another possible explanation for their findings is random measurement error. Random measurement error results when intraindividual variation or imprecision in measurement leads to differences between observed values and the true values of interest. This can generate inaccurate estimates of correlation coefficients, particularly when only one measurement per individual is used for estimation. If Y is an individual’s true mean for some factor (i.e., sperm concentration) and V (PTRR of motile sperm) is the true mean for another factor, then the correlation coefficient VY represents the linear association between the true means, Y and V, of the two variables for individuals within a given population. The imprecision of these two factors makes it necessary to consider an error term, which indicates that the influence random measurement error has on this estimate. Let us assume for simplicity, that only one of the two variables, Y, has random measurement error. Then the estimate is yv, equal to the true correlation coefficient, VY multiplied by an error term as shown below
ˆ yv ⫽ VY ⫻
Furthermore, if the correlation coefficient is estimated from a small sample as is in this case, the sampling error may also be quite large. The result may be a grossly inaccurate estimate of the true correlation (3). The error term decreases as the number of measurements per individual (n) is increased. Thus, the value being estimated if only one factor has random measurement error and replicates are measured is
ˆ yv ⫽ VY ⫻
冑
n y2 ⫽ VY ⫻ e2 ⫹ n y2
冑冉
1 1⫹
冊
e2 n y2
Clearly, as n increases the denominator approaches. 1, and the estimated correlation coefficient, ˆ yv approaches the
FIGURE 1 Effect of random measurement error on true correlation coefficients.
冑
y2 e2 ⫹ y2
This error term is the square root of the ratio 2y, the interindividual variation of y, to the sum of 2y and 2e , the 514
random measurement error, and is known in social science literature as “the reliability index” (2). Because this is a proportion, it ranges from 0 to 1 with values close to 1 indicating that the measurement error is relatively small and values close to 0 indicating a large relative measurement error. As exemplified in Figure 1, the effect of random measurement error is attenuation of correlation coefficient estimates. As random measurement error increases, so does the amount of attenuation, and the estimated correlation coefficient approaches zero.
Schisterman. Letter to the Editor. Fertil Steril 2004.
true correlation, VY. This fact suggests a possible solution to reduce the effect of measurement error by increasing the number of replicates per individual. Another possible solution to the issue of measurement error is to attempt estimation of the effect such error might have on estimate through use of a reliability study. This entails selection of a random sample of individuals and taking repeated measurements to generate estimates of total measurement error. Regardless, it is necessary for investigators to consider the role of measurement error in the design phase, as well as in the analysis phase of studies. Failure to account for measurement error, which is present to some degree in every study, and its impact on study power, may hamper the ability to draw proper inferences. Enrique F. Schisterman, Ph.D. Brian W. Whitcomb, B.A. Division of Epidemiology, Statistics and Prevention Research National Institute of Child Health and Human Development Bethesda, Maryland April 13, 2004
References 1. Meseguer M, Garrido N, Martı´nez-Conejero JA, Simo´ n C, Pellicer A, Remohi J. Role of cholesterol, calcium, and mitochondrial activity in the susceptibility for cryodamage after a cycle of freezing and thawing. Fertil Steril 2004;81:588 –94. 2. Carmines EG, Zeller R. Reliability and validity series: quantitative applications in social sciences. Newbury Park: Sage Publications, 1979. 3. Liu K, Stamler J, Dyer A, McKeever J, McKeever P. Statistical methods to assess and minimize the role of intra-individual variability in obscuring the relationship between dietary lipids and serum cholesterol. J Chronic Dis 1978;31:399 –418.
doi:10.1016/j.fertnstert.2004.05.017
Reply of the Authors: We wish to thank the critical comments of Dr. Schisterman regarding our publication (1). We think they are very thoughtful and helpful. After reading your comments we also think that it would be interesting for all investigators in the future to consider the role of measurement error in the design phase, as well as in the analysis phase of our studies. As commented in the letter, if the correlation coefficient is estimated from a not very big sample, the sampling error could be quite large. In that work one of our multiple aims was to study the sperm factors influencing the post-thawing results of sperm in terms of quality. We also aimed to find a predictive parameter of post-thawing results. To do that we performed a retrospective analysis of 1,197 semen samples from patients attending our facilities from January 1990 until December 2001, and 832 samples from sperm donors within the same period. The main outcome measurements that we studied were the mean basic semen parameters of the 2,029 semen samples before and after the freezing and thawing process. FERTILITY & STERILITY威
We did not observe any significant correlation between the different semen parameters and post-thaw recovery rate (PTRR) of motile sperm and no semen features were able to predict the percentage of motile sperm after thawing by doing a ROC curve analysis. Keeping in mind that we are quite sure that our information is correct. We also disagree with one of the comments of our reviewer; 0.18 is not the largest correlation estimated in our article. We observed a negative correlation of ⫺0.360 between intracellular sperm Ca⫹2 concentrations and PTRR of motile sperm. Marcos Meseguer, Ph.D. Nicola´ s Garrido, Ph.D. Ernesto Bosch-Bastida, M.D. Instituto Valenciano de Infertilidad Valencia, Spain April 20, 2004
Reference 1. Meseguer M, Garrido N, Martinez-Conejero JA, Simon C, Pellicer A, Remohi J. Role of cholesterol, calcium, and mitochondrial activity in the susceptibility for cryodamage after a cycle of freezing and thawing. Fertil Steril 2004;81:588 –94.
doi:10.1016/j.fertnstert.2004.05.018
Editorial Commentary
Measurement error! Communication between content experts and statisticians The authors of the paper by Meseguer and colleagues are well known and well published in the reproductive sciences (1). They have made many significant contributions to the literature. In this study the outcome variable, which is the mean progressive post-thaw recovery rate (PTRR) of sperm and the predictor variables (cholesterol, calcium, mitochondrial activity) are measured with error. This is especially true for the estimates of PTRR. As indicated by the correspondents all of the estimates of correlation will be biased unless the analysis corrects for measurement error. These adjustments must be considered at the onset of the study. Doctor Schisterman’s comments are especially relevant. The authors state in the Methods section and in the legends above the figures that they used linear regression analysis followed by analysis of variance (ANOVA) to analyze the data. The X variable in linear regression analysis is considered to be fixed and measured without error, whereas the Y-axis is dependent and subject to measurement error. In linear regression, the estimates of the slope of the line and intercept are affected entirely by the variability in the Y variable (2). However, the summary measures in the article by Meseguer and colleagues indicate that correlation analysis was used rather than linear regression analysis. 515