The Importance of Attending to Underlying Statistical Assumptions

The Importance of Attending to Underlying Statistical Assumptions

The Importance of Attending to Underlying Statistical Assumptions Hongwei Yang, PhD and Schuyler W. Huck, PhD Statistical tools are delicate instrume...

125KB Sizes 3 Downloads 74 Views

The Importance of Attending to Underlying Statistical Assumptions Hongwei Yang, PhD and Schuyler W. Huck, PhD

Statistical tools are delicate instruments. When used properly, they help quantitative researchers illuminate relationships important to both practitioners and theorists; when used carelessly, they can bring about unjustified, distorted, and/or misleading claims. Almost every statistical tool has underlying assumptions (ie, prerequisite conditions) that supposedly govern its use. In this article, four questions concerning such assumptions are raised: What is an underlying statistical assumption? What are the consequences, if any, of violating them? Do applied researchers (and particularly those who publish articles in Newborn and Infant Nursing Reviews) pay attention to the assumptions that underlie the statistical procedures they use? What new insights into statistical assumptions have come about during the last 10 to 15 years? Keywords: Statistical tools, Assumptions; Quantitative research

In the opening article of the inaugural issue of Newborn and Infant Nursing Reviews (NAINR), the editors stated that this journal's purpose was to “provide current education and research on new trends in newborn and infant nursing care.”1 They went on to say that “whenever possible, content is based on solid scientific knowledge” and “we know quality is important to our readers and our authors.” In the most recent issue of the journal (Volume 9, No. 3), the current editor reiterated this concern by saying “infants and families are relying on us to provide the best care possible based on the latest rigorous science.”2 Even a casual review of any of this journal's many issues reveals that a host of individuals have worked hard to provide the “education and research” referred to by the journal's initial editors in 2001. Authors, members of the Editorial Board, and Guest Editors have succeeded in populating the journal with articles that are impressive in their content, appearance, and usefulness. Without question, NAINR is a top-flight, highquality scholarly journal. But can it get better? We think so. The majority of this journal's research articles involve either reviews of previous research or presentations of authors' primary research findings. (Articles of both types have appeared mainly in sections called Research Articles, the Research Corner, or simply Articles.) A large number of these articles contain findings and recommendations that are based on some form of From the Department of Educational Policy Studies and Evaluation, University of Kentucky, Lexington, KY. From the Department of Educational Psychology and Counseling, University of Tennessee, Knoxville, TN. Address correspondence to Schuyler W. Huck, PhD, Department of Educational Psychology and Counseling, University of Tennessee, 1122 Volunteer Blvd, CC446 Claxton Complex, Knoxville, TN 37996-3452. E-mails: [email protected], [email protected]. URL: http://web.utk.edu/~edpsych/f_s/huck.html. © 2010 Elsevier Inc. All rights reserved. 1527-3369/09/1001-0339$36.00/0 doi:10.1053/j.nainr.2009.12.005

statistical analysis. Can these statistically based studies be improved? We believe so. If this improvement occurs, it will help NAINR achieve its goal of sharing solid scientific research based on the latest rigorous science. A variety of evaluative criteria ought to be considered when examining the merits of any statistically based investigation. Many of them are considered in other parts of this special issue of NAINR. The focus of this particular article is limited to just one facet of such research studies: statistical assumptions. Our thesis is that everyone connected to this journal—authors, editorial reviewers, and the practitioners who read what's published—will be better off if they understand what an assumption is, if they realize how a statistical result can be distorted if an assumption is violated, and if they comprehend what the term robust means. Simply stated, we believe that NAINR's content will be better grounded in “solid scientific knowledge” if greater attention is devoted to underlying statistical assumptions. It should be noted that the problem of not attending to statistical assumptions is characteristic of many fields, not just nursing. For example, Keselman et al3 provide evidence that “the vast majority of educational researchers are conducting their statistical analyses without taking into account the distributional assumptions of the procedures they are using.” In psychology, its journals have been “littered with nonsignificant results [due to low power caused by violations of assumptions] that would have been significant if a more modern method had been used.”4 In the field of sports science, the reliability of measurements is a major concern; however, a review of published articles revealed that “the important assumption regarding the relationship between error and the magnitude of the measured value is rarely explored by reliability researchers.”5 The remaining portion of this article is divided into four sections. First, we clarify what a statistical assumption is. Second, we show how statistical results can be distorted if an assumption is violated. Third, we look inside NAINR to see if assumptions are being attended to by the authors of statistically-

based research articles. Finally, we review what new knowledge about statistical assumptions has been gained over the past 10 to 15 years.

Statistical Assumption: Definition, Analogy, Examples A statistical assumption is a “prerequisite condition” that must exist for a statistical tool to do its job. Referred to at times as an underlying assumption or a background assumption, a statistical assumption deals with something other than where the main statistical focus will be. In other words, a statistical assumption is a supposition about one thing that, if true, allows something else, the main statistical tool, to operate effectively. Perhaps an analogy will help. An upright vacuum cleaner is designed to remove things such as dust, lint, and food particles from the floor. The vacuum will function as intended, however, only if certain “prerequisite conditions” exist. The vacuum must be set properly for the flooring material (carpet or wood) on which it is used, its internal belts must not be broken, the electric cord must be plugged in, and the vacuum's dust bag must not be full. If any of these conditions does not exist, the vacuum will not work very well—or at all. Most statistical procedures can be placed into one of two categories: those that are purely descriptive and those that are inferential. The statistical tools that fall into the first of these categories have three main assumptions. First, the level of measurement in the data should meet or exceed the demands of the statistical tool. (Thus, the mean requires data that are interval or ratio in nature.) Second, data should be reliable and valid. (Pearson's r, for example, is attenuated by unreliability in the X and/or Y scores.) Third, certain descriptive techniques (such as the mean and r) provide accurate summaries only if there are no outliers in the data. Statistical procedures that are inferential in nature have the same prerequisite conditions that descriptive techniques have: proper level of measurement, reliable and valid data, and the absence of outliers. In addition, however, inferential procedures —whether dealing with estimation or hypothesis testing—have additional assumptions. (Even nonparametric procedures—that often involve data in the form of ranks—have assumptions despite the widespread yet incorrect belief that they are “assumption-free.”) An obvious assumption of all inferential procedures is that any sample is a random subset of the population to which an inference will be made. Beyond randomness, many other assumptions are associated with the statistical tools applied researchers use to make sample-topopulation inferences. Certain of these assumptions (such as normality and homogeneity of variance) are common to a wide variety of inferential procedures; others (such as sphericity and the absence of collinearity) are tethered to only a few. In the remaining portion of this article, our focus is on the assumptions associated with inferential procedures. This focus seems reasonable for two reasons. First, the findings of most statistically-based studies rely on inferential techniques. Second, as pressure builds on applied researchers to use

NEWBORN

more advanced techniques in their studies (so as to have their investigations appear “elegant” and “sophisticated” in the eyes of journal editors), there is a danger that researchers will pay less attention to, or neglect totally, the assumptions of their chosen statistical procedures.

Consequences of Violating Assumptions If an assumption is violated, it is possible that one's statistical tool will malfunction badly. To illustrate, four examples will be considered involving statistical tests used by researchers who have presented their findings in NAINR. These statistical procedures are (a) an independent-samples t test, (b) a twoway mixed analysis of variance (ANOVA), (c) the analysis of covariance, and (d) a multiple regression. One of the assumptions of an independent-samples t test is equality of population variances. If this assumption is violated, the Type I error rate can be substantially different from the nominal level of significance. To provide proof of this, the authors conducted a Monte Carlo simulation study involving equal population means but differing population variances and sample sizes. With α set equal to .05, a total of 100,000 trials of the simulation were run. Results indicated that the incidence of Type I errors (stated like a probability) ranged from .0004 to .38, depending on whether the larger sample came from the more homogeneous or heterogeneous population. (When the larger sample came from the less variable population, the Type I error rate was higher than .05; in contrast, the true null hypothesis was incorrectly rejected less often than should have been the case when the larger sample came from the more heterogeneous population.) In this investigation, the .05 level of significance was “nominal” in the sense that it was truthful “in name only.” A two-factor mixed ANOVA (ie, a two-way ANOVA with repeated measures on one factor) has the same assumptions as a fully between-subjects two-way ANOVA—normality, homogeneity of variance, independence of observations—plus a special assumption called sphericity. This assumption is relevant to the F ratios for the main effect of the repeated measures factor and the interaction, the Fs of most interest to many researchers. It deals with the population values of the variance-covariance matrix, and the assumption holds if the variances of the differences between values of the dependent variable are the same for all bivariate combinations of the repeated-measures factor.6–8 Violations of this assumption can seriously affect results.9 This is because the computed F ratios for the interaction and the main effect of the repeated measures are positively biased (ie, too large) when sphericity does not hold.10 Thus, Fs that ought to be declared nonsignificant may appear to be significant (because P b .05). Our third example concerns the analysis of covariance (ANCOVA). More than a decade ago, nursing researchers were reminded of the “uses and abuses” of this powerful, yet delicate, statistical tool.11 Analysis of covariance has several important assumptions, one of which states that the regression slopes in the various populations are equal. Because the adjusted means

& INFANT NURSING REVIEWS, MARCH 2010

45

compared by ANCOVA are calculated using a pooled regression slope, the adjusted mean for one or more groups will be inaccurate if the assumption of equal regression slopes is violated. Moreover, a Monte Carlo simulation study has revealed “large discrepancies between the empirical alpha levels for ANCOVA and the corresponding nominal alpha” when regression slopes differ in the presence of unequal sample sizes.12 Simply put, the analysis of covariance is inappropriate if the assumption of equal regression slopes is violated.13 A multiple regression involves many assumptions, one of which is homoscedasticity. This assumption is analogous to the homogeneity of variance assumption of other test procedures (eg, t tests, ANOVA, ANCOVA) for it states that the variance of residuals is constant across different values of the independent variables. When this assumption is violated, a condition called heteroscedasticity is said to exist. The estimates of regression coefficients remain unbiased when the homoscedasticity assumption is violated; however, heteroscedasticity causes the standard errors for these estimates to be incorrect, thus making the significance test focused on each independent variable inaccurate.14 It should be noted that we have discussed just one assumption for each of the four statistical procedures considered above. These procedures have other assumptions, and our failure to discuss them should not be interpreted to mean that these other assumptions are unimportant. Our goal has not been to present a complete examination of each procedure's full set of underlying assumptions. Instead, our purpose has been to illustrate, via a consideration of one assumption per procedure, why it is important to consider a statistical tool's “prerequisite conditions.” One final point needs to be made in our discussion of the “consequences” of violating assumptions. Under certain conditions, a statistical procedure can be “robust” to one or more of its assumptions. This means that the statistical procedure will function as intended even if the assumption is violated. For example, equal and large sample sizes allow an independent-samples t test to work properly even if the equal variance assumption is violated. Equal and large sample sizes, however, do not make all statistical procedures robust to all of their assumptions. Accordingly, conscientious researchers should become familiar with (and discuss) the assumptions of the statistical tools they use. As indicated in our four illustrative examples, violations of assumptions can have severely adverse consequences.

Concern for Assumptions Among NAINR Researchers If statistical assumptions are important and should be considered by applied researchers, it seems reasonable to ask the obvious question: Do published research articles in NAINR indicate that the researchers who conducted these studies attended to the assumptions of the statistical tools upon which the findings were based? To answer this question, we went to the primary source and examined NAINR articles.

46

VOLUME 10, NUMBER

We focused our investigation on the most recent 5 years of NAINR: Volumes 5 through 9. We examined 19 issues of the journal (four per year, except for Issue 4 of Volume 9 that had not yet been distributed at the time this article was written). We restricted our survey to statistically-based studies, and we looked into two kinds of articles: (a) those that summarized the authors' own research studies and (b) those that summarized the work of other researchers. Initially, all 142 research articles in the five volumes of NAINR were examined; of these, 15 represented original research and 127 were “review” pieces. Of the articles in the latter category (wherein others' research results were reviewed), 41 clearly summarized the results of one or more statistically-based studies. Thus, we looked closely at 15 + 41 = 56 of the 142 articles. We suspect that many, and perhaps most, of the 86 “review” articles not included in our sample also summarized statistically-based studies. However, our selection criteria demanded that a review article, to be included in our investigation, had to contain the notation P b .05—or something similar—or the word significant used to summarize the findings of a statistical test. (Phrases such as significant issue, significant strides, and significant implications did not cause an article to enter our sample.) Our examination of the 15 articles that summarized original research produced mixed results. Some of these authors (n = 4) attended to at least one statistical assumption; others (n = 11) did not. For example, the normality assumption of an independent-samples t test was dealt with by the authors of one study by normalizing their data via a logarithmic transformation before conducting the t test.15 In a different article, the results of a two-way mixed ANOVA were presented with nothing said about the sphericity assumption or the application of any adjustment (eg, the Huynh-Feldt correction) to compensate for nonsphericity.16 In a third article, the analysis of covariance was used as the primary statistical tool; yet no mention was made of the assumption of equal regression slopes (or any of ANCOVA's other assumptions).17 In a fourth article, a multiple regression was used with attention devoted to multicollinearity, a situation that distorts multiple regression; however, nothing was said about other important assumptions (normality, linearity, homoscedasticity, and the absence of outliers).18 Regarding the 41 “review” articles we examined, most statistically-based studies were summarized in one of two ways. In most instances, the review article's author provided a one- totwo-sentence summary of someone else's research finding. Here is an example: Furman and Minich (2004) explored the efficiency of breastfeeding vs bottlefeeding in two groups of comparable preterm infants at 35 weeks of GA. They found that when infants breastfed or bottlefed for the same amount of time, the breastfed infants took in significantly less volume than the bottlefeeding infants (6.5 vs 30.5 mL), fed less efficiently (0.6 vs 2.2 mL/min), and spent less time with sucking bursts (33% vs 55%).19

1, www.nainr.com

In a smaller number of instances, many of the details of the original study were provided by the author of the review article. The following is an example of this kind of presentation: A recently reported RCT supported an intervention which used both problem-focused and emotion-focused strategies. The Mother-Infant Transaction Program, an early intervention program to reduce parental stress, was evaluated in a Norwegian study of preterm infants (71 intervention preterm infants, 67 control preterm infants, 72 control term infants). This intervention involves 8 sessions during hospitalization and 4 home visits by specially skilled nurses in which parents are given education and training on infant development and “reading” behavioral cues (facilitating problem-focused coping). In the initial visit, parents were encouraged to express their feelings related to grief or blame and their experiences during hospitalization (supporting emotionfocused coping). The study demonstrated that parents of preterm babies who received the intervention had significantly lower stress than the parents of preterm babies who did not receive the intervention.20 Many, many studies were summarized, either briefly or in a more detailed fashion, in the 41 review articles that were examined. Collectively, these articles had 1897 references (M = 46.3, SD = 24.2). In a few of the review articles, earlier research was evaluated on the basis of considerations such as the kind of research design used, population heterogeneity, sample size, and statistical power.21,22 However, in no case was any comment made about a previous study's statistical analysis being trustworthy—or untrustworthy—due to a consideration of statistical assumptions. To illustrate why it is important to attend to statistical assumptions when reviewing others' research, consider this passage from a particular review article in NAINR: Some studies are reporting increased vocalization with preterm infants, but these behaviors are less likely to be contingent on maternal behavior.… One exception was a study conducted by Forcaado-Guex et al, which reported a controlling style in their maternal sample that resulted in more compulsive compliant behavior in infants who already are more passive when interacting.23 We examined the original study (published in Pediatrics) referred to in this passage.24 In the original study, the researchers used an independent-samples t test to compare their study's two groups, saying “Student's t tests were used to compare groups on demographic variables, gestational age, PERI score, and SES.” One sample size was nearly double the other (47 vs 25), and the groups' sample variances on the Perinatal Risk Inventory (.25 and 12.96) had a ratio of about 50 to 1. The study's data suggest strongly that the t test's assumption of equal population variances was violated, and the different sample sizes did not make the t test robust.

NEWBORN

However, no mention was made of the t test's assumptions or of invoking an unequal-variance version of the t test to compare the sample means on the PERI inventory. In discussing the above example, we are not denigrating Pediatrics (a top-flight research journal) or the authors who conducted the original study we considered. Instead, our goal is to make a simple yet important point: the conclusions reached in earlier studies, if based on inferential statistical procedures, should not be passed along to others (by the author of a review article) unless the researchers of the original studies indicated that they attended to the assumptions of their statistical tools. In an ideal world, a concern for statistical assumptions would be raised at the time research reports are initially submitted to journals for possible publication. In reality, however, editors and reviewers sometimes focus their attention so much on the goals, methodology, and findings of research investigations that little or no concern is aimed at statistical assumptions. This being the case, it is prudent for authors of review articles to carefully screen others' studies before passing along their claimed “discoveries.”

Recent Insights Into Statistical Assumptions Many insights into statistical assumptions have been gained over the past 10 to 15 years. Space limitations preclude a discussion of all or even most of these insights. However, three that are especially relevant to applied researchers need to be shared with those who read or contribute to NAINR. These three insights concern the use of preliminary tests of assumptions, the appropriateness of using nonparametric tests when assumptions are violated, and the development of new techniques for assessing assumptions. For decades, applied researchers have been advised to follow a two-stage approach to their data analysis. In stage 1, important assumptions are tested; in stage 2, the central hypotheses of the study are tested by a test procedure deemed “proper” by the outcome of the stage 1 investigation. For example, in a twogroup comparison of means, an initial check of the sample variances (eg, via Levene's test) might or might not suggest that the equal variance assumption is violated. If the assumption seems tenable, a t test would then be used in stage 2. However, if the initial test leads to a rejection of the equal variance null hypothesis, stage 2 might involve the application of a data transformation (to “stabilize” the variances) or the application of a nonparametric test. Recent research has shown that this two-stage approach is improper. This is because the theoretical distribution of the test conducted in stage 2 is conditional on the results of the initial check on the assumption(s). This situation can so substantially alter the actual Type I and Type II error rates for the stage 2 test (as compared with their nominal rates) as to render the stage 2 test invalid.25–27 There are recommended ways to circumvent this problem, and the reader is urged to examine the available advice.28

& INFANT NURSING REVIEWS, MARCH 2010

47

The second recent insight to be discussed here concerns the long-held view that nonparametric tests are superior to parametric procedures in those situations where assumptions seem to be violated. This line of thinking, for example, might prompt one to use a Wilcoxon-Mann-Whitney U test rather than an independent-samples t test (or to use the KruskalWallis test rather than a one-way ANOVA) if the comparison groups have grossly unequal sample sizes and quite disparate sample variances. Thinking that the nonparametric test is “distribution-free,” many applied researchers would choose to use U rather than t (or H rather than F). The results of a Monte Carlo simulation study challenge the notion that nonparametric procedures are uniformly better than parametric ones when assumptions are violated.29 In this study, the actual Type I error rates and power values for Student's t test and the Wilcoxon-Mann-Whitney test were compared using combinations of 10 different distributional shapes, seven different degrees of variance heterogeneity, and various ratios of sample size inequality. A total of 10,000 trials were run for each combination of conditions. Results indicated that the Wilcoxon-Mann-Whitney test performed worse than the t test when the variance and normality assumptions were both violated and when the larger variance was paired with the smaller sample size. The third insight we want to mention concerns new techniques that have been developed to tackle violations of standard statistical assumptions. Among them are multiple imputation for handling missing data,30 weighted least squares estimation for stabilizing heterogeneity of regression error variance,31 two-stage least squares for dealing with correlated predictors and regression errors (ie, violation of recursivity),32 optimal scaling for quantifying nominal and ordinal data,33 and the use of the generalized linear model and robust statistics for dealing with violations of the normality assumption.34,35

to generate research findings. Doing this is critical to the presentation of trustworthy information, and it most certainly is not a waste of time. As indicated earlier in this article, statistical tools may not function as intended if assumptions are violated. Type I error rates and power can be corrupted in the presence of things like non-sphericity, heteroscedasticity, and unequal regression slopes. To some, it may seem trivial that the actual probability of a Type I error rate is higher or lower than the nominal α level by a small amount. However, our review of published articles in NAINR shows that there is a widespread acceptance of .05 as the criterion for establishing statistical significance. Relationships are said to exist and differences are said to be found if P b .05. This is not the case if P N .05. But if important assumptions are violated, a result that's tagged as significant might actually deserve to be labeled as not significant (and vice versa). Such mislabeling of research outcomes can have an enormous impact on the evolving body of presumed knowledge in a discipline. It should be noted, of course, that paying attention to the underlying assumptions of one's chosen statistical tools is only one of many markers of a well-trained researcher. Such researchers will set up carefully-designed studies that address important issues. The phrase carefully-designed covers a host of important concerns—population specification, sampling procedures, instrument selection/development, sample size determination, control over extraneous variables, blindedness, and ethics, to name but a few. Clearly, a focus on assumptions cannot cause an investigation to become good if it has an unimportant goal or a deficient methodology. It is also clear, however, that studies with worthy goals and sound methodology can be rendered useless by improper data analysis. Proper data analysis itself is multifaceted. As this article has shown, we hope, a critical part of the data analysis phase of high-quality quantitative studies requires that researchers know about, and attend to, the underlying assumptions of their investigations' statistical procedures.

Discussion Twenty-five years ago, two authors writing in the Western Journal of Nursing Research commented on the degree to which nursing researchers pay attention to statistical assumptions. They said: Nursing scientists have increasingly used more sophisticated statistical techniques in data analysis in order to explore and confirm theoretical relationships.… However, researchers seldom proceed with the analysis of their data by assessing [possible] violations in model assumptions.36 To achieve the goal, expressed in 2001 and 2009 by the editors of this journal, that the content of NAINR will be based on “solid scientific knowledge” and its recommendation to families will come from “rigorous science,” researchers who conduct their own original research or who prepare review articles that summarize others' research investigations should attend to the underlying assumptions of the statistical tools used

48

VOLUME 10, NUMBER

References 1. Kenner C, Lott JL, Strodtbeck F. Introduction. Newborn Infant Nurs Rev. 2001;1:1. 2. Altimier L. Newborn and infant nursing reviews. Newborn Infant Nurs Rev. 2009;9:129. 3. Keselman HJ, Hubert CJ, Lix LM, et al. Statistical practices of educational researchers: an analysis of their ANOVA, MANOVA, and ANCOVA analyses. Rev Educ Res. 1998;68: 350-386. p. 353. 4. Wilcox RR. How many discoveries have been lost by ignoring modern statistical methods? Am Psych. 1998;53: 300-314. 5. Atkinson G, Nevill AM. Statistical methods for assessing measurement error (reliability) in variables relevant to sports medicine. Sports Med. 1998;26:217-238. p. 235. 6. Quinn GP, Keough MJ. Experimental design and data analysis for biologists. Cambridge University Press; 2002. p. 281.

1, www.nainr.com

7. GraphPad. Repeated measures ANOVA, sphericity and compound symmetry. Available at: http://graphpad.com/ faq/viewfaq.cfm?faq=1500 [Accessed November 4, 2008] 8. Baguley T. An introduction to sphericity. Available at: http://homepages.gold.ac.uk/aphome/spheric.html [Accessed November 12, 2009]. 9. Leech NL, Barrett KC, Morgan GC. use and interpretation. 3rd ed. New York: Routledge; 2007. p. 147. 10. Landau S, Everitt B. A handbook of statistical analyses using SPSS. CRC Press; 2003. p. 174. 11. Owen SV, Froman RD. Uses and abuses of the analysis of covariance. Res Nurs Health. 1998;21:557-562. 12. Hamilton B. An empirical investigation of the effects of heterogeneous regression slopes in the analysis of covariance. Educ Psych Meas. 1977;37:701-712. 13. Tabachnick BG, Fidell LS. Using multivariate statistics. 5th ed. Pearson; 2007. p. 213. 14. Cohen J, Cohen P, West LG, et al. Applied multiple regression/correlation for the behavioral sciences. 3rd ed. Mahway, NJ: Lawrence Erlbaum; 2003. p. 120. 15. McCain GC, Fuller EO, Gartside PS. Heart rate variability and feeding bradycardia in healthy preterm infants during transition from gavage to oral feeding. Newborn Infant Nurs Rev. 2005;5:124-132. 16. Hill A. The effects of nonnutritive sucking and oral support on the feeding efficiency of preterm infants. Newborn Infant Nurs Rev. 2005;5:133-141. 17. Ludwig S, Steichen J, Khoury J, et al. Quality improvement analysis of developmental care in infants less than 1500 grams at birth. Newborn Infant Nurs Rev. 2008;8:94-100. 18. Brown LF, Pridham K. The effect of maternal depressive symptoms and early maternal feeding behavior on later infant feeding behavior. Newborn Infant Nurs Rev. 2007;7: 56-63. 19. Breton S, Steinwender S. Timing introduction and transition to oral feeding in preterm infants: current trends and practice. Newborn Infant Nurs Rev. 2008;8:153-159. p. 155. 20. Howland LC. Preterm birth: implications for family stress and coping. Newborn Infant Nurs Rev. 2007;7: 14-19. p. 17-18.

NEWBORN

21. Otova M. Heparin safety in the neonatal intensive care unit: are we learning from mistakes of others? Newborn Infant Nurs Rev. 2009;9:53-59. 22. Hurst N. Assessing and facilitating milk transfer during breastfeeding for the premature infant. Newborn Infant Nurs Rev. 2005;1:19-26. 23. Bozzette M. A review of research on premature infant-mother interaction. Newborn Infant Nurs Rev. 2007;7:49-55 p. 54. 24. Forcada-Guex M, Pierrehumbert B, Borghini A, et al. Early dyadic patterns of mother–infant interactions and outcomes of prematurity at 18 months. Pediatrics. 2006;118: e107-e114. 25. Zimmerman DW. A note on preliminary tests of equality of variances. Brit J Math Stat Psych. 2004;57:173-181. 26. Rasch D, Kubinger KD, Moder K. The two-sample t test: pre-testing its assumptions does not pay off. Statistical Papers. Available at: http://www.springerlink.com/content/ m66w1517r47m4000/ [Accessed October 28, 2009]. 27. Shuster JJ. Diagnostics for assumptions in moderate to large simple clinical trials: do they really help? Stat Med. 2005; 24:2431-2438. 28. Wells CS, Hintz JM. Dealing with assumptions underlying statistical tests. Psych Schools. 2007;44:495-502. 29. Zimmerman DW. Invalidation of parametric and nonparametric statistical tests by concurrent violation of two assumptions. J Exp Educ. 1998;67:1,55-68. 30. Rubin DB. Multiple imputation for non-response in surveys. Hoboken, NJ: Wiley; 1987. 31. White H. A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica. 1980;48:817-838. 32. Pearl J. Causality: models, reasoning, and inference. Cambridge, United Kingdom: Cambridge University Press; 2000. 33. Gifi A. Nonlinear multivariate analysis. Hoboken, NJ: Wiley; 1990. 34. Hoffmann JP. Generalized linear models. Boston: Allyn and Bacon; 2003. 35. Huber PJ, Ronchetti EM. Robust statistics. 2nd ed. Hoboken, NJ: Wiley; 2009. 36. Verran JA, Ferketich SL. Residual analysis for statistical assumptions of regression equations. West J Nurs Res. 1984; 6:27-40.

& INFANT NURSING REVIEWS, MARCH 2010

49