Clinical Radiology (1996) 51, 47 50
A Review of the Statistical Analysis Used in Papers Published in Clinical Radiology and British Journal of Radiology j. GOLD1N*, W. Z H U t and J. W. SAYRE*?
Departments of *Radiological Sciences and "~Biostatistics , University of California, LA, USA Statistical analysis such as significance testing have become essential features of published medical studies. This has resulted in an increased frequency with which statistics are used, making the interpretation of scientific publications more difficult. There is an extensive array of tests and techniques. The aim of this study is to identify which statistical tests are used in radiology publications. All major articles published in Clinical Radiology and British Journal of Radiology in one year were reviewed. The frequency of statistical methods used was as follows: no statistical method or descriptive statistics only 103 (47%), one type of statistical method 67 (31%), and two or more methods 47 (22%). Statistics dealing with basic inference, decisions, contingency tables or correlation[regression techniques were found in 124 (53%) in which a procedure had been used. Advanced statistics including receiver operating characteristics (ROC), odds ratio, regression techniques, multiway ANOVA, and nonparametric A N O V A studies accounted for only 41 (19%) in which a procedure had been used. We conclude that descriptive analysis and basic statistical techniques account for most of the statistical tests reported. Physicians should concentrate on improving their understanding of basic statistics but advice should be sought from professionals in the fields of biostatistics and epidemiology as to whether the use of more advanced techniques would be more appropriate. Goldin, J., Zhu, W. & Sayre, J.W. (1996) Clinical Radiology 51, 47-50. A Review of the Statistical Analysis Used in Papers Published in Clinical Radiology and British Journal of
Radiology Accepted for Publication 31 August 1995
In the reporting of studies, methods of statistical analysis such as significance testing have become essential features with the aim of ensuring that conclusions are based on evidence rather than opinions. This has resulted in an increased frequency with which statistical tests are used, making the interpretation of scientific publications more difficult [1 7]. As a consequence it has become important for investigators, reviewers and editors to ensure the validity of statistical tests used and the correct interpretation of their significance. This is particularly important given the poor statistical knowledge of physicians demonstrated in several recent studies [8-10]. There is an extensive array of available statistical tests and techniques. Two surveys of the statistical content of medical journals have demonstrated that just under 50% of the articles used either descriptive statistics or no statistics; and that an additional 30% of the articles used only basic statistics [11,12]. To date only one other investigation has reported the statistical content in specialist radiology publications [12]. Descriptive analysis, simple inference (t-tests), contingency table analysis 2 . . . . (~ -tests) and basic correlation/regression techmques are the most common statistical tests reported in the medical literature [12-27]. The aim of this investigation was to identify what methods of statistical analysis are used in radiology investigation.
• Correspondence to: Dr Jonathan Goldin, Department of Radiologtcal Sciences, University of California, Los Angeles, 10833 Le Conte Avenue, BL-721 CHS, Los Angeles, CA 90095-1721, USA. © 1996The Royal Collegeoi Radiologists.
METHODS We examined all articles published between January and December 1994, inclusive, in two dedicated general radiology journals, Clinical Radiology ( CR) and British Journal of Radiology (BJR). Only original articles defined for the purposes of this investigation as studies describing imaging findings in at least 10 patients or experimental studies in which at least 10 measurements of any kind had been obtained. Smaller studies, technical notes, case reports, review articles, abstracts and letters were excluded from this analysis. Each article was reviewed to determine the types of statistical methods contained. Statistical tests were classified according to categories based on those used in other similar investigations [7,12] and are summarized in Table 1. The cumulative accessibility, defined as the percentage of articles to which a reader would have full access by being familiar with a given statistical procedure as well as all simpler procedures, was also calculated [12] (see Table 2). Investigations were also subdivided into the following organ system or radiologic subspecialties: abdominal, breast, cardiac, chest, contrast media, genitourinary, interventional, musculoskeletal, neuroradiology, paediatric, vascular, and miscellaneous. To reduce the number of categories, closely related subjects were grouped together. Equipment evaluations and subspecialties sections in which fewer than five major articles had been published over the year were included in the miscellaneous category.
48
CLINICAL RADIOLOGY
Table 1 - Classification of statistical procedures [12]
Category
Description
None/descriptive
No statistical procedures employed or statistical inferences made; data organized and summarized in such a way as to make them readily comprehensible (e.g. mean, percentages, standard errors, histograms).
Decision statistics basic advanced ROC t- and z-tests Basic Contingency tables basic advanced Correlation/regression basic nonparametric advanced ANOVA basic advanced Nonparametric basic advanced Survival analysis advanced Transformation advanced Other advanced
Sensitivity, specificity, positive predictive value, accuracy Relative risk, odds ratio, log odds; Bayesian estimation; Berkson's analysis Advanced decision statistics based on analysis of ROC curves One-sample, matched pair, and two-sample t-tests and z-tests; confidence intervals x2-tests, Fisher's exact test, McNemar test Multiway tables; Mantel Haenszal test, log-linear models Pearson correlation coefficient and testing, two-variable (simple linear) regression, least-squares modeling Spearman's p, Kendall's ~-; tests for trend; monotone regression Multiple linear regression; polynomial regression; nonlinear regression, estimation, and modeling; stepwise regression; logistic regression; discriminant analysis Single and repeated measures one-way ANOVA; multiple comparison procedures (e.g. Newman Keuls, Bonferroni, Tukey, Duncan tests); simple linear contrasts; F-tests Multiway ANOVA or analysis or covariance; block designs (e.g. Latin square, factorial) Sign test, Wilcoxon signed-rank test, Mann-Whitney U-test; median/range test Nonparametric ANOVA (Kruskal Wallis, Friedman tests); goodness-of-fit tests (Kolmogorov Smirnov type) Life table (survival function, Kaplan Meier product limit analysis), regression for survival (Mantel Cox) proportional hazards model, rate adjustment Use of data transformation (e.g. logarithmic, exponential, reciprocal) Any statistical method not fitting the above headings (e.g. cluster analysis, autocorrelation, quality control charts, pharmacokinetic modeling)
ROC, receiver-operating characteristics; ANOVA, analysis of variance. RESULTS
A total of 218 major articles were included in the study. The frequency of statistical methods used was as follows: no statistical methods or descriptive statistics only 103 (47%), one type of statistical method 67 (31%), two or more methods 47 (22%). Statistics dealing with inference, decisions, contingency tables or correlation/
regression techniques were found in 53% of the articles in which a statistical procedure had been used (Table 2). The listing of statistical tests in Table 3 is given in order of their decreasing usage in the articles analysed. Advanced statistics including correlation/regression analysis, odds ratio, regression techniques, multiway ANOVA, nonparametric A N O V A studies and Receiver Operating Characteristic (ROC) accounted for 19%. The most
Table 2 - Statistical content by category and accessibility
Statistical Category
No (%) of articles containing methods (n - 218)
None/descriptive Basic statistics t- and z-tests contingency tables decision statistics correlation/regression nonparametric analysis of variance (ANOVA) Advanced statistics correlation/regression: nonparametric decision statistics: receiver-operating characteristics transformation correlation/regression decision statistics contingency tables nonparametric survival analysis study design and analysis ANOVA: advanced Other
103 (47)
47
43 (20) 31 (14) 21 (10) 20 (9) 5 (2) 7 (3)
56 65 72 78 80 81
7 (3) 2 (1) 9 (4)
84 85
17 (8) 1 (0.5) 1 (0.5)
2 (1) 8 (4)
4 (2) 0 (0)
2 (1)
Cumulative accessibility by method (%)
88 92 93 93 94 97 99 99 100
ROC, receiver-operating characteristics; ANOVA, analysis of variance. © 1996 The Royal Collegeof Radiologists, Clinical Radiology, 51, 47-50.
49
STATISTICALANALYSIS IN RADIOLOGY PUBLICATIONS Table 3 - Distribution and complexity of statistical methods according to radiology subspeeialty classification
Classification
No. of articles
No. (%) of artieles using No statistics or only descriptive statistics
> One method
Advanced methods
Abdominal
22
8 (36)
4 (18)
2 (9)
Breast Chest Contrast Genitourinary Interventional Miscellaneous
20 16 6 12 7 57
8 (40) 7 (44) 4 (67) 7 (58) 6 (86) 26 (45)
5 (25) 5 (31) 0 (0) 1 (8) 1 (14) 12 (20)
4 (20) 4 (25) 0 (0) 0 (0) 0 (0) 13 (23)
Musculoskeletal
25
9 (36)
10 (40)
7 (28)
Neuroradiology Paediatric
15 21
9 (60) 13 (62)
2 (13) 1 (5)
3 (20) 2 (10)
5
0 (0)
3 (75)
4 (100)
12
6 (50)
3 (25)
2 (17)
217
103 (47)
49 (22)
41 (19)
Radiotherapy Vascular
Totals
frequently used advanced statistical analyses were advanced correlation/regression techniques. The cumulative accegsibility column takes into account the fact that many articles contained more than one statistical method. The value indicates that the percentage of articles an individual reader could understand given their knowledge of statistics (e.g. a , reader with knowledge of descriptive and basic t-tests only would have informed access to 56% of articles in both journals). The use of statistical methods in investigations divided by radiologic subspecialties (see Table 3) demonstrates that statistical techniques were most likely to be found in articles dealing with abdominal (64% of articles used statistics), musculoskeletal (64%), and breast radiology (60%). Statistical techniques were least likely to be used in articles dealing with interventional (14%) and contrast media (33%). Advanced statistical methods were most likely to be used in articles dealing with musculoskeletal (28%), breast radiology (20%), chest radiology (25%) and were not used at all in articles classified as genitourinary or interventional radiology. There was some subspecialty difference in use of advanced statistics. Advanced correlation/ regression analysis was most widely used in paediatric, nonparametric correlation techniques were most prominent in neuroradiology and musculoskeletal radiology and survival analysis was principally used in articles of vascular and interventional radiology. DISCUSSION The pattern of statistical usage in Clinical Radiology and British Journal of Radiology is similar to that © 1996The Royal Collegeof Radiologists,ClinicalRadiology, 51, 47-50.
Most frequently used advanced statistical methods (No. of articles)
Correlation/regression: advanced (1); nonparametrics; advanced (1); study design (1) Correlation/regression: (2) Transformation (2)
Correlation/regression: advanced (5) Correlation/regression: nonparametric (2); Correlation/regression: advanced (2) Transformation (2) Correlation/regression: advanced (2) Correlation/regression: advanced (3); Survival analysis (3) Transformation (1); Correlation/regression: advanced (1); Survival analysis (1)
reported in a recent review of two other general radiology journals, Radiology and American Journal of Roentgenology [12]. In both studies, descriptive analysis, simple inference, contingency tests and basic correlations/ regression techniques account for most of the statistical tests reported. There was some difference in the use of advanced statistics with correlation/regression analysis being the most frequently used advanced technique in CR and BJR, while ROC analysis was the most commonly used in A JR and Radiology. These slight differences may in part be due to the difference in the size of the two studies. In both studies, knowledge of basic statistics would only allow a reader to understand 80% of all papers published [12]. Radiologists should, therefore, concentrate on improving their understanding of basic statistics in order to have a complete understanding of the majority of papers published in these journals. Basic statistical tests are most often used to determine statistical significance in the differences demonstrated between groups. Statistical significance and clinical significance are not necessarily synonymous [28,29]. While statistical tests demonstrate statistical significance, they do not necessarily measure clinical significance. Further, a study's quality cannot be judged by the frequent use of statistical tests nor their degree of complexity. The use of the appropriate study design and the use of careful methods to avoid the many sources of bias are as important as the choice of a particular statistical test. It is important to realize that even descriptive studies require careful attention to study design to ensure the validity of the observations. The intended size of the sample requires mathematical justification (e.g. power calculations) and should be specified in the Methods Section. Any discrepancy between the actual and intended number of patients should be explained. The
50
CLINICAL RADIOLOGY
statistical results published must be interpreted in the context of the study design used and efforts taken to avoid bias [11,30,31]. CONCLUSIONS
The purpose of medical research is to enable physicians to extrapolate the results of studies on groups of patients to the management of individual patients. It is, therefore, important that investigators, readers and editors improve their statistical knowledge in order to understand the accuracy and validity of the statistical results reported. This study has identified the commonly encountered statistical tests used in radiology publications and it is hoped that the accompanying review articles [32,33] will offer a platform from which more specific advice can be sought from professionals in the related fields of biostatistics and epidemiology.
13
14 15 16 17 18
19
20
Acknowledgement. We would like to thank Ms Judy Ho for her assistance in the preparation of this manuscript.
21
REFERENCES
22
1 Altman DG. Statistics in medical journals: developments in the 1980s. Statistics in Medicine 1991;10:1897 1913. 2 Hanley JA. The place of statistical methods in radiology (and in the bigger picture). Investigative Radiology 1989;24:10 16. 3 Squires BP. Statistics in biomedical manuscripts: what editors want from authors and peer reviewers. Canadian Medical Association Journal 1990;142:213-214. 4 Gardner M J, Machin D, Campbell MJ. Use of check lists in assessing the statistical content of medical studies. British Medical Journal 1986;292:810 812. 5 Salsburg DS. The religion of statistics as practiced in medical journals. American Statistics 1985;39:220 223. 6 Mainland D. Statistical ritual in clinical journals: is there a cure? Part I. British Medical Journal 1984;288:841-843. 7 Pocock SJ, Hughes MD, Lee RJ. Statistical problems in the reporting of clinical trials: a survey of three medical journals. New England Journal of Medicine 1987;317:426 432. 8 Weiss ST, Samet JM. An assessment of physician knowledge of epidemiology and biostatistics. American Journal of Medical Education 1980;55:692 703. 9 Berwick DM, Feinberg HV, Weinstein MC. When doctors meet numbers. American Journal of Medicine 1981;71:991 998. 10 WulffHR, Andersen B, BrandenhoffP, Guttler F. What do doctors know about statistics? Statistics in Medicine 1987;6:3-10. 11 Emerson JD, Colditz GA. Use of statistical analysis in the New England Journal of Medicine. New England Journal of Medicine 1983;309:709 713. 12 Elster AD. Use of statistical analysis in the A JR and Radiology:
23
24 25 26 27 28 29 30 31 32 33
frequency methods and subspecialty differences. American Journal of Roentgenology 1994;163:711-715. Hokanson JA, Bryant SG, Gardner R Jr, Luttman DJ, Guernsey BG, Bienkowski AC. Spectrum and frequency of use of statistical techniques in psychiatric journals. American Journal of Psychiatry 1986;9:1118-1125. Colditz GA, Emerson JD. The statistical content of published medical research: some implications for biomedical education. Medical Education 1985;19:248-255. Emerson JD, McPeek B, Mosteller F. Reporting clinical trials in general surgical journals. Surgery 1984;95:572-579. Gore SM, Jones IG, Bytter EC. Misuse of statistical methods: critical assessment of articles in B M J from January to March 1976. British Medical Journal 1977;1:85 87. White SJ. Statistical errors in papers in the British Journal of Psychiatry. British Journal of Psychiatry 1979;135:336 342. Avram MJ, Shanks CA, Dykes MHM, Ronia AK, Stiers WM. Statistical methods in anesthesia articles: an evaluation of two American journals during two six-month periods. Anesthesia and Analgesics 1985;64:607-611. MacArthur RD, Jackson GG. An evaluation of the use of statistical methodology in the Journal of Infectious Diseases. Journal of Infeetious Diseases 1984;149:349 354. Hokanson JA, Ladoulis CT, Quinn FB Jr, Bienkowski AC. Statistical techniques reported in pathology journals during 1983 1985: implications for pathology educators. Archives in Pathological Laboratory Medicine 1987;111:202-207. Badgley RG. An assessment of research methods reported in 103 scientific Canadian medical journals. Canadian Medical Association Journal 1961;85:246 249. Feinstein AR. Clinical biostatistics: XXV. A survey of the statistical procedures in general medicine journals. Clinical Pharmacology and Therapeutics 1974;15:97 107. Menegazzi JJ, Yealy DM, Harris JS. Methods of data analysis in the emergency medicine literature. American Journal of Emergency Medicine 1991;9:225-227. Fromm BS, Snyder VL. Research design and statistical procedures used in The Journal of Family Practice. Journal of Family Practice 1986;23:564-566. Greenland S. Quantitative methods in the review of epidemiologic literature. Epidemiology Review 1987;9:1-30. Morris RW. A statistical study of papers in The JournalofBone and Joint Surgery (Br) 1984. Journal of Bone and Joint Surgery Br 1986:70-B:242 246. Selvin S, White MC. Description and reporting of statistical methods. American Journal of Infection Control 1993;21:210 215. Rothman KJ. Significance testing. Annals of Internal Medicine 1987; 105:445-447. Sterling TD. Publication decisions and their possible effects on inferences drawn for tests of significance - or vice versa. Journal of the American Statistical Association 1959;54:30 34. Sackett DL, Haynes RB, Tugwell P. Clinical epidemiology: a basic science for clinical medicine. Boston: Little, Brown, 1985. Fletcher RH, Fletcher SW, Wagner EH. Clinicalepidemiology the essentials. Baltimore: Williams and Wilkins, 1985. Goldin JG, Sayre JW. Guide to clinical epidemiotogy: part I: study design and research methods. Clinical Radiology, accepted for publication. Goldin JG., Sayre JW. Guide to clinical epidemiology: part II: statistical analysis. Clinical Radiology, accepted for publication.
© 1996 The Royal College of Radiologists,Clinical Radiology, 51, 47-50.