Editorial A Historical Perspective of Statistics The word "statistics" means many things to many people. Its origin lies in the compilation and recording of information to be summarized and presented numerically, primarily for use by the state. To the modern professional statistician, the word refers to a body of knowledge system atically built on a foundation of philosophic prin ciples by using sophisticated mathematics. To the student, it often refers to a hated required course recalled only for its difficulty and obtuseness. To the modern clinician, it refers to a va riety of numerical techniques encountered more and more frequently in the medical literature. The profession of statistics has a rather brief history. Aside from the collection of basic mor tality and morbidity data in the 17th century and an 18th century interest in the theory of prob ability stimulated by gambling, the origins of most of the statistical tools in current use can be found in the last half of the 19th century. Thus, the formal philosophic and mathematical under pinnings of the profession are relatively new. Despite the fact that some of the original datacollection systems pertained to information of interest to clinicians, clinical scientists were slow to embrace this new discipline. Indeed, social scientists, agronomists, and astronomers were among the first to recognize the inherent vari ability in their data and to see the value of a systematic approach to analyzing and reporting such data. Consequently, statistical vocabulary is replete with terms whose origins make sense only when viewed from a historical perspective. As in so many areas of clinical science, the Mayo Clinic was a leader in the introduction of statistical methods to clinical investigation. It is a matter of historical record that the Mayo brothers and other early Mayo staff members frequently presented summaries of large series of patients of a certain type to state, national, and international meetings. Indeed, the unit record
Address reprint requests to Dr. W. M. O'Fallon, Section of Biostatistics, Mayo Clinic, Rochester, MN 55905. Mayo Clin Proc 63:952-954, 1988
system and diagnostic coding scheme introduced by Dr. Henry S. Plummer facilitated and stimu lated such assessment of clinical practice. In the early 1930s, Joseph Berkson, M.D., D.Sc, joined the Mayo Clinic staff and shortly there after founded the Department of Biometry. Dur ing his more than 30 years at Mayo, Dr. Berkson collaborated with many of his Mayo clinical col leagues to produce numerous publications of clini cal significance. Dr. Berkson's most lasting con tributions, however, are clearly in the areas of statistical philosophy, study design, and data analysis. His insistence on clear thinking, precise statements of goals and objectives, and appro priate methods of analysis ensured the quality of studies on which he collaborated. Futhermore, his influence extends beyond such studies be cause he published his methodologic work and many of his ideas have been adopted and es poused by others. Dr. Berkson became a major force in the international biostatistics commu nity that he served in several leadership roles, culminating in his election to the National Academy of Sciences. In the May 24, 1950, issue of The Proceedings of the Staff Meetings of the Mayo Clinic, Berkson and his colleague Robert P. Gage published a manuscript entitled, "Calculation of Survival Rates for Cancer." 1 Although the methodology espoused therein was not new, his adaptation of it, which provided an appropriate approach to analysis of data compiled from the follow-up of patients after a diagnosis or a procedure, was of major importance. Today it is referred to as the actuarial method or, in a more generic sense, as survival analysis. As Dr. Berkson neared the end of his career at Mayo—although not the end of his productive professional life—Drs. Leonard T. Kurland, Lila R. Elveback, and William F. Taylor joined the Mayo statistical staff and be gan collaborating with their clinical colleagues. They also published methodologic manuscripts on such topics as epidemiology, normal-value determinations, and survival analysis. Today the staff of the Section of Biostatistics, as we are now known, is considerably larger than during the days of Dr. Berkson. Nonetheless, our interest in the highest principles of sound think ing, careful experimental design, and appropriate
952
Mayo Clin Proc, September 1988, Vol 63
data analysis has continued. The statistical staff at Mayo continues to publish articles that de scribe new methods of designing clinical trials, propose alternative analytic strategies, illustrate new methods of handling the large data sets obtained through multiple clinical trials and epidemiologic studies, establish early stopping rules for clinical trials, and so forth. Our primary activity, however, has always been collaboration with clinical colleagues as they design studies, collect and analyze data, and prepare the resultant manuscripts. An integral part of this process is educational, as we learn something about the clinical question under in vestigation and our colleagues learn something about statistics. The ultimate collaborative effort involves investigators who share enough of each other's disciplines to participate interactively in the process of conceiving and conducting an experiment. Of course, knowledge of statistics serves all clini cians because the medical literature is replete with statistical references. Thus, some knowledge of terminology and concepts has become almost essential if one is to be a critical consumer of the literature. Following in Berkson's footsteps and in line with our section's educational philosophy, Dr. Peter C. O'Brien of the Section of Biostatistics and Dr. Marc A. Shampo of the Section of Pub lications published in the Proceedings a series of articles 2 on basic statistical concepts that was very well received. In that series, they attempted to lay a foundation of understanding of what the profession of statistics can do for medical scien tists. Their goal was to introduce concepts and terms—especially those likely to be encountered in the literature—through brief presentations and discussions of actual research projects conducted by Mayo investigators in collaboration with Mayo statisticians. The model that produced that basic series has now been used again to produce a new series of articles on more sophisticated topics. The first contribution in this new series was published in the August 1988 issue of the Proceedings. Drs. O'Brien and Shampo have set themselves to a much more difficult—but no less i m p o r t a n t task. Again following Berkson, they are challeng ing us to think about some of the more difficult concepts involved in the analysis and under standing of data. Their particular emphasis in
EDITORIAL
953
this series will be on statistical implications in the analysis of data generated by obtaining many observations on each of the subjects in volved in a study, a common occurrence. Because such multiple observations can be ob tained in a variety of ways, some structured and some unstructured, they have written several arti cles in which they discuss alternative methods. In some of these articles, Dr. O'Brien presents methodology that he has developed and for which he is perceived as a leader in the statistical world. The basic problem is simple. If you do some thing often enough, things that might otherwise be perceived as unusual will be observed. This fact of life brings us back to the card table to play bridge or poker. We know that if we play often enough, we will see some good hands and perhaps even some day will hold 13 spades or a royal flush. Yet, if I sat down to play bridge this evening and on the first deal found myself hold ing 13 spades, I would immediately be suspi cious—someone stacked the deck purposely or inadvertently, and the shuffle and deal did not produce a random allocation of the cards. The paradox is exquisite. Something is held to be special and worthy of our attention because it is believed to occur only rarely. In statistics, the observation of such a "rare" event causes us to reject our hypothesis—that is, to change one's belief. Yet we know that rare events do happen eventually, and if we look diligently enough and long enough, we will observe one. Under such circumstances, when such an event is observed is it fair for us to behave as though it was rare? Thus, as Drs. O'Brien and Shampo pointed out in their first article, if 100 statistical tests are performed of 100 null hypotheses, each of which is true and each requiring P<0.05 to be signif icant, 5 of the 100 tests would be expected to be significant. In their terminology, the percomparison error rate is 5%, but if the 100 tests considered as a whole constituted a study (or experiment), the per-experiment error rate could be almost 100%—that is, almost certainly, at least one of the tests could (erroneously) have signif icant results. In conclusion, I would like to emphasize two points. First, computer packages are readily available, and they perform most tests known to humankind that might possibly apply to the data on hand. In part 4 of their current series of articles, Drs. O'Brien and Shampo offer some
954
Mayo Clin Proc, September 1988, Vol 63
EDITORIAL
specific insight of value to all who might find themselves with such a plethora of computer output from which to choose. The second point focuses on the conduct of randomized clinical trials, which has become the hallmark of experi mental design for comparing therapeutic effi cacy. Such trials accrue patients over time, and it is frequently morally necessary that multiple interim assessments be made while the study continues. In their final contribution in the series, Drs. O'Brien and Shampo discuss the pitfalls associated with such assessments if they are not conducted carefully. In the literature, studies have been described that were terminated early and apparently erroneously because of lack of such care. Furthermore, theoretic studies have demonstrated a high likelihood of erroneous de cisions when multiple tests are used inappro priately. Because such clinical trials frequently involve critical questions of relative therapeutic value and their outcome may lead to the discon tinuation of a potentially valuable therapy, it is
essential that they be assessed properly. Dr. O'Brien has published important and influential results on this topic, and his insights are valuable. The series of articles begun in the August 1988 issue of the Proceedings continues an important and long-standing tradition of this journal and of the staff of the Section of Biostatistics of pub lishing methodologic and educational material pertaining to statistical issues. I commend them to the readers. W. Michael O'Fallon, Ph.D. Section of Biostatistics
REFERENCES
1. Berkson J, Gage RP: Calculation of survival rates for cancer. Proc Staff Meet Mayo Clin 25:270-286, 1950 2. O'Brien PC, Shampo MA: Statistics for clinicians. Mayo Clin Proc 56:45-46; 47-49; 126-128; 196-197; 274-276; 324326; 393-394; 452-454; 513-515; 573-575; 639-640; 709-711; 753-754; 755-756, 1981