Basics of research (part 11): Compendium of research terms

Basics of research (part 11): Compendium of research terms

SPECIAL COMMUNICATION Basics of Research (Part 11): Compendium of Research Terms Edward A. Panacek, MD’ 1. Department of Emergency Medicine, Univers...

540KB Sizes 2 Downloads 171 Views

SPECIAL COMMUNICATION

Basics of Research (Part 11): Compendium of Research Terms Edward A. Panacek, MD’

1. Department of Emergency Medicine, University of California, Davis Medical Center, Sacramento, Calif.

Key Words: biostatistics, design, statistics

research,

research

Address for correspondence and reprints: Edward A. Panacek, MD, UC Davis Medical Center, Department of Emergency Medicine, 2315 Stockton Blvd., PSSB, Ste. 2100, Sacramento, CA 95817 Copyright Associates

0 1998

1067-991

X/98/$5.00

Reprint

by the Air Medical

Journal

+ 0

no. 74/l/87564

Introduction

Prior parts in this series have covered elements of research study design (Parts 3 and 4) and quantitative data analysis, including the use of statistics (Part 6). Although those parts can serve as useful references for the beginning researcher, sometimes having a brief, descriptive compendium of terms to use as a quick reference is helpful. Some journalsi provide a glossary of terms included within their “information for authors” section. Many statistics texts also provide abbre viated guides to selecting and best applying individual tests.2 However, these guides are not always oriented toward the beginning researcher and sometimes can be difficult to understand. In addition, they don’t always include all the terms or statistical tests most commonly used in acute care clinical research. This compendium is meant to serve as a stand-alone quick reference guide to ap propriate use of methodology terms and research study design and is a very brief summary of the most common statistical tests. This glossary should allow easier comparisons of terms for the novice researcher and hopefully will facilitate selecting appropriate statistical tests. Methodology in Research

Terms, Design

Before-After Trial. Investigation of therapeutic alternatives or other interventions in which individuals of one pe riod and under one treatment or condition are compared with individuals at a subsequent time, treated in a different Air Medical

Journal

17:l

January-March

1998

fashion, or existing under different conditions. If the disorder is not fatal, and the “before” treatment is not curative, the same individuals may be studied in the before and after periods, strengthening the design through increased group comparability for the two periods. See also Historical controls. Blind or Blinded. Masked, unaware, the term may be modified according to the purpose of the blinding. For example, clinicians or patients can be blind to the treatments patients are receiving, and observers can be blind to each other’s assessments, making their observations uninfluenced by one another (see also Double blind). To avoid confusion, the term masked is preferred in studies in which vision loss is an outcome of interest. Case-Control Study (Case-Referent or Case-Comparison Study). Generally used to test possible causes of a disease or disorder, a study in which individuals who have a designated disorder are compared with individuals who do not have the disorder with respect to previous or current exposure to a putative causal or predictive factor. For example, persons with hepatic cancer (cases) are compared with persons without hepatic cancer (controls), and history of hepatitis B is determined for the two groups. A casecontrol study usually is referred to as a retrospective study (even if patients are recruited prospectively) because the logic of the design leads from effect retrospectively back to cause or prior findings. See Nestedcase-control. 25

Case Series. A series of patients ety’s perspective, in which the costs of with a defined disorder. The term typi- medical care are compared with the ecocally is used to describe a study report- nomic benefits of the care; both costs and ing on a consecutive collection of pa- benefits are expressed in units of curtients treated in a similar manner, rency. The benefits typically include rewithout a concurrent control group. For ductions in future health care costs and example, a surgeon might describe the increased earnings as a result of the imcharacteristics of and outcomes for 100 proved health of those receiving the care. consecutive patients with cerebral isCost-Effectiveness Analysis. An chemia who received a revascularization economic evaluation in which alternative procedure. See also Consecutivesample. programs, services, or interventions are Cohort. A group of persons with a compared in terms of the cost per unit of common characteristic or set of charac- clinical effect (for example, cost per life teristics. Often refers to a group followed saved, cost per millimeter of mercury of for a specified period to determine the in- blood pressure lowered, or cost per qualcidence of a disorder or complications of ity-adjusted life-year gained). The last an established disorder (that is, progno form of measuring outcomes (and equivsis), as in an inception cohort or cohort alents such as “healthy days of life longitudinal design. When comparing gained”) gives rise to what also is retwo cohort groups, generally is referred ferred to as cost-utility analysis. to as a cohort study or cohort analytic Criterion Standard. Preferred term study. Usually prospective but can be a to gold standard. A method having estab retrospective design. lished or widely accepted accuracyfor deCohort Analytic Study. Prospective termining a diagnosis, providing a staninvestigation of the factors, etiologies, or dard with which a new screening or therapies that have an effect on outcome diagnostic test can be compared. The in which a cohort of individuals who do method need not be a single or simple not have evidence of an outcome of inter- procedure but could include follow-up of est but who are exposed to the agent of patients to observe the evolution of their interest or have the variable of interest conditions or the consensus of an expert are compared with a concurrent cohort panel of clinicians. Criterion standard also also free of the outcome but not exposed can be used in studies of the quality of or without the variable. Both cohorts care to indicate a level of performance, then are followed to compare the inci- agreed to by experts or peers, with which dence of the outcome of interest. the performance of individual practitionConfounder, confounding Variable. ers or institutions can be compared. A factor that distorts the true relationship Crossover Trial. A method of comof the study variables of central interest by paring two or more treatments or intervirtue of being related to the outcome of ventions in which subjects or patients, on interest or having the variable of interest completion of the course of one treatment, but extraneous to the study question and are switched to another. Typically, allocaunequally distributed among the groups tion to the first treatment is by random being compared. For example, age might process. Participants’performance in one confound a study of the effect of a toxin on period is used to judge their performance longevity if individuals exposed to the in others, usually reducing variability. toxin were older than those not exposed. Data Set. Raw data gathered by inConsecutive Sample. Sample in vestigators. which the units are chosen on a strict Double Blind or Double Mask. “first come, first chosen” basis. All indi- (1) Neither the subject nor the study viduals who are eligible should be in- staff (those responsible for patient treatcluded as they are seen. ment and data collection) are aware of Convenience Sample. Individuals the group or intervention to which the or groups selected at the convenience of subject has been assigned. (2) Any conthe investigator or primarily because dition in which two groups of persons they were available at a convenient time are purposely denied access to information to keep that information from influor place. Cost-Benefit Analysis. A form of encing some measurement, observation, economic assessment, usually from soci- or process. 26

Economic Evaluation. Comparative analysis of alternative courses of action in terms of both their costs and consequences. End Point, See Outcomes Gold Standard. See Criterion Standard

Histotical Controls. The use of a similar patient group from prior experience as the comparison for a new group or for the ongoing treatment, rather than using concurrent randomized controls. Generally only used when the expected mortality/ morbidity rates are very high (nearly 100%).More prone to biased results than designs using concurrent controls. Inception Cohort. A designated group of persons, assembled at a common time early in the development of a specific clinical disorder (for example, at the time of first exposure to the putative cause or at the time of initial diagnosis), who are followed thereafter longitudinally (see also Cohort). Likelihood Ratio. For a screening or diagnostic test (including clinical signs or symptoms), expresses the relative odds that a given test result would be expected in a patient with (as opposed to one without) a disorder of interest. Masked. See Blind Matching. The deliberate process of making a study group and a comparison group comparable with respect to factors extraneous to the purpose of the investigation but that might interfere with the interpretation of the study’s findings (for example, in case-control studies, individual cases might be matched or paired with a specific control on the basis of comparable age, gender, clinical features, or a combination). Nested Case-Control. The use of a case control design from withii an existing and complete database (i.e., using data “nested” within a previous study data set). The database is usually from a comprehensive registry or a prior clinical study. Has advantages over other retrospective designs by having much more complete and accurate data collection. Nonrandomized Control Trial. Experiment in which assignment of patients to the intervention groups is at the convenience of the investigator or according to a preset plan that does not conform to the definition of random. Similar to an RCT but with assignment to

January-March

1998 17:l

Air Medical

Journal

Common Parametric

Tests

(tests

Statistical

Tests:

difference

t-test

Two

groups

ANOVA

At least three sets (e.g., groups) of data One-way-ne factor (= one dependent variable) Two-way-at least two factors (= two or more dependent variables) Repeated measures-at least three nonindependent groups

ANCOVA

Controls

MANOVA

Analyzes multiple comparisons) Tests

(tests

for

Chi-square Fisher’s

No more

Exact Rank

Wilcoxon’s

Signed

Sum Rank

Two

sets data

Two

sets data

Nominal

Association Spearman

independent)

statistically

variables

simultaneously

in variable

data (two

Compares

sample

Compares

dependent

(independent),

any data,

any variance

nonindependent) population samples

Tests Rank

Ordinal

data (association

only)

Nominal

Pearson

Correlation

Numerical

Bivariate

Regression

Continuous

dependent

variable

on one independent

Continuous

dependent

variable

on at least two independent

Regression Regression

Logistic

Regression

data

Categorical

Comparison

Bonferroni

Adjustment

Sidak

data

Sequential

Multiple Scheffe’s

Test

(association

analysis

only)

(predictive),

dependent

variables,

evaluating

multiple

variable

(predictive) variables

independent

(predictive)

on dependent

comparisons

in combination

Corrections Most

extreme

correction

Most

flexible,

can use for all type

of comparisons

Adjustment

study groups rather than randomization. See also Randomized trial. Outcomes. All possible changes in health status that may occur while following subjects or that may stem from expo sure to a causal factor or from preventive or therapeutic interventions. The narrower term end points refers to health events that lead to completion or termination of follow-up of an individual in a trial or cohort study, for example, death or major morbidity, particularly related to the study question. Prospective Study. Design in which study outcomes have not yet occurred at the time of study initiation (e.g. cohort, RCT). Random. Governed by a formal change process in which the occurrence of previous events is of no value in predicting future events. The probability of assignment of, for example, a given sub Air Medical

for multiple

(dependent) groups

Risk

Stepwise

(corrects

(independent)

Relative

Multiple

combinations

data (at least five expected/cell)

McNemar Probability

variables

dependent

paired,

than five expected/cell

At least three

Binomial

or samples)

single,

differences)

Kruskal-Wallis

Sign Test

subtests:

for confounding

Categorical

Wilcoxon’s

(multiple

groups

Applications

for

Nonparametric

between

Most Common

Journal

17:l

January-March

ject to a specified treatment group is fixed and constant (typically 0.50), but the subject’s actual assignment cannot be known until it occurs. Random Sample. A sample derived by selecting sampling units (e.g., individual patients) such that each unit has an independent and fixed (generally equal) chance of selection. Whether a given unit is selected is determined by chance (e.g., by a table of randomly ordered numbers). Randomization, Random Allocation. Allocation of individuals to groups by chance, usuallydone with the aid of a table or random numbers. Not to be confused with systematicallocation(e.g.,on even and odd days of the months or allocationat the investigator’sconvenienceor discretion). Randomized Trial (Randomized Control Trial, Randomized Clinical Trial, RCT). Experiment in which individuals are randomly allocated to receive or not 1998

receive an experimental preventive, therapeutic, or diagnostic procedure and then followed to determine the effect of the intervention. Retrospective Study. Design in which the study variables of interest already have occurred before study initiation or subject enrollment. Sensitivity. The sensitivity of a diagnostic or screening test is the proportion of people who truly have a designated disorder so identified by the test, which may consist of or include clinical observations. Sequential Sample. See Consecutive sample Specificity. The specificity of a diagnostic or screening test is the proportion of people who are truly free of a designated disorder so identified by the test, which may consist of or include clinical observations. Survey. Observational or descrip27

tive, nonexperimental study in which individuals are examined or questioned systematically for the absence or presence (or degree of presence) of characteristics of interest.

or not the means of two sets of continuous data are equal. The t-test assumesthe data are distributed normally (parametric) and the data from both treatment groups have equal variances. Different forms of the ttest are used depending on whether or not the samples (groups) are paired: l t-test for one sample (compares with a population or theoretical value) l t-test for dependent (paired) samples l t-test for independent samples (i.e., no overlap or pairing of groups) One-way ANOVA. Tests the null hypothesis that three or more sets of continuous data (for a single independent variable) are drawn from samples with equal means. This test assumes the data are distributed normally (parametric) and the data from all groups have identical variances. (The one-way ANOVA is an extension of the t-test for three or more groups.) Two-way ANOVA. Same as ANOVA but tests for at least two independent variables (factors). Repeated measures ANOVA. Same but tests at least three paired groups. ANCOVA. An extension of ANOVA that includes a regression analysis to control for confounding variables that could not be controlled by exclusion, matching, or randomization. Can be used for any number of groups and one or more covariates. MANOVA. An extension of ANOVA to compare multiple (two or more) dependent variables simultaneously while also taking correlations between dependent variables into account and adjusting for multiple comparisons. May show predictive value of combinations of variables.

A Brief Summary of the Most Common Statistical Tests For novice researchers, one of the most common diiculties is deciding on the se lection of appropriate statistical tests during the data analysis phase of a research project. The following compendium of tests starts from the premise that the investigator has an understanding of the type of data collected, the number of study groups, and the desired types of comparisons. However, one danger of such a brief summary is that the user may develop a false sense of confidence in the choice and application of statistical tests. This summary should not be used to replace appropriate consultation with a statistician. Usually multiple tests can be applied to a set of data. Additionally, not all data will clearly fit the categories as described. When in doubt, consult a statistician. A very brief listing of the most common applications of the most common tests is provided in Table 1. Statisticaltests to perform comparisons between groups generally are classifiedas either parametric or nonparametric. Parametric tests involve data that have a normal distribution and involve continuous type of measurements. Nonparametric statistical tests most commonly are used for variables of the categorical type and therefore do not have a normal distribution. Tests of associationor correlation are used to attempt to demonstrate links or relationships between variables or groups. Nonparametric Tests I&, when multiple statisticalcomparisons C&i-square test. Used with categorical are performed using the same data set, ad- variables (i.e., ordinal or nominal data) to justments for multiple comparisons must test the null hypothesis that no effect of be performed. treatment exists on outcome. Although Although an in-depth understanding most commonly used with two treatof these corrections is beyond most ments and two outcomes (i.e., to analyze novice researchers, some of the common a 2 by 2 contingency table), the chicorrection tests or adjustments also are square test easily is generalized to any listed in the table. All these issues are ex- number of treatments and outcomes. plained in greater detail of Part 6 of the The chi-square test uses an assumption “Basics of Research” series or in general that requires at least five expected observations for each combination of treatintroductory statisticstextbooks.2 ment and outcome (under the null hypothesis) (i.e., six expected observations Parametric Tests Student’s test (t-test). Tests whether per cell [square] in the table). 28

Corrections sometimes are suggested (e.g., Yate’s). Fisher’s exact test. Used in an analogous manner to the chi-square test for the same type of data. However, it may be used even when fewer than five observations are expected in one or more cells. Fisher’s exact test is more difficult to calculate than the chi-square test, and some commercial statistical packages will not perform the test for cases with more than two treatments and/or more than two outcomes. Wilcoxon’s Rank Sum (or MannWhitney U) Test. Tests whether two sets of independent groups have the same me dian. These tests do not assume anything about the normality or variance of the data. Generally used for categorical data, therefore nonparametric. Can be thought of as a t-test for independent samples. Wilcoxon’s Signed Ranks Test. Same test except used for two dependent (paired) groups. Kruskal-Wallis. Nonparametric test analogous to the one-way ANOVA No assumption is made regarding normality or variance of the data. The data may be continuous or categorical (ordinal or nominal) because the test relies on the ranks of the data. (The Kruskal-Wallis test is the nonparametric analog to ANOVA). McNemar Test. Tests the association between proportions obtained from dependent samples by contrasting the frequencies in cells in 2 x 2 tables that r-e fleet incongruent responses. Binomial Probability Test. Compares a sample proportion with a population proportion or theoretical value or to compare two proportions ob tamed from two samples. Sign Test. Extension of the binomial test to use on nonparametric data for de pendent samples. Association Tests Spearman Rank Correlation. Determines the degree of linear relationships between two ordinal variables or when numerical variables have skewed distributions. Relative Risk Test. Calculates the risk of the dependent variable, given the presence versus the absence of the inde pendent variable. Pearson Correlation (aka, Pearson Product-Moment). Generates a correla-

January-March

1998

17:l

Air Medical

Journal

tion coefficient to indicate the degree of linear relationship between two continuous (numeric) variables. Coefficients range from -1.0 to +l.O and are an index, not a measurement. Based on a “best fit” line. Examines for possible association, not predictive relationship. Bivariate Regression Analysis. Procedure to attempt prediction of an independent variable by a dependent variable. Multiple Regression Analysis. Determines the combination of independent variables that predict a numerical dependent variable. Stepwise Regression Analysis.

Air Medical

Journal

17:l

January-March

Sequentially enter independent variables to attempt to identify associations. The order of inclusion of variables usually is structured. Most often used in exploratory studies to identity factors worthy of further investigation. Logistic Regression Analysis. Determines combination of independent variables that predicts a categorical dependent variable. Independent variables may be numerical or categorical.

ence for the investigator who is confused regarding appropriate methodology terms and proper application of the most common statistical tests. If any question exists regarding the appropriate use of these terms and tests, consult a research mentor or a biostatistician. The reader also is referred back to prior articles in the series that discuss many of these top its in greater detail. References

Conclusion

As already discussed, this brief summary is meant to simply provide a quick refer-

1998

1. Instructions for authors. JAMA 1995;273:2930. 2. Dawson-Saunders B, Trapp RG. Basic and clinical biostatistics. 2nd ed. Norwalk (CT): Appleton & Lang; 1994.

29