Supportive Cancer Therapy
comprehensive review
Key Words: Explanatory models, Predictive performance, Risk factors
A Primer in Prognostic and Predictive Models: Development and Validation of Neutropenia Risk Models Gary H. Lyman, Nicole M. Kuderer
Abstract Prognostic or risk models can be used to identify risk factors that are associated with the outcome of interest or to accurately predict response to therapy. Increasingly, such models are playing a role in clinical oncology when selecting patients for more targeted therapy. Multivariate risk models require careful attention to accurate measurements, a priori selection of the primary outcomes and risk factors of interest, specification of data collection including any missing values, and assessment of collinearity and any confounding or effect modification. Recommendations for the design, conduct, analysis, and reporting of risk model studies in oncology are provided herein. Procedures for assessing model predictive performance or accuracy are discussed, including model discrimination, calibration, and power. An illustration of model development and validation is provided for a neutropenia risk model. The modeling procedure includes characterization of the population being studied, identification in advance of dependent or outcome variables and independent or risk factors, specification of the modeling plan including univariate and multivariate analyses, evaluation of the model predictive performance, and the procedure for validation of the model. Finally, there should be a plan for implementation and evaluation of the model in actual clinical practice. Prognostic or risk models are being developed and applied in a broad range of areas in clinical oncology. When properly developed and validated, such models are able to guide clinicians in the selection of patients who are at the greatest risk in order to target potentially toxic or expensive interventions in a more effective and cost-effective manner.
Introduction
(independent variables). Prognostic models or risk models are used for a number of purposes, including (1) to improve our understanding of a disease process; (2) to improve the design
A risk model studies the association between an outcome (dependent variable) and ≥ 1 predictive or prognostic factors
Health Services and Outcomes Research Program, James P. Wilmot Cancer Center, University of Rochester Medical Center, Rochester, NY
Address for correspondence: Gary H. Lyman MD, MPH, FRCP(Edin), James P. Wilmot Cancer Center, University of Rochester Medical Center, 601 Elmwood Ave, Box 704, Rochester, NY 14642 Fax: 585-276-1885; e-mail:
[email protected] Submitted: Jan 3, 2005; Revised: Mar 14, 2005; Accepted: Mar 14, 2005 Supportive Cancer Therapy, Vol 2, No 3, 168-175, 2005
Electronic forwarding or copying is a violation of US and International Copyright Laws. Authorization to photocopy items for internal or personal use, or the internal or personal use of specific clients, is granted by Cancer Information Group, ISSN #1543-2912, provided the appropriate fee is paid directly to Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923 USA 978-750-8400.
168
Volume 2, Number 3 • April 2005
and analysis of clinical trials; (3) risk stratification; (4) to assist in comparing the outcome between treatment groups in nonrandomized studies by allowing adjustment for case mix; (5) to define risk groups based on prognosis; (6) to predict disease outcome more accurately or parsimoniously; and (7) to guide clinical decision making, including treatment selection, and patient counseling.1 In general, there are 2 major types of models depending upon the purpose of the study: explanatory models and predictive models.
Figure 1
Confounding The treatment effect is falsely obscured or falsely observed because of an association with both treatment and outcome
Prognostic Factor
Outcome
Explanatory Models Explanatory models generally serve one of several purposes. They are often used to identify risk factors that are associated with the outcome of interest in an exploratory study. In other cases, the primary focus is on the relationship of the outcome with a specific risk factor when there is concern that the relationship between the risk factor and outcome may be confounded by other factors or the strength of the relationship may be modified by an interaction term. Clinical experience and expertise is of considerable importance in the modeling process, particularly in identifying potential confounding factors and effect modifiers in advance of the study.
Confounding Factor
effect modifier if the relationship between age and risk of febrile neutropenia were different for men than for women. It is important that any potential interaction terms be considered a priori, because the evaluation for interaction involves subgroup analyses and multiple testing issues. Although an interaction factor may also be a confounding term, confounding and interaction must be dealt with differently. As discussed earlier, including such a factor in the risk model and therefore adjusting for its effect can often adequately deal with confounding. Interaction can be assessed in stratified analysis or by incorporating an interaction term generally as the product of the effect modifier and the risk factor into the model. Essentially, if there is no significant interaction, the interaction term can be ignored. In situations in which there is significant interaction, such as with sex, a single model without an interaction term will provide an “average” effect that may not be correct for men or women. To properly deal with interaction in risk models, the investigator has two choices. The model can be presented with the interaction term as well as the effect modifier and risk factor in the model. Unfortunately, such models are often complex and difficult to explain because relative risk estimates such as odds ratios or hazard ratios are no longer merely fixed numbers based on a single coefficient but rather represent an equation in which the relative risk is a function of the interaction term and will change as the value of interaction term changes. In the example cited previously, the relative risk for febrile neutropenia with age would not be a discrete number but a function of sex and would be different for men and women. As an alternative, the investigator may choose to present separate models for different levels of the interaction term, eg, different models for men and women. Testing the interaction between the risk factor and the interaction variable rather than by separate analyses within subgroups (which may have limited power because of smaller sample sizes) should assess outcome differences between subgroups.
Confounding Confounding occurs when the apparent association between a risk factor and an outcome is altered by the relationship of each to a third factor (Figure 1). Such confounding factors may obscure a real relationship or create an apparent relationship when one does not actually exist. Multivariate methods permit the study of the risk factor/outcome relationship adjusted for other known and measured cofactors, which may confound the relationship. Although confounding can also be addressed to some extent through a stratified analysis, multivariate models have the advantage of dealing with multiple confounders, potentially incorporating all of the subjects simultaneously. It must be noted that such analyses can adjust for only potential confounding factors that are actually considered, measured, and included in the analysis. Confounding may still exist with factors that are unknown, unmeasured, or otherwise left out of the model. When the focus is on the identification or evaluation of specific risk factors, the model permits the study of the independent contribution of each factor to the outcome. When the analysis is focused on one main predictor, its relationship with outcome can be estimated while being adjusted for all of the potential confounding factors included in the model. Interaction Dealing with potential interaction can be one of the most challenging aspects of developing risk models. Interaction or effect modification represents a situation in which the interacting variable alters the level of association between a risk factor and an outcome. For instance, sex would represent an
169
Supportive Cancer Therapy
Development of Prognostic Risk Models for Neutropenia
Table 1
Considerations in the Design of Risk Model Studies 1. The primary and secondary hypotheses should be clearly stated, including any subgroup analyses planned in advance of the study. 2. Consider prognostic factors for which there is sufficient evidence to warrant further investigation based on: a) Previous studies; b) Biologic and clinical plausibility; and c) Relevance and importance to the understanding or treatment of the disease. 3. The study population should be defined with specified inclusion/exclusion criteria and methods to judge evaluability. 4. Patient treatment should be standardized or assigned by randomization. 5. Assays should be reproducible and should be performed without knowledge of the clinical data and patient outcome. 6. Estimate the sample size keeping in mind the following: a) The desired power to detect meaningful differences in outcome for the major endpoints and to reject such differences with reasonable confidence if they are not found; b) The relationship of sample size to the number of outcome events; these will be less frequent in favorable prognostic groups; and c) The desirability of large prospective studies of a single prognostic factor. 7. Specify the planned analysis including any cutpoints for continuous variables, proposed hypothesis testing on subgroups, and anticipated interactions in advance of the study. 8. Key study features, including the above information, should be fully detailed in a formal written protocol.
Risk Factors The selection of risk factors for inclusion in a model represents perhaps the most challenging issue with regard to risk modeling. Ideally, such models should include all relevant and important risk factors and their potential confounders. Variable selection often appears to be more of an art than a science yet should be guided by the knowledge and experience of the clinician. Often risk factors are identified through the use of a statistical model by using exploratory techniques such as forward or backward stepwise variable selection methods. Because such an approach selects variables to fit the specific set of data, it is fraught with many potential problems, including model instability and exaggeration of relative risk estimates and their associated level of significance. A more valid approach is the selection of risk factors based on a fundamental understanding of the pathophysiology of the disease and pharmacology of treatment and previous knowledge and experience. Evaluation of a model based on risk factors identified a priori is less likely to lead the investigator astray. In
addition, multiple comparison issues must be considered when selecting many risk factors or cutpoints. Statistical Models The specific model or mathematical relationship between risk factors and chosen outcome depends on the type of outcome being studied and anticipation that the data will approximately follow the relationship of the model. In multivariate analyses, a coefficient or “slope” is estimated for each independent variable by fitting a certain model to the data while adjusting for all of the other variables. Commonly used models include the following: (1) linear regression modeling of the mean of normally distributed continuous outcome measures, such as heart rate as a linear function of independent variables; (2) logistic regression modeling of the probability of a dichotomous (yes/no) outcome as the multiplicative product of the individual predictors; and (3) proportional hazards regression modeling of the instantaneous risk or hazard of a discrete event over time (survival) as the product of the individual predictors.
170
Volume 2, Number 3 • April 2005
Gary H. Lyman, Nicole M. Kuderer
Table 2
Considerations in the Analysis of Risk Model Studies 1. Base analysis, including any hypothesis testing, on the primary and major secondary outcomes specified before the study. 2. Consider possible bias because of missing data. 3. Consider the issue of multiple comparisons when evaluating many prognostic factors or cutpoints, and adjust tests of significance accordingly. 4. Beware of the problems associated with the interpretation of stepwise multiple regression models, including model instability and likely exaggeration of coefficient estimates and their associated P values. 5. Adjust the effect of new prognostic factors for existing prognostic factors of recognized and accepted importance. 6. Outcome differences between subgroups should be assessed by testing the interaction between the prognostic factor and the variable defining the subgroups rather than by separate analyses within subgroups. 7. Interpret with caution apparent outcome or prognostic marker differences between subgroups (many such differences arise from multiple testing or small sample size within subgroups). 8. Analysis of subgroups defined only during or after completion of the study should be acknowledged as exploratory. 9. When reporting the results of a prognostic factor study, do the following: a) Clearly state the study design: exploratory/confirmatory, prospective/retrospective, treatment (eg, randomized or standardized), blinding, main outcomes, etc. b) Report the number of patients excluded because of missing data. c) Specify study duration including criteria for study termination (if relevant). d) Report methods of measurement of prognostic markers with information about reproducibility if possible. e) Define clearly all study endpoints. f) Summarize outcomes as quantitative estimates and confidence intervals. g) Emphasize the outcome differences observed for all patients more than those found among subgroups. h) Discuss any weaknesses of the study, especially related to subgroup analyses and multiple comparisons.
Like a slope when there is only one independent variable, the risk factor coefficient reflects the rate of change in the outcome variable for every unit change in the predictor variable adjusted for any other variables in the model. It can be shown that, in logistic regression models, the exponential function (antilogarithm) of the coefficient is equal to the odds ratio. Likewise, it can be demonstrated that, for proportional hazards regression models, the exponential function of the coefficient is equivalent to the hazard ratio that represents the immediate risk of an event. Generally, the coefficients or the derived ratios are reported as a point estimate along with a measure of precision such as the standard error or 95% confidence limits on the estimate, reflecting the variability in measures and the sample size of the population. Like variable
coefficients, the derived estimates of relative risk are automatically adjusted for other variables included in the model. All models are based on specific assumptions that must be satisfied for the results to be interpretable and valid. For instance, all models assume that the observations are independent of one another from separate patients. Collinearity exists when a sufficient correlation exists between ≥ 2 variables to prevent reliable estimation of their individual regression coefficients. Collinearity may reduce the power of the model and complicates the interpretation of the contributions of the correlated predictors. Extreme examples of collinearity are when a variable is included twice or when one variable is essentially a surrogate for the other with near-perfect correlation. For lesser degrees of collinearity, investiga-
171
Supportive Cancer Therapy
Development of Prognostic Risk Models for Neutropenia
assessed in a variety of ways, including sensitivity, specificity, and predictive value (Figure 2; Appendix I). The sensitivity of a model is the probability of being correctly classified if an event occurred and specificity is the probability of being correctly classified if an event does not occur. Obviously, from a clinical and modeling perspective, greater interest lies in the predictive value or post-test probability, ie, the probability of an event occurring based on the result predicted by the model.
Figure 2
Measures of Risk Model Accuracy Predicted Results
Sensitivity = A / (A + C) Event
No Event
A
B
Specificity = D / (B + D) PPV = A / (A + B)
High Risk
Low Risk
C
D
A+B
C+D
NPV = D / (C + D)
Model Accuracy
LR+ = sensitivity / (1 – specificity)
Model predictive performance or accuracy is based on discrimination, calibration, and power.
LR– = (1 – sensitivity) / specificity
Model Discrimination The predictive accuracy of a model is often summarized on the basis of the likelihood ratio representing the likelihood or odds of the event based on the predicted outcome divided by the likelihood of the event in the overall population. The full characterization of model performance is similar to that of assessing the performance of a test by quantifying sensitivity and specificity. In the context of a prognostic model, sensitivity represents the probability of individuals experiencing events being correctly predicted to be at high risk by the model. Similarly, specificity represents the probability of individuals not experiencing events being correctly predicted to be at low risk by the model (Figure 2). Overall model discrimination is then reflected in the test performance over the full range of possible criteria for defining high versus low risk, ie, all pairs of values for the true positive rate (sensitivity) and the false positive rate (1 minus specificity; Appendix II). The graph defined by these pairs of points is termed a receiver operating characteristic (ROC) curve (Figure 3). For dichotomous outcomes, as in logistic regression analysis, the area under the curve (AUC) under the ROC curve is equivalent to the C statistic as a measure of concordance.2 For reasonable models, the AUC or C statistic varies between 0.5 and 1.0, with the best models approaching 1. The diagonal line defining the pairs of points where the true positive and false positive rates are equal represents the curve expected if the model conveys no information and has no discriminating capability. The farther the ROC curve deviates from the diagonal into the opposing corners of the graph, the greater the model discrimination. Overall model accuracy or discrimination performance can be estimated on the basis of the model diagnostic odds ratio. In the simple dichotomous situation, this represents the ratio of the post-test odds in the high-risk group to the posttest odds in the low-risk group or the likelihood ratio positive divided by the likelihood ratio negative (Figure 2; Appendix II). The ratio becomes greater as the likelihood ratio positive becomes larger and the likelihood ratio negative becomes smaller. Importantly, the model or diagnostic odds ratio often remains relatively constant over the range of possible cutpoints.
Diagnostic OR = (A × D) / (C × B) A+C
B+D
Abbreviations: LR = likelihood ratio; NPV = negative predictive value; OR = odds ratio; PPV = positive predictive value
tors may choose to run the risk of ignoring it, include only the more relevant of 2 correlated factors, create a composite variable, or use another type of analysis or regression; however, none of these choices are optimal.
Predictive (Risk) Models Often, the primary purpose of a model is not focused on the individual risk factors but on the overall prediction or estimation of the risk of the outcome of interest. Such predictive models may also be used to group or classify subjects into discrete risk categories (risk models). Factors to consider in the design and analysis of risk model studies are summarized in Tables 1 and 2. The simplest risk model classifies patients into ‘high’ and ‘low’ risk categories. The development and evaluation of predictive models or risk models are similar to those used to develop and evaluate clinical tests. In its simplest form, a predictive model attempts to predict a single dichotomous (yes/no) outcome from a single dichotomous risk factor. For instance, one may wish to predict the risk of febrile neutropenia from a single factor such as sex. As with any test, perfect prediction of the actual outcome such as a true positive or true negative result is seldom achieved. Rather, some subjects predicted to be at low risk may actually experience the event (ie, false negative), whereas some at high risk may not (ie, false positive). Therefore, subjects can be grouped according to the predicted model results and/or the actual outcome observed. As with clinical test performance, model predictive performance can be
172
Volume 2, Number 3 • April 2005
Gary H. Lyman, Nicole M. Kuderer
Model Calibration Evaluation of model calibration or its fit to the data is generally based on the agreement between predicted risk score probabilities based on the model and the actual observed probabilities. In residual analysis, this error in estimation can be evaluated using residual plots. For risk models, a weighted averaged global score can be generated and the accuracy of the scoring system can be evaluated by plotting the observed versus predictive outcomes. Prediction rules can then be generated based on the most accurate scoring system. Sensitivity, specificity, predictive value, and likelihood ratio can be estimated for each model and prediction rule. Calibration can be quantified on the basis of the slope of the prognostic index.3 The slope of the prognostic index or calibration slope represents the regression coefficient of the prognostic index as the only covariate in a logistic regression model. If observed and predicted probabilities agreed perfectly, the calibration slope would equal 1.
Figure 3
Empirical ROC Curve
True Positive Rate
1.0 0.8 0.6 0.4 0.2
0.0
Model Power The overall power of a model to predict the outcome of interest can be based on a variety of measures:
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 False Positive Rate
1) The average prediction error or Brier score ranging from 0 (perfect) to 0.25 (worthless), 2) The D statistic, which represents a scaled version of the model χ2 statistic (χ2 – 1) / n, where n is the number of subjects, and 3) The model R2 representing the proportion of the variation in outcome that is explained by the independent variables in the model.4
Missing Data Complete data are critical to powerful and unbiased risk estimation in multivariate models. Most modeling methods remove subjects from the calculations who are missing any of the variables included in the model, potentially biasing the model results. Therefore, all reasonable efforts should be taken to minimize missing data in a study. Any missing data, despite these efforts, should be evaluated for any relationship to the various outcomes and covariates. When a limited amount of data are missing completely at random, various imputation technologies are available to provide reasonable ‘guesses’ for the missing values.
The R2 increases as the number of predictive variables in the model increases, whereas the predictive factors in a model with an R2 close to 1 or –1 accurately predict the outcome. Another way to view the R2 is the potential to further improve outcome prediction with consideration of additional variables.
Development of Neutropenia Risk Models
Model Validation Ideally, model validation is based on an independent population of individuals not included in the original developmental sample. Internal validation can be assessed using an intensive resampling technique or bootstrapping, which permits one to make inferences about the underlying population from which the sample originated by drawing samples with replacement from the original sample.5 The procedure recommended by Steyerberg and colleagues includes the following: (1) drawing a random bootstrap sample of the same size with replacement from the original; (2) selecting the same covariates and estimating the logistic regression coefficients and performance indices as described earlier; and (3) evaluating the model by comparing the performance in the bootstrap sample with that in the original sample, which is an unbiased estimate of the over-optimism expected for the model based on the original sample.6
Patient Population The population to be used as the developmental data set must be fully characterized including type and stage of disease, treatment given, and demographic characteristics including age, sex, and race. The source of the population must be fully defined. Patients rather than encounters represent the primary unit of analysis. Data Set Characteristics Important variables and any cutpoints should be defined in advance. Dependent Variables The primary outcome variables sought should include various measures of myelosuppression such as the incidence, severity, and duration of neutropenia; various infectious com-
173
Supportive Cancer Therapy
Development of Prognostic Risk Models for Neutropenia
Table 3
plications including febrile neutropenia; and the impact on dose reduction, treatment delay, or overall dose intensity. Secondary outcome variables may include quality of life assessment and economic measures most notably related to resource utilization. Outcomes should be assessed separately for each cycle of therapy as well as cumulatively across the entire course.
Dependent and Independent Variables of Interest in Neutropenia Risk Modeling Dependent Variables/Outcomes Primary Outcomes (cycle 1; overall all cycles; initial occurrence) Neutropenic events
Independent Variables Predictive and prognostic variables should include patient and practice demographics, comorbidities and recognized prognostic measures, disease status, baseline counts and treatment-related factors including regimen, schedule, and dose intensity, as well as supportive care interventions. When measures have been recorded as continuous variables or multiple categories, the analysis should initially be based on such measures rather than arbitrarily imposed cutpoints. When incorporated, cutpoints will be based on standard and referenced parameters rather than convenience or arbitrary ones. Possible dependent and independent variables for analysis are listed in Table 3.
Severe or febrile neutropenia Febrile neutropenia hospitalization Documented infection Mortality Infection-related mortality All deaths Reduced dose intensity Dose reductions Treatment delays
Data Analysis
Relative dose intensity
Descriptive Each study measure should be assessed individually for completeness (missing data), consistency, and quality. Missing data must also be evaluated for any relationship with primary outcomes or any of the prognostic/predictive variables. If any variable to be used in further analysis has > 5% missing data, consideration should be given to using one of the available imputation techniques. Multiple imputation techniques that provide unbiased estimates and reasonably preserve the variance structure are available. Summary measures should be generated for each variable including estimates of central tendency (mean, median, proportion, etc.) and variability (standard error, confidence limits, etc.).
Secondary outcomes Quality of life Summary health profile measures Quality-adjusted life years Economic Direct costs Cost of Hospitalization (bed costs, length of stay, diagnostic tests, medications [antibiotics, growth factors], transfusions) Ambulatory costs of treating or preventing febrile neutropenia Indirect costs
Analytic The relationship between each outcome and each covariate should then be studied in univariate and multivariate models. Formal hypothesis testing should be limited to relationships considered a priori based on the literature or clinical understanding.
Out-of-pocket expenses, lost wages Independent Variables: Prognostic/Predictive Factors Demographic factors: age, sex, race Comorbidities
Modeling Modeling of categoric or composite outcomes, which can be expressed dichotomously, may be studied using multivariate logistic regression analysis. Alternatively, outcomes representing time to an event or duration of events may be studied using the proportional hazards regression method of Cox. Covariates to be considered in the models should be specified a priori based on previous models generated on smaller retrospective data sets or with a firmly established biologic or clinical rationale. Models should include as covariates each
Previous treatment Body surface area Performance status Treatment information Regimen and drugs Dosage and schedule Supportive care (growth factors, antibiotics)
174
Volume 2, Number 3 • April 2005
Gary H. Lyman, Nicole M. Kuderer
treatment agent and its possible interaction as well as factors of known prognostic significance. Models based on the initial (developmental) population should be subsequently validated ideally on an independent population (validation). The predictive performance of each model, including discrimination, calibration, and power, should be evaluated. If useful and appropriate, construction of a prognostic index or score in which the regression coefficients are used as weights may be considered. Formal tests for interaction should be applied rather than hypothesis testing between subgroups that may have limited power. Additional issues that should be addressed during the modeling process include the following: (1) bias because of missing data; (2) the potential impact of multiple comparisons when evaluating multiple prognostic factors or data cut points; and (3) the limitations of stepwise regression models including model instability and the exaggeration of coefficient estimates. Modeling should be based on fixed models incorporating variables of accepted importance and others with underlying biologic or clinical rationale, and differences in subgroups will be evaluated by testing for interaction. Although the development and validation of the model constitute critical steps in the risk model process, several additional factors must be considered, including the target user and use of the model, the ease of implementation and compliance, as well as the ultimate evaluation of the impact of the model on patient care and clinical outcomes. Properly developed and validated risk models have the potential to not only provide clinicians with a better understanding of important disease processes but also to facilitate the selection of patients for the targeted application of potentially toxic or costly interventions in a more effective and cost-effective manner.
Appendix I
Measures of Model Performance Measure
Definition The probability of high risk in those experiencing the outcome of interest
Sensitivity False-Positive Rate
The probability of high risk in those not experiencing the outcome of interest
Specificity
The probability of low risk in those not experiencing the outcome of interest The probability of low risk in those experiencing the outcome of interest
False-Negative Rate Positive Predictive Value
The probability of outcome of interest in those designated as high risk
Negative Predictive Value
The probability of no outcome in those designated as low risk The probability of correct prediction (true positive and true negative)
Accuracy
Appendix II
Measures of Model Discrimination Likelihood ratio: ratio of the odds of outcome based on the model over the odds of outcome in all subjects (post-test odds / pretest odds): - Positive: odds of outcome in those predicted high risk/odds of outcome in all subjects.
References
- Alternatively, likelihood ratio positive = sensitivity / (1 – specificity)
1. Altman DG, Lyman GH. Methodological challenges in the evaluation of prognostic factors in breast cancer. Breast Cancer Res Treat 1998; 52:289303. 2. Harrell FE, Califf RM, Pryor DB, et al. Evaluating the yield of medical tests. JAMA 1982; 247:2543-2546. 3. Cox DR. Two further applications of a model for binary regression. Biometrika 1958; 45:562-565. 4. Steyerberg EW, Harrell FE, Borsboom GJ, et al. Internal validation of predictive models: efficiency of some procedures for logistic regression analysis. J Clin Epidemiol 2001; 54:774-781. 5. Efron B, Tibshirani R. An introduction to the bootstrap. Monographs on statistics and applied probability. New York. Chapman & Hall, 1993. 6. Steyerberg EW, Eijkemans MJC, Harrell FE, et al. Prognostic modeling with logistic regression analysis: in search of a sensible strategy in small data sets. Med Decis Making 2001; 21:45-56.
- Negative: odds of outcome in those predicted low risk/odds of outcome in all subjects. - Alternatively, likelihood ratio negative = (1 – sensitivity) / specificity Diagnostic odds ratio: ratio of the odds of outcome in those predicted by the model over the odds of outcome in those not predicted by the model - (TP × TN) / (FP × FN) - Likelihood ratio positive/likelihood ratio negative C-statistic: area under the ROC curve plotting true positive rate (sensitivity) versus false positive rate (1 – specificity) for different thresholds
175