Principles of Successful Cancer Screening

Principles of Successful Cancer Screening

SCREENING FOR CANCER 1055-3207/99 $8.00 + .OO PRINCIPLES OF SUCCESSFUL CANCER SCREENING Robert A. Smith, PhD In 1999, approximately 1.22 million Am...

4MB Sizes 1 Downloads 67 Views

SCREENING FOR CANCER

1055-3207/99 $8.00 + .OO

PRINCIPLES OF SUCCESSFUL CANCER SCREENING Robert A. Smith, PhD

In 1999, approximately 1.22 million Americans will be diagnosed with invasive cancer. This estimate excludes basal and squamous cell skin cancers, of which more than 1 million new cases are expected, and in situ cancers (except bladder) of the breast, cervix, and m e l a n ~ m a In . ~ 1999, more than 500,000 Americans are expected to die from cancer. Overall, it is the second leading cause of death in the United States, and among individuals who do not die from their disease, the lasting effects of treatment may adversely and permanently affect the quality of remaining years of life. The control of cancer is a formidable public health challenge, although the accumulation of knowledge about disease etiology and natural history has provided some clear pathways toward reductions in disease burden. Current disease control strategies include prevention, early detection, and treatment. Without question, the prevention of cancer offers the greatest potential for reductions in disease burden. The most wellknown example of a prevention-related cancer control strategy and the major focus of anticancer public health initiatives is discouraging use of tobacco, which the American Cancer Society estimates accounts for nearly one third of cancer death^.^ Other lifestyle factors associated with cancer risk, such as nutrition, physical activity, and sun protection, offer less certain paths to cancer prevention for individuals, but epidemiologic evidence indicates that the widespread adoption of healthier lifestyles would reduce future disease incidence and thus attendant morbidity and morstrategies at the individual and population t a l i t ~ . ~Cancer , ~ ~ , prevention ~~ level are long-term investment strategies, for which future dividends will be gained through continuous risk reduction.

From the Department of Cancer Control, American Cancer Society, Atlanta, Georgia SURGICAL ONCOLOGY CLINICS OF NORTH AMERICA VOLUME 8. NUMBER 4 OCTOBER 1999

587

588

SMITH

Prevention is not an option for the several millions of individuals who will be diagnosed this year with invasive and in situ cancers of the body and skin, nor is it an option for the millions of individuals who currently have undiagnosed disease. Furthermore, preventive strategies do not exist for all cancers, and where they exist they hold the possibility, but not the certainty, of cancer prevention. For these individuals, the control of cancer depends on successful treatment. Yet cancer statistics provide a stark reminder that treatment does not always assure the cure of disease. One common feature of most cancers is that prognosis generally is better and treatment more successful if cancer is detected early (i.e., while it is still localized).48This is less true for digestive system cancers (esophagus, stomach, liver, gallbladder, and pancreas) and brain cancers. However, for some of the more common cancers, including cancers of the skin, breast, cervix, endometrium, oral cavity, ovary, colon and rectum, prostate, and lung, more favorable prognosis among cases detected early has resulted in significant public health effort to devise successful secondary prevention ~ t r a t e g i e s .Not ~ , ~all ~ of these efforts have been successful, but nearly half of the expected new cases diagnosed this year are among cancers for which early cancer detection recommendations exist.2 Secondary prevention of cancer is distinguished from primary prevention in that it is an intermediate intervention that targets disease in process.59Secondary prevention cancer control interventions can reduce cancer morbidity and mortality by (1) diagnosing invasive disease at an earlier, more favorable prognostic stage and (2) detecting precursor lesions associated with some cancers that once eliminated, prevent progression to invasive disease. Screening for cancer is a secondary prevention strategy. Screening is "the application of various tests to apparently healthy individuals to sort out those who probably have risk factors or are in the early stages of specified condition^."^^ For any given cancer site, the potential of secondary prevention strategies to reduce morbidity and mortality is based on well-defined criteria for the evaluation of screening efficacy and the manner in which screening is implemented in clinical practice. These are two separate issues. In order to justify mass screening or an offer of cancer screening to an individual patient, there must be general agreement that certain minimal criteria related to disease burden, the value of early detection, test characteristics (accuracy, safety), and cost have been demonstrated. If these criteria are met, the potential of screening to contribute to cancer control outcomes depends on the degree to which screening programs can achieve high quality, high levels of participation by health care providers and the atrisk population, minimal adverse consequences, and appropriate followup for patients with abnormal test results. CRITERIA FOR SCREENING

The decision to screen an at-risk population for preclinical signs of cancer is based on well-established criteria related to the disease in ques-

PRINCIPLES OF SUCCESSFUL CANCER SCREENING

589

tion and the screening tests that are used to identify individuals who may have occult d i s e a ~ e . Although ' ~ ~ ~ ~ ~the ~ ~overall objective of a screening program is to reduce morbidity and mortality from cancer, the goal of screening per se is the "application of a relatively simple, inexpensive test to a large number of persons in order to classify them as likely, or unlikely, to have the cancer which is the object of the screen."'O The emphasis on likelihood underscores the limits of what should be expected from screening (i.e., screening tests are not diagnostic tests). A person with an abnormal screening test does not have a definitive diagnosis until additional, more sophisticated diagnostic tests are completed. The emphasis on likelihood also is important because screening tests are inherently limited in their accuracy, which varies by test, cancer site, and individual characteristics. Although most of screening interpretations are accurate, it is inevitable that some individuals are identified as possibly having cancer when they do not, and screening tests fail to identify some individuals who do have the disease.52The comparative evaluation of accuracy versus error cannot be considered in absolute terms but rather should be evaluated in terms of the relative consequences of one or the other kind of error, as discussed later. Before large numbers of asymptomatic individuals undergo routine testing for cancer, certain conventional criteria should be met.1n,2n,4n,59 First, the disease should be an important health problem. Second, there should be a period when the disease is detectable in an asymptomatic individual (i.e., there should be a detectable preclinical phase). Third, treatment of disease diagnosed before the appearance of symptoms should offer advantages compared with treatment of symptomatic disease. Fourth, a screening test should be affordable and achieve an acceptable level of accuracy in the population undergoing screening. Finally, the test must be acceptable not only to targeted individuals but also to health care providers because they are the primary gate-keepers to the use of screening tests by the population. Although accuracy is important, the consequences of false-positive or false-negative test results and the acceptability of these consequences to providers and the public can influence the acceptability of a screening program and thus its potential to contribute to cancer control efforts.37Similar considerations pertain to financial costs. Apart from the situation in which any one factor nullifies all others (i.e., the absence of a treatment advantage for early disease), generally none of these criteria alone, no matter how favorable or unfavorable in the matrix of relevant factors, is sufficient to accept or reject offering screening to the population. Rather, they must be considered collectively.

Characteristics of the Disease

Disease Burden

Among all chronic diseases, cancer is a leading cause of morbidity and mortality. Among men, the lifetime risk of developing invasive cancer

590

SMITH

is 44.7%, and the lifetime risk of dying from cancer is 23.61%; among women, the lifetime risk of developing invasive cancer is 38%, and the lifetime risk of dying from cancer is 20.5%.48 Cancer is also a leading cause of premature mortality, measured as the expected longevity (i.e., years of potential life left) from the age at death had the person not died of cancer. However, in the presence of this overall disease burden, we do not screen the public for any sign of cancer; rather, public health initiatives focused on cancer screening generally include only cancers of the breast, cervix, colon and rectum, prostate, and skin, and screening recommendations are limited to specific age groups. In order to justify cancer screening, the disease must be recognized as an important health problem. Importance can be gauged by incidence, morbidity, and associated mortality. Furthermore, there are no threshold criteria for "importance" for any one or combination of these criteria. Rather, the more salient factor is that the public judges the disease to be an important health problem. A disease may account for significant mortality, but disease onset and death may occur late in life. On the other hand, a cancer with a lower mortality rate but early age at onset may account for many more years of premature mortality compared with a cancer with higher mortality but late age at onset. Finally, overall survival for individuals diagnosed with the cancer may be high, but the associated morbidity may be sufficient to consider screening for the disease. Identifiable Population at Risk

A factor related to the importance of the disease as a public health problem is the identification of the population at risk. In other words, there should be sufficient prevalence of preclinical disease in the population undergoing screening to make screening worthwhile. Eligibility criteria for cancer screening among average risk adults are commonly defined by age, based on an annual age-specific incidence rate determined to represent a threshold of disease burden. In addition, screening recommendations may be based on the estimated prevalence of lesions regarded as preinvasive that, if detected and treated, would prevent the eventual progression to invasive disease. For subgroups of the population at appreciably higher than average risk, more aggressive routine surveillance may be recommended (i.e., beginning screening at an earlier age and screening more frequently).Under these circumstances, screening at an earlier age can be recommended if the age-specific incidence rates in the high-risk group at a younger age are equal to or exceed those of the average-risk population at the time routine screening should commence. For example, the Cancer Studies Genetics Consortium recommended that individuals who have tested positive for a mutation of known significance on a breast cancer susceptibility gene or who have family history suggestive of significantly elevated risk could reasonably begin breast cancer screening at age 25.8The ACS recommends that women begin screening for cervical cancer at age 18 or after the onset of sexual activity, whichever comes first.49For individuals

PRINCIPLES OF SUCCESSFUL CANCER SCREENING

591

at higher risk of a diagnosis of colorectal cancer at an earlier age due to an individual or family history of colorectal cancer or polyps, known inheritance of familial adenomatous polyposis (FAP) or hereditary nonpolyposis colorectal cancer (HNPCC), or a history of inflammatory bowel disease, both the ACS and a multidisciplinary panal convened by the Aging for Health Care Policy and Research (AHCPR) have recommended that screening should begin at an earlier age and occur at a greater frequency than is recommended for average-risk adult^.^,^^ Greater cost effectivenessin screening programs could be achieved if the population could be segmented on the basis of risk factors into a group from which all or at least most incident cases would likely derive. If this were possible, those who are not at risk for a particular cancer would be spared from the requirement and expense of routine screening. One obvious example, although not currently public health policy, would be to screen for lung cancer only among former and current smokers, individuals with significant life-long exposure to secondary tobacco smoke, or individuals with high-risk occupational exposures. In general, however, apart from broad categorical selection factors such as gender and age, efforts to segment the population on the basis of risk factors to improve screening efficacy have not been successful. For example, numerous efforts have been made to identify a higher risk group of women to target for breast cancer screening, but in no instance has this segmentation based on conventional risk factors had the potential to identify more than half of new ~ a s e s . l , ~ ~ Detectable Preclinical Phase

The detectable preclinical phase, which is also known as the sojourn time, is the duration that preclinical disease is detectable by a screening test.13,19The duration of the detectable preclinical phase depends on the characteristics of the disease and the screening test. Both knowledge of the detectable preclinical phase and the extent to which it varies within the population at risk are important for establishing screening protocols. The detectable preclinical phase is often confused with the lead time, which is the amount of time the diagnosis is advanced by the screening test. In an ideal world, the lead time and the sojourn time always would be equivalent or nearly so, meaning that there was a coincidence between the beginning of the detectable preclinical phase and the occasion of a screening test. Of course, this is unrealistic, but it is true that the shorter the screening interval, the greater the likelihood of achieving lead times that are more similar to sojourn times. An important consideration when determining whether screening for a cancer is feasible is that the sojourn time must be sufficiently long to ensure that periodic screening will detect occult disease at a more favorable stage. If the sojourn time is short, then screening may be impractical due to the requirement for testing at an unrealistic frequency. If sojourn times vary by age or other factors, then a "one size fits all" program may offer adequate protection to one age group and be inadequate for another.

592

SMITH

For this reason, screening intervals must be tailored to estimated sojourn times, which may vary by both disease histology and age. Knowledge of sojourn time is important for determining screening intervals because the sojourn time defines the upper limit of the lead time that might be gained.ls When a screening interval equals or exceeds the mean sojourn time, there is increased potential for a higher rate of interval cancers and thus poorer prognosis in that subset of the incident cases. Screening for breast cancer is a good example of this axiom. Data from the Swedish Two-County Study have shown that the detectable preclinical phase for women aged 40 to 49 is shorter (1.7 years) compared with women aged 50 to 59 (3.3 years) and women aged 60 to 69 (3.8 years).5sScreening programs that offered breast cancer screening every 24 months to all women over age 40 have tended to show poorer results among the under-50 age group compared with screening programs that offer screening at a shorter inter~al.~ Although , ~ , ~ ~ annual breast cancer screening for women over age 50 improves sensitivity somewhat compared with screening every 2 years, annual screening is essential to the design of a breast cancer screening program for women aged 40 to 49 due to faster tumor growth rates in that age g r o ~ p . ~ ~ , ~ ~ , ~ ~ Screening-Detected Cancers Should Have Better Prognosis Treatment of screen-detected cancers should offer advantages measured by lower mortality, lower morbidity, and improved quality of life compared with treatment of symptomatic disease. It is important to avoid equating detection of occult disease with better prognosis because this may not always be the case.39If a significant proportion of screen-detected cancers in asymptomatic individuals is no longer localized, then screening may not offer sufficient benefits to be justified. On the other hand, shortening the screening interval may produce more favorable benefits; therefore, if a shorter but still practical interval were possible, it might be considered. Screening, however, may not be justified under these circumstances, but ultimately a more sensitive test may emerge that would offer the potential to diagnose a malignancy at a time when there was still potential for early treatment to alter the natural history of the disease. Nevertheless, if treatment of screen-detected cancers offers no advantage over treatment after symptoms develop, then screening is pointless. However, whereas the value of screening is most commonly measured in terms of lower mortality, a significant improvement in disease-associated morbidity would be an acceptable advantage over treatment of later-stage disease. Characteristics of the Screening Test

Ultimately there are four possible outcomes to a screening test (Table 1).An individual without disease may be correctly labeled as not hav-

PRINCIPLES OF SUCCESSFUL CANCER SCREENING

593

Table 1. CONVENTIONAL MEASURES OF SCREENING EFFICACY Disease Status Screening Test Results

positive negative

Yes

No

a

b d

c

+ +

Sensitivity = a / ( a c). Specificity = d/(b d). Positive predictive value = a / ( a

+ b)

ing the disease or incorrectly labeled as possibly having the disease. Conversely, an individual with disease may be correctly labeled as possibly having the disease or incorrectly labeled as not having the disease. Although these categories seem straightforward, the actual measurement is rather complex for the simple reason that screening tests are not in themselves definitive, and final resolution of the accuracy of the original interpretation may take weeks or months or may not occur at all if follow-up testing is rejected. By convention the outcome of a screening test is measured as follows: True positive: Cancer or precursor lesion diagnosed within a certain period of time after an abnormal screening examination. True negative: No known cancer or precursor lesion diagnosed within a certain period of time after a normal screening examination. False negative: Cancer or precursor lesion diagnosed within a certain period of time after a normal screening examination. False-negative results generally become apparent when symptoms develop during the interval between regularly scheduled examinations (e.g., the label "interval cancers"). False positive: No known cancer or precursor lesion diagnosed within a certain period of time after an abnormal screening examination. In practice, an individual who undergoes screening may receive an indeterminate interpretation, which may be resolved with further testing at the time of screening, within several weeks of the test, or after follow-up testing recommended at an intermediate interval (i.e., 6 months). For purposes of evaluation, individuals with an indeterminate finding may be classified on the basis of the original interpretation or the subsequent interpretation, but as a general rule one strategy or the other should be followed consistently. It could be argued, however, that all indeterminate cases that are not resolved on the day of screening and not determined to have cancer within the specified interval (as described previously) should be labeled as false positives, because the recommendation for interim testing is tantamount to an abnormal finding and has costs for both individuals (anxiety) and the health care system (additional testing). For purposes of program evaluation, these summary categories described previously are used to estimate measures of screening validity and efficacy, in particular sensitivity, specificity, and positive predictive value (see Table 1).

594

SMITH

Sensitivity Sensitivity is the probability that the screening test will detect cancer among all asymptomatic individuals with the disease in a population undergoing screening. Sensitivity is estimated as the proportion of all individuals with the disease (true positives and false negatives) who were correctly identified by the screening test (true positives only) within a specified period of time. Sensitivity is calculated as follows: true positive/ (true positive false negative). In practice, measuring sensitivity is a challenge because false negative results are not easily identified. Measuring program sensitivity based on the method described earlier depends not only on clear histologic criteria for a precursor lesion or invasive disease but also on the ability to follow a population over a prolonged period of time in order to identify cancers diagnosed within the follow-up period after a normal e~amination.~~ Some error is inherent in this method because a cancer diagnosed within the follow-up period after a normal examination is assumed to have been detectable during the initial examination; conversely, a cancer that was truly missed but not diagnosed within the follow-up period is not labeled as false negative. Perhaps some balance in these misclassifications occurs. The practical limits of this method he., the challenge of identifying all false-negative results at some future time), have led to alternative strategies to estimate program sensitivity. Day and Walter14have proposed an alternative approach to estimating sensitivity, which is based on the proportional incidence of interval cancers, that is, the ratio of the number of observed cancers (interval cases) to the expected incidence in the absence of screening. The complement of this fraction is the program sensitivity.

+

Specificity Specificity is the probability of correctly identifying a patient as normal when no cancer exists. Specificity is estimated as the proportion of all individuals without the disease (true negative and false positive) that were correctly identified by the screening test (true negative only) within a specified period of time. Specificity is calculated as follows: true negative/(true negative false positive). The estimate of specificity depends on the definition of a false-positive result. Specificity is lower if the definition of false-positive is the result of the initial screening examination, which may be resolved to a normal interpretation with some additional testing, compared with defining false-positive results based on biopsy results. For example, the specificity of mammography can be calculated based on the initial interpretation of a mammogram, although most women with this interpretation are returned to the normal screening pool after additional imaging. Specificity also can be calculated on the basis of whether the patient had cancer based on recommendation for surgical consultation or, more definitively, only if the recommendation actually resulted in a biopsy5 Because specificity varies based on the definition of a false-positiveresult, it is important that the criteria be specified and that

+

PRINCIPLES OF SUCCESSFUL CANCER SCREENING

595

it be truly relevant for program evaluation. Achieving a low false-positive rate may be important in one screening program because of significant financial and human costs associated with a diagnostic evaluation; alternatively, it may be less important in another screening program because these costs are more modest in terms of financial costs and patient acceptance. Positive Predictive Value

The positive predictive value (PPV) of a screening test is the proportion of all positive screening cases that result in a diagnosis of cancer, and like specificity, it varies according to the criteria for a false-positive interpretation. The PPV is calculated as follows: true positive/(true positive false positive). The PPV is influenced by the sensitivity and specificity of the screening test as well as by the magnitude of the underlying prevalence of disease.1° Overall however, specificity and disease prevalence have the greatest effect on PPV. Consider the following examples, as shown in Table 2. In the base example, the hypothetical disease has a prevalence at screening of 500 cases per 100,000 individuals screened. The test sensitivity is 80% because 400 cases were correctly identified by the screening test. Because 100 cases were missed by screening, the false-negative rate was 20%. Most individuals who did not have the disease tested normal, but 995 individuals (1%)had positive test results, resulting in a 99% specificity. The example in Table 2 is designed to demonstrate the degree to which performance parameters influence the PPV. In the first example, if sensitivity improves from 80% to 90%, this results in only a 2% improvement in the PPV. In the second example, one can see that the influence of specificity on the PPV is more dramatic. A decline of 1% in specificity results in a decline in PPV from 29% to 17%; a decline in specificity to 95% results in a PPV of 8%. The influence of specificity on PPV becomes clearer when one considers that small changes in the false-positive rate influence large numbers of individuals who must then have additional testing. Example three shows a similar influence of disease prevalence. The influence of disease prevalence is entirely on the numerator. Even if the test is highly accurate, PPV varies with the prevalence rate. If the prevalence of disease were doubled in this example, PPV would increase to 44%, and if disease prevalence were halved, PPV would decline to 18%. When evaluating the PPV and comparative rates of PPV, it is important to consider the underlying goal of screening for the disease in question. A low PPV may indicate lower specificity, lower disease prevalence, or a combination of these two influences. A high PPV may be similarly influenced.I0Obviously, a high PPV is preferred, but a high PPV alone may not be indicative of good performance. If disease prevalence is high and specificity is high, a test with relatively poor sensitivity may still have a better PPV than a screening test with high sensitivity for a disease of lower prevalence. Whether a low PPV is acceptable or whether PPV should be improved should be considered by weighing the human and

+

Table 2. CHANGES IN THE PREDICTIVE VALUE OF A POSITIVE TEST DUE TO VARIATIONS IN SENSITIVITY, SPECIFICITY, AND DISEASE PREVALENCE* -

Screening Test Results

Disease Status

Yes positive negative Total

- - -

Screening Test Results

Yes

No

-

a

b d

C

b

Disease Status

a + b c +d

positive negative Total

+d +

400 100 500

+

Base example. In the example above, sensitivity (a/a + c) = 80%; specificity (d/b d) = 99%; and positive predictive value (a/a b) = 29%. Variations in sensitivity. If disease prevalence (500 per lo5)and specificity (99%)remain the same, an increase in sensitivity to 90% (450/500) increases PPV from 29% to 31%; a decline in sensitivity to 70% (350/500) decreases PPV from 29% to 26%. Variations in specificity. If disease prevalence (500 per lo5) and sensitivity (80%)remain the same, a decline in specificity to 98% (97,510/99,500) decreases PPV from 29% to 17%;a decline in specificity to 95% (94,525/99,500) decreases PPV from 29% to 8%. Variations in disease prevalence. If sensitivity (80%)and specificity (99%)remain the same, an increase in disease prevalence to 1,000 per lo5increases PPV from 29% to 44%; a decrease in disease prevalence to 250 per lo5 results in a decline in PPV from 29% to 17%. *Adapted from Morrison A: Screening in Chronic Disease. New York, Oxford University Press, 1992; with permission.

PRINCIPLES OF SUCCESSFUL CANCER SCREENING

597

program costs of a false-positive result. Alternatively, focusing screening on a higher risk subpopulation may increase PPV due to higher disease prevalence, but this may be at the cost of the overall detection rate. Sensitivity, Specificity, and PPV

These summary categories are the basic elements for measuring the performance of screening programs. As noted previously, they actually may be measured or have estimates based on modeling assumptions. Because most individuals who undergo screening examinations do not have cancer, nearly all normal interpretations are accurate (true negative).Truepositive results are measured in the near term by biopsy results. Falsenegative or false-positive results are based on the assumption that cancer would have been detected, or was not present, at the time of the initial screening examination on the basis of the presence or absence of histologic confirmation of disease within the specified evaluation period (usually 1 year). PPV is a comparative measure of program efficiency and cost effectiveness. The sensitivity and specificity of a screening test may be influenced by factors other than simply the ability of the test to detect or rule out cancer. These factors can be thought of as falling along a continuum. At one end of the continuum are factors that may be outside of the influence of programmatic modifications or improvements in quality assurance. For the most part these factors are idiosyncratic. Sojourn times that are considerably shorter than average mean at least some portion of the interval cancer rate probably is unavoidable because these cases are not attributable to any source of error during the previous screening examination and appear as interval cancers simply due to faster tumor growth rates. Other individual characteristics also may make screening a greater challenge in some individuals (e.g., mammography is less useful in women who have such significant breast density that little mammographic contrast is discernible in the breast parenchyma). Finally, some disease characteristics challenge the limits of the current technology (e.g., it is a greater challenge to identify precursor signs of adenocarcinoma of the cervix compared with squamous cell carcinoma of the cervix because the former originates higher in the transformation zone and commonly escapes usual cell collection technique^).^^ At the other end of the continuum are factors that are entirely within reach of programmatic and quality assurance influences. These factors include tailoring screening programs to estimated sojourn times, adherence to test quality control recommendations, efforts to improve the quality of test interpretation and reduce human error, and appropriate interpretative thresholds for a positive test. The latter is especially important, because there are trade offs between sensitivity and specificity. A program that attempts to maximize sensitivity may find that small gains are achieved at the cost of specificity, the consequences of which include increased burden on health care resources, financial costs, anxiety among individuals with false-positive interpretations, and erosion of the accept-

598

SMITH

ability of screening by patients and providers. Conversely, a program that focuses on maximizing specificity (i.e., avoiding false-positive interpretations), may contribute little to the reduction in the rate of advanced disease because only the most highly suspicious cases are segregated for further evaluation.ll A screening program with low sensitivity, due either to poor quality control or an excessive desire to avoid false-positive results, may only diagnose tumors near the end of the detectable preclinical phase and thus accomplish little compared with no screening at all. Finally, while reasonably high sensitivity in a screening program is necessary to achieve disease control objectives, it is not perfectly associated with program performance because screening programs with similar sensitivity still may vary in the rate of reduction in advanced disease due to differences in the composition of cases missed by screening.46Because the goal of screening is to reduce the rate of advanced disease in a population, screening performance measures should be evaluated in the context of what they contribute to the distribution of prognostic factors that foretell eventual mortality.15In the report from the Organizing Committee of the Falun Meeting on Breast Cancer Screening with Mammography in Women aged 40 to 49 Years that was held in March 1996, an update of follow-up data and screening performance measures was provided for all the world's breast cancer screening trials. Among women invited to screening, the relative risk of being diagnosed with a node-positive tumor was associated highly with the relative risk of dying from breast cancer, whereas program sensitivity was less positively associated with mortality reduction^.^^ Because measures of screening efficacy are summary measures, they are largely uninformative about the relative contributions of these many factors to estimates of sensitivity, specificity, and PPV. It is also easy to uncritically accept these estimates as measures of maximum test performance. Although perfect sensitivity and specificity are not achievable due to variations in individuals, generally improvements in screening program performance are achievable with greater attention to quality assurance. THE EVALUATION OF SCREENING PROGRAMS

The efficacy of a screening test is best evaluated with a populationbased, randomized clinical trial with a mortality endpoint. In the evaluation of the efficacy of a screening test, it is important to distinguish the actual improvements in survival from apparent improvements, because often persons with screen-detected cases have better survival than persons diagnosed after the onset of clinical symptoms.40A randomized trial eliminates the potential biases in observational studies that may influence survival comparisons among screen detected and non-screen-detected cases, most notably lead time bias, length bias sampling, and selection bias. This is ideally achieved by randomly assigning two groups of individuals in the desired population to a group invited to screening and a group not

PRINCIPLES OF SUCCESSFUL CANCER SCREENING

599

invited to screening. After a predetermined number of screening rounds and period of continued follow-up, cumulative disease-specific mortality trends between the two groups can be compared. As noted previously, the goal of screening is to gain lead time. If treatment before the onset of symptoms offers greater benefits, then improvements in survival should be associated with lead time gained, and mortality should be lower in cases diagnosed by screening. On the other hand, if the lead time gained only advances the time of diagnosis and life is not extended, then there is only the appearance of greater survival due to the longer duration from the time of diagnosis to the time of death. In this instance, death occurs at the same point in the natural history of the disease among both screen-detected and non-screen-detected cases. Lead time bias occurs when increased survival is a function only of the time gained before the point at which diagnosis would have occurred in the absence of screening. A survival benefit also may be due to tendency of screening to detect more slow-growing, indolent cancers. Length bias sampling refers to the tendency for screening to detect more slow-growing, less aggressive disease and to be less successful at detecting more aggressive, faster growing disease. If screening selectively identifies cases at a lower risk of death, length bias sampling may influence end results. Participation in screening also may be influenced by the tendency for healthier individuals to participate in preventive health programs. Selection bias refers to the tendency for individuals who are healthier or more health conscious and with a different probability of developing and dying from cancer to participate in the program. In a population-based randomized trial, these biases are eliminated effectively. With a carefully designed study, randomization should result in equal distributions of confounding factors in the groups invited and noninvited to several rounds of screening. Lead time bias is eliminated because disease-specific mortality in the group invited to screening is compared with disease-specific mortality in the group not invited to screening. Because the comparisons are based on disease-specific mortality rates at some interval after a common study starting point in both groups, lead time bias is avoided. Length bias sampling and selection bias are eliminated because randomization should ensure a similar distribution of individuals with underlying probabilities of developing cancer, overall health status, and with faster and slower growing tumors into each group in the study. Because the analysis is based on mortality differences in the two groups and not on the basis of the subgroups that were and were not screened, most known and unknown biases that influence mortality comparisons are minimized. Although the results of screening trials are commonly described as between "screened and unscreened groups, this casual description of end results comparisons is incorrect. For this same reason, there are methodologic limitations to modeling screening benefits in cost effectivenessanalyses based on trial relative risks because they invariably underestimate the true magnitude of the benefit. Thus, whereas a randomized trial may

600

SMITH

be the best method for evaluating the benefit of screening compared with usual care, it may not be a good method for estimating the true magnitude of the screening benefit due to noncompliance in the group invited to screening and contamination (i.e., participating in screening) in the usual care group. Contamination and noncompliance among study participants pose two fundamental challenges to trials. First, to the extent that either or both groups deviate from the intention of their original assignment, the statistical power of the trial is reduced. Second, factors that lead to noncompliance and contamination are generally not ascertainable as part of the randomization scheme. Because noncompliers may have different baseline risks, it is not statistically valid to perform subgroup analyses and compare only a screened group and an unscreened group. Cuzick et all2have proposed a method for addressing this problem. Their method estimates the true magnitude of the treatment (or screening) effect that is not diluted by lack of compliance or contamination while respecting the original randomization and retaining approximately the same power as the intent-to-treat analysis. CRITERIA FOR SUCCESSFUL SCREENING PROGRAMS Ultimately the goal of cancer screening is to reduce the incidence of advanced disease and, for some conditions, to reduce the incidence of invasive disease by diagnosing and treating precursor lesions. If screening for disease can meet the criteria outlined earlier, then achievement of the full potential of a screening program depends on the degree to which the early detection program is systematized, that is, organized around a set of rules, roles, and relationships. Rules imply that screening must follow protocols related to screening intervals, quality assurance, follow-up of abnormal cases, and program monitoring. Roles and relationships imply that a successful screening program requires a degree of compliance and cooperation between asymptomatic individuals, key health care professionals, and health care providers. If these elements are not organized into a system, or if any one of the elements is not fully contributory, then the fullest potential of screening is not realized. Screening Guidelines The importance of review, synthesis, and decision making related to the general principles for determining the value of cancer screening has led some leading organizations to issue guidelines, most notably the American Cancer Society (ACS), the United States Preventive Services Task Force, the American Medical Association, and the American College of Physicians. Many other organizations have issued disease-specific guidelines and endorsed the guidelines of others. Periodically, expert

PRINCIPLES OF SUCCESSFUL CANCER SCREENING

601

groups are assembled to address clinical practice dilemmas that have been identified as a persistent source of uncertainty among health care providers and patients.45 Guidelines help patients, practitioners, and policy makers reach decisions about appropriate health care.26Guidelines have obvious value to policy makers, the practicing clinician, and the public on the presumption that (1) an expert group has evaluated an extensive body of literature, (2) an evidence-based process was followed, and (3) the values and evidence-based criteria underlying the end results are explicit. The fact that organizations and expert groups often reach different conclusions about the worth of screening, the target population, the test, when it should begin, and how often it should take place is testament to both the complexity of the process and the degree to which different values and approaches to evidence-based medicine are factors in the process. These differences can prove to be an obstacle to cancer screening if they lead to inaction on the part of providers and the public (i.e., policy and participation are postponed until "the experts make up their minds"). For this reason, it is particularly important that the process, evidence, and rationale that form the underpinnings of a guideline are well articulated. Simply describing the process as evidence-based is overly presumptive because it implies that only one scientific conclusion could have been reached. Furthermore, values undoubtedly play a role in the selection of evidence for review as well as the degree to which different kinds of evidence form the basis for a guideline statement. Lack of evidence, sometimes expressed as no scientific evidence, may be explained by solidly designed studies that show no benefit, poorly designed studies that show none or mixed benefit and thus are not a sound basis for establishing policy, or simply the absence of any study or any study with a particular design to guide policy. Too frequently the lack of a solid body of evidence to inform screening policy is invoked as if there were solid evidence that screening was ineffective. Uncertainty is still a reasonable basis for not recommending screening, but favorable as well as unfavorable aspects of the uncertainty should be detailed because dismissive summary statements about the lack of scientific evidence do little to advance understanding. For these reasons, end users are better served with detailed descriptions of the evidence, how it was reviewed, and how it shaped the final guideline. Organizations also can benefit from these details because different organizations may reach different conclusions based on important considerations that, if reconciled in a deliberative manner, could benefit screening overall. An additional basis for differences in guidelines relates to timing. The release of new clinical practice guidelines is always a reactive process based on the accumulation of evidence that may be judged to be sufficient to warrant recommending screening or modifying existing screening guidelines. However, organizations are generally not poised to react at the first sign of new evidence; thus, guidelines may be out of sync simply

602

SMITH

due to differences between organizations in the periodicity and internal processes for considering new evidence related to cancer screening. An organization's guideline may be out of sync with new evidence for years, and although originally evidence based, it no longer may be based on the most current evidence. Individuals who wuzzle over differences in euidelines should appreciate that timing, val;es, and criteria for how evrdence is considered are all factors that explain differences in current recommendations between organizations. One value that ought " " to be common among all organizations, however, is the importance of quickly responding to new evidence that could improve the public's health. The National Guideline Clearinghouse is a source of full-text guideline statements for health professionals. This organization was established by the Agency for Health Care Policy and Research, the American Medical Association, and the American Association of Health Plans and can be found on the World Wide Web at http://www.guideline.gov. Quality Assurance

Quality assurance is an all-embracing term that includes the quality control of the screening test and all other factors in the screening and follow-up process that ensure state-of-the-art outcomes. As noted previously, the success of a screening program depends greatly on attention to detail along a chain of events, from recruitment to follow-up. Of particular importance, however, is the quality control of the screening test. Poor quality contributes to less than optimal specificity and sensitivity, or, put another way, avoidable diagnostic tests and missed opportunities for early detection. For individuals who undergo screening, as well as their referring providers, poor quality also represents an institutional failure to live up to the promise implied by public health initiatives to encourage participation in screening. The insidious nature of quality control failures is that they are of little consequence to most individuals who do not have the disease, at least on any given occasion of screening. Poor quality may influence the screening process at several key levels. Personnel may fail to administer the screening test properly, tailor it to individual requirements, or ensure that the test is accompanied by important patient data. The test (i.e., specimen, image) may be processed improperly, thus important information is not available to the interpreter. The individual who interprets the test may not be fully competent, overworked, or simply uninterested. Finally, the manner in which results are reported to the referring physician and the patient may be misunderstood or confusing and thus may fail to prompt appropriate action. In the last decade there has been a growing appreciation among all parties for the importance of high-quality screening. Shortcomings in the quality of cervical cancer screening were the source of numerous media expos6s in the mid to late 1980s and led to new rules and regulations targeting cervical cytology in the 1988 update of the Clinical Laboratory

PRINCIPLES OF SUCCESSFUL CANCER SCREENING

603

Likewise, quality shortcomings identified early in the Improvement period when mammography use was increasing led the American College of Radiology to establish the voluntary Mammography Accreditation Program, which ultimately laid the groundwork for passage of the Mammography Quality Standards Act of 1992.27Currently, all facilities that offer mammography must be certified by the Food and Drug Administration, which requires an annual on-site inspection. There are extensive standards goverrkng quality control proceiures, personnel, and safeguards for the consumer.22In addition, the importance of peer review of mammographic images that was established by the American College of Radiology was recognized by Congress, which required in the law that all facilities must be accredited by a nonprofit accrediting - body as a requirement for ~ertification.~~ There is no inherent reason why screening quality must be assured by a regulatory process, but in each of these two instances quality control standards had been established but were not being followed by most providers. either because thev were not fullv understood or their im~ortance was not fully appreciated. Because the public is not in a good position to scrutinize the quality of a screening provider, there is limited potential for an informed consumer to drive improvements in quality. Regulation may not always be necessary, but when providers are reluctant, uninterested, or slow to make changes, regulation may be required to hasten and ensure the implementation of standards that lead to a minimum standard of aualI I ity. As onerous as regulation in medicine may seem, the advantage of these standards and oversight is that they establish a common minimum standard of quality and provide the public with confidence that the tests they receive meet these standards. It is critically important that quality control factors are monitored regularly against known quality control measures, and when shortcomings are identified, responsible parties must act immediately to address the source of the I

The Role of the Referring Provider

Studies have shown consistently that the single most important factor in whether an individual has ever had a screening test or had a recent screening test is a recommendation from the primary care pro~ider.~~,~0,53 Even in the presence of ongoing public health campaigns touting the importance of screening, an individual may wait for his or her doctor to initiate the process-he, she, or they (a third-party provider) are in the most credible position to legitimize the importance of routine ~ c r e e n i n g . ~ ~ As a gatekeeper, the provider can explain the importance of screening if the patient is uncertain, or he or she may play an important role in helping the patient reach an informed decision in the context of benefits and harms associated with some screening tests. The referring provider also can serve as a point of reminder for periodic cancer screening, something that is

604

SMITH

more readily achieved if patients are enrolled in a reminder system. Finally, the referring provider is in the key position to help the patient understand screening results that are abnormal and to manage follow-up care. One of the most difficult barriers to screening participation is a lack of provider referral. However, attempting to educate providers to include recommendations for cancer screening in their routine encounters with patients is not as simple as it seems. The average office visit does not provide for a great deal of preventive health counseling, and other factors such as physician to patient ratios, the organization of the practice, lack of preventive health orientation and reminder tools (flowsheets, patient data systems), neglect, and physician specialty all have been identified as structural barriers to screening.53Furthermore, the average office visit is for an acute care episode, and in some respects the situational context of this kind of visit may not be conducive to cancer screening or discussions about cancer screening. Finally, patients may see a specialist physician who takes responsibility for a specific health care problem but not for preventive health and therefore does not feel responsible for screening referrals, although he or she may be the individual's only source of health care. These factors combined lead to a situation in which screening commonly occurs opportunistically (i.e., by virtue of a combination of situational coincidences). When screening is organized, however, individuals have a greater likelihood of receiving routine screening. Tools that have been shown to enhance screening include flowsheets, chart reminders, computerized tracking and reminder systems, and group practices.16,17,24,25,38 The Role of the Individual In the United States, considerable resources are dedicated to educating individuals about the importance of early cancer detection based on the underlying assumption that an informed consumer is likely to participate in screening. These messages normally follow a health belief model approach, which emphasizes that individuals take action when they perceive the importance of a particular behavior for their health.23 Based on this model, health education related to screening generally communicates risk, explains the value of screening, and concludes with the recommendation to talk with one's doctor about getting tested. Because the primary care provider either conducts the test or provides the referral, this approach acknowledges not only the importance of the health care provider's basic role but also the "demand side" of this process. Because most individuals are not reminded to have regular screening examinations through a centralized surveillance system, compliance with screening often requires some initiation on the part of the i n d i v i d ~ a l . ~ ~ , ~ 9 Even with a comprehensive tracking and monitoring system in place, individuals must comply with routine screening opportunities and, more importantly, recommendations for follow-up of abnormal tests.5In addi-

PRINCIPLES OF SUCCESSFUL CANCER SCREENING

605

tion to compliance, individuals play an important role in contributing to the quality of screening tests. For example, on the day a woman is scheduled to have a mammogram, she should not use deodorant or cosmetics on or near the breast. Colorectal cancer screening requires a considerable patient contribution, from dietary modification before fecal occult blood testing to bowel preparation before endoscopy. Individuals also have a responsibility to inform their physician if they have had any symptoms, because this experience could result in being scheduled for different testing. Finally, follow-up testing should be viewed as a joint responsibility between physician and ~ a t i e n tCurrently, .~ there is no shared standard for a division of responsibility between physician and patient regarding follow-up testing, and in many instances it is entirely up to the patient to be mindful of when to schedule follow-up examinations. Inevitably, this can lead to follow-up failures. Individuals always will have an important role in compliance with routine screening and follow-up. Some individuals have the benefit of computerized tracking systems managed by their primary care provider or health maintenance organization that notify them when they are due for screening, communicate important preexamination information, communicate results after the examination, and notify them for interim followup care. However, most individuals are not covered by such a system, and for this reason greater attention should be dedicated to assisting individuals to understand fully their contribution to high-quality screening. Monitoring the Performance of Screening Programs

Once a screening program has been implemented in the community, its performance should be monitored ~onstantly.'~ This commonly neglected element of a screening program is vital to its success in that it affords program managers with access to data to identify quickly any emerging problems with quality, to measure human performance in the program, and to communicate accomplishments to health care provide r ~Evaluation . ~ ~ and monitoring programs also can serve as sentinel surveillance tools and provide a source of new observations that may improve the quality of screening for unique subpopulations. The ongoing evaluation of a screening program includes both nearterm and long-term measures, each of which relate to the other. In the near term, programs should monitor the rate of compliance with invitations to screening and evaluate carefully basic parameters of screening performance, including the detection rate, staging elements, sensitivity, specificity, incidence of interval cancers, and predictive value. Over time, programs should monitor change in the rate of advanced disease and trends in the mortality rate.15 Additional performance indicators include compliance with follow-up recommendations, the duration of time involved in resolving indeterminate and abnormal screening test interpretations, and rates of rescreening. At the practice level, medical audits have been recommended as a way for the practice and interpreting clinicians

606

SMITH

to monitor and evaluate their performance and to identify areas in which improvement is needed.34,35,51,54

CONCLUSION

In the United States, the diverse elements necessary for successful screening programs exist in varying degrees depending on the target disease and population, but generally they are not organized and therefore do not constitute a system. Thus, discussions about screening programs pertain more to the performance parameters of programmatic elements than to programs per se. The fullest potential for screening's contribution to cancer control depends on the systemization of screening, as described previously. When screening is organized, rather than opportunistic, compliance rates are higher. When programs are centralized, there is greater opportunity for ongoing evaluation of screening performance and end results. If screening programs are linked to surveillance systems, there is potential to measure screening's contribution to disease control efforts. Although billions of health care dollars have been spent on screening and millions of dollars have been spent on the promotion of screening to the public and health care providers, our current surveillance systems do not provide us with an opportunity to track the proportion of incident cases that are screen detected. There are additional advantages to the systemization of screening, which pertain to readiness for new cancer control opportunities with early detection. Although solid evidence exists to support the value of screening for colorectal cancer and has existed for some time, screening rates among average-risk men and women are lo^.^,^^,^^,^^ It is reasonable to conjecture that if screening were more systemized then the apparatus and natural intent to integrate colorectal cancer screening into routine care would have contributed to higher rates of participation than we currently see. As it stands, health care institutions, providers, and individuals face the same uphill challenge to reap the great potential offered by mass screening for colorectal cancer that was faced for mammography in the early 1980s. Although no organization recommends mass screening for lung cancer, new approaches to the use of low-dose spiral CT to image the lung and molecular approaches to sputum cytology may reopen the opportunity for screening high-risk adults for lung c a n ~ e r .If~ these ~,~~ new technologies can be shown to be efficacious, then the potential to add an additional screening examination to the current armamentarium will take place. Because many persons must be tested to identify the few, without a system in place to integrate screening into routine care, the arrival of new testing opportunities adds just another competing element to the demand side burden on health care today. The fullest contribution of screening to cancer control will be achieved if systems are established to ensure compliance and provide the reassurance of value through monitoring program performance and outcomes.

PRINCIPLES OF SUCCESSFUL CANCER SCREENING

607

References 1. Alexander F, Roberts M, Huggins A: Risk factors for breast cancer with applications to selection for the prevalence screen. J Epidemiol Community Health 41:101,1987 2. American Cancer Society:American Cancer Society Facts and Figures. Atlanta, American Cancer Society, 1999 3. American Cancer Society: Guidelines for the cancer-related checkup: Recommendations and rationale. CA Cancer J Clin 30:4,1980 4. Andersson I, Janzon L: Reduced breast cancer mortality in women under age 50: Updated results from the Malmo Mammographic Screening Program. Monogr Natl Cancer Inst 22:63, 1997 5. Bassett LW, Hendrick RE, Bassford TL, et al: Quality determinants of mammography. Clinical Practice Guidelines, No. 13, Publication No. 95-00632. Rockville, Agency for Health Care Policy and Research Public Health Service, US Department of Health and Human Services, 1994 6. Bjurstam N, Bjorneld L, Duffy SW, et al: The Gothenburg breast screening trial: First results on mortality, incidence, and mode of detection for women ages 39-49 years of randomization [see comments]. Cancer 80:2091,1997 7. Brinton LA, Bernstein L, Colditz GA: Summary of the workshop: Workshop on physical activity and breast cancer, November 13-14,1997. Cancer 83:595,1998 8. Burke W, Daly M, Garber J, et al: Recommendations for follow-up care of individuals with an inherited predisposition to cancer. 11. BRCAl and BRCA2. Cancer Genetics Studies Consortium [see comments]. JAMA 277997,1997 9. Byers T, Levin B, Rothenberger D, et al: American Cancer Society guidelines for screening and surveillance for early detection of colorectal polyps and cancer: Update 1997. CA Cancer J Clin 47154,1997 10. Cole P, Morrison AS: Basic issues in cancer screening. In Miller AB (ed): Screening in Cancer. Geneva, International Union Against Cancer, 1978, p 7 11. Curpen B, Sickles E, Sollitto R: The comparative value of mammographic screening for women 40-49 years old versus women 50-64 years old. American Journal of Radiology 164:1099,1995 12. Cuzick J, Edwards R, Segnan N: Adjusting for non-compliance and contamination in randomized clinical trials. Stat Med 16:1017, 1997 13. Day NE: Quantitative approaches to the evaluation of screening programs. World J Surg 13:3, 1989 14. Day NE, Walter SD: Simplified models for screening: Estimation procedures from mass screening programmes. Biometrics 40:1, 1984 15. Day NE, Williams DR, Khaw KT: Breast cancer screening programmes: The development of a monitoring and evaluation system. Br J Cancer 59:954,1989 16. Dietrich JJ, Woodruff CB, Carney PA: Changing office routines to enhance preventive care: The preventive GAPS approach. Arch Fam Med 3:176,1994 17. Dietrich JJ, O'Conner GT, Keller A, et al: Cancer: Improving early detection and prevention. A community practice randomised trial. BMJ 304:91, 1992 18. Duffy SW, Chen HH, Tabar L, et al: Estimation of mean sojourn time in breast cancer screening using a Markov chain model of both entry to and exit from the preclinical detectable phase. Stat Med 14:1531, 1995 19. Duffy SW, Chen HH, Tabar L, et al: Sojourn time, sensitivity and positive predictive value of mammography screening for breast cancer in women aged 40-49. Int J Epidemi01 25:1139, 1996 20. Eddy D: ACS report on the cancer-related health checkup. CA Cancer J Clin 30:193,1980 21. Eifel PJ, Berek JS, Thigpen JT: Cancer of the cervix, vagina, and vulva. In Devita VT, Hellman S, Rosenberg SA (eds): Cancer: Principles and Practice of Oncology, ed 5. Philadelphia, Lippincott-Raven, 1997, pp 1433-1478 22. Food and Drug Administration: Quality mammography standards; final rule. Federal Register (ed 208). Rockville, Department of Health and Human Services, 1997, p 55851 23. Fulton JP, Buechner JS, Scott HD, et al: Predictors of breast cancer screening among women ages 40 and older: A study guided by the health belief model. Public Health Rep 106:410, 1991

608

SMITH

24. Gann P, Melville SK, Luckmann R: Characteristics of primary care office systems as predictors of mammography utilization. Ann Intern Med 118:893,1993 25. Garr DR, Ornstein SM, Jenkins RG, et al: The effect of routine use of computer-generated preventive reminders in a clinical practice. Am J Prev Med 9:55,1993 26. Gaus CR: Guideline development and use. In Bassett LW, Hendrick RE, Bassford TL, et a1 (eds): Quality Determinants of Mammography. Clinical Practice Guidelines, No. 13, Publication No. 95-00632. Rockville, MD, Agency for Health Care Policy and Research Public Health Service, US Department of Health and Human Services, 1994 27. Hendrick RE, Smith RA, Wilcox PA: ACR accreditation and legislative issues in mammography. In Haus AG, Yaffee MJ (eds): Syllabus: A Categorical Course in Physics: Technical Aspects of Breast Imaging. Chicago, RSNA, 1993, pp 137 28. Henschke CI, Investigators E: Early lung cancer action project: Overall design of baseline screening. Cancer, in press 29. Horton JA, Cruess DF, Romans MC: Compliance with mammography screening guidelines: 1995 mammography attitudes and usage study report. Womens Health Issues 6:239, 1996 30. Horton JA, Romans MC, Cruess DF: Mammography attitudes and usage study, 1992. Womens Health Issues 2:180,1992 31. Huang Z, Hankinson SE, Colditz GA, et al: Dual effects of weight and weight gain on breast cancer risk [see comments]. JAMA 278:1407,1997 32. Institute of Medicine Committee for the Study of the Future of Public Health: The Future of Public Health. Washington, DC, National Academy Press, 1988 33. Koss LG: The Papanicolaou test for cervical cancer detection: A triumph and a tragedy [see comments]. JAMA 261:737,1989 34. Linver M: Meet high expectations for mammography with high-quality services. Diagnostic Imaging 20:89,1998 35. Linver MN, Rosenberg RD, Smith RA: Mammography outcomes analysis: Potential panacea or Pandora's box? [comment]. AJR Am J Roentgen01 167:373,1996 36. Mandel JS, Bond JH, Church TR, et al: Reducing mortality from colorectal cancer by screening for fecal occult blood. Minnesota Colon Cancer Control Study. N Engl J Med 328:1365,1993 [Erratum appears in N Engl J Med 329:672,19931 37. Marteau TM: Toward an understanding of the psychological consequences of screening. In Croyle RT (ed): Psychosocial Effects of Screening for Disease Prevention and Detection. New York, Oxford University Press, 1995, p 1433 38. McPhee SJ, Bird JA, Jenkins CN, et al: Promoting cancer screening: A randomized, controlled trial of three interventions. Arch Intern Med 149:1866,1989 39. Miller AB: Fundamental issues in screening for cancer. In Schottenfeld D, Fraumeni JF (eds): Cancer Epidemiology and Prevention, ed 2. New York, Oxford University Press, 1996 40. Miller AB: Fundamentals of screening. In Screening for Cancer. Orlando, Academic Press, 1985, p 3 41. Trends in Cancer Screening: United States 1987 and 1992. MMWR Morb Mortal Wkly Rep 45:57,1996 42. Morrison A: Screening in Chronic Disease. New York, Oxford University Press, 1992 43. Mulshine JL, DeLuca LM, Dedrick RL, et al: Considerations in developing successful population-based molecular screening and prevention of lung cancer. Cancer, in press 44. National Cancer Institute Breast Cancer Screening consortium: Screening mammography: A missed clinical opportunity? Results of the NCI breast cancer screening consortium and national health interview survey studies. JAMA 264:54, 1990 45. National Institutes of Health Consensus Development Panel: National Institutes of Health Consensus Development Conference Statement: Breast cancer screening for women Ages 40-49, January 21-23, 1997. National Institutes of Health Consensus Development Panel. J Natl Cancer Inst 89:1015,1997 46. Organizing Committee and Collaborators: Breast Cancer Screening with mammography in women aged 40-49 years. Report of the Organizing Committee and Collaborators, Falun, Sweden. Int J Cancer 68:693,1996 47. Public Law 102-539: The mammography quality standards act of 1992. Washington, DC, 1992

PRINCIPLES OF SUCCESSFUL CANCER SCREENING

609

48. Ries L, Kosary C, Hankey B, et al: SEER Cancer Statistics Review, 1973-1995. Bethesda, MD, National Cancer Institute, 1998 49. Shingleton HM, Patrick RL, Johnston WW, et al: The current status of the Papanicolaou smear. CA Cancer J Clin 45:305,1995 50. Sickles EA: Breast cancer screening outcomes in women ages 40-49: Clinical experience with service screening using modern mammography. Monogr Natl Cancer Inst 22:99, 1997 51. Sickles EA: Quality assurance: How to audit your own mammography practice. Radiol Clin North Am 30:265,1992 52. Smith RA: Screening fundamentals. Monogr Natl Cancer Inst 22:15,1997 53. Smith RA, Haynes S: Barriers to screening for breast cancer. Cancer 69:1968, 1992 54. Smith RA, Osuch JR, Linver MN: A national breast cancer database. Radiol Clin North Am 33:1247,1995 55. Smith-Warner SA, Giovannucci E: Fruit and vegetable intake and cancer. I n Heber D, Blackburn GL, Go VLW (eds): Nutritional Oncology. San Diego, Academic Press, 1999 56. Solin L, Schwartz G, Feig S, et al: Risk factors as criteria for inclusion in breast cancer screening programs. I n Ames F, Blumenschein G, Montague E (eds): Current Controversies in Breast Cancer. Austin, University of Texas Press, 1984 57. Tabar L, Chen HH, Fagerberg G, et al: Recent results from the Swedish Two-County Trial: The effects of age, histologic type, and mode of detection on the efficacy of breast cancer screening. Monogr Natl Cancer Inst 22:43,1997 58. Tabar L, Fagerberg G, Chen HH, et al: Efficacy of breast cancer screening by age: New results from the Swedish Two-County Trial. Cancer 75:2507,1995 59. US Preventive Services Task Force: Guide to Clinical Preventive Services, ed 2. Baltimore, Williams & Wilkins, 1996 60. Wilson JMG, Junger G: Principles and Practice of Screening for Disease. Geneva, World Health Organization, 1968 61. Winawer SJ, Fletcher RH, Miller L: Colorectal cancer screening: Clinical guidelines and rationale. Gastroenterology 112:594, 1997

Address veprint requests to Robert A. Smith, PhD American Cancer Society 1599 Clifton Road, NE Atlanta, GA 30329 e-mail: [email protected]