Screening for Lung Cancer*

Screening for Lung Cancer*

Screening for Lung Cancer* A Review of the Current Literature Peter B. Bach, MD; Michael J. Kelley, MD; Ramsey C. Tate, BS; and Douglas C. McCrory, MD...

108KB Sizes 4 Downloads 132 Views

Screening for Lung Cancer* A Review of the Current Literature Peter B. Bach, MD; Michael J. Kelley, MD; Ramsey C. Tate, BS; and Douglas C. McCrory, MD, MHS

Study objectives: To review the available data on the early detection of lung cancer, with a focus on three technologies: chest x-ray (CXR), sputum cytology, and low-dose CT (LDCT) scanning. Design, setting, participants: Review of published clinical studies of early detection technologies. The best available evidence on each topic was selected for analysis. Randomized trials were used to evaluate CXR and sputum cytology. Cohort studies, as well as studies providing evidence regarding rates of overdiagnosis and efficacy of initial treatment, were considered in evaluation of LDCT. Study design and results were summarized in evidence tables. Statistical analyses of combined data were not performed. Measurement and results: Five randomized trials of CXR with or without sputum cytology have been conducted, each which reports disease-specific mortality as well as other end points. None of these studies provide support for the use of either CXR or sputum cytology for the early detection of lung cancer in asymptomatic individuals. Eight completed and ongoing trials of LDCT were identified. All of these studies report the frequency and stage distribution of lung cancers found during initial (“prevalence”) screening, and several studies also report rates of detection at the time of annual follow-up. No outcome data on survival or treatment are available. A number of studies support the hypothesis of “overdiagnosis”—that some lung cancers detected by LDCT may behave in an indolent manner. Conclusions: The use of either CXR or sputum cytology for the early detection of lung cancer is not supported by the published evidence. The evidence for LDCT appears promising, in that the technology typically identifies lung cancer at an early stage, although corollary studies suggest that these findings in isolation may be misleading. Further high-quality research is needed to better define the role of LDCT in the evaluation of asymptomatic high-risk individuals. (CHEST 2003; 123:72S– 82S) Key words: evidence-based medicine; lung neoplasms; mass chest x-ray; mass screening; review; sputum; tomography, x-ray computed Abbreviations: CXR ⫽ chest x-ray; ELCAP ⫽ Early Lung Cancer Action Project; LDCT ⫽ low-dose CT; MSKCC ⫽ Memorial Sloan-Kettering Cancer Center; NCI ⫽ National Cancer Institute; RCT ⫽ randomized controlled trial

cancer accounts for approximately 6% of all L ung deaths in the United States each year, and is the leading cause of cancer death for both men and women.1 At present, most patients who receive an initial diagnosis of lung cancer have advanced stage disease, making cure with currently available therapies unlikely. In contrast, individuals with early stage disease can achieve cure through surgical resection. Because of this dichotomy in outcome associated with stage at *From the Health Outcomes Research Group, Department of Epidemiology and Biostatistics (Dr. Bach and Ms. Tate), and the Department of Medicine, Memorial Sloan-Kettering Cancer Center, New York, NY; the Department of Medicine (Dr. Kelley), and the Center for Clinical Health Policy Research (Dr. McCrory), Duke University, Durham, NC; the Durham Veterans Affairs Medical Center, Durham, NC. Correspondence to: Peter B. Bach, MD, MAPP, Health Outcomes Research Group, Memorial Sloan-Kettering Cancer Center, 1275 York Ave, Box 221, New York, NY 10021; e-mail: [email protected] 72S

diagnosis, there has been persistent interest in designing and testing methods for early detection of lung cancer. To place these studies in context, we articulate the critical elements of successful screening tests and describe the research methods that can be used to evaluate them. We then focus on lung cancer screening studies with chest x-ray (CXR), sputum cytology, and low-dose CT (LDCT) scanning. For the first two modalities, we review published randomized controlled trials (RCTs). For LDCT, we review published and ongoing studies, discuss the evidence that has fueled the recent debate over efficacy, and discuss future plans of assessment. Elements of a Beneficial Screening Test The benefit of a screening test is typically assessed in two domains. First, the screening test must proLung Cancer Guidelines

vide benefit to individuals who have the illness, typically through increasing life expectancy. In order to prolong life expectancy, the test must be capable of detecting the disease at a point in its natural history at which the course can be altered through treatment, typically earlier than is encountered in sporadic practice. Patients survive longer after diagnosis of early stage lung cancer than they do after diagnosis of more advanced stage lung cancer; therefore, it is widely believed that early detection followed by definitive treatment should alter the natural history of the disease and thereby reduce lung cancer mortality. Second, the screening test should not be dangerous or painful, nor should it have numerous falsepositive results that cause anxiety or necessitate invasive or dangerous follow-up tests. From a societal perspective, the screening test should also not be harmful to the large fraction of the population who do not have the disease, either by consuming vast amounts of resources or by directly affecting the capacity of the health-care system to provide for others.

Research Methods for Evaluating Screening Tests Three general methods have been advocated for the evaluation of screening tests: randomized trials of screening, population-based studies of screening, and observational studies of screening in select cohorts. Traditionally, the scientific community ranks RCTs at the top of this hierarchy of study designs. In a randomized trial, individuals are subjected to different intensities of screening, and the outcome that is observed is diseasespecific mortality. The efficacy of fecal-occult blood testing for the detection of colon cancer was established in this manner.2– 4 As another example, the Mayo Lung Project randomized individuals to receive regularly scheduled screening (intervention arm) or routine care; the primary outcome of interest was lung cancer mortality.5–7 An alternative highly indicative design is the populationbased study. In studies of this design, the impact of a broadly implemented screening program is assessed through changes in disease-specific mortality rates within the population. The efficacy of cervical cancer screening was established through multiple studies that documented large declines in the rates of cervical cancer mortality in the 1960s and 1970s in Sweden, Finland, and a number of other countries.8,9 In the third design— observational studies of screening in select cohorts—the efficacy of the screening test is inferred from the frequency with www.chestjournal.org

which it can detect early stage cancers. The subsequent impact on disease-specific mortality is then ascertained from historical data on the outcomes of individuals with early stage disease, or through the documentation of survival for the individuals enrolled in the study. Readers should be aware that all of these study designs have potential weaknesses. Negative results of randomized trials may be viewed with skepticism either because of defects in design (such as inadequate power), or defects in study conduct (such as having excessive crossover between arms). For example, despite seven separate RCTs, the efficacy of mammography for the prevention of death from breast cancer remains uncertain. Some investigators believe that significant design flaws in the majority of mammography studies limit the ability of those studies to measure the impact of mammography. These investigators have pointed out that the best designed studies show no discernible impact on overall mortality,10,11 an assertion that has led an independent panel of US National Cancer Institute (NCI) advisers to revise their information summaries about screening with mammography.12,13 Similarly, population-based studies can be confounded by numerous factors, including the implementation of a screening program itself. For example, there is a substantial amount of evidence suggesting that the act of screening for prostate cancer with the prostatespecific antigen test may artifactually raise incidence and mortality rates from that disease.14,15 A study from Black et al16 suggested that screening interventions in RCTs may also influence the ascertainment of cause of death. Observational studies are subject to particular biases, including those resulting from “lead time” and “length time,” and those resulting from the recruitment and selection of volunteer cohorts. Sputum Cytology and CXR Sputum cytology and CXR have each been assessed in RCTs. The first study was begun in 1960, in which 55,034 London men were randomized to CXR every 6 months for 3 years, or to CXR at the beginning and the end of the 3-year period.17,18 Follow-up exceeded 99% in both study arms. In this study, a total of 36 additional cases of lung cancer were identified in the frequently screened group (132 cases) than in the group screened only at the study inception and study completion (96 cases). Of the cancers detected during the 3-year study period, 44% were resected in the screened group, while only 29% were resected in the control group. During the 3-year study period, lung cancer mortality was similar: 62 individuals in the frequently screened group died CHEST / 123 / 1 / JANUARY, 2003 SUPPLEMENT

73S

of lung cancer, compared to 59 individuals in the less frequently screened group. This study, as well as several observational studies, did not support the hypothesis that CXR screening would be associated with a mortality benefit, and widespread screening was not pursued.19,20 Improvements in CXR and sputum cytology technologies led the NCI to readdress this question in the 1970s. The NCI funded the Cooperative Early Lung Cancer Group,21 which conducted three randomized studies of sputum cytologic examination and CXR. During the same time period, a randomized study of CXR and sputum cytology was conducted in Czechoslovakia.22 Two of the NCI-funded RCTs (the Johns Hopkins Lung Project and the Memorial Sloan-Kettering Cancer Center [MSKCC] Lung Cancer Screening Program) randomized subjects to either CXR alone or CXR plus sputum cytology. In neither study were there differences in the number of cancers detected, the fraction of cancers that were “resectable,” or the lung cancer mortality rate (Table 1).23–25 These studies were primarily designed to assess the incremental impact of sputum cytology in a CXRscreened cohort. Thus, the absence of a mortality benefit in these studies may have been due to the insensitivity of sputum cytology as it was implemented. The NCI-funded Mayo Lung Project focused on the combined impact of CXR and sputum cytology for screening.5–7 This study recruited male smokers ⬎ 45 years old between 1971 and 1976 who had at least one pack-per-day cigarette use, an estimated survival of at least 5 years, no evidence of lung cancer on an initial evaluation, and sufficient lung function to tolerate lobectomy. Among the 10,933 subjects initially recruited, 91 prevalent cases of lung cancer were detected on screening studies (0.83%), 51 cases by CXR, 17 cases by sputum cytology alone, and 15 cases by a combination of both tests. Subsequently, individual subjects with negative and satisfactory prevalence screens were randomized to one of two groups. The “screening” group had CXRs and sputum cytology every 4 months for 6 years; the “control” group received a recommendation at the start of the study to receive a yearly CXR and sputum cytologic examination (that being the standard Mayo Clinic recommendation at that time). Subjects were then followed up from 1 to 5.5 years (median, 3 years). Compliance in the screened group was 85% during the first year and fell to 75% by the end of the study. More than one half of the control group also underwent CXR during the study period. Lung cancer was detected more often in the 74S

screened group (206 cases) than in the control group (160 cases; Table 1). Forty-eight percent of the cancers detected in the screened group were localized, and thus amenable to surgical resection with curative intent, while only 32% of the cancers detected in the control group were localized, suggesting that screening may increase the cure rate. However, at the first analysis, there were no substantive differences in lung cancer-specific mortality rates: 3.2 per 1,000 person-years in the screened group and 3.0 per 1,000 person-years in the control group. All-cause mortality was also similar (Table 1). The investigators of the Mayo Lung Project therefore concluded that the combination of CXR and sputum cytology performed every 4 months does not decrease lung cancer mortality compared to the recommendation of annual screening.6 An analysis of the same cohort performed 15 years after study completion yielded a similar finding: no substantive difference in lung cancer mortality rate between the screened group (4.4 per 1,000 person-years) and the control group (3.9 per 1,000 person-years).6,26 The contemporaneous study conducted in Czechoslovakia randomly assigned individuals to CXR and sputum cytology every 6 months or to initial screening followed by screening again after 3 years had elapsed. Subjects in both arms were then screened annually for an additional 3 years. The results of this study at 3 years were similar to the Mayo study; a greater number of cancers had been detected in the intervention group than the control group (39 cases vs 27 cases), including a greater number of cases with early stage disease (20 cases vs 10 cases; Table 1). However, there was effectively no difference between the number of individuals in the screened group and the control group with late-stage cancer (19 patients vs 17 patients), nor was there a difference in the number of patients dying of lung cancer.22,27

Conclusions: CXR and Sputum Cytology The results of the five RCTs suggest that neither CXR nor sputum cytology satisfy the primary criterion of a beneficial screening test. Neither test appears to prolong the life expectancy of an individual with the disease. Whether either test fits the second criterion—that the test is not particularly painful or harmful—was not addressed in sufficient detail in any of these studies to determine. Readers should appreciate that despite the consistency of the findings in these studies, there is still skepticism about their interpretation. One observaLung Cancer Guidelines

www.chestjournal.org

CHEST / 123 / 1 / JANUARY, 2003 SUPPLEMENT

75S

Control

Control All Intervention

All Intervention

Control

All Intervention

Control

All Intervention Control All Intervention

Study Arm

CXR annually and sputum cytology every 4 mo CXR annually

CXR annually and sputum cytology every 4 mo CXR annually

CXR and sputum cytology every 6 mo for 3 yr CXR and sputum cytology at end of 3 yr

CXR and sputum cytology every 4 mo for 6 yr Advised to have CXR and sputum cytology annually

CXR every 6 mo for 3 yr CXR at end of 3 yr

Intervention

5,161

5,072 10,386 5,226

10,040 4,968

3,174†

6,364 3,172†

4,593†

55,034 29,723 25,311 10,933 4,618†

Sample Size

40

23 79 39

53 30

NA

18 NA

NA

51 31 20 91 NA

Cancers, No.

7.8

4.5 7.6 7.5

5.3 6.04

NA

2.8 NA

NA

0.9 1.0 0.8 8.3 NA

Per 1,000 Persons

202

121 396§ 194

235 114

27

66 39

160

177 101 76 366 206

Cancers, No.

NR

3.8 NR NR

NR 3.7

NR

NR NR

4.3

NR NR NR NR 5.5

Per 1,000 Person-Years

Repeat Screening

*NA ⫽ not available; NR ⫽ not reported. †Randomization occurred subsequent to baseline screening; therefore, the sample sizes of the study arms do not sum to the number of total enrollees. ‡Randomization occurred prior to baseline screening; therefore, the total number of deaths may include individuals found to have lung cancer during the baseline screen. §Includes 379 cancer detected during screening period and 17 cancers detected after the end of screening.

Johns Hopkins 1973–198221,23

MSKCC 1974–198221,25

Czechoslovakia 1976–198022,27

Mayo 1971–19835–7

London 1960–196417,18

Study Site

Baseline Screen

Table 1—RCTs of CXR With or Without Sputum Cytology*

3.8‡

3.4‡

2.7‡

NR 2.7‡

2.6

NR 3.6

3.0

2.2 2.1 2.4 NR 3.2

Per 1,000 Person-Years

Lung Cancer Mortality

tion, initially provided by the International Union Against Cancer,28 is that many of these studies may be invalid because they did not include a “noscreening” study arm and thus no determination of true efficacy could be made. Other authors have argued that the sample size of these studies was inadequate. Both the Mayo Lung Project and the Czechoslovakian studies were powered to detect a 50% reduction in lung cancer mortality in the screened group compared with the control group.27 In contrast, the power to detect a smaller but clinically meaningful effect, such as a 10% reduction in mortality, was much lower (only 0.21 and 0.16, respectively).7,27,29 –31 Adding to the controversy, case-control studies have suggested that there might still be a benefit from CXR screening.32 The ongoing NCI-sponsored Prostate, Lung, Colorectal, and Ovarian Trial is a randomized trial designed to further determine the role of CXR screening in a low-risk population of both genders, but the results from this study have not yet been made available.33,34

LDCT Scanning LDCT scanning is a technique that allows a low-resolution image of the entire thorax to be obtained in a single breath-holding with low radiation exposure. There is enormous enthusiasm for LDCT as a screening test. To date, LDCT has only been evaluated in observational studies of volunteer cohorts (Table 2).

Studies of LDCT Screening A prevalence screen using a single-spiral CT scan in 5,483 persons aged 40 to 74 years was performed in 1996 in Matsumoto, Japan.35 Most of the subjects had undergone yearly CXRs and sputum cytologic screening, and 3,967 subjects underwent CXR concurrently with spiral CT scanning. Smokers also underwent sputum cytology. Among subjects who had abnormalities prompting further evaluation were 59 subjects (1%) with noncancerous but suspicious lung lesions, 84 subjects (2%) with lesions suspicious for lung cancer, and 80 subjects (2%) with indeterminate small lung nodules. Nineteen lung cancers were diagnosed (8.5% of subjects with abnormal findings prompting further evaluation). The initial radiographic appearance of these lesions was suspicious for lung cancer in 14 cases, benign but suspicious in 3 cases, and indeterminate in 2 cases. Eighteen of the 19 cases were surgically confirmed, and 1 case was clinically diagnosed. Pathologic stag76S

ing in 16 of 19 tumors (84%) was stage I, while the remainder (3 of 19 tumors) were stage IV. Four tumors were ⱕ 1 cm, and 14 tumors were between 1 cm and 2 cm. Only 1 of the 17 tumors that were ⱕ 2 cm was seen on CXR. One lung cancer that was not seen on CT scan was diagnosed by sputum cytology. Overall, the rate of lung cancer diagnosis was 0.48%, which was significantly higher than the rate of 0.03 to 0.05% in the same area prior to spiral CT screening. A historical comparison of two screening strategies performed in Japan within the Anti-Lung Cancer Association was updated at the annual meeting of American Society of Clinical Oncology in May 1999.36,37 The report compared the outcome of screening with CXR and sputum cytology done from September 1975 to August 1993 (a total of approximately 26,000 screenings) to the same screening strategy plus spiral CT scan done from September 1993 to December 1998. Forty-three patients with primary lung cancer were found prior to CT scanning, compared to 36 patients with primary lung cancer in the CT-scan period. In addition, the percentage of stage IA tumors increased from 42 to 81%, and the 5-year survival improved from 48 to 82%. The results of this nonrandomized, historical comparison suggest that screening with CT scans can increase the ability to diagnose lung cancers at earlier stages but does not constitute strong evidence of a mortality benefit due to the potential for lead time and other biases. The Early Lung Cancer Action Project (ELCAP) performed yearly low-dose spiral CT scans and CXR in a single-arm study of 1,000 smokers in New York City.38 One to 6 noncalcified lung nodules were found at baseline (“prevalence”) in 233 participants; 27 nodules (2.7%) were malignant. Among these 27 malignant nodules, 20 nodules (74%) were not found on standard CXR; in contrast, no malignant nodules were detected by CXR that were not also seen on CT scanning. The majority of these malignancies (23 of 27; 85%) were stage I, and all but one nodule (97%) were resectable. In the second year of screening (“incidence”), seven cancers were detected, and six of these were stage I. For relevance, in the United States, only approximately 20% of sporadically diagnosed lung cancer is stage I.39 The findings of these and related studies are summarized in Table 3. All of the studies had a high rate of false-positive results and a preponderance of stage I lesions among found cancers. Conclusions: LDCT It is difficult to determine whether LDCT meets either of the two criteria of a beneficial screening Lung Cancer Guidelines

www.chestjournal.org

CHEST / 123 / 1 / JANUARY, 2003 SUPPLEMENT

77S

5,483 8,546

2001

2001

919

2001 1,611

493

2001

2002

1,520

1,000

1999, 2001

2002

614

2001

NR

676 (12)

186 (11.5)

NR

NR

782 (51)

233 (23)

NR

35 (0.41)

22 (0.40)

14 (0.87)

17 (1.85)

11 (2.23)

22 (1.4)

27 (2.70)

4 (0.65)

Malignancies Detected, No. (% Total)

97

100

77

76

18

59

81

100

Detected Malignancies Stage I, %

7,434

8,303

7,891

NR

91

1,464

1,184

NR

Patients Screened, No.

*NR ⫽ not reported. †Unpublished proceedings of the 4th International Conference on Screening for Lung Cancer; February 23–25, 2001; New York, NY.

Mayo Clinic, United States†42 Moffitt Cancer Center, United States† Muenster University, Germany† National Cancer Center Hospital, Japan43 Shinshu University, Japan44 Hitachi Health Care Center, Japan†

Hadassah University, Israel40 Cornell University, United States38,40,41

Study Site

Patients Screened, No.

Date of Publication

Abnormal Results, No. (% of Total)

Baseline Screen

Table 2—LDCT Screening Studies*

NR

518 (6)

721 (9.1)

NR

NR

191 (13)

63 (5)

NR

Newly Identified Abnormal Results, No. (% of Total)

7 (0.09)

34 (0.41)

22 (0.28)

2 (NR)

0 (0)

3 (0.20)

7 (0.59)

NR

Malignancies Detected, No. (% Total)

Annual Repeat Screening

100

86

82

100

0

0

85

NR

Detected Malignancies Stage I, %

Table 3—Characteristics of Incident Cancers Detected in RCTs of CXR and Sputum Cytology Incident Cancers Study Site Mass Radiography Service, London17,18 Mayo Clinic, United States5–7

Research Institute of Tuberculosis and Respiratory Diseases, Czechoslovakia22,27 Memorial Sloan-Kettering Cancer Center, United States21,25 Johns Hopkins, United States21,23

Intervention

Early Stage

Late Stage

Screen Detected

CXR every 6 mo for 3 yr CXR at end of 3 yr CXR and sputum cytology every 4 mo for 6 yr Advised to have CXR and sputum cytology annually CXR and sputum cytology every 6 mo for 3 yr CXR and sputum cytology at end of 3 yr CXR annually and sputum cytology every 4 mo CXR annually CXR annually and sputum cytology every 4 mo㛳 CXR annually㛳

44* 22* 99

57* 54* 107

65 17 90

36 59 116†

51

109

0

160‡

20

19

31

8

10

17

8

19§

50

64

70

44

58 139¶

63 240

65 186

56 193

Interval Detected

*Excluding cancers detected by the initial screening. †Includes 73 cases detected by symptoms and 43 cases detected by screening outside of study protocol. ‡Includes 112 cases detected by symptoms and 48 cases detected by screening outside of study protocol. §Includes 15 cases detected by symptoms, 1 case detected by screening outside of study protocol, and 3 cases detected by autopsy. 㛳Not reported by study arm. ¶Calculated value based on reported proportions among screen-detected and interval cases.

test. Some have argued that the data from the observational studies are so compelling that we can conclude that screening with LDCT will be associated with a reduction in lung cancer mortality (the first criterion).40,41,45 Others have argued that the findings in the observational studies may reflect the impact of biases, rather than a true effect.30,46 –50 Whether LDCT meets the second criterion—that it is not harmful or painful—is also the subject of a contentious debate. Those arguing that LDCT will reduce lung cancer mortality point to very consistent findings in observational studies: LDCT detects far more lung cancers than CXR, and the vast majority of those detected are early stage. However, those arguing that the finding of early stage disease does not necessarily equate to an improvement in life expectancy raise two concerns. LDCT may be detecting nodules of histologic cancer that are often indolent in their behavior (ie, “overdiagnosis”), so treating these individuals cannot increase their life expectancy. Also, there are questions about whether surgical treatment of screen-detected cancers alters the natural history of the disease. Critics addressing the second criterion of a beneficial screening test have also pointed out that the aggregate harm to screened individuals (and to society) may exceed the benefit conferred to those who have lung cancer detected. In this case, these harms would be due to the high costs of screening and follow-up, and the morbidity and 78S

anxiety associated with false-positive results. We review the evidence substantiating each of these claims below.

Evidence Supporting the Claim That LDCT May Be Associated With Overdiagnosis Much of the work on determining whether LDCT may overdiagnose lung cancer has focused on establishing whether indolent lung cancer exists. Some evidence from the randomized trials of CXR and sputum cytology has been advanced in support of the hypothesis that indolent lung cancer can be detected by a screening intervention. To understand this argument, consider that in a randomized trial of screening, if all screen-detected cancers grow, spread, and cause disease, one would expect that an equal number of cancers would be detected in the screened and unscreened groups after a sufficiently long period of follow-up. In the screened group, cancers would be found at earlier, presymptomatic stages while cancers in the nonscreened group would more typically be found at a later, symptomatic stage (stage shift). In contrast, if screening led to the identification of cancers that did not grow, spread, or cause death, these cancers would be detected in the screened group, but would never manifest with symptoms and therefore would be missed in the nonscreened group. The detection of small lesions Lung Cancer Guidelines

that do not grow, spread, or cause death has been referred to as overdiagnosis. It is important to note that a higher percentage of early stage tumors is also found in the overdiagnosis scenario, so that the detection of a higher percentage of early stage tumors in the screened group cannot by itself distinguish stage shift from overdiagnosis. In both the Mayo and Czechoslovakian randomized trials of CXR, substantially more cancers were detected in the screened than the unscreened group. On extended follow-up, the excess cancers detected by screening appear to have been overdiagnosed. Marcus et al26 reported on follow-up of all cases diagnosed during the enrollment and study period in the Mayo study (206 in the screened group and 160 in the control group). Nearly all of the 20% excess cancers detected in the screened group were early stage cancers. Failure to detect an equivalent number of early stage cancers in the control group was without apparent ill effect, because the control group experienced no excess number of lung cancer deaths, even when follow-up extended beyond 17 years for all participants.26 Kubik et al27 presented long-term follow-up data for the screened and control groups from the Czechoslovakian study. After the third year of this study, both the screened and control groups were entered into a program of annual screening. Despite the fact that substantially more lung cancers were detected in the screened group than in the control group during the first years of the study (year 1 to year 3), the rate of lung cancer detection in the two groups in the latter years (year 4 to year 6), while they were being screened in a similar fashion, was the same. In other words, greater detection of lung cancer in the screened group early in the study did not lead to a compensatory fall in detection rates later on, suggesting that screening was detecting excess lesions that would not have progressed to more advanced disease.27 A number of investigators have also looked to data from autopsy series for evidence of indolent lung cancers that could be overdiagnosed by a sensitive screening test. These investigators argue that if indolent lung cancer exists, then numerous people who have died of other causes should have small nodules of lung cancer found during their autopsies. A study based at the Yale-New Haven hospital documented that of all patients who at autopsy had primary lung cancers, 28% were not diagnosed during their lifetime. The overall prevalence of undiagnosed lung cancer among all autopsies was 0.09%.52 A study from Duke University recently suggested that some lung cancers may not be detected at postmortem examination. Only 19 of 28 nodules (68%) seen on CT scanning within 2 months prior to death were found at autopsy.52 These findings are www.chestjournal.org

provocative but somewhat hard to interpret. Diagnosed lung cancer is relatively rare: the incidence in 50- to 55-year-olds is only 82 per 100,000 (0.08%), and only rises to about 340 per 100,000 (0.34%) in 75- to 79-year-olds (Surveillance, Epidemiology, and End Results Program, 1973 to 1998 Public Use Data).1 Also, nodules seen by CT scan that were undetected at autopsy may not have been primary lung cancers. As a result, it is difficult to know what rate of discovery would constitute strong evidence for or against the hypothesis that indolent lung cancer is a substantial concern in screening.53 If screen-detected cancers are found at rates in excess of those expected based on population incidence statistics, or are discovered in patterns inconsistent with well established risk factors, this would suggest that some of the screen detected cancers are “atypical” (or overdiagnosed). Evidence that LDCT is detecting atypical lung cancer can be seen in two studies. In a large Japanese study of mobile CT scanning, the rate of detection of lung cancer was equivalent in smokers and nonsmokers, strongly contravening the epidemiologic data on sporadically diagnosed cancer.44 In the New York-based ELCAP study, the number of cancers detected in the annual follow-up LDCT is far less than that detected during the initial scan, even though the size of the detected lesions is similar. This difference in rates of detection is inconsistent with what one would expect if all of the lesions grew at similar rates (ie, all were aggressive malignancies).40,41 Instead, that the size of the prevalent lesions detected at initial scan are consistent with those found at subsequent scans but the rates of detection are substantially higher suggests that a meaningful proportion of cancers detected during the initial scan would have behaved in an indolent manner.

Evidence Supporting a Limited Impact of Surgical Resection on Outcome of Screen-Detected Disease The intervention universally advocated for early stage lung cancer is surgical resection.54 Studies in which survival is assessed within a standardized population of individuals with early stage disease strongly support the efficacy of this intervention.55,56 However, there are no RCTs in which surgical resection has been compared to no treatment, and such a trial will probably never be done. Some indirect evidence raises doubts about the benefit of surgery in screen-detected disease. In the Mayo RCT of CXR, for example, more patients in the screened group had early stage lung cancer and underwent surgery, but there was no statistically CHEST / 123 / 1 / JANUARY, 2003 SUPPLEMENT

79S

significant difference in lung cancer mortality (in fact, lung cancer mortality rates were marginally higher in the screened group). In other words, increased rates of surgery did not confer a benefit to the group as a whole.57 Similar results were seen in the RCT of CXR conducted in Czechoslovakia.22

Evidence that LDCT May Result in Aggregate Harm to Screened Individuals Distinct from the issues of lung cancer mortality, there are a number of concerns about the potential harms that could be caused by LDCT if implemented as a screening tool. First among these is the identification of large numbers of individuals with abnormalities on LDCT that do not require treatment, but may require serial monitoring and biopsies. These individuals likely experience anxiety when an abnormality is found during their screening test for lung cancer. The likelihood of such events appears high. In the ELCAP study, 23% of participants had at least one noncalcified nodule, and only 2.9% had diagnosed lung cancer, although it should be noted that all of the individuals who were identified as requiring biopsies because their lesions were highly suspicious did in fact have histologically confirmed lung cancer.41 In a parallel study conducted at the Mayo Clinic in Rochester, MN, 51% had at least one noncalcified nodule on CT scan, and only 1.4% had diagnosed lung cancer.58 In other words, the apparent false-positive to true-positive ratio is probably in the range of 10 to 30. A corollary source of harm may result from some histologically proven lung cancers being in fact overdiagnosed. Unnecessary treatments may include thoracotomy, lung resection, chemotherapy, and radiation.59 The cost of mass screening with LDCT is unknown. A large potential population of high-risk subjects, combined with a high rate of findings that necessitate follow-up CT scanning, subspecialist consultation, and other testing, all endorse the expectation of a high total cost. An additional source of excess cost will come from the fact that costs of care for individuals with early stage lung cancer exceed costs of care for individuals with more advanced stage disease.60 Whether the LDCT will be cost-effective, meaning that the incremental improvement in survival per dollar spent is reasonable, is a related question that cannot be answered until effectiveness is established. The few published cost-effectiveness analyses on this topic rely on assumptions about effectiveness that are unproven.61– 63 80S

Summary Sputum cytology screening has been evaluated in the context of RCTs and does not appear to meet the criteria of a beneficial screening test. It appears to detect only a minority of all lung cancers, and does not appear to reduce lung cancer mortality, although it has been primarily assessed in combination with other screening evaluations. More sensitive analyses of sputum for abnormalities suggestive of malignancy are being investigated currently. CXR screening every 4 months or 6 months in combination with sputum cytology has been assessed in two RCTs, and CXR alone every 6 months as a single modality was assessed in a third RCT. In none of these studies was the control arm a no-screening intervention, and the power of these studies to detect small reductions in mortality was limited. Yet, in none of these studies was CXR associated with a reduction in lung cancer mortality. CXR is being further evaluated in the nearly completed Prostate, Lung, Colorectal, and Ovarian Trial. LDCT has only been evaluated in the context of observational studies with volunteer cohorts. The vast majority of cancers detected by LDCT scanning in these studies are stage I at the time of discovery. This suggests that the preponderance of them might be cured through surgical intervention. However, evidence from numerous sources casts doubt on the extent to which these screen-detected lesions represent aggressive malignancies, and therefore the extent to which surgical treatment will alter outcome. Readers should note that the finding of larger numbers of early stage cancers also characterizes all of the RCTs of sputum cytology and CXR, and in none of these studies was there an alteration in lung cancer mortality. The impact of LDCT on lung cancer mortality to date is consequently unknown. References 1 Ries LAG, Eisner MP, Kosary CL, et al. SEER Cancer Statistics Review, 1973–1997. Bethesda, MD: National Cancer Institute, 2000 2 Mandel J, Bond J, Church T. Reducing mortality from colorectal cancer by screening for fecal occult blood. N Engl J Med 1993; 328:1365–1371 3 Hardcastle J, Chamberlain J, Robinson M, et al. Randomized controlled trial of faecal-occult blood screening for colorectal cancer. Lancet 1996; 348:1472–1477 4 Kronborg O, Fenger C, Olsen J, et al. Randomised study of screening for colorectal cancer with faecal-occult blood test. Lancet 1996; 348:1467–1471 5 Fontana R, Sanderson DR, Woolner LB, et al. Lung cancer screening: the Mayo program. J Occup Med 1986; 28:746 – 750 6 Fontana RS, Sanderson DR, Taylor WF, et al. Early lung cancer detection: results of the initial (prevalence) radiologic and cytologic screening in the Mayo Clinic study. Am Rev Respir Dis 1984; 130:561–565 Lung Cancer Guidelines

7 Fontana RS, Sanderson DR, Woolner LB, et al. Screening for lung cancer: a critique of the Mayo Lung Project. Cancer 1991; 67(4 Suppl):1155–1164 8 Gatta G, Capocaccia R, Hakulinen T, et al. Variations in survival for invasive cervical cancer among European women, 1978 –1989. Cancer Causes Control 1999; 10:575–581 9 Kjellgren O. Mass screening in Sweden for cancer of the uterine cervix: effect on incidence and mortality. Gynecol Obstet Invest 1986; 22:57– 63 10 Gotzsche P, Olsen O. Is screening for breast cancer with mammography justifiable? Lancet 2000; 355:129 –134 11 Olsen O, Gotzsche P. Screening for breast cancer with mammography (Cochrane Review). The Cochrane Library. Oxford, UK: Update Software, 2002. 12 Charatan F. US panel finds insufficient evidence to support mammography. BMJ 2002; 324:255 13 Kolata G. Expert panel cites doubts on mammogram’s worth. New York Times. January 24, 2002;A16 14 Feuer EJ, Merrill RM, Hankey BF. Interpreting trends in prostate cancer: part II. Cause of death misclassification and the recent rise and fall in prostate cancer mortality. Cancer Surveillance Series. J Natl Cancer Inst 1999; 91:1025–1032 15 Hankey BF, Feuer EJ, Clegg LX, et al. Interpreting trends in prostate cancer: part I. Evidence of the effects of screening in recent prostate cancer incidence, mortality, and survival rates. Cancer Surveillance Series. J Natl Cancer Inst 1999; 91:1017– 1024 16 Black WC, Haggstrom DA, Welch HG. All-cause mortality in randomized trials of cancer screening. J Natl Cancer Inst 2002; 94:167–173 17 Brett GZ. Earlier diagnosis and survival in lung cancer. BMJ 1969; 4:260 –262 18 Brett GZ. The value of lung cancer detection by six-monthly chest radiographs. Thorax 1968; 23:414 – 420 19 Boucot KR, Weiss W. Is curable lung cancer detected by semiannual screening? JAMA 1973; 224:1361–1365 20 Lilienfeld A, Archer PG, Burnett CH, et al. An evaluation of radiologic and cytologic screening for the early detection of lung cancer: a cooperative pilot study of the American Cancer Society and the Veterans Administration. Cancer Res 1966; 26:2083–2121 21 Berlin N. Overview of the NCI Cooperative Early Lung Cancer Detection Program. Cancer 2000; 89(11 Suppl): 2349 –2351 22 Kubik A, Polak J. Lung cancer detection: results of a randomized prospective study in Czechoslovakia. Cancer 1986; 57: 2427–2437 23 Frost J, Ball WC, Levin M, et al. Early lung cancer detection: results of the initial (prevalence) radiologic and cytologic screening in the Johns Hopkins Study. Am Rev Respir Dis 1984; 130:549 –554 24 Tockman MS. Lung cancer screening: the Johns Hopkins study. Chest 1986; 89(suppl):324 –324 25 Melamed MR. Lung cancer screening results in the National Cancer Institute New York study. Cancer 2000; 89(11 Suppl): 2356 –2362 26 Marcus PM, Bergstralh EJ, Fagerstrom RM, et al. Lung cancer mortality in the Mayo Lung Project: impact of extended follow-up. J Natl Cancer Inst 2000; 92:1308 –1316 27 Kubik AK, Parkin DM, Zatloukal P. Czech Study on Lung Cancer Screening: post-trial follow-up of lung cancer deaths up to year 15 since enrollment. Cancer 2000; 89(11 Suppl): 2363–2368 28 Prorok PC, Chamberlain J, Day NE, et al. UICC workshop on the evaluation of screening programmes for cancer. Int J Cancer 1984; 34:1– 4 www.chestjournal.org

29 Eddy DM. Screening for lung cancer. Ann Intern Med 1989; 111:232–237 30 Patz EF Jr, Goodman PC, Bepler G. Screening for lung cancer. N Engl J Med 2000; 343:1627–1633 31 Bailar JC. Screening for lung cancer: where are we now? Am Rev Respir Dis 1984; 130:541–542 32 Okamoto N, Suzuki T, Hasegawa H, et al. Evaluation of a clinic-based screening program for lung cancer with a casecontrol design in Kanagawa, Japan. Lung Cancer 1999; 25:77– 85 33 Gohagan J, Prorok PC, Hayes RB, et al. The Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial of the National Cancer Institute: history, organization, and status. Control Clin Trial 2000; 21(6 Suppl):251S–272S 34 Prorok PC, Andriole GL, Bresalier RS, et al. Design of the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial. Control Clin Trial 2000; 21(6 Suppl):273S– 309S 35 Sone S, Takashima S, Li F, et al. Mass screening for lung cancer with mobile spiral computed tomography scanner. Lancet 1998; 351:1242–1245 36 Kaneko M, Eguchi K, Ohmatsu H, et al. Peripheral lung cancer: screening and detection with low-dose spiral CT vs radiography. Radiology 1996; 201:798 – 802 37 Kakinuma R, Ohmatsu H, Kaneko M, et al. Detection failures in spiral CT screening for lung cancer: analysis of CT findings. Radiology 1999; 212:61– 66 38 Henschke CI, McCauley DI, Yankelevitz DF, et al. Early Lung Cancer Action Project: overall design and findings from baseline screening. Lancet 1999; 354:99 –105 39 Feinstein MB, Bach PB. Epidemiology of lung cancer. Chest Surg Clin N Am 2000; 10:653– 661 40 Henschke CI, McCauley DI, Yankelevitz DF, et al. Early Lung Cancer Action Project: a summary of the findings on baseline screening. Oncologist 2001; 6:147–152 41 Henschke CI, Naidich DP, Yankelevitz DF, et al. Early Lung Cancer Action Project: initial findings on repeat screenings. Cancer 2001; 92:153–159 42 Swenson SJ, Jett JR, Sloan JA, et al. Screening for lung cancer with low-dose spiral computed tomography. Am J Respir Crit Care Med 2002; 165:433– 434 43 Sobue T, Morigama N, Kaneko M, et al. Screening for lung cancer with low-dose helical computed tomography: antilung cancer association project. J Clin Oncol 2002; 20:911– 920 44 Sone S, Li F, Yang ZG, et al. Results of three-year mass screening programme for lung cancer using mobile low-dose spiral computed tomography scanner. Br J Cancer 2001; 84:25–32 45 Miettinen OS, Yankelevitz DF, Henschke CI. Screening for lung cancer [letter]. N Engl J Med 2001; 344:935 46 Jett JR. Spiral computed tomography screening for lung cancer is ready for prime time. Am J Respir Crit Care Med 2001; 163:812– 815 47 Patz EFJ, Goodman PC. Low-dose spiral computed tomography screening for lung cancer: not ready for prime time. Am J Respir Crit Care Med 2001; 163:813– 814 48 Lung screening study ties up advisors, board sends it back to NCI for revision. Cancer Lett 2001; 27[26]:1– 4 49 Grannis FWJ. Lung cancer overdiagnosis bias: “the gyanousa am loose!” Chest 2001; 119:322–323 50 Heffner JE, Silvestri G. CT screening for lung cancer. Am J Respir Crit Care Med 2002; 165:433– 434 51 McFarlane MJ, Feinstein AR, Wells CK. Clinical features of lung cancers discovered as a postmortem “surprise.” Chest 1986; 90:520 –523 52 Dammas S, Patz EF Jr, Goodman PC. Identification of small CHEST / 123 / 1 / JANUARY, 2003 SUPPLEMENT

81S

53 54 55 56 57 58

lung nodules at autopsy: implications for lung cancer screening and overdiagnosis bias. Lung Cancer 2001; 33:11–16 Cancer rates and risks. Bethesda, MD: National Cancer Institute, Cancer Statistics Branch, 1996 Ettinger DS, Cox JD, Ginsberg RJ, et al. NCCN (National Comprehensive Cancer Network) non-small cell lung cancer practice guidelines. Oncology 1996; 10:S81–S111 Bach PB, Cramer LD, Warren JL, et al. Racial differences in the treatment of early-stage lung cancer. N Engl J Med 1999; 341:1198 –1205 Flehinger BJ, Kimmel M, Melamed MR. The effect of surgical treatment on survival from early lung cancer. Chest 1992; 101:1013–1018 Lederle FA, Niewoehner DE. Lung cancer surgery: a critical review of the evidence. Arch Intern Med 1994; 154:2397–2400 Swenson SJ, Jett JR, Sloan JA, et al. Screening for lung cancer

82S

59 60 61 62 63

with low-dose spiral computed tomography. Am J Respir Crit Care Med 2002; 165:508 –513 Black WC. Overdiagnosis: an underrecognized cause of confusion and harm in cancer screening. J Natl Cancer Inst 2000; 92:1280 –1282 Fireman BH, Quesenberry CP, Somkin CP, et al. Cost of care for cancer in a health maintenance organization. Health Care Financ Rev 1997; 18:51–76 Marshall D, Simpson KN, Earle CC, et al. Potential costeffectiveness of one-time screening for lung cancer (LC) in a high risk cohort. Lung Cancer 2001; 32:227–236 Marshall D, Simpson KN, Earle CC, et al. Economic decision analysis model of screening for lung cancer. Eur J Cancer 2001; 37:1759 –1767 Miettinen OS. Screening for lung cancer: can it be costeffective? Can Med Assoc J 2000; 162:1431–1436

Lung Cancer Guidelines