Specific Keynote: Ovarian Cancer Risk Assessment and the Potential for Early Detection

Specific Keynote: Ovarian Cancer Risk Assessment and the Potential for Early Detection

Gynecologic Oncology 88, S75–S79 (2003) doi:10.1006/gyno.2002.6689 Specific Keynote: Ovarian Cancer Risk Assessment and the Potential for Early Detec...

70KB Sizes 0 Downloads 25 Views

Gynecologic Oncology 88, S75–S79 (2003) doi:10.1006/gyno.2002.6689

Specific Keynote: Ovarian Cancer Risk Assessment and the Potential for Early Detection Nicole Urban, Sc.D. Fred Hutchinson Cancer Research Center

RATIONALE FOR OVARIAN CANCER SCREENING

FEASIBILITY OF OVARIAN CANCER SCREENING

In the United States, ovarian cancer is the sixth most common cancer among women, accounting for 4% of all cancers in women, and the fifth most common cause of cancer death among women. It causes more deaths than any other cancer of the female reproductive system. The American Cancer Society estimates that about 23,000 new cases of ovarian cancer and about 14,000 deaths from ovarian cancer occur in the United States each year [1]. The facts about ovarian cancer suggest that screening of the postmenopausal population would reduce mortality, if a costeffective, efficacious screening strategy could be identified. Ovarian cancer occurs most frequently in women aged 50 –79; over 70% of the cancers occur after age 50. Ninety percent of women with ovarian cancer have no family history of the disease [2]. Thus, screening programs targeted at all postmenopausal women are most likely to have an important impact on mortality. About 50% of women diagnosed with ovarian cancer survive 5 years after diagnosis. Survival is excellent in early-stage disease but poor in late-stage disease, regardless of histology. If diagnosed and treated while the cancer has not spread outside the ovary, the 5-year survival rate is over 90%. Even for the most aggressive tumors, the serous tumors that are not well-differentiated, 5-year survival rates are over 80%. However, early-stage diagnosis is rare. Only 25% of all ovarian cancers are found at an early stage [3], suggesting that there is much room for improvement in the early detection of ovarian cancer. Screening presents challenges, however. The incidence of ovarian cancer is low, and the disease typically progresses quickly. Among all women, the lifetime risk of ovarian cancer diagnosis is estimated to be 1.8%, or 1 in 56. Definitive diagnosis of ovarian cancer requires laparoscopy or laparotomy, either of which carries significant risk to the patient, as well as cost to the health care system. Because screening is imperfect, many healthy women will have to undergo these procedures only to learn that they do not have cancer. Appropriate screening technologies and populations must be selected to balance the benefits and costs of ovarian cancer screening.

The efficacy of screening for ovarian cancer must be demonstrated in a randomized controlled trial (RCT). To be feasible for general use, costs and quality-of-life (QOL) effects must also be considered. Prior to testing ovarian cancer screening in a RCT, a potentially cost-effective strategy should be identified. Several strategies have been proposed, and some are currently being evaluated for efficacy or QOL effects. Each candidate strategy makes use of one or more screening tests used according to a defined protocol. The protocol includes the sequence of tests to be used, the criteria for positivity of each test, and the screening interval. A strategy employs a specified protocol in asymptomatic women of specified age and risk status. Required Performance of a Screening Strategy Most critics agree that a screening strategy for ovarian cancer must have sensitivity of at least 80% in early-stage disease and a positive predictive value (PPV) of at least 10% (i.e., a maximum of 10 surgeries for every cancer found). Because ovarian cancer is so rare, the latter implies near perfect specificity. For example, in the general population a specificity of 99.6% is needed [4]. Availability of Screening Tests Two screening tests are currently in use: the CA125 serum tumor marker and imaging using transvaginal sonography (TVS). TVS has been studied for several years by various investigators. Early results were summarized by Karlan [5], who reported sensitivity as high as 100% and specificity of about 98%. Specificity of TVS may be improved through the use of color Doppler imaging or a morphology index, usually with some reduction in sensitivity. For example, DePriest and his colleagues [6] reported on a series of women that included high-risk premenopausal as well as average-risk postmenopausal women. They reported a PPV of 6.7% and a sensitivity of 86% using a morphology index. CA125 is a high-molecular-weight glycoprotein, detectable in serum, which has been used for many years in surveillance for ovarian cancer recurrence and has been studied more re-

S75

0090-8258/03 $35.00 © 2003 Elsevier Science (USA) All rights reserved.

S76

HELENE HARRIS MEMORIAL TRUST SUPPLEMENT

cently for use in early detection. In its most common use, a single CA125 level exceeding 35 U/ml determines a positive test for ovarian cancer. Over 85% of all advanced ovarian cancers cases have CA125 levels exceeding this threshold, and 50% of cases with disease confined to the ovary exceed this threshold. However, specificity is poor when CA125 is used in this way: nearly 6% of women without cancer have levels of CA125 exceeding 35 U/ml [7]. CA125 may also be useful as an indicator of risk, as it is over 30 U/ml in 50% of ovarian cancer patients more than 18 months prior to clinical detection and in 23% of patients more than 5 years before diagnosis [8]. Strategies employing TVS and/or CA125 as a first-line screen have been proposed and are being tested in trials. In the Prostate, Lung, Colon and Ovary trial [9] in the United States, CA125 (single threshold elevation above 35) and TVS are used together annually as a first-line screen; if either is positive the woman is referred for surgical consultation. In this two-arm RCT, 74,000 women aged 55–74 have been randomized to a screening arm (annual screening for ovarian, lung, and colon cancer) or to a standard-care control arm. Ten centers are collaborating in this efficacy trial, which requires 10 years average follow-up. New markers are also under investigation including most notably the M-CSF serum marker [10] and the LPA plasma tumor marker [11]. M-CSF is a hematopoietic cytokine involved in the activation of macrophages, detectable in serum. When used alone, M-CSF detects 61– 68% of all cases, and its specificity for benign tumors is 93% [12]. It may be valuable as a marker to be used in combination with CA125, as in cases with CA125 ⬍ 35 its sensitivity is 50%. Using elevation of either CA125 or M-CSF as the criterion for positivity of a blood test, it has been reported that 96 –98% of cases are found, including 81% of early-stage cases. However, it is difficult to compare this performance directly to the use of CA125 alone because its specificity is poor, with 20% of women without cancer exceeding at least one of these thresholds [10]. LPA is a factor composed of various species of lysophosphatidic acids, detectable in plasma. Its sensitivity in advanced disease is reported to be 100%; in disease confined to the ovary it is 90%. Its specificity for ovarian cancer is reported to be near 90% [11]. In a series of stage I and II cancers in which both CA125 and LPA were assessed, LPA was elevated in over 90% while CA125 was elevated in fewer than 50% [13]. Use of a panel of markers is a potentially promising strategy. Toward this end, new markers are being identified and evaluated with respect to their ability to complement existing markers. For example, novel technologies such as high density array hybridization (HDAH) [14] and SEREX [15] are being used to identify genes and proteins that are more highly expressed in the tissue and serum of women with ovarian cancer than in healthy women. HDAH is used to identify cancer-related genes, while SEREX immunoscreening is used to identify cancer-related proteins that elicit an immune response. Assays are being developed to measure these tumor markers in serum,

FIG. 1. (A) ROC curve for CA125 and mesothelin; (B) ROC curve for CA125, Ab to H2N, and p53.

and statistical methods are being developed to evaluate the new markers. The first step in evaluating new markers is to compare their levels in women with and without cancer. It is important for these analyses that blood samples are obtained prior to diagnosis of cancer in the cases and that blood is processed identically in cases and controls. Characteristics of the woman that may affect the levels of the markers as well as the risk of cancer such as age, menopausal status, and family history, should be controlled in the analysis. Similarly, characteristics of the cancer must be considered, including stage at diagnosis and histology. The markers should not be evaluated in isolation, but considered for their potential contribution to a panel of markers that might be used sequentially or together. For example, investigators in the Pacific Ovarian Cancer Research Consortium are evaluating mesothelin and antibodies to p53 and her2/neu for their contribution to a marker panel that includes CA125. The predictive ability of a marker is assessed in a logistic regression model that can be extended to multiple markers. The model can then be used to obtain ROC curves by determining the sensitivity and specificity of a marker panel for each positivity criterion and graphing sensitivity (TP) vs 1-specificity (FP). The results of preliminary analyses are shown in Fig. 1. A limitation of the analyses conducted as a first step in

HELENE HARRIS MEMORIAL TRUST SUPPLEMENT

marker evaluation is that the markers are measured at a single point in time, close to the time of diagnosis when symptoms may be present. A second step in marker evaluation is therefore to evaluate the behavior of markers over time in healthy women, considering characteristics of the woman such as age, menopausal status, and family history. Within-woman variability over time in each marker is evaluated. Markers that signal cancer by a change in their values over time in response to developing cancer are needed [16]. This is discussed further below in the context of the criteria for the positivity of tests. Using Tests Sequentially When used alone, and in a conventional way, none of the available tests achieves the needed performance. However, performance of a screening strategy, rather than an individual test, should be considered in judging the potential usefulness of ovarian cancer screening. One way to improve performance is to use tests sequentially. A reasonably sensitive and specific first-line test is used to select women for screening by another independent second-line test in order to improve specificity while retaining sensitivity. To improve cost-effectiveness, a less costly test or tests can be used to select women for a more costly test(s) [17]. For example, an approach that appears to hold promise is to select women for TVS screening on the basis of a CA125 test. Jacobs has tested this multimodal strategy in a pilot RCT in the United Kingdom [18]. In the Jacobs strategy, TVS was performed only if CA125 exceeded 30 U/ml. CA125 was performed at 12-month intervals in 22,000 women. Results of the prevalence screen suggest sensitivity of 85% at the 1-year follow-up and 58% at the 2-year follow-up and specificity of 99.6. Results of Jacobs’ two-arm trial (11,000 per arm) were that survival was better in the screened group (72.9 months vs 41.8 months) and that the PPV was 20%. The difference in survival, while not statistically significant, is impressive because it is not biased by lead time: survival was measured from the time of randomization rather than diagnosis. These results suggest that a sequential, multimodal approach may be efficacious even using CA125, an imperfect marker, as the first-line test. If a panel of complementary markers can be identified, their sequential use may prove to be a particularly cost-effective strategy. Criteria for Positivity of Tests Another way to improve performance is to adjust the criteria for test positivity. For example, there are various ways to define the positivity of an imaging test. Criteria might make use of the size of certain dimensions of the visualized ovary, presence of any abnormality, morphology of a visualized abnormality, and presence of increased blood flow suggestive of angiogenesis. Use of a morphology index (rather than ovarian volume or flow imaging) has been reported to improve performance, particularly specificity [19].

S77

Several strategies have been proposed for defining the criterion for positivity of a serum marker. A limitation of the marker analyses described above is that marker measurement is at a single point in time. The conventional approach is to specify a single threshold based on a single measure, such as 30 or 35 U/ml for CA125. However, it is likely that sensitivity can be improved without a loss in specificity by controlling for screening history when determining a positive test [20]. For example, doubling since last screen [17], excessive deviation from historical mean level [16], or the determination that an exponential rise has been observed [21] may be more useful than the single threshold rule. In ongoing work, Jacobs is using a longitudinal risk-ofovarian-cancer (ROC) algorithm developed by Skates and Pauler to improve the performance of his multimodal strategy [22]. Based on an assumption that CA125 rises exponentially as cancer develops, change-point analysis is used to select women for screening by TVS. Multiple callbacks for repeat CA125 are used to triage women at each screen such that 2% of women go directly to TVS, 13% are given early recall for repeat CA125 in 6 weeks to 6 months, and the remainder return in a year for their next annual screen. Women with early recall undergo the same procedure and so may be sent to TVS, rescheduled for their annual visit, or recalled again as many as three times. Because only about 2.3% are referred for TVS each year, the strategy achieves high specificity. It is being tested in a three-arm trial in unselected postmenopausal women aged 50 –74. A total of 200,000 women are being randomized to the multimodal strategy, TVS as a first-line screen, or a control condition. Twelve centers in the United Kingdom are collaborating to recruit the requisite number of women. Women will be enrolled over a 3-year period, screened annually six times, and followed for an average of 7 years [23]. Screening Interval An important way to influence the effectiveness and costeffectiveness of screening is by varying the screening interval. Frequency of screening can be defined for all women in the screening program, or it can be adjusted based on the risk level of the woman. Frequent screening is likely to be useful if the disease progresses rapidly on average, if there is significant variation among women in the rate of progression of disease, and/or if risk justifies frequent screening based on cost-effectiveness considerations. Although the large efficacy trials are employing annual screening, more frequent screening is being investigated in smaller studies, particularly in high-risk women. One approach is to alternate the use of TVS and CA125 so that women are screened every 6 months. Quality-of-life effects of this approach are being evaluated in 592 women aged 30 –70 without a significant family history of breast or ovarian cancer. In a RCT with a 2 ⫻ 2 factorial design, QOL effects of both risk education and screening are being evaluated. The risk education is conducted in groups of

S78

HELENE HARRIS MEMORIAL TRUST SUPPLEMENT

6 women, in 4 weekly sessions with a booster session 1 year later. Screening employs a multimodal strategy in which CA125 and TVS are used alternately every 6 months as firstline screens; when one is positive the other is performed. The criterion for positivity of CA125 is based on a Parametric Empirical Bayes (PEB) rule developed by McIntosh and Urban [16]. It uses historical data to tailor the use of markers to the individual woman. At each screen, a woman’s CA125 is tested for deviance from her own normal CA125 value. At each screen, 4% of women are referred for TVS. The PEB rule has some advantages, including that it directly controls the falsepositive rate, it distributes false positives uniformly among women screened, it does not require data describing behavior of a marker during the preclinical phase of the disease, and it can be generalized to a marker panel. A retrospective evaluation of the PEB rule using a marker panel is being conducted by McIntosh and Karlan in a cohort of high-risk women using stored blood. Choice of Population to Be Screened It may also be possible to select women for screening based on risk factors. Among women with a significant family history (at least one first-degree relative with ovarian cancer), the lifetime risk of ovarian cancer rises to 9.4% or 1 in 11. If a woman has no family history, 3⫹ pregnancies, and 4⫹ years of oral contraceptive use, her risk falls to 0.6%, or 1 in 167 [24]. The use of serum or plasma markers to refine risk algorithms holds much promise. For example, CA125 has been shown in two studies to predict ovarian cancer 5 years prior to diagnosis [8, 25]. Specimen repositories from large trials such as the Women’s Health Initiative offer an excellent opportunity to evaluate risk factors and marker panels for this purpose. BARRIERS TO PROGRESS Significant progress is being made in identifying and evaluating ovarian cancer screening strategies. However, progress is impeded by the fact that precursor lesions and premalignant conditions have not been identified. In addition, the duration of the preclinical phase of the disease is uncertain, and data describing the behavior of tumor markers during the preclinical phase are rare. Many other things about ovarian cancer screening are still poorly understood, including whether benign masses are a risk factor for subsequent malignant disease and to what extent TVS and CA125 detect different cancers [26]. Quality-of-life effects of screening should also be considered, including long-term effects of false positives and oopherectomy. Our understanding of ovarian cancer disease progression and detection will be improved by analysis of data from a large, well-designed RCT. It is exciting that an international trial or set of trials will soon be underway to test the hypothesis that screening and early detection prevent mortality from ovarian cancer.

Efforts are ongoing to identify new markers such as LPA that may be useful in combination with CA125 and TVS. It is likely that changes over time in one or more markers will be useful for identifying women with developing or early-stage ovarian cancer. It is possible, too, that these or other markers will be useful in identifying women who are likely to develop ovarian cancer in the future. If good markers of risk can be identified, it may eventually be possible to perform risk-based screening in which the frequency of screening is a function of the woman’s risk based on her family history and other risk factors, the levels of her markers, and even her baseline TVS. REFERENCES 1. American Cancer Society, American Cancer Society cancer facts and figures—2000. Atlanta (GA): American Cancer Society Institute, 2000. 2. Schildkraut JM, Thompson WD. Familial ovarian cancer: a populationbased case-control study. Am J Epidemiol 1988;128(3):456 – 66. 3. Ries LAG, Kosary CL, Hankey BF, Miller BA, Clegg L, Edwards BK, editors. SEER cancer statistics review, 1973–1996. Bethesda: National Cancer Institute, 1999. 4. Jacobs I. Overview—progress in screening for ovarian cancer. In: Sharp F, Blackett A, Berek J, Bast R, editors. Ovarian Cancer 5. Oxford: Isis Medical Media, 1998. 5. Karlan BY. The status of ultrasound and color Doppler imaging for the early detection of ovarian carcinoma. Cancer Invest 1997;15(3):265–9. 6. DePriest PD, Gallion HH, Pavlik EJ, Kryscio RJ, van Nagell JR. Transvaginal sonography as a screening method for the detection of early ovarian cancer. Gynecol Oncol 1997;65(3):408 –14. 7. Jacobs I, Bast RC. The CA-125 tumour associated antigen: a review of the literature. Hum Reprod 1989;4(1):1–12. 8. Jacobs IJ, Skates S, Davies AP, Woolas RP, Jeyerajah A, Weidemann P, Sibley K, Oram DH. Risk of diagnosis of ovarian cancer after raised serum CA 125 concentration: a prospective cohort study. BMJ 1996;313(7069): 1355– 8. 9. Kramer BS, Gohagan J, Prorok PC, Smart C. A National Cancer Institute sponsored screening trial for prostatic, lung, colorectal, and ovarian cancers. Cancer 1993;71:589 –93. 10. Suzuki M, Ohwada M, Aida I, Tamada T, Hanamura T, Nagatomo M. Macrophage colony-stimulating factor as a tumor marker for epithelial ovarian cancer. Obstet Gynecol 1993;82(6):946 –50. 11. Xu Y, Shen Z, Wiper D, Wu M, Morton RE, Elson P, Kennedy AW, Belinson J, Markman M, Casey G. Lysophosphatidic acid as a potential biomarker for ovarian and other gynecologic cancers. JAMA 1998;280(8): 719 –23. 12. Xu FJ, Ramakrishnan S, Daly L, Soper JT, Berchuck A., Clarke-Pearson D, Bast RC. Increased serum levels of macrophage colony-stimulating factor in ovarian cancer. Am J Obstet Gynecol 1991;165(5 Pt 1):1356 – 62. 13. Bast RC, Xu F, Yu Y, Barnhill S, Zhang Z, Mills GB. CA125; the past and the future. Int J Biol Markers 1998;13:179 – 87. 14. Schummer M, Ng WV, Bumgarner RE, Nelson PS, Schummer B, Bednarski DW, Hassell L, Baldwin RL, Karlan BY, Hood L. Comparative hybridization of an array of 21,500 ovarian cDNAs for the discovery of genes overexpressed in ovarian carcinomas. Gene 1999; 238(2):375– 85. 15. Stone B, Schummer M, Paley PJ, Crawford M, Ford M, Urban N, Nelson BH. MAGE-F1, a novel ubiquitously expressed member of the MAGE superfamily. Gene 2001;267:173– 82. 16. McIntosh MW, Urban N. A parametric empirical Bayes method for cancer

HELENE HARRIS MEMORIAL TRUST SUPPLEMENT

17.

18. 19.

20.

21.

screening using longitudinal observations of a biomarker. Biostatistics 2002, in press. Urban N, Drescher C, Etzioni R, Colby C. Use of a stochastic simulation model to identify an efficient protocol for ovarian cancer screening. Control Clin Trials 1997;18:251–70. Jacobs I, Skates SJ, MacDonald N, et al. Outcome of a pilot randomized controlled trial of ovarian cancer screenings. Lancet 1999;253:1207–10. van Nagell JR, DePriest PD. The efficacy of transvaginal sonographic screening in asymptomatic women at risk for ovarian cancer. Gynecol Oncol 2000;77:350 – 6. Urban N, McIntosh M, Clarke L, Karlan B, Anderson G, Drescher C. In: Socioeconomics of ovarian cancer screening. Stockholm: 7th Helene Harris Memorial Trust Forum on Ovarian Cancer, Apr 19, 1999. Skates S, Jacobs I, Knapp R. Quantifying risk of ovarian cancer using longitudinal CA125 levels. In: Sharp F, editor: Ovarian cancer 5. Oxford: Isis Medical Media 1998;187–97.

S79

22. Skates S, Pauler DK. Screening based on risk of cancer calculation from Bayesian hierarchical change-point models of longitudinal markers. JASA 2001;96:429 –39. 23. Jacobs, I. UK Collaborative Trial of Ovarian Cancer Screening: full proposal for MRC grant. July 14, 1999. 24. Hartge P, Whittemore AS, Itnyre J, McGowan L, Cramer D. Rates and risks of ovarian cancer in subgroups of white women in the United States. The Collaborative Ovarian Cancer Group. Obstet Gynecol 1994;84(5): 760 – 4. 25. Zurawski VR, Orjaseter H, Andersen A, Jellum E. Elevated serum CA-125 levels prior to diagnosis of ovarian neoplasia: relevance for early detection of ovarian cancer. Int J Cancer 1988;42:677– 80. 26. Karlan BY, Baldwin RL, Lopez-Luevanos E, Raffel LJ, Barbuto D, Narod S, Platt LD. Peritoneal serous papillary carcinoma, a phenotypic variant of familial ovarian cancer: implications for ovarian cancer screening. Am J Obstet Gynecol 1999;180(4):917–28.