Available online at www.sciencedirect.com
Gynecologic Oncology 110 (2008) 374 – 382 www.elsevier.com/locate/ygyno
Evaluation of biomarker panels for early stage ovarian cancer detection and monitoring for disease recurrence Laura J. Havrilesky a , ⁎, Clark M. Whitehead b , Jennifer M. Rubatt a , Robert L. Cheek b , John Groelke b , Qin He b , Douglas P. Malinowski b , Timothy J. Fischer b , Andrew Berchuck a
a
Division of Gynecologic Oncology, Department of Obstetrics and Gynecology, Duke University Medical Center, Durham NC, USA b BD-TriPath, Durham, NC, USA Received 20 March 2008 Available online 27 June 2008
Abstract Objective. To determine the utility of novel combinations of biomarkers, using both a one-step and two-step assay format, to distinguish serum of early ovarian cancer patients from that of healthy controls and to discern the utility of these biomarkers in a monitoring capacity. Methods. For ovarian cancer detection, HE4, Glycodelin, MMP7, SLPI, Plau-R, MUC1, Inhibin A, PAI-1, and CA125 were evaluated in a cohort of 200 women with ovarian cancer and 396 healthy age-matched controls. Each biomarker was assessed by serum-based immunoassays utilizing novel monoclonal antibody pairs or commercial kits. For detection of disease recurrence, HE4, Glycodelin, MMP7 and CA125 were evaluated in 260 samples from 30 patients with OC monitored longitudinally after diagnosis. Results. Based upon ROC curve analysis, the sensitivity/specificity of specific biomarker combination algorithms ranged from 59.0%/99.7% to 80.5%/96.5% for detection of early stage ovarian cancer and 76.9%/99.7% to 89.2%/97.2% for detection of late stage cancer. In monitoring evaluation of 27 patients who experienced recurrence of OC, sensitivity for predicting recurrence was 100% for the biomarker panel and 96% for CA125. At least one of the panel biomarkers was elevated earlier (range 6–69 weeks) than CA125 and prior to clinical evidence of recurrence in 14/27 (52%) patients. Conclusions. We have developed and demonstrated the utility of several one- and two-step multi-marker combinations with acceptable test characteristics for possible use in an ovarian cancer screening population. A subset of this panel may also provide adjunctive information to rising CA125 levels in disease monitoring. © 2008 Elsevier Inc. All rights reserved. Keywords: Ovarian cancer; Tumor marker; Early detection; Monitoring; CA125; HE4; Glycodelin; MMP7; MUC1; Plau-R
Introduction Ovarian cancer is the fourth leading cause of cancer deaths among women in the United States [1]. Early stage ovarian cancer has an excellent prognosis if treated, but advanced stage ovarian cancer, which is diagnosed in approximately 70% of patients, is associated with a poor survival rate of only 10–30% [2]. Given the limitations of treatment for advanced ovarian cancer and the success of treatment for early stage disease, a screening test is intuitively appealing. The ability to
⁎ Corresponding author. Box 3079 DUMC, Durham NC 27710, USA. Fax: +1 919 684 8719. E-mail address:
[email protected] (L.J. Havrilesky). 0090-8258/$ - see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.ygyno.2008.04.041
accurately detect early stage disease would potentially improve ovarian cancer survival dramatically. However, the low prevalence of ovarian cancer (30–50 cases/100,000 women) limits the achievable sensitivity and specificity of any screening test. Prior attempts to establish population-based screening protocols for ovarian cancer have employed CA125, ultrasound, and novel molecular and statistical approaches. CA125 is useful for discriminating benign from malignant pelvic masses and can be used to assess response to treatment, but is not sensitive or specific enough to justify screening[3–5]. Ultrasound-based screening has resulted in a positive predictive value (PPV) of 9.4% [6], while a screening algorithm employing both longitudinal CA125 patterns and ultrasound imaging has achieved a PPV of 19% in clinical trials [7].
L.J. Havrilesky et al. / Gynecologic Oncology 110 (2008) 374–382
The use of marker panels to improve sensitivity and specificity has been extensively investigated with some of the most promising reported markers including CA72-4, M-CSF, OVX1, LPA, Prostacin, Osteopontin, Inhibin and Kallikrein [7–15]. Zhang et al. recently reported combining multi-panel analysis with artificial neural network (ANN) modeling and attained a sensitivity of 71% with a specificity of 98% for the detection of early stage ovarian cancer [16]. No marker panel reported to date, however, has achieved adequate performance characteristics to use as an ovarian cancer screening test. There remains a need for validated biomarkers to improve the ability to diagnose ovarian cancer at an early stage in a format amenable to routine clinical immunoassay laboratories. The objective of this study was to evaluate a panel of candidate biomarkers and to determine whether novel combination(s) of these biomarkers, used either as a standard panel or as part of a two-step algorithm, could distinguish serum of early ovarian cancer patients from that of healthy controls without the use of specialized instrumentation, such as mass spectrometry, or complex computational algorithms, such as neural networks. We further wished to discern the utility of these biomarkers in a monitoring capacity to detect disease persistence or recurrence. Methods Screening study Study population A total of 700 serum samples were collected for the ovarian cancer detection study, comprising three separate cohorts (Table 1): (1) Ovarian cancer: equally distributed between stages I, II, and III (n = 200). (2) Normals: Healthy women (n = 500), further subdivided into: (a) Normal (n = 396), median age 55; comparison group for calculation of test characteristics; (b) Reference (n = 104, mean age 63), 98.1% from women N55 years, used to establish baseline levels of each marker and calculate appropriate threshold cutoff values. Ovarian cancer sera were obtained under IRB approval from the following sources: DSS (West Barnstable, MA), Asterand (Detroit, MI) Virginia Biologicals (Virginia Beach, VA) Duke University Medical Center (Durham, NC), Lake Arrowhead (Lake Arrowhead, CA), Gynecology, Oncology, and Pelvic Surgery Associates (Columbus, OH); Marshfield Clinic (Marshfield, WI), Genomics Collaborative (Cambridge, MA), Miami Valley Hospital (Dayton, OH). The normal sera utilized for both the threshold reference and normal comparison cohorts were collected from healthy donors though Community Blood Center (Kansas City, MO). 97% of OC samples originated from two sites, DSS and Lake Arrowhead. Sera were collected from women whose ovarian cancers had the following histological subtypes: clear cell (10), endometrioid (37), serous (38), mucinous (36), papillary (25), stromal sarcoma (2), and unclassified (3). Forty-nine additional specimens were classified only as adenocarcinomas. For the ovarian cancer monitoring study, 260 serum samples obtained from 30 patients with advanced ovarian cancer diagnosed from 1990–1995 at Duke University were identified under IRB-approved protocol. Eligibility for inclusion in this study required the availability of sera over the period encompassing chemotherapy treatment and subsequent clinical events, known CA125 value for each serum sample, and available clinical data to correlate date of serum collection to clinical status of patient's disease. Patients were excluded from the study if insufficient clinical data to establish disease status was available. Sample preparation All sera were acquired following a standard collection protocol. Briefly, sera were collected in a Red Top Vacutainer (no additives Cat #366430, Becton Dickinson, Franklin Lakes, NJ), clotted 60–90 min and centrifuged 10 min at 1300 ×g. The serum fraction was removed and stored on dry ice and/or − 80 °C until use.
375
Table 1 Demographics and clinical characteristics of study population Characteristics
Age (years) Mean (std) Range Age distribution b55 N55 Race African-American Caucasian Other Missing Ovarian cancer stage Stage I Stage II Stage III Histology Clear cell Endometrioid Mucinous Serous Papillary Adenocarcinoma, not otherwise specified Sarcoma Unclassified
Ovarian Ovarian Normal cancer stage cancer stage (n = 396) I/II (n = 133) III (n = 67)
Reference group (n = 104)
54.8 (11.0) 19–78
56.7 (12.2) 22–81
55.3 (9.7) 40–84
63.0 (6.9) 41–80
77 (58%) 56 (42%)
23 (34%) 44 (66%)
201 (50.8%) 2 (1.9%) 195 (49.2%) 102 (98.1%)
6 (4.6%) 121 (91.7%) 5 (3.8%) 1 (0.8%)
4 (6%) 62 (92.5%) 1 (1.5%) 0
3 (1%) 389 (99%) 1 (b1%) 3
0 98 (100%) 0 6
67 (100%)
NA
NA
67 (51%) 66 (49%)
0 (0%) 30 (22.6%) 27 (20.3%) 35 (26.3%) 15 (11.2%) 23 (17.3%)
10 (15.0%) 7 (10.4%) 9 (13.4%) 3 (4.4%) 10 (15.0%) 26 (38.8%)
0 (0%) 3 (2.3%)
2 (3.0%) 0 (0%)
Biomarker selection Candidate biomarkers meeting the following inclusion criteria were selected: (1) over-expression of candidate gene in epithelial ovarian cancer relative to normal ovarian epithelium; (2) over-expression of encoded protein in ovarian tissue; (3) localization of encoded proteins to extra-cellular compartment as membrane protein or secreted protein; (4) discriminated ovarian cancer from normal sera utilizing prototype immunological assays. A literature review was conducted for investigations describing transcriptional profiling of epithelial ovarian cancer. Genes identified were further reviewed for supporting data to confirm over-expression using either RT-PCR or immunohistochemical analysis of tissue specimens. We selected HE-4, Glycodelin (PAEP), MMP-7, MUC-1, PAI-1, and SLPI as over-expressed genes in epithelial ovarian cancer based upon transcriptional profiling[7,17–21]. A subset had supporting data using RT-PCR or tissue-based immunohistochemistry methods [7,17,18] or protein analysis of serum samples from epithelial ovarian cancer patients[22–24]. Secondary review of literature describing more traditional protein analysis of serum samples from ovarian cancer patients identified Plau-R and Inhibin A as candidate markers for inclusion [18,25,26]. CA125 values were acquired and included as a panel member. Biomarker assays Biomarker assays were developed and performed at BD-TriPath Oncology. Novel ELISA assay development and processing. Recombinant proteins to HE4, PAI1, MUC1 and SLPI were expressed and injected into mice via conventional or rapid methods [27]. The resulting monoclonal antibodies were screened to identify specific antibody pairs with utility in an ELISA sandwich format. The following novel antibody clones were selected, purified and utilized in this study: HE4 90.1.6, 71.1.1.13, PAI1 1D6.1.34, 4E2.6, MUC1 16E3.3, SLPI 6A9.07, 5G6.20. Assay plate preparation. 96-well ELISA plates (Immulon 2HB, ThermoLabSystems, Milford, MA) were coated with 100 µl/well of appropriate capture
376
L.J. Havrilesky et al. / Gynecologic Oncology 110 (2008) 374–382
Fig. 1. ROC curves for CA125, HE4, Glycodelin and Plau-R. Comparison is between normal and ovarian cancer cohorts.
antibody (HE4-90.1.6, PAI1-1D6.1.34, MUC1-16E.3.3, SLPI Goat anti-Mouse Fc (Jackson ImmunoResearch, West Grove, PA)) diluted in PBS pH 7.4 (EMD Chemicals, Gibbstown, NJ) to 2 µg/ml. Plates were incubated at 4 °C overnight, washed with PBS, and blocked with 250 µl/well 1× PBS/3% BSA (Pierce, Rockford, IL) for 2 h at 30 °C. Plates were emptied, vacuum dried for 2 h at RT, heat-sealed in mylar foil packs, and stored at 4 °C. HE4. Serum samples were diluted 1:4 with sample diluent (PBS, 1% Bovine Serum (EquitechBio, Kerrville, TX)), 0.05% Tween 20 (Sigma, St. Louis, MO) and supplemented with 1 mg/ml mouse IgG (BioCheck, Foster City, CA). Standard curve samples, 100–0.78 ng/ml, were prepared by dilution in sample diluent; 100 μl of diluted test samples was incubated at 30 °C for 2 h, and washed 5 times in 0.05% Tween-20 in PBS pH 7.4. HRP conjugated detection
HE4 antibody clone 71.1.1.13 was diluted 1:16,000 with sample diluent (final concentration 0.166 µg/ml). 100 ml was added to each well and the plate was incubated at 30 °C for 1 h. PAI1 and MUC1. Serum samples were diluted 1:4 (PAI1) or 1:2 (MUC1) with sample diluent. Standard curve samples, 100–0.78 ng/ml (PAI1) or 600–0.27 U/ml (MUC1) were prepared using sample diluent. 100 μl was added to each assay well, incubated 2 h at 30 °C, and washed 5 times. Biotinylated detection antibody PAI14E2.6 was diluted to 0.1 μg/ml, and MUC1 M2C5 (Abcam, Cambridge, MA, Ab #8323) was diluted 1:5000 with sample diluent. 100 ml/well was added to plates and incubated 1 h at 30 °C. Wells were washed 5 times prior to addition of diluted (1:5000 in conjugate diluent) Streptavidin-HRP (Pierce). 100 ml/well was added to plates, followed by incubation for 1 h at 30 °C.
Table 2 Sensitivity and specificity of each tested marker Marker
HE4 Glycodelin MMP7 SLPI Plau-R Muc-1 Inhibin A PAI-1 CA125
Best Cutoff
Mean+2SD
Sensitivity stage I/II (n = 133) (%)
Sensitivity stage III (n = 67) (%)
Specificity (n = 396) (%)
Sensitivity stage I/II (n = 133) (%)
Sensitivity stage III (n = 67) (%)
Specificity (n = 396) (%)
82.7 63.2 45.9 54.2 69.9 53.4 45.1 65.4 45.9
92.5 64.2 62.7 71.6 74.6 55.2 34.3 68.7 58.5
86.3 79.0 89.4 71.7 75.8 73.0 70.3 76.0 98.2
62.4 53.4 23.3 9.8 44.4 6.8 39.1 14.3 45.9
74.6 44.8 44.8 25.4 52.2 10.5 23.9 16.4 58.5
96 85.3 97.7 98.3 90.1 98.5 75.8 96.2 98.5
Specificity for both early and late stage cancer was calculated using the entire normal population.
L.J. Havrilesky et al. / Gynecologic Oncology 110 (2008) 374–382 SLPI. Primary antibody 6A9.07 was diluted 2 µg/ml in sample diluent. 100 µl was added per well, followed by incubation at 30 °C for 2 h. Serum samples were diluted 1:40 with supplemented sample diluent. A standard curve was prepared from 305–2.38 ng/ml by dilution in sample diluent. 100 µl/well diluted test samples were incubated for 2 h at 30 °C, and washed ×5. Secondary antibody conjugated to HRP, 5G6.20, was diluted 1:16,000 (0.076 µg/ml) with supplemented sample diluent, 100 µl was added per well, followed by incubation for 1 h at 30 °C. Assay plate development. After final antibody incubation, plates were washed 5 times in PBS/0.05% Tween-20, and developed with 100 μl/well TMB (3,3′,5,5′Tetramethylbenzidine, (Sigma, St. Louis, MO) for 10 min. After incubation with 2N H2SO4 (Mallinckrodt, Hazelwood, MO) for 10 min at room temperature, plates were read at 450 nm (reference wavelength 650 nm) on a Spectramax Plus 384 (Molecular Devices, Sunnyvale, CA).
377
from 30 patients with advanced ovarian cancer diagnosed 1990–1995 at Duke University were identified under IRB-approved protocol. A positive biomarker panel was defined as elevation of any of three biomarkers above the upper 2SD cutoff at each specific time point tested. A negative panel consisted of values of all three biomarkers below the cutoff. Clinical disease status was defined by physical examination, all available radiographic studies and available biopsy confirmation.
Results Screening study Demographics and clinical characteristics of individuals providing serum are listed in Table 1. ROC curves were constructed
Commercial assays. Commercial assays were executed per manufacturer's protocols: Glycodelin (BIOSERV Diagnostics, Rostock, Germany, Cat. # BS30-20); MMP7 (R&D Systems, Minneapolis, MN, Cat. # DMP700); PLAU-R (uPAR) (R&D Systems, Minneapolis, MN, Cat. # DUP00); Inhibin A (Diagnostic Systems Laboratories, Webster, TX, Cat. # DSL-10-28100). CA125 levels for serum samples included in the screening study were determined by testing performed by LabCorp (Burlington, NC) utilizing the ADVIA Centaur CA125II assay. A cutoff of 35 U/ml was utilized for all CA125 calculations. In the longitudinal monitoring study, review of clinical records determined CA125 levels corresponding to dates for which serum was analyzed. Statistical analysis Sensitivity and specificity for individual marker performance was determined utilizing two approaches: (1) as binary variables using the reference cohort with upper 2× Standard Deviation limit (2SD) as cutoff per current CLSI recommendations [28] and (2) as binary variables using the best cutoff point as determined utilizing (ROC) curve analysis as follows: As the decision threshold of the test is varied (that is, as the cutoff point that separates normal subjects from disease subjects is changed) the sensitivity and specificity of the test also change. The best cutoff point, defined as the largest value of sensitivity plus specificity, was selected from each individual marker ROC curve. The ROC-GLM regression model described by Pepe [29] was performed using Stata (StataCorp, College Station, TX) to compare the sensitivity of each marker across cancer stage. Marker combinations were selected by AND/OR rules that resulted in the greatest non-redundant performance in either a one or two-step assay format. In the two-step format, a first assay is performed to detect expression of a biomarker/panel of biomarkers. The first assay step is designed to enrich disease “prevalence” in the patient population under review by eliminating a large percentage of true negatives. If the first biomarker/panel of biomarkers is over-expressed, a second assay step is performed to detect expression of a second biomarker/panel of biomarkers in the enriched population to more accurately identify true positive results. In the one-step assay, evaluated markers are analyzed simultaneously, with no prior enrichment. Sensitivity and specificity of multiple markers was determined for various marker combinations by logistic regression analysis using SAS (Cary, NC). Logistic regression analysis was performed by linearly combining measurements from each marker (1) using 2SD limit as cutoff (2SD); and (2) using the ROC curve Best cutoff. The ‘area under the curve’ comparison was performed as described [30]. A bootstrap validation procedure was conducted to assure a robust estimate of sensitivity and specificity given that a unique set of cancer sera was not available to validate the model. The bootstrap procedure consisted of artificially creating 1000 datasets each of which had the same sample size with replacement selection from the original one. The 2-step assay format using the Best cutoff threshold reported in Table 3 was selected as the “Test” algorithm to distinguish either early or late stage ovarian cancer from normal.
Advanced ovarian cancer monitoring study An evaluation of the utility of monitoring for disease recurrence was carried out using a subset of our marker panel, specifically; HE4, MMP7 and Glycodelin. These markers were chosen based on their sensitivity for detection of Stage III ovarian cancer, with the least redundancy, in our screening study. This panel was evaluated in a longitudinal monitoring cohort for its ability to predict cancer recurrence following initial response to primary chemotherapy. 260 serum samples obtained
Fig. 2. Multi-marker ROC plots. Each data point reflects a unique marker combination.
378
L.J. Havrilesky et al. / Gynecologic Oncology 110 (2008) 374–382
Table 3 Sensitivity and specificity of multi-marker combinations Assay Cutoff Marker combinations format 2 Step 2SD
Best
1 Step 2SD Best
Stage I/II (n = 133) Step
Step 1: HE4 N 1.8 ng/ml Step 2: Test = ‘+’ if: any one of CA125, Glycodelin, Plau-R = ‘+’
Step 1 Step 2 Combined Step 1: HE4 N 1.8 ng/ml Step 1 Test = ‘+’ if: CA125 = ‘+’ or any 2 of Glycodelin, Mcu-1 and Plau-R = ‘+’ Step 2 Combined Combination of CA125, HE4, Glycodelin, Plau-R, MMP7 One step CA125, HE4, Glycodelin, Plau-R, Muc-1, PAI-1 One step
for each individual marker; curves for HE4, Glycodelin, CA125 and Plau-R are depicted in Fig. 1. The cutoff for each individual marker was applied to both early (I/II) and late (III) stage samples and the data presented in Table 2. Individual marker performance for the detection of early ovarian cancer versus normal using the 2SD cutoff was: sensitivity 6.8–62.4%; specificity 75.8–98.5%. Using the Best cutoff, single marker performance for detection of early stage ovarian cancer was: sensitivity 45.1–82.7%; specificity 70.3–98.2%. HE4 demonstrated the highest sensitivity in identifying both the early (62.4–82.7%) and late stage (74.6–92.5%) ovarian cancer regardless of which cutoff was used. A ROC-GLM regression model was applied to the data to determine whether any of the markers were significantly associated with a specific cancer stage. In this model, a positive coefficient with an accompanying positive 95% confidence interval indicates a statistically significant association. Two markers showed this association; SLPI with a coefficient value of 0.242 (95% CI 0.045–0.439), and CA125 with a coefficient value of 0.301 (95% CI 0.100–0.502). Thus, the sensitivity of these two markers significantly increases between early and late stage ovarian cancer patients. Using the data from individual biomarker analysis, various combinations of the evaluated markers were next investigated using both one-step and two-step analysis algorithms. Numerous marker combinations were evaluated comparing both oneand two-step assay algorithms using both the Best and 2SD cutoffs. ROC plots were constructed to illustrate the varying sensitivity and specificity couplings associated with various marker combinations (Fig. 2). Each data point on the multimarker ROC curve denotes a specific marker combination. The performance of CA125 alone was plotted for reference. Multiple
Stage III (n = 67)
Sensitivity Specificity Step
Sensitivity Specificity
93.2% 79.0% 73.7% 93.2% 82.3% 76.7% 78.9% 80.5%
98.5% 86.0% 84.6% 98.5% 85.9% 84.6% 86.2% 89.2%
73.7% 76% 93.7% 73.7% 89.4% 97.2% 93.2% 96.5%
Step 1 Step 2 Combined Step 1 Step 2 Combined One step One step
73.7% 77% 94.0% 73.7% 89.4% 97.2% 94.4% 97.2%
marker combinations in both a one- or two-step assay format improved the identification of early stage ovarian cancer compared to CA125 alone. The sensitivity/specificity of specific biomarker combination algorithms ranged from 59.0%/99.7% to 80.5%/96.5% for detection of early stage ovarian cancer and from 76.9%/99.7% to 89.2%/97.2% for detection of late stage cancer. A representative example from each testing paradigm is shown in Table 3. Based on the observed sensitivity of HE4 in the above individual marker analysis, it was selected as the firststep test employed in the two step assay format. A level of HE4 N1.8 ng/ml was considered positive and initiated step two testing. For the second assay step, various marker combinations were utilized to identify those providing the best results with the least redundancy. For example, using the 2SD cutoff, the second step was considered positive if CA125, Glycodelin, or Plau-R tested positive. The sensitivity for detection of early stage ovarian cancer was 73.7% with an accompanying specificity of 93.7%. For detection of late stage disease, the sensitivity and specificity of the two-step assay increased to 84.6% and 94.0%, respectively. With the Best cutoff, the second step of the algorithm was considered positive if CA125 was positive (N 35 U/ml) or if any two of Glycodelin, MUC-1 or Plau-R tested positive. This algorithm gave a sensitivity of 76.7% with a specificity of 97.2% for the detection of early stage ovarian cancer; with an increased sensitivity to 84.6% for the detection of late stage ovarian cancer. For the one-step analysis, using the 2SD cutoff, a biomarker combination of CA125, HE4, Glycodelin, Plau-R and MMP7 resulted in a sensitivity of 78.9% with a specificity of 93.2% for the detection of early stage OC (Table 3). This biomarker panel showed an increased sensitivity and specificity of 86.2% and
Table 4 AUC comparison of multi-marker one-step model versus CA125 Cutoff
Marker combinations
Stage I/II (n = 133) AUC with 95% CI
P-value
Stage III (n = 67) AUC with 95% CI
P-value
2 SD
One-step model CA125
0.0002
One-step model CA125
0.955 (0.9213, 0.9879) 0.897 (0.8534, 0.9404) 0.965 (0.9278, 0.9996) 0.897 (0.8534, 0.9404)
0.0032
Best
0.903 (0.8683, 0.9213) 0.813 (0.7683, 0.8575) 0.959 (0.9422, 0.9768) 0.813 (0.7683, 0.8575)
b0.0001
0.0028
L.J. Havrilesky et al. / Gynecologic Oncology 110 (2008) 374–382
94.4%, respectively, for the detection of late stage disease. For the Best cutoff, a panel containing CA125, HE4, Glycodelin, Plau-R, MUC-1 and PAI-1 gave a sensitivity and specificity of 80.5% and 96.5%, respectively, for the detection of early stage ovarian cancer. For the detection of late stage disease, this biomarker panel showed an increase in sensitivity to 89.2% with an accompanying increase in specificity to 97.2%. The increased performance of the one step, multi-marker panel over CA125
379
alone is further demonstrated by a significant increase in the area under each respective ROC curve (Table 4). A bootstrap method of analysis was conducted to ensure robust estimates of test characteristics. The Bootstrap validation analysis of the “Test” algorithm (Best cutoff threshold, 2-step assay as reported in Table 3) in early stage cancer resulted in sensitivity of 76.7%, with a standard deviation of 3.06 (95% CI 71.5–81.5%) and a specificity of 97.2%, with a
Fig. 3. Levels of HE4, Glycodelin, MMP7 and CA125 in all advanced ovarian cancer monitoring patients with available pre-treatment serum (n = 10). Each graph denotes relative biomarker level from pre-treatment through clinical recurrence. Week 0 = pre-operative serum levels; Blue vertical line = second look laparotomy, no pathologic evidence of disease; yellow vertical line = second look laparotomy, pathologic evidence of disease (always followed by further chemotherapy treatment); black vertical line = clinical recurrence based on physical exam, CT findings, and/or biopsy. For each marker, any ratio value greater than 1 is above its mean plus 2SD cutoff. Any ratio value greater than 5 is capped at 5 for graphing purposes only.
380
L.J. Havrilesky et al. / Gynecologic Oncology 110 (2008) 374–382
standard deviation of 0.83 (95% CI 95.7–98.5%). In late stage cancer the Bootstrap validation analysis resulted in a sensitivity of 84.2% (95% CI 75.0–92.4%) and a specificity of 97.2% (95% CI 95.4–98.7%). This analysis provides the range of performance for a specific multi-marker algorithm supporting the performance characteristics obtained from our test cohort. Advanced ovarian cancer monitoring study Sera and longitudinal data were available for 30 patients with advanced OC. Two patients were excluded with insufficient clinical information regarding cancer status. A third patient with recurrent disease never achieved a response to chemotherapy. She subsequently developed both elevated CA125 and biomarker panel in the course of failed salvage regimens. Among 27 patients who experienced recurrence following initial response to treatment, sensitivity for predicting recurrence was 100% for the biomarker panel and 96% for CA125. In 15/27 (56%) patients, one or more panel biomarkers was elevated earlier (range 6–69 weeks) than CA125 and prior to other clinical evidence of recurrence. In 11 (41%) patients, CA125 rose within an equivalent time frame to the biomarker panel, and in one patient (4%), CA125 rose in advance of the biomarker panel. Fig. 3 depicts levels of CA125, HE4, MMP7 and Glycodelin in all patients (n = 10) whose serum was available prior to primary surgical exploration. Twelve patients underwent second look laparotomy procedures with normal CA125 levels and no clinical evidence of disease following primary chemotherapy. Eight of twelve had biopsy-proven persistence of cancer (positive second look) while four had no evidence of disease on multiple biopsies (negative second look). Among eight patients with positive second look, five (62%) had a positive biomarker panel predicting cancer persistence. Among four patients with negative second look, all had elevated biomarker panels. Three patients with negative second look had clinical recurrence within ten months after the procedure, while the fourth recurred clinically 33 months later. Discussion Current limitations of biomarkers for ovarian cancer screening relate to the relatively poor sensitivity and specificity for detection of early stage disease. The influx of data from genomic and proteomic-based profiling studies of ovarian cancer may increase the likelihood that new biomarker combinations capable of detecting early stage disease will be identified[31–35]. The use of transcriptional profiling and proteomics has identified genes over-expressed in ovarian cancer at the mRNA level with detectable increase in serum protein concentration [36]. Candidate biomarkers evaluated in this study represent 3 major classes of protein function related to normal ovarian physiology and altered expression in cancer biology: (i) hormonal control of cellular proliferation/ovarian physiology (PAEP-glycodelin and Inhibin)[26,37–39]; (ii) extra-cellular mucins/epithelial cell signaling (CA125; MUC-1) [40,41] and (iii) control of proteolysis/
coagulation/fibrinolytic pathways (MMP-7; SLPI; HE-4; PAI-1 and Plau-R)[22,23,42–45]. Prior studies have investigated combinations of biomarkers and statistical models for predicting ovarian cancer. For example, Skates et al. [31] reported that a mixture discriminant analysis model using the combination of CA125II, CA72-4, and M-CSF distinguished serum of patients with stage I/II ovarian cancer from that of healthy controls with 70% sensitivity and 98% specificity. Likewise, Zhang et al. [16] developed a biomarker panel based on proteomic data with sensitivity 74% and specificity 97% to detect stage I/II ovarian cancer. While these and other potential screening paradigms with comparable sensitivities and specificities have been previously reported, these assays often require specialized equipment not routinely utilized in the clinical immunoassay laboratory or rely on complex computational algorithms to generate adequate assay performance. The multi-marker data summarized in our ROC plots indicate comparable, and with certain marker combinations, superior performance. Our study utilized only standard ELISA assays and routine cutoff threshold determinations including the Upper 2× Standard Deviation limit consistent with current CLSI C28 recommendations. Investigation of these marker combinations and testing paradigms in additional patient cohorts, coupled with further assay optimization, may further refine this testing approach. While all samples were collected and processed using the same procedure, an inherent limitation with any multi-center retrospective study is potential variation of patient state at the time of collection. A prospective study that includes patients with benign gynecologic conditions is currently underway to further assess the robustness of our test paradigm and marker combinations. Our data demonstrates the potential utility of employing either a one- or two-step assay design in conjunction with various marker combinations. The multi-marker ROC curves indicate that both of these assay approaches demonstrate similar performance in our test cohort. However, the two-step assay design may help to overcome the performance hurdles associated with the low prevalence of ovarian cancer. The ideal two-step algorithm maximizes sensitivity and negative predictive value in the first step, eliminating many true negatives. The second step maximizes specificity, reducing false positives and raising overall positive predictive value. Additionally, in the two-step model the vast majority of women would test as true negatives after the first test, substantially decreasing the number of patients who would require the second test. This two-step approach would provide various economic and logistical advantages to such a screening modality. In the current ELISA-based format, the analysis of a multi-marker one-step assay would be logistically and economically inferior to the two-step assay format. The use of a multiplex analysis platform, however, would make all multi-marker analysis more feasible for clinical laboratories. We utilized HE4 as the first step of the two-step screening test due to the relatively high sensitivity of this biomarker for detection of ovarian cancer using our in-house developed antibody. HE4 has previously been reported to have an advantage over CA125 [23] and to complement the expression of CA125 [46] in the detection of epithelial ovarian cancer from
L.J. Havrilesky et al. / Gynecologic Oncology 110 (2008) 374–382
serum. More recently, HE4 was reported to have the highest sensitivity of a panel of nine biomarkers for the detection of ovarian cancer, especially Stage I disease, in patients with a pelvic mass [47]. With an ovarian cancer prevalence of only 1 in 2500 among postmenopausal women in the United States, an effective screening strategy for the general population needs to attain a sensitivity of 75% and specificity about 99.7% to attain a minimally acceptable PPVof 10% for the detection of all stages of ovarian cancer. No single biomarker reported to date has met these thresholds, nor did any of the individual biomarkers examined in this study. We analyzed various combinations of the nine biomarkers for their ability to detect early stage ovarian cancer. Using this approach, we were able to approach these benchmarks, attaining a specificity of 96.5% with a high sensitivity of 80.5% for the detection of early stage ovarian cancer, with higher values being obtained for the detection of late stage ovarian cancer. Lower specificity may be acceptable in higher risk populations, such as women with a strong family history of ovarian cancer, or within a population of women presenting with a pelvic mass. To estimate the potential positive (PPV) and negative predictive values (NPV) of our data we utilized a target high risk population prevalence of 0.25%, approximately five times the prevalence in the general population. This high risk asymptomatic population is defined by any of the following criteria: breast and ovarian cancer syndrome, known BRCA1/2 carrier, first/second degree relative with ovarian cancer, personal history of breast cancer, or hereditary non-polyposis colorectal cancer (HNPCC). Two multi-marker combinations that resulted in early stage cancer detection with a range of sensitivity/specificity of 59.0%/99.7%–76.7%/97.2% respectively, were used to estimate PPV and NPV in our hypothetical high risk population. Under these conditions, the early stage cancer PPV would range from 6.42–33.02% and the NPV would range from 97.20%–99.70%. We feel that the addition of subsequent transvaginal sonography for our test positives may further increase the overall PPV for the detection of ovarian cancer. In our pilot assessment of three candidate biomarkers for the monitoring of ovarian cancer disease status, the subset biomarker panel (HE4, MMP7, Glycodelin) predicted disease recurrence prior to elevation of CA125 in 56% of cases and in an equivalent time frame to CA125 in 41% of cases. Lead time compared with CA125 elevation appeared to be favorable at 6 to 69 weeks. While there is no data to suggest that earlier detection of recurrence leads to improved survival among patients with ovarian cancer, salvage therapies may improve progression free survival (PFS), an important endpoint in the care of patients with recurrent ovarian cancer. We found that among 6 patients with positive second look procedures, the biomarker panel predicted residual disease in 50%, compared to 0% for CA125. A potential weakness is our inability to evaluate the specificity of the biomarker panel in this setting, due to the lack of available true negatives. As such, we consider these results preliminary and subject to further evaluation. The efficacy of longitudinal serum monitoring using a biomarker panel will ideally be evaluated prospectively to determine the role for such tests in the setting of possible disease recurrence.
381
In conclusion, we have evaluated unique combinations of nine candidate biomarkers using both novel monoclonal antibody clones in sandwich ELISAs and commercially available tests, established normal cutoffs for these markers in a healthy reference population, and derived individual test characteristics for distinguishing the serum of patients with ovarian cancer from healthy controls. We have developed and demonstrated the utility of several one- and two-step multi-marker algorithms with acceptable test characteristics for possible use in an ovarian cancer screening population. We have also identified a subset of the biomarker panel that may have utility to monitor disease status in patients with recurrent ovarian cancer. Conflict of interest statement This study was sponsored by BD-TriPath Oncology. The following authors are employees of BD-TriPath: CW, RC, JG, QH, DM, and TF. AB has performed consultant services for BD-TriPath for which he received honoraria. LH has served on the BD-TriPath gynecologic scientific advisory board without compensation and is the PI for a clinical trial sponsored by BD-TriPath.
Acknowledgments The authors would like to acknowledge Malena Sansbury for her technical assistance and Charlotte Brown for her editing contributions. References [1] Jemal A, Siegel R, Ward E, Murray T, Xu J, Smigal C, et al. Cancer statistics, 2006. CA Cancer J Clin Mar–Apr 2006;56(2):106–30. [2] Schink JC. Current initial therapy of stage III and IV ovarian cancer: challenges for managed care. Sem Oncol Feb 1999;26(1 Suppl 1):2–7. [3] Bast Jr RC, Feeney M, Lazarus H, Nadler LM, Colvin RB, Knapp RC. Reactivity of a monoclonal antibody with human ovarian carcinoma. J Clin Invest 1981;68(5):1331–7. [4] Rustin GJ, Bast Jr RC, Kelloff GJ, Barrett JC, Carter SK, Nisen PD, et al. Use of CA-125 in clinical trial evaluation of new therapeutic drugs for ovarian cancer. Clin Cancer Res 1 Jun 2004;10(11):3919–26. [5] Jacobs I, Oram D, Fairbanks J, Turner J, Frost C, Grudzinskas JG. A risk of malignancy index incorporating CA 125, ultrasound and menopausal status for the accurate preoperative diagnosis of ovarian cancer. Br J Obstet Gynaecol Oct 1990;97(10):922–9. [6] van Nagell Jr JR, DePriest PD, Reedy MB, Gallion HH, Ueland FR, Pavlik EJ, et al. The efficacy of transvaginal sonographic screening in asymptomatic women at risk for ovarian cancer. Gynecol Oncol 2000;77(3): 350–6. [7] Menon U, Skates SJ, Lewis S, Rosenthal AN, Rufford B, Sibley K, et al. Prospective study using the risk of ovarian cancer algorithm to screen for ovarian cancer. J Clin Oncol 1 Nov 2005;23(31):7919–26. [8] Bast Jr RC. Status of tumor markers in ovarian cancer screening. J Clin Oncol 15 May 2003;21(10 Suppl):200s–5s. [9] Gadducci A, Ferdeghini M, Prontera C, Moretti L, Mariani G, Bianchi R, et al. The concomitant determination of different tumor markers in patients with epithelial ovarian cancer and benign ovarian masses: relevance for differential diagnosis. Gynecol Oncol Feb 1992;44(2):147–54. [10] Xu FJ, Ramakrishnan S, Daly L, Soper JT, Berchuck A, Clarke-Pearson D, et al. Increased serum levels of macrophage colony-stimulating factor in ovarian cancer. Am J Obstet Gynecol Nov 1991;165(5 Pt 1):1356–62. [11] Xu Y, Shen Z, Wiper DW, Wu M, Morton RE, Elson P, et al. Lysophosphatidic acid as a potential biomarker for ovarian and other gynecologic cancers. JAMA 26 Aug 1998;280(8):719–23. [12] Kim JH, Skates SJ, Uede T, Wong KK, Schorge JO, Feltmate CM, et al. Osteopontin as a potential diagnostic biomarker for ovarian cancer. JAMA 3 Apr 2002;287(13):1671–9.
382
L.J. Havrilesky et al. / Gynecologic Oncology 110 (2008) 374–382
[13] Healy DL, Burger HG, Mamers P, Jobling T, Bangah M, Quinn M, et al. Elevated serum inhibin concentrations in postmenopausal women with ovarian tumors. N Engl J Med 18 Nov 1993;329(21):1539–42. [14] Yousef GM, Polymeris ME, Yacoub GM, Scorilas A, Soosaipillai A, Popalis C, et al. Parallel overexpression of seven kallikrein genes in ovarian cancer. Cancer Res 1 May 2003;63(9):2223–7. [15] Xu FJ, Yu YH, Daly L, Anselmino L, Hass GM, Berchuck A, et al. OVX1 as a marker for early stage endometrial carcinoma. Cancer 1 Apr 1994;73(7): 1855–8. [16] Zhang Z, Yu Y, Xu F, Berchuck A, van Haaften-Day C, Havrilesky LJ, et al. Combining multiple serum tumor markers improves detection of stage I epithelial ovarian cancer. Gynecol Oncol Dec 2007;107(3):526–31. [17] Shridhar V, Lee J, Pandita A, Iturria S, Avula R, Staub J, et al. Genetic analysis of early-versus late-stage ovarian tumors. Cancer Res 1 Aug 2001;61(15):5895–904. [18] Begum FD, Hogdall CK, Kjaer SK, Christensen L, Blaakaer J, Bock JE, et al. The prognostic value of plasma soluble urokinase plasminogen activator receptor (suPAR) levels in stage III ovarian cancer patients. Anticancer Res May–Jun 2004;24(3b):1981–5. [19] Hough CD, Sherman-Baust CA, Pizer ES, Montz FJ, Im DD, Rosenshein NB, et al. Large-scale serial analysis of gene expression reveals genes differentially expressed in ovarian cancer. Cancer Res 15 Nov 2000;60(22): 6281–7. [20] Hough CD, Cho KR, Zonderman AB, Schwartz DR, Morin PJ. Coordinately up-regulated genes in ovarian cancer. Cancer Res 15 May 2001;61(10): 3869–76. [21] Schwartz DR, Kardia SL, Shedden KA, Kuick R, Michailidis G, Taylor JM, et al. Gene expression in ovarian cancer reflects both morphology and biological behavior, distinguishing clear cell from other poor-prognosis ovarian carcinomas. Cancer Res 15 Aug 2002;62(16):4722–9. [22] Drapkin R, von Horsten HH, Lin Y, Mok SC, Crum CP, Welch WR, et al. Human epididymis protein 4 (HE4) is a secreted glycoprotein that is overexpressed by serous and endometrioid ovarian carcinomas. Cancer Res 15 Mar 2005;65(6):2162–9. [23] Hellstrom I, Raycraft J, Hayden-Ledbetter M, Ledbetter JA, Schummer M, McIntosh M, et al. The HE4 (WFDC2) protein is a biomarker for ovarian carcinoma. Cancer Res 1 Jul 2003;63(13):3695–700. [24] Tsukishiro S, Suzumori N, Nishikawa H, Arakawa A, Suzumori K. Use of serum secretory leukocyte protease inhibitor levels in patients to improve specificity of ovarian cancer diagnosis. Gynecol Oncol Feb 2005;96(2): 516–9. [25] Sier CF, Stephens R, Bizik J, Mariani A, Bassan M, Pedersen N, et al. The level of urokinase-type plasminogen activator receptor is increased in serum of ovarian cancer patients. Cancer Res 1 May 1998;58(9): 1843–9. [26] Robertson DM, Stephenson T, Pruysers E, Burger HG, McCloud P, Tsigos A, et al. Inhibins/activins as diagnostic markers for ovarian cancer. Mol Cell Endocrinol 31 May 2002;191(1):97–103. [27] Kilpatrick KE, Wring SA, Walker DH, Macklin MD, Payne JA, Su JL, et al. Rapid development of affinity matured monoclonal antibodies using RIMMS. Hybridoma Aug 1997;16(4):381–9. [28] NCCLS. How to define and determine reference intervals in the clinical laboratory; approved guideline-second edition. NCCLS. 2000(C28-A2 [ISBN 1-56238-406-6]). [29] Sullivan Pepe M. Modeling covariate effects on ROC curves. The statistical evaluation of medical tests for classification and prediction. Oxford University Press; 2003. p. 151–63. [30] DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under
[31]
[32]
[33]
[34]
[35]
[36]
[37]
[38]
[39] [40] [41]
[42]
[43]
[44]
[45]
[46]
[47]
two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics Sep 1988;44(3):837–45. Skates SJ, Horick N, Yu Y, Xu FJ, Berchuck A, Havrilesky LJ, et al. Preoperative sensitivity and specificity for early-stage ovarian cancer when combining cancer antigen CA-125II, CA 15-3, CA 72-4, and macrophage colony-stimulating factor using mixtures of multivariate normal distributions. J Clin Oncol 15 Oct 2004;22(20):4059–66. Kozak KR, Amneus MW, Pusey SM, Su F, Luong MN, Luong SA, et al. Identification of biomarkers for ovarian cancer using strong anionexchange ProteinChips: potential use in diagnosis and prognosis. Proc Natl Acad Sci U S A 14 Oct 2003;100(21):12343–8. Ardekani AM, Liotta LA, Petricoin III EF. Clinical potential of proteomics in the diagnosis of ovarian cancer. Expert Rev Mol Diagn 2002;2(4):312–20. Lancaster JM, Dressman HK, Whitaker RS, Havrilesky L, Gray J, Marks JR, et al. Gene expression patterns that characterize advanced stage serous ovarian cancers. J Soc Gynecol Investig Jan 2004;11(1):51–9. Hibbs K, Skubitz KM, Pambuccian SE, Casey RC, Burleson KM, Oegema Jr TR, et al. Differential gene expression in ovarian carcinoma: identification of potential biomarkers. Am J Pathol Aug 2004;165(2): 397–414. Meinhold-Heerlein I, Bauerschlag D, Zhou Y, Sapinoso LM, Ching K, Frierson Jr H, et al. An integrated clinical-genomics approach identifies a candidate multi-analyte blood test for serous ovarian carcinoma. Clin Cancer Res 15 Jan 2007;13(2 Pt 1):458–66. El-Shalakany A, Abou-Talib Y, Shalaby HS, Sallam M. Preoperative serum inhibin levels in patients with ovarian tumors. J Obstet Gynaecol Res Apr 2004;30(2):155–61. Seppala M, Taylor RN, Koistinen H, Koistinen R, Milgrom E. Glycodelin: a major lipocalin protein of the reproductive axis with diverse actions in cell recognition and differentiation. Endocr Rev Aug 2002;23(4):401–30. Seppala M, Koistinen H, Koistinen R. Glycodelins. Trends in endocrinology and metabolism. TEM Apr 2001;12(3):111–7. Taylor-Papadimitriou J, Burchell J, Miles DW, Dalziel M. MUC1 and cancer. Biochimica et Biophysica Acta 8 Oct 1999;1455(2–3):301–13. Duffy MJ, Bonfrer JM, Kulpa J, Rustin GJ, Soletormos G, Torre GC, et al. CA125 in ovarian cancer: European Group on Tumor Markers guidelines for clinical use. Int J Gynecol Cancer Sep–Oct 2005;15(5):679–91. Wang FQ, Smicun Y, Calluzzo N, Fishman DA. Inhibition of matrilysin expression by antisense or RNA interference decreases lysophosphatidic acid-induced epithelial ovarian cancer invasion. Mol Cancer Res Nov 2006;4(11):831–41. Rosen DG, Wang L, Atkinson JN, Yu Y, Lu KH, Diamandis EP, et al. Potential markers that complement expression of CA125 in epithelial ovarian cancer. Gynecol Oncol Nov 2005;99(2):267–77. Bouchard D, Morisset D, Bourbonnais Y, Tremblay GM. Proteins with whey-acidic-protein motifs and cancer. Lancet Oncol Feb 2006;7(2): 167–74. Zacharski LR, Memoli VA, Ornstein DL, Rousseau SM, Kisiel W, Kudryk BJ. Tumor cell procoagulant and urokinase expression in carcinoma of the ovary. J Natl Cancer Inst 4 Aug 1993;85(15):1225–30. Scholler N, Crawford M, Sato A, Drescher CW, O'Briant KC, Kiviat N, et al. Bead-based ELISA for validation of ovarian cancer early detection markers. Clin Cancer Res 1 Apr 2006;12(7 Pt 1):2117–24. Moore RG, Brown AK, Miller MC, Skates S, Allard WJ, Verch T, et al. The use of multiple novel tumor biomarkers for the detection of ovarian carcinoma in patients with a pelvic mass. Gynecol Oncol Feb 2008;108(2): 402–408.