Vol. 160,430-436, August 1998 Printed in U S A .
NOVEL STAGING TOOL FOR LOCALIZED PROSTATE CANCER: A PILOT STUDY USING GENETIC ADAPTIVE NEURAL NETWORKS ASHUTOSH TEWARI
PERINCHERY NAFUYAN
AND
From the Division of Urology, University of Florida and Department of Veteran Affairs Medical Center, Gainesville, Florida
ABSTRACT
Purpose: An estimated $1.5 billion is spent annually for direct medical expenses and a n additional $2.5 billion for indirect costs for the management of prostate cancer. Today there are several procedures for staging prostate cancer, including lymph node dissection. Despite these procedures, the accuracy of predicting extracapsular disease remains low (range 37 to 63, mean 45%).Use of multiple staging procedures adds significantly to the costs of managing prostate cancer. Recently artificial intelligence based neural networks have become available for medical applications. Unlike traditional statistical methods, these networks do not assume linearity or homogeneity of variance and, thus, they are more accurate for clinical data. We applied this concept to staging localized prostate cancer and devised an algorithm that can be used for prostate cancer staging. Materials and Methods: Our study comprised 1,200 men with clinically organ confined prostate cancer who underwent preoperative staging using serum prostate specific antigen, systematic biopsy and Gleason scoring before radical prostatectomy and lymphadenectomy. The performance of the neural network was validated for a subset of patients and network predictions were compared with actual pathological stage. Mean patient age was 62.9 years, mean serum prostate specific antigen 8.1 ngJml. and mean biopsy Gleason 6 . Of the patients 55%had organ confined disease, 27% positive margins, 8%seminal vesicle involvement and 7% lymph node disease. Of margin positive patients 30% also had seminal vesicle involvement, while of seminal vesicle positive patients 50% also had positive margins. Results: The sensitivity of the network was 81 to loo%, and specificity was 72 to 75% for various predictions of margin, seminal vesicle and lymph node involvement. The negative predictive values tended to be relatively high for all 3 features (range 92 to 100%).The neural network missed only 8%of patients with margin positive disease, and 2%with lymph node and 0% with seminal vesicle involvement. Conclusions: Our study suggests that neural networks may be useful as a n initial staging tool for detection of extracapsular extension in patients with clinically organ confined prostate cancer. These networks preclude unnecessary staging tests for 63% of patients with clinically organ confined prostate cancer. KEY WORDS:neural networks (computer),artificial intelligence, prostatic neoplasms, cost-effective analysis Prostate cancer is 1 of the most common cancers in men, comprising approximately 33% of all cancers. In 1997, 334,500 new patients were estimated to be diagnosed with prostate cancer.1 Of these patients an estimated 50% (167,250) will undergo radical prostatectomy.'. Statistics indicate that $1.5 billion are spent for direct medical expenses and an additional $2.5 billion for indirect costs for the management of prostate The costs were even greater in 1997 since the number of diagnosed cases increased, and costs are anticipated to increase further in the next few years due to the steep rise in population of patients older than 65 years. Inaccurate staging adds significantly to the cost of managing prostate cancer. Therefore, it is imperative to construct accurate and cost-effective staging tools.'. 2 Today radical prostatectomy is clearly the most effective treatment for prostate cancer, resulting in disease-free survival at 10 years for up to 85% of patients. However, cure is more likely if cancer is organ confined.5.6 Several long-term studies of radical prostatectomy show that biochemical fail-
ure at 10 years occurs in as many as 58% of patients with positive margins, and 57%with seminal vesicle positive and 100% with lymph node d i ~ e a s e . ~Therefore, -l~ it is essential to stage prostate cancer accurately and avoid radical interventions in potentially incurable patients. However, under staging now occurs in 40 to 60% of clinically localized prostate cancer patients. Margin positive disease remains the biggest dilemma that confronts clinicians who attempt surgical management of prostate cancer. In recent studies the incidence of margin positive disease after radical prostatectomy was high (range 37 to 63, mean 45%).16-18 Mean incidence of margin positivity and seminal vesicle involvement is 17 to 47%, while nodal involvement is 2 to 10.5%.7,11,16-1s Although an overwhelming number of staging studies are currently available to the clinician, including digital rectal examination, prostate specific antigen (PSA), prostate specific antigen density, sextant prostate biopsy, molecular staging (polymerase chain reaction PSA), endorectal magnetic resonance imaging (MRI), pelvic computerized tomography (CT), CT guided lymph node biopsy, antigen directed monoAccepted for publication February 13, 1998. clonal antibodies, bone scintigraphy and staging lymph node Presented at annual meeting of Southeastern Section, American Urological Association, Naples, Florida, March 8-11, 1997 and dissection, none is accurate individually for making critical annual meeting of American Urological Association, New Orleans, management decisions. Louisiana, Apnl 13-17, 1997. Due to this lack of sensitivity and specificity many patients
~
430
NOVEL STAGING TOOL FOR LOCALIZED PROSTATE CANCER
undergo multiple staging procedures, which add significantly to the cost of managing prostate cancer. Because of the need for strategies to reduce staging cost several investigators have studied this issue. Oesterling reported on the role of serum PSA in reducing the need for bone scintigraphy.2.19 Several nomograms have been proposed to predict probabilities of extracapsular and metastatic disease.20-25 However, these nomograms are variable and not highly accurate.26We developed a neural network for staging prostate cancer, which in initial studies appears to be superior to current modalities.27.28 We present the results of prostate cancer staging using a neural network analysis of 1,200 patients with clinically localized prostate cancer. PATIENTS AND METHODS
Patients and preoperative evaluation. The study population comprised 1,200 patients with prostate cancer who underwent pelvic lymphadenectomy and radical prostatectomy at 2 university hospitals and 2 Veterans M a i r s medical centers. Preoperatively all patients underwent 3 or more biopsies from each prostate lobe using ultrasound guidance in addition to lesion directed biopsy if suspicious areas were noted. All patients underwent serum PSA estimation before biopsy using the Hybritech Tandem R* 2-site radioimmunoassay. If more than 1PSA was available the highest serum PSA before biopsy was used for analysis. Patients were excluded from study if they did not undergo transrectal ultrasound guided biopsy, or if they had received preoperative hormonal therapy, radiotherapy or cryotherapy. Assessment of Clinical Extent of Cancer: Based on biopsy results patients were classified as having T l a disease if cancer was diagnosed on transurethral prostatic resection and occupied less than 5% of tissue, T l b if diagnosed incidentally and occupied greater than 5% of tissue, Tlc when diagnosed only on the basis of elevated PSA, T2a-b if biopsies showed 1 lobe positive, T2c if biopsies from both lobes were positive and T3 if seminal vesicle biopsy was positive for prostate cancer. Biopsy findings were reported using the Gleason grading system. The highest biopsy score was assigned if the field contained more than 1 Gleason score. Operative Protocol and Histopathology: Radical retropubic prostatectomy was performed on all patients using the standard technique of Walsh.29-31 Nerve sparing prostatectomy was reserved for patients with unilateral cancers. Frozen section analyses of lymph nodes were performed based on the discretion of the operating surgeon when it was believed that there were clinically suspicious nodes intraoperatively. If the lymph nodes were positive on frozen section biopsy, radical prostatectomy was abandoned. All lymph node samples were subjected to permanent section. After removal specimens were coated with india ink, weighed and measured in the anteroposterior, cephalocaudal and transverse dimensions. Prostates were embedded in their entirety and fixed in 10% formalin for 18 to 24 hours. After fixation the distal and proximal urethral margins were removed for histological examination. The prostate and seminal vesicles were step sectioned at 3 to 5 mm. intervals perpendicular to the long axis of the gland, and each section was examined histologically. Pathological stage was reported as organ confined, extracapsular extension with or without positive surgical margins, and/or seminal vesicle involvement, and/or involvement of lymph nodes. Gleason score was assigned based on the area of most aggressive cancer. Neural network methodology. The data were collected using a spreadsheet on an IBM compatible personal computer. Standard statistical computations of mean, median and standard deviations were done. Based on the need to classify the noisy clinical data into groups, several neural network archi-
* Hybritech, San Diego, California.
43 1
t;ectures were tested to determine appropriateness of predicI;ion model.
After experimenting with several simpler back propaga;ion networks a probabilistic neural network was cho~ e n . 3 ~ 4Probabilistic 5 neural networks are based on Bayes lecision theorem and nonparametric statistics to calculate the probability density function.46 These networks learn pattern statistics from a training set, which may be in terms of global or local basis functions. The global basis function is defined as nonlinear (usually sigmoidal) functions of the distance of the pattern vector from a hyperplane. The function to be approximated is defined as a combination of these sigmoidal functions. Probabilistic neural networks learn in basic and adaptive forms. The basic forms are characterized by 1 pass learning and the use of the same width for the basis function for all dimensions of the measurement space. On the other hand, adaptive probabilistic neural networks are characterized by adapting separate widths for the basis function for each dimension. Because this adaptation is iterative it sacrifices the 1pass learning of the basic forms but achieves better generalization accuracy.45.47.48 Probabilistic neural networks are essentially a 3-layer network in which the training patterns are presented to the input layer and the output layer has 1neuron for each possible cabgory.’ There must be as many neurons in the hidden layer as there are training patterns. The network produces activations in the output layer, which correspond to the probability density function estimate for that category. The highest output represents the most probable category. Probabilisticneural networks are known for their ability to separate data into a specified number of output categories.*,47-49 Genetic Adaptive Algorithm: We used a genetic algorithm to find appropriate individual smoothing factors for each input as well as an overall smoothing factor. The input smoothing factor is an adjustment used to modify the overall smoothing factor to provide a new value for each input.36-5044 Training, which uses a genetic adaptive algorithm, proceeds in 2 parts. The first part trains the network with the data in the training set and the second uses calibration to test a whole range of smoothing factors to hone in on 1 combination that works best in the test set that the network created in the first part. The genetic algorithm looks for a smoothing factor multiplier for each input variable.36.50.51 Training, Testing and Validation: Of 1,200 patients we used data of an average of 40% for training, 30% for testing and 20% as production sets for validation. These sets were selected on the basis of random numbers generated by the computer. The training was done using commercially available software. During training the network calculated the best smoothing factor and tested its performance against the test set data. Overtraining of the network was avoided by calculating the root-mean-square error after every generation.32,35,37. 38,55-59The accuracy of the network was assessed on the basis of its performance with the production set. The validation set data (production set) were never used by the neural network. Based on the input variables the network classified the output variable as positive or negative for a pathological feature. The output variables included margin positivity, seminal vesicle involvement and lymph node disease. Network output was compared with actual pathological stage and results were grouped as right or wrong. The accuracy, sensitivity, specificity, and positive and negative predictive values were calculated based on this production set.37.38,58,59 Variable Selection: The preoperative information used for neural network training included race, age, rectal examination findings, size of tumor on ultrasound, serum PSA, biopsy Gleason stage, systemic biopsy based staging information, such as bilaterality of cancer and number of positive cores (out of 61,and perineural infiltration data. Outcome variables were margin, seminal vesicle and lymph node positivity.
432
NOVEL STAGING TOOL FOR LOCALIZED PROSTATE CANCER
Statistical methods for assessing diagnostic performance of neural network predictions of prostate pathology. Data were collected using computer software. A 120 MHz. Pentiurn* GATEWAY? personal computer with 32 megabytes random access memory, and 1gigabyte hard drive was used for network training, which took about 12 to 14 hours in a genetic breeding pool of 300 patients. For each pathological feature a trained neural network algorithm was used to generate binary probability predictions (based on Bayesian theorem) and continuous probability predictions in regard to the presence of the feature of each patient within a group of patients of known status (that is positive or negative for the feature). Diagnostic performance summary statistics, including sensitivity, specificity, sample specific positive and negative predictive values, and overall percentage of patients correctly classified (Appendix l),were computed for the binary predictions with 1-tailed lower 95% confidence intervals for sensitivity and specificity. Rank correlation was used to estimate the area under the observed receiver operating characteristics (ROC) curve. The area under the curve, expressed as a proportion of the area under the ROC curve of a perfectly performing predictor, can be considered a relative measure of content provided by a continuous predictor of binary outcomes.
Mean patient age was 62.9 years, mean serum PSA 8.1 ng./ml. and mean biopsy Gleason 6. Table 1 summarizes the prevalence of various clinical stages and corresponding pathological features. Of the patients 55% (660) had organ confined disease, 27% (318) positive margins, 8% (98) seminal vesicle involvement and 7% (88)lymph node disease. Of margin positive patients 30% also had seminal vesicle involvement, while of seminal vesicle positive patients 50% also had positive margins. Appendix 2 summarizes input variables used for construction and training of the neural network. Detailed results are listed in table 2. Sample specific negative predictive values tended to be relatively high for all features (86 to 100%). Sample specific positive predictive values tended to be relatively low (13 to 54%),particularly for lymph node (19%)and seminal vesicle (13%)involvement. Prevalence of margin positivity in the sample was 27% (table 2). Using the neural network sensitivity and specificity for this feature were 81.3 and 75%, respectively. Sample specific positive and negative predictive values were 54 and 92%, respectively. The area under the curve was 0.7940. Prevalence of seminal vesicle positivity in the sample was 8%. Using neural network sensitivity and specificity were 100.0 and 72.1%,respectively. Sample specific positive and negative predictive values were 12.8 and 100.0%, respectively. The area under the curve was 0.804. Prevalence of lymph node positivity in the sample was 7.1%. Using neural network sensitivity and specificity were 83.3 and 71.8%,re-
* Intel Corp., Santa Clara, California.
T Gateway 2000, North Sioux City, South Dakota
TABLE1. Prevalence of clinical stages and pathological features in entire .... .dntn - -.- net -. No. No. Organ No. Pos. No. Pos. No. Pos. Margins* Seminal Vesicle* Lymph Node
pts. Confined
Tla
10 9 0 0 1 18 14 1 1 2 Tlc 86 184 0 26 4 T2a-2b 765 273 200 62 19 T2C 231 104 51 1R A6 ._ -_ -T3 9 0 7 6 40 17 1 7 Total No. (%) 660 (55) 318 (26.5) 88 (7) Of margin positive patients 30% also had seminal vesicle involvement, while of seminal vesicle positive patients 50% also had positive margin.
Tlb
1,200
Parameters
W Accuracy % Sensitivity
5% Specificity % Pos.predictive value % Neg. predictive value Area under the curve
Margin
Seminal Vesicle
Lymph Nodes
76.7 81.3 75 54 92 0.7940
73.2 100 72.1 12.8 100 0.804
72.6 83.3 71.8 18.5 98 0.768
spectively. Sample specific positive and negative predictive values were 18.5 and 98.2%,respectively. The area under the curve was 0.768. Thus, 81.5%of cases designated as positive by the neural network did not require further testing, while 1.8%of those designated as negative did require further testing. The network designated 63% of all cases as negative for margin positivity, seminal vesicle involvement and lymph node disease. The margin of error was less than 15% for margin positive disease, 2%for lymph node disease and 0% for seminal vesicle involvement. DISCUSSION
RESULTS
Stage
TABLE2. Accuracy of staging neural network
980
There are several important findings of our study. A neural network with high sensitivity (81 to 100%)and specificity (72 to 75%) was constructed to stage preoperatively cases of clinically localized prostate cancer. The high negative predictive values (92 to 100%)made the network useful for initial screening and avoided further testing in 63%of patients. The cost saving of this approach will exceed a n estimated $150 million a year. The currently available staging modalities for prostate cancer are inadequate for accurate staging. Only 60% of clinically confined cancers are organ confined on final pathological analysis.18.60-62 Accurate preoperative staging is important since patients with positive margins with or without seminal vesicle involvement and lymph node spread have a distinctly poorer prognosis than those with organ confined disease.18.60-62 Radical prostatectomy in patients with organ confined disease on final pathological analysis results in survival comparable to that of age matched controls without prostate cancer. Therefore, it is essential to make a distinction between patients with organ confined versus extracapsular and regional disease for management decisions.18.60-62 The most common tests to stage prostate cancer are digital rectal examination, PSA, transrectal ultrasound guided systematic biopsy to determine unilateral and bilateral cancers, and pathological grade of biopsies. Among these tests PSA and systematic biopsy are relatively reproducible while digital rectal examination (certainly) and Gleason (to a lesser grade) are subject to interpretational error. Several investigators have attempted to construct probability tables and nomograms by combining the results of these and other variable~.~O-~4,61.63,~~ Among them the tables of Partin et a1 are the most familiar.20While they use the large database of well documented prostate cancer patients at The Johns Hopkins Institute, the major drawback is the use of digital rectal examination to differentiate unilateral from bilateral cancers (T2alb and T2c) and lack of reported accuracy of these models. The intent of these nomograms was to provide clinicians with tables that would be easy to understand and use in day-to-day practice. Recently Kattan et a1 evaluated the use of original Partin nomograms for 697 patients who underwent radical prostatectomy.26 Predictions made by the nomogram were applied to the data set, and the predictions were compared with actual outcomes of the patients. A localized least squares regression model was used to determine whether the nomogram was accurately calibrated for the
NOVEL STAGING TOOL FOR LOCALIZED PROSTATE CANCER data and whether it discriminated across a full spectrum of patient characteristics. The authors observed that, although nomograms did discriminate well between organ confined and nonorgan confined cancers (concordance index value 0.7581, the concordance index was 0.742,0.750 and 0.768 for prediction of extracapsular extension, seminal vesicle and lymph node spread, respectively. There was significant departure from predicted probabilities as the prediction percentages increased. The performance was especially poor for seminal vesicle and lymph node involvement. If the nomogram predicted 75% chance of lymph node metastasis, the actual incidence was only 20%. Therefore, they concluded that the Partin nomograms might not be totally applicable to general urological practice until further validation and modifications are performed.26 More recently Partin et al combined the clinical data from 3 academic institutions and developed a multi-institutional model with serum PSA level, clinical stage and Gleason score to predict pathological stage for 4,133 men with clinically localized prostate cancer.21 Multinomial log linear regression was performed for the simultaneous prediction of organ confined disease, isolated capsular penetration, and seminal vesicle or pelvic lymph node involvement. Bootstrap estimates of the predicted probabilities were used to develop nomograms to predict pathological stage. For the validation patients (selected by bootstrap model from original pool of patients) 72.4% of the time the nomograms correctly predicted the probability of a pathological stage within 10% (67.3%organ confined disease, 59.6%isolated capsular penetration, 79.6% seminal vesicle involvement and 82.9% pelvic lymph node involvement). At a cutoff of p >0.50 for organ confined disease the nomogram performance was 70% sensitivity, 64% specificity, 63% positive predictive value and 70% negative predictive value. Using a cutoff of p = 0.05 the accuracy results for lymph node spread were 72% sensitivity, 77% specificity, 15% positive predictive value and 98%negative predictive value.21 Narayan et a1 used preoperative PSA, Gleason grade and
433
transrectal ultrasound guided systematic biopsy data to determine T2 ah versus T2c stages in construction of nomograms for 813 patients.22 Bostwick et a1 used the biopsy parameters of nuclear grade, perineural invasion and percent biopsy positive for cancer in addition to PSA and Gleason score.23Although the typical urologist does not routinely perform nuclear grade and calculation of percentage cancer in a biopsy, the addition of these parameters appears promising. Badalament et a1 have gone further and constructed a backwards stepwise logistic regression algorithm using nuclear imaging, deoxyribonucleic acid ploidy, nuclear grade, serum PSA, percent of tumor involvement, number of positive sextant cores, preoperative Gleason score and more than 5%of base andor apex involvement data to stage prostate cancer.24 They report 86% sensitivity, 71% specificity, 73% positive predictive value and 85% negative predictive value. Again, their input data required several specialized investigations such that day-to-day use and cost-effectiveness remain to be established. "hey also have a relatively small number of patients (210), and so the use needs to be tested in large numbers. Artificial intelligence based neural networks are new tools that are user-friendly and already used in several branches of medicine, including cardiology and radiology. Use of this modality in urology has just started, particularly for decision making for infertility and calculus disease.6"67 For the management of prostate cancer a pilot study was done by Snow et al, who evaluated a back propagation artificial neural network for diagnosis and prognosis of prostate cancer.- However, to our knowledge there are no published reports on the use of neural networks to stage prostate cancer. Our study suggests that artificial intelligence based neural networks may be a useful initial tool for staging of clinically organ confined prostate cancer. The sensitivity of the network was 81 to 100% and specificity 72 to 75% for various predictions of margin, seminal vesicle and lymph node involvement. The negative predictive values tended to be relatively high for all 3 features (92 to 100%).The neural network
Clinically localiz d prostate cancer
t
Test required: Serum PSA, Systematic Biopsy, Biopsy Gleason
t
PNN Processing Negative prediction for Margin, S. V., and L.N. involvement
Positive prediction for even one parameter
t
63% of total patients fall in this group No further test required
PSA > I 0 ng/ml &lor Gleason 27
I
'Bone scan *MRIICT 'LND
PSA 510 nglml &lor Gleason <7
1
"Minilap or open LND
Cost effective staaina alaorithm usina PNN
! !
Proceed with definitive therapy
8% chance of missing positive margin, 0% of S.V., 8 2% of L.N. disease. Up to 63% reduction in Imaging and LND cost
!
Estimatedsavina to US health care $150
.. mrlkon/vear
The IndicaUonsfor bone scan8MRI and LND will vwy dapndlng upon the PSA, Ohason and p.uSnt and phy.ki.n bba
'* Minilap b more cost allsctivs for patient. undergoing RAP than lapronsopicLND; O w LND d m not add aignMwntW to Ua cost
Cost-effective staging algorithm shows detailed staging protocol. Note that 63%of patients did not uire additional testin L.N., lym h node. LND,lymph node dissection. Minilap, minilapamscopy. RRP,radical retropubic prostatectorny?.V., seminal vesicle. h V ,probakhstic neural networks.
434
NOVEL STAGING TOOL FOR LOCALIZED PROSTATE CANCER
I
APPENDIX 2: INPUT VARIABLES FOR NEURAL NETWORK only 8%ofpatients with magin positive disease, and 2% with lymph node and 0% with seminal vesicle involveserial Input Variable Comments ment. The accuracy of the network was 73 to 77%.Based on Number these findings and published literature we suggest a staging 1 SerumPSA Pre-biopsy serum PSA algorithm (see figure) that may be a cost-effective tool in the (nglml.) management of localized prostate 2 Systemic biopsy data UnilatJbilatJNo. core pos. Today it is important to design paradigms that use simple, 3 Biopsy Gleasonscore 2-10 reproducible universal tests which are also cost-effective. 4 Digital rectal examination l--Organ confined + uniPrevious studies have shown the limited role of bone scintiglat., '&organ constaging raphy in the staging of prostate cancer with PSA less than 15 h e d + bilat., %-extrang./ml. This finding has resulted in substantial savings of capsular extension more than $50 million a year to the United States health care suspected,4-seminal system but studies that address costs of other staging tests vesicle involvement susare not as well discussed.2.19 pected Our report provides an approach which can result in cost5 Perineural infiltration Yedno effective staging of prostate cancer. In 1996, 317,100 new YE. 6 Age Black American vs. white patients were diagnosed with prostate It is esti7 Race American vs. others mated that 509i percent of these patients (158,550) will undergo radical prostatectomy and an additional 20% (63,420) will undergo radiation therapy for organ confined cancer. REFERENCES Therefore, approximately 221,970 patients will be candidates 1. Parker, S. L., Tong, T., Bolden, S. and Wingo, P. A.: Cancer for staging evaluations of margin positivity, and seminal statistics, 1997.CA, 47: 5, 1997. vesicle and lymph node disease.',' Currently about 50% of 2. Oesterling, J . E.: Using prostate-specific antigen to eliminate unnecessary diagnostic tests: significant worldwide economic these patients (110,985) are at a sufficiently high risk (PSA implications. Urology, 4 6 26, 1995. greater than 10 ng./ml. and Gleason greater than 7) for 3. Mettlin, C. J., Murphy, G. P., McGinnis, L. S. and Menck, H. R.: lymph node and extraprostatic disease, which will require The National Cancer Data Base report on prostate cancer. that a pelvic CT, endorectal MRI andor lymph node dissecAmerican College of Surgeons Commission on Cancer and the tion be performed. The conservative cost estimates of 110,985 American Cancer Society. Cancer, 7 6 1104,1995. pelvic CTs ($400 a case) are $44,394,000, pelvic MRIs ($800 4. Mettlin, C. J., Murphy, G. P., Ho, R. and Menck, H. R.: The a case) $88,788,000 and lymph node dissections (open $2700, National Cancer Data Base report on longitudinal observations on prostate cancer. Cancer, 77:2162,1996. laparoscopic $8,700 a case) $299,659,500 to $965,569,500. 5. Zincke, H.,Oesterling, J. E., Blute, M. L., Bergstralh, E. J., The total range from $344,053,500 (if CT is used) to Myers, R. P. and Barrett, D. M.: Long-term (15years) results $388,447,500 (if MRI is used).70-7* With the use of neural after radical prostatectomy for clinically localized (stage T2c or networks one can avoid this additional testing for 63% of lower) prostate cancer. J. Urol., 1 5 2 1850,1994. patients (234,150),thus, saving approximately $150 million a 6. Murphy, G. P.: Prostate cancer: here and now (editorial). CA, 4 5 year. 1995. 7. Epstein, J. I., Partin, A. W., Sauvageot, J . and Walsh, P. C.: CONCLUSIONS
We have staged prostate cancer by training a genetic adaptive probabilistic neural network. We suggest a staging algorithm that uses this network, which will miss less than 10% of patients with margin positive disease, and 2%with lymph node and 0% with seminal vesicle involvement but will preclude unnecessary staging tests in 63% of patients. Using this approach cost savings of greater than $150 million a year are anticipated. Drs. Angela Karmerer and Erica Schalow helped with data organization, Mr. Jorg Mager of the College of Engineering offered his expertise in neural networks, and Mr. Chris Barnett, Mr. Richard K. Goede and Mr. David Twombley of the University of Florida, Computer Sciences Department helped with computing needs.
APPENDIX 1: DEFINITIONS
Sensitivity: proportion of patients correctly classified by the neural network. Specificity: proportion of patients without the feature of interest who were correctly classified by the neural network. Positive predictive value: proportion of patients classified by the neural network as positive for the feature that, in fact, had the feature. Negative predictive value: proportion of patients classified by the neural network as negative for the feature that, in fact, did not have the feature.
Prediction of progression following radical prostatectomy. A multivariate analysis of 721 men with long-term follow-up. Amer. J. Surg. Path., 2 0 286, 1996. 8. Linzer, D. G., Stock, R. G., Stone, N. N., Ratnow, R., Ianuzzi, C. and Unger, P.: Seminal vesicle biopsy: accuracy and implications for staging of prostate cancer. Urology, 4 8 757, 1996. 9. Soh, S.,Kattan, M. W., Berkman, S., Wheeler, T. M. and Scardino, P. T.: Has there been a recent shift in the pathological features and prognosis of patients treated with radical prostatectomy? J. Urol., 157: 2212,1997. 10. Lerner, S. E., Blute, M. L. and Zincke, H.: Risk factors for progression in patients with prostate cancer treated with radical prostatectomy. Sem. Urol. Oncol., 1 4 12, discussion 21, 1996. 11. Ohori, M., Wheeler, T. M., Dunn, J. K., Stamey, T. A. and Scardino, P. T.: The pathological features and prognosis of prostate cancer detectable with current diagnostic tests. J. Urol., 1 5 2 1714,1994. 12. Mettlin, C., Murphy, G. P., Lee, F., Littrup, P. J., Chesley, A,, Babaian, R., Badalament, R., Kane, R. A. and Mostofi, F. K.: Characteristics of prostate cancers detected in a multimodality early detection program. The Investigators of the American Cancer Society-National Prostate Cancer Detection Project. Cancer, 7 2 1701,1993. 13. Epstein, J. I., Pizov, G. and Walsh, P. C.: Correlation of pathologic findings with progression after radical retropubic prostatectomy. Cancer, 71: 3582, 1993. 14. Epstein, J. I., Carmichael, M. J., Pizov, G. and Walsh, P. C.: Influence of capsular penetration on progression following radical prostatectomy: a study of 196 cases with long-term followup. J. Urol., 1 5 0 135,1993. 15. Krongrad, A.,Lai, H. and Lai, S.: Survival after radical prostatectomy. J.A.M.A., 278 44,1997. 16. Epstein, J . I.: Incidence and significance of positive margins in radical prostatectomy specimens. Urol. Clin. N. Amer., 2 3 651,1996.
435
NOVEL STAGING TOOL FOR LOCALIZED PROSTATE CANCER
17. Menon, M., Parulkar, B. G. and Baker, S.: Should we treat ral Networks in Computer Intelligence. New York: McGrawlocalized prostate cancer? An opinion. Urology, 4 6 607,1995. Hill, Inc., pp. 1-61, 1994. 18. Menon, M.: Editorial: predicting biological aggressiveness in 39. Jakobsen, E., Kruse-Andersen, S. and Kolberg, J.: Neural netprostate cancer-desperately seeking a marker. J. Urol., 157: work for automatic analysis of motility data. Methods. Inf. 228,1997. Med., 3 3 157,1994. 19. Oesterling, J. E.: Using PSA to eliminate the staging radionu- 40. Kattan, M. W., Cowen, M. E. and Miles, B. J.: Computer modelclide bone scan. Significant economic implications. Urol. Clin. ing in urology. Urology, 47: 14,1996. N. Amer., 2 0 705, 1993. 41. Lamb, D. J. and Niederberger, C. S.: Artificial intelligence in 20. Partin, A. W., Yoo, J., Carter, H. B., Pearson, J. D., Chan, D. W., medicine and male infertility. World J . Urol., 11: 129,1993. Epstein, J . I. and Walsh, P. C.: The use of prostate specific 42. Tetko, I. V., Villa, A. E. and Livingstone, D. J.: Neural network antigen, clinical stage and Gleason score to predict pathologstudies. 2.Variable selection. J. Chem. Inf.Comput. Sci., 3 6 ical stage in men with localized prostate cancer. J. Urol., 150 794, 1996. 110,1993. 43. Takahashi, Y.: A mathematical solution to a network designing 21. Partin, A. W., Kattan, M. W., Subong, E. N., Walsh, P. C., Wojno, problem. J . Comput. Biol., 3 97,1996. K. J., Oesterling, J. E., Scardino, P. T. and Pearson, J . D.: 44. Wasserman, P.: Neural Computing: Theory and Practice and Combination of prostate-specific antigen, clinical stage, and SubNeural Computing: Theory and Practice, New York Van Gleason score to predict pathological stage of localized prosNostrand Reinhold, pp. 1-11, 1993. tate cancer. A multi-institutional update. J.A.M.A., 277:1445, 45. Hamilton, P. W., Montironi, R., Abmayr, W., Bibbo, M., 1997. Anderson, N., Thompson, D. and Bartels, P. H.: Clinical appli22. Narayan, P., Gajendran, V., Taylor, S. P., Tewari, A., Presti, cations of Bayesian belief networks in pathology. Pathologica, J. C., Jr., Leidich, R., Lo, R., Palmer, K., Shinohara, K, and 87:237,1995. Spaulding, J. T.: The role of transrectal ultrasound-guided 46. Orr, R. K.: Use of a probabilistic neural network to estimate the biopsy-based staging, preoperative serum prostate-specific anrisk of mortality after cardiac surgery. Med. Decision Making, tigen, and biopsy Gleason score in prediction of final patho17: 178,1997. logic diagnosis in prostate cancer. Urology, 4 6 205,1995. 47. Korning, P. G.: Training neural networks by means of genetic 23. Bostwick, D. G., Qian, J., Bergstralh, E., Dundore, P., Dugan, J., algorithms working on very long chromosomes. Int. J. Neural Myers, R. P. and Oesterling, J. E.: Prediction of capsular Systems, 6 299,1995. perforation and seminal vesicle invasion in prostate cancer. 48. Livingstone, D. J., Mandack, D. T. and Tetko, I. V.: Data J. Urol., 155 1361,1996. modelling with neural networks: advantages and limitations. J. Comput. Aided Mol. Des., 11: 135,1997. 24. Badalament, R. A., Miller, M. C., Peller, P. A., Young, D. C., Bahn, D. K., Kochie, P., O'Dowd, G. J . and Veltri, R. W.: An 49. Eisenstein, E. L.and Alemi, F.: A comparison of three techniques for rapid model development an application in patient riskalgorithm for predicting nonorgan confined prostate cancer using the results obtained from sextant core biopsies with stratification. Proc. AMIA Ann. Fall Symposium, 443, 1996. 50. Hatjimihail, A. T.: Genetic algorithms-based design and optimiprostate specific antigen level. J. Urol., 156 1375,1996. zation of statistical quality-control procedures, Clin. Chem., 25. DAmico, A. V., Whittington, R., Malkowiez, S. B., Schultz, D., 3 9 1972,1993. Schnall, M., Tomaszewski, J. E. and Wein, A.: Combined modality staging of prostate carcinoma and its utility in predict. 51. Husbands, P., Harvey, I., Cliff, D. and Miller, G.: Artificial evolution: a new path for artificial intelligence? Brain Cognition, ing pathologic stage and postoperative prostate specific anti34: 130,1997. gen failure. Urology, 4 9 23,1997. 26. Kattan, M. W., Stapleton, A. M., Wheeler, T. M. and Scardino, 52. Jefferson, M. F.,Pendleton, N., Lucas, S. B. and Horan, M. A,: Comparison of a genetic algorithm neural network with logisP. T.: Evaluation of a nomogram used to predict the pathologic tic regression for predicting outcome after surgery for patients stage of clinically localized prostate carcinoma. Cancer, 7 9 with nonsmall cell lung carcinoma. Cancer, 7 9 1338,1997. 528,1997. 27. Tewari, A. and Narayan, P.: Cost effective staging of prostate 53. Levin, M.: Use of genetic algorithms to solve biomedical problems. Methods Comput., 1 2 193,1995. cancer using genetic adaptive neural networks. Presented at annual meeting of Southeastern Section of American Urolog- 54. Saravanan, N., Fogel, D. B. and Nelson, K. M.: A comparison of methods for self-adaptation in evolutionary algorithms. ical Association, Naples, Florida, March 8-11, 1997. Biosystems, 3 6 157,1995. 28. Tewari, A., Mager, J., Kamerer, A., Shukla, A. and Narayan, P.: Genetic adaptive probabilistic neural network models in pre- 55. Brier, M. E. and Aronoff, G. R.: Application of artificial neural networks to clinical pharmacology. Int. J. Clin. Pharmacol. diction of pathological stage in management of localized prosTher., 3 4 510, 1996. tate cancer: a pilot study. J. Urol., part 2, 157: 293,abstract 56. Burke, H. B., Goodman, P. H., Rosen, D. B., Henson, D. E., 1142,1997. Weinstein, J. N., Harrell, F. E., Jr., Marks, J. R., Winchester, 29. Walsh, P. C. and Partin, A. W.: Treatment of early stage prostate D. P. and Bostwick, D. G.: Artificial neural networks improve cancer: radical prostatectomy. Important Adv. Oncol., p. 211, the accuracy of cancer survival prediction. Cancer, 79: 857, 1994. 1996. 30. Walsh, P. C.: Editorial: the status of radical prostatectomy in the United States in 1993:where do we go from here? J . Urol., 152 57. Buscema, M.: A general presentation of artificial neural networks. I. Subst. Use Misuse, 3 2 97,1997. 1816,1994. 31. Walsh, P. C.: Re: Potency-sparing radical retropubic prostatec- 58. Frye, K. E., Izenberg, S. D., Williams, M. D. and Luterman, A,: Neural networks: what are they? J . Burn Care Rehab., 1s: 72, tomy: a simplified anatomical approach. J . Urol., 155: 294, 1997. 1996. 32. Bergeron, B. P., S h i f i a n , R. S. and Rouse, R. L.: Data qualifi- 59. Minor, J. M. and Namini, H.: Analysis of clinical data using neural nets. J. Biopharm. Statistics, 6 83,1996. cation: logic analysis applied toward neural network training. 60. Barry, M. J., Fowler, F. J., Jr., Bin, L. and Oesterling, J. E.: A Comput. Biol. Med., 2 4 157,1994. nationwide survey of practicing urologists: current manage33. Borisyuk, G. N., Borisyuk, R. M., Khibnik, A. I. and Roose, D.: ment of benign prostatic hyperplasia and clinically localized Dynamics and bifurcations of two coupled neural oscillators prostate cancer. J. Urol., 158:488,1997. with different connection types. Bull. Math. Biol., 57: 809, 61.Bostwick, D. G.: Staging prostate cancer-1997: current methods 1995. and limitations. Eur. Urol., 3 2 2, 1997. 34. Burke, H. B.: Artificial neural networks for cancer research 62. Murphy, W. M.: Prostate cancer. The problem of prognostic outcome prediction. Sem. Surg. Oncol., 10: 73,1994. factors. Amer. J. Clin. Path., suppl., 106: S45, 1996. 35. Dassen, W. R. and Mulleneers, R. G.: The value of artificial neural network techniques to develop diagnostic systems in 63. DAmico, A. V., Whittington, R., Schultz, D., Malkowicz, S. B., Tomaszewski, J . E. and Wein, A.: Outcome based staging for cardiology. Pacing. Clin. Electrophysiol., 17: 1672,1994. clinically localized adenocarcinoma of the prostate. J. Urol., 36. Forrest, S.:Genetic algorithms: principles of natural selection 158: 1422,1997. applied to computation. Science, 261: 872,1993. 37. Forsstrtjm, J. J. and Dalton, K. J.: Artificial neural networks for 64. Narayan, P., Foumier, G., Gajendran, V.,Leidich, R., Lo, R., Wolf, J. S., Jr., Jacob, G., Nicolaisen, G., Palmer, K. and ~ ~ clinical medicine. Ann. Med., 27:509,1995. decision S U D D O in Freiha. F.: of Dreowrative serum ~ m ~ t a t e - ~ . o e canifi~ 38. Fu,L.: Neuril-Networks in Computer Intelligence and SubNeu- Utilitv - -a
-
1
436
NOVEL STAGING TOOL FOR LOCALIZED PROSTATE CANCER
tigen concentration and biopsy Gleason score in predicting risk of pelvic lymph node metastases in prostate cancer. Urology, 44:519,1994. 65. Niederbereer. C. S..LiDshdtz. L. I. and Lamb. D. J.: A neural network-to'analyze flrtility data. Fertil. Steril., 60:324,1993. 66. Niederberger, C. S.:Commentary on the use of neural networks in clinical urology. J. Urol., 153: 1362,1995. 67. Niederberger, C.: Computational tools for the modem andrologist. J. Androl., l?:462,1996. 68. Snow, P. B., Smith, D. S. and Catalona, W. J.: Artificial neural networks in the hagnosis and prognosis of prostate cancer: a pilot study. J. Urol., 152: 1923,1994. 69. Tewari, A, Calvanese, C., Carlson, G., Kahane, H. and Narayan, P.: An artificial intelligence based genetic adaptive neural network model to predict pathological stage of prostate cancer in patients undergoing radical prostatectomy. J. Urol., part 2, 159: 112,abstract 431,1998. 70. Wolf, J. S.,Jr., Cher, M., Dall'era, M., Presti, J. C., Jr., Hricak, H. and Carroll, P. R.: The use and accuracy of cross-sectional imaging and fine needle aspiration cytology for detection of pelvic lymph node metastases before radical prostatectomy. J. Urol., 153: 993,1995. 71. Tempany, C. M.: MR staging of prostate cancer. How we can improve our accuracy with decisions aids and optimal techniques. Magnetic Resonance Imaging Clin. N. h e r . , 4 519, 1996. 72. Milestone, B. N. and Seidman, E. J.: Endorectal coil magnetic resonance imaging of prostate cancer. Sem. Urol., 1 3 113, 1995.
modeling results in printed journals but provide descriptions of the models electronically on the World Wide Web. Another difficulty the urologist will encounter is using the computational model. Nowhere in the article can a means to access the trained computational tool be found, relegating the report to one of great interest but no current use. Again, the World Wide Web provides a useful means to make computer code widely available. Yet another difficulty facing the urologist who contemplates the model is evaluating its accuracy. The authors rightly use ROC curve analysis to report the accuracy of their tool. ROC area provides a single value that essentially combines sensitivity and specificity. ROC area values range from 0.5 for an absolutely worthless classifier to 1.0for a perfect one. The model ROC areas were 0.79for margin, 0.80 for seminal vesicle and 0.77 for lymph node positivity. These are promising values for a preliminary study but are not likely to leave a practicing urologist comfortable deciding, for example, whether a patient should undergo radical prostatectomy. For these reasons it is premature for the authors to suggest that their tool is ready to save hundreds of millions of dollars a year. Nonetheless, as a preliminary study this report is promising. Replicated, improved and distributed to practicing urologists, it can become a real clinical tool of the future. Craig Niederberger College of Medicine Department of Urology University of Illinois at Chicago Chicago, Illinois
EDITORIAL COMMENT
Individual tests for staging prostate cancer fall short in accuracy and clinical usefulness. Most currently available unified models for staging use logistic regression and do not present accuracy as ROC curves and area under the curves. The few that do present accuracy data use variable cutoff points (ranging between 0.05 for organ confined disease to 0.005for lymph node spread) which are difficult to use in clinical practice. In addition, when more than 3 parameters are used in predicting stage, the number of tables grows exponentially. The integrative model that we presented can calculate the probability for any number of variables. If new tests become available, they can be incorporated in the existing pool of tests. The current accuracy of our model is 75 to 80% compared with 49% incidence of under staging in contemporary series. As and when more accurate input variables become available, the accuracy is likely to increase. Artificial intelligence based models are difficult to train but easy to use. Once tested and fine tuned they can be made available in a user-friendly computer interface in which individual patient data can be entered and predictions obtained in a few seconds. We agree that these models should be made available on the World Wide Web or as computer software, and are currently working on this aspect of the project.
REPLY BY AUTHORS A highly accurate method of predicting which prostate cancers are organ confined would go a long way toward solving the quandary of which patient to treat with what therapy. Despite the screening power of PSA, a single test is not likely to provide such a method. Instead, a combination of tests and clinical features considered together provides the best hope to develop a decision making model that will preoperatively predict the stage of prostate cancer. The authors use a modem computational approach to develop a prostate cancer staging model. Interestingly, this model is a hybrid of several computational approaches including neural computation, statistical pattern recognition and a genetic algorithm. As such, it is highly complex and presents intriguing difficulties for the physician who wishes to use the model in clinical practice. One difficulty confronting the urologist who evaluates this model is replicating it. The description of the computational tool in the Patients and Methods section is insufficient to repeat the experiment. Given the complexity of modern computational modeling schemes in general, and the hybrid approach used in this report in particular, it is unlikely that printed journals will be able to accommodate a description replete enough to replicate these types of computational experiments. One solution may be for authors to publish