UPDATE
ARTIFICIAL NEURAL NETWORKS IN CLINICAL UROLOGY PETER B. SNOW, DAVID M. RODVOLD,
I
n urologic practice there exists a need to make clinical predictions for individual patients. Predictions may involve the stratification of patients into risk groups, diagnosis, prediction of cancer stage, prediction of treatment outcomes, or likelihood of disease recurrence. Traditionally, statistical classification models have addressed these predictions. These models assume, at best, fixed statistical relationships that allow only limited types of relatively simple, nonlinear, intervariable interactions and, at worst, assume linear relationships among all variables. Because medical data are inherently “noisy,” have wide variability, are not usually normally distributed, and often exhibit significant nonlinear intervariable relationships, statistical models often fall short of the desired accuracy when used in clinical urologic practice.1 Artificial neural networks (ANNs) are nonlinear, computational, mathematical models for information processing that are loosely based on biologic nervous systems. With the help of learning algorithms, ANN-based systems form nonlinear classification decision boundaries on the basis of information provided by a set of clinical training examples. Our repeated experiences using medical data indicate that properly trained and tested ANNs are as accurate as, and usually significantly more accurate than, regression analyses when measured by receiver operating characteristic (ROC) curve areas generated for validation data or for prospective patients not available to the training algorithms (a representative example is discussed below). Although the use of ANNs in clinical medicine is a recent phenomenon, many applications have been or are being developed. The availability of clinical technologies for the practicing urologist continues to increase. This includes diagnostic, prognostic, and staging tools based on ANNs. Some ANN-based tools are currently available, and current research indicates that a larger number are From Xaim, Incorporated, Colorado Springs, Colorado Reprint requests: Peter B. Snow, M.D., Xaim, Incorporated, Suite 504D, 6660 Delmonico Drive, Colorado Springs, CO 80918 Submitted: April 1, 1999, accepted (with revisions): July 9, 1999 © 1999, ELSEVIER SCIENCE INC. ALL RIGHTS RESERVED
AND
JEFFREY M. BRANDT
entering mature stages of development. There are a number of channels available for deploying urologic ANNs. Including the ANN result with the result(s) of the laboratory test is one avenue, although some important inputs may not be included with the laboratory data. Stand-alone computer applications or Internet-based applications appear to be the most useful methods of delivery for direct use by practicing clinicians. In particular, Internet-based access provides positive version control, ensuring that outdated applications are not used. One factor that must be addressed in any discussion of deployed tools is requisite approval by governmental oversight organizations. In the United States, this is the Food and Drug Administration (FDA), which has taken an active role in defining the level of regulation required for medical ANNs. The FDA regulations2 are applicable to “ . . . medical devices containing software and for software products considered by themselves to be medical devices.” ANNs are addressed explicitly in these regulations, with an emphasis placed on formal software engineering practices. To address these requirements, Rodvold3 presents a rigorous software development process model for ANNs in critical applications. The remainder of this report discusses current ANN applications in the field of urology, important variables and markers that are used in the ANNs, recent applicable advances in neural network technology, and current research directions. APPLICATIONS PROSTATE CANCER DIAGNOSIS Most of the current effort using ANNs in clinical urology is in the diagnosis, staging, and treatment outcomes for prostate cancer. The earliest reported work applying ANNs to prostate cancer was by Snow et al.1 In their study, a data base of 1787 men with prostate-specific antigen (PSA) levels greater than 4 ng/mL was used to predict the outcome of the first biopsy on the basis of the PSA, digital rectal examination (DRE), and transrectal ultrasound (TRUS) findings. Since that initial report, at least five recent studies have been published that used UROLOGY 54: 787–790, 1999 • 0090-4295/99/$20.00 PII S0090-4295(99)00327-1 787
ANNs in the diagnosis of prostate cancer.4 – 8 Variables that are typically available for predicting the first biopsy findings are age, race, PSA, DRE, clinical stage, clinical TNM scores, and in the past year or so free PSA (fPSA). Not as commonly used because of the expense are the variables associated with TRUS, such as prostate volume and grading (eg, suspicious, abnormal). In addition, one ANN application involves the use of prostatic acid phosphatase and total creatine kinase combined with PSA and age to produce the ProstAsure Index.9 The most important variables tend to be fPSA (or the percent fPSA ⫽ fPSA/PSA ⫻ 100) and TRUS. The accuracy of diagnostic ANNs, as measured by the area under the ROC curve, is generally between 70% and 85% for validation on prospective patients. Accuracy much in excess of 80% should not be expected, since approximately 20% of all biopsy outcomes are falsely negative (the needles miss the cancer).10 At a sensitivity of 90%, these predictive accuracy ranges produce specificities between approximately 30% and 65%. Area-weighted microvessel density from a negative biopsy has been demonstrated using an ANN to give an ROC area of 83.4% (91% sensitivity at 40% specificity) for predicting the presence of cancer despite the negative biopsy outcome.11 PROSTATE CANCER STAGING Relatively little work using ANNs to predict pathologic stage has been published. This is an important area, since treatment outcome is related to disease stage.12 Prediction of stage is usually accomplished by adding the results of the Gleason biopsy scores to the abovementioned variables for diagnosis. The first published work was by Tewari et al.,13 who trained an ANN to predict stage. For positive margins, they obtained a sensitivity of 76% at a specificity of 65%; for positive lymph nodes, a sensitivity of 83% at a specificity of 72%; and for seminal vesicle invasion, a sensitivity of 100% at a specificity of 72%. In a more recent publication, Tewari and Narayan14 reported a network sensitivity of 81% to 100%, with a specificity of 72% to 75% for various predictions of margin, seminal vesical, and lymph node involvement. Crawford et al.15 demonstrated a neural network/rule base combination that partitions patients into low and high-risk groups for pelvic lymph node disease. Multicenter validation cases have demonstrated an accuracy of 98% to 99% in predicting a node negative condition in the low-risk group (⬃57% of the population) and a 25% node positive predictive value at 95% sensitivity in the high-risk group. PROSTATE CANCER PROGNOSIS Treatment outcome or prognosis involving the use of ANNs to predict outcome also has relatively 788
little published material. This also is an important area for investigation, since recent studies have shown that in other cancers, ANNs are more accurate at predicting prognosis than conventional TNM staging.16 Treatment outcome is usually defined as status (dead from disease, alive with disease, disease free) at a prescribed time after treatment. Variables available for predicting treatment outcome are the same as those found in diagnosis and staging predictions. If a prostatectomy was performed, the pathologic stage and TNM scores are also available. If a lymphadenectomy was performed, the pelvic lymph node status is available. In an early study, Snow et al.17 reported an overall accuracy of 90% in predicting clinical recurrence after prostatectomy. Douglas et al.18 and Douglas and Moul19 have reported an overall accuracy of 97% in predicting chemical recurrence after prostatectomy. Neither of these studies attempted to predict the time after treatment of the recurrence. In a seminal study by Radge et al.,20 an ANN was used to predict the likelihood of percutaneous prostate brachytherapy treatment success. Success was defined as a disease-free status 10 years after seed implantation. The ANN predicted the likelihood of success with and without prior external beam pelvic radiation. The ANN produced a success predictive value of 82% and a failure predictive value of 76% at a sensitivity of 90% for the validation patients not used in the training or testing of the neural network. A multivariable regression analysis produced predictive results that were 10% less accurate than the ANN. OTHER APPLICATIONS ANNs have been used to interpret urologic imaging. Prostate ultrasound images have been investigated by Prater and Richard21 and Loch et al.22 Maclin et al.23 investigated renal ultrasound images using ANNs. Hurst et al.24 used ANNs to analyze images of bladder cells for malignancy. Staging of testicular cancers using ANNs has been accomplished by Moul et al.25 Krongrad et al.26 used ANNs to model emotional components involving quality of life in patients with prostate cancer or benign prostatic hyperplasia. APPLICATIONS AND REGULATORY APPROVAL At the time of this writing, two urologic ANN products have fulfilled FDA regulatory requirements. Horus Therapeutics (Savannah, Ga) has produced an ANN-based system to define its proprietary prostate cancer diagnostic, the ProstAsure Index.9 Xaim (Colorado Springs, Colo) has more recently completed the regulatory process for brachytherapy success prediction software, the first of its prostate cancer diagnostic and prognostic tools.20 UROLOGY 54 (5), 1999
ARTIFICIAL NEURAL NETWORK DIRECTIONS From the preceding section, it is clear that researchers have been diligently constructing ANNs for urologic applications. At a different level, other ANN researchers have been improving the tools that ANN application developers use. Only a few years ago, ANNs were synonymous with backpropagation training and feed-forward architecture.27 Recently, however, a number of useful ANN architectures and training algorithms have emerged from academic laboratories into the mainstream. Several ANN architectures based on multidimensional geometric grouping, such as Kohonen, general regression, and probabilistic, have become popular recently, both for direct predictions and data preprocessing.28 Even the common feedforward multilayer perceptron has been the subject of significant improvement, with the emergence of efficient and convergent training techniques such as conjugate gradient descent and the Levenberg-Marquardt algorithm.29 Although the well-known back-propagation algorithm is still widely used, these new training methods are quietly emerging as the favored techniques. Auxiliary to the methods used to train the ANNs (ie, actually adjusting the synaptic weights) are the techniques for selecting input variables and determining optimal network architectures. A typical urologic data base might include several tens of potential predictive variables, with only a handful needed to specify the desired output quantity. Determining which of these factors to use as input variables, and how to scale and/or combine them, is one of the most important and daunting tasks facing the ANN developer (as shown by Rodvold and LeKang30). Similarly, once the inputs and outputs are determined, and the training data base has been processed as necessary, determining the optimal ANN topology is also a challenge. Trial-anderror has been a necessarily popular approach to these and other similar structural problems associated with ANNs. One solution to these configuration problems can be found using optimization techniques such as genetic algorithms or simulated annealing.31 These methods automate much of these otherwise manual processes and now appear in some commercially available tools. Although these techniques are, relatively speaking, very efficient, for complex problems they may impose heavy burdens on the computers used to train the ANNs. Furthermore, it is important to review any optimized solutions these automated techniques yield to ensure that they do not violate any logical problem constraints or statistical significance limitations. UROLOGY 54 (5), 1999
COMMENT In the past year or two, there has been an accelerating development of ANN applications in clinical urology, particularly in the diagnosis, staging, and prognosis of prostate cancer. ANNs, when properly trained, demonstrate equivalent and, in many examples, superior predictive accuracy to multivariate regression analyses. Problems that contain more than a few variables, particularly when these variables are related in nonlinear ways to one another or to an outcome, favor ANN analysis from an accuracy and development effort point of view. The accuracy of any analysis model, ANN or otherwise, should be measured by the ROC curve area generated by running validation or prospective cases through the developed model. These cases cannot be used to develop or evaluate network weights or biases, evolve network architecture, or determine regression coefficients. Overfitting within the ANN is avoided when ROC curve areas for training and validation cases are equivalent. As new serum, tissue, and genetic markers are developed, ANNs will play a major role in determining the value of the new marker in diagnosis, staging, and prognosis. As new markers that are important to outcome predictions become commercially available, ANNs will play a significant role in the development of clinically useful models for improved predictions. REFERENCES 1. Snow PB, Smith DS, and Catalona WJ: Artificial neural networks in the diagnosis and prognosis of prostate cancer: a pilot study. J Urol 152: 1923–1926, 1994. 2. FDA: Guidance for FDA reviewers and industry: guidance for the content of premarket submissions for software contained in medical devices. U.S. Department of Health and Human Services, Food and Drug Administration, Center for Devices and Radiological Health, Office of Device Evaluation, 1998. 3. Rodvold D: A software development process model for artificial neural networks in critical applications. Presented at the 1999 International Joint Conference on Neural Networks (IJCNN’99), Washington, DC, 1999. 4. Stamey TA, Barnhill SD, Zhang Z, et al: Effectiveness of ProstAsureTM in detecting prostate cancer (PCa) and benign prostatic hyperplasia (BPH) in men age 50 and older. J Urol 155: 436A, 1996. 5. Barnhill S, Stamey T, Zhang Z, et al: The ability of the ProstAsureTM index to identify prostate cancer patients with low cancer volumes and a high potential for cure. J Urol 157: 63– 67, 1997. 6. Babaian R, Fritsche H, and Goldman M: A comparison of ProstAsureTM index and free/total PSA ratio in the diagnosis of prostate cancer (abstract). Presented at the Association of Clinical Scientists, Salt Lake City, Utah, May 1997. 7. Tisman G, Strum S, and Scholz M: Pre-therapy prediction of the duration of post-therapy non-detectable PSA for prostate cancer patients considering intermittent combined hormonal blockade by use of computerized neural net model789
ing. Presented at the American Society of Clinical Oncology Meeting, May 1997. 8. Snow PB, Crawford ED, DeAntoni EP, et al: Prostate cancer diagnosis from artificial neural networks using the prostate cancer awareness week database. J Urol 157: 365, 1997. 9. Wei JT, Zhang Z, Barnhill SD, et al: Understanding artificial neural networks and exploring their potential applications for the practicing urologist. Urology 52: 161–172, 1998. 10. Brawer MK, Gold M, and Meyer G: Area-weighted microvessel density aids in the diagnosis of carcinoma after negative prostate needle biopsy. Eur Urol 35(suppl 2), No. 79, 1999. 11. Snow PB, Brandt JM, and Brawer MK: Prediction of missed carcinoma on prostate biopsy is enhanced with artificial neural networks (abstract). American Urological Association 94th Annual Meeting (unmoderated poster session 9), Dallas, Texas, 1999. 12. Murphy GP: Prostate cancer: here and now (editorial). CA Cancer J Clin 45: 133, 1995. 13. Tewari A, Mager J, and Kamerer A: Genetic adaptive probabilistic neural network model in prediction of pathological stage in localized prostate cancer: a pilot study. J Urol 157: 293, 1997. 14. Tewari A, and Narayan P: Novel staging tool for localized prostate cancer: a pilot study using genetic adaptive neural networks. J Urol 160: 430 – 436, 1998. 15. Crawford ED, Snow PB, Lynch J, et al: Artificial intelligence system in prediction of lymph node involvement in radical prostatectomy patients (abstract). Presented at the American Urological Association 94th Annual Meeting (unmoderated poster session 8), Dallas, Texas, 1999. 16. Burke HB, Goodman PH, Rosen DB, et al: Artificial neural networks improve the accuracy of cancer survival prediction. Cancer 79: 857– 862, 1997. 17. Snow PB, Smith DS, and Catalona WJ: Artificial neural networks in the diagnosis and prognosis of prostate cancer: a pilot study. J Urol 152: 1923–1926, 1994. 18. Douglas TH, Connelly RR, McLeod DG, et al: Neural network analysis of pre-operative and post-operative variables to predict pathologic stage and recurrence. J Urol 155: 487A, 1996. 19. Douglas TH, and Moul JW: Applications of neural net-
790
works in urologic oncology. Semin Urol Oncol 16: 35–39, 1998. 20. Ragde H, Elgamal AA, Snow PB, et al: Ten-year diseasefree survival after transperineal sonography-guided iodine125 brachytherapy with or without 45Gy external beam irradiation in the treatment of patients with clinically localized, low to high Gleason grade prostate carcinoma. Cancer 83: 989 –1001, 1998. 21. Prater JS, and Richard WD: Segmenting ultrasound images of the prostate using neural networks. Ultrason Imaging 142: 159 –185, 1992. 22. Loch T, Leuschner I, and Bruske T: Neural network analysis of subvisual transrectal ultrasound data: improved prostate cancer detection. J Urol 157: 364, 1997. 23. Maclin PS, Dempsey J, Brooks J, et al: Using neural networks to diagnose cancer. J Med Syst 15: 11–19, 1991. 24. Hurst RE, Bonner RB, Ashenayi K, et al: Neural netbased identification of cells expressing the p300 tumor-related antigen using fluorescence image analysis. Cytometry 27: 36 – 42, 1997. 25. Moul JW, Snow PB, Fernandez EB, et al: Neural network analysis of quantitative histological factors to predict pathological stage in clinical stage 1 nonseminomatous testicular cancer. J Urol 153: 1674 –1677, 1997. 26. Krongrad A, Granville LJ, and Burke MA: Predictors of general quality of life in patients with benign prostate hyperplasia or prostate cancer. J Urol 157: 534 –538, 1997. 27. Wasserman PD: Neural Computing: Theory and Practice. New York, Van Nostrand Reinhold, 1989. 28. Lawrence J: Introduction to Neural Networks: Computer Simulations of Biological Intelligence. Grass Valley, California, California Scientific Software, 1991. 29. Bishop CM: Neural Networks for Pattern Recognition. Oxford, England, Oxford University Press, 1995. 30. Rodvold DM, and LeKang DE: Applying neural networks in the modeling and simulation of environments. Presented at the 1994 TECOM Artificial Intelligence Technology Symposium, Aberdeen, Maryland, 1994. 31. Davis L, and Steenstrup M: Genetic algorithms and simulated annealing: an overview, in Davis L (Ed): Genetic Algorithms and Simulated Annealing. London, Pitman, 1987, pp 6 –10.
UROLOGY 54 (5), 1999