WOMEN'S HEALTH WOMEN'S HEALTH
A Review of the Concordance of Diagnoses Made After Multi-Channel Urodynamics and Video Urodynamics in Women With Urinary Incontinence Jane A. Schulz, MD,1 Kevin M. Smith, MB BCh,2 Harold P. Drutz, MD2 1
Department of Obstetrics and Gynecology, University of Alberta, Edmonton AB
2
Division of Urogynaecology, Department of Obstetrics and Gynaecology, University of Toronto, Toronto ON
Abstract Objective: Symptoms of urinary incontinence often do not correlate well with the diagnosis provided by urodynamic investigations. Video urodynamics has been described as the “gold standard” investigation for patients with lower urinary tract symptoms. The aim of our study was to determine the concordance of diagnoses made on multi-channel and video urodynamics in women presenting with urinary incontinence to a tertiary care urogynaecology unit. Materials and Methods: We performed a retrospective chart review of 38 women who had video urodynamics and a multi-channel study completed within a one-year period. All patients had a complete history and pelvic floor assessment. The investigators completing each study were blinded to the clinical diagnoses, the physical findings, and the other urodynamics diagnoses. International Continence Society standards were followed for completion of all urodynamics investigations. Multi-channel studies were completed with the patient lying supine and video studies were performed with the patient sitting on a commode. The level of agreement of the diagnoses was calculated using a kappa (k) statistic with 95% confidence intervals (CI). Results: The median age of subjects was 61 years (range 14–79), with a median duration of lower urinary tract symptoms of 6.0 years (range 0.5–41). Patients had had a median of two previous bladder surgeries (range 0–5). The level of concordance of the two diagnoses gave a kappa of 0.16 (95% CI 0.06–0.26). Conclusions: There was poor concordance between the diagnoses made on multi-channel and video urodynamics when the two tests were performed on the same patient. Prospective studies are required to evaluate the reproducibility of diagnoses made on cystometry.
Key Words: Urinary incontinence, video urodynamics, multi-channel urodynamics Competing Interests: None declared. Received on March 19, 2008 Accepted on August 6, 2008
156
l FEBRUARY JOGC FÉVRIER 2009
Résumé Objectif : Les symptômes de l’incontinence urinaire ne correspondent souvent pas bien au diagnostic établi au moyen des explorations urodynamiques. L’urodynamique vidéo a été décrite comme étant l’« étalon or » en matière d’exploration chez les patientes présentant des symptômes affectant les voies urinaires inférieures. Notre étude avait pour but de déterminer la concordance des diagnostics établis au moyen de l’urodynamique multicanal et de l’urodynamique vidéo chez les femmes consultant une unité de soins tertiaires en urogynécologie en raison d’une incontinence urinaire. Documents et méthodes : Nous avons mené une analyse de dossiers rétrospective portant sur 38 femmes qui ont subi tant une évaluation urodynamique vidéo qu’une évaluation urodynamique multicanal dans un délai d’un an. Toutes les patientes avaient connu une évaluation du plancher pelvien et leurs antécédents complets avaient été obtenus. Les chercheurs menant chaque évaluation n’étaient pas mis au courant des diagnostics cliniques, des résultats physiques ni des autres diagnostics urodynamiques. Les normes de la International Continence Society ont été respectées en ce qui concerne l’exécution de toutes les explorations urodynamiques. Les évaluations au moyen de l’urodynamique multicanal ont été menées en plaçant la patiente en position couchée sur le dos, tandis que les évaluations au moyen de l’urodynamique vidéo ont été menées en plaçant la patiente en position assise sur une chaise d’aisance. Le degré de concordance des diagnostics a été calculé au moyen d’une statistique kappa (k) et d’intervalles de confiance (IC) à 95 %. Résultats : L’âge médian des participantes était de 61 ans (plage : 14–79), la durée médiane des symptômes affectant les voies urinaires inférieures étant de 6,0 ans (plage : 0,5–41). Les patientes avaient connu une médiane de deux chirurgies précédentes visant la vessie (plage : 0–5). Le degré de concordance des deux diagnostics a engendré un kappa de 0,16 (IC à 95 %, 0,06–0,26). Conclusions : Nous avons constaté une faible concordance entre les diagnostics établis par urodynamique multicanal et par urodynamique vidéo lorsque ces deux tests étaient menés chez la même patiente. Des études prospectives s’avèrent requises pour évaluer la reproductibilité des diagnostics établis par cystométrie. J Obstet Gynaecol Can 2009;31(2):156–160
A Review of the Concordance of Diagnoses Made After Multi-Channel Urodynamics
INTRODUCTION
hen assessing patients with lower urinary tract symptoms, urogynaecologists, gynaecologists, and urologists often rely on the results of urodynamic investigations to make diagnoses of bladder dysfunction. Many bladder conditions require the use of cystometry for correct diagnosis. The International Continence Society definitions of detrusor overactivity and genuine stress incontinence at the time of this study depended on the cystometric presence or absence, respectively, of detrusor contractions, and the objective demonstration of urinary loss.1 The bladder has been described as “an unreliable witness”2; it is not always possible to make a diagnosis based on symptoms alone. DO can present with symptoms of stress incontinence and GSI can present with irritative symptoms. Therefore, the results of cystometric testing are an important adjunct to diagnosis. Despite our advances in the development of incontinence symptom questionnaires, it has been shown that we still have no symptom with high enough specificity and sensitivity to replace urodynamic testing in our assessment of women with urinary incontinence.3 Many sources feel that women with complaints of urinary incontinence, especially those for whom surgery is contemplated, should undergo complete urodynamic evaluation when it is available.4
W
Although we have developed increasing reliance on these advances in technology, the true sensitivity and specificity of urodynamic testing remains unknown. We know that the sensitivity of cystometry is not perfect; there are limitations in detecting detrusor contractions due to the artificial nature of the test. Also, there are limitations in testing patients with significant genitourinary prolapse. Previous studies have shown that only 55% of patients presenting with urgency incontinence had cystometric evidence of DO.5 In 1980, Jarvis et al. found that only 68% of patients presenting with symptoms of stress incontinence had GSI on urodynamic testing.6 These authors were only able to confirm the diagnosis of detrusor overactivity in 51% of patients with symptoms of urge incontinence.6 Video urodynamics has been described as the “gold standard” testing modality for the evaluation of incontinence and lower urinary tract symptoms.7 However, there have
ABBREVIATIONS DO
detrusor overactivity
GSI
genuine stress incontinence
OUSI
occult urodynamic stress incontinence
USI
urodynamic stress incontinence
been concerns about masked findings in patients with genitourinary prolapse.8,9 There is little information about the reproducibility of urodynamic testing. One study on uroflowmetry suggested that to minimize the boundaries of a confidence interval around a single maximal flow result, 25 measurements should be taken to provide adequate representation of the true result.10 However, another study on the short-term repeatability of pressure flow studies found that 85% of patients reproduced their test results accurately when the tests were repeated within a month.11 Other studies have shown limited repeatability of urodynamics in healthy female volunteers,12 whereas smaller studies have shown good reproducibility of the test over a two-year time interval.13 Lower urinary tract diagnoses of stress urinary incontinence from both clinical and urodynamic data have demonstrated substantial reliability and interobserver agreement. However, by conventional interpretation of kappa statistics, the reliability of diagnoses of DO or voiding dysfunction was only moderate, and interobserver agreement on these diagnoses was no better than fair. Urodynamic interpretations may not be satisfactorily reproducible for voiding dysfunction and DO.14 The International Consultation on Incontinence has stated, after a review of all literature on assessment of incontinence and lower urinary tract symptoms, that urodynamic testing is currently the best tool available to assist in the assessment of lower urinary tract symptoms and incontinence and to guide our management decisions.9 Urodynamic investigations have been recommended for incorporation in the standard diagnostic workup of patients undergoing surgical correction of genital prolapse.9,15 Stress incontinence is reported by 40% of patients with genital prolapse; urodynamic stress incontinence is diagnosed in 70% to 75% of these patients; and latent or occult urodynamic stress incontinence is diagnosed in about 50% of the patients with geontinence before surgery.9,15 Performing urodynamic investigation in patients undergoing prolapse surgery may be valuable if diagnosing USI or OUSI results in the selection of the optimal treatment strategy. This treatment strategy is either a combination of prolapse and stress incontinence surgery, or prolapse surgery at the beginning and re-evaluation of possible stress incontinence afterwards. The combination of prolapse and stress incontinence surgery has the advantage of potentially solving two problems at the same time, but carries an increased risk of unwanted side effects, of which voiding dysfunction and detrusor overactivity are the most important.9,15 The aims of our study were to compare the diagnoses made on multi-channel and video urodynamics in women with urinary incontinence, voiding dysfunction, or both, and to determine the level of agreement between these diagnoses. FEBRUARY JOGC FÉVRIER 2009 l
157
WOMEN'S HEALTH
Table 1. Comparison of the methods used in performing the standard multi-channel urodynamics and the video urodynamics Multi-channel UDS
Video UDS
Position
Supine
Sitting
Flow rate
50 mL/min (medium filling)
50100 mL/min (medium filling)
Saline (room temp)
Saline (room temp)
Cough
Cough VLPP
Filling medium Provocation
UDS: urodynamic studies; VLPP: Valsalva leak point pressure
Those involved in the study had had their video urodynamics at the urology centre, and their multi-channel studies were subsequently performed at the tertiary care urogynaecology unit. Due to the assessments occurring in two different subspecialty units, there was a time delay between tests, but the patients had no therapeutic intervention.
Table 2. Concordance of diagnoses made from Multi-channel UDS and Video-UDS performed on the same patient Diagnosis by Video UDS
Diagnosis by Multichannel UDS
Normal
GSI
DI
Voiding dysfunction
Total
Normal
2
2
1
3
8
GSI
1
10
2
12
25
DI
0
1
0
2
3
Voiding Dysfunction
2
2
1
9
14
Total
5
15
4
26
50
UDS: urodynamic studies; DI: detrusor instability Note: Some patients had more than one diagnosis
We also wished to examine the ability of our testing to detect incontinence in women with urogenital prolapse (latent or occult stress urinary incontinence). MATERIALS AND METHODS
We completed a retrospective chart review of all women who had been referred for video urodynamics to the Department of Urogynecology and Pelvic Reconstructive Surgery at the Mount Sinai Hospital, Toronto, over a one-year period. All of the patients had presented to this tertiary care unit with lower urinary tract symptoms; each patient had a full initial assessment including history and physical examination. Physical examination was performed independently by two physicians. Women were included in the review if they had multi-channel studies completed within one year of the video urodynamics. The exclusion criteria were an interval of greater than 12 months between studies, and any intervention between studies which might affect bladder function. The interventions that could lead to exclusion from the study were behavioural therapy, medications, or surgery. The patient population was a cohort of women with urinary symptoms requiring tertiary care. 158
l FEBRUARY JOGC FÉVRIER 2009
The multi-channel urodynamics and video urodynamics, although performed in different centres, both conformed to International Continence Society guidelines and used the same flow rate and same temperature filling medium (Table 1). For the video studies, a small amount of contrast was mixed with saline. Multi-channel studies were completed with the patient supine and video studies were performed with the patient sitting on a commode. The only other difference between the two tests was the addition of fluoroscopic imaging of the bladder for the video studies. The investigators performing the multi-channel urodynamics were blinded to the results of the video urodynamics and vice versa. Both sets of investigators were blinded to the clinical findings. The diagnoses made following multi-channel urodynamics (performed at Mount Sinai Hospital) and video urodynamics (performed at Toronto Western Hospital) were noted. The concordance between the diagnoses made after the two tests had been performed on the same patient was calculated and compared to the concordance expected by chance by calculating a kappa (k) value. RESULTS
After a review of all records, 55 women were found to have had video urodynamics within the study period. Seventeen cases were excluded by the given criteria. Therefore, 38 patients were included in the final review. The median patient age was 61 years (range 14–79). The women had a median duration of lower urinary tract symptoms of 6.0 years (range 0.5–41) and a median number of two prior bladder surgeries (range 0–5). Their cases were complicated, with long-standing problems, and most had had multiple prior surgeries.
A Review of the Concordance of Diagnoses Made After Multi-Channel Urodynamics
The diagonal line of numbers in Table 2 indicates the number of diagnostic agreements between the two tests when performed on the same patient. Out of a total of 50 diagnoses made in the 38 patients (some patients had more than one diagnosis), 21 were the same; thus, the agreement in diagnosis made by the two tests was 42%. However, this simple calculation of agreement between the two tests is weak, because we would expect some agreement to occur by chance, even if the diagnoses were random. Using the values in Table 2, the number of agreements expected by chance alone is 15.7 (31%). The maximum agreement is 100%, so the agreement of the two tests can be expressed as a proportion of the possible scope for doing better than chance (kappa). Thus: k = (0.42!0.31) / (1!0.31) = 0.16 (95% CI 0.06–0.26) The findings on physical examination when compared to the results of urodynamic testing are documented in Table 3. Video urodynamics did not detect some of the cases of latent stress incontinence that were detected on multi-channel studies. DISCUSSION
The maximum value for kappa is 1 when agreement is perfect. A value of 0 indicates no agreement better than chance; negative values show worse than chance agreement. While there are no absolute definitions, guidelines set out by Landis and Koch help to interpret kappa values between 0 and 1 (Table 4).16,17 Our calculation of a kappa of 0.16 shows very poor correlation between diagnoses made on video urodynamics and the multi-channel diagnoses made on the same patient. This creates some concern about the reproducibility of one of the most commonly used testing modalities in the field of lower urinary tract assessment. Previous studies have shown variable reproducibility of urodynamics; in the short term, at least, the results seem quite reproducible.18,19 Studies by Digesu et al.18 and Sorenson et al.19 have shown good test reproducibility within two weeks to two months of initial testing in healthy postmenopausal female volunteers and in women with lower urinary tract symptoms presenting to a tertiary referral unit. Beyond this, however, the length of time for which urodynamics testing remains valid is not well-established. Another prior study in healthy female volunteers showed good reproducibility of urodynamics over a two-year period11; however, this study population was small and subjects did not have lower urinary tract symptoms. Certainly, our study results suggest that a 12-month time period does not allow good test reproducibility. Further prospective studies are required to
Table 3. Comparison of physical findings to urodynamics results SUI on M-UDS
SUI on V-UDS
Overt SUI (n = 25)
16
13
Latent SUI (n = 5)
4
1
No SUI (n = 6)
5
1
On examination N = 36*
M-UDS: multi-channel urodynamic studies; V-UDS: video urodynamics studies *2 patients had voiding dysfunction and no incontinence
Table 4. Interpretation of kappa value17 Value of kappa
Strength of agreement
< 0.20
Poor
0.21–0.40
Fair
0.41–0.60
Moderate
0.61–0.80
Good
0.81–1.00
Very good
determine the length of time for which tests remain reliable, and at what point repeat testing would be required prior to implementation of management decisions. Detection of latent stress urinary incontinence has always been difficult. Multiple techniques including reduction of prolapse by a pessary, cotton swab, or tampon have been tried, but none are ideal.20 We demonstrated that detection of latent stress incontinence was not perfect with either of our testing modalities; however, video urodynamic studies were particularly deficient in this regard. Although our tests were performed in different positions, all other variables were the same. Previous reports have shown no difference in urodynamic measurements when women were tested in the supine and sitting positions21; therefore, we do not feel that this difference alone should contribute to such poor concordance in results. Other reports have shown that although prolapse reduction significantly decreases maximum urethral closure pressure, it does not alter intrinsic neuromuscular activity of the striated urethral sphincter. Prolapse reduction does not alter any other filling or pressure flow parameter.22 We acknowledge that we still have no “gold standard” testing modality in the assessment of lower urinary tract symptoms, and that urodynamic studies are a diagnostic tool to assist in patient management. Our results suggest that the addition of video fluoroscopy to the urodynamic study does not offer any advantage over regular multi-channel studies. Urodynamic studies play a more significant role in complicated tertiary care patients. Results from urodynamic testing FEBRUARY JOGC FÉVRIER 2009 l
159
WOMEN'S HEALTH
are a snapshot of what happened on the day of the test and must be added to the overall picture of complete patient assessment (including a thorough history and physical examination) when making decisions regarding management. Guidelines provided by the Society of Obstetricians and Gynaecologists of Canada23,24 and the International Consultation on Incontinence25 are helpful in guiding practitioners in their assessment of urinary incontinence. Current Canadian guidelines recommend urodynamic assessment in complicated patients and those with prior pelvic or incontinence surgery.24
11. Hansen F, Olsen L, Atan A, Nordling J. Pressure-flow studies: short-time repeatability. Neurourol Urodyn 1999;18:205–14.
We acknowledge that this is a retrospective pilot study with small patient numbers, but feel that prospective studies are required to evaluate the reproducibility of diagnoses made on cystometry and better elucidate its role in assessment of the female lower urinary tract.
15. Roovers JP, Oelke M. Clinical relevance of urodynamic investigation tests prior to surgical correction of genital prolapse: a literature review. Int Urogynecol J Pelvic Floor Dysfunct 2007;18(4):455–60.
REFERENCES 1. Abrams P, Blaivas JG, Stanton SL, Andersen JT. Standardisation of terminology of lower urinary tract function. Neurourol Urodyn 1988;17:403–27. 2. Bates CP, Whiteside CG, Turner-Warwick R. Synchronous urine pressure flow cystourethrography with special reference to stress and urge incontinence. Br J Urol 1970;42:714–23. 3. Khan MS, Chaliha C, Leskova L, Khullar V. The relationship between urinary symptom questionnaires and urodynamic diagnoses: an analysis of two methods of questionnaire administration. BJOG 2004;111(5):468–74. 4. Summitt RL Jr, Stovall TG, Bent AE, Ostergard DR. Urinary incontinence: correlation of history and brief office evaluation with multichannel urodynamic testing. Am J Obstet Gynecol 1992;166(6 Pt 1):1835–40; discussion 1840–4. 5. Cantor TJ, Bates CP. A comparative study of symptoms and objective urodynamic findings in 214 incontinent women. Br J Obstet Gynaecol 1980;87:889–92. 6. Jarvis GJ, Hall S, Stamp S, Miller DR, Johnson A. An assessment of urodynamic examination in incontinent women. Br J Obstet Gynaecol 1980;87:893–6. 7. McGuire EJ, Cespedes RD, Cross CA, O’Connell HE. Videourodynamic studies. Urol Clin North Am 1996;23(2):309–21. 8. Hextall A, Boos K, Cardozo L, Toozs-Hobson P, Anders K, Khullar V. Videocystourethrography with a ring pessary in situ. A clinically useful preoperative investigation for continent women with urogenital prolapse? Int Urogynecol J Pelvic Floor Dysfunct 1998;9:205–9. 9. Rosenzweig BA, Pushkin S, Blumenfeld D, Bhatia NN. Prevalence of abnormal urodynamic test results in continent women with severe genitourinary prolapse. Obstet Gynecol 1992;79:539–42. 10. Sonke GS, Kiemeney LA, Verbeek AL, Kortmann BB, Debruyne FM, de la Rosette JJ. Low reproducibility of maximum urinary flow rate determined by portable flowmetry. Neurourol Urodyn 1999;18:183–91.
160
l FEBRUARY JOGC FÉVRIER 2009
12. Gupta A, Defreitas G, Lemack GE. The reproducibility of urodynamic findings in healthy female volunteers: results of repeated studies in the same setting and after short-term follow-up. Neurourol Urodyn 2004;23(4):311–6. 13. Sorensen S, Gregersen H, Sorensen SM. Long term reproducibility of urodynamic investigations in healthy fertile females. Scand J Urol Nephrol Suppl 1988;114:35–41. 14. Whiteside JL, Hijaz A, Imrey PB, Barber MD, Paraiso MF, Rackley RR, et al. Reliability and agreement of urodynamics interpretations in a female pelvic medicine center. Obstet Gynecol 2006;108(2):315–23.
16. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics 1977;33(1):159–74. 17. Landis JR, Koch GG. An application of hierarchical Kappa type statistics in the assessment of majority agreement among multiple observers. Biometrics 1977;33(2): 363–74. 18. Digesu GA, Hutchings A, Salvatore S, Selvaggi L, Khullar V. Reproducibility and reliability of pressure flow parameters in women. BJOG 2003;110(8):774–6. 19. Sorensen S. Urodynamic investigations and their reproducibility in healthy postmenopausal females. Scand J Urol Nephrol Suppl 1988;114:42–7. 20. Dionysios K, Veronikis MD, Nichols DH, Wakamatsu MD. The incidence of low-pressure urethra as a function of prolapse-reducing technique in patients with massive pelvic organ prolapse (maximum descent at all vaginal sites). Am J Obstet Gynecol 1997;177:1305–14. 21. Shukla A, Johnson D, Bibby J. Impact of patient position on filling phase of urodynamics in women. Int Urogynecol J Pelvic Floor Dysfunct 2006;17(3):231–3. 22. Mueller, E R. Kenton, K. Mahajan, S. FitzGerald, M P. Brubaker, L. Urodynamic prolapse reduction alters urethral pressure but not filling or pressure flow parameters. J Urol 2007;177(2):600–3. 23. Farrell SA, Epp A, Flood C, Lajoie F, MacMillan B, Mainprize T, Robert M. The evaluation of stress incontinence prior to primary surgery: SOGC Clinical Practice Guideline No. 127, April 2003. J Obstet Gynaecol Can 2003;25(4):313–8. 24. Drutz HP, Farrell SA, Lemieux MC, Mainprize T, Wilkie D. Guidelines for evaluation and treatment of urinary incontinence following pelvic floor or incontinence surgery. SOGC Policy Statement No. 74, August 1998. J Soc Obstet Gynaecol Can 1998;20(8):778–81. 25. Staskin D, Hilton P, Emmaunuel A, Goode P, Mills I, Shull B, et al. Initial assessment of incontinence. In: Abrams P, Cardozo L, Khoury S, Wein A., eds. Volume 1 of 3rd International Consultation on Incontinence. Health Publications Ltd. 2005;485–518.