ORIGINAL ARTICLE
Image documentation of endoscopic findings in ulcerative colitis: photographs or video clips? Thomas de Lange, MD, Stig Larsen, DR SC, Lars Aabakken, MD, PhD Oslo, Norway
Background: Previous studies have shown deficiencies in the endoscopy reports and substantial interobserver variation in the assessments of endoscopic findings. The aim of this study was to determine how to perform systematic digital image documentation in ulcerative colitis and to evaluate if mucosal inflammation is assessed equally on a still image and on a video clip. Methods: Eighteen video clips and their corresponding photographs that visualize different severities of ulcerative colitis were shown in randomized order to 20 experienced endoscopists. They assessed the mucosal inflammation of each image twice on a visual analog scale. Three comparisons were performed between the video clips, the photographs, and the video clips to the photographs, respectively. Results: The mean score of the inflammation of the video clips at tape 1 and 2 was 4.74: 95% confidence interval (CI)[4.41, 5.08] and 4.90: 95% CI[4.56, 5.24), respectively, and of the photographs 4.53: 95% CI[4.19, 4.88] and 4.43: 95% CI[4.09, 4.77], respectively. The first answer explains 83% of the variation in the second answer for all comparisons, and the agreement index ranged from 0.38 to 0.42. Conclusions: The mucosal inflammation might be documented nearly as well with a still image as on a video clip. Systematic use of still images probably improves the endoscopy reports by adding more objective information about the mucosal inflammation. (Gastrointest Endosc 2005;61:715-20.)
Ulcerative colitis (UC) is a common disease encountered in a gastroenterology practice. The assessment of the disease activity has an important impact on the therapeutic decisions, and improved diagnostic tools would benefit patients, as well as the society. Previous reports have emphasized the importance of endoscopic findings in the evaluation of the disease activity.1,2 However, studies of endoscopy reports have shown deficiencies, and in our previous study, we showed it more particularly in UC, in both the description of various statements, as well as features of the endoscopic activity index (EAI) for UC.3-8 Even the interobserver variation in assessing features of ulcerative colitis is quite important, particularly in mild and moderate disease.9 The endoscopy reports might be improved by introducing the systematic image documentation with 8 standard images both in upper-GI endoscopy and colonoscopy as recommended by the European Society of Gastrointestinal Endoscopy (ESGE).10 Nevertheless, to our knowledge no previous study has evaluated the use of systematic image documentation.
The main purpose of this study was to determine how to perform systematic image documentation in ulcerative colitis and to evaluate if mucosal inflammation in ulcerative colitis assessed with the EAI might be evaluated as well on a still image from the most distal part of a colonic segment as on the video clip from the entire segment. The study also is designed to assess both the repeatability of the scores and the intraobserver agreement.
PATIENTS AND METHODS
Copyright ª 2005 by the American Society for Gastrointestinal Endoscopy 0016-5107/2005/$30.00 + 0 PII: S0016-5107(05)00337-8
The study was approved by the local ethic committee and performed according to the Helsinki Declaration. The study enrolled 4 patients admitted for a scheduled colonoscopy. They had various degrees of UC. Three of the patients received medication with 20 mg Buscopan IV (hyoscine butylbromide; Boehringer Ingelheim, International GmbH, Ingelheim am rhein, Germany) to optimize the visibility. The retraction of the colonoscope from the cecum was digitally recorded on a Sony digital video (DV) CAM DSR-20 MDP (Sony, Oslo, Norway), which permitted lossless editing and copying of the images. After the examination, the recordings were imported to Adobe Premiere 6.5 (Adobe
www.mosby.com/gie
Volume 61, No. 6 : 2005 GASTROINTESTINAL ENDOSCOPY 715
Image documentation of ulcerative colitis
Systems Nordic AB, Kista, Sweden), edited, and divided into 5 segments: right colon, transverse colon, left colon, sigmoid colon, and rectum. From the distal part of each segment, a still image was selected (Fig. 1A to D). Eighteen video clips and their corresponding still images were exported in randomized order to two different DV tapes. However, two clips were left out of the demonstrations to the observers. One, to avoid too many images of normal mucosa, and one clip was left out for quality reasons. These videotapes were presented randomly in 6 endoscopy units in two separate sessions with an interval of 4 weeks or more. Ten observers from 3 different units evaluated tape 1, and the last 10 observers evaluated tape 2 in the first session. The videotapes were shown on the endoscopy monitor in the respective unit. The mean duration of the video clips was 38.4 seconds (range 17-59 seconds) and the still images were shown in 15 seconds. The observers were experienced gastroenterologists, having performed more than 1000 colonoscopies. They were asked to score the EAI in the most inflamed area at each video clip and still image on a 10cm visual analog scale (VAS) according to the modified EAI.11-14 Endoscopic lesions might be assessed in different ways but are most frequently assessed on discrete scales. However, we used the VAS because previous studies have shown benefits in the assessments of endoscopic lesions, and it was felt that the continuous outcome measure would yield a higher study power.15-18 At tape 1, the video clips and the photographs were referred to as V1 and P1 respectively, and at tape 2 as V2 and P2, respectively. In total, 1437 observations were recorded.
Statistical analysis All continuously distributed variables are expressed by mean values with the 95% confidence intervals (CI).19 All tests carried out in the analysis were 2-tailed tests, with a significance level of 5%. It is recognized that there was multiple testing of outcome data arising from individual endoscopist observers. Because the focus of the statistical analysis was to highlight differences, there has been no correction of p values to adjust for multiple testing of data; however, it is noted that all findings of nominal statistical significance in individual tests of hypotheses would have been removed by the Bonferroni method of correcting for the multiple testing of data. To analyze the interobserver agreement, the following agreement procedure was used. The mean difference between two measurements (DiffAB) is calculated and tested against zero by using the Hotelling T statistic.20 Thereafter, the regression line between the two measurements is drawn and is tested to detect significant deviation from the line of equality (A Z B).21 The standard deviation (SD) of the differences is referred to as SDdiff and the mean of the measurements as MeanAB. The standardized agreement index (AI), defined as: AI Z 1 ÿ (2 SDdiff/MeanAB), is determined.22 A positive AI supports agreement, and a value larger than 0.5 indicates 716 GASTROINTESTINAL ENDOSCOPY Volume 61, No. 6 : 2005
de Lange et al
Capsule Summary What is already known on this topic d
d
d
Endoscopic assessment of ulcerative colitis is important for disease activity evaluation and management. There is significant interobserver variation in the endoscopic assessment of mild to moderate ulcerative colitis. Systematic still-image documentation and video may minimize interobserver disagreement.
What this study adds to our knowledge d
In a prospective study, the mucosal inflammation of ulcerative colitis could be assessed nearly as well with still-images representing each colonic segment as with videos.
good agreement. The agreement limits are defined as DiffAB G 2 SDdiff, and an agreement plot of the difference against the mean of the two measurements is used to spot outliers, defined as differences lying outside of the agreement limits. Finally, the correlation coefficient between the mean of the measurements and the absolute value of the difference is calculated and tested to reveal if the difference changes with increasing measurement values.
RESULTS Comparisons of the mean score of inflammation There was no significant difference in the mean score of inflammation between P1 and P2. However, there was a slight, though statistically significant, difference between V1 and V2 and between V1 and P2. The mean score of inflammation was 5% to 10% lower when using the photographs (Table 1).
Repeatability of assessments between two sessions The scatter diagrams in Figure 2 appear to show identical patterns in the comparisons V1 to V2 (Fig. 2A), P1 to P2 (Fig. 2B), and V1 to P2 (Fig. 2C), respectively. In all 3 regression models relating two sets of measurements (on the X and Y-axes), 83% of the total variance of one measurement (on the Y-axis) is explained by the other measurement (on the X-axis) by the same observer. The regression lines are approximately the identity line, Y Z X, or repeatability of the two measurements. This corresponds to a simple correlation of 0.91 between measurements (approximating the slopes) in each, of the 3 regressions.
Intraobserver agreement The results of the agreement analysis demonstrated a small mean difference (range –0.16 to 0.21) between the scores at tape 1 and tape 2 (Table 2). The agreement index www.mosby.com/gie
de Lange et al
Image documentation of ulcerative colitis
Figure 1. Endoscopic still images of ulcerative colitis. A, Normal mucosa. B, Mild inflammation. C, Moderate inflammation. D, Severe inflammation.
TABLE 1. The mean score of inflammation of the colonic mucosa assessed on a 10-cm VAS, comparing video tapes and photographs Video tape 1
Video tape 2
Photograph 1
Photograph 2
No. of observations
359
358
360
360
Mean score (95% CI)
4.74 (4.41, 5.08)
4.90 (4.56, 5.24)
4.53 (4.19, 4.88)
4.43 (4.09, 4.77)
Total range
0-9.9
0-10.0
0-10.0
0-10.0
VAS, Visual analog scale; CI, confidence interval.
The present study clarifies a major concern and determines the role of still images compared with video
clips in systematic image documentation. It clearly demonstrates that the mucosal inflammation in ulcerative colitis might be assessed nearly as well with still images as with video clips with both an identical repeatability between the observations and a good intraobserver agreement. However, when graded on a VAS, the inflammation was scored slightly lower on the photographs. Nevertheless, this slight difference probably does not have any clinical consequences. This result might have an important clinical impact. It indicates that the shortcomings of the present text
www.mosby.com/gie
Volume 61, No. 6 : 2005 GASTROINTESTINAL ENDOSCOPY 717
was relatively good (range 0.38-0.42) for all 3 comparisons, and the percentage of outliers ranged from 6.4% to 7.2% (Table 2). The agreement plots compared the video clips (V1/V2) (Fig. 3A), the photographs (P2/P3) (Fig. 3B), and the video clips (V1) to the photographs (P2) (Fig. 3C).
DISCUSSION
Image documentation of ulcerative colitis
de Lange et al
TABLE 2. Intraobserver agreement analysis Video score
Photo score
Video & photo score
Mean of score tape 1 & tape 2
4.83
4.48
4.64
Meandiff score tape 1 – tape 2
ÿ0.16
ÿ0.10
0.31
SDdiff
1.39
1.39
1.37
Agreement index
0.42
0.38
0.40
Percent outliers
6.72%
7.22%
6.41%
Correlation*
0.05
0.17
0.11
SD, Standard deviation. *Correlation between the mean and the absolute difference of two measurements.
documentation systems might be alleviated by systematic recordings of still images. This also is a strong argument to follow the recommendations of ESGE.10 The increasing opportunity of integrated digital images in the electronic
endoscopy reports at a low cost also makes it feasible to follow these recommendations. In routine practice, still images present major advantages to video clips. First of all, they are recorded instantly, with no later need for editing. The files are relatively small, and the storage and the transfer of these files are very convenient and will probably facilitate both supervision and access to second opinions. Finally, they can be presented on paper in a regular patient record. It is still unclear to what extent it is possible to compress the image files without compromising the image quality, even if it seems that image compression in JPEG format from full size 600 Kb to 30 Kb does not affect the visual and diagnostic quality in a significant way.23,24 The recently developed compression algorithm, JPEG 2000, aims to allow further lossless compression. Video clips frequently need time-consuming editing after the examination. This problem was illustrated in the study of Wurnig et al.25 Video clips also require substantial storage space; a full size, 30-second video clip requires approximately 700 MB of storage capacity and needs 2.5 minutes to be transferred on a 2 MB/second broadband network and approximately 30 minutes on an integrated services digital network connection. However, video clips are in many situations very suitable for educational purposes. Further progresses in video compression algorithms, such as MPEG 7 and 21, would probably facilitate both the storage and the transmission of video clips. However, to our knowledge, no study has determined the degree of compression possible without compromising the diagnostic and the visual quality; further studies are needed. The high study power made it possible to detect small differences as statistically significant. However, these differences are clinically quite insignificant, and, therefore, we admitted that an agreement analysis might be done. This study showed a relatively good agreement index; but, we observed a borderline excess in the percentage of
718 GASTROINTESTINAL ENDOSCOPY Volume 61, No. 6 : 2005
www.mosby.com/gie
Figure 2. Correlation between the assessment of inflammation on video clips. A, At tape 1 and tape 2. B, Photographs at tape 1 and tape 2. C, Between video clips at tape 1 and photographs at tape 2. The lines of regression are drawn with the 95% confidence interval.
de Lange et al
Image documentation of ulcerative colitis
outliers. However, some investigators estimate that this interpretation is too conservative and that the standard deviation should be multiplied by the square root of 2.26 If we had chosen those agreement limits, the number of outliers would have ranged from 1% to 1.5%, which is quite acceptable. Even methodologic concerns have to be discussed, because comparable studies have not previously been performed. We chose to compare two different ways of image documentation possible to be applied in clinical practice. We postulated that the video clips are the criterion standard. Nevertheless, in the live situation, the endoscopists receive diagnostic information, both at the insertion and at the retraction of the endoscope. However, at least in colonoscopy, it frequently requires long video clips to be recorded and stored. It also would be particularly infeasible to reexamine these tapes before a new examination. In studies where the observers are asked to perform a detailed evaluation and description of the mucosa, the live situation would be ideal. However, it would be difficult to have the necessary number of observers assisting the same examination live, but recording both the insertion and the withdrawal of the endoscope with a playback possibility would probably reproduce the live situation sufficiently. The observers in the present study had the opportunity to examine the images for a limited duration; therefore, further studies are needed to evaluate if the time of observation influences the results. To our knowledge, this is the first study of systematic image documentation, however, dealing only specifically with a diffuse segmental disease. The requirements of image documentation may be different with other endoscopic findings. The actual method of evaluating images is time consuming. More effective methodology has to be developed; an Internet solution is desirable. In conclusion, mucosal inflammation of UC might be assessed nearly as well by a still image from the most distal part of a colonic segment as by the entire video clip from the segment. Though it is strongly recommended to record at least 5 still images from a colonoscopy to document mucosal inflammation and to record two supplementary images to prove the completeness of the examination, this is easily and quickly done in clinical practice.
ACKNOWLEDGMENTS
Figure 3. The intraobserver agreement plot comparing video scores. A, Tape 1 and tape 2. B, Photographs at tape 1 and tape 2. C, Video scores at tape 1 to photograph scores at tape 2, expressed as the mean difference with the agreement limits.
www.mosby.com/gie
We thank the 20 observers; E. Aadland, Z. Konopski, O. C. Lunde (Aker University Hospital); J. Løvik, A. Rydning, T. Sandanger, J. Sko ¨to ¨ (Akershus University Hospital); E. Lind, Ø. Dyrhaug, H. Torvik (Asker and Bærum Central Hospital); V. Skar, F. Strøm, R. Tronstad (Lovisenberg Diakonale Hospital); F. Lerang, T. Hauge, B. Moum, P. Sandvei (Østfold Central Hospital), B. Hofstad, I. Lygren, C. Solberg (Ulleva˚l University Hospital). Volume 61, No. 6 : 2005 GASTROINTESTINAL ENDOSCOPY 719
Image documentation of ulcerative colitis
de Lange et al
REFERENCES 1. The role of colonoscopy in the management of patients with inflammatory bowel disease. American Society for Gastrointestinal Endoscopy. Gastrointest Endosc 1998;48:689-90. 2. Marteau P. Inflammatory bowel disease. Endoscopy 2002;34:63-8. 3. Gouveia-Oliveira A, Raposo VD, Salgado NC, Almeida I, Nobre-Leitao C, de Melo FG. Longitudinal comparative study on the influence of computers on reporting of clinical data. Endoscopy 1991;23:334-7. 4. Kuhn K, Gaus W, Wechsler JG, Janowitz P, Tudyka J, Kratzer W, et al. Structured reporting of medical findings: evaluation of a system in gastroenterology. Meth Inf Med 1992;31:268-74. 5. Kuhn K, Swobodnik W, Johannes RS, Zemmler T, Stange EF, Ditschuneit H, et al. The quality of gastroenterological reports based on free text dictation: an evaluation in endoscopy and ultrasonography. Endoscopy 1991;23:262-4. 6. Moorman PW, van Ginneken AM, van der Lei J, Siersema PD, van Blankenstein M, Wilson JH. The contents of free-text endoscopy reports: an inventory and evaluation by peers. Endoscopy 1994;26:531-8. 7. Moorman PW, van Ginneken AM, Siersema PD, van der Lei J, van Bemmel JH. Evaluation of reporting based on descriptional knowledge. J Am Med Inform Assoc 1995;2:365-73. 8. de Lange T, Moum BA, Tholfsen JK, Larsen S, Aabakken L. Standardization and quality of endoscopy text reports in ulcerative colitis. Endoscopy 2003;35:835-40. 9. de Lange T, Larsen S, Aabakken L. Inter-observer agreement in the assessment of endoscopic findings in ulcerative colitis. BMC Gastroenterol [serial online]. 2004;4:9. Available at: http://www.biomedcentral.com/ 1471-230X/4/9. 10. Rey JF, Lambert R. ESGE recommendations for quality control in gastrointestinal endoscopy: guidelines for image documentation in upper and lower GI endoscopy. Endoscopy 2001;33:901-3. 11. Gomes P, du Boulay C, Smith CL, Holdstock G. Relationship between disease activity indices and colonoscopic findings in patients with colonic inflammatory bowel disease. Gut 1986;27:92-5. 12. Holmquist L, Ahren C, Fallstrom SP. Clinical disease activity and inflammatory activity in the rectum in relation to mucosal inflammation assessed by colonoscopy. A study of children and adolescents with chronic inflammatory bowel disease. Acta Paediatr Scand 1990;79:527-34. 13. Roseth AG, Aadland E, Jahnsen J, Raknerud N. Assessment of disease activity in ulcerative colitis by faecal calprotectin, a novel granulocyte marker protein. Digestion 1997;58:176-80. 14. Baron JH, Connell AM, Lennard-Jones JE. Variation between observers in describing mucosal appearances in proctocolitis. Br Med J 1964;1:89-92. 15. Aabakken L, Larsen S, Osnes M. Visual analogue scales for endoscopic evaluation of nonsteroidal antiinflammatory drug-induced mucosal
720 GASTROINTESTINAL ENDOSCOPY Volume 61, No. 6 : 2005
16.
17. 18.
19.
20. 21.
22.
23. 24.
25.
26.
damage in the stomach and duodenum. Scand J Gastroenterol 1990; 25:443-8. Aabakken L, Olaussen B, Mowinckel P, Osnes M. Gastroduodenal lesions associated with two different piroxicam formulations. An endoscopic comparison. Scand J Gastroenterol 1992;27:1049-54. Larsen S, Aabakken L, Lillevold PE, Osnes M. Assessing soft data in clinical trials. Pharmaceutical Medicine 1991;5:29-36. Osnes M, Larsen S, Eidsaunet W, Thom E. Effect of Diclofenac and Naproxen on gastroduodenal mucosa. Clin Pharmacol Ther 1979;26: 399-405. Altman DG. Relation between two continuous variables. Practical statistics for medical research. 1st ed. London: Chapman and Hall; 1991. p. 277–321. Anderson TW. The generalized T2-statistic. An introduction to multivariate statistical analysis. 1st ed. New York: Wiley; 1984. p. 101–22. Kleinbaum DG, Kupper LL, Muller KE, Nizam A. Testing hypotheses in multiple regression. Applied regression analysis and other multivariable methods. 3rd ed. Boston: PWS-Kent; 1998. p. 136–59. Aaras A, Veierod MB, Larsen S, Ortengren R, Ro O. Reproducibility and stability of normalized EMG measurements on musculus trapezius. Ergonomics 1996;39:171-85. Kim CY. Compression of color medical images in gastrointestinal endoscopy: a review. Medinfo 1998;9:1046-50. Maycon ZR, Korman LY, Kim CY. Application of image compression to digitized gastrointestinal (GI) endoscopic color images: a pilot study [abstract]. Gastrointest Endosc 1997;45:AB34. Wurnig PN, Hollaus PH, Wurnig CH, Wolf RK, Ohtsuka T, Pridun NS. A new method for digital video documentation in surgical procedures and minimally invasive surgery. Surg Endosc 2003;17:232-5. Brennan P, Silman A. Statistical methods for assessing observer variability in clinical measures. BMJ 1992;304:1491-4.
Received July 7, 2004. Accepted December 18, 2004. Current affiliations: Department of Gastroenterology, Ullevaal University Hospital, Oslo, Norway, Department of Epidemiology, Norwegian School of Veterinary Medicine, Oslo, Norway, Department of Gastroenterology, Rikshospitalet, Oslo, Norway. This work was a poster at Digestive Disease Week, May 15-20, 2004, New Orleans, Louisiana (Gastrointest Endosc 2004;59:AB272). Oral presentation at the Nordic meeting of Gastroenterology, June 2-5, 2004, Oslo, Norway (Scand J Gastroenterol 2004;39[Suppl 240]:34). Reprint requests: Thomas de Lange, MD, Department of Gastroenterology, Ullevaal University Hospital, Kirkeveien 166 0407, Oslo, Norway.
www.mosby.com/gie