A methodology and implementation for annotating digital images for context-appropriate use in an academic health care environment

A methodology and implementation for annotating digital images for context-appropriate use in an academic health care environment

Journal of the American Medical Informatics Association Application of Information Technology Volume 11 Number 1 Jan / Feb 2004 29 j A Methodolo...

854KB Sizes 0 Downloads 29 Views

Journal of the American Medical Informatics Association

Application of Information Technology

Volume 11

Number 1 Jan / Feb 2004

29

j

A Methodology and Implementation for Annotating Digital Images for Context-appropriate Use in an Academic Health Care Environment PATRICIA A. GOEDE, BS, JASON R. LAUMAN, BS, CHRISTOPHER COCHELLA, MS, GREGORY L. KATZMAN, MD, DAVID A. MORTON, PHD, KURT H. ALBERTINE, PHD A b s t r a c t Use of digital medical images has become common over the last several years, coincident with the release of inexpensive, mega-pixel quality digital cameras and the transition to digital radiology operation by hospitals. One problem that clinicians, medical educators, and basic scientists encounter when handling images is the difficulty of using business and graphic arts commercial-off-the-shelf (COTS) software in multicontext authoring and interactive teaching environments. The authors investigated and developed software-supported methodologies to help clinicians, medical educators, and basic scientists become more efficient and effective in their digital imaging environments. The software that the authors developed provides the ability to annotate images based on a multispecialty methodology for annotation and visual knowledge representation. This annotation methodology is designed by consensus, with contributions from the authors and physicians, medical educators, and basic scientists in the Departments of Radiology, Neurobiology and Anatomy, Dermatology, and Ophthalmology at the University of Utah. The annotation methodology functions as a foundation for creating, using, reusing, and extending dynamic annotations in a context-appropriate, interactive digital environment. The annotation methodology supports the authoring process as well as output and presentation mechanisms. The annotation methodology is the foundation for a Windows implementation that allows annotated elements to be represented as structured eXtensible Markup Language and stored separate from the image(s). j

J Am Med Inform Assoc. 2004;11:29–41. DOI 10.1197/jamia.M1247.

Annotating digital images with symbols and text is a fundamental task that a clinician, medical educator, or basic scientist must perform when preparing material for academic use.1–3 Image annotation, in the broad sense, includes any means that allows an author to label, point to, or otherwise indicate some feature of the image that is to be the focus of attention.1,4–7 Authors should be able to perform the task of annotation quickly and easily to optimize utility, workflow, and time management. Unfortunately, annotation is made difficult by the lack of tools for annotation of digital media for

Affiliations of the authors: Department of Radiology, University of Utah Health Sciences Center, Salt Lake City, UT (PAG, JRL, CC, GLK); Department of Pediatrics, University of Utah Health Sciences Center, Salt Lake City, UT (PAG, JRL, CC, KHA); Department of Neurobiology and Anatomy, University of Utah Health Sciences Center, Salt Lake City, UT (DAM, KHA). Research and development for this project was funded partially by a State of Utah Center of Excellence Grant. Partial funding was also from the George S. and Dolores Dore Eccles Foundation and a Benning Research Grant from the Department of Radiology. Additional support was provided by the Department of Pediatrics and the Department of Neurobiology and Anatomy, University of Utah Health Sciences Center. Correspondence and reprints: Patricia Goede, University of Utah Health Sciences Center, Department of Pediatrics, Program in Imaging, Communication, and Collaboration, Electronic Medical Education Resource Group, 30 N 1900 E, Salt Lake City, UT 841322202; e-mail: . Received for publication: 09/10/02; accepted for publication: 07/27/03.

use in a context-appropriate setting (i.e., colleagues, students, patients) that promotes reuse of the annotated material. Commercial-off-the-shelf software (COTS) is commonly used to create and present material for academic use and is general enough to handle most tasks, but does not support the author’s conceptual framework or workflow.1,8 Difficulties arise for several reasons, including lack of optimal file format that supports reuse, lack of a methodology for annotating digital material in a hierarchical fashion that does not embed the annotations within the raster-based image, and lack of mechanisms to index and catalogue annotated material for reuse. Consequently, clinicians, medical educators, and basic scientists amass enormous numbers of images, many of which are duplicated. Two unanswered questions that clinicians, medical educators, and basic scientists have are (1) how to annotate digital material in a widely accepted manner with a clearly defined set of rules (methodology) that supports reuse of the annotated images and (2) how to create material for context-appropriate reuse in lectures, case conferences, and publications, without having to maintain multiple copies or file formats for different uses.9,37 For example, when an image from the clinical Picture Archiving and Communications System (PACS) is acquired, the radiologist has the option of saving key images as Tagged Image File Format (TIFF), Joint Photographic Experts Group (JPEG), or bitmapped image (BMP) files (depending on vendor platform and capability). After the image is acquired, the radiologist typically uses raster-based COTS to annotate the image for a single use within a given presentation (e.g., the process of applying

30

GOEDE ET AL., Annotating Images for Context-appropriate Use

annotations using Photoshop for use in a PowerPoint presentation). Basic science teachers, such as gross anatomists, encounter the same problems because they use enormous numbers of labeled images to teach human anatomy.7,10,11 Each image may contain several annotations or groups of annotations that are necessary to convey a certain point. Most software applications do not have the ability to manage visual annotations and text in the output, which leads to a cluttered, overly annotated image. Moreover, many software applications provide a feature to manage annotations (i.e., visible/invisible) within the application but do not provide the ability to manage the annotations in the output. The annotations are embedded in the flattened, permanently altered image. Each flattened image has one specific use within a single context, which necessitates multiple duplications of the same annotated image, as opposed to annotating a single image that can be presented with some, all, or none of the annotations, thereby allowing reuse for multiple purposes in multiple contexts. Flattened image annotations result in a variety of undesirable side effects: repetition of work, increased authoring effort, increased organization requirements, increased complexity, difficulties to automate image cataloging, and reduced instructional capability. The solution is to define an annotation methodology that is the foundation for development of a software implementation that facilitates annotation of digital image data, tracks the inherent structure, and identifies relationships or intellectual groupings of the annotations.7,10 The annotation methodology forms the basis of an eXtensible Markup Language (XML)12 schema that defines a platform-independent annotation exchange format. The annotations are stored as vector information in the Scalable Vector Graphics (SVG) format and remain linked to the original image.13 This methodology keeps annotations accessible as vector data, not embedded in the image, and maintains links between the image and annotation information. Because the annotations are vector based, they can be overlaid and linked to images with similar features that are generated from other imaging modalities. Therefore, the annotations and images become reusable. Moreover, with vector-based image annotations, management of multiple original versions in a proprietary format or distribution of multiple copies of the same image is not necessary. We identified requirements for and implemented a simple, yet effective, image annotation tool based on the annotation methodology. The tool was designed to be simple to use, to utilize vector-based annotations, and represent annotation groupings. These requirements created the need for a stan-

Table 1

j

dardized methodology to effectively communicate visual annotations in a consistent and congruent manner while preserving images for reuse. Therefore, the focus of this report is to present the annotation methodology. We also define the schema for development of a software implementation for multispecialty visual annotation of digital images. The software implementation also facilitates visual annotation interactivity and context-appropriate viewing of the visually annotated images. The annotation implementation was developed specifically for clinicians, medical educators, and basic scientists who required the ability to annotate images with visual expert knowledge for viewing in an interactive, context-appropriate, digital environment.

Design Objectives—Methods This project had three objectives. First, we conducted a multispecialty requirements assessment, consisting of clinicians, medical educators, and basic scientists, to establish a guideline for annotation and visual knowledge representation. The second objective was to establish a methodology for annotation and visual knowledge representation and further define a specification to be translated into compiled code (software). The third objective was to develop, implement, and evaluate a prototype software application for annotation and knowledge representation that would be used by the participants in the requirements analysis.

Requirements Assessment Based on the desire to develop a software implementation for annotation, we organized a requirements assessment to define what is involved in the process of annotation, what elements and features are needed for annotation, and how the output would be used and in what context (e.g., is the output going to be viewed via the Worldwide Web [WWW], is the output targeted for publication in a peer-reviewed journal, is the output to be viewed primarily by a group of investigators, or will the output need to be viewed by patients or other health professionals?). This assessment established the basic design requirements for the annotation software implementation. The requirements assessment was conducted within radiology (5 radiologists) and was extended to query other medical science professionals, including two anatomists, a dermatologist, two ophthalmologists, and three medical students, to establish guidelines and a foundation for annotation (Table 1). The methods for collecting the requirements began as an open-ended discussion with the participants who described their requirements for annotating digital images. All of the radiologists that participated in the requirements analysis are practicing clinicians. One of the five radiologists also

Summary of Participants in the Requirements Assessment

Specialty Radiology Ophthalmology Neurobiology and anatomy Dermatology Ophthalmology Second-year medical students

Clinical Research

Basic Science Research

Clinical Activities

1 1 1

4 1

1

Teaching (Formal Instruction)

2* 1 1 3

*One of the participants performs basic science research (lung and cancer) as well as teaches human gross anatomy.

Number of Participants 5 2 2 1 2 3

Journal of the American Medical Informatics Association

Volume 11

participates in a clinical research project. The two anatomists direct and teach the human gross anatomy course for firstyear medical students. One of the anatomists also directs the human gross anatomy course and is involved in basic science lung and cancer research. The dermatologist is a clinician. Of the two ophthalmologists, one is a clinician, and the other is a basic science vision investigator. The three medical students were enrolled in the human gross anatomy course. We invited clinical and basic science investigators who were not part of the formal requirements assessment to review and give feedback on what was established in the formal requirements assessment. We selected participants based on use of images to convey information (e.g., imaging results, clinical information, or teaching) as part of their clinical, educational, or research activities. The participants were familiar with annotation and had extensive experience annotating images ranging from traditional Letraset to using sophisticated image processing software such as Adobe Photoshop,14 Adobe Illustrator,15 and Microsoft PowerPoint.16 Only individuals that expressed the need for a simple tool to visually annotate an image with pointers and labels to convey information and/or multiple output formats and reuse of the archive image participated in the requirements assessment.

Annotation Methodology for Visual Knowledge Representation Concurrent with the requirements assessment was the definition of a methodology for annotation and visual knowledge representation. To define the Annotation Methodology, the participants were asked to describe how they use annotated images, including what software they most commonly use and in what contexts and types of output (format) they require. Additionally, participants were asked to provide detailed information on how they store annotated images, their process for locating and retrieving digital images, problems they encounter retrieving images and visual knowledge, and what type of storage they use to store images (i.e., database, fileserver, backup media). Design interface questions, such as icon use, placement of icons, and drop down menus that were directly related to the software implementation were posed to the participants to get feedback on those features.

Results Part of the requirements assessment was to determine if the need for annotation transcends the boundaries of clinical, basic science research, and teaching efforts. For example, two open-ended questions, ‘‘what is a region of interest?’’ and ‘‘what is the definition of a pointer?’’ were asked to determine if the concepts were different for clinicians compared with basic scientists and educators. Examples of the top five questions, specific to annotation, that participants responded to in the formal requirements analysis are shown in Table 2. The results of the requirements assessment identified the following fundamental annotation requirements:  Accurately identify a feature on an image  Identify a region of interest (ROI) with a point, edge, or polygon on the image

Number 1 Jan / Feb 2004

31

Table 2 j Example of the Five Most Frequently Responded to Questions in Requirements Assessment How would you identify a region of interest (ROI)? How would you label a region of interest (ROI)? How can you group annotations for a particular image? Is it necessary to define arbitrary groups of annotations for specialpurpose and context-appropriate viewing? What about inclusion and support for third-party lexicons and nomenclature?

 Label a feature or ROI with alphanumeric symbols, labels, and captions  Adjust visible contrast and color of annotations  Organize annotations into hierarchical groupings  Define arbitrary groups of annotations for special purpose and context-appropriate viewing  Support for third-party lexicons and nomenclature  Flexibility to annotate any image file format The most frequently identified requirements were to define and outline a region of interest; label it with a symbol, label, or caption; set the visible contrast to an annotation (e.g., black on white or white on black); group the annotated regions of interest in a hierarchical fashion (e.g., similar to a table of contents); and provide support for third-party lexicons. The participants also identified the need for a software application that would allow them to annotate images in different file formats and not be constrained by a single image format. All participants in the requirements analysis identified the need to have annotated material in a format that promotes reuse (e.g., reuse as in cross-media publishing—journals, conference presentations, lectures, Web viewing). Another desirable feature was the ability to reuse annotated material for context-appropriate viewing (e.g., clinician– clinician, clinician–basic scientist, medical educator– student, and eventually clinician–patient) as an interactive resource. Most important to the participants was the need for a tool that provided all of the functional annotation requirements and reduced the variety of undesirable side effects, such as repetition of work, excessive authoring effort, organizational requirements, and complexity. Other undesirable side effects are difficulties to automate image cataloging, and limited instructional capability. Finally, participants were asked to identify categories into which the annotation requirements should be placed. The resultant four categories are (1) visual annotation, (2) textual information, (3) presentation attributes, and (4) interactive features. The results are summarized in Table 3 and discussed in the following sections.

Model for Annotating Digital Images—The Process of Annotation Actual output from the annotation software implementation that maps how an image may be annotated for contextappropriate viewing (multiple audiences) and reused in academic activities (journal, conference, teaching, Web viewing) is shown in Figure 1. The annotation requirements of the

32

GOEDE ET AL., Annotating Images for Context-appropriate Use

individual author included selectable options for style, color, size, orientation, visual elements used to indicate features, and presentation methods. Authors expressed a fundamental need to visually annotate areas of an image that are relevant to their instructional audience and goal.4,7,11,17,18 Equally important was the need to establish a method to share or discuss their images in a given context. Additionally, investigators entering into the task of annotation wanted to accomplish the annotation task without being too constrained by the methodology. Lastly, authors did not want this task to be hindered by complexity or by cumbersome software.

Table 3

j

The process of annotation is illustrated in Figure 2. Our annotation methodology is divided into four sections, as defined by the participants during the requirements analysis. The first section is the definition of the data that are used to construct the visual parts of the annotation, which are the region of interest (ROI) and the pointer. The second section is the textual data that are tied to the visual annotation and include the symbol, label, and caption. The textual data also include the order, which is used to prioritize and group the annotations logically. The third section contains the presentation attributes that direct the size, color, shape, and visibility of the annotation. The fourth section defines the

Categories of the Annotation Methodology

Visual Annotation

Textual Information

Presentation Attributes

Interactive Features

Region of Interest Pointer

Order (list) Label(s)

Hierarchical grouping Arbitrary views

Symbol

Caption(s) Lexicon(s)

Size (small, default, large) Pointer type (none, line, edge, arrow) Color (light, default, dark)

Pointer location Visibility Rollovers Zooming

F i g u r e 1. Reuse and context-appropriate viewing of an annotated image. The annotated image is a computed tomogram (CT) of a cross-sectional slice across the neck of a patient who had a cyst on the left side of the neck. The cyst is annotated with a light gray polygon to create a region of interest, a light gray pointer (arrow), and a light gray label (Cyst). Other annotated structures are the carotid sheath (dark gray annotations) and carotid artery (light gray annotations). The pop-up boxes show the hierarchical structure of the annotations. Reuse is indicated by directing the annotated image to three outputs. Context-appropriate views are available to many users because the annotation structure enables the users to select the annotations that are displayed or hidden.

Journal of the American Medical Informatics Association

Volume 11

33

Number 1 Jan / Feb 2004

F i g u r e 2. Process of annotating raster images with vector annotations. The left side of the figure shows the steps to annotate an image. The author opens an image, adds annotations through an iterative process, and saves the annotated image. The right side of the figure shows an example of a computed tomographic (CT) image of the head of a patient who had hemorrhage metastasis in the brain. Grouping of the annotations is shown in the box below the annotated CT image. interactive features that facilitate presentation to multiple audiences in a context-appropriate manner. Combination of the four sections in the annotation methodology established a simple and flexible information model that gives the author precise means to add meaningful annotations to an image.

Section 1: Visual Annotation Region of Interest The visible portion of the annotation includes a region of interest (ROI). The participants defined the ROI as a feature or structure on an image (e.g., pathology, tumor, nerve) that conveys a clinical or research finding. The participants

defined the need to be able to highlight (draw a point, line, or polygon) to indicate an ROI. The ROI is most often accompanied by a pointer and symbol that allow the author to identify features that convey relevant information about an image 5,17 (Fig. 3).

Pointer The pointer for the annotation is partially defined by the author and partially computed based on where the author initially places it. For example, the author selects where the tail of the pointer should appear, and an algorithm calculates the closest point on the ROI to place the pointer tip.10 This dual mechanism for anchoring the pointer allows the author

34

GOEDE ET AL., Annotating Images for Context-appropriate Use

F i g u r e 3. Display of regions of interest on the base of the skull. The base of the skull has three depressions (called cranial fossae) on its inside surface. The fossae are named anterior cranial fossa (ACF), middle cranial fossa (MCF), and posterior cranial fossa (PCF). Each fossa is highlighted by a region of interest polygon. The region of interest polygon in the ACF has a thin line and numerous white circles, both of which designate an active polygon. When a polygon (or other type of pointer) is active, the white circles serve as nodes to adjust the position of the region of interest line relative to landmarks in the image. For this figure, the contents of the MCF are enabled (turned on). The annotation methodology was used to annotate points, pointers, and symbols to identify holes (foramina) through which nerves and blood vessels exit or enter the MCF. Symbols in the MCF are FS (foramen spinosum), FO (foramen ovale), FR (foramen rotundum), SOF (superior orbital fissure), and FL (foramen lacerum). to make choices about the layout of visual information on the image without relying on a totally automated and potentially unpredictable layout algorithm.19

may be used as a key to link the visual annotation to the textual information. These presentation options are defined in the presentation attributes section, below.

Symbol The symbol that is customarily associated with a visual piece of the annotation is taken from the textual information that is derived from a lexicon or free text entry. In the annotation software implementation, the symbol is an abbreviation, typically derived from the label, that is six characters or shorter. The character length of the symbol allows it to be drawn on the image with numerous sets of other annotations, without obscuring visual information or interfering with the other annotations. When the symbol is used in this manner, it

Section 2: Textual Information The textual information that is defined by the annotation methodology includes the symbol (described in the previous section), label, and caption (Fig. 4). Providing the ability to add textual information about the annotation enables the author to comment or add his or her expert knowledge on contents of an image in the form of a symbol, label, and caption. The comments may refer to a detail of the image or the annotated image as a whole.20 The symbol, label, and

Journal of the American Medical Informatics Association

Volume 11

Number 1 Jan / Feb 2004

35

F i g u r e 4. Illustration of the textual attributes enabled in Figure 3 for the middle cranial fossa. The annotated image in Figure 3 was zoomed for Figure 4 to show that the annotations remain anchored to their anatomic structure. Moreover, the annotations did not pixelate despite zooming. Both advantageous outcomes are the result of using scalable vector graphics. caption are a set of information commonly used across many fields but may have specialty-specific terminology.17,18

Label The label is the word or phrase that defines the visual annotation. For medical purposes, this label may also be taken from a lexicon or vocabulary, which enables dictionarystyle lookup in the software implementation. The lexiconspecific piece of textual information allows the annotation to be linked to a larger body of information outside the image. For authors who do not use lexicons during the authoring process, the symbol may be enough to match the annotation with external information. The annotation methodology does not restrict or define lexicons because use of lexicons is the author’s preference or institution’s policy. If the label is drawn from a defined lexicon, it should at least be consistent across the author’s work.

Caption The caption is defined as a sentence or paragraph that describes the annotation. The description may include references to other pieces of information that may be part of an index or hypertext system. The caption should not contain information about the image as a whole, which is handled through a constant nonvisual annotation (i.e., image metadata21).

Order or Grouping The order is a character sequence that allows the annotations of the image to be organized in an outline format, allows the annotations to be grouped (or nested) logically, and may

impart priority (like the first annotation in the outline is the most important). The order is not treated as an annotation but is used to identify and set up the hierarchy that the visual annotations fall into. This piece of textual information is an invisible annotation that links the pieces of textual information consisting of the symbol, label, or caption to the image. The ordered or grouped textual information is linked with the image, much like the chunks of data that are embedded within the Portable Networks Graphics (PNG) format.22 This practice is similar to the concept of a table of contents. The textual information that defines the order or grouping of the visual annotations is a constant, nonvisual annotation that always exists at the first position in the outline and is a part of the information used to create the metadata of the image.

Section 3: Presentation Attributes The presentation attributes of the annotation methodology define how annotations should be drawn when rendered through presentation software.10,17 The visible parts of the presentation attributes may also be interpreted differently, depending on the medium (e.g., laser print, journal article, or Web browser) or the context (e.g., clinician–basic scientist, clinician–patient, or medical educator–student). The presentation attributes that are currently defined are size, color, pointer type, and tip location. To accommodate contextappropriate attributes, the annotation software implementation gives the author the ability to add annotations as visual elements with the textual information in groups that can be viewed independently. Annotations can also be visible or

36

GOEDE ET AL., Annotating Images for Context-appropriate Use

invisible (turned on/off) to optimize presentation and management of annotated structures. Figure 5 illustrates the presentation output from the annotation software implementation. Visible parts of an annotation can be changed for viewing on the Web, printed in a journal or textbook, or used in a professional presentation. Each presentation attribute has only three or four options to provide better control over presentation and annotation reuse. All presentation attributes in the annotation methodology are guidelines for the rendering and reuse of visual characteristics, including fonts, sizes, and colors. Hypertext Markup Language (HTML) 23 has used this approach with success.

Annotation Size The options for the annotation size attribute are ‘‘small,’’ ‘‘default,’’ and ‘‘large.’’ The size options control the size and line width of the pointer and associated text rendered with the visual annotation. The algorithm for determining actual pixel dimensions is processed by the software implementation of the annotation methodology.

Annotation Color The options for annotation color are ‘‘light,’’ ‘‘default,’’ and ‘‘dark.’’ The color options control the color of the polygon, line, or point that indicate an ROI and the pointer and text that are rendered as part of the visual annotation. The light, default, and dark options for annotating radiographic grayscale images present the author with three options that optimize contrast on the image. Authors who annotate full color images may use a color palette from which to pick the color for individual annotations, if the standard light, default, and dark options are insufficient. Moreover, consideration is integrated into the annotation software implementation for individuals with color deficiencies. Such individuals have

difficulty distinguishing between colors and the contrast of the annotations on the images that are annotated with the light, default, and dark (e.g., incomplete dichromatopsia, most commonly red/green color deficiency, which affects 8% of white men or achromatopsia, no color differentiation or reduced visual acuity).24 The color deficiency may cause the author/viewer to not see an annotation because of insufficient contrast.25 With this in mind, the annotation methodology has an option that gives the author the ability to change the color of an annotation on an individual basis. The color that each of the three-color attributes, light, default, and dark, map to must be defined in a separate style sheet, offering style control to the author.

Font The annotation methodology has an option that gives the author the ability to select font and font size. The font is selected from the fonts that are available in the system. The size options are controlled and rendered similar to the visual annotations. The algorithm for determining actual font size is processed by the software implementation of the annotation methodology.

Pointer Type The pointer type options are ‘‘none,’’ ‘‘line,’’ ‘‘wedge,’’ and ‘‘arrow.’’ Other pointer types may be added, but these four options form the foundation for the types of pointers that may appear with an ROI. The style sheet and software implementation control the appearance of these pointers. Another pointer option is the ‘‘tip’’ option control, in which the tip of the pointer appears relative to the ROI. The options are ‘‘center’’ and ‘‘edge.’’ Using this attribute, the software implementation determines the actual pixel location of the pointer tip.

Section 4: Knowledge Representation: Interactive Features, Context-appropriate Viewing, and Reuse The participants in the requirements analysis identified the need for a software application that achieved three goals: (1) annotation on any type of image file format (i.e., TIFF, JPEG, or PNG), (2) an interactive feature that provides the ability to turn on and off sets of annotations for context-appropriate viewing, and (3) reuse of annotated images. Therefore, these goals were included in the software implementation of the annotation methodology. We used existing standards whenever possible. We incorporated the Annotation and Collaboration Working Group (W3C) (,www.W3C.org/ annotation.) standards into the definition framework. An annotation and related textual information (i.e., label or caption) consist of discrete pieces of information that, when viewed over the WWW, are interactive. Interactivity in this sense is defined as giving the viewer the ability to turn on/off annotated groups on the image. Annotations and associated textual information are viewed and controlled independently from the image.

F i g u r e 5. Example of the output of the fully annotated image of the base of the skull. All of the annotations for each of the cranial fossae are enabled (turned on).

Context-appropriate viewing of an image and related annotations is a feature that allows the annotations on an image to be turned on or off for a particular audience or presentation. The annotation view attribute controls the visibility of an annotation because the annotations are separate from the image. Thus, the view attribute can turn

Journal of the American Medical Informatics Association

Volume 11

annotations on/off in a context-appropriate manner. The options for view presentation are ‘‘all,’’ ‘‘ROI only,’’ ‘‘ROI and symbol,’’ ‘‘ROI and label,’’ ‘‘ROI with pointer and symbol,’’ ‘‘ROI with pointer and label,’’ ‘‘pointer and symbol,’’ ‘‘pointer and label,’’ and ‘‘none.’’ Depending on the context, portions of annotations may be viewed in a presentation, while other portions remain hidden. An example of how an annotated image can be viewed in an interactive manner is shown in Figure 6. Reuse is facilitated by providing an open ‘‘hook’’ to link the image and related annotations to larger cataloging systems. Participants in the requirements analysis, with the help of the developers, identified the need to be able to reuse annotated images for different purposes (i.e., publication, Web viewing, or professional conferences).37 The software implementation, based on the annotation methodology, gives the author the ability to annotate an image once and reuse the annotations or the image with the annotations. Authors can store the archived image with the linked annotations. The images remain unaltered because the annotations are not embedded into the image. Therefore, the image remains in an archival format and can be reused for other purposes or applications.

Software Implementation Based on the above methodology for visual annotation and knowledge representation, we developed a software implementation for the Windows16 Desktop environment.

Number 1 Jan / Feb 2004

37

Although the current prototype implementation was written in Tool Command Language (Tcl/tk)26 for Microsoft Windows,16 Tcl/Tk provides a cross-platform scripting environment and facilitates rapid development, prototyping, and testing for Macintosh27 and Linux.28 Thus, the initial software implementation for visual annotation and knowledge representation is a platform-specific application. On the other hand, the output from the software implementation is not platform specific. Rather, the output format uses the Scalable Vector Graphics (SVG) format, which is an extension of the eXstensible Markup Language (XML) specification.12 The SVG format provides flexibility and interactivity when viewed through a Web browser and can be used for output to print material since the annotations remain as vector information overlaid onto the image. The output includes metadata that contain information about the image, visual annotations, author information, lexicons, and information related to the authoring sessions, such as revision control, and is stored within the XML output file. SVG facilitates extensibility, interactive Web viewing, and reuse. SVG also allows the annotations and visual expert knowledge (i.e., labels and captions) to remain linked to the image, as opposed to embedding the annotations to the image.13 To facilitate the interactivity of the annotated images, we leveraged Adobe’s freely available SVG plug-in (Adobe Systems, San Jose, CA) 14 for viewing annotated images over the WWW. The flexibility of using XML allows

F i g u r e 6. Examples of two annotation groups (panels A and B) for the same image to show context-appropriate presentation and use. Annotation groupings can be enabled/disabled, depending on the features that are viewed. Panel A shows the annotation hierarchy in pop-up windows for all of the annotations. The check marks to the left of each line in the pop-up windows mean that the annotation is enabled. Panel B shows the annotation hierarchy when only the annotations are enabled for the PCF. The user selects which annotations are enabled.

38

GOEDE ET AL., Annotating Images for Context-appropriate Use

powerful graphics editing programs similar to Adobe Photoshop 14 and presentation programs similar to Microsoft PowerPoint 16 to consume the output for further editing and other uses. Currently, Adobe Illustrator 15 can consume the SVG output from the annotation implementation without changing any of the visual annotations and their attributes. The annotation methodology and subsequent annotation exchange format, based on XML/SVG supports linking of images through annotation. The attributes that are linked to regions of interest on one image can be linked to corresponding regions of interest on other images and remain persistent. Linked images could be composed of serial sections generated from a single imaging modality or images that are generated from different imaging modalities. All of the information regarding linking between images through their annotation sets is stored within the XML output file. An example of how the annotations link images from two imaging modalities is shown in Figure 7. By adopting open standards such as XML and SVG in the software implementation, the annotation methodology provides authors with the ability to save images with the annotations linked to the images, in a structured format of XML (SVG).12,13 The open and extensible features of SVG promote indexing of the image with associated annotations and textual information, thus, allowing images and annotations to be catalogued in a database or asset management system. An example of the structured output is shown in Figure 8, which illustrates an annotated image of the posterior cranial fossa of the skull.

Lessons Learned The requirements analysis to define an annotation methodology identified users of different abilities and with specific requirements. Individuals participating in the requirements assessment appreciated the need for a methodology but at the same time did not like the constraints of a methodology. For example, participants did not want to be too constrained with the three color choices (gold, white, or black) and preferred a full range of color options to pick from similar to color palette options. The individuals that annotate grayscale images eventually decided that having the additional color options was not necessary and decided on the original three color choices. Balancing these competing interests presents a unique challenge for the software implementation. From the software developers’ and integrators’ viewpoint, the annotation methodology and software implementation must remain simple and extensible for authoring and, at the same time, generate structured output that is in a standard format, is flexible, and adheres to open standards. Annotated image content eventually will require integration into a larger enterprise cataloging system. By adopting open standards for the software implementation and structured output, we can develop a software solution that generates annotated collections of images that can be integrated into or consumed by cataloging systems.

Discussion The overall goal of the project was to define a methodology for visual annotation of digital images that functions as the

F i g u r e 7. Example of two images that are linked through their annotations (panels A and B). The regions of interest on the base of the skull (Panel A) are applied to the computed tomography (CT; panel B) that shows corresponding soft tissue structures. The annotation groupings applied to the CT can be enabled/disabled, depending on the features that are viewed. The presentation and use attributes allow the user to select which annotations are enabled on either image. The images remain linked through the annotation sets.

Journal of the American Medical Informatics Association

Volume 11

Number 1 Jan / Feb 2004

39

F i g u r e 8. Structured output (XML) of the annotated image of the posterior cranial fossa and the foramen magnum (FM) shown in Figure 6. The output shows the annotations in a structured XML format. The XML output is flexible and contains additional information related to the annotations such as style, the metadata fields that are generated from annotation session, and links to the images. The flexibility of XML facilitates indexing and cataloging, Web presentation, and consumption by other systems such as a database or file system. foundation for a software implementation that fulfills the image annotation and visual knowledge representation requirements of clinicians, medical educators, and basic scientists. After completing a requirements analysis, we defined a methodology and developed a software implementation prototype that provides users with the ability to visually annotate an image that preserves the original image, links the visual annotations and expert knowledge to the image, enables reuse of the images and annotations, provides interactive viewing, and supports context-appropriate presentation of annotated images. A workflow-crippling issue identified by clinicians, medical educators, and basic scientists is the lack of software solutions to create visual annotations on digital images, with associated expert knowledge, that can be shared or reused either together or separately.3,19,29 An ideal solution is to enable clinicians, medical educators, and basic scientists to annotate their digital images with a software package that is simple to use, facilitates locating and retrieving images and/or their

associated annotations, and can be integrated into the workflow. Additionally, a solution must also provide the ability to output annotated images for context-appropriate presentation to the WWW, print, or other digital presentation used for professional conferences. Unlike the analog, or hard copy environment (rub on labels), the digital environment requires a formal definition (methodology) to handle digital image annotation. The author is not able to just ‘‘do it’’ on a computer without formal definition of the data he or she is handling and a user-friendly, easy-to-learn software implementation to support the definition. Caruso et al.30 proposed using Adobe Photoshop (Adobe Systems, San Jose, CA) for annotation of digital image data, such that the digital image data become suitable for publication. This approach is used widely for annotation of print-quality images; however, because the final output is a raster image file with the annotations also rasterized and therefore embedded in the image, the visually annotated images are only suitable for print and no other media (i.e., WWW

40

GOEDE ET AL., Annotating Images for Context-appropriate Use

viewing and interactivity). Further, the process of annotating with Adobe PhotoShop does not link annotated information to the image and does not lend itself to reuse of the source material (image and annotations) for other purposes or interaction with the annotations and context-appropriate presentations, especially in the digital environment.4,29,30 Because raster output does not separate the visual annotations from the underlying image, the annotations cannot be manipulated separate from the image. It is this manipulation that permits interactivity, reuse for multiple purposes, multiple publishing targets (print, Web), and multiple contexts as illustrated in the output from the annotation implementation. The annotation methodology began several years ago as a set of style decisions for a raster-based, image annotation, Web application named ArrowMagick 17 that was developed using the ImageMagick libraries.31 The application itself was not deployed, but the requirements and concepts that were built into the software were extracted and reconfigured as the first draft of the annotation methodology. Development of the annotation methodology continued by analyzing existing analog and digital annotation mechanisms. Traditional annotation methods, including photographic annotation with Letraset, physical annotation with marking utensils, the concept of ‘‘pin-and-string’’ annotation,10 digitally annotated magnetic resonance (MR) images,1 and map labeling in the Geographical Information Systems (GIS) field 5,6 were used to form the foundation for the annotation methodology. Individual user habits, with general-purpose image manipulation software (e.g., Adobe PhotoShop [Adobe Systems]14), were observed and taken into consideration while defining a medical-based method for annotating digital images. The annotation methodology constrains the author to a set of artistically clean choices for presentation and authoring. In addition, the methodology defines how the annotation information is captured in a structured manner for reuse and stores the annotated information in a vector format for visual clarity.19 The annotation methodology forms the basis of an XML schema that defines a platform-independent annotation exchange format.12,13 The annotations are stored as vector information in the SVG format and remain linked to the original image. The annotation exchange format ensures that the annotations are accessible as vector data linked to the image, not embedded in the image. The design decisions behind the annotation methodology have been derived from traditional analog annotating methods and a consensus of common practices of clinicians, medical educators, and basic scientists. The basic annotation element of pointer and label exists across professional fields and conveys the meaning of ‘‘this is the focus of attention.’’ However, each professional field deviates from the basic annotation elements to handle field-specific data. For example, in the field of academic radiology, radiologists often identify pathology as an ROI and want the ability to point with an associated label or caption, as often is the case in a clinical teaching conference. On the other hand, in the field of human gross anatomy, detailed assignment of labels is customary to teach students relevant normal structures in a region.11 Multiple colors and shapes of labels are needed because the color and contour of human anatomic structures are variable.

Another example is the field of basic science research for which the ability to identify relevant structures on a digital photograph for publication in a peer-reviewed journal, including electronic journals, is of paramount importance. The annotation methodology was defined to establish a schema for software development that would provide the author the ability to annotate an image similar to commercially available tools such as Adobe Illustrator 15 and Adobe Photoshop,14 in which the author identifies a region of interest on an image, then places an arrow or pointer to draw attention to a region of interest. These features were noted, since many authors are familiar with such applications. The annotation methodology has been used to develop software that enables authors to layer annotations on images while retaining the original format of the image and by linking annotations to the image file. The Annotation Methodology and implementation allow the author to add, track, and retain visual annotations and associated expert knowledge as separate layers of information within the software environment and in the output. Additionally, the structured nature of the visual annotations and associated textual knowledge allows the image and visual information to be indexed and cataloged, thus, facilitating location of the annotated image for other uses. Standards that define annotation elements (i.e., symbols, labels, and captions) on digital images have yet to be adopted. Associated standards for applying annotation to digital medical images likewise do not exist. Without a standard for digital image annotation, every image becomes a custom annotation job. The annotation methodology presented here is an assembled and evolved annotation exchange format that solves the problem of standardizing digital image annotation in health care yet has applicability to other disciplines.32 The shortcomings of establishing the annotation methodology are in the complexity of the methodology. What started as a definition for placing arrows and labels on digital images grew into a complex definition of requirements for annotation of digital images produced in a university medical center community.11 The definition included the ability to reuse annotated images for multiple output and to create context-appropriate presentation for interaction with clinicians, medical educators, basic scientists, and, in the future, patients. The shortcomings should become less of a hindrance, however, as new software applications implement methodologies that give users the flexibility described in the annotation methodology. A limitation of the assessment was that a modest usability study was conducted to collect feedback and gather user statistics on the implementation. An extensive usability study is needed to determine outcomes of developing and using a tool for visual annotation and will be the focus of a follow-up report. The annotation methodology, through modification of earlier versions of the software implementation, has been presented at several national conferences, including the Radiological Society of North America InfoRAD (RSNA),33 Society of Computer Applications in Radiology (SCAR),34 American Telemedicine Association (ATA),35 and Federation of American Societies for Experimental Biology (FASEB).36 Each of these professional meetings has provided a forum

Journal of the American Medical Informatics Association

Volume 11

outside the University of Utah Health Sciences Center to collect feedback from medical academicians who are participating in complementary and competing projects. This feedback was also evaluated and incorporated into the current annotation methodology.

Conclusions The annotation methodology presented in this report provides several key solutions for creating interactive digital material. Because the annotation methodology is the culmination of multidisciplinary consensus, the methodology is a robust standard that fulfills medical annotation requirements, at least currently. Clinicians, medical educators, and basic scientists can collect and annotate digital images and group relevant annotated groups for use in a contextappropriate environment, with colleagues, students, or patients, using the same annotated image without modifying the original image. Furthermore, because the annotation software implementation uses a standard, open annotation exchange format (XML, SVG), the annotated images can be reused in a cross-media publishing environment (e.g., print, Web, and database), interactive teaching, and contextappropriate collaboration and communication (e.g., clinician–clinician, clinician–basic scientist, medical educator– student). By creating a user-friendly tool that promotes standardization of annotations, indexing, cataloging, and vector-based interactivity, the annotation methodology functions as the foundation for new and important solutions for annotating digital images. References

j

1. Caruso R, Postel G, McDonald C, Aronson B, Christensen J. Software-annotated, digitally photographed, and printed MR images: suitability for publication. Acad Radiol. 2002; 9:346–51. 2. Marshall C. Annotation: From Paper Books to the Digital Library. Proceedings of the 1997 ACM International Conference on Digital Libraries (DL97). Philadelphia, PA: ACM Press on Digital Libraries, pp 131–40. 3. Davidson H, Lauman J, Goede P, Harnsberger HR. CAT: A methodology for annotating digital teaching file images. Scientific Program Proceedings in Radiology. 2000:698. 4. Goede P. CAT: An Annotation Methodology for the Medical Image Annotation Tool. National Center for Research Resources (NCRR) Sponsored BioInformatics Approaches to Neuroimaging in Clinical Research. Seattle, WA: January 25–27, 2002. 5. Wagner F, Wolff A. Map labeling heuristics: provably good and practically useful. ACM, Annual Symposium on Computational Geometry. 2001;3:109–11. 6. Wagner F, Tycho S, Wolff A, Kapoor V. Three rules suffice for good label placement. Algorithmica. 2001;30(2):334–49. 7. Brinkley JF, Rosse C. The digital anatomist distributed framework and its applications to knowledge-based medical imaging. J Am Med Inform Assoc. 1997;4:165–83. 8. Chronaki C, Zabulis X, Orphanoudakis S. I2Cnet Medical Image Annotation Service. Med Inform, Special Issue. 1997;22:337–47. 9. Albertine KA. Use and Re-use of Content from an Imager’s Perspective. National Center for Research Resources (NCRR) Sponsored BioInformatics Approaches to Neuroimaging in Clinical Research. Seattle, WA: January 25–27, 2002. 10. Lober B, Brinkley J. A portable image annotation tool for Web-based anatomy atlases. Proc Am Med Inform Assoc. 1999.

Number 1 Jan / Feb 2004

41

11. Morton DA, Goede PA, Lauman JR, Albertine KH. Annotation tool for images for human gross anatomy. FASEB J. 2002;16:A1090. 12. World Wide Web Consortium (W3C) eXtensible Markup Language (XML) Working Group. . Accessed October 31, 2003. 13. World Wide Web Consortium (W3C) Scalable Vector Graphics (SVG) Working Group. . Accessed October 31, 2003. 14. Adobe PhotoshopÒ, San Jose, Calif. . Accessed October 31, 2003. 15. . Accessed October 31, 2003. 16. Microsoft Corp., Redmond, WA. . Accessed October 31, 2003. 17. Heaps N, Davidson H, Lauman J, Harnsberger H. Arrow Magick: Labeling digital radiological images on the Web. Telemed J. 1999;5:95. 18. Albertine K, Morton D, Peterson K, Dalton M, Schultz R. Radiologic holograms as teaching tools for human gross anatomy. FASEB J. 2001;15:A65. 19. Lieberman H, Rosenweig E, Push S. Aria: an agent for annotating and retrieving images. IEEE Computer. 2001;34(7):57–62. 20. Chronaki C, Zabulis X, Orphanoudakis S. I2Cnet medical image annotation service. Medical Informatics, Special Issue. 1997; 22(4):337–47. 21. Marshall C. Making metadata: a study of metadata creation for a mixed physical-digital collection. 1998 ACM International Conference on Digital Libraries (DL98). 22. Wiggins RH, Davidson HC, Harnsberger HR, Lauman JR, Goede PA. Image file formats: past, present and future. Radiographics. 2001;21:789–98. 23. The Hypertext Markup Language (HTML). . Accessed October 31, 2003. 24. Joshi VG. Brightness contrast as source of error in the Ishihara test for colour blindness. J All India Ophthalmol Soc. 1965; 13(3):83–7. 25. Aarnisalo E. Screening of red-green defects of colour vision with pseudoisochromatic tests. Acta Ophthalmol (Copenhagen). 1979;57(3):397–408. 26. Active Static TCL Developer Exchange. . Accessed October 31, 2003. 27. Apple Computer, Cupertino, CA. . Accessed October 31, 2003. 28. LINUX Online, Ogdensburg, NY. . Accessed October 31, 2003. 29. Lauman J. Image annotation and re-use issues in medical academia. Proceedings of American Society for Experimental Biology. FASEB J. 2001;15:A67. 30. Caruso R, Postel G. Image editing with Adobe Photoshop 6.0. Radiographics. 2002;22:993–1002. 31. ImageMagickÓ 1998, E. I. Du Pont de Nemours and Co., Inc., John Cristy. 32. Lober B. Personal Annotated Image Server (PAIS). . Accessed October 31, 2003. 33. Radiological Society of North America (RSNA). . Accessed October 31, 2003. 34. Society for Computer Applications in Radiology (SCAR). . Accessed October 31, 2003. 35. American Telemedicine Association (ATA). . Accessed October 31, 2003. 36. Federation of American Societies Proceedings (FASEB). . Accessed October 31, 2003. 37. Cochella C, Lauman JR, Goede P, Harnsberger HR, Katzman GL. A simple mechanism for sharing and transporting medical digital case information across disparate computer language and data storage environments. J Dig Imaging. 2001;14(2 suppl 1):187–9.