R E S E A R C H
Assessing software usability through heuristic evaluation Titus K.L. Schleyer, DMD, PhD
sability, as defined by the International Organization for Standardization, is “the extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use.”1 Various methods have been developed to assess software usability. Formal usability testing using a “thinkaloud” protocol is considered the gold standard for measuring usability.2,3 In this approach, participants are given a set of tasks to complete using a software application. The participants are asked to verbalize their thoughts (think aloud) while they are trying to complete each successive task. An observer or a screen capture software package (for example, Camtasia Studio 4, Techsmith, Okemos, Mich.) record the participant’s actions and verbalizations. The results of the experiment then are analyzed and coded systematically to identify usability problems. While usability testing helps assess users’ interactions with a system, other methods rely on experts’ evaluations of the interface to identify potential problems. One of these methods is heuristic evaluation,4 in which reviewers judge the user interface and system functionality as to whether they conform to established principles of usability and good design. An example of such a principle (a “heuristic”) is Error Prevention, which stipulates that the program should help the user avoid errors as much as possible. Thus, a program that allows the user to enter a date in the past when making a patient appointment would violate this heuristic. This is termed “heuristic violation.” Evaluators used a set of 10 heuristics published by Nielsen and Mack4 to evaluate the practice management systems (PMSs) in our study. While several different sets of heuristics exist,5,6 those popularized by Nielsen and Mack are used widely and address high level of usability concepts applicable to all types of soft-
U
ware. These heuristics emerged as particularly effective for a factor analysis and a ranking of explanatory coverage for 101 previously proposed heuristics.7 During a heuristic evaluation, experts evaluate the target system or interface using the list of heuristics. To direct the evaluators to focus on a particular aspect of the system, one or more tasks can be provided to frame the system review (for example, making an online purchase on a Web site). The evaluators “step through” or inspect the user interface and then record (sometimes with the help of an observer) violations of the specific heuristics. The resulting violations then are compiled into a single list. A common recommendation from the literature is to have between three and five people assess a user interface independently.8 People who have expertise in the domain in which the system would be used, as well as usability assessment experience, tend to identify more violations than do people who have no previous experience or who have experience limited to only one of these areas.2 When analyzing the results of a heuristic evaluation, each reported violation should be reviewed carefully by the developers, and if possible, validated using other measures such as usability tests and problem frequencies extracted from manufacturers’ help desk databases. A thorough review can eliminate items that may violate a heuristic in theory but do not result in an actual usability problem. On the other hand, problems identified using more than one evaluation method may be serious enough to require a significant correction in the software design. Heuristic evaluation does not generate solutions to the potential usability problems that have been identified; however, the nature of each violation can provide some guidance on how to fix the problems. For some violations cited in our study, the necessary corrections—such as changing a label, relocating a button or
JADA, Vol. 138 http://jada.ada.org Copyright ©2007 American Dental Association. All rights reserved.
February 2007
211
R E S E A R C H
increasing the window size—are easy to determine and implement. Other violations, such as the common separation of the hard-tissue and periodontal charts, are more difficult to address. In conclusion, heuristic evaluation is a valuable method of assessing usability problems in software interfaces. Combined with other approaches, such as usability testing, it can generate valuable insights into how to improve the quality of a software interface. ■ Dr. Schleyer is an associate professor and the director, Center for Dental Informatics, School of Dental Medicine, University of Pittsburgh. 1. ISO standards: Standards in usability and user-centred design— ISO 9241-11 (1998) guidance in usability. Available at: “www. usabilitypartners.se/usability/standards/shtml”. Accessed Nov. 27, 2006.
In our study, two postgraduate students in dental informatics and one faculty member in dental informatics evaluated each of four PMSs: Dentrix (Version 10.0.36.0, Dentrix Dental Systems, American Fork, Utah), EagleSoft (Version 10.0, Patterson Dental, St. Paul, Minn.), SoftDent (Version 10.0.2, Kodak, Rochester, N.Y.) and PracticeWorks (Version 5.0.2, Kodak). We installed the systems on a personal computer workstation (operating system: Windows XP Professional, Microsoft, Redmond, Wash.; processor speed: 2.4 gigahertz; random access memory: 512 megahertz). After installation, we evaluated all programs in their default configuration. All of the evaluators were dentists with a significant background in informatics and information systems. The faculty member was an expert in heuristic evaluation, and the postgraduate students had completed a course in human-computer interaction evaluation methods, including heuristic evaluation. All of the evaluators were familiar with PMSs in general, but had no experience through routine use. Before the study, a faculty member with expertise in human-computer interaction and medical informatics (V.M.) conducted a refresher tutorial about heuristic evaluation with the evaluators. The tutorial used examples from a PMS that was not evaluated in the study. We chose three common clinical documentation tasks for the evaluators to perform to focus the heuristic evaluation on key clinical functions of the software applications. We asked the evaluators to record a mesio-occlusal carious lesion on the maxillary left central incisor, a porcelainfused-to-metal crown on the maxillary left first 212
JADA, Vol. 138
2. Nielsen J. Usability engineering. San Francisco: Morgan Kaufman; 1994:1995. 3. UI guidelines vs. usability testing. Available at: “http://msdn.microsoft.com/library/default.asp?url=/library/ en-us/dnwui/html/uiguide.asp”. Accessed Jan. 5, 2007. 4. Nielsen J, Mack RL. Executive summary. In: Nielsen J, Mack RL, eds. Usability inspection methods. New York: John Wiley & Sons; 1994:1-24. 5. Gerhardt-Powals J. Cognitive engineering principles for enhancing human-computer performance. Int J Hum Comput Interact 1996;8(2):189-211. 6. Shneiderman B. Use the eight golden rules of interface design. In: Shneiderman B, ed. Designing the user interface: Strategies for effective human-computer interaction. 3rd ed. Boston: AddisonWesley Professional; 1998:74-6. 7. Nielsen J. Enhancing the explanatory power of usability heuristics. In: Proceedings of the SIGCHI conference on human factors in computing systems: Celebrating interdependence; Boston; April 24, 1994. New York: ACM Press; 1994:152-8. 8. Nielsen J, Molich R. Heuristic evaluation of user interfaces. In: Proceedings ACM CHI’90 Conference, Seattle, Washington. April 1, 1990; New York: ACM Press; 1990:249-56.
molar and the periodontal status of one quadrant of the dentition. Evaluators told the observer the heuristics that they considered to have been violated while completing the tasks. An observer (T.T.) wrote down the violations and helped record illustrative screen shots when necessary using a recorded macro function in Microsoft Word (Microsoft). While the evaluation was grounded in the three documentation tasks, the evaluators were free to explore other clinical (but not administrative) program functions to increase the coverage of the heuristic evaluation. We limited the sessions to approximately one hour per evaluator and software application. We compiled the heuristic violations identified by each evaluator in a spreadsheet and summarized them for each program. We identified the heuristic violations found by more than one evaluator, and totaled the violations categorically across all PMSs. All of the evaluation sessions were conducted over a two-week period in February 2005. Each evaluator evaluated the systems in a different order. We submitted a draft of this article to all three PMS vendors for verification of reported findings. Both Dentrix Dental Systems and Patterson Dental replied, and their relevant comments appear in the Results section. RESULTS
Table 2 summarizes the number of heuristic violations for the four PMSs. We sorted the data by the total number of violations for each heuristic totaled across the PMSs in descending order. We found that the heuristics Consistency and Standards, Match Between System and the Real World, and Error Prevention were violated relatively
http://jada.ada.org February 2007 Copyright ©2007 American Dental Association. All rights reserved.