Symbiosis of Human and Artifact Y. Anzai, K. Ogawa and H. Mori (Editors) © 1995 Elsevier Science B.V. All rights reserved.
375
A T e a c h i n g M e t h o d as a n A l t e r n a t i v e to the C o n c u r r e n t T h i n k - A l o u d M e t h o d for U s a b i l i t y T e s t i n g P. R. Vora a and M. G. Helander b a U S WEST Technologies, 1475 Lawrence St., Suite 304, Denver, CO 80202, USA Dept. of Mechanical Engineering, Div. of Industrial Ergonomics, Linkoping Institute of Technology, 58133 Linkoping, Sweden
b
In this paper, we propose a teaching method as an alternative to the concurrent think-aloud (CTA) method for usability evaluation. In the teaching method, the test participant, after becoming familiar with the system, demonstrates it to a seemingly naive user (a confederate) and describes how to accomplish certain tasks. In a study that compared the teaching and the CTA methods for evaluating usability of human-computer interactive tasks, the results indicated that the number of verbalizations elicited using the teaching method far exceeded those elicited using the CTA method. Also, the concurrent verbalizations were dominated by the participants' interactive behavior and provided little insight into the participants' thought processes or search strategies, which were easily captured using the teaching method. 1. INTRODUCTION The goal of usability evaluation is to quickly identify and rectify usability deficiencies in human-computer interfaces. The Concurrent Think Aloud (CTA) method, where the participants verbalize their thoughts while interacting with the system, is considered to be one of the most useful methods for practical usability evaluation (Nielsen, 1993a). The strength of the CTA method is "to show what the users are doing and why they are doing it while they are doing it in order to avoid later rationalizations" (Nielsen, 1993b; p. 195). Thinking aloud, however, is not a normal practice for most people using a computer, and is often intrusive and distracting to the users of an unfamiliar software (Holleran, 1991; Nielsen, 1993b). The following methods have been suggested to overcome the "unnaturalness" of the concurrent think-aloud method:
1. Constructive interaction or Codiscovery learning (O'Malley, Draper, and Riley, 1984; Kennedy, 1989), where two test users work together to solve a problem (see Bainbridge, 1990). However, the disadvantage is that the users
376 may have different strategies for learning and using computers (Nielsen, 1993b). 2. Question-answer method or coaching method (Kato, 1986; Mack and Burdett, 1991), where participants are asked to complete a given task by obtaining information from a tutor (a coach), whose responses are based on a policy of limited assistance. This method is aimed at discovering the information needs of users to improve training, documentation, and interface design. We propose a teaching method, as alternative to the concurrent think-aloud method as a non-intrusive and a more natural method to elicit verbalizations from test participants. In the teaching method, the participants interact with the system first, so that they are familiar with it and have acquired some expertise in accomplishing tasks using the system. At the end of the task, the test participant is introduced to a naive user (a confederate); the confederate is briefed by the experimenter to limit his/her active participation and not to become an active problem-solver. The test participant describes to the confederate how the system works and demonstrates to him/her a set of tasks pre-determined by the experimenter. Since the confederate is introduced as a novice, the test participants can be expected to feel more comfortable in describing and explaining how to accomplish tasks using the system. Therefore, they may freely reveal useful information such as their understanding of the system components, organization of the information, their strategies for accomplishing tasks, and their mental model of the system. Whereas, if they are asked to describe the system to the experimenter, who is assumed to be an expert, they may feel uncomfortable and may withhold their verbalizations. This approach of introducing a "non-threatening" participant in the study can be compared with Gelman and Gallistel's (1978) Counting Study, where 2- to 5year old children were introduced to a puppet and were asked to answer or do things for the puppet. The presence of "non-threatening" puppets helped the experimenters both to elicit children's responses and to increase their attention span and tolerance for a lengthy testing procedure. Though not used for usability testing, we think that introducing a puppet may be useful for evaluating systems designed for young children. In this paper, we describe a study that compared the teaching method and the CTA method for human-computer interactive tasks. We were particularly interested in identifying the differences in the content of the verbalizations using both these methods and to determine if the teaching method was a useful alternative for evaluating usability.
377 2. METHOD 2.1. Test-bed To compare the two methods we used an experimental database on nutrition referred to as NutriText. NutriText was designed as a hypertext system, where the related pieces of information were connected by "hot buttons" or links.
2.2. Participants Nine participants (1 female and 9 males), graduate and undergraduate students at the State University of New York at Buffalo participated in the study. Their ages ranged from 20 years to 32 years (average = 23.4 years). On a pre-experiment questionnaire, all the participants categorized themselves as computer users and indicated that they were familiar with using a mouse to interact with the computers. 2.3. Procedure After receiving training on NutriText, the participants answered 12 search questions and thought-aloud concurrently for the 1st, 6th, and 12th questions. At the end of the search task, they were introduced to a confederate to whom they described NutriText and demonstrated how to access information to answer 3 search questions given by the experimenter.
3. RESULTS & DISCUSSION
The participants' concurrent and teaching verbalizations were transcribed and analyzed for their content; see Table 1. The teaching method seemed more natural to the users as the number of verbalizations elicited from the participants using the teaching method far exceeded those elicited during the concurrent think-aloud method m 72.8 vs. 59.4 verbalizations per participant over 3 questions. The differences were not only in the total number of verbalizations, but also i n their content. The concurrent verbalizations were dominated by the participants' interactive behavior (26.4% verbalizations were Actions) and provided little insight into their thought processes or search strategies (10.8% were Explanations for Actions and 3.1% were Search Strategies). For example, $1: "...that means that I have to find which vitamin corresponds to wheat germ [State Goal]... and I'll start with fat-soluble vitamins [Action]... I'll start with vitamin A [Action]... backtrack [Action]... vitamin D [Action]... backtrack [Action]... vitamin E [Action]... here I found wheat germ [Match Goal]... so I think the person is lacking vitamin E [Answer]..." In contrast, in the teaching method, the participants' thought processes and search strategies were more pronounced in their verbalizations (only 15.0% of the
378 verbalizations were Actions, whereas 18.0% were Explanations for Actions and 11.2% were Search Strategies). For example, $3: "... so the key word here probably is biotin [State Goal]... you have to look under all the various categories of various vitamins A, B, C, D, and find if we can find the word biotin.., and then work backwards [Strategy]..."
Table 1 Content analysis of Concurrent and Teaching protocols CTA method Protocol Type Answering Questions (3) Read Question State Goal Assumptions Info. from Past Knowledge Strategy Information Organization Explanation for Action Action Analyze Action Reading Screen Match Goal Answer Interaction with Experimenter Others Subtotal: Describing NutriText Purpose of NutriText Identify Features Explain use of Features Information Organization Explanation by Example Difficult Characteristic Interaction with Confederate Interaction With Experimenter Others Subtotal: Total:
Teaching Method
Avg. # Avg. # verbalizations % of Subtotal verbalizations / par ticip ant / participant
3.71 3.71 0.71 0.43 1.86 0.00 6.43 15.71 0.57 5.14 9.14 3.86 6.29 1.86 59.43 n/a n/a n/a n/a n/a n/a n/a n/a n/a 59.43
6.3 6.3 1.2 0.7 3.1 0.0 10.8 26.4 1.0 8.7 15.4 6.5 10.6 3.1
% of Subtotal
3.13 3.25 3.50 1.25 8.13 1.50 13.13 10.88 3.25 1.75 15.13 4.25 1.88 1.75 72.75
4.3 4.5 4.8 1.7 11.2 2.1 18.0 15.0 4.5 2.4 20.8 5.8 2.6 2.4
1.00 5.13 8.00 6.13 6.13 0.50 0.63 2.13 2.25 31.88 104.63
3.1 16.1 25.1 19.2 19.2 1.6 2.0 6.7 7.1
379 Further, when the participants were describing NutriText, their mental model of the information organization and the difficulties they experienced while interacting with it were evident in their verbalizations. The participants also used their experiences to give hints and suggestions to facilitate the confederate's future interaction with NutriText. For example, $5: "This basically gives the information about vitamins [Purpose of NutriText]... and it is arranged in a sort of hierarchical way [Information Organization]..." $2: "... and when you click on vitamin B, you get a whole list of words, and unless you are careful, you don't realize these are vitamins [Difficult
Characteristic]..." The participants may have experienced difficulty in verbalizing concurrently while trying to search for information due to their lack of familiarity with both the domain and hypertext technology. Verbalizing, therefore, posed an additional workload on them causing them to elicit their actions and not their thoughts. Since participants' actions and navigation data can be collected in the background by the system, the CTA method did not provide additional useful information.
4. CONCLUSION In sum, the data suggests that with the teaching method it's easier both to elicit verbalizations and to capture the thought processes of the participants (including search strategies and interaction difficulties). We believe that the teaching method is particularly suitable for the interactive tasks requiring extensive navigation (moving between screens or windows), where using think-aloud method may reveal action history rather than their thoughts. Other benefits of the teaching method are: 1. A more natural approach. The teaching method avoids situations where the experimenter unwittingly constraints the participants' behavior (Kirakowski and Corbett, 1990). 2. No effect on task performance. The teaching method does not facilitate initial stages of learning, which is a problem with the CTA method (Ahlum-Heath and Di Vesta, 1986). Berry and Broadbent (1990) have also shown that instructions to report justifications for actions verbally can improve performance of a task. 3. No effect on the time to complete the task. Since teaching method is used at the end of the experimental task, time-based interaction data can be collected during the use of the system and they will not be confounded by the concurrent verbalizations. Ericsson and Simon (1984) have summarized evidence indicating that verbalizations during performance may have a significant effect on the time taken to complete a task.
380 REFERENCES
1. Nielsen, J. (1993a). Evaluating the thinking-aloud technique for use by computer scientists. In H. Rex Hartson and D. Hix (Eds.), Advances in Human-Computer Interaction, Vol. 3. Norwood, New Jersey: Ablex. 2. Nielsen, J. (1993b). Usability Engineering. New York: Academic Press. 3. Holleran, P. A. (1991). A methodological note on pitfalls in usability testing. Behavior & Information Technology, 10, 345-357. 4. O'Malley, C. E., Draper, S. W., and Riley, M. S. (1984). Constructive interaction: A method for studying human-computer-human interaction.
Proc. IFIP INTERACT "84 First Intl. Conf. Human-Computer Interaction (London, U. K., 4-7 September), 269-274. 5. Kennedy, S. (1989). Using video in the BNR usability lab. ACM SIGCHI Bulletin, 21, 92-95. 6. Bainbridge, L. (1990). Verbal protocol analysis. In J. R. Wilson and E. N. Corlett (Eds.), Evaluation of Human Work: A Practical Ergonomics Methodology. New York: Taylor and Francis. 7. Kato, T. (1986). What "question-asking protocols" can say about user interface. International Journal of Man-Machine Studies, 25, 659-673. 8. Mack, R. L. and Burdett, J. M. (1991). When novices elicit knowledge: Question-asking in designing, evaluating, and learning to use software. In R. Hoffman (Ed.), The Cognition of Experts: Empirical Approaches to Knowledge Elicitation. New York: Springer-Verlag. 9. Gelman, R. and Gallistel, C. R. (1978). The Child's Understanding of Number. Cambridge, MA: Harvard University Press. 10. Kirakowski, J. and Corbett, M. (1990). Effective Methodology for the Study of HCI. New York: North-Holland. 11. Ahlum-Heath, M. E. and Di Vesta, F. J. (1986). The effect of consciously controlled verbalization of a cognitive strategy on transfer in problem solving. Memory and Cognition, 14, 281-185. 12. Berry, D. C. and Broadbent, D. E. (1990). The role of instruction and verbalization in improving performance on complex search tasks. Behavior & Information Technology, 9, 175-190. 13. Ericsson, K. A. and Simon, H. A. (1984). Protocol Analysis: Verbal Reports as Data. Cambridge, MA: The MIT Press.