The effect of interaction style and training method on end user learning of software packages

The effect of interaction style and training method on end user learning of software packages

Interacting with Computers 11 (1998) 147–172 The effect of interaction style and training method on end user learning of software packages Sid Davis ...

166KB Sizes 0 Downloads 22 Views

Interacting with Computers 11 (1998) 147–172

The effect of interaction style and training method on end user learning of software packages Sid Davis a ,*, Susan Wiedenbeck b a

Department of Decision Sciences, School of Business Administration, East Carolina University, Greenville, NC 27858, USA b Faculty of Computer Science, Dalhouse University, Halifax, NS, Canada B3J 2X4 Received 8 January 1997; received in revised form 16 August 1997; accepted 18 February 1998

Abstract This paper reports two studies of software learning by individuals who use packages as a tool but never become experts. Using assimilation theory, we studied the effect of three interaction styles (direct manipulation, menu, and command) and two training methods (instruction and exploration) on the initial learning of a package and the subsequent learning of functionally equivalent packages. Results suggest that direct manipulation aids initial learning and that previous experience is a moderate aid in learning a subsequent package, but only when the interaction styles are similar. Exploration training does not appear to aid learners in a short training period. 䉷 1998 Elsevier Science B.V. All rights reserved Keywords: Interface style; Exploration-based training; Instruction-based training; End users

1. Introduction We are concerned with the ability of end users to learn software packages and become productive with them. Today, vast numbers of people use packages in work or school, and many use them for personal tasks as well. These end users wish to learn software to support their own professional needs. There are many patterns of use among end users. They may use the software frequently or intermittently, but in general they do not use it intensively for many hours a day over long stretches of time, as do clerical workers. They rarely become experts, or power users, of the software. Frequently, they use several * Corresponding author. 0953-5438/98/$ - see front matter 䉷 1998 – Elsevier Science B.V. All rights reserved PII: S0953 -5 438(98)00026 -5

148

S. Davis, S. Wiedenbeck/Interacting with Computers 11 (1998) 147–172

functionally equivalent package as their job needs change or as they perform similar tasks in different environments. Thus, we see two situations for end users learning software packages: the initial learning of a software package and the subsequent learning of functionally equivalent packages. These two situations are quite distinct. In initial learning, the end user has limited knowledge of what the software can do and how to manipulate it. In subsequent learning, the user has preconceptions about the capabilities of a package and about using it. Such prior knowledge may aid in learning a subsequent package; nevertheless, end users still do not approach the learning of a subsequent package from the position of strength of the true expert. A number of researchers have studied experts or highly practiced users as they transfer from one package to another [7,39]. However, research has yet to investigate less highly skilled users as they learn subsequent packages. Clearly, many factors influence the initial and subsequent learning of software packages. We chose to study the interaction style and the training method because they appear to be two elements which play a prominent role in the process of software learning (e.g. Refs. [12,14]). The interaction style may have a strong impact on learning, particularly for users who are not computer professionals and who are characterized by an irregular or less intense pattern of use [33]. Styles which are difficult for learners to master may prevent them from attaining the level of skill which is necessary to achieve their minimal computing goals. Even if they learn to carry out a minimal set of tasks adequately, they may be discouraged from learning efficient methods or from expanding their skills to new tasks. Effective computer learning can be facilitated through training. However, it appears that much training is itself ineffective. An examination of the literature on computer training reveals an array of problems experienced by learners, including false analogies drawn from non-computer experience [16], training materials which are voluminous, hard to understand, and hard to use [10], difficulty remembering syntax and semantics [6], and inability to use the training materials to recover from errors [10]. In the work reported here, three word processing systems which differed in interaction style were investigated for their effect on initial and subsequent learning. We use the term ‘interaction style’ broadly to refer not just to the interface type, but to a whole set of software and hardware features which define the user’s experience of interacting with the computer. For convenience, we use the labels direct manipulation, menu-based, and a command-based to refer to the interaction styles with the understanding that the terms are used in a broader sense than computer interface alone. With respect to initial learning, we wanted to find out whether people learn better with a direct manipulation interaction style (DMI) that mimics real world operations, whether the guidance of a menu-based interaction style is sufficient to aid learning, or whether the interaction style is unimportant, given careful training. In subsequent learning, we wanted to find out if the history of previous learning affects the learning of a functionally equivalent package for non-expert users. In learning a menu-based system after a DMI, we expected the similarities in the way of interacting to facilitate subsequent learning. The command-based system, on the other hand, gave us the opportunity to explore the impact of underlying functionality versus surface similarities on subsequent learning. Two training methods were studied: instruction-based and exploration-based. We investigated the question of which training method leads to better initial learning. We also studied how training methods affect subsequent learning, both in the case where there are substantial similarities between

S. Davis, S. Wiedenbeck/Interacting with Computers 11 (1998) 147–172

149

the old and new system and in the case where there are fewer similarities. Two experimental designs (reported as Part 1 and Part 2 below) were embedded into a single large scale experiment to investigate both initial and subsequent learning. The next section reviews the results of previous studies of learning software packages. The two following sections present Part 1 on initial learning and Part 2 on subsequent learning. The implications of this research for end user training are discussed in the final section.

2. Previous research We approach an understanding of the factors involved in learning of software packages in terms of assimilation theory [2]. According to assimilation theory, meaningful learning leads to success in problem solving. Meaningful learning occurs when a person draws connections between the new information to be learned and related information already in long-term memory, sometimes referred to as an ‘assimilative context’ [2]. To do this, the learner must search long-term memory for appropriate anchoring concepts, retrieve them to short-term memory, then manipulate them actively to make the connections to the new information. Furthermore, the connections must be of a substantive rather than an arbitrary sort; so, for instance, similarity in word length would not be likely to result in meaningful learning of verbal material. Meaningful learning implies an understanding of basic concepts of the material. After meaningful learning, a person should be equipped to solve problems and even extend their knowledge to situations somewhat different from the context in which it was originally learned [23]. Thus, meaningful learning can be measured by determining the extent to which learners are able to apply learned elements in problem solving situations. 2.1. Initial and subsequent learning Integration of existing and new knowledge is viewed as crucial in meaningful learning. However, in the initial learning of a software package, the learner often lacks an appropriate assimilative context and thus has few anchoring concepts to use as a basis for integration. The learner may be able to make use of some analogies to past non-computer experience, but in many instances non-computer analogies may be fairly sparse and perhaps not wholly appropriate [16,19]. Thus, the lack of relevant past experience may hinder learning by diminishing the learner’s ability to draw analogies between a known domain and the new domain. For example, the learner with little relevant experience may be unable to make connections between existing knowledge and models explicitly or implicitly presented in training or by the interface. We focus on end users who have some competence as defined by their own work needs, but are not true experts on their initial software package. Anderson [1] presents evidence that proceduralization occurs in complex skills, such as computer programming, through extensive practice of the productions that describe the skill. Recent research on transfer has shown that, for participants trained to an expert criterion on one software system, the amount of time to carry out the corresponding tasks in another unknown software system

150

S. Davis, S. Wiedenbeck/Interacting with Computers 11 (1998) 147–172

may be modeled accurately by the degree of overlap of the productions describing performance in the two systems [7,28,39]. However, this does not describe well the situation of our population of interest who are not experts on their initial software package. They know imperfectly or forget certain procedures in their initial software. When necessary, they reconstruct how to carry out these procedures using whatever resources are available to them in their own memory, the interface, and help sources. Furthermore, the foremost issue for them is not time to carry out a task in the unknown software, but ability to accomplish it at all. While some proceduralization, as described by Anderson [1] may occur in initial learning and aid in the learning of a subsequent, functionally equivalent package, we expect it to be fairly minor given the low experience of our population of end users. We argue that for these users learning a subsequent package is likely to be more a question of making ‘mindful generalizations’ from the initial to the subsequent package than of transfer of common elements which are automatically elicited [32]. Thus, we argue that subsequent learning by these users fundamentally involves problem solving and is better described by assimilation theory than by automatic processes of transfer. In terms of assimilation theory, learners of subsequent packages should possess useful anchoring concepts from their initial software, which will aid meaningful learning of the subsequent package. We expect better overall outcomes in subsequent learning than in initial learning. 2.2. Interaction style Three general styles of interaction are commonly in use: direct manipulation, menu, and command. Direct manipulation allows users to carry out computer operations as if they were working on the actual objects of interest in the real world. The gap between the user’s intention and the actions necessary to carry it out is small. These two characteristics of direct manipulation are referred to as engagement and distance by Hutchins, Hollan, and Norman [21]. They argue that high engagement and small distance lead to a feeling of directness in an interaction. Many sources may contribute to the feeling of directness: continuous visibility of the object of interest, representation of objects in a familiar form (often implemented through icons), manipulation of objects by physical actions (pointing, clicking, touching, dragging) or labeled button presses rather than complex syntax, rapid incremental operations, reversibility of actions, and immediately visible feedback about the result of actions [38]. The menu-based interaction style represents objects and possible actions by a list of choices, usually presented through text. Menus are similar to direct manipulation in that they provide guidance to the user; thus, the burden on memory is reduced, interface constraints aid the user in structuring the task, and many kinds of syntactic errors, such as misspelled command names or illegal command arguments, are impossible [37]. However, menu-based interfaces are on the whole less direct than DMIs because the actions of users are mediated by the syntax and semantics embodied in the menus, and pointing devices are often replaced by keyboards. Thus, the user tends to lack a sense of performing actions directly on the objects of interest. In a command-based interaction style the user types a command string in the vocabulary and syntax recognized by the system. The burden of remembering the commands is on the user, as is the burden of structuring a sequence of actions correctly to obtain a desired result [14]. Interactions are carried out via a keyboard, rather than by pointing, clicking, and dragging. The results of

S. Davis, S. Wiedenbeck/Interacting with Computers 11 (1998) 147–172

151

actions are often not as visible as in DMI or menu systems. Shneiderman [38] argues that in a command-based system there is a relatively large distance between user’s intentions and the actions needed to carry them out, and the sense of working on the objects themselves is reduced dramatically. The descriptions above characterize DMI, menu, and command-based systems generally. However, the question of how to operationalize the three interaction styles for research purposes is a difficult one. Some researchers have taken a reductionist approach: creating pure instantiations of different interaction styles for the purpose of comparison by developing carefully crafted ‘toy’ programs (e.g., Ref. [4]). This approach maximizes the differences between styles, thus allowing for a clean comparison. Its disadvantage is low ecological validity [42] because pure interaction styles, particularly a pure DMI style, are difficult to adapt to complex, multi-function software. In fact, most commercial software, with the exception of some games, consists of a combination of styles. Whiteside, Jones, Levy, and Wixon ([44], p. 185) argue that a reductionistic experimental approach, which simplifies systems and decomposes them into component variables, is probably futile because performance differences are due to an ‘‘inextricably complex interaction of causes’’. They argue instead for holistic comparisons in which real systems are studied. In a holistic comparison, the largest sources of variance are identified and controlled, but minor sources of variance are allowed to exist as the price of studying complex systems. We are concerned with training individuals to use realistic software systems. Therefore, we have chosen to increase ecological validity by carrying out this research using complex commercial software in a holistic comparison. One result of this decision is that the software packages we used for testing were hybrids, and the differences in interface style among them were a matter of degree. We see them as occupying points on a continuum where direct interfaces would be at one end and indirect interfaces at the other end. Viewed in this way, the interaction style that we refer to as direct manipulation is clearly the closest to the direct end of the continuum, even though it uses menus for certain operations. Likewise, the interaction style that we refer to as command-based is closest to the indirect end of the continuum, even though a few operations have an immediately visible effect on the object of interest, which is more characteristic of DMI systems. Some prior research has investigated the influence of interface type on learning or performance, although the research is not as extensive as one might expect. Several studies have found an advantage for direct manipulation, usually implemented with icons, over commands [14,24,25,29,41,43,46] or menus [4]. Other studies have failed to find an advantage or have found decreases in performance [17,18,30,44]. Many of these studies are difficult to interpret because of the lack of a theory guiding the experimental manipulations, a problem also noted by Benbasat and Todd [4] and Davis and Bostrom [14]. To overcome this problem we base our predictions and interpretations on assimilation theory. Our work also differs from the majority of past research in studying systems at three points along the continuum from direct to indirect, rather than concentrating only on the polar extremes. In fact, it may be argued that today the most interesting comparisons are between the extremes and the middle ground. Assimilation theory argues that linking new information to prior knowledge in longterm memory is essential for meaningful learning. Mayer [23] showed that participants

152

S. Davis, S. Wiedenbeck/Interacting with Computers 11 (1998) 147–172

who were given prior appropriate concepts to which they could link programming knowledge did better at a programming task which required problem solving. Davis and Bostrom [14] point out that a direct manipulation interface can provide the basic anchoring concepts to which new knowledge can be assimilated, for example, the analogy of a desktop model to a manual file system. The small distance between the user’s intentions and the method of implementing them in the system may also lead to more experimentation in a direct manipulation interface. This is important because it is stressed in assimilation theory that the learner must work with the concepts actively in order to integrate them with other information in long-term memory. Based on assimilation theory, we argue that direct manipulation provides the elements necessary for meaningful learning. We expect better learning on a direct manipulation interface than on a command-based interface, as measured by problem solving. A command language does not give the user an explicit set of anchoring concepts, although eventually, through use of a command language, users may develop their own model of the system. However, without anchoring concepts to use for assimilation, this model will not be easily built in early learning, when it is badly needed. Command languages also add distance between the user and the system because they force the user to express commands using the abstraction of an artificial language. The user’s intentions are less readily expressed because of the translation into the vocabulary and syntax of the command language. Feedback received must also be translated into meaningful terms because it is usually not given in the form of immediately visible changes in a set of domain objects. The predictions of assimilation theory with respect to learning on a menu-based system are less clear. A menu system does not appear to provide a context for assimilation of new concepts in the same direct way as a DMI, because the interface does not support analogy well by providing such a highly visible, manipulable model. On the other hand, a menubased interaction style does offer some potential advantages. Unlike a command language, the menu system displays the choices available on the screen in an organized way. The availability of an organized view of the functionality of the system may make it easier for users to construct their own understanding of the system. Also, the display of menu choices may make the gap between intentions and the actions to carry them out smaller than in a command language. The user still has to know the system terminology to initiate an action, but only needs to recognize terms rather than recall them. In this research we are concerned with the effect of the computer interface in subsequent as well as initial learning. Most of the research on this topic does not apply directly because it was done in the context of expert attainment in the initial software [7,39]. In one relevant study, Streitz, Spijkers and van Duren [40] found that participants who learned a menubased system first and then a command-based system performed no better than participants who used only the command-based system, although they did have a better overview of the capabilities of the system from having used the menus. Using assimilation theory, we argue that, given underlying equivalent functionality, the learner is likely to be facilitated because of the ability to make analogies to past experience with another interface type. If the interaction styles are similar, the assimilation process is aided because the analogies between the prior and current system are more direct, and less processing is required. On the other hand, if the interaction styles are dissimilar, the process of assimilation is more difficult because analogies are not obvious, given the surface dissimilarities.

S. Davis, S. Wiedenbeck/Interacting with Computers 11 (1998) 147–172

153

2.3. Training methods Two training approaches have been discussed in the literature and are of interest in this study: instruction-based training and exploration-based training. A fundamental difference between the two is that exploration-based training tends to be more learner directed, while instruction-based training tends to be more formally structured by the learning materials. Davis and Bostrom [14] present a useful taxonomy in which they identify two major categories of features that distinguish these two approaches: process features and structural features. Process features refer to the manner in which learning is carried out. These include the reasoning process used by learners, the level of programming, and the control of learning. In exploration learning the reasoning process tends to be inductive where learners work from specific examples in an attempt to derive general rules or principles. This implies that the level of programming, or external control over learning, tends to be low. Learners engage in trial-and-error exploration of the software and, as such, exercise high control over the type and sequence of interactions with the system. Instruction-based learning, on the other hand, is deductive. Learners are first provided with general rules describing software functions, then work through examples that illustrate those rules. Instruction-based training also tends to be highly programmed. It leads a learner step-by-step through the learning process, leaving very little to the individual’s discretion. As such, the learner has low control over the learning process. The second major category distinguishing instruction from exploration learning is the structure of learning materials. These structural features include the level of completeness of the learning materials and learning orientation toward system features or holistic tasks in the materials. Exploration-based training materials tend to be incomplete. The few examples that they provide are intended to stimulate the learner’s curiosity to explore and experiment with concepts in order to derive new relationships and to resolve inconsistencies. Because of the incompleteness, exploration training materials are sometimes referred to as ‘minimal manuals’ [11,12]. Exploration learning materials also tend to focus on typical user tasks rather than the features of the system. This is in response to the production focus [11,12]. That is, researchers have found that learners frequently want to accomplish something meaningful with the software (e.g., build a spreadsheet or format a document) rather than simply learn a collection of specific features (e.g., entering cell formulae or character formatting). Instruction-based training, alternatively, uses more complete learning materials that leave the learner little responsibility for discovering new knowledge. It also tends to focus on features of the software (e.g., how to copy text) rather than whole task accomplishment. Thus, instruction-based training materials tend to lack the focus of exploration training on integration of skills and real world tasks [12]. In terms of assimilation theory, a training approach will be successful if it promotes meaningful learning, and meaningful learning is achieved when the learner links new information to anchoring concepts in long-term memory. In assimilation theory, it is seen as essential that the learner actively works with the concepts to achieve integration of the new information with the old. This description suggests that exploration-based training may promote meaningful learning. While Ausubel [3] was skeptical of exploration training for adults, Bruner [8] argues that exploration training helps learners organize

154

S. Davis, S. Wiedenbeck/Interacting with Computers 11 (1998) 147–172

information, making it more readily available for later application in problem solving. Applying ideas of assimilation theory to computer training, it may be argued that exploration learners are not simply provided with rules but are forced to devise and work with their own examples to induce the rules. Thus, the active orientation is present. Furthermore, the incompleteness of the materials may promote organization of the information and integration with prior knowledge, since the user must draw on all resources possible to fill in the gaps. The focus on real tasks may give the user a context for activities, and thus promote integration to the extent that the user has prior relevant experiences. Instructionbased training, lacking these characteristics, would not be expected to facilitate meaningful learning as well as exploration-based training. Past studies outside the domain of computer training have compared instruction-based and exploration-based learning. These are reviewed well in Blake [5]. Generally, exploration training has been found to be superior for meaningful learning, as measured by problem-solving tasks. Interest in exploratory computer training has been expressed in the practitioner literature since the early 1980s [9]. Both positive and negative results have been reported in the literature. In a series of studies, Carroll and his colleagues found that in a word processing environment exploration-based training was superior by several measures, including time spent in learning and in performing evaluation tasks and successful completion of tasks [11,12]. Lazonder and van der Meij [22] also report on studies in which exploration-based training was more effective in learning a word processor. However, Olfman [27] found no difference between instruction-based and explorationbased training in the learning of a spreadsheet, nor did Davis and Bostrom [14] in learning a computer operating system. Other studies have suggested a need for guidance in exploration [35,45] and possible ways to provide such guidance [31]. The work on training wheels interfaces [13], which guard exploratory learners from the consequences of certain errors committed during exploration, also suggests a need for limitations on the explorationbased training paradigm. In initial learning, assimilation theory does not predict an advantage for explorationbased training because the new user has few useful anchoring concepts in long-term memory. In subsequent learning, however, we argue that an exploration-based training approach will facilitate performance. Active assimilation of new to prior information is essential for learning to occur because, without it, the learner is not likely to see the applicability of the prior knowledge. Thus, a training method that promotes working with the material and actively trying out procedures will support the conditions necessary for learning. On the other hand, a training method that presents the user with a set of rules to follow may not provide enough encouragement for the user to draw a meaningful connection with prior knowledge, instead promoting rote learning, or memorization.

3. Part 1: initial learning The purpose of Part 1 was to assess the effects of interface and training method on initial learning by participants who had little or no computer and word processing experience.

S. Davis, S. Wiedenbeck/Interacting with Computers 11 (1998) 147–172

155

3.1. Participants One hundred and seventy-three participants were selected for the study from several sections of an undergraduate computer tools course. The participants selected for the study were screened from a pool of about 450 volunteers, and were selected on the basis of having little or no knowledge of computers or word processing. The participants received course credit for participating. The average age of the participants was 21.38 years, and the average college grade point average was 3.08 on a scale from 0 to 4. There were 106 male and 67 female participants. On average the participants had first been introduced to computers in high school. They had taken an average of one previous course about computers in high school or college, and they had taken one previous course in high school or college which involved some computer use (e.g., using a spreadsheet in a course assignment). The participants’ self-reported previous use of word processing was 1.82 on a scale from 1 to 5 (SD ¼ 0.68). Thus, it fell between designated usage categories of ‘not at all’ and ‘a little’. This supports our contention that the participants had little prior word processing experience. Also, evidence taken from a separate protocol analysis suggests that participants had not had enough prior experience to develop biases or strong opinions about different word processing systems, or about the interaction styles represented in this study. 3.2. Materials The domain selected for this study, word processing, offers several advantages for a study of this type. First, since it is a common application domain, the findings of the study should be applicable to a wide variety of users. Second, given the large number of applications available, word processing allows us to control the functionality of the software while selecting applications that vary in their interaction style. As discussed previously, commercial software packages represent a mixture of interaction styles. The packages which we used can be seen as occupying places along a continuum extending from a direct style of interaction at one end to an indirect style at the other. The direct manipulation interface was the closest of our packages to the direct end of the continuum. The command-driven interface was closest to the indirect end of the continuum, and the menu-driven interface was intermediate. Thus, the DMI and menubased systems can be considered more similar to each other based on their relative positions on the continuum than the DMI and command-based systems. All three systems had equivalent word processing functionality with respect to the features and tasks used in this experiment. Similarities and differences between the three systems are summarized in Table 1 and discussed in the following paragraphs. The DMI, Word for the Macintosh, had a number of features which led us to classify it as the most direct of our interaction styles. The features which make it direct are: 1. continuous visibility of objects being worked on; 2. representation of objects in familiar form to promote analogy, text for a document or descriptive icons for operations;

156

S. Davis, S. Wiedenbeck/Interacting with Computers 11 (1998) 147–172

Table 1 Comparison of computer environments DMI

Menu

Command

Continuous visibility

High

High

Analogy (text, icon) Direct manipulation of objects

Both High (mouse and cursor for pointing, clicking, and dragging)

Text only Medium (cursor for selection)

Reliance on menus Use of command language Immediacy of feedback Hardware

Low/medium None

High None

High

Medium/high

Medium (text continuously visible but available actions not so) Text only Limited (most object manipulation by commands without direct action on objects) None High (used for editing and formatting operations) Low

Macintosh

PC

Display Keyboard

10 inch screen Non-extended

10 inch screen Non-extended with function keys (function keys not used in experiment)

Macintosh as terminal to mainframe 10 inch screen Non-extended

3. actions carried out directly on objects of interest, either text or icons, without the intermediary of a command-language or menu choices; 4. use of a mouse for selection of text and command icons, allowing direct actions on objects by pointing, clicking, and dragging; 5. immediate feedback on the result of operations by a change in the visible features of screen objects. On the other hand, this DMI, like virtually all DMIs, is a hybrid in terms of interaction style, using menus for some editing and formatting operations. This feature detracts from directness because direct action on text or objects is replaced by the intermediary of the menus. However, participants were taught to use menu options only in a very few operations for which direct manipulation options did not exist. Command shortcuts were also available for some operations, but our participants were not taught them. Close monitoring of the participants during training and evaluation showed that they did not discover or use any command shortcuts on their own. It is important to note, with regard to the hybrid nature of the software, that the authors do not view direct manipulation as a purely nonlinguistic phenomenon. What makes an interface direct manipulation is its ability to give the user the overall feeling of operating directly on the objects of interest. The command-based software represented the least direct of the three interaction styles. It consisted of a full screen editor, Vi, used in combination with a text formatting program, TROFF. This word processing system ran on a mainframe and was accessed by using a Macintosh as a terminal (with the mouse removed). In this system, the text of a document and formatting commands are placed in a file created with the editor. Then the file is

S. Davis, S. Wiedenbeck/Interacting with Computers 11 (1998) 147–172

157

processed by the formatting program to produce a file containing the formatted document, ready to be printed. In our experiment, participants worked in a single window environment, so they could not run the processor to see the formatted document side by side with the file they were creating. This system is very indirect for several reasons: 1. actions are expressed by artificial and often non-mnemonic command codes; 2. references to a document are expressed indirectly, for example, to select multiple text lines for copying one must go to the first line of the segment to be operated on, count the number of lines, and issue a command to copy that number of lines; 3. the results of actions are not immediately visible, since formatting commands are only applied to the document when it is processed by the formatting program. Even though the system is very indirect, it should be pointed out that it too is a hybrid with a few features that can be considered more direct. For example, a whole screen of text is visible at once, not just isolated lines, as in line editors or small window electronic typewriters. Also, to correct or add to the text in a full screen editor one goes to the location in the text where it is to appear and types. Nevertheless, we argue that overall the features of this software give it a distinctly greater feeling of indirectness than the other two systems. The menu interface used in this study was Word for the PC. The version of PC Word used in this experiment runs under the DOS operating system. While versions of PC Word running under the Windows operating system are very similar to Macintosh Word in their incorporation of direct manipulation features, this DOS-based version is strongly menu-driven because all operations are carried out by making choices in hierarchical menus. Compared with Macintosh Word, the actions of users are mediated to a much greater extent by the linguistic representation of operations embodied in the textual menus. A mouse may be used optionally in this menu-based software but was not used in our experimental environment. Rather, specification of operations was done by using the arrow keys to select operations from lists and using the keyboard to fill in parameters. Also, all text selection in this version was done by using arrow keys to move the cursor over the desired text, not clicking or dragging with a mouse. Because of the linguistic representation of operations and the less direct keyboard selection, we argue that users of this menu-based software experienced less of a sense of directness than participants using the DMI. Nevertheless, several features of the interface are direct manipulation-like. For example, while selection is done using the keyboard, it does involve action on the text itself, not an indirect reference to the text. Also, the results of most actions are immediately visible. As in the DMI, some command shortcuts were available, but participants were not taught them and our monitoring showed that they did not discover and use them on their own. Based on the above features, we consider the menu interface to occupy the middle place of our three interfaces but to be closer to the direct manipulation interface than to the command-based interface. As is clear from the previous discussion, our three word processing systems ran on different hardware platforms. Thus, our experiment should be seen as a ‘holistic’ comparison, similar to those done by previous researchers whose goal was to study realistic systems [14,44]. The largest sources of variance were controlled because: (1) the functionality of the three systems was equivalent for the features used in the study, (2)

158

S. Davis, S. Wiedenbeck/Interacting with Computers 11 (1998) 147–172

participants used common training materials which differed only in the specific commands and procedures of the system they were learning, and (3) participants’ learning was evaluated using the same set of evaluation tasks. With respect to software, participants used only the word processor and had no exposure to other applications or the system software. With respect to hardware, the DMI and command participants used the same kind of 10 inch Macintosh monitors. The menu participants used a different monitor but of the same size. All participants worked in a single window environment, in a window set up to hold the same number of lines of text, using a non-extended keyboard. The PC keyboard contained function keys, but they were not used in this experiment. There were no obvious differences in system response time, and participants were never slowed by waiting for a system response. Given these similarities, we argue that the principal differences experienced by the participants were the striking differences in the interaction styles of the word processing systems, including the method of input. The DMI system used a mouse and keyboard, while the menu and command systems used keyboard only. While this is a hardware difference, we consider it to form an integral part of the interaction styles that we were testing in our experiment, based on the argument that manipulation of objects by physical actions of pointing, clicking, and dragging is a fundamental component of the directness of DMI systems [4,14,38]. A questionnaire after training and subsequent separate protocol analysis supported our arguments that participants were not strongly impacted by extraneous hardware differences; no participants reported perceptual or mechanical problems related to the different platforms. The second independent variable was training method, consisting of instruction-based and exploration-based training. Training manuals were created for both the exploration and instruction-based treatments. Both types of manuals were structured as self-study tutorials. Altogether, there were six training manual/interface combinations. Each manual conveyed the same basic material about how to perform basic word processing operations. The instruction-based manual focused on features of the software. It contained sections explaining how to: position the cursor in the text, insert text, format characters, format lines, format paragraphs, indent paragraph margins, copy text, set tab stops, and insert page breaks. The exploration manual, on the other hand, focused on tasks. It grouped the same material under more task-based headings: typing text, rearranging and altering the appearance of text, and formatting a document. All manuals contained similar elaboration of concepts when necessary and provided instructions to participants about what to do if something went wrong. The examples and wording used across manuals were as similar as possible. The differences between the exploration and instruction-based versions reflected the basic characteristics of the two training approaches. Exploration manuals never provided step-by-step procedures or rules to follow. Instead, they provided a few basic examples of each concept, then required participants to create their own examples to explore through trial-and-error. Instruction-based manuals, on the other hand, first provided specific rules for each word processing feature, then gave participants the same basic examples to work through following the step-by-step rules. Instruction-based participants worked only with the rules and examples given. They did not create examples of their own to explore the system. Learning was measured by participants’ performance in a set of ten hands-on tasks and was operationalized in two ways: simple problem solving and complex problem solving.

S. Davis, S. Wiedenbeck/Interacting with Computers 11 (1998) 147–172

159

The tasks were carried out in a five page document different from the training document. The seven simple problem solving tasks included inserting text, deleting text, changing line spacing, adjusting margins, inserting page breaks, bolding, centering, and underlining. The tasks represent basic word processing functionality, but they were a problem solving situation for our beginner participants. Understanding the nature of the formatting to be done was fairly easy, as the task statement indicated unequivocally what needed to be done (e.g., set left margin of the second paragraph to 1.5 inches). The problem solving entered into the tasks from two sources: (1) tasks occurring in a different context in the evaluation than the context encountered during training, and (2) error recovery while carrying out the tasks. The context problem occurred when participants applied operations to different text states. For example, during training participants inserted a page break on a blank line in the training document. In the evaluation they had to insert a page break, but there was no blank line on which to place it. Participants had to figure out how to create a blank line, which required understanding how line wrap works, something they had not explicitly dealt with in training. Also, participants sometimes had to apply two or more different operations to the same text object, and this could require selecting the object using different selection methods depending on the operations. While participants were familiar with the operations from the training, they had to problem solve to deal with the seeming inconsistency that arose from their juxtaposition. For example, to center a heading in the text the subject could place the cursor anywhere in the heading, whereas in a subsequent task the subject had to highlight the whole heading in order to bold it. If participants failed to distinguish these situations correctly they risked failing in their operation or creating unwanted effects in the rest of the document. Finally, in learning software there is a great potential for errors in carrying out tasks. Since we were dealing with non-experts, we were not concerned with unnecessary keystrokes, but rather with actions which could, if left uncorrected, lead to whole or partial failure to complete a task. During training, participants had to be able to recognize when an action led to an undesired state and recover from it. Much problem solving in simple word processing tasks was introduced by the need for such error recovery. A separate videotaped protocol analysis showed errors in the execution of more than 60% of the simple tasks. However, through error recovery the participants corrected a large proportion of their errors. The complex problem solving tasks required participants to combine several simple operations to achieve a more complex goal—a formatted section of text. In the task instructions, participants were presented with a formatted section of text, then were asked to create a similar looking section of text in their document. Although they could see what the target text should look like, they were not told specifically what types of formatting to include nor how to organize the formatting into a set of steps. These complex tasks required participants to: (1) identify the types of formatting needed to produce the desired outcome, (2) decide on the steps for carrying out the formatting, (3) determine the correct order of the steps, and (4) carry out the steps, dealing with any errors or unforeseen interactions arising during execution. For example, participants were presented with a sheet of paper showing a table in four columns with a heading centered at the top of each column and numeric data lined up on the decimal point inside. They were asked to reproduce it in the word processor. First the participants had to notice features of the table, such as the centering and the alignment on the decimal point. Then the participant had to

160

S. Davis, S. Wiedenbeck/Interacting with Computers 11 (1998) 147–172

decide to use margins, centering, and tab stops together to format the table. Next the participant had to determine the order in which to apply the operations because certain orderings made it extremely difficult to achieve the task successfully (e.g., entering the text of the table first and afterwards trying to set the margins, centering, and tab stops). Finally the participant had to enter the data and make changes if it did not look exactly like the paper copy. By requiring participants to identify and solve problems, complex tasks also measured meaningful learning but at a deeper level than the simple tasks. Complex problem has been similarly operationalized in related studies [14,36]. The seven simple and three complex tasks were arranged in a pseudo-random order. 3.3. Experimental procedure Half the participants were randomly assigned to the DMI and one-quarter each to the menu and command-driven software. Training for each interaction style was held separately to avoid participants becoming aware of the different styles during the experiment. Training sessions included from 10 to 12 participants each. Eight such sessions were held to train participants assigned to the DMI and four each to train those assigned to the menu and command-based systems. Each training session was 2 hours long. Two experimenters conducted training sessions following a standard script. In the first 10 minutes of each session the experimenter presented a brief overview of the activities that would follow, and participants filled out some short questionnaires. Next, participants were given 55 minutes to work through the training manual on their own. Each word processing operation in the manual and its related example task referred to sections of a four-page document which participants worked with and modified on their computers. They also had at their disposal a paper copy of the document in which tasks were number keyed to the appropriate sections of the manual. Participants worked through the training manual from beginning to end. If they encountered problems which they could not solve for themselves, the experimenter referred them to the sections of the manual that offered solutions. If participants were still unable to resolve their problems using the manual, the experimenter gave brief hints. Participants were not allowed to communicate among themselves. Exploration training participants were encouraged by the manual to explore concepts on their own as they worked through the manual. Each section contained a subsection called ‘on your own’ dedicated to experimentation and trial-and-error. Also, a section at the end of each exploration manual encouraged participants to go back to explore any topics they wanted, time permitting. Again, they were encouraged to create their own examples or to experiment with any of the previous examples. Instruction-based participants worked through the training manual by first reading general rules for performing operations, then carrying out examples following the rules. They were not allowed to explore or experiment on their own with the concepts. If participants finished working through the manual before their time was up, they were allowed to re-read any of its sections, but were not allowed to create and explore their own examples. Participants were monitored by the experimenter to ensure that they worked on the manual examples without doing exploration going beyond the examples. This measure was taken in order to avoid the potentially confounding effects of allowing both exploration and instruction-based participants to explore the system.

S. Davis, S. Wiedenbeck/Interacting with Computers 11 (1998) 147–172

161

Table 2 Means and standard deviations of interaction style and training method in Part 1

Simple

a

Explore Instruct Complex b Explore Instruct a

DMI

Menu

Command

Explore

Instruct

Grand mean

34.67 (7.09) 34.37 (7.88) 34.98 (6.21) 9.16 (3.10) 9.33 (3.34) 9.00 (2.87)

30.61 (7.86) 30.18 (8.85) 31.05 (6.91) 8.25 (3.07) 8.32 (3.33) 8.18 (2.86)

26.70 (8.68) 26.83 (9.67) 26.57 (7.69) 5.05 (2.68) 5.00 (2.83) 5.10 (2.57)

31.35 (9.10)

31.88 (7.56)

31.61 (8.36)

7.94 (3.65)

7.82 (3.20)

7.88 (3.43)

Maximum score 42. Maximum score 18.

b

The hands-on evaluation lasted 35 minutes. This was enough time for a large majority of participants in all conditions to attempt each task. Participants worked on a document which was different in content but similar in length and format to the training document. They were not allowed to use the training manual. However, they were supplied with a paper copy of the online document in which they were to carry out the evaluation tasks, a brief command/operation summary sheet, and the evaluation task list containing the tasks to be performed. The command/operation summary sheets provided help in recalling the basic operations, by providing simple clues for each one (e.g., menu hierarchies in the menu-based system). It was given in order to prevent the evaluation from becoming a memory test. Participants worked through the tasks in the order in which they were given. They were instructed that if they became stuck on a task (i.e., spending more than 5 minutes on it) they should continue to the next task and come back to it if time allowed. The experimenters provided no help during this hands-on tasks phase, nor did they allow participants to communicate with one another. 3.4. Results Two independent judges who were not aware of the experimental questions or design of the study scored the task sets based on the final work of the participants. Thus, errors committed and corrected while working on the task set were not visible to the judges. The judges used specific guidelines to grade each task on a scale from 0 to 3. Tasks that were completely correct received a score of 3. Those that were completely incorrect or were not attempted received a score of 0. Tasks that were mostly correct but contained minor errors were assigned a score of 2. Tasks that were mostly incorrect were scored 1. The final score assigned to each task was the sum of the scores of the two judges and ranged from 0 to 6. Analysis of the judges’ scoring yielded an inter-rater reliability of 0.86. Also, tests of inter-item reliability yielded a Cronbach’s alpha of 0.76, within limits recommended by Nunnally [26].

162

S. Davis, S. Wiedenbeck/Interacting with Computers 11 (1998) 147–172

The data were analyzed using two-way between-participants Analysis of Variance. The independent variables were interaction style with three levels (DMI/menu/command) and training method with two levels (exploration/instruction). The dependent variables were: (1) the correctness score on simple tasks, which was the sum of the scores on the seven individual tasks and ranged from 0 to 42, and (2) the correctness score on complex tasks, which was the sum of the scores on the three complex tasks and ranged from 0 to 18. Table 2 shows the means and standard deviations of the dependent variables. The ANOVA for simple tasks showed that there was a significant effect of interaction style (F(2, 167) ¼ 15.68, P ⬍ 0.001). However, training was not significant (F(1, 167) ¼ 0.146, P ⬍ 0.703), nor was the interaction style by training interaction (F(2, 167) ¼ 0.065, P ⬍ 0.937). The effect of interaction style on simple tasks was further analyzed with a Tukey’s test. It showed that the simple task scores of the DMI group were significantly higher (P ⬍ 0.05) than the scores of the menu or command-based group. Also, the menu group was significantly higher than the command-based group (P ⬍ 0.05). For complex tasks the effect of interaction style was significant (F(2, 167) ¼ 27.54, P ⬍ 0.001). A follow-up Tukey’s test showed that the DMI and menu-based groups scored significantly higher (P ⬍ 0.05) than the command-based group on complex tasks. However, there was no significant difference between the DMI and menu-based groups. Training was not significant (F(1, 167) ¼ 0.14, P ⬍ 0.710), and the interaction of training and interaction style was not significant (F(2, 167) ¼ 0.071, P ⬍ 0.931). 3.5. Discussion In initial learning both the DMI and menu participants performed better than the command-based participants on simple and complex problem solving tasks. In general, the pattern of performance on the different interaction styles was similar across both task types. However, the scores of all groups on the complex problems solving tasks were lower as a percentage of the total possible score than on the simple tasks, reflecting the increased difficulty of these tasks. The superior performance of the DMI participants over command-based participants on both task types suggests that the DMI provided a model in the interface that was more conducive to assimilation of word processing knowledge. There are two possible explanations for this. One is that the appearance of the direct manipulation interface provided more clues about the capabilities of the system and about how to perform certain functions. For example, the DMI displayed a ruler bar across the top of the screen with arrows at each end to indicate the edges of the margins, icons for creating different line alignments and spacings, arrows for marking points along the ruler line, and scroll bars along the edges of the window. The command-based system had none of these. These clues, we argue, bear more similarity to how non-computer users might think of working with a document than the ‘clues’ offered by the command-based system (i.e., simply a screenful of text). A second, and perhaps more important explanation, is the way in which each interface allowed users to interact with the system. Assimilation theory says that an individual must actively work with information in a new domain in order to link it with existing knowledge. By working with this new information the user discovers the similarities and dissimilarities between the old and new domains and, eventually, is able to generalize

S. Davis, S. Wiedenbeck/Interacting with Computers 11 (1998) 147–172

163

existing knowledge structures to incorporate the information from the new domain. The DMI provided a much more direct way of interacting with the system and interpreting its output than the command-based system did. Protocols which we collected from a different group of subjects carrying out the same training suggest that it encouraged users to work with the system with far less fear of performing operations that they did not understand or of producing errors. DMI users performed many operations by simply pointing and clicking, with feedback from these operations immediately visible in the form of changes in manipulated objects. Command-based users, on the other hand, had to use cryptic syntax the effects of which on the objects of interest were often not apparent. This required them to filter their intentions through the command language and interpret output by understanding the impacts that certain commands were supposed to have on the text. The lack of feedback was further magnified by the fact that subjects did not have the ability to print their documents in any of the experimental conditions. The command-based participants made more errors during the practice than the DMI subjects, according to our observations and protocols, but help was available in the manual and, if that failed, from the experimenter. In spite of making more errors, our observations show that the command-based subjects were able to complete the examples in the training manual. Thus, their difficulties cannot be attributed to lack of time to finish the training. Instead, we believe that the mode of interacting with the command-based system placed a very heavy burden on cognitive processing. It was more difficult to understand and interpret the results of each operation, and this made it very difficult to understand the relationship between different types of operation—a requirement for meaningful learning. This had an adverse effect on performance. Also, in motivational terms we hypothesize that a non-intuitive mode of interacting may have failed to encourage users to work with the system. Our experiment did not test this hypothesis, but future work to test it is planned. The difference between the DMI and menu groups on simple problem solving tasks can be explained in terms of the lack of a concrete, manipulable model in the menu-based interface. The menu-based system lacked the manipulable interface objects of the DMI. This may have been important, since the beginner participants did not have a well-developed internal model of word processing for assimilation. Thus, they had to rely on the model presented in the interface as the assimilative context to support problem solving, and it did not support them as well as the DMI model. Furthermore, as argued in the literature on direct manipulation [14], the less direct method of interacting using the keyboard rather than clicking and dragging with a mouse may also have played a role in the difference in performance between DMI and menu-based systems in reducing the feeling of working directly on objects. The difference in simple problem solving between menu and DMI systems supports the results of Benbasat and Todd [4] that a DMI is superior to a menu-based system in performance of problem solving tasks. The lack of difference between DMI and menu systems on complex tasks may indicate that the complex tasks required a level of integration of knowledge which was difficult to achieve given particiapants’ modest experience with simple tasks and their weak context of related informtion in long-term memory. It appears that the menu-based system offered advantages in learning over the command-based system for simple and complex tasks. We argue that the superiority of the menu system is a direct result of two phenomena. One is the reduction in cognitive

164

S. Davis, S. Wiedenbeck/Interacting with Computers 11 (1998) 147–172

processing load associated with selecting commands from menus rather than having to retrieve them from memory, as in the command-based system. The other is the ability to receive direct and immediate feedback in the form of visible changes in the objects being manipulated, similar to the DMI. In addition, it is possible that the visibility provided by lists of commands in the menus may have led menu users to try out solutions that would not have occurred to them in a command-based environment. The results related to training methods were inconclusive. Since past results have been mixed, it was not obvious what to expect with regard to the effect of training on the dependent variables. Training method had no significant effect. In fact, none of the results even approached significance. Previously we argued that an exploratory training approach would encourage learners to actively manipulate the material they were working with. This in turn would aid assimilation of new material. This argument may not be very relevant for initial learning in that learners have little pre-existing knowledge in longterm memory and thus a weak basis for assimilation of new information. Under these conditions, exploratory activities may not have an advantage over other training methods. Another possibility is that beginners are so bombarded by new information and procedures in initial learning that the differences between instruction and exploratory training appear rather subtle and do not have an impact. Also, our college age participants may not have had good ideas of what kinds of tasks would be useful to explore. In Carroll et al.’s [11,12] studies of exploratory learning the participants were office personnel who knew a great deal about document preparation in a non-computer environment, and this knowledge may have helped them to make the best use of open-ended training.

4. Part 2: subsequent learning The purpose of Part 2 was to assess subsequent learning of a functionally equivalent software package by participants who had previous experience using a different interaction style. The participants who were trained on the direct manipulation interface in Part 1 received a second training session, either on the menu-based system or the command-based system. Their performance on the set of evaluation tasks after training was compared with that of participants who learned the menu or command-based systems in their initial training session. This design allowed us to assess how the previous basic training in word processing, as well as interaction style and training method, affected their attainment of skill in learning the subsequent software package. 4.1. Participants The same 173 individuals who participated in Part 1 also participated in Part 2. The participants who received a second training session for this study had received initial training on the DMI in Part 1. They had achieved a moderate level of competence but were not experts in the initial software. This is shown by the scores in Part 1, in which these participants achieved 83% correct on simple tasks. Thus, this group allowed us to pursue our objective of studying non-experts as they went on to learn subsequent functionally equivalent software.

S. Davis, S. Wiedenbeck/Interacting with Computers 11 (1998) 147–172

165

4.2. Materials The same materials were used as in Part 1, except that the DMI materials were not used in this part. 4.3. Procedure The DMI participants from Part 1 were randomly assigned to a second training session on either the menu or command-based word processing software. Participants remained in the same training method to which they had been assigned in the first session. The training session took place 2 or 3 days after the initial training on the DMI. This time lag was chosen to be short enough that participants would likely retain much of their knowledge from the initial training with the DMI. The DMI was chosen as the initial software to be learned because the characteristics of direct manipulation, notably the concrete physical model and visible feedback, were conjectured to be most likely to facilitate the construction of a base of knowledge for assimilation, as previously supported by Davis and Bostrom [14] and Benbasat and Todd [4]. The procedure in the second training session was the same as in the first session, consisting of instruction or exploration-based training on the assigned interface followed by the set of ten evaluation tasks comprising simple and complex problem solving. 4.4. Results The same scoring procedure was followed as in Part 1. The inter-rater reliability was 0.86. The test for inter-item reliability yielded a Cronbach’s alpha of 0.74, again within the limits recommended by Nunnally [26]. The data were analyzed using a three-way between-participants ANOVA. The independent variables were interaction style with two levels (menu vs. command), session with two levels (session 1 initial learning vs. session 2 subsequent learning), and training method with two levels (instruction vs. exploration). The dependent variables were the total score on simple problem solving tasks and the total score on complex problem solving tasks. The ANOVA for simple tasks showed significant effects of interaction style (F(1, 165) ¼ 28.07, P ⬍ 0.001) and of session (F(1, 165) ¼ 7.39, P ⬍ 0.001). The means in Table 3 show that performance was better with the menu-based interface than the command-based interface. Performance was better in session 2 than in session 1. The effect of training method was not significant (F(1, 165) ¼ 0.013, P ⬍ 0.909). The interaction style by session interaction was significant (F(1, 165) ¼ 4.24, P ⬍ 0.041) and is analyzed further in the following paragraph. The other two-way interactions and the three-way interaction were not significant: interaction style by training (F(1,165) ¼ 1.48, P ⬍ 0.22), training by session (F(1, 165) ¼ 0.018, P ⬍ 0.892), interaction style by training by session (F(1, 165) ¼ 0.85, P ⬍ 0.358). The ANOVA for complex tasks indicated a significant main effect of interaction style (F(1, 165) ¼ 86.90, P ⬍ 0.001). Training was not significant (F(1, 165) ¼ 0.13, P ⬍ 0.718), and session fell marginally short of significance (F(1, 165) ¼ 3.19, P ⬍ 0.076). The two-way interaction between session and interaction style also fell short of significance (F(1, 165) ¼ 3.02, P ⬍ 0.084). The other two-way interactions and the

Instruct

Explore

Instruct

Explore

Maximum score 42. Maximum score 18.

b

a

Complex b

Simple a 30.61 (7.86) 30.18 (8.85) 31.05 (6.91) 8.25 (3.07) 8.32 (3.33) 8.18 (2.86)

26.70 (8.68) 26.83 (9.67) 26.57 (7.69) 5.05 (2.68) 5.00 (2.83) 5.10 (2.57)

36.31 (6.59) 37.14 (4.87) 35.48 (8.00) 9.74 (2.52) 9.62 (2.38) 9.86 (2.71)

Menu

Menu

Command

Session 2

Session 1

27.51 (7.73) 26.73 (7.27) 28.33 (8.28) 5.07 (2.68) 5.46 (2.70) 4.67 (2.65)

Command

Table 3 Means and standard deviations of interaction style and training method in Part 2

6.62 (3.48)

28.47 (9.32)

Explore

Session 1

6.64 (3.11)

28.86 (7.56)

Instruct

7.49 (3.28)

31.81 (8.09)

Explore

Session 2

7.26 (3.73)

31.90 (8.82)

Instruct

6.65 (3.28)

28.66 (8.46)

Session 1

Grand mean

7.38 (3.49)

31.86 (8.41)

Session 2

7.01 (3.40)

30.23 (8.56)

mean

Grand

166 S. Davis, S. Wiedenbeck/Interacting with Computers 11 (1998) 147–172

S. Davis, S. Wiedenbeck/Interacting with Computers 11 (1998) 147–172

167

Fig. 1. Comparison of computer environments.

three-way interaction were not significant: interaction style by training (F(1, 165) ¼ 1.11, P ⬍ 0.348), training by session (F(1, 165) ¼ 0.09, P ⬍ 0.760), interaction style by training by session (F(1, 165) ¼ 0.557, P ⬍ .457). We further investigated the interaction style by session interaction by performing tests of simple main effects. Huck, Cormier and Bounds [20] recommend the use of simple main effect tests in the event of interactions in order to clearly identify the types of effects that are present. The procedure for the simple main effects analysis of a two-way interaction is to hold the level of one factor constant while comparing the levels of the other factor. We began by holding the interaction style constant as command (N ¼ 87) and analyzing the difference between sessions. The results showed that there was no significant difference between sessions for this group for simple tasks (F(1, 85) ¼ 0.21, P ⬍ 0.648) or for complex tasks (F(1, 85) ¼ 0.002, P ⬍ 0.966). However, going on to consider the menubased group only (N ¼ 86) we found that there was a significant difference between sessions for simple tasks (F(1, 84) ¼ 13.20, P ⬍ 0.001) as well as complex tasks (F(1, 84) ¼ 6.02, P ⬍ 0.016). The interaction is graphed in Fig. 1, which shows clearly that the group which used the menu-based interface in session 2, after experience with the DMI, was superior to the group that learned it in the initial session. A corresponding improvement was not seen for the group which learned the command-based interface after DMI experience. This interaction takes precedence over the main effect of

168

S. Davis, S. Wiedenbeck/Interacting with Computers 11 (1998) 147–172

session reported earlier in that the difference between sessions was confined to the menu group. 4.5. Discussion The principal finding of Part 2 is that prior experience with the DMI aided learning only for the menu group. The menu users with prior DMI experience scored higher than the menu users without prior DMI experience. There was no corresponding effect among the command-based users for whom the scores were very much the same for those who had prior DMI experience and those who did not. The results do not support an across-theboard argument that DMI experience facilitates users’ learning of subsequent functionally equivalent software packages. The question then becomes: why did DMI experience help menu participants but not command participants? As discussed earlier, the features of the menu-based system placed it closer to the DMI on a continuum from direct to indirect interaction styles. This suggests that the similarity between the DMI and menu systems established a stronger basis for using analogy to learn menu system concepts. Since participants moving to the menu system found both its functionality and outputs very similar to those of the DMI ‘model’, they are likely to have devoted the bulk of their cognitive processing effort to generalizing that model by assimilating menu choices to their previous knowledge of DMI procedures and learning to use keyboard input. Alternatively, the command-based system differed from the DMI not only in its use of commands instead of direct manipulation, but also in the form of its output. Most of the changes in the text as a result of editing were not immediately visible, as was the case in the DMI. Therefore, participants had to devote more cognitive effort to deciphering the meaning of both input and output, using that meaning to build a mental model of the command-based system, then drawing analogies between it and the model they already had of the DMI. Certainly, this would impede the process of meaningful learning and lead to reduced performance. From the lack of facilitation among command participants who had prior DMI experience, it appears that knowledge of the functionality of one package is not a sufficient condition for meaningful learning to occur. The ‘appropriate assimilative context’ needs to include strong surface similarities and operating procedures between applications. Apparently users need to recognize explicitly how the operations in a new application map to those in applications that they already know. In the cases of the DMI and command-based systems these similarities were not strong enough. This supports the findings of Schumacher and Gentner [34] and Dixon and Gabrys [15] on the role of similar surface and operating features in subsequent learning. The menu participants with prior DMI experience performed better on simple problem solving than their counterparts who used the menu system without DMI experience. There was a trend in that direction for complex problem solving, as well. This suggests that participants possessed an assimilative context from the DMI experience that was sufficiently robust to aid learning of simple tasks on a relatively similar system. While the previous DMI experience was somewhat less helpful for complex tasks than for simple tasks, there was evidence of some facilitation on the complex tasks, too, which may also be attributed to the assimilative context provided by the DMI.

S. Davis, S. Wiedenbeck/Interacting with Computers 11 (1998) 147–172

169

It should be noted that the facilitation gained by the menu participants from having previous DMI experience was relatively modest for both simple and complex problem solving. The amount of facilitation, if any, is represented by higher scores of participants who learned a particular package in their second session compared with their counterparts who used the same package in their first session. The studies on expert transfer in software [7,28,39] are not exactly comparable because they measured time to carry out tasks, not correctness. Nevertheless, it seems that the effect of prior knowledge of the earlier system was greater in those studies. It seems clear that the weaker effect in the present study is related to the level of expertise of the participants in the first software package. The participants in the Polson et al. [28] study were trained to a high criterion of correctness on the first software (for each function to be learned error free performance of ten edits) before going on to the next software. Thus, they can be considered surrogates for experts on the first software. For them it may be assumed that the facilitation of learning observed in the subsequent software was the result of automatic processes, i.e., well-known productions being fired when the conditions to elicit them were present. Our interest is in users who go on to use subsequent functionally equivalent packages when they are not experts in the first package. For them some proceduralization may occur in the learning of an earlier package. However, the facilitation from prior experience with another package is unlikely to be explained entirely or mostly in terms of automatic processes. Instead, it probably comes from effortful generalizations, as suggested by Salomon and Perkins [32] and Mayer [23]. This may explain the differences between this study and previous studies of experts in the degree of facilitation of learning observed in the subsequent software. The results related to training methods were inconclusive, as in Part 1. In subsequent learning, we hypothesized that participants who used exploration training would perform better, particularly in complex problem solving, than those who used instruction-based training. This prediction was based on assimilation theory, which says that working with new concepts facilitates their integration with existing knowledge structures. The fact that this did not occur suggests several alternative explanations. One is that the tutorial sections of the exploration training manual, along with the specific error recovery information, may have given participants a false sense of security in learning new word processing operations. After working through the example and getting the desired results, participants may have felt that they understood the operations and found no need to explore most of them on their own. Thus, they may have failed to take advantage of opportunities to work with the new information to integrate it with their knowledge of the previous word processor. The preliminary analysis of the separate protocol analysis referred to earlier supports this view in showing that many participants made little effort to further practice with or explore the software after working through the given examples. A second explanation for the nonsignificance of training is that the time restrictions of the experiment prevented participants from spending as much time as they needed to explore the system. It is interesting to note that in studies in which exploration training proved to be superior to instruction-based training, participants were given more time to work with the systems than the time allowed in this study. In the Carroll et al. [11,12] studies, for example, participants received between one and several days of training. Future research needs to determine the nature of this time versus learning tradeoff for both exploration and instruction-based training approaches.

170

S. Davis, S. Wiedenbeck/Interacting with Computers 11 (1998) 147–172

A third explanation for the results of our study deals with the backgrounds of the participants. Most of them were second or third year university students. Many of these individuals may have had little experience with unstructured learning environments and, thus, may have been more accustomed to situations in which learning was very structured, similar to many classroom environments. This may have lessened the impact of exploration learning and led to the lack of significance of training methods. 5. Conclusion The results of this study lead to several conclusions. The first is that in initial learning computer novices can benefit from the use of direct manipulation interfaces versus commandbased interfaces. Perhaps more interesting, a DMI with its concrete model embedded in the interface leads to better performance than a menu-based system, suggesting the importance of the model and direct action on objects. A conclusion that one may draw is that aiding the interaction must, to be most effective, go beyond memory aids in the form of menu lists and feedback on operations. It is significant that this argument, which was made initially by Hutchins et al. [21], was supported in a complex real word processing environment. Second, when end users learn a new package they can benefit at least moderately from prior experience with a functionally equivalent package, even though their skills with the prior software are not at the expert level. We assume that, as suggested by assimilation theory, the benefit of previous experience came from consciously working with the material and making links between the new and prior software. However, the benefit was only seen in learning a package that exhibits some surface and operational similarities to the prior software. Thus, the consistency of interfaces as well as the order in which they are learned appear to be important in learning. Further study is warranted to disentangle the effects of order and consistency. Nevertheless, it seems reasonable to suggest that training which aids the learner in detecting similarities may facilitate learning. These results argue in favor of interface consistency within a training context. Third, in short-term training situations there is little evidence to suggest that an openended, exploration approach leads to better learning. Future research needs to test the effects of providing exploration users with the opportunity to practice the skills they learn in the form of integrative exercises as opposed to more open-ended exploration [31]. Also, longer training sessions may lead to more favorable results. A longer training session might also modify the effect of the interface on learning, although it is questionable whether it would reduce or accentuate it. Further testing which manipulates the length of the training and the precise way in which exploration and instruction-based training are operationalized may also give researchers a clearer picture of the relationship of less structured training situations to the form of the computer interface. These, and related studies, will lead us to a more complete understanding of the process of initial and subsequent learning of computer software by end users. References [1] J.R. Anderson, Rules of the Mind, Erlbaum, Hillsdale, NJ, 1995. [2] D.P. Ausubel, Educational Psychology: A Cognitive View, Holt, Reinhart and Winston, New York, 1968.

S. Davis, S. Wiedenbeck/Interacting with Computers 11 (1998) 147–172

171

[3] D.P. Ausubel, The Psychology of Meaningful Verbal Learning, Grune and Stratton, New York, 1963. [4] I. Benbasat, P. Todd, An experimental investigation of interface design alternatives; icon vs. text and direct manipulation vs. menus, International Journal of Man–Machine Studies 38 (1993) 369–402. [5] R.S. Blake, Discovery versus expository instructional strategies and their implications for instruction of hearing-impaired post-secondary students, in: A.H. Areson, J.J. DeCaro (Eds.), Educational Teaching, Learning, and Development, Vol. 1, National Institute for the Deaf, Rochester, NY, 1984, pp. 269–316. [6] C.L. Borgman, The user’s mental model of an information retrieval system: an experiment on a prototype on-line catalog, International Journal of Man–Machine Studies 24 (1986) 47–64. [7] S. Bovair, D.E. Kieras, P.G. Polson, The acquisition and performance of text-editing skill: a cognitive complexity analysis, Human–Computer Interaction 5 (1990) 1–48. [8] J.S. Bruner, The act of discovery, Harvard Educational Review 31 (1961) 21–32. [9] J.M. Carroll, Minimalist training, Datamation 30 (1984) 125–136. [10] J.M. Carroll, R.L. Mack, Learning to use a word processor: by doing, by thinking, and by knowing, in: J.C. Thomas, M.L. Schneider (Eds.), Human Factors in Computer Systems, Ablex, Norwood, NJ, 1984, pp. 13–51. [11] J.M. Carroll, R.L. Mack, C.L. Lewis, N.L. Grischkowski, S.R. Robertson, Exploring a word processor, Human–Computer Interaction 1 (1985) 283–307. [12] J.R. Carroll, P.L. Smith-Kerker, J.R. Ford, S.A. Mazur-Rimetz, The minimal manual, Human–Computer Interaction 3 (1987) 123–153. [13] R. Catrambone, J. Carroll, Learning a word processing system with training wheels and guided exploration, in: J.M. Carroll, P.P. Tanner (Eds.), CHI þ GI 1987 Conference Proceedings: Human Factors in Computing Systems and Graphics Interface, ACM, New York, 1987, pp. 169–174. [14] S.A. Davis, R.P. Bostrom, Training end users: an experimental investigation of the roles of the computer interface and training methods, MIS Quarterly 17 (1993) 61–85. [15] P. Dixon, G. Gabrys, Learning to operate complex devices: effects of conceptual and operational similarity, Human Factors 33 (1991) 103–120. [16] S. Douglas, T.P. Moran, Learning text editor semantics by analogy, in: A. Janda (Ed.), CHI’83 Conference Proceedings: Human Factors in Computing Systems, ACM, New York, 1983, pp. 207–211. [17] S.T. Dumais, W.P. Jones, A comparison of symbolic and spatial filing, in: L. Borman, B. Curtis (Eds.), CHI’85 Proceedings: Human Factors in Computing Systems. ACM, New York, 1985, pp. 127–130. [18] C. Egidio, J. Patterson, Picture and category labels as navigational aids for catalog browsing, in: E. Soloway, D. Frye, S.B. Sheppard (Eds.), CHI’88 Conference Proceedings: Human Factors in Computing Systems, ACM, New York, 1988, pp. 127–132. [19] F. Halasz, T.P. Moran, Analogy considered harmful, in: Proceedings of the Human Factors in Computer Systems Conference, National Bureau of Standards, Gaithersburg, MD, 1982. [20] S.W. Huck, W.H. Cormier, W.G. Bounds, Reading Statistics and Research, Harper and Row, Evanston, IL, 1974. [21] E.L. Hutchins, J.D. Hollan, D.A. Norman, Direct manipulation interfaces, Human–Computer Interaction 1 (1985) 311–338. [22] A.W. Lazonder, H. van der Meij, The minimal manual: Is less really more?, International Journal of Man– Machine Studies 39 (1993) 729–752. [23] R.E. Mayer, A psychology of how novices learn computer programming, Computing Surveys 13 (1981) 121–141. [24] A. Michard, A graphical presentation of boolean expressions in a database query language: design notes and an ergonomic evaluation, Behaviour and Information Technology 1 (1982) 279–288. [25] K. Morgan, R.L. Morris, S. Gibbs, When does a mouse become a rat? or...comparing the performance and preferences in direct manipulation and command line environment, Computer Journal 34 (1991) 265– 271. [26] J.C. Nunnally, Psychometric Theory, McGraw-Hill, New York, 1967. [27] L.A. Olfman, A comparison of applications-based and construct-based training methods for DSS generator software, Doctoral Dissertation, Indiana University, Bloomington, IN, 1987 (unpublished). [28] P.G. Polson, S. Bovair, D. Kieras, Transfer between text editors, in: J.M. Carroll, P.P. Tanner (Eds.), CHI þ GI 1987 Conference Proceedings: Human Factors in Computing Systems and Graphics Interface, ACM, New York, 1987, pp. 27–32.

172

S. Davis, S. Wiedenbeck/Interacting with Computers 11 (1998) 147–172

[29] G. Rohr, Understanding visual symbols. IEEE Computer Society Workshop on Visual Languages, IEEE, New York, 1984. [30] G. Rohr, E. Keppel, Iconic interfaces: where to use and how to construct? in: H.W. Hendrick, O. Brown (Eds.), Human Factors in Organizational Design and Management, Elsevier, New York, 1984, pp. 269–275. [31] M.B. Rosson, J.M. Carroll, R.K.E. Bellamy, Smalltalk scaffolding: a case study of minimalist instruction, in: J.C. Chew, J. Whiteside (Eds.), Human Factors in Computing Systems Empowering People: CHI’90 Conference Proceedings, ACM, New York, 1990, pp. 423–429. [32] G. Salomon, D.N. Perkins, Transfer of cognitive skills from programming: when and how?, Journal of Educational Computing Research 3 (1987) 149–169. [33] R. Santhanam, S. Wiedenbeck, Neither novice nor expert: the discretionary user of software, International Journal of Man–Machine Studies 38 (1993) 201–229. [34] R.M. Schumacher, D. Gentner, Transfer of training as analogical mapping, IEEE Transactions on Systems, Man, and Cybernetics 18 (1988) 592–600. [35] M.M. Sebrechts, R.L. Marsh, Components of computer skill acquisition; some reservations about mental models and discovery learning, in: G. Salvendy, J.M. Smith (Eds.), Designing and Using Human–Computer Interfaces and Knowledge-Based Systems, Elsevier, Amsterdam, pp. 168–173. [36] M.K. Sein, R.P. Bostrom, Individual differences and conceptual models in training novice users, Human– Computer Interaction 4 (1989) 197–229. [37] B. Shneiderman, Designing the User Interface: Strategies for Effective Human–Computer Interaction, 2nd edn., Addison-Wesley, Reading, MA, 1992. [38] B. Shneiderman, The future of interactive systems and the emergence of direct manipulation, Behaviour and Information Technology 1 (1982) 237–256. [39] M.K. Singley, J.R. Anderson, The Transfer of Cognitive Skill, Harvard University Press, Cambridge, MA, 1989. [40] N.A. Streitz, A.C. Spijkers, L.L. vanDuren, From novice to expert user: a transfer of learning experiment on different interaction modes, in: H.-J. Bullinger, B. Shackel (Eds.), Human–Computer Interaction—INTERACT’87, Elsevier, Amsterdam, 1987, pp. 841–846. [41] D. Te’eni, Direct manipulation as a source of cognitive feedback: a human–computer experiment with a judgement task, International Journal of Man–Machine Studies 33 (1990) 453–466. [42] J.C. Thomas, W.A. Kellogg, Minimizing ecological gaps in interface design, IEEE Software 22 (1988) 78– 86. [43] E. Ulich, M. Rauterberg, T. Moll, T. Greutmann, O. Strohm, Task orientation and user-oriented dialogue designs, International Journal of Human–Computer Interaction 3 (1991) 117–144. [44] J. Whiteside, S. Jones, P.S. Levy, D. Wixon, User performance with command, menu, and iconic interfaces, in: L. Borman, B. Curtis (Eds.), CHI’85 Proceedings: Human Factors in Computing Systems, ACM, New York, 1985, pp. 185–191. [45] S. Wiedenbeck, P.L. Zila, Hands-on practice in learning to use software: a comparison of exercise exploration, and combined formats, ACM Transactions on Computer–Human Interaction 4 (1997) 169–196. [46] H.S. Woodgate, The use of graphical symbolic commands (icons) in application programs, in: Proceedings of the European Graphics Conference and Exhibition, Elsevier, Amsterdam, 1985, pp. 25–36.