An attempt to incorporate expertise about users into an intelligent interface for Unix

An attempt to incorporate expertise about users into an intelligent interface for Unix

Int. J. Man-MachineStudies(1989) 31, 269-292 An attempt to incorporate expertise about users into an intelHgent interface for Unix JENNIFER JERRAMS-S...

2MB Sizes 1 Downloads 58 Views

Int. J. Man-MachineStudies(1989) 31, 269-292

An attempt to incorporate expertise about users into an intelHgent interface for Unix JENNIFER JERRAMS-SMITH

Artificial Intelligence Group, Philips Research Laboratories, Cross Oak Lane Redhill, Surrey, RHl SHA, UK (Received 5January 1987and in revised form lab June 1988) Knowledge acquisition about novice users of Unix was undertaken in order to provide an improved interface to an existing interactive system. This included a behavioural study involving observations of a large group of novice users of Unix, and analysis of logs of their keyboard input and of verbal protocols. The requisite expertise for the analysis was provided by tutors who were expert users of Unix. Their expertise involved two domains, the first of which was Unix. The second type of expertise, gained from their experience of tutoring, involved the tutors’ understanding of the meaning of the behaviour of novices who were learning how to use Unix. The tutors’ expertise was used to define categories into which user errors could be classified. The term error has here an extended meaning: it covers commands, which, though valid, make inefficient use of the system or do not produce the desired result. The tutors provided the knowledge which allowed the detection of the category to which each error belonged. This information was encapsulated within the interface in the form of production rules within the knowledge base of a KBS. It is these production rules which provide the user modelling component of the interface. The resultant interface, a Smart User-System Interface (SUSI), provides a first attempt at a supportive environment in which novices can easily learn to use the Unix operating system efficiently. It offers advice when it detects an error but is otherwise transparent. The results of testing are reported and these indicate the usefulness of the SUSI interface. Further work is now in progress to use the ideas it embodies in other interactive systems.

1. Introduction People often find it difficult to use the interactive computer facilities which are intended to help them to complete their tasks quickly and effectively (Hayes, Ball & Reddy, 1981; Schneider & Thomas, 1983). They are often reluctant to make use of all the facilities provided (Nickerson, 1981) and those that are used are often used inefficiently (Jet-rams-Smith, 1986). This suggests that a lack of knowledge of the system on the part of the user may waste machine resources and, more importantly, may waste users’ own time and effort. There is thus good reason to attempt to build improved user-system interfaces which can provide the required knowledge. There are numerous systems already in existence which would be improved by a more helpful user-system interface. With this in mind, the Smart User System Interface (SUSI) was developed in an attempt to provide an example of a helpful interface to an existing interactive system. The Unix operating system was chosen 269 0020-7373/89/030269 + 24$03.00/O

0

1989 Academic Press Limited

270

J. JERRAMS-SMITH

because it is well known that novices find it difficult to use. Additionally, a large user group was available for study and Unix experts were easily available. One of the aims of the development of SUSI was to find a method of providing a supportive interface which ensures that novices can learn about Unix by exploring its capabilities: they can try out ideas without the fear of “getting lost” or unintentionally causing damage. Such exploration-based learning is believed to be educationally preferable to that of the more usual instruction-based learning (Carol1 & Mack, 1984; Kamouri, Kamouri & Smith, 1986), and has been provided for an informal learning activity by Burton & Brown (1979). The “smart” interface should be able to recognise and attempt to rectify the user’s initial misconceptions about Unix as part of the learning process, otherwise such misconceptions can lead to further misunderstandings. Janosky, Smith & Hildreth (1986), in an investigation of first-time users of an online library catalogue system, found there could be a snowballing effect as the user tried to find and interpret additional information in an attempt to recover from an error. Secondly, the interface must also be able to protect the novice user by being capable of recognising when the user is giving a command which could cause damage. In most cases the damage would be to the user’s own files, and so such a command should not be passed immediately to Unix. In order to achieve these aims, the SUSI interface provides active help (Fischer, Lemke & Schwab 1984) by detecting when help is needed and then providing it, rather than waiting for the user to request it. This is particularly important because users do not always request help when they need it. For instance, they may not know that they are using the system inefficiently, and even when they do know, they cannot always access help successfully because they cannot describe the problem or they do not know the words which refer to it in this system. Such an interface should enable novices to complete more of their tasks successfully. It should also reduce the number of repeated errors that users make and reduce the number of errors caused by the same underlying misconception or lack of knowledge. The method used for achieving these objectives in the development of SUSI was to consider the types of errors users typically make. The meaning of the term “error” is here extended to include not only errors which are recognised as such by the system and that result in a system error message, but additionally to include commands which, although valid, either make inefficient use of the system or do not produce the result intended by the user. Knowledge about the possible reasons for each type of error was embodied in the interface. The knowledge represented within the SUSI interface enables it to recognise the underlying cause of an error which appears in an individual’s input so that suitable tutorial help can be provided. This involves modelling the user, in terms of misconceptions the user may have about the system, and relates to other work in the field of Intelligent Tutorial Systems including that of Self (1974), Clancey (1986) Brown & Burton (1978) Sleeman & Smith (1981), and Johnson & Soloway (1984). Although the term “user model” is used in a variety of ways, as described in Johnson & Johnson (1987), the user model is here taken to be the interface’s model of the user’s knowledge about Unix; the user modeller is the component of the

DEVELOPMENT

OF AN INTELLIGENT

INTERFACE

271

interface which builds and alters the user model. The user’s mental model is taken to be the representation that the user has of Unix; it equates to a version of the user conceptual model (Young, 1981). Knowledge about typical user errors and the underlying misconceptions which cause them is encapsulated in production rules which comprise the user modeller component. This component enables the interface to make hypotheses about the user’s mental model of Unix. That is, it enables a user model to be constructed and enables suitable remedial help to be provided. The tutorial information output by the interface provides missing knowledge and corrects misconceptions so that the user can be helped to develop a sufficiently good mental model of Unix to enable her/him to use it effectively. SUSI is a “transparent” interface, situated between the user and the system, which enables users to feel that they are interacting directly with Unix. In fact, they are interacting with Unix via the SUSI interface which detects when help and guidance are needed and provides immediate instruction so that errors do not become ingrained. The SUSI interface has been reported previously in outline descriptions of an early version (Smith, 1985) and of the current version (JerramsSmith, 1987). The following sections describe the process of knowledge acquisition, and the implementation and evaluation of the resultant SUSI interface. 2. Knowledge acquisition 2.1. THE

BEHAVIOURAL

STUDY

first stage of the design process consisted of the collection of data which could be used to elicit knowledge about novice Unix users from people who had expertise about such users. The data were provided during observations of the behaviour of 55 novice users of Unix. They were all undergraduates at Birmingham University who had just completed the first year of a course in software engineering and were thus familiar with other operating systems, although new to Unix. The observations were carried out over four weeks and consisted of logging all input to the shell from the subjects and collecting verbal protocols from them while they were learning to use Unix. The procedure was as follows. The undergraduates were given a short lecture on the capabilities of Unix. They were then asked to get to know Unix in order to write, format and print a short essay and also to write a short program in the pal3 assembly language. There were no constraints as to how they should use the system. They were provided with documentation and a demonstrator was available for a few hours each week to answer their questions. However, many of the students could not access the machines at the times when the demonstrator was available and so they were dependent to a large extent on their own problem solving abilities. The user log for each subject consisted of the names of files in the current directory, the date and time of each command plus the command itself (only shell commands were studied). Verbal protocols were obtained by asking subjects to speak into tape recorders as they worked. They were asked to describe what they were attempting to do, how they intended to achieve their objective and to give

The

272

.I.JERRAMS-SMITH

their comments on the response of the Unix shell to the commands they gave. However, the provision of such protocols was a difficult task for the subjects because they were attempting to complete two different tasks at the same time (working out what to do and describing it verbally). 2.2. USING

THE LOGS AND PROTOCOLS IN ORDER TO ELICIT EXPERTISE

2.2.1. Introduction The log for each user was subjected to a detailed error analysis, and this provided the basis for elicitation of knowledge about the interaction of novices with Unix. This analysis was carried out by an expert user of Unix who also had experience in teaching novices how to use Unix. This expert studied the logs in order to discover the categories of errors; all errors were then classified into the existing categories. From the errors visible in the logs the expert had to decide why the user made that error and thus how to classify it. In cases where the expert had difficulty in classifying an error, other similar experts were consulted and a consensus judgement obtained. The process of classifying errors in the logs into their respective categories provided the knowledge about the type of input that was typical of each of the error categories and this knowledge was later encapsulated in production rules. These production rules could then be used by the interface to interpret the behaviour of a specific user by determining to which category an error belonged and hence to decide what tutorial guidance to provide. 2.2.2. Error categories The following error categories were obtained from the first series of observations, and are the categories referred to in Tables 1 and 3, and in Figs 3 and 4. They are numbered in the order of their occurrence. It is probable that this categorisation is still incomplete, since it was dependent upon the content of the protocols and the judgement of a single expert in most cases. For this first attempt at providing a user modeller, the major emphasis was not upon what the errors were or how they were related to each other, but rather, it was upon the knowledge that the expert used in order to assign an error to a particular category. Thus, no attempt was made to do other than a “coarse-grained” detection of errors, nor has the classification scheme been structured in any way. This clar&ication scheme provided the basis for the elicitation of knowledge about the underlying cause of the users’ errors. The examples quoted in italic type are taken from the user logs: (1) Unable to give a command correctly. For instance, because of the inclusion of metasymbols as in (cat fl ) or because of the omission of a delimiting space between the argument and command as in catlmanual/summary-passwd. (2) Failed try for help. These commands usually contained help, h or ? as in help erase, cat heip and h delete. (3) Giving an inappropriate command for the part of Unix currently in use. For instance, attempting to list a file which is in some other directory, or giving editor commands such as w or q to the shell. (4) Giving an incorrect command suggested by previous learning and knowledge

DEVELOPMENT

OF AN INTELLIGENT

INTERFACE

273

of other computer systems, as in the use of pip (which is available under CP/M), Zogo and ty (which are available under Tops20 on a Dec20). The attempted use of previous knowledge is also shown in some errors which have been assigned to category 2 (trying to find help). (5) Mistyped commands (these account for a large proportion of all errors). For example, miul (mail), maual (manual) and dummury (summary). (6) Giving a command whose action has been prevented by the system or system administrator. For example the use of write and games on this particular system. (7) Misunderstanding. The user appears to have an incorrect mental model of the system. (This category was later incorporated into the misconceptions category, see below.) (8) Inefficient use of the system, although the given commands are correct. For instance the use of rm fl to remove a file followed on the next line by rm f2, which could have been effected in a single command. (9) Misread documentation: errors may have been caused by unclear printed symbols. For example, the use of ir where the obvious intention is Is. (10) An unknown command for which the intention cannot be discovered. This may have been caused in some cases by characters which the user had attempted to delete. Some examples are helogout, djrzf, yoo. (11) User has a misconception-an incorrect mental model of the system. For example, nrofffilellfiZe2 where file2 is not executable and the user appears to be confused between pipes which pass the output of a process on to a further process and redirection of output to another file. The probable intention was nrofl fire1 >jZe2. (Some other categories, such as 3, 4 and 13, might be considered as subcategories of this one.) (12) Obscenities-these seldom occurred but seemed to indicate the user’s frustration at inability to control the system. (13) User guesses and tries anything, however unlikely, that might give the required result. For instance, the user might try Zogofl, go away, finzkh or exit in an attempt to logout. This differs from category 4 in that this category is not prompted by previous learning of a specific, identifiable computer system. (14) Loss of attention: it is clear from previous actions in the log that the user knows the required action, but does something else, probably because of an attention lapse. Example: the use of nforr followed on the next line by nrofl-this differs from a simple spelling or typing error but it is almost impossible to predict what such an attention lapse will look like. (15) Forgot own filename and used a non-existent one. For instance, cut essay when the file in the current directory was essuy.txt (16) An easier option is available (this was later incorporated into category 8). For instance using the ed line editor when the easier editor em is also available. (17) Known to be wrong: the user appears to notice that part of the command line is wrong and sends it to shell without completing it. Possibly the user considers this is easier than deleting the line. (18) The user has forgotten how to effect an action, although the log shows it being successfully completed previously. (19) Documentation errorstwo of these were apparent when users attempted to use @ and # which do not delete in the version of Unix which was in use.

J. JERRAMS-SMITH

274

The above categories were later modified when further observations and interpretations indicated that they were not completely accurate, since some of the categories were discovered to be so similar that they could be collapsed together. Thus category 7 (misunderstanding) was incorporated into category 11 (misconception) and category 16 (easier option) was incorporated into category 8 (inefficient usage). 2 2.3. Design features derived from the study The behavioural study indicated the necessity of the following features of the SUSI interface. Associated with each feature there are examples of the problems which became obvious from the empirical studies and which the SUSI interface attempts to alleviate. (1) The interface should be able to detect the majority of the most prevalent errors. Table 1 indicates the relative importance, in terms of prevalence, of the different types of errors which the interface should be able to detect. However, it is unlikely that the interface would be able to detect all errors, and the appearance of new error categories during the second two-weeks indicates that more might have appeared if the study had continued for a longer time. (2) The interface should cope with different sorts of errors at different times. Table 1 indicates that different behaviours were shown during the first two-week session when compared with the second two-week session. Table 1 also indicates that a larger total of errors was found during the second two weeks. However, the error rate was not noticeably greater. (3) The interface should differentiate between trivial errors, such as typing errors for which help is not required, and the more serious ones caused by lack of TABLE 1

Errors and percentages for 1st and 2nd 2-weeks and totalsfor both Error 1 9

4 5 6 7 8 9 10

11 12 13 14 15 16 17 18 19

Category Unable to give command Failed try for help Wrong mode or directory Using previous knowledge Mistype Prevented Misunderstanding Inefficient use Misread documents Unknown Misconception Obscenities Guesses Loss of attention Forgot own file name Easier option available Known to be wrong Forgotten Documentation errors

1st

% 1st

2nd

% 2nd

Total

40 17 7 21 20 7 5 6 5 0 0

29 12 5 15 14 5 3 4 3 0 0 0 0 0 0 0 0 0 5

14 39 18 32 77 10 5 55 11 57 45 2 2 26 3 8 9

3 9 4 7 18 2 1 13 2 13 10

54 46

00 0 0 0 0 0 7

:

;:: 6 0.5 1 2 o-5 0

25 53 97 17 10 61 16 57 45 2 2 26 3 8 9 3 7

DEVELOPMENT

OF AN INTELLIGENT

INTERFACE

275

understanding, for which help should be provided. For instance, Table 1 indicates that in the second two weeks the major error category was for mistyping, for which the interface does not provide advice. (4) Help should be provided only when it is really needed. Whereas some users made very few errors and might be alienated by offers of unwanted help, others made a large number of errors and were apparently in a state of extreme confusion, as illustrated by the transcripts in Figs. 1 and 2. The system responses were not meaningful to these users, as is shown at 13.50 in Fig. 2. (5) The interface should detect and remedy the misconceptions which are common to this type of user. Guidance should be offered as quickly as possible, since the same underlying error can cause a sequence of incorrect commands which wastes the user’s time and effort. Subject C (Fig. 3) attempts to set a password and later attempts to find help on nrofi For both of these operations there is a combination of misconceptions and syntactic errors. For example, subject C gives eight commands in order to set the password, of which five are failed attempts to receive information from the on-line help. (6) Similarly, the interface should attempt to detect the user’s underlying goal and explain how to achieve it. For Subject B (Fig. 4) there is a long session of failed attempts to find information and set the password. (7) The interface should detect some occasions when the system is being used inefficiently, even though valid commands are being given, and provide an explanation of how to use it more efficiently. Figure 3 shows a typical sequence of June 14th, 1983. 15.31

Just dabbling to see if I can do anything. It would appear that @ doesn’t work and removing underlines. Don’t know why that is. Just put in my password. Think I’ll leave it and come back after I’ve looked at the literature. June lSth, 1983. 13.23 Objective today to try and understand what this thing does. Since yesterday I failed to do anything. Actually understand some of the text handling. . . files and things. Couldn’t manage to write to terminal G. Don’t know why that was. Well, so far we have achieved nothing again today. So I’m leaving off for today. June l&h, 1983. 13.32 Well, I’ve now got half an hour on this machine. We’ll call it junk.txt (using the ned screen editor) We appear to have - what about “F? If we can find the right spot in our Unix papers we can actually do something. Totallv stuck on the literature. Well, ‘we can honestly say it’s hopeless finding anything in this manual. [Consults friends. One explains how to use the em line editor) Well, I think I’ll start again. End of session three (note of despair or disgust in his voice). FIG. 1. Transcript for one subject’s first three sessions.

276

J

JERRAMS-SMITH

Subject A June 16th, 1983

13.48 Login Typed my name and got through Now want to set my password Going to call help for the password 13.49 help passwd All instructions are coming up Doesn’t actually tell me how to put a password in Now going to use cat 13.50 cat/manual/summary/passwd Reply was that it was not found Now not sure what to do Going to read the notes Couldn’t find information on how to set password so asked another person 13.56 passwd . . (Set password correctly) 13.56 logout 13.57 Login Tried logout and login, password OK 13.57 logout More information should have been given on how to use the command passwd.

FIG. 2. User log and associated transcript for a first session.

such inefficient usage where the subject moves in and out of the editor, because he is unaware that the shell escape character, !, allows shell commands such as nroff to be given from within the editor. (Satisficing often occurs. That is, the user is unwilling to spend an unknown amount of additional effort in order to find an easier method. However, users may accept an easier and more efficient method which is explained to them and where no additional effort is involved in discovering it .) (8) The interface should prevent the repetition of mistakes. For example, in Fig. 4, the same error is made five times in cat/manual. . . , where the subject does not realise that the missing space is the cause of the problem. Additionally, in four of the incorrect versions of this command the metasymbol < is wrongly present. (9) The interface should act as a supportive assistant. During the first two weeks some enthusiastic students quickly found out how the system worked, but others waited until this had been done so that there would be a friend to consult about any problems. Students that started later showed fewer of the typical errors of novices; the interface should attempt to provide a similar environment. (10) Classification was a relatively difficult task as some errors could have belonged to more than one category, and it was not always obvious from the log precisely why the error had been made. Additional insight into this problem was

DEVELOPMENT OF AN INTELLIGENT

Command passwd

(catha-4 cat/passwd (cat / passwd) (cat) help passwd cat/passwd cat /manual/summary/passwd passwd

.

^D em em exercise 1.text em ex0ne.t Iogout cat / manual/summary/nroffint cat / nroffint nroffint treat em em text1 nroff-mtha text1 nroff -mtha text1 em text1 nroff -mtha text1 em text1 nroff -mtha text1 em text1 nroff -mtha text1 em text1 nroff -mtha text1 em text1 nroff -mtha text1 em text1 nroff -mtha text1 nroff -mtha textl(lf games

277

INTERFACE

Error 1 1+ 11 1+11 1+ 11 1

Type unable. . ditto + misconception ditto ditto unable. .

1 1+11

unable. . . ditto + misconception

(Password set correctly)

50rll 5+ 11 11 19

see next mistype + misconcept. misconception documentation error

5

mistype

8

inefficient (see above)

8

ditto

8

ditto

8

ditto

8

ditto

6

prevented

FIG. 3. Log of shell commands and error classification

(for subject C)

gained from a study of user protocols, but this additional help will not be available to the interface and thus its deductions will result in more than one hypothesis. If the SUSI interface is unable to choose between two or more hypotheses then it should offer advice for each and allow the user to choose which advice to use. 2.3. THE USER MODELLER

process of the classification of user errors produced information on those aspects of user behaviour which allowed the expert to recognise to which category an error belonged. This information provides the user modelling component which is encapsulated within the production rules. The behaviour of specific users can be compared with it and their errors correctly classified so that they then receive the most suitable tutorial information. Frequency of appearance of error types has been assumed to be an indication of their importance, although it could be the case that some important error types

The

.I. JERRAMS-SMITH

278 Command elp help passwd robins help passwd passwd

Error 5

Type mistype

1 10

unable. . unknown

1

5

unable. mistype

.

logot

1

unable

.

1 1 10 7 7 14 7+2

unable . unable. . unknown misunderstanding misunderstanding loss of attention misunderstanding + failed help

7 7

misunderstanding misunderstanding

1 10 10 10 10 1 7 1

unable. unknown unknown unknown unknown unable . , misunderstanding unable.

1 13 2+13

unable. . guess failed help + guess

13+ 11 1

guess + misconception unable. . .

logout passwd help passwd (cat/manual/summary/passwd) passwd login login passpasswd passwd help help passwd passwd (v) passwd (u) help passwd passwd aaaaaa aaaaaaa hoqrr uoou (cat/manual/summary/passwd) login cat/manual/summary/passwd help passwd (cat/manual/summary/passwd why not help goto help passwd help passwd help login /etc/passwd (cat/etc/passwd

FIG. 4. Log of shell commands and ecror classification (for subject B).

occur infrequently. The categories detected by the interface include all those which appear frequently (for which 20 or more examples appear in the logs), except for category 10 (unknown) and category 14 (loss of attention). For these two categories there is no way of specifying in advance what the error will be like and therefore no way of recognising it. It is important to note that what appears to be the same error may have different causes depending on the particular user and the context in which the error occurs. For example, for the same invalid command, the rules are able to differentiate between an expert user’s mistype and an error which indicates that a novice has an incorrect understanding of the system (misconception). 2.3.1. Expertise in the user modeller: examples of rules This section gives examples of rules, presented in an extended form, provided in the user modeller. These examples give an indication of the knowledge represented

DEVELOPMENT

OF AN INTELLIGENT

INTERFACE

279

within the rules which allows the correct classification of an individual user’s error, and the detection of misconceptions or of inefficient behaviour. This leads to one or more hypotheses about the user’s state of knowledge. Throughout these examples, a mistype of a command or argument is defined as a token which is similar to one which would be valid and could have been caused by one of the four most common typing errors. These are relatively easy to detect and comprise addition of one character, omission of one character, inversion of two adjacent characters, substitution of one character for another. The first group of rules contains some of those used when a pipe is present in the input line. The knowledge encapsulated here enables the interface to decide whether there is an error of category 11 (a misconception) and thus whether the user requires some tutorial help. Note that the first rule results in the conclusion that the user knows how to use a pipe; this is the sort of information that would be used later by other rules. Example 1. This rule detects that the user has used the pipe correctly. Rule 29 IF Valid command follows a pipe THEN The hypothesis is that the user knows how to use pipes Example 2. This rule detects that the user may not understand how to use pipes. Rule 34 IF Invalid command follows pipe THEN The hypothesis is that the user may not know how to use pipes Example 3. This rule detects that the user may know how to use a pipe even though the command which follows is invalid. Rule 35 IF Invalid command follows a pipe User is an expert Command could have been mistyped THEN The hypothesis is that the expert has not made an error about the use of pipes but has mistyped the command which follows. The next group is part of a set of rules which is called when the redirection of output symbol, >, is present in an input line. These rules allow the interface to differentiate between an error of category 5 (a mistype) and an error of category 11 (a misconception). Example 4. This rule detects that user may have mistyped the file being used for output.

280

J. JERRAMS-SMITH

Rule 63 IF > is present > is not part of >> > is followed by a mistype of an existing filename THEN The hypothesis is that the user may understand how to use >, but may have mistyped the following filename. Example 5. This rule detects that a more serious error, caused by a misconception, may exist rather than the type shown in example 4. Rule 64 IF > is present > is not followed by name of a current file > is not followed by a mistype of a current file > is followed by a Unix command THEN The hypothesis is that the user may be confusing > and pipe. The following rules are from a set which detects that an error of category 8 may be present at that the user is behaving inefficiently, although the given commands are correct. However, it should be noted that apparently inefficient behaviour could in some circumstances be intentional because of some desired side effect. Example 6. This rule detects that a user may not yet know that it is possible to use repeated files with some commands, such as cur and rm. Rule 70 IF The current line has the same command as the previous line The arguments on the two lines are equivalent One of the arguments is of the repeated-file type It is this argument which provides the only difference between the two lines THEN The hypothesis is that the user was unaware that both filenames could be supplied on the same line Example 7. This rule detects that the user may not know that it is possible to call a command from within another (for instance, calling nroff from within an editor). Rule 71 IF

The command is one from within which others can be called

The same command was used two lines previously The command was not used in the previous line The current and the two preceding lines all have the same filename as argument THEN The hypothesis is that the user is alternating between two commands when instead one could be called from the other Example 8. This rule also indicates that an error of category 8 may be present.

It

DEVELOPMENT OF AN INTELLIGENT

INTERFACE

281

line such as cat text&t essay.& that this could be expressed more efficiently by using cat *.dut, as long as no other file in the current directory also ends in the same way.

states that if the user has given a command

memo.dat then the user may need information

Rule 75 IF The command would take repeated files as arguments The same substring is present in all instances of a repeated fiie That substring is absent from files which are not in the command but are present in the current directory THEN The hypothesis is that the user does not know about the use of wildcard. The following rules are used to detect errors in category 2, when the user is trying to find help. The first detects that help is required on a specific command, the second that more general help information should be provided. Example 9. Rule 30 IF Command = help or man Given arguments do not match the command One of the given tokens is a possible substitute for a command THEN The hypothesis is that the user is trying to guess the name of the command for which information is required. Example 10. Rule 36 IF Command is invalid The string help appears somewhere in the line THEN The hypothesis is that the user needs some general information help.

on how to find

The following rule attempts to detect what the user’s real intention is, rather than what it might at first appear to be. Example 11. This rule detects that the novice might have made a spelling or typing error when typing the command, although the given command appears to be valid. It is derived not from the error categories but from the finding that novices generally use a relatively small subset of all possible commands, this subset of commands being known to the interface. This rule would not apply to an expert user. Rule 26 IF The user is known to be a novice The given valid command is not present in the list of the 40 most frequently used commands

282

J. JERRAMS-SMITH

The given command could be a possible mistype of some other command The given arguments match that other command The other command is one which is frequently used THEN The hypothesis is that the valid command is a mistype of the command which the arguments actually match.

3. The SUSI interface 3.1. OVERVIEW The

SUSI interface is written in Franz Lisp and runs on a Vax 530 (development started in 1982 and was completed in 1985, and at that time no better machine was available for this project). When the user types a Unix command this is accepted and interpreted by the interface before being passed to Unix. The command line is subjected spelling checking and the production rules are used in constructing a user model which determines whether tutorial help should be provided. The current situation is that unless the command is likely to cause harm (such as accidental removal of files) it is passed directly to the system and the system response is returned before any tutorial information appears. Thus users learn the meaning of the system error messages and can eventually use Unix in other situations where the interface may be absent. However, the potentially harmful effect of hostile or confusing error messages is minimised because of the additional tutorial information. Inappropriate learned behaviour is also minim&d by prompt detection of errors and subsequent tutorial output. The interface consists of modular components which enabled it to be altered easily during testing and development. These are: the user modeller (the production rules of the Knowledge Based System), the user model, the interpretation module, the domain model and the tutorial module. 3.2. THE KNOWLEDGE BASED SYSTEM (KBS) The

KBS is based on that suggested by Winston & Horn (1981). Its knowledge base contains the knowledge of the expert in the form of causal explanations of users’ errors. It is the major source of information about the general user of Unix. The knowledge base consists of modules of knowledge about Unix users in the form of production rule sets grouped according to function and accessed when required for solving particular parts of a problem. There are approximately 70 rules altogether. For the SUSI interface, deduction is carried out by forward chaining only. Forward chaining has been shown to be useful for modelling some cognitive processes and for solving problems which require very broad but shallow knowledge. The backward chaining paradigm, where a set of hypotheses is worked through until one is found to be true, is less useful for this particular application since the database is added to before the first call to the inference engine. The KBS is able to explain its activities to the user who can then verify the reasoning strategy for a solution: the user might ask how a fact was derived. This function was available during the pre-testing stage of development, with output

DEVELOPMENT

OF AN INTELLIGENT

INTERFACE

283

being in pseudo-natural language; coding in LISP allows chosen parts of the rule to be directly output to the screen. 3.3. THE USER MODEL: STATIC COMPONENT

A general (static) user model contains published information on the use of Unix, particularly command usage frequency information (Hanson, Kraut & Farber, 1984). It also contains information about likely previous knowledge of other operating systems, since it is well known that people will attempt to apply such knowledge, and that it can often be misleading. 3.4. THE USER MODEL: DYNAMIC COMPONENT The user model includes what is known about the current user’s mental model of the system, the user’s behaviour patterns, the user’s expertise level and the trace of the user’s behaviour for the current session. The record of previous activity is held on property lists: attributes for each input line include original command, deduced command, intended command, original arguments, deduced arguments, intended arguments. Sequences which may indicate the occurrence of inefficient use are maintained. The frequency of usage of commands and files is also recorded. In future, as part of the planning operation, the dynamic user model should also maintain possible sequences which might indicate a command which is likely to be used next, or a file which is currently being operated upon (example: eclitfilel cut file1 lffilel). 3.5. THE INTERPRETATION

MODULE

This receives the user’s command and divides it into component items, such as command and filenames. Each item is subjected to intelligent spelling checking (Jerrams-Smith, 1986). A number of different possible interpretations may be passed to the production rules of the KBS. 3.6. THE DOMAIN MODEL The domain model consists of declarative knowledge such as that held about Unix commands. The interface knows all valid commands and is able to check the options and other arguments of a set of the most frequently used commands; a simple network holds information for about 40 of these. This information includes details of each command and the relevant arguments for the command, such as whether the argument is repeated, what type it should be (for instance, usemame, file, string) and whether it is optional. Information is also provided on how many alternative forms exist, the usage frequency of each command and the substitutes likely to be used for each command. The substitutes provided are those which the user might guess when attempting to find a command. For example, substitutes for logout or ^D might be (logo logoff stop finish exit). In the current system such substitutes are not acceptable as replacements for the correct command, but they provide a signal that tutorial information is required. In future it would also be useful for the interface to be able to recognise aliases, which are likely to be used by

J JERRAMS-SMlTH

284

the more expert user. Each possible command for the particular input line is taken to be the starting node for traversal through the network until one of the stopping conditions is reached (for instance, no tokens left in the input line). 3.7. THE TUTORIAL MODULE

When an error is detected, this module decides what information would most aid the user, based on the hypotheses produced by the KBS. An interactive dialogue is carried on with the user and the user’s response to the tutorial messages provides additional information for the user model. Suitable instruction is offered after confirmation by the user that the correct interpretation has been made. Some of this instruction is generated from the information about Unix held in the domain model and uses files in the current directory as examples. However, many of the explanations are static and are based upon the model of the “typical” user derived from the behavioural studies. This was a satisfactory arrangement because the interface was mainly intended for, and tested on, novices who were similar to those originally studied. The tutorial module allows the user to input a correct token to replace one that it has detected as incorrect. Also, guidance is provided when a valid command has been given but there are indications that the user lacks some knowledge which would enable more efficient use of the system. 3.8. THE INTERFACE IN ACTION

A major problem for novices is that they cannot find help about an operation they need because they do not know its name in the particular system they are using. For beginners, a menu of a small number of the most frequently used commands is output above the prompt. It also contains instructions for switching off the menu and finding more help. A special command, info, has been included with which users can call up a highly simplified version of the help file. The information it displays is generated from the network of information about Unix commands. The info file provides one-line descriptions of the forty most frequently used commands, and is sufficiently small to allow the subjects to read through and find the name of the command they need. Unix users frequently give the 1s command in order to check which files are currently available; this necessity is removed because the interface provides, above the prompt, a continuously updated list of the files in the current directory. This also provides feedback and facilitates learning because the subjects know immediately the result of giving a command such as cp or rm. The command line is checked for mistyping and the rules are then activated. The following provides an example for which a number of rules, including rules 63 and 64 (above), are found to be true. The user types spell

f 2 > lf

The files in the current directory include rf5 and f 2 and in this version of Unix rf lists a file on a printer. Thus the given command is valid. However, the interface determines which of three possibilities may have occurred, two of which would be errors. Firstly, the user might be a novice and is confusing pipes with re-direction of

DEVELOPMENTOF AN INTELLIGENT INTERFACE

285

output. This novice would receive the following response, which is suited to a student of software engineering who is a novice user of Unix: The > symbol is used to redirect output to a file It cannot be used to transform output into input of some other process The 1 symbol is used to pipe output of a process into another process Secondly, the user might be an expert and have mistyped lf5. Thirdly, an expert user might have intended to create a file in the current directory which has the same name as a Unix file. The interface would be unable to choose the more likely of the second and third possibilities, and so would not interrupt the expert user by offering possibly unwanted advice. 3.9. RECOMMENDATIONS

FOR FUTURE RE-DESIGN

In any future implementation it would be useful to include an additional facility so that the user would be able to initiate a dialogue with the tutorial component in order ask for help. The tutorial component should also be able to offer a suggested correction of an incorrect token; this correction would be effected only after a positive reply had been received from the user and then the complete corrected line would be passed to the system.

4. Evaluation The behavioural study described in section 2 indicated that novice users of Unix make numerous errors, waste a good deal of time and effort and are still unable to complete their required tasks. SUSI was designed to enable novice users in particular to carry out their required tasks more successfully and to waste less of their own time and effort than would otherwise have been the case. This should be indicated by the following: (1) successful completion of more of a set of given operations; (2) giving fewer commands to complete the same number of operations; (3) making fewer errors, and in particular, making fewer repeated errors and making fewer of the errors which although apparently different, are all caused by the same underlying lack of knowledge or misconception; and (4) Making fewer repetitions of inefficient usage of the system. The following study was undertaken and, on its completion, were performed to test the above predictions.

post-hoc analyses

4.1. METHOD

A group of 13 undergraduates who had no previous experience of Unix was divided into an experimental group using SUSI (six subjects) and a control group who interacted directly with the Unix shell (seven subjects). A check was made to ensure that the total population was divided as randomly as possible between the experimental group and the control group. The MannWhitney U Test was used in order to test the hypothesis that members of the control group and the experimental group are not part of the same population when their

J. JERRAMS-SMITH

286

first year examination results are under consideration. The results of this test are summarised in Table 2. This indicates that the experimental and control groups are part of the same population when their examination results are considered. It can therefore be assumed that any observed difference between the two groups was not caused by differing computing ability but by the use of SUSI by the experimental group. The subjects were set a task which could be carried out in their normal working environment, a large laboratory with a terminal for each subject. They were told to take as long as they needed to complete the task, and that they need not necessarily complete it in a single session. Each subject was given the sequence of operations shown in the Appendix and required to complete them in the given order. They were also asked to perform only the given operations and no others until the sequence had been completed. The operations which they attempted involved frequently-used Unix commands and concepts. For example, operation 18 could be most efficiently carried out by using a pipe. Some operations are included more than once in the sequence and are part of an endeavour to discover whether or not the subject is effecting the operations with maximum efficiency (from the point of view of the user). For instance, a wildcard could be used for operation 28. Likewise, a single command, cat t3 t4, can be used for operation 24, rather than giving two commands; similarly, operations 29 and 30 need only one command and can act as a test of whether the tutorial information provided by the interface has been understood/remembered by the subject. The performance of the subjects was recorded in the following ways:

(1) A log of their input to (2) Verbal protocols were (3) On completion of the each of the operations

the Unix shell was collected, as in the previous study; collected, as previously and task, the subjects were asked for their comments on in the sequence, including any problems they found.

4.2. RESULTS AND CONCLUSIONS

Table 3 provides the errors in each category for the control and experimental groups. The results are summarised in Table 4. They indicate the following differences between subjects in the control group and those in the experimental group: (1) More of the operations were attempted and more were successfully completed

TABLE 2

Summarized results of first year examinations, (MannWhitney V = 21 P = O-527. con = control group, exp = experimental group) Median Examination Results

con exp

65.5 57

Mean :;

S.D. (n - 1) 1640 1040

DEVELOPMENT OF

287

AN INTELLIGENTINTERFACE TABLE 3

Evaluation: control and experimental error means Error 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

Category

Control

Unable to give command Failed try for help Wrong mode or directory Using previous knowledge Mistype Prevented Misunderstanding Inefficient use Misread documents Unknown Misconception Obscenities Guesses Loss of attention Forgot own file name Easier option available Known to be wrong Forgotten Documentation errors

Experiment

1 3 0 0.5 2.5 0.83 0 3.1 0 1.67 9.67 0 0.67 1.17 0.17 0 0.33 0.17 1.17

3 0.57 0.14 0.14 0.43 0 0 0 0.29 0.29 0.71 0 0 0.29 0 0 0 0 0.43

TABLE 4

Evaluation study: additional results Totals

Median

Mean

S.D.(n-1)

MWU

P

Total of commands

con exp

116 55

111.5 56

38.31 7.93

7

0.026

Operations attempted

wn exp

22 30

23.5 29.6

4.18 0.54

4

0.007

Operations successful

wn exp

18 29

19.3 28.3

5.28 1.11

6

O@Ol

Total of all errors

wn exp

25.5 4

29 6.7

14.63 4.42

3

0.002

Total of repeated errors

wn exp

4 0

4.5 0.1

4.55 0.38

8

0.057

Total of misconceptions

con exp

11 1

9.7 0.7

5.61 0.49

11

0.002

Repeated inefficiency

con exp

0 0

1.5 0

249 0

0

0~001

Errors with same cause

con exp

7 0

9 0.57

9.09 1.19

5

0.011

288

J JERRAMS-SMITH

by members of the experimental group than those of the control group. Most members of the experimental group completed the sequence successfully whereas most of the control group did not. (2) Although they attempted and successfully completed more operations, members of the experimental group gave many fewer commands than members of the control group. (This is a good indication that the SUSI users are having less difficulty. However, the mode of activity may be different for learners where help is not easily available: they may adopt a more experimental approach and hence use more commands to effect the same results.) (3) Members of the experimental group made fewer errors than did those of the control group. (3.1) They did not repeat errors as often as did those in the control group. (Members of the experimental group were immediately given the information which allowed them to learn from their mistake, whereas members of the control group might try the same invalid command again, sometimes more than once.) (3.2) Members of the control group would more often continue to issue invalid commands which, although different from each other, were all caused by the same underlying misconception or lack of knowledge. (Members of the experimental group received tutorial guidance about each misconception or lack of knowledge as soon as it became apparent.) (4) Members of the experimental group showed no repetitions of inefficient use, whereas some repetitions were shown by members of the control group. 4.3. COMMENTS

When compared with the control group, the experimental group showed a significantly improved ability to use Unix easily and efficiently. However, the following points should be noted. (1) There are two different aspects of SUSI which were provided in an attempt to produce a supportive interface. First, the provision of remedial instruction, the necessity for which was deduced from the production rules. Secondly, the provision of additional facilities: the info file, directory listing and menu. The relative contributions of these two aspects of the interface to the improved performance of the experimetal group is unclear. It is possible, for example, that access to information about the contents of the current directory was a contributing factor to the reduced error rate of the experimental group. However, there are strong indications that the remedial instruction was useful, in that the experimental group made fewer repetitions of errors and made fewer errors caused by an already evidenced misconception or lack of knowledge. However, in future it would be important to test the two aspects separately in order to discover their relative contribution to the improved performance. (2) There was a slower system response time for users of SUSI-this probably did not significantly affect the results, but there is a possibility that it may have encouraged users in the experimental group to make more effort to get it right first time, or because they actually had more thinking time, they may have worked out the next command while waiting. The observations on the control group make

DEVELOPMENT

OF AN INTELLIGENT

INTERFACE

289

this seem unlikely: they spent a considerable amount of time and effort, and were still less successful at completing all the operations than were members of the experimental group. However, future testing should ensure that both the control and the experimental system provide similar response times. (3) There may have been a stronger Hawthorne effect for the experimental group. That is, they may have made more effort to do well because the extensive and immediate feedback they received (the tutorial guidance) was a reminder that they were part of an experiment. Again, it seems unlikely that this was the sole cause of the improvement for the experimental group, because the control group members were aware of being observed and certainly gave every appearance of trying all possible ways to complete their task. However, for future testing it might be possible to provide some means by which the members of the control groups are reminded that they are being observed to the same extent as the members of the experimental group (by providing additional feedback of some sort).

5. Concluding remarks The development of the SUSI interface was based upon an extensive series of observations of user behaviour. Unix experts who were also experienced tutors provided the knowledge about the causal classification of user errors. This was used as the basis for detection of individual users’ problems by SUSI. Evaluation of the resultant SUSI design indicates that it helps novices to overcome the initial difficult stage of using Unix and enables them to use it more efficiently. The interface allows experts to interact with Unix in almost their usual way, but offers advice and guidance when it detects a novice error. Novices show much less confusion because their underlying problems are usually solved quickly and although there are some occasions when the interface is unable to help, the novice is no worse off in that situation than is the usual novice user of Unix. The methodology for the development of an intelligent interface consists of carrying out studies of user behaviour which are used to detect the categories of errors which occur and then to discover from experts what characteristics of a user’s input allow them to classify each error. This knowledge is then formulated as production rules which provide a user modeller which interprets the current user’s behaviour so that suitable guidance and instruction can be provided. Additionally, the studies of users enabled the discovery of further facilities which the interface should provide (such as the “info” file for SUSI). If there had been no time constraint on the project it would have been useful to improve the testing and to extend some of the features, particularly in the areas of user modelling and tutorial interaction. It could be of particular value to users if the interface were able to infer possible plans and intentions as part of the user model. It was clear from the behavioural studies that some plans are fairly easy to detect. This was so particularly for the subjects in the evaluation study because their goals had been explicitly provided; they had received a list of operations to be performed in the given order and so it was possible to detect which actions contributed to the given goals.

290

J. JERRAMS-SMITH

Future effort might also go into providing the interface with detailed knowledge of more commands. However, since most novices use far fewer commands than those already known by the interface, this may not be particularly useful. Also, as users gain expertise, they are able to use the online help more efficiently. It would probably be more useful to provide successive levels of explanation about the known commands and concepts, which would then be chosen and output by the interface in order to fit the user’s mental model of the system. Further aspects for consideration are that the interface was designed as a result of behavioural studies of a particular type of user: undergraduates who had previous experience of other operating systems. It would be useful to discover what changes might be required to enable the interface to help naive users of computers, to help users who had previous experience of earlier versions of Unix and also to help intermittent users to regain lost expertise. I should like to thank Bob Hendley and Peter Dodd for their contribution to the work described in this paper and Thomas Green, Donia Scott and Peter Johnson for their helpful advice on its presentation. References BROWN,J. S. & BURTON, R. R. (1978).

A paradigmatic example of an artificially intelligent instructional system. International Journal of Man-Machine Studies, 10,323-339. BURTON, R. R. & BROWN, S. (1979). An investigation of computer coaching for informal learning activities. International Journal of Man-Machine Studies, 11,5-24. CARROLL, J. M. & MACK, R. L. (1984). Learning to use a word processor: by doing, by thinking and by knowing. In J. C. THOMAS & M. SCHENEIDER, Eds. Human Factors in Computers. Norwood: Ablex. CLANCEY,W. J. (1986). Qualitative student models. Stanford Knowledge Systems Laboratory Report No. 86-15. FISCHER, G., LEMKE,A. & SCHWAB,T. (1984). Active help systems. In G. C. VANDE VEER, M. J. TAUBER, T. R. G. GREEN & P. GORNY, Eds. Readings in Cognitive Ergonomics: Mind and Computers. Berlin: Springer-Verlag. HANSON, S. J., KRAUT, R. E. & FARBER, J. M. (1984). Interface design and multivariate analysis of Unix command use. Association for Computing Machinery Transactions on Ofice Information Systems, 2,42-57. HAYES, P. J., BALL, J. E. & REDDY, R. (1981). Breaking the man-machine communication barrier. Instituteof Electrical and Electronics Engineers Computer, 14, 19-30. JANOSKY, B., SMITH, P. J. & HILDRETH, C. (1986). Online library catalogue systems; an analysis of user errors. ln&national Journal of Man-Machine Studies, 25,573-W. JERRAMS-SMITH, J. (1986). The application of expert systems to the design of humancomputer interfaces. PhD thesis, Computer Science Department, Birmingham University, UK. JERRAMS-SMITH, J. (1987). An expert system within a supportive interface for Unix. Behaviour and Information Technology, 6,37-41. JOHNSON, P., JOHNSON, H., WADDINGTON, R. & SHOULS, A. (1988). Task related knowledge structures: analysis, modelling and application. In D. JONES,Ed. People and Computers 4. Cambridge: Cambridge University Press. JOHNSON,W., LEWIS & SOLOWAY, E. (1984). Intention-based diagnosis of programming errors. Proceedings of the National Conference on Artificial Intelligence, 162-168. KAMOURI,A. L., KAMOURI,J. & SMITH, K. H. (1986). Training by exploration: facilitating procedural knowledge through analogical reasoning. International Journal of ManMachine Studies, 24, 171-192.

DEVELOPMENT

OF AN INTELLIGENT

INTERFACE

291

NICKERSON,R. S. (1981). Why interactive computer systems are sometimes not used by people who might benefit from them. International Journal of Man-Machine Studies, 15, 469-483. NORMAN,D. A. (1983). Design rules based on analyses of human error. Communications of the Association for Computing Machinery, 26, 254-258. SELF, J. A. (1974). Student models in computer aided instruction. International Journal of Man-Machine Studies, 6, 261-276. SHNEIDER, M. & THOMAS, J. C. (1983). The humanization of computer interfaces. Communications of the Association for Computing Machinery, 26, 252-253. SHNEIDERMAN, B. (1979). Human factors experiments in designing interactive systems. Institute of Electrical and Electronics Engineers Computer, 12, 9-19. SLEEMAN, D. H. & SMIITH,M. J. (1981). Modelling student’s problem solving. Artificial Intelligence, 16, 171-188. SMITH, J. JERRAMS(1985). SUSI-a smart user-system interface. Proceedings of Conference British Computer Society Human Computer Interaction Group. P. JOHNSON& S. COOK, S. Eds. 211-220, Cambridge: Cambridge University Press. WINSTON,P. H. & HORN, B. K. P. (1981). LISP. Reading, MA: Addison-Wesley. YOUNG, R. M. (1981). The machine inside the machine: user’s models of pocket calculators International Journal of Man-Machine Studies, 15, 81-85.

Appendix

for the evaluation

study

The (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)

standard sequence of operations which all subjects were asked to complete: Find out how to use the help facility Set a password Create a file “text” of the text supplied Copy the contents of “text” to “try” Display contents of “try” on screen Rename file “try” to “test2” Print contents of “test2” on screen with headings and page numbers Delete file “text” List contents of current directory on screen Copy contents of “test2” to “test3” Edit “test3” to include commands recognised by a text formatter Use the spelling checker (British version) to find mis-spelled words in file “test3” (13) Output the mis-spelled words to a file “test.sp” (14) Use an editor to look at file “testsp” (15) Format “test3” (16) Edit “test3” again to include corrections and some additional commands (17) Formal “test3” again and output newly formatted version to file “t3” (18) Use spell and WCto discover total number of spelling errors in file “t3” (19) Discover who else is currently using the system (20) Send mail to js: include date, time, own usemame (21) Change access rights on “t3” to be readable, writable and executable by owner only

(22) List contents of current directory, showing access rights (23) Copy file “t3” to “t4” (24) Display files “t3” and “t4” on screen

292 (25) (26) (27) (28) (29) (30)

J. JERRAMS-SMITH

Show date and time on screen Rename “test2” as “testtwo” Rename “test3” as “testthree” Display files “testtwo” and “testthree” Delete “t4” Delete “testtwo”

on screen