Int. J. Man-Machine Studies (1983) 18, 215-252
A theoretical basis for the representation of on-line computer systems to naive users A. P. JAGODZINSKI
Radcliffe Science Library, Oxford University, Parks Road, Oxford, U.K. (Received 24 October 1981 ) Computer science has now recognized that in holistic systems design the designer must include not just the terminal, but also the user within the boundaries of the system. Users, particularly if they are computer naive, require a conceptual model of the computer system so that they can form a clear idea of what the system is doing and what it can do. This model is communicated to the user by the representation of the system which appears at his terminal. Existing techniques for the design of terminal dialogues do not include methods for representing conceptual models, so that new techniques are needed. If these are to be reliable they must be based in theory rather than just the intuitions of individual designers. This article examines current theory and practice in psychology, computer sicence and process control, and seeks a consensus for the design of representations suitable for describing the operations of on-line computer systems via their terminal interfaces.
1. Introduction 1.1. THE NEED FOR COMPUTER SYSTEMS TO BE REPRESENTED TO THEIR USERS In his guest editorial of the A C M Computing Surveys, Moran (1981a) provided a comprehensive introduction to "The Psychology of the Computer User". In this he made it clear that the responsiblity of the designer of computer systems encompasses not just the terminals, but users as well. The design of the interface between the system and the user must therefore satisfy not only the hardware and software constraints of the system, but also the psychology of the user. The whole conceptual organisation of the computer system from the user's point of view--the user's conceptual model of the system--is an integral part of the user interface. The conceptual model is the knowledge that organises how the system works and how it can be used to accomplish tasks . . . . The conceptual model must be taught to the user and must be reinforced by the behaviour of the system. In the case of systems for dedicated operators or programming users the designer can, to a large extent, teach the users their conceptual model as part of their training, or rely on their computer education to provide them with a reasonable facsimile of the system's operations. A consequence of the falling cost of computer hardware is the rapid growth of interactive information and decision-support systems with an increasing proportion of their users having no experience or training in the use of computers, i.e. "casual" (Martin, 1973), "general" (Miller & Thomas, 1977), or "naive" (Kennedy, 1975). For these users a problem is created by the fact that their use of the computer forms only a very small part of their total jobs. Consequently they do not develop the computer-familiarity of dedicated operators or a high degree of fluency with the 215 0020-7373/83/030215 + 38503.00/0
O 1983 AcademicPress Inc. (London) Limited
216
A.P.
JAGODZINSKI
computer dialogue and task structure. Also, they are likely to be given little or no training (especially if they are members of the public) and virtually no explanation of how the system works. (Jagodzinski, 1980). Clearly, then, systems with naive users must provide a conceptual model as part of the terminal dialogue, and this must be pitched at the right level for them. Fitter (1979) states the aim: (i) the underlying process which the computer is performing should model the processes which are directly pertinent to the user in a manner compatible with the user's own model of the process; (ii) the communication language (or user interface) should be designed so as to reveal underlying processes as vividly as possible. To summarize, the terminal displays must provide a coherent representation of the processes being carried out by the system and the representation must be easily assimilable by the user. 1.2. THE NEED FOR A THEORETICAL BASIS FOR REPRESENTATION Having established that there is a need for computer systems to be represented to naive users, it would be tempting to go ahead and provide representations on V D U screens on the basis of the experience and intuition of the systems designer and for the needs of a particular computer system and set of users. Mumford (1980) cautions against this approach in discussing the effect of sociology on the design of information systems: Many articles on the design of computer systems start with a check list of steps or procedures which are put forward as a guide to eventual success. Other, more sophisticated writings, talk about taxonomy and categorize different aspects of the problem or the system so that it can be broken down into small, manageable units. Few, however, go deeper than this and try to identify the intellectual theories that underpin the recipes that are recommended. yet every method or form of classification is based on a theory of some kind and it can be argued that faster progress would be made in developing useful methods and taxonomies if these underlying theories were made explicit and their validity examined and discussed. If theory is not made explicit then the assumptions behind current practice remain hidden and may not be recognized as false. Also practice becomes based on empiricism, on what seems to work rather than on understanding and knowledge of why it works. Moran, too, points out the danger of an approach that generates guidelines which are specific to particular systems and which are empirically rather than theoretically based: As a consequence, the corpus of guidelines will have to be enormous; and many of them will contradict each other--resulting in muddled design guidance. The reason for this complexity is that a collection of features motivated by design choices does not add up to a coherent psychological picture. In accordance with these warnings, the references made to specific computer systems (mainly a system for the control of serials in a science library) will be purely to illustrate possible interpretations of theory, rather than concrete proposals for systems designers. The value of using this particular system as an example is that most of its concepts will be easily understood by readers with an academic background. In order to illustrate the concepts which arise from the investigation the system has to big enough to have independent modules dealing with different aspects of a real
REPRESENTATION OF SYSTEMS TO NAIVE USERS
217
world problem. The processes chosen are: Registration:
Claiming: Holdings enquiries:
the process of recording the arrival of a new issue of a journal. It assumes that details of title, etc., and possibly other issues are already held by the system. the process of detecting the non-arrival of an expected journal issue and the generation of a claim to the supplier. the process of finding our whether particular issues of particular titles are held by the system.
These processes share subroutines for logging-on and for identifying particular titles (Dictionary look-up), and holdings within titles (Holdings key look-up) (see Fig. 1).
Log-on, Dictionary and Holdings key ook-up
I
RegistrationJ FIG. 1. The serials control system. Although a fairly-elaborate example is needed, the investigation will be driven by theoretical considerations rather than led by the desire for immediate practical solutions to a particular system design. This emerges in the objectives. 1.3. OBJECTIVES OF THE INVESTIGATION 1. To identify within a rather diverse range of existing theories a consensus of ideas for the design of representations for naive users of on-line computer systems. 2. To derive and illustrate the general rules which the consensus suggests.
2. T h e sources Three areas of knowledge have emerged as bases for the design of a method of representation and there is considerable overlap in both the ground they cover and the explanations they propose. 2.1. PSYCHOLOGY Cognitive psychology provides the greatest n u m b e r of ideas. The large volume of published material about the theory of representation has made it necessary to confine the scope of this study to those authors in the field who are cited most frequently as having a fundamental contribution to make. The role of psychology in the investigation of the cognitive activity of computer users is well defined by Green (1980) in his paper " P r o g r a m m i n g as a cognitive activity". First of all he dismisses the notion that there may be in existence a comprehensive " T h e o r y of Thinking" which can be applied to the simplification of
218
A.P.
JAGODZINSKI
all cognitive activities. Then he points to three ways in which psychology can help the investigator. First, the experimental methods of psychology can be used to test the truth of intuitions. Green suggests that for problems which have two or more contradictory solutions, each one seeming intuitively satisfactory, theories and experiments of psychologists can help in the choice of the best solution. Secondly, they can be used to indicate by how much and under what conditions method A is better than method B. Thirdly, psychology can suggest ways of doing things which are not intuitively obvious, even to the experts in the field. An example, given by Green from Sime, Arblaster & Green (1977), illustrates these points. In experiments on programming languages conditional statements of the form "if P then do A else do B" were examined. It was demonstrated that the form "if P: do A; not P: do B; end P" was handled more easily by novice and professional programmers. For example, program mistakes by novices were corrected 10 times faster in the second form. This finding arose from what Green calls "unremarkable psychological thoughts", even though there were absolutely no standard languages using the form at the time. In this way the theory of psychology can be seen as a useful source of ideas which, as yet, have not been applied to improving the representation of computer processes to naive operators. The experimental method of psychology provides hard evidence to help in the choice of methods of representation, and to quantify and qualify the advantages of the chosen methods. The range of processes for which a natural form of representation is being sought forms only a very small proportion of the total range considered by psychologists. For example, this study is not concerned with specific findings about vision systems or schema for stories, but only with the general conclusions about methods of representation which can be drawn from them and applied to the particular problem of representing computer processes. A useful consequence of this position is that the minutiae of much of the theory is not specifically relevant to this research and need not be discussed here. 2.2. COMPUTER SCIENCE Since the inception of computers there has been a continuous movement in programming away from methods which reflect the working of the computer towards methods which correspond more closely with human cognitive processes. For example, Dijkstra (1976) in "an essay on the notion: 'the scope of variables' ", discusses the evolution of high level programming languages from the techniques of the circuit design. In this he recognizes that some facilities available in a machine, such as the freedom to jump to addresses anywhere in memory, are not well suited to the way in which the human mind solves problems. Hoare's (1972) ideas in "Notes on data structuring" later extended by Jackson (1980), take the trend to its logical conclusion, suggesting that programmers should solve problems in a "programming language" which is entirely free from any consideration of the characteristics of the machine. An "execution language" should then be used to specify how the program should be compiled and executed to achieve reasonable machine efficiency.
R E P R E S E N T A T I O N O F sYSTEMS TO N A I V E U S E R S
219
Clearly, computer science recognizes that there are advantages in insulating the programmer as a human problem-solver from the characteristics of the computer. The same principle can be extended to include the naive user of on-line systems. However, naive users are quite different from programmers, and need to be considered separately9 Obviously, the rigorous methods of algorithmic problem-solving are not appropriate either to the level of problem (usually simple data-manipulation) or to the people (by definition untrained) encountered as on-line data processing system users. Nevertheless, computer science makes a very significant contribution9 Its most important role is as a paradigm of the systematic, disciplined approach to the solution of complex problems. In this way it sets a standard and provides operational guidelines for the investigation to follow, although the problem under consideration is more amorphous and fuzzy than most tackled by the discipline. Secondly, it provides excellent examples of the way in which a formal organization of ideas helps to clarify real-world problems, for instance, the benefits provided by structured programming in its hierarchical decomposition of problems. 2.3. PROCESS C O N T R O L
The third source has been the findings of process control. The particular value of this contribution is that it deals with practical applications of the representation of unseen processes and provides hard evidence of the relative merits of different methods in working environments. Much of this has been provided by the work of Rasmussen, principally from his (1980) paper "The human as a systems component". His findings are derived from study of operators and engineers during routine process monitoring and fault-finding in nuclear power stations. The analogy with computer systems is provided by the fact that both systems are on-line, designed by man and with internal processes which can not be viewed directly and which therefore must be represented to their users9 The rigour of Rasmussen's methods and conclusions is probably enhanced by the serious implications of operator error in nuclear installations9 2.4. A C L A S S I F I C A T I O N OF T H E SUBJECT
With its broad range of sources, this investigation is probably best classified as cognitive science. This is described by Bobrow & Collins (1975) as a new field which includes elements of psychology, computer science, linguistics, philosophy and education. Certainly the investigation shares the aim of cognitive science of producing solutions to "problems that did not appear to be solvable from within any single discipline".
3. The medium for representation Traditionally, when introducing a new computer system a designer relies heavily on the users' existing knowledge of the pre-computer system to help their formulation of a representation of the computer system. In this way simple analogies can be drawn, for example between Kardex files and Computer files, invoices and VDU screen "forms". However, the computer system may not do things in the old way, or it may do some completely new things, or it may be for use by people who have no pre-computer experience of this particular part of the real world9
220
A.P.
JAGODZINSKI
The second traditional method of providing the user with an effective representation of the actions of the computer system is by a programme training. The problem i n the case of naive users is that training is often not desirable or not possible. The third method of informing the user of the computer's actions is the medium of the user interface in its narrowest sense, i.e. the terminal dialogue and displays. The only physical device which the user is obliged to use is the computer terminal, so that it is the only medium which can be relied on to provide a representation of the computer's operations. In practice it may be necessary to include other media such as instruction manuals or rudimentary training, but in principle it will be assumed that these should not be relied on. However, it will also be assumed that the devices available as terminals are the richest and most flexible available with current technology. At present this probably means high-resolution, colour graphics V D U / k e y board terminals, although again in practice comprise should be possible. In the near future it is likely that links between video technology and computer systems will permit unlimited variety and scope for imagination in terminal displays.
4. The nature of representation
(A description of representation and its place in computer system design and use) The first step in a discussion of representation must be to describe what a representation is and the properties which are necessary for it to be of service to its owner. A simple, robust model is provided by Bobrow (1975). He takes the position of the designer of an "understander system" which uses knowledge to achieve some goal. Representations are then seen as the result of a selective mapping of aspects of the world. Obviously, the model has to take account of changes in the world, and so needs to be dynamic in operation [see figs 2(a) and (b)]. (a)
(b)
The simple model World state
~ Knowledge state
World slate 1
~, Knowledge state 1
World state 2
~ Knowledge state 2
The dynamic model
FIG. 2. (a) Representation: the mapping of world states to knowledge states, (b) representation: the dynamic mode].
The effectiveness of the understander system depends on the consistency with which the representations (i.e. Knowledge states) map the World states. In the case of computer systems the World states are largely invisible to the user so that his Knowledge states depend on a model operation which corresponds reliably with the system's operations. There are other models of representation, for example the one proposed by H o a r e (1972) in which representation is part of the process of abstraction. However, Bobrow's model seems particuarly well suited to naive computer users with its orientation towards an understander system. Nevertheless, the model will not be followed slavishly
REPRESENTATION
221
OF SYSTEMS TO NAIVE USERS
because computer systems do not present all the difficulties of understanding found in the entire real world. In particular, they are distinguishable by the fact that they can be closely defined; a real world (the computer system) which is specified from a relatively simple abstract representation (the system design) of a more complex real world (the business system). The design of the computer processes which we are considering has been filtered to remove much of the variety of intangible or indefinable features of the business system (Fig. 3). (Consequently, it is possible to avoid those parts of Bobrow's scheme which are concerned with the sort of exceptions that are excluded by rules from commercial computer data processing.)
Real-world business
The systems
Mapping
syslem A set of changing
\
//
k
/ / //
\\ \
The user's represenlolions
User's internal representation of real-world business system A set of changing knowledge states
....
by
A set of changing computer states
// Visual representa~o/n./~
\\
\\ trammq '" \
./ /Mappingvia the //c omputer's user interface
Mapping by
/
Computer system
program
\\\
Mapping by interview///
\\
Mapping from pre-computer\ experience (if any)
Mapping
by systems J A set of abstract analysis symbolsand axioms describing the real world
world states
\\
~lJi Design system
\
F I
Moppin~;y I User's internal representation of by anal
computer system
A set of changing
knowledge states
~. Mapping which always occurs ~ Mapping which m a y occur ~- Visual representation of computer states to assist mapping
FIG. 3. The relationships between the real world, the computer system, the designer and the user in commercial data processing system design.
To sum up, it is proposed that the terminal interfaces for naive users of on-line computer systems should be explicitly designed to assist the users in the job of modelling the systems' processes with their internal representation of those processes. To do this the interfaces should be designed to display representations on which the users may base their own internal representations (see Fig. 4). Note that Fig. 4 introduces an extra set of representations, the terminal interface. These are for the benefit of the user and should therefore be designed not in the form most convenient for the computer system, but according to the same principles and design criteria that cognitive science has identified as being used by human "understander systems".
222
A.P. JAGODZINSKI
Real-world systems A set of changing world states
Mapping > by systems analysis~ design and programming
Computer system A set of changing computer states re presenting real-world states
Mapping by techniques of cognitive science
Terminal interface A set of representations changing to match the computer states
Mapping by user's cognitiv, facultie~
UserIs internal representation of the computer system A set of changing knowledge states
FIG. 4. The role of the terminal interface: to represent the parts of the real world modelled by the computer system.
Note also that the purpose of the user in operating the computer system is ultimately to control real-world events, not computer states for their own sake. The terminal representation should reflect this purpose and minimize the intrusion of computer concerns.
5. The dimensions of representation: a framework for the investigation In an expansion of his simple model, Bobrow (1975) lists a set of "design issues" for his Understander system. These define the different facets or dimensions of conceptual models which the representation must provide, and thus give the designer a set of objective which he must achieve if the representation is going to be complete. In this way Bobrow's design issues become central to the investigation and provide its framework. They are as follows.
1. Domain and Range : What is being represented? H o w do objects and relationships in the world correspond to units and relations in the model?
2. Operational Correspondence : In what ways do the operations in the representation correspond to actions in the world?
3. Process of Mapping: How can knowledge in the system be used in the process of mapping?
4. Inference: How can facts be added to the knowledge state without further input from the world?
5. Access: How are units and structures linked to provide access to appropriate facts?
6. Matching: How are two structures compared for equality and similarity? 7. Self-awareness: What knowledge does a system have explicitly about its own structure and operation? In the remainder of this article the serials control system, a typical data processing system, is examined under these headings. Modifications are suggested to bring the system some way towards meeting the requirements of the theories of cognitive science. 5.1. D O M A I N A N D R A N G E
5.1.1. Units and relations Here the concern is the correspondence between real-world objects and their relationships, and the units and relations in the model.
REPRESENTATION
OF SYSTEMS TO NAIVE
223
USERS
As we have seen, technical devices such as the internal representation of characters by bits are irrelevant to the needs of the user, so that no useful purpose is served by subdividing the units in his representation below the level of a single character. This is the smallest parcel of data that he can receive on a CRT or send on a keyboard, and as defined by Bobrow (1975), a unit can be used without knowing anything about its internal structure. Thus the lowest level of unit in the representation is established. Obviously the use of multi-character tields, e.g. " d a t a " and "volume number", is essential and provides a higher level of unit. For familiar fields such as these, the structure of the characters they contain, e.g. day, month, year, is also obvious. However, the principles of structuring data are worth stating for the light they may shed on the representation of fields whose structure is not known to the user. For data of this type it is advantageous to structure the representation so that the relationships between the units is evident. Rasmussen (1980), discussing the use of displays for process control operators, describes how operators "chunk" several variables into higher level abstractions as states to counteract the low capacity of conscious reasoning. Newell & Simon (1972), discussing the human as an information processing system, suggest that conscious processing can cope with only two or three chunks at one time. Thus it is helpful to the operator if unfamiliar data is represented in a way which suggests a logical and consistent pattern for "chunking", and if the " c h u n k " is provided with a single descriptor. The structure and description of the data chunk must then be used consistently throughout the system, as shown in Fig. 5.
Issue description
The chunk The fields The characters
Volume number Port number I
I
I
I
Sub-port number Edition I
I
FIG. 5. R e p r e s e n t a t i o n of d a t a in a " c h u n k e d " form.
In some cases the design of the computer system will suggest a suitable grouping of data for chunking, but this should not be confused with the possibility of representing all the computer system's data structures to the user. In many cases these structures will be manifestiations of technical constraints, rather than of the real-world relationships of the data.
5.1.2. Exhaustiveness The principle inherent in choosing units in the representation (the terminal dialogue) to correspond to objects in the real world and the computer system is to give the user the optimum amount of structural information. The representation is not an exhaustive map of the computer system. It lacks detail of the fine structure within the chosen units, and by the same principle need not show the details of the physical structure within which the units fit. For example, data relating to a specific issue of a journal may be held on three separate files in the computer system, one for the journal's title, one for the holding to which the issue belongs and one for the issue itself. This quirk of the database structure, designed to avoid redundancy of data, will be totally
224
A.P. JAGODZINSKI
irrelevant to the user who wishes to register the part when it arrives. As far as he is concerned, the most convenient analogy for the structure of the data would probably be one based on the real world, i.e. the journal issue itself--a simple block of data corresponding directly with the front cover of the issue.
Computer system Data structure Dictionary records: Title keyword--key Control number Bibliographic records: [ Control number--key 7 Bibliographic data | Holdings records: Control number ~ 2-level I Holding number J key [ Source | Data specific to holding [ Holding Issues records: I number Control number ] 3-1evel\,__J Holding key ~ key J Issue description ) Data specific to issue
Representation in Terminal Dialogue Title e.g. "Computer Weekly"
Holding (evident from source) e.g. "Copyright" Issue description e.g. Number 777 Issue data (if any) e.g. "contains index"
5.1.3. Verbal mediation One problem which occupies a large proportion of the literature of cognitive science is that of word-sense in natural language systems. To a large extent this problem is avoided in computer data processing systems because individually they deal with very small subsets of the whole world. Within these subsets the sense of important words takes on a specialized meaning, even for naive users. For example, in library serials systems the word " v o l u m e " refers to a set of issues which have the same prefix characters. This would be true for the system even if it were used by the general public. In most data processing systems the specialized use of words is m a d e even more reliable by the fact that the population of users may be computer-naive but have some other training in common, for example librarianship or banking. Even in the case of systems designed for use by the general public, e.g. a bank account enquiry service, it is reasonable to use words in a specialized way with out the need for explanation. Because the customer is focusing on a particular subset of the world he knows that " b a l a n c e " means the amount of m o n e y in his account, rather than a pair of scales or an acrobatic performance. In both cases there is an unwritten acknowledgement that particular words symbolize a specific class or type of objects in the real world. However, it would be unrealistic to require rigorous or complicated definitions of new concepts to be known by the users simply for the convenience of the computer system. A t the same time, if the real-world system deals in complicated concepts then it would not be realistic to expect the computer system to protect the user from the neeed to understand them for example, the complex semantic relationships required for information retrieval by subject keywords.
R E P R E S E N T A T I O N OF SYSTEMS TO N A I V E U S E R S
225
5.2. O P E R A T I O N A L C O R R E S P O N D E N C E
In this section the issue changes from the representation of data to the representation of procedures. Bobrow (1975) states the design aim simply as the need to have correspondence between actions and structures in physical world (in this case the computer system) and operations and representational forms in the model (the terminal displays).
5.2.1. Updating and consistency It is clearly not desirable for the representation to map the computer's operations exhaustively, any more than it was desirable to have an exhaustive mapping of computer files to representational units. Again, an economical form of representation is desirable to avoid unnecessarily overloading the human information processor's capacity. Thus, the designer has two questions to answer. First, in representing the operations of a data processing system what is the optimum amount of detail which users should be given if they are to have a consitent model of the system? Secondly, what are the different types or styles of representation that are appropriate for the different types of operations required of the users of on-line data processing systems? In deciding how much detail to show the user, it seems reasonable for the designer to apply a criterion similar to the one which emerged in the discussion of domain and range. That is, to provide him with a representation which models processes to the same level of detail as it is within his power to affect, but no more. For example, if the process is updating the database with details of a serial part which is being registered, the operator does not need to know that the file is indexed-sequential, or that overflow has taken place. However, he does need some unambiguous representation to confirm that his intention has been carried out and that the database does now hold a record of the serial part. Rasmussen (1980) points out that operators' capacity to absorb the details of a representation, and their grasp of the process is improved if the representation is structured to permit concepts to be grouped together into a single concept. (There is a strong parallel here with the "chunking" of data described in the previous section.) For example, in a representation of the registration of newly-arrived serial parts the following steps are essential as they require data inputs from the operator: (the process is described in context in section 1.2). 1. 2. 3. 4.
5. 6. 7. 8.
Type in keyword derived from title. Select 1 holding from up to 5 alternatives. Examine part for imperfections. Select I of the following: next expected part, later than next expected part, earlier than next expected part, special issue or supplement. Type in number of issues in this physical part. Check details input so far. Change details if necessary and return to 6. If satisfied accept part as registered.
226
A.P.
JAGODZINSKI
To make the representation of this process easier to grasp its stages could be grouped as follows: Registration: 1. Identify journal: steps 1 2 2. Identify sequence of issue (relative to existing holdings): step 4 3. Physical check of issue: steps 3 5 4. Check/accept input data: steps 6 7 8 Conceptually the process of registration now has four stages, rather than eight. Within each stage there is a maximum of three steps. According to Miller (1.968) the capacity of short-term memory is severn symbols, plus or minus two. If a simple processing task is interposed (e.g. counting backwards in threes) between presentation and recall of data then the capacity is reduced to about two symbols. In its structured form the process fits more easily within these limits. In addition its purpose and direction become more evident. The types of representation which need to be provided are also identified by Rasmussen (1980) in his work on the role of human operators in process control in nuclear generating plants. He distinguishes between models which represent the environment by a set of interacting objects or components and models which represent it by a set of variables connected by a network of rules, and points out that the distinction is similar to Bertrand Russell's distinction between causal commonsense reasoning and deterministic, scientific reasoning. According to Rasmussen's (1980) findings the operators in his experiments with process control work preferred the former type of model, while system designers preferred the second type. Rasmussen describes the causal, commonsense model as an associative chaining of events through the system from the initiating event through to its ultimate effect. Events are changes in object states representing the physical variables of the sytem. In the more formal type of model, preferred by systems designers, system states are represented by the magnitude of measurements or observable variables. Functional properties of the system are represented by rules or axioms describing the interrelationships among the variables. These rules can be basic physical laws or symbolic algorithms, depending on the type of system being modelled. Rasmussen comments that the causal type of model is generally characterized as ambiguous and fuzzy in comparison with the more formal model. However, he goes on to say that as a basis for communication and action in a specific system context where the world being modelled is closely limited and defined, the simpler model can be extremely accurate. Furthermore, the rigour of the formal model is not only unnecessary, but will tend to be counter-productive at the data manipulation level of operation, or what he calls the "data domain".
REPRESENTATION
OF SYSTEMS
TO NAIVE
USERS
227
Formal, rigorous modelling of processes is more appropriate to what Rasmussen calls the "functional domain". This is concerned with the internal causal processes of the system. It can be used to generate non-observable data, to predict the response of the system to untried operations and to select the proper means of action to reach a specific system state. Operations with in the data domain, such as association recognition, matching of sets of observations to stored data, etc. can be performed by the parallel, high-capacity, unmonitored functions if all the data are available, especially if the operations can be performed directly on the presented symbols rather than on their functional meaning. For the majority of tasks in routine commercial data processing for example, the Library's serials control system, Rasmussen's data domain is clearly appropriate. The profile of tasks in organizations carrying out large volume data processing is such that a high proportion of human contact with the computer system is confined to a relatively small proportion of the range of operations available. For example, the input of data to the system via the Registration option will account for at least 80% of operators' hands-on time (i.e. registering 180 issues/day on average) in the library system, even though Registration forms only 10% of the options available. During operations of this type it is obviously desirable for the operator to be able to work in the relatively fast, high-capacity, unmonitored mode of Rasmussen's data domain as much as possible. Bobrow & Norman (1975) als0 recognize the distinction between conscious "highlevel" and automatic "low-level" types of human information processing [the appropriateness of hierarchical levels to processing structures will be discussed in the section 5.3 (the Mapping Process)]. In describing memory schemata they make the point that context can be .used as part of the address specification of objects so that descriptions within a specific context can be short and efficient. They also distinguish between event-driven and concept-driven processing. In the first type, a flow of data is automatically processed at a low level. Only if data cannot be accomodated by the automatic processing facilities does it become necessary to pass it up the hierarchy to the slower conscious levels of concept-driven processing. An explanation of the relatively low rate of conscious processing is given by Newell & Simon (1972). Automatic processes may be parallel while conscious processing is serial and is limited by the capacity of short term memory (two or three "chunks" during processing). Complex processes either involve breaking the problem down into simple steps, or access to long-term memory for a solution. Newell & Simon also support Rasmussen's (1980) view that processing is made easier if it can be carried out by manipulating symbols in some form of external representation (such as the CRT) or external memory which relieves the pressure on short-term memory. Thus there is a clear concensus that distinguishes between automatic and conscious processing. The consensus ascribes a larger capacity for data to automatic processing, whereas conscious processing is necessary for activities such as choosing and maintaining goals, fault finding and exception handling. In a typical commercial data-processing system both types of processing will be used by the operators--automatic, for handling the large proportion of data manipulation and conscious for moving between system functions, handling exceptions and predicting the operation of the system. The forms of representation for these two types of process are suggested in section 5.3, the Mapping Process.
228
A.P.
JAGODZINSKI
5.2.2. History and planning This is Bobrow's second (1975) design issue under the heading of operational correspondence. As he states, a simple representation changes to match changes in the world states. A more sophisticated representation would allow its user to look backwards in time to earlier world states (history) and to look forwards to predict the results of future changes in world states, resulting either from alternative operator actions or from independent system functions (planning). Obtaining a representation of the history of an on-line computer system's processes is fairly easy. Terminal screen displays can be stored for as much of the past as is wanted, simply by keeping a sequential file of everything that appears on the screen. Recall of the representation is simply a matter of scrolling backwards through the display file, although in an on-line system it would not be possible for the operator to modify earlier transactions. In practice it is likely that operators would not be interested in transactions older than, say, the one-before-last, so the size of the historical representation file need probably be no greater than two or three screens-full. This amount of buffering is often provided in standard V D U terminals so that no additional facilities would be necessary. Providing a representation to help planning is technically not so straightforward. Conventionally, transaction dialogues in on-line systems are driven by the operators input data which in turn are directed by prompts from the system. Without the input, for example, of key data it would not be possible for records to be retrieved for display. In most conventional on-line systems a user performing a particular task, such as the registration of a serial issue, cannot leave that process without completing it or abandoning it. In situations of this sort, two possible types of planning representation would be useful to the operator. First, he might want to see the effect of different actions or data inputs on the transaction he is processing without them effecting a permanent change in the database. To facilitate this type of planning the system should allow the user to freeze his existing operation at any point and to switch to a reconnoitre mode. This would allow him to simulate the continuation of the current transaction processing without causing irrevocable change or updating any files. On switching out of reconnoitre mode, the terminal dialogue would return him to the point in the transaction where he left off. This facility would be designed primarily to help the user in selecting from the available small-scale, short-term options in the current process. For example, use of the.reconnoitre mode in serials registration: "Identify Journal Please input EITHER the keyword for the journal's title OH the journal's ISSN /* AAPGBU I. DEFINITIVE TITLE: "AAPG bulletin" SPONSORING BODY: "American association of petroleum geologists" The keyword you have input references the above title. Y o u m a y E I T H E R input the number of the title you select, OR re-input the keyword or ISSN /* I
REPRESENTATION OF SYSTEMS TO NAIVE USERS
SELECTED TITLE: "AAPG bulletin" HOLDING (I) SOURCE: Copyright Via Bcdley LIBRARY; RSL STATUS: Ordered COPY: 1 SHELFMARK: PER 1253 d 392 RANGE: 1980HOLDING (2) SOURCE: Purchase Direct LIBRARY: RSL STATUS: Dead and all available COPY: 2 SHELFMARK: Per 1253 d 627 RANGE: 1952-1963 HOLDING (3) SOURCE: Exchange Via Bodley LIBRARY: RSL COPY: 3 STATUS: Live and all available RANGE: 1950SHELFMARK: PER 1253 d 960 Please select one of the preceding holdings by inputting its number
/*
3
SELECTED HOLDING: LIBRARY: RSL SOURCE: Exchange Via Bodley STATUS: Live and all available COPY: 3 SHELFMARK: PER 1253 d 960 RANGE: 1950IdentifySequence TITLE: "AAPG bulletin" LATEST ISSUE: Volume: I0 Part: ii Please examine the part which you are registering and choose one of the following options by inputting its number: (i) the part is next in sequence after the one on the screen. (2) the part is later than next in sequence. (3) the part is the same as or earlier than the one on the screen. (4) the part is a special issue, index or supplement.
/*
1
The issue description of the last part registered is: ( 1 ) VOLUME I0 (2) PART II (3) SUB-PART (4) EDITION Please change this to the issue description of the part you wish to register by inputting the line number followed by the new entry. Press the ESCAPE key after each line. When all the details are correct input "F". /* 2 12 Physical Check of Issue Please inspect the journal for damage or faults. Please EITHER input "D" if it is imperfect OR input "N" if it is all right
/*
N
Please input the number of issues which are contained in this part." [Request "reconnoitre mode" (by soft key or special command).]
/* 2 Check/Accept Input Data You have now set up the following details. Please check their accuracy. (i) VOLUME I0
229
230 (2) (3) (4) (5)
A.P. JAGODZINSKI PART SUB-PART EDITION START ISSUE NUMBER
12
12 13 (7) DATE REGISTERED 30th January 1981 (8) PERSON REGISTERING Systems Analyst Please make any necessary changes to these details by inputting the line number followed by the new entry. Press the ESCAPE key after each line. When all the details are correct input "F"." [Request return to current processing (by soft key or special command).] "/* i Check/Acqe_~tIn~ut Data You have now set up the following details. Please check their accuracy. (I) VOLUME i0 ( 2 ) PART 12 (3) SUB-PART (4) EDITION (5) START ISSUE NUMBER 12 (6) END ISSUE NUMBER 12 (7) DATE REGISTERED 30th January 1981 (8) PERSON REGISTERING Systems Analyst Please make any necessary changes to these details by inputting the line number followed by the new entry. Press the ESCAPE key after each line. When all the details are correct,input "F". /* F This transaction is now complete."
(6) END ISSUE NUMBER
It would be clearer if dialogue in reconnoitre mode could be shown, e.g. in red. If a colour V D U was not available the dialogue could be shown as in the inverse form to normal displays, i.e. black on white rather than white on black. Note that by using reconnoitre m o d e the user is shown the effect of assigning two issues to one physical part. Having decided that this does not correspond with the part he is registering he reverts to the more normal form of one issue per part. The use of a reconnoitre m o d e has implications beyond short-term planning during live processing. It would also be an extremely useful aid to learning the various facilities of the system and the routes through them. For example, novice users could start using the system entirely in reconnoitre mode (perhaps as a restriction generated by their code n u m b e r during L O G I N ) . As their confidence increased they could gradually start processing real transactions, until eventually they would use reconnoitre m o d e only for a quick reminder of the result of a particular action. In effect reconnoitre m o d e would be used as a flight simulator, with the advantage that it would still be available during a real flight. As a second aid to planning the user should be provided with the facility to have an overview of the system, showing him where he is in it and the routes available to him. Again, by switching into overview mode the user should be able to freeze the current transaction and be presented with a m a p of the system. For large systems it may be useful for the operator to be able to change the scale of the map so that he
REPRESENTATION
OF SYSTEMS
TO
NAIVE
231
USERS
could have successively wider, less-detailed views of the system, in effect stepping back from the representation. In the following example, the user is engaged in the dialogue for the registration of a journal part: "Identify Journal Please input E I T H E R the keyword for the journal's title OR the journal's ISSN /* aaspho 1. D E F I N I T I V E TITLE: "AAS photo-bulletin" S P O N S O R I N G BODY: "American a s t r o n a u t i c a l society" The keyword you have input references the above title. You may EITHER input the number of the title you select, OR re-input the keyword or ISSN /* I SELECTED TITLE; "AAS photo-bulletin" LATEST ISSUE: VOLUME: l0 PART: 12 Identify Sequence Please examine the part which you are r e g i s t e r i n g and choose one of the f o l l o w i n g options by inputting its number: (1) the part is next in sequence after the one on the screen. (2) the part is later than next in sequence. (3) the part is the same as or earlier than the one on the screen. (4) the part is a special issue, index or supplement."
Identify title
I
I I
~
) Youare here (Blinking)
Identify sequence Registration
(
1
Physicalcheckof issue
)
Check/accept input data
)
21 2 j~ Find a particular issue ) Chooseprocessrequired~[Holdings enquiries]~ 2.2 ~. Reviewpublication patterns )
3 "~
31 ~ . R'eviewexisting"to be claimed"list ) Claiming ~~.~f/k32
Makea new claim
FIG. 6. System map providing an overview of systems functions.
)
232
A.P. JAGODZINSKI
At this point the user is unsure whether the part he has, Volume 11, Part 1, is next in sequence or later than next in sequence. He would like to look at the holdings for previous volumes to see if there are usually more than 12 parts per volume, but does not know how to find this information. By calling up the system map facility he obtains a quick overview of the system function which he is currently in, and of the functions which it is possible to use without abandoning his current transaction or needing to login again (see Fig. 6). From the map the user can select the function he wishes to perform, ideally by light pen but possibly with an identifying code. The dialogue for the selected function is continued for as long as the user requires. he then returns to his original transaction, which he is able to complete with the information he found from his enquiry about previous holdings (i.e. that there are usually 12 parts per volume). [SELECT FUNCTION 2.2 (by soft key, special command or light pen)] "2. HOLDINGS ENQUIRIES
2.2 REVIEW PUBLICATION PATTERNS Do you wish to review the pattern of publication of the journal you are registering? /* Y Volume 9 Part i Registered 31.01.80 2 01.03.80 3 31.03.80 4 28.04.80 5 30.05.80 6 30.06.80 7 30.07.80 8 28.08.80 9 29. O9.80 I0 30 i0.80 ii Ol 12.80 12 30 12.80 Volume I0 1 31 01.81 2 02 03.81 3 02 O4.81 4 03 05.81 5 31 05.81 6 Ol 07.81 7 30 07.81 8 30 08.81 9 30.09.81 i0 30. I0.81 II 30.11.81 12 O1.01.82 Do you wish to review the publication pattern for earlier volumes? /* N [RETURN TO CURRENT TRANSACTION (by soft key or special command)] "SELECTED TITLE: "AAS photo-bulletin" LATEST ISSUE: VOLUME: lO PART: 12
REPRESENTATION
OF SYSTEMS
TO
NAIVE
USERS
233
Id_entify Sequence Please examine the part which you are r e g i s t e r i n g and choose one of the f o l l o w i n g options by inputting its number: (1) the part is next in sequence after the one on the screen. (2) the part is later than next in sequence. (3) the part is the same as or earlier than the one on the screen. (4) the part is a special issue, index or supplement. /* 1 The issue d e s c r i p t i o n of the last part r e g i s t e r e d is: (I) VOLUME I0 (2) PART 12 (3) SUB-PART (4) EDITION Please change this to the issue d e s c r i p t i o n of the part you wish to register by i n p u t t i n g the line number f o l l o w e d by the new entry. Press the E S C A P E key after each line. When all the details are correct input "F". /* 1 ii /* 2 1 Physical Check of Issue Please inspect the journal for damage or faults. Please EITHER input "D" if it is imperfect OR input "N" if it is all right. /* n Please input the number of issues which are contained in this part. /* 1 You have now set up the f o l l o w i n g details. Please check their accuracy (I) VOLUME ll (2) PART I (3) SUB-PART (4) E D I T I O N (5) START ISSUE NUMBER 1 (6) END ISSUE N U M B E R 1 (7) DATE R E G I S T E R E D 30th January 1981 (8) PERSON R E G I S T E R I N G Systems Analyst Please make any n e c e s s a r y changes to these details by i n p u t t i n g the line number f o l l o w e d by the new entry. Press the ESCAPE key after each line. When all the details are correct input "F". /* F This t r a n s a c t i o n is now complete."
The closest approach to this facility in conventional data processing systems is the use of menus to show the operator the range of choices availableto him at a particular stage of a transaction. However, menu techniques have two shortcomings. First, they give only brief descriptions of the options for the next stage of processing without putting them in the context of the whole, or at least a significant part, of the system. Secondly, menus are displayed by the system at specific stages of a transaction dialogue and are not available to the user at other times. Thus the function of menus is not to enlighten the user but to obtain control commands from him.
234
n. P. JAGODZINSKI
" 9 . 4 6 p m 3 0 t h January, 1981 SERIALS CONTROL SYSTEM. Version 1.0 Please Login: /* 7 Enter Option No. or "D" to display the options: /* D OPTION NO. PROCESS 1 Registration 2 Claiming 3 Subscription 4 Binding 5 New Titles 6 Cataloguing ? Ordering 8 Circulation Control 9 Holdings Enquiries l0 File Maintenance Enter Option No. or "Q" to log out: /* I"
(the dialogue now proceeds with Option 1). The user's decision about his choice of system functions could be helped be helped significantly if this type of menu was replaced by a map representation of the available functions showing briefly what each one entails, for example, the map displayed in the last sample of dialogue. 5.3. THE MAPPING PROCESS This, in Bobrow's (1975) scheme, is the process by which the constraints on world states are represented to the user so that he can adjust his data inputs and actions to suit them. Bobrow identifies two levels at which this takes place. The more elementary level is that of the local, small-scale processes such as the input of individual data items. The constraints on this type of process are expressible as formuli containing variables; subgtituting valid data for these variables gives a relationship between data items which is valid in terms of the real world (in this case the computer system's version of the real world). For example, a constraint of the system is that journal parts must be filed in sequence. This is expressed by the formula sequence of issues = sub-part within part within volume. If we substitute real data for these variables then it is constrained to conform with the formula. Thus issue description: "Volume6 Part I0 Sub-part 0"
comes after
issue description: "Volume 4 Part 5 Sub-part 02"
The higher level of representation is concerned with larger structures which provide a conceptual framework on which the elementary formuli can be hung.
235
REPRESENTATION OF SYSTEMS TO NAIVE USERS
For example, the formula expressing the sequence of issues is an implicit or explicit part of much larger blocks of processing which in turn interact with other blocks which together belong to the overall system as shown in Fig. 7.
Identify title
(
~[
Registration
i
(
,
I
]
Identify sequence
I
,.~ )
(
Physicalcheckof issue
)
(
Check/accept input data
)
Find a particular issue
)
Review publication pattern
)
I
] Login/Iogout ~
Select process
-~ Holdingsenquiries
q
r~Review
existing"to be claimed"list)
Claiming Make a new claim
,,~
FIG. 7. The structure of the serials control system. Modules marked * make use of the formula for issue sequence.
This distinction between elementary operations and higher level structures forms what is probably the most significant principle in the design of representation for two reasons. First, it implies that the human operator uses two different modes of thought during mental processing. These correspond with Rasmussen's (1980) data domain and functional domain, Russell's causal reasoning and deterministic reasoning and the level and high level thinking of Bobrow & Norman's (1975) referred to later in this section and in section 5.2.1. Secondly, acceptance of the distimction has implications for the entire shape and structure of the representation, rather than just some of its individual components. Almost inevitably it dictates an overall structure based on a hierarchical tree or network, with elementary processes taking place at nodes. This idea is already well established in computing with the techniques of structured programming, but as yet has not been formally applied to the representation.~ users of on-line processing. Because of its fundamental effect on the structure of representations the arguments in support of the model will be briefly examined. Simon (1969), restating an earlier work entitled "The architecture of complexity" (1962) examines the properties which are common to diverse kinds of complex systems.
236
A.P.
JAGODZINSKI
He reviews the way in which science has classified the natural world hierarchically in order to be able to cope with its complexity, and the way in which artificial systems, e.g. empires, manufacturing processes and music, are capable of reaching a greater size and becoming more complex if they are organized hierarchically. Simon goes on to distinguish between interactions among the subsystems of hierarchies and the interactions within subsystems. In an analogy with the rare gases he points out that interactions between subsystems (intermolecular forces) are much less powerful than those within subsystems (the forces binding the molecules) so that the system can be regarded as "decomposable" into its particles. As the gas is compressed, the intermolecular forces become more significant but still not as strong as those within molecules. From this analogy he proposes a class of systems which are "nearly decomposable", i.e. in which interactions among subsystems are weak but not negligible. The significance of this proposal is that it greatly facilitates the understanding of complex systems. In interactions between subsystems, subparts of subsystems only act in an aggregative fashion so that the detail of their interaction may be ignored. In studying the interaction of two nations, we do not need to study in detail the interaction of each citizen of the first with each citizen of the second . . . . If there are important systems in the world that are complex without being hierarchic, they may to a considerable extent escape our observation and our understanding. Analysis of their behaviour would involve such detailed knowledge and calculation of the interactions of their elementary parts that it would be beyond our capacities of memory or computation. Becket (1975), concentrating on the particular case of behavioural systems, questions the prevailing practice of describing them hierarchically when they occur as strictly linear sequences in time. H e concludes that within these systems any linear sequence of events can be seen as a set of significant recurring patterns. Process-concepts naturally nest to form part-whole hierarchies e.g. "grasping an object" is part of "picking up an object", just as "pink clouds" are a part of "sunset". Becker also makes the point that hierarchical organization is not necessarily inherent in the system being observed (which may, for example, be organized with a very high level of parallelism) but does seem to be essential as a step in a human's understanding of the system. Bobrow & Norman (1975) consider in more detail the structure of human information-processing system. They propose a system based on m e m o r y structures comprising a set of active schemata, each capable of evaluating information passed to it and capable of passing information and requests to other schemata. Of particular interest in the design of representations is their hypothesis that the processing structure of the m e m o r y system does itself operate hierarchically. The lower levels of the hierarchy deal with "event-driven" processing, which is bottom-up, and seeks structures in which to embed sensory inputs. The higher levels perform "concept-driven", top-down processing and are driven by goals and motives. Sensory inputs are processed by the lowest possible level of the hierarchy unmonitored by conscious thought, and only passed upwards if they cannot be handled at a particular level. The higher levels of processing can continue until interrupted by an input which is not dealt with at the lower levels. For example, an experienced car driver can drive a car quite adequately with event-driven, low-level processing while
REPRESENTATION
OF SYSTEMS
TO
NAIVE
USERS
237
his higher-level concept-driven processing copes with a searching conversation about philosophy. However, when the driver is required to navigate a new route while driving, sensory inputs (new road signs, new scenery) need to be passed up to the higher levels to be processed. Under these conditions the serial, conscious mode of concept-driven processing cannot also cope with a demanding conversation. Bobrow & Norman summarize their theory as follows: The automatic, active schemata of memory and perception provide a bottom-up, datadriven set of parallel, sub-conscious processes. Conscious processes are guided by high-level hypotheses and plans. Thus consciousness drives the processing system from the top down, in a slow, serial fashion. Both the automatic and the conscious processes must go on together; each requires the other. The implications for the structuring of representations now begin to emerge clearly. If the human operator can process routine data in a paralllel (and thus fast) "subconscious" mode then the system should allow this to happen without the need to invoke conscious processing. This implies that the various elementary operations of data processing should take place within their own self-contained (decomposed) nodes of the system. Only if non-standard situations arise should the operator need to pass upwards through the system's and his own processing hierarchies. For top-down, motive-driven processing, for example, when the operator is choosing from a range of alternative processes, the interactions between nodes in the hierarchy become of primary interest to the user and therefore must appear in the representation. The exact relationship between these two modes of information processing in humans is of concern here only insofar as it affects the design of a method of representation. Clearly, if the representation is to be valuable to the system user, and seem natural to him, it must reflect both the modes in which he works. The balance between the two modes and the way in which they interact are currently the subject of speculation and controversy. Winograd (1975) identifies the two leading contenders as the declarativists (supporting a theory emphasizing the independence of processing nodes) and the proceduralists (supporting a theory emphasizing the interaction between processes). Winograd's tentative pact for the controversy is the hypothesis that both types of processing are necessary [also stated by Bobrow & Norman (1975)]. In Winograd's terminology processing is carried out by a set of hierarchically related "frames", each of which represents a class of objects (a declarative view). Classes within frames are owned by super-classes outside the frame, giving the element of interaction between processing nodes (the procedural view). Thus the concept " d a y " may have separate frames for different contexts such as "calendar objects" or "event sequence". Within each frame the processing elements ("IMPS") are appropriate for the particular context. While this explanation is unlikely to be the last word on the subject, it does contain some valuable ideas for schemes of representation. The facility to shift the context of a particular process or concept would be useful in many aspects of a data processing system. It is unlikely to apply to individual terms (see "Verbal mediation", section 5.1.3) which are narrowly defined for the system, but may well apply to the way in which sub-processes are regrouped depending on context. For example, in the library system the sub-routines for identifying a journal's title and holding (Dictionary Look-up and Holdings Key Look-up) may appear in the context of several other processes such as Registration and holdings Enquiries.
238
A . P. J A G O D Z I N S K I
A hierarchical structure of nearly autonomous modules as the backbone of a system of representation is thus strongly supported by the theories of Simon (1969), Becker (1975), B o b r o w (1975), Bobrow & N o r m a n (1975) and Winograd (1975). Equally strong practical evidence of the value of this scheme as an aid to the human p r o b l e m solving process and a boost to the capacity to c o m p r e h e n d large, complex systems, is provided by the exponents of structured programming. The basic principles of this m e t h o d are described by Dahl, Dijkstra & H o a r e (1972) 9 Their experience led them to the conclusion, first, that computing processes are too complex to be understood without hierarchical decomposition because precise thinking is only possible in terms of a small n u m b e r of elements; and secondly, if a system is going to be decomposed into modules, then these must be chosen so that there is relatively little interaction between modules, otherwise it becomes impossible to write and modify a module without extensive knowledge of the others. These conclusions, which are exactly in step with the theories above, are followed by a set of practical proposals for their implementation in the design of c o m p u t e r programs, by Jackson (1975)9 Problems should be decomposed into hierarchical structures of parts, with an accompanying dissection of the programs into corresponding structures and parts9 At each level of decomposition we should limit ourselves to the use of three structural forms: concatenation (sequential flow), repetition (DO WHILE or REPEAT UNTIL) and selection (IF THEN ELSE or CASE). The GO TO statement should be avoided completely or so far as possible. Jackson goes on to state that the only practicable way to achieve these aims is to base the decomposition of the p r o b l e m on the structure of the date being processed, so that there is a correspondence between data types and the modules 9 This practical advice provides a valuable d e v e l o p m e n t of the purely theoretical approach and has important implications for the design of representations of computer processes 9 Rasmussen's (1980) work (also cited in sections 5.1 and 5.2) distinguishes clearly between the nature of operations at the different levels of hierarchical structure of h u m a n information processing. Operations at the bottom-level nodes, what he calls "the data d o m a i n " are typically based on processes such as association, recognition, matching of sets of observations to stored state models, etc. Such data processes depend on stored data sets, state models, used as reference sets and labelled according to their operational significance as system states, properties, tasks, etc. The operations in the data domain can be performed by the high-capacity subconscious functions if all information is available in parallel and the operations can be performed directly on the presented symbols constituting a time-space configuration rather than on their functional meaning. 9
The operations required to navigate through the structure of the system he classifies as belonging to the "functional d o m a i n " . Operations in the functional domain are related to the internal causal processes of the system, and data presentation should be related to the structure of mental models of the functional category9 Typical processes implied in operations at the functional level are those of abduction, deduction and induction (Fogel 1961). Abduction and deduction are respectively the processes of backward and forward cause-and-effect tracing through a system having known internal properties and structure. Such processes are used to generate non-observable data; to predict the response to intended actions upon the system; to
REPRESENTATION
OF SYSTEMS
TO NAIVE
USERS
239
predict the course of events in a system subject to known disturbances; and to select the proper means of action to reach a specific system state9 R a s m u s s e n also gives the following advice a b o u t the structuring of tasks: . . . the guiding principle should be that no mental task should be forced into a level of consciousness higher than the task in itself justifies (due to some inappropriate coding of information or choice of strategy in the computer)9 If this principle is not followed, the operator may have to time-share the main task with the extra, irrelevant task of data recoding. This is perhaps one of the great problems in conventional systems where the magnitudes of process variables--selected not only in accordance with their importance to the operator's tasks, but largely according to the availability of reliable measuring probes--are displayed individually by meters or indicators. When the system state, therefore, cannot be identified by direct perception but by the cognitive process of diagnosis, skilled operators will typically replace functional reasoning by the use of single cues as signs for system stages. This is a very efficient solution during a normal work situation, but becomes somewhat of a trap in not-seen-before situations9 H e states the designer's p r o b l e m in resolving this issue as being: to match the code used in the information display to the mode of human data processing preferred in the specific task. It is not, as frequently stated, to reduce the amount of information displayed9 9
As a guide to a solution he quotes N e w m a n (1966): lit was] found that properly formatted displays allowed people to tolerate and absorb much more information than would normally be expected. There seems to be an important principle operating here, one of considerable generality. People don't mind dealing with complexity if they have some way of controlling or handling it . . . . [If] a person is allowed to structure a complex situation according to his perceptual and conceptual needs, sheer complexity is no bar to effective performance. and c o m m e n t s : This is another way of saying that inappropriate coding of information should not force a task from the level of high-capacity perceptive functions to the level of low-capacity cognitive functions. B r o o k s (1977) provides a n o t h e r bridge b e t w e e n the clearly parallel d e v e l o p m e n t s in the theory and practice of hierarchical structuring of p r o b l e m s and c o m p l e x p r o cesses. F r o m detailed studies of the cognitive processes in c o m p u t e r p r o g r a m m i n g he f o u n d that the Newell & Simon (1972) m o d e l of h u m a n problem-solving was appropriate. H e concluded that structured p r o g r a m m i n g m e t h o d s were effective: The beneficial effect of imposing a hierarchical cognitive organization is a fairly direct consequence of assuming an STMt with a capacity of a tixed number of units. Since the hierarchical structuring would permit more information about the program to be packed into each unit, fewer, separate units would be needed and the possibility of losing a unit, causing errors, would be reduced9 F r o m these a r g u m e n t s the following implications for the design of representations emerge. 1. T w o m o d e s of o p e r a t i o n should be catered for: (i) T o p - d o w n , hypothesis-driven, "functional d o m a i n " , e.g. for selecting the best c o m b i n a t i o n of operations for achieving a particular result. t Short Term Memory.
240
a . P . JAGODZINSKI
(ii) Bottom-up, data-driven, " d a t a - d o m a i n " , e.g. for processing large volumes of standard data such as periodicals to be registered. 2. Structuring of data elements and processing routines should reduce complexity by grouping sets of concepts u n d e r single headings. The m a x i m u m n u m b e r of components in a group should be about seven for simple operations, or three if p r o b l e m solving is to be carried out. 3. Structuring of data elements and processing routines should be consistent within the system and conform with the structure of the task being performed. Processing of standard data should not normally require incursions into "functional d o m a i n " processing. 4. The structure of grouped data elements under their heading concept should be evident. 5. Operations (at elementary level) which involve data manipulation should be represented by displays in which values or positions of data change to correspond with the actions performed on them. Functional descriptions of processes (e.g. "field updated") should not be used to represent actions at this level. 6. The interrelationships between operations at elementary and group level should be evident, in, for example, a tree structure. 7. In moving from one group of processes to another the identity of the groups must be represented. Only one entry point and one exit point should be used for any group. 8. Subprocesses of processing groups should be limited to a small, familiar range of alternatives, e.g. iterations of a single process, a selection of one of a range of alternative processes, or a sequence of individual processes. In any case, on completion of processing by c o m p o n e n t elements of groups, control should be seen to pass back to their owning block. 5.4. INFERENCE
5.4.1. Formal inference Discussing representation systems in general, Bobrow (1975) defines formal inference as the process of deriving implicit facts from the initial set of explicit formulae, according to some fixed rules, without interaction with the real world in question. Put another way, it is the process of applying formal axioms to arrive at conclusions about the real world. As discussed in section 5.1, the user should not need to be aware of the formal inferences made by the computer p r o g r a m m e r in creating the computer system, but does need to be able to follow the inferential processes of the real world system which it represents if he is to be able to m a k e judgements based on the information provided by the system. For example, in the case of the library serials system records for individual journal issues are held only for fairly recent issues as the data they contain may be needed to answer suppliers' queries or to calculate arrival patterns. When issues are two or three years old the information about the issues on file is summarized by increasing a range-of-issues field to encompass the issue numbers and then the individual records are deleted. If an issue record is missing f r o m the sequence of deleted issues then a record for the missing issue is added to the unavailable-issues file. If there is an enquiry as to
241
R E P R E S E N T A T I O N OF SYSTEMS TO N A I V E USERS
Search?heindividual issuerecords l
+
Yes
I Checkthe range of issuesfield I
No
I Check the unavailable issues file I
~{
Yes
I Issueis not held by library I
~
No
I
},
Issueis held by library I
FIG. 8. Formal inference in the computer system. Although there may be no explicit record in the system to identify a particular issue, the system can infer that the issue is there.
whether the library holds a particular issue then system goes through the steps shown in Fig. 8. Thus, although there may be no explicit record in the system to identify a particular issue, the system can infer that the issue is there. The explicit record of the issue being held by the library (the first route) is different in quality to the inference that, as there is no record of it being missing, it must be present. If the user is to give an accurate model of the system's formal inference then the two possiblities should be presented differently: 1. "This issue is held. Its shelfmark is: 18612 d. 776". 2. "This issue is within the range of issues held and there is no record of its being missing. Its shelfmark is: 18612 d. 776". Version 2 gives the user more room to doubt the sytem's information and this accurately reflects the status of the information. It may be the case that the data are unreliable, so that another inference is possible, i.e. "the issue is missing but there is no record of this". If this is so, then the two possible conclusions should be presented to the operator, perhaps with an indication of the probability for each. The question of representing doubtful data is tackled in section 5.7 "Self-Awareness".
5.4.2. Informal inference There is another level of inference which will inevitably take place, even with a comprehensive scheme of representation. This is described by Bobrow (1975) as the
242
A. P. JAGODZINSKI
use of meta-inferential techniques. These enable the individual to find facts which are not necessarily derivable in a formal way from the set already present, but which are consistent with them and may be useful. One of these techniques is inductive inference, which uses a set of facts to form the basis for a general rule for expressing relations. For example, the use of indentation to show the grouping of data into single concepts (see Fig. 9).
Date: Day Month Year
[ ~
FIG. 9. Inductive inference.
This type of relationship would not be described formally in the representation, but having been encountered a few times is likely to be taken as a general rule. Another meta-inferential technique is inference by analogy. This works on the assumption that if a new situation, Y, has certain criteria of similarity to a known situation, X, then the result of Y will be the same as the result of X. Thus, if in a process X the operations A, B and C were always performed in that order, then inference by analogy would predict that in an operation represented as having a similar structure, Y, operations P, Q and R would be performed in the same order (see Fig. 10).
I
2
3
I
2
3
FIG. 10. Inference by analogy.
The significant difference between formal inference and meta-inferential techniques in the design of a scheme of representation is that the former can and should be avoided whereas the latter will inevitably occur. In other words words, the naive user should not be asked to perform formal inference (beyond that inherent in the real-world system) but will naturally perform inductive inference and inference by analogy of the sort described in the preceding examples. Clearly the scheme of representation should be designed to take advantage of the users' natural inclination to use meta-inferential techniques. Thus the designer must ensure that representations of data or processes which look the same behave in the same way, and vice versa. With this support users can venture into previously untried parts of the system confident that their inductive and analogical inferences are valid. 5.5. ACCESS
This is Bobrow's (1975) fifth major issue in the design of representations and it identifies the essence of the designer's task as being to provide the right piece of
REPRESENTATION
OF SYSTEMS
TO
NAIVE
USERS
243
knowledge at the right time. This task can be divided into two concerns: first, the philosophy of which elements of knowledge are grouped together, and second, the mechanisms which are used for access between groups. Using the nearly-decomposable system as the model for the scheme of representation the problem of identifying and grouping the elements with strong interactions into unitary structures has to be solved. Taking a top-down view of this problem, it is also necessary to design the links, corresponding to the weaker interactions of the system, between the unitary structures. Moran (1981b) suggests principles which enables the systems designer to structure on-line computer systems so that their components correspond with the user's mental model of the tasks being performed. This methodology, based on his C o m m a n d Language G r a m m a r (CLG), takes as its starting point a user's view of the task structure as a hierarchy of tasks and subtasks. The elements within this structure are identified as: Task entities are the conceptual objects involved in the task environment. Each entity is defined by enumerating its important properties, parts, and relations to other entities. The entities define the terms used in the statement of the tasks. Tasks are the specific goals that the user will set for himself and will attempt to accomplish, usually with the help of the system. (Actually, a task is a goal plus an initial state.) Each task may have input parameters, an output result, and a failure condition. Other characteristics of a task may also be noted: its frequency or importance, its response time requirements, its error tolerance, etc. Task procedures are procedures composed of tasks. When a procedure is called to do a task, the tasks in the procedure become the subtasks of that task. The procedures explicitly define the task structure hierarchy. Task methods associate tasks with task procedures. Note that procedures are not themselves linked to tasks; it is the methods that carry the linking information. Thus, CLG makes an explicit distinction between tasks (ends), procedures (means), and methods (means-ends links). The initial description of the system is progressively refined through "semantic", "syntactic" and "interaction" levels, the end-product being a description of the system at the level of physical actions of the user and the system, in effect the structure of the dialogue between user and system. M o r a n does not specify how the initial "task level" model should be elicited f r o m the user, but this is recognized as a problem of system analysis and should be amenable to an existing solution such as D e Marco's (1978) technique of user-drawn data data-flow diagrams. The size of processing nodes under the C L G model is decided by function rather than by time, but a typical operation, given as an example by Moran, would contain the following operations at the "interaction level".
"(AN INTERACTION-METHOD FOR READ-NEW-MAIL DO (REPEAT UNTIL (* End of MAILBOX) DOING (SEQ: (READ (THE CURRENT-MESSAGE) IN MESSAGE-AREA) (CHOICE: (KEY: "N") (KEY: " D " ) ) ) ) ) "
244
A. P. J A G O D Z I N S K I
Some guidance of the time scale of operations at nodes is given by the "Goals, Operators, Methods and Selection rules" (GOMS) model of Card, Moran & Newell (1980a) and their "keystroke level model of user p e r f o r m a n c e " (1980b). According to these models, the time taken to execute a task has two parts: (1) acquisition of the task; (2) execution of the task. During acquisition the user sets up a mental representation of the task, and during execution he calls on system facilities to accomplish it.
Ttask = Tacquire+ T ..... ,o The acquisition time for a unit task depends on the characteristics of the larger task situation in which it occurs. In a manuscript interpretation situation, in which unit tasks are read from a marked-up page or from written instructions, it takes about 2 to 3 seconds to acquire each unit task. In a routine design situtation, in which unit tasks are generated in the user's mind, it takes about 5 to 30 seconds to acquire each unit task. In a creative composition situation, it can take even longer. The execution of a unit task involves calling the appropriate system commands. This rarely takes over 20 seconds (assuming the system has a reasonably efficient command syntax). If a task requires a longer execution time, the user will likely break it into smaller unit tasks. F r o m this analysis of the c o m p o n e n t s of execution time it is apparent that Tacquire (and presumably the user's effort, too) is usually at least as great as T . . . . . to, and is therefore worth the attention of the system designer. The work of Card, M o r a n & Newell (1980a, b) focuses primarily on ways of improving T ..... te. However, the principles in Moran's (1981b) C L G model of basing the structure of the dialogue on the user's conceptual model of the task is equally relevant to shortening Tacquire by improving the way in which the system is represented to the user. Moran's C L G model thus provides a means of structuring the computer system in accordance with the user's conceptual model of the task. Using his methods the unitary structures at the terminal nodes of the hierarchy b e c o m e apparent in terms of data objects, the processes by which they are manipulated and the tasks' objective. The requirements of the task also provide a userful guide to the links which should exist between nodal process. In m a n y cases these will not correspond neatly with the sort of strictly functional system design which can be produced if the systems analyst is allowed to work in isolation f r o m the user. For example, a functional approach to the design of the Library's serials control system produced the structure shown in Fig. 11.
I
Login ond se ect process I o
FIG. 11. A functional dcscription of the serials control system.
These categories of processing group highly-interrelated functions, which correspond broadly with operational divisions within the library's manual serials system, and even with the separation of physical locations at which the operations are performed.
R E P R E S E N T A T I O N OF SYSTEMS TO N A I V E USERS
245
H o w e v e r , an analysis b a s e d on task, r a t h e r t h a n existing m a n a g e m e n t divisions, r e v e a l e d that t h e f u n c t i o n of H o l d i n g E n q u i r i e s , that is t h e facility to e x a m i n e r e c o r d s of p r e v i o u s l y r e c e i v e d issues, is u s e d n o t j u s t in r e s p o n s e to r e a d e r s ' e n q u i r i e s b u t also e x t e n s i v e l y d u r i n g t h e R e g i s t r a t i o n a n d C l a i m i n g processes. C o n s e q u e n t l y t h e p a t h s of access t h r o u g h t h e s y s t e m s h o u l d b e as in Fig. 12.
I Loginand select process Registration
FIG. 12. A task-based description of the serials system.
In p r a c t i c e it m a y n o t always b e p o s s i b l e for the s o f t w a r e to follow the u s e r ' s c o n c e p t u a l m o d e l closely. H o w e v e r , the s c h e m e of r e p r e s e n t a t i o n can, a n d , as M o r a n says should, shield t h e user f r o m t h e u n d e r l y i n g realities of the s y s t e m in these cases. 5.6. M A T C H I N G
T h i s is t h e p r o c e s s b y which u n k n o w n s t r u c t u r e s a r e c o m p a r e d with k n o w n s t r u c t u r e s for e q u a l i t y a n d similarity, o r in o t h e r o t h e r s , the w a y in which t h e o b j e c t s a n d p r o c e s s e s which m a k e u p the users' task a r e fitted to t h e i r r e p r e s e n t a t i o n s in t h e s y s t e m ' s interface.
5, 6.1. Forms of matching B o b r o w (1975) identifies t h r e e f o r m s of m a t c h i n g all of which a r e r e l e v a n t to r e p r e s e n t a t i o n s of c o m p u t e r d a t a p r o c e s s i n g systems. T h e t h r e e f o r m s are:
Syntactic Matches: In syntactic matching, the form of one unit is compared with the form of another, and the two forms must be identical. In a slight generalization, a unit may have variables, which can match any constant in the other. Further complications involve putting restrictions on the types of constants a variable can match. A common use of syntactic matching procedures is to find appropriate substructures by matching variables which fit into parts of larger structures. Another step in the generalization is to allow the pattern matcher to be recursive, so that the matcher is called to determine if a subpiece of a pattern matches a subpiece of a unit. Parametric Matches : In a syntactic match, a binary decision is made. A pattern either does or does not match. In a parametric match, a parameter specifies the goodness of any match. In such a match, certain features of a pattern may be considered essential, others typical and hence probably should be there, and others just desirable in an element to be matched. A goodness parameter can account for how many of which features can be found. Ripps, Shoben & Smith (1974) hypothesize that people use a parametric match using levels of feature comparison. For example, they claim a person would classify a particular picture of an animal as a bird if sufficient features presented in that picture match those of a "typical" bird. Semantic Matches: In a semantic match, the form of elements are not specified. The function of each element in the structure is specified; then the system must engage in a
246
A. P. JAGODZINSKI problem-solving process to find elements which can serve that function. For example, a table could be specified to be a horizontal surface on top of a support which keeps the surface at a height of about 30 inches. This does not at all specify the form of the support, which could be anything from a box to a cantilever from a wall. This type of specification, separating form from function, seems necessary to allow the flexible definitions that humans seem capable of handling.
It is a characterstic of c o m p u t e r data processing systems that they can only p e r f o r m tasks and handle data inputs which have been anticipated in detail by their designers. Tasks and data which fall outside these limits are classified as exceptions to the system. The role of syntactic matching is to provide the means by which data inputs and processing can be located in one of the system's predefined categories or as exceptions. For example, the library system allows journals' issue descriptions to be in one of three formats: 1. V o l u m e Part number Sub-part number 2. Y e a r Quarter M o n t h 1 (from) M o n t h 2 (to) Week Day 3. Description The function of the representation is to enable the user who is registering the first issue of a new title to select the appropriate format for issue descriptions or to classify the transaction as an exception, not amenable to processing by the system. Parametric matching is not sufficiently precise for the needs of data input, although it does have a role in enabling the user to match his processing requirements with the representations of the processes available to him. For example, during Registration he may find it useful to look at previous issues of a particular holding, so that his processing needs match some, but not all, the facilities shown in the representation of Holdings Enquiries. Semantic matching, where function but not form of structural elements is specified, is probably most relevant to operations in which the user is trying to navigate his way through the sysem by referring to the abbreviated descriptions of available functions provided in the representation. Conventional " m e n u " representations give a good example of this: "Enter Option No. or "D" to display the options: OPTION NO. PROCESS I Registration 2 Claiming 3 Subscriptions 4 Binding 5 New Titles 6 Cataloguing 7 Ordering 8 Circulation control
REPRESENTATION
OF SYSTEMS
TO
NAIVE
USERS
247
9 Holdings enquiries i0 File m a i n t e n a n c e Enter O p t i o n No. or "Q" to logout."
5.6.2. Uses for matching As well as describing the different forms of the matching process, Bobrow (1975) discusses four basic purposes for which mapping is used, namely classification, confirmation, decomposition and correction. An examination of these purposes suggests several implications for the design of representations. Classification is the process of matching an unknown input against a number of known patterns to find the one which it fits best. The implication of this for the design of the representation is that the criteria for inclusion in any of the classes, i.e. the known tmttern, should be clearly evident to the user. In a piece of conventional system design the library system asked the user to classify newly arrived serial parts with the following dialogue: "TITLE: "AAPG Bulletin" L A T E S T ISSUE: PART: 200 SUB-PART: 2 Please examine the part you are r e g i s t e r i n g and choose one of the f o l l o w i n g options by i n p u t t i n g its number: (i) the part is next in sequence after the one on the screen. (2) the part is later than the next in sequence. (3) the part is the same as or earlier than the one on the screen. (4) the part is a special issue, index or supplement."
This has several faults. First, it confuses classification of regular issue/special issue with classification of next/later/earlier. Secondly, it does not provide the exception option of not being able to classify the part. Thirdly, it does not suggest the criteria on which the classifications are to be made. An approach which takes account of the theory of matching in representation is as follows: "TITLE: "AAPG Bulletin" LATEST ISSUE: PART: 200 SUB-PART: 2" W h i c h of the f o l l o w i n g d e s c r i p t i o n s a p p l i e s to the part you are registering? (1) a regular issue w i t h P A R T S U B - P A R T etc. (2) an issue, also w i t h the title "AAPG Bulletin", but w i t h no issue d e s c r i p t i o n . (3) a journal not r e l a t e d to the title above. Please input the n u m b e r of your chosen d e s c r i p t i o n : "
The sequence of the part, if it is a regular issue, could now be treated separately if necessary (in practice the operator does not need to make decisions about the part's sequence as this can be done by the computer from the part number, etc.).
248
A.P.
JAGODZINSKI
The principles contained in this example can be generalized as follows. 1. Require the user to perform only one classification at a time. 2. State the criteria on which classification is to be made. 3. Provide the option of the unknown being unclassifiable. Bobrow's second purpose for matching is confirmation that a correct classification has been made. From the users' viewpoint this is a similar operation to classification, and the same general rules will apply. Typically, confirmation of correct data inputs takes place as the final operation in processing a transaction, as in the following example. "You have now set up the following details. Please check their accuracy: (1) VOT.U~E (2) PART 200 (3) SUB-PART: 3 (4) EDITION: (5) START ISSUE NUMBER: 200 (6) END ISSUE NUMBER: 200 (7) DATE REGISTERED: 30th January 1981 (8) PERSON REGISTERING: Systems analyst"
The task of confirming the correctness of the operator's various choices would be eased if the display were broken down into groups which correspond with the groupings of the data in the "unknown", i.e. one group for the data about the journal part and one group for the rest of the data. Bobrow's third purpose for matching is for patterns with substructure to be matched against structured unknowns so that the unknown can be decomposed into subparts corresponding to those in the pattern. This process may take place at many levels, but is likely to be particularly useful in enabling users to put some shape into their vague ideas about the facilities provided by the computer system. For example, the semantic matching (section 5.6.1) provided by conventional menu displays may often provide a match which is unsatisfactory because it does not give the user a framework with which to structure his ideas about the operation of any particular function. A more satisfactory representation would show a hierarchical tree of subfunctions within each function. The general rules to be derived here are: 1. do not use semantic matching except for familiar or simple concepts; 2. if a concept can be decomposed then represent its structure when this may help the user make a decision. Bobrow's fourth purpose for matching is for the correction of earlier actions. The implication for the representation is that when the user makes a mistake it is preferable to correct him by showing him the nature of the desired solution, rather than simply telling him he is wrong. A simple example is provided by the operator inputting a month number of 21. The error message "month number out of range" does not allow the user the opportunity to match his input with the desired input. The message "month number should be in the range 1-12" is to be preferred.
REPRESENTATION
OF SYSTEMS
TO
NAIVE
249
USERS
5.7. S E L F - A W A R E N E S S
This is the last of Bobrow's dimensions of representation and is concerned with whether the system has "explicit knowledge of its own workings". Bobrow identifies knowledge about facts and knowledge about process as the two components of self- awareness.
5. 7.1. Knowledge about facts In discussing representation systems in general, knowledge about facts should, if the representation is to be useful, include measures of the relevance and the expected degree of validity of the facts being represented. These measures are necessary for the individual to be able to assess and select from the mass of facts present in representations of the real world. In the case of representations of computer systems most of this filtering and selection will already have been done by the system designer. The facts present in the system can largely be assumed to be both relevant and, if the computer system is sound, valid. However, there may be a need to qualify some of the facts in the representation if they are not to mislead the user. For example, in the library serials system there is a field in the record for each journal title which indicates the frequency of publication of the title (for instance, 12 per year). For a high proportion of titles in the collection this fact is reasonably reliable for regular issues, although it may not take account of occasional special issues and supplements. At best it can only be regarded as a guide, and for the minority of irregular titles may have little value. This is in sharp contrast to the majority of facts in the system and representation which the user can treat as completely reliable. If the "frequency of publication" data is treated as reliable it could easily lead to system users making mistakes, such as claiming issues which have not in fact been published. Clearly, the representation should indicate exceptions to the general rule that facts are relevant and valid. This could be done by a simple description such as "approximate frequency of publication" or, more elaborately, by building in a reliability indicator which would be set for each title as follows: Title " C o m p u t e r weekly" "Beyjing University Computing Review"
Frequency
Reliability of Frequency
52 per year 4 per year
absolute very low
As a general principle of design it can be stated that the representation of facts should include an indication of reliabiltiy and relevance when they are exceptions to the implicit status of facts in the representation.
5. 7.2. Knowledge about process This is well explained by Bobrow (1975): In modelling interactions with the outside world the system needs to predict its own capabilities to plan a strategy in which information gathering cannot all be done before
250
A . P. J A G O D Z I N S K I
starting an action sequence. For example, in planning a route, it must be able to realise that at a certain intersection it will be able to look for a street sign. There are two implications in this for the design of the representation. First, the representation must provide guidance of the kind required by the user e n r o u t e . T h e forms this should take have already been discussed in detail in section 5.2.2, "History and Planning", in which a "reconnoitre m o d e " and overview maps of the system were suggested as aids to the user. The second implication is that the user must be able to rely on the availability of such aids in all parts of the system, and that the method of invoking the "street signs" should be consistent throughout.
6. Conclusions A number of general principles for the representation of on-line computer systems to naive users have emerged in this discussion. They fall into three categories. First, there are those which have already been applied to this, or other, aspects of computer systems design, for example the principle of structuring systems hierarchically. It may be argued that no useful purpose is served by justifying theoretically a principle which is well established in practice. However, it is worth repeating the arguments of section 1.2: if theory is not made explicit then the assumptions behind current practice remain hidden and may not be recognised as false (Mumford, 1980) and Moran's (1981a) warning that design guidelines developed without a theoretical basis may be contradictory. The second category of findings includes those principles which are not in current use but which are not dramatically novel, and which might easily have arisen out of simple intuition--statements of what is almost obvious, e.g. that it is better to break dialogues into small, recognisable "chunks" of processing rather than run them as continuous flows. Again, the search for a theoretical basis for findings of this type can be justified by the arguments above. In addition it can be fairly claimed that the process of a theoretical investigation has stimulated the production of ideas which may or may not have arisen out of the process of conventional systems design, and has given clear guidance (e.g. the size of "chunks") on the implementation of the ideas. The third category of findings includes the design principles which are evidently innovative and more clearly the direct result of an examination of the theory of representation, e.g. the use of "reconnoitre" and "overview" modes. The contents of this last category provide a more exciting justification for turning to the theories" of cognitive science. Clearly, findings of this type give the systems designer insights into the functioning of the user which could only have been guessed at without the help of theory. Nevertheless, the value of the findings should not be judged on their novelty alone, but on the contribution they made to improving the representation of systems to their users. This will be established during the next stages of the research when they are tested in practical implementations. A few at least should be proved fit to be included in a corpus of reliable and well-founded principles of systems design.
REPRESENTATION OF SYSTEMS TO NAIVE USERS
251
The author gratefully acknowledges the help of the Science and Engineering Research Council and International Computers Ltd. in funding this reserach. Thanks are also due to D r D. F. Shaw, Keeper of Scientific Books at the Radcliffe Science Library in Oxford, for making available the serials control system which forms a vehicle for the research, and to Dr David Clarke of the Department of Experimental Psychology of the University of Oxford for his comments and encouragement.
References BECKER, J. D. (1975). Reflections on the formal description of behaviour. In BOBROW, D. G. & COLLINS,m., Eds, Representation and Understanding. New York: Academic Press. BOBROW, D. G. (1975). Dimensions of representation. In BOBROW, D. G. & COLLINS, A., Eds, Representation and Understanding. New York: Academic Press. BOBROW, D. G. & COLLINS, m. (1975). Representation and Understanding. New York: Academic Press. BOBROW, D. G. & NORMAN, D. A. (1975)~ Some principles of memory schemata. In BOBROW, D. G. & COLLINS, A., Eds, Representation and Understanding. New York: Academic Press. BROOKS, R. (1977). Towards a theory of the cognitive processes in computer programming. International Journal of Man-Machine Studies, 9(6), 737-752. CARD, S. K., MORAN, T. P. & NEWELL, A. (1980a). Computer text-editing: an information processing analysis of a routine cognitive skill. Cognitive Psychology, 12, 32-74. CARD, S. K., MORAN, T. P. & NEWELL, m. (1980b). The keystroke-level model for user performance time with interactive systems. Communications of the Association for Computing Machinery, 23(7), 396-410. DAHL. O.-J., DIJKSTRA, E. W. & HOARE, C. m. R. (1972). StructuredProgramming. London: Academic Press. DE MARCO, T. (1978). Structured Analysis and System Specification. New York: Yourdon. DIJKSTRA, E. W. (1976). A Disciplineof Programming. Englewood Cliffs, New Jersey: PrenticeHall. FITTER, M. (1979). Towards more 'natural" interactive systems. International Journal of Man-Machine Studies, U(3), 339-350. GREEN, T. R. G. (1980). Programming as a cognitive activity. In SMITH, a . T. & GREEN, T. R. G., Eds, Human Interaction with Computers. London: Academic Press. I-IOARE, C. m. R. (1972). Notes on data structuring. In DAHL, O.-J., DIJKSTRA, E. W. & HOARE, C. m. R., Structured Programming. London: Academic Press. JACKSON, M. A. (1975). Principleso[Program Design. London: Academic Press. JACKSON, M. A. (1980). The design and use of conventional programming languages. In SMITH, n . T. & GREEN, T. R. G., Eds, Human Interaction with Computers. London: Academic Press. JAGODZINSKI, m. P. (1980). Proposals for the study of the reactions of users to the introduction of an on-linear computer system for serials management at the Radcliffe Science Library: a report. Radcliffe Science Library Report RSL/ENG673/9. KENNEDY, T. C. S. (1975). Some behavioural factors affecting the training of naive users of an interactive computer system. [nternationalJournalofMan-Machine Studies, 7, 817-834. MARTIN, J. (1973). Design of Man-Computer Dialogues, Englewood Cliffs, New Jersey: Prentice-Hall. MILLER, G. A. (1968). The magical number seven, plus or minus two: some limits on our capacity for processing information. The Psychology of Communication. London: Alien Lane. MILLER, L. A. & THOMAS, J. C. (1977). Behavioural issues in the use of interactive systems. International Journal of Man-Machine Studies, 9(5), 509-536. MORAN, T. P. (1981a). An applied psychology of the user. ACM Computing Surveys, 13(1), 1-12.
252
A. P. JAGODZINSKI
MORAN, T. P. (1981b). The command language grammar: a representation for the user interface of interactive computer systems. International Journal of Man-Machine Studies, 15(1), 3-50. MUMFORD, E. (1980). The design of computer-based information systems: an example of practice without theory : a report. Manchester Business School. NEWELL, A. & SIMON, H. A. (1972). Human Problem Solving. Englewood Cliffs, New Jersey: Prentice-Hall. NEWMAN, R. J. (1966). Extension of human capability through information processing and display systems. AD-645-435. RASMUSSEN, J. (1980). The human as a systems component. Human Interaction with Computers. London: Academic Press. SIME, M. E., ARBLASTER, m. T. & GREEN, T. R. G. (1977). Structuring the programmer's task. Journal of Occupational Psychology, 50, 205-216. SIMON, H. A. (1969). The Sciences of the Artificial. Cambridge, Massachusetts: M.I.T. Press. WINOGRAD, T. (1975). Frame representations and the declarative-procedural controversy. In BOBROW, D. G. & COLLINS, A., Eds, Representation and Understanding. New York: Academic Press.