Pattern classification for diagnostic purposes

734KB Sizes 4 Downloads 129 Views

Report

PDF Reader
Full Text

OMEGA The Int. JI of Mgmt Sc~., Vol. I1, No. 4. pp. 38%39~. 1983 Printed in Great Brttain. All rights reserved

0305-0a83 8353.00+0.00 Copyraght f=~ 1983 Pergamon Press Ltd

Pattern Classification for Diagnostic Purposes DAN ZAKAY DOV TEENI MOSHE BEN-BASSAT Tel Aviv University, Israel (Received September 1982) Behavioral information ac~luisition strategies in a diagnosis problem are formulated and examined and the effects on these strategies of both information presentation and user training are tested. Twenty-five subjects participated in an experiment in which the decision maker was required to solve a set of medical diagnosis problems. A matrix of 7 illnesses and 11 p o ~ b l e tests was presented on a micro-computer; the task was to diagnose the correct illness with the minimum number of tests. The results were analyzed using verbal protocols and a computerized record of the user's responses. The results confirm the use of certain behavioral strategies, in particular concentration on a small subset of classes in a regular pattern, and a positional effect on the choice of tests was also detected. The implications of these results on the design of management information ~ e m s are discussed.

INTRODUCTION DECISION AIDS for classification (diagnostic) tasks have been widely discussed (e.g. [24, 13]). The purpose of these aids is to assist humans in overcoming their cognitive limitations. Although extensive reviews of the field have been provided in the past decade-', there follow a few comments which are particularly relevant to the ensuing discussion. Hogarth [11, p. 27] states that: "'The study of judgmental processes has produced at least two firm conclusions. First, m a n has only limited information processing capacity. Second, since the h u m a n is an adaptive organism, the nature of the judgmental task with which he is faced determines to a large extent the possible strategies he may employ for dealing with that task".

The present discussion is concerned with nontrivial problems in which the information prot This research was supported in part by the US Army Research Institute for Behavioral and Social Sciences, Contract No. D A J A - 3 7 - 8 1 - 0 0 6 5 . -' See [7, 1 i, 12, 22, 23]. 385

cessing capacity presents a real problem to the decision maker (DM). Individual strategies are employed in these situations and we can detect in them elements of human characteristics, such as concentrating only on part of the available data (see, for example, [17]). For given environmental conditions the DM's choice of strategy is influenced by his cognitive style and by the task characteristics. Both influences are considered in determining the elements of human characteristics in the acquisition strategies. Levit [13] defines cognitive styles as "characteristic modes of functioning that we show throughout our perceptive and intellectual activities in a highly consistent and pervasive way". He classifies cognitive styles as described in Fig. 1. To illustrate the effect on decision support systems (DSS), we would expect, for example, that an active style will call for an active source of information which, in turn, requires that the DM actively seeks the information he desires. Moskowitz et al. [15] consider several factors that might affect human judgment. Regarding

386

Zukay, Teeni, Ben-Bassat--Pattern Classification for Diagnostic Purposes

~CTtVE INTUITIVE~ t CONCRETE~ ABSTRACT I LOGICAL PASSIVE

determine three relevant factors: (1) cognitive complexity. (2) field dependence or independence, (3) analytic vs heuristic.

FIG. 1. Cogniti~'e styles [13].

These three factors, in turn, affect both output the effect of task characteristics they note that format and, more relevant to our discussion, the source complexity may induce the DM to con- use of decision aids. centrate on one item only. Concentrating on To enhance acceptance of DSS we should part of the available data is one element of attempt to design systems equipped with stratehuman information acquisition strategies which gies compatible with human characteristics [13]; we shall examine? The amount of information we can do so by using adaptive systems? Supacquired depends on individual styles [1]. In plying the information in a manner that is some cases it may be determined as a reaction harmonious with the DM's cognitive style when to a state of stress caused by an overload of acquiring the information effects a more information [I 8]. efficient utilization of that information (see, for Once the DM has concentrated on a subset of example, [8]). 5 the information, he performs some manipuHenceforth our discussion will be in the conlation on it in order to reach a decision. In a text of a medical classification problem in which complex problem this double stage process may prior probabilities of illnesses (classes) are rerecur sequentially. Edwards [5] introduces the vised by Bayes theorem according to sequential notion of conservatism, suggesting that man test results (each test-feature is presented as a adjusts his beliefs in the correct direction (ac- vector of conditional probabilities; see Fig. 2 for cording to decision theory), but does so slowly an example). Note that the probability adjustin comparison with the Bayesian model. We use ment is calculated and presented by the comthe notion of conservatism to consider the pos- puter. The DM cannot override these probasibility that the DM tends to anchor around the bilities even though he may not accept them as original subset first chosen, even though deci- valid. Note that in this experimental setup we sion theory may suggest a more dynamic atti- avoid calculation mistakes and biases due to conservatism. tude toward the subsets. The strong relationship between cognitive Regarding the specific context mentioned styles and man-machine systems has already above two potential presentational effects must been established. For example Benbasat and be considered. The first is the notion of conTaylor [2] discuss the impact of cognitive styles creteness; Slovic [2l, p. 14] states that: on management information systems (MIS) and

Several researchers have suggested that humans consider only a subset of available information [4, 17]. A conceptual framework for one type of problem is Tversky's EBA theory [26]. ~Adaptive systems are usually associated with systems that adapt themselves to the decision maker's cognitive style. ~Ein-Dor and Segev [6] suggest that, "Users with different cognitive styles will react differently to a given MIS, thus affecting the likelihood of success" (proposition No. 9.5, see also [25]). Other researchers who report experimental results are: Barif and Lusk [1] reach these conclusions from a set of experiments and strongly suggest the use of psychological tests to determine cognitive styles for the design of M[S; Mason and Mitroff [14] argue that various cognitive styles would require different types of information systems.

"'Concreteness represents the general notion that a decision maker tends to use only the information displayed in the stimulus object and will use it only in the form in which it is displayed. Information that has to be stored in the memory, inferred from the explicit display, or transformed, tends to be discounted or ignored".

The second notion is the tendency towards verification rather than falsification. Note that the features are presented in the problem as conditional probabilities for positive results. Wason and Johnson-Laird [28] suggest a model in which increased insight is correlated with increased awareness of the importance of falsification. This model is relevant to the identification problem and, if it holds, it would follow that untrained decision makers will tend

387

Omega, Vol. II, No. at

to seek evidence that supports their initial premises by way of verification, even though falsifying evidence could be more effective. Thus the preference of a feature with relatively high conditional probabilities may be due to one of the above effects rather than to a rational choice of features. Concluding the short review above, the DM, dealing with non-trivial problems, is expected to employ individual acquisition strategies that reduce the cognitive load imposed on them. It would seem that the main characteristics of these strategies in the above context center around the following: (1) considering a subset of the available information, usually concentrating on targets that seem important; (2) tending to simplify the problem, e.g. considering only recent history or immediate future; (3) effects of information presentation format are involved. The purpose of this paper is to identify and analyze behavioral strategies used in information acquisition in the context of a sequential classification problem. If such strategies are established it may be possible to construct DSS which adapt to the user's individual style. Simultaneously, research is being carried out to incorporate elements of these strategies into analytic strategies and to test their impact on efficiency [3].6 The following hypotheses were constructed on the basis of a pilot experiment conducted with 15 postgraduates in Tel-Aviv University: (1) DMs tend to restrict the amount of cognitive processes needed. Due to the limitations of the cognitive system DMs tend to restrict themselves to a limited aspect of the problem, reflected by the following forms of behavior: (a) when the number of classes is above a certain threshold, a limited number of them is considered,

(b) when a large number of features is available only a limited number of them is selected for evaluation. (2) DMs tend to consider limited history on the one hand and myopic policies on the other. (3) In some cases DMs exhibit strong reactions to significant changes in class probabilities, in others they anchor around a preferred subset of classes. The authors assume that these characteristics of behavioral strategies will be evident in untrained classifiers as well as trained classifiers, but that well formed individual strategies will be found only with trained classifiers. In any event these characteristics should be established beyond any presentational effects [4]. We therefore investigated two additional hypotheses: (4) The training process will not uproot the above tendencies, moreover training will proceed easily as long as the extra cognitive load imposed is not too great.

(5) The existence of possible effects of information presentation format upon the order of feature selection and the choice of classes considered.

METHOD Subjects

The subjects (Ss) were 25 students from TelAviv University who participated in the experiment in partial fulfillment of course requirements. They consisted of two groups: Group A--12 second-year psychology students with a background in behavioral statistics, including Bayes' formula and its applications. Participation in the experiment was mandatory. Prior to the experiment, this group received a lecture describing the classification problem and were guided to Bayes' solution. The Ss were informed that their results would be submitted to their lecturer for assessment. This group will be termed hereafter 'Bayesians'.

6There is reason to suspect that human simplifications Group B--13 first-year students in an intromay result in suboptimal efficiency[26.20]. ductory psychology course. Participation in

Zakay, Teeni. Ben-Bassat--Pattern Classification jbr Diagnostic Purposes

388

this experiment was voluntary. These Ss had no formal statistics education and did not receive a preliminary lecture as did G r o u p A. This group will be termed hereafter 'Laymen'.

Apparatus and experimental task The experiment was conducted in a detached room in which the Ss were seated in front of a micro-computer PDT incorporating a VTI00 terminal. A standard cassette recorder was placed on the table with its microphone attached to the Ss. The actual classification problem consisted of five missions which became progressively harder. Each mission was a sequential classification problem with seven possible classes (illnesses) and eleven available features (tests). The classes were given as prior probabilities, the features as the conditional probabilities for a positive test result for every class. The Ss were presented with two alternating screens (see Fig. 2). Each S was confronted with the same problem. The task consisted of the following decisions: (a) on which classes to concentrate, (b) in which order to select features, (c) when to stop. The problem dimensions, seven classes and eleven features, ensured a non-trivial problem [17, 9]. In order to detect any effects due to the way in which the information was presented to the Ss, the same information was presented in three different forms (form 3 is shown in Fig. 2). In all three forms the prior probabilities were drawn top down on the right-hand side. In form 1 the features ran from left to right, so that feature t I was nearest to the prior probabilities. In form 2, the original classes 1 to 3 were Illness number

Illness wob

1 3 5 7

0.07 006 004 0.25

renamed 5 to 7 and positioned accordingly. In form 3 the features ran from right to left. Features 5 and 7 were nearly equivalent insofar as one was the complement of the other. Feature 7 had mostly high conditional probabilities, while feature 5 had low probabilities. Preference of feature 7 may be due to a tendency to prefer affirmative evidence.

Procedure Each S was given a written set of instructions, which was also played on the tape recorder. The instructions consisted of a short example made up of three illnesses and two tests. The example was explained, step by step, showing the effect of the test results on the revision of prior to posterior probabilities. The technical options were explained and the S was urged to exercise the instructions. Finally the scoring sytem was explained. The explanation lasted approximately 20min. At the end of the explanation the experimenter (E) made sure the S fully understood the problem. The first task was given as a demonstration and technical questions were answered. The S proceeded to the next task only when the E was content that the Ss had reached a common starting point. The Ss were told that their target was to classify the true illness correctly, with a minimum number of tests. The scoring system is presented in the following formula: 5

y (11 - number of tests used in mission) missions

y= 1

if correct classification,

y = 0

if incorrect classification.

The S was not limited in time. The entire experiment lasted from 1 to 2 hr. A considerable monetary prize (150 IS) was promised to the three best scorers.

Test 1 0.35 0.50 0.60 055

+ 2 0.10 0.05 0.11 0.02

3 0.50 0.50 0.75 0.55

4 0,02 0.99 0.05 0.09

5 0.41 0.45 0.45 0.65

6 0.75 082 0.20 0.99

7 0.60 0.55 0.55 035

8 0.01 008 0.32 0,28

9 0.80 0.85 080 0.20

10 0.95 0.45 0.95 0.45

11 0.90 0.95 0.89 0.97

Your options are tO:

(1) D e c i d e n o w on the correct class without tests. (2) Choose a test for the a b o v e selected classes, (3) Concenlrate ol a different set of classes,

FIG. 2. Main screen after concentrating on classes 1, 3, 5 and 7 (translated mirror image of the

original screen in Hebrew from right to left ).

Omega, Vol. 1 I, No. 4

389

(b) his use of subsets,

Measurements The S's performance was recorded in three ways. The first measurement was to tape record vocal protocols. These tapes were later used to perform protocol analyses. A comprehensive discussion on protocol analysis is given in [10] and its successful use is cited in [16]. Instructions to the Ss stressed the need to speak out loud while performing their mental tasks and, indeed, the E made sure that they remembered to do so [10]. To ensure the correct use of protocol analysis two steps were taken: first, the protocol analysis was cross-validated with the next two measurements (in fact, the process tracing (protocol analysis) was reexamined in the light of the input-output evidence (computer output)); second, to examine the objectivity of the protocol analysis, a sample of the protocols was reexamined by a second judge. Both judges were in full agreement. For the second measurement all the S's man-machine interactions were coded by the computer and stored on tape. This tape enabled statistical analysis and helped decode the protocols. The use of a single user micro-computer ensured an immediate account of the S's responses. For the third measurement a short inquiry was made immediately after the experiment. The Ss were asked to explain their performance on four issues: (a) what he thought was a 'good' test,

(c) the stopping rules he employed, (d) his attitude towards the experiment, evaluated on a scale from one to three. This self-evaluation was compared with the E's evaluation and together produced a joint attitude scale of poor, fair and good. RESULTS Twenty-two tapes (out of the original 25) were converted into working sheets in order to analyze the verbal protocols. The three missing tapes were technically incomprehensible. The protocol analysis covered three issues: feature selection, the use of subsets of classes and the stopping rule employed. General remarks were noted separately (see Table 1 for part of an example). The analysis enabled the E to categorize the method of feature selection as one of seven methods, described in order of sophistication (see Table 2). In addition, another method, termed the "hypothesis method", was often employed parallel to one of the above seven methods. Method A is characterized by deciding on a feature solely on account of its position: for instance, choosing features from left to right. Method B is characterized by choosing some particular class and then selecting the feature

TABLE 1. AN EXAMPLE OF PROTOCOL ANALYSISON FEATURE SELECTION Time

Mission

37.02

l

Hypothesis

Recording

Searches for discrimination power

I want a feature with distant probabilities

39.20

High positive likelihood for a certain class

Feature 10 will confirm class 2

43.29

High positive likelihood for a certain class

The feature with the highest probability opposite the class is feature 3. I want highest probability for the class

High positive likelihood to confirm or low positive to eliminate

I will choose feature 11 because it gives most of the classes a high probability to classify the class

High positive likelihood to confirm or low positive to eliminate

Because of the above reasoning

47:12

48:45

2

390

Zakay. Teeni. Ben-Bassat--Pattern Classificationfor Diagnostic Purposes TABLE 2. FREQUENCY OF USE OF FEATURE SELECTION ,METHODS AT START AND END OF TASK. DIVIDED BY GROUPS Method

Ba.vesians Start End

A.

Relating to position.

B.

H i ~ probabilities c o n d i t i o n a l upon a specific class.

3

C.

High c o n d i t i o n a l probabilities to confirm or low probabilities to eliminate one or more classes.

3

3

5

3

D.

As in C - - t a k i n g a c c o u n t o f both positive and negative results.

1

l

1

5

E.

Distance between certain c o n d i t i o n a l probabilities, t a k i n g a c c o u n t o f positive results.

F.

As in E - - t a k i n g a c c o u n t o f b o t h positive a n d negative results.

G.

F o r m i n g a weighted distance o f c o n d i t i o n a l with p r i o r probabilities.

X.

H y p o t h e s i s method.

with the highest probability, conditional upon the particular class chosen. In method C, the S also first chooses a particular class or a small number of classes (a subset) and then seeks a feature which either confirms a subset of high probabilities or eliminates a subset of low probabilities. Method D is similar to method C, but takes account of effects of both positive and negative results on the prior probability. In method E the S looks for a feature with maximum distance between its conditional probabilities (discrimination). Method F is similar to method E but takes into account both positive and negative results. Method G is based on the analytic strategy (in myopic policies) of distance measures, where the distances of all features (taking account of both positive and negative results) are weighted with prior probability and the most discriminating feature is chosen. Method X, which may be used in conjunction with any of the above seven methods has a non-Bayesian orientation. The DM first sets a hypothesis, for example, class 5 is correct. He then seeks a feature which could reject his hypothesis. If it is not rejected he chooses a

--

Laymen Start End

--

1

--

--

--

--

4

--

1

2

3

5

4

5

more severe test and continues doing so until he is satisfied that his hypothesis holds. Table 2 indicates that more than 50~ of the Ss improved their method of feature selection during the experiment. On this issue the protocol analysis agreed with the debriefs and thus added to the credibility of our analysis. Table 3 presents the same data showing that the improvement of the Ss during the sessions was evident. A MacNemer test (on the rightmost column of Table 3) showed a significant result Z 2 = 9 . 8 at P < 0.01 with 1 d.f. Tables 4 to 6 present the results regarding the use of subsets. All but two Ss concentrated on subsets of classes on most of their missions. The range of classes in a subset was from one to six, although over 6 0 ~ concentrated on two to three classes. TABLE 3. NUMBER OF SS WHO PASSED FROM

ONE STRATEGY TO ANOTHER

Improved Static Receded

Bayesians

Laymen

All

4 4 --

7 5 --

I1 9 --

Omeea. Vol. I1, .Vo. 4

Table 7 presents the reasons used for stopping. It shows that the dominant reason for stopping the testing procedure and making a decision was the level of the probability of error, which is dominant in reasons A and B in Table 7 (16 out of 22). This stopping rule is usually used in myopic policies, but we also noted that Ss did consider historical information not included in the final class probability vector (which is sufficient for Bayesian analysis). This analysis did not take account of the division between groups, as there was no difference found nor expected. A measure of performance was devised to enable order comparison of strategy efficiency. The scoring rule presented to the Ss was inappropriate for measuring performance; a good solution is one that attained low levels of probability of error, while correct classification (employed in the scoring rule) is only indirectly expected to correlate with low levels of probability of error. Performance was defined as a TABLE 4. N U M B E R OF CLASSES IN SUBSET

Number of classes in subset

Frequency of users

1-3 2-3 2-4 3-6 1-4

4 10 2 2 2

TABLE 5. FREQUENCY OF USERS ACCORDING TO REASONS OF SUBSET FORMATION

A.

B.

C.

Reason

Frequency

Does not form a subset in order not to exclude the correct class

2

F o r m s a subset because he functions better with fewer classes

7

7

B.

One class has a probability above some threshold and the rest under a minimum level

C

A series of tests reaffirming results in a class

D.

Roughly a fixed number of tests

E.

Consider data not in probability vector (beyond reasons C, D)

function of two factors: probability of error attained in the final stage and the number of tests used. Three referees individually constructed indifference charts between the two factors. These charts were then combined by trial and error into a scale ranging from one to eight sketched in Fig. 3. (The correlation between the three charts was high.) Based on the above performance scale we examined performance in relation to various dependent variables, measured or constructed, such as latency time, attitude, etc. The results are presented in Tables 8 t o 12. In Table 8 we examine the performance of feature selection methods grouped into two categories, less sophisticated or more sophisticated, as detailed in Table 2. The more sophisticated methods of feature selection resulted, on average, in better performance. A t-test for independent samples showed a tendency toward this relationship (t (d.f. = 18)= 1.622, P < 0.10). Table 9 confirms that anchoring around a fixed subset is an element of behavior worth considering. The table indicates that anchoring

40 - 59

5

6 0 - - 74

4

75 - - 841

Frequency

3

85-

2

90 -- 94

I

95 - tOO

7.

2-4 extreme probabilities

9

B.

Class or classes with probability over some threshold

6

Use of both techniques

5

S anchors around original subset

Frequency

One class has a probabilit? abo~e some threshold

6

4

A.

C.

Reason A.

I -- Pe 0 - 40

TABLE 6. T E C H N I Q U E S OF SUBSET FORMATION

A + B.

T.~,BLE 7. REASONING FOR STOPPING RULES

7

Concentrates on different classes in different ways

Technique

39l

8

i

8

7/

-zq

1

i

89

t

2

4

5

6

7

8

9

I0

I I

Tests

FIG. 3. Performance map relating Pe and number of tests

taken.

Zaka.v. Teem, Ben-Bas~at--Pattern Classification[or Diagnostic Pur?,~es

392

T~.BLE 8. PERFORM~,NCE SCORES ACCORDING TO FEATURE SELECTION METHODS

Feature selection method

n

Average performance

Standard de~,iation

Less sophisticated ( B - D ) More sophisticated {E-F)

12 8

22.50 18.38

28.5~5 26.73

improved performance (t ( d . f . = 2 0 ) = 4 5 . 5 9 , P < 0.005). Table 10 shows no difference at all between performance of Groups A and B, i.e. Bayesians and Laymen, which was contrary to our expectations. Similarly, no difference was detected in the number of stages between groups. Table 11 shows better performance for those with a more positive attitude toward the experiment. Although this result was not highly significant, a t-test showed differences at (t (d.f. = 14) =0.84). A breakdown of performance back into stages and Pe level showed a higher average of stages for Ss having a more positive attitude; this would seem reasonable, reflecting their willingness to invest more in the experiment. Table 12 presents the use of features 1 and 11 in five tasks, for every S. Each S performed one of three forms. Table 12 indicates a significant effect of position on the choice of features. In forms 1 and 2 feature I1 was farthest right and nearest to the prior probabilities; in form 3 it was farthest left. For 22 Ss we noted the number of times each feature was tested (out of

a maximum of 5 missions). The use of feature 11 was found to be significantly greater when it was in forms 1 or 2, using the K o l m o g o r o v Smirnov test [19] Z: = 6.24 for 2 d.f. P < 0.05. Table 13 presents the use of features 5 and 7, which are equivalent in information content. However, feature 7 has mostly high positive conditional probabilities while feature 5 has mostly low probabilities. No difference was found in their use, neither was any positional effect detected. The latter was expected because both features were positioned, intentionally, in the center of the screen. DISCUSSION It is evident from the data collected that, as hypothesized, subsets were frequently used to ease the decision tasks. Furthermore, it was found that Ss most frequently employed two or TABLE 12. FREQUENCY

OF USING FEATURES I AND 11 ACCORDING TO THEIR POSITION

Form I and 2 (11 on RHS) TABLE 9. PERFORMANCE SCORES ACCORDING TO WHETHER OR NOT ANCHORING WAS US[-D

Use of anchoring

n

Average performance

No anchoring Anchoring

[4 8

24.21 16.38

Standard deviation 23.88 7.73

TABLE 10. PERFORMANCE SCORES ACCORDING TO GROUPS

Group

n

Bayesians Laymen

I0 10

Average Standard performance deviation 21.80 21.80

26.16 37,00

TABLE I 1. PERFORMANCE SCORES ACCORD[NG TO ATTITUDE

Attitude

n

Good Fair

6 10

Average Standard performance deviation 18.83 21.50

29.81 34.65

I

I1

2 2

5 5

1 1

1 0

0 2 2

0 3 3

I

3

2 2 0 0

Form 3 (11 on LHS) I

11

4

0

4 2

3 2

3

4

1

0

2

3

16

12

5 5 2 3

2 3

4 3

t t

I I

Total _.a "~'~

44

Omega, Vol. 11. No. 4 TABLE 13. USE OF F'EATL'RES 5 AND 7 ACCORDtNG TO THEIR POSITION

F o r m I and ,~ 5 7 0 0 0 0 4 0 I 2

0 "~ 2 0

Form 3 5 7 0

1

2 1

0 l

0 0

1 0

0

I

3

4

1 1 2 0 I 0 l 0

2 1 2 0

0 I

0 2

2 0 Total

l 0

14

14

three classes in a subset (given the experimental setup). The reason for concentrating on a subset is not clear cut; there appear, however, to be two dominant reasons for this tendency: (1) it is easier to deal with fewer classes, (2) Ss deal with highly probable or improbable classes differently, e.g. eliminate all classes with low probabilities. The technique for including classes in the subset is straightforward, namely to include a small number of classes with either the highest or lowest probabilities, or else all classes above or below a certain threshold. No presentational effects on the choice of subsets were detected. Beyond these rules a meaningful number of Ss (8 out of 20) exhibited a tendency to anchor around a subset of classes, e.g. to include a certain class that had already been included at the prior stages even though it would not be included at the present stage according to the above techniques. The reason for this tendency was not investigated (although earlier we suggested that it was due to conservatism). This tendency has been detected in different situr Interestingly, the use o f subsets serves as an easy way out because, a l t h o u g h the S s were not a w a r e of the fact, subsetting can be viewed as weighting the resultant distance vector with a binary vector of zeros and ones.

393

ations (see, for example, [27]). Note that anchoring had no negative effect on performance. However, in a set of simulations performed on a large range of problems [3], anchoring had a significant negative effect on performance. In addition, no evidence was found to support our hypothesis that Ss would give special attention to classes that grew dramatically in their probabilities. It is possible that this tendency holds only for trained classifiers in certain tasks. Further research is required in order to explore this possibility in realistic situations. Table 3 indicates that more than half of the Ss progressed in their method of feature selection from methods that seemed initially to be correct but were in fact wrong (methods B, C, D) or inadequate (method E), to method F which is reasonable although still inferior to method G. No S employed method G which is highly complex insofar as it requires the comparison of a number of weighted distances. 7 It seems therefore that a distinction can be made between the low and high stages of the learning process:

(l) Progression in the range of methods A to

(2)

F is easily taught. At these stages the S learned the importance of discrimination and the consideration of both results, positive and negative. However the amount of cognitive processing imposed did not appear to be drastic. Progression from method F to G would seemingly impose a high cognitive load. In method G the S has to weight a number of distances with the appropriate prior probabilities; this step is apparently not easily made. The effective direction in building man-machine systems may well be to: first, train the DM up to a certain point that is relatively easy to learn and second, illustrate these cognitive impositions to the DM and then leave the remaining computations to the machine.

Consider now a computer-based decision aid whose information acquisition strategies incorporate human heuristics and are goal oriented. Since such a system would allow the user to specify his preferred strategies for the next steps the information requests which would be generated by the system are likely to be in line with those which would have been generated by the

394

Zakav. Teeni. Ben-Bassat--Pattern Classification jigr Diagnostic Purposes

user u n d e r ideal conditions, without time constraints a n d e n v i r o n m e n t a l stress a n d with plenty of c o m p u t a t i o n a l resources. Such compatibility is expected to increase acceptance of the system's r e c o m m e n d a t i o n s a n d to e n h a n c e the effect of learning on the user who will gradually adapt himself to practicing his own style in an optimal fashion. E x a m i n i n g the use of features 1 a n d 11 on matrices 1 and 2 against matrix 3 reveals the influence of p r e s e n t a t i o n on feature choice: Ss scanned the features from right to left. This is due either to the fact that Ss usually proceed from right to left (the experiment was c o n d u c t e d in Hebrew) or, more likely, to the possibility that Ss begin the s c a n n i n g process from the prior probabilities which are the baseline of the process. This question also requires further research. The tendency to consider positive results was evident in the use of selection m e t h o d s B, C, E. As discussed in the review this p h e n o m e n o n m a y be due to the h u m a n tendency towards concreteness. The work d o n e so far strongly suggests that D M s do employ i n f o r m a t i o n acquisition strategies in face of non-trivial classification problems. The possible negative effect of these strategies on p e r f o r m a n c e should be weighted against the gain in e n h a n c e d acceptance of the DSS t h r o u g h adaptiveness. This t r a d e o f f s h o u l d influence both the design of MIS a n d the training p r o g r a m s for DSS users. F u r t h e r research is needed to determine the real gain achieved by a d a p t i n g the m a c h i n e to the D M ' s cognitive style with individually tailored acquisition strategies. This would require building adaptive DSS a n d testing them u n d e r real world conditions. REFERENCES 1. BAR~FFML and Lus~ EJ (1977) Cognitive and personality tests for the design of MIS. Mgmt Sci. 23(8), 820-829. 2. BEYBASAa" I and TAYLOR N (1978) The impact of cognitive styles on inlbrmation systems design. MIS 2, 43-54. 3. BEY-BAssATM and TEEYI D (1982) Human oriented information acquisition in sequential pattern classification. Israel Institute Business Research, WP No. 744/82. Submitted for publication. 4. BE'rrMAYJ and KAKK,~RP (1977) Effects of information presentation format on consumer information acquisition strategies. Jl Consumer Res. 3, 233-240. 5. EDWARDSW (1968) Conservatism in human information processing. In Formal Representation of Human Judgement (Ed. Kleinmuntz B). John Wiley, New York.

6. EI.~-DoR P and SEGEV E (1981) ,4 Paradigm for Management [njormation Systems. Praeger, New York. 7. EI,',,~oR.', HJ and HO~AR-C~t RM (1981) Behavioral decision theory: processes of judgment and choice. A. Rev. Psychol. 32, 53-88. 8. GRACEGL (1966) Application of empirical methods to computer based system design. Jl appl. Psychol. 50, 442--450. 9. HAYES JR (1962) Human data processing limits in decision making. US Dept. Commerce, AD 283384, July. 10. HAVESJR [1981) Issues in protocol analysis. In Proceedings Conference on New Directions in Decision Making, March. Oregon University (in preparation). 11. HOGARTHRM (1975) ComaRNe processes and the assessment of subjective probability' distributions. JI Am. statist. Assoc. 70, 271-294. 12. KAHA,~'EMAND, SLOVICP and TVERSl<.YA (1982) Judgement under Uncertainty. Cambridge University Press, Massachusetts. 13. LEVIT RA (1974) Development and application of a decision aid for tactical control of battlefield operations, Vo[. 1. Honeywell Inc. 14. MAsoN RO and Mtr,ao~ II (1973) A program for research on management information systems. Mgmt Sci. 19, 475-487. 15. Mos~owITz H, SCHAE~R RE and BORCm~RDIYG K (1976) 'Irrationality' of managerial judgements: implications for information systems. Omega 4(2), 125-140. 16. PAVYE JW (1976) Task complexity and contingent processing in decision making. An information search and protocol analysis. Hum. Perform. Organ. Behav. 16, 366-387. 17. REEDSK (1972) Pattern recognition and categorization. Cog. Psychol. 3, 382-407. 18. SHAPmOHB (1974) Crisis management: psychological and sociological factors in decision making. Special Report Contract No. N00014.75, ARPA Order No. 2819. 19. SIEGEt.S (t956) Nonparametric Statistics for the Behavioral Sciences, pp. 127-136. McGraw-Hill, New York. 20. SIMos HA (1957) Models of Man. John Wiley, New York. 21. SLOVIC P (1972) From Shakespeare to Simon. Speculations and some evidence about man's ability to process information. ORI Res. Monogr. 12 (2). 22. SLovtc P, FISCHHOFF B and LICHTENSTEIN S (1977) Behavioral decision theory. A. Ret'. Psychol. 28, 1-39. 23. SLOVtC P and LtCHTENSTEZYS (197I) Comparison of Bayesian and regression approaches to the study of inlbrmation processing in judgment. Hum. Perform. Organ. Behat,. 6, 649-744. 24. STEEB R, WELTMA.","G and FREEOY A (1976) Man machine interaction in adaptive computer aided control. Proceedings NA TO symposium on monitoring, No. 2. ORI Res. supervisory control. Berchtesgaden, West Germany. 25. SWANSO.~EB (1982) Measuring user attributes in MIS research--a review. Omega 10(2), 157-165. 26. TVERSKVA (1972) Elimination by aspects--a theory of choice. Psych. Rev. 79, 281-299. 27. TVERSKYA and KAHANEMAND (1974) Judgment under uncertainty: heuristics and biases. Science 185, 1124-31. 28. WASO~"P and JOHNSON-LAIRDPN (1972) Psychology of Reasoning, Structure and Content. Harvard University Press, Cambridge, Massachusetts. ADDRESSFOR CORRESPONDENCE: Dr Dan gakayo Department of Psycholog.v, Tel-Aviv University. Ramat At,it, 69 978. P.O. Box 39040, Israel.

Pattern classification for diagnostic purposes

Pattern classification for diagnostic purposes

Recommend Documents