Human factor assessment of the legibility of five numeric visual displays

Human factor assessment of the legibility of five numeric visual displays

Apphed Ergonomtcs, 4.3, 144-149 Human Factor assessment of the legibility of five numeric visual displays B. Copping, V. D. Alexander and Jacqueline ...

492KB Sizes 0 Downloads 51 Views

Apphed Ergonomtcs, 4.3, 144-149

Human Factor assessment of the legibility of five numeric visual displays B. Copping, V. D. Alexander and Jacqueline J. Hunter Post Office Research Department, Dolhs Hill, London.

Exper,ments have been carried o u t des,gned to find on human factor considerations alone, which of five ava,lable displays would be most suitable for the next generation of Post Office telephone exchange switchboard.

Telephone exchange SWltchrooms which use manual boards with tall multiple jack fields and cords, tend to be dark and noisy. The introduction in 1956 of the experimental Cordless Switchboard No 1 (CSS1) (Mlssen, Snook and Mltchen, 1955) gave the Post Office the opportumty to create office-type envaroments for telephomsts. The experimental CSS1 system (Cheesebrough, 1961) design has been developed (Anon, Wyeth and Dickanson, 1969, Heptinstall and Keltch, 1969) and is now installed In about 18 exchanges, with from 24 to 96 operator positions. A further 50 are planned including new Auto-Manual Centres for London (Crooks, 1971) Changes in technology, system orgamsatton, traffic patterns and customers' requirements for facilities have led to consideration of a new generation of cordless switchboard, designated CSS2 (Heptlnstall and Frame, 1970, Anon, 1971). The main feature of the proposed system is that it will use stored programme control, these computer-type techmques incorporating a visual display system should remove the need to write bIlhng information, and so streamline switchboard operation. The expermaents described later were aimed at discovenng which of five numerical displays was the most legible to telephonlsts sitting at a switchboard, with a display behand the keyshelf. The whole arrangement could be considered as a good approxamation to an office desk and hence the results may be more generally applicable. The hmltatlon to numerics removed the majority o f perceptual confusions which could occur due to visual similarity of form (0 and o or B and 8) and the confusions which arise in memory due to the acoustic slmilar]ty of names (B,C,D and E) within a larger set of symbols. This limitation restricts the application of the Tesults of the experiments, however, the authors can see no objection to a similar experimental approach using alphanumerics, providing a similar application were to be made o f the alpha-numeric data displayed

Based on a paper presented at the 6th Internat,onal Symposeum on Human Factors ,n Telecommun,catsons 144

Apphed Ergonom,cs September 1973

Displays The displays were basically commercial 1terns (a) A display composed of discrete Indicators 20mm x 13ram, each of which operated on a projection principle, le, an Individual lamp and mask was selected for each character for projection onto a screen (henceforth designated type A). (b) A display comprising two small cathode ray tubes each equipped with 10 electron guns. The beam from the selected gun, when deflected through an etched metal mask, gave the required character form. By the use of multiplexing techniques each tube was capable o f displaying up to four characters simultaneously across the face of the phosphor screen (type B). (c) An 80mm x 170ram cathode ray display, employing a laminar beam tube and character generating circuitry, with the capability of displaying up to 576 stroke formed characters. Stroke lengths of 0-65ram or 1-3mm for horizontal or vertical lines and 1-45ram or 1.84mm for diagonals were drawn on a 5 x 4, 5 x 5 or 5 x 7 dot matrix depending on the character required (type C). (d) A chsplay which used a multi-cell flat discharge tube. The cells form a ? x 192 matrix with each column having a common cathode, and each row a c o m m o n rear and front anode. Potentials applied to a particular row and column cause a gas discharge at the crosspomt. Alphanumeric characters are formed on a 7 x 5 matrix of crosspolnts. Up to 8 rows of 32 characters per row can be accommodated on the 80mm x 190ram viewing area (type D) (e) A raster scanned television type umt on which a flicker-free display is provided by placing the normally interleaved scan of the 625 line set on top of the prmaary scan, effectively giving a 3000 line display at twice the normal refresh rate. Character size and spacing can be varied over a wide range, subject to a maxamum capacity of 28 rows of 80 characters on the 125mm x 100ram viewing area (type E).

For easy comparison, the important details are shown in Table 1 Table 1 Experimental displays 8-digit Display sequence type length

Character Luminance

Background Luminance

Hmght

Width

Colour

9-0mm

6"5mm

White

170 Cd/m 2

17 Cd/m 2

A

130mm

B

86mm

8"9mm 4.5mm

Green

240 Cd/m 2

17 Cd/m 2

C

60mm

5"3mm 4.0mm

Green

170 Cd/m 2

14 Cd/m 2

D

50mm

6.6mm

4.6mm

RedOrange

170 Cd/m 2

17 Cd/m 2

E

27mm

4.0mm

3.0ram

Green

170 Cd/m 2

14 Cd/m 2

Table 2 AIIocatton of subjects and sequences Experiment

D~splay _ type Total

Sublects Number of Dig=tsper Vtsual TeleResearch sequences sequence no~se phomsts staff

1 Key1ng

A B C

20 20 20

20 20 20

----

60 60 60

8 8 8

2 Keying w=th noise

A C D E

42 42 44 44

25 25 25 25

17 17 19 19

60 60 60 60

10 10 10 10

present present present present

3 Reading

A C D E

19 19 19 20

19 19 19 20

60 60 60 60

10 10 10 10

present present present present

4 Compar,son

A C D E

15 15 15 15

15 15 15 15

120 120 120 120

10 10 10 10

absent absent absent absent

Experiments

General The experunents conducted fell into three categories m winch the subjects were asked to. (a) (b) (c)

key telephone numbers &splayed read aloud the displayed numeric sequences compare two groups of digits and decide whether they were identical or not. All three experimental methods were used for each display. Keying, reading and pattern comparison times of the subjects and the accuracy with winch they performed, were the measures used m assessing the readabdlty of the &splays. All experiments were conducted m a room with no natural hght but controlled artificml fllummat]on of 300 lux at desk height. A total of 369 subjects were used in the tests, 160 telephomsts and 209 members of the Post Office Research Department staff. Table 2 shows the allocation of sublects to the experiments, a separate block of subjects being used for each. Tins approach posed considerable problems m matcinng the groups but avoided the problem of

absent absent absent

assymemcal transfer effects winch can arise m factorial designed experiments. Columns 6 and 7 of the table gwe the number of digits per sequence and the total number of sequences presented to each subject respectively, whde column 8 denotes whether or not 'visual noise" was present. V~sual no~se took the form of two rows of figures, one above and one below the &splayed number, the retention was to make the numbers more difficult to read by &stractmg the subjects, thus accentuating any &fferences m leglbdlty winch may exist between the &splays.

Subject matching tests In allocating subjects to balance the experimental groups, the following factors were taken into account. (a) (b) (c) (d)

sex and age left or right handedness visual acuity &gzt span

Each subject's age and keying experience were estabhshed verbally, but the three last items were quantified by spemal tests Apphed Ergonom=cs September 1973

145

Handedness test Subjects were asked to arrange a row of randomly presented numbered blocks into eather an ascending or descending order using each hand. Four timed trials covering all combinations of hand and order were completed for each subject Visual acuity test Two sample tests were performed m wtuch subjects were asked to read

(a) (b)

a Snellen test card at a reduced viewing chstance of 30 an, w~th the s~ze of the card reduced pro rata a passage of randomised upper case letters, crossmg out as many M's, C's and T's as they could an two minutes.

Dzgtt span test The subjects were asked to repeat a series of dlgat sequences back to the experimenter from memory, starting with five dlg~t sequences and progressing to a maxamum o f 12 dagats per sequence. Three successwe wrongly repeated sequences were taken as sagnlfymg the upper lmaat of a subject's dlgat memory span.

Computer roor IComputer!

SubJect room

Cordless swltchboord .,~ck-up

%

%1

I

~_~F'--- Ke'--y~eT" 11 U e Keying xp~)r:n~nts '~

~H~story

f ] ope

I

LLI Read'ng

ex..,,meo,

l ilS°moe D' o °q

I

_'Jl "-'o'o'

F,g 1 Bas=clayout for exper,ments after completion of keying one number. Errors were calculated on the basis o f the number of 8 or 10 Chglt sequences which contained at least one chglt error. Experiment 1

Experimental procedure

Table 3 First exper,ment mean t,me and error rate

The block dmgram (Fag 1) shows the basac layout for all experiments. Operation o f the clear-down key caused the computer to output a chgat sequence to the (hsplay under test. The dasplays, mounted an a mock-up o f a Post Office cordless switchboard, were smtably interfaced to the computer outputs. In the keying experiments, subjects seated at a switchboard keyed the chsplayed digits on a keyset mounted on the right hand side of the keyshelf. The keyset was connected to the computer inputs vm an interface which included contact bounce ehmmatmg circuits.

Experimental Display Stimulus Overall task type time (see) time (sec) % Errors

In the reading experiment, subjects read aloud the digits displayed. Immedmtely on completaon of reading, both subject and the experimenter pushed buttons whach the computer interpreted as 0 and 1 respectwely. The experimenter's action was desagned to gave corroborative exadence of a somewhat lmprecase measure. Subjects in the companson experiment were asked for a 'same' or 'different' decaslon on two simultaneously displayed groups of digats. The 'same' button was recogmsed as dlgat 1 and the 'different' button as O. On completaon of each number, subjects in all three experiments were asked to operate the clear-down key, which had the effect of erasing the display. After an mbudt delay of five seconds, a further number was then displayed from the computer store. A complete time-event history of all keying and control operataons was punched on paper tape enabling a detailed analysis to be made later.

1 Keying

A

2-19

8"44

8-5

B

2-23

8-38

8-3

C

2" 11

7.95

6.6

Table 3 shows the overall mean time and error rates obtained from the first keying expenment. An analysis of varxance calculated from each subject's mean performance showed no significant chfference between the three dasplays, although a slight advantage of chsplay C was apparent. Experiment 2 This was a repeat of Experiment 1, wl-nch used ten chglt numbers with visual noise, provxdmg extra stress on the subjects. The results are shown in Table 4. Equipment to provade the noise facility was not available for display B, and the display was omitted from all further experiments. Table 4

Resultsof Experiment 2

Experimental Display Overall 99% % task type time (sec) Confidence Errors hmlts 2

A

10-49

0-13

9-2

Keying

C

10"21

0-17

9-0

Times and errors were used as a measure of legibility. The important ume measures calculated were-

with

D

9-95

0-12

8-2

(a) Stimulus Trine- the time from commencement of the number display to the time of the first key operation

noise

E

10 34

0-15

8-5

Experimental results

(b) Overall T i m e - the time from commencement of the display to the time of operation of the clear-down key

146

Applied Ergonomics September 1973

The level of statistical significance (Copping, Alexander and Hunter, 1971) of these results was raised by changing

the form of analysis. The calculatmns for these and all the remaining results were on the basB of the individual keyed number t~mes and not as in Expertment 1 on the basis of subject mean performance. Stat]stical slgmficance was taken at the 95% level, highly slgmficant at the 99% level and very highly slgmficant at the 99.9% level. Column 4 shows the 99% confidence hm]ts for each chsplay. Display D was h]ghly slgmficantly better than the other &splays: only C was slgmficantly better than the remaining display A. Furthermore, when the subjects were &vlded mto the categories (a) telephomsts with keying experience, (b) telephomsts wRhout keying expenence, and (c) research department staff, it was found that differences were generally consistent over each category.

shows the prmcipal error or confusions based on the percentage of different sequences undetected per occurrence of the first digit (column 3) and per occurrence of the confused pair (column 4). Table 7 Principal error or confus=ons First sequence Second sequence % Error % Error d~glt d~g=t numerator numerator first dlgCt occurrence occurrence of confused pa~r 4

7

0"25

15"0

It was felt that the vanance of the keying operation could have masked chfferences m &splay leglbfllty, so a simple reading test was devised to test this hypothesis. The results are shown m Table 5 in terms of mean reading time per sequence.

2

9

0"19

11-7

8

6

0"14

8"3

9

5

0"11

6"6

Table 5 Keying operation and display legibility

5 9 0

8 6 4

0"08 0"08 O'08

5"0 5"0 5-O

4

1

0"14

4.1

1 9

5 0

0"11 011

3"3 3-3

Experiment 3

Experimental Display Overall task type time (sec)

99% Confidence limits

% Errors

3

A

4"45

0"10

1.85

Reading

C

4"99

0"13

1"31

D

4"78

0-09

0-79

E

3-94

0"07

1"84

Consldenng the overall reading times, Dasplay E was highly significantly better than all other &splays. A was highly significantly better than the remaining pair and D was s]gmficantly better than C. Experiment 4

Table 6

Mean scanning reaction time

Experimental Display Overall task type time (sec)

4

99% Confidence hmits

% Errors

A

2"36

0-04

1"8

Scanmng

C

2 33

0"04

1-6

comparison

D

2-32

0.04

0-9

E

2.01

0"05

3-0

The mean scanning reaction time for each display is shown m Table 6. The errors have been calculated on wrong decisions. D]splay E was once again the best and very highly slgmficantly so. The differences between the remainder were not slgmficant. The differences m the error rates were not statistically slgmficant but Table 7

Discussion

In general, the error rates for the various tasks were within the range expected of the particular type of laboratory task concerned. The &fferences m error rates between the &splays cannot be regarded as stat]st]cally slgmficant although Display D had the lowest error rate m all types of the experiments. The most interesting result worthy of examination was that Display D was best for a keying type task and Display E for the reading and scanning types of task. Locatlonal difficulties between the position of the keys on the keypad and the users place within the &splayed number were undoubtedly exaggerated by the need of some subjects to fl~ck their eyes from one to the other. Any hesitation to find either the appropnate place m the number or the appropriate key should have Increased the hkehhood of memory error and slowed down the task. Indeed, ~t could be argued that a fatigue effect should show ~tself earlier on the &splay where place locat]on was a larger problem. Table 8 shows the constituent parts of the overall mean keying times for each &splay for Experiment 2. The small difference between the figures m column 5 of Table 8 and those shown in column 3 of Table 4 are due to the inclusion of all the events up to an error within a sequence which were omitted from the earher calculation. The figures m Table 8 m&cate that Ds~play E was initially scanned more quickly than the other displays but had a higher mean mterdlg]t pause than other &splays, th~s suggests that subjects found ]t more difficult to locate their place in the &splay. The shorter mterdlglt ttme on Display Apphed Ergonomics September 1973

147

Table 8 Mean keying times 1600,

D=splay

Stimulus (sec)

Button depress (m sec)

Interdlglt (m sec)

2-437

166

619

10"519

C

2-360

160

619

10-309

D

2-241

135

608

9 98

E

2-135

155

646

1400 -

Button d e pre sslon n t e r

digit

-1 I

Display A

12OO -

I000800-

Display C

F-

4 0 0 lV U L 2 0 0 "U S

10"309

The keying of 60 ten-digit numbers may be expected to show some short-term fatigue effects which may be projected as reflecting possible long-term fatigue effects. A series of curves were plotted against the order m which the numbers were keyed, and quadratic curves fitted. The Stimulus time curves which are not included here, showed very little except a possible tendency for the Stimulus time to increase earlier on display E than D (Number order 34 against 41) but the goodness of fit of the curves was very

D

E

-

I

All reflect the telephone number format 4 + 6 with the tendency of subjects to group the numbers in twos and threes. The last 'mterdigit pause' was the time taken between the release of the button and the operation of the clear-down key. Display C showed a greater division of the last six dig~ts into a pair of threes than A. In general, the length of Display A and ]ts large characters seem to have encouraged subjects to plod along at a steady rate. Displays D and E show greater tendency to 4 + 3 + 3 format, but Fig 4 reflects a general qmcker pace for D than E (Fig 5) except for the sixth button depression. There was a general qmckenmg towards the ends of the numbers on all &splays no doubt because the end of a number was near and the remaining digits could be held m memory.

1600 -

1200

6 0 0 -S T

D suggests that the subjects were more confident that the right key had been pressed, also they did not tend to keep it depressed while beginning to look for the next digit to be keyed. An inspection of the mean time event histograms for each display, Fig 2 for A, Fig 3 for C, Fig 4 for D and Fig 5 for E, is relevant.

2436

1

1400 -

Overall (sec)

A

~2 3 5 3

D,glt order Fig 3 Histogram for Display C 2322 1600 1400

Display D

1200

I000 8OO I--

600

-TS C L

E

200

o s'!~ N.~ [~

A

R

E4: Di it o r d e r

Fig 4 Histogram for Display D

poor. Slmdarly, button depress time curves were poor fits statistically but again the Display E curve started to rise early compared with the other &splays (not shown). The mean ]nterdig~t curves are more rehable and Fig 6 shows that for Display D and Fig 7 that for Display E. The first four numbers have been omitted from the curve fitting because they may be regarded as representing the learning part of the task. The mimimums on F~gs 6 and 7 corresponding to number order 41 and 50 respectively which compare wath 51 and 45 for the respective &splays A and C. The actual curves m the figures show D to be consistently faster than E, but the two &splays show a smaller difference after a 'training effect' due to the previous 40 to 50 numbers. Hence increased faugue chd not operate against Display E.

IOOO 800 t-

m

-

Conclusions

6 0 0 -T S

4°° u 200

'b

S

R Digit o r d e r

F,g 2 Histogram for Display A 148

Apphed Ergonomics September 1973

It ]s very difficult to explain why Display E is best for a readmg task and Display D for a keying task. The poorer performance of E m the latter can only be attributed to locat~onal difficulnes in which the important factors may be the smallness of the character, the smallness of the spaces between characters, and the possible tendency of the horizontal line construction to carry the eye forward when scanning. The authors favour the second factor, putting the first as ~mprobable, but clearly further study is

required on optimum human requirements from visual display systems to suit the task concerned. The overall conclusion of this set of expermaents is that the choice of display is clearly dependent on the task for which it is to be used

Display E Interdlg,t mean time 720

X X

X X

680

×

X X X

Acknowledgement

~n

E 64(

The authors wish to thank the Director of Research of the Umted Kingdom Post Office for permission to publish the information contained m this paper.

X XXX

t-.-

X X

600

X X

X

X

X

X

X

56C

X

2134 520

16OO

|

t

i

i

O

14OO

60

Number order

Display E 12OO

Fig 7 Interdigit mean time for Display E IOOO 800

References

600

Anon

E c

400

The Cordless Switchboard No 1, Post Office Telecommumcatlons (CHQ/PRD)

C L

E

200 O

Digit order

720

Xx

D

lnterdldlg,t

mean

Copping, B., Alexander, V.D., and Hunter, J.J. 1971, HumanFactor Assessment of some Numeric Visual Displays

time

Crooks, K.R. 1971, The Post Office Electrical Engineers Journal, 63.4, Jan. A New Generation of Auto-Manual Centres for London

X

X

68C X

~

~

yX

X

X

640

X

E

XX ""

E

60C

X X

X ~ ×

X

X

X X

XX

X X

X

X AX X-'~.~....~

Heptinstall, D.L., and Frame, P.B. 1970, A.M.C.Cordless Switchboards. Paper presented to Institution of Post Office Electrical Engineers

X

x ~

xX X X X

X XXX

X X X

X

56C X

X

X X

520 i

I

12

Post Office Telecommunications Journal 23.3, Autumn, The Cordless Revolution

Cheesbrough, J. 1961, Post Office Electrical EngTneers Journal, 54, p 201, Oct. Stafford Cordless Switchboard

Fig 5 Histogram for Display E

Display

Anon 1971,

I

I

20

I

I

I

I

28 36 Number order

Fig 6 Interdlgit mean time for Display D

I

I

44

I

I

52

I

60

Heptinstall, D.L., and Keitch, E.H. 1969, Post Office Telecommuniazt~ons Journal, 21.2, Summer, The Queuing System for the New Cordless Switchboard Missen, L.A., Snook, R.A., and Mitchen, E.J.T. 1955, Post Office Electrical Engmeers Journal, 48, p 102, July, Thanet Cordless Switchboard Wyeth, F.A., and Dickinson, C.G. 1969, Post Office Telecommunications Journal 21.2, Summer, The New Cordless Switchboard

Apphed Ergonomics

September 1973

149