Evaluation S A GREGORY 22 Crescent Road, Stafford, Staffs ST17 9AL, UK Synthesis and evaluation are the natural driving couple within design work. This is shown by experience as well as by field and laboratory study. Attempts have been made to derive a foundation for evaluation in terms of objectives or procedures. Such attempts range from Aristotle to Hall, the latter arguing for systems engineering. Although he made an apparently comprehensive approach he did not cover some possible sources of insights nor did he make useful suggestions about the application of principles to the detailed work of design. Design has now to deal with even more complex situations and perhaps there should be a policy for evaluation. Possible policies are discussed. Keywords: evaluation, decision making, analysis
Practising designers take an activity, here called evaluation, in all its aspects, as a central part of their work. Either they are weighing up alternative ways of doing something, or some other person is trying to see what the designed object means to him. Designers seem to be solving the immediate and direct problems of the design upon which they are working. To some, a suggestion that, somehow, what they are doing involves matters of policy seems almost an irrelevance. They are trying to get a design worked out; they are trying to produce an artefact which will be successful; but what further? In everyday life our breathing activity is largely unconscious: we inhale and exhale without thinking about it, and the speed and depth of breathing respond almost automatically to the body's needs. Only when something unusual overtakes us do we become aware of sudden breathing, or, if there is a special need we prepare our breathing accordingly. In the latter case we may undertake particular training to get our breathing right. Within the realm of design work our synthesis and evaluation are rather like breathing. They provide a fundamental activity which usually goes on unconsciously, yet, for special occasions, can be closely directed and, for such, training and experience can be of significance. A feature of our design culture--however divided this may be--is an emphasis upon the synthesis end of this two-faced activity. Lack of emphasis upon evaluation as an integral part of design work may arise from a number of causes. Synthesis is more spectacular or mysterious: the expression of creativity. The design techniques which are taught tend to concentrate upon synthesis and to neglect evaluation. This may be connected in part with certain classes of difficulties, eg the finding of appropriate costs, in the case of financial evaluation. There is a lack of objectivity about much of evaluation, and personal whim and arbitrariness may well prevail. In the language used about design, evaluation does not always appear.
SYNTHESIS-EVALUATION MODEL The classical problem-solving pair 'synthesis-analysis' which, according to Polya 1 came principally from Pappus, dealt with the treatment of mathematical problems where analysis was itself a heuristic way of getting at the answer. The synthesis-analysis scheme passed over into mechanical engineering where, although it may carry a mathematical overtone, it is remote from the heuristic of Pappus. In much of present-day parlance problem analysis is largely directed at the indentification and grasping of a problem. Only in the sense of preparing the ground for synthesis does it appear to carry any of the earlier meaning. For much of what is still called 'analysis' we should be using the term 'evaluation'. In evaluation we attempt to find a value for a particular proposal arrived at by synthesis. Where there are multiple proposals the values need to be compared in some way. This comparison may dig into the foundations of the proposals or attempt to examine their likely future consequence or anything else which seems likely to help in choice. For those who prefer to call upon field observations and laboratory studies, rather than to rely upon the biddings of experience, it may be stated that two independent surveys of the available reports by Gregory 2 and Lera 3 showed that relevant studies supported the synthesisevaluation picture. A diagram from the 1977 survey is given
vol 3 no 3 july 1982
0142-694X/82/030147-06 $03.00 © Butterworth & Co (Publishers) Ltd
147
Synthesis vs evaluation ( in - company negotiation )
Macro negotiation span
I The design task
I
J
Synthesis in design activity
Evaluation bose related to user company in-house design
I
Requirements
J
=
Micro negot ot on span
I J,,oRequirem.ents out,.o.
Experience Product synthesis I I
Synthesis relate Catalyst synthesis (investment) (Return) ~---" "7 \ \• • / [user needs \ / / J market Evoluotion related to ~experience "V' / /level of work (Uncertainty] / J activity of others , / ~novelty Management ~ - "~..._.. -- , Information ~"~'""~ "~1 Technology,user needs . ~-.~ ~ ' ~ Techniques
Manpower Computer
Control s y s t e m . ~ _ synthesis
Plant unit
I I
synthesi/s
Market shore Growth Devopment aspects I I Customer image of company r i i Customer motivation 'ocess J P nthes~s I Service Related research Price Directions Quality I Consistency I Availability Performance I Novelty Transport cost
IR/I~
Layout synthesis
I I
I
~Capitol
J Directions
I
Negotiations
I
Fobr ication and erect ion - ~ - - i , .
-==--:~---"
-
.._1~
- ~ F - " ~ - - - L~ -
-
/
""=: ~
----- ------.
/
Figure 1. Design activity
in Figure 1. This puts synthesis-evaluation as the core activity within design. A difficulty with this particular representation is that it suggests that evaluation is done internally by the designer. There is no doubt that a great deal of detailed, working level evaluation is done by the designer. However, respect has to be given to the majority of situations, in which what is being designed is for the use or enjoyment of other people rather than for the designer's own ends. Design in the applied arts and the useful arts fields is directed at other people. There is an external evaluation structure which involves objectives, their interrelationships, ways of appreciating and measuring the associated values, and the interrelationships of the values. The nature of this structure, the way it works, and the norms it carries are not always obvious. Yet, wherever design is valued externally and the purpose of design is to satisfy other people in some way, the designer has to make himself familiar with the goals and values which obtain and the duties of evaluation, declared or undeclared. At many points in producing a design the designer, if he is striving to do the best job, has to adopt the role of proxy for others. The location of evaluation activity may be set externally, as in Figure 2. It is possible to interpret this picture in several ways. Thus, the designer may refer every evaluation task to someone else, or he may refer only those evaluation tasks which seem to him to be critical. In the former model the designer is reduced to what is an 'ideas man'; in the latter he has to make judgments about what is critical for the external evaluation. Taking on the role of others is difficult. It may well be that the persons responsible for
148
Comm~issioning
Production
/
Specification for detailing
motivation
I Uncertainty Plant availability Maintainability Operability Automatic/computer aid level Potential hazard Prospects Environmental posture Resources concern Responsible innovation Personnel motivation Company image in community and society Portfolio pattern Stability Flexibility
Figure 2. Synthesis versus evaluation (in-company negotiation)
making the external evaluation face great difficulty in achieving an adequate view of the situation in which the artefact is to be used. This is particularly likely in the case of innovation. What is the fair way in which their difficulties should be reflected in the interchanges with the designer?
CAN WE GET AN ADEQUATE FOR EVALUATION?
FOUNDATION
We might think, perhaps, that somewhere there is an age-old solution to the problem of how to go about evaluation. What about the philosophers? Aristotle 4, in his Nicomachean ethics presented something which might serve as a starting point: Every art and every inquiry, and likewise every action and pursuit, is believed to aim at some good. The good has therefore been rightly declared to be 'that at which all things aim'. But ends differ to some extent among themselves: some are activities, others are products distinct from these; and in the latter case the products are naturally more worth having than the activities. Now since there is a great variety of actions, arts and sciences, there are correspondingly numerous ends: health is the end of medical science, a vessel that of shipbuilding, victory that of strategy, wealth that of economics. Wherever such arts, etc can be classified as members of a single skill, the ends of the master arts are preferable to all the subordinate ends; for it is with a view to the former that the latter are pursued. It makes no difference whether the activities are ends in themselves, or something distinct from them (as in the case of the sciences mentioned in the foregoing note). If then there is some end of the things we do, an end which we desire for its own sake and for which all else is desired--that is to say, if we do not do everything for the sake of something else-obviously this end will be the Good, ie the best of all things. Knowledge of it must accordingly exercise a profound influence upon life; if, like archers, we have a target, we shall be more likely to hit upon what is right.
DESIGN
STUDIES
A belief in a specific way of tackling designs--the systems approach--has expanded in the last 20 years. This strategy, of a comprehensive nature, has revealed itself in most technical areas, often with local variants, eg structured programming in the computer world. Among the best known of the exponents of systems engineering--relevant to our purpose--is Hall s w h o formerly worked for Bell Telephone Laboratories. In his view systems engineering links research and detailed design and attempts to shorten the time lags between scientific discoveries and their applications, and between the appearance of human needs and the production of new systems to satisfy these needs. Systems engineering is very much 'front end' work. It is concerned with the handling of complexity, even if this is difficult to define. Hall, in developing his view of a systems approach, was, not surprisingly, greatly concerned about the value system. He saw the interaction between means and ends, listed what were expected to be recurring kinds of objectives (profit, market, cost, quality, performance, dealing with competitors, compatibility, adaptability, permanence, simplicity, safety, and time), and gave some suggestions that followed from the state of knowledge about value system design. Some suggestions from him are set out in Table 1. Hall was driven to accept the practical evidence that human decisions are largely made by some kind of intuitive response. He believed that 'decision makers can avoid complete reliance on intuition by reference to a number of ethical systems'. His position was that 'engineering is always concerned with a value system. The ethics and mores of our culture form an integral part ofthat
Table 1.
Hints on setting objectives (Hall s)
Put the objectives on paper. Get agreement that the words used are neutral and free of bias. Identity means and ends. If this results in several chains of means and ends, 'position' the chain in order to locate objectives on the same hierarchical level, and to identify the different dimensions of the value system. Test to see that the objectives at one level are consistent with higher level objectives. This is necessaryto decide the relative importance of various subsets of objectives. Test that the subset of objectives at each level is logically consistent. Inconsistent objectives signal the existence of trade-off relations. All of these relations will eventually need careful specification to allow for compromises. Define the terms of trade for related variables. Sometimes all that is necessary is to find the derivative of one variable with respect to another, and to state the limits of the variables within which the trade is valid. Make the set of objectives complete. The use of experience in similar problems is one way to satisfy this criterion. However, this operation generally continues to the end of design because it is impossible to foresee all consequences of the physical system and cover them with objectives. Give each objectives the highest possible level of measurement. Recognize that some objectives are not measurable on the highest level of measurement scales. Usually the members of certain subsets of objectives can at least be ranked by importance. Check the objectives to see if each is physically, economically, and socially feasible. State the limiting factors. Allow for risks and uncertainties by various available techniques and by selecting the appropriate decision criterion. As a step in settling value conflicts, isolate logical and factual questions from purely value questions. This frequently calls for the use of experts. Settle value conflicts. Have all interests represented. Use tentativeness. Avoid dogmatism, dictatorial methods and premature voting.
vol 3 no 3 j u l y 1982
s y s t e m . . , there is nothing in engineering itself, or even in the model of statistical decision theory, that provides a value s y s t e m . . . It is therefore necessary that ethics help to provide the standards for deciding if certain objectives are good in any meaningful sense.' His own choice in the ethical field was for a kind of stoicism: to want what we can get rather than worrying about what is beyond our grasp. But, in addition to examining what might come from philosophy, he was careful to trace the boundaries of those subjects which might provide some possible extra insight. He considered economics, psychology, and statistical decision theory. Finally, 'having failed to find any complete and general method for choosing the best objectives, it is worthwhile to turn to yet another source of wisdom'. This turns out to be the art and science of using past decisions to guide present actions: to take laws, customs, policies, etc as standing plans. This is rather contrary in spirit to that important aspect of evaluation in which attempts are made to anticipate likely future outcomes.
ONWARD FROM HALL Hall identified himself as a caring engineer. It is not known how far Hall's considerations reflected opinions and practices in the company where he worked. Did every telecommunication scheme have an adequate evaluation of the social consequences? The professional designer will wonder how some of the general recommendations were to be carried out in practice. How were social considerations carried over into the design of nuts and bolts? This is not as trivial and stupid a question as it might at first appear. Then there is the class of questions which look at changes which might have occurred since Hall's time and which may considerably affect what he proposed. Philosophers, for example, in looking at objectives, are likely to split them into the two groups: arbitrary and instrumental. Arbitrary objectives are those which are imposed in some way, whereas instrumental objectives are essentially subobjectives, providing the necessary infrastructure of the principal objectives. The associated value schemes are with respect to the arbitrary or the instrumental. A task which the designer faces here is to take on arbitrary objectives with which he may not necessarily agree. Hall had a specific model of evaluation in mind. The language which he uses reflects the model. Thus: 'The decision-making m o d e l . , has several clearly distinguishable parts: (1) a list of objectives, (2) a list of alternatives, (3) methods of predicting the consequences of these alternatives, (4) some method of assigning probabilities (if any) to the consequences, (5) a value system, implicit in the objectives, for attaching values to the consequences, and (6) a decision criterion, included in the value system, which states h o w to operate upon the other five parts to specify the best alternative'. In spite of this he recognized the more general use of other methods. Hall was assuming a prescriptive model in which evaluation occurred. He did not apparently have access to the field studies on design decisions or management behaviour which were coming forward. Although he knew that evaluation was different from his model in practice he did not attempt to compare and contrast.
149
BEHAVIOURAL APPROACHES AND FIELD STUDIES The acceptance of 'satificing' behaviour by management rather than optimizing behaviour tends to upset the beauty of the statistical decision theory model. Another aspect of management behaviour, observed by Mintzberg 6, is that decisions tend not to be made in a 'clean' manner--they are negotiated and start and stop and start again. Field studies of design behaviour have similar and other indications. Of the studies generally available the most useful appear to be those from Marples 7 and from Bessant and McMahon 8. These two cases are at opposite extremes of design work. Marples deals with the way in which decisions are made about the design of components of engineering equipment, whereas Bessant and McMahon deal with the manner in which top management decisions are about a major investment. What Marples found was that low-level decisions were made by 'engineering judgment' (whatever that may be defined as) of which an important feature was the estimation of future adaptability or developability. In addition Marples showed that decision-making changed when something of a.critical nature emerged. People higher up in the group were involved. In the Bessant and McMahon case what was obvious was a complex starting and stopping, with opportunities for thinking afresh, amid ongoing and intermittent negotiations at national and international level. Further, this was notionally linked with the operation of a formal evaluation and review procedure which encompassed original research concepts or market suggestions and continued onwards. Although the behavioural observations depart from the formalistic prescriptive models of decision making it can still be argued that route maps are useful. It would be expected, for example, that the kinds of advances made in associated practical fields might also contribute to thinking about design evaluation. Hall's book came at the end of a period which had seen a concentration of effort upon psychological aspects of decision making which, in its turn, had succeeded many years of unimpeded development of economic theories. With the emergence of statisticai decision theory most of its obvious implications for design were expounded by Starr9. In R&D a great emphasis has been put upon evaluation or its equivalent. Baines et al 1° place considerable importance upon the use of 'screens'. These have their most important properties defined by six parameters: target definition; width of spectrum; mesh size; quantitative content; degree of abstraction sufficiency. Beattie and Reader 1~ provide a handbook of the formal methods of evaluating and selecting projects. The position is that very few formal methods get used as such. Notwithstanding this relatively unproductive cultivation of the field of decision theory every now and then ideas emerge which are seized upon with interest to see what new insight might be obtained.
EFFECT OF EXTERNAL PRESSURES UPON EVALUATION In the late 1960s social pressures began to play an increasingly obvious part in the evaluation of technical designs. At first some of these external matters were listed under the heading of dealing with uncertainty, for which a
150
place already existed among the categories used in evaluation. Safety considerations lent themselves particularly to this kind of treatment because risks could be expressed in probabilistic terms. But it was not long before hazard analyses of various kinds became part of the more or less standard evaluation packages relating to a specific technological domain. In a few years it was clear that any well-defined area, particularly one of public concern, might rapidly have an evaluation scheme attached to it. If the evaluation scheme has a more or less technical base to it and, in addition, can be treated numerically, then it may be taken up with some enthusiasm. Some of these new procedures can be developed in great detail, and the time needed for full treatment takes up a large fraction of design time. As example Figure 3 shows what is termed a typical reliability programme as given in BSI Handbook 22 ~2. This is only one kind of evaluation scheme and competes for time with others. No doubt there will be further candidates and claimants.
QUOT HOMINES, TOT SENTENTIAE To compound the accumulation of difficulties there is the range of personal opinions which must be accommodated. Kelly's personal construct theory, as expounded by Bannister and Fransella 13 emphasizes the uniqueness of each person. Further complication may be added by taking account of organizational requirements and roles. From such considerations it is easy to argue that what we need is some kind of effective policy on evaluation in order to make it easier for designers and others to work constructively together: to make it possible to attain reasonable agreement upon the kinds of evaluation which should be carried out and the different levels of importance to be assigned to particular facets of evaluation.
POLICIES AND POLICY MAKING Alternative policies are readily found. One obvious policy on evaluation is to do nothing and declare no policy. This could be construed to mean that not only should there be no general policy on evaluation but that there should be no individual case policy, either overtly or covertly. It is unlikely that such a position would prevail for an individual case, either because of the choices of the designer or of his client. Another extreme policy is for a national evaluation scheme to be drawn up. In view of the infinite number of possib°le design situations this would be difficult to achieve and unlikely to have practical usefulness. An alternative approach is to lay upon the designer some responsibility, by an ethical code, for example, to do all that was reasonable in a given set of circumstances to use or bring to the notice of the client the relevant modes of evaluation which might be applied. Another possibility is that there should be made generally available the 'building blocks' of evaluation schemes so that for most design/client variants it would be possible to set up practicable and relevant forms of evaluation.
PRACTICAL WAYS OF EXECUTING POLICIES Although Hall gave much consideration to general principles, he provided little in the way of guides to specific
DESIGN STUDIES
~n
O0 M
t-
O
< O W
....
I
malion and data
Assurance infol
g~.r~ralK3n
Contract and specification
t
- -
Basic activities
....
Reliability definition phase
I
I-Reliability 7 =proposal I i requirement I [ determination I
I Reliability specification input determ ned
.__L___ F System ] Contract outline I determination i proposal i definitions Specification
Feasibility study
----T---
definition
"~yZtem 'I requirement I
L
System definition
Figure 3. Typical reliability programme
O
~-='~ ~ ~ c=r E.~, 5 =~n o ~. ¢ ~~_~ ~ -~ .E. .e=o . °o' =.o ~-.
• ~ ~~~
I
--~
Definition phase
Product planning
__.
Established and •-4 novel features Lnalysis . . . . .
__.1
I
:r
Safety engineering Maintainability Human factors analysis
Redundancy
a..s
: i I '
'
l /L
t
~sts
J
I
.. c.an
Icertification
I Sub-assembly I I or equipment I ' test including ' L__._ J
r__
I Initial design I justification ..D=|validation I Assurance | information and data
:
I FFairure mode" 7 I ' criticaeffect andty --11 I I analysis I= I F~u . ] It tree analys~sI __
I L~iation
Sub-assembly °r equipment manufacture and assemb y
I
Parts and components manufacture
I
I I Process ' selection and ~approval
[
I
Design acceptance assurance information and data
ns~ndio n ]1 ;J f J
r--L--7
rl
----r--,
1
In-process inspection
1
....
Inspection and test
.....
"0 Action information feedback
Consolidated information and database providing quality assurance in terms of customer confidence Analysis and assessment
I J
7
,~%~rr~%n ' and data
Production and I I product I
I Product on and I I product I I verif cat on I Itests I I------F---I
r-- -- 3 - - -- -.1
I Burn-in, bed-in, I I running and I screening tests I L_ __J
r-Z]--
Reliability at qualify verification phase
I Product receipt assurance I reformation and data
I
Production I acceptance and I product release I inspection
l
spect test . . . .
1
I oouct on re, I endurance, I durability, I I sampling tests I II __J
! L
fxamine,
rl_ Cpr-.-L -7
[
~ I I
~.J
I
I
I__
/
Goods inward ivin or rece g inspection
control
Quality
Quality control and quality vertification
I
Series or repeat production
Production phase
Integrated examination, inspection and test planning for product verification measures in terms of customer confidence
6
Design I I review I~ I I Design and L . . . . . j I sngieeerinc changes F__I__7 determined Design I Environment I freeze,, I and loading I Base line I tests l definition / . . . . --I --1 -- -- 7 Specifica-, tion release to I Accelerated I production I tests I--~ I I I Qualified certified L---I--J I and r----L.----; I approved ILife and/or I I product ; Configured I lenduranc e testsl I item i declared L_ . . . . .jI
.... I-----I'Perf°rmance I tests, including Ifuncti°nal Itests
Development model, prototype or one-off manufacture
I
__11~ initial manufacturing prototype, etc I ~
Reliability demonstration phase, including assessment, analysis, evaluation and review
I ~,
ItionlSpe'i~-I
~Ma) fac- I Ituri
IDes land etaill Ide-~fi U°n I
tiniti set 1 mal fac I ~ ~
Stress and worst I case analyss I Ic°rr etedl t"Jincluding -I Ivariance and I
I
I ~
D ~ i.=gw~ re .=
Design and manufacture proposals
1 FPreferred Parts I ]qualified, certified tor approved IProduct selection Concept and I feasibility I nalys:s a . . . . validation I of chosen I Quality analysis ] design I parts materials processes
Design review
~' v I
Design and development (including initial manufacturing) phase
Product acquisition
Product life cycle
b
I I
Design process is iterative through all phases
in-service product support assurance I information and I data
Reliability achievement measurement phase
7
1
In-service inspection and experience Design review in terms of product improvement measures
I Functional or
I
I
I Warranty @ guarantee examination
r Examine, I inspect, test L_
I
Maintainability service maintenance refit, or repair modifications updating
Function or use
Function phase
Product usage
applications. In particular it is difficult to find what his thinking was with respect to the lower levels of design. Thus, cost might be seen as a major consideration in overall design, but nothing was offered to help the evaluation or control of cost at any working level of design. Evaluation gives information about the way a design is proceeding and suggests the direction in which change should be made in order that the complex of policies behind the design should be fulfilled in a satisfactory manner. The evaluation procedures should be considered in some way, not necessarily in fine detail, when a major design is about to begin. In formulating policies it is important to be sure that they are intrinsically practicable, that some means exists in principle whereby their progress towards implementation by way of design may be monitored and measured, and that, for the designer, practicable means are available, or may be developed, by which he and his associates may deal with the finer details of design work.
CONCLUSIONS What appears to be emerging is a need, not yet expressed in much detail, for some broad principles of evaluation which might be acceptable to most designers and provide at least action guidelines which have a wide range of applicability. In addition there is possibly a place for evaluation elements or building bricks, particularly those having potential for use at most levels of design work.
152
REFERENCES 1
Polya,G How to solve it Princeton University Press, Princeton, NJ, USA (1945)
2
Gregory, S A 'Towards design science' Radziejowice Symposium PAN, summarized in Nauka i projektowaniu PAN, Warsaw (1978)
3
Lera,S 'Empirical and theoretical studies of design judgement: a review' Design Studies Vol 2 No 1 pp 19-26
4
Aristotle Nicomachean ethics edited by J Warrington J M Dent and Sons, London (1963)
5
Hall,A D A methodology for systems engineering Van Nostrand, Princeton, NJ, USA (1962)
6
Mintzberg, H et al 'The structure of unstructured decision processes' Admin. Sci. Q. Vol 21 No 4 (1976)
7
Marples, O L The decisions of engineering design lED, London
(1960) 8
Bessant, J R and McMahon, B J 'Participant observation of a major design decision in industry' Design Studies Vol 1 No 1 (1979) pp 21-25
9
Starr, M K Product design and decision theory Prentice-Hall, Englewood Cliffs, NJ, USA (1963)
10 11
Baines, A, Bradbury, F R and Suckling, C W Research in the chemica/industry Elsevier (1969) Beattie, C J and Reader, R D Quantitative management in R&D
Chapman and Hall, London (1971) 12
British Standards Institution Qua/ity assurance: handbook 22
London (1981) 13
Bannister, D and Fransella, F Inquiring man Penguin, Harmonds-
worth, Middx, UK (1971)
DESIGN STUDIES