PII: sooo34870@6)ooo25-2
Applied Ergonomics Vol29, No. 1, pp 5558, 1998 0 1997 Ekevier Science Ltd Printed in Great Britain. AU rights reserved 0003-6870/98 $19.00 + 0.00
ELSEVIER
Ergonomics in consumer product evaluation: an evolving process Lindsey M. Butters and R. Tetra Dixon Consumers’ Association Research and Testing Centre, Davy Avenue, ffiowlhill, Milton Keynes MK5 8NL, UK (Received 31 July 1996; in
revisedform 22 December 1996)
As part of its commitment to empowering people to make informed consumer decisions, the Consumers’ Association investigates the convenience aspects of a vast range of products, from cars to garden spades. Evaluation approaches include user trials, convenience checklists and expert appraisalk. Our methodology is subject to constant review and refinement to ensure the highest levels of reliability, validity and auditability. We have a distinctive approach: our tests are designed to reflect consumer usage and to provide comparative data which is absolutely fair to all products. This paper discusses the evolving nature of that methodology within the “lifetime” of a product. Reasons for choosing each method are given as practical guidance for those involved in comparative testing. 0 1997 Elsevier Science Ltd Keywords: consumer products, evaluation methods, validation of methods
Introduction
of ergonomics but also its potential for market advantage. Despite this, products still exist which people find inconvenient or complicated, uncomfortable to use. To return to the example of washing machines, it is disappointing to report that, when it comes to ease of use, the Consumers’ Association has yet to find a machine that is completely free of design problems (Which? 1995a). We therefore advise our readers on the types of program controls, handles and dispensers to look out for, and highlight those machines that are particularly good or bad in each area of design. We also regard it as part of our mission, which is to enable people to make informed consumer decisions, to distinguish between genuine advances in ease of use and cosmetic sales techniques. The general public is becoming more discerning about ease of use but guidance is still required if they are not to be swayed by cosmetic factors and advertising hype. In arriving at our “Best Buys”, we calculate a “total test score” for each product, comprising a number of weighted factors from our test results. For example, in a recent report on dishwashers, ease of use accounted for 20% of the total test score, washing and drying performance for 35% and 15% respectively, with the other factors being water and energy consumption and program time (Which? 1996a).
At the Consumers’ Association Research and Testing Centre, technical performance testing, including safety checking and reliability research, is complemented by assessments of the ease of use or convenience of a product. The broad term “convenience” encompasses all aspects of living with one’s choice, including familiarisation, the appropriateness of the functionality, comfort, anthropometric fit, adjustability and ease of maintenance. The changing priority of ergonomics When a new product type first hits the market-place, the fact that it does something different, which was not possible before, is often enough to sell it. However, as competition increases and the technical performance of products improves, ergonomics becomes one of the main differentiating factors. For example, reliability differences aside, the performance of washing machines has improved considerably in recent years. Some are better than others but, almost without exception, they wash clothes to an acceptable standard., as results of performance tests demonstrate (Which? 1995a). However, some wash well and adhere to ergonomics principles and use less water and energy, whereas others do not. Only a minority consider the needs of all users: small and tall, young and old, able-bodied and those with special needs. In our experience, an increasing number of manufacturers are recognising not only the importance
Choice of evaluation methods A number of evaluation methods are employed including: user trials with lay and trained panels (both in the user’s home and at the CA testing centre and
55
56
Consumer product evaluation: L. M. Butters and R. T. Dixon
gardening sites), convenience checklists administered by testing staff and expert appraisals by an ergonomist. The choice of methodology is governed by a number of factors, including the intended focus of the magazine article, constraints of time and cost, the characteristics of the product and the trade-off between experimental control and realism. Our primary aim is to test the products in a way which accurately reflects consumer usage behaviour. Our evaluation methods vary not only from product to product, but also between different assessments of the same product. We take into account the convenience “history” and look at which methods have been used in the past, at where the product is in its “life cycle” and at our frequency of testing.
User trials Wherever possible, it is important to collect the first information about a new product from a user trial, to gather data about the experiences of representative consumers. Trials may be carried out by users in their own homes or at the CA testing centre and gardening sites. Ideally, home user trials should be conducted when the domestic realism of the testing is considered to be an important determining factor. However, choice of venue is also influenced by a number of other considerations, including the frequency of normal use, possible dangers during use, expertise required, the size of the product (for ease of transporting and housing) and the cost of samples. A home-user trial is most appropriate for small products requiring no installation, which can be easily delivered to, and collected from, the user’s home. Products such as kettles and vacuum cleaners are ideally suited, as they are used regularly in the home and we do not have to ask the users to do much in addition to what they would be doing normally. Users are given ample time to familiarise themselves with the product in their own surroundings (usually one week per sample over a four to five week period) and we normally ask them to assess it after using it several times rather than giving a first impression. Trials conducted at a CA site are favoured if there is a particular need to observe users in a carefully controlled environment or to video the trial for future reference. An on-site trial is also chosen if the product has practical drawbacks for testing at home, such as infrequency of normal usage. If people were given DIY eye protectors or workbenches to try out at home they would be unlikely to have sufficient suitable tasks to give them a fair test. Other reasons include safety concerns (e.g. compost shredders) and the difficulty of installing the product in the user’s home (e.g. washing machines). On-site user trials inevitably create an artificial situation but every effort is made to maximise realism. For example, the users testing eye protectors performed a series of controlled tasks, including painting and sanding walls, to simulate real-life use. The time of year has an impact when devising our evaluation methods for gardening products. On one occasion, scheduling demands meant that we were assessing weeding tools in the winter, ready for reporting in the spring. There are fewer weeds to dig
up in the winter and it is not the right time to ask people to be out in their gardens. Therefore, weeds were grown under a polythene tunnel at a garden test site and keen gardeners were able to test the tools under cover. Usually, data from user trials are statistically analysed using analyses of variance techniques. Occasionally, only anecdotal information is required. Testing is conducted according to an experimental design to ensure a randomised, balanced order of testing. If there are a large number of products being compared, it may not be reasonable to expect a user to test all of them. In this case, balanced or partiallybalanced incomplete block designs are employed to reduce the effect of differences between users. These designs are produced by statisticians. Unless the trial is designed properly, only very limited conclusions can be drawn from it. Users fill in a self-completion questionnaire for each product tested. Where the trial is to be statistically analysed, numerical rating scales with a mid-point are used. Usually a 5 point scale is adopted. Space for comments is allowed, and users are encouraged to list any particularly good or bad features which contributed towards their overall assessment.
User panels: lay, specialist or trained We have a panel of nearly 2000 users in just over 1000 households, all of whom are subscribers to one or more of our magazines. They all live within 20 miles of the Research and Testing Centre and have volunteered to participate in trials. Detailed information concerning their household appliances, their skills and interests, together with their availability to test for us is held on a database. This enables us to select the most appropriate users for each individual trial. We need a large panel as we may be carrying out several user trials at the same time and we do not want to overburden the users or turn them into experts. In addition, we have a small panel of keen gardeners living in the vicinity of one of the sites used by Gardening Which? at Cape1 Manor in Enfield. Although the panels are drawn from relatively small geographical areas, this is not thought to be a serious limitation because CA’s product testing produces data which is comparative, not absolute. The panels contain men and women of all ages, with a wide diversity of interests and abilities, including people with special needs. Careful recruitment is needed to select users with a suitable level of knowledge and expertise. For example, the more specialist wood-working tools can only be properly assessed by people with appropriate skills. On the other hand, complex products such as video recorders need to be usable by a wide range of people, not just those who are particularly skilled in the use of high technology. However, it would be inappropriate to recruit completely naive users with no experience of using the product. Occasionally, a specialist group of users is needed (e.g. a cycling club to try out racing bicycles or very young babies to try out disposable nappies). We need to look outside our established panels on these occasions.
Consumer product evaluation: L. M. Butters and R. T. Dixon
In some cases, trained panels are used. For example, car testing is carried out by an in-house panel of drivers. The panel is a group of 12 ordinary motorists and consumers, all shapes and sizes with assorted patterns of use and driving styles, who are experienced in assessing the cars for a huge number of factors, including ergonomics issues such as dashboard layout, driving position, visibility, etc. This approach is adopted because an untrained user would find it difficult to be consistent in the assessment of the many features we examine (typically over 80 questions per car). The assessments are carried out as if the car were their own (e.g. for travelling to and from work, trips to the supermarket, weekend outings, etc.). Having driven the car for several days, on all road types, they fill out a detailed questionnaire covering all aspects of the car. The results are statistically analysed and combined with technical measurements and track tests to give overall ratings. A detailed profile of the characteristics of the drivers enables us to interpret their comments in the light of anthropometric considerations.
Convenience checklists administered by laboratory staff In some cases, when we believe we have enough information collected from user trials, a convenience checklist is devised for use on future projects. Incorporating relevant standards, our checklists include many of the questions that may be asked of a user in a self-completion questionnaire. They are administered by a team of trained laboratory staff, who each rate the product separately before discussing their findings to reach a consensus. Our policy is for the checklists to be completed by both male and female staff and for a left-handed assessment to be included. Sometimes, it is not practical to conduct a user trial as the first source of information, usually because the product is considered to be too complex. In these cases, our checklists are drawn up by ergonomists in the light of existing guidelines and in consultation with the staff responsible for performance testing. A checklist must be prescriptive and precise enough to help the assessors award a fair and consistent rating and for all testers to arrive at similar, if not identical, answers. A delicate balancing act is called for. Checklists must be detailed enough to capture all relevant data and to give sufficient guidance on how the checklist is to be administered, without being too long-winded and time-consuming. It is often easier for testers to make comparative assessments by collecting objective data (e.g. measuring the size of a control, the force required to turn it, etc.). However, interpretation of such measurements is often difficult because much of the available data and recommendations in ergonomics guidelines are incomplete, out-of-date and do not reflect the developments in modern domestic products. For this reason, the emphasis in the majority of our checklists is on subjective assessment. Checklists must be regularly revised and updated to ensure that all relevant changes in legislation and standards are included and that it reflects the state-ofthe-art of the product concerned. Design tweaks and
57
seemingly minor model changes may have a significant cumulative effect; a case of the whole being much greater than the sum of its parts. Changes to the rating scale may be required if products improve (to allow results to be carried forward from previous projects and compared fairly). With the shelf-life of products becoming ever shorter and an increasing emphasis on speed of testing, we also investigate changes in format which will reduce assessment time whilst maintaining or improving the quality of results.
Periodic user trials to validate and update checklists Laboratory staff, who are very familiar with the products having tested them for technical performance, do not relate to those products in the same way as the ordinary consumer. Their expert knowledge puts them in an excellent position to give detailed advice on what is good or poor about a product. However, given CA’s strong commitment to the philosophy of testing products as the consumer would use them, it is important to submit our checklists to regular reviews. The principal method of investigating changing patterns ‘\of consumer behaviour is by means of supplementary user trials. For example, lawnmowers are a product traditionally assessed by checklist but periodically a user trial is conducted in parallel. The degree of correlation between the two sets of results is analysed and any necessary adjustments to the checklist are made. Validatory user trials also have their place in checking our checklist testing of complex products which have a steep learning curve, such as video recorders. By concentrating on a subset of the most commonly used functionality within a user trial, we can make sure that we are getting the basics right and continuing to report on the issues which are most important to consumers.
Complementary and supplementary methods In some cases, it is necessary to employ several types of assessment to gather extra data about a product. For example, a standard user trial is often supplemented by an appraisal by a left-handed member of laboratory staff. This is often more appropriate than including left-handedness as a variable in the main trial, where it is often impossible for every user to try out every product. Expert appraisals An expert appraisal by CA’s Ergonomist may be used to predict how a product is likely to suit a range of users over time and to identify design flaws which may lead to problems. It is an approach which is used less frequently, but can be the most appropriate method if an assessment of compliance with standards is required, or if resources do not permit a full-scale user trial. The usefulness of the method hinges on the expert’s experience with the product and knowledge of the user’s likely behaviour. However, this experience can be a double-edged sword. As with the administration of convenience checklists, it can be very difficult to set aside one’s expertise and assume the r61e of a user.
58
Consumer product evaluation: L. M. Butters and R. T. Dixon
For these reasons, it is a method which is rarely used in isolation but is more commonly used to complement other methods. For example, the usability element of a recent report on electronic route-finders (Which? 1996b) comprised a user trial plus an evaluation of the user interface by the Ergonomist. Discussion groups
Discussion groups are usually run in conjunction with other methods. They can be a useful means of helping to decide which aspects of the product should be concentrated on. In this case, the group discussion would be held before the user trial or after a pilot trial has taken place. Sometimes it is held as a debriefing exercise after the user trial has been completed. It needs careful handling by a trained facilitator to get the most out of it and to ensure that all opinions are represented. Convenience diaries
The learning curve is an aspect of convenience testing with widespread implications. It can be difficult to judge whether the aspects that seem inconvenient on first acquaintance with a sample would remain inconvenient on prolonged use. Conversely, inconveniences which at first sight seem minor can become more irritating with time. This is one of the advantages of home-user trials over on-site trials because the user can spend much more time with the product. Convenience diaries, kept during performance testing as a back-up source of information, can also be very useful in assessing those aspects which are likely to be a source of long-term annoyance.
Attending to the needs of all consumers Consumer products have, by their very nature, to cater for the broadest of user groups. Which? articles regularly include specific information for less-able consumers. Over six million adults in Britain have at least one disability. As we get older, many of us start to experience difficulties with our sight, hearing, mobility and the use of our hands. Moreover, it is usually the case that those products which are the easiest to use for less-able consumers will also be the most convenient for the able-bodied. Recent examples of information to appear in Which? include VCRs which can record teletext subtitles (Which? 1995b), knives for people with weak hands (Which? 1995c), washing machines with adaptations for users with impaired vision (Which? 1995a) and which camera to buy if you wear spectacles (Which? 1995d). Much of our work on products for less-able users is conducted in collaboration with the Research Institute for Consumer Affairs (RICA). RICA is a sister charity to Consumers’ Association specialising in research
on behalf of consumers with disabilities. We recently carried out a large joint project on the design of domestic appliances for elderly and disabled consumers (Which? 1996~). The approach is broadly the same as that described above, albeit fine-tuned to take account of a broad range of disabilities. Convenience checklists are based upon exploratory user trials and are administered by laboratory staff who have undergone specific training to assess products for those with special needs. Our work with, and on behalf of, consumer organisations in Europe and beyond also means that we need to take into account the effects of national consumer habits and style of living upon product usage.
Consumer surveys A further element in the iterative improvement of our methodology involves employing the resources of CA’s Survey Centre to improve our understanding of people’s use of consumer products and the problems they encounter. For example, questions on the ease of use of vacuum cleaners were included in the 1995 Which? Members’ Annual Survey. This is a survey which is sent to 50,000 members, randomly selected to be representative of our membership. Typically, it achieves a response rate of between 40% and 50%. It is used to generate subsamples according to particular topics for further surveys during the course of the following year. In the case of vacuum cleaners, respondents were asked to choose five features from a prescribed list of 18 items which were the most important in their choice of vacuum cleaner. The list included performance, aesthetic and convenience factors. In a second question, respondents were presented with the same list and were asked to choose those five features which they felt were most in need of improvement, from their experience of using the vacuum cleaner. By comparing the two sets of responses, we could look at people’s attitude to convenience issues before and after purchase, influences upon the buying decision and the main problems encountered since purchase. This information feeds into our principles for the collecting and reporting of usability data within CA and acts as a check that we are reporting on the most important convenience issues.
References Which? Which? Which? Which? Which? Which? Which?
(1995a) Finding a machine that’s easy to use, January, p. 53 (1995b) VCRs on test, May, p. 50 (1995~) Sharp practices, July, p. 8 (1995d) SLR cameras on test, June, p. 44 (1996a) Dishwashers on the rack, December, p. 43 (1996b) Route planning made easy, July, p. 39 (1996~) Design of domestic appliances, August, p. 38