A comparison of sensory methods in quality control

A comparison of sensory methods in quality control

Food Quality and Preference 13 (2002) 341–353 www.elsevier.com/locate/foodqual A comparison of sensory methods in quality control E. Costell* Institu...

307KB Sizes 7 Downloads 270 Views

Food Quality and Preference 13 (2002) 341–353 www.elsevier.com/locate/foodqual

A comparison of sensory methods in quality control E. Costell* Instituto de Agroquı´mica y Tecnologı´a de Alimentos (CSIC) PO Box 73, 46100 Burjassot, Valencia, Spain Accepted 8 February 2002

Abstract Many different types of sensorial methods have been proposed and used to evaluate and control the sensory quality of foods. However, not all of them are suitable for incorporation in to quality control programmes. To simplify comparison a distinction is proposed between methods that can be used to define sensory specifications or to select a product quality standard and those that can be used to check if a product complies with stated requirements. With this approach, the appropriateness and limitations of different methods and their practical applicability, according to their use with or without a previously selected or developed standard (product, mental or written), are discussed. # 2002 Elsevier Science Ltd. All rights reserved. Keywords: Sensory quality; Sensory analysis; Quality standards; Quality specifications

1. Introduction The term ‘‘quality’’ has been used so much and in so many contexts that its meaning is frequently unclear. A number of definitions have been proposed, always with reference to the situation or problem to be solved in each case. They vary widely between simple expressions such as ‘‘Fitness for use’’ (Juran, 1974) to more detailed ones like that proposed by Molnar (1995): ‘‘The quality of food products, in conformity with consumers’ requirements and acceptance, is determined by their sensory attributes, chemical composition, physical properties, level of microbiological and toxicological contaminants, shelf-life, packaging and labelling’’. Any of these or many other definitions could be useful in a certain context but none of them is always satisfactory. Quoting Fisken (1990), ‘‘quality is a fuzzy and relative term and it is in a constant motion’’. Due to this lack of conceptual definition, any specification, method or group of methods designed to control the quality of a certain product may be applicable in a particular situation but they are subject to a constant evolution. The changes come, on the one hand in function of methodological advances in each area (chemical analysis, microbiology, toxicology, etc.) and on the other, of changes undergone

* Tel.: +34-96-3900022; fax: +34-96-3636301. E-mail address: [email protected] (E. Costell).

by market requirements and degree of commercial competition for the particular food product. Notwithstanding the lack of conceptual definition of quality, food quality control and assurance is evidently a top subject both in industry and in public and private control institutions (Gould & Gould, 1988; Herschdorfer, 1984; Juran, 1974; Kramer & Twigg, 1970; Stauffer, 1988) and continues to be a matter of discussion in both academic and industrial forums. Some of the most relevant items, advances and problems related to the definition and measurement of quality of foods have been discussed in a Special Issue of Food Quality and Preference published in 1995. Basically the utilisation of any type of method in food quality control follows a common approach: first, definition of specifications or quality standards and second, development and testing of methods to evaluate, in a reliable manner, whether or not a product complies with the requirements of the quality standards. Two questions arise with this approach: (1) Which food characteristics or properties should be included in the standard? (2) Which methods are to be used for their analysis or evaluation? The answer to the first question is usually conditioned by the need to establish a compromise between two extreme alternatives: either consider a large number of the food samples or a long list of food properties, leading to a rather complete specification but difficult to apply in practice, or else, select only those characteristics of higher incidence on quality that

0950-3293/02/$ - see front matter # 2002 Elsevier Science Ltd. All rights reserved. PII: S0950-3293(02)00020-4

342

E. Costell / Food Quality and Preference 13 (2002) 341–353

make it possible to decide if the food fulfils the requirements of a certain quality grade in a simpler manner. On answering the second question a similar situation holds: not always the most precise and costly methods are most suitable but, in general, the selection is based on the capacity of the method to measure variations in each of the characteristics that influence product quality with sufficient precision. The implementation of food quality control and assurance systems, in the areas of chemical composition, microbiological and toxicological safety, and nutritional characteristics, brings up problems related to the selection of properties or characteristics to be measured and to the methods to be used. These problems are much more numerous when the system is designed to control what is known as ‘‘sensory quality’’. Sensory quality is even more difficult to define because it is linked not only to food properties or characteristics but to the result of an interaction between the food and the consumer. Besides, sensory evaluation is a rather recent discipline, as compared with others such as chemical or microbiological analysis. It was born and slowly developed its methodology during the second half of the twentieth century (Costell & Dura´n, 1981; Costell, 2000; Larmond, 1994; Moskowitz, 1993). As a consequence, not all methods developed and used by different research teams at different times can today be considered adequate to evaluate and control the sensory aspect of quality. The concept of sensory quality has changed with time since it was defined by Kramer in 1959 as ‘‘the composite of those characteristics that differentiate among individual units of a product and have significance in determining the degree of acceptability of that unit by the user’’. Some authors centre their attention on the first part of this definition. For them sensory quality is product oriented. Others emphasise the second part and consider that sensory quality is consumer oriented. In the first case, quality is considered as a convention developed by experts and it may therefore be considered as constant over a limited period only (Molnar, 1995). With the second approach, quality is mainly related to consumer acceptance and is context dependent (Cardello, 1995). The product oriented approach may, in some cases, render results of doubtful practical validity since it is assumed that the opinion of a group of experts is representative of the reaction of the potential consumers of the product in question. But the second approach is not totally satisfactory either because if a specification or standard has to be established in order to define the sensory quality of a certain food product, it is not sufficient to collect acceptability data that merely give statistically significant results (Booth, 1995). In relation to the latter point it should be considered that, according to Stone and Sidel (1993), when fixing acceptable deviations of the magnitude of an attribute with

reference to a standard, one should consider not only their direct effect on acceptability but also to what extent such a variability may affect consumer confidence in the product. It is evident that a constantly changing product is certain to affect consumer confidence. For example, in a recent study (Costell, unpublished data), to determine the influence of some sensory attributes on the acceptance of commercial chocolate milk beverages, no relationship was found between perceptible colour differences and consumer acceptance. Products of clearly different colours were equally acceptable. However, it is evident that manufacturers should define their colour standard and control variability between lots to avoid the negative effect that perceptible differences will have on the confidence of regular consumers. The recentness and slow development of the discipline of sensory analysis is perhaps the cause of the lack of immediate responses to the real need for methods to measure and control sensory quality both in industry and in control organizations. Consequently, many methods or systems, with variable scientific base, have been proposed for sensory quality control. Some of them are being used at the present time simply by inertia or habit. Every quality control technician or group has tried to solve problems by themselves in the best possible way, as, although some books on this matter are available (ASTM, 1996; Pangborn & Roessler, 1965; Lawless & Heymann, 1998; Moskowitz, 1983; Stone & Sidel, 1993), it was not until 1992 that Mun˜oz, Civille & Carr published the book Sensory Evaluation in Quality Control, the first one exclusively dedicated to this specific topic. For these reasons, when analysing the different alternatives proposed and used by control organizations and by different industries, the first impression is that there is a great diversity of approaches, requirements, levels of strictness and practical applicability and still today it is generally considered that the correct application of sensory methods requires a lot of time to carry out and to analyse data and that the number of qualified assessors is not always available. To simplify comparison between methods, the distinction between the sensory methodology to be used in the development of standards and specifications from that to be used to check if a product complies with their requirements is important. On one hand, when establishing standards and specifications, collecting data sets from different tests and relating them allows for powerful modelling of the relationships between physical process and ingredient variables and the perceived attributes from descriptive profiling, and ultimately, consumer appeal, present no problems (Mun˜oz & Chambers, 1993). In this case it is neither necessary nor convenient to use fast methods or to take quick decisions. On the other hand, in practical quality control, fast methods using a few assessors are needed in order to take quick decisions at a given

E. Costell / Food Quality and Preference 13 (2002) 341–353

moment. For these reasons, the author proposes to differentiate between methods that can be used to define specifications or to select a product quality standard and those destined to decide whether a particular food item fulfils the requirements of the appropriate standard.

2. Preliminary steps 2.1. Selection or establishment of sensory quality standards or specifications 2.1.1. Sensory quality standards The establishment or definition of the quality standards is the critical point in the implementation of a quality control programme. In practice, each company or institution must define the quality level to be controlled in a certain product and then develop a standard that fits their objectives. When dealing with foods and with sensory quality it is difficult, and often practically impossible, to obtain a product or a series of products showing the same unaltered sensory characteristics during enough time to permit their use as reference items in subsequent comparisons. Fortunately, for some attributes, such as colour or appearance, quality standards (photographs or reproductions of the food in materials like plastics or ceramic) have been used successfully when the product itself cannot be used, generally for reasons of sensory variation or alterations. In the majority of cases even this is not possible. This problem has traditionally been solved in two ways: either relying on the mental standard created by one or several experts, or developing a written standard, in which a description of the main attributes is commonly included. 2.1.1.1. Product standard. As indicated above, the use of the same product as a standard in the evaluation of the quality of raw or processed foods is almost always difficult or impossible. However it is more frequently used in quality control of ingredients or of some raw materials. According to Mun˜oz et al. (1992) a control standard selected for quality purposes is referred to as a product that is used as a representation of certain characteristics (not necessarily the ‘‘optimal’’) and a product that can easily be obtained, maintained or reproduced. The criteria for choosing a product as a control standard can be arbitrary or deliberate. In any case, before its selection, information must be obtained on the product variability and on its incidence on the sensory quality of the final food item. This implies the identification and quantification of the sensory attributes of the studied ingredient or raw material by using sensory descriptive techniques (profiles, QDA, Spectrum) and the determination of those attributes that influence the final product quality assessment by consumers. Acceptable variation limits for each of the attributes should

343

also be established. An interesting point to consider when a real product is used as a quality standard is the establishment of a clear methodology to substitute it when necessary. This need arises when the standard product is running out of stock or when the end of its shelf-life is getting near. The new standard must be sensorially identical to the previous one. This similarity should be ascertained by means of sensory discriminatory or difference tests, such as the triangle test. An important consideration here is that the objective of the sensory test is not to detect differences between samples but rather to establish that they are sensorially equivalent. In this case the analyst must determine what constitutes a meaningful difference by selecting the proportion of distinguishers and then select a small value of b-risk to ensure that there is only a small chance of missing that difference if it really exists (Ennis, 1993; Meilgaard, Civille & Carr, 1999; Schlich, 1993). 2.1.1.2. Mental standard. One of the most controversial strategies in sensory quality control is to assign a quality level to a product with reference to a mental standard developed by one or a group of experts or panellists. Criticisms have been based mainly on two aspects. The lack of concordance between experts as to the mental standard applicable to a certain product and the fallacy of assuming that the opinion of the experts represents that of the consumers. Mental standards should only be criticized if each expert or panellist operates under different criteria or mental standards, and if panellists do not evaluate products uniformly. Therefore, when a mental standard is to be used, panellists need to be trained on the criteria and product attributes that are to form the mental standard. This training provides validity and reliability to the use of mental standards and thus contributes to a sound evaluation method. In addition, criteria and/or products that form the mental standard should periodically be reviewed to strengthen the principles learned and the reliability of this practice. Mun˜oz et al. (1992) describe the procedures to form and teach sound mental standards to panellists for some of the QC/sensory methods that use these standards (e.g. ‘‘in/out’’ method) Consumer opinion is affected by the context in which the food is examined and by the expectations that some external factors, such as brand or price, will exert (Cardello, 1995; Lawless, 1995). In principle, it is understood that experts are those individuals who posses a highly developed ability to recognize and evaluate sensory properties and detailed technical information about their companies’ products. For the most part, these individuals have been successful and their activities have constituted one of the earliest organised efforts in sensory evaluation (Stone & Sidel, 1993). One of the main problems posed by the use of experts in the evaluation of product sensory quality is that the

344

E. Costell / Food Quality and Preference 13 (2002) 341–353

qualification of ‘‘expert’’ is not well defined. As a result, sometimes some people are considered ‘‘experts’’ when in fact they are not and their personal opinion on a product is erroneously identified as a valid mental standard for assessing the quality of same. The period of education and training is not clearly specified and the criteria used to select experts may vary from case to case and from time to time. Consequently the concept of individual quality, defined by the mental standard, usually varies from one expert to another. Recently Guinard, Yip, Cubero, and Mazzucchelli (1999) have confirmed this fact by analysing the quality scores of different types of beer given by several ‘‘experts’’ and observing that they clearly differed in their concept of quality. Perhaps one of the reasons for these results lies in the lack of coincidence in the training process of what the authors call ‘‘brewing experts to various degrees’’. As with other terms used in sensory analysis, the fight against ambiguity should start by clearly defining what ‘‘expert’’ means and establishing the minimum requirements in education and training that a person must possess to be qualified as such. It is recommendable to first consult the ISO Standard (1994) on this subject. This Standard defines what a selected assessor, an expert assessor and a specialised expert assessor are (Table 1) and describes criteria for choosing people with particular sensory skills. This Standard offers principles and procedures for expanding their knowledge and abilities to the levels required for expert and for specialized expert assessors. The main contribution of this Standard is that the experts, as defined in it, will not only have the experience to be

considered a specialist in the product and/or the process and/or marketing but they will also have a contrasted physiological sensitivity and a wide knowledge of sensory evaluation techniques. Having these characteristics, the experts are expected to use more homogeneous quality criteria and to communicate their qualifications in more precise terms. However these advantages do not solve the other problem, that of the lack of concordance between the experts opinion and that of consumers. However, in practice, the significance of this lack of concordance depends on the type of product and the level of quality that is to be assessed. To rely on a mental standard to decide the quality level (acceptable or not, good or better) of a widely consumed food product continues to be risky. Yet, when it is a matter of differentiating between good quality and exceptional quality in certain products (wine, coffee, etc.), the assessment of quality by real experts, in accordance with their mental standards, is still considered to be a valid alternative. 2.1.1.3. Written standard. The elaboration of written sensory standards to be used as references when determining a product quality must include definitions for both critical attributes, the perceptible variations of which depend on the raw materials and on the manufacturing process and the attributes that drive consumer acceptance. The type of standard will depend on the quality level to be controlled, on the objective of the control and on the type of product. The quality level is important: it is not the same to develop a standard designed to distinguish between acceptable

Table 1 Definition and characteristics of selected assesors, expert assessors and specialized expert assessorsa Type of assessor

Definition

Selected assessor

Assessor chosen for his/her ability to perform a sensory test

Expert assessor

Selected assessor with a high degree of sensory sensitivity and experience of sensory methodology, who is able to make consistent and repeatable sensory assesssments of various products

Good consistency of judgements, both within a session and from one session to another Good long-term sensory memory

Specialized expert assessor

Expert assessor who has additional experience as a specialist in the product and/or the process and/or marketing, and who is able to perform sensory analysis of the product and to evaluate or predict effects of variations relating to raw materials, recipes, processing, storage, ageing, etc.

Extensive experience in the relevant specialist field Highly developed ability to recognize and evaluate sensory properties Mental retention of reference standards Recognition of key attributes Deductive skills which may be applied to problem solving Good ability to describe and communicate conclusions or to take appropriate action

a

Based on ISO 8586–2 (1994).

Characteristics

E. Costell / Food Quality and Preference 13 (2002) 341–353

and unacceptable products as it is to do so in order to differentiate between two acceptable products (which one is of higher quality) or even to set a standard applicable to differentiate between high quality products and those of optimal or exceptional quality. Evidently, the difficulties increase as the quality level increases because the number of critical attributes grows and their selection becomes more complex as quality levels are higher. Frequently it is difficult to locate and describe the difference between a good quality product and a better quality one mainly because of the small differences found (Powers, 1981). This problem has not yet been satisfactorily solved but it is undoubtedly of the highest interest. According to Cardello (1997), ‘‘establishing the relationship between sensory responses and the pleasure associated with food is one of the most important and practical contributions that sensory science can make to the study of food’’ In the development of a quality standard the specific objective of the quality control programme should be taken into account. It is not the same when the objective is to design a sensory quality control within either a public or private control organisation or to control the quality of products to be included in a specific Designation of Origin or to control the quality of an industrial food product to compete with other products in the market. In the first case the aim of the programme is to ensure that no inadequate products reach the consumer. Here the term ‘‘quality’’ is equivalent to ‘‘absence of defects’’. Regulatory quality should reflect the minimum acceptable quality and form a base from which individual companies can develop their standards. The appropriate standard must then include a description of the most common defects in the product, comprising those defects due to inadequate characteristics of the raw materials used or to the process conditions, those resulting from incorrect or prolonged storage or even those derived sporadically from accidental causes. The description of the standards used by the Inspection Branch of the Canadian Department of Fisheries and Oceans to determine the sensory quality of fish, as reported by York (1995), constitutes a good example of criteria to be applied in the development of this type of standard. In this case, the regulatory definition of quality should ensure food safety and should be a reflection of consumer expectation of minimum acceptable quality. This regulatory sensory quality is reduced to three specific measurable characteristics: taint, decomposition and unwholesomeness. As stated by York, ‘‘Consumers have other concepts of quality such as product form, species and processing conditions (e.g. fresh vs. frozen) which are outside the mandate of regulatory quality’’. A different approach has been used by various European Fisheries Research Institutes to develop an accurate and objective method for the determination of fish freshness considering that freshness is a critical quality parameter

345

of this product. The Quality Index Method (QIM) is based on objective evaluation of the key sensory attributes of each fish species using a points scoring system (from 0 to 3). The lower the total score, the fresher the fish. QIM procedures for 12 fish species have now been developed. As an example, a QIM scheme for cod is shown in Fig. 1. It is expected that the QIM will become the leading reference method for the quality assessment of fresh fish within the European Community (www.qim.eurofish.com). Another option is that chosen by the International Olive Oil Council (COI) to define the quality standard for virgin olive oil. This organization has recently proposed a revised method for the organoleptic assessment of virgin olive oil (COI, 1996) with the purpose of determining the criteria needed to assess the flavour characteristics of this product and developing the methodology for its classification according to the intensity of the perceptible defects. In this case, a profile sheet was defined including negative (defects) and positive attributes and unstructured continuous scales for measuring their intensity were incorporated (Fig. 2). When the objective is to control the sensory quality of products of a Designation of Origin (EEC Council, 1992), the approach is different. According to Bertozzi (1995), the denomination of a product marked with the geographic name of the zone in which it is produced includes information on the manufacturing process and on product characteristics. In this context, it is necessary to furnish objective methods to certify the typicity of every production in such a way that it can be differentiated in comparison with imitations. For these reasons, in this case, apart from considering the possible presence of common or sporadic defects in the product, the standard must include not only the attributes defining its sensory profile and those affecting acceptability but also the attributes which can establish differences with other similar products from other designations of origin. The latter additional attributes may not be necessary because frequently the differences between designations lie in differences in intensity of the same attributes rather than in different attributes. The development of this type of standard involves a lot of time consuming work including the collection and initial screening of a great number of different samples, representative of the variability of the products belonging to the designation and also the generation and selection of the attribute descriptors. From the results of the required descriptive analysis a sensory profile is finally defined which serves as the specific standard for the designation. An example of this type of standard is the Guide to the sensory evaluation of the texture of hard and semi-hard ewes’ milk cheeses (EUR, 1999). This guide includes the attributes to be evaluated, their physical and sensory definitions, their evaluation techniques and seven point intensity scales, of which three points are fixed by a standard

346

E. Costell / Food Quality and Preference 13 (2002) 341–353

Fig. 1. Quality Index Method (QIM) scheme for cod.

reference product. Definitions, evaluation technique and scale for friability evaluation can be seen in Fig. 3. Finally, the development of a quality standard for commercial food products can follow a scheme almost similar to the above mentioned type of standard. The difference lies in the fact that in this case defining the quality standard requires the consideration of several points such as marketing objectives, production variability, attributes that vary, attributes that drive consumer acceptance, manufacturing conditions and available resources.

2.1.2. Sensory specifications Broadly speaking, a sensory specification is designed to determine the acceptable or tolerable variation in a product with reference to a previously selected product or an established written standard. In the latter case the specification defines the range of intensities accepted or tolerated for each of the attributes or the range of defects included in the written standard. Specifications can be set based on management’s criteria alone and/or on consumer response. The second option provides

E. Costell / Food Quality and Preference 13 (2002) 341–353

347

Fig. 2. Profile sheet for the organoleptic assessment and classification of virgin olive oil.

more realistic specifications because here the influence of the variability of the product or of each considered attribute on consumer acceptance is taken into account. A product test for establishing a sensory specification includes: 1. Selection of a group of samples showing different sensory properties and representing the actual variability in the marketplace. In some cases it is convenient to add samples showing an especially important defect. 2. Evaluation of the perceptible difference/s between each sample and the standard either by direct comparison or by means of descriptive analyses in which the magnitude of the defects and/or attributes are evaluated. 3. Evaluation of the acceptability of samples by a large consumer panel.

4. Analysis of the relationship between the variability of the attributes or the product and the variability in consumer acceptability.

The main information thus obtained will show for which attributes, their variability influences consumer acceptance. It must be accepted that variability in the intensity of some attributes may not affect acceptability. Furthermore the extent of the variability in an attribute is not necessarily related to the magnitude of its effect on acceptability. With this information and the particular criteria used by the institution or company a definite sensory specification can be established. This specification includes not only those attributes affecting acceptability but also all those proposed by the responsible organisation according to its particular understanding of quality for the product studied.

348

E. Costell / Food Quality and Preference 13 (2002) 341–353

Fig. 3. Definitions, evaluation technique, scale and standard reference products for friability evaluation of hard and semi-hard ewes’ milk cheeses.

In any case, the development of standards and specifications is neither an easy nor a quick task. On many occasions the results obtained in the first study are not satisfactory and the initially proposed standard or specification must be modified. On the other hand, variations produced in the market because of changes in consumer preferences or habits, degree of exigency, new trends, or even changes produced in the market structure when new products are introduced, must be followed. The validity of standards or specifications may vary with time and must then be periodically updated to adapt them to market variations. 2.2. Selection of methods Following the described procedure, the application of sensory methods to the development of standards and the establishment of sensory quality specifications present no special problems. The objectives, the experimental designs, the testing conditions, the number of assessors and their level of training, the criteria for the selection of consumer panels and the statistical analysis of data are well defined in many recent texts (ASTM, 1996, 1997; MacFie & Thomson, 1994; Moskowitz, 1994; Lawless & Heymann, 1998; Meilgaard et al., 1999). The problem arises when, once the quality standard has been established and the specification of a product defined, it is necessary to use sensorial methods in order to decide if the product meets the requirements set or not. In principle, the most suitable sensorial methods are those which make it possible to measure the magnitude of variability between a product and a previously defined standard (intensity scales, quality rating or difference from control method) while the dif-

ference or affective tests are not appropriate for the routine evaluation of products quality. Difference tests are too sensitive to relatively small differences and do not determine the extent of the difference and the determination of the preferences of a small group of assessors does not represent the consumer population. In practice, the selection of the method to be used will depend on the objectives set and the characteristics of the products to be evaluated. For example, as Mun˜oz et al. (1992) comment, if the variable attributes are limited to five to ten key attributes, the comprehensive descriptive method is feasible but when the product variability is not easily defined by specific sensory attributes, but can be more readily reflected in the broad sensory parameters (appearance, flavor, texture) the quality rating method is a likely method of choice. In those cases, when variation cannot be specifically defined by sensory attributes or when examples of unacceptable product cover a multitude of sensory conditions, the in/out method is recommended. To sum up, in each particular case, the choice of sensorial method should be made taking the following criteria into account:

1. The objective of the quality control programme 2. The type of standard previously established 3. Whether or not the perceptible variability of a product can be defined by specific sensory attributes and, if so, the number of parameters or sensorial attributes necessary to do so. 4. The magnitude of perceptible variability that must be detected 5. The level or levels of quality to be assessed.

E. Costell / Food Quality and Preference 13 (2002) 341–353

Applying these criteria it is possible to select the most suitable sensorial method to obtain the sensorial information needed in a timely and cost-effective way.

3. Sensory methods in quality control As has been commented above, various publications have proposed different types of sensorial methods which can be used in sensory quality control of food products. According to Mun˜oz et al. (1992), they can be classified into eight groups: Overall difference tests; Difference from control; Attribute or descriptive tests; In/out of specifications; Preference and other consumers tests; Typical measurements; Qualitative description of typical production and Quality grading. The problems associated with using some of these methods have already been mentioned, such as the difference or consumer tests but other test types also present serious limitations in their approach as is the case, for example, with methods which have the objective of classifying a product as typical or atypical giving no information as to the reason for the classification given. But apart from the particular limitations of each of the methods, in many cases, the lack of validity of results may not be attributable to the method itself but to a defective realisation of the test and/or to an incorrect analysis of the information obtained. The same sensorial method used with a correctly established and well defined standard or applied directly, relying on the individual criteria regarding product quality held by a small group of company employees or a group of selected and trained assessors may render results of different validity. 3.1. Methods involving a comparison to a standard The objective of these methods is to evaluate the differences between the product and the corresponding standard. This involves the clear definition of terms used and of the experimental test conditions, the design of a score card, the selection and training of a panel and the selection of the method to be used to analyse data obtained. 3.1.1. Difference from a standard or control product There are several ways to examine the differences from a standard product. The simplest one is to evaluate the overall degree of difference using a single scale (rating, category or unstructured) with the extremes labelled ‘‘no difference’’ and ‘‘extreme difference’’ (Fig. 4a). It is an easy and fast method, useful when the analysed product does not have complex sensory characteristics. Its objective is to distinguish between the samples showing a tolerable difference from the standard and those for which the difference is greater than the tolerance established in the corresponding specification. It is recommended that

349

the final decision be taken by the person responsible for quality control, according to the scores given by the judges. The judges should centre their attention on the magnitude of the perceptible differences. The responsibility to decide may have a psychological influence on the evaluation of the differences. Another source of influence may be the knowledge of the specification by the judges. To compensate for the latter it is common practice to introduce a blind sample of the standard product to be compared with the declared standard. This a useful method in public or government organisations, where the objective is to separate samples of low quality. In industrial quality control, this method lacks the capability of giving information on the nature of the difference, necessary to identify the cause and correct the difference (Aust, Gacula, Bearm, & Washam, 1985; Mun˜oz et al., 1992). A more informative method consists in selecting the most important attributes in the product sensory quality and evaluating the differences from the standard for each attribute (Fig. 4b). A difference higher than the specified tolerance in any attribute will mean that the product is out of specification. With this information, corrections can be introduced when necessary. Still, this method detects the magnitude of the differences in attributes but not their direction. A possible alternative, successfully used in some cases (Costell, unpublished data) is to design a scale similar to the ‘‘just-right scale’’, in which the central point corresponds to the standard product (Fig. 4c). With this scale information is obtained not only on the magnitude but also on the direction of the differences from the standard in the different attributes. This procedure may be of interest when the objective is to evaluate the effect of a change in the formulation of a certain product on its sensory quality and the direction of the possible change in any quality attribute is not predictable. Besides the type of scale used, the quality of the obtained information depends on the degree of training of the judges and their knowledge of the product, on the realisation conditions and on the correct analysis of data as a function of the type of scale (ordinal or interval scale) used. A large trained panel (30–40) is appropriate when only the degree of difference from the control is to be evaluated. When additional information about differences on specific attributes is required a smaller and more highly trained panel is recommended. 3.1.2. Difference from a mental standard As commented above, the use of a mental standard by one or several experts to define the quality of a food product presents two serious problems, derived from the possible difference between the mental standards used by the experts and from the fact that their opinions are not representative of consumers’ opinion. In principle, based on these considerations, it would not be recommendable

350

E. Costell / Food Quality and Preference 13 (2002) 341–353

Fig. 4. Different types of scales for: (a) overall difference from standard product ratings; (b) difference from control for selected attributes and (c) directional diference from control for selected attributes.

to use this method to evaluate the quality of certain products. But this position must be reconsidered. In cases when an expert or a group of experts, showing recognised ability to evaluate the magnitudes of the perceptible differences between products and a profound knowledge of the product and its manufacturing process is available there are situations in which their performance is not only admissible but recommendable. One of these situations is when the characteristics of the product will not be directly evaluated by the consumer, such as, when dealing with raw materials or ingredients or when only previous information is sought on the effects of formulation, process or storage conditions on the product quality. Another situation where quality evaluation by experts is appropriate is when the objective is to evaluate differences between quality grades of products of exceptional sensorial characteristics, such as wine, coffee or olive oil, in which small differences between high quality levels may be decisive in their

market price. These differences can hardly be detected by naive consumers. 3.1.2.1. In/Out method. This is the simplest method to compare product quality with a mental standard by experts. It is mainly used to identify products that show clear deviations (presence of off-notes or other defects) from the normal production. It can be recommended for the evaluation of raw materials or relatively simple finished products. Its advantages are the simplicity and the direct use of results obtained. The main disadvantage is its inability to provide descriptive information and therefore its lack of direction and actionability to fix problems (Mun˜oz et al, 1992). The validity of the information provided depends on whether the ‘‘experts’’ are indeed genuine experts. If this method is used by one or a small group of people in a company who do not possess the necessary expertise, each of them makes decisions based on his individual experience and on his

E. Costell / Food Quality and Preference 13 (2002) 341–353

product knowledge. This situation leads to highly variable and subjective information. 3.1.2.2. Overall quality rating method. This method consists in assessing the quality of a product according to an established quality criteria. Samples are rated using a single quality scale anchored ‘‘very poor’’ and ‘‘excellent’’. A product is rejected when the quality ratings are low. Initially it was considered that a quality rating test represents a combination of affective and descriptive tasks (Sidel, Stone, & Bloomquist, 1981). Besides this duality, other problems arise in data treatment because the quality scores are not clearly based on psychophysical measurements (Lawless, 1994). Another approach consists in considering the product quality as an integrated impression like acceptability or pleasure experienced when consuming a food or drink. Using this criterion, the evaluation of a product quality with a unidimensional scale may appear logical. This implies accepting that quality and acceptability are not concepts of an exclusively sensorial character (Costell, 2001). As stated by Cardello (1997): ‘‘In psychological terms, pleasure and displeasure, liking and disliking, are not sensory phenomena, although they accompany most sensory stimuli. Rather, pleasure and displeasure are emotional experiences. They are conscious cognitions that accompany the somatic effects of emotions’’. Based on this approach it can be accepted that a group of experts, sharing a common mental standard, may successfully judge a product quality grade. From a quality control standpoint, this method has the disadvantage of producing an integrated judgement, that may not be actionable and useful for product documentation or guidance. In an effort to overcome the above mentioned problem, other methods have been proposed, in which a scale to evaluate overall quality and other scales to evaluate the attributes’ quality or their intensities are included in the same scorecard (Mun˜oz et al., 1992). This scheme is apparently similar to that described above to evaluate the perceptible differences from a control product but the situation is not the same. Even assuming that the experts possess a solid mental standard of the product quality it is hard to make them pay attention to the overall quality and to the quality or intensity of attributes in the same session. In this way they are obliged to perform functions that require different mental attitudes which can produce erroneous results. It should not be recommended that experts use this type of scorecard to evaluate a product quality. 3.1.3. Difference from a written standard These types of tests are among the most frequently used in quality control. Essentially they consist in the evaluation of the intensity of different attributes and/or defects or the evaluation of quality grade using a scorecard. The

351

information gathered during the previous development of the standard and the establishment of the specification is collected in the scorecard, according to different criteria and in different ways. Several alternative procedures have been used but, practically, only one of them is in use at present: the quality grading test. 3.1.3.1. Quality grading method. This method has been one of the most popular sensory tests used in quality control and consists in developing a scorecard that includes a scoring system with points assigned for each grade and a description of sensory characteristics defining quality for each grade. The scorecard is composed of ordinal scales using discrete numbers and contains the description of the characteristics. The scale amplitude may be 3, 6 or 9 points. The upper third of the scale includes a detailed description of the intensity of each attribute corresponding to a high level of quality, the medium third the description corresponding to an acceptable quality and the lower third that corresponding to rejectable quality. Frequently a scale is designed for each basic sensory attribute, e.g. appearance, colour, flavour and texture. The judges give scores to each attribute and when a product is assigned a score in the lower third of the scale, it should be rejected (ISO, 1987). This test allows for a rapid qualification of the product and for the detection of the possible causes of rejection. However this test requires a group of very well trained judges that can correctly interpret the descriptions corresponding to each of the quality grades for each of the selected attributes. An important problem here is that the judges are obliged to carry out an analytical job simultaneous to a qualification which may produce deviations in the results, as commented above. Finally it should be taken into account that the data from this test are of ordinal nature and this fact leads to the use of non-parametric statistical methods for analysing the data obtained. 3.2. Methods of evaluation without a standard 3.2.1. Descriptive method This method consists in having a well-trained sensory panel that provides data on a set of the product’s sensory attributes. During the initial development of the sensory standard a number of attributes are selected. Some of them have been selected because their variations affect product acceptability by consumers and some others may be introduced by the industrial company on the basis of their interest in connection with the identity and/or desired image of the product in the market. In the establishment of the corresponding specification their tolerable variability is fixed. Evaluation of quality with this descriptive method consists basically in the evaluation of the intensity of each attribute by a trained panel using descriptive profiling (conventional

352

E. Costell / Food Quality and Preference 13 (2002) 341–353

profile, QDA, Spectrum). The person responsible for quality control then studies the results obtained from the statistical analysis of experimental data and makes the final decision based on the sensory specification previously established. In this case, the specifications are represented by the range of intensities tolerated for each attribute. Products whose intensity on any given attribute fall outside specifications are considered unacceptable. For example, a company wants to assess the quality of the virgin olive oil it produces, in accordance with the standard proposed by the COI (Fig. 2). The first step should be the selection and training of the panel. The training comprises the definition, evaluation procedures and magnitude scoring of each of the attributes and of the defects included in the scorecard (COI, 1996). Once it has been ensured that the panel works well, the comparison between the panel results and the specifications set for each of the product attributes can be used to make decisions regarding the product quality. As stated by Mun˜oz et al, (1992), the two main advantages of this approach are the absence of any subjectivity in the evaluation and the quality of the data obtained. The main disadvantages are the time and cost necessary to train and calibrate the panel and the time necessary to perform the test and to analyse the data. This test and the corresponding data analysis can be simplified by using the software available (Punter, 1994). The method described is not suitable for solving some particular problems that require an immediate decision. In this case one possibility may be to perform a reduced version of the profile. Once the panel is trained on the whole profile (10 to 15 attributes), a small group of judges may be selected to evaluate the most important attributes (4–5). This simplification may allow its use in daily quality control. 3.2.2. Other methods Many of the merited criticisms of sensory methods used in quality control originate from the lack of a previously developed standard or an established specification. It should also be considered that the standards and specifications are developed for a specific situation (industry, public or private organisation, etc). The use of some methods (In/Out, Quality Rating, etc.) without a previous standard or specification afford results of doubtful validity. It is especially important to note that the development of a quality grading system without a previous study of the relations between the variations in attributes and product acceptance may lead to the construction of scorecards without any practical value. It is also important to point out that the use of these methods in food research to compare products or to study the effects of processing conditions, must be avoided. The evaluation of the sensory quality of any product by a group of 10 to 20 more or less trained panellists in a laboratory without a standard or specification has

the same doubtful value as their opinion on product acceptance. Finally something must be said about the quality evaluation methods based on what is known as ‘‘complete scorecard’’. These scorecards include evaluations for different sensory categories such as appearance, flavour, texture, etc. as well as for some specific attributes like sourness or viscosity and a variable number of ‘‘quality points’’ is assigned to each one of them. The sum of points obtained determines the product quality. Another, alternative, method consists in assigning scores to the intensity of different attributes, multiplying them by different factors according to their importance and adding them up to get a product qualification. One of the better known methods of this type is the U.C. Davis 20-point wine scoring system described in 1981 by Amerine and Roessler (Lawless, 1995). These methods were once very popular and were adopted by some industrial firms and control organisations. They would appear to make it possible to express the quality of a product with a single number. But in practice they present several problems. They have been criticised on many occasions because the weight of each attribute has been arbitrarily assigned and the product quality is taken as the sum of the qualifications given to a limited number of attributes. On the other hand, the scales used to evaluate the intensity of the different attributes do not always have sensorially equivalent magnitudes. This also means that the validity of the information obtained when each score is multiplied according to a previously established weighting factor is questionable, even when this factor has not been established in an arbitrary manner. For all of these reasons this type of test is not considered recommendable for assessing the quality of a product.

4. Concluding remarks In accordance with what has been stated above, we can conclude that not all methods proposed for evaluating the sensorial quality of food products are suitable for incorporation in quality control programmes. Difference or preference tests, typical measurement or those methods based on ‘‘complete scorecards’’ are the less appropriate while difference from control methods and descriptive methods, are the most sound sensory tests for quality control purposes. Others methods such as In/Out, Quality rating and Quality grading methods may be used in particular situations. The characteristics of each product, the degree or level of quality that it is wished to control and the resources available condition the choice of method to be used. On the other hand, it should be borne in mind that designing an effective programme for testing the sensorial quality of a product is based on the following points: (a) The selection of the

E. Costell / Food Quality and Preference 13 (2002) 341–353

sensory quality standard; (b) The establishment of the sensory specification; (c) The selection of a method to evaluate differences between the product and the corresponding standard and; (d) The selection, training and maintenance of the panel. The practical value of the information obtained will be determined by the correct fulfilment of these requirements.

Acknowledgements To Ministerio de Ciencia y Tecnologı´a of Spain (Project AGL 2000–1590). The author acknowledges Dr. Luis Dura´n for revision of the manuscript and helpful observations and Alejandra Mun˜oz for constructive comments.

References ASTM. (1996). Sensory testing methods: Second Edition. MNL 26. Philadelphia: American Society for Testing and Materials. ASTM. (1997). Relating consumer, descriptive and laboratory data. MNL 30. Philadelphia: American Society for Testing and Materials. Aust, L. B., Gacula, M. C., Beard, S. A., & Washam II, R. W. (1985). Degree of difference test method in sensory evaluation of heterogeneous product types. Journal Food Science, 50, 511–513. Bertozzi, L. (1995). Designation of origin: quality and specification. Food Quality and Preference, 6, 143–147. Booth, D. A. (1995). The cognitive basis of quality. Food Quality and Preference, 6, 201–205. Cardello, A. V. (1995). Food Quality: conceptual and sensory aspects. Food Quality and Preference, 6, 163–168. Cardello, A. V. (1997). Pleasure from food: its nature and Role in Sensory Science. Cereal Foods World, 42, 550–552. COI (1996). Organoleptic assessment of virgin olive oil. COI/T.20/ Doc.no 15/Rev.1. International Olive Oil Council. Costell, E., & Dura´n, L. (1981). El ana´lisis sensorial en el control de calidad de los alimentos. Revista de Agroquı´mica y Tecnologı´a de Alimentos, 21, 1–10. Costell, E. (2000). Ana´lisis sensorial: evolucio´n, situacio´n actual y perspectivas. Industria y Alimentos Internacional, 2, 34–39. Costell, E. (2001). La aceptabilidad de los alimentos. Nutricio´n y placer. Arbor, 661, 65–85. EEC Council (1992). Council Regulation 2081/92. 14 July 1992. Official Journal of the European Community. Luxemburg. Ennis, D. M. (1993). The power of sensory discrimination methods. Journal Sensory Studies, 8, 353–370. EUR (1999). A guide to the sensory evaluation of the texture of hard and semi-hard ewes’ milk cheeses. No 18829. Official Publications of the European Communities. DG XII, Brussels. Fisken, D. (1990). Sensory quality and the consumer: viewpoints and directions. Journal Sensory Studies, 5, 203–209. Gould, W. A., & Gould, R. W. (1988). Total quality assurance for the food industry. Baltimore: CTI Publications Inc. Guinard, J. X., Yip, D., Cubero, E., & Mazzucchelli, R. (1999).

353

Quality ratings by experts, and relation with descriptive analysis ratings: a case study with beer. Food Quality and Preference, 10, 59– 67. Herschdoerfer, S. M. (1986). Quality control in the food industry. New York: Academic Press. ISO. (1987). Sensory analysis. Methodology. Evaluation of food products by methods using scales. International Standard no. 4121. Switzerland: International Organization for Standardization. ISO. (1994). Sensory analysis. General guidance for selection, training and monitoring of assessors. Part 2: Experts. International Standard no. 8586–2.. Switzerland: International Organization for Standardization. Juran, J. M. (1974). Quality Control Handbook (3rd ed.). New York: McGraw Book Co. Kramer, A. (1959). Glossary of some terms used in the sensory (panel) evaluation of foods and beverages. Food Technology, 13, 733–738. Kramer, A., & Twigg, B. A. (1970). Quality control for the food industry (3rd ed.). Westport. Connecticut: The AVI Publishing Co. Lawless, H. T. (1994). Getting results you can trust from sensory evaluation. Cereal Foods World, 39, 809–814. Lawless, H. T. (1995). Dimensions of sensory quality: a critique. Food Quality and Preference, 6, 191–199. Lawless, H. T., & Heymann, H. (1998). Sensory evaluation of food. New York: Chapman & Hall. International Thomson Publishing. Lardmond, E. (1994). Is Sensory Evaluation a Science? Cereal Foods World, 39, 804–808. MacFie, H. J. H., & Thomson, D. M. H. (1994). Measurement of food preferences. London: Blackie Academic & Professional. Meilgaard, M., Civille, G. V., & Carr, B. T. (1999). Sensory evaluation techniques (3rd ed.). Boca Raton. Florida: CRC Press. Molnar, P. J. (1995). A model for overall description of food quality. Food Quality and Preference, 6, 185–190. Moskowitz, H. R. (1983). Product testing and sensory evaluation of foods. Westport, Connecticut: Food & Nutrition Press. Moskowitz, H. R. (1993). Sensory analysis procedures and viewpoints: Intellectual history, current debates, future outlooks. Journal of Sensory Studies, 8, 241–256. Moskowitz, H. R. (1994). Food concepts and products. Just-in time development. Trumbull, Connecticut: Food & Nutrition Press, Inc. Mun˜oz, A. M., & Chambres IV, E. (1993). Relating sensory measurements to consumer acceptance of meat products. Food Technology, 47, 128–131, 134. Mun˜oz, A. M., Civille, G. V., & Carr, B. T. (1992). Sensory evaluation in quality control. New York: Van Nostrand Reinhold. Powers, J. J. (1981). Multivariate Procedures in Sensory Research: Scope and limitations. MBAA Technical Quarterly, 18, 11–21. Punter, P. H. (1994). Software for data collection and processing. In J. R. Piggott, & A. Paterson (Eds.), Understanding natural flavors (pp. 97–111). London: Blakie Academic &Professional. Schlich, P. (1993). Risk tables for discrimination tests. Food Quality and Preference, 4, 141–151. Stauffer, J. E. (1988). Quality Assurance of Foods. Westport, Connecticut: Food & Nutrition Press, Inc. Sidel, J. L., Stone, H., & Bloomquist, J. (1981). Use and misuse of sensory evaluation in research and quality control. Journal Dairy Science, 64, 2296–2302. Stone, H., & Sidel, J. L. (1993). Sensory Evaluation Practices ((2nd edition.)). New York: Academic Press, Inc. York, R. K. (1995). Quality assessment in a regulatory environment. Food Quality and Preference, 6, 137–141.