Chapter 7
Triangle Test Ce´cile Sinkinson JTI (Japan Tobacco International), Geneva, Switzerland
The triangular or triangle test is a discrimination test designed primarily to determine whether a perceptible sensory difference exists or not between two products. It uses trial sizes and processes to generate sufficient good quality data for statistical analyses. From these analyses firm conclusions can be drawn, and unbiased information provided to allow the correct business decisions to be made. The triangle test in its original form was developed in the 1940s, in the laboratories of Joseph E Seagram & Sons, to monitor the production quality of whiskeys (Peryam, 1950). It was used and reported by Helm and Trolle (1946) as a method to select panel members for the assessment of beers. Since then it has been used for different research objectives and for a multitude of products from the food and drink industry such as broccoli (Jacobsson et al., 2004), mustard (Rousseau et al., 1999), wine (Sauvageot et al., 2012), and vinegar (Tesfaye et al., 2002), but also from the household and personal care industry, such as fragrances (Allen et al., 2015) and many more. It is used in most fastmoving consumer good companies as the standard test method, often as a matter of course, even when it is not the most suitable choice for the objective or action standard. There has been much interest in the procedure, power, and replications for the triangle test (MacRae, 1995; Brockoff and Schlich, 1998; Angulo et al., 2005; Lee and O’Mahony, 2006; Ennis and Jesionka, 2011). Essential reading for anyone using the triangle test on a regular basis is the paper entitled “Who Told You the Triangle Test was Simple” (O’Mahony, 1995) which describes the analysis and the many pitfalls in this test.
1. TEST PRINCIPLE Three samples, two of which are the same, are presented simultaneously to each panelist who is then required to identify the one that is different from the other two. There are six possible serving orders: AAB, ABA, BAA, BBA, BAB, and ABB, which should be randomized across all panelists to prevent
Discrimination Testing in Sensory Science. http://dx.doi.org/10.1016/B978-0-08-101009-9.00007-1 Copyright © 2017 Elsevier Ltd. All rights reserved.
153
154 PART j II Methods and Analysis in Discrimination Testing: Practical Guidance
psychological errors due to position, as often the temptation is to perceive the second sample as the different one. This test has a statistical advantage over the “directional difference” test when differences are small (or there is no difference) because the panelist can guess correctly only one-third of the time (p ¼ 0.333). The appropriate statistical test to use is a one-tailed binomial test (O’Mahony, 1986). Interpretation is based on the minimum number of correct answers required for significance at a predetermined significance level, given the total number of responses received. The minimum number of correct answers is found in statistical tables (BS ISO 4120:2004; ASTM E1885-04, 2007; Meilgaard et al., 2015; Stone and Sidel, 2013) also available in Appendix 2. The three samples are marked with three-digit random codes, and the panelists are asked to taste/evaluate the products in the order presented from left to right and identify the odd one. Panelists could be asked to further describe the nature of the difference they perceived. If gathered, only comments from assessors who were correct should be used. This should not be analyzed but may be used as qualitative information for the sensory scientist to identify trends.
2. WHY AND WHEN TO USE IT The test is useful to evaluate overall product differences, to determine the effect of a change in ingredients, packaging, processing, handling or storage conditions. It is also useful as an initial tool for screening/validating panelists during the recruitment process. The method is only applicable if there is a small or subtle difference between the products; if the difference is large or easily noticeable, a discrimination test is not the appropriate methodology. Therefore, the sensory professional or project team should screen the products first. In addition, the only differences between the products should be due to the modification that is being studied; if there are other variations, i.e., different production methods, different raw materials, lack of homogeneity within products, etc., it will not be possible to accurately identify whether the differences perceived between the products are due to the designed change being studied or these other variations. Of course, if the project objective involves a different production method or the change in a raw material, then this is acceptable but should be the only difference. The overriding aim is to maximize the chances of finding a significant difference between two products for the factors of interest. For food and non-food products, product characteristics and performance may change over time, and the triangle test may be difficult to set up for practical reasons. For example, the application of two deodorants can be compared, but their performance may differ over time, and repeated assessments over time are limited for practical reasons.
Triangle Test Chapter j 7
155
For non-food products the use of the triangle test stays limited. The application of the tests has some practical constraints for products, such as lipsticks, mascaras, or shampoos, which require to be applied at the same time. The triangle test should not be used for products that cause excessive sensory fatigue, for example, products that decrease the ability to perceive, such as products containing alcohol or nicotine. It should not be used for products that require high concentration to evaluate, such as minty products or products which give rise to carryover, such as mentholated or spicy products with lingering taste or sensation. For these type of products the rest time can either be increased between each product evaluation, or another discrimination test can be chosen for the purpose of the evaluation. Using the same-different test is recommended to use rather than extending the rest time. The samedifferent test is discussed in Chapter 5 of this book.
3. ADVANTAGES AND DISADVANTAGES This test is considered as one of the basic discrimination tests and is widely used in the industry. Also referred to as “the unspecified triangle test”, the triangle test does not give any indication of the direction of the difference, such as the identification of a specific sensory attribute (BS ISO 4120:2004) or the extent or magnitude of that difference. The sensory professional should not be tempted to conclude on the magnitude of the difference from the significance level or the probability (p-value) from the analysis (Lawless and Heymann, 2013). The test can encompass a large number of panelists and a large quantity of products. Ennis and Rousseau (2012) introduced the tetrad method as a way of reducing costs. Some major companies such as General Mills have decided to move from the triangle test method to the tetrad test to reduce the cost (Gelski, 2013). Based on currently published research, the tetrad method also possesses statistical advantages over the triangle and would require fewer panelists, reduced testing time, and would use less product material (ASTM, 2011; Ennis and Rousseau, 2012; Ennis, 2013). However, this test also introduces a fourth item into the testing procedure, thereby increasing the risk of adaptation (reduced sensitivity resulting from repeated presentation) and sensory fatigue and therefore is not always the most sensible choice for some products.
4. TERMS AND DEFINITIONS (BS ISO 4120) Difference: When products can be differentiated because of perceptible sensory characteristics between them. Similarity: When products cannot be differentiated because the difference in sensory characteristics between them is too small to be perceived.
156 PART j II Methods and Analysis in Discrimination Testing: Practical Guidance
4.1 Risks Analysis of the data from the triangle test is based on probability, and the conclusion that can be drawn depends on the risk the test requester is prepared to take. The risks are as follows:
4.1.1 Alpha Risk (a Risk) Alpha risk is the risk of concluding that a perceptible difference exists between the two products, when in truth they are the same (false positive). This risk should be minimized when your objective is to determine “a difference” between two products. Its generally and arbitrarily used target is a 0.05 (5%). However, some sensory professionals are more cautious and choose 1% as their level of significance, particularly if any changes made, based on the test result, might have a major effect on the product/business. The alpha risk is also referred to as a type I error, significance level, or false positive rate. BS ISO 4120 provides a table to determine the number of correct answers to conclude that there is a perceptible difference between products (Table A2.13 in Appendix 2). 4.1.2 Beta Risk (b Risk) Beta risk is the risk of concluding that no perceptible difference exists between the two products, when in truth they are different (false negative). This risk should be minimized when your objective is to determine the similarity of two products. The usual level of acceptable risk for a false negative is set at b < 0.05 (5%) and arises from a compromise between the robustness of the test and the number of panelists used (Table A2.13, in Appendix 2). The beta risk is also referred to as type II error or false negative rate (ISO 4120). BS ISO 4120 provides a table to determine the number of correct answers to conclude that no meaningful difference exists between the products (Table A2.12, in Appendix 2).
4.2 N N is the total number of independent results required. If there is no replication used, then this is also the minimum number of panelists. The minimum number given is the absolute minimum needed to meet the confidence levels set for the test.
4.3 Pd For similarity, Pd is the maximum proportion of “distinguishers” that the sensory professional can tolerate being able to detect a difference between products.
Triangle Test Chapter j 7
157
4.4 Action Standard An unambiguous statement explaining what action will be taken based on the results of the study.
5. SETTING UP THE TEST 5.1 Panelists’ Instructions The panelists need to be instructed on how to evaluate the products. It is usually better to perform the evaluation quickly so that the three samples can be compared effectively. However, it is recommended to take a small break between consecutive products, especially for tastes and fragrances that linger in mouth or smell. As an example, for products with a mild flavour a one to two minute rest may be appropriate, and for flavoured gums a three minute chew followed by a three minute rest could be a regime to adopt. For each project, it is advisable to assess the rest time necessary between products.
5.2 Palate Cleansers For many food tests, panelists are required to palate cleanse with water and unsalted crackers before commencing tasting and between each product. For some specific food products, water and crackers may be insufficient to clear the palate. In these cases it may be necessary to experiment with other suitable palate cleansers, for example, warm water and a slice of unsweetened apple. This combination can be efficient to remove the mouth coating after a fatty product. For a sniff test, a two minute rest would be advised with the instruction of resetting the sense of smell by smelling the panelist’s own skin.
5.3 “No-Difference” Option The triangle test is a “forced-choice” method, and panelists are required to respond even if they guess. They are not allowed to report “no difference.”
5.4 Retasting Products For food tests, most sensory professionals leave panelists the freedom to retaste products as many times as necessary before choosing the odd product. It was verified that for some products, retasting allowed for better discrimination by panelists (Caroselli, 2012; Ishii et al., 2013; Rousseau and O’Mahony, 2000). When carrying out the test, some panelists may find retasting useful to confirm their first impression. However, by nature, retasting products increases the true number of tested samples in the test and therefore increases the psychological strain associated: fatigue and adaptation occur as a
158 PART j II Methods and Analysis in Discrimination Testing: Practical Guidance
result of retasting. Retasting of products should be done in the order that was in the original design otherwise the integrity of the test may be compromised. For some non-food tests reassessment can be difficult. For example, with cosmetic creams, three similar skin sites are necessary to conduct the assessment, making it difficult to do a reassessment of products. Conversely, evaluating something such as crockery shine (if testing washing-up liquids) or cloth foldability (if testing fabric conditioners) the products can be reassessed very easily.
5.5 Additional Information Given to Panelists During the evaluation session, any information about product identity, expectations of the outcome, or individual feedback must be avoided until the test is complete. This is vital if the panelist is likely to replicate the test and it minimizes the risk of panelists sharing information thereby avoiding psychological errors and biases due to expectations. For example, in a test which may be examining product differences resulting in a bean or nut roasting process then information on the roasting process itself may result in biasing in the panelist’s conclusions.
5.6 Testing Environment Panelists should be carrying out the test in a comfortable and relaxed environment free from external stimuli, such as extraneous odors and noise that could distract and bias the panelists. It is advised to follow some general guidance for the design of the test rooms (ISO 8589:2007, Sensory analysis).
5.7 Action Standard This should be agreed with the client before the test is run. The project objective must be known because this effects the action standard; the following guidance should help. If the outcome of the trial product will be promoted to the consumer, for example a new or improved recipe, either on packaging or advertising, then we want to find “a difference” and also for the consumer to be able to notice the difference. In this case our action standard is, “if there is a significant difference between Product A and Product B, then the improved product will be launched.” If the outcome of the trial product is not to be promoted to the consumer, for example in the case of value optimization, substituted ingredient, and different supplier, then we want to find “no difference” and not want the consumer to notice. In this case our action standard is, “if there is no significant difference between Product A and Product B, then the modified product can be launched.”
Triangle Test Chapter j 7
159
This must be an unambiguous statement so that at the end of the analysis it can be clearly stated that the action standard has or has not been met. Case Study 1: Testing for “Difference” Issue: A company produces chocolate bars with inclusions. The inclusions have been changed from the current one to be of better quality. From previous work, no specific attributes have been identified as perceivably different, but the client wishes to know if there is an overall difference. Project Activity: The test product with improved inclusions was evaluated against the control product. The three samples were blind coded, two of the test samples were the same, and the third different. The panelists were given the three samples to test in a set order and instructed to select the odd sample. Assessors: Twenty-four panelists were available. Alpha ¼ 0.05, is chosen to keep the risk of concluding a difference when it does not exist, low. Action standard: If there is a perceivable difference between the two products the new chocolate bar with inclusions will be launched. Results: Out of 24 responses, 14 were correct. Using Table A2.13, in Appendix 2, we can see that the minimum number of correct responses is 13 to establish the difference between the products at a significance level of 5%. Therefore, there is evidence that the chocolate bar with improved inclusions is perceivably different from the control. The action standard has been met and therefore the product can be launched. Collecting the Comments: Comments are only collated for panelists who have correctly identified that the products are different and need to be summarized and presented in a table. In this particular case comments relating to the standard product and those relating to the test product were split as shown below. In addition, the attributes were listed in modality (appearance, aroma, flavor, texture/mouth feel, and aftertaste) and frequency as shown below: Standard Weaker flavor (2) More bitter (2) More metallic (1) Harder to chew (3) Less gritty, crunchy (1) More body (1)
Test Higher flavor (3) More bitter (1) Artificial, plastic flavor (1) More crunchy inclusions (3) More body (1) Tongue burning (1)
If highlighting comments in the summary, only use those with a net score greater than 1 and report in terms of the test product. For example, the test product had a stronger flavor (5), was easier to chew (3), and had more crunchy inclusions (4).
160 PART j II Methods and Analysis in Discrimination Testing: Practical Guidance
Case Study 2: Testing for “No difference” Issue: A company is changing supplier for the tomato flavor of its sauce. The change was driven by a cost optimization objective. From previous work no specific attributes have been identified as perceivably different, but the client wishes to know if there is a perceivable overall difference. Project activity: The test product is evaluated against the control product. The three samples are blind coded, two of the test samples are the same, and the third different. The panelists are given the three samples to test in a set order and instructed to select the odd sample. Test conditions: Before commencing the test, the level of statistical significance and the appropriate minimum sample size have been determined by the sensory professional. a ¼ 0.10 b ¼ 0.10 Pd ¼ 30% The sensory professional raised alpha up to 10% as she/he is not worried about keeping the risk of concluding that two products are different when they are not low. Beta is set to 0.1 because the team does not want any higher risk that 30% of the population can detect the difference. This level of Pd is a medium-sized value according to BS ISO 4120. Assessors: Table A2.14 in Appendix 2, gives us the number of panelists needed for the test: 43. N ¼ 42 panelists. Action standard: “If there is no significant difference between the original product and the new product, the new tomato sauce will be launched.” Results: Nine panelists correctly answered the test, which is much less than the maximum number of 17 correct answers needed to conclude similarity (Table A2.12 in Appendix 2). Therefore there is no evidence to suggest a significant difference between the two products. There is no significant difference between product A and product B. The action standard has been met, and the new tomato sauce can be launched. Example of triangle test report Project Name: XXX Project Number: XXX Triangle Test on Caramel Candy Covered in Milk Chocolate Issued on: xxx Distributed to: xxx Background: The company has to handle two chocolate coatings in the process area for two different confectionary products. To facilitate the process on the production line, it is proposed to keep only one new chocolate coating for the two products manufactured on this line. This project looks at substituting the current milk chocolate coating by the new milk chocolate on our caramel candy. Having tasted the products before the test date, it was deemed appropriate to compare the two versions of the product with a triangle discrimination test.
Triangle Test Chapter j 7
161
Test Objective: The objective of this test is to determine whether there is statistically significant difference between a control caramel (made with current milk chocolate coating) and a trial product (made with new milk chocolate). Action Standard: If no statistically significant difference exists between the control and trial product, then the new chocolate coating will substitute the current one. Summary of Results: The results from this study indicate that there is no significant difference between the control and trial product; the action standard has been met. Sixty assessments were made by sixty panelists. Seventeen assessments were correct answers of the triangle test. Conclusion: Based on these results, there is evidence to suggest that substituting current chocolate coating by new milk chocolate does not affect the sensory characteristics at a significant perceptible level on caramel candy. Products: Product identification Requestors Formula Code Control: caramel candy with milk chocolatedcontrol Trial: Caramel candy with milk chocolatedtrial
Product Name and Production Description Site Production Date Candy with current Factory A Week 38/DD/MM/YY milk chocolate coating Candy with new milk Factory A chocolate coating
Week 38/DD/MM/YY
Test Date: The products were evaluated in week 44, DD/MM/YY. Methodology Summary: Nature of participants Discrimination test Number of participants Test design
Sensory test method
Environmental condition Product size and cup size
Trained panelists Triangle test Sixty panelists participated in one tasting session. Three samples were presented to the respondent in a balanced order. The samples were marked with three-digit random codes. For each triangle test, the subjects evaluated the products, following one of six possible presentation orders, which were presented an equal number of times. For each triangle test, the three samples were presented under blind conditions; two of the three samples were the same, and one sample was different. Sensory booths, yellow monochrome lights. One candy in coded opaque plastic cup.
162 PART j II Methods and Analysis in Discrimination Testing: Practical Guidance
Panelist instruction
Panelists were asked to rest for three minutes between tasting products and then asked to identify the odd sample and describe the nature of the difference(s) perceived; they were instructed to cleanse their palate with water and unsalted cracker biscuits between tastings. Answers were collected via a data capture system.
Tabulated Results Number of subjects Number of correct responses Maximum number of correct answers needed for no difference (as found in Table A2.12, in Appendix 2)
60 17 22
The consumer science department has set up the following parameters: Alpha ¼ 0.20 risk of concluding that products are different when they are not. Beta ¼ 0.10 risk of concluding that products are similar when they are not. Pd ¼ 20% maximum proportion of discriminators. In this case the critical number of correct responses is 22. Seventeen correct answers in the test are lower than the critical value of 22, so there is no significant difference between the two products, the action standard has been met.
6. ASSESSORS 6.1 Health All panelists should be healthy and should not carry out the evaluation if their health impairs their normal sensory ability. Colds, allergies, medications, and pregnancy are well-known conditions that can affect taste and smell ability. A confidential record of each panelist’s allergies, likes, and dislikes should be kept so as not to invite panelists who do not like the products or may react to them. It is assumed that panelists who like a product will spend more time carrying out the test and will be more accurate (Loucks et al., 2017).
6.2 Motivation Most often companies will use their own employees as panelists. However, they should participate on a voluntary basis and have a desire to partake, as an interested panelist is likely to be more efficient and reliable. Maintaining a high level of motivation among panelists requires constant and regular effort from the sensory professional by communicating individual results after
Triangle Test Chapter j 7
163
completion of the study, communicating the importance of the evaluation, and always proceeding in a rigorous and efficient manner. Incentive schemes (gift vouchers, tombola, etc.) can be put in place to encourage panelists but monetary reward has not proven better performance of panelists in the tests (Loucks et al., 2017).
6.3 Experience Versus Inexperience Many companies use already established panels for discrimination testing. These panelists are very familiar with tests such as the triangle test and familiarity with the method improves the ability to discriminate (BS ISO 4120 and Dacremont and Sauvageot, 1997). Before being included in the panel for discrimination testing, assessors can be screened, e.g., selected according to their sensory ability to discriminate products. This is done by preparing several exercises with different levels of increasing difficulty, e.g., discrimination of samples with larger to very small differences. Assessors become “known discriminators” when they have successfully passed these tests and shown higher performance than untrained assessors. Using unscreened or untrained consumers requires the use of a large N (total number of assessors). Using known discriminators reduces the N, which is more practical and less costly, as less resources and less material are required to set up the test.
6.4 Training All panelists should be trained in the triangle test methodology and be familiar with the questionnaire before participating in any formal testing. Panelists can easily be trained in the methodology of triangle test by carrying out tests with products from the chosen product category selected or prepared with known differences. By using products, in the test, which have a large difference the panelist can be introduced into what is required in the triangle test, and he/she can be trained to discriminate finely between products by progressively reducing the difference between the samples. Repeated trials and tests on a variety of minor differences can significantly improve a panelist’s ability to identify differences between products. For example, spiked products, the differences in which are known, are used for triangle test training. When working on soft drinks, for example, the addition of sugar to one product or its dilution will impart a small change to the drink. Product developers can also help prepare products with a known difference in the pilot plant. A series of repeated trial tests will then be given to panelists. Carrying out these trial tests, they get familiar with the instructions and the methodology. They also improve their performance in their ability to discriminate. Their results are monitored by the sensory professional who keeps records and decides when the person can join the panel. Once in the panel, the performance of each panelist should be followed by keeping record of his/her success rate in each test. However, it
164 PART j II Methods and Analysis in Discrimination Testing: Practical Guidance
is important to bear in mind that there is no point penalizing the panelist for not selecting the odd product if the products are very similar; in effect the panelist might be right in his/her choice.
6.5 Information Given Familiarity and experience with the test material can influence the performance and the likelihood of perceiving a significant difference. Familiarity with the product category is good, but familiarity with the project and study objective can negatively impact results. Therefore caution should be taken when recruiting panelists. Employees directly involved in the product development of the products and employees with knowledge of the test objectives should be excluded. This will minimize the expectation error occurring when panelists expect a difference, i.e., an employee involved in the project may know that the structural characteristics of a product have been changed and will expect some textural differences.
6.6 Number of Assessors to Invite The minimum number of panelists is determined by the target alpha risk. A larger sample size gives a greater degree of confidence but comes at a higher resource cost. The minimum number given is the absolute minimum needed to be sufficiently confident in the conclusions of the test. BS ISO 4120 table (Table A2.14, in Appendix 2) gives a guide to the number of assessors so as to reach the sensitivity required for the test. When running a test for “a difference” with a target of a ¼ 0.05, a minimum of N ¼ 24 independent results are generally used in the industry. When running a test for “similarity” with a target of a ¼ 0.25, a minimum of N ¼ 60 independent results are generally required (BS ISO 4120 [Table A1.13, in Appendix 2]).
6.7 Replication Often in the industry, the availability of panelists is limited, and the use of replications is necessary, but if used, the number of replications should be the same for each panelist. However, duplicated assessment for this method cannot be considered as independent and therefore using replication does not greatly reduce the total number of panelists to use. Any overdispersion caused by replications must be taken into account when analyzing the data. Publications on replicated discrimination tests suggest alternative analysis for replicated triangle tests. Ennis and Bi (1998) discuss the use of beta-binomial model for replicated difference and preference tests. They provide tables that can be easily used to establish a difference between products at alpha 5%. Lee and O’Mahony (2006) also use the beta-binomial statistical analysis and discussed
Triangle Test Chapter j 7
165
the computation methodology. However, the beta-binomial tables cannot be used to test similarity between products (Ennis and Bi, 1998). Brockhoff and Schlich (1998) provide a method based on an adjusted overdispersion approach that can be used for both difference and similarity testing.
7. PRODUCT PREPARATION AND SERVING Products must be homogeneous and be prepared and presented together in an identical manner; the same quantity should be served, and samples should be of the same piece size and same temperature. This is to avoid the stimulus error by which the panelists would be influenced by other characteristics not related to the test, i.e., the panelists will be influenced by a difference in the portion size, in color or texture difference, or in a temperature difference. No visual differences should be apparent; if there is a difference in product appearance, these will need to be masked using colored lights. However, if visual differences are the modification to be tested then no further action is required in terms of masking. Sensory scientists should evaluate the need/ benefit of running the test if any visual cues are already introducing a bias in the evaluation. This methodology does not lend itself to products where extreme intensities, and/or carryover of flavor/sensation are common, in mentholated products, for example, or if products cannot be consumed in large quantities (i.e., exceeding acceptable daily intake for certain ingredients). This test is also not recommended for products that have known large batch-to-batch variation within a product variant. Knowledge on the batch-tobatch variation in production is required before deciding on the appropriate discrimination methodology.
8. TEST LAYOUT 8.1 Practical Example of Procedure to Set Up the Test 1. Start to choose and code appropriate numbers of containers; i.e., white paper plate, glass, opaque plastic cup, etc. Label the containers with random three-digit codes. Ensure there are no visual differences imparted by the label itself; i.e., difference in writing, for instance. Ensure the given codes are not confusing, for example giving two similar codes in the same test such as 256 and 356 could be confusing for the sensory professional when preparing the products. Products could be poured into the wrong container by mistake. Ensure that the four numbers chosen have the least number of repeats, and no numbers are same at beginning and at ends, for example: 247, 951, 803, and 625. The use of sensory software to provide codes is not a “safe” solution either: always check. For example, random three-digit product codes: 767, original product 189, original product
166 PART j II Methods and Analysis in Discrimination Testing: Practical Guidance
312, modified product 570, modified product Twelve panelists need at least the following: 18 original productsdtherefore, 9 containers should be labeled 767 and 9 containers should be labeled 189 l 18 modified productsdtherefore, 9 containers should be labeled 312 and 9 containers should be labeled 570 2. Prepare the order of presentation for each panelistdhalf the panelists should receive two samples of the modified product and one original product and the other half of the panelists should receive two samples of the original product and one modified product. An example presentation design for 12 panelists is shown below: A-767 B-312 B-570 B-312 A-189 B-570 B-570 B-312 A-189 A-767 A-189 B-570 A-767 B-312 A-189 A-767 A-189 B-312 A-767 B-312 B-570 B-312 A-767 B-570 B-570 B-312 A-767 A-767 A-189 B-570 A-767 B-312 A-189 A-767 A-189 B-570 3. Prepare a sensory questionnaire (see Fig. 7.1) for each panelist. l
Photocopy enough questionnaires for each panelist, and write the product set number given for each on the blank lines as shown in Fig. 7.1. The test can also be set up in data capture software. In this case, the sensory professional should ensure that the same guidelines as aforementioned are followed.
9. ANALYSIS AND REPORTING The test report has to give details of the background and objectives, full details of the product set (nature, batch number, age, etc.), preparation method, method used, action standard that was set prior to testing, and significance of the test results. There are two outcomes depending on the objective of the project: either we are expecting “a difference” or “no difference” between the products. Examples of where “a difference” is looked for are changes in ingredients, packaging, processing, or storage when the objective is for consumers to notice the difference.
Triangle Test Chapter j 7
167
Date:____________ Name:____________ Product Set:_____________
Rinse your mouth with water before beginning. You are presented with three coded products. Two of these products are the same and one is different. Please evaluate the products in the order shown on your form/on the screen, from left to right. Select the code of the product that is different. Rinse your mouth using water and plain crackers between products.
_____________
____________
_____________
Please comment: _____________________________________________________________________ Thank you for your participation
FIGURE 7.1 Example of questionnaire for triangle test set up for a food test.
An example of a “no difference”, also referred to as similarity or equivalence testing, the expectation is when you are value optimizing a product or when the objective is for consumers not to notice the difference, for example, when a flavor is sourced from an alternative supplier and the desire is that the consumer does not notice the difference. For these options we need to process the data in different ways. The reason is to prevent the required answer becoming the default position, rather than the one we have to gather evidence for. Otherwise a poorly designed or underresourced experiment will provide the default answer (whether or not it is true).
9.1 Difference Testing When looking for “a difference,” the default position used is that there is not a difference between the products. The sensory professional can use Table A2.13, in Appendix 2 and seek positive evidence of a difference using a type I error, alpha of 0.05, to be able to conclude that the products are different. If the
168 PART j II Methods and Analysis in Discrimination Testing: Practical Guidance
correct number of responses is greater than or equal to the number given (corresponding to the number of panelists and the alpha-risk level chosen for the test), the conclusion is that a perceptible difference exists between the products. Setting alpha at 5% means that out of 100 conclusions of a difference existing, 5 will be wrong. Five percent is an arbitrary cutoff point, and a very cautious sensory professional will always try to reduce alpha to affirm a difference. Most triangle tests in the industry are actually conducted to ascertain that two products are not perceptibly different. Failing to conclude that a difference exists does not prove that products are similar. Therefore a very different approach is used to test for no perceptible difference.
9.2 Similarity Testing When looking for “similarity”, the sensory professional wants to have confidence that products are not perceivably different. This is done by choosing a small value of the type II error, beta, reducing the risk of concluding that two products are similar when they are not. Beta is often arbitrarily fixed to 10% for practical reasons (Schlich, 1993). When testing for similarity, the sensory specialist wants to demonstrate that the proportion of panelists who will perceive that the difference is not larger than a critical proportion Pd (BS ISO 4120). Pd is the maximum acceptable proportion of the population that can distinguish between the products. Ideally, Pd should be as low as possible, meaning that a very large number of panelists are required to take part in the test. For practical reasons, Pd is often chosen between 20% and 30%. If the correct number of responses is less than or equal to the number given in Table A2.12, in Appendix 2 (corresponding to the number of panelists, the beta-risk level and the value Pd chosen for the test), we conclude that no meaningful difference exists between the products. Another approach to establish “similarity” is to calculate the confidence interval on the proportion of distinguishers, giving reassurance to the sensory professionals that the proportion of distinguishers is low enough. BS ISO 4120 gives us the method to calculate the 90% confidence interval on the actual proportion of distinguishers. MacRae (1995) provides graphs with the 90% confidence bounds for the triangle test with the alpha set at 5% and demonstrates that a very large number of data are needed to give reassurance about similarity.
REFERENCES Allen, C., Havllcek, J., Roberts, S., 2015. Effect of fragrance use on discrimination of individual body odour. Frontiers in Psychology 6. Angulo, O., Lee, H.S., O’Mahony, M., 2005. Sensory difference tests: overdispersion and warmup. Food Quality and Preference 18, 190e195.
Triangle Test Chapter j 7
169
ASTM International E1885-04, Standard Test Method for Sensory Analysis e Triangle Test. ASTM WK32980, 2011. New Test Methods for Sensory Analysis e Tetrad Test. ASTM International, West Conshohockem, PA. British Standard BS IS0 4120-2004. Sensory Analysis Methodology e Triangle Test. British Standard BS ISO 8589:2007. Sensory Analysis e General Guidance for the Design of Test Rooms. Brockhoff, P., Schlich, P., 1998. Handling replications in discriminations tests. Food Quality and Preference, 303e312. Caroselli, A., 2012. Investigation of the Effect of Allowed and Forced within Trial Retasting on Judge Performance in the 2-AFC (MS thesis). University of California, Davis. Dacremont, C., Sauvageot, F., 1997. Are replicate evaluations of triangle tests during a session a good practice? Food Quality and Preference 8, 367e393. Ennis, J.M., 2013. The year of the tetrad test. Journal of Sensory Studies 28 (4), 257e258. Ennis, D.M., Bi, J., 1998. The beta-binomial model accounting for inter-trial variation in replicated difference and preference tests. Journal of Sensory Studies 13, 389e412. Ennis, J.M., Jesionka, V., 2011. The power of sensory discrimination methods revisited. Journal of Sensory Studies 26, 371e382. Ennis, J.M., Rousseau, B., 2012. Reducing costs with tetrad testing. IFPress 15 (1), 4e5. Gelski, J., 2013. Switching sensory test protocol benefits General Mills. Food Business News. http://www.foodbusinessnews.net/. Helm, E., Trolle, B., 1946. Selection of a taste panel. Wallerstein Laboratories Communications 9, 181e194. Ishii, R., O’Mahony, M., Rousseau, B., July 2013. Triangle and tetrad protocols: small sensory differences, resampling and consumer relevance. Food Quality and Preference 31 (1), 49e55. Jacobsson, A., Nielsen, T., Sjoholm, I., Wendin, K., 2004. Influence of packaging material and storage condition on the sensory quality of broccoli. Food Quality and Preference 15, 301e310. Lawless, H.T., Heymann, H., 2013. Types of Discrimination Tests. Sensory Evaluation of Food Principles and Practices. Lee, H.S., O’Mahony, M., 2006. Sensory difference testing: the problem of overdispersion and the use of beta binomial statistical analysis. Food Science Biotechnology 15 (4), 494e498. Loucks, J.N., Eggett, D.L., Dunn, M.L., Steele, F.M., Jefferies, L.K., 2017. Effect of monetary reward and food type on accuracy and assessment time of untrained sensory panellists in triangle tests. Food Quality and Preference 56, 119e125. MacRae, A.W., 1995. Confidence intervals for the triangle test can give reassurance that products are similar. Food Quality and Preference 6 (2), 61e67. Meilgaard, M.C., Carr, T.B., Civille, G.V., 2015. Sensory Evaluation Technique, fifth ed. CRC Press, Boca Raton, FL. O’Mahony, M., 1986. Sensory Evaluation of Food: Statistical Methods and Procedures. Marcel Dekker Inc. O’Mahony, M., 1995. Who told you the triangle test was simple? Food Quality and Preference 6, 227e238. Peryam, D.R., 1950. Quality control in the production of blended whiskey. Industrial Quality Control 7, 17e21. Rousseau, B., O’Mahony, M., 2000. Investigation of the effect of within-trial retasting and comparison of the dual-pair, same-different and triangle paradigms. Food Quality and Preference 11, 457e464.
170 PART j II Methods and Analysis in Discrimination Testing: Practical Guidance Rousseau, B., Rogeaux, M., O’Mahony, M., 1999. Mustard discrimination by same-different and triangle tests: aspect of irritation memory and teta criteria. Food Quality and Preference 10, 173e184. Sauvageot, F., Herbreteau, V., Berger, M., Dacremont, C., 2012. A comparison between nine laboratories performing triangle tests. Food Quality and Preference 24, 1e7. Schlich, P., 1993. Risk tables for discrimination tests. Food Quality and Preference 4, 141e151. Stone, H., Sidel, J.L., 2013. Sensory Evaluation Practices, fourth ed. Academic Press, Inc., Orlando, Florida, USA. Tesfaye, Garcia Parrillo, M.C., Troncoso, A.M., 2002. Sensory evaluation of sherry wine vinegar. Journal of Sensory Studies 17, 133e144.
FURTHER READING Ennis, D.M., 1990. Relative power of difference testing methods in sensory evaluation. Food Technology 44, 1147. Ennis, D.M., 1993. The power of sensory discrimination methods. Journal of Sensory Studies 8, 353e370. Ennis, J.M., 2012. Guiding the switch from triangle testing to tetrad testing. Journal of Sensory Studies 27, 223e231.