BIOLOGICAL PSYCHOLOGY ELSEVIER
Biological
Psychology37 (1994) 269-273
Comments on “Has odour conditioning been demonstrated?” M.D. Kirk-Smith Department
ofMarketing
and Organisational Studies, Vnicersity of Ulster, Newtonabbey, BT37 OQB, UK
Abstract
I have read Black and Smith’s comments on our conditioning study (Kirk-Smith, Van Toller and Dodd, 1983) with interest. I was previously aware of a number of the issues; indeed, I had alerted him in general terms about the existence of extra flaws of which up to that point he appeared not to be aware Kirk-Smith, personal communication, 1993). As well as replying to the main points, I would also like to suggest an improved design and to provide necessary corrections to the original paper.
1. Design Many designs could have been used in this study. The design selected was the simplest, and perhaps most elegant, that we could devise. We fully appreciated the need for a control group to confirm, not prove, the effects of the stressor ‘. Our reason for not running this group was simpiy a lack of time and resources (indeed, I would have liked to have run at least twice the number of participants in the design published). Our rationale was that, if we got effects, we would be able to infer that the stressing session must have been stressful, since it would be difficult to imagine any other interpretation for the effects, given that the second session was double blind and odour was the only variable present at conditioning and test sessions.
’ This method of stressing participants had been developed by Dr Van Toller and used by him since 1974 to produce stress in participants, e.g. in psychophysiological studies (Van Toller, 19X2;Van Toller, personal communication, 1993). 0301-0511/94/$07.00 0 1994 Elsevier SSDI 0301-05 11(94)00938-T
Science
B.V. All rights reserved
270
M.D. Kirk-Smith
/Biological
Psychology 37 (1994) 269-273
The selection of a perfectly appropriate control group may not be possible. Repeating the stressing session with before and after measures, e.g. the mood scales used previously, would not be a true control, since participants would be alerted to what to expect by the terms used in the mood scales, even if control terms were used. We could see no easy way around this problem. Also, participants in the stress confirmation would have to have identical expectations to the other participants, i.e. they must also be required to attend a second session some days later to get payment. Black and Smith also question the assumption that the task intended to induce anxiety did, in fact, do so. Their criticism is merely to offer an alternative interpretation, without any supporting evidence being offered apart from the one which argues that this alternative interpretation seemed more likely. It is clear that the grounds for them making this counter-assertion are similar in quality to the grounds they criticise in the original paper. Without any supplemental evidence that can be offered and agreed upon, therefore, the two opposing arguments are effectively indistinguishable in terms of their plausibility. However, the report mentioned at the start of this paper (see note 11, does in fact provide additional supporting evidence that the task was likely to induce anxiety. It would be up to Black and Smith to provide similar or more persuasive evidence that supports the alternative interpretation they have offered. A confirmatory study could be run as follows. The design should be as it is, but with three times as many participants. Two thirds would do the study as published. A further third would act as a stress confirmation group. They would do exactly the same as the other participants, with half in the odour first session and half in the control first session, all would complete the second session. The only difference is that they would complete stress mood scales before and after the first session. This manipulation would allow confirmation of the stressful effects, for participants with and without odour. It would also allow an estimation of the memory carry-over effect of participants encountering similar mood scales in the second session. Finally, this further study would have twice the number of participants as in the original study. Re-running the study with an improved design, as outlined, is crucially important, given the interest shown in this area. I would like to take this opportunity to invite the authors of the commentary to collaborate with me in executing this new study as outlined above.
2. Statistical
analyses
I agree that planned comparisons or post hoc tests should have been included. I do not understand why they were not, since this was our regular practice (e.g. Kirk-Smith, 1978; Kirk-Smith, Booth, Carroll & Davies, 1978). However, in the absence of these tests I would argue that by inspection the female experimental group means are different from the other means. Since they are qualitatively different (i.e. in direction or sign), this is quite likely to be the source of the
M.D. Kirk-Smith / BiologicralPsycho&y 37 (1994) 269-273
271
differences found in the analyses, given the assumptions underlying the use of ANOVA. The second paragraph states that the analysis and design were flawed and unsound; while later our conclusions are said to be unfounded. The assertion that the design and anaiysis were flawed suggests that there is a right and a wrong way to analyse any experimental data. The statistical model could be said to be incorrect only if it were not isomorphic with the underlying substantive model (see Coombs, 1983, for a greater explanation of this.). This is not the case here, in which case, the assumptions made which justify the application of the statistical model can only be said to be supported by the data to a greater or lesser degree, and are not justified or unjustified by assertion or counter-assertion alone. In short, the application of a statistical test assumes a particular underlying statistical model, which can be justified to a greater or lesser extent. And, unless something remarkable has happened, the choice of type of test can only be said to be justifiable or unjustifiable, or, alternatively, appropriate or inappropriate. Furthermore, some comment needs to be made about the assertion that the conclusion was unfounded. An earlier draft of the criticism used the word “invalid” applied to the argument, instead of “unsound” or “unfounded” applied to the conclusion, but the same basic difficulty applies in both situations. For the conclusions of an argument to be unfounded, this implies that there are no reasons for arriving at the conclusion (Flew, 1975). This is self-evidently false, as the original paper does include arguments. The description of the original experiment in the criticism by Black and Smith actualiy states the steps used that provided the reasons for arriving at the conclusion. An assertion that conclusions were given, with no reasons, implies some error in the peer-reviewing process of Biologicnf Psychology. What Black and Smith could have said was that there were arguments that led to a different conclusion than that offered, and that they thought they were more persuasive. However, the form of their criticism as it stands suggests that a greater flaw in the original study can be inferred than they can supply evidence for. In the case of the critique offered, the assumption that inter-participant variability and intra-participant variability could be pooled is actually criticised as being wrong rather than unjustifiabIe, and the failure in the published paper to make post hoc comparisons is an omission, rather than the application of an incorrect statistical model.
3. Corrections Two changes were made to the Discussion section of the original manuscript, which I, as first author, did not authorize. The third author handled the final editorial work on the manuscript with the editors of Biological Psychology. It is important for the understanding of the paper that I correct these statements now. They should be omitted since the first is untrue and the second is misleading. Both are also inconsistent with other statements in the paper.
272
M.D. Kirk-Smith /Biological Psychology 37 (1994) 269-273
The first statement is “It is of interest to note that the concentration of TUA odour used during the sessions seemed very ‘obvious’ to the experimenters who felt sure that participants would detect it”. The statement in the Stimulus-odour section (p. 222) “The concentration of TUA used was not noticed by participants in the pilot study unless their concentration was specifically drawn to it” is accurate. This concentration was obtained through systematic pilot trials with various concentrations before the conditioning study. The selection of an unnoticeable level was considered crucial to the study, since any expectation or “Rosenthal’‘-type effects resulting from recognition of the odour’s presence would have resulted in unpredictable effects. The second statement is “The role of perfumers in odour association is emphasised by pointing out that the TUA and related aldehydes are the key impact odorants in perfumes belonging to the important aldehydic-floral family of perfumes”. This statement is irrelevant and misleading. TUA was chosen from several hundred odorants for the sole reason that it was the “greyest” odour, i.e. the odour with least (if any) associations. The first sentence of the Stimulus-odour section makes this clear (the pilot study cited was the last of many). The rationale was that an odour without associations would be more easily conditioned than an odour that had pre-existing associations, support for this being drawn from various studies (e.g. Engen & Ross, 1973; Lawless & Cain, 1975; Lawless & Engen, 1977). 4. Conclusion The points raised in the commentary tend to be technical, as the title suggests, rather than forming a critique of the theoretical position held. The commentary states that the paper is highly cited. This corpus of work must rely on the results of this study to some extent. Now, suppose that the study did not exist, i.e. the explanations presented were removed from the literature. How then would the interpretation of all this later research have to be modified if the results of our paper were not robust? Also, if the study did not exist, to what extent would alternative explanations create incoherence in interpretation of the results of these later papers? If other interpretations would then not form a coherent or consistent whole, then the way to avoid this problem is to accept provisionally the results of this study. From this point of view, the paper still stands as a valid first study of unconscious odour conditioning in human beings. But given the problems that are being correctly pointed in the critique, a new study, along the lines outlined above, could be carried out as a confirmatory study of the results. Once again, I invite the authors to collaborate with me in carrying out this important confirmatory study.
Acknowledgements
I would like to thank my colleague, Dr. David D. Stretch, University of Leicester Medical School, for his helpful advice and comments in the preparation of this paper.
M.D. Kirk-Smith
/Biological
Psychology 37 (1994j
269-273
273
References Coombs, C.H. (1983). Psychology and mathematics: An essay on theory. Ann-Arbor: University of Michigan Press. Engen, T., & Ross, B.M. (19’73). Long term memory of odours with and without verbal descriptions. Journal of Experimental Psychology, 100, 221-227. Flew, A. (1975). Thinking about thinking. (Or do I sincerely want to be right?). Glasgow, UK: Fontana Press. Kirk-Smith, M.D. (1993). Personal Communication (letter to Black, S., 16 June 1993) Kirk-Smith, M.D., Booth, D.A., Carroll, D., and Davies, P. (1978). Human social attitudes affected by androstenol. Research Commun~caf~~s in Psychology, Psychiatry and Beha~~our 3, 379-384. Kirk-Smith, M., Van Toller, C., & Dodd, G. (1983). Unconscious odour conditioning in human subjects. Biologicat Psychology, 17, 221-231. Lawless, H.T., & Cain, W.S. (1975). Recognition memory for odours. Chemical Senses and Flauour, I, 331-337. Lawless, I-L., & Engen, T. (1977). Associations to Odours: Interference, mnemonics and verbal labelling. Journal of Experimental Psychology, 3(l), 52-59. Van Tolier, S. (1993). Personal Communication (telephone conversation, 17 November 1993) Van Toller, S. (1982). A simple and reliable stressor. Internal Rep., University of Warwick, 1982.