0005-79h7~X3~010087-02~03.00,‘0 CopyrIght 0 1983 Pergamon Press Ltd
CASE HISTORIES
AND SHORTER
A note on statistical
COMMUNICATIONS
inference in meta-analysis
Meta-analysis. the technique of combining the results of independent studizs. has recently become a popular feature of literature reviewjs (Burger. 1981; Cooper, 1979; Mullen and Sulls. 1982). A number of workers (Andrews and Harvey. 1981; Shapiro, 1980; Smith and Glass. 1977; Smith rt al.. 1980) have applied meta-analytic techniques to the psychotherapy outcome literature in order to examine the relative effectiveness of different forms of therapy. They have attempted to quantify the relationship between treatment and improvement by applying a slightly modified version of the d-index of ‘effect sire’ as described by Cohen (1977). (‘d’ is the difference between the mean scores of the treated and control groups divided by the standard deviation of the control group scores). The index is scale-free (i.e. expressed in unit normal deviates) and has the further advantage of being easily translated into overlap scores. These show what percentage of patients’ scores in the group with the smaller mean is exceeded by the average patient in the group with the larger mean. The appearance of the Smith and Glass (1977) meta-analytic study was followed by a series of critical assessments (Eysenck. 1978a. b; Rachman and Wilson. 1980; see also Kazrin ef al., 1979). Several of the charges levelled by these critics have since been answered by supporters of the meta-analytic approach. Shapiro, for instance. demonstrated that there was no support for the view. expressed by Eysenck. that those studies which produced relatively high effect sizes for verbal therapies were also those that were high in reactivity and low in internal validity. Andrews and Harvey (1981) demonstrated that the exclusion of studies using analogue patients did not in fact lead to a significant revision of Smith et ~1,‘s conclusions-as had also been intimated. The latter authors also extended the literature reviewed by Smith and Glass in an attempt to answer a number of criticisms concerning its selectiveness. Other criticisms of the meta-analytic approach are perhaps unanswerable. e.g. the claim that it represents a “resurrection of uniformity assumption myths” (Paul and Licht, 1978) in patients. treatments and outcome measures. Nevertheless it appears that the current mood within psychotherapy evaluation research is one of resigned acceptance of the ‘Dodo bird verdict’ (‘Everybody has won and all must have prizes’) and of the spirit of compromise with which it is associated (Frank, 1979). It would be premature however to accept this widely cited conclusion without first examining a number of methodological questions related to the use of effect sizes as a means of hypothesis testing. In the first place effect size is an unweighted indicator of the strength of relationship between treatment and outcome. It therefore places undue emphasis on studies with smaller sample sizes. If Stouffer’s method (see Rosenthal. 1978) of combining -_-scores from several studies (derived from the probability level reported by each study) is used. sample stze weighting may be achieved by multiplying each ;-score by the number of subjects in the study and dividing the sum of these by the square-root of the squared and summed individual study sizes. This simple method of weighting is not appropriate however to the measure used by Smith and Glass, based on the magnitude of treatment effects. A simple and effective expedient would be to divide the differences between treated and untreated group means by the combined standard error of the means (rather than the SD). This would provide a score adjusted for sample size. Using the t-distribution it would still be possible to calculate overlap scores and thus state overall values of psychotherapy outcome in terms of the percentage of the treated group who have better outcomes than the average patient in the untreated group. There is a further issue concernmg weighting which is not taken care of by the use of an adjustment for sample size. Smith and Glass, and all subsequent workers have calculated separate effect sizes for each of the outcome measures applied in a single study as well as for every point in the study at which outcome was assessed. They went on to use effect sizes as their basic unit of statistical analysis. Unfortunately they offer no justification for this. Moreover when the effects of particular groups of treatments are compared the mean effect sizes are stated for each treatment without disclosing the number of such effect sizes which were derived from the same individuals. A more appropriate procedure might have been to average effect si7es within each of the studies (if more than one measure of outcome was used) or to preserve their separate identity and treat them as variables nested within a study. As it is. the impact of any one study on the overall mean for the treatment approach it represents may simply be a function of the number of outcome measures it used. This shortcoming greatly reduces the illustrative value of the average effect size measure since the mean values may be quite misleading. The problem becomes a great deal worse if inferential statistical procedures are applied to the effect size data. Andrews and Harvey. for example. applied the analysis of variance model to test the null hypothesis that there were no differences between effect sizes for verbal as opposed to behabioural psychotherapies. This procedure is inconsistent with the assumption of independence of observations which is fundamental to the ANOVA model. Effect sizes derived from the same study may rightly be expected to be highly correlated as they were provided by the same patients. who had been treated by the same therapists. Similarly Shapiro (1980). using inferential statistics demonstrates that the differences in effect sizes between behavioural and verbal psychotherapy increase with decreasing reactivity, showing “a significant linear trend (P < 0.01)” (p. 4). Shapiro does not disclose what statistical procedure yielded this estimate of the likelihood of such a pattern of results being obtained by chance alone. It seems highly probable however that it also fails to meet the assumptions of the test. Finally, the overall appropriateness of the meta-analytic approach to psychotherapy research may be questioned. The statistical combination of a number of studies is likely to enhance the significance of small effects as long as they occur in the same dimension across studies. It is hardly necessary to invoke the classical distinction between clinical “KT ?I’,
G
87
xx
CASE HISTORES AND
SHORKR
(‘OMMLNI(‘ATIONS
and statistical significance to realise that in order for psychotherapy to bc a useful part of a clinician’s repertoire it must be seen to produce significant changes in individual cases rather than for such changes to become apparent only after the treatment of 25.000 control and experimental subjects (Smith and Glass. 1977).
REFERENCES
AXUREWS G. and HAR VEY R. (1981) Does psychotherapy benefit neurotic patients? 1Irr11.sq@‘r~. P\j,chfuf. 38, l203- 120X. BUKC;ER J. M. (1981) Motivational biases tn attribution of responslbllity for an accident: a meta-analysis of the defensive-attribution hypothesis. Psyckol. Bull. 90. 49G-512. COHEU J. (1977) S~rctisriccrl Power Anal~.~i.s,fiv thr Brhariourcd Sckncr. 2nd edn. Academic Press, New York. COOPER H. M. (1979) StatistIcal comb;ning of independent studies: a meta-analysis of sex differences in conformity research. J. Person. sot. P.svchol. 37, 13l& 146. EYSENCK H. J. (1978a) An exercise in mega-silliness. Am. P.s~c/d. 33, 517. EYS~NIX H. J. (1978b) Correspondence. Bull. Br. Ps~chol. Sot. 31. 56. FKANK J. D. (1979) The present status of outcome studies. J. c~onvulr. c/in. Psychol. 47, 310-316. KAZRIU A., DURA~ J. and AGTEROS T. (1979) Meta-analysis: a new method for evaluating therapy outcome. Brhar. Res. T/w. 17, 397-399. MULLEU B. and SULS J. (1982) The effectiveness of attention and rejection as copmg styles: a meta-analysis of temporal differences. J. p.s~horonz. Rm. 26, 43-49. PAUL G. L. and LITHE M. M. (IY78) Kesurrectlon 01 umiormlty assumptton mytns anu rne ranacy 01 s~uaucal absolutes in psychotherapy research. J. consult. c/in. Psyhol. 46, 153 I- 1534. RA~HMAN S. J. and WILSON G. T. (1980) The Eff;ct.s o/ Ps~~chologica/ Thmqq. 2nd edn. Pergamon Press. Oxford. ROSENTHAL R. (1978) Combining results of independent studies. Psycho/. 81111.85, 185. 193. SHAPIRO D. A. (1980) Science and psychotherapy: The state of the art. Br. J. med. Ps~dd. 53. I~-10. SMITH M. L. and GLASS G. (1977) Meta-analysis of psychotherapy outcome studies. Am. P,sychol. 32, 752-760. SMITH M. L., GLASS G. V. and MILLER T. I. (1980) T/w Brrwfits of P.s~chotherap.v. The Johns Hopkins Press, Baltimore.
* Maudsley
Hospital,
Denmark
Hill. London
SE5. England