Journal of Psychosomu~ic Rrseurch, Vol. 38, No. 8, pp. 823-835, 1994 Copyright 0 1994 Elsewer Science Ltd Prmted in Great Rritam All rights reserved 0022-3999194 $7.00 + .OO
Pergamon
0022-3999(94)0007%6
THE EVALUATION
OF LIFE EVENT
SIEGFRIED GEYER,* MATTHIAS BROER,* KARL-ERNST BUHLERT and URSULA (Received
DATA
HORST HALTENHOF,_F MERSCHBACHERT
forpublication 20 June 1994)
Abstract-Data from three life event studies are compared. The interviews covered events that occurred within a period of 2 years before interview. The same inventory was used in each of the studies. Samples were drawn from depressives, myocardial infarction patients and an industrial worker population. The patient groups were interviewed twice within 4 weeks. Fewer than 50% of the total number of events reported in the retests were recorded twice. For events reported in both interviews the correlations of subjective appraisals were only moderate. There is considerable fall-off for reports of events occurring more than 6 months before interview. It was expected that the severest events would have the lowest, and the least severe ones the highest frequencies. Instead, inappropriate labels of the rating scales led to clusterings of severity ratings at their extreme points. Numbers of events and severity ratings were positively correlated with measures of depression.
INTRODUCTION
Research on stressful life events and illness has produced four different measurement approaches. (1) In self-report inventories respondents go through a list of events and check whether one or more of them have occurred within a defined time period. A life change score is assigned to each type indicating necessary adaptational efforts. Individual life change is computed by adding up the scores for the reported event categories. Holmes and Rahe’s Social Readjustment Rating Scale (SRRS) [I] and Dohrenwend’s Psychiatric Epidemiology Research Interview (PERI) [2] are wellknown examples. (2) A variation of the first approach is to leave out the weights and to analyze events as separate categories, e.g. deaths, divorces or accidents [3]. (3) In inventories with respondent-based distress ratings subjects again go through a list, and each reported event is judged according to its stressfulness. This can be performed for one dimension, e.g. an overall distress measure [4], or for specific characteristics such as controllability and predictability [5]. (4) Events can be assessed by semi-structured and tape-recorded interviews. Severity ratings are performed by interviewers using the reported contextual event characteristics as a basis. Well-trained raters skilled in the use of a set of detailed
* Institute of Medical Sociology, Heinrich-Heine University, Dusseldorf. t Clinic of Psychiatry, Phillips-University, Marburg. Address correspondence to: Dr Siegfried Geyer, Institute of Medical Sociology, Faculty Heinrich-Heine University Dusseldorf, Postfach 101007, 40001 Dusseldorf, Germany.
823
of Medicine,
824
S. GEYER
ET AL
event examples and interpretational rules are required. Brown and Harris’ Life Events and Difficulties Scale (LEDS) is the best-known method of this sort [6]. These different approaches have been assessed in several studies [7-161. Comparisons between checklist inventories and the LEDS show that there is little overlap in reported events, and that the number recollected with both of them is below 30% [14, 161. This suggests that different phenomena are being measured, even if the instruments cover similar types of events and time spans [8, 12, 161. With increasing severity of events both methods show converging results, i.e. severe loss events are recorded regardless of the type of instrument applied [8, 9, 151. In contrast to the considerable number of studies dealing with comparisons of different methods, reliability analyses with the same instrument are rare. In earlier studies reliability measures have not been determined on an event basis. Instead, a more indirect procedure used computing test-retest associations for life change scores derived from the SRRS. Their correlations vary between r=0.34 and r=0.67 [17]. Using sum indices may conceal that the same individual score for two measurements might be composed of dissimilar events. When comparisons are made event by event the concordances drop considerably [16]. When more than two measurements are performed these results are confirmed: Raphael et al. [ 181 used Dohrenwends PER1 over 12 interviews within a year. Sixty per cent of the events reported in an interview were omitted 6 weeks later. If subjective event ratings are collected in consecutive interviews the stability of appraisals is also at issue. However, in the literature reviewed for this paper this problem has not been considered. The principal reason for inconsistencies is forgetfulness. Its magnitude depends upon characteristics of events and assessment methods. The probability of the events for being reported diminishes with increasing temporal distance between interview and date of event. When all types of negative events are considered, for listing inventories (i.e. the first and second type of instruments described above) a fall-off rate of 4”/0 per month is reported [19-211. For events with a highly stressful impact (e.g. loss experiences) the forgetting rate is low, and for positive events about 9% per month. The differences in reporting are much less when lengthy and in-depth interviews are performed. Studies using the LEDS consistently find a lower fall-off [19, 22, 231. Regardless how life change is measured, highly stressful conditions are rare, and less severe ones are reported more often. This distribution also corresponds to everyday experiences and should be reproducible with any type of instrument, even when minor events are easily forgotten. In addition to differences in reporting of events distress ratings between two measurements may differ for a number of reasons: the appraisal of an event occurring shortly before an interview may be different from a judgement of that same event observed from a more distant time perspective. In an earlier study it was found that distress was rated as more severe for events that occurred close to the time of interview than for more distant ones [24]. This phenomenon is due to the success or failure of coping with the consequences of events [2.5]. Further situational and personal factors may influence reporting behaviour: experiments have shown that remembering may be dependent on moods [26629]. Events connected to a certain mood state, e.g. death events and being depressed, are more likely to be remembered when subjects are in the corresponding frame of mind. Thus, when groups with different mood states are interviewed, greater numbers of events will be reported by
The evaluation
of life event data
825
these who are depressed. This may occur when chronically ill patients and healthy controls are compared, and may lead to false conclusions about event effects on health status. In addition, severity ratings may be influenced by situational conditions: respondents may rate the amount of distress associated with an event towards their current frame of mind. This may apply when respondents do not have enough time to recollect enough information for forming a rating. Instead, mood states may then serve as a cognitive anchor for judgements. This is especially true when listing inventories are applied that devote not much time to memorize events. This source of bias is excluded when lengthy interviews are conducted (e.g. the LEDS) that call a lot of contextual information to mind. These factors may influence other aspects of reporting behaviour not considered further in this paper, e.g. the precision of dating events [28, 301, estimating the duration of events [31], the predictive value of different weighting schemes [7, 321 and problems of ambiguity of event categories. This paper illustrates the problems with data from three studies with repeated measurements of events. The same respondent-based inventory is used throughout; it combines a list administered by an interviewer with subjective ratings performed for each reported event. Four main issues are considered. (1) The reliability of event reports over different measurements will be analysed. For events reported in consecutive interviews concordances in subjective appraisals will be considered. (2) Forgetting of events will be presented as fall-off curves for each study. In a second step similarity measures are computed to compare the data structure over measurements within and between studies. These analyses provide information on whether the inventory stimulates recollection consistently over different samples. (3) The distribution of the severity ratings should show a structure that was assumed to be a naturally occurring pattern. The most severe events should have the lowest, the least severe category should have the highest frequency. (4) Finally, situational factors influencing remembering and rating events will be examined. It will be investigated whether events close to the date of interview are rated as more severe than more distant ones, and whether depressivity has an impact on recalling and rating events.
METHOD
Study 1 was conducted to explore the reliability of life event reports in myocardial infarction patients and took place in a rehabilitation clinic. Fifty-six (17 female, 39 male) subjects were interviewed twice: immediately after admission and before discharge (distance between T, and Tz: 4 weeks). The mean age was 52.1 yr (s~z6.2) and the average interval between infarction and admission into the rehabilitation clinic was 9 weeks. Study 2 was also a retest, with the same goals as the first one. Fifty depressive patients (31 female, 19 male) were interviewed at the onset and at the end of their stay in a psychiatric hospital. The time span between the interviews was again 4 weeks. The mean age was 51.2 yr (SD= 14.2). The periods between the onset of a depressive episode and hospitalization were difficult to determine. The inaccuracy of the information gleaned from patients and relatives rendered it unusable. Study 3 comprised data that were collected as part of a prospective study by Siegrist et al. [33]. A sample of blue-collar workers was interviewed four times over 6.5 yr. The study dealt with chronic and acute psychosocial conditions causing increased risks for myocardial infarction. Life events were assessed at the first three interview waves. The analyses below include only the first two, but the results can be
826
S. GEYER
ET AL.
generalized to the third one. The intersection of the covered time periods was so insignificant that reliability analyses were not possible here. The sample size at the first interview (Tl) was N=416, at the second (T,) N= 356. The mean age (T,)was 40.8 yr (SD = 13.9). The data of this study were included to have a comparison sample of healthy subjects. It has been recognized that the interviewer instructions between the studies (myocardial infarction patients and depressive patients on the one hand and the industrial worker samples on the other) may have differed, although it is not clear to what extent. Measures All studies used the Inventory of Life Changing Events (ILE) [5]. It combines an event list with respondent-based ratings. The interviewer reads through the list with 32 potentially stressful events, e.g. divorce, death events, accidents, illnesses, and two open categories. For every reported event a series of ten descriptive dimensions is completed by the respondents (predictability, controllability, situative vulnerability, subjective importance of an event, disruption of daily routines, active coping behaviour, two items on social support, ‘psychological costs’ of event coping, previous experience with an event. and the degree of event-related distress present at the date of interview). Every dimension is operationalized as a statement. Subjects give their responses on five-point scales (1. ..5) whose ends are labelled ‘true’ and ‘not true’. ‘1’ denotes the lowest, ‘5’ the highest degree of controllability. An index for single events is computed by adding up the scores over all event dimensions. Individual scores are computed as the mean values of all reported events. In the Introduction it was explained that using sum indices as a basis for reliability measures might underestimate error rates. Thus in the following examinations the focus of attention will be on single events. As the ILE has ten dimensions, considering all of them would take considerable space. Instead, it was decided to utilize only one. Controllability was chosen because of its central role in stress theory [34, 351. A high degree of controllability is assumed to be connected with a low distress level. The recommended time span to be covered with the ILE is 2 yr before an interview. For the analyses below the 24 months were divided into 3 month intervals. Besides the ILE, depression inventories were applied in studies 1 and 2 to analyse whether differences in event reporting and rating event stress arc dependent on the degree of depression. In study 1 the German version of the Center of Epidemiological Studies Depression Scale (CES-D) [36] was applied. This questionnaire consists of 20 statement items. e.g. ‘During the last week I felt worried about things that usually do not affect me’. Each item has four response alternatives specifying the described state for frequency of occurrence (assigned values: 0.. .3; verbal denomination: not at all/sometimes/often/most of the time). Item values are totalled and higher scores arc interpreted as higher degrees of deprcssivity. Owing to external constraints the CES-D was administered only once. In study 2 the German version of the Self-Rating Depression Scale (SDS) [37] was used. The SDS consists of 20 items with statements like ‘Sometimes I suddenly start to weep or I feel like weeping’. These statements are also to be judged for frequency of occurrence on four-point scales (assigned scores: 0.. .3/verballabels: not at all/sometimes/often/most of the time). The item values are summed with higher scores indicating higher degrees of depressivity. In study 3 no data on potentially confounding factors were collected.
RESULTS
Reliability of lijb event report., Reliability measures could only be computed for the first and second study. The third one was not a retest. The rate of reliably recalled events was lower than 50% for both studies (Table I). For the first study events reported only once were evenly distributed over the two interviews. The distribution was unbalanced for the second study since 38% of all events were reported only at the first interview. Since only a few events were reported twice, the computation of reliabilities for controllability ratings is based on these few (Tables II and III). The highest reliability would be obtained when all event ratings were consistent over both measurements. In this case in Tables II and III all cells outside the main diagonal should be zero. The greatest stabilities in the data were found in the category indicating the highest degree of controllability (65% and 7.5%); for the remaining categories the rates of agreements were much lower. This is reflected in
The evaluation Table I.-Frequencies Study
and percentages
of life event data for events reported
827 in the two retest studies
1
Total number of reported events Events reported in both interviews Events reported only in the first interview Events reported only in the second interview
173 76 56 41
(100%) (43.9%) (32.4%) (23.6%)
Study 2 Total number of reported events Events reported in both interviews Events reported only in the first interview Events reported only in the second interview
Table H-Event
ratings
for the controllability
Frequency Row (%) Column (%)
First interview rating
246 (I OO”/o) 120 (48.8%) 94 (38.2%) 32 (13%)
dimension
for both interviews
for study
1
Second interview rating
1
2
3
5 29.4 41.7
3 17.6 37.5
3 17.6 42.9
2 28.6 16.7
2 28.6 25.0
1 25.0 8.3
25.0 12.5
4
6 35.3 15.0
17 22.4
2 28.6 5.0
7 9.2
2 50.0 5.0
4 5.3
3 37.5 33.3
4 50.0 10.0
8 10.5
1 14.3 11.1
I
1 12.5 12.5
5
4 10.0 33.3
1 2.5 12.5
4 10.0 51.1
5 12.5 55.6
26 65.0 65.0
40 52.6
12 15.8
8 10.5
7 9.2
9 11.8
40 52.6
76 100.0
Ch? = 24.30; df= 16; p=O.O8.
Kendall’s Tau B, an association measure that is sensitive to deviations from the main diagonal. For study 1 it was T=0.49, for study 2 it was T=O.42, which is only moderate. Forgetting
of events
The second question listed in the Introduction referred to the forgetting of events, and whether there are differences over the three studies. The fall-off rates are
828
S. GEYER Table III.-Event
ratings
for the controllability
Frequency Row (%) Column (“XI)
First interview rating
13 43.3 61.9
2
1
3
dimension
Second interview rati-rg 2
1
1
ET AL.
1.7 4.8
4
5
3 10.0 23.1
9 30.0 13.2
30 25.0
1 25.0 11.1
1 25.0 1.1
2 50.0 2.9
4 3.3
2 15.4 22.2
4 30.8 30.8
6 46.2 8.8
13 10.8
16.7 4.8 5
3
for study 2
5 16.7 55.6
I
41
for both interviews
2 33.3 15.4
2 33.3 22.2
1 16.7 1.5
6 5.0
6 9.0 28.6
1 1.5 11.1
3 4.5 23.1
7 10.4 77.8
50 74.6 73.5
67 55.8
21 17.5
9 7.5
13 10.8
9 7.5
68 56.1
120 100.0
Ch? = 55.42; df=16;p=O.O0.
0
Fmt int.
+ Second int.
0
1
I
I
I
I
I
I
I
2
3
4
5
6
7
8
Three month-intervals Fig. 1. Relative
frequencies
of remembered
life events of both interview
series for the first study.
presented in Figs 1-3, and events are plotted in percentages. The 2 yr covered were divided into 3-month intervals, with the first one embracing the date of interview. In the reliability studies (studies 1 and 2) the major proportions of events fell into the first and second time interval, then they showed a sharp decline. It can be
The evaluation
01 1
829
of life event data
I
I
I
I
I
2
3
4
5
6
I
7
J
8
Three month-intervals
Fig. 2. Relative
frequencies
of remembered
life events of both interview
series for the second
study.
0 First int + Scxxmdint.
Three month-intervals Fig. 3. Relative
frequencies
of remembered
life events of both interview
series for the third study.
assumed that the probability of occurrence should be the same over the 2 yr span, so most events that happened before the first or second period remained unreported. The pattern observed in the third study (Fig. 3) was different as the highest proportion of events appeared in the second period. In order to get measures for similarities between the distributions, we used an approach derived from cluster analysis [38].
830
S. GEYER
Table IV.PSimilarities
Study Study Study Study Study Study
between
fall-elf
as standardized
correlations)
St. l/II
St. 2/l
St. 2111
St. 3/I
St. 301
I
0.87 I
0.82 0.73 I
0.81 0.75 0.95 I
0.36 0.30 0.74 0.82 I
0.34 0.36 0.76 0.80 0.96
1
The latin numbers
denominate
the interview:
Table V.PDistributions
Study Tl
Rating
I 2 3 4
‘Not true’
rates over all studies (expressed
St. l/l III I/II 2/I 201 3/I 3/II
‘True’
ET AL.
5
33 25.0 14 10.6 8 6. I II 8.3 66 50.. 132
1
I, first interview;
of event ratings
II, second
interview.
for the controllability
T2
Study 2 Tl
T2
Study 3 TI
T2
21 17.9 I5 12.8 8 6.. I4 12.0 59 50.4 II7
55 25.7 6 2.8 27 12,6 I3 6.1 II3 52.8 214
28 18.4 IO 6.6 20 13.2 10 6.6 84 55.3 152
I54 26.6 64 I1 68 II.7 64 II 232 39.9 582
I55 27.3 80 14.1 87 15.3 56 9.9 I89 33.3 567
dimension
n ‘%I n ‘!A> n ‘%, n “/;I n %I N
Standardized correlations were computed between fall-off rates over interview points and over groups for similarities between studies (Table IV). The associations within studies were at least r=0.80. Comparisons between them showed considerable dissimilarities, depending on the pair of interviews considered. Studies 1 and 3 on the one hand and studies 1 and 2 on the other were quite similar. The greatest difference was between the first and the third study. Earlier research examined whether fall-off rates are dependent on event severity. Owing to low reliabilities this analysis cannot be performed with the data in hand, as will be shown in the next section. A methodologically correct procedure requires ratings that are stable over both interviews, but the respective frequencies are not sufficient to perform statistical analyses. Distribution
oj’severity
ratings
The first problem listed in the Introduction dealt with the question of whether severity ratings assigned to events reflected what was assumed to be a naturally occurring pattern, i.e. most events should be rated into the least severe category, and the smallest number should get the highest rating on the controllability dimension. Table V shows the distribution for all three studies; the figures for the remaining
The evaluation Table WPCorrelations
Study Study
Study 2
Tl T2
831
between measures of depression, frequencies of events and severity events for the controllability dimension
Event characteristics
Interview 1
of life event data
Depressivity (CES-D) Depressivity (SDS) Depressivity (SDS)
Number Rating Number Rating Number Rating
of events of events of events
ratings
of
Correlation 0.22 0.02 0.04 0.25 0.34 0.11
dimensions showed the same distributional pattern (further details about the remaining event dimensions are available on request). The distributions of responses on the controllability scale were similar for all three studies. The extreme points have attracted the majority of responses, while those in between were given less frequent. At first glance it seemed that the trend towards extreme responses was stronger in studies 1 and 2 (especially for the category ‘not true’) than in study 3. A discriminant analysis [39] was performed to obtain a measure for the similarity of the response patterns presented in Table V. Wilks’ Lambda was chosen as an appropriate statistic to distinguish between them. It is an inverse measure that gives the unexplained variance after discriminating variables have been introduced. Here, the controllability ratings were taken as discriminating factors and sample membership served as a grouping variable. For the initial series of interviews of all groups, Lambda yielded 0.99 (less than 1% of explained variance), i.e. there was no group-specific response pattern. This was again reflected in the classification analysis: 100% of the cases are assigned to the sample with the highest a priori probability (study 3). The results for the second interview series were similar. Here Lambda is 0.97, and again loo%, of the cases were classified into the largest sample. Effects
of situational jhctors
on reporting and rating events
In the Introduction we postulated that severity ratings may also be time dependent: those having occurred close to the date of interview might be evaluated as more severe than more distant ones as the coping process is still in progress. There is no evidence in the present data. The findings of an earlier study [24] could not be replicated and will therefore not be presented. Nevertheless, there are associations between measures of depression and event reports (Table VI). The correlations are not consistently substantial, but they are at least as strong as those reported in the literature for associations between events and illness. For the depression study the mean difference for depressivity between interviews is significant (MT, =54.8, s~=8.5/A4~,=39.2, SD= 11.5; T(43) =8.01; p=O.Ol). For study 1 this cannot be analysed since depression was assessed only once. In numbers of events the depressives reported more at the first interview than in the second. (nT, = 214 vs. nT2= 152). The average numbers per individual are 4.1 (r,) and 3.0 (TJ. For the myocardial infarction patients the corresponding numbers per individual are 2.4 (r,) and 1.6 (T,), respectively.
832
S. GEYER
ET AL.
DISCUSSION
The first section of the results was considered with reliability, i.e. whether events were reported and rated consistently in two consecutive interviews. It turned out that the probability for events to be reported twice was less than 50% after 4 weeks. For the minority for which controllability ratings could be computed, test-retest associations were again only moderate. A consistent result in life event research is that an increased risk of illness can only be expected for major life events. As these are rarely forgotten [19-211, inconsistent reporting affects the less severe events, which in turn does not necessarily impair the predictive value of data. However, Tables II and III show that it is not easy to identify the most severe events (category ‘1’) when the controllability ratings are taken as a criterion: several events rated ‘1’ in the first interview got a ‘5’ in the second, and vice versa. Thus the meaning of the ratings is not clear. If not much time can be devoted to the assessment of events it seems to be a better solution to analyse the separate categories [3] instead of forming indices. Our second question concerns the forgetting of events. The fall-off curves were considered and compared over studies and interviews. Overall, there is considerable forgetting over the period covered with the life event interview. In studies 1 and 2 events are reported to have occurred in the two intervals preceding the interview while in study 3 peaks appear in the second and third interval. Two explanations are possible for the different distributions: a number of events are reported that might have contributed to the onset of myocardial infarction (study 1) or depression (study 2). This has not been tested with the present data, but in Brown and Harris’ study [6] the period between event and onset of a depressive episode was about 9 weeks. In an earlier case-control study myocardial infarction patients reported increased event-related distress within 12 weeks before disease onset [40]. These events might appear as peaks in the first and the second time interval (cf. Figs 1 and 2) with a fall-off from the first to the following periods. In the third study (a healthy sample) the number of potential illness provoking events should be much lower. This should explain the lack of marked clusters in the first two intervals. The other explanation is directed towards differences in interviewer instructions. When the criteria for recording an event are restrictive, less severe events will not be included in the data. Since these are forgotten within short time periods, the fall-off rates vary between studies with different criteria for what is to be counted as an event. In the first and the second study the instructions were the same, but it was not possible to reconstruct whether they were alike in the third study. The third theme of the analysis is the question of whether the data reflected a naturally occurring pattern: the largest proportion of events should have been classified into the least severe category, and the smallest proportion should have obtained the lowest controllability rating [23, 431. In the three studies such a distribution could not be replicated. The severity ratings were consistently clustering at the two ends of the scale. Such a pattern is unusual in life event research and it was concluded that the denomination of response categories may be a crucial determining factor. In scaling studies it was demonstrated that the type of scale (i.e. their verbal and numeric description) strongly determines subjects’ response behaviour
The evaluation
of life event data
833
[44, 451. In the ILE, five-point scales are used. Only their extreme ends, ‘1’ and ‘5’, are labelled: ‘true’ and ‘not true’. Usually these expressions are used in items with two response alternatives. In the present context the extreme points should not have caused interpretation problems, but the intermediate categories ‘2’, ‘3’ and ‘4’ may have done so since the respondents had to generate the lacking verbal meanings by themselves. It was assumed that this was not very successful. Instead, uncertainty about the (subjectively) correct response arose. The conflict was resolved by preferring the extreme categories, with the consequence that for the researcher their substantial meaning is unclear. For the present data not much can be done about this, but for further studies a solution may be to assign verbal denominations to every point on the scale. These expressions will have to be chosen with equal semantic distances between them [cf. Ref. 461. The last main issue is the role of situational influences on reporting and rating events. As mentioned in the Introduction, depression changes the probability that information will be remembered. More events should be reported that carry an affective tone consistent to the current frame of mind. In the psychiatric sample the depression scores diminished between the first and second interview; the rate of events reported only at one interview dropped from 38.2% to 13% (all reported events at both interviews taken as lOOn/;,). In the same way the rate of reported events per person decreased. Since in the patient samples different depression questionnaires were used, it was not possible to compare depressive states between the studies. Nevertheless, it can be assumed that the heart patients (study 1) were less depressive than the psychiatric respondents (study 2). This should explain the lower rate of reported events in the infarction sample. There was no association between depression and the number of events in the first interview series of the depression study. This is an unexpected result, but there is evidence that remembering may be impaired in higher degrees of depression [27]. The association between mood states and memory should therefore not be linear. We need to consider whether life event inventories such as the instrument used here yield valid information. A general conclusion is difficult, but a solution for life event studies is to take explanatory factors and confounders simultaneously into account and to apply multivariate methods instead of bivariate analyses. If effects of life events on disease outcomes persist, the associations between events and illness can then be interpreted. To sum up: collecting data on life changing events is faced with a number of problems that begin with the problem of recollection. In cross-sectional studies with a single interview per respondent forgetting processes are largely obscured. Conclusions about them can only be drawn on the aggregate level by inspecting falloff rates. Since even divorce and death events can become subject to forgetting, suppression or denial, important personal landmarks may remain unreported. This can lead to erroneous conclusions about event-illness links. The rating scale was given unsuitable labels that led subjects to make extreme event assessments. However, improvements are possible. Verbal labels for every point of the scales should enhance the quality of the data. On the whole, the application of more detailed interviewing techniques like the LEDS is the best choice, but this is often not feasible in largescale studies. Finally, situational factors, especially the current state of mind, have an influence
834
S. GEYER
ET AL.
on recollection of events. Although associations between event appraisals and depression are moderate, taking mood states into account is essential since relationships between events and illness are similar in magnitude. The present studies did not enable us to determine whether correlations between events and illness may disappear when confounders are taken into account, because appropriate comparison groups were missing. This question has to be left open, but there is hope, since much unexplained variance remains in the data. Acknow/edgements-We are indebted to Johannes Siegrist for giving us opportunity to reanalyze the life event data from the industrial worker study and also for helpful comments on an earlier draft of this manuscript. Karin Siegrist’s support is greatly appreciated, and Christa Mettler made a number of interviews for the myocardial infarction study.
REFERENCES I. HOLMES TH, RAHE RH. The Social Readjustment Rating Scale. J Psychosom Res 1967; II: 213-218. 2. DOHRENWEND BS, KRASNOFF L, ASKENASY AR, DOHRENWEND BP. Exemplification of a method for scaling life events: the PER1 life events scale. J Health Sot Behav 1978; 19: 205 229. 3. BRUGHA T, BEBBINGTON P, TENNANT C, HURRY J. The list of threatening experiences: a subset of 12 life event categories with considerable long-term contextual threat. Psycho/ Med 1985; 15: 189-194. 4. SARASON IG, JOHNSON JA, SIEGEL JM. Assessing the impact of lift changes: development of the Life Experiences Survey. J Consult Clin Psycho1 1978; 46: 932-946. 5. SIEGRIST J, DITTMANN K. Inventar lebensverandernder Ereignisse (Inventory of Life Changing Events). In ZUMA-Handhuch sozicrlwissensch~~tlicher Skalen (Edited by Wegener B, Allmen-Dinger J, Schmidt P). Mannheim: Zentrum fur Umfragen, Methoden und Analysen 1983. 6. BROWN GW, HARRIS T. Socicd Origins qf Depression. London: Tdvistock, 1978. 7. ROSS CE, MIROWSKY J. A comparison of life event wieghting schemes: change, undesirability, end effect-proportional indices. J Health Sot Behav 1979; 20: 166 ~177. 8. FARAVELLI, C, AMBONETTI A. Assessment of life events in depressive disorders. A comparison of three methods, Sot Psychiutry 1983; 18: 51-56. 9. BEBBINGTON P, TENNANT C, STURT E, HURRY J. The domain of life events: a comparison of two techniques of description. Psycho/ Med 1984; 14: 219-222. IO. ZIMMERMAN M, PFOHL B, STANGL D. Life events assessment of depressed patients: a comparison of self report and interview formats. J Human Stress 1986; 12: 13-19. 11. ZUCKERMAN LA, OLIVER JM, HOLLINGSWORTH HH, AUSTRIN HA. A comparison 01 life events scoring methods as predictors of psychological symptomatology. J Human Stress 1986; 12: 6470. 12. KATSCHNIG H. Measuring life stress. A comparison of two methods. In The Suicide Syrrdronw (Edited by Farmer R, Hirsch S). London: Croom-Helm, 1980. 13. KATSCHNIG H. Measuring life stress -A comparison of the checklist and the panel technique. In Lije Events and Psychiatric Disorders (Edited by Katschnig H). Cambridge University Press, 1986. 14. OEI TI, ZWART FM. The assessment of life events: self administered questionnaire versus interview. J Affective Disord 1986; 10: 1855190. 15. COSTELLO CG, DEVINS GM. Two-stage screening for stressful life events and chronic difficulties. Can J Behuv Sci 1988; 20: 85592. 16. MCQUAID JR, MONROE SM, ROBERTS JR, JOHNSON SI, GARAMONI GL, KUPFER DJ, FRANK E. Toward the standardization of life events assessment: definitional discrepancies and inconsistencies in methods. SIress Med 1992; 8: 47 56. 17. NEUGEBAUER R. The reliability of life-event reports. In Stress/id Life Events und lhcir Con/cxt.s (Edited by Dohrenwend BS, Dohrenwend BP). New Brunswick: Rutgers University Press, 1984. 18. RAPHAEL K, CLOITRE M, DOHRENWEND BP. Problems of recall and misclassification with checklist methods of measuring stressful events. Health Psycho1 1991; IO: 62274. 19. BROWN GW, HARRIS T. Fall-off in the reporting of life events. Sot PsJ’chiutr~ 1982; 17: 23328. 20. UHLENHUTH E, BALTER MD, LIPMAN RS, HABERMAN SJ. Remembering lift events. In The Origins und Course ofP.~ychopa/hology (Edited by Strauss JS, Babigian HM, Roff M). London: Plenum, 1977.
The evaluation
of life event data
835
21. FUNCH DP, MARSHALL DR. Measuring life stress: factors affecting fall-off in the reporting of life events. J Health Sot Behav 1984; 25: 453-464. 22. NEILSON E, BROWN GW, MARMOT M. Myocardial infarction. In L@ Events und Illness (Edited by Brown GW, Harris T). London: Unwin Hyman, 1989. 23. GEYER S. Lehensveriindernde Ereignis.re und Brustkrehs. Bern: Huber, 1991. 24. GEYER S. Methodische Aspekte der Erfassung lebensverindernder Ereignisse. In Jugend ‘92. Lehenslagen, Orientierungen und Entwicklungsperspektiven im vereinigten Deutschland, Vol. 4 (Jugendwerk der Deutschen Shell), pp. 27739. Opladen: Leske & Budrich, 1992. 25. LAZARUS RS, FOLKMAN S. Stress, Appraisal, and Coping. New York: Springer, 1984. 26. FOGARTY SJ, HEMSLEY DR. Depression and the accessibility of memories. Br J Psychiatry 1982; 142: 232-237. 27. CLARK DM, TEASDALE JD. Diurnal variation in clinical depression and accessibility of memories of positive and negative experiences. J Abnorm Psycho1 1982; 91: 87795. 28. SALOVEY P, SINGER J. Mood congruency effects in recall of childhood versus recent memories. In Mood and Memory (Edited by Kuiken D). London: Sage, 1991. 29. BOWER G. Mood and memory. Am. Psychologist 1981; 36: 1299148. 30. RUBIN DC, BADDELEY AD. Telescoping is not time compression: a model of the dating of autobiographical events. Memory Cognition 1989; 17: 6533661. 31. BURT CD, KEMP S. Retrospective duration estimation of public events. Memory Cognition 1991; 19: 252-262. 32. ZIMMERMAN M. Weighted versus unweighted life event scores: is there a difference? J Human Stress 1983; 9: 30-35. 33. SIEGRIST J, PETER R, JUNGE A, CREMER P, SEIDEL D. Low status control, high effort at work and ischemic heart disease: prospective evidence from blue-collar men. Sot Sci Med 1990; 31: 112771134. 34. GLASS DC, SINGER JE. Urban Stress-Experiments on Noise and Social Stressors. New York: Academic Press, 1972. 35. MIROWSKY J, ROSS CE. Social Causes of Psychological Distress. New York: Aldine de Gruyter, 1989. 36. HAUTZINGER M. Die CES-D Skala. Ein Depressionsinstrument ftir Untersuchungen in der Allgemeinbeviilkerung. Diugnostika 1988; 34: 167 -i73. 37. ZUNG WW. A self-rating deoression scale. Arch Gen Psvchiatrv 1965; 16: 543-541. 38. GOWER JC. A comparis& of some methods of cluster analysis. Biometrics 1967; 23: 6233638. 39. KLECKA WR. Discriminant Analysis. Beverley Hills: Sage, 1980. 40. SIEGRIST J, DITTMANN K, RITTNER K, WEBER I. Soziale Belastungen und Herzinfarkt. Stuttgart: Enke, 1980. 41. SCHMID I, SCHARFETTER C, BINDER J. Lebensereignisse in Abhangigkeit von soziodemographischen Variablen. Sot Psychiatry 1981; 16: 63368. 42. GEYER S. Life events, chronic difficulties, and vulnerability factors preceding breast cancer. Sot Sri Med 1993; 37: 154551555. 43. GLICKMAN L, HUBBARD M, LIVERIGHT T, VALCIUKAS JA. Fall-off in the reporting of life events: effects of life change, desirability, and anticipation. Behav Med 1990; 16: 31-37. 44. HIPPLER HJ, SCHWARZ N. Response effects in surveys. In Social Information Processing and Survey Methodology (Edited by Hippler HJ, Schwarz N, Sudman S). Heidelberg: Springer, 1987. 45. SCHWARZ N, HIPPLER HJ. What response scales may tell our respondents: Informative functions of response alternatives. In Social information Processing and Survey Methodology (Edited by Hippler HJ, Schwarz N, Sudman S). New York: Springer, 1987. 46. ROHRMANN B. Empirische Studien zur Entwicklung von Antwortskalen ftir die sozialwissenschaftliche Forschung. Z Sozialpsychol 1978; 9: 2222245.