The assessment of daytime sleep propensity: a comparison between the Epworth Sleepiness Scale and a newly developed Resistance to Sleepiness Scale

The assessment of daytime sleep propensity: a comparison between the Epworth Sleepiness Scale and a newly developed Resistance to Sleepiness Scale

Clinical Neurophysiology 114 (2003) 1027–1033 www.elsevier.com/locate/clinph The assessment of daytime sleep propensity: a comparison between the Epw...

106KB Sizes 0 Downloads 48 Views

Clinical Neurophysiology 114 (2003) 1027–1033 www.elsevier.com/locate/clinph

The assessment of daytime sleep propensity: a comparison between the Epworth Sleepiness Scale and a newly developed Resistance to Sleepiness Scale C. Violania,*, F. Lucidia, E. Robustob, A. Devotoa, M. Zucconic, L. Ferini Strambic a

Department of Psychology, University of Rome ‘La Sapienza’, Via dei Marsi, 78-00185 Rome, Italy b Department of General Psychology, University of Padua, Padua, Italy c Sleep Disorders Center, IRCCS S. Raffaele of Milan, Milan, Italy Accepted 24 February 2003

Abstract Introduction: The Epworth sleepiness scale (ESS) is widely used as a way of measuring subjective sleep propensity in research and clinical practice. Psychometric studies do not rule out the presence of more than one latent dimension underlying the items. Objective: Aims of the present study were to: (a) evaluate psychometric proprieties of the ESS by means of classic psychometric techniques; (b) compare them with those from a newly developed resistance to sleepiness scale (RSS); (c) evaluate, following the latent trait theory, whether the items of both ESS and RSS could be conceptualized as different levels of an interval variable representative of a single latent trait related to sleep propensity. Methods: One hundred and forty-six inpatients suffering from different sleep disorders filled in both the RSS and ESS in a sleep disorder centre. Results: Indexes of fit derived by the application of the extended logistic model are consistent with the idea that each ESS item can be conceptualized as different levels of an interval variable representative of a single latent trait. However, most of the ESS items are found to be located at the opposite extremes of this continuum. Conclusions: The under representation of situations characterized by an intermediate soporific nature in the ESS could limit ESS sensitivity to detect intermediate variations of sleep propensity. q 2003 International Federation of Clinical Neurophysiology. Published by Elsevier Science Ireland Ltd. All rights reserved. Keywords: Subjective assessment; Daytime sleep propensity; Epworth Sleepiness Scale; Latent trait theory

1. Introduction The assessment of daytime sleepiness is relevant both in sleep medicine and in the prevention of somnolence-related accidents. The 8 item Epworth Sleepiness Scale (ESS; Johns, 1991) is widely used as a way of measuring subjective sleep propensity in research and clinical practice, due to its briefness and simplicity (most subjects complete the one-page form in less than 5 min). The ESS questionnaire requires subjects to rate the chance that they might doze or sleep, on a 4 point scale (0 ¼ ‘would never doze’; 3 ¼ ‘high chance of dozing’), with respect to 8 different * Corresponding author. Tel.: þ 39-06-49917646; fax: þ 39-06-4451667. E-mail address: [email protected] (C. Violani).

everyday situations. The sum of the 8 item responses gives the Epworth score, which can range from 0 to 24. The ESS has been concurrently validated by data indicating that it can distinguish groups of patients with various sleep disorders from normal subjects (e.g. Johns, 1991; Smolley et al., 1993; Johns, 1994). Patients with obstructive sleep apnoea syndrome (OSAS; ICSD, 1993) were found (Johns, 1992) to have reduced ESS scores after 3 or more months of treatment with nasal applied continuous positive airways pressure (CPAP; Sullivan and Grunstein, 1989). Recently, based on receiver operator curves, Johns (2000) reported data indicating that ESS scores correctly assign normal and narcoleptic subjects to their correct groups with nearly perfect accuracy. Linear correlations between ESS and objective measures of sleep propensity – i.e. the Multiple Sleep Latency Test (MSLT) or the

1388-2457/03/$30.00 q 2003 International Federation of Clinical Neurophysiology. Published by Elsevier Science Ireland Ltd. All rights reserved. doi:10.1016/S1388-2457(03)00061-0 CLINPH 2002676

1028

C. Violani et al. / Clinical Neurophysiology 114 (2003) 1027–1033

Maintenance Wakefulness Test (MWT) – were found to be weak but significant in different studies (e.g. Johns, 1991, 1994; Briones et al., 1996; Chervin et al., 1997; US Modafinil in Narcolepsy Multicenter Study Group, 1998; Sangal et al., 1999a). Other studies (Chervin and Aldrich, 1999; Benbadis et al., 1999) did not find significant correlations between ESS and MSLT or MWT in patients with sleep-disordered breathing. Sangal et al. (1999b) pointed out that cubic models were better at explaining the relationship between MWT and ESS, indicating that, when the MWT shows mild sleepiness, the ESS score is low; however, within a wide range of intermediate MWT sleepiness, there is no variation in the ESS scores and, when the MWT shows profound sleepiness, the ESS is very high. Johns has repeatedly argued that the lack of a relevant correlation between the ESS and objective sleepiness test is due to the fact that the former concerns sleep propensity in 8 different situations while the latter only measures it in one situation. Johns (1994) emphasized that the different daily life situations considered by the ESS could be defined as ‘highly soporific’ (involving prolonged inactivity, with little or no body movement or interaction with other people) or ‘less soporific’. The ranked mean item scores in different groups of patients suggests that items 5 (lying down to rest in the afternoon when circumstances permit), 2 (watching TV) and 1 (sitting and reading) are representative of the most soporific situations; items 6 (sitting and talking to someone) and 8 (in a car, while stopped for a few minutes in the traffic) as representative of the least soporific and items 3 (sitting, inactive in a public place), 4 (as a passenger in a car for 1 h without a break) and 7 (sitting quietly after a lunch without alcohol) as intermediate. Unfortunately, the average scores are not adequate for establishing a stable ranking. In fact, the mean scores depend on the sample considered and are affected by extreme scores. It should be noted that, even if there are consistencies in the rankings reported from different studies (Johns, 1992, 1994; Izquierdo-Vicario et al., 1997), they may not be considered as universal. Recently, a different ranking approach has been used. Following the Mokken scale procedure (Debets and Brouwer, 1989), Kingshott et al. (1998) assessed whether the 8 ESS items could be conceptualized as a cumulative, hierarchical and uni-dimensional pattern of sleep-related situations in 129 patients with excessive daytime sleepiness (EDS). Results showed that only items 2, 1, 7 and 6 (in a descending order from the most soporific) contributed to a significant cumulative hierarchical scale along a single continuum. The ESS test – retest reliability was assessed in 87 normal subjects (Johns, 1992) with satisfactory results (Pearson’s r ¼ 0:82); internal consistency of the ESS 8 items was shown to be good (Cronbach alpha ¼ 0.88) in patients suffering from EDS, while it was fair (alpha ¼ 0.73) in subjects without EDS (Johns, 1992). Taken together, this psychometric data allows the use of the ESS as a unidimensional scale. Also factor analyses show a primary

factor that explains most of the scale variance (44 – 57%); nevertheless items 6 and 8 have low loadings, particularly when factor analyses are performed on data from normal subjects (e.g. Johns, 1992). Summarizing, psychometric studies show that the ESS internal consistency is high when patients are considered, but it is lower when considering normal subjects. Results both from factor analyses and procedures assessing unidimensional scalability (i.e. the Mokken procedure) do not rule out the presence of more than one latent dimension underlying the items. Various hypotheses can be put forward concerning the dimensionality of the ESS: 1. the ESS items refer to situations that could be conceptualized as different levels of an interval variable representative of the self-perceived sleep propensity in daily life, but the different situations represented could cluster at the opposite extreme with respect to their soporific nature. In other words, situations with an intermediate soporific nature could be underrepresented in the scale. In factor analysis, when intermediate items are few or absent, items clustering at the extremes of the same continuum can result in different factors; unfortunately, there is no way of knowing from factor analysis alone whether each of these factors is a dimension per se or a part of the same dimension (Linacre, 1998). 2. the ESS items could measure two different but correlated aspects of subjective sleep propensity: the ability to fall asleep in appropriate and soporific circumstances and the inability to resist falling asleep even in less soporific situations in which sleeping is inappropriate. The ESS instructions do not clarify whether subjects should evaluate the likelihood of dozing off voluntarily or of falling asleep involuntarily. This could determine some confusion in subjects who might consider some items in one way and some items in another. Different categories of subjects might favour understanding instructions in terms of sleep-ability (probably the normal subjects) and others in terms of sleep-resistance (probably the clinical subjects). Although bi-dimensional, in actual fact the 8 ESS items could be too few to permit factor analyses to reveal the existence of two factors. In order to clarify if there is more than one dimension in the subjective assessment of sleepiness in terms of likelihood of falling asleep, as a term of comparison, we have developed a new questionnaire, the resistance to sleepiness scale (RSS; described in Table 2), which has two differences with respect to the ESS: (a) the RSS instructions clearly state that the responder should assesses the likelihood of falling asleep involuntarily; (b) the RSS contains 6 items depicting situations in

C. Violani et al. / Clinical Neurophysiology 114 (2003) 1027–1033

which falling asleep would be appropriate and 6 situations in which it would be inappropriate. We assessed the dimensionality and the internal consistency of the ESS and RSS by means of classic psychometric techniques such as Principal Component Analysis (PCA) and Cronbach’s alpha. Furthermore, following the latent trait theory (Weiss, 1983), we evaluated whether the items of both ESS and RSS could be conceptualized as different levels of an interval variable representative of a single latent trait. For this purpose, we adopted the ‘extended logistic model’ (e.g. Andrich, 1982, 1989) which enables the generalization of the simple logistic model (e.g. Rasch, 1980) for ordinal rating scales with more than two points. In Rasch’s models for polycotomous responses, it is presumed that the measured variable is uni-dimensional and the goal is to identify a model defined by a parameter describing the subject’s attitude or ability (b) and a parameter concerning intensity (affective value or difficulty) for each item (d). When there is a good fit between the model and the data set: (1) item parameter estimates are independent of the group of subjects used from the population of subjects for whom the test was designed; 2) subject parameter estimates are independent of the particular subset of items used (Hambleton and Swaminathan, 1985). In the case of the ESS and RSS, both aimed at evaluating the general level of daytime sleepiness, parameter b describes the self-perceived propensity of each subject to fall asleep in the situation presented, while parameter d describes how much the situation represented in each item is perceived as liable to induce sleep. The lower the value of parameter d, the more the situation depicted is perceived as sleep inducing. Summarizing, the aims of the present study were to: (a) evaluate psychometric proprieties of the ESS by means of classic psychometric techniques; (b) compare them with those from a newly developed RSS; (c) evaluate, following the latent trait theory, whether the items of both ESS and RSS could be conceptualized as different levels of an interval variable representative of a single latent trait, identifying the location of each item among this continuum. From a methodological point of view, the different information gained from combining two different analysis models guarantee an overall interpretation of data that neither of the two perspectives could provide on their own.

1029

six patients suffered from insomnia or parasomnias (mean age ¼ 46; 38 males and 8 females), while 100 patients suffered from OSAS or narcolepsy or hypersomnia (mean age 50.3 years; 88 males and 22 females). According to the International Classification of Sleep Disorders (ICSD, 1993), the former are sleep disorders not characterized by objective sleepiness, while the latter 3 primarily produce EDS. 2.1. Data analyses The dimensionality of both the ESS and RSS was assessed by two different PCA on the items of each scale. The internal consistency (Cronbach’s alpha) of each scale was assessed. The ESS and RSS scores were separately submitted to one-way analysis of variances (ANOVAs) considering Group (patients with EDS vs. patients without EDS) as a factor. These analyses were performed through the SPSS 10.0 package. Finally, the ESS and RSS scores were separately analyzed following the extended logistic model. After the estimate of the model defined by parameters b and d, we tested whether or not the data agreed with this model. To this end, two procedures were performed: one considering item – person interaction and the other item – trait interaction. The item –person interaction involves predicting the scores for each person in each item, given the model. Test of fit for specific persons and items is finally obtained by summing and transforming standardized residuals across items and persons, respectively. Item – trait interaction focuses on the items and tests the condition that the parameter estimates are invariant across different data groupings. Following this procedure, subjects are clustered into two or more internally homogeneous groups on the basis of b values. In the present study, 3 groups were generated both for the ESS and RSS, characterized by lower, intermediate or higher b values (i.e. lower, intermediate or higher self-perceived general attitude to falling asleep in the situations represented). The item parameter estimates were calculated separately for each group. The discrepancy between observed and expected responses was summarized in a chi-square value. A high and significant chi-square value indicates the misfit to be a function of the persons’ position on the continuum. These analyses were carried out through the RUMM 2.7 software (Andrich et al., 1996).

2. Methods

3. Results

One hundred and forty-six inpatients (mean age ¼ 49; 18 –71 years) suffering from different sleep disorders filled in both the RSS and ESS in the sleep disorders centre (San Raffaele Hospital in Milan). Half of the patients filled in the RSS beforehand and the ESS afterwards, while the others filled in the ESS beforehand and the RSS afterwards. Forty-

The factor analysis (PCA and oblimin rotations, delta ¼ 0 corresponding to a direct quartimin solution) carried out on the ESS yielded two factors with eigenvalues greater than 1, explaining 52.9 and 12.9% of the scale variance, respectively. The component correlation matrix indicates that these two factors are correlated ðr ¼ 0:429Þ.

1030

C. Violani et al. / Clinical Neurophysiology 114 (2003) 1027–1033

The most representative items for the first factor refer to 3 low soporific situations in which falling asleep is generally inappropriate (items 3, 6 and 8), while those loading on the second factor represent two highly soporific situations, in which dozing could be appropriate (items 5, 7) (Table 1). Factor analysis on the RSS reveals that a single factor explains 59.6% of the variance (Table 2). Consistency of the RSS is higher (alpha ¼ 0.94) than that of the ESS (alpha ¼ 0.86). This difference cannot be attributed entirely to the fact that there are more items in the RSS than in the ESS. In fact, by adding 4 items, according to the Spearman – Brown formula, the estimate of the ESS internal consistency should be 0.909. Comparing patients with and without EDS, both scales showed the expected significant differences (ESS: F1;124 ¼ 30:5, P , 0:0001, h2 ¼ 0:20; RSS: 2 F1;124 ¼ 38:2, P , 0:0001, h ¼ 0:23). The analyses performed following the extended logistic model showed that both the ESS and RSS have satisfactory general item –trait fit indexes (ESS: x2 ¼ 16:76, d.f. ¼ 14, P ¼ n.s.; RSS: x2 ¼ 31:89, d.f. ¼ 22, P ¼ n.s.). The location parameter (d) estimates and statistics related to item – person and item –trait fit procedures for the ESS and RSS are reported in Tables 3 and 5, respectively. Indexes of fit for both scales are consistent with the idea that both reflect a single latent trait. However, none of the ESS items significantly diverts from it, even if item 5 approaches significance (Table 3), while two RSS items (numbers 6 and 7) are not coherent with a uni-dimensional model (Table 5). Tables 4 and 6 cross-tabulate the patients diagnoses with the 3 groups characterized by lower, intermediate or higher b values (i.e. lower, intermediate or higher self-perceived propensity to fall asleep in the situations represented) Table 1 Factor loadings for the ESS items (oblimin solution) Item

Content

1 2 3 4

Sitting and reading Watching TV Sitting inactively in a public place As a passenger in a car for 1 h without a break Lying down to rest in the afternoon when circumstances permit Sitting and talking to someone Sitting quietly after lunch without alcohol In a car, while stopped for a few minutes in the traffic

5 6 7 8

%Variance Eigenvalue

Factor 1

Factor 2

0.507 0.479 0.863 0.447

0.437 0.423 20.027 0.393

20.187

0.946

0.907 0.302 0.827

20.109 0.605 20.042

52.4 4.26

12.9 1.19

The item marker for each factor, i.e. the item with primary loading .0.4 and with a ratio between primary and secondary loading higher than 2, are given in bold.

Table 2 Factor loadings for the RSS items Item

Content

1

Lying down, reading a book or a magazine Sitting in the stalls, at a theatre or cinema Sitting to watch TV At home, after dinner, meeting friends As a passenger in a car, after travelling for over 1 h Driving at night, on a motorway In the afternoon, in an armchair Sitting in a waiting-room Sitting, after a lunch, without having drunk any alcohol Sitting, listening to someone Sitting in a train, bus or plane for more than 1 h In the afternoon, studying or working, sitting at a writing-desk

2 3 4 5 6 7 8 9 10 11 12

Factor loading

%Var

0.812 0.803 0.765 0.740 0.732 0.643 0.783 0.830 0.793 0.760 0.815 0.766 59.6

Eigenvalue

7.2

generated by the item –trait fit procedure, respectively, on the ESS and RSS scores.

4. Discussion PCA on the ESS revealed two correlated factors. The item markers for the first factor refer to less soporific life situations in which falling asleep is generally inappropriate, while the item markers for the second factor refer to highly soporific situations in which falling asleep is generally appropriate. The second factor explains a small amount of variance, but it still has an eigenvalue greater than 1, which is not irrelevant considering that the ESS has only 8 items. With the RSS, we probed uni-dimensionality by adding just 4 items and specifying in the instructions that the respondent should assess the likelihood of falling asleep involuntarily. Table 3 ESS: location parameter (d) estimates and statistics related to item–person and item–trait fit procedures Item

5 2 1 7 4 3 8 6

Location (d)

21.86 20.776 20.580 20.306 20.272 0.336 1.65 1.81

Item-person fit

Item-trait fit

Z

P (one-tailed)

x2

d.f

P

0.816 20.203 20.802 20.049 0.956 20.183 20.511 20.661

0.207 0.419 0.211 0.480 0.169 0.427 0.305 0.254

5.01 0.51 4.17 2.01 1.69 0.40 0.71 2.25 16.757

2 2 2 2 2 2 2 2 14

0.057 0.769 0.101 0.349 0.413 0.812 0.694 0.307 0.27

C. Violani et al. / Clinical Neurophysiology 114 (2003) 1027–1033

1031

Table 4 Cross-tabulation between diagnosis and internally homogeneous clusters of subjects with respect to their level of self-perceived general attitude to falling asleep (b value) in the situations represented in the ESS Groups

Insomniacs or parasomniacs OSAS, hypersomniacs or narcoleptics Total

Total

Lower beta

Intermediate beta

Higher beta

38 (79,2%) 10 (20,8%) 48 (100%)

8 (16,3%) 41 (83,7%) 49 (100%)

0 (0%) 49 (100%) 49 (100%)

46 100 146

Column percentages are reported in parentheses.

The internal consistency of the RSS (alpha ¼ 0.94) resulted higher than the ESS (alpha ¼ 0.86). According to the Spearman –Brown formula, this difference, although limited, cannot be attributed entirely to the fact that there are more items in the RSS than ESS. Analyses guided by the latent trait theory indicate that the items of both scales can be considered as situations placed in different positions along a uni-dimensional continuum, which refers to the ease with which one falls asleep, independent of the consequences. It may be noted that the ESS items are located at very different hierarchical levels of this uni-dimensional continuum; in fact, the d values reported in Table 3 show an abrupt jump, that corresponds to items 6 and 8. These items are indicated as representative of the less soporific situations considered in the ESS (e.g. Johns, 1994) and it has been shown that the ranked mean scores for these two items are lower both among ‘sleepy’ patients (Johns, 1992, 1994) and among normal subjects (Johns, 1992, 1994; Izquierdo-Vicario et al., 1997). Not surprisingly, items 6 and 8 represent the principal markers of the factor related to the less soporific situations revealed by the PCA in our data. The fact that, in our results, items 6 and 8 have a relevant factor loading in a separate factor, even if they belong to the same latent trait when analyzed following the extended logistic model, is not contradictory. It is an artefact due to the fact that, in factor analysis, item clustering at different levels in the same continuum can result in different factors (Linacre, 1998). The abrupt jump between items 6 and 8 with respect to the others considered in the ESS could contribute to explain the presence of a cubic relationship between the ESS and objective measures of sleep propensity, i.e. the MWT, reported by Sangal et al. (1999a). At a mild level of MWT sleepiness, all ESS items should be low. It is possible that patients in a large MWT range of moderate to severe sleepiness choose a high value for items 1 –4 and 7 and a low value for items 6 and 8 and only patients with profound sleepiness in the MWT choose high scores for the nonsoporific items. Olson et al. (1998) suggested that one of the factors potentially limiting the relationship between ESS score and the sleep latency at the MSLT is that the first could underestimate sleepiness in patient with mild or moderate daytime sleepiness. The present data indicate that the ESS refers to a single

latent trait of self-perceived daytime sleepiness; item 5 can be considered to depict the most soporific situation, while item 6 depicts the situation in which falling asleep is evaluated as less likely. Since d values represent the location order of each item along a uni-dimensional and progressive continuum, our results indicate that if a person gives a low score to item 6, it is likely that he will also give a low score to each of the other items, while a high score given to item 4 is sufficient to predict a high score to items 5, 2, 7 and 1, but not to items 3, 8 and 6. The present study formalizes in a statistical model of hierarchy of responses the indications otherwise provided by the ranked mean item scores obtained in different groups of patients and normal subjects (Johns, 1992, 1994; Izquierdo-Vicario et al., 1997). Unlike the ranked means, that depend on the sample considered, the statistical model used in the present study tests the condition that the parameter estimates are invariant across different groups of subjects, in particular, in subjects characterized by different levels of self-perceived general attitude to falling asleep in the situations represented; when this is not true, the fit indexes are significant. The index for ESS item 5 approaches significance in the item – trait fit procedure. This indicates that the location of this item is not independent of the subjects’ self-perceived sleepiness. In particular, the comparison between observed and estimated responses shows that subjects classified in the group with Table 5 RSS: location parameter (d) estimates and statistics related to item–person and item –trait fit procedures Item

7 1 3 11 9 5 2 12 8 6 4 10

Location (d)

21.27 21.06 21.03 20.72 20.30 20.25 0.298 0.304 0.459 0.973 1.04 1.55

Item-person fit

Item-trait fit

Z

P (one-tailed)

x2

d.f

P

21.555 0.109 0.663 0.892 20.443 1.153 0.205 0.188 20.424 2.571 0.673 20.649

0.060 0.457 0.254 0.186 0.329 0.124 0.419 0.425 0.336 0.005 0.250 0.258

7.88 1.41 3.43 0.54 0.16 2.72 0.93 1.51 1.66 6.07 2.20 3.39 31.89

2 2 2 2 2 2 2 2 2 2 2 2 22

,0.001 0.479 0.158 0.758 0.922 0.237 0.619 0.457 0.422 0.023 0.316 0.161 0.08

1032

C. Violani et al. / Clinical Neurophysiology 114 (2003) 1027–1033

Table 6 Cross-tabulation between diagnosis and internally homogeneous clusters of subjects with respect to their level of self-perceived general attitude to falling asleep (b value) in the situations represented in the RSS Groups

Insomniacs or parasomniacs OSAS, hypersomniacs or narcoleptics Total

Total

Lower beta

Intermediate beta

Higher beta

41 (85,4%) 7 (14,6%) 48 (100%)

5 (11,2%) 44 (89,8%) 49 (100%)

0 (0%) 49 (100%) 49 (100%)

46 100 146

Column percentages are reported in parentheses.

the lowest self-perceived sleepiness (b-mean ¼ 2 2.25) produce a number of high scores (value 4 on the scale) that is greater than the number estimated by the model (standardized residuals ¼ 3.06). In our study, the subjects with the lowest sleep-ability are mainly insomniacs and it is reasonable to think that they rarely ‘lie down to rest in the afternoon when circumstances permit’ if they want to resist dozing off. As suggested by Kingshott et al. (1998), the situation depicted by ESS item 5 may occur too infrequently to represent a valid assessment of their daytime sleep propensity. There are some other indications showing that item 5 of the ESS has some problems. In 1994, Johns compared the responses to the ESS items given by a group of patients with those given by their spouses and found significant differences for item 5 (and 4). Furthermore, item 5 was the only one showing a normalized factor loading , 0.10 in one of the 4 factor analyses performed in the same study (Johns, 1994). The RSS items are set along a more homogeneous continuum so that, probably because intermediate items were introduced, only one factor emerged from the PCA. Nevertheless, two items (6 and 7) exhibit fit values reaching statistical significance, which should be placed outside the hierarchical uni-dimensional continuum represented in the scale. Item 7 of the RSS (in the afternoon, in an armchair) depicts a situation very similar to item 5 of the ESS and, similarly, the item – trait fit procedure highlighted a relevant difference between expected and observed frequencies in the group with lower perceived sleep propensity. Both fit procedures show a significant coefficient for this item, thus confirming that this situation does not allow a valid assessment of self-perceived daytime sleep propensity. Like item 8 of the ESS, item 6 of the RSS (driving at night, on a motorway) refers to driving, but, unlike ESS item 8, it specifies that subjects should evaluate their chances of falling asleep while driving. Also for this item, considering both item – person interaction and item –trait interaction fit procedures, the fit statistics are significant. In this case, residuals analysis of item – trait fit reports a high standardized residual for the subjects with lowest sleep-ability (the insomniacs). Their observed frequencies on scale level 3 are greater than expected. It is possible that insomniacs are the only patients who have a reasonable experience with driving

at night, while patients with EDS (i.e. OSAS or narcoleptics) may avoid this dangerous situation so that it may not occur frequently enough to allow a valid assessment of their self-perceived daytime sleep propensity. Summarizing, our results indicated that the ESS measures a single latent trait related to the continuum of self-perceived sleep propensity in daily life. ESS items can be considered as representative of different situations that could be hierarchically located as a function of their relative level of soporificity. However, their location on this continuum is altered by an abrupt jump, which could limit the sensitivity of the ESS to detect intermediate variations of sleepiness. Furthermore, the placement of the ESS item 5 along this continuum is not independent of the subjects’ self-perceived sleepiness. In particular, subjects with lower self-perceived sleepiness produce a number of high scores that is greater than expected. To improve the ESS, it may be worth substituting item 5 with another item describing a high sleep-inducing situation. Furthermore, the introduction of intermediate soporific situations should be considered.

Acknowledgements The authors wish to thank Antonio Massimo for his help in the collection of data.

References Andrich D. An extension of the Rasch model for rating. Psychometrika 1982;47:105–13. Andrich D. Constructing fundamental measurement in social psychology. In: Keats A, Taft R, Heath RA, Lovibond SH, editors. Mathematical and theoretical systems. North-Holland: Elsevier Science Publishers B.V, 1989. p. 17– 26. Andrich D, Luo G, Sheridan B. RUMM – Rasch Unidimensional Measurement Models. User’s guide. (available from David Andrich, Murdoch University, Murdoch, Western Australia 6150) 1996. Benbadis SR, Mascha E, Perry MC, Wolgamuth BR, Smolley LA, Dinner DS. Association between the Epworth sleepiness scale and the multiple sleep latency test in a clinical population. Ann Intern Med 1999;16: 289 –92. Briones B, Adams N, Strauus M. Relationship between sleepiness and general health status. Sleep 1996;19:193– 7. Chervin RD, Aldrich MS. The Epworth sleepiness scale may not reflect objective measures of sleep apnea. Neurology 1999;52:125– 31.

C. Violani et al. / Clinical Neurophysiology 114 (2003) 1027–1033 Chervin RD, Aldrich MS, Pickett R, Guillerminault C. Comparison of the result of the Epworth sleepiness scale and the multiple sleep latency test. J Psychosom Res 1997;42(2):145–55. Debets P, Brouwer E. MSP: a program for Mokken scale analysis for polycotomous items. Groningen: Iec ProGAMMA, 1989. Hambleton RK, Swaminathan H. Item response theory. Boston, MA: Kluwer-Nijhoff Publishing, 1985. ICSD. International classification of sleep disorders (ICSD): diagnostic and coding manual. Rochester, MN: American Sleep Disorders Association, 1993. Izquierdo-Vicario Y, Ramos-Platon MJ, Conesa Paraleja D, Lozano-Parra AB. Epworth sleepiness scale in a ample of the Spanish population. Sleep 1997;20(8):676–7. Johns MW. A new method for measuring daytime sleepiness: the Epworth sleepiness scale. Sleep 1991;14:540–5. Johns MW. Reliability and factor analysis of the Epworth sleepiness scale. Sleep 1992;15:376–81. Johns MW. Sleepiness in different situations measured by the Epworth sleepiness scale. Sleep 1994;17:703–10. Johns MW. Sensitivity and specificity of the multiple sleep latency test (MSLT), the maintenance of wakefulness test and the Epworth sleepiness scale: failure of the MSLT as a gold standard. J Sleep Res 2000;9:5–11. Kingshott R, Douglas N, Deary I. Mokken scaling of the Epworth sleepiness scale items in patients with the sleep apnea/hypopnoea syndrome. J Sleep Res 1998;7:293– 4.

1033

Linacre A. Rasch first or factor first? Rasch Meas Trans 1998;11(4):603. Olson LG, Cole MF, Ambrogetti A. Correlations among Epworth sleepiness scale scores, multiple sleep latency test and psychological symptoms. J Sleep Res 1998;7(4):248 –53. Rasch G. Probabilistic models for some intelligence and attainment tests. Chicago, IL: The University of Chicago Press, 1980. (original work published 1960). Sangal RB, Sangal JM, Belisle C. Subjective and objective indices of sleepiness (ESS and MWT) are not equally useful in patients with sleep apnea. Clin Electroencephalogr 1999a;30:73–5. Sangal RB, Mitler MM, Sangal JM. Subjective sleepiness ratings (Epworth sleepiness scale) do not reflect the same parameter of sleepiness as objective sleepiness (maintenance of wakefulness test) in patients with narcolepsy. Clin Neurophysiol 1999b;110:2131–5. Smolley LA, Ivey C, Farkas M, Faucette E, Murphy S. Epworth sleepiness scale is useful for monitoring daytime sleepiness. Sleep Res 1993;22: 389. Sullivan C, Grunstein R. Continous positive airways pressure in sleepdisordered breathing. In: Kryger MH, Roth T, Dement W, editors. Principles and practice of sleep medicine. Philadelphia, PA: W.B. Saunders, 1989. p. 559 –70. US Modafinil in Narcolepsy Multicenter Study Group, Randomized trial of modafinil for the treatment of pathological somnolence in narcolepsy. Ann Neurol 1998;43:88–97. Weiss DJ, editor. New horizons in testing. New York, NY: Academic Press, 1983.