International Emergency Nursing (2009) 17, 143– 148
available at www.sciencedirect.com
journal homepage: www.elsevierhealth.com/journals/aaen
Manchester Triage in Sweden – Interrater reliability and accuracy Pia Olofsson BSc, RN (Leader of Research and Development Team) a, Martin Gellerstedt PhD (Senior Lecturer) b, ¨m PhD, MA, RN (Senior Lecturer, Director of Research) Eric D. Carlstro
c,*
a
˚rden, Trollha ¨ttan, Sweden Department of Emergency Medicin, Nu-sjukva ¨ttan, Sweden Statistics/Informatics, University West, Department of Informatics, Trollha c Health Management, Policy and Economics, Director of Research, University West, Department of Nursing, ¨ttan, Sweden Health and Culture, SE-461 86 Trollha b
Received 30 July 2008; received in revised form 29 November 2008; accepted 30 November 2008
KEYWORDS Triage; Reliability; Emergency care; Sweden
Abstract Introduction: This study investigates the interrater reliability and the accuracy of Manchester Triage (MTS) at emergency departments in Western Sweden. Methods: A group of 79 nurses from seven emergency departments assessed simulated patient cases and assigned triage categories using the same principles as in their daily work. K statistics, accuracy, over-triage and under-triage were then analyzed. The nurses performed 1027 triage assessments. Results: The result showed an unweighted j value of 0.61, a linear weighted j value of 0.71, and a quadratic weighted j value of 0.81. The determined accuracy was 92% and 91% for the two most urgent categories, but significantly lower for the less urgent categories. Conclusions: Patients in need of urgent care were identified in more than nine out of 10 cases. The high level of over-triage and under-triage in the less urgent categories resulted in low agreement and accuracy. This may suggest that the resources of emergency departments can be overused for non-urgent patients.
ª 2008 Elsevier Ltd. All rights reserved.
Introduction
* Corresponding author. Tel.: +46 702738126; fax: +46 520223099. E-mail address:
[email protected] (E.D. Carlstro ¨m).
Triage, i.e., prioritizing and sorting at emergency departments, has been attracting increasing attention. In particular, older patients with multiple diagnoses are becoming more common (Goodacre et al., 1999; Beveridge et al.,
1755-599X/$ - see front matter ª 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.ienj.2008.11.008
144 1999; Palmquist and Lindell, 2000; Go ¨ransson, 2006). The prioritization carried out has, therefore, also become more complex. More parameters have been introduced for the order in which patients are treated (Baldursdottir and Jonsdottir, 2002; Fernandes et al., 2005; Go ¨ransson, 2006). The triage models, which were developed during the 1990s have been refined and have become national standards in some countries (Fernandes et al., 2005; Worster et al., 2004). Australia was the first to introduce a triage model, ‘‘The National Triage Scale’’. The model was developed by The Australasian College for Emergency Medicine in 1993. At the beginning of the 21st century its name was changed to ‘‘The Australasian Triage Scale’’ (ATS). In Canada a triage model was developed in the mid-1990s, which was based on the Australian model ATS, ‘‘The Canadian Emergency Department Triage and Acuity Scale’’ (CTAS). In the USA another triage model, ‘‘The Emergency Severity Index’’ (ESI) has been in existence since the end of the 1990s (Beveridge et al., 1998; Gilboy et al., 2005; McCallum Pardey, 2006). This study focuses on a fifth national standard model, ‘‘Manchester Triage’’ (MTS) (Mackway-Jones, 1997) which is accepted as a standard at emergency departments in Great Britain, Holland and Portugal (Lipley, 2005). In this article we examine the interrater reliability and accuracy using MTS at emergency departments in Western Sweden where the model has been implemented. There have been no earlier studies of the method applicable to Western Sweden. There was, therefore, a need to research the case.
Background At the beginning of the 21st century discussions began at emergency departments in Western Sweden with regards to implementing a common triage model. At this time most departments used the Swedish National Board of Health and Welfare criteria document which only provided three prioritization levels (Socialstyrelsen, 1994). When the waiting times increased due to an increased patient influx at the emergency departments in Western Sweden, discussions arose that led to the conclusion that the criteria document did not satisfy the requirements for emergency departments. In order to find a structured triage model, which more effectively met the needs of the emergency departments, foreign triage models were studied. MTS was the one chosen. It was hoped that patient safety would increase where the patients needed immediate care. It was also hoped that an order of treatment based on clinical need could be achieved for remaining patients (Dann et al., 2005). The aim of this study was to find out whether patients in need of urgent care were identified by means of MTS.
MTS Prioritization times in MTS are associated with colours. The model has five triage categories and time intervals specified in minutes. The time corresponds to the longest recommended time based on clinical indicators. The patients triaged in the highest category (red) are in need of immediate care. The next two categories (orange and yellow) have longer recommended time allowance (10 and 60 min, respectively). The two lowest categories (green
P. Olofsson et al. and blue) have the longest recommended time allowances of 120 and 240 min from the patient’s arrival to seeing a physician. The decision model for MTS consists of 52 flow charts based on the most common reasons for emergency visits. The flow charts are designed so that the most urgent category is presented first in order for the most serious clinical indicators to be identified as fast as possible (MackWayJones, 1997; Widfeldt, 2005).
Measuring reliability and validity Existing triage studies principally measure the model’s reliability or validity (Gilboy et al., 2005). The validity corresponds to the model’s sensitivity and specificity. One way to measure this is to compare the triage category assessed by the nurse with a standard value. This includes the resources used and the end result of the triage. It could include the hospitalization of a patient or biological markers such as test results and fatalities (Altman, 1999; Cooke and Jinks, 1999; Stenstrom et al., 2003; Dong et al., 2007). If the triage model is reliable, the end result of the triage will be the same, independent of which nurse makes the assessment. This is often defined as the kappa value (j) and measures the degree of agreement between obtained and predicted values (Altman, 1999; Jakobsson and Westergren, 2005). The ranking order is from 0 to 1, where 0 means that there is no agreement other than what is random, and 1 means complete agreement. The range 0.8–1.0 is considered excellent agreement, 0.6–0.8 good agreement, 0.4– 0.6 moderate agreement, 0.2–0.4 fair agreement, and less than 0.2 poor agreement (Altman, 1999; Goodacre et al., 1999; Jakobsson and Westergren, 2005).
Weighted and unweighted j values Two kinds of j values, weighted and unweighted, appear in the studies. The difference is that unweighted j values only include identical assessments as acceptable. Weighted j values are more ‘‘flexible’’ since assessments deviating somewhat from the predicted are considered partially acceptable (Altman, 1999; Jakobsson and Westergren, 2005). There are relatively few studies of MTS reliability, and those in existence are limited. Two studies are presented here (Goodacre et al., 1999; Versloot and Luitse, 2007). The first of these was retrospective and dealt with four experienced emergency physicians with no prior MTS experience. They analyzed the notes of the triage nurses and then assessed the triage category according to MTS flow charts. The agreement between the four physicians was measured by calculating the j values between different pairs of assessments. The j values presented varied between 0.31 and 0.63. It was not specified whether the j values were weighted or unweighted (Goodacre et al., 1999). The second study presented the interrater reliability using simulated patient cases. Eight nurses with MTS experience each assessed 50 patient cases. The obtained unweighted linear j value was 0.76 and the obtained weighted quadratic j value was 0.82 (Versloot and Luitse, 2007). Two kinds of unweighted j values, quadratic and lin-
Manchester Triage in Sweden – Interrater reliability and accuracy ear, appear in the studies. In the quadratic set weights are higher than in the linear. Consequently the quadratic j value is more allowing than the linear (Cohen, 1968). Kappa with quadratic weights is easier to interpret since it is equivalent to the intraclass correlation coefficient (ICC) (Fleiss and Cohen, 1973). In one of the few triage studies performed in Sweden, nurses assessed simulated patient cases according to CTAS time priorities (Go ¨ransson et al., 2005). CTAS is similar to MTS but with shorter waiting times (Beveridge et al., 1998). When the study was carried out, only a few emergency departments were using a full-scale triage model. The nurses had no CTAS training and their actions had no medical basis. The study indicates an unweighted j value of 0.46 and a weighted j value of 0.71, i.e., a moderate to good agreement. It was not specified whether the weighted j value was linear or quadratic (Go ¨ransson et al., 2005).
145
The capacity varied from 22,000 to 50,000 visitors per year and the numbers of nurses employed were between 30 and 60 per emergency department. One of the emergency departments was excluded since it only accepted children. The rest had mixed specialties, such as internal medicine, surgery, and orthopedics. In this study, a random day was selected for data collection at the remaining seven emergency departments. Only nurses normally working in triage participated in the study. An inclusion criterion was that at least ten nurses with at least six months triage work experience should participate from each department during the selected day. Simulated patient cases were presented in a survey consisting of two parts. The first part included questions about the nurse’s background while the second part listed patient cases similar to those the nurse experiences in her daily routine.
Patient cases Accuracy, over-triage and under-triage Accuracy is another measure used in triage studies (Considine et al., 2000, 2004; Grafstein et al., 2003; Go ¨ransson et al., 2005; Manos et al., 2002). It specifies the assessment to a predicted triage category. An expert panel chooses triage categories for a number of simulated patient cases. If the assessments carried out in the study are in agreement with these predicted values, it would suggest a high accuracy. In the above-mentioned Swedish study, the accuracy did not exceed 57.7% (Go ¨ransson et al., 2005). Other measures often presented in studies are over-triage and under-triage. They are triage categories that are higher or lower than the predicted category considered the reference value. If patients are given high priority without reason, the increased resources applied for these patients can mean long waiting times for other patients. Over-triage and under-triage are presented as percentages. Go ¨ransson et al. (2005) present, for example, an over-triage of 28.4% and an under-triage of 13.9%. In another study performed at a pedriatic emergency department using ATS (similar to CTAS time priorities), an over-triage of 22.6% and an under-triage of 15.7% were obtained (Crellin and Johnston, 2003).
Method The research design was a prospective and descriptive survey based on simulated patient cases. The data was collected by asking triage nurses (n = 79) to assess simulated patient cases and determine a triage category according to the same principles used in their daily triage work. The cases had been given reference values by two expert panels. An analysis subsequently took place of the j values, the accuracy, the over-triage and the under-triage. Nine emergency departments in Western Sweden use the MTS triage model. These all had trained instructors who in turn had trained nurses in using MTS. All the emergency departments used flow charts that had been translated to Swedish from the original version. Only one of the departments declined to participate in the study.
Nine of the patient cases were taken from an Australasian study (Considine et al., 2000). They were revised, however, to resemble patient cases at Swedish emergency departments. An additional five patient cases were developed from authentic scenarios extracted through studies of patient journals. A total of 14 simulated patient cases were used in the study, all designed to reflect the distribution of triage categories normal at emergency departments. Special attention was given to categories risking both over-triage and under-triage. They corresponded to orange, yellow, and green categories. Therefore, only one of the patient cases was reported as red, while as many as three were orange, six were yellow, and three were green. The blue triage category was not represented by any of the patient cases in the study. This was because a ‘‘blue patient’’ did not match the criteria of an emergency department. The majority of the patients visiting an emergency department are in the orange, yellow and green categories (Gilboy et al., 2005). Those constructing the simulated patient cases assessed each case and predicted the triage category according to a consensus discussion. Two independent expert panels were subsequently formed. One panel consisted of two chief physicians with considerable experience in emergency care and an introduction to MTS applications. One of these physicians had been responsible for the Swedish translation of MTS. The other panel consisted of three nurses who were trained and had long experience in MTS. These two panels assessed (1) the triage category for each patient case, (2) if the patient cases were suitable for emergency departments in Western Sweden, and (3) if the presentation was understandable. They registered to what level they reached consensus. The expert panels were in disagreement about the predicted outcome of one of the cases, and it was therefore excluded (Twomey et al., 2007). There were no changes required for the remaining patient cases.
The survey The seven emergency departments participating in the study used different nomenclature for flow chart, selection
146
P. Olofsson et al.
criterion, and triage category. Because of this, a pilot study was performed where those representing the various emergency departments gave their opinions on the choice of word so that all participants would understand the formulations. Minor adjustments were made to the survey in accordance with the pilot group’s opinions. The survey contained an introductory sheet where the background of the study was described. It also contained detailed instructions and information stating that participation was voluntary. The study was considered exempt from formal ethics, as it did not impact on patient care. Only the researchers had access to the survey results that were subsequently coded. The following analysis was performed: the interrater reliability was given as j values, whereas accuracy, over-triage, under-triage, and distribution of triage categories were given in percentages. The j values were presented as unweighted and weighted values. Weighted j values were further divided into linear and quadratic values (Altman, 1999).
to excellent (0.8–1.0) agreement. Between the various emergency departments, the unweighted j value varied between 0.56 and 0.65, the linear weighted j value between 0.68 and 0.75, and the quadratic weighted j between 0.78 and 0.85 (Table 1). Two of the departments thus showed a moderate agreement, i.e., an unweighted j of just under 0.6. The rest showed good agreement. The quadratic weighted j values, however, indicate excellent agreement at six of the emergency departments. Table 2 shows the distribution of over-triage (14%) and under-triage (13%). This meant that almost as many cases were given higher priority as cases given lower priority in comparison with the predicted outcome. The mean accuracy for the emergency departments was 73%. It was particularly high for the red (92%) and the orange (91%) categories, but significantly lower for the less urgent categories. Over-triage and under-triage results showed greater values for the yellow and green than for the more urgent red and orange categories. Within the green category, incorrect estimates occurred over two stages of over-triage.
Results
Discussion
Seven emergency departments took part in the study. The nurses (n = 79) assessed 13 patient cases each. A total of 1027 triage assessments were analyzed. There were 82% female and 18% male participants, and 91% were older than 25 years of age. The majority of the nurses had worked more than two years after completing their basic training. The participating nurses from the seven emergency departments together presented an unweighted j value of 0.61 (SD 95%, CI 0.57–0.65), a linear weighted j value of 0.71, and a quadratic weighted j value of 0.81. The j values, across departments, thus indicate a good (0.6–0.8)
Deviation from predicted category
Table 1
The deviation from the predicted triage categories red and orange was nearly one out of 10. These results would suggest that patients in need of urgent care were identified as such at the emergency departments covered in the study. However, there is still a group of severely ill patients that are not assigned to the predicted categories. The high percentages of over-triage and under-triage in the yellow and green categories implied that the order of priority was disrupted. It can be significant for a patient
Presentation of j values for each emergency department.
Emergency departments
Unweighted j value
Linear weighted j value
Quad. weighted j value
A B C D E F G
0.56 0.59 0.60 0.62 0.62 0.64 0.65
0.68 0.69 0.70 0.70 0.71 0.74 0.75
0.78 0.80 0.81 0.80 0.82 0.83 0.85
Mean
0.61
0.71
0.81
Table 2
The distribution of triage to predicted values (in bold). Triage done by nurses
Red Orange Yellow Green Mean
Accuracy vs. predicted
Red (%)
Orange (%)
Yellow (%)
92 5
8 91 11 4
4 66 30
Green (%)
22 63
Blue (%)
Over-triage (%)
Under-triage (%)
Accuracy (%)
3
5 11 34
8 4 22 3
92 91 66 63
14
13
73
Manchester Triage in Sweden – Interrater reliability and accuracy Table 3
147
Current study compared to Versloot and Luitse (2007).
Unweighted j value Quadratic weighted j value Number of participants Number of assessments
Olofsson et al.
Versloot and Luitse
0.61 0.81 79 1027
0.76 0.82 8 400
whether they are assigned the yellow category (60 min maximum waiting time) or the green category (120 min maximum waiting time). It can mean that the resources at the emergency departments are used for non-urgent patients, forcing patients in need of urgent care to wait. These two triage categories need further study. The study does not indicate any clear connections between level of training and triaging. A more extensive data collection of training levels should clarify if such connections exist. There was, however, a difference in the results from the various emergency departments. This can point to varying work habits, cultures, internal training, or other differences may affect the result. This, however, needs to be studied more in detail.
Simulated patient cases The orange category, which had 91% accuracy, was represented by three patient cases and the triage category red, which had 92% accuracy, was represented only by one patient case in the study. This could be a weakness in the study. On the other hand, an over-representation of red patient cases can lead to unreasonably high j values because over-triaging is not possible. This reduces the probability for mistakes. Since the greatest numbers of simulated patient cases were in categories, which had the highest risk of incorrect assessments the study was carried out under greater pressure than would have been the case if the majority of the patient cases had been red or blue. The yellow triage category contained most of the simulated patient cases. Gilboy et al. (2005) believe that 1–3% of the patients are assigned to category one (red), 20–30% to category two (orange), 30–40% to category three (yellow), and 20–35% to categories four (green) and five (blue). This study used the opinion of the expert panel as the predicted result (‘‘gold standard’’). The mean accuracy was 73% compared to the predicted category. It may be questioned whether a five person expert panel is more accurate in predicting the category than a group of 79 nurses who assess patients in their daily triage work. The high accuracy, however, implies that both the expert panel and the nurses had a relatively similar perception of how MTS should be applied. Simulated patient cases have limitations since the nurse is unable to question or examine the patient in person. A prospective study, i.e., to follow the daily work of a triage nurse, would have ethical limitations and also be more time consuming. The advantage in simulated patient cases is that all those involved in the study receive similar information and it is easier to obtain a large sample (Considine et al., 2000). It is the authors’ opinion that, from the three different j values discussed (unweighted, linear weighted, and qua-
dratic weighted), only the unweighted j value could be compared with models with similar triage categories. When the weighted j value is provided, the agreement is also specified if the assessment falls one or two categories from the predicted value. In such a case, an orange patient being assessed as yellow can result in a relatively high agreement. In practice, a critically ill patient that should be seen by a physician within 10 min, instead has to wait 60 min. This means that a statistical analysis can point to a deviation that may seem negligible. In a clinical context, however, this can be of great importance for the patient. Over-triage and under-triage can thus have greater consequences than a statistical study would suggest. For triage models like CTAS and ATS which have shorter distances between categories than MTS, a weighted j value could be more suitable. A weakness in comparing j values is that the result depends on (1) the number of nurses participating in the study, (2) the characteristics of the simulated patient cases, and (3) the total number of triage assessments (Altman, 1999; Jakobsson and Westergren, 2005). The recent MTS study by Versloot and Luitse (2007) is the only one that examined simulated patient cases and therefore could be compared with the current study (Table 3). The unweighted j values 0.61 and 0.76 could be used in this comparison. The criterion for choosing the eight nurses, or their level of experience, is not specified in the study by Versloot and Luitse (2007). The distribution of patient cases is another variable that could have contributed to the difference in j values. The majority of the patient cases in our study were distributed across categories where an incorrect assessment could be done in two directions (over-triage and under-triage). Only in one of the patient cases that were of category red could an incorrect assessment have been done in just one direction (under-triage). The study by Versloot and Luitse (2007) does not specify the distribution of patient cases. In a study containing a large amount of red cases, the limitation of distribution will lead to higher j values.
Conclusions In this MTS study we found good interrater reliability at emergency departments in Western Sweden. The accuracy was high (73%). The triage categories red and orange showed the highest values (92% and 91%), which imply that MTS could identify patients in need of early intervention in more than nine out of 10 cases. This study indicates that the implementation of a structured clinical decision support system could increase the interrater reliability and the accuracy while decreasing over-triage. A recommendation to the clinical practice, therefore, is to use structured clinical decisions. One weak-
148 ness, however, is that the yellow (60 min maximum waiting time) and green (120 min maximum waiting time) triage categories resulted in low agreement and accuracy. It implies that they are difficult to differentiate. Subsequently, a limitation of clinical practice is that the resources of the emergency department can be overused for non-urgent patients. It suggests a need to further develop the model, focusing on these categories.
References Altman, D.G., 1999. Practical Statistics for Medical Research. Chapman & Hall, London. Baldursdottir, M.S., Jonsdottir, H., 2002. The importance of nurse caring behaviours as perceived by patients receiving care at an emergency department. Hart and Lung 31, 67–75. Beveridge, R., Clarke, B., Janes, L., Savage, N., Thompson, J., Dodd, G., 1998. Implementation Guidelines for the Canadian Emergency Department Triage and Acuity Scale (CTAS). CAEP, Ottawa. Beveridge, R., Ducharme, J., Janes, L., Beaulieu, S., Walter, S., 1999. Reliability of the Canadian emergency department triage and acuity scale: interrater agreement. Annals of Emergency Medicine 34, 155–159. Cohen, J., 1968. Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit. Psychological Bulletin 70, 213–220. Considine, J., Ung, L., Thomas, S., 2000. Triage nurses’ decisions using the national triage scale for Australian emergency departments. Accident and Emergency Nursing 8, 201–209. Considine, J., Le Vasseur, S.A., Villanueva, E., 2004. The Australasian triage scale: examining emergency department nurses’ performance using computer and paper scenarios. Annals of Emergency Medicine 44, 516–523. Cooke, M.V., Jinks, S., 1999. Does the Manchester triage system detect the critically ill? Journal of Accident & Emergency Medicine 16, 179–181. Crellin, D.J., Johnston, L., 2003. Poor agreement in application of the Australasian triage scale. Contemporary Nurse 15, 49–60. Dann, E., Jackson, R., Mackway-Jones, K., 2005. Appropriate categorisation of mild pain at triage: a diagnostic study. Emergency Nurse 13, 28–32. Dong, S.L., Bullard, M.J., Meurer, D.P., Blitz, S., Akhmetshin, E., Ohinmaa, A., 2007. Predictive validity of a computerizes emergency triage tool. Academic Emergency Medicine 14, 16– 21. Fernandes, C., Tanabe, P., Gilboy, N., Johnson, L., Mc Nair, R., Rosenau, A., 2005. Five-level triage: a report from the ACEP/ ENA five-level triage task force. Journal of Emergency Nursing 31, 39–50. Fleiss, J.L., Cohen, J., 1973. The equivalence of weighted kappa and the intraclass correlation coefficient as measure of reliability. Educational and Psychological Measurement 33, 613–619.
P. Olofsson et al. Gilboy, N., Tanabe, P., Travers, D., Rosenau, A., Eitel, D., 2005. Emergency Severity Index, Version 4: Implementation Handbook. Agency for Healthcare Research and Quality, Rockville. Accessed 15.11.07
. Goodacre, S.W., Gilett, M., Harris, R.D., Houlihan, K.P.K., 1999. Consistency of retrospective triage decisions as a standardised instrument for audit. Journal of Accident & Emergency Medicine 16, 322–324. Go ¨ransson, K., 2006. Registered Nurse-led Emergency Department Triage: Organisation, Allocation of Acuity Ratings and Triage Decision Making. Dissertation, Ha ¨lsovetenskapliga Institutionen. ¨ rebro universitet, O ¨ rebro. O Go ¨ransson, K., Ehrenberg, A., Marklund, B., Ehnfors, M., 2005. Accuracy and concordance of nurses in emergency department triage. Scandinavian Journal of Caring Science 19, 432–438. Grafstein, E., Innes, G., Westman, J., Christenson, J., Thorne, A., 2003. Inter-rater reliability of a computerized presenting-complaint-linked triage system in an urban emergency department. Canadian Journal of Emergency Medicine 5, 323–329. Jakobsson, U., Westergren, A., 2005. Statistical methods for assessing agreement for ordinal data. Scandinavian Journal of Caring Science 19, 427–431. Lipley, N., 2005. The Manchester triage system is proving ideal for the emergency care system in Portugal. Emergency Nurse 13, 5. Mackway-Jones, K. (Ed.), 1997. Emergency Triage. BMJ, London. Manos, D., Petrie, D.A., Beveridge, R.C., Walter, S., Ducharme, J., 2002. Inter-observer agreement using the Canadian emergency department triage and acuity scale. Canadian Journal of Emergency Medicine 4, 16–22. McCallum Pardey, T.G., 2006. The clinical practice of department triage: Application of Australasian Triage Scale – an extended literature review. Part 1: Evolution of the ATS. Australasian Emergency Journal 9, 155–162. Palmquist, I., Lindell, G., 2000. Emergency departments in Sweden – today and in the future. Va ˚rd i norden 20, 28–31. Socialstyrelsen, 1994. Akut omha ¨ndertagande – Ett underlag fo ¨r kompetensutveckling vid omha ¨ndertagande av akut sjuka och skadade. Rapport 6, Socialstyrelsen, Stockholm. Stenstrom, R., Grafstein, E.J., Innes, G.D., Christenson, J.M., 2003. The predictive validity of the Canadian triage and acuity scale (CTAS). Academic Emergency Medicine 10, 512. Twomey, M., Wallis, L.A., Myers, J.E., 2007. Limitations in validating emergency department triage scales. Emergency Medicine Journal 24, 477–479. Versloot, S.-M., Luitse, J., 2007. The agreement of the Manchester triage system and emergency severity index in terms of agreement: a comparison. Academic Emergency Medicine 14, 57. Widfeldt, N., 2005. Arbetsmaterial: Svensk o ¨versa ¨ttning av Manchester Triage Flo ¨desschema. Prehospitalt och Katastrofmedicinskt centrum, Go ¨teborg. Worster, A., Gilboy, N., Fernandes, C., Eitel, D., Kevin, E., Geisler, R., 2004. Assessment of inter-observer reliability of two fivelevel triage and acuity scales: a randomized controlled trial. Canadian Journal of Emergency Medicine 6, 240–245.