Peer and expert opinion and the reliability of implicit case review

Peer and expert opinion and the reliability of implicit case review

Peer and Expert Opinion and the Reliability of Implicit Case Review Curtis E. Margo, MD, MPH Objective: To compare the interrater and intergroup agree...

75KB Sizes 0 Downloads 28 Views

Peer and Expert Opinion and the Reliability of Implicit Case Review Curtis E. Margo, MD, MPH Objective: To compare the interrater and intergroup agreement in judging physician maloccurrence and compliance with standards of care using the implicit case review process. Design: Mail survey with questionnaire. Participants: Case reviews and questionnaires were mailed to 140 board-certified ophthalmologists and 140 board-certified ophthalmologists with fellowship training. Main Outcome Measure: Agreement judging maloccurrence and compliance with standard of care within each group and between general ophthalmologists and specialists. Results: Ninety-seven (35%) questionnaires were returned. Overall, 35% of respondents believed that ophthalmologists in the case reviews committed an error of either commission or omission. Forty-five percent of reviewers believed that physicians did not meet the standard of care. There was good within-group agreement for finding clinical error in management and not meeting the standard of care for all groups (kappa coefficient range: 0.55– 0.83; P ⫽ ⬍ 0.004) except retina specialists (kappa coefficient ⫽ 0.12; P ⫽ 0.2). Conclusions: Unstructured implicit case review is not a reliable method for determining physician error or for measuring compliance with standards of care. The process is susceptible to bias, and results may vary with reviewer background or training. Unstructured implicit case review needs to be regarded as a rough screening tool and used accordingly. Ophthalmology 2002;109:614 – 618 © 2002 by the American Academy of Ophthalmology. Case review is a fundamental and frequently used method to determine the appropriateness and effectiveness of clinical care. Both formal and informal case reviews are standard methods of judging quality of care in hospitals and clinics. Managed care organizations, insurers, and state regulatory agencies rely on case reviews to make decisions about quality of care and medical coverage.1–3 Case review also plays a fundamental role in the tort system in deciding whether a patient has been injured by medical care and in determining compensation for injury.4 The retrospective case review process depends on peer review and expert opinion. Despite the crucial role of peer review and expert opinion, the reliability and validity of these interpretations are not well documented in ophthalmology. This study was conducted to determine the reliability of generalists and specialists in ophthalmology in judging clinical culpability and in assessing compliance with standards of care using an unstructured implicit case review.

Material and Methods Case review material and questionnaires were mailed to 280 ophthalmologists practicing in the United States. The case reviews Originally received: November 6, 2000. Accepted: June 13, 2001. Manuscript no. 200765. Department of Ophthalmology, Watson Clinic, Lakeland, Florida. Supported by a grant from the Watson Clinic Foundation Inc. Reprint requests to Curtis E. Margo, MD, MPH, Ophthalmology, Watson Clinic, 1600 Lakeland Hills Blvd., Lakeland, FL 33805.

614

© 2002 by the American Academy of Ophthalmology Published by Elsevier Science Inc.

were based on material from a quality assurance program. The cases were disguised so that physician and patient identities could not be determined. For the purposes of this study, board-certified ophthalmologists were regarded as “peers” and board-certified ophthalmologists with an additional minimum of 1 year of clinical fellowship training were regarded as “experts.” Case 1 was mailed to 140 board-certified ophthalmologists: 70 who practiced general ophthalmology and 70 who were fellowship trained in retina. Case 2 was mailed to 140 board-certified ophthalmologist: 70 who practiced general ophthalmology and 70 who were fellowship trained in glaucoma. General ophthalmologists were selected from the 1998 Membership Directory of the American Academy of Ophthalmology. Fellowship-trained retina specialists were selected from the 1998 Combined Membership Directory of the Vitreous, Retina and Macula Societies. Fellowship trained glaucoma specialists were identified from the 1998 Membership Directory of the American Academy of Ophthalmology. A cover letter to potential participants explained the purpose of the study. Participants were told that the enclosed case material represented the complete medical record; no clinical information was omitted from the case review. The same questionnaire was mailed to all participants. After carefully reviewing the material, the participants were asked to answer three questions: (1) Was the clinical outcome in the case attributable to the medical care the patient received (an error of either commission or omission) for ophthalmologist labeled “A”? (2) Did ophthalmologist A meet the standard of care expected of him or her in this clinical situation? and (3) If the standard of care was not met, was the deviation minor, moderate, or major? Comments concerning the case were encouraged but not required. The answers were coded so they could not be traced to individual participants. Agreement among observers was expressed as proportions and tested for statistical significance with the chi-square test. The ISSN 0161-6420/02/$–see front matter PII S0161-6420(01)00820-X

Margo 䡠 Peer Review kappa coefficient was used to measure agreement between judgments of culpability and compliance with standards of care.5 The complete case reviews can be obtained from the author. Case 1 involved a 38-year-old man who was seen by his ophthalmologist (ophthalmologist A) with a 3-day history of flashing lights and floaters in his right eye. Ocular history was negative. Corrected vision was 20/20 in each eye. The patient’s refraction was approximately ⫺3.00 in each eye. The eye examination was normal except for pigment “dust” in the right vitreous and lattice degeneration of the retina in both eyes. No retinal holes were identified. The ophthalmologist diagnosed “symptomatic lattice degeneration in the right eye” and recommended laserpexy. This was completed without difficulty. The patient was instructed to call if any new symptoms occurred. Follow-up was scheduled for 3 weeks. One week later, the patient returned because of decreased right eye vision for 2 days. Ophthalmologist A found counting fingers vision and a macula-off retinal detachment and immediately referred the patient to a retina specialist. The retina specialist found a superior horseshoe tear not associated with lattice degeneration. His retinal drawing showed three patches of laser-treated lattice degeneration in the right eye and two that were untreated. There were no atrophic holes associated with any of the lattice lesions. The patient underwent uneventful repair of his retinal detachment. Case 2 involved a 55-year-old woman who was seen by her ophthalmologist (ophthalmologist A) with a 3-day history of decreased vision and pain in the right eye. There was some nausea and right-sided headache. On the day of the examination, the pain was better, but vision remained blurred. A similar episode occurred a year earlier, but the patient did not seek medical help. Ocular history was negative. On examination, vision was 20/60 in the right eye and 20/20 in the left eye. Intraocular pressures were 19 mmHg in the right eye and 14 mmHg in the left eye. On slit-lamp examination, a “disciform” haze was present in the right central cornea. The anterior chamber was deep with no cell or flare. Atrophy of the right iris was noted, and the lens was clear. The left anterior segment was normal. Gonioscopy of the right eye showed 360° of small peripheral anterior synechiae, but the view was described as poor. There was no mention of gonioscopy of the left eye. A nondilated fundus examination was normal. Ophthalmologist A diagnosed disciform keratitis without epithelial involvement and treated the patient with topical Viroptic, 0.2% prednisolone acetate every 6 hours, and oral acyclovir. Follow-up was schedule for 5 days. The patient was instructed to call the office if any problems developed. Two days later, the patient went to the emergency department when right eye pain, nausea, and headache recurred. The emer-

Table 1. Response Rate to Mailed Questionnaire Number Returned (%)

Category Case 1, Case 1, Case 2, Case 2, Total

general ophthalmologist retinal specialist general ophthalmologist glaucoma specialist

22 (31) 27 (39) 24 (34) 24 (34) 97 (35)

gency department physicians called the ophthalmologist on-call (ophthalmologist B) who examined the patient and found her right vision was 20/200 with a mid dilated right pupil. Intraocular pressures were 26 mmHg in the right eye and 17 mmHg in the left eye. The right cornea was edematous, and the peripheral anterior chamber seemed shallow. Gonioscopy of the left eye showed “slit open” angle. Ophthalmologist B diagnosed acute angle-closure glaucoma and treated the woman medically with intramuscular acetazolamide, topical pilocarpine, and timolol. An hour later, intraocular pressure was 14 mmHg, and the patient was discharged from the emergency department. The next day, the patient had an uneventful laser iridotomy of the right eye.

Results Ninety-seven (35%) of the 280 questionnaires were returned (Table 1). In case 1, 45% of general ophthalmologists attributed the clinical outcome to either an error of commission or omission by ophthalmologist A compared with 7% of the retina specialists (P ⫽ 0.002) (Table 2). Fifty-nine percent of general ophthalmologists believed that ophthalmologist A did not meet the standard of care expected in this clinical situation compared with 26% of retina specialists (P ⫽ 0.02). Forty-five percent of general ophthalmologists and 26% of retina specialists consider the deviation from the standard of care to be in the range of moderate to major (Table 2). In the second case, 46% of both general ophthalmologists and glaucoma specialists attributed the clinical outcome to either an error of commission or omission (P ⫽ 1.0) (Table 3). Forty-six percent of general ophthalmologist thought that ophthalmologist A did not meet the standard of care compared with 54% of glaucoma specialists (P ⫽ 0.8). Twenty-one percent of general ophthalmologists and 42% of glaucoma specialists characterized the care of ophthalmologist A as deviating from the standard of care in the range of moderate to major (Table 3).

Table 2. Judging Error and Compliance by Generalist and Specialist. Case 1 Case 1 Outcome attributable to physician error? Meets standards of care? Amount deviation from standard of care† Minor Moderate Major

Generalist

P Value*

Specialist

Yes 10 (45%)

No 12 (55%)

Yes 2 (7%)

No 25 (93%)

0.002

Yes 9 (41%)

No 13 (59%)

Yes 20 (64%)

No 7 (26%)

0.02

1 7 3

0 5 2

*Chi-square test. † Some questions not answered by all participants.

615

Ophthalmology Volume 109, Number 3, March 2002 Table 3. Judging Error and Compliance by Generalist and Specialist. Case 2 Case 2 Outcome attributable to physician error? Meets standards of care? Amount deviation from standard of care† Minor Moderate Major

Generalist

P Value*

Specialist

Yes 11 (46%)

No 13 (54%)

Yes 11 (46%)

No 13 (54%)

1.0

Yes 13 (54%)

No 11 (46%)

Yes 11 (46%)

No 13 (54%)

0.6

5 4 1

3 7 3

*Chi-square test. † Some questions not answered by all participants.

There was good within-group agreement between a judgment of physician error and not meeting the standard of care in all groups except retina specialists (Table 4). The average kappa coefficient for all groups except retina specialists was 0.68 (range, 0.55– 0.83; P ⫽ ⬍0.004). Among retinal specialists, however, 7% found an error of omission or commission, whereas 26% considered medical care to deviate from the standard of care by a moderate to major degree (kappa coefficient ⫽ 0.12, P ⫽ 0.2) (Table 4).

Discussion The peer and expert opinions solicited in this study are examples of implicit review of medical records and have been the community standard for making final judgments about quality of care.1 With implicit case reviews, the criteria for determining judgments are not expressly stated; each person uses his or her own undefined criteria to assess quality of care. The basic assumption is that reviewers are able by their professional training to distinguish standard from substandard medical care. The subjective nature of implicit review contrasts with an explicit assessment, which relies on a priori fixed criteria. In an explicit review process, criteria can be based on consensus opinion from a panel of experts or on information abstracted from clinical outcomes research. An explicit structured case review on the management of angle-closure glaucoma, for example, might use a checklist that looks for documentation in the medical record of a chief complaint, review of ocular history, measurement of visual acuity, intraocular pressure, description of angle

Table 4. Agreement between Judgments of Physician Error and Lack of Compliance with Standards of Care

Group

Observed Agreement

Kappa Coefficient

P Value, One Tailed

General ophthalmologist, case 1 Retina specialist, case 1 General ophthalmologist, case 2 Glaucoma specialist, case 2

0.77 0.74 0.83 0.92

0.55 0.12 0.66 0.83

0.0040 0.2000 0.0005 0.0001

616

anatomy, and an assessment and plan based on the clinical findings. Relatively few studies have been designed to assess the reliability and validity of the implicit review process. Those that have been published show physician agreement on appropriateness of care is generally poor and cast doubt on the validity on of the peer review process.6 –15 The implicit review process, which serves a prominent role in adjudicating malpractice claims and in investigating charges of substandard care, may be particularly prone to bias when used to evaluate patients with poor clinical outcomes.16 Validity and reliability are interrelated concepts that have to do with the trustworthiness of a measurement or the confidence that can be placed in a measurement. The validity of the implicit case review process is based on the premise that it can do what it purports: to consistently identify deviations in the standard care or errors in clinical practice. Reliability refers to how reproducible the measurement is and depends on the extent to which the implicit case review process is free of random and systematic errors. Goldman6 performed a computerized search of the English-language literature through 1990 looking for studies on interreviewer agreement of implicit medical record review and found 12 with documentation that were adequate for analysis. Most of these studies showed that physician peer judgments of quality of care were poorly reliable, with agreements only slightly better than that predicted by chance.6 The results of structured implicit reviews are generally considered more reliable than unstructured, but even when guidelines are used, the assessments of maloccurrence can vary.17 In a large study of adverse events from 51 inpatient facilities in New York State, researchers found two physicians more frequently had extreme disagreement about the occurrence of an adverse event (12.9%) than two physician experts agreed (10%).9 This study involved 7533 pairs of structured implicit reviews of medical records performed by 127 physicians working independently of one another. Individual physician’s rates of finding adverse events varied widely, raising concerns about the validity of structured implicit reviews.9 When medical records were used to determine the ap-

Margo 䡠 Peer Review propriateness of 64 hospital admissions, two physicians disagreed on the need for hospitalization in 48% of cases.11 This discrepancy of opinion occurred despite the application of well-defined criteria for the appropriateness of hospital admission. Implicit case review has a particularly contentious role in medical malpractice litigation, where it is used in an adversarial manner. Expert medical opinion in malpractice review of 103 claims by 30 anesthesiologists showed that reviewers agreed on 62% of claims and disagreed on 38%.15 This level of agreement when corrected for chance is rated fair to poor.15 A variety of explanations have been offered for the high levels of interrater disagreement in the implicit review process. Reviewer bias reflects the tendency of some physicians to be consistently lenient in their judgments of medical care, whereas others are consistently strict.9,18 There may also be a tendency to judge medical care more harshly if reviewers have knowledge that a patient had a serious adverse outcome or had a permanent disability. Physician judgments tend to be inversely related to the severity of permanent disability; the worse the permanent disability, the more likely physicians are to assess the quality of medical care negatively.12 Observer discordance has also been associated with clinical conditions in which there is a lack of conclusive evidence of effective therapy.19 Thus, in situations in which clinical effectiveness is less clear, physician reviewer judgments of appropriateness of care show greater variations. The general lack of formal training in the case review process also may contribute to poor performance and reliability.20 Other possible factors include the complex nature of medical practice itself, the large and almost unmanageable amount of material involved in some cases, inability to read hand-written records, and the lack of available practice guidelines for many disorders. This study differs from previous ones in several ways. First, legibility of medical records was not an issue, because the case reviews were typed and abbreviations (other than widely recognized ones such as “OD” and “OS”) were not used. Second, the clinical histories were short and therapeutic interventions limited and straightforward. Unlike many general medical and surgical reviews that involve patients who have had prolonged hospitalizations or clinical courses, these two cases transpired over a few days and revolved around a single clinical issue. Third, the reviewers were not biased by knowing final long-term outcomes and whether permanent visual disability occurred. Each case was reviewed by multiple ophthalmologists, which diminishes the likelihood that the results were influenced by a few physicians who might have misinterpreted material or who held nonconventional clinical opinions. Finally, this study looked at two strata of review: peer and expert. Although the definitions of peer and expert are inexact, this study used a commonly accepted standard for expert as a board-certified physician with an additional year of fellowship training.4,21 The two case reviews in this study were selected after careful consideration that the diagnoses and managements were within the scope of practice of general ophthalmologists. Despite the brevity of the written case material and the

absence of knowledge about long-term visual disability, there were divergent opinions about practitioner culpability and adherence to standards of care among both peer and expert reviewers. A number of participant reviewers commented that these cases were “ambiguous” or difficult to evaluate, because they could not determine from the medical record what physical findings were actually present during each visit or which clinical interpretation was correct. These situations, however, in which conflicting findings are present in the medical record are precisely the types of cases that end up being reviewed or litigated. This study and others demonstrate serious limitations of implicit medical record review. Previous investigators have recommended several remedies, including greater use of structured implicit review, more reliance on explicit review, and the need for formal training of physicians who become involved in medical review.9 Although explicit reviews are considered more reliable than implicit reviews, explicit review methods are more time consuming, costly, and cannot be developed to address all possible clinical situations. Tests of reliability of the explicit review process also have had mixed results.22 In one comprehensive review, physicians overturned primary decisions for hospitalization on the basis of explicit criteria 28% to 74% of the time.23 One of the most respected methods of explicit review for defining appropriate medical care is the Delphi panel process developed in the 1980s by RAND-University of California at Los Angeles.24 With this method, physicians develop hundreds of case scenarios on the basis of combinations of clinical factors that could affect patient outcome. A multidisciplinary panel of expert clinicians independently rates the scenarios on a 9-point scale and then rates them again after discussing areas of disagreement. The Delphi process, however, is probably too cumbersome and costly to replace most implicit review programs.22,25 The process of accurately and consistently identifying substandard medical care is complex and may be as elusive as agreeing on what constitutes good medical care. This study and others in the literature show that an unstructured implicit case review is unreliable in judging physician error and compliance with standards of care. When seemingly straightforward implicit reviews provoke divergent opinions about physician culpability, the process seems capricious. To a large extent, implicit case reviews reflect personal values, individual training, and background. Physician judgments about the quality of medical care are necessary to maintain professional accountability.26 Although the term “standard of care” is familiar, its concept is remarkably complex and steeped in medicolegal connotation. When the term standard of care is used in a legal context, it usually refers to a level of skill, care, and diligence exercised by members of the same profession within a similar clinical setting. In this study, peers and experts were asked to make implicit judgments about standards of care without any predetermined guidelines on how to measure them. This type of subjective independent review is widely used in hospital quality assurance settings, by disciplinary review boards, and in malpractice proceedings.12 Although some managed care organizations and quality improvement programs use explicit structured formats, for-

617

Ophthalmology Volume 109, Number 3, March 2002 malized standards often do not exist for many clinical situations. In ophthalmology, at least two theoretical standards of care exist: those for generalists and those for subspecialists. With increasing specialization, courts have disregarded geographic considerations in establishing standards and instead hold physicians to standards of their board-certified specialty. In this study, the reviewers did not have specific knowledge of the ophthalmologists’ previous training, but it was apparent that they practiced general ophthalmology, because their peer group was identified as general ophthalmologists. Conversely, because this was an implicit review, participants were free to apply any standards of care they deemed appropriate. Although this particular laxity may seem troublesome, there are, in fact, no satisfactory guidelines that differentiate the quality or standard of care provided by a generalist and subspecialist in ophthalmology.21 The purpose of this study was to test the reliability of the unstructured implicit case review process. An implicit case review is a commonly used tool in a variety of professional and legal activities. The lack of reliability of the implicit review process obviously casts doubt on the validity of any professional or legal proceeding that uses it. The merits and flaws of the peer review process, standards of care evaluation, and medical tort law are not the principal concern of this study, but their outcomes are affected by the use of implicit case reviews. What this study points out is the discrepancy between the important professional and legal roles given to the implicit case review process and its dismal performance in tests of reliability.6,8,9,11,13–15,27 The implicit case review needs to be regarded as a rough screening test and its limitations recognized. Serious questions about compliance with standards of care, medical maloccurrence, or malpractice that arise from an unstructured implicit case review must be verified by other methods. At minimum, a second level of review should have an explicit format, be based on established clinical practice guidelines if they exist, and be conducted by persons with experience in the clinical review process. The optimal method of identifying and verifying substandard medical performance, however, has not been determined.6,9,14,22,25 More research is needed in this area to establish confidence in the process and guarantee its validity.

References 1. Dans PE, Weiner JP, Otter SE. Peer review organizations. Promises and potential pitfalls. N Engl J Med 1985;313: 1131–7. 2. Smits HL. The PSRO in perspective. N Engl J Med 1981;305: 253–9. 3. Joint Commission on Accreditation of Healthcare Organizations. Accreditation Manual for Hospitals. Oakbrook Terrace, IL: Joint Commission on Accreditation of Healthcare Organizations, 1991;1990;220. 4. Piorkowski JD. Medical testimony and the expert witness. In: Sanbar SS, Gibofsky A, Firestone MH, LeBlang TR, eds. Legal Medicine. St Louis: Mosby, 1998;132– 46. 5. Epi Info 6 [computer program]. Atlanta: Epidemiology Program Office, Center for Disease Control and Prevention, 1995.

618

6. Goldman RL. The reliability of peer assessments of quality of care. JAMA 1992;267:958 – 60. 7. Rubin HR, Rogers WH, Kahn KL, et al. Watching the doctorwatchers. How well do peer review organization methods detect hospital care quality problems? JAMA 1992;267:2349 – 54. 8. Wilson DS, McElligott J, Fielding P. Identification of preventable trauma deaths: confounded inquires? J Trauma 1992;32: 45–51. 9. Localio AR, Weaver SL, Landis JR, et al. Identifying adverse events caused by medical care: degree of physician agreement in a retrospective chart review. Ann Intern Med 1996;125: 457– 64. 10. Park RE, Fine A, Brook RH, et al. Physician ratings of appropriate indications for six medical and surgical procedures. J Public Health 1986;76:766 –72. 11. Dippe SE, Bell MM, Wells MA, et al. A peer review of a peer review organization. West J Med 1989;151:93– 6. 12. Caplan RA, Posner KL, Cheney FW. Effect of outcome on physician judgments of appropriateness of care. JAMA 1991; 265:1957– 60. 13. MacKenzie EJ, Steinwachs DM, Bone LR, et al. Inter-rater reliability of preventable death judgments. J Trauma 1992;33: 292–302. 14. Hayward RA, McMahon LF, Jr, Bernard AM. Evaluating the care of general medicine inpatients: how good is implicit review? Ann Intern Med 1993;118:550 – 6. 15. Posner KL, Caplan RA, Cheney FW. Variation in expert opinion in medical malpractice review. Anesthesiology 1996; 85:1049 –54. 16. Schroeder SA, Kabcenell AI. Do bad outcomes mean substandard care? [editorial]. JAMA 1991;265:1995. 17. Rubenstein LV, Kahn KL, Reinisch EJ, et al. Changes in quality of care for five diseases measured by implicit review, 1981 to 1986. JAMA 1990;264:1974 –9. 18. Richardson FM. Peer review of medical care. Med Care 1972; 10:29 –39. 19. Lomas J, Anderson G, Enkin M, et al. The role of evidence in the consensus process: results from a Canadian consensus exercise. JAMA 1988;259:3001–5. 20. Brook RH, Lohr KN. Monitoring quality of care in the Medicare program. Two proposed systems. JAMA 1987;258: 3138 – 41. 21. Fung WE. Opportunities for maloccurrence in delivery of specialized care in the managed care environment [case report]. Surv Ophthalmol 1998;43:280 –2. 22. Naylor CD. What is appropriate care? N Engl J Med 1998; 338:1918 –20. 23. Strumwasser I, Paranjpe NV, Ronis DL, et al. Reliability and validity of utilization review criteria: appropriateness, evaluation protocol, standardized medreview instrument, and intensity-severity-discharge criteria. Med Care 1990;28:95–111. 24. Ayanian JZ, Landrum MB, Normand SLT, et al. Rating the appropriateness of coronary angiography - Do practicing physicians agree with an expert panel and with each other? N Engl J Med 1998;338:1896 –904. 25. Shekelle PG, Kahan JP, Bernstein SJ, et al. The reproducibility of a method to identify the overuse and underuse of medical procedures. N Engl J Med 1998;338:1888 –95. 26. Millenson ML. Demanding Medical Excellence: Doctors and Accountability in the Information Age. Chicago: University of Chicago Press, 1999; 27. Rutkow IM, Gittelsohn AM, Zuidema GD. Surgical decision making. The reliability of clinical judgment. Ann Surg 1979; 190:409 –19.