Causality assessment methods in drug induced liver injury: Strengths and weaknesses

Causality assessment methods in drug induced liver injury: Strengths and weaknesses

Review Causality assessment methods in drug induced liver injury: Strengths and weaknesses Miren García-Cortés1,a, Camilla Stephens2,a, M.I. Lucena2⇑...

770KB Sizes 3 Downloads 25 Views

Review

Causality assessment methods in drug induced liver injury: Strengths and weaknesses Miren García-Cortés1,a, Camilla Stephens2,a, M.I. Lucena2⇑a, Alejandra Fernández-Castañer1, Raúl J. Andrade1,a, On behalf of the Spanish Group for the Study of Drug-Induced Liver Disease (Grupo de Estudio para las Hepatopatías Asociadas a Medicamentos GEHAM) 1

Unidad de Hepatología, Servicio de Aparato Digestivo, Hospital Universitario Virgen de la Victoria, Campus Universitario de Teatinos sn, 29010 Málaga, Spain; 2Servicio de Farmacología Clínica, Grupo de Estudio para las Hepatopatías Asociadas a Medicamentos, Co-ordinating Centre, Hospital Universitario Virgen de la Victoria, Facultad de Medicina, Campus Universitario de Teatinos s/n, Málaga, Spain

Summary Diagnosis of drug-induced liver injury (DILI) remains a challenge and eagerly awaits the development of reliable hepatotoxicity biomarkers. Several methods have been developed in order to facilitate hepatotoxicity causality assessments. These methods can be divided into three categories: (1) expert judgement, (2) probabilistic approaches, and (3) algorithms or scales. The last category is further divided into general and liver-specific scales. The Council for International Organizations of Medical Sciences (CIOMS) scale, also referred to as the Roussel Uclaf Causality Assessment Method (RUCAM), although cumbersome and difficult to apply by physicians not acquainted with DILI, is used by many expert hepatologists, researchers, and regulatory authorities to assess the probability of suspected causal agents. However, several limitations of this scale have been brought to light, indicating that a number of adjustments are needed. This review is a detailed timely criticism to alert the readers of the limitations and give insight into what would be needed to improve the scale. Instructions on how to approach DILI diagnosis in practice are provided, using CIOMS as an aid to emphasize the topics to be addressed when assessing DILI cases. Amendments of the CIOMS scale in the form of applying authoritative evidence-based criteria, a simplified scoring system and appropriate weighting given to individual parameters based

Keywords: Drug induced liver injury; Hepatotoxicity; CIOMS/RUCAM scale; Causality assessment methods; Diagnosis; Risk factors; Limitations. Received 22 December 2010; received in revised form 11 February 2011; accepted 14 February 2011 * Corresponding author. Address: Departamento de Farmacología, Facultad de Medicina, Boulevard Louis Pasteur, 32 Campus de Teatinos s/n, 29071 Málaga, Spain. Tel.: +34 952 131572; fax: +34 952 131568. E-mail address: [email protected] (M.I. Lucena). a Centro de Investigación Biomédica en Red: Enfermedades Hepáticas y Digestivas (CIBERehd). Abbreviations: ADR, adverse drug reaction; ALT, alanine aminotransferase; ALP, alkaline phosphatase; DILI, drug induced idiosyncratic liver injury; CIOMS, council for international organizations of medical science; RUCAM, Roussel Uclaf Causality Assessment Method.

on statistical evaluations with large databases will provide wider applicability in the clinical setting. Ó 2011 European Association for the Study of the Liver. Published by Elsevier B.V. All rights reserved.

Introduction Drug induced liver injury (DILI) is today a leading global health problem. Idiosyncratic DILI, together with acetaminophen overdose, rank as the first cause of acute liver failure in the US and Sweden and are the main reasons for postmarketing regulatory decisions, such as withdrawal of drugs from the market [1–3]. Almost any pharmaceutical or xenobiotic compound may induce liver injury, rendering the diagnosis of DILI a challenging task for health care professionals. In addition, robust DILI biomarkers specific and sensitive enough to distinguish DILI from other causes of liver injury are absent. The development of such biomarkers would not only facilitate DILI diagnosis, but would also enhance DILI research and human risk assessments. Efforts to enhance the identification of adverse hepatic reactions and to obtain reliable information about the epidemiology and pathogenesis are being made worldwide. The Spanish DILI Registry, set up in 1994, is a multicenter collaborative network with a large database of prospectively recorded DILI cases [4]. The Drug Induced Liver Injury Network (DILIN) was established in the US in 2003 to conduct research into the causes of DILI, while DILIGEN is operating in England [5,6]. Collections of hepatotoxicity cases identified through pharmacovigilance systems have also been set up, such as the Swedish adverse drug reactions advisory committee (SADRAC) [7] and the EUDAGRENE project [8]. These collaborative networks provide opportunities to agree upon common definitions, diagnostic criteria, terminologies, and improvements in causality assessment. This review presents a summary of the most commonly used causality assessment methods for adverse drug reactions (ADRs) focusing on their strengths and drawbacks in determining causality in potential DILI cases and give insight into what would be needed to improve the CIOMS scale. This review is also intended

Journal of Hepatology 2011 vol. 55 j 683–691

Review to give the reader detailed instructions on how to approach the diagnosis of DILI in practice, using the CIOMS as a reminder of the topics that need to be addressed when approaching DILI cases.

Liver injury (clinical hepatitis or abnormal increase in liver tests)

Suspicion of DILI

Diagnosis of DILI Prompt recognition of a culprit drug as the cause of liver injury is the most important aspect in hepatotoxicity management, since it appears to decrease the risk of progression to acute liver failure or chronic liver injury [9]. Several aspects of DILI complicate its diagnosis. Primarily, DILI may resemble any acute or chronic liver disease and the ‘‘signature’’ (consistent clinical, pathological, and latency presentations) for a given drug can vary. Secondly, there is currently no ‘‘gold standard’’ for DILI verification. The diagnosis, therefore, depends heavily on exclusion of other causes of liver injury with no end to possible exclusions that could be sought.

Key Points Until specific biomarkers for DILI become available the diagnosis relies on a systematic approach where chronology of drug administration and dechallenge with regard to the hepatic disease and careful exclusion of potential competing causes are crucial. It is unlikely that a single instrument would accommodate all forms of DILI presentation unless there is a dynamic weighting of the component variables. Among the available scoring methods for assessing DILI in clinical practice the CIOMS scale, although cumbersome and with flaws in intra- and interrater reliability, is currently the preferred method when approaching a suspicion of DILI. However, it does not substitute clinical judgment. Complex instructions, selection, and weighting of the criteria discrimination among concomitant drugs need to be addressed more adequately in the CIOMS scale. Likewise, liver biopsy findings and immunoallergic features should be incorporated in the scoring system. A future “tailored” computerised assessment scale that could incorporate information on known signatures, specific risk factors and authoritative evidence-based criteria for a given drug could be feasible through the analyses of current large databases of DILI patients.

Diagnosis of idiosyncratic DILI requires high levels of suspicion and is usually made after a retrospective review of the available data has been done. Evaluation of cases is not a homogeneous process due to work-up variabilities resulting from differences in clinical approaches and patient data availability. Polypharmacy and the presence of concomitant diseases can further impede DILI diagnosis. Gathering the necessary information at the time the illness is just unfolding improves the chances of accurate diagnosis and consequently patient outcome. A careful step-by-step approach to diagnosis is recommended (Fig. 1). 684

Retrieve history of drug and xenobiotic use

Exposure to potential DILI agent with compatible temporal sequence NO

Search for non-drug causes: Alcohol abuse Viral hepatitis (A, B, C, D, and E) EBV and CMV Bacterial or fungal sepsis Autoimmune hepatitis Inherited diseases (e.g. Wilson’s disease) Congestive heart failure, ischemic liver Primary or metastatic liver Biliary tract or pancreatic carcinoma Benign biliary obstruction

YES

Search for culprit drug: Hepatotoxic potential Typical latency signature Features strengthening DILI suspicion: drug allergy features rapid improvement upon dechallenge inadvertent rechallenge/previous drug exposure suggestive liver biopsy features high CIOMS score Fig. 1. Approaching a suspected drug-induced hepatotoxicity case.

Initiatives to homologize DILI assessments have been taken, with a recently held international clinical research workshop focused on standardizing current nomenclature and terminology implicated in DILI research [10]. Likewise, the International Severe Adverse Events Consortium (iSAEC) Phenotyping Standardization Project (PSP) is currently working towards achieving consistency, homogeneity, and objectivity in the definition, characterization, and classification of clinical DILI syndromes [11].

Causality assessment methods DILI causality assessment has been a subject of interest and debate for years. Several standardized systems have been proposed to assess the relationship between drugs and the appearance of adverse events. Those systems belong to three categories: expert judgement, probabilistic methods, and algorithms or scales.

Journal of Hepatology 2011 vol. 55 j 683–691

JOURNAL OF HEPATOLOGY Table 1. Strengths and weaknesses of standardized causality assessment scales and algorithms in evaluation of suspected DILI cases.

Strengths

Weaknesses

Provide better reproducibility between evaluators

Often complex and time consuming

Enhance objectivity in case assessments

Do not provide certainty of case diagnosis

Grade the strength of probability in broad categories

Do not substitute clinical judgement

Excellent teaching tool for DILI workers

Lack of case information or follow-up data leads to reduced probability

Act as a checklist for information needed from the case Valuable in assessment of complex cases or in research settings Can provide early detection of pharmaceutical warning signs for regulatory measurements

Do not discriminate among concomitant drugs Evaluation of fatal or atypical cases remains challenging Restrictive criteria and arbitrary weighting of factors may lead to incorrect evaluations

Expert judgement Expert judgement relies on professional opinions on causality after considering all available and relevant data on the case [12]. This approach is similar to clinical diagnosis. Hence, it suffers the same limitation: subjectivity [13]. However, some authors, such as Macedo et al., defend the expert judgement method, due to greater specificity showed by this approach when compared with several general algorithms [14]. Today, the most detailed and standardized expert opinion method is used by the DILIN study group [15]. This group attempts to minimize individual biases and inter-rater variability by producing a consensus from three expert opinions. An obvious problem with this system, however, is the absence of an expert panel in daily clinical practice. Probabilistic methods Most probabilistic approaches are derived from the Bayes’ theorem, a mathematical representation of the relationship between one conditional probability and its inverse [13]. This procedure respects the basic rule of the probability theory, which states that in the absence of any relevant information one should obtain a neutral estimate. This is clearly advantageous in DILI causality assessments where an incomplete work-up is a common occurrence in clinical practice. However, the requirement of precise data from a large number of cases to model probability distribution for each parameter limits the applicability of this approach, as DILI is a relatively rare condition. Nevertheless, probabilistic methods have been tested in hepatotoxicity [16]. Scales and algorithms Scales and algorithms are useful ways of weighting the evidence concerning adverse drug reactions (ADRs). They provide not only a diagnostic tool but also a checklist to help gather the information required, by reminding clinicians of the topics to be addressed when a case is assessed. Algorithmic approaches are widely used for operational assessments of ADRs, not only due to their appealing simplicity (successive evaluation of criteria, sum of scores or decision trees), but also for requiring less subjective judgements. However, assessments with a given algorithm strongly depend on the weight given to the criteria. The validity of assessment scales will, therefore, drop if the correct weights

have not been allocated to the parameters. A major problem currently hindering the validation of ADR causality algorithms in DILI assessments is the absence of a diagnostic ‘gold standard’ to define the truth. Advantages and disadvantages of standardized causality assessment scales and algorithms in DILI evaluation are outlined in Table 1. The scale and algorithm methods can be further divided into general systems for causality assessment of non-specific ADRs and liver-specific scales for DILI. General scales and algorithms The first ones to develop operational diagnostic criteria for identification of ADRs were Karch and Lasagna in 1977 [17]. Their approach has two major limitations. Firstly, it is impossible to identify the first case of ADR as definitive for a drug not previously implicated in ADRs. Secondly, subjective judgement is needed for many of the steps, making the method more prone to bias. An alternative algorithm, the Naranjo Adverse Drug Reaction Probability Scale (1981) offers the advantage of simplicity and wide applicability [18]. This scale involves ten ‘‘yes’’, ‘‘no’’, or ‘‘unknown/inapplicable’’ questions that provide a probability category based on the total score. The scale was validated and demonstrated improved assessment reproducibility. However, the Naranjo scale showed low sensitivity and negative predictive value in a recent comparison with the Council for International Organizations of Medical Sciences (CIOMS) method in 225 suspected hepatotoxicity cases [19]. This study concluded that the Naranjo scale lacks in validity and reproducibility when evaluating DILI. Indeed, the Naranjo method was not designed for assessing hepatotoxicity but for assessing ADRs in general. It therefore, contains questions not relevant to idiosyncratic reactions and is subsequently not recommended for hepatotoxicity assessments. Specific causality assessment scales for DILI Given the low validity and reproducibility of general methods when assessing hepatotoxic events, several groups have attempted to develop objective, systematic methodologies for evaluating this type of ADR [20–23]. The first causality assessment method for drug-induced liver injury was the decision tree developed by Stricker in 1992 [20]. This model assesses the degree of certainty on a scale of several levels. Unfortunately, Stricker’s decision tree is a complex and perhaps overly subjective method for use in routine clinical practice.

Journal of Hepatology 2011 vol. 55 j 683–691

685

Review Table 2. The most representative liver-specific causality assessment scales.

CIOMS Axis

MARIA & VICTORINO Score

Chronological criteria : From drug intake until onset

Axis

DDW-J Score

Axis

Score

Chronological criteria

Chronological criteria +1 to +2 From drug intake until onset

From drug withdrawal until onset 0 to +1 From drug withdrawal until onset

+1 to +3

From drug intake until onset

-3 to +3

From drug withdrawal until onset

+1 to +2 0 to +1

Course of the reaction Risk Factors

-2 to +3 Course of the reaction 0 to +2 Exclusion of other causes

0 to +3 -3 to +3

Course of the reaction Risk Factors

-2 to +3 0 to +1

Concomitant therapy Exclusion of other causes

Extrahepatic manifestations -3 to 0 -3 to +2 Rechallenge

0 to +3 +3

Exclusion of other causes Previous information

-3 to +2 0 to +1

Previous information Rechallenge

0 to +2 Known reaction -2 to +3

-3 to +2

Rechallenge Extrahepatic manifestations

0 to +3 0 to +1

DLST

0 to +2

Scores:

Scores:

Scores:

>8 points definite

>17 points definite

>4 points definite

6- 8 points probable

14-17 points probable

3-4 points probable

3-5 points possible

10-13 points possible

<3 points unlikely

1-2 points unlikely

6-9 points unlikely

<0 points excluded

<6 points excluded

CIOMS, Council for the International Organization of Medical Sciences; DDW-J, Digestive Disease Week-Japan; DLST, Drug lymphocyte stimulation test.

Initiated in 1989, the CIOMS coordinated several meetings with a panel of experts to develop an objective and consistent DILI causality assessment algorithm. This led to the publication of the Roussel Uclaf Causality Assessment Method (RUCAM) in 1993 [21]. This method, which we will be referring to as the CIOMS scale, is based on the international DILI consensus criteria that were published a few years earlier [24]. The scale applies numerical weighting to key features in seven different domains: chronology (latency and dechallenge), risk factors, concomitant drug use, search for other aetiologies, existing information on the drug’s hepatotoxic potential, and response to rechallenge. The numerical weight given to each key feature is summed up to generate an overall score that reflects the causality probability (highly probable, probable, possible, unlikely, or excluded). The system handles hepatocellular and cholestatic/mixed reactions differently, recognizing that the latter may occur after a longer post-cessation period and resolves more slowly. This method, when first validated, demonstrated 86% sensitivity, 89% specificity and positive and negative predictive values of 93% and 78%, respectively, using a cut-off point of 5 [25]. However, the validation of this system can be questioned as there was, and still is, no true diagnostic ‘‘gold standard’’ available to correctly verify DILI. Nevertheless, the DILI cases included in this validation were cases with a positive drug rechallenge, which is currently the major evidence recognized for confirming DILI. The reproducibility of the scale was evaluated by applying the CIOMS to 50 suspected DILI cases by four experts. Agreement between two, three, and four experts was 99%, 74%, and 37%, respectively, indicating perturbing discrepancies as the number of raters increases [21]. The CIOMS scale also demonstrated limited reliability in a recent DILIN study, where 40 DILI cases were assessed by three independent raters and reassessed by the same

686

raters five months later. Complete agreement was only seen in 26% of the cases and the intra- and inter-rater reliability were 0.54 and 0.45, respectively [26]. In 1997, Maria and Victorino developed a simplified scoring system in an attempt to overcome the complexity of the CIOMS system. This system, referred to as the Clinical Diagnostic Scale (CDS) or the M&V scale, uses several features of the CIOMS scale, but focuses on less components [22]. The overall score corresponds to five probability degrees: definite, probable, possible, unlikely, and excluded. The M&V scale was validated using real and fictitious cases and compared with the classification of three external experts. The comparison showed 84% agreement between the scale and expert opinions [22]. However, the authors themselves implementing the scale assessments might have enhanced the scale’s performance. Nonetheless, the M&V scale has limitations. The scale classifies cases as definite only when ‘‘positive rechallenge’’ and hypersensitivity features are present, despite these features being comparatively infrequent in DILI [4]. Besides, the instrument performs poorly in atypical cases, such as those with unusually long latency periods or those leading to chronic evolution after drug withdrawal [27]. In addition, drugs with more than five years on the market and no documented hepatotoxicity potential are given a lower score and criteria for exclusion of alternative causes are poorly described. Taken together, these limitations make it difficult to generate high scores for most hepatic reactions and to ascertain a diagnosis of drug-related hepatotoxicity with confidence. A new diagnostic scale, the Digestive Disease Week-Japan (DDW-J) scale, was recently proposed in Japan. This scale was derived from the CIOMS scale, but with modifications in the items concerning chronologic criteria, concomitant drug(s), and

Journal of Hepatology 2011 vol. 55 j 683–691

JOURNAL OF HEPATOLOGY Table 3. Summary of validity and comparative studies between different causality assessment methods.

Studies

Compared methods

Number of cases

Lucena et al. [27]

CIOMS vs. M&V

215

Aithal et al. [31]

M&V vs. International

135

Agreement and validity

Comments

18% and Kw 0.28

>agreement in HS cases

M&V: S 37%, Sp 100%, PPV 100%, NPV 25%


M&V: S 88%, Sp 92%

The authors concluded that the M&V

Consensus Criteria

longer latency period, death and liver transplant cases scale correlated well with consensus classification. A large proportion of the cases presented HS features.

Watanabe et al. [28]

DDW-J vs. CIOMS vs. M&V 127

García-Cortés et al. [19] CIOMS vs. Naranjo

225

DDW-J vs. CIOMS: Kw 0.03

DDW-J and CIOMS are better than M&V

CIOMS vs. M&V: Kw 0.35

DDW-J: >S but
24% and Kw 0.15

The

Naranjo: S 54%, Sp 88%,

and reproducibility in DILI cases.

Naranjo

scale

lacks

validity

PPV 95%, NPV 29% Naranjo : interrater 45%, Kw 0.17 CIOMS: interrater 72%, Kw 0.71 Rockey et al. [15]

SEO vs. CIOMS

187

Spearman’s r: 0.42

The

SEO: interrater 27%

interrater agreement and likelihood

SEO

produced

higher

CIOMS: interrater 19%

scores than the CIOMS.

ALF, acute liver failure; CIOMS, council for the international organization of medical sciences; DDW-J, digestive disease week-Japan; HS, hypersensitivity; Kw, weighted kappa; M&V, Maria, and Victorino scale; NPV, negative predictive value; PPV, positive predictive value; S, sensitivity; SEO, structured expert opinion method; Sp, specificity.

extra-hepatic manifestations [23]. The DDW-J scale has been shown to accurately diagnose DILI and to be superior to the CIOMS and M&V scale [28]. The DDW-J scale includes an in vitro drug lymphocyte stimulation test (DLST) evaluation criterion. Limited access and lack of standardization have prevented generalized clinical use of the DLST outside Japan and consequently DDW-J scale applications. However, recent findings of specific HLA allele associations with DILI, suggesting an important role for the adaptive immune response in DILI pathogenesis [29], have highlighted the DLST’s potential value, and promoted a new interest for lymphocyte-based tests in DILI assessments [30]. A comparison of domains and corresponding score allocations in the three most representative liver-specific causality assessment scales is featured in Table 2. Various studies have examined the level of agreement between causality assessment methods, in particular the CIOMS and the M&V systems. A comparison of these two methods in 215 suspected DILI cases demonstrated low consistency between the two systems [27]. Absolute agreement between the scales was only observed in 42 cases (18%). The CIOMS scale showed better discriminative power and produced assessments closer to those of specialists. The M&V scale was also compared with the international consensus criteria in 135 reported hepatic ADRs [31]. Reports classified as drug-related with the consensus criteria scored higher with the M&V scale than those classified as drug-nonrelated or indeterminate. Aithal and co-workers suggested that a cut-off score >9 could identify all drug-related liver injuries, unless alternative diagnoses are strongly suspected. This seems a precarious conclusion considering the proposed cut-off score falls into the

‘‘possible’’ category in the M&V scale, which makes it unreliable for decision making. A comparison between the DILIN group’s structured expert opinion method and a five-category CIOMS scale in 187 suspected DILI cases (each assessed by three independent reviewers) showed initial complete agreement in 27% and 19% of cases, respectively, with modest correlation between the two methods (Spearman’s r = 0.42). The evaluation also demonstrated a shift of the causality likelihood towards lower probabilities with the CIOMS method [15]. The flaw in intra- and inter-rater reliability demonstrated for the CIOMS scale is a concern and casts doubts on the validity of this method in its current state. Besides, one unique instrument could hardly accommodate all the complexities posed by DILI. A summary of the different comparative studies focused on in this review is shown in Table 3. The way in which hepatotoxicity diagnosis is reached was evaluated in 61 DILI cases reported to the PubMed database during the last decade. While the CIOMS was found to be the most frequently used scale (16.4%) followed by the Naranjo (13.1%), more than 62% of the reports did not use any causality assessment method [32]. This highlights the fact that diagnostic scales do not substitute for clinical judgement. Indeed, such methods only translate the clinician’s suspicion into a quantitative score and are designed for supporting rather than excluding causality.

Strengths and limitations of the CIOMS scale Due to heterogeneous DILI symptoms and variability among drugs and individuals, it is unlikely that a unique assessment

Journal of Hepatology 2011 vol. 55 j 683–691

687

Review instrument would fit all forms of presentations. The CIOMS scale offers a reasonable balance between the demands for absolute scientific objectivity and the necessity of having a simple enough method for practical use, and is consequently used by many experts in the field. Nevertheless, the CIOMS scale on its own is flawed for DILI diagnosis, an observation also made in recent additional reviews on the topic [33,34]. The CIOMS scale has, however, room for improvements. Difficulties in how to apply the scale are a serious setback and none of the risk factors focused on in the CIOMS scale has been proven to play a conclusive role in determining hepatotoxicity causality. In addition, the significance of the numerical weight given to each domain is questionable. Incomplete patient work-ups, atypical presentations of DILI, and concomitant drugs are other parameters that are managed inadequately in the scale. The emerging new data on DILI and the practical experience with the CIOMS method over the last decade have resulted in a need for a CIOMS scale ‘facelift’. Complex instructions The initial step when applying the CIOMS system is to define the type of liver injury based on the pattern of serum enzymes by calculating the ‘‘R ratio’’ [24].

these items to the total sum, generating a greater numerical value. In addition, the CIOMS does not assign points for reactions that start more than 15 or 30 days after ceasing drug administration in hepatocellular and cholestatic/mixed cases, respectively. This aspect should be modified to take into account known data on delayed appearance of hepatotoxicity for certain drugs, such as amoxicillin–clavulanate [35]. Course of the reaction Often the time taken to restore normal liver test values is not clear, especially when blood tests are not available at day 8 or 30 for hepatocellular injury or at day 180 for cholestatic/mixed type of liver damage, as requested by the CIOMS scale. Besides, for many drugs the liver injury may worsen for days or weeks after stopping the responsible drug, leading to a lower score. Adaptation and normalization of liver tests can also occur despite the continuation of the offending drug (so-called ‘‘tolerance’’) and is not compatible with the current CIOMS criteria. In cases with aberrant reaction course, such as long term cholestasis or cases of acute liver failure leading rapidly to death or liver transplantation, the CIOMS scale tends to give lower probability scores and consequently under-diagnoses cases in this category [27].

R ¼ ðALT value=ALT ULNÞ=ðALP value=ALP ULNÞ

Risk factors

An R value greater or equal to 5 defines hepatocellular damage, smaller or equal to 2 defines cholestatic damage, and cases with an R value between 2 and 5 are considered as having mixed type of liver injury. Depending on the pattern of liver damage, some items in the scale will score differently. Importantly, mixed type of damage fall into the cholestatic damage category although no data support strict resemblance in disease manifestation, severity, and progression. Furthermore, the type of liver injury may change along the course of the illness [9]. Due to variations in the serum enzyme profile during disease progression, the time point for calculating the R value is important. Many clinicians use enzyme values from the first analytical test showing elevations above normal to establish the R value, while others use peak values, which may or may not coincide with the initial analytical values. The lack of clear instructions for how the R value should be calculated can lead to differences among users in defining the type of liver injury.

In the CIOMS scale, old age is considered a risk factor and a patient age over 55 years scores an extra point. However, recent data from a large cohort indicate that older age does not predispose to overall DILI, but can be a predictor of cholestatic damage while younger age seems to be related to heptocellular pattern of injury [36]. Separate studies have also shown that age may be a risk factor for particular forms of DILI, such as flucloxacillin and isoniazide-induced liver injury [37,38]. Alcohol is another risk factor in the CIOMS scale. Alcohol use per se is not known to increase the risk of idiosyncratic DILI development. Moreover, evidence that chronic alcohol intake increases DILI susceptibility is based on findings with limited number of drugs, and may not be universally applicable [39]. Hence, the point given for alcohol intake in the CIOMS scale, without differentiating time, amount, and duration of consumption, is by no means justified. The diversity and complexity of DILI is reflected in the emerging data of specific risk factors, such as enhanced risk of valproate hepatotoxicity with concomitant use of other anticonvulsants or increased antiretroviral hepatotoxicity in patients co-infected with HBV or HCV and HIV [39]. Such risk factors are disregarded in the CIOMS scale. Most experts agree today that genetic variations are probably the greatest risk factor for DILI [40]. Recent studies have shown, for example, that presence of HLA alleles B⁄5701, DRB1⁄0701, and DRB1⁄1501 are associated with flucloxacillin, ximelagatran, and amoxicillin–clavulanate hepatotoxicity, respectively [6,40]. As new data unfold, extending the risk factor domain in the CIOMS scale will be crucial to validate DILI.

Selection and weighting of criteria Another important limitation is represented by the domains and the corresponding weightings. The CIOMS domains were selected based on expert opinions and, therefore, lack scientific evidence supporting a role in hepatotoxicity. Likewise, the weighting of the criteria has not been derived from appropriate statistical approaches, but was determined relatively arbitrarily by a group of experts. This can potentially lead to incorrect assessment results.

Concomitant drugs Time to onset The latent period for DILI can vary considerably. The CIOMS scale contains two items in this domain: time to onset from the beginning and cessation of the drug. Since it is not stated that the items are mutually exclusive some clinicians add the score from both of 688

A major limitation of the CIOMS scale is the inability to ascertain causality when two or more drugs are taken concomitantly, especially if having the same temporal sequence. In DILI cases with concomitant drugs the CIOMS scale requires individual assessments for each of the drugs. Concomitant use of another drug

Journal of Hepatology 2011 vol. 55 j 683–691

JOURNAL OF HEPATOLOGY with compatible or suggestive time to onset subtracts one point, but two points if the concomitant drug is known to be hepatotoxic. Should we subtract one point to the score achieved by a known hepatotoxic drug, such as amoxicillin–clavulanate, for the concomitant use of a bronchodilator? On the contrary, the term hepatotoxic is often subject to variable interpretations, which may lead to bias towards certain drugs. Likewise, the CIOMS scale disregards differences in metabolic pathways used by concomitant drugs and consequent potential pharmacokinetic drug–drug or drug–disease interactions [41].

Spanish DILI Registry showed that rechallenge data were absent in >95% of all cases. While immunoallergic DILI reactions often produce a rapid response, drugs triggering metabolic idiosyncrasy can have highly variable and delayed responses to readministration. Interpretation of such reactions can, therefore, be difficult and potentially lead to erroneous assumptions [45]. Due to the infrequency of rechallenge, a simplification of the four response categories in the CIOMS scale’s rechallenge domain seems feasible and should be addressed. Histological findings

Search for non-drug causes The CIOMS scale includes a list of common disorders that should be excluded in order to more confidently ascribe hepatic injury to the drug. However, this list is not exclusive. In practical terms, one would expect the clinician to rule out at least acute hepatitis A, B, and C, in addition to biliary problems, autoimmune, and alcoholic hepatitis and circumstances that produce ischaemic– hypoxic liver injury. There are liver diseases less frequently considered that should preferably be included in the list, such as hepatitis E; acute exacerbation of chronic hepatitis B, C, or D; worsening of non-alcoholic fatty liver disease; metabolic disorders e.g. Wilson’s disease; vascular pathology and oncologic liver disease [42]. In the case of hepatitis E, Dalton and co-workers have shown that 21% of patients with criterion-referenced DILI had autochthonous hepatitis E. Based on their findings, they suggest that a diagnosis of DILI is not secure without testing for HEV [43].

The role of liver biopsy in DILI diagnosis is a controversial issue. This technique is not routinely performed for diagnostic indication. Liver biopsy should be performed though when the patient may have an underlying liver injury or to characterize injury patterns from drugs not previously imputed in DILI and to identify more severe or residual lesions that could have prognostic significance [46]. Histological information may also allow more confident exclusion of common liver conditions and subsequently strengthen a DILI diagnosis. The presence of biopsy findings suggestive of DILI, such as demarcated perivenular (acinar zone 3) necrosis, minimal hepatitis with canalicular cholestasis, poorly developed portal inflammatory reaction, abundant neutrophils, abundant eosinophils, epithelioid-cell granulomas, microvesicular steatosis, and the presence of drug deposits (vitamin A autofluorescence) should upgrade the causality assessment score [47,48].

Previous information on hepatotoxicity of the drug

Future prospects

Scoring previous knowledge of drug hepatotoxicity potentials will only be reproducible if there is a common data source with continuous updating of the information, or the judgement will be largely dependent on each evaluator’s state of information [21]. This CIOMS score can be particularly difficult to assign as summary of product characteristics of drugs may not be easily accessible and are not always informative enough. Many drugs have notes in their summary of product characteristics indicating some form of liver enzyme alterations. However, the degree of hepatic effects is often not detailed. The meaning of ‘hepatotoxicity potential’ is, therefore, left to the evaluator to interpret, inevitably leading to variations between assessors. Herbal and dietary supplements are even less likely to be complemented with hepatotoxicity information and the task can be more difficult due to lack of information on the active ingredient(s). For drugs with no previous information on hepatotoxicity, no points are given in the CIOMS scale. This, however, does not accurately reflect the drug’s hepatotoxicity potential. The development of an international database with an up-to-date list of hepatotoxins would facilitate DILI diagnosis and enhance the accuracy. An initial attempt was made recently, whereby a list of drugs associated with hepatotoxicity in three major DILI registries in Spain, the US, and Sweden was assembled [44].

A focussed effort on developing stricter definitions and conveniently standardized means of evaluating hepatotoxicity through large scale collaborative approaches would greatly improve DILI causality assessments. It is unlikely that a single instrument will be accurate and reliable for all patients with suspected toxic liver disease from any agent unless there is a dynamic weighting of the component variables. Although difficult to apply and non-userfriendly the CIOMS method is the most frequently used DILI causality scale and is applied by many DILI experts, researchers, and regulatory authorities to assess the probability of suspected causal agents. To overcome difficulties in DILI causality assessment and to more reliably ascertain causal agents, the CIOMS scale needs to be refined. Amendments in the form of applying authoritative evidence-based criteria, a simplified scoring system, and appropriate weighting given to individual parameters based on statistical evaluations with large databases, are devised to provide wider applicability in the clinical setting. This will now be feasible thanks to the many large DILI registries with large data sets established around the world. This invaluable source has brought to light the variability in DILI manifestation and consequent need for ‘tailored’ CIOMS scales. The ability to incorporate information on the known signature of a given drug and specific risk factors could strengthen the assessment. In particularly, accommodating the emerging data on genetic predisposition to DILI is highly desirable. However, to manage and incorporate the ever growing data on DILI a computerized assessment scale would be the preferred option. DILIN has expressed interest in developing a web-based causality assess-

Response to rechallenge Drug rechallenge is arguably the most definitive means of DILI diagnosis, but is rarely performed due to the risks involved. A recent analysis of the performance of the CIOMS scale in the

Journal of Hepatology 2011 vol. 55 j 683–691

689

Review ment system in conjunction with a comprehensive DILI database in collaboration with the National Library of Medicine [5]. The CIOMS scale efficiency is dependent on the amount of available data for the case. Subsequently, early suspicion of DILI often leads to a low CIOMS scale probability score due to lack of data, such as response to dechallenge. The development of an abridged scale, requiring limited data, for early assessments would be an important goal in terms of faster DILI recognition and subsequent treatment for the patient [49]. Work in this area is in progress.

Conflict of interest The authors declare that they do not have anything to disclose regarding funding or conflict of interest with respect to this manuscript.

Financial Support This study was supported, in part, by research grants from the Agencia Española del Medicamento and Fondo de Investigación Sanitaria (FIS 07-0980). CIBERehd is funded by Instituto de Salud Carlos III.

Appendix

References

On behalf of the Spanish group for the Study of Drug-Induced Liver Disease Participating clinical centres: Hospital Universitario Virgen de la Victoria, Málaga (centro coordinador): R.J. Andrade, M.I. Lucena, M. García-Cortés, A Fernandez-Castañer, Y. Borraz, E. Ulzurrun, M. Robles, C. Stephens, I. Moreno, D.L. Gualteros. Hospital Torrecárdenas, Almería: M.C. Fernández, G. Peláez, R. Daza, M. Casado, J.L. Vega, F. Suárez, M. Torres, M. GonzálezSánchez, J. Esteban. Hospital Universitario Virgen de Valme, Sevilla: M. RomeroGómez, L. Grande, M. Jover, B. Prado. Hospital de Mendaro, Guipúzkoa: A. Castiella, E.M. Zapata. Hospital Germans Trias i Puyol, Barcelona: R. Planas, J. Costa, A. Barriocanal, E. Montané, S. Anzola, N. López, F. García-Góngora, A. Borras, E. Gallardo, A. Vaqué, A. Soler. Hospital Virgen de la Macarena, Sevilla: J.A. Durán, I. Carmona, A. Melcón de Dios, M. Jiménez, J. Alanís- López, M. Villar. Hospital Central de Asturias, Oviedo: R. Pérez-Álvarez, L. Rodrigo-Sáez. Hospital Universitario San Cecilio, Granada: J. Salmerón, A. Gila. Hospital Costa del Sol, Marbella (Málaga): J.M. Navarro, J.F. Rodríguez, I.M. Mendez Sánchez. Hospital Universitario Virgen de lãs Nieves, Granada: R. Martín-Vivaldi, F. Nogueras. Hospital Sant Pau, Barcelona: C. Guarner, G. Soriano, E.M. Román. Hospital Puerta del Mar, Cádiz: F. Díaz, M.J. Soria, P. Rendón, M. Macías. Hospital Morales Meseguer, Murcia: H. Hallal, E. García Oltra. Hospital 12 de Octubre, Madrid: T. Muñoz-Yagüe, J.A. SolísHerruzo. Hospital Marqués de Valdecilla, Santander: F. Pons, M. Arias. 690

Hospital Xeral-Calde, Lugo: S. Ávila-Nasi. Hospital de Donosti, San Sebastián: M. García-Bengoechea, J. Arenas, M.I. Gomez Osua. Hospital de Basurto, Bilbao: S. Blanco, P. Martínez-Odriozola. Hospital Carlos Haya, Málaga: M. Jiménez, R. GonzálezGrande. Hospital de Puerto Real, Cádiz: J.M. Pérez-Moreno, M. Puertas. Hospital del Mar, Barcelona: R. Solá. Hospital General Básico de Vélez, Málaga: F. Santalla, C. Sánchez-Robles. Hospital General Universitario Gregorio Marañón, Madrid: R. Bañares. Hospital de Sagunto, Valencia: J. Primo, J.R. Molés. Hospital de Laredo, Cantabria: M. Carrascosa. Hospital Clínic, Barcelona: M. Bruguera, P. Gines, S. Lens. Hospital Universitario de Canarias, La Laguna, Tenerife: M. Hernandez-Guerra. Hospital Infanta Cristina, Badajoz: J.L. Montero. Hospital Puerta de Hierro, Madrid: J.L. Calleja, J. de la Revilla. Hospital Del Tajo, Aranjuez, Madrid: O. Lo lacono. Hospital La Fe, Valencia: M. Berenguer. Hospital Doctor Peset, Valencia: A. del Val. Hospital La Línea, Algeciras, Cádiz: E. García Ruiz. Hospital Universitario Virgen de la Arrixaca, Murcia: M. Miras Lopez. Hospital de Albacete, Albacete: J.M. Moreno.

[1] Guidance for Industry. Drug-Induced Liver Injury: Premarketing Clinical Evaluation, July 2010. . [2] Ostapowicz G, Fontana RJ, Schiødt Fv, Larson A, Davern TJ, Han SH, et al. Results of a prospective study of acute liver failure at 17 tertiary care centers in the United States. Ann Intern Med 2002;137:947–954. [3] Wei G, Bergquist A, Broomé U, Lindgren S, Wallerstedt S, Almer S, et al. Acute liver failure in Sweden: etiology and outcome. J Intern Med 2007;262: 393–401. [4] Andrade RJ, Lucena MI, Fernández MC, Pelaez G, Pachkoria K, García-Ruiz E, et al. Drug-induced liver injury: an analysis of 461 incidences submitted to the Spanish registry over a 10-year period. Gastroenterology 2005;129: 512–521. [5] Fontana RJ, Watkins PB, Bonkovsky HL, Chalasani N, Davern T, Serrano J, et al. Drug-induced liver injury network (DILIN) prospective study rationale, design and conduct. Drug Saf 2009;32:55–68. [6] Daly AK, Donaldson PT, Bhatnagar P, Shen Y, Pe’er I, Floratos A, et al. HLAB⁄5701 genotype is a major determinant of drug-induced liver injury due to flucloxacillin. Nat Genet 2009;41:816–819. [7] Björnsson E, Olsson R. Outcome and prognostic markers in severe druginduced liver disease. Hepatology 2005;42:481–489. [8] Molokhia M, McKeigue P. EUDRAGENE: European collaboration to establish a case-control DNA collection for studying the genetic basis of adverse drug reactions. Pharmacogenomics 2006;7:633–638. [9] Andrade RJ, Lucena MI, Kaplowitz N, García-Munoz B, Borraz Y, Pachkoria K, et al. Outcome of acute idiosyncratic drug-induced liver injury: longterm follow-up in a hepatotoxicity registry. Hepatology 2006;44: 1581–1588. [10] Fontana RJ, Seeff LB, Andrade RJ, Björnsson E, Day CP, Serrano J, et al. Standardization of nomenclature and causality assessment in drug-induced liver injury: summary of a clinical research workshop. Hepatology 2010;52: 730–742. [11] Aithal GP, Watkins PB, Andrade RJ, Larrey D, Molokhia M, Takikawa H, et al. Case definition and phenotype standardization in drug-induced liver injury. Clin Pharmacol Ther 2011;89:806–815. [12] Kramer MS. Assessing causality of adverse drug reactions: global introspection and its limitations. Drug Info J 1986;20:433–437. [13] Arimone Y, Bégaud B, Miremont-Salamé G, Fourrier-Réglat A, Moore N, Molimard M, et al. Agreement of expert judgment in causality assessment of adverse drug reactions. Eur J Clin Pharmacol 2005;61:169–173.

Journal of Hepatology 2011 vol. 55 j 683–691

JOURNAL OF HEPATOLOGY [14] Macedo AF, Marques FB, Ribeiro CF. Can decisional algorithms replace global introspection in the individual causality assessment of spontaneously reported ADRs? Drug Saf 2006;29:697–702. [15] Rockey DC, Seeff LB, Rochon J, Freston J, Chalasani N, Bonacini M, et al. Causality assessment in drug-induced liver injury using a structured expert opinion process: comparison to the Roussel-Uclaf causality assessment method. Hepatology 2010;51:2117–2126. [16] Zapater P, Such J, Pérez-Mateo M, Horga JF. A new Poisson and Bayesianbased method to assign risk and causality in patients with suspected hepatic adverse drug reactions: a report of two new cases of ticlopidine-induced hepatotoxicity. Drug Saf 2002;25:735–750. [17] Karch FE, Lasagna L. Toward the operational identification of adverse drug reactions. Clin Pharmacol Ther 1977;21:247–254. [18] Naranjo CA, Busto U, Sellers EM, Sandor P, Ruiz I, Roberts EA, et al. A method for estimating the probability of adverse drug reactions. Clin Pharmacol Ther 1981;30:239–245. [19] García-Cortés M, Lucena MI, Pachkoria K, Borraz Y, Hidalgo R, Andrade RJ, et al. Evaluation of Naranjo adverse drug reactions probability scale in causality assessment of drug-induced liver injury. Aliment Pharmacol Ther 2008;27:780–789. [20] Stricker BHC. Drug-induced hepatic injury. 2nd ed. Elsevier: Amsterdam; 1992. [21] Danan G, Benichou C. . Causality assessment of adverse reactions to drugs-I. A novel method based on the conclusions of international consensus meetings: application to drug-induced liver injuries. J Clin Epidemiol 1993;46:1323–1330. [22] Maria VA, Victorino RM. Development and validation of a clinical scale for the diagnosis of drug-induced hepatitis. Hepatology 1997;26:664–669. [23] Takikawa H, Takamori Y, Kumagi T, Onji M, Watanabe M, Shibuya A, et al. Assessment of 287 Japanese cases of drug induced liver injury by the diagnostic scale of the International Consensus Meeting. Hepatol Res 2003;27:192–195. [24] Benichou C. Criteria of drug-induced liver disorders. Report of an international consensus meeting. J Hepatol 1990;11:272–276. [25] Benichou C, Danan G, Flahault A. Causality assessment of adverse reactions to drugs-II. An original model or validation of drug causality assessment methods: case reports with positive rechallenge. J Clin Epidemiol 1993;46: 1331–1336. [26] Rochon J, Protiva P, Seeff LB, Fontana RJ, Liangpunsakul S, Watkins PB, et al. Reliability of the Roussel Uclaf Causality Assessment Method for assessing causality in drug-induced liver injury. Hepatology 2008;48:1175–1183. [27] Lucena MI, Camargo R, Andrade RJ, Pérez-Sánchez C, Sánchez De La Cuesta F. Comparison of two clinical scales for causality assessment in hepatotoxicity. Hepatology 2001;33:123–130. [28] Watanabe M, Shibuya A. Validity study of a new diagnostic scale for druginduced liver injury in Japan-comparison with two previous scales. Hepatol Res 2004;30:148–154. [29] Stephens C, Lopez-Nevot MI, Lucena MI, Ruiz-Cabello F, Ulzurrun E, Cabello MC, et al. The HLA class I B⁄1801 allele influences hepatocellular expression of amoxicillin–clavulanate liver damage and outcome in Spanish patients. J Hepatol 2010;52:S439.

[30] Watkins PB. Biomarkers for the diagnosis and management of drug-induced liver injury. Semin Liver Dis 2009;29:393–399. [31] Aithal GP, Rawlins MD, Day CP. Clinical diagnostic scale: a useful tool in the evaluation of suspected hepatotoxic adverse drug reactions. J Hepatol 2000;33:949–952. [32] Tajiri K, Shimizu Y. Practical guidelines for diagnosis and early management of drug-induced liver injury. World J Gastroenterol 2008;14:6774–6785. [33] Hayashi PH. Causality assessment in drug-induced liver injury. Sem Liv Dis 2009;29:348–356. [34] Shapiro MA, Lewis JH. Causality assessment of drug-induced hepatotoxicity: promises and pitfalls. Clin Liver Dis 2007:477–505. [35] Lucena MI, Andrade RJ, Fernándes MC, Pachkoria K, Pelaez G, Durán JA, et al. Determinants of the clinical expression of amoxicillin–clavulanate hepatotoxicity: a prospective series from Spain. Hepatology 2006;44:850–856. [36] Lucena MI, Andrade RJ, Kaplowitz N, García-Cortes M, Fernández MC, Romero-Gomez M, et al. Phenotypic characterization of idiosyncratic druginduced liver injury: the influence of age and sex. Hepatology 2009;49: 2001–2009. [37] Fairley CK, McNeil JJ, Desmond P, Smallwood R, Young H, Forbes A, et al. Risk factors for development of flucloxacillin associated jaundice. BMJ 1993;306:233–235. [38] Nolan CM, Goldberg SV, Buskin SE. Hepatotoxicity associated with isoniazid preventive therapy: a 7-year survey from a public health tuberculosis clinic. JAMA 1999;281:1014–1018. [39] Chalasani N, Björnsson E. Risk factors for idiosyncratic drug-induced liver injury. Gastroenterology 2010;138:2246–2259. [40] Andrade RJ, Robles M, Ulzurrun E, Lucena MI. Drug-induced liver injury: insights from genetic studies. Pharmacogenomics 2009;10:1467–1487. [41] Lucena MI, Andrade RJ, Vicioso L, González FJ, Pachkoria K, García-Muñoz B. Prolonged cholestasis after raloxifene ad fenofibrate interaction: a case report. World J Gastroenterol 2006;12:5244–5246. [42] Verma S, Kaplowitz N. Diagnosis, management and prevention of druginduced liver injury. Gut 2009;58:1555–1564. [43] Dalton HR, Fellows HJ, Stableforth W, Joseh M, Thurairajah PH, Warshow U, et al. The role of hepatitis E virus testing in drug-induced liver injury. Aliment Pharmacol Ther 2007;26:1429–1435. [44] Suzuki A, Andrade RJ, Björnsson E, Lucena MI, Lee WM, Yuen NA, et al. Drugs associated with hepatotoxicity and their reporting frequency of liver adverse events in VigiBase™: unified list based on international collaborative work. Drug Saf 2010;33:503–522. [45] Andrade RJ, Robles M, Lucena MI. Rechallenge in drug-induced liver injury: the attractive hazard. Expert Opin Drug Saf 2009;8:709–714. [46] Andrade RJ, Robles M, Fernández-Castañer A, López-Ortega S, López-Vega MC, Lucena MI. Assessment of drug-induced hepatotoxicity in clinical practice. a challenge for gastroenterologists. World J Gastroenterol 2007;13:329–340. [47] Kleiner DE. The pathology of drug-induced liver injury. Semin Liver Dis 2009;29:364–372. [48] Larrey D. Epidemiology and individual susceptibility to adverse drug reactions affecting the liver. Semin Liver Dis 2002;22:145–155. [49] Kaplowitz N. Causality assessment versus guilt-by-association in drug hepatotoxicity. Hepatology 2001;33:308–310.

Journal of Hepatology 2011 vol. 55 j 683–691

691