Computational predictive programs (expert systems) in toxicology

Computational predictive programs (expert systems) in toxicology

ELSEVIER Toxicology 119 (1997) 213-225 Computational predictive programs (expert systems) in toxicology Emilio Benfenati a,*, Giuseppina Gini b aI...

1MB Sizes 93 Downloads 104 Views

ELSEVIER

Toxicology 119 (1997) 213-225

Computational

predictive programs (expert systems) in toxicology

Emilio Benfenati a,*, Giuseppina Gini b aIytituto

di Ricerche Farmacologiche ‘Mario Negri’, Via Eritrea 62, 20157 Milano, Italy b Dipartimento di Elettronica e Informazione, Polite&co di Milano, 20133 Milano, Italy

Received 19 June 1996; accepted 24 January 1997

Abstract

The increasing number of pollutants in the environment raises the problem of the toxicological risk evaluation of these chemicals. Several so called expert systems (ES) have been claimed to be able to predict toxicity of certain chemical structures. Different approaches are currently used for these ES, based on explicit rules derived from the knowledge of human experts that compiled lists of toxic moieties - for instance in the case of programs called HazardExpert and DEREK - or relying on statistical approaches, as in the CASE and TOPKAT programs. Here we describe and compare these and other intelligent computer programs because of their utility in obtaining at least a first rough indication of the potential toxic activity of chemicals. 0 1997 Elsevier Science Ireland Ltd. Keywords:

Toxicity; Expert systems; Artificial intelligence; Computer systems; QSAR

1. Introduction Abbreviations: AI, artificial intelligence; CASE, Computer Automated Structure Evaluation (a research program); COMPACT, Computer Optimized Molecular Parametric Analysis of Chemical Toxicity (a research program); DEREK, Deductive Estimation of Risk from Existing Knowledge (a commercial program); ES, expert systems; HazardExpert, a

commercial program; MULTICASE, the improved version of CASE; NMR, nuclear magnetic resonance; NTP, National Toxicology Program (IJS); QSAR, quantitative structure-activity relationship; TOPKAT, Toxicity Prediction by Kom-

puter Assisted Technology (a commercial program). * Corresponding author. Tel.: + 39 2 39014420;fax: + 39 2 39001916;e-mail: [email protected].

1.1. The dijjkulties of establishing the toxicity of chemicals

A large number of chemicals are currently used, and many others are in development. An urgent auestion concerns their involvement in Dotential 1 1 negative effects on human health, also when they enter the environment. This is a complex question requiring careful consideration on account of the risk to exposed populations.

0300-483X/97/$17.00 0 1997 Elsevier Science Ireland Ltd. All rights reserved. PZI SO300-483X(97)0363

l-7

214

E. Benjhati,

G. Gini / Toxicology

Furthermore, compounds undergo transformations during their life, generating new compounds, which means that new information must be obtained about the toxicity of the transformation products. Toxicity can be defined as any harmful effect of a chemical on a target organism. A key point is that the extent of toxicity is dose-linked. A large battery of different toxicological studies is needed to assess toxicity, and there are many variables, such as the adverse effects, the animals used for the study, the dose, the route of exposure. Biochemically the picture is equally as complex, because toxicity includes different mechanisms and activities. Thus, a toxic substance may directly reach the target site, or the toxic effect may be mediated by one or more steps, involving activation of a protein receptor, directly or after the transformation of the compound into a specific metabolite. To identify chemicals capable of inducing toxicity and to possibly limit the incidence of human cancers and other diseases, rodent bioassays today are the mainstay. However, this approach is not altogether problem-free, on several accounts (Omenn, 1995): 1. the cost of the assay ( > 1 million US dollars per chemical); 2. the time needed for the tests (335 years); 3. public pressure to reduce or eliminate the use of animals in research and testing. Surrogate tests to predict carcinogens are often used to overcome these problems, even for regulatory purposes, but questions have been raised as to their validity.

1.2. The computer

119 (1997) 213-225

be inferred similarly on the basis of substituents on a particular structure. These relationships are illustrated by Fig. 1A. In the seventies the rapid development of ecotoxicology was supported by the discovery ~ actually originating from older data ~ that a certain amount of the toxicity observed in animals and plants may be explained on the basis of the coefficient of partition between octanol and water (Nendza et al., 1990). The rationale for this stems from the fact that a compound must enter the cell

approach

Decades ago chemists investigated the effects of particular groups in a family of molecules on particular properties of members of this family. A famous example is the influence of substituents on the pKa of benzoic acids: the values vary in relation to the group present in the ring beside the carboxy and it may be possible to predict them with reasonable confidence. Spectral characteristics of molecules, such as UV and NMR data, can

Fig. 1. A. The classical relationship between rules and molecules. B. The relationship between empirical physicochemical parameters and toxic activities. C. The direct relationship between theoretical molecular descriptors and toxic properties.

E. Benfenati, G. Gini / Toxicology 119 (1997) 213-225

in order to show biological activity, and compounds that tend ‘to stay in octanol have more affinity for the membrane and enter the cell more easily than more hydrophilic compounds. Many algorithms to explain toxic effects have been proposed for different situations where homogeneous classes of chemicals produced different activities (examples are given in the book edited by Karcher and Devillers, 1990). However, most of these algorithms are suitable only for predictions within t.he structure space spanned by the set of compounds used to build the model: thus, to quote Alexandre Dumas, all generalizations are dangerous, even this one. The relationships between chemical and biological properties can often be sketched out as in Fig. 1B. Theoretical descriptors of chemicals have also been proposed, with more general aims. Some examples are molecular volume, dipole moment, molecular shape, steric factors, electronegativity (Dearden, 1990). Weighted holistic invariant molecular descriptors have also been proposed (Todeschini et al., 1995). This parallels other applications, as previously mentioned, where empirical parameters were used. This approach can be visualized in Fig. 1C. For theoretical descriptors the interest lies in parameters also underlying some of the empirical parameters considered in many toxicological and ecotoxicological studies. In the future theoretical descriptors are likely to be used more extensively. All these studies will help researchers working on toxicity and the environment. There are thus three parts to the question of predicting toxicity of chemicals: 1. The chemical part, 2. The biological part (activities), 3. The relationship between 1 and 2 (algorithm). Attempts have been made to model non-homogeneous sets of compounds, which is of course more difficult than explaining the properties of analogues. As regards the chemical part of the problem there is a shift from empirical towards intrinsic, theoretical parameters. This is more apparent in the field of quantitative drug design, and can be

215

easily seen reading scientific journals such as the ‘Journal of Medicinal Chemistry’ and ‘Quantitative Structure-Activity Relationships’. In this field a wide use of computer programs are used, that however differ from those here described for at least two main points: programs used for drug design are generally deterministic and their application is mainly for the study and prediction of pharmaceutical activities. A list and a description of many of programs for computer chemistry can be found at the address given in the notes. Another shift is from a view with the emphasis on the presence of certain molecular fragments towards a global approach. In the first case it is assumed that the contribution of the individual parts of the molecule are additive, while in the second case holistic characteristics are considered. This is reflected in the different computer programs, as we discuss later. As regards the biological part, many different activities have been considered, involving different mechanisms. As regards the relationships between chemicals and the biological part of the problem, the tendency is towards more complex algorithms. We used the word ‘relationship’ not necessarily to mean causality, as might be suggested if we identify the chemical part as ‘properties’ causing a biological effect: to determine causality we have to know the exact mechanism of the biological activity. This has been pointed out by other authors (Mumtaz et al., 1995; Enslein et al., 1994). We focus our discussion here on some of these more or less complex algorithms. For further discussion the reader is referred to a few recent reviews (Combes and Judson, 1995; Richard, 1994; Lewis, 1992).

1.3. The use of ES The use of ES to predict toxicity can provide an indicator as to whether pollutants have the potential for inducing toxicity. An ES is a computer program that provides solutions to important problems similar to those obtained by human experts. Generally ES have different sections: data

216

E. Benfenati,

G. Gini/ Toxicology

and rule acquisition, a reasoning system, and a rule generator. An ES is 1. l heuristic - it can reason both with theories and with expert knowledge; 2. 0 transparent - it explains its line of reasoning; 3. l flexible it can integrate new knowledge into its existing base. In the standard ES knowledge is incorporated in the so called ‘rules’ in the form of logical expressions: if < condition > then < action > A complete set of rules covering the entire domain of a given problem (problem space) should be sufficient to answer all questions raised within that domain. During the operation of the ES the set of rules (which is understandably large) is inspected by the inference engine. Using different programming and logical mechanisms in theory the inference engine enables the ES to detect all possible connections between the rules, the facts, and the solutions (answers). In practice, this process may be too wide and suitable searching strategies have to be chosen; this affects the results of the program. The advantage of rule-based systems is that the set of rules - which are independent - can be extended to deal with incomplete knowledge. The main disadvantage of rule-based ES is that it is very hard to obtain a complete set of rules and that control knowledge is not easily integrated. The priority of rules can be artificially set by the programmer. If no priority is set the search through the entire knowledge base can be prohibitively long, but if priorities are established there is always the risk that solutions may be biased by outside knowledge forced on the rules. ES have been used in biomedical and chemical applications since the beginning of artificial intelligence (AI) (Lindsay et al., 1980; Buchanan and Shortliff, 1984; Gray, 1988). In the last decade or so a few ES for toxicological studies have been reported. We summarize here the major studies published so far, examining the ES proposed and comparing them. Table 1 presents major characteristics of these ES.

II9 (1997) 213-22.5

2. The ‘philosophy’ of the different ES for toxicology 2.1. Rule-bused human -derived ES Classically ES create rules from human knowledge, obtained through experts. There is an example of this process in the paper by Jelovsek et al. (1990) presenting the procedure they used to formulate rules from the expertise of selected toxicologists. A major problem is how to make explicit the implicit knowledge that experts have developed. Jelovsek et al. (1990) used the interview, the most frequently used method for knowledge acquisition. One important feature of rule-based ES is that they are well-suited to account for uncertain or unknown information, which is common in toxicology. A typical easy approach, used by human experts when evaluating an unknown chemical, is to look at its similarity with other molecules. However, the concept of similarity is subjective and what is mainly picked up is the presence of certain reactive groups. For instance, Ashby and Paton (1993) listed many toxic residues responsible for adverse activity. CompuDrug started from this approach and encoded into its ES, called HazardExpert, the behaviour of selected residues based on a report by the U.S. Environmental Protection Agency. The system searches for these structural alerts and, on finding them, gives an overall possibility of toxicity. To enhance the efficiency of the system, there are built-in modules which can predict the dissociation constant (pKa) and the distribution coefficient (log P). These can be used to predict the bioabsorption and accumulation of xenobiotics in living organisms, in addition to oncogenicity, mutagenicity, teratogenicity, irritation, sensitization, immunotoxicity and neurotoxicity. HazardExpert examines the compound itself as well as potential metabolites, based on modules providing for generation of potential metabolites. This may help in assessing the overall activity. Sanderson and Earnshaw (1991) used the relationship $..then, introducing a series of substructures known to be toxic in the rule base of a system called DEREK (Deductive Estimation of

E. Benfenati, G. Gini / Toxicology I19 (1997) 213-225

217

Table 1 The characteristics and differences between the ES HazardExpert

DEREK

By Nakadate

X

X

X

TOPKAT

CASE

By Malacarne

X

X

X

COMPACT

Source of knowledge

Human Statistical Mechanistic

X X

Molecular elements considered

Toxic residues Inhibiting residues Global parameters Metabolites

X X

X

x

X

X

X X X

X X X

X

X X X

X

X

X

X X

X

Activities Any

Carcinogenicity Mutagenicity Other

X X X X

X

Advantages

Any activity No training set required Information on mechanism

X X

Disadvantages

Fixed activities Training set required

X

X

X

Risk from Existing Knowledge), that then recognizes any such residues in the compound examined. DEREK is updated by the DEREK Collaborative Group which is made up of representatives from agrochemical, pharmaceutical and regulatory organisations. An update of the program has been presented (Ridings et al., 1996). DEREK makes qualitative rather than quantii:ative predictions. It looks for previously characterized toxicophores that are highlighted in the display and their toxic activity associated. The presence of several toxicophores in the molecule means there are more risks, but whether the risks are additive or not is decided by the user. DEREK also takes into account physico-chemical properties such as log P and pKa. There are several toxicological endpoints including mutagenicity, carcinogenicity, skin sensitization, irritation, reproductive effects, neurotoxicity and others.

X X

X

X X

X

Payne and Walsh (1994) developed quantitative structure-activity relationships (QSAR) for dermal toxicity, to be incorporated as rules into the DEREK system. The sources of rules of DEREK are varied and have been updated to include structural features identified by other ES, such as CASE and TOPKAT (Ridings et al., 1996), making an interesting case of off-line rule generation and use between two ES. Other information on DEREK is available through Internet (see Notes). Nakadate et al. (1991) used a system in which the ES produces the rules on toxicity. The start-ing point of this system is a fact database, data on storing mutagenicity (mainly), carcinogenicity and teratogenicity. Data are processed into a data modification module, that also generates substructural fragments. The modified data are then used for the rule-making support module.

218

E. Benfenati,

G. Gini / Toxicology

2.2. ES using statistical procedures Other methods use statistical approaches. The TOPKAT (Toxicity Prediction by Komputer Assisted Technology) program uses QSAR principles, involving statistical methods such as linear multiple regression equations and two-group linear discriminant functions (Enslein et al., 1994; Gombar et al., 1993). Structural descriptors are used such as electronic, connectivity, shape and substructure descriptors. It is a modular integrated package, each module consisting of a QSAR model and a database. It makes automatic rule induction. The program uses two functions, SEARCH and COVER. SEARCH is used to scan the training database for molecules with substructures overlapping with the compound to be studied. COVER then reports the fragments considered. The scores for the descriptors of interest for a particular compound are algebraically added up; so negative values indicate a low probability of toxic effect. TOPKAT can estimate the statistical significance of the results, improving the robustness of the program. The authors claim that TOPKAT enables users to establish whether the predicted value is meaningful or not. This test is essential since TOPKAT, like the more common QSAR methods, relies on the original database used for development of the model. The approach used by Klopman and coworkers considers a list of chemicals, characterized as regards their toxic effects (Klopman, 1984; Rosenkranz and Klopman, 1990a,b; 1993; Klopman and Rosenkranz, 1992; 1994). Their system, initially called CASE (Computer Automated Structure Evaluation), was then upgraded to a more powerful system called MULTICASE, improving the statistical treatment of the data. CASE generates all the possible substrates of a defined dimension (number of non-hydrogen atoms) in each compound and finds substructures characteristic of the toxic activity on the basis of the recurrence of the residue within the series of toxic compounds. Similarly, inactivating substructures present in inactive chemicals are identified. Given an original training set, the descriptors at the basis of CASE are generated automatically in

119 (1997) 213-225

an unbiased way. There is a single CASE program, but many prediction models, using the different training sets chosen by the user. This means that the knowledge derived depends on the training set. CASE makes a Bayesian determination of total probability, presents confidence levels, and reports the fragments generated; prediction is based on the ‘known’ fragments, obtained from the training set; however, a warning message identifies any unknown residues. CASE does not consider molecular descriptors or other holistic parameters whereas TOPKAT relies more on physico-chemical descriptors. Richard (1994) thoroughly compared TOPKAT and CASE. Klopman and coworkers have presented tens of papers on applications of CASE making it the most widely described ES for toxicity predictions. CASE, with its general applicability, has been used for several different classes of chemicals and biological and toxicological activities. Table 2 lists several examples. Unfortunately, almost all these papers are from the same group, that has the ownership of the program. This prompted an Italian group to partially and independently replicate the CASE approach (Malacarne et al., 1993). 2.3. ES encoding mechanistic processes Another system is COMPACT (Computer Optimized Molecular Parametric Analysis of Chemical Toxicity) and is based on some specific molecular characteristics that should indicate that the chemical will interact with specific families of the cytochrome P450 superfamily (Lewis, 1992; Lewis et al., 1993). The rationale is that much of the metabolic activation of chemicals to toxic electrophiles is the result of metabolic oxygenation by these enzymes. COMPACT has been designed to identify indirectly-acting carcinogens and this specificity is a limit. COMPACT recognizes potential toxic action mainly from the following molecular parameters, alone or combined: high degree of planarity, difference between the energy of the lowest empty molecular orbital and highest occupied molecular orbital, and the collision diameter. Fig. 2 shows the scheme of the

E. Benfenati, G. Gini / Toxicology 119 (1997) 213-225

COMPACT process and the importance of the mechanistic approach according to COMPACT. Compared to the other ES, this is conceptually a different approach, that tends to consider the mechanism of the toxic action. These approaches are related to the known characteristics of biochemical processes, so these systems should greatly improve as knowledge of the biochemical pathways and the structure of the macromolecules involved increases. The other ES try to derive the activities of chemicals directly from their structure, ignoring the biochemical processes underlying the activity.

219

A

I

Electronic slnluclllre

Comparison with training set

I

I

I

I

Prediction

Table 2 Chemicals and biologic,al and toxicological activities studied with CASE Chemicals

References

Polycyclic aromatic hydrocarbons Flavonoids

Klopman, 1984

Phenylazoanihne dyes Retinoids Phytoalexins Natural pesticides Dioxins Nitroarenofurans

Biological and toxicological activities Carcinogenicity

Mutagenicity

Clastogenicity Nephrotoxicity Cytoprotection Cytogenotoxicity Enzymatic activities

Klopman and Dimayuga, 1988 Rosenkranz and Klopman, 1989 Klopman and Dimayuga, 1990 Rosenkranz and Klopman, 1990b Rosenkranz and Klopman, 199oc Rannug et al., 1991 Mersh-Sundermann et al., 1994

Rosenkranz and Klopman, 1990a,b,c; Klopman and Rosenkranz, 1994 Rosenkranz and Klopman, 1989; 1990b; Klopman and Rosenkranz, 1992; 1994 Rosenkranz and Klopman, 1992 Rosenkranz et al., 1991 Klopman and Srivastava, 1990 Rosenkranz et al., 1991 Klopman and Buyukbingol, 1988; Klopman and Dimayuga, 1988

Fig. 2. A. The scheme of the COMPACT process. B. Major importance for COMPACT relies on planarity and electronic structure. The molecule, drawn as a rectangle, has to fulfil dimensional and electronic criteria.

The assumption is that a chemical’s activity is written in its formula, implying that the biochemical process is constant. This may be an oversimplification, of course, especially in cases where the same chemical can induce different activities in different animal species, due to different biochemical processes.

3. Structural characteristics and robustness 3.1. The program architecture and validation Generally all the ES used in toxicology lack a description of the software. Indirectly, this can be judged from the fact that most papers on these systems appeared in journals on toxicology and,

E. Benfenati,

220

G. Gimi / Toxicology

in some cases, in chemistry. On the other hand, a major difference between various ES is the internal structure of the inference engine, and this is not generally described. Thus, comparison must be based only on claims of performance. Most of the papers have been presented by the authors themselves, and this has given rise to some perplexity of the scientific community, which lacks confidence in the results. However, some of these systems have been bought by companies or are used by environmental agencies. Validation is sometimes described by the authors (see Table 3). For ES based on statistical analysis it is easier to evaluate the system internally. This has been done by Enslein et al. (1994). To verify the performance of the ES, CASE considers a set of compounds let out from the training set: for instance, five polycyclic aromatic hydrocarbons were in the test set, while the training set comprised 38 cotipounds (Klopman, 1984). The results were good. Similarly, Malacarne et al. (1993) used 80% of the chemicals as a training set and 20% as a test set. Nakadate et al. (1991) evaluated their system by the elimination method. For DEREK the system was assessed considering the results given by the program and those available from the NTP (44 compounds) (Sanderson and Earnshaw, 1991). The results presented in papers by the authors give a general idea of the performances of the different programs, and only in a few cases was prediction poor (Sanderson and Earnshaw, 1991; Enslein et al., 1994). Generally the results were good. Table 3 Validation

procedures

Validation

TOPKAT

Resubstitution cross-validation Test set Test set

CASE Malacarne DEREK

et al.

External

procedure test and test

set, NTP

References Enslein et (1994) Klopman, Malacarne al. (1993) Sanderson Eamshaw,

Table 4 ES and human cals

experts

predictions

for toxicology

of 44 chemi-

Expert

Accuracy

Percentage

Human experts DEREK TOPKAT COMPACT CASE

30/40 22137 14124 19135 11135

15 59 58 54 49

Results can be evaluated in terms of accuracy, sensitivity and selectivity. Accuracy is defined by the ratio between the sum of correct assignments (active and inactive) and the total checked compounds. Sensitivity is the ratio considering only active compounds, correctly assigned, with total active compounds, and specificity is the ratio for inactive compounds (Omenn, 1995; Lewis, 1992). Very few comparisons have been made of different ES in toxicology. Most of the papers presented by the authors of the different ES claim good predictions, often better than 90%. Omenn (1995) reported the results of predictions on 44 chemicals made by some human experts and different computer programs. Table 4 compares the results with the ES and the best human results. Some systems did little better than random. This also demonstrates that for the time being no ES can do better than a good human expert. However, this in itself is a challenge to improve our approach. 3.2. Hardware Some ES in toxicology runs on personal computers, while others need more powerful machines. Up-dated details can be obtained from the authors and Companies (see Notes).

used for the ES

ES

119 (1997) 213-225

al.

4. Differences between ES 1984 et and 1991

4.1. Learning or human knowledge

A first difference between various ES is that some are able to generate rules themselves, while

E. Benfenati, G. Gini / Toxicology 119 (1997) 213-225

others rely on human experts, judged to be more reliable. The advantage of the first approach is that it offers a more general tool, able to process chemicals whose activity has been characterized but without needing the presence of specific identified toxic moieties. Thus, for instance, CASE could be used to inspect a new class of compounds, eliciting rules itself. On the other hand, the other systems contain knowledge (from experts) that is presumably deeper and more varied than the simple statement on toxicity, and includes, for instance, risk for the potential presence of toxic impurities (though these are special cases, difficult to catalogue), or the possible release or formation of toxic degradation products. 4.2. Metabolism More recent versions of CASE (Klopman and Rosenkranz, 1994) are able to deal with biodegradation, a feature also offered, in a more sophisticated way, by an ENSdeveloped by Darvas, called MetabolExpert, able to predict metabolites, whose toxicity can then be investigated. HazardExpert also considers metabolites using a simpler version of Metabo’lExpert, that can be used to obtain more information, such as the second generation of metaboli-tes.

221

The DEREK system, on the other hand, does not consider inactivating residues. This is probably one of the reasons for the false positives foreseen by Sanderson and Earnshaw (1991) for the initial version of DEREK. 4.4. The holistic approach As seen above, some systems consider only specific active fragments in the molecule, while others consider antagonistic residues as well. A further problem is how to deal with activities from more than one fragment. Some systems simply use the highest activity; DEREK, however, leaves the choice to the user. Another approach is to consider the whole molecule. An example of this holistic view of the problem is COMPACT, where the surface of the compound is evaluated, as well as other parameters. Another parameter found to be related to toxicity in many cases is log P. Of course, it is difficult to believe that any single parameter can disclose the different mechanisms underlying the different toxic activities. Toxicologists are alerted by the presence of certain moieties and organic chemists know that certain groups can produce a specific reaction. However, theoretical chemists reply that different molecules behave in different ways and that specific reactivities can be encoded and described by specific holistic parameters, such as charge distribution.

4.3. Inactivating fragments

4.5. Quantzjication of the activities

Another difference is related to inactivating moieties. The general rule, stated above, if < presence of toxic residue > then , may be enough to describe the risk, and at first sight there are only two categories: activity or lack of activity (due to the absence of the active residue). However, a more sophisticated analysis detects inactivating moieties: for instance, the introduction of sulphonyl residues in particular sites of azo dyes reduces their carcinogenicity and mutagenicity. The CASE approach introduces this possibility, with the presence of the biophobe, that inhibits activity. TOPKAT too introduces fragments with a negative value in the algorithm, that contrast the toxic activity.

Another point is quantification of the toxic activity. Sanderson and Earnshaw were interested in a purely qualitative system (Sanderson and Earnshaw, 1991). Similarly, Malacarne et al. considered only qualitative results because their software can process only categorical outcomes (Malacarne et al., 1993). The first release of CASE could not assess the potency of chemicals and did not provide a quantitative correlation between activity and descriptors (Klopman, 1984); however, more recent papers do contain quantitative results (Rosenkranz and Klopman, 1994). HazardExpert is also able to provide a quantitative assessment of the different activities, or at least to indicate several categories of toxicity.

222

E. Benfenati,

G. Gini / Toxicology

119 (1997) 213-225

4.6. The dimension of the substructure

5. Discussion

The dimension of the substructure considered is another difference. Malacarne et al. (1993) limited the maximum size of the fragments considered to eight heavy atoms, to simplify the computational task. They argued that bigger fragments are statistically less significant because they are found in too few compounds. CASE, however, in one of its first examples, considered subunits containing between three and 12 interconnected heavy atoms; the authors stated that it was not unusual to find that relevant keys extend over eight to ten atoms (Klopman, 1984). More recently, studying PAH, a 14-atom fragment was identified (Rannug et al., 1991). DEREK too considers structures bigger than eight heavy atoms; for instance tetrachloro-dibenzodioxin has 18 non-hydrogen atoms.

A variety of different approaches are used for ES on toxicology. Some systems are quite developed, while others are still in their early stages, and this is reflected in their extension. A major drawback is the validity of the database itself. This may affect the results, as the authors sometimes noticed when they used different databases (Malacarne et al., 1993). Even in simpler experiments, such as determination of the log P, results found in the literature may differ by two units, which is a high value, since this is a logarithmic number (Fielding, 1992). ES can be a valid tool to compensate experimental uncertainty in the case of log P, that relies directly on the chemico-physical characteristics of the molecule. More difficult is the case of biological activity, where many factors are involved in the final activity. Toxicology is still hampered by a basic lack of knowledge and there are many unanswered questions. For instance, is there a threshold for carcinogenic compounds? How can we calculate the effects at low doses? How can we extrapolate results from animals to humans, and from microorganisms to humans? These uncertainties still restrict the potency of the ES. A particular problem is the nature and evaluation of the information. In several cases experts pay special attention to some data and overlook others, because they know from experience which data are most reliable. Sometimes their experience is concentrated on certain aspects of the problem. As a consequence, different experts will give different answers. There are numerous debates in toxicology, like in other fields. Once the limits of current knowledge in toxicology are recognized, the challenge is open to study the open questions. In any case, the knowledge is partial but not necessarily incorrect one, so it does contain ‘pieces’ of true knowledge which are currently valid. From a more methodological point of view, given these limitations, there may be some contradictory data in the different databases, as Malacarne et al. did indeed find (Malacarne et al., 1993), in limited cases.

4.7. The data- and rule-bases Another important point is the data considered for the knowledge. CASE, with its general applicability, has been used for several different classes of chemicals. In some cases the database was quite large (more than 2000 molecules), while in others it was much smaller. Of course, for CASE and for the other statistics-based approaches it may be dangerous to extrapolate the rules outside the class of chemicals described; on the other hand, for similar compounds a more powerful description may be expected using this kind of approach. Malacarne et al. (1993) considered 826 chemicals out of over 1000, selected partly from the NTP database and partly from the CPDB database; some molecules were discarded because they were administered in mixtures or because of problems in introducing the formula into the computer. DEREK compiled about 50 rules based on expert judgement and literature information (e.g. from the US Food and Drug Administration). Now the system uses heterogeneous rules from different sources, factual or predicted. These rules cover a wide range of activity, such as irritation, mutagenicity, effect on thyroid function, chloracne.

E. Benfenati, G. Gini / Toxicology 119 (1997) 213-225

Considering all these problems, it is wise to estimate the uncertainty associated with the results of ES. None of the ES considered places adequate emphasis on this aspect. However, some databases do contain uncertainty values, and the measurement of uncertainties is a recognized issue in risk analysis. In particular, in cases where uncertainty is linked to the statistical variability and inefficient use of the available data, ES are useful for correctly addressing the problem. Of the three parts of the process mentioned in the Section 1, the dhemical, biological and computational ones, we believe that studies on the biological activities a:re the most difficult. For the chemical part efforts are being made to define general descriptors of molecules. For instance, some examples of these descriptors have been proposed by Todleschini et al. (1995) and by NoviE and Zupan (1995). In both cases these descriptors consider the whole molecule. Todeschini’s appear more familiar to chemists, because they contain known parameters. As a result, they are more transparent and the results are more easily understood in a classic way. Zupan’s descriptors sound familiar to chemists, because they look like spectra, but their meaning is not so obvious, with the partial exception of the Mulliken charges. Another difference is that the Zupan descriptors are biunivocal, and these authors were particularly i:nterested in the reversibility of the information from structure to descriptor. Todeschini expressly avoided this reversibility, fearing it might introduce a division in fragments in order to trace back the way from structure to descriptor. These examples illustrate the critical process that will probably open up new avenues in this research field. As regards algorithms, there are promising possibilities of increasing the power of the prediction, but this will require new strategies. Lewis (1992) pointed out that the different ES should not be considered antagonistic since they may be complementary in some cases: for instance he recognized this possibility for COMPACT and HazardExpert. DE.REK, as we said, used some of the rules obtained through CASE and TOPKAT. Richard (1994) too indicated that TOPKAT might be profitably employed to use the results of

223

CASE. This shows that no single system at the moment is satisfactory, but that further efforts are needed for the development of ES in toxicology, possibly by hybrid methods. As a conclusion, several ES are useful for predicting the toxicity of chemicals. However, further efforts are needed to improve these systems, as regards their relationship with the available data and expert knowledge.

Acknowledgements We acknowledge the financial support of the European Commission (CP94-1029). Notes. HazardExpert is supplied by CompuDrug Chemistry Ltd., H-1395 Budapest 62. P.O. Box 405. e-mail: [email protected]. Telefax + 36 1 1322574. DEREK is supplied by LHASA UK, University of Leeds, Leeds, LS2 9JT, UK. It has a www address: http://chem.leeds.ac,uk/luk. TOPKAT is supplied by Health Design Inc., Rochester, NY 14604, USA. Computer chemistry: a list of programs can be found at: http://www.cray.com/PUBLIC/ APPS/DAS/chemistry.html.

References Ashby, J. and Paton, D. (1993) The influence of chemical structure on the extent and sites of carcinogenesis for 522 rodent carcinogens and 55 different human carcinogens exposures. Mutat. Res. 286, 3-74. Buchanan, B.C. and Shortliff, E.H (1984) Rule-based expert systems: The MYCIN experiments of the Stanford Heuristic Programming Project, Addison-Wesley, Reading, MA. Combes, R.D. and Judson, P. (1995) The use of artificial intelligence systems for predicting toxicity. Pestic. Sci. 45, 179-194. Dearden, J.C. (1990) Physico-chemical descriptors. In: W. Karcher and J. Devillers (Eds), Practical Applications of Quantitative Structure-Activity Relationships (QSAR) in Environmental Chemistry and Toxicology, Kluwer, Dordrecht, pp. 25-29. Enslein, K., Gombar, V.K. and Blake, B.W. (1994) Use of SAR in computer-assisted prediction of carcinogenicity and mutagenicity of chemicals by the TOPKAT program. Mutat. Res. 305, 47-61. Fielding, M. (Ed.) (1992). Pesticides in ground and drinking water. Commission of the European Communities. E. Guyot SA, Brussels (Water Pollution Research Report 27).

224

E. Benfenati, G. Gini / Toxicology 119 (1997) 213-225

Gombar, V.K., Enslein, K. and Blake, B.W. (1993) Carcinogenicity of azathioprine: an S-AR investigation. Mutat. Res. 302, 7-12. Gray, N.A.B. (1988) Artificial intelligence in chemistry. Anal. Chim. Acta 210, 9932. Jelovsek, F.R., Mattison, D.R. and Young, J.F. (1990) Eliciting principles of hazard identification from experts. Teratology 42, 521-533. Karcher, W. and Devillers, J. (Eds) (1990) Practical Applications of Quantitative Structure-Activity Relationships (QSAR) in Environmental Chemistry and Toxicology, Kluwer, Dordrecht. Klopman, G. (1984) Artificial intelligence approach to structure-activity studies. Computer automated structure evaluation of biological activity of organic molecules. J. Am. Chem. Sot. 106, 7315-7321. Klopman, G and Buyukbingol, E. (1988) An artificial intelligence approach to the study of the structural moieties relevant to drug-receptor interactions in aldose reductase inhibitors, Mol. Pharmacol. 34, 852-862. Klopman, G. and Dimayuga, M.L. (1988) Computer-automated structure evaluation of flavonoids and other structurally related compounds as glyoxalase I enzyme inhibitors. Mol. Pharmacol. 34, 218-222. Klopman, G. and Dimayuga, M.L. (1990) Computer automated structure evaluation (CASE) of the teratogenicity of retinoids with the aid of a novel geometry index. J. Comput. Aided Mol. Des. 4, 117-130. Klopman, G. and Rosenkranz, H.S. (1992) Testing by artificial intelligence: computational alternatives to the determination of mutagenicity. Mutat. Res. 272, 59-71. Klopman, G. and Rosenkranz, H.S. (1994) Approaches to SAR in carcinogenesis and mutagenesis. Prediction of carcinogenicity/mutagenicity using MULTI-CASE. Mutat. Res. 305, 33-46. Klopman, G. and Srivastava, S. (1990) Computer-automated structure evaluation of gastric anti-ulcer compounds: study of cytoprotective and antisecretory imidazoil ,Za]pyridines and -pyrazines. Mol. Pharmacol. 37, 958-965. Lewis, D.F.V. (1992) Computer-assisted methods in the evaluation of chemical toxicity. In: K.B. Lipkowitz and D.B. Boyd (Eds.), Reviews in Computational Chemistry III, VCH Publishers, Inc., New York, pp 173-222. Lewis, D.F.V., Ioannides, C. and Parke, D.V. (1993) Vahdation of a novel molecular orbital approach (COMPACT) for the prospective safety evaluation of chemicals, by comparison with rodent carcinogenicity and Salmonella mutagenicity data evaluated by the U.S. NCI/NTP. Mutat. Res. 291, 61-77. Lindsay, R.K., Buchanan, B.G., Feigenbaum, E.A. and Lederberg, J. (1980) Applications of artificial intelligence for organic chemistry. The DENDRAL Project, McGraw-Hill, New York. Malacarne, D., Pesenti, R., Paolucci, M. and Parodi, S. (1993) Relationship between molecular connectivity and carcinogenic activity: a confirmation with a new software program based on graph theory. Environ. Health Perspect. 101, 332-342.

Mersh-Sundermann, V., Rosenkranz, H.S. and Klopman, G. (1994) The structural basis of the genotoxicity of nitroarenofurans and related compounds. Mutat. Res. 304, 271-284. Mumtaz, M.M., Knauf, L.A., Reisman, D.J., Peirano, W.B., DeRosa, C.T., Gombar, V.K., Enslein, K., Carter, J.R., Blake, B.W., Huque, K.I. and Ramanujam, V.M.S. (1995) Assessment of effect levels of chemicals from quantitative structure-activity relationship (QSAR) models. I. Chronic lowest-observed-adverse-effect level (LOAEL). Toxicol. Lett. 79, 131-143. Nakadate, M., Hayashi, M., Sofuni, T., Kamata, E., Aida, Y., Osada, T., Ishibe, T., Sakamura, Y. and Ishidate Jr., M. (1991) The expert system for toxicity prediction of chemicals based on structure-activity relationship. Environ. Health Perspec. 96, 77779. Nendza, M., Volmer, .I. and Klein, W. (1990) Risk assessment based on QSAR estimates. In: W. Karcher and J. Devillers (Eds), Practical applications of quantitative structure-activity relationships (QSAR) in environmental chemistry and toxicology, Kluwer, Dordrecht, pp. 213-240. NoviE, M. and Zupan, J. (1995) A new general and uniform structure representation. Presented at ‘Information und Wissen am Arbeitsplatz des Chemikers’, Jahrestagung, CIC-Workshop, Hochfilzer/Tirol, Austria, 19921 November. Omenn, G.S. (1995) Assessing the risk assessment paradigm. Toxicology 102, 23-28. Payne, M.P. and Walsh, P.T. (1994) Structure-activity relationships for skin sensitization potential: development of strucural alerts for use in knowledge-based toxicity prediction systems. J. Chem. Inf. Comput. Sci. 34, 1544161. Rannug, U., Sjiigren, M., Rannug, A., Gillner, M., Toftgard, R., Gustafsson, J.-A., Rosenkranz, H. and Klopman, G. (1991) Use of artificial intelligence in structure-affinity correlations of 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD) receptor ligands. Carcinogenesis, 12, 2007-2015. Richard, A.M. (1994) Application of SAR methods to noncongeneric data bases associated with carcinogenicity and mutagenicity: issues and approaches, Mutat. Res. 305, 73-97. Ridings, J.E., Barratt, M.D., Cary, R., Earnshaw, C.G., Eggington, C.E., Ellis, M.K., Judson, P.N., Langowski, J.J., Marchant, CA., Payne, M.P., Watson, W.P. and Yih, T.D. (1996) Computer prediction of possible toxic action from chemical structure; an update of the DEREK system. Toxicology 106, 267-279. Rosenkranz, H.S. and Klopman, G. (1989) Strucutural basis of the mutagenicity of phenylazoaniline dyes. Mutat. Res. 221, 217-234. Rosenkranz, H.S. and Klopman, G. (1990a) New structural concepts for predicting carcinogenicity in rodents: an artificial intelligence approach. Teratog. Carcinog. Mutag. 10, 73-88. Rosenkranz, H.S. and Klopman, G. (1990b) The structural basis of the carcinogenic and mutagenic potentials of phytoalexins. Mutat. Res. 245, 51-54.

E. Benfenati, G. Gini / Toxicology 119 (1997) 213-225

Rosenkranz, H.S. and Klopman, G. (199Oc) Natural pesticides present in edible pl,ants are predicted to be carcinogenic. Carcinogenesis 11, 349-353. Rosenkranz, H.S. and !Klopman, G. (1992) 1,4-Dioxane: prediction of in vivo clastogenicity. Mutat. Res. 280,245-251. Rosenkranz, H.S. and Klopman, G. (1993) Structural relationships between mutagenicity, maximum tolerated dose, and carcinogenicity in rodents. Environ. Mol. Mutagen. 21, 193-206. Rosenkranz, H.S. and Klopman, G. (1994) Structural implications of the ICPEMC method for quantifying genotoxicity data. Mutat. Res. 305, 99-116.

225

Rosenkranz, H.S., Zhang, Y.P. and Klopman, G. (1991) Implications of newly recognized relationships between mutagenicity, genotoxicity and carcinogenicity of molecules. Mutat. Res. 250, 25-33. Sanderson, D.M. and Earnshaw, C.G. (1991) Computer prediction of possible toxic action from chemical structure; the DEREK system. Hum. Exp. Toxicol. 10, 261-273. Todeschini, R., Gramatica, P., Provenzani, R. and Marengo, E. (1995) Weighted holistic invariant molecular descriptors. Part 2. Theory development and applications on modeling physicochemical properties of polyaromatic hydrocarbons. Chemometrics Intelligent Laboratory Systems 27, 221-229.