Journal Pre-proofs Integrating in silico models for the prediction of mutagenicity (Ames test) of botanical ingredients of cosmetics Giuseppa Raitano, Alessandra Roncaglioni, Alberto Manganaro, Masamitsu Honma, Laurent Sousselier, Quoc Tuan Do, Eric Paya, Emilio Benfenati PII: DOI: Reference:
S2468-1113(18)30116-6 https://doi.org/10.1016/j.comtox.2019.100108 COMTOX 100108
To appear in:
Computational Toxicology
Received Date: Revised Date: Accepted Date:
7 October 2018 5 May 2019 21 August 2019
Please cite this article as: G. Raitano, A. Roncaglioni, A. Manganaro, M. Honma, L. Sousselier, Q.T. Do, E. Paya, E. Benfenati, Integrating in silico models for the prediction of mutagenicity (Ames test) of botanical ingredients of cosmetics, Computational Toxicology (2019), doi: https://doi.org/10.1016/j.comtox.2019.100108
This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
© 2019 Published by Elsevier B.V.
Integrating in silico models for the prediction of mutagenicity (Ames test) of botanical ingredients of cosmetics
Giuseppa Raitano1*, Alessandra Roncaglioni1, Alberto Manganaro1, Masamitsu Honma2, Laurent Sousselier3, Quoc Tuan Do4, Eric Paya4, Emilio Benfenati1.
1. Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Via Giuseppe La Masa, 19, 20156 Milano, Italy 2. Division of Genetics and Mutagenesis, National Institute of Health Sciences, 3-25-26 Tonomachi, Kawasaki-ku, Kanagawa 210-9501, Japan 3. UNITIS, 24 rue Marbeuf F - 75008 PARIS 4. Greenpharma S.A.S 3, Allée du Titane, 45100 Orléans, France
*Corresponding author: Giuseppa Raitano, Department of Environmental Health Sciences, Istituto
di Ricerche Farmacologiche Mario Negri IRCCS, Via Giuseppe La Masa 19, 20156 Milano, Italy E-mail:
[email protected]
Abstract Plant extracts are widely used as cosmetic ingredients and have to be investigated to guarantee consumer safety. However, these natural products are often complex mixtures of chemicals. No animal tests can be done, in compliance with cosmetic regulations, therefore non-testing methods (NTM) could be useful for preliminary screening to address the safety of finished cosmetic products. We developed an integrated strategy (IS) to assess the genotoxic potential of ~18000 molecules present in natural cosmetics ingredients by combining several quantitative structure-activity relationship (QSAR) models. This IS consists of a sequence of steps to formalize the expert reasoning. We also developed a new classification model based on a large dataset of compounds to clarify the outcomes that remain equivocal after the application of this strategy. 1
Highlights
The aim was to assess the genotoxic potential of a large dataset of compounds by integrating several in silico models
The integration strategy was tuned and applied to ~ 18000 molecules found in plant extracts used in cosmetics
A new SAR model was developed in Pyton, based on active and inactive rules
The in silico methods were refined
Keywords Cosmetics, plant extracts, integrated strategy, in silico methods, mutagenicity, quantitative structure–activity relationship (QSAR) Abbreviations AD Applicability Domain, ADI Applicability Domain Index, ES Evaluation Set, FN false negative, FP false positive, IRFMN Istituto di Ricerche Farmacologiche Mario Negri, IS Integrated Strategy, ISS Istituto Superirore di Sanità, LR Training Likelihood Ratio, MCC Matthew’s Correlation Coefficient, NTM non- testing methods, PPV Positive Predicted Value, (Q)SAR Quantitative Structure Activity Relationship, SA Structural Alert, SVM Support Vector Machine, TN true negative, TP true positive, T.E.S.T. Toxicity Estimation Software Tool, VEGA virtual models for property evaluation of chemicals.
2
INTRODUCTION Lipsticks, skin cleansers, body lotions, shampoos and haircare products are just a few examples of how cosmetics used in our daily life. In view of their growing diffusion, the European authorities have regulated the production of cosmetic products and their ingredients in order to ensure a high level of protection for human and environmental health but, at the same time, not to interfere with their market. [1] Plant extracts are widely used as cosmetic ingredients due to the consumers’ frequent demand instead of synthetic constituents. However, these natural products are often complex mixtures of chemicals and can induce adverse reactions [2]. No animal tests can be conducted, in compliance with cosmetic regulations [1], so non-testing methods (NTM) like read-across, quantitative structure-activity relationship (QSAR) models and weight-of-evidence approaches are useful for preliminary screening in terms of the safety of ingredients or finished cosmetic products. NTM cut costs and time and can be part of more complex testing strategies, as recommended by legislation in many contexts [3, 4-10]. Indeed, international authorities recommend using more than one model considering both expert-based and statistical approaches [11-12]. Mutagens, carcinogens and toxicants for reproduction (CMR) arouse particular concern because of their adverse impact on human health. Among others, the bacterial reverse mutation test (Ames test) [13] is widely used during the initial screening for genotoxicity to address mutagenicity. The wide availability of consistent experimental data means this in vitro test is often employed to develop (Q)SAR models [14]. We made an in silico assessment of ~18000 molecules found in plant extracts used in cosmetics, applying an integrated strategy (IS) of several (Q)SAR models. Integrated strategies for mutagenicity assessment (Ames test) are already available and give good performance [15]. This IS consists of a sequence of steps to mimic expert reasoning in a formalized and reproducible way. Several tools/elements were considered as input for the IS, for instance the concordance 3
between the individual models’ predictions and their reliability, the positive predictive value (PPV) of structural alerts (SAs) when available, and the presence of specific exception rules. We also developed a new structure-activity relationship (SAR) model based on a large dataset of compounds (with unbalanced distribution between positive and negative compounds) to resolve any assessments that remain equivocal after the use of other tools.
2. MATERIALS AND METHODS 2.1 Models used We used ten freely available models belonging to different platforms. Table 1 sketches them out, focusing on the underlying approaches and their current level of availability.
Table 1. Models used and some details about their approach and availability.
Platforms
Model/profiler used Mutagenicity (Ames test) CONSENSUS model 1.0.2 Mutagenicity (Ames test) model (CAESAR) 2.1.13
VEGA v.1.1.4
T.E.S.T. v. 4.2.1
OECD QSAR toolbox v. 4.2
Approach Weighted combination of the four individual VEGA models depending on the individual models ADI values Statistical SVM model + expertbased SAs
Mutagenicity (Ames test) model (SarPy/IRFMN)1.0.7
Statistical SAs
Mutagenicity (Ames test) model (ISS) 1.0.2
Expert-based SAs
Mutagenicity (Ames test) model (KNN/Read-Across) 1.0.0
Read-across approach
CONSENSUS method
Average prediction of the individual models in the domain of applicability
Hierarchical clustering method
Statistical
FDA method
Statistical
Nearest neighbour method
Statistical (Read-across approach)
DNA alerts for AMES by OASIS v.1.4
Expert-based SAs
Status
already available
already available
already available 4
IRFMN group
New SARpy model (stepwise)
Statistical SAs
newly developed
We used the following VEGA platform (version 1.1.4) models [16]: - Mutagenicity AMES model (CAESAR) v. 2.1.13 model [17], based on the Bursi mutagenicity dataset [18] with a training data set of 4204 and a validation data set of 837 compounds. This hybrid model integrates a trained Support Vector Machine (SVM) classifier and an additional model based on Structural Alerts (SAs) matching. - Mutagenicity AMES model (SARpy/IRFMN) v. 1.0.7 [19] is built as a set of rules extracted automatically by SARpy (SAR in python) software from the same dataset as the CAESAR model [18]. The model includes specific rules for mutagenic and non-mutagenic activity. - The SAs related to mutagenicity of the ToxTree version 2.6 [20] are implemented in Mutagenicity AMES model (ISS) v.1.0.2 [21]. The ISS model provides a mutagenic prediction if at least one SA is matched in the target compound otherwise a compound is predicted as non-mutagenic. The dataset of the model counts 670 compounds. - Mutagenicity AMES model (KNN/Read across) v.1.0.0 [22] runs a read-across on a dataset of 5770 chemicals including a benchmark dataset compiled by Hansen et al. [14] and a collection of data (positive results) made available by the Japan Health Ministry within their Ames QSAR project [23]. - Consensus model v. 1.0.2 makes an overall assessment based on the predictions of the previous four VEGA mutagenicity models (CAESAR, SARpy, ISS and KNN) and their applicability domain index (ADI) values. The consensus model prediction is influenced by the concordance of the individual models’ predictions, together with their reliability in terms of the ADI values. The outcome of the VEGA consensus comes together with a score that is used to measure its confidence (consensus score). The score achieves its maximum value (1) only if one or more models find 5
experimental values in their training sets and all available values are concordant. In all other cases, the score will be lower. VEGA individual models provide an ADI as a measure of the reliability of each prediction. The ADI is calculated for each model by grouping other indices, each considering an element of the AD. The ADI aggregates information about the similarity and common structural features with compounds in the training set, the concordance in the experimental values for similar compounds and the accuracy in their prediction. ADI values range from 0 (worst case) to 1 (best case). In general, mutagenicity predictions with an ADI <0.6 are outside the AD of the model.
From the T.E.S.T. platform (version 4.2.1) [24] we included the following: - Hierarchical clustering method produces a series of clusters from the training set. Clusters are subsets of chemicals from the whole set, which have similar characteristics. A genetic algorithmbased selection was used to generate models for each cluster. - The U.S. Food and Drug Administration (FDA) method makes predictions using a single cluster (constructed at runtime) which contains structurally similar chemicals selected from the overall training set to build a model. This contrasts with the Hierarchical method, where the predictions are made using one or more clusters constructed a priori. - In the nearest neighbour approach, the predicted toxicity is the average of the toxicities of the three most similar chemicals (structural analogues) in the training set if their similarity exceeds a given threshold. - The Consensus method estimates the mutagenic activity by taking an average of the predicted toxicities from the above QSAR methods (hierarchical clustering, FDA and nearest neighbour) provided their prediction is valid.
6
All the TEST models are built on a benchmark dataset compiled by Hansen et al. [14] that consists of 5743 chemicals, and for all those models the prediction is made only when the compound is included in the respective AD. From the OECD QSAR toolbox platform (version 4.2) [25] we used the profiler for DNA alerts for AMES by OASIS. It counts 85 SAs responsible for the interaction of chemicals with DNA extracted from the Ames Mutagenicity model, part of the OASIS TIMES system [26-27]. 2.2 New SARpy model In addition to the ten available models, we developed a new SARpy model based on a stepwise approach. This model was used only when the IS was not able to give a reliable outcome. Generally, SARpy software extracts each possible fragment from a set of molecular structures and correlates these substructures with the activity of the molecules that contain them. As the last step, it selects fragments suitable to become SAs, on the basis of their prediction performance on the training set. Each SA, or rule, is associated to a Training Likelihood ratio (LR) as a measure of its statistical power. The SARpy model already available in the VEGA platform counts 112 rules specific for mutagenicity and 93 for non-mutagenicity. Here we collected a bigger training set (TS), mining several data sources: - A benchmark database compiled by Hansen and colleagues from the scientific literature [14] - A set of ~12000 compounds from the Ames/QSAR international project, National Institute of Health Sciences in Japan [23]. - A set of more than 700 molecules selected from the ECHA CHEM database [28] during the European project CALEIDOS [29]. The results of Ames tests for both Japanese substances (except for those classified as “strong positive”) and ECHA compounds are confidential and cannot be disclosed. Therefore, no information about their chemical identity is provided here. We pruned the data, removing duplicate structures and incongruent experimental results, neutralizing salts and removing counter ions 7
(manually or with in-house software). After this, the TS was composed of 18338 compounds (5025 mutagens and 13313 non-mutagens). From the first application of SARpy software on the whole dataset, we extracted more than 1000 rules specific for detecting either active or inactive compounds. The settled length of the fragments ranged from 2-18 atoms. Those new rules were compared with the old ones (from previous SARpy model) to exclude any in common. Afterwards, new rules with the best LR were added to the old ruleset. We matched this new ruleset with the TS compounds using in-house software (Istmolbase) looking for the presence of a list of fragments (as SMART strings) in a set of molecular structures. This way we checked the correctness of the formalism of the rules and verified their accuracy. Since several fragments (with different output labels) could be present in one molecule, rules selected had to meet the following thresholds: Inactive rules: %TN ≥ 80% and %NewTN≥ 70% Active rules: %TP ≥ 70% and %NewTP≥ 60% Where %TN (True Negative) is the percentage of negative compounds where the inactive rule was detected (generic accuracy). %NewTN is the percentage of negative compounds where the inactive rule was detected, not already matched by rules giving higher accuracy. %TP (True Positive) is the percentage of positive compounds where the active rule was detected. %NewTP is the percentage of positive compounds, where the active rule was detected, not already matched by rules giving higher accuracy. %NewTN and %NewTP were considered during the selection phase since they show how much the rule contributes to the correctness of the predictive model, independently from the other rules. The final list of rules was reduced to 725 (158 active and 567 inactive) after the application of the quality criteria, as explained above.
8
The molecules were classified on the basis of rule with greatest accuracy found in the molecule by Istmolbase. If several fragments with the same accuracy were detected in a molecule, positive prediction was preferred. The 725 rules did not match about 6000 compounds (34%), and for the predicted compounds 748 were wrongly predicted as negative (False Negative, FN) and 335 wrongly predicted as positive (False Positive, FP). To optimize the low coverage and the number of FNs, two additional extractions with SARpy were done using the same setting and criteria as above. We extracted and selected 11 rules for activity to detect among the 9709 compounds predicted as inactive (TN and FN) those that were actually FN. Similarly, 201 active rules where extracted from the 6144 compounds originally not matched by the set of 725 rules. However, since these rules have fewer substances at their basis, we defined the outcome of the rule as “possible” (Figure 1). In total, this new SARpy model counts 370 active and 567 inactive rules and makes predictions applying them in three steps. The decision tree underlying the new model is summarized in Fig. 1 and was applied on the TS. Accuracy, sensitivity and specificity [30] of the prediction results are specified below.
Accuracy = (TP + TN)/(TP + FN + FP + TN)
Eq. 1
Sensitivity = TP/(TP + FN)
Eq. 2
Specificity = TN/(TN + FP)
Eq. 3
Since the training set is unbalanced toward the negative compounds, we also computed the Matthew’s correlation coefficient (MCC) [31], as follows:
MCC = (TP ∗ TN ― FP ∗ FN)/√((TP + FP)(TP + FN)(TN + FP)(TN + FN))
Eq. 4 9
The MCC ranges from −1 to +1: +1 indicates perfect prediction while −1 indicates total disagreement between predicted and observed values. A value of 0 indicates prediction no better than random.
Figure 1. Decision tree of new SARpy model.
2.3 Positive predictive value for the OASIS and ISS SAs. Structural alerts of toxicity are a feature for concern in human chemicals assessment. Most of them identify well-known classes of substances, characterized by their mechanisms of action [32-33], and several predictive programs are based on this human experts’ knowledge. We used the OECD QSAR Toolbox profiler for DNA alerts for AMES by OASIS and the Mutagenicity AMES model (ISS). We collected SAs and validated them through their mechanism of action; they indicate classes of chemicals that potentially cause interaction with DNA. Since this potential is modulated by the rest of the structure in each molecule, not all compounds that give an SA are necessarily toxic. Furthermore, the list of alerts is not complete: not all toxic compounds are "explained" by an alert [34]. Therefore, to avoid false positive predictions with this approach and to 10
compare the alerts objectively, we tested the reliability of the ISS and OASIS alerts by calculating their positive predictive value (PPV), as in equation 1, on a bigger set of compounds, the set used to build the new SARpy model (TS).
𝑇𝑃
eq. 5
𝑃𝑃𝑉 = 𝑇𝑃 + 𝐹𝑃 2.4 Evaluation set (ES)
We used this IS to evaluate a set of 17954 compounds found in plants (natural complex substances – plant extracts), used as ingredients in cosmetics. Those compounds were from the NCStox project (http://www.unitis.org/en/ncs-tox-project-presentation,379.html) which aimed to develop a predictive database to determine the toxicological profile of natural complex substances by integrating data mining and in silico methods. The database was created by extracting from the Greenpharma database GPDB (http://www.greenpharma.com/services/greepharma-core-database-gpdb), at least ten representatives for each molecular group (Table 2). The molecules had to be of plant/mushroom origin. They were identified from the scientific literature and can be determined by different methods. Like for the TS, this ES was inspected for data cleaning. Table 2 shows the distribution of the compounds in 92 molecular groups. Table 2. Compounds in the evaluation set (ES) for each molecular group. Molecular group
No. of comp.
Molecular group
No. of comp.
Molecular group
No. of comp.
Abietane
244
Indole
336
Polycyclic_Diterpene
260
Acetylenic_Derivative
617
Indolizidine
28
Polyinsaturated_Fatty_Acid
60
Acridone
146
Iridoid
558
Proanthocyanidin
195
Amaryllidaceae
58
Isoflavane
73
Protoberberine
143
Anthocyanidin
196
Isoflavone
437
Pterocarpan
167
Anthraquinone
530
Isothiocyanate
25
Purine
84
Apocarotenoid
28
Kaurene
517
Pyranocoumarin
150
11
Aporphine
318
Labdane
560
Pyrethrinoid
7
Aurone
43
Lignane
447
Pyridine
73
Benzoquinone
121
Limonoid
72
Pyrrolizidine
193
Benzylisoquinoline
67
Macrocyclic_Diterpene
40
Quassinoid
165
Cannabinoid
29
Monoinsaturated_Fatty_Acid
68
Quinazoline
17
Carotenoid
114
Monosaccharide
129
Quinoline
54
Chalcone
207
Monoterpene
172
Quinolizidine
74
Coumarin
571
Monoterpene_Acid_And_Ester
17
Rotenoid
92
Coumestan
46
Monoterpene_Alcohol_And_Ether
227
Saturated_Fatty_Acid
72
Cyclitol
23
Monoterpene_Aldehyde
20
Sesquiterpene
866
Dianthrone
29
Monoterpene_Ketone
50
Sesquiterpene_Lactone
490
Dihydroflavonol
235
Morphinane
72
Steroidal_Alkaloid
953
Diterpene_Acid
109
Naphthoquinone
202
Sterol
846
Diterpene_Alcohol
455
Neoflavonoid
60
Stilbenoid
106
Diterpene_Lactone
161
Nucleoside
9
Styrylpyrone
28
Ellagitannin
27
Oligosaccharide
99
Sulfur_Derivative
27
Flavanone
125
Phenethylamine
133
Tetrahydroisoquinoline
376
Flavone
915
Phenol
21
Tocopherol
16
Flavonol
910
Phenolic_Acid
95
Triterpene
225
Furane
53
Phenolic_Acid_Ester
83
Tropane
122
Furanocoumarin
53
Phenols
1
Tropolone
76
Gallotannin
115
Phenyl_Ether
112
Tryptamine
31
Heteroside
45
Phloroglucinol
66
Xanthone
453
Imidazole
63
Piperidine
151
2.5 Exceptions to rules Taking account of the PPV analysis of SAs, we found that some of them had low statistical power on the TS. Some fragments were present both in mutagenic and non-mutagenic compounds of the TS, showing little ability to distinguish the two. Furthermore, in some cases mutagenicity alerts were found in non-mutagenic compounds. We focused on the molecular groups mostly populated in ES and containing molecules fired by a SA with a low PPV, and investigated the structural reasoning for the incongruence. Once the conditions for the exceptions to the rule (of the SA) were recognized, we statistically verified them by calculating their accuracy on the TS (number of correct negatively predicted compounds/number
12
of compounds detected). We thoroughly investigated the mechanism of action (or biological reason) for that detoxifying behavior. 2.6 AMES mutagenicity workflow Different factors were considered to decide the order and strategy for combining the outcomes with the in silico tools. In view of the large number of substances to be screened we decided to include in the first step ready-to-use information without the need for manual inspection of the results (consensus from T.E.S.T. and VEGA and SAs from OASIS). We included in the evaluation both statistical tools (VEGA and T.E.S.T.) and expert-based SAs (from OASIS) to reflect the approach suggested in the ICH M7 guideline [11] for the two types of information to boost the reliability of the assessment. The overall statistical quality of the proposed IS and some of its steps were also checked on the TS at the basis of the model described in section 2.2. The strategy integrates all the predictions made by the 11 models and provides an outcome with its own confidence starting from the agreement in the evaluation of the two consensus models and the assessment of the OASIS profiler. The confidence can be labelled from “low” to “very good” depending on several conditions. Very good confidence was assigned to those outcomes that reported the experimental results in Ames test of the compounds found either in the training sets of the models or in the OECD QSAR toolbox databases. If no experimental data were available, the predictions of the two consensus models and the presence of the OASIS DNA alert were considered the first step of the IS. If the two consensus models agree, the outcome has “good” or “moderate” confidence depending on the OASIS profiler result and on the number of individual models’ predictions. If two consensus models disagree, the confidence on the outcome decreases to “low” down to cases where the IS cannot provide any reliable outcome; only in this last case are the predictions from the new SARPy model used. The ADI value of the predictions, the presence of ISS alerts specific for mutagenicity together with their PPV assessment, and of exceptions to rules may affect the confidence or in some cases change 13
the outcomes of the IS. Since OASIS SAs are already considered in the first step, in this second step of the workflow ISS SAs were used to avoid FNs by inspecting their statistical accuracy. We classified ISS alerts as “not reliable” if the PPV was ≤0.5, “weakly reliable” if it was 0.5-0.7 and “reliable” if it was ≥0.7. Predictions with low ADI value (<0.65) were considered untrustworthy during IS assessment because outside the AD of the VEGA models. Figure 2 shows the IS workflow.
14
Figure 2. AMES workflow at the basis of the IS 15
Figure 3 reports some examples of the outcomes, and their confidence levels.
Figure 3. Possible outcomes of the IS their confidence levels. For example, the IS outcome will be “mutagenic” with moderate confidence if: - The two consensus models agree about mutagenicity, - Four of the seven individual models agree too, - OECD QSAR Toolbox does not find any alert of mutagenicity. Otherwise, the IS outcome will be “non-mutagenic” with moderate confidence if: - The two consensus models agree about non-mutagenicity, - Four of the seven individual models agree too, 16
- OECD QSAR Toolbox does not find any mutagenicity alert. - ISS detects a statistically consistent mutagenicity alert but its prediction is not reliable since it is outside the AD of the model (ADI=0), Table S1 in the supplementary material shows the possible combinations used in the ES assessment. 3. RESULTS and DISCUSSION 3.1 IS on the evaluation set Most of the ES compounds (14,894) were predicted as non-mutagenic by the integrated strategy. Two thirds (9606/14,894; 65%) of negative classifications have good confidence (Figure 4).
Figure 4. Results on 17954 compounds in the plant extracts.
The new SARpy model was used for the assessment of 2695 molecules since other IS steps did not provide any reliable outcome. Taking into account the molecular groups’ distribution, triterpenes, chalcones, proanthocyanidins and stilbenoids, accounting for more than 100 compounds each, were mainly classified as negative (all were non-mutagens). On the other hand, among the most populated molecular groups, xanthones, aporphines, anthraquinones and acridones were largely classified as positive (more than 96%). 17
Considering IS negative outcomes, monoterpene_aldehyde was the molecular group with the highest percentage of outcomes with very good confidence (40%); heteroside had the highest percentage of outcomes with good confidence (98%); anthocyanidin had the highest percentage of outcomes with moderate confidence (84%); and coumestan had the highest percentage of outcomes with low confidence (76%). Apocarotenoid had the highest percentage of positive outcomes with very good confidence (18%), acridone with good confidence (85%) isothiocyanate with moderate confidence (32%) and quinazoline with low confidence (35%). Only very few compounds (reported as negative or positive with very good confidence) had an experimental value in the literature sources (Figure 4). This reflects the fact that substances in natural extracts are experimentally poorly characterized in terms of Ames test data and may also occupy a chemical space not overlapping the one used as the basis for the models. 3.2 New SARpy model We compared the results from the new SARpy model on its dataset (TS) with those of the existing model developed on the Bursi mutagenicity dataset. The statistics were obtained merging “possible non-mutagenic” predictions with “non-mutagenic”, and “possible mutagenic” predictions with “mutagenic” (Table 3).
Table 3. Comparison of the predictions of the existing model (Mutagenicity AMES model-SARpy /IRFMN, 1.0.7) and the new one (New SARPy) on their datasets.
TP TN FP FN
New SARPy
Existing SARPy
4017 11319 1994 1008
2011 1431 425 337 18
TOT Accuracy Sensitivity Specificity MCC
18338 0.84 0.80 0.85 0.62
4204 0.82 0.86 0.77 0.63
Generally, the “in fitting” statistics of the models are similar: accuracy ranges from 0.82 for the available SARpy to 0.84 in the new one while the MCC is respectively 0.63 and 0.62. The new SARpy model gives considerably fewer false positives, raising the specificity to 0.85 of the predictions, while the available SARpy model has higher sensitivity. This can be explained considering the composition of the training sets of the models. The former Mutagenicity AMES model (SARpy /IRFMN) is based on a more balanced dataset (with 56% of mutagens) while the training set of the new model has more negative compounds than positive ones (the prevalence of mutagens is only 27%); therefore it contains more rules for inactive compounds: 370 for actives and 567 for inactives. 3.3 Positive predictive value of SAs During PPV analysis 45 ISS alerts were detected in TS and then investigated: 19 had a PPV > 0.70 (Table 4). No compound had in its structure SA67 triphenylimidazole fragment. Table 4. ISS SAs ordered by PPV: TOT indicates the total number of compounds where the SA is detected in TS, POS indicates the number of positive compounds where the SA is detected. Structural Alerts SA9 Alkyl nitrite SA57 DNA intercalating agents with a basic side chain SA58 Haloalkene cysteine S-conjugates SA62 N-acyloxy-N -alkoxybenzamides SA65 Halofuranones SA64 Hydroxamic acid derivatives SA68 9,10-dihydrophenanthrenes SA21 Alkyl and aryl N-nitroso groups
TOT POS PPV 7 7 1.00 9 9 1.00 7 7 1.00 37 37 1.00 20 20 1.00 13 12 0.92 94 84 0.89 186 166 0.89 19
SA25 Aromatic nitroso group SA22 Azide and triazene groups SA5 S or N mustard SA61 Alkyl hydroperoxides SA63 N-aryl-N-acetoxyacetamides SA18 Polycyclic aromatic hydrocarbons SA6 Propiolactones and propiosultones SA7 Epoxides and aziridines SA69 Fluorinated quinolines SA27 Nitro aromatic SA19 Heterocyclic polycyclic aromatic hydrocarbons SA66 Anthrones SA12 Quinones SA23 Aliphatic N-nitro SA24 alfa,beta unsaturated alkoxy SA3 N-methylol derivatives SA8 Aliphatic halogens SA14 Aliphatic azo and azoxy SA28 Primary aromatic amine, hydroxyl amine and its derived esters (with restrictions) SA59 Xanthones, thioxanthones, acridones SA28ter Aromatic N-acyl amine SA1 Acyl halides SA28bis Aromatic mono- and dialkylamine SA60 Flavonoids SA13 Hydrazine SA37 Pyrrolizidine alkaloids SA30 Coumarins and furocoumarins SA2 Alkyl (C<5) or benzyl ester of sulphonic or phosphonic acid SA29 Aromatic diazo SA4 Monohaloalkene SA26 Aromatic ring N-oxide SA11 Simple aldehyde SA15 Isocyanate and isothiocyanate groups SA10 alfa, beta unsaturated carbonyls SA16 Alkyl carbamate and thiocarbamate SA38 Alkenylbenzenes SA39 Steroidal estrogens SA67 Triphenylimidazole
46 91 35 21 6 674 409 459 15 1419 491 190 235 20 20 11 932 45
41 81 30 18 5 521 305 338 11 1040 349 113 138 11 11 6 499 24
0.89 0.89 0.86 0.86 0.83 0.77 0.75 0.74 0.73 0.73 0.71 0.59 0.59 0.55 0.55 0.55 0.54 0.53
1167 50 273 225 263 7 293 12 51 103 496 43 51 323 60 805 349 123 3 0
620 26 122 100 113 3 125 5 21 42 200 17 20 71 13 167 49 16 0 0
0.53 0.52 0.45 0.44 0.43 0.43 0.43 0.42 0.41 0.41 0.40 0.40 0.39 0.22 0.22 0.21 0.14 0.13 0 N/A
SA9, SA57, SA58, SA62 and SA65 had the best PPVs while SA26, SA11, SA15, SA10, SA16 and SA38 gave a low statistical reliability with PPV<0.40.
20
In the case of OASIS, 78 structural alerts were detected and investigated: 49 had a PPV > 0.70 and 13 gave PPV <0.40. (Table 5). When possible, the comparison of the PPVs showed many OASIS SAs were more accurate than ISS’s. Table 5. OASIS SAs ordered by PPV: TOT indicates the total number of compounds where the SA is detected in TS, POS indicates the number of positive compounds where the SA is detected. OASIS Structural Alerts Acyclic triazenes Alkylnitrites Aminoacridine DNA intercalators Coumarins Flavonoids Haloalkene cysteine s-conjugates Haloepoxides and halooxetanes Halofuranones Haloisothiazolinones Organic diselenides and ditellurides Perfluoroalkyl hypohalites Peroxyacyl nitrates Propyne derivatives Quinoxaline-type 1,4-dioxides Sulfonyl azides N-acyloxy(alkoxy) arenamides Organic azides P-aminobiphenyl analogs Vicinal dihaloalkanes Haloalkanes containing heteroatom N-nitroso compounds Organic peroxy compounds Conjugated nitroalkenes and five-membered aromatic nitroheterocyclics Fused-ring nitroaromatics Nitrogen and sulfur mustards Epoxides and aziridines C-nitroso compounds Polynitroarenes Polycyclic aromatic hydrocarbon and naphthalenediimide derivatives Nitrophenols, nitrophenyl ethers and nitrobenzoic acids Sulfonates and sulfates DNA intercalators with carboxamide and aminoalkylamine side chain Fused-ring primary aromatic amines Nitro azoarenes and p-substituted azobenzenes
TOT POS PPV 6 6 1.00 7 7 1.00 9 9 1.00 11 11 1.00 3 3 1.00 4 4 1.00 1 1 1.00 18 18 1.00 1 1 1.00 1 1 1.00 1 1 1.00 1 1 1.00 1 1 1.00 2 2 1.00 3 3 1.00 33 32 0.97 62 60 0.97 18 17 0.94 14 13 0.93 45 41 0.91 87 79 0.91 32 29 0.91 191 173 0.91 226 203 0.90 9 8 0.89 233 206 0.88 34 30 0.88 50 44 0.88 315 276 0.88 45 39 0.87 36 31 0.86 19 16 0.84 95 80 0.84 53 43 0.81 21
Alpha,beta-Unsaturated Aldehydes Anthrones Nitroalkanes Quinoline derivatives Thiols Dicarbonyl compounds Nitrobiphenyls and bridged nitrobiphenyls P-substituted mononitrobenzenes Arenediazonium salts Haloalcohols Haloalkane derivatives with labile halogen N-hydroxylamines Polarized haloalkene derivatives Nitroarenes with other active groups N-aryl-N-acetoxy(benzoyloxy) acetamides Single-ring substituted primary aromatic amines N,N-dialkyldithiocarbamate derivatives Nitroaniline derivatives Quinones and trihydroxybenzenes Acridone, thioxanthone, xanthone and phenazine derivatives Monohaloalkanes Amino anthraquinones Hydrazine derivatives Acyl halides Four- and five-membered lactones Triarylimidazole and structurally related DNA intercalators Diazoalkanes Alpha-haloethers Haloalkenes with electron-withdrawing groups Sulfonyl halides Alkylphosphates, alkylthiophosphates and alkylphosphonates Specific imine and thione derivatives Sultones Specific acetate esters Diazenes and azoxyalkanes N-methylol derivatives Quinoneimines N-acetoxyamines Quinolone derivatives Quinone methides Geminal polyhaloalkane derivatives Alpha-beta conjugated alkene derivatives with geminal electron-withdrawing groups Pyrrolizidine derivatives Specific 5-substituted uracil derivatives
5 5 5 10 5 14 44 92 12 36 104 90 51 60 7 65 3 69 101 47 15 17 105 14 9 15 17 31 6 33 40 70 13 92 3 3 16 7 34 6 315
4 4 4 8 4 11 34 71 9 27 78 67 37 43 5 44 2 45 63 29 9 10 61 8 5 8 9 16 3 16 16 27 5 32 1 1 5 2 9 1 52
0.80 0.80 0.80 0.80 0.80 0.79 0.77 0.77 0.75 0.75 0.75 0.74 0.73 0.72 0.71 0.68 0.67 0.65 0.62 0.62 0.60 0.59 0.58 0.57 0.56 0.53 0.53 0.52 0.50 0.48 0.40 0.39 0.38 0.35 0.33 0.33 0.31 0.29 0.26 0.17 0.17
8 1 2
1 0 0
0.13 0.00 0.00 22
The flavonoid family is well represented in the ES: there were 2632 compounds, comprising anthocyanidins, aurones, chalcones, dihydroflavonols, flavanones, flavones and flavonols. SA60 Flavonoids matched only the quercetin-type flavonoids (flavonols molecular group). By matching all flavonoids in the TS through SMART strings (supplementary Table S2), it emerged that there were 17 flavonoids in the TS, four of them mutagenic, so the positive rate for this class of compounds is low (0.24). Considering how the SA for flavonoids is coded in both the ISS and OASIS lists, OASIS appeared to identify mutagenic flavonoids better (PPV 1.00) while ISS alerts included several FPs with a lower PPV (PPV 0.43). 3.4 Exception rules Coumarins and flavonoids are widely used in several cosmetics and personal care products respectively as a fragrance ingredient or for their anti-oxidant properties. Though several expertbased SAR predictive methods, like the ISS SAs, consider these compounds potentially genotoxic (in Ames test), our analysis of PPV showed low statistics for this and therefore we explored possible explanations and the potential biological conditions responsible for detoxification of coumarins and flavonoids (SA30 and SA60). 3.4.1 Specific exception rule for coumarins The ISS model detected SA30 coumarins and furocoumarins, and predicts as mutagenic 51 compounds of the TS, but only 21 are experimentally toxic (Table 4). We analysed the structures of mutagenic and non-mutagenic coumarins and found that all the 18 molecules with O or N atoms in seventh position, and not included in an aromatic ring, were experimentally non-mutagenic (Figure 5).
23
Figure 5. Structure of the exception rule for coumarins.
This could be explained biologically since 3,4 epoxidation of coumarins could lead to DNA damage through covalent binding while if the coumarin can be 7-hydroxylated its metabolites are excreted in the urine as glucuronide and sulphate conjugates. The ISS model predicts 845 molecules as mutagenic in the ES due to the presence of SA30 but only 252 of these have - according to the IS application - a mutagenic outcome. In fact, the remaining 593 compounds have a negative outcome that is justified for 473 of them by the presence of the detoxifying fragment. Benfenati et al. (2015) already encoded and then included this rule among those implemented in ToxRead software [35]. The current analysis of coumarins not only confirms this but also boosts its statistical reliability, since now it is based on a three times bigger dataset and clarifies the biological reasons for this behaviour. 3.4.2 Glucoside fragment In our assessment of the ES, the glucoside fragment (GF) was present in many compounds (2816): flavonols, iridoids, flavones, steroidal_alkaloids, anthocyanidins, isoflavones and monosaccharides were the molecular groups most involved, each having more than 100 compounds with that fragment. We investigated the role of this fragment in the TS. Seventeen of the 94 compounds with at least one GF in their structure are mutagenic experimentally. If we consider the molecules with more than one GF, the ratio between mutagenic 24
and non-mutagenic compounds decreases: only three out of 33 compounds are mutagenic (with a prevalence of 0.09%). In particular, if we consider only flavonoids, all five compounds with this condition (GF>1) are experimentally non-mutagenic (Figure 6).
Figure 6. Glucoside fragment characterization: more than one ring, even not directly linked. R group could be an oxygen atom, an aliphatic carbon or an aromatic carbon.
In the whole ES, GF >1 was found in 585 compounds; 29 of them predicted mutagenic mostly with low confidence and 556 predicted non-mutagenic mostly with good confidence. The ISS model found SA60 in 518 compounds while the IS gave 461 out of 518 compounds with negative outcomes, 118 of them with GF>1. 4. CONCLUSIONS We describe a new integrated strategy combining several in silico methods to assess the (Ames) mutagenicity potential of ~18000 molecules found in plants extracts and used as ingredients in cosmetics. Most of them (14,894) were classified as negative, with good confidence for two thirds (65%). In the workflow of this strategy, we combined several tools and optimized their integration. We calculated the positive predictive value on a large database of substances, using expert-based 25
structural alerts, this parameter could serve as an ingredient to assess their reliability in this specific context. Toxicological assessment based on this kind of SAs is biased toward the identification of toxic effects and tends to neglect non-toxic ones. We identified and statistically tested possible exception rules for two structural alerts (SA30 coumarins and furocoumarins and SA60 flavonoids) that fired in a large number of compounds of the ES but gave a low PPV in the TS. This refinement was very useful to improve the outcomes of the AMES integrated assessment strategy and could be valuable to avoid false positives in the detection of substances for further scrutiny. We also developed a new classification model based on a large dataset of compounds to resolve any assessments that remain equivocal after application of the IS. This information about the individual constituents of natural mixtures can be used to prioritize extracts whose chemical composition calls for thorough investigation of their mutagenic potential because of the presence of suspect constituents according to the in silico screening strategy. This approach is in line with what has been proposed in other contexts such as in the EFSA statement on Genotoxicity assessment of chemical mixtures [36] where a component-based approach is recommended in the case of chemically fully defined mixtures. Read-across and (Q)SAR outcomes are mentioned there among the options to assess the mixtures components individually and can assist in defining further strategies to conclude on the genotoxicity assessment.
26
Acknowledgements We would like to thank Pr. Sylvie Michel and Dr. Hanh Dufat from the Laboratory of Pharmacognosy-UMR of the University Paris Descartes for their precious work defining molecular groups. References 1. Regulation (EC) No 1223/2009 of the European Parliament and of the Council of 30 November 2009 on cosmetic products. http://data.europa.eu/eli/reg/2009/1223/oj 2. Plant Extracts in Skin Care Products, Special Issue Editors Beatriz P.P. Oliveira Francisca Rodrigues https://doi.org/10.3390/books978-3-03897-161-0 3. Regulation (EC) No 1907/2006 of the European Parliament and of the Council of 18 December 2006 concerning the Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH), establishing a European Chemicals Agency, amending Directive 1999/45/EC and repealing Council Regulation (EEC) No 793/93 and Commission Regulation (EC) No 1488/94 as well as Council Directive 76/769/EEC and Commission Directives 91/155/EEC, 93/67/EEC, 93/105/EC and 2000/21/EC. http://data.europa.eu/eli/reg/2006/1907/oj 4. EFSA report: Guidance on tiered risk assessment for plant protection products for aquatic organisms in edge-of-field surface waters. https://www.efsa.europa.eu/it/efsajournal/pub/3290 5. EFSA Supporting Publications: Evaluation of the applicability of existing (Q)SAR models for predicting the genotoxicity of pesticides and similarity analysis related with genotoxicity of pesticides for facilitating of grouping and read across, DOI: 10.2903/sp.efsa.2019.EN-1598 6. Mombelli E, Ringeissen S. The computational prediction of toxicological effects in regulatory contexts. L’Actualité chimique. 2009; 335:52–59 7. Fjodorova N, Novich M, Vrachko N et al (2008) Directions in QSAR modeling for regulatory uses in OECD member countries, EU and in Russia. J Environ Sci Health C 26:201–236 8. Guidance on information requirements and chemical safety assessment. Chapter R.7a: Endpoint specific guidance. https://echa.europa.eu/documents/10162/13632/information_requirements_r7a_en.pdf/e4a2a18fa2bd-4a04-ac6d-0ea425b2567f 9. Guidance on information requirements and chemical safety assessment Chapter R.6: QSARs and grouping of chemicals. https://echa.europa.eu/documents/10162/13632/information_requirements_r6_en.pdf/77f49f81b76d-40ab-8513-4f3a533b6ac9 10. OECD Environment Health and Safety Publications Series on Testing and Assessment No. 69 “OECD GUIDANCE DOCUMENT ON THE VALIDATION OF (QUANTITATIVE) STRUCTURE-ACTIVITY RELATIONSHIPS [(Q)SAR] MODELS”. http://www.oecd.org/officialdocuments/publicdisplaydocumentpdf/?doclanguage=en&cote=env/jm/ mono(2007)2 27
11. ICH M7, 2017. (R1). Assessment and Control of DNA Reactive (Mutagenic) Impurities in Pharmaceuticals to Limit Potential Carcinogenic Risk. http://www.ich.org/fileadmin/Public_Web_Site/ICH_Products/Guidelines/Multidisciplinary/M7/M 7_R1_Addendum_Step_4_31Mar2017.pdf 12. Amberg, A., Beilke, L., Bercu, J., Bower, D., Brigo, A., Cross, K. P., Custer, L., Dobo, K., Dowdy, E., Ford, K. A., Glowienke, S., Van Gompel, J., Harvey, J., Hasselgren, C., Honma, M., Jolly, R., Kemper, R., Kenyon, M., Kruhlak, N., Leavitt, P., Miller, S., Muster, W., Nicolette, J., Plaper, A., Powley, M., Quigley, D. P., Reddy, M. V., Spirkl, H. P., Stavitskaya, L., Teasdale, A., Weiner, S., Welch, D. S., White, A., Wichard, J. and Myatt, G. J. (2016) Principles and procedures for implementation of ICH M7 recommended (Q)SAR analyses. Regul. Toxicol. Pharmacol. 77, 1324 13. OECD (1997), Test No. 471: Bacterial Reverse Mutation Test, OECD Guidelines for the Testing of Chemicals, Section 4, OECD Publishing, Paris, https://doi.org/10.1787/9789264071247en. 14 Hansen, K.; Mika, S.; Schroeter, T.; Sutter, A.; ter Laak, A.; Steger-Hartmann, T.; Heinrich, N.; Müller, K.-R. Benchmark Data Set for in Silico Prediction of Ames Mutagenicity. J. Chem. Inf. Model. 2009, 49, 2077-2081.; Benchmark, T. http://doc.ml.tu-berlin.de/toxbenchmark/ (accessed 4/30/10) 15. Evaluation of QSAR models for the prediction of ames genotoxicity: a retrospective exercise on the chemical substances registered under the EU REACH regulation. Antonio Cassano, Giuseppa Raitano, Enrico Mombelli, Alberto Fernández, Josep Cester, Alessandra Roncaglioni, Emilio Benfenati. J Environ Sci Health C Environ Carcinog Ecotoxicol Rev. 2014; 32(3): 273–298. doi: 10.1080/10590501.2014.938955 16. https://www.vegahub.eu/ 17. T. Ferrari and G. Gini, An open source multistep model to predict mutagenicity from statistical analysis and relevant structural alerts, Chem. Cent. J. 4 Suppl. 1, S2 (2010), pp. 1–6. 18. J. Kazius, R. McGuire, and R. Bursi, Derivation and validation of toxicophores for mutagenicity prediction, J. Med. Chem. 48 (2005), pp. 312–320 19. T. Ferrari, D. Cattaneo, G. Gini, N. Golbamaki Bakhtyari, A. Manganaro, and E. Benfenati, Automatic knowledge extraction from chemical structures: The case of mutagenicity prediction, SAR QSAR Environ. Res. 24 (2013), pp. 365–383. 20. http://toxtree.sourceforge.net 21. R. Benigni and C. Bossa, Structure alerts for carcinogenicity, and the Salmonella assay system: A novel insight through the chemical relational databases technology. Mutat. Res. 659 (2008), pp. 248–261. 22 A. Manganaro, F. Pizzo, A. Lombardo, A. Pogliaghi, and E. Benfenati, Predicting persistence in the sediment compartment with a new automatic software based on the k-Nearest Neighbor (k-NN) algorithm, Chemosphere 144 (2016), pp. 1624–1630 23. Ames/QSAR international project, National Institute of Health Sciences in Japan http://www.nihs.go.jp/dgm/amesqsar.html 28
24. T. Martin, User’s Guide for T.E.S.T. (Toxicity Estimation Software Tool), U.S. EPA/National Risk Management Research Laboratory/Sustainable Technology Division, Cincinnati, OH (2016). Available at https://www.epa.gov/sites/production/files/2016-05/documents/ 600r16058.pdf 25. https://www.qsartoolbox.org/it/ 26. Mekenyan, O., Dimitrov, S., Serafimova, R., Thompson, E., Kotov, S., Dimitrova, N., and Walker, J. (2004) Identification of the structural requirements for mutagenicity by incorporating molecular flexibility and metabolic activation of chemicals I: TA100. Chem. Res. Toxicol. 17, 753766.2. 27. Serafimova, R., Todorov, M., Pavlov, T., Kotov, S., Jacob, E., Aptula, A., and Mekenyan, O. (2007) Identification of the structural requirementsfor mutagencitiy, by incorporating molecular flexibility and metabolic activation of chemicals. II. General Ames mutagenicity model. Chem. Res. Toxicol. 20, 662-676. 28. European Chemicals Agency. Registered substances; 2014. http://echa.europa.eu/ web/guest/information-on-chemicals/registered-substances. 29. http://www.life-caleidos.eu/pages/project.php 30. Cooper JA, Saracci R, Cole P. Describing the validity of carcinogen screening tests. Br J Cancer. 1979;39:87-89. 31. Dao P, Wang K, Collins C, Ester M, Lapuk A, Sahinalp SC. Optimally discriminative subnetwork markers predict response to chemotherapy. Bioinformatics. 2011;27:205-213. 32. Benigni Romualdo e Bossa Cecilia Mechanisms of Chemical Carcinogenicity and Mutagenicity: A Review with Implications for Predictive Toxicology, dx.doi.org/10.1021/cr100222q | Chem. Rev. 2011, 111, 2507–2536) 33. Benigni Romualdo e Bossa Cecilia, Structural Alerts of Mutagens and Carcinogens, Current Computer-Aided Drug Design, 2006, 2,(2),169-176) 34. Floris, M., Raitano, G., Medda, R., & Benfenati, E. Fragment Prioritization on a Large Mutagenicity Dataset. Molecular Informatics. DOI: 10.1002/minf.201600133 35. Emilio Benfenati, Serena Manganelli, Sabrina Giordano, Giuseppa Raitano & Alberto Manganaro (2015) Hierarchical Rules for Read-Across and In Silico Models of Mutagenicity, Journal of Environmental Science and Health, Part C, 33:4, 385-403, DOI: 10.1080/10590501.2015.1096881 36. Genotoxicity assessment of chemical mixtures, EFSA Journal 2019;17(1):5519, doi: 10.2903/j.efsa.2019.5519 https://www.efsa.europa.eu/en/efsajournal/pub/5519
29
Supplementary Material Table S1. All combinations of the integrated strategy used during the ES assessment; if the IS outcome was not reliable, a SARpy new model prediction was provided.
2 consensus (TEST&VEGA)
Single models
OASIS OECD QSARToolbox
Outcome
Confidence
agree about nonmutagenicity
all models agree about nonmutagenicity
no alert found
NONMutagenic
good
agree about nonmutagenicity
6/7 models agree about nonmutagenicity
no alert found
NONMutagenic
good
agree about nonmutagenicity
6/7 models agree about nonmutagenicity, exception rule
no alert found
NONMutagenic
good
agree about nonmutagenicity
6/7 models agree about nonmutagenicity, ISS finds alert not reliable
no alert found
NONMutagenic
moderate
agree about nonmutagenicity
6/7 models agree about nonmutagenicity, ISS finds alert reliable (acc>=0.7)
no alert found
NONMutagenic
moderate
agree about nonmutagenicity
6/7 models agree about nonmutagenicity, no exception rule
no alert found
NONMutagenic
moderate
agree about nonmutagenicity
5/7 models agree about nonmutagenicity, exception rule
no alert found
NONMutagenic
good
agree about nonmutagenicity
5/7 models agree about nonmutagenicity, alert not reliable
no alert found
NONMutagenic
moderate
agree about nonmutagenicity
5/7 models agree about nonmutagenicity, an alert not reliable, glucoside fragments
no alert found
NONMutagenic
moderate
agree about nonmutagenicity
5/7 models agree about nonmutagenicity
no alert found
NONMutagenic
moderate
agree about nonmutagenicity
5/6 models agree about nonmutagenicity
no alert found
NONMutagenic
moderate
agree about nonmutagenicity
5/7 models agree about nonmutagenicity but ISS finds alert reliable (acc>=0.7)
no alert found
SARpy
low
agree about nonmutagenicity
5/7 models agree about nonmutagenicity but ISS finds alert reliable (acc>=0.7) but ADI=0 or very low
no alert found
NONMutagenic
moderate or low
agree about nonmutagenicity
4/7 models agree about nonmutagenicity, exception rule
no alert found
NONMutagenic
moderate
agree about nonmutagenicity
4/7 models agree about nonmutagenicity but ISS finds alert reliable (acc>=0.7)
no alert found
SARpy
low 30
agree about nonmutagenicity agree about nonmutagenicity
4/7 models agree about nonmutagenicity but ISS finds alert reliable (acc>=0.7), ADI 0 or very low 4/7 models agree about nonmutagenicity
no alert found
NONMutagenic
moderate or low
no alert found
NONMutagenic
moderate
agree about nonmutagenicity
4/7 models agree about nonmutagenicity, ISS finds alert not reliable
no alert found
NONMutagenic
moderate
agree about nonmutagenicity
4/6 models agree about nonmutagenicity
no alert found
NONMutagenic
moderate
agree about nonmutagenicity
3/7 models agree about nonmutagenicity
no alert found
NONMutagenic
moderate
agree about nonmutagenicity
3/7 models agree about nonmutagenicity but ISS finds alert reliable (acc>=0.7)
no alert found
SARpy
low
no alert found
NONMutagenic
moderate or low
no alert found
NONMutagenic
moderate
agree about nonmutagenicity
3/7 models agree about nonmutagenicity but ISS finds alert reliable (acc>=0.7), ADI 0 or very low two models do not provide a prediction
agree about nonmutagenicity
all models agree about nonmutagenicity
alert found
NONMutagenic
moderate
agree about nonmutagenicity
6/7 models agree about nonmutagenicity
alert found
NONMutagenic
moderate
agree about nonmutagenicity
6/7 models agree about nonmutagenicity, ISS finds an alert not reliable
alert found
NONMutagenic
moderate
agree about nonmutagenicity
6/7 models agree about nonmutagenicity, ISS finds alert reliable but the ADI value is 0
alert found
NONMutagenic
moderate
agree about nonmutagenicity
6/7 models agree about nonmutagenicity, exception rule
alert found
NONMutagenic
moderate
agree about nonmutagenicity
6/7 models agree about nonmutagenicity, no exception rule
alert found
NONMutagenic
low
agree about nonmutagenicity
6/7 models agree about nonmutagenicity, glucoside fragments
alert found
NONMutagenic
moderate
agree about nonmutagenicity
6/7 models agree about nonmutagenicity, ISS finds alert reliable (acc>=0.7)
alert found
SARpy
low
agree about nonmutagenicity
5/7 models agree about nonmutagenicity
alert found
NONMutagenic
low
agree about nonmutagenicity
31
agree about nonmutagenicity
5/7 models agree about nonmutagenicity, ISS finds an alert not reliable and ADI 0 or glucoside fragments or exception rule.
alert found
NONMutagenic
moderate
agree about nonmutagenicity
5/7 models agree about nonmutagenicity, ISS finds alert reliable (acc>=0.7)
alert found
SARpy
low
agree about nonmutagenicity
4/7 models agree about nonmutagenicity
alert found
SARpy
low
agree about nonmutagenicity
4/7 models agree about nonmutagenicity, ISS finds alert reliable (acc>=0.7)
alert found
Mutagenic
moderate or low
agree about nonmutagenicity
3/7 models agree about nonmutagenicity
alert found
Mutagenic
low
agree about nonmutagenicity
3/7 models agree about nonmutagenicity, ISS finds alert reliable (acc>=0.7)
alert found
Mutagenic
moderate or low
TEST provides no prediction
all models of VEGA agree about non-mutagenicity
alert found
SARpy
low
TEST provides no prediction TEST provides no prediction, VEGA consensus predicts mutagenic TEST provides no prediction, VEGA consensus predicts mutagenic TEST provides no prediction, VEGA consensus predicts non-mutagenic TEST provides no prediction, VEGA consensus predicts non-mutagenic TEST provides no prediction, VEGA consensus predicts non-mutagenic TEST provides no prediction, VEGA consensus predicts non-mutagenic TEST provides no prediction, VEGA consensus predicts non-mutagenic agree about mutagenicity
all models of VEGA agree about non-mutagenicity
no alert found
NONMutagenic
moderate
3/4 models of VEGA predict mutagenic (ISS finds alert reliable)
no alert found
Mutagenic
moderate
2/4 models of VEGA predict mutagenic (ISS finds alert reliable)
no alert found
SARpy
low
2/4 models of VEGA predict mutagenic (ISS finds alert reliable)
no alert found
SARpy
low
3/4 models of VEGA predict non-mutagenic, the other model provides no prediction
no alert found
NONMutagenic
low
3/4 models of VEGA predict non-mutagenic (ISS finds alert reliable)
no alert found
SARpy
low
3/4 models of VEGA predict mutagenic
no alert found
SARpy
low
1/4 models of VEGA predict mutagenic
no alert found
NONMutagenic
low
alert found
Mutagenic
good
all models agree about mutagenicity
32
agree about mutagenicity
6/7 models agree about mutagenicity
alert found
Mutagenic
good
agree about mutagenicity
5/6 models agree about mutagenicity
alert found
Mutagenic
good
agree about mutagenicity
5/7 models agree about mutagenicity
alert found
Mutagenic
good
agree about mutagenicity agree about mutagenicity
4/7 models agree about mutagenicity all models agree about mutagenicity
alert found
Mutagenic
good
no alert found
Mutagenic
good
agree about mutagenicity
5/6 models agree about mutagenicity (ISS finds alert reliable, acc>=0.7)
no alert found
Mutagenic
good
agree about mutagenicity
5/7 models agree about mutagenicity (ISS finds alert reliable, acc>=0.7)
no alert found
Mutagenic
good
no alert found
Mutagenic
good
no alert found
Mutagenic
moderate
no alert found
Mutagenic
moderate
no alert found
Mutagenic
moderate
no alert found
Mutagenic
good
no alert found
Mutagenic
moderate
no alert found
Mutagenic
moderate
no alert found
Mutagenic
good
agree about mutagenicity
6/7 models agree about mutagenicity (ISS finds alert reliable) 6/7 models agree about mutagenicity
agree about mutagenicity
6/7 models agree about mutagenicity, exception rule
agree about mutagenicity
5/7 models agree about mutagenicity 5/7 models agree about mutagenicity (ISS finds alert reliable)
agree about mutagenicity
agree about mutagenicity agree about mutagenicity agree about mutagenicity agree about mutagenicity
5/7 models agree about mutagenicity (ISS finds alert reliable but the ADI value is 0) 5/7 models agree about mutagenicity 5/7 models agree about mutagenicity, no exception rule
agree about mutagenicity
4/7 models agree about mutagenicity, exception rule, ISS find alert reliable but ADI=0
no alert found
Mutagenic
moderate
agree about mutagenicity
4/7 models agree about mutagenicity, no exception rule
no alert found
Mutagenic
good
agree about mutagenicity
4/7 models agree about mutagenicity (ISS finds alert reliable acc>=0.7)
no alert found
Mutagenic
good
33
agree about mutagenicity
4/6 models agree about mutagenicity
no alert found
Mutagenic
moderate
agree about mutagenicity
3/7 models agree about mutagenicity
no alert found
Mutagenic
moderate
agree about mutagenicity
3/7 models agree about mutagenicity; exception rule
no alert found
Mutagenic
low
disagree
6/7 models agree about nonmutagenicity
no alert found
disagree
1/7 model predicts mutagenic
no alert found
disagree
1/6 model predicts mutagenic 4/6 models predict mutagenicity (ISS finds alert reliable) 4/6 models predict mutagenicity (ISS finds alert reliable) 3/7 models predict mutagenicity, exception rule
no alert found
NONMutagenic NONMutagenic SARpy
alert found
Mutagenic
moderate
no alert found
SARpy
low
no alert found
NONMutagenic
low
2/7 models predict mutagenic
alert found
SARpy
low
alert found
Mutagenic
moderate
no alert found
Mutagenic
moderate
disagree disagree disagree disagree disagree disagree
6/7 models predict mutagenic (ISS finds alert reliable) 5/7 models predict mutagenic (ISS finds alert reliable, acc>=0.7)
moderate low low
disagree
5/7 models predict mutagenic (ISS finds alert reliable) but ADI=0 or exception rule
alert found
Mutagenic
low
disagree
6/7 models predict mutagenic (ISS finds alert reliable)
no alert found
Mutagenic
moderate
disagree
5/7 models predict mutagenic
no alert found
SARpy
low
disagree
3/7 models predict mutagenic (ISS finds alert reliable)
alert found
Mutagenic
low
disagree
3/7 models predict mutagenic (ISS finds an alert not reliable)
alert found
SARpy
low
disagree
4/7 models predict mutagenic
alert found
Mutagenic
low
disagree
4/7 models predict mutagenic; ISS finds an alert not reliable
alert found
Mutagenic
low
disagree
4/7 models predict mutagenic (ISS finds alert reliable)
alert found
Mutagenic
moderate
disagree
4/7 models predict mutagenic, no exception rule
alert found
Mutagenic
moderate
no alert found
SARpy
low
no alert found
Mutagenic
low
disagree disagree
4/7 models predict mutagenic 4/7 models predict mutagenic (ISS finds alert reliable, acc>=0.7)
34
disagree
4/7 models predict mutagenic, exception rule, no alert found by QSAR Toolbox.
no alert found
SARpy
low
disagree
3/6 models predict mutagenic
no alert found
SARpy
low
disagree
3/7 models predict mutagenic, no exception rule
no alert found
SARpy
low
disagree
3/7 models predict mutagenic
alert found
SARpy
low
disagree
3/7 models predict mutagenic (ISS finds alert reliable)
alert found
Mutagenic
low
disagree
3/7 models predict mutagenic (ISS finds alert reliable)
no alert found
SARpy
low
disagree
3/7 models predict mutagenic
no alert found
SARpy
low
disagree
3/7 models predict mutagenic , exception rule
no alert found
NONMutagenic
low
disagree
2/7 models predict mutagenic
no alert found
SARpy
low
disagree
2/7 models predict mutagenic, no exception rule
no alert found
SARpy
low
disagree
2/7 models predict mutagenic, exception rule
no alert found
NONMutagenic
low
disagree
2/6 models predict mutagenic
no alert found
SARpy
low
disagree
two models do not provide a prediction
no alert found
SARpy
low
SARpy
low
Only VEGA prediction is available, low score Only one model of TEST predicts. VEGA consensus and that single model agree about mutagenicity Only one model of TEST predicts. VEGA consensus and that single model agree about nonmutagenicity Only one model of TEST predicts. VEGA consensus and that single model disagree agree about mutagenicity
2/4 VEGA models predict mutagenic (ISS finds alert reliable)
no alert found
Mutagenic
low
All VEGA models predict nonmutagenic
no alert found
NONMutagenic
moderate
no alert found
SARpy
low
no alert found
SARpy
low
2/7 models agree about mutagenicity
35
Table S2. SMART strings used to detect all flavonoids in TS MOLECULAR GROUPS Flavones + flavonols Flavonones + dihydroflavonols + flavan-3ols + flavan-3,4-diols Chalcones Aurones Anthocyanidins
SMARTS Oc1cc(O)c2C(=O)C=C(Oc2c1)c3ccc(O)[$([cH]),$(cO)]c3 Oc1cc(O)c2[$([CH2]),$(C=O),$(CO)]CC(Oc2c1)c3ccc(O)[$([cH]),$(cO)]c3 Oc1ccc(C(=O)C=Cc2ccc(O)[$([cH]),$(cO)]c2)c(O)c1 Oc1ccc(C=C2Oc3cc(O)ccc3C2=O)cc1 Oc1cc(O)c2C=CC(=[O+]c2c1)c3ccc(O)c(Cl)c3
36
Declaration of interests
☐ The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
☒The authors declare the following financial interests/personal relationships which may be considered as potential competing interests:
The employers of some authors (Mario Negri Institute, Greenpharma and UNITIS) received payement from Botanical alliance consortium since they participated to the first phase of the NCStox project. This allowed to develop a predictive database determining the toxicological profile of more than 18000 natural compounds used in cosmetics.
37
Graphical abstract
38
Highlights
The aim was to assess the genotoxic potential of a large dataset of compounds by integrating several in silico models
The integration strategy was tuned and applied to ~ 18000 molecules found in plant extracts used in cosmetics
A new SAR model was developed in Pyton, based on active and inactive rules
The in silico methods were refined
39