Expert QSAR system for predicting the bioconcentration factor under the REACH regulation

Expert QSAR system for predicting the bioconcentration factor under the REACH regulation

Environmental Research 148 (2016) 507–512 Contents lists available at ScienceDirect Environmental Research journal homepage: www.elsevier.com/locate...

517KB Sizes 0 Downloads 19 Views

Environmental Research 148 (2016) 507–512

Contents lists available at ScienceDirect

Environmental Research journal homepage: www.elsevier.com/locate/envres

Short communication

Expert QSAR system for predicting the bioconcentration factor under the REACH regulation Francesca Grisoni a,b,n, Viviana Consonni a,b, Marco Vighi a,1, Sara Villa a, Roberto Todeschini a,b a b

University of Milano-Bicocca, Dept. of Earth and Environmental Sciences, Milano, Italy Milano Chemometrics and QSAR Research Group, Milano, Italy

art ic l e i nf o

a b s t r a c t

Article history: Received 1 February 2016 Received in revised form 20 April 2016 Accepted 25 April 2016 Available online 4 May 2016

Expert systems are a rational integration of several models that generally aim to exploit their advantages and overcome their drawbacks. This work is founded on our previously published Quantitative StructureActivity Relationship (QSAR) classification scheme, which detects compounds whose Bioconcentration Factor (BCF) is (1) well predicted by the octanol-water partition coefficient (KOW), (2) underestimated by KOW or (3) overestimated by KOW. The classification scheme served as the starting point to identify and combine the best BCF model for each class among three VEGA models and one KOW-based equation. The rationalized model integration showed stability and surprising performance on unknown data when compared with benchmark BCF models. Model simplicity, transparency and mechanistic interpretation were fostered in order to allow for its application and acceptance within the REACH framework. & 2016 Elsevier Inc. All rights reserved.

Keywords: QSAR Bioaccumulation BCF Expert system TGD VEGA

1. Introduction Quantitative Structure-Activity Relationship (QSAR) methodology has gained much attention since the advent of the European REACH Regulation (EC 1907/2006), as it defines a relationship between the structural features of molecules, encoded within the molecular descriptors (Todeschini and Consonni, 2009), and their biological properties. REACH places QSAR on the same footing of experimental tests, as far as it provides the same level of information. Furthermore, as stated by the European Chemicals Agency, a QSAR “is associated with an underlying dataset. As a representation of this dataset, the model averages the uncertainty over all chemicals. Thus, it is possible for an individual model estimate to be more accurate than an individual measurement” (ECHA, 2008), further encouraging the use of QSAR as an alternative to animal testing. Some hazardous compounds, such as POPs (Persistent Organic Pollutants), can bioaccumulate within organisms, reaching higher concentrations than those measured in the environment (Gobas and Morrison, 2000). Bioaccumulation at each trophic level n Corresponding author at: University of Milano-Bicocca, Dept. of Earth and Environmental Sciences, Milano, Italy. E-mail address: [email protected] (F. Grisoni). 1 Present address: IMDEA Water Institute, Alcalà de Henares, Madrid, Spain.

http://dx.doi.org/10.1016/j.envres.2016.04.032 0013-9351/& 2016 Elsevier Inc. All rights reserved.

(biomagnification) exposes top predators to high concentrations of xenobiotics, whose effects may manifest only in later phases of life or after generations (van der Oost et al., 2003). Thus, even without detectable acute or chronic effects, bioaccumulation is a hazard in itself, which is regulated by many national and international frameworks. Authorities mainly rely on the Bioconcentration Factor (BCF), a laboratory measure of the ratio between the concentration within the organism (usually fish) and that in water at the steady state, when only non-dietary exposure occurs. Since the BCF measurement is expensive and requires the use of more than 100 animals (HESI, 2006), experimental values are available only for 4% of compounds (Arnot and Gobas, 2006). In this scenario, QSAR methods play a crucial role for data-gap filling. QSARs for BCF mainly rely on the octanol-water partition coefficient (KOW) or related parameters. KOW predicts well the lipid-driven bioconcentration, but it underestimates/overestimates the BCF when interactions with non-lipid tissues or metabolism occur (Grisoni et al., 2015b). In our previous work (Grisoni et al., 2016), we developed a QSAR scheme to classify compounds as: (1) predicted well by KOW because of lipid storage, (2) underestimated by KOW because of additional storage within non-lipid tissues and/or specific interactions, (3) overestimated by KOW because of biotransformation/elimination. In this work, the classification scheme was applied to predict the BCF for regulatory purposes, as the basis to choose the optimal

508

F. Grisoni et al. / Environmental Research 148 (2016) 507–512

model for each class. Four BCF models, one based on KOW (TGD, Technical Guidance Document - European Commission) and three well-established VEGA models (IRFMN, 2015), were analysed for their performance on each predicted class and then rationally combined. Attention was posed to the factors that influence the regulatory acceptance of models, such as (1) a proper validation, (2) simplicity and (3) transparency (Jaworska et al., 2003; OECD, 2007).

2. Materials and methods 2.1. Dataset The starting point was a dataset of fish BCF for 1056 chemicals (Grisoni et al., 2015b), which collects the models training and test sets plus additional external data. In particular, 473 compounds derive from CAESAR dataset, 662 compounds from Meylan dataset, 832 compounds from Read-Across dataset, and 45 additional compounds were gathered from highly reliable literature sources. The final dataset was manually curated in terms of molecular structures and experimental BCF values, as explained in the original manuscript. 2.2. Class prediction The class was predicted using the consensus classification scheme proposed in the original paper (Grisoni et al., 2016). The scheme combines two classification trees: the first one identifies compounds underestimated from KOW, while the second one identifies compounds that are overestimated by KOW. Each model has its own applicability domain (AD), calculated on the basis of the molecular descriptors range. Predictions were combined in a consensus manner, i.e., by assigning a compound to a class if and only if both models agreed. No class was assigned to compounds that (a) were out of the applicability domain (AD) of at least one model, or (b) were predicted with disagreement by the two submodels. The following notation will be used throughout the entire manuscript: (1) class 1 - compounds well predicted by KOW, (2) class 2 - compounds underestimated by KOW (specific interactions/non-lipid storage), (3) class 3 - compounds overestimated by KOW (metabolism/elimination).

3. Results and discussion 3.1. BCF prediction In our previous work, the lipid-driven BCF was predicted using the Technical Guidance Document (TGD) approach (European Commission, 2003) from experimental KOW values. The TGD approach, in particular, combines four equations (three linear and one quadratic) on the basis of the KOW range (see Supporting Information, SI). Since often no experimental KOW is available for new chemicals, here the values were predicted by VEGA KOWWIN (Meylan and Howard, 1995) model. VEGA (IRFMN, 2015) is a freely available platform of QSAR models for regulatory purposes. It assesses the applicability domain (i.e. the chemical space where the model predictions are reliable) through the Applicability Domain Index (ADI), which ranges from 0 (predictions unreliable) to 1 (maximum reliability). VEGA KOW models are very accurate (Cappelli et al., 2015) and, among them, KOWWIN resulted the best alternative in a preliminary analysis (see SI). In addition to TGD, we focused on three well-established VEGA models for BCF: (1) VEGA CAESAR (Benfenati, 2010; Zhao et al., 2008), which combines two sub-models based on theoretical

molecular descriptors, (2) VEGA Meylan (Meylan et al., 1999), a KOW-based model with correction factors according to molecular fragments, and (3) VEGA Read-Across (Floris et al., 2014) relying on a similarity approach. A detailed model description is provided in Grisoni et al., 2015b. We preferred VEGA models to other benchmarks, such as the Meylan model implemented in EPI Suite (U.S. EPA, 2000), because they integrate an AD assessment, which is essential for the regulatory application of QSARs (OECD, 2007) and improves the prediction accuracy (Grisoni et al., 2015b). Each model was tested on: (1) its training and test sets, (2) an external set, obtained by excluding from the starting dataset the molecules used to train/validate the model. The external set served to compare the model performance on unknown data. We acknowledge that using compounds external to all the models would allow for an objective comparison. However, only 45 compounds are external to all the models and, in the majority of the cases, only a few of them is within the model AD. Furthermore, to the best of our knowledge, the used dataset is among the largest available in the literature and additional data are limited. Since the use of a large number of compounds is fundamental for obtaining robust and reliable statistics, we decided to use all the external molecules available. This was supported by the comparison of the external sets in terms of their structural characteristics and of BCF values distributions, which revealed that they are very similar, thus allowing for a less biased comparison (see SI). Only reliable predictions (ADI 4 0.75) were considered. For TGD equation, lacking a proper training/test set and AD assessment, the training and test sets of the classification scheme were used and the ADI of the KOW model was considered (ADI 40.75). Prediction accuracy was quantified through the Root Mean Square Error (RMSE):

RMSE =

1 n

n

∑ (yi − y^i )2 i=1

(1)

^ are the experimental and predicted logBCF values where yi and y i of the ith compound, respectively, and n is the number of compounds. RMSE represents the mean model error and is in the same measuring unit of logBCF. For all the datasets, the model accuracy was tested on: (a) all the compounds and (b) each class (Table 1). The results of TGD confirm that class 1 compounds are predicted well by KOW, even when predicted values are used. The RMSE is sensibly lower on class 1 than on all compounds and on class 2 and 3. Moreover, on the external set, the TGD on class 1 is always more accurate than VEGA models on all compounds. Finally, as TGD lacks a proper training and test set, its stability on all the class 1 sets is surprising. This suggests that the classification scheme can serve as a preliminary filter to detect class 1 compounds, for which KOW can be used without the need of added complexity. For VEGA models, in all cases, the predictions on at least two of the classes are better than on all compounds, justifying the development of an expert system. On a class-basis, no model always outperforms the others. Nonetheless, some models have a similar behaviour on all the datasets, keeping a constantly low RMSE: TGD on class 1 (RMSE from 0.52 to 0.54), Meylan on class 2 (from 0.43 to 0.52) and ReadAcross on class 3 (from 0.46 to 0.55). In all of the other cases, the RMSE has a larger variance. On class 1 compounds, the TGD approach was chosen as the optimum, as its simplicity outweighs the slightly better performance of the other models on their training/test sets. For class 2 and 3, the model comparison was refined through a recently proposed multi-criteria decision making technique, the wR-Hasse (Grisoni et al., 2015a). Hasse Diagrams set an order of preference between alternatives on the basis of their criteria values and represent the ordering with a graph (see Brüggemann and Carlsen,

F. Grisoni et al. / Environmental Research 148 (2016) 507–512

509

Table 1 Statistics of the BCF models on each predicted class (all: all compounds; 1: well predicted by KOW; 2: underestimated by KOW; 3: overestimated by KOW, out: not predicted); n¼ number of compounds, nin ¼number of compounds within the AD, out ¼compounds not predicted by the classification scheme. Training Model

Test

External

Class n

nin

RMSE

n

nin

RMSE

n

nin

RMSE

TGD

All 1 2 3 Out

584 247 114 165 58

495 230 91 125 49

0.82 0.52 0.74 1.13 1.12

195 77 30 60 28

159 64 27 47 21

0.84 0.54 0.71 0.98 1.27

277 54 30 149 44

104 27 11 57 9

1.18 0.54 0.97 1.46 0.77

CAESAR

All 1 2 3 Out

378 166 64 124 24

334 160 53 103 18

0.43 0.39 0.42 0.50 0.46

95 49 10 33 3

86 41 8 25 1

0.63 0.43 0.56 0.41 0.70

583 164 101 217 101

162 71 29 53 9

1.33 1.23 1.36 1.49 0.53

Meylan

All 1 2 3 Out

516 222 71 171 52

389 195 55 104 35

0.41 0.37 0.43 0.46 0.39

146 73 34 29 10

101 61 20 17 3

0.45 0.44 0.51 0.46 0.35

394 88 73 175 58

97 32 20 35 10

0.64 0.56 0.45 0.74 0.76

Read-Across

All 1 2 3 Out

686 253 112 233 88

584 230 94 195 65

0.50 0.45 0.46 0.53 0.60

173 62 25 63 23

146 61 18 51 16

0.53 0.47 0.52 0.55 0.68

197 65 39 78 15

98 44 12 36 6

0.66 0.73 0.65 0.51 1.06

basis for the expert system development. 3.2. Expert system development The previous results allowed us to rationalize the expert system. This is comprised of a first step, where each compound is classified into one of the three classes using the proposed classification scheme. Then, according to the predicted class, the optimal model (identified previously) is used: (1) TGD approach for

Fig. 1. wR-Hasse Diagrams for the BCF models on: (a) class 2, (b) class 3. Graph vertices are the BCF models, while the edges represent their ordering, from the best alternative to the worst.

2006). The wR-Hasse method allows (1) weighting the criteria according to their relevance and (2) overcoming some representation problems in case of many conflicting criteria. We considered the predictivity (i.e., RMSE) towards new compounds as the most important feature for the regulatory application, and set the relevance as: External4 Test4Training. Moreover, in addition to the RMSE, we also took into account the percentage of compounds within the AD, as a secondary parameter, which should be as high as possible. Details about the weighting scheme are provided in SI. The obtained wR-Hasse Diagrams (Fig. 1) identify the best models as the most stable: Meylan for class 2 and Read-Across for class 3. These outcomes served as the starting

Fig. 2. Simplified scheme of the final expert system. After assigning a mechanistic class, a model is used as follows: (1) class 1 compounds are predicted with a KOW-based model (TGD), (2) class 2 compounds (having an increased BCF with respect to that based on KOW) are predicted using VEGA Meylan, (3) class 3 compounds (having a BCF lower than expected on a KOW basis) are predicted using Read-Across.

510

F. Grisoni et al. / Environmental Research 148 (2016) 507–512

Table 2 Statistics of the expert system in comparison with the benchmark models; n¼ number of compounds external to the model, nin ¼number of compounds within the model AD. Numbers in parentheses represent the statistics on the 45 compounds that are external to all the tested methods. For the expert system, the training/test/external sets were built in agreement with the used sub-models, i.e. summing: (1) TGD molecules belonging to class 1, (2) Meylan molecules belonging to class 2 and (3) Read-Across molecules belonging to class 3. Training

Test

External

Model

TGD CAESAR Meylan Read-Across Expert system

n

nin

584 378 516 686 551

495 334 389 584 480

RMSE 0.82 0.43 0.41 0.50 0.52

n

nin

195 95 146 173 174

159 86 101 146 135

RMSE 0.84 0.63 0.45 0.53 0.54

277 583 394 197 205

n

nin

(45) (45) (45) (45) (45)

104 (25) 162 (10) 97 (4) 98 (13) 83 (9)

RMSE 1.18 (1.03) 1.33 (1.05) 0.64 (0.53) 0.66 (0.65) 0.51 (0.56)

Fig. 3. Experimental vs predicted logBCF for (a) the expert system, (b) TGD, (c) CAESAR, (d) Meylan, (e) Read-Across. Solid line represents the perfect fit (Experimental¼ Predicted), dashed lines represent 7 0.4 log unit from the perfect fit (comparable with the experimental variability). For each model, only the external data within the AD were plotted (Table 2).

class 1, (2) VEGA Meylan for class 2, and (3) VEGA Read-Across for class 3 (Fig. 2). The expert system is more stable than the individual models (Table 2). Using a complex model on class 1 could lead to a more satisfying performance on training and test sets; however, one of our goals was to keep the system as simple and interpretable as possible. As class 1 is associated with lipid-driven bioconcentration, the use of KOW is reasonable and easy to explain. Our major objective was to provide an accurate model towards unknown data, as the relevance of QSARs is at its highest for filling data gaps. The obtained expert system showed a remarkable accuracy on external data, with a lower RMSE (up to 0.82 log units lower) than the original models (Table 2, Fig. 3). When considering the 45 compounds external to all the models, the expert system has a very low RMSE (0.56). The Meylan model is the only one with a (slightly) better RMSE (0.53), but it has more than 50% fewer compounds within the AD. Moreover, a major advantage of the expert system is the mechanistic interpretation associated with each assigned class. In particular, class 1 compounds (e.g. polybrominated biphenyls) are stored within lipid tissues and are generally predicted well on a KOW basis. On the contrary, class 2 compounds (e.g. perfluorinated

alkyl acids) are likely to show a BCF larger than expected from their affinity with lipids, due to increased interactions with organism tissues. Thus, they should be regarded as a class of increased hazard to humans and ecosystems. Finally, class 3 compounds (e.g. synthetic pyrethroids) could be overestimated due to metabolism/excretion processes and have, consequently, a reduced hazard. In our opinion, when dealing with class 3 compounds, other weight-of-evidence should be considered to pursue a cautionary approach. Interested readers can find a detailed analysis and mechanistic interpretation of the classes in Grisoni et al. (2016). 3.3. Regulatory application Under a regulatory point of view, it is essential to classify new chemicals in safety terms. In this sense, the classification of compounds as non-bioaccumulative (nB), bioaccumulative (B) and very bioaccumulative (vB) substances is very useful, these categories being separated by the REACH trough the thresholds of 3.3 and 3.7 log units, respectively. In particular, the correct classification of chemicals with BCF values near the border is a critical issue, especially in the case of underestimated bioaccumulative

F. Grisoni et al. / Environmental Research 148 (2016) 507–512

Table 3 Statistics of the models on external data for non-Bioaccumulative (nB), Bioaccumulative (B) and very Bioaccumulative (vB) compounds. Sensitivity (Sn), Specificity (Sp), Non-Error Rate (NER), and percentage of compounds belonging to any class (%) are reported. Type I error¼ percentage of nB compounds classified as B or vB; type II error¼ percentage of B and vB compounds classified as nB. Statistics

TGD

CAESAR

Meylan

Read-Across

Expert system

n RMSE

104 1.18

162 1.33

97 0.64

98 0.66

83 0.51

nB

% Sn Sp NER

70 0.62 0.84 0.73

69 0.97 0.51 0.74

60 0.88 0.74 0.81

65 0.84 0.68 0.76

63 0.79 0.81 0.80

B

% Sn Sp NER

18 0.16 0.92 0.54

7 0.08 0.98 0.53

12 0.25 0.91 0.58

13 0.38 0.82 0.60

12 0.50 0.79 0.65

vB

% Sn Sp NER

12 0.92 0.64 0.78

24 0.49 0.95 0.72

28 0.81 0.96 0.89

21 0.62 1.00 0.81

25 0.67 0.97 0.82

Type I error Type II error

% %

38 16

3 49

12 26

16 32

21 19

compounds. For this reason, each benchmark model plus the expert system was evaluated for its ability to identify bioaccumulative and very Bioaccumulative compounds according to the abovementioned thresholds. In particular, for each class, the classification accuracy was quantified through Sensitivity (Sn), Specificity (Sp), and Non-Error Rate (NER), defined as follows: TP TP + FN TN Sp = TN + FP Sn + Sp NER = 2 Sn =

(2)

where TP, TN, FP and FN are the number of true positives, true negatives, false positives and false negatives for each class, respectively. Moreover, we calculated the frequency of two types of errors: (1) type I, corresponding to the assignment of a nonBioaccumulative compound to B or vB classes, and (2) type II, corresponding to the classification of a B or vB compound as nB. Theoretically, for a regulatory assessment, an optimal model should be characterized by: (1) a high capability of correctly identifying compounds not belonging to the class of nB (high SpnB), (2) a high capability of correctly identifying B and vB compounds (high SnB and SnvB). In particular, the correct classification of B compounds is a critical issue, as they are close to the border. When observing the benchmark models (Table 3), one can note that generally they are characterized by unbalanced performances. CAESAR, Meylan and Read-Across have high Sn values for nB compounds (SnnB 40.84), but their ability to recognize B and vB compounds is limited. On the contrary, the TGD has high Sp values for nB and vB compounds, but its performance with bioaccumulative compounds is low. When observing the ability to correctly identify hazardous compounds (B and vB) and non-hazardous compounds (nB), as represented by type I and II errors, one can note that the benchmark models are unbalanced towards one of the two errors. TGD model tends to misclassify many nB compounds (38%) into B and vB classes. This large number of false positives (52% of the compounds identified as bioaccumulative are actually not) hampers the usefulness of TGD to identify small sets of chemicals to be priority tested and to rationalize the use of animals.

511

Furthermore, as KOW-based approaches account for lipid-driven bioaccumulation, they can strongly underestimate the BCF in case of non-lipid storage and specific interactions with tissues (Beek et al., 2000; Landrum et al., 1996), such as for Perfluorinated alkyl acids (Ng and Hungerbühler, 2013). Their application is, thus, sub-optimal when no mechanistic knowledge is available (Endo et al., 2011; Hermens et al., 2013). The other benchmarks assign many B and vB compounds to the nB class (from 32% to 49%), which is worse than the former case and non-optimal for regulatory applications. The expert system overcomes the drawbacks of the single models and is characterized by: (1) a very high Sp for nB compounds (0.81), (2) the largest Sn for B compounds (0.50), and (3) a very high Sn for vB compounds (0.67). When looking at type I and II errors, the expert system is the model with the best compromise, as it misclassifies only the 21% and the 19% of non-hazardous and hazardous compounds, respectively. In conclusion, the proposed model shows the best compromise between accuracy (RMSE), classification performance, and type I and type II errors. Finally, as the major advantage of the expert system lies in the mechanistic interpretation associated with each class, one could use this information to pursue a cautionary approach for regulatory applications. In particular, for class 2 compounds (those with the highest hazard potential), the largest predicted BCF amongst the expert system sub-models (TGD, VEGA Read-Across and VEGA Meylan) could be considered.

4. Conclusions This work applied our recently proposed classification scheme to develop an expert system for BCF prediction under REACH. The simplicity of a KOW-based approach (TGD) was combined with the advantages of two more sophisticated VEGA models (Read-Across and Meylan), which are freely available and integrate the OECD principles for QSAR validity, such as the applicability domain assessment. Our major goal was to provide a simple and transparent expert system with a satisfactory performance on unknown data, because QSAR models are needed at most for data-gap filling. Since reaching high performance was secondary to keeping the model as simple and as transparent as possible, the TGD equation was chosen for compounds with lipid-driven bioconcentration, whose BCF can be reliably predicted by KOW. For the other mechanistic classes, more sophisticated approaches were applied (Meylan for specifically-interacting chemicals, Read-Across for metabolized/ eliminated chemicals). The resulting expert system had an increased performance on external data than the analysed models, with an improvement of RMSE up to 0.82 log units. Moreover, it showed the best ability to recognize hazardous (bioaccumulative/ very bioaccumulative) and non-hazardous compounds. The expert system can be applied within the REACH framework, as (1) it models a defined endpoint through an unambiguous algorithm, (2) it is properly validated, (3) integrates an applicability domain assessment (OECD, 2007). Moreover, a mechanistic explanation is available for each class (Grisoni et al., 2016). As the use of QSAR also hinges on the model availability and transparency, the approach will be soon implemented as a KNIME (Berthold et al., 2007) workflow.

Software The expert system will be soon available as a free KNIME workflow on Milano Chemometrics website (http://michem.disat. unimib.it/chm/download/softwares.htm). Contact FG for further information.

512

F. Grisoni et al. / Environmental Research 148 (2016) 507–512

Appendix A. Supporting information Supplementary data associated with this article can be found in the online version at http://dx.doi.org/10.1016/j.envres.2016.04. 032.

References Arnot, J.A., Gobas, F.A., 2006. A review of bioconcentration factor (BCF) and bioaccumulation factor (BAF) assessments for organic chemicals in aquatic organisms. Environ. Rev. 14, 257–297. Beek, B., Böhling, S., Bruckmann, U., Franke, C., Jöhncke, U., Studinger, G., 2000. The assessment of bioaccumulation. In: Beek, B. (Ed.), Bioaccumulation – New Aspects and Developments, the Handbook of Environmental Chemistry. Springer, Berlin Heidelberg, pp. 235–276. Benfenati, E., 2010. The CAESAR project for in silico models for the REACH legislation. Chem. Cent. J. 4, I1. http://dx.doi.org/10.1186/1752–153X-4-S1-I1. Berthold, M.R., Cebron, N., Dill, F., Gabriel, T.R., Kötter, T., Meinl, T., Ohl, P., Sieb, C., Thiel, K., Wiswedel, B., 2007. KNIME: The Konstanz Information Miner. Springer. Brüggemann, R., Carlsen, L., 2006. Partial Order in Environmental Sciences and Chemistry. Springer. Cappelli, C.I., Benfenati, E., Cester, J., 2015. Evaluation of QSAR models for predicting the partition coefficient (log P) of chemicals under the REACH regulation. Environ. Res. 143, 26–32. http://dx.doi.org/10.1016/j.envres.2015.09.025. ECHA, 2008. European Chemicals Agency, Guidance on Information Requirements and Chemical Safety Assessment. Chapter R.6: QSARs and Grouping of Chemicals. Endo, S., Escher, B.I., Goss, K.-U., 2011. Capacities of membrane lipids to accumulate neutral organic chemicals. Environ. Sci. Technol. 45, 5912–5921. http://dx.doi. org/10.1021/es200855w. European Commission, 2003. Technical Guidance Document (TGD) on risk assessment in support of Commission Directive 93/67/EEC on risk assessment for new notified substances and Commission Regulation (EC) (No 1488/94 on risk assessment for existing substances and Directive 98/8/EC of the European parliament and of the council concerning the placing of biocidal products on the market. Eur. Community Bruss. Belg.). Floris, M., Manganaro, A., Nicolotti, O., Medda, R., Mangiatordi, G.F., Benfenati, E., 2014. A generalizable definition of chemical similarity for read-across. J. Cheminf. 6, 1–7. http://dx.doi.org/10.1186/s13321–014–0039–1. Gobas, F., Morrison, H.A., 2000. Bioconcentration and biomagnification in the aquatic environment. Handb. Prop. Estim. Methods Chem. Environ. Health Sci. Boethling RS Mackay Eds CRC Press, Boca Raton, USA.

Grisoni, F., Consonni, V., Nembri, S., Todeschini, R., 2015a. How to weight Hasse matrices and reduce incomparabilities. Chemom. Intell. Lab. Syst. 147, 95–104. http://dx.doi.org/10.1016/j.chemolab.2015.08.006. Grisoni, F., Consonni, V., Vighi, M., Villa, S., Todeschini, R., 2016. Investigating the mechanisms of bioconcentration through QSAR classification trees. Environ. Int. 88, 198–205. http://dx.doi.org/10.1016/j.envint.2015.12.024. Grisoni, F., Consonni, V., Villa, S., Vighi, M., Todeschini, R., 2015b. QSAR models for bioconcentration: Is the increase in the complexity justified by more accurate predictions? Chemosphere 127, 171–179. http://dx.doi.org/10.1016/j. chemosphere.2015.01.047. Hermens, J.L.M., de Bruijn, J.H.M., Brooke, D.N., 2013. The octanol–water partition coefficient: Strengths and limitations. Environ. Toxicol. Chem. 32, 732–733. http://dx.doi.org/10.1002/etc.2141. Hesi, I., 2006. JRC/SETAC-EU. pp. 5–6. IRFMN, 2015. VEGA Non-Interactive Client, Version 1.1.0, 〈http://www.vega-qsar.eu/ 〉, Istituto di Ricerche Farmacologiche Mario Negri Milano. Jaworska, J.S., Comber, M., Auer, C., Van Leeuwen, C.J., 2003. Summary of a workshop on regulatory acceptance of (Q)SARs for human health and environmental endpoints. Environ. Health Perspect. 111, 1358–1360. Landrum, P.F., Harkey, G.A., Kukkonen, J., 1996. Evaluation of organic contaminant exposure in aquatic organisms: The significance of bioconcentration and bioaccumulation. Ecotoxicol. Hierarchial Treat. Newman MC Jagoe CHeditors Lewis, Boca Raton, FL, USA, p. 85. Meylan, W.M., Howard, P.H., 1995. Atom/fragment contribution method for estimating octanol? water partition coefficients. J. Pharm. Sci. 84, 83–92. Meylan, W.M., Howard, P.H., Boethling, R.S., Aronson, D., Printup, H., Gouchie, S., 1999. Improved method for estimating bioconcentration/bioaccumulation factor from octanol/water partition coefficient. Environ. Toxicol. Chem. 18, 664–672. http://dx.doi.org/10.1002/etc.5620180412. Ng, C.A., Hungerbühler, K., 2013. Bioconcentration of perfluorinated alkyl acids: how important is specific binding? Environ. Sci. Technol. 47, 7214–7223. http: //dx.doi.org/10.1021/es400981a. OECD, 2007. Guidance Document On The Validation Of (Quantitative)StructureActivity Relationships [(Q)Sar] Models [WWW Document]. (URL) 〈http://www. oecd.org/env/ehs/risk-assessment/validationofqsarmodels.htm〉 (accessed 05.21.13). Todeschini, R., Consonni, V., 2009. Molecular Descriptors for chemoinformatics (2 volumes). Wiley-VCH. U.S. EPA, 2000. Estimation Program Interface (EPI) Suite, Version 1.68  2000 - ww. epa.gov/opptintr/exposure/pubs/episuite. van der Oost, R., Beyer, J., Vermeulen, N.P., 2003. Fish bioaccumulation and biomarkers in environmental risk assessment: a review. Environ. Toxicol. Pharmacol. 13, 57–149. http://dx.doi.org/10.1016/S1382–6689(02)00126–6. Zhao, C., Boriani, E., Chana, A., Roncaglioni, A., Benfenati, E., 2008. A new hybrid system of QSAR models for predicting bioconcentration factors (BCF). Chemosphere 73, 1701–1707. http://dx.doi.org/10.1016/j.chemosphere.2008.09.033.