In silico prediction of the mutagenicity of nitroaromatic compounds using a novel two-QSAR approach

In silico prediction of the mutagenicity of nitroaromatic compounds using a novel two-QSAR approach

Accepted Manuscript In silico prediction of the mutagenicity of nitroaromatic compounds using a novel two-QSAR approach Yi-Lung Ding, You-Chen Lyu, M...

1MB Sizes 0 Downloads 44 Views

Accepted Manuscript In silico prediction of the mutagenicity of nitroaromatic compounds using a novel two-QSAR approach

Yi-Lung Ding, You-Chen Lyu, Max K. Leong PII: DOI: Reference:

S0887-2333(16)30269-7 doi: 10.1016/j.tiv.2016.12.013 TIV 3900

To appear in:

Toxicology in Vitro

Received date: Revised date: Accepted date:

31 July 2016 13 November 2016 21 December 2016

Please cite this article as: Yi-Lung Ding, You-Chen Lyu, Max K. Leong , In silico prediction of the mutagenicity of nitroaromatic compounds using a novel two-QSAR approach. The address for the corresponding author was captured as affiliation for all authors. Please check if appropriate. Tiv(2016), doi: 10.1016/j.tiv.2016.12.013

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

In Silico Prediction of the Mutagenicity of

CR

IP

QSAR Approach

T

Nitroaromatic Compounds Using a Novel Two-

Department of Chemistry, National Dong Hwa University, Shoufeng, Hualien 97401, Taiwan Department of Life Science and Institute of Biotechnology, National Dong Hwa University,

M

b

AN

a

US

Yi-Lung Dinga, You-Chen Lyua, and Max K. Leonga,b*

ED

Shoufeng, Hualien 97401, Taiwan

PT

*Corresponding author: Max K. Leong, phone: +886-3-863-3609; fax: +886-3-863-3570; e-mail:

AC

CE

[email protected].

1

ACCEPTED MANUSCRIPT Abstract. Certain drugs are nitroaromatic compounds, which are potentially toxic. As such, it is of practical importance to assess and predict their mutagenic potency in the process of drug discovery. A classical quantitative structure-activity relationship (QSAR) model was developed using the linear partial least square (PLS) scheme to understand the underline mutagenic

T

mechanism and a non-classical QSAR model was derived using the machine learning-based

IP

hierarchical support vector regression (HSVR) to predict the mutagenicity of nitroaromatic

CR

compounds based on a series of mutagenicity data (TA98 − S9). It was observed that HSVR performed better than PLS as manifested by the predictions of the samples in the training set, test

US

set, and outlier set as well as various statistical validations. A mock test designated to mimic real

AN

challenges also confirmed the better performance of HSVR. Furthermore, HSVR exhibited superiority in predictivity, generalization capabilities, consistent performance, and robustness

M

when compared with various published predictive models. PLS, conversely, revealed some

ED

mechanistically interpretable relationships between descriptors and mutagenicity. Thus, this twoQSAR approach using the predictive HSVR and interpretable PLS models in a synergistic

PT

fashion can be adopted to facilitate drug discovery and development by designing safer drug

AC

CE

candidates with nitroaromatic moiety.

Keywords: nitroaromatic compounds, mutagenicity, 2-QSAR, partial least square, hierarchical support vector regression

2

ACCEPTED MANUSCRIPT Introduction Nitroaromatic compounds (NACs), which are also termed aromatic nitros or nitroarenes, are a class of compounds with one or more nitro substituents attached to mono- or polycyclic aromatic hydrocarbons, viz. arenes, and can be derivatized from polycyclic aromatic hydrocarbons (PAHs)

T

(Pederson and Siak, 1981). Many NACs show mutagenic and carcinogenic properties since they

IP

can be converted to highly reactive electrophilic nitroanion radical, nitroso intermediates, and N-

CR

hydroxy derivative, which, in turn, can form adducts with DNA, tissue proteins, and serum albumins and hemoglobins (Sabbioni, 1994), leading to various forms of toxicity such as

US

mutagenicity (Debnath et al., 1991), immunotoxicity (Li et al., 1999), hepatotoxicity (Aßmann et

AN

al., 1997), skin sensitization (Cronin et al., 1998), nephrotoxicity (Travlos et al., 1996),

carcinogenicity (Debnath et al., 1991).

M

neurotoxicity (Philbert et al., 2000), teratogenicity (Kovacic and Somanathan, 2006), or

ED

Generally, NACs are considered structural alerts in the process of drug discovery and

PT

development due to their potential toxic risks (Benigni and Bossa, 2006; Stepan et al., 2011) despite the fact that an appreciable number of drugs contain the nitroaromatic structure

CE

(Boelsterli et al., 2006). For example, nitroaromatic flutamide, which is a therapeutic agent for

AC

treating metastatic prostate cancer by blocking both endogenous and exogenous testosterone, is considered potentially hepatotoxic (Martelli et al., 2000). It has been attempted to replace the nitro group by cyano to reduce its toxicity while maintaining its efficacy (Coe et al., 2007). Furthermore, it is of great importance to accurately and reliably assess toxicity of drug candidates during drug discovery and development (Dobo et al., 2012; Escobar et al., 2013; McCarren et al., 2011; Valerio, 2009; Valerio et al., 2007; Venkatapathy et al., 2009). Of various assay systems (Rao et al., 2004), the in vitro Ames test, which uses a histidine-free medium with

3

ACCEPTED MANUSCRIPT an engineered Salmonella typhimurium bacteria to detect the sensitivity of mutagenics, is the most predictive and prevalent one (Ames et al., 1975; Benigni et al., 2010; Hornberg et al., 2014). Additionally, it has been shown that mutagenicity in S. typhimurium is closely related to carcinogenicity in rodent and human bioassays (Mortelmans and Zeiger, 2000). In fact, Ames

T

tests have been extensively adopted by the European Union’s Registration, Evaluation,

IP

Authorization, and Restriction of Chemicals (REACH) legislation for predicting toxicity of novel

CR

chemicals (Claxton et al., 2010). In addition, different S. typhimurium strains have been engineered to test for different types of mutations and metabolic pathways. The TA100 and

US

TA98 strains, for instance, are developed to detect base-pair substitution mutations and frame-

AN

shift mutations, respectively. Of various bacterial strains, TA100 and TA98 are the most effective for mutagenicity testing because they are more sensitive to mutagenic activity

M

(Hornberg et al., 2014).

ED

The mechanisms of nitroaromatic mutagenesis are rather complex, involving in penetration

PT

into the cellular systems, diffusion and binding to the active site of the specific enzyme, reduction reaction in the presence of certain enzymes to form an aromatic hydroxylamine

CE

intermediate, which, in turn, can further produce a nitrenium ion, viz. an electrophilic

AC

intermediate per se, to further react with nucleophiles such as proteins or DNA to form an adduct (Hakimelahi and Khodarahmi, 2005). Aromatic amines also go through the same mechanism of adduct reaction except that the aromatic hydroxylamine intermediates arise from oxidation. Thus, the formation of nitrenium intermediate can take place at nitro or amine moiety in those aminecontaining NACs. This can yield a complicated situation since there is no compelling evidence to support the predominance of both functional groups (Fu, 1990). Nevertheless, it has been indicated that the TA98 strain consists of a full complement of nitroreductases required to

4

ACCEPTED MANUSCRIPT activate the reduction reaction of NACs, whereas aromatic amines demand the presence of an exogenous metabolic activation system, viz. S9 mix, to initiate the oxidation reaction, suggesting that mutagenicity can only take place at the nitro moiety in nitroaromatic amines in S. typhimurium strain TA98 without S9 mix (Fu, 1990). Accordingly, such ambiguity can be

T

eliminated once the mutagenicity study of NACs is carried out using S. typhimurium strain TA98

IP

without S9 microsomial activation.

CR

In addition to in vitro and in vivo assay systems, in silico approaches play an increasing role in regulatory toxicology and drug safety assessment (Benigni et al., 2013; Cherkasov et al., 2014;

US

Fioravanzo et al., 2012; Greene and Pennie, 2015; Simon-Hettich et al., 2006; Valerio, 2013). In

AN

fact, various quantitative structure-activity relationship (QSAR) techniques, which are a mathematic means to establish the relationship between biological activity and chemical

M

characteristics, have been adopted to develop predictive models, and more importantly, they

ED

have been extensively employed by REACH, the Seventh Amendment (of the Cosmetics Directive), and the Screening Information Data Set (SIDS) to predict toxicity of untested

PT

compounds as a supplement to experimental assessments (Bakhtyari et al., 2013). It has been

CE

suggested that in silico approach can be a good alternative to Ames tests (Naven et al., 2012). In fact, various packages and theoretical models have been proposed to quantitatively predict

AC

mutagenicity of Ames test (Hillebrecht et al., 2011; Xu et al., 2012). Also, a number of comparative molecular field analysis (CoMFA), molecular similarity indices analysis (CoMSIA), genetic function approximation (GFA), hologram QSAR (HQSAR), and classical QSAR models have been developed to predict mutagenicity of NACs (Caliendo et al., 1995; Debnath et al., 1991; Fan et al., 1998; King et al., 1996; Klein et al., 2000c; Lopez de Compadre et al., 1998;

5

ACCEPTED MANUSCRIPT Lopez de Compadre et al., 1990; Lopez De Compadre et al., 1988; Maynard et al., 1986; Nair and Sobhia, 2008; Wang et al., 2005). Linear predictive models, such as linear partial least square (PLS), are useful to interpret the relationship between descriptor and biological activity (Hasegawa and Funatsu, 2010). Such

T

advantage, however, can become an insurmountable difficulty for developing a sound predictive

IP

model in case that structure-activity data are not linear per se as illustrated by the study

CR

conducted by Debnath et al. (Debnath et al., 1991), in which the nonlinear relationship between log P and log TA98 was observed. Machine learning (ML) methods, conversely, can extract the

US

nonlinear relationship between input and output that are otherwise often mishandled by linear

AN

approaches since it has been demonstrated that ML-based models generally perform better than their conventional linear counterparts (Hou et al., 2009). Nevertheless, ML-based models are

M

normally considered as “black box” approaches since they are hard to interpret the relationship

ED

between input and output. In another word, the advantages of PLS can be the limitations of MLbased schemes and vice versa. Nevertheless, both seemingly contradictory approaches can be

PT

carried out in a synergistic fashion to produce high predictivity while the interpretability can still

CE

be maintained at the same time as recently proposed by Fujita and Winkler (Fujita and Winkler, 2016). It has been suggested that linear SVM can be an alternative to resolve the conflict

AC

between predictivity and interpretability (Cherkasov et al., 2014). In reality, linear SVM is not the ultimate solution to this issue since it has been shown that linear SVM models cannot properly address the nonlinear relationship between input and output as compared with SVM (Ren et al., 2007). Herein, the objective of this investigation was to use the novel two-QSAR approach by combing the predictive ML-based model and the interpretable linear model to accurately predict

6

ACCEPTED MANUSCRIPT the mutagenicity of NACs and to understand the underline complex mechanism based on the most comprehensive data collection (TA 98 – S9) from the literature, respectively. The former was carried out by the hierarchical support vector regression (HSVR) scheme (Leong et al., 2009) since it can simultaneously possess the advantageous characteristics of local model and global

T

model, viz. broader coverage of applicability domain (AD) and higher level of predictivity,

IP

respectively, and the latter was executed by the PLS scheme to offer straightforward

US

CR

interpretation.

AN

Materials and methods

M

Data compilation

The sample data play a crucial role in determining the quality of a predictive model

ED

(Cherkasov et al., 2014). As such, it is of necessity to collect a great number of samples with

PT

broad ranges of chemical structures and biological activities to develop a predictive model. A comprehensive literature search was carried out to compile data from a variety of sources (André

CE

et al., 1997; André et al., 1995; Debnath et al., 1991; El-Bayoumy et al., 1981; Hooberman et al.,

AC

1994; Juneja et al., 1991; Juneja et al., 2001; Jung et al., 1991; Klein et al., 2000b; Klein et al., 2000c; LaVoie et al., 1981; Ludolph et al., 2001; Rosenkranz et al., 1985; Rosenkranz and Mermelstein, 1985; Takamura-Enya et al., 2006; Tokiwa et al., 1981; Tokiwa et al., 2003; Vance et al., 1987; von Tungeln et al., 1994; Watanabe et al., 1994; You et al., 1993; Yu et al., 1992), whose mutagenicity values were expressed by the logarithm of the number of revertants per nanomole and assayed by S. typhimurium TA98 − S9 microsomial preparation. All chemical structures were carefully scrutinized to ensure their integrity and certainty since compounds with

7

ACCEPTED MANUSCRIPT different chirality can exert different activities. The SMILES strings, CAS registry numbers, TA98 values, and literature references of all molecules collected in this study are listed in Table

T

S1 (supporting information).

IP

Molecular descriptors

CR

All selected molecules were first subjected to full geometry optimization using the density

US

functional theory (DFT) B3LYP method with the basis set 6-31G(d,p) by the Gaussian 09 package (Gaussian, Wallingford, CT) in the dimethyl sulfoxide (DMSO) solvent using the

AN

polarizable continuum model (PCM) (Cammi and Tomasi, 1995; Miertuš et al., 1981) to mimic

M

the real assay conditions. The atomic charges of optimized structures were calculated using the molecular electrostatic potential (ESP) model of Merz and Kollman (Besler et al., 1990).

ED

Furthermore, the energies of frontier orbitals, viz. HOMO energy (EHOMO) and LUMO energy

PT

(ELUMO), and dipole moment, were retrieved from optimization calculation. The energy-related descriptors, namely hardness (η), softness (σ), chemical potential (μ), and electrophilic index (ω),

CE

were also determined because of their implication in the formation of electrophile-nucleophile

AC

adduct (LoPachin et al., 2012). η = (ELUMO − EHOMO) / 2

(Equation 1)

σ=1/η

(Equation 2)

μ = (ELUMO + EHOMO) / 2

(Equation 3)

ω = μ2 / 2η

(Equation 4)

8

ACCEPTED MANUSCRIPT In addition to those energy- and charge-related descriptors, the other more than 200 one-, two-, and three-dimensional molecular descriptors of those optimized molecules were calculated by the Discovery Studio package (Accelrys, San Diego, CA) and E-Dragon (available at the web site http://www.vcclab.org/lab/edragon/). These descriptors can be categorized as electronic

T

descriptors, spatial descriptors, structural descriptors, thermodynamic descriptors, topological

CR

IP

descriptors, and E-state indices.

Data purification was initiated by discarding those descriptors missing for at least one

US

compound or showing little or no discrimination against all compounds. It has been suggested by Topliss and Edwards that only one descriptor should be retained among those descriptors with

AN

intercorrelation values of r2 > 0.8 for reducing the probability of spurious correlations (Topliss

ED

ensure the quality of developed models.

M

and Edwards, 1979). The more conservative level of r2 ≥ 0.64 was taken in this study to further

It sometimes can be observed that those descriptors with broader ranges will outweigh the ones

PT

with narrower ranges because of their substantial variations in magnitudes. Therefore, the non-

CE

descriptive descriptors, viz. real variable descriptors, were normalized with the following



AC

equation (equation 5) (Kettaneh et al., 2005)

ij  xij  x j

  x n

i 1

ij

 xj



2

n  1

(Equation 5)

where xij and  ij are the original and normalized jth descriptors of the ith compound, respectively,

x j is the mean value of the original jth descriptor, and n is number of samples. It has been observed that descriptor selection plays an essential role in determining the performance of predictive models (Tseng et al., 2012). An over-fitting predictive model, for

9

ACCEPTED MANUSCRIPT example, can be yielded if there are too many descriptors selected (Burden et al., 2000). The descriptor selection was initially carried out by GFA using the QSAR module of the Discovery Studio package because of its effectiveness and efficiency (Rogers and Hopfinger, 1994). Further selection was executed by the recursive feature elimination (RFE) method, in which the

T

predictive model was repeatedly developed by all but one of descriptors; the descriptors were

IP

ranked according to their contributions to the predictive performance; and the descriptor with

CR

least contribution was discarded (Guyon et al., 2002). The remaining descriptors were treated as the input of support vector regression (SVR) (vide infra), and different descriptor combinations

US

constituted different SVR models. PLS models (vide infra) were built by combining those

M

AN

descriptors adopted by those SVR models in the ensemble.

ED

Data sets

PT

The collected compounds were randomly divided into the training set and test set with the ratio of ca. 4:1 to develop and to verify the predictive model, respectively (Tropsha et al., 2003). The

CE

high levels of similarity in both sample sets were scrutinized by inspecting their biological and

AC

chemical distributions as suggested by Tropsha (Tropsha, 2012). Furthermore, it is not trivial to detect the outliers and to remove them from the sample collections in the course of model development since outliers represent the extrapolated predictions (Gramatica et al., 2013). As such, most of predictive models are very sensitive to outliers (Casalegno et al., 2008). Nevertheless, eight molecules assayed by Juneja et al. (Juneja et al., 1991) and Jung et al. (Jung et al., 1991) were deliberately chosen as the outlier set to assess the robustness of the developed models because of their substantial distinctions from those in the training set (see the “Results”

10

ACCEPTED MANUSCRIPT section). In addition to the samples adopted in the training set, test set, and outlier set, a number of NACs with the same assay performed by other research groups were also used to test the

T

derived predictive models to mimic real world challenges.

IP

Partial least square

CR

Partial least square is a generalization of regression, which can handle data with collinearity

US

among the independent variables, viz. descriptors. As such, the advantageous characteristic of PLS is that PLS can process data where the number of descriptors is larger than that of

AN

observations (Wold, 1995). To minimize the chance correlations, cross-validation is a reliable

M

and commonly adopted method for testing its complexity (Clark and Cramer, 1993). The PLS model development was carried out using the Partial Least Square module in the Discovery

PT

ED

Studio package.

CE

Hierarchical support vector regression

AC

SVM, which was proposed by Vapnik et al. (Cortes and Vapnik, 1995), was originally designed for classification and subsequently modified for regression problems by nonlinearly mapping the input data into a higher-dimension space, in which a linear regression is carried out (Vapnik et al., 1997). SVM regression takes into consideration not only the training error but the model complexity, whereas traditional regression algorithms build predictive models by minimizing the training error. As such, SVM generally performs better than traditional regression methods because of its advantageous characteristics, namely dimensional

11

ACCEPTED MANUSCRIPT independence, limited number of freedom, excellent generalization capability, global optimum, and little effort to implement (Schölkopf and Smola, 2002). Nevertheless, SVM, like any other linear or ML-based QSAR techniques, has to trade off between the characteristics of a global model, viz. broader coverage of applicability domain

IP

T

(AD), and a local model, viz. higher level of predictivity (Netzeva et al., 2005). This perplexing

CR

situation, nevertheless, can be plausibly resolved using the hierarchical support vector regression (HSVR) scheme, which was initially proposed by Leong et al. and was derived based on SVM

US

(Leong et al., 2009). The most advantageous characteristic of HSVR is that HSVR can simultaneously take into account seemingly mutually exclusive characteristics, namely coverage

AN

of AD and level of predictivity, respectively (Netzeva et al., 2005). More importantly, it has been

M

demonstrated that HSVR models maintained very high levels of predictivity even when applied to compounds with very diverse chemical structures (Leong et al., 2009; Leong et al., 2010), and

ED

HSVR outperformed artificial neural network (ANN), genetic algorithm (GA), and SVM (Leong

PT

et al., 2010).

were

initially

CE

The detail of HSVR has been described elsewhere (Leong et al., 2009). Briefly, SVR models developed

using

the

LIBSVM

package

(software

available

at

AC

http://www.csie.ntu.edu.tw/~cjlin/libsvm) based on a variety of descriptor combinations to represent various local models. The model development and verification were carried out using the implemented modules svm-train and svm-predict, respectively. The regression modes, namely, ε-SVR and γ-SVR, were selected, and radial basis function (RBF) was used as the kernel because of its simplicity and great performance as compared with the others (Kecman, 2001). The runtime parameters, namely regression modes ε-SVR and ν-SVR, the associated ε and ν,

12

ACCEPTED MANUSCRIPT cost C, and the kernel width γ, were systemically scanned in a parallel fashion using an in-house Perl script (Ding et al., 2014). Initially, two SVR models were selected to develop an SVR ensemble (SVRE), which, in turn, was further subjected to regression by another SVR to yield the final HSVR model, viz. a global

T

model per se. The construction of SVRE was continuously carried out until the HSVR model

IP

performed well. Otherwise, the three- or even four-member ensembles were derived by adopting

CR

one or more SVR models, respectively, in case of poor performance shown by all two-member ensembles. The descriptor selection and ensemble construction were mainly governed by the

US

principle of Occam’s razor by adopting the least numbers of descriptors and SVR models

AN

(Dearden et al., 2009).

M

Predictive evaluation

ED

The predictivity of a derived model was determined by several statistics. The coefficients r2 and q2 in the training set and external set, respectively, for the linear least square regression were

n

i 1

2

n

 y 

CE

r 2 , q 2  1    yˆi  yi 

PT

computed by the following equation (equation 6)

i 1

i





2

AC

where yi and yˆ i are the observed and predicted values, respectively; and

(Equation 6)

yˆ and n are the

average of predicted values and the number of samples in the data set, respectively. Furthermore, the difference between yi and yˆ i , viz. the residual Δi, was computed for every prediction

i  yi  yˆi

(Equation 7)

13

ACCEPTED MANUSCRIPT The root mean square error (RMSE) and the mean absolute error (MAE) for n samples in the data set were calculated by the following equations n



RMSE 

(Equation 8)

n

T

1 n  i n i 1

(Equation 9)

IP

MAE 

i 1

2 i

CR

The developed model was further subjected to 10-fold cross-validation instead of the widely

US

used leave-one-out due to its better performance (Breiman and Spector, 1992), giving rise to the 2 correlation coefficient of 10-fold cross validation qCV . In addition, a variety of modified r2



r 2  ro2

rm2  r 2 1 



M

r 2  ro2



ED



rm2  r 2 1 

AN

versions proposed by Ojha et al. (Ojha et al., 2011) were also calculated.

PT

rm2   rm2  rm2  2

(Equation 11) (Equation 12) (Equation 13)

CE

rm2  rm2  rm2

(Equation 10)

where ro2 and k are the correlation coefficient and the slope of the regression line (predicted vs.

AC

observed values) through the origin, whereas ro2 is the correlation coefficient of the regression line (observed vs. predicted values) through the origin. 2 2 2 Moreover, the correlation coefficients qF1 , qF2 , and qF3 and concordance correlation coefficient

(CCC) proposed by Golbraikh et al. (Golbraikh et al., 2003), Schüürmann et al. (Schüürmann et al., 2008), Consonni et al. (Consonni et al., 2009), and Chirico and Gramatica (Chirico and Gramatica, 2011) were also adopted to measure the predictivity of developed models.

14

ACCEPTED MANUSCRIPT

nEXT

2 qF2  1    yi  yˆi 

i 1

y  i 1

i

 nEXT  2 2 qF3  1     yi  yˆi  nEXT   i 1 

i 1

i

yEXT

2

(Equation 15)

   yˆ 2

i

 yˆ

 yˆ EXT

i



2



2

 nTR  

 yˆ EXT



(Equation 16)

 nEXT  yEXT  yˆ EXT



(Equation 17)

2

AN

i 1

(Equation 14)

 nTR   yi  yTR  i 1

nEXT

nEXT



yEXT

2   yi  yEXT

y 

2

nEXT

2

i 1

CCC 



T

i 1

yTR

i

IP

y 

CR

nEXT

2

US

nEXT

2 qF1  1    yi  yˆi 

where nTR and nEXT are the numbers of samples in the training set and external set, respectively, is the average of predicted values in the training set, and yEXT and yˆ EXT are the

M

yˆ TR

ED

averages of observed and predicted values in the external set, respectively. In fact, the parameter

PT

2 has been adopted by Organization for Economic Co-operation and Development (OECD) to qF2

assess the external predictivity of QSAR models (Schüürmann et al., 2008).

CE

Most importantly, a predictive model should meet the most stringent criteria proposed by

AC

Golbraikh et al. (Golbraikh et al., 2003), Ojha et al. (Ojha et al., 2011), Roy et al. (Roy et al., 2012), and Chirico and Gramatica (Chirico and Gramatica, 2012). 2 2 r 2 , qCV , q 2 , qFn  0.70

(Equation 18)

2 r 2  qCV  0.10

(Equation 19)

r

2

 ro2  r 2  0.10 and 0.85  k  1.15

ro2  ro2  0.30

(Equation 20) (Equation 21)

15

ACCEPTED MANUSCRIPT rm2  0.65

(Equation 22)

rm2  0.65 and rm2  0.20

(Equation 23)

CCC ≥ 0.85

(Equation 24)

CR

IP

set, respectively, and qFn in equation (18) stands for qF1 , qF2 , and qF3 .

T

where r in equations (20)–(23) represents the parameters r and q in the training set and external

US

Results

AN

Data partition

Of all molecules adopted in this study, 226 and 56 molecules were randomly assigned to the

M

training set and test set, respectively, with an exact 4:1 ratio. Figure 1 shows the projection of all

ED

molecules enrolled in this investigation in chemical space, spanned by the first three principal components (PCs), explaining 97.9% of the variance in the original data. As displayed, both data

PT

sets exhibited high levels of similarity in the chemical space. Furthermore, the high levels of

CE

biological and chemical similarity between both sets can also be observed from Figure S1, which displays the histograms of log TA98, log P, molecular weight (MW), surface area, and molecular

AC

volume (Vm) in density form for all molecules in the training set and test set. Hence, the unbiased partition of data samples can be asserted (Scott, 2010).

16

AN

US

CR

IP

T

ACCEPTED MANUSCRIPT

M

Figure 1. Molecule distribution for the molecules selected for this study in the training set (solid

ED

circle), the test set (open triangle), the outlier set (open square), and the mock test (gray diamond)

PT

in the chemical space spanned by three principal components.

CE

Conversely, molecules in the outlier set should be clearly distinct from those in the training set

AC

that can be examined by inspecting the chemical space (Reymond et al., 2010). In fact, those outliers designated in this study are very dissimilar from those in the training set as illustrated in Figure 1, in which all of outliers are completely positioned outside the perimeter of the training set, suggesting that they are far away from the AD of model development and serve as a good metric of robustness of a predictive model. The structural distinctions between outliers and the others can actually be realized by the fact that all of those designated outliers contained the

17

ACCEPTED MANUSCRIPT thioether group as compared with the other molecules, leading to high levels of dissimilarity between those samples in the training set and outlier set.

HSVR

T

Of all generated SVR models using various combinations of descriptors and runtime

IP

parameters, two SVR models, denoted by SVR A and SVR B were assembled to construct the

CR

SVR ensemble, which was further subjected to regression by another SVR to generate the HSVR

US

model. SVR A and SVR B adopted 7 and 11 descriptors, respectively (Table 1).

x

ω x

Dipole FH2O

x

AC

I(diNO2)

An electrophilic index combines softness and chemical potential Sum of molar refractivity of substituents in the orthopositions of aromatic nitro.

x

The partial atomic charge on the C2 atom attached to the nitro group (NO2)

x

I1 is set equal to 1 for all compounds containing three or more fused rings; 0 otherwise

x

I(diNO2) is set equal to 1 for all compounds containing two or more nitro groups; 0 otherwise

x

Dipole moment of molecule

x

R8p+ Mor32p

x x

CE

qC2 IL

Log of the octanol/water partition coefficient

PT

MRo

x

M

log P

SVR A SVR B Description

ED

Descriptor

AN

Table 1. Descriptors selected as the input of SVR models in the ensemble and their descriptions.

Aqueous solvation free energy x

x

R maximal autocorrelation of lag 8/weighted by atomic polarizabilities 3D-MoRSE - signal 32/weighted by atomic polarizabilities

18

ACCEPTED MANUSCRIPT

Jurs-WPSA-1

x

The Jurs-WPSA-1 descriptor describes the surfaceweighted charged partial surface areas; an indication that the positive charge distribution plays a role in receptor recognition

Jurs-RNCG

x

Relative negative charge: charge of most negative atom divided by the total negative charge

S_sCH3

S value for a single-bonded methyl group

Eigenvalue 04 from edge adj. matrix weighted by edge degrees

CR

x

US

EEig04x

x

T

A topological index, whose value does not substantially increase with molecule size or the number of rings present.

x

IP

JX

The optimal runtime parameters of SVR A, SVR B, and HSVR, are listed in Table S2. Both

AN

SVR models in the ensemble adopted different combinations of descriptors, suggesting that they

M

are local models per se. As such, HSVR generally produced the medium residuals as compared with its counterparts in the ensemble (Table S1). Furthermore, it can be observed from Figures 2

ED

and 3, which display the scatter plots of observed vs. predicted log TA98 values in the training

PT

set and test set, respectively, that the distances between the predictions by HSVR and regression line are generally between those yielded by both SVR models. Nevertheless, HSVR yielded the

CE

smallest deviations in some cases. The predictions of compound 21 by SVR A, SVR B, and

AC

HSVR, for example, gave rise to absolute residuals of 0.63, 0.38, and 0.06, respectively. Statistically, HSVR executed better than any individual model in the ensemble in the training set and test set as manifested by those parameters listed in Tables 2 and 3. For instance, HSVR 2 yielded the largest r2, qCV , and q2 (0.91, 0.90, and 0.83) as well as the smallest differences 2 between r2 and qCV (0.01) and between r2 and q2 (0.08), indicating that HSVR is highly

predictive and well trained.

19

AN

US

CR

IP

T

ACCEPTED MANUSCRIPT

M

Figure 2. Observed log TA98 versus the log TA98 predicted by SVR A (open circle), SVR B

ED

(open inverted triangle), HSVR (gray square), and PLS (open diamond) for the molecules in the training set. The solid line, dashed lines, and dotted lines correspond to the HSVR regression of

AC

CE

prediction, respectively.

PT

the data, 95% confidence interval for the HSVR regression, and 95% confidence interval for the

20

AN

US

CR

IP

T

ACCEPTED MANUSCRIPT

M

Figure 3. Observed log TA98 versus the log TA98 predicted by SVR A (open circle), SVR B (open inverted triangle), HSVR (gray square), and PLS (open diamond) for the molecules in the

ED

test set. The solid line, dashed lines, and dotted lines correspond to the HSVR regression of the

CE

prediction, respectively.

PT

data, 95% confidence interval for the HSVR regression, and 95% confidence interval for the

AC

Table 2. Statistic evaluations, namely correlation coefficient (r2), maximal absolute residual (ΔMax), mean absolute error (MAE), standard deviation (s), RMSE, and 10-fold cross-validation 2 correlation coefficient ( qCV ) evaluated by SVR A, SVR B, HSVR, and PLS in the training set.

r

2

ΔMax MAE s

SVR A 0.91

SVR B 0.87

HSVR 0.91

PLS 0.72

2.92 0.32 0.50

2.25 0.59 0.36

1.94 0.50 0.37

3.33 0.79 0.65

21

ACCEPTED MANUSCRIPT RMSE 2 qCV

0.59 0.65

0.69 0.44

0.63 0.90

1.02 0.69

T

2 2 2 Table 3. Statistic evaluations, correlation coefficients q2, qF1 , qF2 , and qF3 , concordance

IP

correlation coefficient (CCC), maximal absolute residual (ΔMax), mean absolute error (MAE),

SVR B 0.74 0.61

HSVR 0.83 0.82

PLS 0.70 0.68

2 qF2

0.68

0.61

0.82

0.68

2 F3

0.74 0.85 2.90 0.74 0.65 0.98

0.86 0.91 1.61 0.59 0.44 0.73

0.74

2

q

ED

0.83 2.66 0.76 0.63 0.99

PT

ΔMax MAE s RMSE

0.69 0.84 3.88 0.81 0.73 1.08

M

q CCC

US

2 qF1

SVR A 0.72 0.68

AN

CR

standard deviation (s), and RMSE evaluated by SVR A, SVR B, HSVR, and PLS in the test set.

When applied to the molecules in the outlier set, HSVR even showed more pronounced

CE

predominance as manifested by those statistical parameters listed in Table 4 as well as Figure 4,

AC

which displays the scatter plot of observed vs. predicted log TA98 values in the outlier set. For instance, SVR A, SVR B, and HSVR gave rise to MAE values of 0.47, 0.54, and 0.24, respectively, indicting the better accuracy of HSVR. The performance predominance of HSVR in the outlier set, in fact, can be realized by the fact that SVR A and SVR B are local models per se, which are generally lack of extrapolation capacity, whereas HSVR is more robust, which is of critical importance to a predictive model (Gnanadesikan and Kettenring, 1972).

22

ACCEPTED MANUSCRIPT 2 2 2 Table 4. Statistic evaluations, correlation coefficients q2, qF1 , qF2 , and qF3 , concordance

correlation coefficient (CCC), maximal absolute residual (ΔMax), mean absolute error (MAE), standard deviation (s), and RMSE evaluated by SVR A, SVR B, HSVR, and PLS in the outlier set.

T

PLS 0.45 0.55

SVR B 0.74 0.82

HSVR 0.84 0.95

2 qF2

0.21

0.00

0.77

−1.51

2 qF3 CCC

0.92 0.68 1.01 0.47 0.30 0.55

0.90 0.73 1.14 0.54 0.32 0.61

0.97 0.88 0.69 0.24 0.23 0.32

0.75

CR

US

0.38 1.33 0.99 0.27 0.97

AC

CE

PT

ED

M

ΔMax MAE s RMSE

AN

q

IP

2 qF1

SVR A 0.72 0.86

2

Figure 4. Observed log TA98 versus the log TA98 predicted by SVR A (open circle), SVR B (open inverted triangle), HSVR (gray square), and PLS (open diamond) for the molecules in the

23

ACCEPTED MANUSCRIPT outlier set. The solid line, dashed lines, and dotted lines correspond to the HSVR regression of the data, 95% confidence interval for the HSVR regression, and 95% confidence interval for the prediction, respectively.

IP

T

PLS

CR

The linear PLS model as shown in equation 25 was built by combining those descriptors adopted by SVR A and SVR B in the SVR ensemble (Table 1). Table S1 lists the prediction

corresponding statistical assessments, respectively.

US

results of the molecules in the training set, test set, and outlier set, and Tables 2−4 summarize the

AN

log TA98 = 0.120331 +1.47418 × IL − 0.317731 × JX + 0.230309 × Mor32p + 0.0042358 × FH2O

M

+ 0.013219 × log P − 0.227749 × MRo + 0.379951 × EEig04x + 0.282496 × ω −

ED

0.176672 × Jurs-RNCG − 0.203552 × Jurs-WPSA-1 − 0.066267 × R8p + 0.03860 × Dipole + 0.235772 × I(diNO2) − 0.122489 × qC2 − 0.365117 × S_sCH3 (Equation 25)

PT

The PLS model yielded an r2 value of 0.72, denoting its modest performance in the training set.

CE

Nevertheless, some great prediction errors produced by PLS can still be observed, resulting in the fact that most of points predicted by PLS generally are the longest distances from the

AC

regression line as compared with SVR A, SVR B, and HSVR (Figure 2). As such, PLS gave rise to the largest ΔMax (3.33), MAE (0.79), s (0.65), and RMSE (1.02) (Table 2), suggesting that PLS 2 was the worst predictor in the training set. Nevertheless, PLS produced a qCV of 0.69, which is

not only slightly smaller than its r2 but better than both ones yielded by SVR A and SVR B, suggesting that this PLS was a well trained model as compared with SVR A and SVR B, which 2 were over trained as manifested by their marked differences between r2 and qCV .

24

ACCEPTED MANUSCRIPT When applied to the samples in the test set, PLS did not show substantial performance variations from the training set, especially when compared with SVR A and SVR B, which showed various levels of over-training. For instance, PLS gave rise to a q2 of 0.70, which is not only very similar to r2 (0.72) generated in the training set but is very close to both values

T

obtained by SVR A and SVR B (0.72 and 0.74, respectively). In addition, the similar

IP

performance levels observed in both data sets suggest that PLS was well trained and the

CR

collected samples were fairly partitioned into the training set and test set chemically and biologically because it will otherwise produce at least one significant difference.

US

Like any other predictive models, which are generally very sensitive to the outliers, the

AN

performance of PLS greatly decreased from the training set to the outlier set (Tables 2 and 4). For example, q2 generated by PLS was 0.45, showing decreases of 0.27 and 0.25 from the

M

training set and test set, respectively. Both SVR models in the ensemble, conversely, were much

ED

less sensitive to the outliers as manifested by their smaller differences between q2 in the outlier set and r2 in the training set, although PLS is a global model and SVR A and SVR B are local

CE

PT

models per se that, in turn, are normally lack of extrapolation capacity (Leong et al., 2010).

Predictive evaluations

AC

Various statistical parameters have been proposed to gauge the predictivity of a QSAR model (Gramatica and Sangion, 2016), and different model systems or observations can lead to different conclusions. As such, a QSAR model can be considered predictive if it can meet all of the most stringent requirements. Accordingly, the generated HSVR and PLS models were further assessed by those validation requirements proposed by Golbraikh et al. (Golbraikh et al., 2003), Ojha et al. (Ojha et al., 2011), Roy et al. (Roy et al., 2012), and Chirico and Gramatica (Chirico

25

ACCEPTED MANUSCRIPT and Gramatica, 2012) in the training set, test set, and outlier set (equations (18)−(24)). The results are summarized in Table 5, from which it can be found that HSVR not only produced large statistical values but also fulfilled all validation requirements, suggesting that this theoretical model is highly accurate, predictive, and robust. For instance, HSVR met the

IP

T

requirements of rm2 and rm2 , which were considered by Roy et al. to be the best validation parameters (Roy et al., 2012). Nevertheless, Todeschini et al. have concluded the importance of

CR

2 2 (Todeschini et al., 2016) and Chirico and Gramatica have regarded qF3 and CCC as the most qF3

US

stringent metrics (Chirico and Gramatica, 2011) and they have even raised the threshold of all q2

AN

parameters from 0.60 to 0.70 and that of rm2 from 0.50 to 0.65 (Chirico and Gramatica, 2012). Regardless of those facts, HSVR still met those strict requirements. In addition, little

M

performance variations among different data sets were produced by HSVR, asserting its

ED

consistent performance regardless of the data sets that is of pivotal importance to a predictive

PT

model.

CE

Table 5. Validation verification of HSVR and PLS based on prediction performance of the

2 o

r k

AC

molecules in the training set, test set, outlier set and mock test.

ro2 rm2 rm2 rm2 rm2 Eq. (18)

Training set HSVR PLS 0.90 0.72 1.09 1.00 0.89 0.70 0.84 0.72 0.79 0.61 0.82 0.66 0.05 0.11 x

Test set HSVR 0.83 0.94 0.81 0.79 0.71 0.75 0.08 x

PLS 0.70 0.87 0.66 0.69 0.57 0.63 0.13

Outlier set HSVR PLS 0.79 0.35 0.87 0.35 0.73 −0.77 0.66 0.30 0.79 −0.05 0.73 0.13 0.13 0.35 x

Mock test HSVR PLS 0.80 0.55 1.08 −0.14 0.73 −0.19 0.76 −0.78 0.76 0.28 0.76 −0.25 0.00 1.05 x

26

ACCEPTED MANUSCRIPT x x x x x N/A

N/A x x x x x

N/A x x x x

N/A x x x x x

N/A

N/A x x x x x

N/A

IP

T

Eq. (19) x Eq. (20) x Eq. (21) x Eq. (22) x Eq. (23) x Eq. (24) N/A† † Not applicable

PLS, on the other hand, met all of validation criteria except equation (18) in the training set,

CR

and equations (18) and (24) in the test set, showing its modest predictivity. However, PLS failed

US

to satisfy any of requirements when applied to the outliers, rendering its vulnerability to the outliers. In fact, it is not a trivial task to remove the potential outliers prior to model development

M

AN

(Casalegno et al., 2008).

Mock test

ED

The developed HSVR and PLS models were further tested by those 5 compounds measured by

PT

Takamura-Enya et al. (Takamura-Enya et al., 2002) and Watanabe et al. (Watanabe et al., 2005) to mimic real-world challenges since they were assayed with the same conditions. The predicted

CE

values are listed in Table S1 and the corresponding statistical parameters and predictive

AC

evaluations are listed in Table 6. It can be observed that PLS showed great performance deterioration in the mock test. For example, PLS produced a ΔMax of 6.90 and an RMSE value of 2 2 4.97. Statistically, PLS performed coarsely according to its small q2 (0.55) and negative qF1 , qF2 ,

2 and qF3 . HSVR, conversely, executed as well as it did in the training set. For instance, HSVR

produced a q2 of 0.80, which is not only very close to its r2 in the training set (0.91) but its q2 values in the test set and outlier set (0.83 and 0.84, respectively). Additionally, HSVR completely fulfilled all statistical validation requirements, whereas PLS failed to comply any one

27

ACCEPTED MANUSCRIPT of them. More importantly, HSVR did not show any pronounced performance variations between the training set and mock test as illustrated by the s values, for example, which were 0.37 and 0.36 in the training set and mock test, respectively. Thus, it can be asserted that HSVR outperformed PLS in the mock test in every aspect as further demonstrated by Figure 5, which

T

displays the scatter plot of observed vs. predicted log TA98 values for the compounds in the

IP

mock test. Thus, this mock test unequivocally ensured the predictivity of HSVR. The substantial

CR

performance discrepancies between HSVR and PLS in the outlier set can be plausibly attributed to the fact that the samples in the mock test are extremely dissimilar to those in the training set

US

(Figure 1), and HSVR is robust to the outliers while PLS is very vulnerable to the outliers (vide

AN

supra). As such, it is plausible to expect the derived HSVR is very useful in practical

M

applications even when applied to novel compounds.

ED

2 2 2 Table 6. Statistic evaluations, correlation coefficients q2, qF1 , qF2 , and qF3 , concordance

PT

correlation coefficient (CCC), maximal absolute residual (ΔMax), mean absolute error (MAE),

AC

the outlier set.

CE

standard deviation (s), and RMSE as well as validation criteria evaluated by HSVR and PLS in

2

q

2 qF1 2 qF2 2 qF3 CCC

ΔMax MAE s RMSE Eq. (18)

HSVR 0.80 0.93 0.77 0.86 0.86 1.27 0.90 0.36 0.95 x

PLS 0.55 −2.16 −10.14 −5.58 0.19 6.90 4.62 2.04 4.97

28

ACCEPTED MANUSCRIPT x x x x x

PT

ED

M

AN

US

CR

IP

T

Eq. (20) Eq. (21) Eq. (22) Eq. (23) Eq. (24)

Figure 5. Observed log TA98 versus the log TA98 predicted by HSVR (gray square) and PLS

CE

(open diamond) for the molecules in the mock test. The solid line, dashed lines, and dotted lines

AC

correspond to the HSVR regression of the data, 95% confidence interval for the HSVR regression, and 95% confidence interval for the prediction, respectively.

Comparison with previous models Numerous theoretical models have been proposed to predict the mutagenicity of NACs (Patlewicz et al., 2003; Vogt et al., 2010), and it is not possible to compare both models derived in this study directly with previous models mainly due to the availability of those programs to

29

ACCEPTED MANUSCRIPT generate those predictive models or the reproducibility of those proposed models. An unbiased model comparison, nevertheless, can still be performed by comparing the predictions of the molecules common to most of predictive models, which were directly excerpted and their statistical assessments were further subjected to comparisons. Accordingly, seven predictive

T

models (Debnath et al., 1991; Klein et al., 2000b; Lopez de Compadre et al., 1998; Nair and

IP

Sobhia, 2008), including two PLS, two CoMFA, one CoMSIA, one HQSA, and one GFA

CR

models, were selected along with PLS and HSVR of this study and the most common 172

US

molecules were chosen. The statistical evaluations are summarized in Table 7.

AN

Table 7. Statistical evaluations, namely correlation coefficient (r2), maximal absolute residual (ΔMax), mean of absolute error (MAE), standard deviation (s), and root-mean square error

M

(RMSE), evaluated by various predictive models based on those common molecules enlisted by

ED

models (n = 172). The ratios depict the percentages between the common molecules included in the original training set and the original data sizes (n), and the numbers within the parentheses

Ratio (n)

s

RMSE

1.68

0.53

0.38

0.65

x

x

x

x

x

x

(226) (0.91) (1.94) (0.50) (0.38) 59% 0.74 3.14 0.80 0.65

(0.63) 1.03

x

x

x

x

x

x

(226) (0.72) (3.33) (0.79) (0.65)

(1.02) x

x

x

x

x

x

x

x

x

x

x

x

0.90

AC

PLS

MAE

59%

a

PLSb

91%

0.81

1.89

0.73

0.48

(188) (0.81) (1.89) (0.72) (0.48) c

CoMFA

67%

0.74

2.94

0.82

0.62

(257) (0.72) (2.61) (0.90) (0.66) PLSd

Equation (18) (20) (21) (22) (23) (24)

ΔMax

CE

HSVRa

r2

PT

denote the statistics in the original data.

63%

0.63

3.74

1.37

0.90

(200) (0.58) (0.74) (1.34) (0.87)

0.87 (0.87) 1.03 (1.11) 1.64

x

(1.60)

30

ACCEPTED MANUSCRIPT 88%

0.58

6.79

0.75

1.21

(142) (0.96) (4.10) (0.19) (0.36) CoMSIAe

88%

0.55

7.19

0.81

HQSAR

88%

0.76

6.63

0.62

GFA

88%

0.71

3.84

0.81

x

1.03

x

x

x

x

x

x

x

x

x

x

(0.50) 1.10

CR

Model proposed by Debnath et al., 1991

Model proposed by Klein et al., 2000

M

AN

e

Model proposed by Nair and Sobhia, 2008

2  r m

IP

(0.89)

Model proposed by Lopez de Compadre et al., 1998

d

x

US

c

x

This work

b

x

(0.34)

0.74

(142) (0.78) (3.03) (0.69) (0.57) a

1.43

0.83

(142) (0.93) (1.59) (0.39) (0.32) e

x

(0.40)

1.17

(142) (0.97) (1.06) (0.25) (0.23) e

1.42

T

CoMFAe

Of nine predictive models, only the PLS model of Debnath et al. (Debnath et al., 1991), the

ED

CoMFA model of Lopez de Compadre et al. (Lopez de Compadre et al., 1998), the HQSAR

PT

model of Nair and Sobhia (Nair and Sobhia, 2008), and the PLS and HSVR models of this study fulfilled all statistical validation requirements when applied to the common 172 molecules.

CE

Furthermore, among these five theoretical models, only PLS of Debnath et al. and HSVR of this

AC

study produced the r2 values of more than 0.80 (0.81 and 0.90, respectively) and the ΔMax values of less than 2.00 (1.89 and 1.68, respectively), suggesting that both models are the best predictors as affirmed by all statistic parameters except MAE, which produced by the PLS model of Debnath et al. (0.73) is slightly larger than that by HQSAR (0.62). HSVR yielded larger r2 and smaller ΔMax, MAE, s, and RMSE than the PLS model of Debnath et al., suggesting that HSVR performed even better than the PLS model of Debnath et al.

31

ACCEPTED MANUSCRIPT Furthermore, the PLS model of Klein et al., the CoMFA, CoMSIA, HQSA, and GFA models of Nair and Sobhia exhibited pronounced performance discrepancies between the original data and the common 172 molecules as manifested by the substantial difference in at least one of those statistical parameters, suggesting that they were not well-trained as compared with PLS

T

and HSVR of this study, PLS of Debnath et al., and CoMFA of Lopez de Compadre et al. since

IP

they barely showed any performance variations between both data sets. Thus, it can be confirmed

CR

from the model comparison results that HSVR is superior to any other selected theoretical

AN

US

models in terms of predictive performance and consistency.

Discussion

M

The mutagenicity of NACs takes place through a series of processes, such as the initial cell

ED

penetration and final frame-shit mutation, that, in turn, are governed by a number of factors (Purohit and Basu, 2000; Vogt et al., 2010). Hydrophobicity, which can be estimated by the

PT

logarithmic partition coefficient between n-octanol and water, viz. log P, is of great importance

CE

to describe initial cell penetration. Additionally, reduction, which can be delineated by ELUMO,

AC

plays a considerably pivotal role in mutagenicity of NACs (Fu and Herreno‐Saenz, 1999). As such, both descriptors were selected by most of published models (Benigni, 2005). Nevertheless, it is not uncommon to observe the inconsistency in lipophilicity dependence among published models as manifested by the linear, bilinear, and independent relationships between log P and log TA98 (Debnath and Hansch, 1993; Debnath et al., 1991; Lopez De Compadre et al., 1988). When taking into account all compounds enlisted in this study, it can be observed that the average log TA98 value for each histogram bin of log P initially increased with

32

ACCEPTED MANUSCRIPT log P and then subsequently decreased more rapidly after maximum (Figure 6). When histogram curve fitting was carried out, a third-order polynomial as shown in equation (26) gave rise to the highest r2 with a value of 0.99, log TA98 = 1.27 + 0.80 log P − 0.25 (log P)2 − 0.11 (log P)3

(26)

T

indicating its high accuracy in relating hydrophobicity to mutagenicity. More importantly, the

IP

very complex relationship between hydrophobicity and mutagenicity is neither linear nor bilinear

CR

per se. Such peculiar relationship can be plausibly attributed to the fact that only certain range of hydrophobic NACs and their metabolites can cross the cell membrane and/or interact with the

US

activating enzyme (Benigni, 2005). More specifically, too hydrophilic compounds cannot cross

AN

the cell membrane because of hydrophobic membrane lipids, whereas too hydrophobic ones can stay trapped in the cell membrane (Sugano et al., 2010). The PLS model developed in this study,

M

which can only address the linear relationship between log P and mutagenicity, produced a small

ED

positive coefficient for log P (0.013219). As such, only QSAR models that can take into

AC

CE

mutagenicity of NACs.

PT

consideration the nonlinearity between descriptor and activity are suitable to predict the

33

M

AN

US

CR

IP

T

ACCEPTED MANUSCRIPT

ED

Figure 6. Average log TA98 versus the distribution of log P. The solid curve represents the

PT

histogram curve-fitting with a third-order polynomial.

CE

The significant role played by ELUMO can be depicted by the fact that the reduction reaction at

AC

the nitro moieties can be a rate-determining step in producing the subsequent intermediates as demonstrated by the predictive model proposed by Maynard et al. since they only adopted ELUMO based on 20 congeneric NACs to produce a modest r2 value of 0.67 (Maynard et al., 1986). Such linear relationship between ELUMO and mutagenicity is not always true (Purohit and Basu, 2000). For instance, Takamura-Enya et al. observed that the mutagenic potency of mono-nitro aromatic compounds was a function of reduction potential, whereas such dependence was not observed for poly-nitro aromatic compounds (Takamura-Enya et al., 2006), suggesting that ELUMO can play

34

ACCEPTED MANUSCRIPT different roles in different types of NACs. Accordingly, ELUMO is not always the best option, especially when the data samples are highly structurally diverse. The descriptor electrophilic index (ω), which is not only a function of ELUMO (equation 4) but of covalent bond formation between electrophile and nucleophiles as well (LoPachin et al., 2012), was included in this study

T

instead. As such, ω not only takes into account the initial reduction but the adduct formation with

IP

nucleophilic DNA, suggesting that ω is a better selection than ELUMO to describe the complex

CR

mechanisms of nitroaromatic mutagenesis. In fact, its significant role in mutagenicity can be illustrated by its relatively large coefficient in the PLS model (0.282496).

US

It has been observed by Klein et al. that the mutagenicity of NACs decreases with an increase

AN

in substituent size ortho to the nitro group since the more steric hindrance at the ortho position can produce the larger dihedral angle between NO2 and aromatic ring, viz. the larger distortion of

M

coplanarity of NO2 with respect to aromatic ring. As such the stability of metabolite nitrenium

ED

ion can be altered and the mutagenicity can be reduced consequently (Klein et al., 2000a). In fact, Takamura-Enya et al. also found that mutagenicity was associated with a decreased dihedral

PT

angle (Takamura-Enya et al., 2006). As a result, the sum of molar refractivity of substituents at

CE

the ortho positions (MRo) was adopted to take into account such relationship and a negative coefficient was given by the PLS model derived in this study consequently. Additionally,

AC

Debnath and Hansch observed that the partial atomic charge on the carbon attached to the nitro group (qC2) plays a significant role in mutagenicity of a group of compounds (Debnath and Hansch, 1993) since the more electron-rich aromatic ring is unfavorable for the initial reduction reaction and/or the final adduct formation with nucleophilic DNA. Consequently, qC2 was enrolled in the model development and was given a negative coefficient by PLS that is consistent with the PLS model developed by Debnath and Hansch.

35

ACCEPTED MANUSCRIPT It should be noted that both MRo and qC2 are associated with the nitro group attached to an aromatic ring, suggesting that the predictive models developed in this study can only be applied to NACs. Otherwise, substantial deviations may be yielded when other predictive models without the selection of both descriptors are applied to the samples included in this study. In

T

addition to using the more sophisticated DFT method and atomic charge calculation scheme,

IP

those quantum mechanics descriptors, namely ω, qC2, and dipole, were calculated by taking into

CR

account the DMSO solvent effects, which are of critical importance since the solvent effects can exert profound impacts on the quantum mechanics descriptors (Andzelm et al., 1995). As such,

US

those quantum mechanics descriptors selected in this study are more realistic as compared with

AN

those adopted by the other predictive models (Yan et al., 2005), yielding better predictive models accordingly.

M

Debnath et al. have observed that molecules with three or more fused aromatic rings showed

ED

more mutagenic potency than the others in TA98 strain, whereas such observation was absent in TA100 strain, suggesting that this structural character can induce the frame-shift mutations since

PT

both strains differ by frame-shift mutation vs. base-pair substitution (vide supra) (Debnath et al.,

CE

1992). It was hypothetically attributed to the intercalation of NACs into DNA that, in fact, is consistent with a recent observation that the stability of intercalated DNA adduct increases with

AC

the size of aromatic ring (Sproviero et al., 2014). As such, the descriptor IL was adopted in this study and was given by a positive coefficient by the PLS model. Moreover, the maximum absolute coefficient of IL among all descriptors in the PLS model signifies its pivotal role in governing the mutagenicity of NACs. Accordingly, the predictive models proposed in this study were specifically aimed at predicting the mutagenicity of NACs that takes place through frameshift mutations, viz. TA 98 strain, and substantial prediction errors can be expected once they are

36

ACCEPTED MANUSCRIPT applied to other types of mutagenicity that takes place through the other mutations such as TA100. It has been observed that NACs are more mutagenic in case of the presence of an additional nitro group (World Health Organization, 2003). For example, the log TA98 values of 1-

T

nitropyrene (25), 1,6-dinitropyrene (164), and 1,3,6-trinitropyrene (99) are 2.76, 5.06, and 4.99,

IP

respectively, that was postulated that the additional nitro group can facilitate the initial reduction

CR

reaction (Spain, 1995). Accordingly, the descriptor I(diNO2) was included to take into account the contribution of an additional nitro group to the mutagenicity of NACs. Furthermore,

US

Librando et al. suggested that the differences in dipole moment play a pivotal role in

AN

mutagenicity of nitrobenzo[a]pyrenes since dipole moment is closely related to substrate– enzyme intermolecular interactions (Librando et al., 2008) that, in fact, is completely consistent

M

with the positive coefficient given by the PLS model, indicating that the larger dipole moment,

ED

the more mutagenic. Accordingly, it is of necessity to select dipole moment to render the substrate–enzyme interaction. Liu et al. have studied the permeability of various compounds

PT

using 18 different membranes and it was found that the aqueous solvation free energy, viz. FH2O,

CE

nonlinearly correlated with permeation coefficient of solutes well (Liu et al., 2005). Furthermore, it has been demonstrated that increased permeability can enhance the mutagenicity (Ichikawa et

AC

al., 1986) that, actually, is in good agreement with the positive coefficient associated with FH2O produced by the PLS model. Thus, the descriptor FH2O was included to take into account the contribution of permeability to mutagenicity. Thus, it can be asserted that those selected descriptors mentioned above were purported to describe various parts of the complex mechanisms of nitroaromatic mutagenesis−from the initial cellular penetration to the final frame-shift mutation. Furthermore, it should be noted that those

37

ACCEPTED MANUSCRIPT descriptors selected in this investigation were normalized, suggesting that those weights associated with the corresponding descriptors in the PLS model can actually reflect their relative contributions to the mutagenicity of those NACs showing linear relationships with mutagenicity. Nevertheless, it is not uncommon to observe that various predictive models employed different

T

combinations of descriptors to predict the mutagenicity of NACs (Patlewicz et al., 2003; Vogt et

IP

al., 2010) that is mainly due to the variations in rate-determining steps associated with different

CR

structural classes (Patlewicz et al., 2003).

Furthermore, Debnath et al. employed the descriptor Ia as an indicator for the presence of

US

acenthrylenes based on empirical observation instead of mechanistic hypothesis (Debnath et al.,

AN

1991), suggesting that it is of necessity to adopt an extra descriptor to describe a special group of molecules, whose rate-determining step can be different from the others and this peculiar

M

characteristic can be even more pronounced when more structurally diverse training samples are

ED

included (Basak and Majumdar, 2015). Accordingly, a number of Dragon descriptors were selected in this study, which, have never been adopted by any published models before (Table 1)

PT

since they were highly related to the mutagenicity of specific types of NACs. For instance,

CE

substantial prediction errors were produced for the nitrodibenzo-1,4-dioxins once R8p+ and Mor32p were not included (data not shown). In addition, the correlation coefficients between

AC

R8p+ and log TA98 and between Mor32p and log TA98 were 0.85 and 0.78, respectively, for the nitrodibenzo-1,4-dioxins, whereas they were 0.33 and 0.40, respectively, for the others. The structural distinction between nitrodibenzo-1,4-dioxins and the others can be illustrated by their distributions in the chemical space as shown in Figure 7. It can be observed that they are completely isolated from the others. Similar observation can also be found for Jurs-WPSA-1, Jurs-RNCG, JX, S_sCH3, and EEig04x, which gave rise to r values of 0.76, 0.82, 1.00, 0.90, and

38

ACCEPTED MANUSCRIPT 1.00

for

those

nitrofluorenes,

nitrodibenzofuran,

nitrobenzidines,

nitrostilbenes,

and

nitroimidazoles, respectively, whereas they yielded r values of 0.14, 0.50, 0.54, 0.31, and 0.67 for the others (Table 8). Furthermore, the coefficients associated with EEig04x, JX, and S_sCH3 in the PLS model suggest their importance in mutagenicity. As of the present, there is no

T

reasonable explanation has been found for the selection of those Dragon descriptors since it is

IP

not straightforward to interpret Dragon descriptors. Further investigation will be carried out in

AC

CE

PT

ED

M

AN

US

CR

the future to elucidate the relationships between those Dragon descriptors and mutagenicity.

Figure 7. The molecular distributions of nitrodibenzo-1,4-dioxins (solid circle) and the others (open square) in the chemical space, which is delineated in Figure1.

Table 8. The correlation coefficients (r) between selected descriptors and log TA98 for specific chemotypes and the other molecules enlisted in this study.

39

ACCEPTED MANUSCRIPT r

Molecules Nitrodibenzo-1,4-dioxins (n = 5) 0.85 0.78

R8p+ vs. log TA98 Mor32p vs. log TA98

The others 0.14

JX vs. log TA98

Nitrobenzidines (n = 3) 1.00

The others 0.54

S_sCH3 vs. log TA98

Nitrostilbenes (n = 17) 0.90

The others 0.31

EEig04x vs. log TA98

Nitroimidazoles (n = 3) 1.00

The others 0.67

The others 0.50

ED

M

AN

CR

IP

Jurs-RNCG vs. log TA98

Nitrodibenzofuran (n = 6) 0.82

US

T

Nitrofluorenes (n = 19) Jurs-WPSA-1 vs. log TA98 0.76

The others 0.33 0.40

Collectively, fifteen descriptors were adopted in this study, giving rise to the sample-to-

PT

descriptor ratio of ca. 15:1, which is significantly larger than 5, viz. the minimal requirement to

CE

lessen the probability of chance correlations in a predictive model (Topliss and Edwards, 1979). However, the selection of suitable descriptors cannot always warrant the development of an

AC

accurate model if the complex nonlinear relationships between descriptors and mutagenicity cannot be fully addressed. The PLS developed in this study, for instance, still produced some marked prediction errors despite of the fact that both HSVR and PLS adopted the same descriptors. For instance, PLS gave rise to an absolute residual of 3.33 when applied to 228, whereas HSVR only yielded a value of 0.51 (Table S1). Such predictive discrepancies between both models can be plausibly attributed to the inadequate rendering of nonlinear relationship between some descriptors and mutagenicity. Even machine learning-based models that cannot

40

ACCEPTED MANUSCRIPT justify the variations in mutagenic dependence of different descriptors for different types of compounds can only accurately predict the mutagenicity of specific types of NACs or substantial deviations may be yielded. HSVR, conversely, included those descriptors not only to render the complete complex mutagenic mechanisms but to take into account the special characteristics

T

associated with specific types of NACs since its unique architecture can simultaneously possess

IP

the advantageous characteristics of local model and global model, viz. the high level of

CR

predictivity and the broader coverage of AD, respectively, which are two seemingly contradictory characteristics hard to be compromised by traditional QSAR methods or even by

AN

US

machine learning techniques.

M

Conclusions

ED

A novel two-QSAR scheme was adopted in this study by combining hierarchical support vector regression and partial least square with the descriptor selections to comprehensively cover

PT

the complex mutagenic process to predict the mutagenicity of nitroaromatic compounds in S.

CE

typhimurium strain TA98 without S9 mix. The built HSVR model performed extremely well in the training set, test set, and even outlier set, whereas the PLS model showed modest

AC

performance in the former two data sets and coarse performance in the latter one. The accuracy and predictivity of HSVR were assured by a variety of statistical evaluations and validation criteria. When mock tested by a group of NACs to mimic the real challenge, HSVR executed far better than PLS. Unbiased comparison results with various published predictive models unequivocally asserted the generalization capabilities, consistent performance, and robustness of HSVR that is plausibly due to its unique architecture to combine the advantageous characteristics

41

ACCEPTED MANUSCRIPT of local model and global model, viz. higher predictivity and broader applicability domain, as well as appropriate descriptor selections. PLS, conversely, revealed the interpretable relationships between some selected descriptors and mutagenicity that can hardly be done by “black box” approaches. Accordingly, it can be assured that this two-QSAR approach by using

T

predictive HSVR and interpretable PLS in a synergistic fashion to predict the mutagenicity of

IP

nitroaromatic compounds and to interpret relationships between descriptors and mutagenicity,

CR

respectively, can be adopted to facilitate drug discovery and development by designing safer

AN

US

drug candidates.

Acknowledgments

M

This work was supported by the Ministry of Science and Technology, Taiwan. Parts of

ED

calculations were performed at the National Center for High-Performance Computing, Taiwan.

PT

The authors are grateful to Prof. Paola Gramatica and Dr. Sean Ekins for providing helpful

AC

CE

suggestions.

42

ACCEPTED MANUSCRIPT References

AC

CE

PT

ED

M

AN

US

CR

IP

T

Aßmann, N., Emmrich, M., Kampf, G., Kaiser, M., 1997. Genotoxic activity of important nitrobenzenes and nitroanilines in the Ames test and their structure-activity relationship. Mutat. Res.-Genet. Toxicol. 395, 139-144. Ames, B.N., McCann, J., Yamasaki, E., 1975. Methods for detecting carcinogens and mutagens with the Salmonella/mammalian-microsome mutagenicity test. Mutat. Res. 31, 347-364. André, V., Boissart, C., Sichel, F., Gauduchon, P., Le Talaër, J.Y., Lancelot, J.C., Mercier, C., Chemtob, S., Raoult, E., Tallec, A., 1997. Mutagenicity of nitro- and amino-substituted carbazoles in Salmonella typhimurium. III. Methylated derivatives of 9H-carbazole. Mutat. Res. Genet. Toxicol. Environ. Mutagen. 389, 247-260. André, V., Boissart, C., Sichel, F., Gauduchon, P., Le Talaër, J.Y., Lancelot, J.C., Robba, M., Mercier, C., Chemtob, S., Raoult, E., Tallec, A., 1995. Mutagenicity of nitro- and aminosubstituted carbazoles in Salmonella typhimurium. II. Ortho-aminonitro derivatives of 9Hcarbazole. Mutat. Res.-Genet. Toxicol. 345, 11-25. Andzelm, J., Kölmel, C., Klamt, A., 1995. Incorporation of solvent effects into density functional calculations of molecular energies and geometries. J. Chem. Phys. 103, 9312-9320. Bakhtyari, N.G., Raitano, G., Benfenati, E., Martin, T., Young, D., 2013. Comparison of In Silico Models for Prediction of Mutagenicity. J. Environ. Sci. Health Pt. C-Environ. Carcinog. Ecotoxicol. Rev. 31, 45-66. Basak, S.C., Majumdar, S., 2015. Prediction of Mutagenicity of Chemicals from Their Calculated Molecular Descriptors: A Case Study with Structurally Homogeneous versus Diverse Datasets. Curr. Comput. Aided Drug Des. 11, 117-123. Benigni, R., 2005. Structure-activity relationship studies of chemical mutagens and carcinogens: Mechanistic investigations and prediction approaches. Chem. Rev. 105, 1767-1800. Benigni, R., Bossa, C., 2006. Structural Alerts of Mutagens and Carcinogens. Curr. Comput.Aided Drug Des. 2, 169-176. Benigni, R., Bossa, C., Tcheremenskaia, O., 2013. Nongenotoxic Carcinogenicity of Chemicals: Mechanisms of Action and Early Recognition through a New Set of Structural Alerts. Chem. Rev. 113, 2940-2957. Benigni, R., Bossa, C., Tcheremenskaia, O., Giuliani, A., 2010. Alternatives to the carcinogenicity bioassay: in silico methods, and the in vitro and in vivo mutagenicity assays. Expert Opin. Drug Metab. Toxicol. 6, 809-819. Besler, B.H., Merz, K.M.J., Kollman, P.A., 1990. Atomic charges derived from semiempirical methods. J. Comput. Chem. 11, 431-439. Boelsterli, U.A., Ho, H.K., Zhou, S., Yeow Leow, K., 2006. Bioactivation and Hepatotoxicity of Nitroaromatic Drugs. Curr. Drug Metab. 7, 715-727. Breiman, L., Spector, P., 1992. Submodel Selection and Evaluation in Regression: The XRandom Case. Int. Statist. Rev. 60, 291-319. Burden, F.R., Ford, M.G., Whitley, D.C., Winkler, D.A., 2000. Use of automatic relevance determination in QSAR studies using Bayesian neural networks. J. Chem. Inf. Comput. Sci. 40, 1423-1430. Caliendo, G., Fattorusso, C., Greco, G., Novellinor, E., Perissutti, E., Santagada, V., 1995. Shape-Dependent Effects in a Series of Aromatic Nitro Compounds Acting as Mutagenic Agents on S. Typhimurium TA98. SAR QSAR Environ. Res. 4, 21-27.

43

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN

US

CR

IP

T

Cammi, R., Tomasi, J., 1995. Remarks on the use of the apparent surface charges (ASC) methods in solvation problems: Iterative versus matrix-inversion procedures and the renormalization of the apparent charges. J. Comput. Chem. 16, 1449-1458. Casalegno, M., Sello, G., Benfenati, E., 2008. Definition and Detection of Outliers in Chemical Space. J. Chem. Inf. Model. 48, 1592-1601. Cherkasov, A., Muratov, E.N., Fourches, D., Varnek, A., Baskin, I.I., Cronin, M., Dearden, J., Gramatica, P., Martin, Y.C., Todeschini, R., Consonni, V., Kuz’min, V.E., Cramer, R., Benigni, R., Yang, C., Rathman, J., Terfloth, L., Gasteiger, J., Richard, A., Tropsha, A., 2014. QSAR Modeling: Where Have You Been? Where Are You Going To? J. Med. Chem. 57, 4977-5010. Chirico, N., Gramatica, P., 2011. Real External Predictivity of QSAR Models: How To Evaluate It? Comparison of Different Validation Criteria and Proposal of Using the Concordance Correlation Coefficient. J. Chem. Inf. Model. 51, 2320-2335. Chirico, N., Gramatica, P., 2012. Real External Predictivity of QSAR Models. Part 2. New Intercomparable Thresholds for Different Validation Criteria and the Need for Scatter Plot Inspection. J. Chem. Inf. Model. 52, 2044-2058. Clark, M., Cramer, R.D., 1993. The Probability of Chance Correlation Using Partial Least Squares (PLS). Quant. Struct.-Act. Relat. 12, 137-145. Claxton, L.D., Umbuzeiro, G.d.A., DeMarini, D.M., 2010. The Salmonella mutagenicity assay: The stethoscope of genetic toxicology for the 21st century. Environ. Health Perspect. 118, 1515. Coe, K.J., Jia, Y., Ho, H.K., Rademacher, P., Bammler, T.K., Beyer, R.P., Farin, F.M., Woodke, L., Plymate, S.R., Fausto, N., Nelson, S.D., 2007. Comparison of the Cytotoxicity of the Nitroaromatic Drug Flutamide to Its Cyano Analogue in the Hepatocyte Cell Line TAMH: Evidence for Complex I Inhibition and Mitochondrial Dysfunction Using Toxicogenomic Screening. Chem. Res. Toxicol. 20, 1277-1290. Consonni, V., Ballabio, D., Todeschini, R., 2009. Comments on the Definition of the Q2 Parameter for QSAR Validation. J. Chem. Inf. Model. 49, 1669-1678. Cortes, C., Vapnik, V., 1995. Support-Vector Networks. Mach. Learn. 20, 273-297. Cronin, M.T.D., Gregory, B.W., Schultz, T.W., 1998. Quantitative Structure−Activity Analyses of Nitrobenzene Toxicity to Tetrahymena pyriformis. Chem. Res. Toxicol. 11, 902-908. Dearden, J.C., Cronin, M.T.D., Kaiser, K.L.E., 2009. How not to develop a quantitative structure–activity or structure–property relationship (QSAR/QSPR). SAR QSAR Environ. Res. 20, 241-266. Debnath, A.K., Debnath, G., Shusterman, A.J., Hansch, C., 1992. A QSAR investigation of the role of hydrophobicity in regulating mutagenicity in the Ames test: 1. Mutagenicity of aromatic and heteroaromatic amines in Salmonella typhimurium TA98 and TA100. Environ. Mol. Mutagen. 19, 37-52. Debnath, A.K., Hansch, C., 1993. Mechanistic Interpretation of the Genotoxicity of Nitrofurans (Antibacterial Agents) Using Quantitative Structure-Activity Relationships and Comparative Molecular Field Analysis. J. Med. Chem. 36, 1007-1016. Debnath, A.K., Lopez de Compadre, R., Debnath, G., Shusterman, A.J., Hansch, C., 1991. Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. Correlation with molecular orbital energies and hydrophobicity. J. Med. Chem. 34, 786-797. Ding, Y.-L., Shih, Y.-H., Tsai, F.-Y., Leong, M.K., 2014. In Silico Prediction of Inhibition of Promiscuous Breast Cancer Resistance Protein (BCRP/ABCG2). PLoS ONE 9, e90689. Dobo, K.L., Greene, N., Fred, C., Glowienke, S., Harvey, J.S., Hasselgren, C., Jolly, R., Kenyon, M.O., Munzner, J.B., Muster, W., Neft, R., Vijayaraj Reddy, M., White, A.T., Weiner, S., 2012.

44

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN

US

CR

IP

T

In silico methods combined with expert knowledge rule out mutagenic potential of pharmaceutical impurities: An industry survey. Regul. Toxicol. Pharmacol. 62, 449-455. El-Bayoumy, K., Lavoie, E.J., Hecht, S.S., Fow, E.A., Hoffmann, D., 1981. The influence of methyl substitution on the mutagenicity of nitronaphthalenes and nitrobiphenyls. Mutat. Res.Fund. Mol. M. 81, 143-153. Escobar, P.A., Kemper, R.A., Tarca, J., Nicolette, J., Kenyon, M., Glowienke, S., Sawant, S.G., Christensen, J., Johnson, T.E., McKnight, C., Ward, G., Galloway, S.M., Custer, L., Gocke, E., O’Donovan, M.R., Braun, K., Snyder, R.D., Mahadevan, B., 2013. Bacterial mutagenicity screening in the pharmaceutical industry. Mutat. Res.-Rev. Mutat. Res. 752, 99-118. Fan, M., Byrd, C., Compadre, C.M., Compadre, R.L., 1998. Comparison of CoMFA models for Salmonella typhimurium TA98, TA100, TA98 + S9 and TA100 + S9 mutagenicity of nitroaromatics. SAR QSAR Environ. Res. 9, 187-215. Fioravanzo, E., Bassan, A., Pavan, M., Mostrag-Szlichtyng, A., Worth, A.P., 2012. Role of in silico genotoxicity tools in the regulatory assessment of pharmaceutical impurities. SAR QSAR Environ. Res. 23, 257-277. Fu, P.P., 1990. Metabolism of Nitro-Polycyclic Aromatic Hydrocarbons. Drug Metab. Rev. 22, 209-268. Fu, P.P., Herreno‐Saenz, D., 1999. Nitro‐polycyclic aromatic hydrocarbons: A class of genotoxic environmental pollutants. J. Environ. Sci. Health Pt. C-Environ. Carcinog. Ecotoxicol. Rev. 17, 1-43. Fujita, T., Winkler, D.A., 2016. Understanding the Roles of the “Two QSARs”. J. Chem. Inf. Model. 56, 269-274. Gnanadesikan, R., Kettenring, J.R., 1972. Robust estimates, residuals, and outlier detection with multiresponse data. Biometrics 28, 81-124. Golbraikh, A., Shen, M., Xiao, Z.Y., Xiao, Y.D., Lee, K.H., Tropsha, A., 2003. Rational selection of training and test sets for the development of validated QSAR models. J. Comput.Aided Mol. Des. 17, 241-253. Gramatica, P., Chirico, N., Papa, E., Cassani, S., Kovarich, S., 2013. QSARINS: A new software for the development, analysis, and validation of QSAR MLR models. J. Comput. Chem. 34, 2121-2132. Gramatica, P., Sangion, A., 2016. A Historical Excursus on the Statistical Validation Parameters for QSAR Models: A Clarification Concerning Metrics and Terminology. J. Chem. Inf. Model. 56, 1127-1131. Greene, N., Pennie, W., 2015. Computational toxicology, friend or foe? Toxicol. Res. 4, 11591172. Guyon, I., Weston, J., Barnhill, S., Vapnik, V., 2002. Gene Selection for Cancer Classification using Support Vector Machines. Mach. Learn. 46, 389-422. Hakimelahi, G.H., Khodarahmi, G.A., 2005. The identification of toxicophores for the prediction of mutagenicity, hepatotoxicity and cardiotoxicity. J. Iran. Chem. Soc. 2, 244-267. Hasegawa, K., Funatsu, K., 2010. Non-Linear Modeling and Chemical Interpretation with Aid of Support Vector Machine and Regression. Curr. Comput.-Aided Drug Des. 6, 24-36. Hillebrecht, A., Muster, W., Brigo, A., Kansy, M., Weiser, T., Singer, T., 2011. Comparative Evaluation of in Silico Systems for Ames Test Mutagenicity Prediction: Scope and Limitations. Chem. Res. Toxicol. 24, 843-854. Hooberman, B.H., Brezzell, M.D., Das, S.K., You, Z., Sinsheimer, J.E., 1994. Substituent effects on the genotoxicity of 4-nitrostilbene derivatives. Mutat. Res. 341, 57-69.

45

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN

US

CR

IP

T

Hornberg, J.J., Laursen, M., Brenden, N., Persson, M., Thougaard, A.V., Toft, D.B., Mow, T., 2014. Exploratory toxicology as an integrated part of drug discovery. Part II: Screening strategies. Drug Discov. Today 19, 1137-1144. Hou, T., Li, Y., Zhang, W., Wang, J., 2009. Recent developments of in silico predictions of intestinal absorption and oral bioavailability. Comb. Chem. High Throughput Screen. 12, 497506. Ichikawa, M., Yamamoto, K., Tanaka, A., Swaminathan, S., Hatcher, J.F., Erturk, E., Bryan, G.T., 1986. Mutagenicity of 3,4-diphenyl-5-nitrofuran analogs in Salmonella typhimurium. Carcinogenesis 7, 1339-1344. Juneja, T.R., Kaur, B., Gupta, R.L., 1991. Mutagenicity of 4-nitrodiphenyl thioether-derived products and their potential metabolites. Mutat. Res. Lett. 263, 13-19. Juneja, T.R., Talukdar, A., Mehta, N., Gupta, R.L., 2001. Effect of various alkyl and unsaturated substituents on the mutagenicity of some nitrophenyl thioethers. Mutat. Res. Genet. Toxicol. Environ. Mutagen. 495, 97-102. Jung, H., Heflich, R.H., Fu, P.P., Shaikh, A.U., P. E. Hartman, 1991. Nitro group orientation, reduction potential, and direct-acting mutagenicity of nitro-polycyclic aromatic hydrocarbons. Environ. Mol. Mutagen. 17, 169-180. Kecman, V., 2001. Learning and Soft Computing : Support Vector Machines, Neural Networks, and Fuzzy Logic Models. MIT Press. Kettaneh, N., Berglund, A., Wold, S., 2005. PCA and PLS with very large data sets. Comput. Stat. Da. An. 48, 69-85. King, R.D., Muggleton, S.H., Srinivasan, A., Sternberg, M.J., 1996. Structure-activity relationships derived by machine learning: the use of atoms and their bond connectivities to predict mutagenicity by inductive logic programming. Proc. Natl. Acad. Sci. USA 93, 438-442. Klein, M., Erdinger, L., Boche, G., 2000a. From mutagenic to non-mutagenic nitroarenes: effect of bulky alkyl substituents on the mutagenic activity of nitroaromatics in Salmonella typhimurium: Part II. Substituents far away from the nitro group. Mutat. Res. 467, 69-82. Klein, M., Erdinger, L., Boche, G., 2000b. From mutagenic to non-mutagenic nitroarenes: effect of bulky alkyl substituents on the mutagenic activity of nitroaromatics in Salmonella typhimurium: Part II. Substituents far away from the nitro group. Mutat. Res. Genet. Toxicol. Environ. Mutagen. 467, 69-82. Klein, M., Voigtmann, U., Haack, T., Erdinger, L., Boche, G., 2000c. From mutagenic to nonmutagenic nitroarenes: effect of bulky alkyl substituents on the mutagenic activity of 4nitrobiphenyl in Salmonella typhimurium: Part I. Substituents ortho to the nitro group and in 2'position. Mutat. Res. 467, 55-68. Kovacic, P., Somanathan, R., 2006. Mechanism of teratogenesis: Electron transfer, reactive oxygen species, and antioxidants. Br. Defects Res. Part A: Embryo Today: Reviews 78, 308-325. LaVoie, E.J., Govil, A., Briggs, G., Hoffmann, D., 1981. Mutagenicity of aminocarbazoles and nitrocarbazoles. Mutat. Res. Genet. Toxicol. 90, 337-344. Leong, M.K., Chen, Y.-M., Chen, T.-H., 2009. Prediction of Human Cytochrome P450 2B6Substrate Interactions Using Hierarchical Support Vector Regression Approach. J. Comput. Chem. 30, 1899-1909. Leong, M.K., Lin, S.-W., Chen, H.-B., Tsai, F.-Y., 2010. Predicting Mutagenicity of Aromatic Amines by Various Machine Learning Approaches. Toxicol. Sci. 116, 498-513.

46

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN

US

CR

IP

T

Li, Q., Minami, M., Hanaoka, T., Yamamura, Y., 1999. Acute immunotoxicity of pchloronitrobenzene in mice: II. Effect of p-chloronitrobenzene on the immunophenotype of murine splenocytes determined by flow cytometry. Toxicology 137, 35-45. Librando, V., Alparone, A., Tomaselli, G., 2008. Electronic properties of some nitrobenzo[a]pyrene isomers: a possible relationship to mutagenic activity. J. Mol. Model. 14, 489-497. Liu, J., Li, Y., Pan, D., Hopfinger, A.J., 2005. Predicting permeability coefficient in ADMET evaluation by using different membranes-interaction QSAR. Int. J. Pharm. 304, 115-123. LoPachin, R.M., Gavin, T., DeCaprio, A., Barber, D.S., 2012. Application of the Hard and Soft, Acids and Bases (HSAB) Theory to Toxicant–Target Interactions. Chem. Res. Toxicol. 25, 239251. Lopez de Compadre, R.L., Byrd, C., Compadre, C.M., 1998. Comparative QSAR and 3D-QSAR Analysis of the Mutagenicity of Nitroaromatic Compounds, In: Devillers, J. (Ed.), Comparative QSAR. Taylor & Francis, Washington, DC, pp. 111-136. Lopez de Compadre, R.L., Debnath, A.K., Shusterman, A.J., Hansch, C., 1990. LUMO Energies and hydrophobicity as determinants of mutagenicity by nitroaromatic compounds in Salmonella typhimurium. Environ. Mol. Mutagen. 15, 44-55. Lopez De Compadre, R.L., Shusterman, A.J., Hansch, C., 1988. The role of hydrophobicity in the Ames test. The correlation of the mutagenicity of nitropolycyclic hydrocarbons with partition coefficients and molecular orbital indices. Int. J. Quantum. Chem. 34, 91-101. Ludolph, B., Klein, M., Erdinger, L., Boche, G., 2001. The effects of 4'-alkyl substituents on the mutagenic activity of 4-amino- and 4-nitrostilbenes in Salmonella typhimurium. Mutat. Res. Genet. Toxicol. Environ. Mutagen. 491, 195-209. Martelli, A., Brambilla Campart, G., Carrozzino, R., Ghia, M., Mattioli, F., Mereto, E., Orsi, P., Porta Puglia, C., 2000. Evaluation of Flutamide Genotoxicity in Rats and in Primary Human Hepatocytes. Pharmacol. Toxicol. 86, 129-134. Maynard, A.T., Pedersen, L.G., Posner, H.S., McKinney, J.D., 1986. An Ab initio study of the relationship between nitroarene mutagenicity and electron affinity. Mol. Pharmacol. 29, 629-636. McCarren, P., Springer, C., Whitehead, L., 2011. An investigation into pharmaceutically relevant mutagenicity data and the influence on Ames predictive potential. J. Cheminformatics 3, 51. Miertuš, S., Scrocco, E., Tomasi, J., 1981. Electrostatic interaction of a solute with a continuum. A direct utilizaion of AB initio molecular potentials for the prevision of solvent effects. Chem. Phys. 55, 117-129. Mortelmans, K., Zeiger, E., 2000. The Ames Salmonella/microsome mutagenicity assay. Mutat. Res. 455, 29-60. Nair, P.C., Sobhia, M.E., 2008. Comparative QSTR studies for predicting mutagenicity of nitro compounds. J. Mol. Graph. Model. 26, 916-934. Naven, R.T., Greene, N., Williams, R.V., 2012. Latest advances in computational genotoxicity prediction. Expert Opin. Drug Metab. Toxicol. 8, 1579-1587. Netzeva, T.I., Worth, A., Aldenberg, T., Benigni, R., Cronin, M.T.D., Gramatica, P., Jaworska, J.S., Kahn, S., Klopman, G., Marchant, C.A., Myatt, G., Nikolova-Jeliazkova, N., Patlewicz, G., Perkins, R., Roberts, D.W., Schultz, T.W., Stanton, D.W., van de Sandt, J.J., Tong, W., Veith, G., Yang, C., 2005. Current status of methods for defining the applicability domain of (Quantitative) structure-activity relationships : The report and recommendations of ECVAM workshop 52. Altern. Lab. Anim. 33, 19.

47

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN

US

CR

IP

T

Ojha, P.K., Mitra, I., Das, R.N., Roy, K., 2011. Further exploring rm2 metrics for validation of QSPR models. Chemometrics Intell. Lab. Syst. 107, 194-205. Patlewicz, G., Rodford, R., Walker, J.D., 2003. Quantitative structure-activity relationships for predicting mutagenicity and carcinogenicity. Environ. Toxicol. Chem. 22, 1885-1893. Pederson, T.C., Siak, J.S., 1981. The role of nitroaromatic compounds in the direct-acting mutagenicity of diesel particle extracts. J. Appl. Toxicol. 1, 54-60. Philbert, M.A., Billingsley, M.L., Reuhl, K.R., 2000. Mechanisms of Injury in the Central Nervous System. Toxicol. Pathol. 28, 43-53. Purohit, V., Basu, A.K., 2000. Mutagenicity of nitroaromatic compounds. Chem. Res. Toxicol. 13, 673-692. Rao, K., Xu, Y., Shaw, E., Parton, J.W., 2004. Mutagenicity testing applied for regulation of developing products. Curr. Separations 20, 141-144. Ren, Y., Liu, H., Yao, X., Liu, M., 2007. Prediction of ozone tropospheric degradation rate constants by projection pursuit regression. Anal. Chim. Acta 589, 150-158. Reymond, J.-L., van Deursen, R., Blum, L.C., Ruddigkeit, L., 2010. Chemical space as a source for new drugs. MedChemComm 1, 30-38. Rogers, D., Hopfinger, A.J., 1994. Application of Genetic Function Approximation to Quantitative Structure-Activity Relationships and Quantitative Structure-Property Relationships. J. Chem. Inf. Comput. Sci. 34, 854-866. Rosenkranz, H.S., McCoy, E.C., Frierson, M., Klopman, G., 1985. The role of DNA sequence and structure of the electrophile on the mutagenicity of nitroarenes and arylamine derivatives. Environ. Mutagen. 7, 645-653. Rosenkranz, H.S., Mermelstein, R., 1985. The genotoxicity, metabolism and carcinogenicity of nitrated polycyclic aromatic hydrocarbons. J. Environ. Sci. Health Pt. C-Environ. Carcinog. Ecotoxicol. Rev. 3, 221-272. Roy, K., Mitra, I., Kar, S., Ojha, P.K., Das, R.N., Kabir, H., 2012. Comparative Studies on Some Metrics for External Validation of QSPR Models. J. Chem. Inf. Model. 52, 396-408. Sabbioni, G., 1994. Hemoglobin Binding of Nitroarenes and Quantitative Structure-Activity Relationships. Chem. Res. Toxicol. 7, 267-274. Schüürmann, G., Ebert, R.-U., Chen, J., Wang, B., Kühne, R., 2008. External Validation and Prediction Employing the Predictive Squared Correlation Coefficient-Test Set Activity Mean vs Training Set Activity Mean. J. Chem. Inf. Model. 48, 2140-2145. Schölkopf, B., Smola, A., 2002. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, 1st ed. MIT Press, Cambridge, MA. Scott, D.W., 2010. Averaged shifted histogram. Wiley Interdisciplinary Reviews: Computational Statistics 2, 160-164. Simon-Hettich, B., Rothfuss, A., Steger-Hartmann, T., 2006. Use of computer-assisted prediction of toxic effects of chemical substances. Toxicology 224, 156-162. Spain, J.C., 1995. Biodegradation of nitroaromatic compounds. Annu. Rev. Microbiol. 49, 523555. Sproviero, M., Verwey, A.M.R., Rankin, K.M., Witham, A.A., Soldatov, D.V., Manderville, R.A., Fekry, M.I., Sturla, S.J., Sharma, P., Wetmore, S.D., 2014. Structural and biochemical impact of C8-aryl-guanine adducts within the NarI recognition DNA sequence: influence of aryl ring size on targeted and semi-targeted mutagenicity. Nucleic Acids Res. 42, 13405-13421. Stepan, A.F., Walker, D.P., Bauman, J., Price, D.A., Baillie, T.A., Kalgutkar, A.S., Aleo, M.D., 2011. Structural Alert/Reactive Metabolite Concept as Applied in Medicinal Chemistry to

48

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN

US

CR

IP

T

Mitigate the Risk of Idiosyncratic Drug Toxicity: A Perspective Based on the Critical Examination of Trends in the Top 200 Drugs Marketed in the United States. Chem. Res. Toxicol. 24, 1345-1410. Sugano, K., Kansy, M., Artursson, P., Avdeef, A., Bendels, S., Di, L., Ecker, G.F., Faller, B., Fischer, H., Gerebtzoff, G., Lennernaes, H., Senner, F., 2010. Coexistence of passive and carriermediated processes in drug transport. Nat. Rev. Drug Discov. 9, 597-614. Takamura-Enya, T., Suzuki, H., Hisamatsu, Y., 2006. Mutagenic activities and physicochemical properties of selected nitrobenzanthrones. Mutagenesis 21, 399-404. Takamura-Enya, T., Watanabe, T., Tada, A., Hirayama, T., Nukaya, H., Sugimura, T., Wakabayashi, K., 2002. Identification of a New Mutagenic Polychlorinated Biphenyl Derivative in the Waka River, Wakayama, Japan, Showing Activation of an Aryl Hydrocarbon ReceptorDependent Transcription. Chem. Res. Toxicol. 15, 419-425. Todeschini, R., Ballabio, D., Grisoni, F., 2016. Beware of Unreliable Q2! A Comparative Study of Regression Metrics for Predictivity Assessment of QSAR Models. J. Chem. Inf. Model. 56, 1905-1913. Tokiwa, H., Nakagawa, R., Ohnishi, Y., 1981. Mutagenic assay of aromatic nitro compounds with Salmonella typhimurium. Mutat. Res. Lett. 91, 321-325. Tokiwa, H., Sera, N., Fukuhara, K., Utsumi, H., Sasaki, S., Miyata, N., 2003. Structural activity relationship between Salmonella-mutagenicity and nitro-orientation of nitroazaphenanthrenes. Chem.-Biol. Interact. 146, 19-25. Topliss, J.G., Edwards, R.P., 1979. Chance factors in studies of quantitative structure-activity relationships. J. Med. Chem. 22, 1238-1244. Travlos, G.S., Mahler, J., Ragan, H.A., Chou, B.J., Bucher, J.R., 1996. Thirteen-Week Inhalation Toxicity of 2- and 4-Chloronitrobenzene in F344/N Rats and B6C3F1 Mice. Toxicol. Sci. 30, 7592. Tropsha, A., 2012. Recent Trends in Statistical QSAR Modeling of Environmental Chemical Toxicity, In: Luch, A. (Ed.), Molecular, Clinical and Environmental Toxicology: Volume 3: Environmental Toxicology. Springer Basel, pp. 381-411. Tropsha, A., Gramatica, P., Gombar, Vijay K., 2003. The Importance of Being Earnest: Validation is the Absolute Essential for Successful Application and Interpretation of QSPR Models. QSAR Comb. Sci. 22, 69-77. Tseng, Y.J., Hopfinger, A.J., Esposito, E.X., 2012. The great descriptor melting pot: mixing descriptors for the common good of QSAR models. J. Comput.-Aided Mol. Des. 26, 39-43. Valerio, L.G., Jr., 2009. In silico toxicology for the pharmaceutical sciences. Toxicol. Appl. Pharmacol. 241, 356-370. Valerio, L.G., Jr., 2013. Predictive Computational Toxicology to Support Drug Safety Assessment, In: Reisfeld, B., Mayeno, A.N. (Eds.), Computational Toxicology. Humana Press, New York, pp. 341-354. Valerio, L.G., Jr., Arvidson, K.B., Chanderbhan, R.F., Contrera, J.F., 2007. Prediction of rodent carcinogenic potential of naturally occurring chemicals in the human diet using high-throughput QSAR predictive modeling. Toxicol. Appl. Pharmacol. 222, 1-16. Vance, W.A., Wang, Y.Y., Okamoto, H.S., 1987. Disubstituted amino-, nitroso-, and nitrofluorenes: a physicochemical basis for structure-activity relationships in Salmonella typhimurium. Environ. Mutagen. 9, 123-141. Vapnik, V., Golowich, S., Smola, A., 1997. Support vector method for function approximation, regression estimation, and signal processing, In: Mozer, M., Jordan, M.I., Petsche, T. (Eds.),

49

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN

US

CR

IP

T

Advances in Neural Information Processing Systems 9. MIT Press, Cambridge, MA, USA, pp. 281-287. Venkatapathy, R., Wang, C.Y., Bruce, R.M., Moudgal, C., 2009. Development of quantitative structure-activity relationship (QSAR) models to predict the carcinogenic potency of chemicals: I. Alternative toxicity measures as an estimator of carcinogenic potency. Toxicol. Appl. Pharmacol. 234, 209-221. Vogt, R.A., Rahman, S., Crespo-Hernández, C.E., 2010. Structure–Activity Relationships in Nitro-Aromatic Compounds, In: Leszczynski, J., Shukla, M.K. (Eds.), Practical Aspects of Computational Chemistry: Methods, Concepts and Applications. Springer, New York, pp. 217240. von Tungeln, L.S., Ewing, D.G., Weitkamp, R., Cheng, E., Herreno-Saenz, D., Evans, F.E., Fu, P.P., 1994. Metabolic Activation of the Potent Mutagen and Tumorigen 2-Nitrobenzo[a]pyrene. Polycyclic Aromat. Compd. 7, 91-98. Wang, X.D., Lin, Z.F., Yin, D.Q., Liu, S.S., Wang, L.S., 2005. 2D/3D-QSAR comparative study on mutagenicity of nitroaromatics. Sci. China Ser. B. 48, 246-252. Watanabe, T., Hasei, T., Takahashi, T., Asanoma, M., Murahashi, T., Hirayama, T., Wakabayashi, K., 2005. Detection of a Novel Mutagen, 3,6-Dinitrobenzo[e]pyrene, as a Major Contaminant in Surface Soil in Osaka and Aichi Prefectures, Japan. Chem. Res. Toxicol. 18, 283-289. Watanabe, T., Kaji, H., Kasai, T., Hirayama, T., 1994. Metabolic activation of nitrodibenzofurans by rat liver in Salmonella/mutagenicity test. Mutat. Res. Lett. 325, 11-19. Wold, S., 1995. PLS for Multivariate Linear Modeling, In: van de Waterbeemd, H. (Ed.), Chemometric Methods in Molecular Design. VCH, Weinheim, pp. 195-218. World Health Organization, 2003. Selected nitro and nitro-oxy-polycyclic aromatic hydrocarbons, Environmental Health Criteria (EHC) Monographs. International Programme on Chemical Safety (IPCS), WHO, Geneva. Xu, C., Cheng, F., Chen, L., Du, Z., Li, W., Liu, G., Lee, P.W., Tang, Y., 2012. In silico Prediction of Chemical Ames Mutagenicity. J. Chem. Inf. Model. 52, 2840-2847. Yan, X.-F., Xiao, H.-M., Gong, X.-D., Ju, X.-H., 2005. Quantitative structure–activity relationships of nitroaromatics toxicity to the algae (Scenedesmus obliguus). Chemosphere 59, 467-471. You, Z., Brezzell, M.D., Das, S.K., Espadas-Torre, M.C., Hooberman, B.H., Sinsheimer, J.E., 1993. ortho-Substituent effects on the in vitro and in vivo genotoxicity of benzidine derivatives. Mutat. Res. Genet. Toxicol. 319, 19-30. Yu, S., Herreno-Saenz, D., Miller, D.W., Kadlubar, F.F., Fu, P.P., 1992. Mutagenicity of nitropolycyclic aromatic hydrocarbons with the nitro substituent situated at the longest molecular axis. Mutat. Res. Lett. 283, 45-52.

50

ACCEPTED MANUSCRIPT Highlights Nitroaromatic compounds are potentially toxic



The selected descriptors can comprehensively render the complicated mutagenicity



PLS model provides interpretability



HSVR model provides predictivity.



Two-QSAR approach can be used to facilitate drug discovery and development

AC

CE

PT

ED

M

AN

US

CR

IP

T



51