Mechanistic studies of congener-specific adsorption and bioaccumulation of polycyclic aromatic hydrocarbons and phthalates in soil by novel QSARs

Mechanistic studies of congener-specific adsorption and bioaccumulation of polycyclic aromatic hydrocarbons and phthalates in soil by novel QSARs

Environmental Research 179 (2019) 108838 Contents lists available at ScienceDirect Environmental Research journal homepage: www.elsevier.com/locate/...

1MB Sizes 0 Downloads 10 Views

Environmental Research 179 (2019) 108838

Contents lists available at ScienceDirect

Environmental Research journal homepage: www.elsevier.com/locate/envres

Mechanistic studies of congener-specific adsorption and bioaccumulation of polycyclic aromatic hydrocarbons and phthalates in soil by novel QSARs

T

Jun Caia,b, Chenggang Gua,∗, Qingqing Tia,b, Chang Liua,b, Yongrong Biana, Cheng Sunc, Xin Jianga a

Key Laboratory of Soil Environment and Pollution Remediation, Institute of Soil Science, Chinese Academy of Sciences, Nanjing, 210008, PR China University of the Chinese Academy of Sciences, Beijing, 100049, PR China c State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing, 210023, PR China b

A R T I C LE I N FO

A B S T R A C T

Keywords: PAHs PAEs Soil adsorption Bioaccumulation Quantitative structure-activity relationships

Polycyclic aromatic hydrocarbons (PAHs) and phthalic acid esters (PAEs) which are structurally featured with one or more aromatic skeletons are often regarded as two important groups of organic pollutants due to the widespread distribution and notorious toxic effects in soils. Relative to the great number of structural analogues or congeners detected in soil, however, the soil adsorption and bioaccumulation of PAHs/PAEs by plant is far less studied for the insufficiency of experimental determinations or lack of insights into the inherent structural requirements. To mechanistically evaluate the congener-specific soil adsorption and bioaccumulation for PAHs/ PAEs, the quantitative structure-activity relationships (QSARs) were successfully developed by density functional theory (DFT) computation and partial least squares (PLS) analysis. As verified with the higher cumulative variance coefficients and cross-validated correlation coefficients for strong stability, interpretability and predictability, the QSARs could be used for prediction of unknown adsorption potency or bioavailability within the specified applicability domain, respectively. It was indicated by QSAR that the structural requirements of PAHs/ PAEs necessary for strengthening the soil adsorption were mainly attributed to the molecular polarizability and the associated dispersion interaction with soil. As regards the bioaccumulation by carrot, the aggravation of spherical polarity change of molecules and the involved electrostatic interaction with soil entity or electron transfer from the highest occupied molecular orbital (HOMO) of PAHs/PAEs was implied to be inherently decisive for the variance of bioavailability among congeners. Based on the holistic view of negative correlation relationship, the soil adsorption seemed to act as the forceful constraint in decreasing the bioaccumulation of PAHs/PAEs and could also be alternatively gauged as the preliminary evaluation of bioavailability and risks on soil ecosystem. It would thus help better understand the soil adsorption and bioaccumulation with the informative mechanistic insights and provide data support for ecological risk assessment of PAHs/PAEs in soils.

1. Introduction With the rapid development of social economy and modern agriculture, a large number of organic pollutants have been input into the soil via different pathways, such as atmospheric deposition, sewage irrigation, livestock manure, fertilization, pesticide spraying and straw burning treatments. Among them, the polycyclic aromatic hydrocarbons (PAHs) and phthalic acid esters (PAEs) which are structurally featured with one or more aromatic skeletons are typified as two important groups of organic pollutants for the ubiquitous distribution and high occurrence in farmland soil (Jiang et al., 2011; Kong et al., 2012; Xu et al., 2016). The large-scale survey of 38 agricultural greenhouse



soils in Spain revealed the great presence of di-(2-ethylhexyl) phthalate (DEHP) in 1000-63000μg kg−1, which was significantly higher than the US phthalates pollutants controls (Vidal, 2012). In the surface soil of farmland in Changchun City of China, the average content of PAHs was reported as high as 2954.9 μg kg−1, and all detection of the samples exceeded the standards set by the Canadian Minister of Agriculture and Environment (Chen et al., 2016). In previous research, the pollution profile of different farmland soils in China also demonstrated the detections of PAHs and PAEs were much higher than that of the co-existing organic pollutants, e.g. organochlorine pesticides (OCPs) and polychlorinated biphenyls (PCBs), and the special concerns over PAHs and PAEs pollution should be aroused therefrom (Sun et al., 2018). As

Corresponding author. Institute of Soil Science, Chinese Academy of Sciences, No.71 East Beijing Road, PR China. E-mail address: [email protected] (C. Gu).

https://doi.org/10.1016/j.envres.2019.108838 Received 19 August 2019; Received in revised form 12 October 2019; Accepted 17 October 2019 Available online 22 October 2019 0013-9351/ © 2019 Elsevier Inc. All rights reserved.

Environmental Research 179 (2019) 108838

J. Cai, et al.

generally characterized with the high hydrophobicity and bioaccumulativity, PAHs and PAEs could smoothly invade the tissue of body and impose the strong adverse impacts on human health and wildlife. The solid bodies of in vivo or in vitro investigations have verified the endocrine disrupting effect and the typical toxicity related with the carcinogenicity, teratogenicity and mutagenicity for PAHs and PAEs (Bansal and Kim, 2015). More specifically, PAHs were considered to capably interfere with the functionality of cell membranes and membrane-related enzymes, and affect the germination rate of plants as an effective immunosuppressant (Abdel-Shafy and Mansour, 2016). Except the strong reproductive and developmental toxicity, the reduction of vitamin C and capsaicin in pepper was also indicated when exposed to PAEs and the metabolites in soil (Yin et al., 2003). In view of ecological risks and potential hazards on food security, PAHs and PAEs have already been listed as priority pollutants in many countries and attracted much concerns of scientific community. To well known the pollution occurrence, potential effects and environmental fate, it is of great significance to pay more attention to the primary behaviors, e.g. adsorption, migration and bioaccumulation of PAHs and PAEs in soil. When the hydrophobic aromatic pollutants enter the soil matrix, the partition interactions either with water phase, soil particles or organisms shall be firstly involved through a string of physicochemicallycoupled processes, such as the adsorption/desorption, uptake/release or dissolution/precipitation, and finally the status of equilibrium partition among different phases would be reached in conformity to the equilibrium partitioning theory (EPT) (Sijm et al., 2000). In virtue of the immobilization either by noncovalently-bonded contacts with soil organic matter or deep residence in the mineral lattices or multilayers, the hydrophobic aromatic pollutants could be effectively adsorbed on soil particles whereas the small portion of them are re-dissolved into soil pore water via desorption. The adsorption and desorption play a key role in regulating the procedure of interphasal equilibrium, which greatly affect the successive transport, accumulation, degradation and bioavailability of pollutants (Yang et al., 2013). In reality, the fraction arrangement between the adsorbed ones by soil particle and that dissolved in soil pore water are quantitatively described by the well-defined soil adsorption index, namely KOC, which is regarded as the basic property of pollutants to evaluate the adsorption capacity and transfer potency. The higher KOC signifies the stronger adsorption capacity and the lower mobility. Besides the structural relevance, the soil adsorption index that bears a strong resemblance to n-octanol/water partition coefficient (KOW) is closely associated with the hydrophobicity, as clearly evidenced in previous studies (Sabljić et al., 1995; Wen et al., 2012). As the chaperoned event of adsorption, the transfer of freelydissolved and weekly-bound species across soil pore water or through direct contact of soil particles could greatly motivate the accumulation by organisms. Despite the liability to be accumulated in lipid body, the hydrophobic compounds, e.g. PAHs and PAEs in rhizosphere soil are also likely to be activated by root exudates and adsorbed on root surface, and are further actively transported into plant from the root to the stem, leaf and fruit compartments with the transpiration stream in xylem (Undeman et al., 2009). A large number of studies have suggested the massive uptake of PAHs and PAEs in crops in different regions (Fismes et al., 2002; Mo et al., 2009). So the accumulation of pollutants in organism shall ultimately endanger human health through the food chain. It is well recognized that the quantitative descriptions about the bioaccumulation potency of organisms are more necessarily demanded for risk assessment rather than the apparent total amount of pollutants. The bioavailability, as expressed as the biota-to-soil accumulation factor (BSAF) that represents the ratio of pollutant accumulated in organism to the residuals in soil could well describe the bioaccumulation potency to a great extent, and has been strongly recommended as the basic criterion used for the screening of priority pollutants and risk evaluation. Similar to KOC, the bioavailability was also related with the soil property and the structural character of pollutant (Dowdy and McKone, 2009), whereas it was independent of KOW

when the bioavailability was theoretically defined by EPT as the ratio between bioconcentration factor (BCF) and KOC (Sijm et al., 2000). In general, the bioaccumulation and bioavailability of pollutant is not only biologically attributed, but closely associated with the pollutant species distribution, which is conversely hinged on the primary adsorption on soil. Because of the inseparable links and mutual influences, the holistic view of bioaccumulation and adsorption that represents the activation and sequestration of pollutant respectively is very critical for understanding the successive behaviors in soil, and the integrative study of KOC and BSAF is of great significance to assess their exposure risks. However, the soil adsorption and bioaccumulation is insufficiently studied yet for the variety of structural analogues of PAHs and PAEs. Though the structural characters are known as the inherent determinants for environment behavior, the structural relevance with the variance of adsorption and bioaccumulation is not yet clearly addressed as well. The more experimental determinations about soil adsorption and bioaccumulation are essential to assess the potential risks of pollutants. As regards the huge time-consumption and high experimental cost, however, the experimentally-determined KOC and BSAF are quite limited for a large number of pollutants. Encouragingly, the in silico techniques involving the model development, such as quantitative structure-activity relationships (QSARs) have been proposed as powerful and supplementary tools used for the reliable prediction and mechanistic evaluation of structural relevance for pollutants in soil. According to the guidelines on the development and validation of QSARs by Organization of Economic Cooperation and Development (OECD) in 2007, a great number of QSARs merely focused upon KOC have been developed through various strategies, e.g. the linear solvation energy relationship (LSERs) and the simple correlation with KOW, water solubility (Sw), topological index and van der Waals volume (dos Reis et al., 2014; Wang et al., 2009). Despite the rigorous external validation conducted by OECD guidelines for certain groups of pollutants, the developed QSARs for adsorption might not facilitate the interpretation of structural relevance due to the obscurity of involved descriptors. Relative to the adsorption, however, the plants uptake of pollutants in soil and its associated BSAFs was far less studied from the point of view of QSARs, let alone those for the typical PAHs/PAEs in soil. The objective of this study was to mechanistically evaluate the congener-specific soil adsorption and bioaccumulation of PAHs and PAEs, and to identify the structural requirements inherently decisive for the variance of them by QSARs development. Based on the homogeneously experimental determinations of KOC and BSAF respectively, the soil adsorption and bioaccumulation of PAHs and PAEs was wholly evaluated at the molecular level by developing the potent QSARs. In conformity with the protocols mentioned above, the density functional theory (DFT) within the framework of ab initio was employed for geometrical optimization and property computation to reduce the uninterpretability of descriptors. To overcome the multi-collinearity among different descriptors, the partial least squares (PLS) regression method was selected in this study (Geladi, 1988; Wold et al., 2001). With the quantum chemical calculations and PLS analysis, the QSARs of KOC and BSAF were successfully developed and validated with the specific applicability domains (AD), and the structural characteristics that should underlie the variance of soil adsorption and bioaccumulation were clearly shown. With the predicted data, the relationship between KOC and BSAF was revealed and the usefulness of KOC as BSAF in the expeditious risks assessment for PAHs and PAEs was thus confirmed. The obtained results would help well understand the molecular mechanism of soil adsorption and bioaccumulation of PAHs and PAEs in soil, and provide theoretical guidance for assessing environmental behaviors and ecological risks.

2

Environmental Research 179 (2019) 108838

J. Cai, et al.

2. Materials and methods

PLS analysis was carried out within the suite of SIMCA-P (Demo Version 11.0, Umetrics AB) (Wold et al., 2001). The optional computational conditions were set as default. Prior to analyses, the leave-one-out cross-validation technique was used to confirm the effective dimensionality or principal components (A). Calibration and prediction performance of the model was evaluated using the following statistical parameters. The overall performance of QSAR was primarily evaluated 2 by the cumulative variance coefficient (R y,cum(adj) ), the correlation coefficient between the observed and predicted lgKOC or BSAF (R2), the generalized standard error (SE) and the significance level (p). The 2 mathematical formula for R y,cum(adj) was defined as follows.

2.1. Data compilation The experimental dataset of soil adsorption and bioaccumulation of PAHs and PAEs in soil were derived from previous research (Yang et al., 2013; Chen et al., 2005). The experimentally-used soils were largely featured with the content of organic matter 0.6%–3.78%, pH 4.88–8.08 and water content 1.73%–3.8%. Because the content of organic matter that is believed as the key dominance for the adsorption of organic pollutants often vary markedly among different soils, the distribution ratios (Kd) of pollutants in the equilibrium between soil and solution was normalized under the unit of organic carbon in virtue of soil organic matter (foc) (Huang et al., 2003), as formulized as KOC=Kd/foc. For clarity, the KOC values were mathematically transformed into the logarithm range, namely lgKOC which homologically spanned about 2 orders of magnitude. We chose the soil-plant migration as an important transport and transformation pathway of aromatic hydrocarbon pollutants in soil in order to further evaluate their bioavailability. Generally, the accumulation of organic pollutants in plants mainly stems from the soil-root transfer and the leaf absorption in air (Fismes et al., 2002; Lin et al., 2007), and it is widely believed that the atmospheric-foliar transport is an important way for foliage plants to accumulate organic pollutants (Barber et al., 2004). In order to more accurately evaluate the bioavailability of pollutants in soil, the carrot, of which the edible part is buried underground was selected as the optimal accumulator to ensure the accumulation of PAHs and PAEs source exclusively from soil rather than the air to the greatest extent (Kipopoulou et al., 1999; Wu et al., 2015). The bioaccumulation content of PAHs and PAEs in underground edible parts of carrot was measured as BSAF.

n

n

⎤ ⎡ 2 R y,cum(adj) = 1 − ⎢∏ ∑ (yifit − yi )2 / ∑ (yi − ymean )2⎥ i=1 i=1 ⎦j ⎣

(j = 1……A) (1)

where yi and yifit is the observed and fitted values of lgKOC or BSAF of the ith compound respectively, and ymean is the average observed values of n compounds in the training set. The cumulative cross-validated 2 correlation coefficient (Qcum ) measures the goodness of fit and the internal predictability of QSARs (de Campos and de Melo, 2014; Davis et al., 2016), and if it is greater than 0.5, the better stability and internal predictability would be signified statistically (Golbraikh and Tropsha, 2 2002). Qcum could be derived from the following equation. n

n

2 Qcum =1−

∏ ⎡⎢∑ (yi − yipred )2/ ∑ (yi − ymean )2⎤⎥ ⎣ i=1

(j = 1……A)

⎦j

i=1

(2)

yipred

where is the predicted value for the ith compound in the training set. However, it is insufficient to describe the predictability of QSARs based only on the internal validity, while the external validation is more essentially needed. In order to perform the external validation, the original data sets were initially divided into training sets (80%) and test sets (20%) by Y-ranking method (Golbraikh and Tropsha, 2000), by which the individual datasets of KOC or BSAF were ranked by numerical values, and the selection was fulfilled at the fixed intervals of data so as to comprise the test set, as markedly labeled in Table 1. The Y-ranking method could guarantee the data uniformity between the training set and the test set to a large extent. As a key index to evaluate the predictability, the calculation of external predictive correlation coefficient 2 (QEXT ) of test set was defined as follows.

2.2. Molecular structural parameter computations The calculation of quantum chemical parameters was carried out using the Gaussian 03 program (Frisch et al., 2003). The semi-empirical molecular orbital calculation method AM1 was initially selected to roughly obtain the energetic minima of PAHs and PAEs. And to be more accurate, the three-parameter mixed functional B3PW91 of DFT and the basis set 6-311G (d, p) was employed to optimize the initial structures of PAHs and PAEs without a prior symmetry restriction. Based on the optimization of electronic structure, the structural descriptors that were supposed to be associated with the soil adsorption and bioaccumulation were fully calculated. The pool of structural descriptors primarily involved the molecular volume (V, within the isocontour of charge density of 0.001 e Bohr−3) (Wong et al., 1995), frontier orbital energies, and the polarity-related terms, e.g. dipole moment (μ), quadrupole moment tensors in each plane (Qxx, Qyy, Qzz). The average molecular polarizability (α) which indicates the overall deformability of the molecule under the action of an external electric field or an adjacent molecular electric field was calculated using its three tensor components. To describe the molecular reactivity and electron transfer potential, the descriptors of absolute electronegativity (χ), molecular hardness (η) and maximum electron transfer (△Nmax) (Parr et al., 1999) were collectively decided here. Moreover, the thermodynamic properties including the total molecular energy (ET), heat capacity at constant volume (CV), Gibbs free energy (G), and entropy (S) were also considered from vibrational analysis. All the derived structural descriptors were listed in Table S1 in supplementary material.

nEXT 2 QEXT =1−



2

(yi − yipred ) /

i=1

nEXT



EXT (yi − ymean )

2

(3)

i=1

EXT ymean

is the where nEXT is the number of compounds in the test sets; 2 , the standard average response values in the test sets. Besides the QEXT error of prediction (SEP) of test set was also used to assess the external predictability, as given by Eq. (4) (dos Reis et al., 2014). nEXT

SEP =



2

(yi − yipred ) / nEXT   

i=1

(4)

2.4. Description of applicability domain To describe the predictive reliability of QSAR, in this study, the applicability domain AD was graphically determined through the Williams plot, which was visualized by the standardized cross-validated residuals (σ) against the leverage values (hi) of pollutants. The Williams plot could help give the immediate detection of response outliers and structurally influential pollutants (Tropsha et al., 2003). When the standardized residual fell outside the residual interval of (−2.5, +2.5), the experimental data of pollutants should be categorized as the outliers in reference to the large prediction bias. The leverage hi of the ith (i =1, …..., n) pollutant could be calculated as the diagonal element of hat matrix, i.e. x iT (X T X )−1x i in the original variable space, where xi is the descriptor vector of the ith pollutant and X the matrix derived from

2.3. QSARs development and validation In this work, many quantum chemical descriptors were calculated as the latent efficient predictors for the soil adsorption and bioaccumulation of PAHs/PAEs, and to overcome the possible multicollinearity among them, hereby, PLS regression analysis was cautiously employed. 3

Environmental Research 179 (2019) 108838

J. Cai, et al.

Table 1 Observed and predicted adsorption indexes and bioavailabilities of PAHs and PAEs by DFT-derived QSARs. Compoundsa

DMP# DEP* Nap DAP DPrP BBEP Fle* DBP Phe Ant BBP Pyr* B [a]A Chr B [b]F B [k]F B [a]P*# DNPP D [a]A I [1]P B [g]P# DPhP* DChP DEHP DnOP# DIDP DNP* Acy Flu B [e]P a b

CAS

131-11-3 84-66-2 91-20-3 131-17-9 131-16-8 117-83-9 86-73-7 84-74-2 85-01-8 120-12-7 85-68-7 129-00-0 203-33-8 218-01-9 205-99-2 207-08-9 50-32-8 131-18-0 5385-75-1 193-39-5 191-24-2 84-62-8 84-61-7 117-81-7 117-84-0 26761-40-0 84-76-4 208-96-8 206-44-0 192-97-2

lgKOC

BSAF

Observed

Predicted

Residuals

1.50 1.76 2.06 2.40 2.49 2.60 3.24 3.38 3.58 3.65 3.70 3.90 4.08 4.08 4.45 4.45 4.45 4.57 4.72 4.73 4.77 4.96 5.08 6.51 7.11 7.21 7.57 – – –

1.76 2.71 2.05 3.19 2.98 3.74 2.37 3.67 3.06 2.48 4.09 2.97 4.07 3.95 4.53 4.24 4.94 3.70 4.86 5.66 4.40 4.04 4.55 6.12 6.91 7.12 7.49 1.79 3.42 4.92

−0.26 −0.95 0.01 −0.79 −0.49 −1.14 0.87 −0.29 0.52 1.17 −0.39 0.93 0.01 0.13 −0.08 0.21 −0.49 0.87 −0.14 −0.93 0.37 0.92 0.53 0.39 0.20 0.09 0.08 – – –

b

Observed

Predicted

Residualsb

8.02 5.07 – – – – 1.11 – 3.20 1.40 2.97 1.40 0.17 0.23 0.12 0.14 0.19 – – 0.16 0.13 – – 2.60 1.62 – – – – 0.18

6.52 5.41 2.01 4.02 4.52 3.22 1.52 3.85 1.53 0.97 2.34 0.87 0.33 0.89 0.86 0.44 0.23 3.69 0.37 0.18 0.07 1.02 2.95 2.89 3.56 1.20 3.31 2.03 1.19 0.52

1.50 −0.34 – – – – −0.41 – 1.67 0.43 0.63 0.53 −0.16 −0.66 −0.74 −0.30 −0.04 – – −0.02 0.06 – – −0.29 −1.94 – – – – −0.34

Full names were shown in Table S1, and the compounds marked with asterisks and hashtags comprised the test set of lgKOC and BSAF respectively. Residual refers to the difference between the observed and predicted lgKOC or BSAF.

excluded from the descriptor pool in the next QSAR development because the corresponding effect on lgKOC or BSAF variability is markedly the weakest. By gradually culling the weakest descriptors until only two descriptors left, a series of QSAR models with different performance were developed. The overall comparison of QSARs developed for lgKOC verified the retention of four descriptors including the molecular polarizability α and its tensor, total molecular energy ET and molecular volume V could 2 2 and Qcum , make the model perform best and gave the largest R y,cum(adj) which was indicative of the best correlation of the structural descriptors with soil adsorption. With three effective principal components extracted by PLS, the overall variance of lgKOC was explained as 86.3%, and SE (0.568) was small. As an important parameter for characterizing 2 the model performance, Qcum was calculated as 0.801 and much larger than the commonly defined critical value of 0.5, so that it could sufficiently represent the good internal predictability. The difference of less 2 2 and Qcum indicates the absence of over-fitthan 0.3 between R y,cum(adj) ting in the model. The predicted values of lgKOC were shown in Table 1 and the smaller residuals corroborated the good correlation with the experimental lgKOC in training set. The correlation coefficient of external validation was 0.809 and the standard error SEP of test set was 0.775, which was consistently parallel with the relevant performance exhibited in training set. In Fig. 1(a), the predicted lgKOC in test set were well linearly correlated with the observed with R2 up to 0.831. So the good stability and external predictability of developed QSAR about KOC of PAHs and PAEs in soil was authentically identified. For the bioaccumulation of PAHs and PAEs by carrot, the structural characteristics were found to show the great influence and intrinsically underlie the variance of bioavailability of different analogues in the same soil matrix. Through the successive development of QSARs and performance comparison, the background noise from the multicolinearity among descriptors was maximally diminished, and the

training set. The leverage can measure the structural influence of pollutants. In general, the consistency between the predicted and the observed data is ascertained when the leverage is lower than the warning value h*, which is represented as three times of average leverages of structural descriptors, and is numerically equivalent with three times the ratio between the number of predictors plus 1 and n. When the leverage of pollutant in training set is larger than the warning leverage, i.e. hi > h*, the structural characteristics are generally considered to be much more influential on the stability of QSAR, and the inclusion of such pollutant in QSAR would enhance the interpretatability. If the leverage value of pollutant in test set was similarly higher than the warning leverage (hi > h*) and the standardized residual much larger, the structural characteristics might not be as relevant with the variance of properties as that derived from the training set. However, if the standardized cross-validated residual of pollutant is small, the QSAR is considered to have a certain degree of epitaxy, and the pollutant cannot be regarded as an outlier. 3. Results and discussion 3.1. Development and validation of QSARs The relationship between the structural descriptors, namely the DFT-calculated electronic and thermodynamic parameters and lgKOC or BSAF was established by PLS analysis for PAHs and PAEs. In order to reduce the background noise of PLS analysis, the QSAR developments with the successive reduction of less influential structural descriptors was performed with the iterative comparisons of model quality. To decide the least influential descriptor, the indices of variable importance in the projection (VIPs) that mirror the ability to interpret the variance of soil adsorption or bioaccumulation were sorted numerically. By comparison, the descriptors with the lowest VIP would be 4

Environmental Research 179 (2019) 108838

J. Cai, et al.

fitting and cross-validation was generated. When Y′ is the same as the 2 correctly assigned lgKOC or BSAF, its correlation coefficient R (Y,Y ′) is 2 2 equal to 1, and the statistics Rcum(Y and are the true cuQ ′,x) cum(Y′,x) mulative interpretation variance and cumulative predicted fraction of 2 2 the model, respectively. On the contrary, Rcum(Y ′,x) and Qcum(Y′,x) are 2 2 2 R vs. accepted as the intercepts of fitted lines of R (Y,Y cum(Y′,x) , R (Y,Y′) vs. ′) 2 2 Qcum(Y′,x) respectively when R (Y,Y′) is equal to 0, as shown in Fig. S1 of Supplementary Material. If the obtained intercepts of regression lines are greater than 0.3 and 0.05 respectively, the original QSARs may be over-fitted. In this study, the intercepts of regression lines were 0.068, −0.366 for lgKOC, and −0.053, −0.254 for BSAF respectively, which were far below the warning values and indicative of the avoidance of over-fitting, over-prediction or accidental correlation of QSARs obtained here. As written below, the scaled and centered regression coefficients and constants obtained by PLS analysis could lead to the quantitative relationships for soil adsorption and bioaccumulation of PAHs and PAEs respectively.

lgK OC = 2.708 − 0.866 ∗ V + 0.814 ∗ α + 0.881 ∗ ET + 0.096 ∗ α yy 2 2 n (training set) = 21, A = 3, R y,cum(adj) = 0.863, Qcum = 0.801, 2 SE = 0.568, ntest( set) = 6, REXT = 0.831, 2 QEXT = 0.809, SEP = 0.775, p = 0.000

(5)

BSAF=0.915 − 0.511 ∗ Q yy − 0.483 ∗ EHOMO 2 2 n (training set) = 13, A= 1, R y,cum(adj) = 0.824, Qcum = 0.821, 2 SE = 0.635, n (test set) = 4, REXT = 0.871, 2 QEXT = 0.857, SEP = 1.227, p = 0.000

(6)

Since the QSARs were derived only from the limited experimental data in all, it was critically necessary to define the AD for the prediction of PAHs and PAEs analogues. It should be noted that each model has a specific AD, which is determined by the pollutants and the descriptors used to develop the model. The different usage of structural descriptors in the same training set can produce hundreds of QSARs, and their applicability domains are different with the descriptors used. As can be seen from the Williams plot in Fig. 2(a), the leverages of all compounds were anchored within the sandwiching area of solid lines, indicating that the model AD contained all of PAHs and PAEs in this study. In Fig. 2(b), the larger leverage points of DEHP and diethyl phthalate (DEP) than h* in the training set implies that the structural characteristics of ester chain are more sensitive and could exert significant influence on QSAR. It is worth noticing that the leverages of some pollutants in test set exceed the value of h*, whereas the absolute values of standardized residuals were less than 2.5σ and no abnormal value was detected. Thus the larger leverages of test set are deemed as the good leverage chemicals (Gramatica et al., 2007), which are symbolic for the accurate extrapolation of QSARs. Either for lgKOC or BSAF, noticeably, the leverages of most compounds in test set were positioned at the right side of leverage lines, and the absolute standardized residuals were less than 2.5σ. The appearance of pollutants with higher leverages and lower residuals in test set shall consolidate the extrapolative abilities of QSARs used for the accurate prediction.

Fig. 1. Linear relationships between the observed and predicted lgKOC (a) and BSAF (b) of PAHs and PAEs in soil.

preserved descriptors, viz. the energy of the highest occupied molecular orbital EHOMO and the component of quadrupole moment along Y-axis Qyy ultimately supported the higher performance of finally-accepted QSAR. It can be seen from Fig. 1(b) that the experimental BSAF in the training set were closer to the predicted values. The cumulative var2 iance R y,cum(adj) was 0.824 with the smaller SE. The higher cumulative 2 cross-validated correlation coefficient Qcum (0.821) verified the sa2 and tisfactory internal predictability. The difference between R y,cum(adj) 2 Qcum was much less than 0.3, indicating that the model does not have the problem of over fitting (Golbraikh and Tropsha, 2002). In test set, the smaller SEP of 1.227 and larger predictive correlation coefficient of 0.857 further reinforced the good performance in stability and predictability of QSAR for the bioaccumulation of PAHs/PAEs. To further evaluate the validity and reliability, the derived QSARs were subjected to the permutation test. As denoted as Y variable, the lgKOC or BSAF were mixed to be randomly misallocated. The number of replacements was selected as up to 250 times, and each time a set of incorrectly assigned lgKOC or BSAF, which was denoted as Y′ for model

3.2. Comparison with previous QSARs To develop the structure-activity relationships for soil adsorption, the various techniques, such as multiple linear regression (MLR), best multilinear regression (BMLR) (Kahn et al., 2005), ordinary least squares (OLS), topological sub-structural molecular design (TOPSMODE) and PLS have been proposed and employed in the successful derivation of QSARs, as illustrated in Table 2. The pivotal statistics of previous QSARs could verify the good performance and the applicability for prediction, however, these studies were not particularly 5

Environmental Research 179 (2019) 108838

J. Cai, et al.

conducted for the typical group of PAHs and PAEs in soil. On the whole, the QSAR established by the classical PLS algorithm could explain more of the soil adsorption variance than previous studies, as validated with the higher R2 of 0.863 and the relatively smaller error. The introduction of more accurate structural descriptors by DFT computation in QSAR possibly accounts for its improved quality. Relative to the work by Wang et al. (2015), our QSAR shows the slightly-improved or comparable performance besides the indispensable external validation and clearly-specified AD. Though the SE value was diminished to a greater extent by TOPS-MODE, the absence of external validation and AD is supposed to restrict the predictive application, and the involvement of obscure codes from topological sub-structure, i.e. the molecular connectivities in QSARs might not facilitate the interpretation of underlying mechanism for soil adsorption of pesticides (Gonzalez et al., 2006). By MLR, the soil adsorption of PAEs was indicated to be structurally affected by molecular polarizability (Yang et al., 2013), which is consistently regarded as the structural determinant for soil adsorption of PAHs and PAEs here. Moreover, the electronic and thermodynamic descriptors calculated by DFT were more considered and could help provide a better understanding of the molecular mechanism of soil adsorption. As parameterized for the bioaccumulation of fish body, the BCFs for some organic pollutants in water were reported to be not only relevant with KOW but with the structural descriptors of pollutants (Jackson et al., 2009; Qin et al., 2009). Differently, the correlativity of BSAF with peripheral factors for organic pollutants in soil is much more complicated because the mass transfer by soil adsorption/desorption is associatively involved in the bioaccumulation except the uptake/release between soil pore water and plant. A few QSAR studies on BSAFs of organic pollutants in soil-plant, such as zucchini have been published, and the energy of lowest unoccupied molecular orbital ELUMO was regarded as one of the most significant descriptors for predicting BSAF (Bordás et al., 2011). Most studies on bioaccumulation of pollutants in plants simply associated the BSAF with KOW, and did not provide the inherent structural determinants for interpreting its variance in soil (Bordás et al., 2011; Trapp and Legind, 2011; Zohair et al., 2006). On the contrary, the quantum chemical calculation allowed of the introduction of the relevant electronic properties and energy levels, e.g. Qyy and EHOMO that were necessary for predicting BSAF in the QSAR. It was thus believed that the QSAR derived from the accurate structural computation by DFT could deepen the understanding about the molecular mechanism of bioaccumulation of PAHs and PAEs. 3.3. QSAR analysis for soil adsorption and bioavailability Fig. 2. Williams plots of PLS-derived QSAR for lgKOC (a) and BSAF (b). The solid lines for outlier detection are depicted as the ± 2.5 units of standardized residuals and the dashed lines as the warning leverage.

In order to clarify the structural influences on the variation of lgKOC of PAHs and PAEs, the effective loading weights of each structural descriptor in principal components were calculated with PLS analysis. As far as the larger cumulative interpretation variances are considered, the weights in the first principal component (w*c [1]) were plotted against those in the second one (w*c [2]), as shown in Fig. 3(a). By

Table 2 Comparison of QSAR performance with previous studies on soil adsorption. R2

Q2

SE

2 QEXT

SEP

AD

lgKOW VED1,nHAcc, MAXDP,CIC0 MLOGP2,α,O-058,ATSC8v,nN,nROH, P-117,SpMaxA_G/D,Mor16u

0.79 0.82 0.854

– 0.80 0.850

0.49 0.523 0.472

– 0.780.761

0.61 0.56 0.558

N Y Y

Gonzalez et al. (2006)

Dip Dip Dist H H μ15 μ4 μ1 μ5 , μ7P μ10

0.838

0.812

0.370





N

Kahn et al. (2005) This study

η V, α, ET, αyy

0.760 0.863

0.730 0.801

0.439 0.568

– 0.809

0.500 0.775

N Y

Algorithms

Reference

Descriptors

MLR OLS MLR

Wen et al. (2012) Gramatica et al. (2007) Wang et al. (2015)

TOPS-MODE BMLR PLS

a

b

a Descriptors of Weighted Holistic Invariant Molecular employed in previous work by Gramatica et al. (2007), Wang et al. (2015)and Gonzalez et al. (2006) respectively. b Y or N signifies whether AD is presented or not.

6

Environmental Research 179 (2019) 108838

J. Cai, et al.

lgKOC, as illustrated in the mathematical formulation of QSAR proposed above. The Y tensor of polarizability always stresses the significance of dispersion or Debye force for soil adsorption as αyy was positively correlated with the mean polarizability with R2 of 0.786. The total energy of molecule ET provides information about molecular stability of PAHs and PAEs. The increase of ET often signifies the less inertness of molecule and the elevation of energy penalty of cavity formation in water, and contrarily the increased opportunity to contact with soil organic matter by van der Waals attraction. That may be the main causes for the positive connection of ET with soil adsorption. By PLS analysis, the VIPs of structural descriptors were obtained to quantitatively evaluate the impact on the soil adsorption of PAHs and PAEs. If the VIP value is greater than 1, the relevant parameters can be considered to play a more important role in model interpretation. As shown as histograms in Fig. S2 of Supplementary Material, the maximum VIP of 1.068 was subordinate to the molecular polarizability and the next was ET and αyy respectively, which were considered as the key structural relevance with the variance of soil adsorption. The scores (t [1] vs. t [2]) of observations scattered along with the axes of the first and second principal components stress the typical structural influences on soil adsorption of PAHs and PAEs in Fig. 3(b), and no outliers are situated outside the ellipse of 95% confidence interval. In the first principal component, the complex structural information could be decoded besides the number of aromatic rings and the length of ester chains, while the discrepancy between groups of PAHs and PAEs is manifested according to the double layered situation of scores in the second principal component. As regards the bioavailability of PAHs and PAEs in soil, the variance interpretability of BSAF was elevated as high as possible by QSAR though only one principal component was finally extracted by PLS. In the principal component, the concentrated structural information on the electronic properties, viz. the quadrupole moment tensor Qyy and the energy of highest occupied molecular orbital EHOMO was clearly decoded by the effective loading weights (w*c) of −0.687 and −0.727, respectively, and the negative correlation of two descriptors with BSAF was shown. The quadrupole moments describes the deviation of the overall charge distribution from spherical symmetry (Mhin et al., 2002). The larger Qyy means the increased polarity in Y vector so that the electron density distribution is more uneven. During the dynamic equilibrium partition of pollutants in soil, soil pore water and plant, the electrostatic interaction with soil would be enhanced with the aggravation of spherical deviation of overall molecular charge, so that the bioavailable species in soil pore water are much reduced and could not promptly satisfy the need of uptake by carrot. The significance of frontier molecular orbital energy EHOMO rests with the possibility of pollutants to interact with soil by electron donation (Saçan et al., 2004). Different form the uptake of polychlorinated dioxins, furans and biphenyls in zucchini (Bordás et al., 2011), PAH/PAEs preferred to be the electron donators to interact with soil. In Fig. 4, the electron populations of HOMO showed the preferable arrangement of outer-sphere electrons particularly on the whole aromatic rings for PAHs, and the phenyl and ester groups for PAEs respectively, and by comparison the reactive sites of PAHs are more evenly spread across the molecule than that of PAEs. As regards the relatively higher EHOMO of PAHs, the electrons in the meshed contours of phenanthrene (Phe, −578.61 kJ/ mol) or benzo [b]fluoranthene (B [b]F, −577.11 kJ/mol) are more likely to be donated than that of dimethyl phthalate (DMP, −693.32 kJ/mol) or di-n-octyl phthalate (DnOP, −702.19 kJ/mol) in Fig. 4, so that the soil contact with the aromatic motifs of PAHs shall be much more reinforced through electron delocalization. Because the soil adsorption becomes more favorable either by the strengthened electrostatic interactions or electron transfer for PAHs, the bioaccumulation by carrot would be hardly accelerated as the result of reduction of bioavailable species. The overall comparison of lower BSAFs of PAHs in Table 1 might be partially attributed to their higher electron donation capacities than that of PAEs. Through PLS analyses, the VIPs of different

Fig. 3. (a) Loading plot of descriptors; (b) scores plot of PAHs/PAEs, where the solid legends are marked for PAEs with different lengths of side chains and the hollow for PAHs with different aromatic ring numbers.

contrast, the first principal component mainly concentrates on the structural information yielded from the total molecular energy and polarizability-related terms, for which the absolute weights are 0.480, 0.510 and 0.575 respectively. In correspondence with the largest weight, the molecular polarizability shall be the most structurally responsible for the variance of soil adsorption of PAHs and PAEs. Because the molecular polarizability describes the deformation of three-dimensional electron density and distribution under the influence of external electric field, the implication of London dispersion forces (induced dipole-induced dipole interactions) and Debye forces (dipoleinduced dipole interactions) (Abraham et al., 2004) is supposed mechanistically to regulate the soil adsorption behavior and the magnitude of lgKOC. The larger the mean polarizability is, the larger the charge deformations is, which makes the molecules more accessible to the soil organic phase by dispersion interaction, and strengthens the soil adsorption consequently (Liu and Yu, 2005). Therefore, there is a significant positive correlation between molecular polarizability and 7

Environmental Research 179 (2019) 108838

J. Cai, et al.

Fig. 4. Electron arrangement surface of HOMO of PAHs and PAEs calculated at B3PW91/6-311G** level.

descriptors were calculated to quantitatively examine the leading role on bioaccumulation of PAHs and PAEs. As Qyy was depicted with a maximum VIP value of 1.028, the spherical polarity change of molecules might be the most important structural requirement for the bioavailability of PAHs and PAEs.

3.4. Correlation between soil adsorption and bioavailability Soil adsorption is an important driving factor to determine the balanced distribution of pollutants between soil particle and soil pore water phases, whereas the bioaccumulation is more complicated and involves the balanced distribution between soil particle, soil pore water and biotaphases. The release of persistent organic pollutants (POPs) from soil particles was previously suggested as the key step for uptake into plant roots (Inui et al., 2008). In general, the soil adsorption is much dependent upon the regulation of molecular hydrophobicity, as illustrated with R2 of 0.809 here. Moreover, the linear correlation between the bioavailability and hydrophobicity, namely lgKOW was also established. The negative correlation indicates the bioavailability of aromatic hydrocarbon pollutants in soil appears to be superficially governed by hydrophobicity. Because the hydrophobicity can be decomposed into the terms of electronic property and molecular shape (Li et al., 1999), the negative correlation with lgKOW corroborates the structural requirements deduced from the QSAR. The correlativity between the soil adsorption and bioavailability was analyzed by linear regression. As plotted in logarithm range in Fig. 5, the significant negative correlation was shown with the moderate R2 of 0.526 between the bioavailability and soil adsorption. As suggested by Inui et al. (2008), the critical stages, i.e. (i) desorption of hydrophobic pollutants from soil particles into soil pore-water, (ii) adsorption into roots and (iii) translocation into aerial parts cooperatively decide the bioaccumulation efficacy of hydrophobic compounds from soil into plants. By QSAR analysis the bioavailability variance of PAHs and PAEs in soil is structurally attributed to the close contact with soil, so the soil adsorption seems to act as the forceful

Fig. 5. Linear relationship between lgBSAF and lgKOC.

constraint to hinder the bioaccumulation of PAHs and PAEs in soil. Like bioavailability, the soil adsorption index could thus be feasibly gauged as the preliminary evaluation of bioavailability and risks imposed on soil ecosystem by the negative correlation relationship. 4. Conclusion In summary, the soil adsorption and bioaccumulation of PAHs and PAEs in soil was mechanistically evaluated by PLS-derived QSARs analysis, and the key structural relevance with the variance of soil 8

Environmental Research 179 (2019) 108838

J. Cai, et al.

adsorption and bioavailability was highlighted. The QSARs established in this study had strong stability, interpretability and predictability. The molecular polarizability and the associated dispersion interaction were predicted to be the most necessary structural requirement for the regulation of soil adsorption. Arising from the transpirational flow action in carrying the hydrophobic pollutants into the plant, the more electrostatic interactions with the soil entity may be deduced as the hindrance for the freely-dissloved species in soil pore water, and further the bioaccumulation. The bioavailability was more correlated with the spherical polarity change of molecules of PAHs and PAEs. The mutuality between soil adsorption and bioaccumulation demonstrated the adsorption should be the forceful constraint in affecting the bioavailability of PAHs and PAEs. This study provided mechanism explanation and data support for ecological risk assessment for hydrophobic aromatic pollutants in soil.

or air using octanol/water and octanol/air partition ratios and a molecular connectivity index. Environ. Toxicol. Chem. 16, 2448–2456. https://doi.org/10.1002/ etc.5620161203. Fismes, J., Perrin-Ganier, C., Empereur-Bissonnet, P., Morel, J.L., 2002. Soil-to-Root transfer and translocation of polycyclic aromatic hydrocarbons by vegetables grown on industrial contaminated soils. J. Environ. Qual. 31, 1649–1656. https://doi.org/ 10.2134/jeq2002.1649. Frisch, M.J., Trucks, G.W., Schlegel, H.B., Scuseria, G.E., Robb, M.A., Cheeseman, J.R., et al., 2003. Gaussian 03, Revision B. 03. Gaussian Inc., Pittsburgh, PA. Geladi, P., 1988. Notes on the history and nature of partial least squares (PLS) modelling. J. Chemom. 2, 231–246. https://doi.org/10.1002/cem.1180020403. Golbraikh, A., Tropsha, A., 2000. Predictive QSAR modeling based on diversity sampling of experimental datasets for the training and test set selection. Mol. Divers. 5, 231–243. https://doi.org/10.1023/A:1021372108686. Golbraikh, A., Tropsha, A., 2002. Beware of q2!. J. Mol. Graph. Model. 20, 269–276. https://doi.org/10.1016/S1093-3263(01)00123-1. Gonzalez, M.P., Helguera, A.M., Collado, I.G., 2006. A topological substructural molecular design to predict soil sorption coefficients for pesticides. Mol. Divers. 10, 109–118. https://doi.org/10.1007/s11030-005-9004-2. Gramatica, P., Giani, E., Papa, E., 2007. Statistical external validation and consensus modeling: a QSPR case study for Koc prediction. J. Mol. Graph. Model. 25, 755–766. https://doi.org/10.1016/j.jmgm.2006.06.005. Huang, W., Peng, Pa, Yu, Z., Fu, J., 2003. Effects of organic matter heterogeneity on sorption and desorption of organic contaminants by soils and sediments. Appl. Geochem. 18, 955–972. https://doi.org/10.1016/S0883-2927(02)00205-6. Inui, H., Wakai, T., Gion, K., Kim, Y.-S., Eun, H., 2008. Differential uptake for dioxin-like compounds by zucchini subspecies. Chemosphere 73, 1602–1607. https://doi.org/ 10.1016/j.chemosphere.2008.08.013. Jackson, S.H., Cowan-Ellsberry, C.E., Thomas, G., 2009. Use of quantitative structural analysis to predict fish bioconcentration factors for pesticides. J. Agric. Food Chem. 57, 958–967. https://doi.org/10.1021/jf803064z. Jiang, Y., Wang, X., Wu, M., Sheng, G., Fu, J., 2011. Contamination, source identification, and risk assessment of polycyclic aromatic hydrocarbons in agricultural soil of Shanghai, China. Environ. Monit. Assess. 183, 139–150. https://doi.org/10.1007/ s10661-011-1913-1. Kahn, I., Fara, D., Karelson, M., Maran, U., Andersson, P.L., 2005. QSPR treatment of the soil sorption coefficients of organic pollutants. J. Chem. Inf. Model. 45, 94–105. https://doi.org/10.1021/ci0498766. Kipopoulou, A.M., Manoli, E., Samara, C., 1999. Bioconcentration of polycyclic aromatic hydrocarbons in vegetables grown in an industrial area. Environ. Pollut. 106, 369–380. https://doi.org/10.1016/S0269-7491(99)00107-4. Kong, S., Ji, Y., Liu, L., Chen, L., Zhao, X., Wang, J., et al., 2012. Diversities of phthalate esters in suburban agricultural soils and wasteland soil appeared with urbanization in China. Environ. Pollut. 170, 161–168. https://doi.org/10.1016/j.envpol.2012.06. 017. Li, X.H., Zhu, L.G., Yu, Q.S., 1999. Study on hydrophobic parameters ofcarboxylic acids by molecular connectivity. Comput. Appl. Chemsitry 10–12. Lin, H., Tao, S., Zuo, Q., Coveney, R.M., 2007. Uptake of polycyclic aromatic hydrocarbons by maize plants. Environ. Pollut. 148, 614–619. https://doi.org/10.1016/j. envpol.2006.11.026. Liu, G., Yu, J., 2005. QSAR analysis of soil sorption coefficients for polar organic chemicals: substituted anilines and phenols. Water Res. 39, 2048–2055. https://doi.org/ 10.1016/j.watres.2005.03.030. Mhin, B.J., Lee, J.E., Choi, W., 2002. Understanding the congener-specific toxicity in polychlorinated dibenzo-p-dioxins: chlorination pattern and molecular quadrupole moment. J. Am. Chem. Soc. 124, 144–148. https://doi.org/10.1021/ja016913q. Mo, C.-H., Cai, Q.-Y., Tang, S.-R., Zeng, Q.-Y., Wu, Q.-T., 2009. Polycyclic aromatic hydrocarbons and phthalic acid esters in vegetables from nine farms of the pearl river delta, south China. Arch. Environ. Contam. Toxicol. 56, 181–189. https://doi.org/10. 1007/s00244-008-9177-7. Parr, R.G., Szentpaly, L.V., Liu, S., 1999. Electrophilicity index. J. Am. Chem. Soc. 121, 1922–1924. Qin, H., Chen, J., Wang, Y., Wang, B., Li, X., Li, F., et al., 2009. Development and assessment of quantitative structure-activity relationship models for bioconcentration factors of organic pollutants. Chin. Sci. Bull. 54, 628–634. https://doi.org/10.1007/ s11434-009-0053-2. Saçan, M.T., Erdem, S.S., Özpınar, G.A., Balcıoglu, I.A., 2004. QSPR study on the bioconcentration factors of nonionic organic compounds in fish by characteristic root index and semiempirical molecular descriptors. J. Chem. Inf. Comput. Sci. 44, 985–992. https://doi.org/10.1021/ci0342167. Sabljić, A., Güsten, H., Verhaar, H., Hermens, J., 1995. QSAR modelling of soil sorption. Improvements and systematics of log KOC vs. log KOW correlations. Chemosphere 31, 4489–4514. https://doi.org/10.1016/0045-6535(95)00327-5. Sijm, D., Kraaij, R., Belfroid, A., 2000. Bioavailability in soil or sediment: exposure of different organisms and approaches to study it. Environ. Pollut. 108, 113–119. https://doi.org/10.1016/S0269-7491(99)00207-9. Sun, J., Pan, L., Tsang, D.C.W., Zhan, Y., Zhu, L., Li, X., 2018. Organic contamination and remediation in the agricultural soils of China: a critical review. Sci. Total Environ. 615, 724–740. https://doi.org/10.1016/j.scitotenv.2017.09.271. Trapp, S., Legind, C.N., 2011. Uptake of organic contaminants from soil into vegetables and fruits. In: Swartjes, F.A. (Ed.), Dealing with Contaminated Sites: from Theory towards Practical Application. Springer Netherlands, Dordrecht, pp. 369–408. Tropsha, A., Gramatica, P., Gombar, V.K., 2003. The importance of being earnest: validation is the absolute essential for successful application and interpretation of QSPR models. QSAR Comb. Sci. 22, 69–77. https://doi.org/10.1002/qsar.200390007. Undeman, E., Czub, G., McLachlan, M.S., 2009. Addressing temporal variability when

Declaration of competing interest The authors declare no competing financial interest. Funding This work was supported by National Key Research and Development Program of China (2016YFD0800204, 2018YFC1801005), National Natural Science Foundation of China (21377138, 41977356), Frontier Project of Knowledge Innovation Engineering Field and “135” plans of CAS (ISSASIP1618) and Key Research Program of Frontier Sciences, CAS (QYZDJ-SSW-DQC035). Declaration of competing interest Authors claim no conflict of interest. Appendix A. Supplementary data Supplementary data to this article can be found online at https:// doi.org/10.1016/j.envres.2019.108838. References Abdel-Shafy, H.I., Mansour, M.S.M., 2016. A review on polycyclic aromatic hydrocarbons: source, environmental impact, effect on human health and remediation. Egypt. J. Pet. 25, 107–123. https://doi.org/10.1016/j.ejpe.2015.03.011. Abraham, M.H., Ibrahim, A., Zissimos, A.M., 2004. Determination of sets of solute descriptors from chromatographic measurements. J. Chromatogr. A 1037, 29–47. https://doi.org/10.1016/j.chroma.2003.12.004. Bansal, V., Kim, K.H., 2015. Review of PAH contamination in food products and their health hazards. Environ. Int. 84, 26–38. https://doi.org/10.1016/j.envint.2015.06. 016. Barber, J.L., Thomas, G.O., Kerstiens, G., Jones, K.C., 2004. Current issues and uncertainties in the measurement and modelling of air–vegetation exchange and withinplant processing of POPs. Environ. Pollut. 128, 99–138. https://doi.org/10.1016/j. envpol.2003.08.024. Bordás, B., Bélai, I., Kőmíves, T., 2011. Theoretical molecular descriptors relevant to the uptake of persistent organic pollutants from soil by zucchini. A QSAR study. J. Agric. Food Chem. 59, 2863–2869. https://doi.org/10.1021/jf1038772. Chen, J., Wang, X.J., Hu, J.D., Tao, S., Liu, W.X., 2005. Adsorption of polycyclic aromatic hydrocarbons(PAHs) in sand soils. J. Agro-Environ. Sci. 69–73. Chen, Y., Zhang, J., Ma, Q., Sun, C., Ha, S., Zhang, F., 2016. Human health risk assessment and source diagnosis of polycyclic aromatic hydrocarbons (PAHs) in the corn and agricultural soils along main roadside in Changchun, China. Human and Ecological Risk Assessment. Int. J. 22, 706–720. https://doi.org/10.1080/10807039. 2015.1104627. Davis, J.M., Ekman, D.R., Teng, Q., 2016. Linking field-based metabolomics and chemical analyses to prioritize contaminants of emerging concern in the great lakes basin. Environ. Toxicol. Chem. 35, 2493–2502. de Campos, L.J., de Melo, E.B., 2014. Modeling structure-activity relationships for prodiginines with antimalarial activity using GA/MLR and OPS/PLS. J. Mol. Graph. Model. 54, 19–31. dos Reis, R.R., Sampaio, S.C., de Melo, E.B., 2014. An alternative approach for the use of water solubility of nonionic pesticides in the modeling of the soil sorption coefficients. Water Res. 53, 191–199. https://doi.org/10.1016/j.watres.2014.01.023. Dowdy, D.L., McKone, T.E., 2009. Predicting plant uptake of organic chemicals from soil

9

Environmental Research 179 (2019) 108838

J. Cai, et al.

Wong, M.W., Wiberg, K.B., Frisch, M.J., 1995. Ab initio calculation of molar volumes: comparison with experiment and use in solvation models. J. Comput. Chem. 16, 385–394. https://doi.org/10.1002/jcc.540160312. Wu, S., Li, B., Liang, J.M., Peng, S.Q., Zhang, T.B., Tang, C., et al., 2015. Distribution characteristics of phthalic acid esters in soils and vegetables in vegetable producing areas of shantou city, China. J. Agro-Environ. Sci. 34, 1889–1896. Xu, P., Tao, B., Ye, Z., Zhao, H., Ren, Y., Zhang, T., et al., 2016. Polycyclic aromatic hydrocarbon concentrations, compositions, sources, and associated carcinogenic risks to humans in farmland soils and riverine sediments from Guiyu, China. J. Environ. Sci. 48, 102–111. https://doi.org/10.1016/j.jes.2015.11.035. Yang, F., Wang, M., Wang, Z., 2013. Sorption behavior of 17 phthalic acid esters on three soils: effects of pH and dissolved organic matter, sorption coefficient measurement and QSPR study. Chemosphere 93, 82–89. https://doi.org/10.1016/j.chemosphere. 2013.04.081. Yin, R., Lin, X.G., Wang, S.G., Zhang, H.Y., 2003. Effect of DBP/DEHP in vegetable planted soil on the quality of capsicum fruit. Chemosphere 50, 801–805. https://doi. org/10.1016/S0045-6535(02)00222-9. Zohair, A., Salim, A.-B., Soyibo, A.A., Beck, A.J., 2006. Residues of polycyclic aromatic hydrocarbons (PAHs), polychlorinated biphenyls (PCBs) and organochlorine pesticides in organically-farmed vegetables. Chemosphere 63, 541–553. https://doi.org/ 10.1016/j.chemosphere.2005.09.012.

modeling bioaccumulation in plants. Environ. Sci. Technol. 43, 3751–3756. https:// doi.org/10.1021/es900265j. Plaza-Bolaños, P., Padilla-Sánchez, J.A., Garrido-Frenich, A., Romero-González, R., Martínez-Vidal, J.L., 2012. Evaluation of soil contamination in intensive agricultural areas by pesticides and organic pollutants: south-eastern Spain as a case study. J. Environ. Monit. 14, 1181–1188. Wang, B., Chen, J., Li, X., Wang, Y-n, Chen, L., Zhu, M., et al., 2009. Estimation of soil organic carbon normalized sorption coefficient (koc) using least squares-support vector machine. QSAR Comb. Sci. 28, 561–567. https://doi.org/10.1002/qsar. 200860065. Wang, Y., Chen, J., Yang, X., Lyakurwa, F., Li, X., Qiao, X., 2015. In silico model for predicting soil organic carbon normalized sorption coefficient (KOC) of organic chemicals. Chemosphere 119, 438–444. https://doi.org/10.1016/j.chemosphere. 2014.07.007. Wen, Y., Su, L.M., Qin, W.C., Fu, L., He, J., Zhao, Y.H., 2012. Linear and non-linear relationships between soil sorption and hydrophobicity: model, validation and influencing factors. Chemosphere 86, 634–640. https://doi.org/10.1016/j. chemosphere.2011.11.001. Wold, S., Sjöström, M., Eriksson, L., 2001. PLS-regression: a basic tool of chemometrics. Chemometr. Intell. Lab. Syst. 58, 109–130. https://doi.org/10.1016/S0169-7439(01) 00155-1.

10