Available online at www.sciencedirect.com
ScienceDirect Materials Today: Proceedings 3 (2016) 847 – 854
12th International Conference on Nanosciences & Nanotechnologies & 8th International Symposium on Flexible Organic Electronics
QSAR model for cytotoxicity of silica nanoparticles on human embryonic kidney cellsÕ Serena Manganellia*, Caterina Leonea, Andrey A. Toropova, Alla P. Toropovaa, Emilio Benfenatia a
IRCSS-Istituto di Ricerche Farmacologiche Mario Negri, Via Giuseppe La Masa 19, Milan 20156, Italy
Abstract A predictive model for cytotoxicity of 20 and 50 nm silica nanoparticles has been built using so-called optimal descriptors as mathematical functions of size, concentration and exposure time. These parameters have been encoded into 31 combinations ‘concentration-exposure-size’. The calculation has been carried out by means of the CORAL software (http://www.insilico.eu/coral/) using three random splits of the obtained systems into training and test sets. The statistical quality of the best model for cell viability (%) of cultured human embryonic kidney cells (HEK293) exposed to different concentrations of silica nanoparticles measured by MTT assay is satisfactory. © 2015 The Authors. Published by Elsevier Ltd. Selection and peer-review under responsibility of the Conference Committee Members of NANOTEXNOLOGY2015 (12th © 2016 Elsevier Ltd. All rights reserved. Selection and peer-review responsibility & of Nanotechnologies the Conference Committee Members of NANOTEXNOLOGY2015 Conferenceunder on Nanosciences & 8th International Symposium on Flexible Organic International (12th International Conference on Nanosciences & Nanotechnologies & 8th International Symposium on Flexible Organic Electronics) Electronics). Keywords: nanoQSAR; CORAL software; silica nanoparticle; cell viability; quasi-SMILES
Õ
This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike License, which permits non-commercial use, distribution, and reproduction in any medium, provided the original author and source are credited. * Corresponding author. Tel.: +39 02 3901.4396; fax: +39-02-39014735. E-mail address:
[email protected]
2214-7853 © 2016 Elsevier Ltd. All rights reserved. Selection and peer-review under responsibility of the Conference Committee Members of NANOTEXNOLOGY2015 (12th International Conference on Nanosciences & Nanotechnologies & 8th International Symposium on Flexible Organic Electronics) doi:10.1016/j.matpr.2016.02.018
848
Serena Manganelli et al. / Materials Today: Proceedings 3 (2016) 847 – 854
1. Introduction Given the delicate structure of the kidney filtration system, along with the major role that this organ plays in the filtration of bodily fluids and the excretion of waste products, it is possible that an inappropriate exposure to nanoparticles (NPs) may affect renal cell structure and function [1]. To investigate this possibility, the wellcharacterized human embryonic kidney (HEK293) cell line has been chosen as a test in vitro system in different studies, given the widespread use of these cells to evaluate the cytotoxic effects of chemicals [2,3]. Computational models for toxicity prediction, as well as (quantitative) structure–activity relationships ((Q)SARs), are increasingly important to support risk assessment of nanoparticles [4]. The aim of the present study is to examine the possibility to build up a model for QSAR analysis of cell viability of this cell line exposed to silica Nps, measured by MTT [3(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium bromide] assay using experimental in vitro data from the literature [5]. This model is based on the understanding that both physicochemical properties of nanoparticles and experimental conditions can be directly responsible for the cytotoxic effect. The combination of nanoparticles sizes as physicochemical property, with experimental conditions, which are specifically nanoparticles concentrations and different exposure times characterizes the ‘eclectic information’ expressed by the so-called ‘optimal descriptors’ or ‘quasi-SMILES’. The calculation was carried out with the CORAL software, which has provided satisfactory results for nanomaterials in different studies [6,7,8]. 2. Model The eclectic descriptors, also called ‘quasi-SMILES’, of nanoQSAR analysis were calculated with the CORAL software. The experimental MTT results, expressed as percentage of cell viability (%) of human embryonic kidney cells (HEK293) were taken from the literature [5]. Cells were exposed to 20 and 50 nm silica nanoparticles at 25, 50, 100 and 200 μg mL-1 for 12, 24, 36, and 48 h. Size of nanoparticles, concentrations and exposure times, defining the eclectic information, were encoded (table 1) and combined to obtain quasi-SMILES. For example, the code ‘3dy’ is a quasi-SMILES, which results from the combination of 100 μg mL-1 (3), 36 h (d) and 50 nm (y); the respective cell viability (%) is 56.070 (table 2). The 31 resulting combined systems (particle concentration-cell exposure time- particle size) were randomly split into a training set and an internal test set, respectively of 22 and 9 systems. Training and test set data are visible sets used during building up the model. An external (invisible) validation set of 9 combinations was used to check up the predictability of the model. Table 1. Codes for different sizes, concentrations and exposure times. Feature Particle size, nm
Value 20 50 25 50 100 200 0 12 24 36 48
Concentration, μg mL-1
Exposure time, hours
Code x y 1 2 3 4 a b c d e
Table 2. Some examples of quasi-SMILES and relative values of cell viability (%). Quasi-SMILES 3ay
Cell viability (%) 100.110
3dy
56.070
Serena Manganelli et al. / Materials Today: Proceedings 3 (2016) 847 – 854 2ex
62.750
3by
104.160
The encoded features with their correlation weights were used for the calculation of the so-called ‘optimal descriptors’ for nanomaterials as the following: ሺǡ ୣ୮୭ୡ୦ ሻ ൌ σ ሺ ୩ ሻDCW (Threshold, Nepoch) =∑ CW (Ck)
(1)
Where CW(Ck) are the correlation weights for codes of size, concentration and exposure time, listed in table 3 . Table 3 contains the correlation weights (CW) for codes of concentration, exposure time, and size, calculated by the Monte Carlo method; CW<1 indicate an increase of the effect, and vice versa. It is worth noticing that there is a common behavior of the codes in the three splits. Table 3. Correlation weights of codes of size (x and y), concentration (1-4), and exposure time, calculated (a-d) by the Monte Carlo method, for splits 1, 2 and 3. Split1
Split2
Split3
Ck
CW(Ck)
CW(Ck)
CW(Ck)
1
1.45441
1.30189
1.40074
2
1.3952
1.17842
1.37172
3
0.83741
0.70096
0.84833
4
0.80139
0
0.69856
a
1.49645
1.59805
1.7231
b
1.59694
1.60415
1.9011
c
0.69776
0.69898
0.498
d
0.70343
0.69814
0.49633
e
0.0
0.0
0.0
x
0.83454
0.95436
0.97374
y
1.09842
1.25187
1.14738
These are functions of the threshold (T) and number of epochs (Nepoch), which are parameters of the Monte Carlo optimization used by the CORAL software. The threshold is a tool for classifying codes as either rare (and thus likely less reliable features, probably introducing noise into the model) or not rare features, which are used by the model and labeled as active. The Nepoch is the number of cycles (sequence of modifications of correlation weight for all codes involved in model development) for the optimization [8]. The endpoint is dependent on the optimal descriptors as follows: Endpoint= C0 + C1 x DCW (T, Nepoch)
(2)
Where C0 and C1 are respectively the intercept and the slope for the training and test set. In order to obtain a model with a good predictive potential the preferable parameters of the Monte Carlo optimization, the threshold (T*) and the number of epochs (N*), which give the maximum for the correlation coefficient between experimental and calculated endpoint values for the test set, should be selected. In this case the
849
850
Serena Manganelli et al. / Materials Today: Proceedings 3 (2016) 847 – 854
preferable threshold and number of epochs, T* and N* values providing the best model statistics were N*=3 and T*=3 for split 1, N*=4 and T*=3 for split 2 and N*=4 and T*=5 for split 3. The three random splits into training and test sets for the considered systems are shown in Table 4, together with their experimental and predicted values of cell viability (%). Table 4. Quasi-SMILES and relative experimental and predicted values of cell viability (%) for splits 1, 2, and 3. Split 1
Quasi-SMILES
DCW (3, 3)
Expr
Calc
Expr-Calc
training
3ay
3.432
100.110
89.405
10.705
training
3dy
2.639
56.070
58.093
-2.023
training
2ex
2.230
62.750
41.923
20.827
training
3by
3.533
104.160
93.373
10.787
training
1cy
3.251
92.740
82.232
10.508
training
1by
4.150
101.740
117.736
-15.996
training
2cy
3.191
74.600
79.894
-5.294
training
4cx
2.334
32.980
46.028
-13.048
training
2ax
3.726
99.780
101.010
-1.230
training
2bx
3.827
98.350
104.978
-6.628
training
2dy
3.197
74.630
80.118
-5.488
training
1ax
3.785
100.040
103.348
-3.308
training
4bx
3.233
95.900
81.532
14.368
training
4by
3.497
100.900
91.951
8.949
training
1ey
2.553
71.080
54.681
16.400
training
3cx
2.370
34.670
47.450
-12.780
training
1bx
3.886
101.730
107.316
-5.586
training
3dx
2.375
34.000
47.674
-13.674
training
4cy
2.598
39.810
56.447
-16.637
training
2dx
2.933
67.150
69.698
-2.548
training
3cy
2.634
56.370
57.869
-1.499
training
3bx
3.269
96.150
82.954
13.196
test
1ex
2.289
75.010
44.261
30.749
test
3ex
1.672
26.720
19.899
6.821
test
1dy
3.256
78.400
82.456
-4.056
test
2ay
3.990
99.950
111.430
-11.480
test
3ax
3.168
100.040
78.986
21.054
test
2by
4.091
102.070
115.398
-13.328
test
1ay
4.049
99.950
113.768
-13.818
test
2cx
2.928
70.190
69.474
0.716
test
4ey
1.900
26.680
28.896
-2.216
851
Serena Manganelli et al. / Materials Today: Proceedings 3 (2016) 847 – 854
Split 2
Quasi-SMILES
DCW (4, 3)
Expr
Calc
Expr-Calc
training
3dy
2.651
56.070
65.729
-9.659
training
2ex
2.133
62.750
49.801
12.949
training
3by
3.557
104.160
93.578
10.582
training
1cy
3.253
92.740
84.226
8.514
training
1by
4.158
101.740
112.049
-10.309
training
2cy
3.129
74.600
80.431
-5.831
training
3ex
1.655
26.720
35.124
-8.404
training
2ax
3.731
99.780
98.922
0.858
training
2bx
3.737
98.350
99.109
-0.759
training
2dy
3.128
74.630
80.405
-5.775
training
3ax
3.253
100.040
84.245
15.795
training
1ax
3.854
100.040
102.717
-2.677
training
4bx
2.559
95.900
62.887
33.013
training
4by
2.856
100.900
72.032
28.868
training
1ey
2.554
71.080
62.741
8.339
training
3cx
2.354
34.670
56.610
-21.940
training
1bx
3.860
101.730
102.905
-1.175
training
3dx
2.353
34.000
56.584
-22.584
training
1ay
4.152
99.950
111.862
-11.912
training
4cy
1.951
39.810
44.209
-4.399
training
2dx
2.831
67.150
71.260
-4.110
training
3cy
2.652
56.370
65.755
-9.385
test
3ay
3.551
100.110
93.390
6.720
test
1ex
2.256
75.010
53.596
21.414
test
4cx
1.653
32.980
35.064
-2.084
test
1dy
3.252
78.400
84.200
-5.800
test
2ay
4.028
99.950
108.067
-8.117
test
2by
4.034
102.070
108.254
-6.184
test
2cx
2.832
70.190
71.286
-1.096
test
4ey
1.252
26.680
22.723
3.957
test
3bx
3.259
96.150
84.433
11.717
Split 3
Quasi-SMILES
DCW (4, 5)
Expr
Calc
Expr-Calc
training
3dy
2.492
56.070
54.510
1.560
training
2ex
2.345
62.750
50.115
12.635
training
3by
3.897
104.160
96.635
7.525
training
2ax
4.069
99.780
101.785
-2.005
852
Serena Manganelli et al. / Materials Today: Proceedings 3 (2016) 847 – 854
training
2bx
4.247
98.350
107.123
-8.773
training
2dy
3.015
74.630
70.205
4.425
training
3ax
3.545
100.040
86.090
13.950
training
1ax
4.098
100.040
102.655
-2.615
training
4bx
3.573
95.900
86.937
8.963
training
4by
3.747
100.900
92.144
8.756
training
1ey
2.548
71.080
56.192
14.889
training
3cx
2.320
34.670
49.353
-14.683
training
1bx
4.276
101.730
107.993
-6.263
training
2by
4.420
102.070
112.330
-10.260
training
3dx
2.318
34.000
49.303
-15.303
training
1ay
4.271
99.950
107.862
-7.912
training
2cx
2.843
70.190
65.048
5.142
training
4ey
1.846
26.680
35.136
-8.456
training
4cy
2.344
39.810
50.069
-10.259
training
2dx
2.842
67.150
64.998
2.152
training
3cy
2.494
56.370
54.560
1.810
training
3bx
3.723
96.150
91.428
4.722
test
3ay
3.719
100.110
91.297
8.813
test
1ex
2.374
75.010
50.985
24.025
test
1cy
3.046
92.740
71.125
21.615
test
1by
4.449
101.740
113.200
-11.460
test
2cy
3.017
74.600
70.255
4.345
test
3ex
1.822
26.720
34.420
-7.700
test
4cx
2.170
32.980
44.862
-11.882
test
1dy
3.044
78.400
71.075
7.325
test
2ay
4.242
99.950
106.992
-7.042
Cell viability (%) as function of optimal descriptors, used for building up the model, has been calculated as follows; we also indicated the statistical parameters obtained for each split. Split1 Training set: Cell viability (%) = -46.1177394 (± 4.0947512) + 39.4848711 (± 1.2424487) * DCW(3,3) n=22;
2
R = 0.8019
2
Q =0.7495
s=11.4 %
MAE= 9.61
(3)
F= 81
Test set: Cell viability (%) = -14.74777 (± 4.0947512) + 29.68753 (± 1.2424487) * DCW(3,3) n=9;
2
R =0.8250
2
Q =0.7189
s=15.6 %
MAE= 11.6
F=33
(4) ଶ തതതത ܴ = 0.6652
853
Serena Manganelli et al. / Materials Today: Proceedings 3 (2016) 847 – 854
Split2 Training set: Cell viability (%) = -15.7569693 (± 2.9599261) + 30.7381576 (± 0.8499385) * DCW(4,3) 2
n=22;
2
R =0.7096
Q =0.6584
s=14.1 %
MAE= 10.8
(5)
F= 49
Test set: Cell viability (%) = -3.47906 (± 2.9599261) + 27.29326 (± 0.8499385) * DCW(4,3) 2
n=9;
2
R =0.8996
Q =0.8430
s=10 %
MAE= 7.45
(6) ଶ തതതത ܴ = 0.8184
F=63
Split3 Training set: Cell viability (%) = -20.2188822 (± 2.1712065) + 29.9870360 (± 0.6140297) * DCW (4,5) R2= 0.8778
n=22;
Q2= 0.8503
s= 9.23 %
MAE=7.87
(7)
F= 144
Test set: Cell viability (%) = -9.30096 (± 2.1712065) + 27.46875 (± 0.6140297) * DCW (4,5) n=9;
2
2
R = 0.7734
Q = 0.6031
s= 14.0%
MAE= 11.6
(8)
F= 24
ଶ തതതത ܴ =0.6854
In Eqs. (3)–(8) n is the number of nanoparticles system in each set; R2 is the square correlation coefficient, Q is leave-one-out cross-validated correlation coefficient, s is standard error of estimation; MAE is mean absolute error; ଶ is a metric of predictability; according to the rules of QSAR/QSPR approaches a F is the variance ratio [9] ܴ ଶ തതതത developed model has predictability if ܴ parameter is > 0.5 [10,11]. One more set of nine quasi-SMILES was used for the external validation, providing predicted values, as listed in table 5 for each model built on the three random splits: Table 5. Codes of systems and relative experimental and predicted values of cell viability (%) for external validation set. QuasiSMILES
Expr
DCW(3, 3)
Calc1
DCW(4, 3)
Calc2
DCW(4, 5)
Calc3
3ey
53.0100
1.93583
30.3181
1.95282
44.2692
1.99571
39.6264
4dx
27.0600
2.33936
46.2517
1.65250
35.0378
2.16864
44.8121
4ax
100.0400
3.13238
77.5640
2.55241
62.6994
3.39541
81.5993
4dy
36.0800
2.60324
56.6710
1.95000
44.1826
2.34227
50.0189
1dx
78.3000
2.99238
72.0361
2.95439
75.0555
2.87081
65.8681
1cx
89.6400
2.98671
71.8123
2.95523
75.0814
2.87247
65.9181
2ey
70.7600
2.49362
52.3425
2.43029
58.9456
2.51910
55.3215
4ex
20.3000
1.63593
18.4767
0.95436
13.5783
1.67231
29.9286
4ay
99.8600
3.39626
87.9833
2.84991
71.8442
3.56905
86.8062
The average തതതത ଶ୫ for the external validation set tested for each random split were respectively equal to 0.5945,
854
Serena Manganelli et al. / Materials Today: Proceedings 3 (2016) 847 – 854
0.5945 and 0.5339.
3. Conclusions This is the first trial to build up a model for predicting cell viability (%) of human embryonic kidney cells exposed to different concentrations of nanoparticles, using quasi-SMILES. Some modifications in the optimization can be provided in the future to improve statistical results and thus increase the robustness of the model, based on the flexibility of the approach. Moreover, any intrinsic property, as well as external condition, can be easily introduced, after being encoded into a ‘quasi-SMILES’. The present paper shows the easiness to introduce descriptors of heterogeneous nature into the model, which is a great advantage in the case of nanomaterials, due to the multiple features used to characterize them and to the lack of standardization on the experimental protocols. The reasoning about the possible involvement of a feature within the endpoint to be modeled is also facilitated, since each attribute assumes an explicit meaning. This allows to put in evidence relevant factors affecting the toxicity, and also other properties of interest of the nanomaterials. Based on the present statistics, this model confirms that CORAL can provide satisfactory models for nanomaterials. Acknowledgments We thank the EC Project PreNanoTox (Project Reference 309666) and NanoPUZZLES (Project Reference 309837). References [1] V. Selvaraj, S. Bodapati, E. Murray, K. M. Rice, N. Winston, T. Shokuhfar, Y. Zhao, E. Blough, Int J Nanomedicine 9 (2014) 1379–1391. [2] A.M. Florea, F. Splettstoesser, D. Büsselberg, Toxicology and Applied Pharmacology 220 (2007) 292–301. [3] L.L., Ji, Y. Chen, Z.T. Wang, Experimental and Toxicologic Pathology 60 (2008) 87–93. [4] A. N. Richarz, J. C. Madden, R. L. Marchese Robinson, Ł. Lubiński, E. Mokshina, P. Urbaszek, V. E. Kuz׳min, T. Puzyn, M. T.D. Cronin Perspectives in Science 3 (2015) 27–29. [5] F. Wang, F. Gao, M. Lan, H. Yuan, Y. Huang, J. Liu., Toxicol. In Vitro 23 (2009), 808–815. [6] Toropova A.P., Toropov, A.A., Chemosphere 124 (2015) 40-46. [7] A.A. Toropov, A.P. Toropov, E. Benfenati, G. Gini, T. Puzyn, D. Leszczynska, J. Leszczynski Chemosphere 89 (2012) 1098-1102. [8] A.P. Toropova, A.A. Toropov, E. Benfenati, R. Korenstein, J. Nanopart. Res. 16 (2014) 2282. [9] K. Roy, P. P. Roy, Chem. Biol. Drug Des. 72 (2008) 370–382. [10] K. Roy, I. Mitra, S. Kar, P.K. Ojha, R.N. Das, H., Kabir, J. Chem. Inf. Model. 52 (2012) 396–408. [11] P.K. Ojha, I. Mitra, R.N. Das, K. Roy Chemometr. Intell. Lab. Syst. 107 (2011), 194–205.