Artificial neural network modeling of biomarkers to infer characteristics of contaminant exposure in Clarias gariepinus

Artificial neural network modeling of biomarkers to infer characteristics of contaminant exposure in Clarias gariepinus

Ecotoxicology and Environmental Safety 77 (2012) 28–34 Contents lists available at SciVerse ScienceDirect Ecotoxicology and Environmental Safety jou...

236KB Sizes 0 Downloads 16 Views

Ecotoxicology and Environmental Safety 77 (2012) 28–34

Contents lists available at SciVerse ScienceDirect

Ecotoxicology and Environmental Safety journal homepage: www.elsevier.com/locate/ecoenv

Artificial neural network modeling of biomarkers to infer characteristics of contaminant exposure in Clarias gariepinus Ali Karami a,1,n, Annie Christianus a, Behzad Bahraminejad b, Franc- ois Gagne´ c, Simon C. Courtenay d a

Department of Aquaculture, Faculty of Agriculture, Universiti Putra Malaysia, 43400 Selangor, Malaysia Institute of Advanced Technology, Universiti Putra Malaysia, 43400 Selangor, Malaysia c Environment Canada, Fluvial Ecosystem Research, Science and Technology Branch, 105 McGill Street, Montreal, Quebec, Canada d Fisheries and Oceans Canada at the Canadian Rivers Institute, Department of Biology, University of New Brunswick, Fredericton, New Brunswick, Canada b

a r t i c l e i n f o

a b s t r a c t

Article history: Received 26 July 2011 Received in revised form 21 October 2011 Accepted 25 October 2011 Available online 17 November 2011

This study examined the potential of artificial neural network (ANN) modeling to infer timing, route and dose of contaminant exposure from biomarkers in a freshwater fish. Hepatic glutathione Stransferase (GST) activity and biliary concentrations of BaP, 1-OH BaP, 3-OH BaP and 7,8D BaP were quantified in juvenile Clarias gariepinus injected intramuscularly or intraperitoneally with 10–50 mg/kg benzo[a]pyrene (BaP) 1–3 d earlier. A feedforward multilayer perceptron (MLP) ANN resulted in more accurate prediction of timing, route and exposure dose than a linear neural network or a radial basis function (RBF) ANN. MLP sensitivity analyses revealed contribution of all five biomarkers to predicting route of exposure but no contribution of hepatic GST activity or one of the two hydroxylated BaP metabolites to predicting time of exposure and dose of exposure. We conclude that information content of biomarkers collected from fish can be extended by judicious use of ANNs. & 2011 Elsevier Inc. All rights reserved.

Keywords: Artificial neural network (ANN) Modeling Benzo[a]pyrene (BaP) Fish biomarkers Biliary fluorescent aromatic compounds (FACs) Glutathione S-transferase (GST)

1. Introduction Chemical or physical analyses of water and sediment can quantify concentration of contaminants in aquatic environments but they are unable to provide information about the ecotoxicological impacts of the contaminants. To address this gap, fish biomarkers have been used to signal the influence of environmental contaminants on aquatic organisms (Peakall, 1994; Budka et al., 2010). Forecasting models have been used to predict water contamination levels and help in making timely decisions to prevent adverse effects to aquatic ecosystems. Artificial neural networks (ANNs) are one of the main artificial intelligence approaches to forecasting, often providing faster, easier and more precise information than other classic methods. ANNs compute data in a process analogous

Abbreviations: i.m., intramuscular; i.p., intraperitoneal; FACs, fluorescent aromatic compounds; 7,8D BaP, 7,8Dihydrodiolbenzo[a]pyrene; 1-OH BaP, 1-hydroxybenzo[a]pyrene; 3-OH BaP, 3-hydroxybenzo[a]pyrene; GST, glutathione S-transferase (GST); PAHs, polycyclic aromatic hydrocarbons; BaP, benzo[a]pyrene; ANN, artificial neural network; MLP, multilayer perceptron; RBF, radial basis function n Corresponding author. Present address: Institute of Bioscience, Universiti Putra Malaysia, 43400 Selangor, Malaysia. E-mail address: [email protected] (A. Karami). 1 Mailing address: P.O. Box 283, Universiti Putra Malaysia, 43400 Selangor, Malaysia. 0147-6513/$ - see front matter & 2011 Elsevier Inc. All rights reserved. doi:10.1016/j.ecoenv.2011.10.026

to biological neural systems consisting of highly interconnected single cells called neurons (nodes), which are responsible for computations. The most important characteristics of ANNs are that they may be trained and developed to be compatible with complex input datasets. A training (learning) process updates the system through relations between the input and output datasets, which increases the predictive properties of the network. ANN models have two major applications: classification and forecasting (Khan et al., 2001; Sahoo et al., 2006). In classification, input variables are categorized into different classes based on the inherent differences among the values. Forecasting consists of using the known input and output variables to find the most reasonable relationship between them for precise prediction of the outcomes of future input datasets. Neural networks have found wide application in different fields of studies such as fisheries (e.g., Robotham et al., 2010), genetics (e.g., Yan and Wu, 2010), oceanography (e.g., Friedrich and Oschlies, 2009), weather forecasting (e.g., Monfared et al., 2009), robotics (e.g., Eski et al., 2011), medicine (e.g., Zini, 2005; Cucchetti et al., 2010; Smyser et al., 2010), psychology (e.g., Bowers, 2009), environmental science (e.g., Bayar et al., 2009) and ecotoxicology (e.g., Gagne´ and Blaise, 1997; Gagne´ et al., 2008; Meng and Lin, 2008). Recently, Budka et al. (2010) classified coastal water pollution levels through mussel biomarkers. However, to date, ANN has rarely been applied to the field of fish biomarkers. In the present study we investigated three ANN models: linear-based network,

A. Karami et al. / Ecotoxicology and Environmental Safety 77 (2012) 28–34

feedforward multilayer perceptron (MLP) with back-propagation learning algorithm and radial basis function (RBF) neural network. The last two ANNs are considered as universal approximators (Hartman et al., 1990; Hornik et al., 1990), meaning that they are always able to predict a given function at any accuracy degree. Polycyclic aromatic hydrocarbons (PAHs) are widespread environmental contaminants (Soclo et al., 2000; Doong and Lin, 2004), resulting from incomplete combustion of organic materials, wastewaters and accidental spills. Benzo[a]pyrene (BaP) is among the most carcinogenic and mutagenic of PAH compounds (Huberman et al., 1976; Wislocki et al., 1976) and is frequently used as a model in ecotoxicological studies. PAHs are predominantly biotransformed by the liver or other internal organs of fish, and both unbiotransformed parent compounds and metabolites are stored in the gall bladder before excretion. Many researchers have highlighted the importance of biliary fluorescent aromatic compounds (FACs) as a biomarker of PAH exposure in fish (Hellou and Payne, 1987; Pettersson et al., 2006; Ortiz-Delgado et al., 2007). Gluthatione S-transferase (GST) is a second phase biotransformation enzyme that catalyzes the conjugation of glutathione (GSH) with a wide range of hydrophobic electrophiles (Gibbs et al., 1996; Sun et al., 2007; Ma et al., 2008). Intraperitoneal (i.p.) and intramuscular (i.m.) injections are two major routes of contaminant exposure in laboratory toxicological experiments. As the first objective of this study, we examined the ability of different ANN models to predict which of the two routes of exposure had been used to produce a given set of biomarker responses. Secondly, there is value in being able to determine when fish were exposed to a given contaminant and at what concentration. Therefore, the second objective of this study was to predict the concentration and time of BaP injection. The contribution of each individual biomarker (input parameters) for determining injection method, time and exposure dosage (output parameters) was examined through sensitivity analysis.

29

Weights

1-OH BaP Method 3-OH BaP Dosage

7,8D BaP BaP

Time GST θ

Weights

1-OH BaP Method 3-OH BaP 7,8D BaP

Dosage

BaP Time GST

2. Materials and methods

θ

2.1. Bioassays Details on treatments of fish and collection of biomarker data are provided in Karami et al. (2011). Briefly, juvenile African catfish were injected either i.p. or i.m. with BaP (dissolved in corn oil) at a dose of 10, 30 or 50 mg/kg body weight. Control groups received injections of corn oil only. Samples of liver and bile, obtained from groups of four fish per treatment 24, 48 or 72 h post-injection, were analyzed for GST activity and concentrations of BaP and its three metabolites 1-hydroxybenzo[a]pyrene (1-OH BaP) and 3-hydroxybenzo[a]pyrene (3-OHBaP) and 7,8dihydrodiolbenzo[a]pyrene (7,8D BaP). 2.2. Ethical statement Animals were used according to the Malaysian legislation and the Code of Practice and accreditation criteria of the University Federation of Animal Welfare, UK (UFAW) (Hubrecht and Kirkwood, 2010).

θ

Hypothetical connections

Weighted connections

1-OH BaP Method 1-OH BaP Dosage

7,8D BaP BaP

2.3. ANN modeling The detailed architecture of ANN models has been reviewed in other studies (Krogh, 2008; Bishop, 2009), so only a brief description of the three ANN models used will be presented here: the linear-based, MLP and RBF neural networks. In a linear network, input data are propagated directly onto the output layer (computational nodes; Fig. 1a). In MLP design, there is one or more hidden layer(s) between the input and output layers providing additional learning or generalization capacity. Hidden layers process the input data and link the input and output datasets to minimize prediction errors using a back-propagation algorithm. The back-propagation step consists of adjusting each node or neuron in a forward (input-output) and backward (input’output) fashion. The scheme of the MLP neural network applied is outlined in Fig. 1b. More hidden layers increase the number of connections, which results in a more complex ANN architecture (Birikundavyi et al., 2002). However, excessive hidden layers increase the risk of over-fitting and also increase computation time (Karunanithi et al., 1994; Kaiser et al., 1997).

Time GST

θ

θ

Fig. 1. Structure of the (a) Linear neural network, (b) MLP neural network with one hidden layer and (c) RBF neural network. Time: Time of injection; Dosage: Dose of injection; Method:Method of injection. In contrast, the small degree of freedom associated with few hidden layers may result in insufficient system training. The MLP neural network is the most popular and most used model in different fields of study, including water contamination

30

A. Karami et al. / Ecotoxicology and Environmental Safety 77 (2012) 28–34

research (Almasri and Kaluarachchi, 2005; Sahoo et al., 2006; El Tabach et al., 2007). In a RBF neural network, a model with only one hidden layer is able to predict the output datasets due to the strong non-linear activation function of the hidden layer neurons (Gaussian function; Girosi and Poggio, 1990; Fig. 1c). In the present study, input parameters (selected biomarkers) were presented to the linear, MLP and RBF neural networks to develop prognosis models for prediction of the output parameters (when, how and at what dosage) C. gariepinus were exposed to BaP. 2.3.1. Data analysis Various methods of data normalization have been described for improving performance of ANN models (Cabreira et al., 2009). In this study, input values were either scaled into the interval 0–1 or log-transformed and the better method of transformation was selected based on performance criteria. Models were developed using the neural network toolbox of MATLAB (V. R 2009a; Mathworks Inc., Natwick, MA) software. Descriptive analyses were performed through Statistix software (V. 8, 2007 Analytical software, USA). 2.3.2. Activation functions and training algorithms ANNs are adaptive systems, which change their structure based on the complex relationships between the input and output datasets during the training phase. Training consists of optimizing thresholds, synaptic weights, learning rates, epoch size and minimizing the model error. During the training process, synaptic weights of the important connections are reinforced while those of less important connections are weakened. In this research 70% and 30% of the whole dataset were randomly fractioned into training and testing sets, respectively (Cabreira et al., 2009). The linear activation function was used for output neurons of all architectures because of its wide range of applications in ANN studies (Rankovic et al., 2010) n¼

m X

xj wjk þ yk

ð1Þ

j¼1

f ðnÞ ¼ n

ð2Þ

Output ¼ f ðnÞ

ð3Þ

where, xj is the output of jth neuron of hidden layer; wjk is the weight between jth neuron of the prior layer (hidden layer) and kth neuron of the output layer; y1, y2,y, yk are biases; m is the number of inputs of neuron k; and f(.) is the activation function. Hyperbolic tangent (Haykin, 1999) was the activation function of neurons in the hidden layer(s) in MLP neural network. It is defined by f ðnÞ ¼

1en 1 þ en

ð4Þ

In RBF neural networks, the activation function of the hidden layer was the Gaussian function: 2

f ðnÞ ¼ aeð:xc:

=2s2 Þ

ð5Þ

where x is the input vector, a is the height the curve’s peak, c is the center and s is the spread of the curve. RBF neural networks are multilayer feedforward artificial neural networks, but with the hybrid training algorithm (Devillers, 2009). In this model, there is no weight between the input and hidden layers; therefore, the hidden layer receives the input variables without the impact of weights (Arbib, 2003). Among the different types of RBF functions, Gaussian is generally preferred (Er et al., 2002) and was used in this study. Back-Propagation (BP) algorithm is a common error correction gradient descent technique in ANNs. Among the various types of BP algorithm, Levenberg–Marquardt was used in this study to train the MLP network because it results in a fast convergence of the networks with up to few hundred neurons (Wilamowski et al., 2001).

2.3.3. Performance criteria In this study, a suite of performance criteria was applied to select the best architecture and model. Model precision and generalization ability were assessed based on the standard error (SE) around the mean of thirty replications of each trial (Forster, 2000). To determine the performance of the three ANN models, error mean square (MSE), mean error (ME) (Singh et al., 2009) and Pearson’s correlation (r) of testing datasets were used. Models with strong correlation between computed and measured outputs are considered reliable and applicable (Zhao et al., 2007). Sum of square error (SSE) was considered as the error goal (Chakraborty et al., 1992) in the training datasets and is defined as SSE ¼

n X

ðycj ymj Þ2

ð6Þ

j¼1

where n represents the number of observations, and ycj and ymj represent the values of computed and measured output of the jth observation, respectively. Mean square error (MSE) provides a good index for the average error of different

Table 1 Twenty six scenarios examined during sensitivity analysis showing biomarkers included (|) and excluded (  ). Scenario

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

Biomarkers GST

7,8D BaP

1-OH BaP

3-OH BaP

BaP

| | | |       | | |    | | |  | | | |  |

|    | | |    | | | | |     | | | |  | |

 |   |   | |  |   | | | |  |  | |  | | |

  |   |   | |  |  |  | | |  | |  | | | |

   |   | |  |   |  | |  | | |  | | | | |

ANN models. It is defined by MSE ¼

n 1X ðy y Þ2 n j ¼ 1 cj mj

ð7Þ

Mean error (ME) is the index of measuring the overestimation or underestimation of the output values. Mean error is defined as ME ¼

n 1X ðy y Þ n j ¼ 1 cj mj

ð8Þ

In MLP and RBF methods, through several trial and error approaches (Sahoo et al., 2006), the optimum epoch size and the preferred SSEtrain value were set to 1000 and 10  4, respectively. The best ANN model was the model with the highest r value, lowest SE, MSE and ME values in the testing dataset. When two architectures resulted in similar performance, priority was given to the simpler architecture (Rojek, 2008).

2.3.4. Sensitivity analysis Selecting the most influential input vectors helps to develop a strong and accurate model, which is able to produce desired outputs (Asce Task Committee, 2000). Different methods have been suggested for understanding the relative importance and influence of the input variables on outputs. Sensitivity analysis was applied in this study to trim the less important input vector(s) (Maier and Dandy, 1996; Dogan et al., 2008). To trim the input variables, the sensitivity analysis procedure proposed by Sahoo et al. (2006) was applied in this study. Different combinations of input variables (26 scenarios) were created (Table 1) by omitting 1–3 variable(s) from the input vector. The scenario, which resulted in the highest model performance, was selected as the best combination of input variable for that particular output variable.

3. Results Overall descriptive statistics of the input variables (selected biomarkers) in i.p.- and i.m.-injected C. gariepinus are presented in Table 2. Compared to data scaling, log transformation resulted in faster training of the models and generally better model performance (e.g., MLP, 5-8-3; Table 3). Different trials revealed that a three layer MLP neural network with eight neurons in the hidden layer (5-8-3) had the highest performance among different MLP architectures. An RBF model with an average of 34 neurons in the

A. Karami et al. / Ecotoxicology and Environmental Safety 77 (2012) 28–34

hidden layer reached the desired SSEtrain value but showed higher variation of performance criteria (i.e., SE) among model replications in the testing dataset compared to the other models (Table 3). If SE values are not considered, both MLP and RBF neural networks resulted in acceptable performance for prediction of dosage and route of exposure. However, the MLP neural network provided stronger rtest values and smaller MSEtest and MEtest values (Table 3). Among the ANN models, only the MLP model produced acceptable prediction of injection (exposure) time. Weaker performance of the linear network than MLP or RBF models was demonstrated by lower rtest values (Table 3), especially for the prediction of time and route of exposure. Sensitivity analysis was applied to determine the influence of each biomarker on the output variables of the MLP neural network, the best performing ANN model. For brevity only the results for the best performing scenarios, with a few other scenarios for comparison, are presented in Table 4. Prediction of dosage of BaP injection showed the highest system performance with scenario 15, which relied on information from biliary concentrations of BaP and its two metabolities 1-OH BaP and 7,8D BaP (Table 4). The best performance for prediction of time of injection was achieved by the scenario 20 using information of biliary concentrations of BaP and its two metabolites 3-OH BaP

Table 2 Basic statistics of GST activities (mmol/min/mg protein), and concentration (mM) of bile metabolites after intramuscular (i.m.) or intraperitoneal (i.p.) injection in all the treatments; n ¼48. Biomarker

Injection method

Min

Max

Mean

SE

GST

i.p. i.m.

0.86 1.37

4.08 5.66

2.33 2.63

0.13 0.17

7,8D BaP

i.p. i.m.

2.1 1.58

180.37 102.95

68.06 18.64

9.69 3.79

1-OH BaP

i.p. i.m.

0 0

861.27 47.83

150.42 10.73

44.37 2.76

3-OH BaP

i.p. i.m.

26.1 25.04

1034.1 141.68

198.34 56.82

47.75 5.59

BaP

i.p. i.m.

5.49 5.05

16.68 13.59

9.17 8.61

0.45 0.46

31

and 7,8D BaP. In contrast, for predicting the route of exposure (i.m. vs. i.p. injection) information from all five of the selected biomarkers contributed (Scenario 26); removing any of the input variables decreased the prognosis precision of the system (Table 4).

4. Discussion One of the main applications of biomarkers in fish is the detection and quantification of environmental contaminants to which higher levels of the ecosystem have been exposed. In this study, the dosages of injected BaP were predicted with high precision (MEtest associated with the MLP model was just  0.13 mg/kg; scenario 15, Table 4). This study also demonstrated the power of ANN in predicting the route of exposure in fish biomarker studies, since small differences were observed between the computed and measured injection methods (Table 4). Route of exposure strongly influences fish biomarkers (James and Bend, 1980; Jonsson et al., 2004; Karami et al., 2011). Though natural routes of contaminant exposure (i.e., food, water, sediment) were not examined in the present study, the ability of ANN to differentiate different exposure routes in the lab (i.m. vs. i.p. injection) is encouraging. Follow-up studies should examine the ability of ANN to differentiate natural routes of exposure. The ability to forecast the time at which fish were exposed to a contaminant has a vital role to play in mitigation and management decisions. In this study, consideration of selected biomarkers resulted in close agreement between the computed and measured time of exposure (MEtest associated with the MLP model¼  0.33 hr; scenario 20, Table 4) suggesting potential of ANNs to determine how long aquatic organisms have been carrying around particular contaminants. Results of the present study indicate different biomarker requirements (input variables) for predicting each parameter of interest (output variable). Injection method was the most dependent output variable, since removing any of the input variables worsened the model performance. Some studies with fish have shown GST activity to have strong potential as a biomarker (e.g., Shailaja and D’Silva, 2003; Lu et al., 2009) though other studies have not (e.g., Best et al., 2002; Mikula et al., 2009). Gagne´ et al. (2009) showed, by applying an ANN model, highest power of

Table 3 Summary of the performance parameters of the RBF, selected MLP (5-8-3, after data scaling or log transformation) and linear ANNs. Data are shown as mean ( 7SE), n¼ 30. ME: Mean Error; MSE: Mean Square Error; r: Pearson’s Correlation; Time: Time of injection; Dosage: Dose of injection; Route: Route of exposure (i.p. and i.m. injection). Model

RBF

ME

Train Test

MLP (Log transformation)

Train Test

MLP (Scaling)

Train Test

Linear

Train Test

MSE

r

Route

Time

Dosage

Route

Time

Dosage

Route

Time

Dosage

3.66  10  4 (5  10  4) 0.106 (0.32)

1.07  10  4 (0.1  10  4)  4.36 (3.85)

1.2  10  4 (8.7  10  5)  0.769 (1.49)

5.6  10  6 (2.8  10  6) 2.54 (2.21)

7.21  10  8 (3.5  10  8) 4.78 (5.61)

4.9  10  8 (2.8  10  8) 4.89 (2.26)

1 (5.7  10  6) 0.73 (0.05)

1 (0)

1 (0)

0.595 (0.07)

0.721 (0.06)

 0.009 (0.005)  0.01 (0.01)

1.17 (0.78) 1.08 (1.24)

 0.06 (0.97)  0.22 (1.45)

0.008 (0.005) 0.05 (0.01)

9.37 (4.36) 2.38 (0.537)

8.86 (3.81) 1.95 (0.478)

0.984 (0.01) 0.91 (0.02)

0.907 (0.04) 0.754 (0.05)

0.92 (0.03) 0.784 (0.05)

 0.027 ( 0.005)  0.023 ( 0.004)

0.175 (0.032) 0.28 (0.51)

 0.349 (0.063) 0.544 (0.099)

0.03 (0.005) 0.072 (0.013)

12.52 (2.42) 6.68 (11.07)

24.01 (13.51) 2.92 (35.1)

0.945 (0.172) 0.864 (0.157)

0.925 (0.168) 0.667 (0.121)

0.931 (0.17) 0.771 (0.14)

 9.4  10  7 (5.6  10  5) 0.06 (0.038)

 1.5  10  4 (1.9  10  4) 1.13 (1.22)

6.2  10  6 (7.4  10  5) 0.992 (0.71)

0.16 (0.003) 0.38 (0.073)

3.05 (0.074) 1.99 (0.097)

1.5 (0.319) 5.71 (1.42)

0.58 (0.01) 0.25 (0.03)

0.499 (0.01) 0.28 (0.04)

0.762 (0.006) 0.653 (0.02)

32

A. Karami et al. / Ecotoxicology and Environmental Safety 77 (2012) 28–34

Table 4 Sensitivity analysis of selected MLP neural network. Performance parameters for some of the scenarios (scenarios 21–25) and the most optimum scenario for each of the output variables (scenarios 15, 20 and 26 for dosage, time and route of exposure, respectively). Data are shown as mean ( 7 SE), n¼ 30. Column labels as in Table 3. Scenario

ME

MSE

r

Route

Time

Dosage

Route

Time

Dosage

Route

Time

Dosage

21

Train Test

 0.021 (0.014)  0.062 (0.027)

0.08 (0.537) 0.201 (1.148)

0.04 (0.976)  0.159 (1.174)

0.028 (0.015) 0.081 (0.025)

1.47 (0.52) 3.42 (0.664)

5.17 (0.414) 2.01 (0.566)

0.947 (0.02) 0.852 (0.048)

0.842 (0.051) 0.642 (0.07)

0.915 (0.048) 0.785 (0.057)

22

Train Test

 0.032 (0.013)  0.016 (0.017)

1.51 (0.74) 2.53 (1.56)

0.43 (0.87) 0.198 (1.56)

0.039 (0.016) 0.09 (0.017)

1.15 (0.477) 2.9 (0.748)

6.22 (0.478) 2.17 (0.613)

0.93 (0.028) 0.82 (0.035)

0.884 (0.048) 0.699 (0.077)

0.91 (0.05) 0.79 (0.05)

23

Train Test

 0.016 (0.006)  0.029 (0.016)

0.505 (0.32) 0.518 (0.92)

 2.18 (1.16)  2.86 (1.32)

0.016 (0.006) 0.072 (0.014)

0.828 (0.386) 2.04 (0.459)

11.58 (0.521) 2.55 (0.548)

0.970 (0.012) 0.86 (0.02)

0.912 (0.04) 0.792 (0.04)

0.817 (0.048) 0.706 (0.05)

24

Train Test

 0.007 (0.007) 0.001 (0.011)

0.101 (0.331) 0.053 (1.338)

 0.261 (0.96)  0.786 (1.42)

0.007 (0.007) 0.088 (0.021)

0.756 (0.371) 2.85 (0.678)

9.815 (0.38) 3.15 (0.547)

0.937 (0.012) 0.819 (0.044)

0.919 (0.039) 0.683 (0.077)

0.905 (0.033) 0.715 (0.05)

25

Train Test

 0.0034 (0.009) 0.031 (0.01)

0.313 (0.472)  0.177 (0.96)

 0.05 (0.88)  0.076 (1.17)

0.0252 (0.014) 0.084 (0.019)

0.537 (0.308) 1.19 (0.439)

1.30 (0.431) 1.82 (0.475)

0.951 (0.02) 0.845 (0.034)

0.932 (0.03) 0.783 (0.048)

0.945 (0.04) 0.82 (0.044)

15

Train Test

– –

– –

 0.021 (0.505)  0.132 (0.866)

– –

– –

3.86 (0.358) 1.44 (0.466)

– –

– –

0.947 (0.037) 0.837 (0.05)

20

Train Test

– –

0.195 (0.679)  0.33 (1.108)

– –

– –

0.923 (0.384) 2.01 (0.474)

– –

– –

0.901 (0.039) 0.803 (0.044)

– –

26

Train Test

 0.009 (0.005)  0.01 (0.01)

– –

– –

0.008 (0.005) 0.05 (0.01)

– –

– –

0.984 (0.01) 0.91 (0.02)

– –

– –

GST activity among several biomarkers for predicting clam (Mya arenaria) population characteristics. The present study revealed weak importance of GST activity in predicting dose and time of exposure to BaP through ANN modeling approaches for the freshwater fish C. gariepinus (scenarios 15 and 20; Table 4). It could indicate that bivalves rely more on xenobiotic conjugation than hydroxylation mechanisms as compared to fish. In addition to removing GST activity from the list of input variables, removing one of the phenolic derivatives (1-OH BaP or 3-OH BaP) was also required. This may suggest that accurate prediction of time and dose of exposure requires relatively uncorrelated input variables. In addition to being an important compound to detect biologically, our results suggest that 7,8D BaP can be used to predict the dose, time and route of exposure because its presence was necessary for accurate prediction of all the three output variables. This metabolite is the penultimate carcinogenic compound which, during the last biotransformation step, may be converted into the ultimate carcinogenic metabolite, 7,8Diol 9,10-epoxide BaP (Jung et al., 2009). A central component of environmental risk assessment (ERA) processes is the measurement and prediction of contaminant levels in environmental compartments (Van der Oost et al., 2003). Numerous studies have investigated the influences of various environmental contaminants on fish biomarkers but none has applied ANNs to predict the time and route of contaminant exposure. However, ANN has been applied to other aspects of fish biology. Adams et al. (1996) used ANN to predict growth rate of wild sunfish (Mola mola) from EROD activity (which contributed little), fish biomass, invertebrate biomass and stream depth. In another study ANN architecture with one hidden layer and back-propagation algorithm was able to correctly classify 70% of dab (Limanda limanda) into ‘‘tumor’’ or ‘‘no abnormality detected groups’’ based on their plasma proteomic profile (Ward et al., 2006). Performance of ANN models is task-dependent. For example, Mateo et al. (2011) concluded that RBF neural networks outperformed MLP neural networks in forecasting deoxynivalenol accumulation in barley seeds and Kis- i (2009) found acceptable performance by both MLP and RBF neural networks in estimation of daily pan evaporation. Therefore, ecotoxicological studies,

including consideration of different types of contaminants, should consider a range of different ANN models. In ecotoxicological studies, biological responses to stress are often non-linear (U- or J-shaped) (Calabrese et al., 2007). Previous studies as well have shown that a comprehensive suite (battery) of biomarkers is most powerful in detecting the presence and harmful effects of contaminants (e.g., Forbes et al., 2006; Codi King et al., 2011). These two features have complicated modeling of ecotoxicological studies. ANN is particularly well suited to addressing these attributes because of its capacity to go beyond linearity and to use large sets of input variables. ANNs are designed to handle high sample size with ‘‘noisy’’ data; in fact they are better when the sample size is high. In the present study relatively higher SE values showed the weaker ability of RBF than MLP networks in handling data with small sample size and/or high inter-individual variation. Due to genetic and physiological differences among individual fish, high inter-individual variation is common in physiological responses to environmental contaminants (Christiansen et al., 1998; Schlenk, 1999; George et al., 2004; Webb and Gagnon, 2009). This study demonstrated acceptable performance of MLP networks with even small sample size, though long training time was required. Classical statistical methods such as multiple regression or correlation coefficients are among other tools used in modeling studies (e.g., Brian et al., 2005). While they are well established, they may be too general, may require more input data and may miss some relations between the input and output variables (Palani et al., 2008). To our knowledge, this study was the first attempt to apply ANNs to the study of the behavior of environmental contaminants in fish through biomarkers and the results are promising. ANN modeling offers the potential of extracting more information from biomarkers measured in wild-sampled fish including when fish were exposed to contaminants, the concentration of those contaminants and the route through which fish were exposed. Because different contaminants behave differently in fish, and there are differences in how different species of fish react to contaminants, next steps should include the development of ANN models for single contaminants and mixture in species of interest. Models developed will then add a great deal to field monitoring programs.

A. Karami et al. / Ecotoxicology and Environmental Safety 77 (2012) 28–34

5. Conclusion The results of this study indicated potential of ANN models to predict the method, time and dosage of BaP exposure in C. gariepinus. Among the developed ANN models, the MLP model proved superior in forecasting performance. Sensitivity analysis revealed that the dosage of BaP injection was better predicted when 7,8D BaP, 1-OH BaP and biliary BaP were considered as the input variables compared to when all the input variables were considered. Time of BaP injection was best predicted by inclusion of 7,8D BaP, 3-OH BaP and biliary BaP. However, all input variables were required to optimally predict method of BaP injection (i.m. vs. i.p.). Among the biomarkers, 7,8D BaP proved the most effective input variable to predict the method, time and dosage of BaP injection.

Acknowledgments Special thanks go to Dr. Shinji Fukuda for his helpful comments on an earlier version of this paper. The authors would like to appreciate helpful arrangements by Prof. Mad Nasir Shamsudin and Prof. Ahmad Makmom b. Hj Abdullah for HPLC utilization, and Dr. Majid Masoumian for his helpful comments during HPLC analyses. Special thanks must also go to Mr. Jasni bin md. Yusoff, the manager of Puchong’s aquaculture research station. References Adams, S., Jaworska, J., Ham, K., 1996. Influence of ecological factors on the relationship between MFO induction and fish growth: bridging the gap using neural networks. Mar. Environ. Res. 42, 197–201. Almasri, M., Kaluarachchi, J., 2005. Modular neural networks to predict the nitrate distribution in ground water using the on-ground nitrogen loading and recharge data. Environ. Model. Software 20, 851–871. Arbib, M.A.E., 2003. The Handbook of Brain Theory and Neural NetworksA Bradford Book/The MIT Press. Asce Task Committee, 2000. Artificial neural networks in hydrology. II: hydrologic applications. J. Hydrol. Eng. 5, 124–137. Bayar, S., Demir, I., Engin, G.O., 2009. Modeling leaching behavior of solidified wastes using back-propagation neural networks. Ecotoxicol. Environ. Saf. 72, 843–850. Best, J.H., Pflugmacher, S., Wiegand, C., Eddy, F.B., Metcalf, J.S., Codd, G.A., 2002. Effects of enteric bacterial and cyanobacterial lipopolysaccharides, and of microcystinLR, on glutathione S-transferase activities in zebra fish (Danio rerio). Aquat. Toxicol. 60, 223–231. Birikundavyi, S., Labib, R., Trung, H., Rousselle, J., 2002. Performance of neural networks in daily streamflow forecasting. J. Hydrol. Eng. 7, 392. Bishop, C.M., 2009. Neural networks and their applications. Rev. Sci. Instrum. 65, 1803–1832. Bowers, J., 2009. On the biological plausibility of grandmother cells: implications for neural network theories in psychology and neuroscience. Psychol. Rev. 116, 220–251. Brian, J.V., Harris, C.A., Scholze, M., Backhaus, T., Booy, P., Lamoree, M., Pojana, G., Jonkers, N., Runnalls, T., Bonfa, A., Marcomini, A., Sumpter, J.P., 2005. Accurate prediction of the response of freshwater fish to a mixture of estrogenic chemicals. Environ. Health Perspect. 113, 721–728. Budka, M., Gabrys, B., Ravagnan, E., 2010. Robust predictive modelling of water pollution using biomarker data. Water Res. 44, 3294–3308. Cabreira, A.G., Tripode, M., Madirolas, A., 2009. Artificial neural networks for fishspecies identification. ICES J. Mar. Sci. 66, 1119–1129. Calabrese, E.J., Bachmann, K.A., Bailer, A.J., Bolger, P.M., Borak, J., Cai, L., Cedergreen, N., Cherian, M.G., Chiueh, C.C., Clarkson, T.W., 2007. Biological stress response terminology: integrating the concepts of adaptive response and preconditioning stress within a hormetic dose–response framework. Toxicol. Appl. Pharmacol. 222, 122–128. Chakraborty, K., Mehrotra, K., Mohan, C., Ranka, S., 1992. Forecasting the behavior of multivariate time series using neural networks. Neural Networks 5, 961–970. Christiansen, L.B., Pedersen, K.L., Korsgaard, B., Bjerregaard, P., 1998. Estrogenicity of xenobiotics in rainbow trout (Oncorhynchus mykiss) using in vivo synthesis of vitellogenin as a biomarker. Mar. Environ. Res. 46, 137–140. ¨ Codi King, S., Conwell, C., Haasch, M., Mondon, J., Mueller, J., Zhu, S., Howitt, L., 2011. Field evaluation of a suite of biomarkers in an Australian tropical reef species, Stripey Seaperch (Lutjanus carponotatus): assessment of produced water from the Harriet a platform. In: Lee, K., Neff, J. (Eds.), Produced Water:

33

Environmental Risks and Advances in Mitigation Technologies, Springer, New York, pp. 261–294. Cucchetti, A., Piscaglia, F., Grigioni, A.D.E., Ravaioli, M., Cescon, M., Zanello, M., Grazi, G.L., Golfieri, R., Grigioni, W.F., Pinna, A.D., 2010. Preoperative prediction of hepatocellular carcinoma tumour grade and micro-vascular invasion by means of artificial neural network: a pilot study. J. Hepatol. 52, 880–888. Devillers, J., 2009. Artificial neural network modeling of the environmental fate and ecotoxicity of chemicals. In: Devillers, J. (Ed.), Ecotoxicology Modeling, Springer, pp. 1–28. Dogan, E., Ates, A., Yilmaz, E., Eren, B., 2008. Application of artificial neural networks to estimate wastewater treatment plant inlet biochemical oxygen demand. Environ. Prog. 27, 439–446. Doong, R., Lin, Y., 2004. Characterization and distribution of polycyclic aromatic hydrocarbon contaminations in surface sediment and water from Gao-ping River, Taiwan. Water Res. 38, 1733–1744. El Tabach, E., Lancelot, L., Shahrour, I., Najjar, Y., 2007. Use of artificial neural network simulation metamodelling to assess groundwater contamination in a road project. Math. Comput. Model. 45, 766–776. Er, M.J., Wu, S., Lu, J., Toh, H.L., 2002. Face recognition with radial basis function (RBF) neural networks. IEEE Trans. Neural Networks 13, 697–710. Eski, I., Erkaya, S., Savas, S., Yildirim, S., 2011. Fault detection on robot manipulators using artificial neural networks. Robot. Comput-Int. Manuf. 27, 115–123. Forbes, V.E., Palmqvist, A., Bach, L., 2006. The use and misuse of biomarkers in ecotoxicology. Environ. Toxicol. Chem. 25, 272–280. Forster, M.R., 2000. Key concepts in model selection: performance and generalizability. J. Math. Psychol. 44, 205–231. Friedrich, T., Oschlies, A., 2009. Neural network-based estimates of North Atlantic surface pCO2 from satellite data: a methodological study. J. Geophys. Res. 114, C03020. Gagne´, F., Blaise, C., 1997. Predicting the toxicity of complex mixtures using artificial neural networks. Chemosphere 35, 1343–1363. Gagne´, F., Blaise, C., Pellerin, J., Fournier, M., Durand, M.J., Talbot, A., 2008. Relationships between intertidal clam population and health status of the soft-shell clam Mya arenaria in the St. Lawrence Estuary and Saguenay Fjord (Que´bec, Canada). Environ. Int. 34, 30–43. Gagne´, F., Blaise, C., Pellerin, J., Fournier, M., Gagnon, C., Sherry, J., Talbot, A., 2009. Impacts of pollution in feral Mya arenaria populations: the effects of clam bed distance from the shore. Sci. Total Environ. 407, 5844–5854. George, S., Gubbins, M., MacIntosh, A., Reynolds, W., Sabine, V., Scott, A., Thain, J., 2004. A comparison of pollutant biomarker responses with transcriptional responses in European flounders (Platicthys flesus) subjected to estuarine pollution. Mar. Environ. Res. 58, 571–575. Gibbs, J.P., Czerwinski, M., Slattery, J.T., 1996. Busulfan-glutathione conjugation catalyzed by human liver cytosolic glutathione S-transferases. Cancer Res. 56, 3678. Girosi, F., Poggio, T., 1990. Networks and the best approximation property. Biol. Cybern. 63, 169–176. Hartman, E.J., Keeler, J.D., Kowalski, J.M., 1990. Layered neural networks with Gaussian hidden units as universal approximations. Neural Comput. 2, 210–215. Haykin, S., 1999. Neural Networks: A Comprehensive Foundation, 2nd edition Prentice-Hall, New Jersy. Hellou, J., Payne, J.F., 1987. Assessment of contamination of fish by water-soluble fractions of petroleum: a role for bile metabolites. Environ. Toxicol. Chem. 6, 857–862. Hornik, K., Stinchcombe, M., White, H., 1990. Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks. Neural Networks 3, 551–560. Huberman, E., Sachs, L., Yang, S.K., Gelboin, V., 1976. Identification of mutagenic metabolites of benzo(a)pyrene in mammalian cells. Proc. Natl. Acad. Sci. USA 73, 607. Hubrecht, R., Kirkwood, J., 2010. The UFAW Handbook on the Care and Management of Laboratory and Other Research Animals, Wiley-Blackwell. James, M.O., Bend, J.R., 1980. Polycyclic aromatic hydrocarbon induction of cytochrome P-450-dependent mixed-function oxidases in marine fish. Toxicol. Appl. Pharmacol. 54, 117–133. Jonsson, G., Taban, I.C., Jorgensen, K.B., Sundt, R.C., 2004. Quantitative determination of de-conjugated chrysene metabolites in fish bile by HPLC-fluorescence and GC-MS. Chemosphere 54, 1085–1097. Jung, D., Cho, Y., Collins, L.B., Swenberg, J.A., Di Giulio, R.T., 2009. Effects of benzo[a]pyrene on mitochondrial and nuclear DNA damage in Atlantic killifish (Fundulus heteroclitus) from a creosote-contaminated and reference site. Aquat. Toxicol. 95, 44–51. Kaiser, K., Niculescu, S., Schuurmann, G., 1997. Feed forward backpropagation neural networks and their use in predicting the acute toxicity of chemicals to the fathead minnow. Water Qual. Res. J. Can. 32, 637–657. Karami, A., Christianus, A., Ishak, Z., Syed, M.A., Courtenay, S., 2011. The effects of intramuscular and intraperitoneal injections of benzo[a]pyrene on selected biomarkers in Clarias gariepinus. Ecotoxicol. Environ. Saf. 74, 1558–1566. Karunanithi, N., Whitley, D., Bovee, K., 1994. Neural networks for river flow prediction. J. Comput. Civ. Eng. 8, 201. Khan, J., Wei, J.S., Ringne´r, M., Saal, L.H., Ladanyi, M., Westermann, F., Berthold, F., Schwab, M., Antonescu, C.R., Peterson, C., 2001. Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat. Med. 7, 673–679.

34

A. Karami et al. / Ecotoxicology and Environmental Safety 77 (2012) 28–34

¨ ., 2009. Daily pan evaporation modelling using multi-layer perceptrons and Kis-i, O radial basis neural networks. Hydrol. Processes 23, 213–223. Krogh, A., 2008. What are artificial neural networks? Nat. Biotechnol. 26, 195–197. Lu, G., Wang, C., Zhu, Z., 2009. The dose–response relationships for EROD and GST induced by polyaromatic hydrocarbons in Carassius auratus. Bull. Environ. Contam. Toxicol. 82, 194–199. Ma, J.J., Xu, Z.R., Shao, Q.J., Xu, J.Z., Hung, S.S.O., Hu, W.L., Zhuo, L.Y., 2008. Effect of dietary supplemental l carnitine on growth performance, body composition and antioxidant status in juvenile black sea bream, Sparus macrocephalus. Aquacult. Nutr. 14, 464–471. Maier, H.R., Dandy, G.C., 1996. The use of artificial neural networks for the prediction of water quality parameters. Water Resour. Res. 32, 1013–1022. Mateo, F., Gadea, R., Mateo, E.M., Jime´nez, M., 2011. Multilayer perceptron neural networks and radial-basis function networks as tools to forecast accumulation of deoxynivalenol in barley seeds contaminated with Fusarium culmorum. Food Control 22, 88–95. Meng, Y., Lin, B.-L., 2008. A feed-forward artificial neural network for prediction of the aquatic ecotoxicity of alcohol ethoxylate. Ecotoxicol. Environ. Saf. 71, 172–186. Mikula, P., Blahova, J., Kruzikova, K., Havelkova, M., Nemethova, D., Hulak, M., Svobodova, Z., 2009. Effects of the herbicide LASSO MTX (alachlor 42% W/V) on biometric parameters and liver biomarkers in the common carp (Cyprinus carpio). Pestic. Biochem. Physiol. 93, 13–17. Monfared, M., Rastegar, H., Kojabadi, H., 2009. A new strategy for wind speed forecasting using artificial intelligent methods. Renew. Energy 34, 845–848. Ortiz-Delgado, J.B., Segner, H., Arellano, J.M., Sarasquete, C., 2007. Histopathological alterations, EROD activity, CYP1A protein and biliary metabolites in gilthead seabream Sparus aurata exposed to benzo(a)pyrene. Histol. Histopathol. 22, 417–432. Palani, S., Liong, S.Y., Tkalich, P., 2008. An ANN application for water quality forecasting. Mar. Pollut. Bull. 56, 1586–1597. Peakall, D., 1994. The role of biomarkers in environmental assessment (1). Introduction. Ecotoxicology 3, 157–160. ¨ Pettersson, M., Adolfsson-Erici, M., Parkkonen, J., Forlin, L., Asplund, L., 2006. Fish bile used to detect estrogenic substances in treated sewage water. Sci. Total Environ. 366, 174–186. Rankovic, V., Radulovic, J., Radojevic, I., Ostojic, A., Comic, L., 2010. Neural network modeling of dissolved oxygen in the Gruza reservoir, Serbia. Ecol. Model. 221, 1239–1244. Robotham, H., Bosch, P., Gutie´rrez-Estrada, J., Castillo, J., Pulido-Calvo, I., 2010. Acoustic identification of small pelagic fish species in Chile using support vector machines and neural networks. Fish. Res. 102, 115–122.

Rojek, I., 2008. Neural networks as prediction models for water intake in water supply system. Lect. Notes. Artif. Intell., 1109–1119. Sahoo, G., Ray, C., Mehnert, E., Keefer, D., 2006. Application of artificial neural networks to assess pesticide contamination in shallow groundwater. Sci. Total Environ. 367, 234–251. Schlenk, D., 1999. Necessity of defining biomarkers for use in ecological risk assessments. Mar. Pollut. Bull. 39, 48–53. Shailaja, M.S., D’Silva, C., 2003. Evaluation of impact of PAH on a tropical fish, Oreochromis mossambicus using multiple biomarkers. Chemosphere 53, 835–841. Singh, K., Basant, A., Malik, A., Jain, G., 2009. Artificial neural network modeling of the river water quality—a case study. Ecol. Model. 220, 888–895. Smyser, C., Inder, T., Shimony, J., Hill, J., Degnan, A., Snyder, A., Neil, J., 2010. Longitudinal analysis of neural network development in preterm infants. Cereb. Cortex 20, 2852–2862. Soclo, H.H., Garrigues, P.H., Ewald, M., 2000. Origin of polycyclic aromatic hydrocarbons (PAHs) in coastal marine sediments: case studies in Cotonou (Benin) and Aquitaine (France) areas. Mar. Pollut. Bull. 40, 387–396. Sun, Y., Yin, Y., Zhang, J., Yu, H., Wang, X., 2007. Bioaccumulation and ROS generation in liver of freshwater fish, goldfish Carassius auratus under HC Orange No. 1 exposure. Environ. Toxicol. 22, 256–263. Van der Oost, R., Beyer, J., Vermeulen, N.P.E., 2003. Fish bioaccumulation and biomarkers in environmental risk assessment: a review. Environ. Toxicol. Pharmacol. 13, 57–149. Ward, D.G., Wei, W., Cheng, Y., Billingham, L.J., Martin, A., Johnson, P.J., Lyons, B.P., Feist, S.W., Stentiford, G.D., 2006. Plasma proteome analysis reveals the geographical origin and liver tumor status of Dab (Limanda limanda) from UK marine waters. Environ. Sci. Technol. 40, 4031–4036. Webb, D., Gagnon, M.M., 2009. The value of stress protein 70 as an environmental biomarker of fish health under field conditions. Environ. Toxicol. 24, 287–295. Wilamowski, B., Iplikci, S., Kaynak, O., Efe, M., 2001. An Algorithm for Fast Convergence in Training Neural Networks. Citeseer (pp. 1778–1782). Wislocki, P.G., Wood, A.W., Chang, R.L., Levin, W., Yagi, H., Hernandez, O., Jerina, D.M., Conney, A.H., 1976. High mutagenicity and toxicity of a diol epoxide derived from benzo[a]pyrene. Biochem. Biophys. Res. Commun. 68, 1006–1012. Yan, S., Wu, G., 2010. Prediction of mutation positions in H5N1 neuraminidases from Influenza A Virus by means of neural network. Ann. Biomed. Eng. 38, 984–992. Zhao, Y., Nan, J., Cui, F., Guo, L., 2007. Water quality forecast through application of BP neural network at Yuqiao reservoir. J. Zhejiang Univ. Sci. A 8, 1482–1487. Zini, G., 2005. Artificial intelligence in hematology. Hematology 10, 393–400.