Evaluating potential refinements to existing Threshold of Toxicological Concern (TTC) values for environmentally-relevant compounds

Evaluating potential refinements to existing Threshold of Toxicological Concern (TTC) values for environmentally-relevant compounds

Journal Pre-proof Evaluating potential refinements to existing Threshold of Toxicological Concern (TTC) values for environmentally-relevant compounds ...

3MB Sizes 2 Downloads 81 Views

Journal Pre-proof Evaluating potential refinements to existing Threshold of Toxicological Concern (TTC) values for environmentally-relevant compounds Mark D. Nelms, Prachi Pradeep, Grace Patlewicz PII:

S0273-2300(19)30269-7

DOI:

https://doi.org/10.1016/j.yrtph.2019.104505

Reference:

YRTPH 104505

To appear in:

Regulatory Toxicology and Pharmacology

Received Date: 20 August 2019 Revised Date:

12 October 2019

Accepted Date: 15 October 2019

Please cite this article as: Nelms, M.D., Pradeep, P., Patlewicz, G., Evaluating potential refinements to existing Threshold of Toxicological Concern (TTC) values for environmentally-relevant compounds, Regulatory Toxicology and Pharmacology (2019), doi: https://doi.org/10.1016/j.yrtph.2019.104505. This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2019 Published by Elsevier Inc.

1

Evaluating potential refinements to existing Threshold of Toxicological

2

Concern (TTC) values for environmentally-relevant compounds

3

Mark D. Nelmsa,b, Prachi Pradeepa,b, and Grace Patlewiczb*

4

5

6

aOak

7

bCenter

8

Agency, Research Triangle Park, Durham, NC 27709, USA

Ridge Institute for Science and Education, Oak Ridge, TN 37830, USA for Computational Toxicology & Exposure (CCTE), U.S. Environmental Protection

9

10

11

12

13

14

*Corresponding author. Grace Patlewicz Address: Center for Computational Toxicology &

15

Exposure (CCTE), US EPA, 109 TW Alexander Dr, RTP, NC 27711, USA

16

Tel: +1 919 541 1540 Email: [email protected]

17

18

Running title: Refining TTC values for environmentally-relevant substances

19

Word counts: Abstract (199), Text (5949), References (1007)

1

20

Abstract

21

The Toxic Substances Control Act (TSCA) mandates the US EPA perform risk-based

22

prioritisation of chemicals in commerce and then, for high-priority substances, develop risk

23

evaluations that integrate toxicity data with exposure information. One approach being

24

considered for data poor chemicals is the Threshold of Toxicological Concern (TTC). Here,

25

TTC values derived using oral (sub)chronic No Observable (Adverse) Effect Level

26

(NO(A)EL) data from the EPA’s Toxicity Values database (ToxValDB) were compared with

27

published TTC values from Munro et al. (1996). A total of 4554 chemicals with structures

28

present in ToxValDB were assigned into their respective TTC categories using the Toxtree

29

software tool, of which toxicity data was available for 1304 substances. The TTC values

30

derived from ToxValDB were similar, but not identical to the Munro TTC values: Cramer I

31

((ToxValDB) 37.3 c.f. (Munro) 30 µg/kg-day), Cramer II (34.6 c.f. 9.1 µg/kg-day) and

32

Cramer III (3.9 c.f. 1.5 µg/kg-day). Cramer III 5th percentile values were found to be

33

statistically different. Chemical features of the two Cramer III datasets were evaluated to

34

account for the differences. TTC values derived from this expanded dataset substantiated

35

the original TTC values, reaffirming the utility of TTC as a promising tool in a risk-based

36

prioritisation approach.

37

38

Keywords

39

Threshold of Toxicological Concern (TTC); Toxicity Values database (ToxValDB); Toxtree;

40

risk-based prioritisation

41 2

42

Highlights

43



Substances present in ToxValDB were assigned into their respective TTC categories

44



Used ToxValDB toxicity values to derive new Cramer TTC values

45



Evaluated whether the Cramer TTC values derived from the ToxValDB and Munro datasets were statistically equivalent

46 47



(dis)similarities in TTC values

48 49 50

Compared and contrasted the chemistry of the two datasets to rationalise any



Study provides increased confidence in the existing TTC values based on the Munro dataset

51 52

3

53

Abbreviations

54

55

acetylcholinesterase inhibitors (AChE inhibitors); cumulative distribution functions

56

(CDFs); European Food Safety Authority (EFSA); US Environmental Protection Agency

57

(EPA); European Chemicals Agency (ECHA); US Food and Drug Administration (FDA); high-

58

throughput exposure (HTE); Kolmogorov-Smirnov (K-S); lowest-observed (adverse) effect

59

levels (LO(A)Els); no-observed (adverse) effect levels (NO(A)Els); odds ratio (OR); point of

60

departure (POD); SMILES arbitrary target specification (SMARTS); Simplified Molecular-

61

Input Line-Entry System (SMILES); Toxic Substances Control Act (TSCA); Threshold of

62

Toxicological Concern (TTC); Toxicity Values database (ToxValDB); World Health

63

Organisation (WHO)

64

4

65 66

1. Introduction

67

The Toxic Substances Control Act (TSCA) mandates the US Environmental Protection

68

Agency (EPA) perform risk-based prioritisation of chemicals in commerce and then, for

69

high-priority substances, develop risk evaluations that integrate toxicity data with

70

exposure information (EPA, 2008). For chemicals with limited chemical-specific toxicity

71

data, one approach being considered is a Threshold of Toxicological Concern (TTC)-to-

72

Exposure ratio. In an earlier manuscript (Patlewicz et al., 2018), a proof of concept study

73

using a dataset of 7986 substances was undertaken to integrate TTC with heuristic high-

74

throughput exposure (HTE) modelling to rank order chemicals for further evaluation. In

75

this study, we sought to evaluate whether the established TTC values that had been used in

76

Patlewicz et al. (2018) were applicable for the types of chemicals of interest to EPA by

77

analysing an expanded toxicity dataset.

78

The TTC approach establishes different levels of human exposure below which there

79

is expected to be a low probability of risk to human health (Kroes et al., 2004; WHO/EFSA,

80

2016; EFSA, 2019). Kroes et al. (2004) presented a tiered TTC approach that established

81

several human exposure thresholds over several orders of magnitude, ranging from

82

0.0025μg/kg-day to 30μg/kg-day. The exposure limit established for each TTC tier was

83

based on an evaluation of existing toxicity data for chemicals in each tier. It should be

84

noted that the TTC approach was initially developed to be used in specific cases where

85

exposure is expected to be low and where no or limited hazard data is available. Moreover,

86

certain chemicals are excluded from the TTC approach because they were not represented

87

in the original toxicity databases supporting TTC (e.g., metals or metal containing 5

88

compounds, organosilicons, proteins) or because standard risk assessment approaches are

89

more appropriate (e.g. 2,3,7,8-dibenzo-p-dioxin (TCDD) and its analogues, high potency

90

carcinogens such as N-nitroso compounds).

91

The lowest TTC tier is 0.0025μg/kg-day which is for substances that raise a concern

92

of genotoxicity determined on the basis of structural alerts for genotoxicity/mutagenicity.

93

For substances without structural alerts, there are a series of non-cancer TTC tiers, which

94

are based on the Cramer et al. (1978) decision tree. Derivation of TTC values for each of

95

the three Cramer Classes stems from the work of Munro et al. (1996). Munro and

96

colleagues compiled a database of NOELs for 613 substances that had been tested in

97

repeat-dose oral toxicity studies including subchronic, chronic, reproductive and

98

developmental toxicity. In cases where there were multiple NOELs for a given substance,

99

the lowest one was selected (there were a total of 2941 NOELs for the 613 substances). The

100

substances were then assigned to the appropriate Cramer structural class, and cumulative

101

distributions of the logarithms of NOELs were plotted separately for each structural class.

102

Adjustments were made to extrapolate subchronic NOELs to chronic, and LOELs to NOELs

103

as appropriate. The 5th percentile NOEL was estimated for each structural class, which

104

then was converted into its respective TTC value by applying a safety factor of 100 (10X to

105

account for extrapolation of animals to humans and 10X for human variability). The TTC

106

values established were 30μg/kg-day for Cramer Class I, 9μg/kg-day for Cramer Class II,

107

and 1.5μg/kg-day for Cramer Class III substances. Kroes et al. (2004) evaluated whether

108

chemicals shown to be neurotoxicants, immunotoxicants and teratogens needed to be

109

considered as a separate category. They concluded that with an exception for

110

organophosphate pesticides (OPs) and carbamates, such substances were adequately 6

111

represented by the TTC approach for systemic toxicity endpoints. A TTC value of 0.3μg/kg-

112

day was derived for organophosphates and carbamates. NOELs for the OPs and carbamates

113

were not subsequently removed from the original Cramer Class III distribution and a

114

number of publications have suggested that this distribution should be re-evaluated

115

without these substances. Indeed, Munro et al. (2008) suggested that the new limit for

116

Cramer Class III would be at least 3μg/kg-day if OPs were removed and an even higher

117

level of 10μg/kg-day if both OPs and organohalogen compounds were removed; however,

118

neither of these refined values have yet been adopted in practice.

119

Although the Munro et al. (1996) dataset was intended to cover a broad chemical

120

domain, the dataset is now over 20 years old, and a question that could be raised is

121

whether the TTC values that had been derived ought to be updated if an expanded dataset

122

were to be used. Indeed, there have been a wealth of studies which have built upon the

123

work by Munro et al. (1996), including identifying groups of chemicals for which the TTC

124

approach is not appropriate, proposing additional TTC values for specific endpoints, or

125

utilising additional datasets to re-evaluate the original TTC values. Many of these studies

126

have been cited in the WHO/EFSA (2016) and EFSA (2019) reports. Examples of studies

127

include that from Cheeseman et al. (1999), Munro et al. (1999), Blackburn et al. (2005), van

128

Ravenzwaay et al. (2011), Kalkhof et al. (2012), Dewhurst and Renwick, (2013), Leeman et

129

al. (2014), Boobis et al. (2017) and Yang et al. (2017). One evaluation of particular interest,

130

and which inspired our own case study was that undertaken by Yang et al. (2017) who

131

enriched the original Munro et al (1996) dataset to capture cosmetics-related substances.

132

The case study here was structured to consider an expanded dataset that was more

7

133

representative of the chemicals of interest to the EPA. Specifically, the objectives were as

134

follows:

135

1. Verify TTC values from Munro et al. (1996)

136

2. Extract data from US EPA’s Toxicity Values database (ToxValDB) available via the US

137

EPA’s

138

Williams et al. (2017))

139 140 141 142 143 144 145 146 147 148 149 150

CompTox

Chemicals

Dashboard

3. Use the Kroes et al. (2004) workflow implemented in Toxtree to assign substances present in ToxValDB into their respective TTC categories 4. Derive TTC values using the toxicity data extracted from ToxValDB for Cramer class chemicals 5. Evaluate whether the newly derived TTC values were statistically equivalent to those derived from the Munro et al. (1996) dataset 6. Derive confidence intervals for the 5th percentile values underpinning the newly derived TTC values 7. Compare and contrast the chemistry of the two datasets to rationalise any (dis)similarities in the TTC values 8. Profile a large inventory of ~45,000 chemicals taking into account insights gained from the preceding objectives

151

152 153

(https://comptox.epa.gov/dashboard;

2. Materials and Methods 2.1 Toxicity Data Sources

8

154

Two sources of toxicity data were utilised in this study: 1) the TTC dataset from Munro et

155

al. (1996) referred to as the ‘Munro dataset’ and 2) the US EPA’s Toxicity Values (ToxVal)

156

database (version 7) referred to as ToxValDB.

157

The Munro dataset was downloaded as an Excel file from the European Food Safety

158

Authority (EFSA) website (http://www.efsa.europa.eu/en/supporting/pub/en-159). This

159

was converted to a comma separated value (csv) file to facilitate use within the R scripting

160

environment (https://www.r-project.org) (R Core Team, 2018).

161

ToxValDB consists of a collection of summary level in vivo test data from a variety of study

162

types typically used in risk assessments. It comprises point of departure (POD) values such

163

as no-observed (adverse) effect levels and lowest-observed (adverse) effect levels

164

(NO(A)ELs and LO(A)ELs). These data have been aggregated from over 40 publicly

165

available sources including US Federal and State agencies (e.g. US EPA, US Food and Drug

166

Administration (FDA), and California EPA) alongside international organisations (e.g.

167

World Health Organisation (WHO)), as well as data submitted under regulatory

168

frameworks such as the European Union’s REACH regulation (e.g. non-confidential

169

registration data submitted to the European Chemicals Agency (ECHA) by industry

170

registrants). The entire ToxValDB was downloaded for subsequent filtering and processing

171

(Supplementary Table 1).

172

2.2 Chemical structure data

173

2.2.1 Profiling of substances through the Kroes et al. (2004) workflow within Toxtree

174

Chemicals with defined structures (such as SMILES: Simplified Molecular-Input Line-Entry

175

System) were needed for profiling through the TTC decision tree within Toxtree. 9

176

QSAR-ready SMILES strings were extracted through a batch search using the US EPA’s

177

CompTox Chemicals Dashboard (https://comptox.epa.gov/dashboard; Williams et al.

178

(2017)) for the chemicals present in ToxValDB. Of the 15,960 unique substances present in

179

ToxValDB, QSAR-ready SMILES were available for 4,554 chemicals. These were

180

subsequently profiled through Toxtree (v3.1.0) (IdeaConsult Ltd) using two of the original

181

modules, namely the Cramer rules (Patlewicz et al., 2007) and Kroes TTC decision tree as

182

well as 3 custom modules developed ad hoc by Patlewicz et al. (2018) intended to identify

183

carbamates, organophosphates (OPs), and steroids.

184

SMILES strings provided in the Munro dataset from the EFSA website were converted to

185

their corresponding Kekule form using the ChemAxon Standardizer (v17.13.0) software.

186

The Munro Cramer class chemicals were also processed through Toxtree to address

187

objective 6.

188

2.3 Verification of Munro et al. (1996) TTC values

189

The column with the header “NOEL_calculated_Munro_mg/kg/day” in the Munro dataset

190

was used to calculate the 5th percentile values associated with each Cramer class. However,

191

when calculating the thresholds using this dataset, it became clear that there were

192

discrepancies between the published 5th percentile values in Munro et al. (1996) and those

193

calculated using the Munro dataset retrieved from the EFSA website. Upon investigation,

194

the following adjustments discussed in Munro et al. (1996) needed to be applied, namely,

195

the 3-fold safety factor for being either: 1) a subchronic study or 2) one of several ad hoc

196

reproductive/teratology studies. After applying these adjustments, the published and

197

calculated thresholds were equal confirming the validity of the Munro dataset.

10

198

2.4 New threshold calculations for Cramer class substances using ToxValDB

199

Data that met the study criteria outlined in Munro et al. (1996) were extracted from

200

ToxValDB for each chemical assigned to the 3 Cramer classes. ToxValDB study criteria were

201

as follows: 1) study types that were included were subchronic, chronic, reproductive,

202

developmental, or multigenerational. Short term and acute studies were not considered; 2)

203

route of exposure – oral, other routes were excluded; 3) species – rodents; 4) point of

204

departure – no observed (adverse) effect level (NO(A)EL); and 5) point of departure units –

205

mg/kg-day. As per Munro et al. (1996), all NO(A)ELs from subchronic studies were divided

206

by a factor of 3 to calculate an approximation of the NO(A)EL that would likely be

207

generated by a chronic study. For chemicals with multiple NO(A)EL values, the minimum

208

NO(A)EL was used once extreme outliers were identified and removed. Tukey’s fences

209

(Tukey, 1977) were calculated for each chemical using the following method:

= 1 − 1.5

3− 1



= 3 + 1.5

3− 1

210

where Q1 is the lower quartile value and Q3 is the upper quartile value. Extreme outliers

211

were identified as NO(A)EL values that were either less than the lower bound value or

212

greater than the upper bound value.

213

The empirical cumulative distributions of the (minimum) NO(A)ELs for every chemical

214

were plotted and fitted with a lognormal distribution for each Cramer class. The 5th

215

percentile NO(A)EL values were calculated and converted to their corresponding TTC

216

values by applying a safety factor of 100 as discussed earlier.

11

217

2.5 Pairwise Comparison of the TTC values derived from ToxValDB and Munro datasets

218

2.5.1 Comparison of NO(A)ELs distributions and their fifth percentile values

219

Pairwise comparisons of the NO(A)EL distributions of the Cramer classes from the

220

ToxValDB dataset were performed using the non-parametric, pair-wise Kolmogorov-

221

Smirnov (K-S) test (Conover, 1999). The K-S test was also used to compare the

222

distributions of each Cramer class between the ToxValDB and Munro datasets (i.e. were the

223

Cramer class distributions statistically different between the two datasets). The 5th

224

percentiles derived for each Cramer class were also compared between the two datasets to

225

calculate whether there was a statistically significant difference. This was performed using

226

the qcomhd function from the WRS2 package available in R, which utilises a Harrell-Davis

227

estimator, in conjunction with bootstrapping (i.e. random sampling with replacement)

228

(Mair and Wilcox, 2018). Briefly, two groups (e.g. Cramer class I from ToxValDB and

229

Munro) were independently bootstrapped n-times. For each bootstrap sample, the 5th

230

percentile for each dataset and the difference between the 5th percentiles were calculated.

231

Once the bootstrapping was complete, the 95% confidence intervals of the difference

232

between the two samples was utilised to identify whether the 5th percentiles were

233

statistically different (i.e. is the 5th percentile difference between the two datasets

234

significantly different from zero). In this study, 5000 bootstrap samples were run to

235

calculate the difference between the 5th percentiles. Bootstrap sampling using 5000

236

samples was further used to calculate the 95% confidence intervals around the 5th

237

percentile NO(A)EL values for each dataset and Cramer class.

238

2.6 Characterisation of the chemical landscape

12

239

2.6.1 Investigation of differences in Cramer class chemical landscapes

240

To provide an overview of the differences in the chemical landscape between the Munro

241

and ToxValDB datasets, bar graphs were generated for each Cramer class which compared

242

the frequency of ToxPrint chemotypes. First, a binary molecular fingerprint was generated

243

for each chemical in the Munro and ToxValDB datasets utilising the publicly available

244

ToxPrint chemotype feature set (https://toxprint.org) and the ChemoTyper software

245

(https://chemotyper.org/). ToxPrint chemotypes consist of a predefined library of 729

246

sub-structural features designed to encapsulate a broad range of chemical atoms and

247

scaffolds, which were developed by Altamira and Molecular Networks under contract by

248

the US Food and Drug Administration (Yang et al., 2015). Next, the full 729-bit ToxPrint

249

fingerprint was condensed to a length of 70-bits so that any differences between the two

250

datasets could be more readily visualised. To do this, the ToxPrints were condensed based

251

upon

252

“bond:C#N_cyano_cyanamide”,

253

“bond:C#N_nitrile_generic” were concatenated to form the more generalised name

254

“bond:C#N” that is common amongst these ToxPrints. The frequency of these condensed

255

fingerprints in each dataset were then calculated and plotted.

256

2.6.2 Chemotype enrichment analysis

257

A chemotype enrichment analysis was conducted to further investigate what, if any, impact

258

the difference in chemical landscape between the two datasets had on the differences in 5th

259

percentile NO(A)EL values. A more comprehensive explanation of the chemotype

260

enrichment analysis workflow is available in Wang et al. (2019). Briefly, chemotype

the

root

of

the

ToxPrint

name.

For

example,

“bond:C#N_nitro_isonitrile”,

13

the

ToxPrints and

261

enrichment analysis identifies sub-structural features that are over-represented with

262

respect to a given endpoint. Typically, this endpoint may be activity in a particular (suite

263

of) assay(s); however, in this study the “endpoint” was the presence/absence of the

264

chemical in the given Munro Cramer class chemical list. Chemicals present in the Munro set

265

were indicated by a value of 1, whilst chemicals originating from the corresponding Cramer

266

class in the ToxValDB dataset were indicated by a value of 0.

267

To conduct this analysis, the full 729-bit ToxPrint fingerprints generated above were

268

annotated with an additional binary column representing which data set the chemical

269

originated from: either Munro (1) or ToxValDB (0). Chemicals present in both datasets

270

were retained as duplicates to avoid modifying the chemical space of either dataset. This

271

combined dataset was subsequently processed using the chemotype enrichment workflow

272

developed by NCCT researchers and implemented in Python. The odds ratio (OR) and p-

273

values generated by the workflow were utilised to identify those ToxPrints that were more

274

highly enriched in the Munro dataset compared to the ToxValDB dataset. For the purposes

275

of the analysis, enriched ToxPrints were defined as having an OR ≥3, a p value ≤0.05 and

276

number of true positives (TP) ≥3. For this study, only ToxPrints with an OR of infinity were

277

carried forward as this signified that the ToxPrint was not present in any chemical in the

278

ToxValDB dataset. Chemicals within the Munro set that contained any one or more, of the

279

ToxPrints with an OR of infinity were excluded and the 5th percentile NO(A)EL values were

280

re-calculated. This enabled an exploration of the impact chemicals containing these

281

chemotypes had on the 5th percentile NO(A)EL value.

282

2.7 Software

14

283

Data processing and analysis was conducted in the R scripting language v3.5.2 (R Core

284

Team, 2018) unless otherwise stated. Code and datasets are available as supplementary

285

information.

286

3.0 Results and Discussion

287

3.1 Verification of previously published fifth percentile NO(A)EL values for Cramer class

288

assigned chemicals

289

To ensure the previously published Cramer class TTC values could be reproduced, the 3-

290

fold adjustment factor was applied to relevant chemicals in the Munro dataset and the 5th

291

percentile NO(A)EL values were then calculated. The number of chemicals present in the

292

overall Munro dataset, as well as the individual Cramer classes, provided from the EFSA

293

website were the same as those published within the Munro et al. (1996) article. Minor

294

discrepancies were found between the 5th percentile values calculated and those originally

295

published in Munro et al. (1996) (Table 1). These discrepancies were most likely due to

296

rounding differences in terms of the number of significant figures used in the calculations

297

of the 5th percentile NO(A)ELs. The 5th percentile NO(A)ELs calculated were equivalent to

298

those reported by Munro et al. (1996), thus confirming the validity of the Munro dataset.

299

300

[TABLE 1 HERE]

301

302

3.2 Calculation of fifth percentile NO(A)ELs from ToxValDB for Cramer class assigned

303

chemicals 15

304

Of the 4,554 chemicals present in ToxValDB with QSAR-ready SMILES, 1,241 (27%) were

305

excluded because the chemical either was: 1) not applicable for TTC, i.e. compound-specific

306

toxicity data were required (114 chemicals); 2) considered a potential genotoxicant based

307

upon the presence of a structural alert, thus, requiring the use of the TTC of 0.0025µg/kg-

308

day (1025 chemicals); or 3) considered to be an organophosphate (OP) or carbamate (102

309

chemicals). Two additional chemicals were excluded from further analysis since they could

310

not be properly profiled through the Cramer workflow implemented in Toxtree. As the

311

decision tree laid out in Kroes et al. (2004) was being followed, substances that presented a

312

structural alert for genotoxicity or were considered to be an OP or carbamate were

313

excluded from further analysis. They will be evaluated separately as part of ongoing work.

314

Of the remaining 3,311 chemicals, 1,476 were assigned to Cramer class I, 162 were

315

assigned to Cramer class II, and 1,673 were assigned to Cramer class III (Table 2).

316

Upon separating the chemicals into their respective Cramer class, associated toxicity data

317

that satisfied the study criteria set out in Munro et al. (1996) were extracted from

318

ToxValDB

319

multigenerational studies conducted in rodents with an oral route of exposure and

320

generating a NO(A)EL in mg/kg-day were utilised. This decreased the number of chemicals

321

down to 565, 39, and 700 for Cramer class I, II, and III respectively (Table 2).

-

i.e.,

only

subchronic,

chronic,

322

323

[TABLE 2 HERE]

324

16

reproductive,

developmental,

or

325

The 5th percentiles calculated for each Cramer class are provided in Table 3. The expected

326

trend in TTC values with more conservative values for Cramer III relative to Cramer I was

327

observed. However, there was only a minimal separation in 5th percentile values between

328

the Cramer I and Cramer II chemicals.

329

330

[TABLE 3]

331

332

Figure 1 shows how the lognormal distributions and empirical cumulative distribution

333

functions (CDFs) for the ToxValDB class I and II chemicals are poorly separated; this is

334

especially true below the 10% quantile where the distributions overlap. The K-S test was

335

utilised to evaluate whether the distributions were significantly different between the

336

Cramer classes. The distributions between Cramer classes I and II (n = 604) and Cramer

337

classes II and III (n = 739) were not statistically different at a significance level of 0.05; in

338

contrast, the difference between the Cramer class I and III (n = 1,265) distributions was

339

significant. Given that the K-S test tends to be more sensitive near the centre of the

340

distribution and, looking at Figure 1, we can see that the main differences in the

341

distributions are at the lower quantiles, the fact that the Cramer class II distribution was

342

not significantly different from either Cramer class I or III may not be wholly surprising

343

especially given the few substances it contains.

344

345

[FIGURE 1 HERE]

17

346

347

3.3 Pairwise Comparison between ToxValDB and Munro et al. (1996)

348

For each Cramer class, the distribution for the ToxValDB dataset was compared to its

349

corresponding distribution for the Munro dataset (e.g. the ToxValDB Cramer class I

350

distribution was compared to the Munro Cramer class I distribution). As can be seen in

351

Figure 2, the empirical CDFs and fitted distributions for Cramer class II and III between the

352

ToxValDB and Munro datasets are visually more distinct than those for Cramer class I:

353

where the lognormal distributions intersect below the 20% quantile. To statistically

354

investigate whether the distributions were significantly different, the K-S test was

355

employed. Both the ToxValDB and Munro Cramer class I (n = 702) and II (n = 67)

356

distributions were not statistically different (at a p-value of 0.05), whilst the Cramer class

357

III (n = 1,148) distributions were observed to be significantly different. Furthermore, the

358

level of overlap between the two datasets was examined and a total of 219 chemicals were

359

found to be in common. Of these, 82 chemicals (37%) had a difference in NO(A)EL between

360

the two datasets of at least ±0.5 log units, whereas 32 chemicals (15%) had a difference in

361

NO(A)EL of at least ±1 log unit (Supplementary Table 2, Supplementary Figure 1).

362

Therefore, for the vast majority of the overlapping chemicals, any discrepancy in NO(A)EL

363

values was considered to be captured by the variability that is inherent to in vivo studies

364

(Pham et al., 2019; Pham et al. in prep). Additionally, the overall NO(A)EL distributions for

365

the intersecting chemicals was assessed; whilst the distribution of ToxValDB NO(A)ELs was

366

marginally left-shifted compared to that of the Munro NO(A)ELs (Supplementary Figure 2),

367

the distributions were not significantly different according to the K-S test (n = 438).

18

368

369

[FIGURE 2 HERE]

370

371

The ToxValDB 5th percentile NO(A)EL value was greater than that for the Munro dataset

372

across the three Cramer classes (Table 3). This was especially true for Cramer class II,

373

where the ToxValDB 5th percentile value was almost 4-fold larger than the Munro value.

374

However, there were relatively few chemicals present in the Cramer class II category for

375

both datasets: 28 and 39 chemicals for Munro and ToxValDB respectively. Thus, a small

376

shift towards less potent NO(A)ELs could have a strong impact on the resulting 5th

377

percentile value. Indeed, this seemed to be the case for the Cramer class II datasets. Only 6

378

of the 28 chemicals (21%) in the Munro Cramer class II set had a NO(A)EL ≥100mg/kg-day,

379

whereas 19 of the 39 chemicals (49%) in the ToxValDB set had a NO(A)EL ≥100mg/kg-day.

380

Furthermore, the Munro Cramer class II set contained only one chemical with a NO(A)EL

381

≥1000mg/kg-day, whilst the ToxValDB Cramer class II set has five chemicals with a

382

NO(A)EL ≥1000mg/kg-day. In addition, 39% of the chemicals in the Munro set had a

383

NO(A)EL ≤10mg/kg-day, whilst this was the case for only 18% of the ToxValDB set.

384

To identify whether the 5th percentile values between the two datasets were statistically

385

different, bootstrapping was utilised to calculate the 95% confidence intervals for each

386

Cramer class (Figure 3). As shown in both Figure 3 and Table 3, the 5th percentile values for

387

Cramer class I and II were not statistically different. Therefore, even though the

388

unprocessed 5th percentile values for Cramer class II appeared different, there was actually

389

a relatively large overlap in their 95% confidence intervals, likely due to the small number 19

390

of chemicals assigned to this class. In contrast, the 5th percentile values for Cramer class III

391

differed significantly between the two datasets. The TTC values are shown in Table 3 for

392

completeness.

393

394

[FIGURE 3 HERE]

395

396

3.4 Investigation of Cramer class III

397

Since the Cramer class III 5th percentile values differed significantly, we sought to

398

investigate whether this was due to an underlying difference in the types of chemicals

399

represented in the ToxValDB and Munro Cramer class III datasets. Comparison of the

400

frequency of ToxPrints present in chemicals in both ToxValDB and Munro Cramer class III

401

datasets provided an initial overview of the differences in chemical landscape. Figure 4

402

illustrates that the Munro dataset contained a higher frequency of certain ToxPrints,

403

including, but not limited to: aromatic and heterocyclic ring structures, phosphate and

404

phosphonate bonds, (amino) carbonyl structures, and halogen containing chemicals. On the

405

other hand, the ToxValDB dataset contained a higher frequency of chemicals possessing a

406

linear alkane chain.

407

408

[FIGURE 4 HERE]

409

20

410

A chemotype enrichment analysis was utilised to provide a more detailed assessment of

411

which specific ToxPrints differed between the two datasets. A total of 63 chemotypes were

412

calculated to be enriched (OR ≥3 and p-value ≤0.05) in the Munro Cramer class III set

413

compared to corresponding ToxValDB set. Of these, 29 ToxPrints were only observed in the

414

Munro Cramer class III set of chemicals (OR “Inf”, Supplementary Table 3). These 29

415

ToxPrints were used to investigate what impact if any, removal of chemicals containing at

416

least one of these ToxPrints had on the Munro Cramer class III 5th percentile, specifically

417

whether a re-derived value remained statistically different from the ToxValDB class III 5th

418

percentile value. After filtering out chemicals that contained any one, or more, of these 29

419

ToxPrints, 306 chemicals remained with a re-calculated 5th percentile NO(A)EL of

420

0.22mg/kg-day (Table 4); meanwhile, the 142 chemicals that contained at least one of the

421

29 ToxPrints and were removed had a 5th percentile value of 0.075mg/kg-day. After

422

bootstrapping, the re-calculated 5th percentile value was not statistically different to that of

423

ToxValDB Cramer class III although the K-S metrics still reflected a difference in the

424

distributions. This suggested that at least some of the potent chemicals contained some of

425

these 29 ToxPrints. Accordingly, the K-S test was utilised to investigate whether these 29

426

ToxPrints were actually separating out the more potent chemicals from the Munro class III

427

set. The 306 Munro class III chemicals that did not contain any of the 29 ToxPrints were

428

compared with the 142 chemicals from Munro class III that did contain at least one of the

429

29 ToxPrints. The two distributions were shown not to be statistically different. As

430

observed in Figure 5A, the two distributions are very closely aligned to one another and

431

even intersect at approximately the mean. Therefore, although the 29 ToxPrints used to

432

filter the Munro class III chemicals raised the 5th percentile NO(A)EL value, the chemicals 21

433

extracted were comparable in potency to chemicals that did not contain one of these

434

ToxPrints. The 29 ToxPrints were not able to account the difference in 5th values between

435

the 2 datasets.

436

437

[FIGURE 5 HERE]

438

Earlier studies have suggested that the Cramer class III value ought to be re-evaluated

439

excluding OP and carbamate insecticides (EFSA, 2012; Leeman et al. 2014; Kroes et al.

440

2000; Kroes et al. 2004; Munro et al., 2008). This is due to fact that both of these insecticide

441

classes are neurotoxicants that act via inhibiting acetylcholinesterase (AChE) either

442

reversibly (e.g. carbamates) or irreversibly (e.g. OPs) (Colovic et al., 2013). Inhibition of

443

AChE leads to acetylcholine accumulating in the nerve synapse, culminating in over

444

stimulation of the nicotinic and muscarinic acetylcholine receptors and, therefore,

445

increased neurotransmitter activity (Colovic et al., 2013, Naughton and Terry, 2018). Given

446

the results of these earlier studies, OPs and carbamates were identified using Toxtree, and

447

the modules developed by Patlewicz et al. (2018) and removed to determine whether this

448

would instead account for the differences in 5th percentile values found for the two

449

datasets.

450

A total of 62 Cramer class III chemicals from the Munro dataset were re-assigned as

451

OPs/carbamates. Excluding these chemicals from the Munro Cramer class III dataset,

452

resulted in a rederived 5th percentile value of 0.2mg/kg-day (based on 386 NO(A)ELs)

453

(Table 4). This re-calculated 5th percentile NO(A)EL value was not statistically different

454

from the ToxValDB Cramer class III chemicals. Again, the cumulative distribution between 22

455

this filtered Munro Cramer class III and the ToxValDB Cramer class III distribution

456

remained statistically different. The 5th percentile NO(A)EL value of the OPs/carbamates

457

excluded from the Munro Cramer class III set was 0.056mg/kg-day, corresponding to a TTC

458

value of 0.56μg/kg-day (c.f. reported TTC value of 0.3μg/kg-day). Furthermore, the K-S test

459

and a CDF plot was used to investigate whether the distribution of the AChE inhibitors

460

extracted from Munro class III were distinct from the chemicals that remained in the

461

Munro class III. According to the K-S test, the two distributions were shown to be

462

statistically different (p-value = 4.404 x 10-5) (Figure 5B). Taken together, these results

463

provide a plausible explanation for the difference in 5th percentile NO(A)EL values between

464

the original Munro and ToxValDB class III datasets.

465

466

[TABLE 4 HERE]

467

468

Previous work by Leeman et al. (2014) recalculated the Cramer class III 5th percentile

469

NOEL and associated TTC threshold after manual inspection, and removal, of

470

OP/carbamate insecticides from the original Munro dataset. Leeman et al. (2014) identified

471

a total of 40 chemicals as being either an OP or carbamate insecticide. After removing these

472

chemicals, they reported an increase in Cramer class III 5th percentile value from

473

0.15mg/kg-day to 0.22mg/kg-day (i.e. to a TTC value of 2.2µg/kg-day). In our study, the

474

number of chemicals identified as AChE inhibitors by Toxtree was greater than those

475

published by Leeman et al. (2014), thus suggesting that the SMARTS patterns contained

476

within the original Toxtree modules were too broad. Therefore, the chemicals identified as 23

477

OPs and carbamates by Toxtree were more closely inspected to determine what

478

refinements could be made to the SMARTS patterns to make them more specific.

479

Preliminary inspection revealed that some of the identified OP/carbamates were not

480

OP/carbamate insecticides. For example, albendazole is an anti-helminthic that contains a

481

carbamate moiety attached to a benzimidazole and whose mode of action is inhibiting the

482

polymerisation of β-tubulin into microtubules rather than AChE activity. Furthermore,

483

there were several chemicals identified by Munro et al. (1999) and/or EFSA (2012) as

484

either an OP or carbamate insecticide with AChE activity yet these were not identified by

485

Toxtree modules, e.g. diethyldithiocarbamate, merphos, and glufosinate-ammonium. The

486

list of 40 OPs and carbamate AChE inhibitors identified by Munro et al. (1999) and EFSA

487

(2012) (as referenced by Leeman et al. (2014)) were utilised to generate more specific

488

SMARTS patterns (Supplementary Table 4) and implemented into updated OP and

489

carbamate Toxtree modules. The entire Munro Cramer class III chemicals were

490

reprocessed through Toxtree using the updated OP and carbamate modules; resulting in 51

491

chemicals identified as AChE inhibitors. The 5th percentile NO(A)EL value of the remaining

492

397 chemicals was 0.23mg/kg-day, which is not statistically different from the ToxValDB

493

Cramer class III 5th percentile NO(A)EL (Table 4). However, the distributions between the

494

updated Cramer and ToxValDB class III datasets still differed significantly. The updated

495

OP/carbamate chemicals excluded from Munro Cramer class III resulted in a 5th percentile

496

value of 0.037mg/kg-day, which corresponds to a TTC value of 0.37μg/kg-day (c.f. reported

497

TTC value of 0.3μg/kg-day). The updated 5th percentile values calculated for 1) the Munro

498

Cramer class III chemicals without AChE inhibitors and 2) the OPs/carbamates with the

499

updated Toxtree modules were comparable with those reported by Leeman et al. (2014). 24

500

These analyses demonstrate the most plausible explanation for the differences in 5th

501

percentile NO(A)EL values between the ToxValDB and Munro Cramer class III chemicals

502

was due to the presence of the OP and carbamate insecticides within Munro Cramer class

503

III. Once these substances were excluded from the Munro Cramer class III dataset, the 5th

504

percentile NO(A)EL values increased from 0.15mg/kg-day to 0.23mg/kg-day. Given that

505

multiple previous studies have also suggested removing the OPs and carbamates from

506

Cramer class III and the Kroes workflow already provides a separate TTC value for these

507

chemicals, based on this study, it does seem appropriate to consider an update to the TTC

508

value for Cramer class III substances (Kroes et al. 2004, Munro et al. 1999, 2008, EFSA

509

2012, Leeman et al. 2014).

510

511

3.5 Practical impact

512

To illustrate the practical consequence of this change, we processed a publicly-available

513

inventory of chemicals (~45,000 substances) along with their TTC category assignments

514

that were reported by Scitovation as part of ACC LRI supported research (accessed 28th

515

June 2019). Since the reported assignments were carried out using an earlier version of

516

Toxtree, the substances were re-profiled using the Kroes workflow in the same manner as

517

the ToxValDB chemicals in this study. To investigate what effect the refined OP/carbamates

518

module had on the TTC category assignments, the chemicals were profiled using both the

519

original and the refined OP/carbamates module developed through this study. Of the

520

~45,000 chemicals in the publicly available list, the vast majority were assigned to the

521

same TTC category irrespective of the OP/carbamate module utilised. However, there were

25

522

a total of 654 chemicals (0.015%) with a discrepancy in their TTC category assignment

523

between the two OP/carbamate modules (Figure 6, Supplementary Table 5).

524 525

[FIGURE 6 HERE]

526 527

Between the original and refined OP/carbamate module, 49 chemicals were reassigned

528

from Cramer class I, one chemical (Terbucarb) was reassigned from Cramer class II, and 99

529

chemicals were reassigned from Cramer class III to an OP/carbamate (Supplementary

530

Table 5). Upon investigation of the chemicals that were reassigned, a number of them were

531

OP/carbamate insecticides that had previously been missed. For example, fenobucarb,

532

formparanate, and propoxur were previously categorised in Cramer class I; terbucarb was

533

previously categorised as Cramer class II, and; aldicarb, carbofuran, carbaryl, and

534

thiodicarb were categorised as Cramer class III. However, all of these chemicals are known

535

insecticides that act via AChE inhibition (Colovic et al. 2013, Knowles and Ahmad 1972).

536

Meanwhile, the updated OP/carbamate module also identifies some chemicals it

537

perhaps should not, such as bambuterol, ladostigil tartrate, pyridostigmine, and

538

rivastigmine, which are pharmaceuticals; nevertheless, these chemicals act via AChE

539

inhibition (Colovic et al. 2013, Feldman and Karalliedde 1996, Weinstock et al. 2006). For

540

example, rivastigmine is a reversible inhibitor of AChE activity that is used in the treatment

541

of Alzheimer’s disease (Colovic et al. 2013).

542

Given that only 40 chemicals were utilised in the development of the SMARTS in the

543

updated OP/carbamate module, these false positives are probably to be expected.

544

Additionally, there are still likely some OP/carbamate containing insecticides that are 26

545

currently being missed because their chemical structure was outside the domain of

546

applicability for the chemicals used in the development of the updated module.

547

Notwithstanding these limitations, the increased identification of OP and carbamate

548

insecticides by the updated module provides an advantage over the original module by

549

further limiting the number of chemicals that were mis-classified. Future work could

550

involve the use of in vitro high-throughput screening data to further refine the SMARTS

551

patterns identified in this study or in the generation of additional SMARTS patterns for

552

OP/carbamate insecticides.

553 554

4. Conclusions

555

Overall, this analysis demonstrates that the TTC values published by Munro et al. (1996)

556

remain consistently below the thresholds derived from the expanded dataset of chemicals

557

of relevance to the EPA (ToxValDB). We were able to utilise bootstrap sampling to calculate

558

95% confidence intervals surrounding the 5th percentile NO(A)EL values for the Munro and

559

ToxValDB datasets. Even though the Munro 5th percentile NO(A)EL values for Cramer class

560

I-III were lower than those using the data from ToxValDB, only the Cramer class III values

561

were significantly different between the two datasets. Chemotype ToxPrint enrichments

562

were explored to identify plausible explanations to account for the variation in 5th

563

percentile values, but the discrepancies in chemical features were not sufficient to

564

rationalise these differences. Rather, identification and removal of OP and carbamate

565

insecticides from the Munro Cramer class III set, was able to account for the differences,

566

thus, lending further support to previous work by Munro et al. (2008) and others who have

567

proposed updating the TTC value for Cramer class III. The insights were used to refine the 27

568

existing SMARTS that are used in Toxtree to identify OP and carbamates. The updated

569

module for OPs/carbamates was used to process a large inventory of substances (~45,000)

570

to illustrate the impact these changes had on how substances are assigned and the

571

consequence this has on the TTC values that are applicable. This showed that the updated

572

module was better equipped to identify OP/carbamate insecticides that act via AChE

573

inhibition than the original module. That said the TTC values derived from this expanded

574

dataset of toxicity values offer additional support for the original TTC values derived by

575

Munro et al. (1996) reaffirming the utility of TTC as a promising tool in a risk-based

576

prioritisation approach.

577

578

28

579

Author Contributions

580

The manuscript was written through contributions of all authors. All authors have given

581

approval to the final version of the manuscript.

582

Data Statement

583

All the data used in this manuscript is available either in the paper, in the Supplementary

584

Information, or from the URLs provided within the manuscript.

585

Funding Sources

586

M.D.N. and P.P were supported by an appointment to the Research Participation Program

587

of the U.S. Environmental Protection Agency, Office of Research and Development,

588

administered by the Oak Ridge Institute for Science and Education through an interagency

589

agreement between the U.S. Department of Energy and the U.S. EPA.

590

Disclaimer and Conflicts of Interest

591

The authors declare no competing financial interests. The contents of this manuscript are

592

solely the responsibility of the authors and do not necessarily reflect the views or policies

593

of their employers. The views expressed in this article are those of the authors and do not

594

necessarily reflect the views or policies of the U.S. Environmental Protection Agency.

595

Mention of tradenames or commercial products does not constitute endorsement or

596

recommendation for use.

597 29

598

References

599

Blackburn, K., Stickney, J. A., Carlson-Lynch, H. L., McGinnis, P. M., Chappell, L., Felter, S.

600

2005. Application of the threshold of toxicological concern approach to ingredients in

601

personal and household care products. Regul Toxicol Pharmacol 43, 249-259.

602

Boobis, A., Brown, P., Cronin, M. T. D., Edwards, J., Galli, C. L., Goodman, J., Jacobs, A.,

603

Kirkland, D., Luijten, M., Marsaux, C., Martin, M., Yang, C., Hollnagel, H. M. 2017. Origin of the

604

TTC values for compounds that are genotoxic and/or carcinogenic and an approach for

605

their re-evaluation. Crit Rev Toxicol 47, 705-727.

606

Cheeseman, M. A., Machuga, E. J., Bailey, A. B. 1999. A tiered approach to Threshold of

607

Regulation. Food and Chemical Toxicology. 37, 387-412

608

Colovic, M. B., Krstic, D. Z., Lazarevic-Pasti, T. D., Bondzic, A. M., Vasic, V. M. 2013.

609

Actylcholinesterase inhibitors: Pharmacology and toxicology. Curr Neuropham. 11, 315-

610

335

611

Conover, W. J. 1999. Practical Nonparametric Statistics. Third Edition, John Wiley & Sons,

612

New York.

613

Cramer, G. M., Ford, R. A., Hall, R. L. 1978. Estimation of toxic hazard - a decision tree

614

approach. Food Cosmet. Toxicol. 16, 255-276.

615

Dewhurst, I., Renwick, A. G. 2013. Evaluation of the Threshold of Toxicological Concern

616

(TTC) - challenges and approaches. Regul Toxicol Pharmacol 651, 168-177.

30

617

US EPA. Overview: Office of Pollution Prevention and Toxics Laws and Programs. 2008

618

https://archive.epa.gov/oppt/pubs/oppt101_tscalaw_programs_2008.pdf [accessed 3 June

619

2019].

620

Feldman, S., Karalliedde, L. 1996. Drug interactions with neuromuscular blockers. Drug Saf.

621

15, 261-273.

622

Kalkhof, H. Herzler, M. Stahlmann, R. Gundert-Remy, U. 2012. Threshold of toxicological

623

concern values for non-genotoxic effects in industrial chemicals: re-evaluation of the

624

Cramer classification. Arch Toxicol 86, 17-25.

625

Knowles, C. O., Ahmad, S. 1972. Mode of Action studies with formetanate and formparanate

626

acaricides. Pesticide Biochemistry and Physiology. 1, 445-452.

627

Kroes, R., Galli, C. L., Munro, I., Schilter, B., Tran, L.-A., Walker, R., Wurtzen, G. 2000. TTC for

628

chemical substances present in the diet, A practical tool for assessing the need for toxicity

629

testing. Food Chem. Toxicol. 38, 255-312.

630

Kroes, R., Renwick, A. G., Cheeseman, M., Kleiner, J., Mangelsdorf, I., Piersma, A., Schilter, B.,

631

Schlatter, J., van Schothorst, F., Vos, J. G., Würtzen, G. 2004. European branch of the

632

International Life Sciences Institute, Structure-based Thresholds of Toxicological Concern

633

(TTC): Guidance for application to substances present at low levels in the diet, Food Chem.

634

Toxicol. 42, 65-83.

635

Kroes, R., Renwick, A. G., Feron, V., Galli, C. L., Gibney, M., Greim, H., Guy, R. H., Lhuguenot, J.

636

C., van de Sandt, J. J. M. 2007. Application of the threshold of toxicological concern (TTC) to

637

the safety evaluation of cosmetic ingredients. Food Chem Toxicol, 45, 2533-2562.

31

638

Leeman, W. R., Krul, L., Houben, G. F. 2014. Reevaluation of the Munro dataset to derive

639

more specific TTC thresholds. Regul Toxicol Pharmacol. 69, 273-278.

640

Mair, P.,Wilcox, R. 2018. WRS2: Wilcox robust estimation and testing.

641

Munro, I. C., Ford, R. A., Kennepohl, E., Sprenger, J. G. 1996. Correlation of a structural class

642

with no observed-effect levels: a proposal for establishing a threshold of concern, Food

643

Chem Toxicol. 34, 829-867.

644

Munro, I. C., Kennepohl, E., Kroes, R. 1999. A procedure for the safety evaluation of

645

flavouring substances. Food Chem. Toxicol. 37, 207-232.

646

Munro, I. C., Renwick, A. G., Danielewska-Nikiel, B. 2008. The Threshold of Toxicological

647

Concern (TTC) in risk assessment, Toxicol Lett. 180, 151-156.

648

Patlewicz, G., Jeliazkova, N., Safford, R. J., Worth, A. P., Aleksiev, B. 2008. An evaluation of

649

the implementation of the Cramer classification scheme in the Toxtree software, SAR QSAR

650

Environ. Res. 19, 495-524.

651

Patlewicz, G., Wambaugh, J. F., Felter, S. P., Simon, T. W., Becker, R. A. 2018. Utilising

652

Threshold of Toxicological Concern (TTC) with high throughput exposure predictions

653

(HTE) as a risk based prioritization approach for thousands of chemicals. Computational

654

Toxicology 7, 58-67. doi: 10.1016/j.comtox.2018.07.002

655

Pham, L. L., Sheffield, T. Y., Pradeep, P., Brown, J., Haggard, D. E., Wambaugh, J., Judson, R. S.,

656

Paul Friedman, K. 2019. Estimating uncertainty in the context of new approach

657

methodologies for potential use in chemical safety evaluation. Curr. Opin. Toxicol. 15, 40-

658

47. 32

659

Pham, L. L., Watford, S., Pradeep, P., Martin, M. T., Judson, R., Setzer, R. W., Paul Friedman, K.

660

(in prep) Variability in in vivo toxicity studies: Defining the upper limit of predictivity for

661

models of systemic effect levels.

662

R Core Team. 2018. R: A language and environment for statistical computing. R Foundation

663

for Statistical Computing, Vienna, Austria (https://www/R-project.org/).

664

van Ravenzwaay, B., Dammann, M., Buesen, R., Schneider, S. 2011. The threshold of

665

toxicological concern for prenatal developmental toxicity. Regul Toxicol Pharmacol, 59, 81-

666

90.

667

Tukey, J. W. 1977. Exploratory Data Analysis. Addison-Wesley. ISBN 978-0-201-07616-5.

668

Wang, J., Hallinger, D. R., Murr, A. S., Buckalew, A. R., Lougee, R. R., Richard, A. M., Laws, S. C.,

669

Stoker, T. E. 2019. High-throughput screening and chemotype-enrichment analysis of

670

ToxCast phase II chemicals evaluated for human sodium-iodide symporter (NIS) inhibition.

671

Environment International 126, 377-386. doi: 10.1016/j.envint.2019.02.024

672

Weinstock, M., Luques, L., Bejar, C., Shoham, S. 2006. Ladostigil, a novel multifunctional

673

drug for the treatment of dementia co-morbid with depression. In: Riederer P., Reichmann

674

H., Youdim M. B. H., Gerlach M. (eds) Parkinson’s Disease and Related Disorders. Journal of

675

Neural Transmission. Supplementa, vol 70. Springer, Vienna.

676

Williams, A. J., Grulke, C. M., Edwards, J., McEachran, A. D., Mansouri, K., Baker, N. C.,

677

Patlewicz, G., Shah, I., Wambaugh, J. F., Judson, R. S., Richard, A. M. 2017. The CompTox

678

Chemistry Dashboard: a community data resource for environmental chemistry. J.

679

Cheminform. 9, 61. doi: 10.1186/s13321-017-0247-6.

33

680

Yang, C. Tarkhov, A. Marusczyk, J. Bienfait, B. Gasteiger, J. Kleinoeder, T. Magdziarz, T.

681

Sacher, O., Schwab, C. H., Schwoebel, J., Terfloth, L., Arvidson, K., Richard, A., Worth, A.

682

Rathman, J. 2015. New publicly available chemical query language, CSRML, to support

683

chemotype representations for application to data mining and modelling. J Chem Inf Model.

684

55, 510-528. doi: 10.1021/ci500667v.

685

Yang, C., Barlow, S. M., Muldoon Jacobs, K. L., Vitcheva, V., Boobis, A. R., Felter, S. P.,

686

Arvidson, K. B., Keller, D., Cronin, M. T. D., Enoch, S. J., Worth, A., Hollnagel, H. M.

687

2017.Thresholds of Toxicological Concern for Cosmetics-Related Substances: New

688

Database, Thresholds, and Enrichment of Chemical Space. Food Chem. Toxicol. 109, 170-

689

193. doi: 10.1016/j.fct.2017.08.043

690

691

692

34

693

Table 1. Verification of Munro dataset provided by EFSA. Comparing the original 5th

694

percentile NOEL values published by Munro et al (1996) to the recalculated 5th percentile

695

values. Number of chemicals

Cramer class I Cramer class II Cramer class III

137 28 448

Original 5th percentile NOEL from Munro (mg/kg bw/day) 3.0 0.91 0.15

696 697

35

Recalculated 5th percentile NOEL from Munro (mg/kg bw/day) 2.9 0.95 0.15

698

Table 2. Number of chemicals from ToxValDB with QSAR-ready SMILES assigned to each

699

TTC category. Two chemicals could not be properly profiled through the Cramer (original)

700

module in Toxtree and were additionally removed.

Not applicable for TTC Presence of genotoxicity alert OPs/Carbamates Cramer class I Cramer class II Cramer class III Could not be profiled Total

Number of chemicals profiled 114 1025 102 1476 162 1673 2 4554

701 702

36

Number of chemicals with toxicity data NA NA NA 565 39 700 NA 1304

703

Table 3. Comparison of the 5th percentile NO(A)EL and TTC values calculated using the

704

ToxValDB and Munro data for Cramer class I-III. Note that the 95% confidence intervals

705

calculated using bootstrapping are in parentheses. Cramer Class

ToxValDB percentile TTC value (mg/kg-day) (µg/kg-day) 3.73 (2.97 – 4.79) 37.3 (29.7 – 47.9) 3.46 (1.5 – 8.63) 34.6 (15 – 86.3) 0.39 (0.3 – 0.53) 3.9 (3 – 5.3) 5th

Class I Class II Class III 706 707

37

Munro percentile TTC value (mg/kg-day) (µg/kg-day) 3.0 (1.71 – 5.31) 30 (17.1 – 53.1) 0.91 (0.32 – 3.02) 9.1 (3.2 – 30.2) 0.15 (0.11 – 0.22) 1.5 (1.1 – 2.2) 5th

708

Table 4. Comparison of the 5th percentile values for the Munro Cramer class III chemicals that were retained and removed

709

after utilising different methods. This investigation was undertaken to identify potential reasons behind the discrepancy in 5th

710

percentile values between the Munro and ToxValDB class III chemicals.

711

Method used for separation Chemotype enrichment Original SMARTS Updated SMARTS

Number of chemicals 306

Re-derived 5th percentile (mg/kg-day) 0.22

Number of chemicals removed 142

386

0.2

62

0.056

No

397

0.23

51

0.037

No

38

Removed chemical Statistically different 5th percentile from ToxValDB class III (mg/kg-day) 5th percentile 0.075 No

Figure Legends Figure 1. Cumulative distribution function and fitted lognormal distribution of NO(A)EL values from ToxValDB for chemicals in Cramer class I (in green), II (in orange), and III (in red). The distributions for Cramer classes I and III were seen to differ significantly, whilst the distributions for classes I and II and classes II and III did not differ significantly (p > 0.05). Figure 2. Comparison of the cumulative and fitted lognormal distributions for the ToxValDB and Munro NO(A)EL data for each Cramer class. The ToxValDB and Munro Cramer class I and II distributions were not significantly different (p > 0.05). Meanwhile, the Cramer class III distributions were significantly different between the two datasets (p < 0.05) Figure 3. Fifth percentile values and associated 95% confidence intervals calculated using 5000 bootstrap samples for each Cramer class from ToxValDB (in red) and Munro (in blue). Only the 5th percentile values for the Cramer class III chemicals from the two datasets were seen to be significantly different (p < 0.05). Figure 4. Comparison of the frequency of the ToxPrints (after being condensed to the 70 level 2 ToxPrints) for the chemicals in Cramer class III for both ToxValDB (in red) and Munro (in blue). Figure 5. Cumulative distribution function and fitted lognormal distribution for the Munro Cramer class III chemicals after being split using A) ToxPrints identified using chemotype enrichment analysis; and B) the OP/carbamates modules developed by Patlewicz et al (2018). After utilising the enriched ToxPrints to separate the Munro class III chemicals, the 39

two distributions were not statistically different; however, after removing chemicals identified as being AChE inhibitors the distributions were significantly different. Figure 6. Tile plot comparing the frequency (in log10 space) of TTC assignments for the ~45,000 chemicals in the publicly available dataset using the original OP and carbamate modules and the updated OP and carbamate modules developed in this study. NB: The values present in each tile display the raw number of chemicals.

40

Figure 1.

41

Figure 2.

42

Figure 3.

43

Figure 4.

44

45

Figure 5.

46

47

Figure 6.

48

49

Author Contributions

The manuscript was written through contributions of all authors. All authors have given approval to the final version of the manuscript. All work was conducted in the course of the authors’ employment.

Funding Sources

M.D.N. and P.P were supported by an appointment to the Research Participation Program of the U.S. Environmental Protection Agency, Office of Research and Development, administered by the Oak Ridge Institute for Science and Education through an interagency agreement between the U.S. Department of Energy and the U.S. EPA. The funding source(s) had no involvement, in the study design, data collection, execution or manuscript preparation and submission.

Disclaimer and Conflicts of Interest The authors declare no competing financial interests. The contents of this manuscript are solely the responsibility of the authors and do not necessarily reflect the views or policies of their employers. The views expressed in this article are those of the authors and do not necessarily reflect the views or policies of the U.S. Environmental Protection Agency. Mention of tradenames or commercial products does not constitute endorsement or recommendation for use.

Declaration of interests ☒ The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. ☐The authors declare the following financial interests/personal relationships which may be considered as potential competing interests:

No conflicts to declare from all 3 authors.