Regulatory Toxicology and Pharmacology 55 (2009) 353–360
Contents lists available at ScienceDirect
Regulatory Toxicology and Pharmacology journal homepage: www.elsevier.com/locate/yrtph
The French approach to deriving toxicity reference values: An example using reprotoxic effects Frédéric Dor a,*, Luc Multigner b, Blandine Doornaert c, Dominique Lafon d, Cédric Duboudin e, Pascal Empereur-Bissonnet a, Patrick Lévy f, Nathalie Bonvallot e a
Institut de veille sanitaire, département santé environnement, 12 rue du Val d’Osne, 94415 Saint-Maurice Cedex, France French Institute of Health and Medical Research, INSERM U 625, Research Group on Human and Mammalian Reproduction, University of Rennes 1, France National Institute for the Industrial Environment and Risks (INERIS), Verneuil en Halatte, France d National Research and Safety Institute for Occupational Accidents Prevention (INRS), Paris, France e French Agency for Environmental and Occupational Health Safety (AFSSET), Maisons-Alfort, France f French Chemical Industries Association (UIC), Paris La Défense, France b c
a r t i c l e
i n f o
Article history: Received 2 March 2009 Available online 22 August 2009 Keywords: Toxicity reference value methodology Reprotoxic substances Benchmark dose Uncertainty factors Framework
a b s t r a c t Following the French health and environment action plan, the French Agency for Environmental and Occupational Health and Safety set up a workgroup to standardise a method of deriving toxicity reference values (TRVs). Over the last few decades, there has been increasing concern about the effect of exposure to chemicals on reproductive function, leading the group to take an interest in reprotoxic effects. This article presents the recommendations of the workgroup regarding specific reprotoxic effects. Abnormal development of foetuses and infants, together with impairment of reproduction were considered to be critical effects. Where critical windows of exposure were concerned, quantitative analysis suggested the need for several types of toxicity reference value, as a function of exposure duration: reprotoxic effects may result from acute or chronic exposure at any time of life, whilst developmental effects may occur after exposure during the pregnancy or during the lactation period. The choice of a critical study is based on epidemiological or toxicological quality criteria. The working group recommends the use of the benchmark dose approach in estimating the critical dose. Finally, the working group considered the application of uncertainty factors typically used to take into account the variability between animal and human, between different individuals, and the availability of the data. Ó 2009 Elsevier Inc. All rights reserved.
1. Introduction In France, the health risk assessment approach is used for: (i) prioritisation, (ii) regulation and (iii) health recommendations. The main objective is the protection of public health. The French Health and Environment Action Plan recommended in 2003 that risk assessment tools should be improved and that the approach to hazard recognition should be standardised, particularly in relation to deriving toxicity reference values (TRVs). These values are equivalent to the reference dose/concentration (RfD/RfC), minimal risk level (MRL), or reference exposure level (REL) in United States.1 They are estimates of levels of exposure (inhalation or oral) of the human population, including sensitive subgroups, which are not
* Corresponding author. Fax: +33 1 41 79 67 68. E-mail address:
[email protected] (F. Dor). 1 The RfD or RfC is the reference dose derived by the US Environmental Protection Agency, the MRL the minimal risk level derived by the US Agency for Toxic Substances and Disease Registry, and the REL the reference exposure level derived by the California EPA Office of Environmental Health Hazard Assessment. 0273-2300/$ - see front matter Ó 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.yrtph.2009.08.006
likely to cause any appreciable risk of deleterious effects over a specified time. In France, these values are referred to as ‘‘toxicity reference values”, and in this paper, relate to health-based acceptable daily intake or concentrations. The target population is the general population. Any agent damaging reproductive function may have dramatic consequences for the continuation of the species. Exposure in the early stages of life can result in long-term effects (Grandjean and Weihe, 2008; Grandjean et al., 2008). There has been concern about the possible role of environmental factors in various reproductive disorders for more than 15 years, as the incidence of some disorders has risen over the last 50 years (Toppari et al., 1996; Thomas and Thomas, 2003; Andersson et al., 2008). A first response to these concerns has been to draw up a series of European regulations classifying reprotoxic substances (chemical substances that are toxic to reproduction). Recent changes in French legislation also reflect concerns relating to reprotoxic substances. The French Decree No. 2001-97 of 1st February 2001 – amending the labour code – extended the provisions of the European Directive 98/24 on carcinogens and mutagens to include
354
F. Dor et al. / Regulatory Toxicology and Pharmacology 55 (2009) 353–360
substances classified as category 1 and 2 reprotoxic. This requires specific prevention measures to be implemented, including alternative research and the reassignment of pregnant and nursing women avoid exposure to such substances. A second response has been to set TRVs for these reprotoxic effects, even if other effects occur at lower levels than those producing reprotoxic effects. Specific methods for deriving TRVs for reprotoxic substances are proposed in the literature: the US Environmental Protection Agency (US EPA) has developed a method relating to developmental effects in particular (US EPA, 1991). The California EPA developed safe harbour levels that are maximum allowable dose levels for chemicals causing reproductive and developmental toxicity under the Proposition 65 Regulation (OEHHA, 2001). The WHO suggests defining various values for developmental effects due to the shorter, or indeed punctual, exposure windows (IPCS, 2001). Some international organisations use an additional safety factor of 10 when the critical effect is reprotoxic (IPCS, 1999). However, in practice, TRVs based on reprotoxic endpoints are rarely developed. In this context, and to help public health decision-making, the French Agency for Environmental and Occupational Health Safety (AFSSET) set up a working group to define a method for TRV derivation (AFSSET, 2007). This group was made up of French institutions involved in risk assessment research: Inserm (French Institute of Health and Medical Research), Ineris (French National Institute for Industrial Environment and Risks), INRS (French national institute for occupational health and safety), AFSSET (French Agency for Environmental and Occupational Health Safety), InVS (French Institute for Public Health Surveillance) and the UIC (French Chemical Industries Association). This article describes the methodological work carried out by the expert working group. The first goal was to reach a consensus of opinion on the steps involved in the derivation of TRVs related to adverse effects on humans, using reproductive and developmental effects as examples. 2. Methods Based on an initial analysis of the literature and of Medline and Toxline databases, in particular, the workgroup divided its work into four methodological stages: – A review of the assumptions underlying the establishment of TRVs; – The definition of reprotoxic effects and their use as critical effects in establishing a TRV; – The definition of criteria for selecting toxicological or epidemiological studies for use as sources of data for establishing TRVs; – The definition of critical dose and uncertainty factors. The method was developed using six examples from a prioritised list of 450 reprotoxic substances (phthalates (DEHP, BBP, DBP), toluene, EGEE, Linuron). This list was compiled according to informative national or international regulations and classifications (Bonvallot et al., 2009). This information is available on the AFSSET website (http://www.afsset.fr). 3. Results 3.1. Default assumptions assumed in the establishment of a TRV The working group used three major default assumptions for the derivation of a TRV. These default assumptions can be defined as generic approaches, based on general scientific knowledge and policy judgment that are applied to various elements of the risk
assessment process when specific scientific information is not available. When mechanistic data showing the contrary are available, these default assumptions should not be used. The first default assumption was that the dose–effect or dose– response relationship is monotonic. This assumption is based on the simple idea that the first interaction of a chemical with a biological target leads to an increase or a decrease of a biomarker over the entire dose range. In the absence of data, this simple assumption may be used as a conservative approach. However, based on current knowledge, this assumption is questioned and the shape of the dose–response curve is better defined by describing toxicokinetic and toxicodynamic processes and underlying mechanisms of action (Conolly and Lutz, 2004). The second default assumption was that reprotoxic effects (excluding mutagenic effects on germ cells, which are not covered by this work) are considered to occur above a threshold exposure dose. This reflects the generally accepted opinion within the scientific community (US EPA, 1991, 1996; Moore et al., 1995). However, we are now able to detect premature biomarkers leading to a decrease in NOAEL (no observed adverse effect level) and LOAEL (lowest observed adverse effect level). The third assumption specified that effects observed in animals may also occur in humans, in the absence of evidence to the contrary. It is further stipulated that, in the absence of toxicokinetic and toxicodynamic data, it is assumed that humans are more sensitive than animals (Sharpe, 1994).
3.2. Selection of reprotoxic critical effects When establishing a TRV, the chosen effect is the critical effect. It is the adverse effect (or its known precursor in terms of biological response) that occurs at the lowest dose in the most sensitive species. Reprotoxic effects are conventionally of two types: (1) Effects on male and female reproduction (reproductive organs and the endocrine system) resulting from exposure at any point in the lifetime of the individual. They are called ‘‘reproductive effects” in this paper. These include, in particular, effects on the onset and progress of puberty and sexual maturity, sexual behaviour, fertility and fertilisation (European Commission, 1967; US EPA, 1996). (2) Developmental effects appearing either during the gestation period or from birth. They occur in infants or children and result from parental exposure prior to conception, during embryofoetal development, the lactation period and up to sexual maturation. The effects observed include, for example, perinatal death, early or late miscarriage, teratogenic effects, intra-uterine growth retardation, impairment in the weight or size of newborns and organs, and functional deficiencies resulting from a failure or retardation in the capacity of organs (for example, in neurological terms, behavioural or neurocognitive disorders). They are referred to as ‘‘developmental effects” in this paper. The working group considered that all these effects, whether reversible or irreversible, could be defined as critical and could be used to establish a TRV. However, it is often difficult to identify precisely the long-term consequences of an experimental observation: could a histological change in rodent gonads lead to an impairment of human fertility? What variation in a blood concentration of a hormone can be responsible for a change in the endocrine function? Is the anogenital distance a good indicator of endocrine disrupters exposure and what is the prediction for neonatal or long-term reproductive disorders? Could a simple variation be considered as an adverse developmental effect? The answers to these questions are difficult and depend on the quantity and the quality of data available for each substance. The working group suggests using expert judgment in a case-by-case analysis.
F. Dor et al. / Regulatory Toxicology and Pharmacology 55 (2009) 353–360
Finally, the different types of reprotoxic effects occur under various conditions of exposure. For example, a low-dose chronic exposure of adult males may result in a decrease in the number of spermatozoids, leading – eventually – to impaired male fertility; short-term exposure (several days) of pregnant females in the critical window of the organogenesis may result in developmental defects. Therefore, when studying reprotoxic effects, exposure durations should be taken into account and evaluated in the derivation of a TRV for each situation. The population to be protected by the TRV should also be defined (general population, pregnant women or men, for example) in relation to the effect studied (male or female reproductive effects, developmental effects after an exposure during pregnancy). 3.3. Selecting the critical study The critical – or pivotal – study, as defined by the International Union of Pure and Applied Chemistry (IUPAC) (Duffus et al., 2007), is an investigation yielding the NOAEL used by health risk organizations as the basis of the TRV. The scientific quality of the study is an important factor in establishing TRVs. Two types of study are usually taken into account: epidemiological studies of human populations and experimental studies conducted mainly in animals. In both cases, it is important to define criteria for measuring the validity of study data. The key factors for selecting an epidemiological study are described in a reference document published by the ADELF (association of French-speaking epidemiologists): (i) measurement of exposure, which aims to determine the levels of exposure to the pollutant in question, the existence of multiple exposures, possible changes in exposure over time and the time when exposure occurred; (ii) choice of the effect analysed; (iii) statistical power of the study, i.e. ability to detect specific excess risks; (iv) management of bias and confounding factors and, in the case of reprotoxic effects, the age of the parents, their occupational and medical history; in the case of developmental effects, any events occurring during pregnancy (ADELF, 2008). In fact, many factors may adversely affect epidemiological studies and reduce their ability to demonstrate the causal link between exposure to a chemical substance and the occurrence of a reprotoxic effect. Finally, not all types of epidemiological study are suitable for establishing a TRV. In general, ecological studies are not suitable as they are not based on individual measurements and do not identify bias and confounding factors at an individual level; however, transversal, cohort or case-control analytical studies can be very useful. TRVs are usually based on animal experiments. The Food and Drug Administration (FDA) developed a series of standardised toxicological protocols in 1966. Since then, several international organisations (US EPA, International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use (ICH), Organisation for Economic Co-operation and Development (OECD)) have developed these types of protocol. The OECD, in particular, has developed guidelines for testing the effects of chemicals on health, which are included in European regulations. Hence, standardised protocols have been developed specifically for studying reprotoxic effects: the prenatal development toxicity study (test 414) was designed to provide information on the effect on the developing organism (growth alteration, teratogenesis, etc.); one and two-generation reproduction toxicity studies (tests 415 and 416) provide information about the reproductive system. Two other tests are used for reprotoxicity screening: the reproduction/developmental toxicity screening test (test 421) and the repeated dose toxicity study combined with the reproduction/developmental toxicity screening (test 422). Finally, the 426 guideline relates to the study of effects on the nervous system in offspring after maternal exposure during gestation or lacta-
355
tion. Other types of toxicity study, such as toxicokinetics (test 417), repeated dose toxicity (tests 407, 408 and 409) and mechanistic studies, may be useful to strengthen the association observed and improve knowledge of the endpoint, histopathological changes and the mechanism of action. The OECD has also developed new and revised test guidelines for detecting endocrine disrupters involved in reprotoxic effects. These include updates of test 407, the rat uterotrophic assay, the Hershberger bioassay, the 21-day fish screening assay and the stably transfected transcriptional activation assay. The principle of Good Laboratory Practice (GLP) was also introduced to enable laboratories to guarantee the traceability of studies and to ensure comparability between laboratories. However, not all experimental studies comply with the OECD’s standardised protocols or with GLP. Therefore, the quality of these studies and their relevance in assessing the dose–effect relationship has been widely discussed (Klimisch et al., 1997; Squire, 1984). The Klimisch system is the most widely used system for assessing data quality in the regulatory evaluation of chemical substances (OECD, 2007). The reliability of a study is determined by analysing, amongst others, the following data: the type, number and gender of the animals tested; the purity and source of the toxic substance; the accuracy of the description of the lesions observed; the description of the animal exposure methods, in particular the route and doses administered; and the identification of a dose–response relationship. Studies can be rated 1–4, with 1 being reliable without restriction, 2 reliable with restrictions, 3 not reliable and 4 not assignable due to lack of information. Only studies rated 1 and 2 are considered of sufficient scientific quality (Table 1). Finally, the critical study selected must describe the critical effect chosen in the first step. The workgroup suggested the following procedure: (i) give preference to epidemiological studies if they are of adequate quality (according to the criteria defined by the ADELF) and, especially, if the exposures are sufficiently well-characterised; (ii) use experimental studies that comply with OECDstandardised protocols or that are rated 1 based on the Klimisch classification; (iii) use experimental studies that are rated 2 based on the Klimisch system. 3.4. Establishing the critical dose The establishment of a critical dose is typically and most often based on the identification of a LOAEL (Lowest Observed Adverse Effect Level) and a NOAEL (No Observed Adverse Effect Level) in animal toxicology studies. The LOAEL is the lowest dose in the experimental protocol to be associated with the occurrence of the chosen critical effect, in a statistically significant manner. The NOAEL is the dose just below, that is, by definition, the highest dose at which the critical effect is not significantly observed. The scientific community has questioned the relevance of these dose descriptors in recent years, (US EPA, 2000). From a practical standpoint, there are several reasons for this: these critical doses are highly dependent on experimental protocol, particularly the number of doses tested, the interval between doses and the number of animals or people included in the study, as the observation of effects is based on a statistically significant difference between exposed groups and a control group. Furthermore, a confidence interval is not given for these values and the degree of uncertainty is not quantified. The NOAEL, which is based on a statistical comparison, is not necessarily an effect-free dose. Moreover, Allen et al. have shown that, as far as developmental effects are concerned, the response percentage associated with a NOAEL may vary from 5% to 20% (Allen et al., 1994). The procedure used ensured that the LOAEL produced an effect, but did not provide any guarantees as to the harmlessness of the NOAEL.
356
F. Dor et al. / Regulatory Toxicology and Pharmacology 55 (2009) 353–360
Table 1 Klimisch rating criteria (according to Klimisch et al., 1997). Rating
Reliability category
1 – 1a
Reliable without restriction – GLP study compliant with standardised tests (OECD, EC, EPA, FDA, etc.) – Comparable with standardised tests (‘‘guidelines”) – Protocol compliant with national standardised method (AFNOR, DIN, etc.) – Protocol that complies with other standardised, scientifically accepted methods, and is sufficiently detailed
– 1b – 1c – 1d 2 – – – –
– 2f – 2g
Reliable with restrictions – Standardised study without detailed documentation – Standardised study with acceptable restrictions – Comparable to a standardised study with acceptable restrictions – Protocol compliant with national standardised methods, with acceptable restrictions – Well-documented study compliant with scientific principles, acceptable for evaluation – Accepted calculation method – Data deriving from reference works and data collection
3 – 3a – 3b – 3c
Not reliable – Document inadequate for evaluation – Significant methodological shortcomings – Unrealistic protocol
4 – – – –
Not assignable – Summary – Secondary literature – Original reference not available – Original reference in a language different from the international language – Document inadequate for evaluation
2a 2b 2c 2d
– 2e
4a 4b 4c 4d
– 4e
More recently, other approaches have been put forward to reduce the level of uncertainty and to avoid the theoretical problems related to the use of a LOAEL or NOAEL. The maximum safe dose (MAXSD), like the LOAEL, is also determined statistically (Tamhane et al., 2001). However, the purpose of the statistical test is not to show that an adverse effect exists as in the NOAEL/LOAEL approach, but rather to demonstrate that there is no adverse effect (MAXSD). In this approach, the level of exposure causing an adverse effect must be defined; this may be, for example, a percentage variance from the response observed in the control group. This definition is based, first and foremost, on toxicological considerations. The MAXSD is the maximum tested dose at which it is statistically certain that the effect produced is lower than this percentage. Nevertheless, the MAXSD is, like the LOAEL and NOAEL, one of the doses tested in experiments. The MAXSD cannot be identified if possibility of an adverse effect cannot be eliminated for any of the doses tested. No one has so far suggested using the MAXSD to establish a TRV (Hothorn and Hauschke, 2000; Tamhane et al., 2006). The purpose of defining a benchmark dose (BMD) or a benchmark dose lower bound (BMDL) is to estimate the dose corresponding to a predefined change in response compared with the control group (usually 5% or 10%). This response level or percentage is called the ‘‘benchmark response level” (BMR). All experimental data are taken into account when calculating this dose, which is based on biologically acceptable mathematical or statistical models (Cal-EPA, 2004b; Parham and Portier, 2005; Slob, 2002). Developments specific to reprotoxic effects have been implemented to take into account the litter effect and indicators such as prenatal death and malformation rates in survivors (Chen, 1991; Fung et al., 1998; Gaylor et al., 1998). The choice of a data adjustment model must therefore be carefully thought out (CalEPA, 2004a). A scientific consensus recommends using the lower bound of a 95% confidence interval on the BMD (BMDL). This interval takes into account uncertainties arising from the more or less
random response of each individual in the test group to a toxic dose. After analysing the various methods for establishing a critical dose, the working group decided to: Give preference to the BMD/BMDL approach wherever possible, otherwise; Propose a MAXSD if such a dose exists; Use the traditional approach with the two dose descriptors LOAEL and NOAEL; Use either a NOAEL or a LOAEL, as a last resort. This approach would increase the level of uncertainty associated with the value of the TRV. This should therefore be considered when deriving and presenting the value. The choice of a critical dose must always be discussed by a group of specialists (toxicologists, physicians, biomathematicians and risk assessors). Further adjustments may be made when defining the critical dose. Allometric adjustments allow for variation in size, kinetics or metabolism between species. An adjustment coefficient is applied to the body surface or the inhalation rate, or to the blood/ air partition coefficients depending on the type of exposure and characteristics of the substance. Some organisations (such as the US EPA) also apply a temporal adjustment based on Haber’s rule, which, in its simplified form, states that the product of the concentration and duration of exposure corresponds to a constant in terms of toxicity. This aims to take into account the variation in duration of exposure between laboratory animals (which are generally exposed for 4–8 h a day, 5 days a week) and humans, where the assumed exposure time is 24 h a day, 7 days a week (US EPA, 1994). 3.5. Choosing uncertainty factors and associated values Uncertainty factors reflect scientific uncertainty involving extrapolation from one species or one individual to another; extrapolation from one type of exposure to another and the knowledge available when establishing the TRV. The workgroup decided that the five uncertainty factors typically used by the risk assessment organisations could be taken into account when establishing the TRV for reprotoxic effects (AFSSET, 2007). These factors are described below. The interspecies variability factor, UFA, is applied when an animal study is used to establish the TRV; it takes into account the toxicokinetic and toxicodynamic differences between the test species and humans. Its numerical value is usually 10; it may be reduced to 3 when available data show the absence of toxicokinetic differences. In rare cases, it may be reduced to 1 if animal and human kinetics and dynamics are identical or if humans are known to be less sensitive than the laboratory animal. In the case of the US EPA TRV for Ethylene glycol butyl ether, for example, the toxicokinetic differences are taken into account by applying a PBPK model (the UFA can be reduced to 3). Studies in vitro show that animal erythrocytes are more sensitive than human erythrocytes, and the UFA was reduced to 1 (Dor and Bonvallot, 2007). However, organisations responsible for establishing TRVs do not all apply this rule in the same way. The interindividual variability factor, UFH, is designed to take into account variations in sensitivity in the human population as a whole, when the study is conducted on a small group or on laboratory animals with limited interindividual variability. This factor should allow for the presence of sensitive subgroups in the general population (children, elderly people, pregnant women, asthmatics, etc.) depending on the chemical and its adverse effect. In the past, a factor of 10 has been justified from analysis of the dose–effect rela-
357
F. Dor et al. / Regulatory Toxicology and Pharmacology 55 (2009) 353–360
tionships obtained from a large number of acute lethality studies in animals (LD50); in 92% of cases the factor takes account of the most sensitive individuals in the experiments (Dourson and Stara, 1983). Nevertheless, genetic homogeneity, uniform living conditions and the absence of specific diseases in laboratory animals result in only a very small variation in effects from one tested animal to another, compared with the variations in humans: such as genetic polymorphism, varying lifestyles from one population group to another, associated risk factors, the presence of specific diseases, different hormonal states, the existence of sensitive individuals, etc. This factor of 10 is therefore highly theoretical. Many authors have suggested making scientific adjustments to this safety factor (Hattis et al., 1987; Renwick et al., 2000; Dorne et al., 2001; Walton et al., 2001; KEMI, 2003). However, there is no scientific consensus and, in practice, its numerical value is usually 10. The uncertainty factor UFL or UFB is applied when the critical dose is a LOAEL or a BMD (or BMDL), because these dose descriptors correspond to doses which give a response in the study population. Where the critical dose is a LOAEL, the numerical values are usually 3 or 10, although there are no explicit rules for using one or the other. Practically, the reasons for reducing the values are the severity of the effect or the shape of the dose–response curve. In the case of a BMD or BMDL, the values are more likely to be 1 and 3, although a value of 10 may sometimes be used. Once again, there are no clear rules in the literature. In most cases, where the BMDL or BMD is based on a benchmark response level of 5%, it is considered as a NOAEL and a factor of 1 is applied. A case-by-case analysis is required. Extrapolation from sub-chronic exposure to chronic exposure assumes that an effect observed during sub-chronic exposure will also occur, at a lower dose, during chronic exposure. This extrapolation can also be used when the severity of the effect increases with exposure time. This assumption is based on Haber’s Rule (Bunce and Remillard, 2003). In practice, this factor has a value of 3 or 10 depending on expert judgment. This factor is probably not used much for cases of reproductive and developmental toxicology, as critical studies are chosen according to exposure duration. For example, if a prenatal toxicity study is used, the TRV will be a short-term value because the pregnant animals are exposed for a few days, which corresponds to an exposure between the third and the ninth month in human pregnancy. Moreover, this short-term TRV is applicable to the entire female reproductive age group because women may be unaware they are pregnant in the first months. If a one-generation study is used, the TRV will be a long-term value because males and females are exposed for approximately a hundred days, corresponding to as much as ten percent of the lifespan. Thus, the TRV is able to protect the general population. A factor that takes the severity of the critical effect into account is sometimes applied. Organisations use this factor for developmental effects in particular and its value is usually 10. Other organisations use a factor of 3, for example the OEHHA for the neurotoxic effects of lead. This uncertainty factor seems to be applied more as a policy than for scientific reasons. The working group recommends using this value as a component of the ‘‘UFD”, on a caseby-case basis. The European Union also recommends applying uncertainty factors when establishing Margins of Safety (MOS) (European Union, 2003), and when calculating DNELs (Derived No Effect Levels, which correspond to TRVs) in line with the new European regulation on chemical substances (REACH) (ECHA, 2008). In both cases, the use of default uncertainty factors is recommended to take account of the variability and uncertainty mentioned above. The European Union therefore suggests using a default uncertainty factor of 10 for an UFH where no specific data on the substance are available. To take interspecies variability into account (transposi-
Table 2 Uncertainty factor values to be applied when establishing a reprotoxic TRV (taken from AFSSET, 2007). Acronym
Interpretation of uncertainty factors (UF)
UF values
UFA
Interspecies variability
3 1–3
Toxicokinetic data
Toxicodynamic data
UFB/L
Toxicokinetic data Toxicodynamic data BMD to NOAEL/LOAEL to NOAEL
UFS
Sub-chronic to chronic
UFD
Adequacy of data (in terms of quality and quantity ± effect severity)
UFH
If no data If toxicokinetics identical, or if using a dose adjustment coefficient If PBPK model completed If using a study in humans If no data If toxicodynamics identical If humans less sensitive than animals tested If using a study in humans
Interindividual variability
1 – 3 1 1
– 3 3
Case-dependent (expert judgment) Case-by-case analysis, probably not used Case-dependent (expert judgment)
1, 3 or 10
1, 3 or 10
tion from animals to humans), default factors based on allometric scaling according to size of species are recommended for oral exposure (for example, a factor of 7 is recommended if the toxicological study was performed on mice, and of 1.4 if the animals tested were dogs). However, for exposure through inhalation, only a temporal adjustment is required to take into account the frequency of animal exposure (usually 6 h a day, 5 days a week) compared with that of human exposure (usually considered to be 24 h a day, 7 days a week), based on Haber’s law. Regarding critical doses, application of a factor of 1 is recommended when the dose is a NOAEL and a factor of 3 or 10 when the dose is a LOAEL (depending on the severity of the effect and the percentage of effects or of animals affected). A BMDL05 is usually considered as a NOAEL, and a factor of 1 is therefore recommended. A factor of 3 is usually applied where the critical dose is a BMDL10. The European Union proposes that a factor of 3 or 10 should be applied for reprotoxicity effects, depending on the severity of the effect. In all cases, the workgroup strongly recommends that UF values be defined on a scientific basis and discussed by specialists whenever data are available. The decisions taken must be clearly presented and explained. The default values proposed by the working group are summarised in Table 2.
4. Discussion A number of factors have generated increased interest in the determination of specific TRVs for reprotoxic effects: the high sensitivity of the reproductive function to certain xenobiotics, the fragility of the human reproductive function, increasing concern about the suspected deterioration in human reproduction over the last few decades and the possible causes of this deterioration. New initiatives are thus taking shape; the expert appraisal organised in France in 2003 enabled the development of a rigorous method for establishing TRVs. Reprotoxic effects were identified.
358
F. Dor et al. / Regulatory Toxicology and Pharmacology 55 (2009) 353–360
The method developed by the working group includes the usual steps involved in TRV derivation (critical study, critical dose and uncertainty factors). The characteristics of TRVs for reprotoxic substances are reflected in the recommendations and explanations made by the working group. At each step, the pooled knowledge of the experts was studied and organised to highlight both strengths and weaknesses of the method. Firstly, the difficulty of selecting the critical effect becomes apparent, as all types of effect, whether reproductive or developmental, can be regarded as adverse. The description of developmental effects must distinguish between those that occur in the foetus alone or in infants, and those that result from maternal toxicity and are harmful to the foetus or the child. This distinction may not always be evident from study findings because the effects observed correspond generally to a biological continuum. The second difficulty relates to the measurement of biological responses at an individual level, and the difficulty of classifying them as adverse effects, due to differences in interpretation. These two issues underline the importance of explaining the reasons for selecting a particular a critical effect. The description of effects revealed the need to establish various types of TRV, depending on the exposure windows likely to generate the effects. However, it is difficult to estimate the exposure window, especially when the studies involve several generations. The observed reproductive effect may be caused by the direct exposure of an individual, by exposure of the parents during their lifetime, or by a combination of parental and child exposure. Regarding developmental effects, the exposure dose may be only the dose administered during the gestation period, or a combination of a prior accumulation of the substance in the organism and the dose administered during the study. This is a key problem relating to bio-persistent substances in the human population. These issues highlight the difficulties encountered when analysing and validating information on effects observed in humans or laboratory animals. Therefore, these exposure-related problems need to be settled once and for all when establishing a TRV. In view of the current state of knowledge, it is probably best to rely on straightforward and simple reasoning as a matter of policy (the above difficulties can only be resolved in a few cases). Thus, effects on embryofoetal development result from short periods of exposure during pregnancy; effects on post-natal development result from sub-chronic exposure during pregnancy and lactation; whilst reproductive effects result from chronic exposure at any time of life. On this basis, the conditions for validating TRVs become clearer. The experimental study protocols were not initially designed to establish TRVs. This was an a posteriori objective. Consequently, the critical doses proposed all have an element of uncertainty that needs to be addressed. Based on the examples developed by the working group, it emerges that the statistical data used to define a LOAEL and a NOAEL are not always explained, particularly the statistical test used to establish the LOAEL, the associated level of significance and the probability of erroneously accepting the null hypothesis (beta-error). The calculation of a BMDL requires indepth discussions between toxicologists and statisticians, not only to clarify the extent to which dose–response relationship models are able take into account relevant biological data, but also to define an adverse effect. An interval of several orders of magnitude may be observed between the BMD and the BMDL (lower 95% confidence interval of the BMD), revealing the significant uncertainties surrounding the value of the resulting TRV. These uncertainties are due to the limits of the database rather than the method. TRVs are often established on the basis of such data. The decision to use a BMD rather than a LOAEL or a NOAEL does not change these uncertainties but if a BMD is used, the uncertainties will be more explicit. It is therefore possible to have a more in-depth discussion on
the value of the TRV and to let the user decide whether to apply it or not, depending on the context. The key question here is whether the available data are adequate to establish a TRV. Where there are significant uncertainties, the BMD and the LOAEL/NOAEL do not have the same meaning. The greater the uncertainty, the higher the LOAEL (and hence the NOAEL); conversely, the BMDL value drops as the level of uncertainty rises. Therefore, uncertainty increases the value of the critical dose when a LOAEL and NOAEL are used, but reduces it when a BMDL is used. For this reason, using a BMD instead of a NOAEL/LOAEL is more protective for public health and suits the precautionary principle. Allometric adjustments are the subject of much debate. Some organisations use these factors to establish inhalation TRVs, by applying a coefficient based on the physiological characteristics of the individual and the physico-chemical properties of the substance for effects occurring above a certain dose threshold. Regarding oral exposure, although a coefficient is proposed for non-threshold TRVs based on cancerogenic effects, according to differences in animal and human body surfaces, allometric adjustment it is not used for threshold TRVs as it is implicit in the uncertainty factor UFA). Given the differences between non-threshold and threshold TRVs on the one hand and between oral and inhalation TRVs on the other, we need to improve our understanding of the transposition of data from animal to human and to continue our work in this field to enable us to make more efficient use of our knowledge of allometry. Likewise, considering the temporal adjustments, Haber’s equation is a specific version of the more general relation C tm = K, where m = 1, and some authors have expressed reservations regarding its suitability for all types of substance (Bunce and Remillard, 2003). Further research should show whether the general application of this adjustment to all dose–response relationships is relevant. Recommendation of a critical dose is a complex task and it is essential to analyse the quality of source data. The systematic and explicit establishment of a BMD/BMDL raises questions concerning the experimental protocol, the critical effect being analysed and the uncertainties or variability associated with estimating and interpreting the critical dose. Analysis of the data used to establish uncertainty factor values reveals differences between the theoretical values and those used in practice. For UFA and UFH, the theory emphasises the toxicokinetic and toxicodynamic differences between species and individuals. In particular, it argues that interspecies transposition of the doses administered is proportional to body surface area, which is itself highly correlated with body weight. This suggests that different species are equally sensitive to a given dose per body surface unit, justifying the reduction of the UFA value from 10 to 3. In practice, the uncertainty factor is not always reduced and differs from one organisation to another because of the absence of an international consensus on allometric scaling; for example, for vinyl acetate, the US EPA uses a value of 3 whereas the ATSDR a value of 10. As for UFH, it is not yet possible to get an accurate idea of the sensitivity of different populations to reprotoxic substances. Therefore, the use of a default value (10) is recommended. The uncertainty factor UFB is applied when the TRV is established on the basis of a BMD/BMDL, a critical dose known to produce an adverse effect. The most widespread BMDLs are the lower bound of the 95th confidence interval of the dose corresponding to 10% or 5% change in the response (BMD10L95 and BMD05L95). According to the literature, it seems that the NOAEL is generally used alongside these two BMDLs. The working group thus recommends a value of 3 for BMD10L95 and 1 for BMD05L95. However, this uncertainty factor has not been adequately tried and tested; it should be re-assessed after further use. Where only a LOAEL is available, a TRV should not be suggested. Experience has shown that there are, however, excep-
F. Dor et al. / Regulatory Toxicology and Pharmacology 55 (2009) 353–360
tions. The UFL should only be used under specific circumstances, following joint discussion. A lack of data for an in-depth reprotoxicity assessment (lack of studies in several species, lack of multigenerational studies, etc.) undermines conclusions and a UF for data inadequacy may be applied. This uncertainty factor may have a value ranging from 3 to 10, depending on available data. However, the severity of the effect is not taken into consideration, as it is impossible to determine the equivalence of severity in animals and humans. 5. Conclusion The establishment of TRVs based on reprotoxic effects results from concerns raised by the dissemination into the general and professional environment of substances likely to damage the reproductive function. The nature of the expected effects and the large number of possible exposure conditions require several TRVs relating, on the one hand, to adult reproduction and, on the other hand, to development during pregnancy and/or the lactation period. Regardless of type, TRVs can be determined by following a series of standard steps. If the effects can all be regarded as critical, special attention must be paid to the establishment of the critical dose (with preference being given to a BMD as far as possible) and the determination of uncertainty factors. Data quality analysis is a key step in establishing a relevant TRV. Methodological developments should focus, first and foremost, on the definition of the critical dose. They should include the definition and standardisation of toxicology protocols suited to establishing TRVs (especially TRVs based on reprotoxic effects). Rules for determining a BMD should be outlined and allometric and temporal adjustments taken into account. Thus, to address the issue of the critical dose better, we need to analyse and exploit data more efficiently. This would probably lead to a reduction in uncertainty factor values. Conflict of interest statement The authors declare that there are no conflicts of interest. Acknowledgments The work described in this article was coordinated by AFSSET. The authors thank all the members of the workgroup and the expert committee on ‘‘Assessing the risks associated with chemical substances”, whose advice was key to the success of this project. The French Ministry of Health funded this work, as part of the National Health and Environment Action Plan. References AFSSET (French Agency for Environmental and Occupational Health Safety), 2007. Valeurs toxicologiques de référence (VTR). Méthode de construction de VTR fondées sur des effets toxiques pour la reproduction et le développement. Available from:
. Allen, B.C., Kavlock, R.J., Kimmel, C.A., Faustman, E.M., 1994. Dose–response assessment for developmental toxicity. II. Comparison of generic benchmark dose estimates with NOAELs. Fundamental and Applied Toxicology 23, 487– 495. Andersson, A.M., Jørgensen, N., Main, K.M., Toppari, J., Rajpert-De Meyts, E., Leffers, H., Juul, A., Jensen, T.K., Skakkebaek, N.E., 2008. Adverse trends in male reproductive health: we may have reached a crucial ‘tipping point’. International Journal of Andrology 31 (2), 74–80. Association des épidémiologistes de langue française (ADELF), 2008. Recommendations for professional standards and good epidemiological practices. Available from:
. Bonvallot, N., Mullot, J.-U., Solal, C., Dor, F., 2009. Method for identifying and prioritizing the reprotoxic chemicals for which human toxicity values must be derived. Environnement Risques et Santé 8 (2), 119–131.
359
Bunce, N.J., Remillard, R.B.J., 2003. Haber’s rule: the search for quantitative relationships in toxicology. Human and Ecological Risk Assessment 9, 973–985. California Environmental Protection Agency (Cal-EPA), 2004a. Guidance for Benchmark Dose (BMD) Approach – Quantal Data. DPR MT-1. Available from:
. California Environmental Protection Agency (Cal-EPA), 2004b. Guidance for Benchmark Dose (BMD) Approach – Continuous Data. DPR MT-2. Available from:
. Chen, J.J., 1991. Analysis of trinomial responses from reproductive and developmental toxicity experiments. Biometrics 47, 1049–1058. Conolly, R.B., Lutz, W.K., 2004. Nonmonotonic dose–response relationships: mechanistic basis, kinetic modelling, and implications for risk assessment. Toxicological Sciences 77, 151–157. Dor, F., Bonvallot, N., 2007. Hazard identification: improving this stage of risk assessment. Environnement Risques et Santé 6 (4), 279–287. Dorne, J.L., Walton, K., Renwick, A.G., 2001. Uncertainty factors for chemical risk assessment. Human variability in the pharmacokinetics of CYP1A2 probe substrates. Food and Chemical Toxicology 39, 681–696. Dourson, M.L., Stara, J.F., 1983. Regulatory history and experimental support of uncertainty (safety) factors. Regulatory Toxicology and Pharmacology 3, 224–238. Duffus, J.H., Nordberg, M., Templeton, D.M., 2007. Glossary of Terms Used in Toxicology, second ed. IUPAC Recommendations 2007. International Union of Pure and Applied Chemistry. Chemistry and Human Health Division. Pure and Applied Chemistry 79 (7), 1153–1344. European Chemical Agency (ECHA), 2008. Guidance for the implementation of REACH. Guidance on information requirements and chemical safety assessment. Part B: Hazard assessment. Available from:
. European Commission, 1967. Council Directive 67/548/EEC of 27 June 1967 on the approximation of laws, regulations and administrative provisions relating to the classification, packaging and labelling of dangerous substances. Official Journal of the European Communities 196, 1. European Union, 2003. Technical Guidance Document on Risk Assessment. Part I. Risk Assessment for Human Health. Institute for Health and Consumer Protection, European Chemical Bureau, second ed. (Chapter 2). Available from:
. Fung, K., Marro, L., Krewski, D., 1998. A comparison of methods for estimating the benchmark dose based on overdispersed data from developmental toxicity studies. Risk Analysis 18, 329–342. Gaylor, D., Ryan, L., Krewski, D., Zhu, Y., 1998. Procedures for calculating benchmark doses for health risk assessment. Regulatory Toxicology and Pharmacology 28, 150–164. Grandjean, P., Weihe, P., 2008. Developmental origins of environmentally induced disease and dysfunction. In: International Conference on Foetal Programming and Developmental Toxicity, Tórshavn, Faroe Islands, 20–24 May 2007. Basic and Clinical Pharmacology and Toxicology 102 (2), 71–273. Grandjean, P., Bellinger, D., Bergman, A., Cordier, S., vey-Smith, G., et al., 2008. The faroese statement: human health effects of developmental exposure to chemicals in our environment. Basic and Clinical Pharmacology and Toxicology 102 (2), 73–75. Hattis, B., Erdreich, L., Ballew, M., 1987. Human variability in susceptibility to toxic chemicals – a preliminary analysis of pharmacokinetic data from normal volunteers. Risk Analysis 7, 415–426. Hothorn, L.A., Hauschke, D., 2000. Identifying the maximum safe dose: a multiple testing approach. Journal of Pharmaceutical Statistics 10 (1), 15–30. International Programme on Chemical Safety (IPCS), 1999. Principles for the assessment of risks to human health from exposure to chemicals. Environmental Health Criteria 210. Available from:
. International Programme on Chemical Safety (IPCS), 2001. Principles for the assessment of risks to reproduction associated with exposure to chemicals. Environmental Health Criteria 225. Available from: . Kemikalieinspektionen (KEMI, the Swedish Chemical Agency), 2003. Human health risk assessment: proposals for the use of assessment (uncertainty) factors – application to risk assessment for plant protection products, industrial chemicals and biocidal products within the European Union. Available from: . Klimisch, H.J., Andreae, M., Tillmann, U., 1997. A systematic approach for evaluating the quality of experimental toxicological and ecotoxicological data. Regulatory Toxicology and Pharmacology 25, 1–5. Moore, J.A., Daston, G.P., Faustman, E., Golub, M.S., Hart, W.L., et al., 1995. An evaluative process for assessing human reproductive and developmental toxicity of agents. Reproductive Toxicology 9, 61–95. OECD, 2007. Manual for Investigation of HPV Chemicals. Available from: . Office of Environmental and Health Hazard Assessment (OEHHA), 2001. Proposition 65. Process for Developing Safe Harbor Numbers. Reproductive and Cancer Hazard Assessment Section. California Environmental Protection Agency. 22 CCR Safe Harbor Regulations. Available from: . Parham, F., Portier, C.H., 2005. Benchmark dose approach. In: Edler, L., Kistos, C. (Eds.), Recent Advances in Quantitative Methods in Cancer and Human Health Risk Assessment. Wiley and Sons, pp. 239–256. Renwick, A.G., Dorne, J.L., Walton, K., 2000. An analysis of the need for additional uncertainty factor for infants and children. Regulatory Toxicology and Pharmacology 31, 286–296.
360
F. Dor et al. / Regulatory Toxicology and Pharmacology 55 (2009) 353–360
Sharpe, R.M., 1994. Regulation of spermatogenesis. In: Knobil, E., Neill, J.D. (Eds.), The Physiology of Reproduction, second ed. Raven Press, New York, pp. 1363– 1464. Slob, W., 2002. Dose–response modeling for continuous endpoints. Toxicological Sciences 66, 298–312. Squire, RA., 1984. Carcinogenic potency and risk assessment. Food Additives and Contaminants 1, 221–231. Tamhane, A.C., Kunyang, S.H.I., Strassburger, K., 2006. Power and sample size determination for a stepwise test procedure for finding the maximum safe dose. Journal of Statistical Planning and Inference 136, 2163–2181. Tamhane, A.C., Dunnett, C.W., Green, J.W., Wetherington, J.D., 2001. Multiple test procedure for identifying the maximum safe dose. Journal of the American Statistical Association 96, 835–843. Thomas, J.M., Thomas, J.A., 2003. Toxic response of the reproductive system. In: Klaassen, C.D., Watkins, J.B., 3rd (Eds.), Essentials of Toxicology. Casarett and Doull’s. McGraw Hill, pp. 301–315.
Toppari, J., Larsen, J.C., Christiansen, P., Giwercman, A., Grandjean, P., et al., 1996. Male reproductive health and environmental xenoestrogens. Environmental Health Perspectives 104 (Suppl. 4), 741–803. US Environmental Protection Agency (US EPA), 1991. Guidelines for developmental toxicity risk assessment. Available from: . US Environmental Protection Agency (US EPA), 1994. Methods for derivation of inhalation reference concentrations and application of inhalation dosimetry. Available from: . US Environmental Protection Agency (US EPA), 1996. Guidelines for reproductive toxicity risk assessment. Available from: . US Environmental Protection Agency (US EPA), 2000. Benchmark dose technical guidance document. External review draft. Available from: . Walton, K., Dorne, J.L.C.M., Renwick, A.G., 2001. Default factors for interspecies differences in the major routes of xenobiotic elimination. Human and Ecological Risk Assessment 7, 181–201.