Alternatives in cosmetics testing

Alternatives in cosmetics testing

Toxic. in Virro Vol. 9, No. 6, pp. 827-838, 1995 Elsevier Science Ltd. Printed in Great Britain Alternatives in Cosmetics Testing N. LOPRIENO*, L. ...

1MB Sizes 3 Downloads 212 Views

Toxic. in Virro Vol. 9, No. 6, pp. 827-838, 1995 Elsevier Science Ltd. Printed in Great Britain

Alternatives

in Cosmetics Testing

N. LOPRIENO*, L. H. BRUNERt, G. J. CARRS, M. CHAMBERLAINg, M. COTTINII, 0. DE SILVA)I and S. KATOT *Department of Environmental Sciences, University of Pisa, 56123 Pisa, Italy, tThe Procter & Gamble Company, Health and Beauty Care Ltd, Regulatory and Clinical Development, Egham, Surrey TW20 9NW, UK, $The Procter & Gamble Company, Biometrics and Statistical Sciences Department, Regulatory and Clinical Development, Miami Valley Laboratories, Cincinnati, OH 45239, USA, §Environmental Safety Laboratory, Unilever Research and Engineering, Colworth House, Sharnbrook, Bedford MK44 lLQ, UK, [[Central Department of Product Safety, Recherche Avan&e, LOreal, 93601 Aulnay-sous-Bois, France and 7Cutaneous Biology Research Center, Harvard Medical School, Department of Dermatology, Massachusetts General Hospital, Charlestown, MA 02129-2060, USA Summary-This paper represents a summary of presentations made during a round-table discussion at the ECVAM Opening Symposium. After introductory comments on the cosmetic industry’s use of alternative methods in the safety assessment process, the use of alternative methods by L’Oreal and by the Japanese cosmetic industry is outlined, current validation studies in Japan are noted, and the involvement of COLIPA, the European Cosmetic, Toiletry and Perfumery Association, in promoting the use of alternative methods is discussed. Two final sections deal with the effect of data variability on the performance of alternative methods in validation studies and on the integrated use of quantitative structure-activity relationship (QSAR) analysis with other approaches in the safety assessment process.

introduction: the cosmeticindustry’s use of akematlves safety assessment process

in the

In order to appreciate how alternatives are used by the cosmetics industry, it is necessary to understand the safety assessment process typically used by the industry. The knowledge and judgement of experienced toxicologists is central to the process. Such individuals typically have several years’ experience in the safety assessment of cosmetics products and have extensive knowledge of the toxic effects of cosmetics ingredients. The safety assessment process is characterized by relative assessments, that is, assessments relative to existing products previously or currently marketed. Feedback from market experience is also an important component of the assessment. Conducting many safety assessments over time leads to the development of (often implicit) algorithms, which are used in the process. Since the experience gained is dependent on the product type and on marketing experience, the algorithms developed tend to be company specific. Despite overlap between companies, the details of the safety assessment process (and of the toxicology data and Abbreviations: CAM = chorioallantoic membrane; CIpred = 95% prediction level; CV = coefficient of variation; HET-CAM = hen’s egg test-chorioallantoic membrane; MAS = maximum average SC0l-e; NRU = neutral red uptake; QSAR = quantitative structure-activity relationship; RBC = red blood cell.

other knowledge used) will therefore be different. Thus, the challenge for cosmetics companies is to define the generic framework of the safety evaluation process in which in vitro assays are used. In vitro tests are not used in isolation. Typically, they are used against a background of an in vivo ingredient database. Such databases tend to span many years and contain results from single substances, probably tested at different concentrations. In addition, they may contain data on formulations, but such data will often be more than lo-15 years old. The data will almost certainly have been produced using several variants of the in vivo test, with relatively few tests being conducted according to full OECD guidelines. An example that illustrates many of the features of conducting a safety assessment using in vitro methods was recently published by O’Brien et al. (1994). This paper reports the process used to support the clearance to market of a new shampoo. The process was characterized by the use of several in vitro assays, and relevant benchmark standards were used for comparison. Rank order methods were used for analysis. Finally, a combination of historical in vivo data, in vitro data and knowledge of market experience was used in the process. When using in vitro assays in safety assessment, it is important to remember that the identity of the test substance is known. This permits a review of existing data with this type of chemical. If testing is required, then in vitro tests are conducted and the results are 827

N. Loprieno et al.

828 Skin culture

Reconstituted skin

HET-CAM

I

ALTERNATIVE

Silicon microphysiometer

need for basic research before relevant alternative methods can be made available for use in most areas of toxicology. The focus at L’OrCal is on three main areas-percutaneous absorption, skin sensitization and ocular irritation. As soon as a promising method is developed, it is assessed with various products for which historical in vivo eye or skin data are available. When a method proves to be relevant to a defined set of applications, it is used as a screening tool. A large number of such screening methods are now available. The final step in the development of a given alternative method is the prevalidation/validation process, in which in vitro methods are used, not for in-house purposes, but to help the scientific community to evaluate these methods. For example, reconstructed skin models are used in these various steps (Fig. 3). Before deciding what type of reconstructed skin to use, various commercial systems were compared, together with a ‘de-epidermized dermis’ model, which is considered to be one of the best reconstructed skin systems (Roguet et al., 1994a). The characterization of these models was based on microscopic examination and biochemical analysis. The relevance of the reconstructed skin model to ocular and skin irritation (Roguet et al., 1993), photoirritation (Cohen et al., 1994), percutaneous absorption and metabolism was also studied. These studies helped to determine for which applications the methods were most appropriate, taking into account availability and economic factors. The reconstructed skin mode1 is one in vitro method selected for the evaluation of eye irritation. Figure 4 summarizes L’Oreal’s practical expertise in the use of alternative methods. Several hundred samples have been evaluated with some of these methods, and the results have been compared with in vivo historical data, when such data were available. Various statistical methods were used to select the best methods for studying various categories of products. This approach led us to abandon some irrelevant tests. L’Orial’s experience strongly suggests that it is not possible to replace the Draize eye test with a single method, but that various sets of alternative

METHODS

Fig. 1. L’Ortal’s background in the use of alternative methods for the assessment of cosmetics.

compared with results obtained from similar substances of known toxicity or market acceptability. Only as a last resort, that is when there is no other way of obtaining the data necessary for an adequate safety assessment, will a limited in vivo study be conducted on the raw material(s): in viuo testing of products is now extremely rare and some companies have an explicit policy against this, since they consider it unnecessary. L’Orbal’s use of alternative methods (M. Cottin and 0. De Silva) L’Ortal has been using alternative methods for more than 25 years, beginning with the use of skin cultures to study finished products. Many methods have since been introduced (Fig. I), leading to the current widespread in-house use of alternative methods. At the same time, the use of animal models has diminished, ending with a total ban on animal testing of cosmetics some years ago. Figure 2 shows the various uses of alternative methods at L’Orial. There is obviously still a great

I

ASSAYS

I

El

BASIC RESEARCH

~~~

- PERCUTANEOUS ABSORPTlON - GENOTOXICITY - OCULAR IRRlTATlON - IMMUNOTOXlClTY

- OCULAR IRRITATION (F, EC/HO and Colipa) - PHOTOTOXICITY (EC/Collpa) - CUTANEOUS IRRlTATlON (F.1

Fig. 2. L’Oreal’s progress in the development of alternative methods (October 1994).

Alternatives in cosmetics testing

1 RECONSTRUCTED SKIN

HISTOLOGY ELECTRON MICROSCOPY + IMMUNOFLUORESCENCE KERATINS ANALYSIS LIPIDS ANALYSIS

MATTEK SKIN’ EPISKIN DED LSE

829

1

i+ I

OCULAR IRRITANCY Mascaras-creams

I I----+

Cutaneous irritancy

I

Photoirrltation/ Photoprotection

I-+ I I -

Transepidermal passage and metabolization

I

Fig. 3. L’OrCal’sexpertise with regard to reconstructed skin.

methods will be needed according to the product category under investigation (Rougier et al., 1992). These selected methods are now in use as part of a tier-testing scheme for the safety evaluation of new cosmetics (Cottin et al., 1994) (Fig. 5). The first step is the use of all available data on the ingredients and on similar existing cosmetics. This step is of great importance and correct safety evaluation of cosmetic formulations necessitates very complete and sound dossiers on the individual ingredients. The second step involves consideration of the conditions of use of the cosmetic under development. Safety evaluation of an eye make-up remover obviously differs from that of a day cream. Alternative methods are also used in some cases to check the tolerability of the new product, in which case they are used in combination, by comparing the new product with well-characterized control products. For new chemicals (Fig. 6), during the preliminary project stage the chemist and the toxicologist attempt to identify any potential toxicological problems that can be approached by looking into

structure-activity relationships (Shahin, 1989). This may lead to the synthesis of another, less reactive chemical, and thus an avoidance of the need for testing. As soon as the chemical has been synthesized, it is evaluated in the first phase with alternative methods for assessing genotoxicity and acute effects in vitro. Selected products will then pass through the next phases, with in vitro and in vivo tests as part of the toxicology assessment. The use of in vitro methods permits the elimination of 90% of new chemicals, which means that few compounds will have to be assessed in viuo. About 1000 new ingredients have to date been tested in the Ames test and about 400 ingredients in the HETCAM and cytotoxicity assays. In conclusion, the two most important points to be made are that combinations of alternative methods for specific categories of cosmetic products have been developed, but there is a need for basic research to devise more universal tests based on in vivo mechanisms of toxicity, for the in vitro safety evaluation of new chemicals.

Fluorescein Leakage Test Silicon Mlcrophysiometer Reconstructed Skin 2000

sampler

EYETEX Agarose Diffusion Method z 400 sampler

BCOP-Test

Fig. 4. L’Or&al’sexpertise in the use of alternatives to the Draize eye test (October 1994).

830

N. Loprieno et al. KS o~lNGREDlENTSandCLOSEFORMULATlONS PHYSICOCHEMICAL DATA IN V/V0 AND IN V/T/70 TOXlCOLOGlCAL IN USE DATA (HUMAN TOLERANCE) COSMETOVIGILANCE

DATA

J

USE m ROUTE OF EXPOSURE QUANTITY FREQUENCY MISUSE I

I

+

IN COMPARATIVE

STUDIES with REFERENCE

PRODUCTS

(POSITIVE AND NEGATIVE)

Fig. 5. L’Orkal’s approach to safety for a new cosmetic product.

Experience of the Japanese industry in the use of alternatives (S. Kato) Research on alternative test methods and their application in Japan, conducted by cosmetic companies, seems to differ in emphasis from validation

programmes designed by academic researchers and/or the Ministry of Health and Welfare. Whereas cosmetic companies probably try to establish alternatives to predict hazardous effects as the first priority, academic scientists tend to concentrate on basic research, using in vitro techniques as tools to examine the mechanisms of toxicological and pharmacological actions and so on. The role of the regulatory agencies is to establish suitable roles for deciding whether applications for approval of chemicals to be used as ingredients for cosmetics will be accepted (Table 1). Shiseido Research Center has been developing a range of in vitro tests and examining their application. We believe that it is important to be able to predict

many facets of toxicological phenomena by means of in uitro tests, for both animal welfare and economic

reasons. We have focused especially on in vitro tests to replace the Draize eye irritation test, which has been heavily criticized from the animal welfare point of view. At first sight, it might seem that a detailed scientific understanding is not necessary, as long as in vitro test data correspond to the results of in uivo tests. However, what we have been trying to do is to give the in vitro tests scientific credibility. We feel it necessary that what we have done should be exposed to scientific review, and so we have presented what we have accomplished at scientific meetings or in peer-review journals. Our experience is that research on alternatives or in vitro techniques also contributes greatly to our understanding of the mechanisms of toxicological action of chemicals and can generate new ideas for basic research as well.

NEW INGREDIENT

-

PHYSICOCHEMICAL PROPERTIES MOLECULAR STRUCTURE (QSAR) POTENTIAL REACTIVITY SITES

SYNTHESIS

I

ASSESSMENT COMMITTEE

/ REJECTION

SCREENING

IN VITRO IN VlTRO

ACUTE POTENTIAL GENOTOXICITY

[ SCREENING

/ ADJUNCT

Fig. 6. Alternative methods as part of the safety evaluation procedure of a new ingredient.

Alternatives in cosmetics testing Table 1. Toxicity studies required in Japan for applications approval to manufacture cosmetics Toxicity study 1. 2. 3. 4. 5. 6. 7. 8. 9.

Ingredients

Products

+ + + + +I+/+ + +

+I_ _ _ -

Acute toxicity Primary skin irritation Cumulative skin irritation Skin sensitization Phototoxicity Photosensitization Eye irritation Mutagenicity Human patch testing

for

+I-

+ = required; - = not required; + / - = conditional.

The Japanese Society of Alternatives to Animal Experiments (JSAAE) is the forum for scientists in Japan to communicate research on alternatives. A total of 2 16 papers have been presented at the annual meetings of the Society during the last 7 years, one-sixth of them by members of the Japan Cosmetic Industry Association (JCIA) (Table 2). One of the special programmes which the JSAAE has introduced to encourage scientists to present their research, is the so-called ‘Golden Presentation Award’, instituted 4 years ago. The award is given for the best poster presentation, based on the votes of participants or of the members of the Council of the Society. One of the key factors to be borne in mind in establishing alternatives is that the techniques used must be scientifically acceptable. Based on research of our own, coupled with literature surveys and the results of validation programmes, we consider that the eye irritation potential of water-soluble chemicals and products can be predicted by means of cytotoxicity tests. The chorioallantoic membrane (CAM) test, which involves the use of fertilized hen’s eggs, or a modified technique incorporating the use of trypan blue, which stains damaged tissue and gives more objective values than the original CAM technique, are useful for water-insoluble chemicals or final products (Hagino et al., 1993). On the basis of multivariate factorial analysis of data obtained in seven in vitro tests for predicting eye irritation (including the EYTEX, SIRC and HeLa cytotoxicity tests, CAM test, liposome, red blood cell and haemoglobin denaturation), Hayashi et al. (1994) have reported that cellular

831

plasma membrane destruction and protein denaturation contribute significantly to Draize eye irritation scores. This finding suggests that a combination of two or more tests can be used as an effective altemative system, and also supports the view that the CAM test is a promising candidate for use as an alternative to the eye test, since the CAM test shows the effects of both phenomena. With regard to alternatives to in vivo skin irritation tests, cytotoxicity tests and tests on artificial skin are among the most useful approaches. As for the alternatives to phototoxicity tests for predicting the hazardous effects of fragrances or chemicals which have absorption in the UV range, the combination of a photohaemolysis test using red blood cells and a yeast growth inhibition assay is very useful (Sugiyama et al., 1994). Research on the use of alternatives to predict contact allergenicity is at an early stage. Assays for measuring the binding of chemicals to proteins, and a local lymph node assay using mice, have been reported; the latter method is based on tritiumlabelled thymidine incorporation. We have tried to develop a technique without using radioactive materials, which would be more practical. Hatao et al. (1995) looked at the changes in interleukin-2 production by cells derived from lymph nodes in the presence of various chemicals, and reported that this technique was potentially useful as an alternative for predicting contact sensitization.

Validation programmes in Japan (S. Kato) Two validation programmes in Japan have been sponsored by the Ministry of Health and Welfare (MHW) and the JSAAE. Cosmetics companies have participated in both studies. The MHW project, headed by Dr Yasuo Ohno, Director of the National Institute of Health Sciences, has reached the end of its second phase and is entering its third phase (Ohno et al., 1994); 22 organizations including 14 cosmetic companies, have participated in the project; 11 techniques are being validated with over 50 chemicals. An outline of the programme was presented by Dr Ohno at an

Table 2. Number of presentations at the meetings of the Japanese Society of Alternatives to Animal Experiments (JSAAE) Meeting no. (and date)

Subject Acute toxicity Reproductive toxicity Mutagenicity Carcinogenicity Eye irritation Skin irritation Phototoxicity Sensitization Other TOtd

(Feb ‘1988)

(Jan :989)

1

2

1

I

I 8 10

5 9

(Ckt :989) 6 (1)’ 3

1: (5)

:: (6)

(Ott 4990) 13 2 3 6 3 1 I1 39

(Nov51991) 6 3 1: (5) 2 2 (1) 2 10 (1) 37 (7)

(Dec61992) IO

6 (3) 4 (2) 2 (I) 126(1) 40 (7)

(Dec71993) 7

Total 9 (1) 41 S

16 (11) 4 (1)

5: (24) 14 (3) 5 (2)

1 (1) 15(2)

5 (1) 80(4)

43 (15)

216 (35)

*Numbers in parentheses show the number of presentations from member companies of the Japan Cosmetic Industry Association (JCIA).

N.

832

Loprieno et al.

international conference held in the United States in 1993 and at other conferences. Dr Tadao Ohno, of the Institute of Physical and Chemical Research (RIKEN), described the JSAAE programme at the INVITOX Meeting which was held in Switzerland in September 1994 (Ohno et al., 1995). Essentially, the programme is the first-stage validation of five in zlitro cytotoxicity assays on five chemicals using five cell lines; 44 laboratories are participating in the programme. A specific feature of both programmes is seminars on technology transfer, to standardize the techniques to be used in order to minimize variation among the laboratories involved and to confirm the detail of the protocols. Alternative testing methods are likely to achieve widespread acceptance, if they have a well-defined scientific basis, and if they do not require difficult or complicated procedures. If methods meeting these criteria were properly validated, they would have a good chance of being accepted by the general public as well. Keeping these points in mind, the following can be stated: Alternatives to the Draize eye irritation test are at the final stage of validation, and, in the near future, it is possible that several methods or combingtions of methods will be accepted to replace the Draize eye irritation test or at least part of its use. Methods for testing phototoxicity and skin irritation are now at the validation stage. Methods for predicting effects such as contact allergenicity, where the underlying mechanism is still not clearly understood, or the techniques used are rather complicated, are still at the research stage, and alternatives are unlikely to be practically available in the near future. Research on alternative test methods is progressing at a rapid rate in Japan, and validation programmes are under way there, as in various other countries. I hope that ECVAM will play a significant role as the centre for co-ordinating validation programmes in Europe, and that, by sharing information worldwide, we can speed the development and acceptance of useful and reliable alternatives. involvement in promoting alternative methods (M. Cottin)

COLIPA’s

the use of

COLIPA is the French abbreviation for the European Cosmetic, Toiletry and Perfumery Association. This Association has been asked by its national members to encourage and co-ordinate work in the development and validation of alternative methods to animal testing. To meet that challenge, COLIPA has created a Steering Committee for Alternatives to Animal

BOD Board of Directors

+ SCAAT Steering

Committee

for Alternatives

to Animal

Testing

i TF (Task Forces)

TF METHODOLOGY ALTERNATIVES

Fig.

TF HUMAN SKIN COMPAllSltlN

TF ‘PERCUTANEOUS ABSORPTION

TF ‘IN WTRO PHO~OIRITAIION’

7. COLIPA organization to co-ordinate develop alternative methods.

efforts

to

Testing (SCAAT), which is directly responsible to the Association’s International Companies Council (ICC) (Fig. 7). SCAAT brings together toxicological R&D experts from several major cosmetics companies. They guide the strategy for leading the cosmetics industry’s efforts to replace animal tests with alternative methods. Various task forces on alternative methodology have been established, and are composed of cosmetics company scientists with specific experience in the use and development of non-animal testing strategies. These task forces are responsible for developing working plans dealing with non-animal testing in the safety assessment of cosmetics. COLIPA is fully aware that there will be little progress in the field of alternatives without communication among the various interested parties. Therefore, SCAAT has organized meetings at the European level with representatives of the European Commission (DGXXIV), the European Parliament (STOA), the Scientific Committee for Cosmetology (SCC) and ECVAM, to present the efforts made by the cosmetics industry and to discuss the activities being co-ordinated by COLIPA (Fig. 8). Furthermore, in order to establish a global basis for its work, in November 1993 in Baltimore, SCAAT organized a meeting between European experts and representatives from the American (CFTA) and Japanese (JCIA) trade associations. This meeting provided the opportunity to find new means of collaboration. As

COMMISSION

see

\m

t

SToA CTFA

ECVAM

i JCIA Fig. 8. Main COLIPA interactions to promote the meeting of alternative methods (joint validation JCIA/CTFA/COLIPA, Baltimore, November 1993).

Alternatives in cosmetics testing EpBl

833

: ‘To determine if the results from a selected set of in vifro methods are valid replacements for animal tests in determining the eye irritation potential of cosmetics fomwlations and ingredients’. Neutral red release Neutral red uptake Fluorescein leakage Silicon microphysiometer Red blood cell assay

METHODS: 10

Pollen Tube Assay HET-CAM CAMVA EYTEX Tissue Equivalent Assay

meSTANCE

: 23 INGREDIENTS and 32 FORMULATIONS

MRATOM:

33

END OF STUDY : JULY 1995 Fig. 9. COLIPA eye irritation validation project.

a direct result of this, both American and Japanese companies are now joining COLIPA in validation exercises. This commitment of the European cosmetics industry to the development of alternative methods also focuses on technical work. COLIPA is supporting a validation project on in vitro alternatives and eye irritation and, together with ECVAM, a validation project on in vitro tests for photoirritation. In addition, two specific task forces have also been devised under the umbrella of SCAAT, one on human skin compatibility and the other on percutaneous absorption. Figure 9 shows the main features of the in vitro eye irritation project. This COLIPA study has been designed to complement and extend the EU/HO validation study, which has now finished. Six alternative methods and some test substances are therefore common to the two studies. The first phase of the photoirritation project is now complete, with the testing of 20 phototoxic and non-phototoxic chemicals successfully completed (Liebsch et al., 1994; Spielmann et al., 1994). The main features of Phase II are shown in Fig. 10. The neutral red uptake (NRU) and red blood cell (RBC) haemolysis assays are the core tests under validation in Phase II of the EpBL

study, whereas the other tests involved are at the prevalidation phase. The goal of the Task Force on Human Skin Compatibility is to discuss testing for skin compatibility by making use of human subjects for cosmetic products. This means examining how a ‘consensus’ proposal testing protocol could be developed and also defining the criteria needed to cover the ethical aspects involved in making use of human subjects. The outcome of this discussion should be the establishment of common guidelines on the use of human subjects for skin compatibility testing, without the use of animals at all. The second task force is discussing the in vitro measurement of percutaneous absorption, together with the in viva/in vitro comparisons. The objective is to agree common guidelines for the in vitro study of percutaneous absorption of cosmetic ingredients. In summary, COLIPA’s contribution to the development and validation of alternative methods is to promote the state-of-the-art validation of promising alternative methods, to make available to EU regulatory bodies and their scientific advisers the data generated in these validation exercises, and, finally, to support efforts at obtaining regulatory acceptance of alternative methods demonstrated to be reliable and relevant.

: ‘To determtne if currently selected in vitro methods are capable of properly predicting the photoirritation potential to humans of chemicals applied via the systemic route or topically to the skin’. Histidine oxidation Complement assay

-SUBSTANCES LABORATEND OF Dv

Skin2 1350

Keratinocytes

SOLATEX PI

Protein binding

30

: 10 : 28 February1995

Fig. 10. COLIPA/ECVAM

in vitro photoirritation

validation project (Phase II).

834

N. Loprieno et al.

Effect of data variability on alternative methods’ performance in validation studies (L. H. Bruner and G. J.

Carr)

Table 3. Expected Pearson correlation coefficients when the errors in in uiuoand alternative method data are considered Expected Pearson’s correlation coefficient

Imposed CV

Developers of eye irritation alternative methods often prepare scatterplots showing the relationship between in uioo irritation score [such as the Maximum Average Eye Irritation Score (MAS)] versus the corresponding alternative method scores when assessing the performance of the new methods (Bagley ef al., 1992; Gettings et al., 1991). This is done in order to view the degree of association between the two datasets, and to develop prediction models that define how to convert the results of an alternative method into a prediction of in viuo toxicity. Such plots are useful, because they provide a clear visual representation of the actual data points and their distribution, the error bars on each data point, the results of a regression analysis, a correlation coefficient, and the 95% confidence interval for any in vivo toxicity prediction (Snedecor and Cochran, 1980). If the relationship between the in uiuo and alternative method data is strong enough, it may be possible to develop prediction models that can be tested in validation studies. It would be most useful to have alternative methods that produced results having data points that clustered tightly around a regression line, had correlation coefficients approaching either 1 or - 1, and have narrow confidence intervals associated with future predictions (a prediction interval). Unfortunately, there is an important factor that prevents alternative methods for reaching this performance ideal. This factor is the variability in measurements obtained from currently available in uiuo toxicity tests. Variation in the results from the alternative method can be a further limiting factor. In order to study how the results from a validation study may be affected by variability in the data, we used a computer simulation based on the assessment of eye irritation alternative methods. For this simulation, we used an algorithm having the sample linear relationship of equation (1): y =(1.1)x where x is taken to be an alternative method that has values in the range 0 to 100, thus implying that y, the in uiuo score, ranges from 0 to 110 as for the MAS. We introduced different levels of error in both measures in order to assess the effects of variability on the correlation coefficient. The coefficient of variation applied to the alternative method response, x, was maintained constant on the full range 0 to 100. The coefficient of variation applied to the in vivo response, y, was based on the minimum distance of a particular y value from either end of the 1IO-point Draize test scale. This follows from the observation that variability in eye irritation scores decreases as the score approaches either extreme of the scale (De Sousa et al., 1984; Talsma et al., 1988). Data from

Alternative method 0.05 0.1 0.2 0.2 0.2

Full range

Restricted range

In vi00

(x = I-100)

(x = 140)

0.05 0.1 0.4 0.5 0.6

0.994 0.975 0.860 0.828 0.803

0.990 0.960 0.719 0.652 0.608

Computer simulations were used to assess the effects of variability in eye irritation test and alternative method data on the correlation coefficients expected between the datasets. The model used in the simulation assumed that the algorithm, y =(1.1)x, describes the relationship between the in uiuo and alternative method data. The simulations were conducted with test substances having the full range of response (x = l-100) and for a restricted range representing the least irritating part of the eye irritation scale (x = 140). Each of the tabled correlation coefficients is based on 10,000 runs of the simulation. Results are shown for the simulation where the variability is set relatively low (in uiw and alternative method CV = 0.05-O.l), and where the variability was set at a level consistent with performance of currently available alternative methods and the in uivo test (in vim CV = 0.40.6 and alternative method CV = 0.2). The results of these simulations demonstrate that variability in the datasets can have a significant effect on the correlation between the two datasets. This must be taken into account when the performance of an alternative method is assessed.

Weil and Scala (1971) provided a basis for assigning a level of variability to the y values. The MAS as defined by Draize et al. (1944) was computed by using data given in the original Wiel and Scala publication. The coefficient of variation (CV) for the MAS was also calculated for each test substance. The degree of variation among the laboratories conducting the Draize eye irritation test on the same substances was strikingly large, ranging between 40 and 60% for a six-animal rabbit eye irritation test. The variability in alternative method data is typically less than that in the in viuo test, with CVs ranging between 10 and 25% (R. Curren, personal Microbiological communication, Associates, Rockville, MD, USA). After a large number of points were simulated for each case, the Pearson’s correlation statistic was derived, in order to determine the correlation between the x and y values. A second set of x values ranging from 0 to 40 was also run, to simulate results for eye irritation scores that might be observed with a more restricted class of test substances, such as cosmetics products. A second set of simulations was conducted in order to determine the 95% prediction interval (Clpred) of an in ho maximum average score of 55. A tight linear relationship with nearly perfect correlation and a small 95% Clpred will exist when there are negligible levels of error in both x and y. As error is introduced into either x or y, the expected level of correlation will be reduced. The effects of viability on the correlation coefficients between sets of in vivo and alternative method data are summarized in Table 3 and illustrated in Fig. 11. The effects on the size of the Pearson correlation coefficient due to error imposed on the

Alternatives in cosmetics testing

in oivo and alternative method responses are shown under two conditions. In the first case (Table 3, Fig. 11A),the imposed variation is set relatively low (in uiuo and alternative method CV = 0.05-0.1). As

835

expected, Pearson correlation coefficients are large, ranging between 0.97 and 0.99. Restricting the alternative method results to the least irritating portion of the Drake scoring scheme (X = O-40) has little

110 100 90 80 70 2

60 50

0

20

40

60

80

100

120

140

120

140

160

In vitro score

I 0

20

40

60

80

100



I



160

In vitro score Fig. 11. Effects of variability on the correlation between results from the Drake test and alternative methods. Computer simulations were used to assess the effects of variability in the eye irritation test and alternative method data on the correlation between the two data sets. The model used in the simulation assumed that the algorithm, y = (1.1)x, describes the relationship between the in vivo and alternative method data. Values for x = O-100 were used to simulate responses across the entire Drake eye irritation scale. Different levels of variability were added to the alternative method and in vivo scores in each run of the simulation. The x and y values generated in 1000 runs of the simulation are plotted on the figures. (a) The expected relationship between the maximum average score (MAS) and the alternative method results when the variability is relatively low. In this case, the CVs applied to both the in vivo and alternative method data were 0.05. (b) The expected relationship between the MAS and alternative method results under typical conditions. The CVs applied to the in vivo and alternative method data were 0.5 and 0.2, respectively.

N. Loprieno

836

Table 4. 95% Confidence intervals for the prechction of an in uiuo eye irritation score of 55 from an alternative method (95% Ctpred) Sample size n = 50

cv Alternative method 0.05 0.1 0.2 0.2 0.2

In vi00 0.05 0.1 0.4 0.5 0.6

95% CIpred i i * k

* 7.1 14.2 34.9 40.0 45.6

SD of 95% CIpred 0.8 I.5 3.4 4.1 4.6

The mean 95% CIpred is shown for a reference set of test substances (RSTS) where n = 50. The variability in the 95% CIpred is indicated as the standard deviation (SD) of the 95% Clpred. Each of the values shown is based on 1000 runs of the simulation. As the variability in the in uiuo and alternative method data increases, the 95% CIpred becomes wider, ultimately encompassing a large proportion of the 1IO-point Drake eye irritation scale.

effect on the correlation coefficients (Table 3). In the second case (Table 3, Fig. 1lB), the CVs applied to the data are consistent with those observed in typical Draize tests and current alternative methods (in viuo CV = 0.5, alternative method CV = 0.2). Under these circumstances, the correlation is still high (> 0.8) with the full set of test substances (X = o-100). Restricting the range of alternative method responses to the least irritating materials (X = O-40) results in a decrease in the correlation coefficient to an approximate range of 0.6-0.7. The effect of variability on the 95% Clpred is shown in Table 4. Under ideal conditions, where the variability is set relatively low (CV = 0.05-O.l), the 95% Clpred represents a relatively small proportion of the entire Draize scoring scheme. When the variability is set at levels more consistent with current method performance, the 95% Clpred is significantly wider, encompassing a large proportion of the 1lopoint Draize eye irritation scale. It is important to note that these simulation studies were conducted under idealized conditions. The underlying assumption is that the relationship between the in uivo and alternative method data is linear. In addition, the number of substances included in the simulations was large (lO,OOO), so that the resulting correlation would be close to the true value. In any one validation study in which all of our assumptions were true (hypothetically speaking, of course), the correlation coefficient would be above or below that average value to some extent, owing to variability in the data. Deviations from the conditions assumed in these simulations, such as non-linearity or non-uniform distribution of responses, may reduce the level of correlation. Since it is unlikely that the results from an alternative method are so simply related to a particular in viuo toxicity, it can be expected that validation studies will result in lower correlation coefficients, even for those alternative methods that may actually be reasonably predictive of the in uivo response.

et

al.

Variability can have a significant effect on the overall relationship between the two data sets evaluated in a validation study. This includes variability in results from both the alternative and the in viuo tests. These effects must be factored into each evaluation of data obtained in a validation study and the expectations placed on the performance of the alternative methods must take such variability into account. QSAR (M. Chamberlain) Traditionally, the selection of test chemicals for inclusion in validation studies has essentially been a haphazard process. This is particularly true when, in considering the different classes of chemicals that have been selected, there has often been a preponderance of a single chemical type, particularly with respect to mechanism of action. Quantitative Structure-Activity Relationships (QSARs) can often provide a systematic rationale for the selection of test chemicals for inclusion in a validation study. The basis for the QSAR approach has been provided by Barratt (1995a): ‘The principles of QSAR are based on the premise that the properties of a chemical are implicit in its molecular structure. It therefore follows that, if the mechanism for the activity of a group of chemicals can be elucidated and relevant parameters measured or calculated, then, in principle, a structure-activity relationship can be established.’ Recently, Barratt (1995b) conducted a QSAR analysis of an eye irritation dataset. This work has clearly demonstrated the potential applications of QSAR, which could lead to the following: 1. Identification of molecular features associated with eye irritation; 2. Prediction of eye irritation potential; 3. Definition of chemicals that can be used to validate an in vitro test; 4. Understanding the importance of uncertainty that arises from biological variability; 5. Evaluation of the appropriateness of an in vitro method in terms of operation of relevant mechanisms. Principal components analysis is a method that has proved particularly useful for visualizing the relationship between mechanistically relevant features and biological activity (in this case, in relation to eye irritation). The position of a chemical on the principal components plot is determined by its molecular features. If the molecular features relevant to the mechanism of action of the set of test chemicals have been included in the plot, then the active (irritant) chemicals will cluster in a chemical parameter space that is separate from that occupied by the inactive (non-irritant) chemicals. This plot can then be used to predict the potential toxicity of an untested chemical, as long

Alternatives in cosmetics testing

as it operates by means of the same mechanism as the chemicals used to generate the original plot. Different regions of chemical parameter space can be explored, to test the model by identification of specific chemicals in that space. Classification of chemicals imposes artificial boundaries on continuous biological data. The position of the boundary can be visualized and understood in terms of chemical parameter space; that is, those chemicals that are near the boundary between irritant and non-irritant can be readily identified. Predictions of the eye irritation potential of chemicals well away from the boundary can be made with a high degree of certainty, as long as there is adequate mapping of the chemical parameter space. However, predictions for chemicals near the boundary may be less certain. This should be manifest, for example, by the results of two Draize rabbit eye irritation tests, leading to the classification of ‘irritant’ in one case and ‘non-irritant’ in the other. This has implications for the selection of chemicals that are included in validation studies. Clearly, if only those chemicals that are well away from the classification boundary are included in the study, then there is a lower probability of obtaining ambiguous data (poor correlations). However, those chemicals that are near or are at the boundary could well lead to ambiguous data being produced by the in vitro tests that are being validated. This would occur, not because of any intrinsic failure of the in vitro test, but because of high variability in assignment of classification from the animal test. Thus, such QSAR modelling can help to minimize the probability of producing ambiguous data in validation studies.

REFERENCES

Bagley, D. M., Bruner L. H., De Silva O., Cottin M., O’Brien K. A. F., Uttley M. and Walker A. P. (1992) An evaluation of five potential alternatives in vitro to the rabbit eye irritation test in vivo. Toxicology in Vitro 6, 275-284. Barratt M. D. (1995a) Quantitative structure activity relationships for skin corrosivity of organic acids, bases and phenols. Toxicology Letters 75, 169-176. Barratt M. D. (1995b) A quantitative structure activity relationships for eye irritation potential of neutral organic chemicals. Toxicology Letters. In press. Cohen C., Dossou K. G., Rougier A. and Roguet R. (1994) Episkin: an in vitromodel for the evaluation of phototoxicity and sunscreen photoprotective properties. Toxicology in Vitro 8, 669-671.

Cottin M., Dossou K. G., De Silva O., Tolle M., Roguet R., Cohen C., Catroux P., Delabarre I., Sicard C. and Rougier A. (1994) Relevance and reliability of in vitro methods in ocular safety assessment. In Vitro Toxicology 7, 277-282. DeSousa D. J., Rouse A. A. and Smolon W. J. (1984)

Statistical consequences of reducing the number of rabbits utilized in eye irritation testing: data on 67 petrochemicals. Toxicology and Applied Pharmacology 76, 234-242.

Draize J. H., Woodard G. and Calvery H. 0. (1944) Methods for the study of irritation and toxicity of sub-

837

stances applied topically to the skin and mucous membranes. Journal of Pharmacology and Experimental Therapeutics

82, 377-390.

Gettings S. D., Bargey D. M., Dematrullas J. L., Dipasquale L. C., Hintze K. L., Rozen M. G., Teal J. J., Weise S. L., Chundkiwski M., Marenus K. D., Pape W. J. W., Roddy M. T., Schnetzinger R., Silber P. M., Glaza S. M. and Kurtz P. J. (1991) The CTFA Evaluation Alternatives Program: an evaluation of in vitro alternatives to the Draize primary eye irritation test. (Phase I) Hydro-alcoholic formulations; (part 2) Data analysis and biological significance. In Vitro Toxicology 4, 247-288. Hagino S., Itagaki H., Kato S. and Kobayashi T. (1993) Further evaluation of the quantitative chorioallantoic membrane test using trypan blue stain to predict the eye irritancy of chemicals. Toxicology in Vitro 7, 35-39. Hatao M., Hariya T., Katsumura Y. and Kato S. (1995) A modification of the local lymph node assay for contact allergenicity screening: measurement of interleukin-2 as an alternative to radioisotope-dependent proliferation assay. Toxicology 98, 15-22. Hayashi T., Itagaki H., Fukuda T., Tamura U. and Kato S. (1994) Multivariate factorial analysis of data obtained in seven in vitro test systems for predicting eye irritancy. Toxicology

in Vitro 8, 215-220.

Liebsch M., Splelmann H., Balls M., Brandt M., Doring B., Dupuls J., Holtzhiitter H. G., Klecak G., L’Eplattenler H., Love11 W., Maurer T., Moldenhauer F., Moore L., Pape W., Pfannenbecker., Potthast J., De Silva O., Steiling W. and Willshaw A. (1994) First results of the EC. Colipa validation project ‘in vitro phototoxicity testing’. In Alternative Methods in Toxicology. Vol. 10 (B6), pp. 243-251. Mary Ann Liebert, New York. O’Brien K. A. F., Basketter D. A., Jones P. A. and Dixit M. (1994) An in vitro study of the eye irritation potential of new shampoo formulations. Toxicology in Vitro 8, 257-261.

Ohno T., Itagaki H., Tanaka N. and Ono H. and the Working Group for the 1st Validation Study on Cytotoxicity. (1995) Validation study on five different cytotoxicity assays in Japan-an intermediate report. Toxicology in Vitro 9, 571-576.

Ohno Y., Kaneko T., Kobayashi T., Inoue T., Kuroiwa Y., Yoshida T., Momma J., Hayashi M., Akiyama J., Atsumi T., Chiba K., Endo T., Fujii A., Kakishima H., Kojima H., Takano K. and Takanaka A. (1994) First-phase validation of the in vitro eye irritation tests for cosmetic ingredients. In Vitro Toxicology 7, 89-94. Roguet R., Cohen C., Dossou K. G. and Rougier A. (1994a) Episkin: a reconstituted human epidermis for assessing in vitro the irritancy of topically applied compounds. Toxicology

in Vitro 8, 283-291.

Roguet R., Rdgnier M., Cohen C., Dossou K. G. and Rougier A. (1994b) The use of in vitro reconstituted human skin in dermatotoxicity testing. Toxicology in Vitro 8, 635-639.

Rougier A., Cottin M., De Silva O., Roguet R., Catroux P., Toufic A. and Dossou K. G. (1992) In vitromethods: their relevance and complementarity in ocular safety assessment. Lens and Eye Toxicity Research 9, 229-245. Shahin M. M. (1989) The importance of analyzing structure-activity relationships in mutagenicity studies. Mutation Research 221, 165-180.

Snedecor G. W. and Cochran W. G. (1980) Statistical Methodr. 7th Ed. Iowa State University Press, Ames, IA. Spielmann H., Balls M., Brandt M., Doting B., Holtzhtitter H. G., Kalweit S., Klecak G., L’Eplattenier H., Liebsch M., Love11W. W., Maurer T., Moldenhauer F.. Moore L., Pape W. J. W., Pfannenbecker U., Potthast J., De Silva O., Steiling W. and Willshaw S. (1994) EEC/COLIPA project on in vitro phototoxicity testing: first results obtained with the Balb/c 3T3 phototoxicity assay. Toxicology in Vitro 8, 793-796.

838

N. Loprieno et al.

Sugiyama M., Itagaki H. and Kato S. (1994) Photohemolysis test and yeast growth inhibition assay to assess phototoxic potential of chemicals. In Alternative Methodr in Toxicology. Vol. lO(B3), pp. 213-221. Mary Ann Liebert, New York. Talsma K. M., Leach C. L., Hatoum N. S., Gibbons R. D., Roger J. C. and Garvin P. J. (1988) Reducing

the number of rabbits in the Draize eye irritancy test: a statistical analysis of 155 stydies conducted over 6 years. Fundamental and Applied Toxicology 10, 146-153. Weil C. A. and Scala R. A. (1971) Study of intralaboratory and inter-laboratory variability in the results of rabbit eye and skin irritation tests. Toxicology and Applied Pharmacology

19, 276-360.