Toxicology in Vitro 16 (2002) 557–572 www.elsevier.com/locate/toxinvit
Predictive ability of reconstructed human epidermis equivalents for the assessment of skin irritation of cosmetics§ C. Fallera,*, M. Brachera, N. Damib, R. Roguetb a
Cosmital SA (Research Company of Wella AG, Germany), Rte de Che´salles 21, CH-1723 Marly, Switzerland b L’Ore´al, Life Sciences Research, Centre C. Zviak, rue du Ge´ne´ral Roguet 90, 92583 Clichy Cedex, France Accepted 22 February 2002
Abstract The aim of this study was to examine the concordance between human in vivo and in vitro skin irritation classifications of cosmetic products and to evaluate the correlations between the different parameters. For that purpose, 22 formulations from product development test series, covering the full range of in vivo scores and representing different cosmetic product classes, were tested in vivo (modified Frosch-Kligman Soap Chamber Patch Test with repetitive occlusive application) and in vitro using two epidermis equivalents commercially available as kits (EpiDermTM and EPISKINTM) and one in-house model (Cosmital). In vivo, skin reactions (erythema, dryness and fissures) were visually evaluated and, in addition, skin redness and transepidermal water loss (TEWL) were measured by means of technical instruments. The parameters measured in vitro were the percent cell viability in the MTT reduction assay, with ET50 determination, and the extracellular release of the pro-inflammatory mediator IL-1a and of the cytosolic enzyme lactate dehydrogenase (LDH), into the culture medium collected after topical application of the products for different exposure times (time-course assay). In general, good Spearman rank correlations could be observed between the different in vivo parameters (with the exception of TEWL and dryness at day 2). Furthermore, high correlation coefficients were obtained by comparing the different in vitro parameters (except for LDH release) and different models, which allowed to conclude that the results obtained with the different reconstructed epidermis models were very similar. A comparison between in vivo and in vitro parameters resulted in the best rank correlation for ET50, then in decreasing order, for the percent MTT viability at 16 h, the IL-1a release and finally, for LDH release, where the correlation was generally low. A direct comparison of the mean total scores (sum of erythema, dryness and fissures at day 5) of the 22 products with the best predictor, ET50 obtained with the three reconstructed epidermis models, using simple linear regression analysis resulted in a coefficient of correlation R=0.94 for EpiDerm, R=0.90 for Cosmital and R=0.84 for EPISKIN. Multivariate descriptive statistics showed that the in vitro parameters, MTT viability evaluated after the 16-h exposure and ET50, as well as the in vivo parameters, sum of visual scores at day 5 and chromameter value, were the best endpoints to discriminate between irritant and non-irritant products. Using the in vivo mean total scores at day 5 with a cut-off value at 2 and the in vitro percent MTT viability after the 16-h exposure with a cut-off value at 50% to classify the products, the same two-bytwo contingency table was obtained for all the three reconstructed epidermis models with sensitivity=92%, specificity=100% and observed concordance=95% (=0.91; 95% confidence interval 0.74–1.08). This classification system was a satisfactory and relevant approach to discriminate the ‘‘irritant’’ from the ‘‘non-irritant’’ cosmetic products in this study. In conclusion, this study demonstrated the usefulness of reconstructed human epidermis equivalents for the in vitro assessment of the irritation potential of a series of cosmetic products. These models allow the measurement of quantifiable and objective endpoints relevant to in vivo irritative phenomena. # 2002 Elsevier Science Ltd. All rights reserved. Keywords: Alternative methods; EpiDermTM; EPISKINTM; In vitro testing; In vivo/in vitro comparison (concordance); Reconstructed human epidermis models (equivalents); Skin irritation
This study is part of the European project SMT4-CT 97-2174: ‘‘Testing and improvement of reconstructed skin kits in order to elaborate European standards’’ Abbreviations: ET50, effective time of exposure required to reduce the viability of treated cultures to 50% of controls; IL, interleukin; LDH, lactate dehydrogenase; MTT, 3-[4,5-dimethylthiazol-2-yl]-2,5§
diphenyltetrazolium bromide; PCA, principal component analysis; PLS, partial least squares; SLS, sodium lauryl sulphate; TEWL, transepidermal water loss. * Corresponding author. Tel.: +41-26-4352555; fax: +41-264352666. E-mail address:
[email protected] (C. Faller).
0887-2333/02/$ - see front matter # 2002 Elsevier Science Ltd. All rights reserved. PII: S0887-2333(02)00053-X
558
C. Faller et al. / Toxicology in Vitro 16 (2002) 557–572
1. Introduction During the development of new cosmetic formulations, the skin irritation potential is investigated in order to identify chemicals which might induce adverse skin reactions. To replace the animal systems used in the Draize dermal irritation test for product safety testing and to avoid exposing human volunteers to potentially irritant products in human clinical testing, various alternatives have been developed, such as monolayer keratinocyte cultures, explant cultures (skin organ cultures) and, most importantly, in vitro reconstructed human skin equivalents. These three-dimensional skin models are generated by growing keratinocyte cultures at the air–liquid interface on de-epidermized dermis, acellular or fibroblast-populated dermal substrates such as inert filters or collagen matrix (Prunieras et al., 1979; Bell et al., 1981; Noser and Limat, 1987; Rosdy and Clauss, 1990; Tinois et al., 1991; Cannon et al., 1994; Prunieras, 1994; Wahlen et al., 1994; Limat and Hunziker, 1997). The cultures exhibit a well-stratified and cornified epidermis, with basal, spinous and granular layers along with a functional stratum corneum, mimicking the architecture of the normal human skin and allowing the direct topical application of finished products, although some deviations in tissue homeostasis and barrier properties in comparison to normal human epidermis still need to be optimized (Boelsma et al., 2000; Ponec et al., 2000, in press). Various test protocols using these models for acute cutaneous toxicity screening have been proposed and evaluated (Bell et al., 1991; Osborne and Perkins, 1991; Gay et al., 1992; Slivka and Zeigler, 1993; Mu¨ller-Decker et al., 1994; Ponec, 1994; Roguet et al., 1994a,b; Ponec and Kempenaar, 1995; Botham et al., 1998; Demetrulias et al., 1998; de Brugerolle de Fraissinette et al., 1999; Roguet, 1999; van de Sandt et al., 1999), but their acceptance as valid in vitro alternatives to in vivo studies depends on their reliability and relevance to the in vivo situation. Thus, a prevalidation study on alternative methods for skin irritation testing of chemicals has been funded by ECVAM (European Center for the Validation of Alternative Methods) during 1999 and 2000 (Fentem et al., 2001). The aim of our study was to examine the concordance between human in vivo and in vitro skin irritation classifications of cosmetic products, to evaluate the correlations between the different parameters and, if possible, to develop a prediction model based on the combination of the in vitro parameters recognized as the most predictive of in vivo skin irritancy potential. The evaluation of the reproducibility of data obtained from in vitro irritation testing using reconstructed human epidermis models commercially available as kits (EpiDermTM, EPISKINTM and SkinEthic1), and one ‘‘in-house’’ model developed at Cosmital, was performed as the first part of the European project SMT4-CT 97–2174. Based
on the reproducibility results obtained in that phase of the project (Roguet et al., 2001; Faller and Bracher, in press) and for time and cost reasons, it was decided to evaluate the concordance between in vivo and in vitro data using only two out of the three industrial models. Twenty-two formulations from product development test series, covering the full range of in vivo scores and representing different cosmetic product classes, were tested in vivo (modified Frosch-Kligman Soap Chamber Patch Test with repetitive occlusive application) and in vitro (EpiDerm, EPISKIN and the in-house epidermis equivalent of Cosmital). A common in vitro protocol was established for the measurement of cytotoxicity in the MTT reduction assay and extracellular release of the pro-inflammatory mediator IL-1a and the cytosolic enzyme LDH into the assay medium, after a range of exposure times to the test products (time-course assay). The MTT assay is a colorimetric cell viability determination, based on the reduction of the yellow tetrazolium salt, MTT, to a blue formazan dye by various dehydrogenase enzymes in active mitochondria. The parameters measured were the percent MTT cell viability relative to the corresponding negative control and the ET50, which is the effective time of exposure required to decrease the MTT reduction capacity of treated cultures to 50% of the negative control, as determined from the MTT cytotoxicity curve, as well as the amount of the cytokine IL-1a and of the enzyme LDH released into the culture media collected at the end of the different exposure periods. A statistical analysis of the in vivo and in vitro experimental data was carried out using simple linear regression, univariate and multivariate descriptive statistics, a multivariate predictive method [partial least squares (PLS) regression] and contingency tables for comparing the in vivo and in vitro classifications, in order to evaluate the agreement between in vivo and in vitro, and to find out which of the in vitro parameters were the most predictive of the in vivo human skin irritancy potential.
2. Materials and methods 2.1. Human reconstructed epidermis equivalents (1) EpiDermTM (Cannon et al., 1994) from MatTek Corporation, Ashland (MA, USA), EPI-200-HCF (HCF indicating hydrocortisone-free cultures and media, hydrocortisone omitted from the growth medium for the final 3 days of culture, from the agarose gels on which the tissues were packaged and stored, and from the assay medium). According to a general description provided on the data sheet from the MatTek Corporation, the EpiDerm skin model (area of 0.6 cm2) consisted of ‘‘normal, human derived epidermal keratinocytes (NHEK) which have been cultured to form a multilayered, highly differentiated model of the human epidermis’’.
C. Faller et al. / Toxicology in Vitro 16 (2002) 557–572
(2) EPISKINTM (Tinois et al., 1991) from Episkin SNC (L’Ore´al; Chaponost, France), consisted of ‘‘type I bovine collagen matrix, representing the dermis, surfaced with a film of type IV human collagen, upon which is laid after 13 days in culture stratified differentiated epidermis derived from second passage human keratinocytes’’ (culture area 1.1 cm2). (3) Cosmital=in-house model developed at Wella/ Cosmital (culture area 0.8 cm2), generated with interfollicular epidermal keratinocytes derived from human skin (first or second subculture) seeded in cell culture inserts (SnapwellTM from Costar) which carry postmitotic human fibroblasts on the under-surface of their microporous membrane (Noser and Limat, 1987; Limat and Hunziker, 1997). The air-lifted epidermis equivalents were cultured during 14 days in six-well plates, the first week with FAD2 medium (DMEM/Ham’s F12, 3:1) and the second week with keratinocyte growth medium (serumfree KGM from Clonetics). 2.2. Test materials and reagents Sodium lauryl sulphate (SLS, from Sigma Ref. L-4509, batch 57H1242) was dissolved in water and used at a concentration of 1% as positive control. Assay medium and maintenance medium for the industrial models were supplied by the kit manufacturers. Phosphate buffered saline (if not provided with the kits) was obtained from BioConcept (3–05K00-I) and MTT from Sigma (Ref. M-2128). The 22 test materials selected for in vivo and in vitro testing are described in Table 1. 2.3. In vivo protocol The human skin irritation potential was evaluated at proDERM, Institute for Applied Dermatological Research GmbH, Schenefeld/Hamburg, using the Modified Frosch-Kligman Soap Chamber Patch Test and according to the proDERM Standard Protocol 01.03. Briefly, 50 ml of the test material were applied four times (the first time for 24 h, followed by three applications of 6 h each on the next 3 days) under occlusive conditions (epicutaneous patch test system; 11 mm Ø filter paper pads) to the forearm of a minimum of 20 human panelists. Skin reactions were visually evaluated at days 2, 3, 4 (before re-application of the test material) and at day 5. Erythema and dryness scores ranged from 0 to 4, those of fissures from 0 to 3, resulting in a possible maximum total score of 11. In addition, measurements of skin redness (Chromameter CR 300, Minolta) and transepidermal water loss (dual probe Tewameter TM 210 and TM 215, Courage and Khazaka, Cologne) were performed before the first application and at the end of the study (d5). The chromameter values were reported as differences to the pretreatment values and TEWL as measurement at day 5.
559
2.4. In vitro protocol A common protocol, with some adaptations for each model due to the particular shape and size of the different reconstructed epidermis kits, was established based on the measurement of the cytotoxicity in the MTT reduction assay and of the extracellular release of the proinflammatory mediator IL-1a and of the cytosolic enzyme LDH, after a range of exposure times to the test products (time-course assay). A first time-range-finding assay was performed with all test materials with exposure times of 1, 4, 7 and 16 h using one single culture per time point. Negative controls (without any treatment, exposure times 4 and 16 h) and a positive control (1% SLS, exposure times 1, 2 and 4 h) were run in parallel in an identical way to the dosed cultures. An MTT test was considered acceptable if the SLS ET50 value was greater than 2 h (i.e. between 2 and 4 h) for EpiDerm, and greater than 1 h (i.e. between 1 and 2 h) for EPISKIN and Cosmital. The nontreated controls were set to represent 100% viability and used as reference. Based on the results of the time-rangefinding assay, the exposure times of the definitive tests, carried out in duplicate using three different batches of reconstructed epidermis kits, were selected for each test product among the exposure times of 0.5, 1, 2, 4, 7 and 16 h in order to produce a time–effect curve and to determine more accurately the ET50 value from the graph. The 16-h exposure time was always included in order to have a common time point for all the products. Negative and positive controls were included as described above. At the end of the incubation, in addition to the MTT reduction test on the tissue samples, the test media underneath were collected to determine the cytokine IL1a and the LDH release. 2.5. Preparation of the skin cultures Upon receipt of the industrial models, the tissue kits and the solutions were stored according to the manufacturer’s directions for unpacking and storage. EpiDerm: the sealed 24-well plates containing the cultures and the assay medium were stored in the refrigerator (4–8 C) overnight. Before starting the experiment, 0.9 ml of assay medium was dispensed into each well of six-well plates, the EpiDerm tissue cultures were transferred into the wells and the plates were preincubated at 37 C and 5% CO2 for at least 1 h. Just prior to dosing, the assay medium was renewed. EPISKIN: the tissues were transferred into 12-well assay plates filled with 2 ml of maintenance medium per well and incubated at 37 C and 5% CO2 overnight. Prior to dosing, the maintenance medium was replaced by assay medium. Cosmital: the epidermis equivalents were cultivated at 37 C and 5% CO2 in six-well plates for 2 weeks, and prior to dosing, the culture medium KGM (2.5 ml per well) was renewed.
560
C. Faller et al. / Toxicology in Vitro 16 (2002) 557–572
Table 1 Test formulations for in vivo/in vitro comparison Product category
Designation and content (if known)
Code
Surfactant-based simple formulations
Base 2 + Benzalkonium chloride 3 mg/g (GAL/06/01/99) Base 2 + Benzalkonium chloride 10 mg/g (GAL/07/01/99) Base 2 + SLS 10 mg/g (GAL/12/01/99) Base 2 + SLS 30 mg/g (GAL/13/01/99) Base 2: Unguentum hydrophilicum PM Ill (GAL/05/01/99) Base 1: Propylenglycol + Glycerol (GAL/01/01/99) Muster ORIEN 20 B FTP 19.01.99 Muster ORIEN 20 C FTP 19.01.99 Muster ORIEN 20 G FTP 19.01.99 EWR. 8 53 04 128A 20. JANUAR 1999 EWR. 8 53 040 135A 20. JANUAR 1999 M#33 M#34 E#5 E#21 G#3 G#16 O#10 O#24 00 0843 002501 3 BS GH DIV 25 RU¨CK 1 05.02.99 00 0843 0029 011 BS GH DIV 29 RU¨CK 1 05.02.99 H2O
2B3 2 B 10 2 S 10 2 S 30 2B0 IB0 ORIEN 20 B ORIEN 20 C ORIEN 20 G EWR. 85304128A EWR. 853040135A M#33 M#34 E#5 E#21 G#3 G#16 O#10 O#24 BS GH DIV 25 BS GH DIV 29 H2O
Propylenglycol (Ondal, art. 21848 A010/2D) Glycerol 86% (Ondal, art. 21694 A010/2D)
376 g 124 g
Cetylane Ph.H.Vll (lanette N, Hanseler lot 402112) Arachidis Oleum Hydrogenatum Ph.H.VlI (Ha¨nseler lot 60713 Propylenglycol (Ondal, art. 21848 A010/2D) Aqua dist., sterile
5.50 g 33.00 g 11.00 g 60.50 g
Shampoos tested neat
Shampoos tested at 8% Mascaras Emulsions Gels Oils Creams Water Base 1, propylenglycol + glycerol, GAL/01/0199:
Base 2, unguentum hydrophilicum PM Ill, GAL/05/01/99:
2.6. Dosing protocol Twenty-six ml (or mg) of test products were applied onto a filter paper (FINN CHAMBER 8 mm diameter), with the exception of EPISKIN, where 50 ml (or mg) were applied on larger filters (11 mm diameter). The filters were then placed inverted into the tissue insert atop the epidermis. Negative and positive controls were run in parallel in an identical way to the dosed cultures. The cultures were incubated at 37 C and 5% CO2 for the desired exposure periods. At the end of the incubation, the different test media were collected in polypropylene cryotubes, immediately frozen in aliquots and stored at 70 C for further analysis of soluble pro-inflammatory mediators or cellular enzymes, and the MTT test was carried out with the tissue cultures. 2.7. MTT assay (Mosmann, 1983) A fresh MTT solution was prepared for each testing day by dissolving 2 mg MTT per ml in the appropriate assay medium. Two ml per well of the MTT solution were pipetted into 12-well plates for EPISKIN, 300 ml per well into 24-well plates for EpiDerm cultures and 2.5 ml per well into six-well plates for the Cosmital model.
At the end of the exposure period, the tissue cultures were removed from the assay plates and gently rinsed with phosphate buffered saline to eliminate the filter paper and any residual test material. Excess liquid was shaken off prior to placing the tissue samples into the plates containing the MTT solution. After incubation at 37 C for 2 h, the inserts were removed from the MTT plates and gently blotted on tissue paper before either completely submerging them in 2 ml of extraction solution (isopropanol) per well, for EpiDerm, or punching a biopsy (10 mm diameter for EPISKIN and 8 mm diameter for Cosmital) and transferring the biopsies into a 12-well extraction plate filled with 2.6 ml of extraction solution per well for EPISKIN or into a 24-well plate filled with 2 ml of the extraction solution per well for Cosmital. The extraction plates were sealed with Parafilm to reduce evaporation and gently shaken for at least 2 h at room temperature to extract the reduced MTT. At the end of the extraction period, the solution within each EpiDerm insert was pooled with the solution underneath the insert in the well. For all the models, the extraction solution was pipetted up and down to ensure complete mixing, and finally, 200 ml were transferred into a 96-well microtiter plate for absorbance measurement (OD=optical density) at 570 nm using 200 ml of isopropanol as a
C. Faller et al. / Toxicology in Vitro 16 (2002) 557–572
blank. After subtracting the blank OD from all the raw absorbance data, the mean OD and standard deviation were calculated for each test condition. The mean OD of the negative control (non-treated cultures) was set to represent 100% viability. The result at each exposure time point was expressed as percent viability relative to negative control standard deviation. The cytotoxicity data were plotted as viability vs exposure time, and the ET50 value, which is the effective time of exposure required to decrease the MTT reduction capacity (viability) of treated cultures to 50% of negative control, was determined for each batch from the exposure time–response curves by linear interpolation. 2.8. Determination of IL-1 and LDH The interleukin IL-1a present in the assay medium was measured by means of a commercially available enzymelinked immunosorbent assay kit (ELISA kit from Endogen, EH2-IL1A-10), following the instructions for use provided by the manufacturer. The detection limit of the assay was 2 pg/ml. The amounts of IL-1a released in the assay medium underneath the culture inserts expressed in pg per ml were multiplied by the total number of ml of assay medium for each model (0.9 ml for EpiDerm, 2 ml for EPISKIN and 2.5 ml for Cosmital) and reported as pg per culture. The activity of the cytosolic enzyme LDH, based on the reaction of transformation of pyruvate to lactate, was determined using an automatic apparatus for medical analysis (COBAS INTEGRA) and reported in units per liter (U/l). The determination of LDH release was not possible for EPISKIN because the EPISKIN assay medium produced high background values. 2.9. Statistics A partial evaluation of some of the experimental data was carried out using simple linear regression analysis. A more sophisticated statistical analysis of the in vivo and in vitro experimental data was performed by Nadia Dami (L’Ore´al), using four statistical methods: 1. Univariate descriptive statistics: Spearman rank correlations were used to determine the correlation coefficients (rho) between each pair of the different parameters (for a good correlation r should be > 0.8). 2. Contingency tables for the estimation of the coefficient were used to evaluate the agreement between the in vivo and in vitro classifications. 3. Multivariate descriptive statistics: The principal component analysis (PCA), a multivariate projection method (Jackson, 1991), was used to evaluate the underlying dimensionality (complexity) of the data, to get an overview of the dominant
561
patterns and major trends and to show the correlation structure in the data and the possible existence of subgroups among the 22 test products. 4. Multivariate predictive method: The PLS regression was used to connect the in vivo and in vitro data by a mathematical model in order to establish a prediction model (algorithm) based on the best predictors (Ho¨skuldsson, 1996).
3. Results and discussion In this study, 22 cosmetic formulations were tested in vivo (modified Frosch-Kligman Soap Chamber Patch Test with repetitive occlusive application) and in vitro (MTT time-course assay on EpiDerm, EPISKIN and the in-house epidermis equivalent of Cosmital). In vivo, skin reactions (erythema, dryness and fissures) were visually evaluated at days 2, 3, 4 (before re-application of the test material) and at day 5. In addition, measurements of skin redness and TEWL were performed before the first application and at the end of the study (d5). The parameters determined in vitro were the percent MTT cell viability at the end of each exposure time and the ET50, derived from the cytotoxicity curves of percent MTT viability vs exposure times, as well as the amount of the cytokine IL-1a and of the enzyme LDH released into the culture media collected at the end of each exposure period. The times of exposure were selected for each formulation on the basis of the results obtained in the first time-range-finding assay, but included for all the formulations the time 16 h as a common time point. Because of the huge amount of experimental data, a selection was carried out among the data and those that were actually included in the statistical analysis were the following: in vitro, the percent MTT viability and the amount of IL-1a and LDH (LDH only for EpiDerm and Cosmital; not possible for EPISKIN, see Materials and Methods) released after the 16-h exposure (the only common time point available for all the test formulations), as well as the ET50, and in vivo, the data obtained at day 2 (i.e. 24 h after the first application of the test products) and at day 5 (i.e. at the end of the test), as well as the Chromameter and TEWL values (Table 2). To illustrate the relationship between cytotoxicity and release of cytokine or cytoplasmic enzyme, the experimental results obtained with one batch of EpiDerm are presented in Fig. 1 as an example. The cytotoxicity determined as percent MTT viability, the extracellular release of LDH as well as the inflammatory response evaluated as the release of IL-1a are presented as a function of the time of exposure to one of the test products (means of duplicates with standard deviations indicated as error bars). Fig. 1 shows that the enzyme LDH and the interleukin IL-1a were released regularly
562
Table 2 Overview of the in vivo and in vitro experimental data selected for statistical analysis (ranged in decreasing order of Sd5) Product
In vitro
Ed2
Dd2
Sd2
Ed5
Dd5
Fd5
Sd5
2.35 2.10 0.78 0.65 1.69 0.67 0.18 0.38 1.20 0.26 0.35 0.38 0.03 0.18 0.08 0.05 0.33 0.11 0.10 0.10 0.00 0.02
0.83 0.63 0.50 0.65 0.02 0.00 0.05 0.60 0.45 0.02 0.00 0.40 0.10 0.13 0.15 0.02 0.00 0.11 0.00 0.00 0.05 0.09
3.18 2.73 1.28 1.30 1.71 0.67 0.23 0.98 1.65 0.28 0.35 0.78 0.13 0.31 0.23 0.07 0.33 0.22 0.10 0.10 0.05 0.11
4.00 3.95 3.30 3.48 3.10 2.83 2.25 1.98 1.58 1.64 1.25 0.83 0.33 0.17 0.03 0.18 0.18 0.09 0.02 0.10 0.05 0.02
4.00 3.85 3.40 3.10 2.55 1.81 2.35 2.05 1.85 0.95 0.98 1.20 0.85 0.40 0.60 0.30 0.10 0.11 0.14 0.00 0.00 0.00
3.00 2.85 2.40 2.30 2.12 1.71 1.35 1.05 0.90 0.50 0.20 0.15 0.00 0.10 0.00 0.14 0.00 0.00 0.00 0.00 0.00 0.00
11.00 10.65 9.10 8.88 7.77 6.35 5.95 5.08 4.33 3.09 2.43 2.18 1.18 0.67 0.63 0.62 0.28 0.20 0.16 0.10 0.05 0.02
Chrom 7.19 7.96 3.97 7.08 5.26 8.57 2.88 2.31 3.00 5.21 3.30 1.23 0.44 0.11 0.45 0.48 1.02 0.31 0.34 0.40 0.41 0.11
TEWL
ET50C
ET50ED
ET50ES
ViabC
ViabED
ViabES
IL1aC
IL1aED
IL1aES
LDHC
LDHED
35.52 37.15 12.29 27.63 12.06 59.43 17.95 13.05 12.93 39.53 24.09 12.88 12.94 9.46 8.40 10.81 12.53 8.53 8.38 8.16 10.82 10.45
2.0 2.0 10.6 2.2 4.2 3.9 9.7 10.9 4.9 8.0 10.7 17.0 17.0 17.0 17.0 17.0 17.0 17.0 17.0 17.0 17.0 16.6
2.1 2.3 4.2 3.2 4.0 3.3 7.8 13.7 5.6 14.3 10.7 17.0 17.0 17.0 17.0 17.0 17.0 17.0 17.0 17.0 17.0 17.0
0.7 0.8 10.5 0.9 6.2 9.8 5.8 14.3 3.0 11.6 4.6 17.0 17.0 17.0 17.0 17.0 14.8 17.0 17.0 17.0 17.0 17.0
0.91 0.11 14.72 1.36 1.29 2.98 6.42 20.98 1.07 18.40 8.10 95.50 93.02 95.27 85.42 82.46 77.13 83.84 77.49 85.91 103.87 64.61
2.68 4.97 15.52 7.43 8.30 10.10 9.90 40.19 8.21 48.27 21.92 94.92 92.40 84.78 89.18 93.29 87.40 88.73 81.00 87.02 91.95 82.11
1.32 2.29 32.67 1.13 9.15 32.64 3.08 43.55 7.02 42.89 12.45 103.79 113.51 95.85 99.42 89.55 50.91 97.66 85.26 95.55 99.72 94.76
191.56 249.07 15.61 264.17 166.68 166.60 100.58 29.99 89.30 84.89 220.05 3.24 4.31 3.17 0.37 12.46 39.01 0.00 2.31 2.87 2.94 12.85
153.95 205.54 33.17 292.03 163.58 157.66 308.72 64.82 106.50 81.42 357.59 17.73 10.22 12.00 10.33 43.90 52.79 20.75 13.05 9.72 7.69 10.29
179.94 277.20 42.23 356.27 135.78 144.28 295.92 151.29 142.38 139.18 388.95 23.76 3.30 3.41 6.55 49.75 56.31 6.04 5.60 3.17 4.88 2.54
2.33 140.33 243.00 2.17 334.17 216.50 41.33 88.33 48.50 243.33 728.50 15.00 13.67 17.33 6.00 10.25 69.83 11.50 8.33 11.20 13.00 10.50
3.00 190.83 293.00 13.17 393.83 207.83 582.67 140.50 0.33 310.17 835.67 28.50 38.33 30.31 21.17 31.00 49.00 15.83 18.50 25.60 30.00 10.67
Abbreviations: In vivo data (means of 20–22 panelists): Ed2=visual score of erythema at day 2 (0–4); Dd2=visual score of dryness at day 2 (0–4); Fd2 not indicated as no fissure was observed at day 2; Sd2=sum of the visual scores at day 2 (0–11); Ed5=visual score of erythema at day 5 (0–4); Dd5=visual score of dryness at day 5 (0–4); Fd5=visual score of fissures at day 5 (0–3); Sd5=sum of the visual scores at day 5 (0–11); Chrom=chromameter value (difference to the pretreatment value); TEWL=transepidermal water loss value at day 5. In vitro data (means of 3 tests): ET50=ET50 values [h]; ET50 > 16 h are reported as 17; C=Cosmital; ED=EpiDermTM; ES=EPISKINTM; Viab=percent MTT viability after the 16-h exposure to the test material; lL1a=IL-1alpha release after the 16-h exposure to the test material [pg per culture]; LDH=LDH release after the 16-h exposure to the test material [U/l].
C. Faller et al. / Toxicology in Vitro 16 (2002) 557–572
Orien20B Orien20G E21 Orlen20C 2B10 2S30 O10 M33 G3 2S10 EWR128 M34 E5 H20 G16 IB0 EWR135 BSGH29 2B3 2B0 O24 BSGH25
In vivo
C. Faller et al. / Toxicology in Vitro 16 (2002) 557–572
563
Fig. 1. Time-dependent response of EpiDermTM tissue cultures (one batch as an example) after topical application of the test formulation 2B10: cell viability determined as MTT reduction capacity, release of the proinflammatory mediator IL-1a and leakage of the cytoplasmic enzyme LDH. Each value represents the mean SD of two replicate cultures. *Volume of assay medium per culture.
into the assay medium in the well below the culture inserts in parallel to the decrease of viability in the MTT assay. However, amounts clearly different from the negative control could be detected only at exposure times showing a strongly decreased viability (< 50%). Furthermore, the IL-1a and LDH release curves of the whole series of test formulations make it clear that the use of the IL-1a and LDH release after the 16-h treatment with the test material, which was decided as a common time point for all the products, was not always appropriate for comparing the products. As a matter of fact, a decrease in the amount of these two parameters could be observed in some cases at that late time point compared to the earlier times of exposure due to high degree of cytotoxicity or to possible interaction of the test material with the cytokine or the enzyme. Probably a shorter time of exposure, for example 7 h, would have been better for comparison, 4 h being too early, however, to detect a significant amount of IL-1a or LDH, even in the presence of irritant products. With the main aim of evaluating the predictive ability of the in vitro test systems, the experimental in vivo and in vitro data were then submitted to statistical analyses, starting with a simple approach using direct linear regression analysis for comparing only two parameters, and then undertaking a more extensive analysis on the data taken as a whole (univariate and multivariate descriptive statistics, contingency tables, PLS regression).
3.1. Simple linear regression analysis With the aim to visualize the concordance obtained between in vivo and in vitro irritation data, we directly compared in a preliminary and partial approach the in vivo mean total scores (sum of erythema, dryness and fissures at day 5=Sd5) of the 22 products with the in vitro ET50 values of the three epidermis equivalents (Fig. 2a–c). These two parameters were selected because they are often used and provide a good overview of the experimental data on in vivo and in vitro irritation potential of the test formulations, the mean total score at day 5 representing the sum of all the in vivo visual observations and ET50, taking into account the in vitro cytotoxicity at all the experimental time points. Simple linear regression analysis allows the comparison of the predictive ability of the different epidermis models and resulted in the best correlation for EpiDerm (R=0.94), followed by Cosmital (R=0.90) and EPISKIN (R=0.84). From Fig. 2(a–c) and Table 2, it is obvious that the quality of the correlations was favoured by the ‘‘clustering’’ of the data points at the extremes of the irritation scale, especially by those situated at the nonirritating end. However, the analysis of the 12 data points that fall in the middle of the scale (i.e. products with 1 < Sd5 < 11) still results in a correlation coefficient of 0.87 for EpiDerm. Despite these observed restrictions, all the three epidermis models were able to discriminate
564 C. Faller et al. / Toxicology in Vitro 16 (2002) 557–572 Fig. 2. Comparison of ET50 and clinical total score at day 5 of all the 22 test products: (a) for EpiDermTM, (b) for EPISKINTM, (c) for Cosmital; (d) comparison of percent MTT viability after 4 h of exposure and clinical total score at day 5 for nine irritant products, for EpiDermTM.
C. Faller et al. / Toxicology in Vitro 16 (2002) 557–572
severe irritants from products with a good skin compatibility, the highest ET50 values (516 h) corresponding to the lowest in vivo scores for non-irritant products, and the lowest ET50 values with the highest in vivo scores for irritant ones. Clear dose–response relationships could be observed with the two types of surfactant-based formulations, containing different surfactant concentrations, either the cationic benzalkonium chloride (2B3 and 2B10) or the anionic SLS (2S10 and 2S30; see Experimental data in Table 2). As half the products had an ET50 greater than 16 h, and as they could not be further discriminated, they were classified as non-irritant. A further analysis of the remaining irritant products showed a relatively good linear correlation by directly comparing the percent viability in MTT after the 4-h exposure time and the in vivo score at day 5 (see Fig. 2d, where the results of EpiDerm are given as an example). Consequently, the mathematical formula of the direct linear regression straight line could be used as a prediction model. However, these results were obtained by using the data of nine products only (for which the experimental percent viability data at 4 h were available) and should therefore be considered with caution. 3.2. Spearman rank correlations The direct comparison of two selected parameters using linear regression analysis as described above was actually too simple and incomplete a way of analysing the results. Therefore, the experimental data (Table 2) were then submitted to a more extensive statistical analysis. First, the in vivo parameters were compared with the in vitro parameters of all the three reconstructed epidermis models (Table 3) and the Spearman rank correlation coefficients obtained were generally higher for ET50 than for the percent MTT viability after the 16-h exposure or for the IL-1a release. In contrast, the correlation between LDH release and the in vivo parameters was generally poor. The best correlations were obtained between ET50 and the clinical scores evaluated at day 5 (erythema, dryness, fissures and their sum) or the chromameter value. There was only a low correlation between the in vitro parameters and either the visual score of dryness at day 2 or the TEWL. Although the Spearman rank correlation coefficients obtained by comparing the ET50 of EpiDerm with the different in vivo parameters were in almost all the cases higher than those obtained with EPISKIN or Cosmital, the results of the correlation analysis with the three different in vitro epidermis models were very close. The same analysis was performed by comparing only the in vitro data obtained with the three epidermis models (comparison vitro/vitro; Table 4). The good correlations between the different in vitro parameters (except for LDH release; correlation coefficient < 0.5)
565
as well as between the different models allowed us to conclude that the performance of the three different reconstructed epidermis models was very similar. Finally, good correlations could be observed in general between the different in vivo parameters (Table 5), with the exception of TEWL, whose coefficients of rank correlation with the other parameters were generally less than 0.8, and the visual score of dryness at day 2 (Dd2) which showed the weakest correlations ( < 0.6) with the other parameters. One possible explanation for the bad performance of Dd2 may be that the evaluation of this parameter only 1 day after the first application of the test products is too early. The best correlations ( > 0.9) were observed by comparing the sum of the visual scores at day 2 with the visual score of erythema at day 2, by comparing the different visual scores at day 5 among each other and with their sum at day 5, and by comparing the chromameter value with the visual score of erythema at day 5. 3.3. Comparison of in vivo/in vitro classifications Another way of evaluating the relationship between in vivo and in vitro data was to classify the 22 cosmetic products into ‘‘irritant’’ and ‘‘non-irritant’’ and to compare the two classifications by constructing contingency tables and evaluating the agreement between the two classifications by means of the coefficient. As no predefined classification system was applied in the in vivo protocol used in this study, the test products were only evaluated in comparison with reference materials run in parallel (benchmark), but not classified as skin irritant or non-irritant. Nevertheless, based on the authors’ practical experience with many cosmetic formulations and the acceptability of such formulations in the market place, it was decided to use the in vivo mean total scores at day 5 with a cut-off value of 2 to classify the products (final score 52=irritant; final score < 2=nonirritant), which is probably relevant for one type of product such as care products, but may be too severe for others. In fact, this threshold value has been chosen as a compromise, applicable on one hand to all the test products of this study and on the other hand allowing a high security level for the human volunteers and the consumers in general. In a similar way, the percent MTT viability after the 16-h exposure and a cut-off value of 50% (viability at 16 h550%=irritant; viability at 16 h> 50%=nonirritant) were used in vitro. The results of the in vivo/in vitro comparisons (Table 6) were identical for all the three reconstructed epidermis equivalents and confirmed previous results with the EPISKIN model (Roguet et al., 1998). This classification system based on the percent MTT viability at the 16-h exposure time, which in practice is less time-consuming than the ET50 determination, was a relevant approach for categorizing the cosmetic products of this study in ‘‘irritant’’ (sensitivity=92%) and ‘‘nonirritant’’ (specificity=100%). The only one in vitro false
566
C. Faller et al. / Toxicology in Vitro 16 (2002) 557–572
Table 3 Spearman rank correlation analysis between the in vitro data obtained with the three reconstructed epidermis models and the in vivo data for the 22 products In vivo
ED2 DD2 SD2 ED5 DD5 FD5 SD5 CHROM TEWL
In vitro ET50C
ET50ED
ET50ES
VIABC
0.79 0.00 0.36 0.10 0.77 0.00 0.85 0.00 0.81 0.00 0.89 0.00 0.85 0.00 0.90 0.00 0.76 0.00
0.85 0.00 0.40 0.06 0.82 0.00 0.92 0.00 0.89 0.00 0.94 0.00 0.92 0.00 0.91 0.00 0.74 0.00
0.83 0.00 0.34 0.12 0.81 0.00 0.87 0.00 0.82 0.00 0.86 0.00 0.85 0.00 0.87 0.00 0.76 0.00
0.78 0.00 0.26 0.25 0.74 0.00 0.77 0.00 0.75 0.00 0.81 0.00 0.77 0.00 0.81 0.00 0.64 0.00
VIABED 0.77 0.00 0.33 0.14 0.74 0.00 0.75 0.00 0.75 0.00 0.81 0.00 0.77 0.00 0.81 0.00 0.57 0.01
VIABES 0.75 0.00 0.23 0.31 0.69 0.00 0.78 0.00 0.74 0.00 0.83 0.00 0.76 0.00 0.78 0.00 0.63 0.00
IL1AC
IL1AED
IL1AES
0.72 0.00 0.20 0.38 0.72 0.00 0.84 0.00 0.73 0.00 0.81 0.00 0.77 0.00 0.85 0.00 0.82 0.00
0.72 0.00 0.11 0.62 0.67 0.00 0.77 0.00 0.73 0.00 0.77 0.00 0.75 0.00 0.78 0.00 0.72 0.00
0.70 0.05 0.23 0.31 0.66 0.00 0.77 0.00 0.75 0.00 0.77 0.00 0.76 0.00 0.74 0.00 0.78 0.00
LDHC
LDHED
0.43 0.45 0.24 0.28 0.37 0.09 0.39 0.07 0.29 0.18 0.35 0.11 0.36 0.10 0.50 0.02 0.43 0.05
0.17 0.45 0.37 0.09 0.11 0.64 0.36 0.10 0.26 0.24 0.30 0.18 0.33 0.14 0.33 0.13 0.39 0.07
1st line: correlation coefficient (Rho). 2nd line (italics): Proba (P), if P <0.5, then coefficient is significantly different from zero. Abbreviations: see Table 2.
Table 4 Spearman rank correlation analysis of the in vitro data obtained with the three reconstructed epidermis models for the 22 products In vitro
ET50C ET50ED ET50ES VIABC VIABED VIABES IL1AC IL1AED IL1AES LDHC LDHED
In vitro ET50C
ET50ED
ET50ES
VIABC
1.00 – 0.96 0.00 0.92 0.00 0.93 0.00 0.92 0.00 0.90 0.00 0.88 0.00 0.79 0.00 0.75 0.00 0.31 0.16 0.18 0.43
0.96 0.00 1.00 – 0.94 0.00 0.91 0.00 0.92 0.00 0.90 0.00 0.86 0.00 0.80 0.00 0.79 0.00 0.33 0.14 0.23 0.31
0.92 0.00 0.94 0.00 1.00 – 0.93 0.00 0.91 0.00 0.94 0.00 0.92 0.00 0.88 0.00 0.88 0.00 0.35 0.11 0.23 0.31
0.93 0.00 0.91 0.00 0.93 0.00 1.00 – 0.93 0.00 0.95 0.00 0.86 0.00 0.84 0.00 0.77 0.00 0.29 0.19 0.15 0.52
VIABED 0.92 0.00 0.92 0.00 0.91 0.00 0.93 0.00 1.00 – 0.94 0.00 0.80 0.00 0.75 0.00 0.70 0.00 0.26 0.25 0.11 0.64
VIABES
IL1AC
IL1AED
IL1AES
LDHC
LDHED
0.90 0.00 0.90 0.00 0.94 0.00 0.95 0.00 0.94 0.00 1.00 – 0.88 0.00 0.88 0.00 0.84 0.00 0.25 0.26 0.19 0.40
0.88 0.00 0.86 0.00 0.92 0.00 0.86 0.00 0.80 0.00 0.88 0.00 1.00 – 0.87 0.00 0.84 0.00 0.40 0.06 0.35 0.11
0.79 0.00 0.80 0.00 0.88 0.00 0.84 0.00 0.75 0.00 0.88 0.00 0.87 0.00 1.00 – 0.95 0.00 0.43 0.05 0.42 0.05
0.75 0.00 0.79 0.00 0.88 0.00 0.77 0.00 0.70 0.00 0.84 0.00 0.84 0.00 0.95 0.00 1.00 – 0.35 0.11 0.37 0.09
0.31 0.16 0.33 0.14 0.35 0.11 0.29 0.19 0.26 0.25 0.25 0.26 0.40 0.06 0.43 0.05 0.35 0.11 1.00 – 0.80 0.00
0.18 0.43 0.23 0.31 0.23 0.31 0.15 0.52 0.11 0.64 0.19 0.40 0.35 0.11 0.42 0.05 0.37 0.09 0.80 0.00 1.00 –
1st line: correlation coefficient (Rho). 2nd line (italics): Proba (P), if P <0.5, then coefficient is significantly different from zero. Abbreviations: see Table 2.
C. Faller et al. / Toxicology in Vitro 16 (2002) 557–572
567
Table 5 Spearman rank correlation analysis of the in vivo data for the 22 productsin vivo In vivo
In vivo ED2 DD2 SD2 ED5 DD5 FD5 SD5 CHROM TEWL
ED2
1.00 – DD2 0.42 0.05 SD2 0.96 0.00 ED5 0.85 0.00 DD5 0.84 0.00 FD5 0.86 0.00 SD5 0.87 0.00 CHROM 0.87 0.00 TEWL 0.60 0.00
0.42 0.05 1.00 – 0.57 0.01 0.45 0.04 0.59 0.00 0.49 0.00 0.52 0.01 0.29 0.19 0.24 0.28
0.96 0.00 0.57 0.01 1.00 – 1.00 0.00 0.86 0.00 0.83 0.02 0.87 0.00 0.84 0.00 0.60 0.00
0.85 0.00 0.45 0.04 0.83 0.00 1.00 – 0.93 0.00 0.95 0.00 0.97 0.00 0.90 0.00 0.79 0.00
0.84 0.00 0.59 0.00 0.86 0.00 0.93 0.00 1.00 – 0.94 0.00 0.98 0.00 0.82 0.00 0.68 0.00
0.86 0.00 0.49 0.02 0.83 0.00 0.95 0.00 0.94 0.00 1.00 – 0.96 0.00 0.86 0.00 0.71 0.00
0.87 0.00 0.52 0.01 0.87 0.00 0.97 0.00 0.98 0.00 0.96 0.00 1.00 – 0.88 0.00 0.74 0.00
0.87 0.00 0.29 0.19 0.84 0.00 0.90 0.00 0.82 0.00 0.86 0.00 0.88 0.00 1.00 – 0.84 0.00
0.60 0.00 0.24 0.28 0.60 0.00 0.79 0.00 0.68 0.00 0.71 0.00 0.74 0.00 0.84 0.00 1.00 –
Fig. 3. Principal component analysis: projection of the in vivo (~) and in vitro () parameters on the plane 1–2; 86% of the variance explained by the first two principal components. Abbreviations: see Table 2.
1st line: correlation coefficient (Rho). 2nd line (italics): Proba (P), if P<0.5, then coefficient is significantly different from zero. Abbreviations: see Table 2.
Table 6 Contingency table of in vivo and in vitro classifications (this table was identical for all the three reconstructed epidermis models) In vitro
In vivo Irritanta
Non-irritant
Total
Irritant Non-irritant
11 1
0 10
11 11
Total
12
10
22
b
Sensitivity=percentage of irritant products recognized correctly=92%. Specificity=percentage of non-irritant products recognized correctly=100%. Observed concordance=percentage of all products classified correctly=95%. Kappa=0.91. 95%-confidence interval=[0.74, 1.08]. a In vivo irritant=mean total score at day 552. b In vitro irritant=percent MTT viability at 16 h450%.
negative result produced was a rather thick mascara (=M # 34 in Table 1), difficult to apply onto the skin surface, with a borderline in vivo score at day 5 of 2.2. In conclusion, the value (0.91; 95% confidence interval 0.74–1.08) indicates a good agreement between the in vitro and the in vivo classifications (observed concordance=95%) for the three reconstructed epidermis models which classified the products in an identical way. 3.4. Multivariate descriptive statistics A PCA, which shows the correlation between variables, was performed on all the parameters taken as a
Fig. 4. Principal component analysis: projection of the 22 test products on the plane 1–2. (=in vivo irritant=mean total visual score5 2). Abbreviations: see Table 2.
whole. In the PCA plots, a good correlation between parameters is indicated by the fact that these parameters are far away from the origin (0,0). Moreover, parameters that lie on a straight line from the origin in the same quadrant are strongly positively correlated. Those that lie on a straight line through the origin in opposite quadrants are negatively correlated. The PCA including all the parameters of this study (in vitro and in vivo) showed that the most important parameters to discriminate between irritant and non-irritant products were the percent MTT viability at 16 h and the ET50 values of all the three epidermis equivalents analysed, and the IL-1a release of the Cosmital model, in vitro, as well as the sum of the visual scores at day 5 and the chromameter value, in vivo (Fig. 3). Furthermore, as the PCA showed also that the products were clearly divided in ‘‘irritants’’ and ‘‘non-irritants’’ (Fig. 4), the correlation analyses presented in the following chapters were carried out for the two classes of products separately. As already mentioned, the test
568
C. Faller et al. / Toxicology in Vitro 16 (2002) 557–572
product M#34 again showed a particular behaviour by its location in the group of non-irritants, probably partly due to its sticky physical aspect (under-prediction in vitro). 3.5. Analysis of the 12 irritant products with univariate descriptive statistics The following conclusions could be drawn from the univariate descriptive statistics (Table 7): The percent MTT viability at 16 h and the ET50 obtained with EpiDerm were better correlated with the in vivo parameters (except dryness at day 2 and TEWL) than the corresponding data obtained with the two other epidermis models analysed; IL-1a release measured at the end of the 16-h exposure to the test material in the three models did not correlate with the in vivo parameters; LDH release seems to be correlated only with the in vivo dryness visual score at day 2, which is not sufficient to allow the predictive evaluation of the irritation potential of the test product after repeated application.
In conclusion, the release of IL-1a and LDH measured after 16 h of treatment with the test products did not allow accurate differentiation of the irritant products. These two parameters generally confirmed the cytotoxicity already measured in the MTT assay, the release of IL-1a or LDH being always linked to a significant
decrease in MTT reduction capacity. The experimental data showed in fact that the late time point of 16 h was not optimal for measuring the amount of cytokine released and accumulated in the culture medium below the tissue insert, since the amount of protein measured at that time was in several cases already reduced compared to the earlier time points probably due to high cytotoxicity caused by irritant products or in some other cases due to a possible interaction between high doses of test product and the cytokine or the enzyme (which can be easily verified). It could be observed that products which were highly cytotoxic (ET50 < 4 h) released smaller amounts of IL-1a and LDH at 16 h than products which were less cytotoxic in the MTT assay. It is, however, not possible either to measure the release of these irritation markers too early in the time course of the in vitro study because they are not yet detectable in the culture medium. Therefore, an adequate time point is still to be found in order to be able to use the parameters of release to better discriminate among the irritant products, as for example, the time at which the release of one marker would be two times the background level, and to combine these parameters with the MTT viability to improve the predictive ability of the assay system. 3.6. Analysis of the 12 irritant products with multivariate descriptive statistics The PCA on the whole of the in vivo and in vitro parameters demonstrated that three principal components
Table 7 Spearman rank correlation analysis between the in vitro data obtained with the three reconstructed epidermis models and the in vivo data for the 12 irritant products In vivo
ED2 DD2 SD2 ED5 DD5 FD5 SD5 CHROM TEWL
In vitro ET50C
ET50ED
ET50ES
VIABC
0.67 0.02 0.37 0.23 0.61 0.03 0.79 0.00 0.59 0.04 0.75 0.00 0.75 0.00 0.88 0.00 0.51 0.09
0.77 0.00 0.47 0.12 0.71 0.01 0.90 0.00 0.80 0.00 0.90 0.00 0.90 0.00 0.83 0.00 0.30 0.34
0.54 0.07 0.44 0.16 0.57 0.05 0.59 0.04 0.59 0.04 0.59 0.04 0.59 0.04 0.55 0.06 0.31 0.32
0.75 0.00 0.36 0.25 0.71 0.01 0.66 0.02 0.63 0.03 0.68 0.02 0.68 0.02 0.66 0.02 0.23 0.47
VIABED 0.71 0.01 0.55 0.07 0.73 0.01 0.77 0.00 0.78 0.00 0.78 0.00 0.78 0.00 0.62 0.03 0.19 0.56
VIABES 0.44 0.15 0.46 0.13 0.50 0.10 0.66 0.02 0.64 0.02 0.64 0.02 0.64 0.02 0.53 0.08 0.26 0.42
IL1AC 0.32 0.31 0.17 0.59 0.36 0.25 0.53 0.08 0.38 0.23 0.47 0.12 0.47 0.12 0.66 0.02 0.45 0.14
IL1AED
IL1AES
LDHC
LDHED
0.09 0.78 0.18 0.58 0.08 0.81 0.22 0.48 0.12 0.71 0.17 0.59 0.17 0.59 0.35 0.27 0.31 0.32
0.16 0.62 0.15 0.64 0.07 0.83 0.23 0.47 0.16 0.62 0.17 0.59 0.17 0.59 0.22 0.48 0.48 0.12
0.12 0.72 0.70 0.01 0.26 0.42 0.24 0.44 0.38 0.22 0.22 0.48 0.22 0.48 0.10 0.76 0.06 0.85
0.50 0.10 0.70 0.01 0.59 0.04 0.20 0.54 0.31 0.32 0.22 0.50 0.22 0.50 0.06 0.86 0.03 0.91
1st line: correlation coefficient (Rho). 2nd line (italics): Proba (P), if P <0.5, then coefficient is significantly different from zero. Abbreviations: see Table 2.
C. Faller et al. / Toxicology in Vitro 16 (2002) 557–572
explained 83% of the variation of the data. The projection of the parameters on the most informative plane 1–2 explained 75% of the variance and on the plane 1– 3, 61% of the variance (Figs. 5 and 6). The following conclusions could be drawn from the PCA: The clinical evaluations (erythema, dryness, fissures and their sum at days 2 and 5) as well as the chromameter value were positively correlated among each other and negatively with the in vitro parameters (percent MTT viability at 16 h and ET50); The percent MTT viability at 16 h and ET50 of the epidermis models were positively correlated among each other; The chromameter value was positively correlated with the TEWL value; LDH did generally not show any good correlation with the other parameters, both in vivo and in vitro.
These correlations were not specific for the irritant products and did not bring new perspectives in the establishment of a prediction model for skin irritation. The absence of correlation of the parameter LDH (measured at 16 h) confirmed the conclusion already mentioned and discussed above.
Fig. 5. Principal component analysis: projection of the in vivo and in vitro data on the plane 1–2 for the 12 irritant products. Abbreviations: see Table 2.
569
3.7. Analysis of the 12 irritant products with PLS regression The purpose of this type of analysis is to connect the in vivo and in vitro test results in order to try to find a linear multivariate model based on the best predictive parameters. The predictive ability of the regression model is characterized by means of the values R2y (percent of variation of the parameters to be predicted explained by the model) and Q2 (predictive power). A model will be considered acceptable if R2y> 0.8 and Q2 > 0.5. The PLS analysis on the in vivo and in vitro parameters of this study showed, however, that the quality of the regression model obtained using all of the parameters (vivo vs vitro) together, that is, its predictive ability, was not satisfactory to allow the prediction of skin irritation potential that would be in good agreement with the data derived from in vivo (R2y=0.62, Q2=0.31; geometrical representations not shown). Therefore, different models were evaluated by using only the three most pertinent in vivo parameters (sum of visual scores at day 5, chromameter value and TEWL). The different associations of parameters did actually not allow either to establish a good predictive model. However, two models, which were able to predict two in vivo parameters but nevertheless did not completely meet the statistical acceptance criteria, can be mentioned: 1. The sum of visual scores at day 5 can be relatively well predicted by the EpiDerm model using the percent MTT viability at 16 h and the ET50 value: Sum of visual scores at day 5=10.494– 0.686ET50 + 0.0418% viability at 16 h (R2y=0.74, Q2=0.66). 2. The chromameter value can be relatively well predicted by the Cosmital model using the percent MTT viability at 16 h and the ET50 value: Chrom=8.395 0.544ET50 + 0.024% viability at 16 h (R2y=0.75, Q2=0.65).
In conclusion, these models present the principal disadvantage, besides the limited predictive ability, to be related to one particular reconstructed epidermis model, that is, the use of another epidermis model would need a new evaluation and demand high expenses. A somewhat better prediction could be obtained by using the same in vitro parameters of the three models all together, but this is not feasible at all for practicability reasons. 3.8. Analysis of the 10 non-irritant products using univariate descriptive statistics Fig. 6. Principal component analysis: projection of the in vivo and in vitro data on the plane 1–3 for the 12 irritant products. Abbreviations: see Table 2.
The univariate descriptive statistics for the non-irritant products (Table 8) showed that there were few
570
C. Faller et al. / Toxicology in Vitro 16 (2002) 557–572
Table 8 Spearman rank correlation analysis between the in vitro data (except ET which was identical ( >16 h) in almost all the cases) obtained with the three reconstructed epidermis models and the in vivo data for the 10 non-irritant products In vivo
In vitro VIABC
ED2 DD2 SD2 ED5 DD5 FD5 SD5 CHROM TEWL
0.18 0.63 0.37 0.29 0.15 0.68 0.27 0.44 0.23 0.53 0.13 0.72 0.31 0.38 0.15 0.68 0.07 0.85
VIABED 0.38 0.28 0.21 0.55 0.22 0.54 0.58 0.08 0.35 0.32 0.23 0.52 0.42 0.23 0.26 0.47 0.54 0.11
VIABES
IL1AC
IL1AED
IL1AES
LDHC
LDHED
0.46 0.18 0.67 0.03 0.07 0.84 0.09 0.80 0.31 0.38 0.20 0.58 0.27 0.45 0.05 0.88 0.18 0.63
0.11 0.76 0.33 0.36 0.09 0.82 0.43 0.21 0.14 0.70 0.28 0.44 0.02 0.96 0.27 0.45 0.70 0.03
0.64 0.04 0.13 0.72 0.44 0.21 0.23 0.53 0.24 0.51 0.37 0.29 0.30 0.40 0.24 0.51 0.07 0.85
0.41 0.24 0.08 0.83 0.25 0.49 0.23 0.53 0.26 0.46 0.22 0.55 0.32 0.37 0.09 0.80 0.15 0.68
0.34 0.34 0.03 0.93 0.41 0.24 0.62 0.05 0.08 0.83 0.03 0.92 0.22 0.53 0.73 0.02 0.60 0.07
0.19 0.60 0.20 0.57 0.19 0.60 0.87 0.00 0.34 0.34 0.35 0.31 0.59 0.07 0.22 0.53 0.65 0.04
1st line: correlation coefficient (Rho). 2nd line (italics): Proba (P), if P <0.5, then coefficient is significantly different from zero. Abbreviations: see Table 2.
relevant and usable correlations between the different in vitro and in vivo parameters. The ET50 values obtained with the three epidermis models were excluded from the analysis as they were practically all greater than 16 h for that group of test products. The highest Spearman rank correlation (=0.87) was found between LDH/ EpiDerm and Erythema score at day 5 (in comparison =0.62 for the Cosmital model). Correlations with coefficient between 0.6 and 0.8 could occasionally be detected with one of the three in vitro epidermis models (e.g. LDH/Cosmital vs Chromameter value; IL-1a/Cosmital vs TEWL). Nevertheless, taking into account the overall relatively low quality of these correlations and the small number of pairs of parameters which are correlated, further multidimensional methods were not applied for this group of products. From these results, it can be concluded that it is almost impossible to discriminate the ‘‘non-irritants’’; that is, to differentiate sub-classes among the ‘‘non-irritant’’ products (e.g. mild, very mild and innocuous).
4. Conclusions The concordance between the in vivo and in vitro classifications as ‘‘irritant’’ and ‘‘non-irritant’’ of all the cosmetic products analysed was very good and the three reconstructed epidermis models, EpiDerm, EPISKIN and Cosmital, classified the products in the same way. The most discriminating parameters to seperate the
irritant products from the non-irritant ones were: in vitro, the percent MTT viability after 16 h of exposure and ET50, and in vivo, the sum of visual scores at day 5 as well as the chromameter value. The analysis carried out separately for ‘‘irritant’’ and ‘‘non-irritant’’ products (univariate and multivariate descriptive statistics; PLS regression) did not allow to determine a satisfactory prediction model, based on the combination of various parameters, for further discriminating inside these two groups of products, except for the sum of visual scores at day 5, which can be predicted for irritants with EpiDerm and the chromameter value with the Cosmital model by means of the in vitro parameters percent MTT viability and ET50. However, these proposed prediction models did not meet all the acceptance criteria fixed by the statistics. Although the extensive statistical analysis of all the in vitro and in vivo parameters evaluated in this study did not allow the establishment of a general mathematical prediction model for skin irritation testing for all the models, it showed, however, that a discrimination between irritant and non-irritant formulations was possible based on the above mentioned in vitro parameters identified as good predictors. The practical usefulness of the in vitro approach resides on one hand in screening formulations before human in vivo dermatological evaluation (protection of the volunteers), and on the other hand during the development of new formulations with a certain irritation potential (such as undiluted, surfactantcontaining cleansing products), in early selection of the best ones. Therefore, we propose as a general strategy for
C. Faller et al. / Toxicology in Vitro 16 (2002) 557–572
screening purposes to test all new products with an exposure time of 16 h and to classify those with MTT viability greater than 50% as non-irritant. If necessary, for certain product categories, the remaining formulations not yet classified are tested with an exposure time of 4 h and, based on the in vivo/in vitro correlation demonstrated in Fig. 2(d) for EpiDerm, they are then ranked according to their MTT viability. In conclusion, this study demonstrated the usefulness of reconstructed human epidermis equivalents for the in vitro assessment of the irritation potential of a series of cosmetic products. These models allow the measurement of quantifiable and objective endpoints relevant to in vivo irritative phenomena.
Acknowledgements This work was part of the European Commission Project ‘‘Testing and improvement of reconstructed skin kits in order to elaborate European standards’’, which was supported by a grant from the European Committee DGXII (Standards, Measurements and Testing SMT4-CT97–2174) and in Switzerland by the Federal Office for Education and Science (BBW-Nr. 97.0255). The authors wish to thank the partners of the project: C. Lotte and F. Dreher (L’Ore´al, Paris, France), I. R. Harris and U. Pfannenbecker (Beiersdorf, Hamburg, Germany), M. Ponec and E. Boelsma (Leiden University Medical Center, Leiden, The Netherlands), and H. Beck and M.-N. Python (Wella/Cosmital, Marly, Switzerland) for fruitful discussions and good co-operation. The excellent technical assistance of E. Knobel, J. Rolle and J. Chassot was highly appreciated.
References Bell, E., Ehrlich, H.P., Buttle, D.J., Nakatsuji, T., 1981. Living tissue formed in vitro and accepted as skin-equivalent tissue of full thickness. Science 211, 1051–1054. Bell, E., Parenteau, N., Gay, R., Nolte, C., Kemp, P., Bilbo, P., Ekstein, B., Johnson, E., 1991. The living skin equivalent: its manufacture, its organotypic properties and its responses to irritants. Toxicology in Vitro 5, 591–596. Boelsma, E., Gibbs, S., Faller, C., Ponec, M., 2000. Characterization and comparison of reconstructed skin models: morphological and immunohistochemical evaluation. Acta Dermato-Venereologica 80, 82–88. Botham, P.A., Earl, L.K., Fentem, J.H., Roguet, R., van de Sandt, J.J.M., 1998. Alternative methods for skin irritation testing: the current status ECVAM Skin Irritation Task Force report 1. ATLA 26, 195–211. Cannon, C.L., Neal, P.J., Southee, J.A., Kubilus, J., Klausner, M., 1994. New epidermal model for dermal irritancy testing. Toxicology in Vitro 8 (4) 889–891. de Brugerolle de Fraissinette, A., Picarles, V., Chibout, S., Kolopp, M., Medina, J., Burtin, P., Ebelin, M.E., Osborne, S., Mayer, F.K., Spake, A., Rosdy, M., De Wever, B., Ettlin, R.A., Cordier, A.,
571
1999. Predictivity of an in vitro model for acute and chronic skin irritation (SkinEthic) applied to the testing of topical vehicles. Cell Biology and Toxicology 15, 121–135. Demetrulias, J., Donnelly, T., Morhenn, V., Jessee, B., Hainsworth, S., Casterton, P., Bernhofer, L., Martin, K., Decker, D., 1998. Skin21—an in vitro human skin model: the correlation between in vivo and in vitro testing of surfactants. Experimental Dermatology 7, 18–26. Faller C., Bracher M. Reconstructed skin kits: reproducibility of cutaneous irritancy testing. Skin Pharmacology and Applied Skin Physiology (in press). Fentem, J.H., Briggs, D., Chesne´, C., Elliott, G.R., Harbell, J.W., Heylings, J.R., Portes, P., Roguet, R., van de Sandt, J.J.M., Botham, P.A., 2001. A prevalidation study on in vitro tests for acute skin irritation: results and evaluation by the Management Team. Toxicology in Vitro 15, 57–93. Gay, R., Swiderek, M., Nelson, D., Ernesti, A., 1992. The living skin equivalent as a model in vitro for ranking the toxic potential of dermal irritants. Toxicology in Vitro 6 (4) 303–315. Ho¨skuldsson, A., 1996. Prediction Methods in Science and Technology. Thor Publishing, Copenhagen, Denmark. Jackson, J.E., 1991. A User’s Guide to Principal Components. John Wiley, New York. Limat, A., Hunziker, T., 1997. Epidermal equivalents generated from cultured outer root sheath cells. In Vitro Toxicology 10 (1) 33–38. Mosmann, T., 1983. Rapid colorimetric assay for cellular growth and survival: application to proliferation and cytoxicity assays. Journal of Immunological Methods 65, 55–63. Mu¨ller-Decker, K., Furstenberger, G., Marks, F., 1994. Keratinocytederived pro-inflammatory key mediators and cell viability as in vitro parameters of irritancy: a possible alternative to the Draize skin irritation test. Toxicology and Applied Pharmacology 127, 99–108. Noser, F.K., Limat, A., 1987. Organotypic culture of outer root sheath cells from human hair follicles using a new culture device. In Vitro Cellular and Developmental Biology 23 (8) 541–545. Osborne, R., Perkins, M.A., 1991. In vitro skin irritation testing with human skin cultures. Toxicology in Vitro 5, 563–567. Ponec, M., 1994. The use of in vitro skin recombinants to evaluate cutaneous toxicity. In: Rougier, A., Goldberg, A., Maibach, H.I. (Eds.), In Vitro Skin Toxicology, Irritation, Phototoxicity, Sensibilization. Mary Ann Liebert, New York, pp. 107–116. Ponec, M., Boelsma E., Gibbs S, Mommaas M. Characterization of reconstructed skin models. Skin Pharmacology and Applied Skin Physiology (in press). Ponec, M., Boelsma, E., Weerheim, A., Mulder, A., Bouwstra, J., Mommaas, M., 2000. Lipid and ultrastructural characterization of reconstructed skin models. International Journal of Pharmaceutics 203, 211–225. Ponec, M., Kempenaar, J., 1995. Use of human skin recombinants as an in vitro model for testing the irritation potential of cutaneous irritants. Skin Pharmacology 8, 49–59. Prunieras, M., 1994. Skin and epidermal equivalents: a review. In: Rougier, A., Goldberg, A., Maibach, H.I. (Eds.), In Vitro Skin Toxicology, Irritation, Phototoxicity, Sensibilization. Mary Ann Liebert, New York, pp. 97–105. Prunieras, M., Regnier, M., Schloetterer, M., 1979. A new method to culture human epidermal cells on allogeneic or xenogeneic dermis: preparation of recombined grafts. Annales de Chirurgie Plastique 24, 357–362. Roguet, R., 1999. Use of skin cell cultures for in vitro assessment of corrosion and cutaneous irritancy. Cell Biology and Toxicology 15, 63–75. Roguet, R., Cohen, C., Dossou, K.G., Rougier, A., 1994a. EpiskinTM, a reconstituted human epidermis for assessing in vitro the irritancy of topically applied compounds. Toxicology in Vitro 8 (2) 283–291. Roguet, R., Cohen, C., Robles, C., Courtellemont, P., Tolle, M., Guillot, J.P., Pouradier Duteil, X., 1998. An interlaboratory study
572
C. Faller et al. / Toxicology in Vitro 16 (2002) 557–572
of the reproducibility and relevance of EpiskinTM, a reconstructed human epidermis, in the assessment of cosmetics irritancy. Toxicology in Vitro 12, 295–304. Roguet R., Faller C., Dreher F., Lotte C., Harris I., Bracher M., Pollet D., Pfannenbecker U., Dami N., Ponec M., 2001. Evaluation of reconstructed human epidermis kits for the in vitro assessment of cosmetic safety. Proceedings—2001 IFSCC Conference, Stockholm, pp. 103–124. Roguet, R., Re´gnier, M., Cohen, C., Dossou, K.G., Rougier, A., 1994b. The use of in vitro reconstituted human skin in dermotoxicity testing. Toxicology in Vitro 8 (4) 635–639. Rosdy, M., Clauss, L.C., 1990. Terminal epidermal differenciation of human keratinocytes grown in chemically defined medium on inert filter substrates at the air-liquid interface. Journal of Investigative Dermatology 95, 409–414. Slivka, S.R., Zeigler, F., 1993. Use of an in vitro skin model for
determining epidermal and dermal contributions to irritant responses. Journal of Toxicology—Cutaneous and Ocular Toxicology 12 (1) 49–57. Tinois, E., Tiollier, J., Gaucherand, M., Dumas, H., Tardy, M., Thivolet, J., 1991. In vitro and post-transplantation differentiation of human keratinocytes grown on the human type IV collagen film of a bilayered dermal substitute. Experimental Cell Research 193, 310–319. van de Sandt, J., Roguet, R., Cohen, C., Esdaile, D., Ponec, M., Corsini, E., Barker, C., Fusenig, N., Liebsch, M., Benford, D., de Brugerolle de Fraissinette, A., Fartasch, M., 1999. The use of human keratinocytes and human skin models for predicting skin irritation. The report and recommendations of ECVAM workshop 38. ATLA 27, 723–743. Wahlen, E., Donnelly, T.A., Naughton, G., Rheins, L.A., 1994. The development of three-dimensional in vitro human tissue models. Human and Experimental Toxicology 13, 853–859.