Towards a functional proteomics approach to the comprehension of idiopathic pulmonary fibrosis, sarcoidosis, systemic sclerosis and pulmonary Langerhans cell histiocytosis

Towards a functional proteomics approach to the comprehension of idiopathic pulmonary fibrosis, sarcoidosis, systemic sclerosis and pulmonary Langerhans cell histiocytosis

JO U R N A L OF PR O TE O MI CS 83 ( 20 1 3 ) 6 0 –75 Available online at www.sciencedirect.com www.elsevier.com/locate/jprot Towards a functional ...

1MB Sizes 0 Downloads 19 Views

JO U R N A L OF PR O TE O MI CS 83 ( 20 1 3 ) 6 0 –75

Available online at www.sciencedirect.com

www.elsevier.com/locate/jprot

Towards a functional proteomics approach to the comprehension of idiopathic pulmonary fibrosis, sarcoidosis, systemic sclerosis and pulmonary Langerhans cell histiocytosis C. Landia,⁎, E. Bargaglib , L. Bianchia , A. Gagliardia , A. Carleoa , D. Bennettb , M.G. Perarib , A. Arminia , A. Prassec , P. Rottolib , L. Binia,⁎ a

Functional Proteomic Section, Department of Life Sciences, University of Siena, Siena, Italy Respiratory Diseases Section, Department of Clinical Medicine and Immunological Sciences, University of Siena, Siena, Italy c Department of Pneumology, Ludwig University, Freiburg, Germany b

AR TIC LE I N FO

ABS TR ACT

Article history:

Bronchoalveolar lavage fluid of patients with four interstitial lung diseases (sarcoidosis,

Received 4 December 2012

idiopathic pulmonary fibrosis, pulmonary Langerhans cell histiocytosis, fibrosis

Accepted 9 March 2013

associated to systemic sclerosis) and smoker and non smoker control subjects were

Available online 23 March 2013

compared in a proteomic study. Principal component analysis was used to statistically verify the association between differentially expressed proteins and the conditions

Keywords:

analyzed. Pathway and functional analysis by MetaCore and DAVID software revealed

Sarcoidosis

possible regulatory factors involved in specific “process networks” like regulation of

Idiopathic pulmonary fibrosis

stress and inflammatory responses. Immune response by alternative complement

Systemic sclerosis

pathways, protein folding, Slit-Robo signaling and blood coagulation were “pathway

Langerhans cell histiocytosis

maps” possibly associated with interstitial lung diseases pathogenesis. Four interesting

Smoke

proteins plastin 2, annexin A3, 14-3-3ε and S10A6 (calcyclin) were validated by Western

Functional proteomics

blot analysis. In conclusion, we identified proteins that could be directly or indirectly linked to the pathophysiology of the different interstitial lung diseases. Multivariate analysis allowed us to classify samples in groups corresponding to the different conditions analyzed and based on their differential protein expression profiles. Finally, functional and pathway analysis defined the potential function and relations among identified proteins, including low abundance molecules present in the MetaCore database.

Abbreviations: PLCH, pulmonary Langerhans cell histiocytosis; Sar, sarcoidosis; IPF, idiopathic pulmonary fibrosis; SSc, systemic sclerosis; sc, smoker control; nsc, non-smoker control; BAL, bronchoalveolar lavage; HRCT, high resolution computed tomography; PFT, pulmonary function test; 2DE, two-dimensional electrophoresis; MS, mass spectrometry. ⁎ Corresponding authors at: Laboratorio di Proteomica Funzionale, Dipartimento di Scienze della Vita, Università degli studi di Siena, via Fiorentina 1, 53100 Siena. Tel.: + 39 0577 234938; fax: + 39 0577 234903. E-mail addresses: [email protected] (C. Landi), [email protected] (L. Bini). 1874-3919/$ – see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.jprot.2013.03.006

61

JO U R N A L OF P ROTE O MI CS 8 3 ( 20 1 3 ) 6 0–7 5

Biological significance This is the first study where different interstitial lung diseases such as sarcoidosis, idiopathic pulmonary fibrosis, pulmonary Langerhans cell histiocytosis, fibrosis associated to systemic sclerosis and smoker and non smoker control subjects were compared in a proteomic study to highlight their common pathways. We decided to report not only principal component analysis, used to statistically verify the association between differentially expressed proteins and the conditions analyzed, but also functional analysis general results, considering all differential proteins potentially involved in these conditions, to speculate about possible common pathogenetic pathways involved in fibrotic lung damage. © 2013 Elsevier B.V. All rights reserved.

1.

Introduction

Sarcoidosis (Sar), idiopathic pulmonary fibrosis (IPF), fibrosis associated with systemic sclerosis (Ssc) and pulmonary Langerhans cell histiocytosis (PLCH) are interstitial lung diseases (ILDs) with different etiopathogenesis, clinical course and prognosis. Sarcoidosis is a chronic multisystemic granulomatous disease of unknown etiology characterized by non caseating epithelioid granulomas distributed around lymphatic vessels, bronchi and pulmonary veins, in alveolar spaces and in pleura [1]. Idiopathic pulmonary fibrosis is a progressive fibroproliferative disorder characterized by fibroblast and myofibroblast deposition in the alveolar walls and overproduction of extracellular matrix, reducing gas exchange. This severe lung disorder is associated with a radiological and histological pattern of usual interstitial pneumonia (UIP) [2,3]. The etiopathogenesis of the diseases is not completely understood. Pulmonary Langerhans cell histiocytosis is a rare lung disorder characterized by unchecked proliferation and infiltration of immature dendritic cells (also called Langerhans cells) in the lungs. PLCH has unknown etiopathogenesis; the bronchiolar distribution of pathological lesions and epidemiological data suggest that an inhaled antigen, such as cigarette smoke, may be crucial for its development [4,5]. Nodule lesions in the lungs may spontaneously disappear, often on stopping smoking, while cysts are generally irreversible [6]. Systemic sclerosis is a heterogeneous disorder characterized by endothelial dysfunction and collagen over-production. Lung involvement occurs in 70% of patients and is associated with a poor prognosis. SSc pulmonary manifestations include vascular pulmonary diseases such as arterial hypertension and venous-occlusive diseases, interstitial lung diseases and increased risk of malignancy [7]. Previous proteomic studies demonstrated that application of two-dimensional electrophoresis to Bronchoalveolar lavage (BAL) is extremely useful for detailed analysis of the pathogenesis of interstitial lung diseases [8–10]. Groups of proteins expressed differentially in single ILDs were identified. Here we focused on BAL protein profiles from patients with Sar, IPF, SSc and PLCH and two groups of healthy controls (smokers and non-smokers, sc and nsc), using proteomic analysis and

a systems biology approach to obtain information about the pathways involved and to discover potential prognostic/diagnostic biomarkers as well as new therapeutic targets. The first step was to compare BAL protein profiles from patients with ILDs and controls by 2D-electrophoresis, image analysis and mass spectrometry; multivariate analysis was performed on the differentially expressed proteins by principal component analysis (PCA) in order to validate the data and evaluate the reproducibility of biological replicates. Western blot analysis enabled validation of some differentially expressed proteins, while pathway analysis was performed using MetaCore software to establish potential correlations among proteins. In order to obtain insights into biological significance, Database for Annotation, Visualization and Integrated Discovery (DAVID) (http:// david.abcc.ncifcrf.gov/) was used to define known functions of proteins of interest. In conclusion, this study made it possible to compare BAL from different ILDs and healthy smoker and non-smoker controls on the basis of their differential protein profiles, providing protein-interacting network maps as well as insights into biological responses and potential pathways.

2.

Materials and methods

2.1.

Population

The population of patients and controls consisted of 9 patients with sarcoidosis (5 female, mean age 60.3 ± 9.4), 7 with systemic sclerosis (3 female, mean age 65.7 ± 5.2), 7 with idiopathic pulmonary fibrosis (1 female, mean age 63.5 ± 2.3), 9 with PLCH (3 female, mean age 59.9 ± 7), 10 non-smoker controls (5 female, mean age 65.3 ± 8) and 8 smoker controls (4 female, mean age 64 ± 5.4). Diagnosis of ILDs was conducted according to international criteria [10–12]. Selection criteria for the inclusion and exclusion of patients were: none had previously been treated with steroids or other immunosuppressants at the time of the bronchoscopy; they were monitored regularly at our Regional ILD Referral Centre in Siena from onset for at least 12 months; they had no history of concomitant pathology (e.g. patients with pulmonary hypertension associated with ILD were excluded). All patients and

Dry cough, dyspnea Asthenia, cough, malaise Dry cough, dyspnea Cough, chest pain, pneumothorax 2/7 0/9 1/7 1/9 11.3 9.6 17.2 10.5 ± ± ± ± 68.2 89.2 70.9 71.3 2.1 1.7 1.5 0.2 ± ± ± ± 6.1 1.9 2.6 1.1 1 1.6 1.7 2.1 ± ± ± ± ± ± ± ±

12.7 15 16 12.6

11.3 35.3 17.9 13.8

± ± ± ±

13.6 21.9 1.8 2.7

10.1 2.3 4.8 3.9

Desaturation PaO2 (mm Hg) (m ± sd) (n.pt./tot.) EOS % PMN % L % M %

73.2 61.4 76.3 81.4 19.6 12.3 24 23.8 ± ± ± ± 54.4 87.4 69.3 78.1 19.5 11.1 16.8 19.2 ± ± ± ± 72.1 95.7 81.1 76.2 17.8 18.2 17.4 14 ± ± ± ± 66.9 97.3 76.5 80.3 2.3 9.4 5.2 7 ± ± ± ± 63.5 60.3 65.7 59.9

BAL samples were centrifuged at 800 ×g for 5 min to separate BAL fluid (BALF) from the cell component. Dialysis for 12 h against four changes of distilled water was performed to eliminate salts. The resulting samples were lyophilized and dissolved in lysis buffer (8 M urea, 4% w/v CHAPS, 40 mM Tris base, 65 mM dithioerythritol (DTE) and trace amounts of bromophenol blue). Before adding bromophenol blue, protein concentration was determined by the Bradford method [13]. Samples were diluted with lysis buffer to obtain 60 μg of proteins in 100 μl of solution for the analytical run and 800 μg of proteins in 200 μl of solution for the preparative run.

6 4 4 6

Preparation of BAL for 2DE analysis

IPF (n = 7) Sarcoidosis (n = 9) SSc (n = 7) PLCH (n = 9)

2.3.

DLCO %

Bronchoscopy with BAL was performed in all patients for diagnostic reasons as previously reported [11]. The patients were not on therapy at the time it was performed. Analysis of BAL cell composition was performed and lymphocyte phenotyping was done by flow cytometry (Facs-Calibur, Becton Dickinson) using anti-CD3, -CD4, -CD8 and -CD1a monoclonal antibodies (Table 1).

FEV1 %

Bronchoscopy and bronchoalveolar lavage

FVC %

2.2.

Sex Age (Male) (m ± sd)

controls gave their written informed consent to the study, which was also approved by local ethics committee. All patients underwent chest HRCT at onset and once a year, documenting fibrosis ranging from mild to moderate; patients with end-stage fibrotic involvement were excluded. Sarcoidosis was diagnosed according to international ATS/ERS/WASOG criteria. Detailed medical history, including occupational exposure and pharmacological history was obtained. Sarcoidosis patients included in the study were consecutive cases of sarcoidosis not on therapy when BAL samples were obtained and they were all subacute chronic patients. Patients with acute Lofgren's syndrome (bihilar lymphadenopathy, erythema nodosum and arthritis) were excluded. IPF was diagnosed according to international ATS/ERS criteria. All patients showed a typical HRCT UIP/IPF pattern characterized by honeycombing with predominant basal/sub pleural distribution. Diagnosis of SSc was based on clinical, radiological and immunological findings and histopathological features. Diagnosis of PLCH was conducted according to international criteria [10–12]; six patients had a diagnosis based on histological examination of transbronchial biopsies showing tissue positivity for anti-CD1a and S100 protein staining; the other three had a diagnosis based on clinical–radiological findings and BAL features (including CD1a positivity). Control subjects were matched for age and gender, had no history of asthma or allergy and were not subjected to any kind of therapy. They had normal lung function parameters and chest X-ray. Lung function test parameters and bronchoalveolar lavage findings in patients and controls are shown in Table 1, together with demographic features, blood gas analysis PaO2 levels, desaturation in the 6-minute walking test and major symptoms at onset.

Main symptoms

JO U R N A L OF PR O TE O MI CS 83 ( 20 1 3 ) 6 0 –75

Table 1 – Table reporting sex, age, FVC, FEV 1, DLCO, BAL macrophage percentage, BAL lymphocytes percentages, BAL polymorphonuclear cells percentage, BAL eosinophil cells percentage, PaO2 levels on blood gas analysis, desaturation on a 6 min walking test and the main symptoms at onset from patients with idiopathic pulmonary fibrosis, sarcoidosis, pulmonary fibrosis associated with systemic sclerosis and pulmonary Langerhans cell histiocytosis.

62

JO U R N A L OF P ROTE O MI CS 8 3 ( 20 1 3 ) 6 0–7 5

2.4.

High resolution 2D-electrophoresis

2DE was carried out using the Immobiline polyacrylamide system on a preformed immobilized nonlinear pH gradient, from pH 3 to 10, 18 cm in length, from GE Healthcare (Uppsala, Sweden). Analytical runs were carried out using the Ettan™ IPGphor™ system (Amersham Biosciences) at 16 °C under the following electrical conditions: 0 V for 1 h, 30 V for 8 h, 200 V for 1 h, from 300 to 3500 V in 30 min, 3500 V for 3 h, from 3500 to 8000 V in 30 min, 8000 V, for a total of 80,000 Vh. Preparative strips were rehydrated with 350 μL UREA 8 M, 4% w/v CHAPS, 1% w/v DTE and 2% v/v carrier ampholyte at room temperature for 12 h. Sample load was obtained by cup loading, with the cup applied at the cathodic and anodic ends of the strip. MS-preparative runs were obtained using the Multiphor™ II electrophoresis system and the following voltage steps at 16 °C: 200 V for 6 h, 600 V for 1 h, 1200 V for 1 h, 3500 V for 3 h, and 5000 V for 14 h. After the first dimension run, the IPG gels were equilibrated in 6 M urea, 2% w/v SDS, 2% w/v DTE, 30% v/v glycerol and 0.05 M Tris–HCl pH 6.8 for 12 min and for a further 5 min in 6 M urea, 2% w/v SDS, 2.5% w/v iodoacetamide, 30% v/v glycerol, 0.05 M Tris–HCl pH 6.8 and a trace of bromophenol blue. After the two equilibration steps, the second dimensional separation was performed on 9–16% SDS polyacrylamide linear gradient gels (18 × 20 cm × 1.5 mm), and carried out at 40 mA/gel constant current, at 9 °C until the dye front reached the bottom of the gel [14]. Analytical gels were stained with ammoniacal silver nitrate [15,16]. MS-preparatory gels were stained with SYPRO Ruby (Bio-rad headquarters, Hercules, California) according to the manufacturer's instructions. Bindsilane (γ-methacryloxypropyltrimethoxysilane) (LKBProdukter AB, Brommo, Sweden) was used to attach polyacrylamide gels covalently to a glass surface for those undergoing SYPRO Ruby staining [17]. Ammoniacal silver-nitrate-stained gels were then digitized with a Molecular Dynamics 300S laser densitometer (4000 × 5000 pixels, 12 bits/pixel; Sunnyvale, CA, USA). Preparatory gel images stained with SYPRO Ruby were digitized with a Typhoon 9400 laser densitometer (GE Healthcare). Computer-aided 2D image analysis was carried out with the Image Master Platinum 7.0 computer system (GE Healthcare). Spot detection was achieved after defining and saving a set of detection parameters, enabling filtering and smoothing of the original gel scans to clarify spots, and removal of vertical and horizontal streaks and speckles. The analysis process was performed by matching all gels of each group with a reference gel for the same condition, having the best resolution and the greatest number of spots, chosen by the user and named “master” by the software. The six master reference gels were then matched with each other. By this procedure, the Image Master Platinum algorithm matched all the gels to find quantitative differences. Spots were considered differentially expressed between two conditions when the ratio of the percentage relative volume (%V) means was greater than two fold. Statistical T-test (p ≤ 0.05) was performed to determine if two sets of data are significantly different from each other.

63

2.5. Statistical analysis by non parametric Kruskal–Wallis test and multiple comparison by Dunn's test The differently expressed spots statistically valid for T-test, were subjected to statistical analysis in the six different groups using the non parametric Kruskal–Wallis test (p ≤ 0.05) (Graphpad Prism 5 for Windows). Significant results were followed by comparisons of mean ranks by Dunn's test. We performed FDR on p-values from Kruskal–Wallis testing in relation to the average number of spots, i.e. 1000. In Table S1 we listed the corresponding q-value (see Supplemental material).

2.6.

Protein identification by MALDI-ToF/ToF-MS

Protein identification was carried out by Peptide Mass Fingerprinting (PMF) on an ultrafleXtreme™ MALDI-ToF/ ToF instrument (Brucher Corporation, Billerica, MA, United States). Electrophoretic spots from SYPRO Ruby stained gels were excised mechanically with an Ettan Spot Picker (GE Healthcare), destained in 2.5 mM ammonium bicarbonate and 50% acetonitrile (ACN), and dehydrated in acetonitrile. They were then rehydrated in trypsin solution and digested overnight at 37 °C. 0.75 μL of each digested protein was spotted onto the MALDI target and allowed to dry. Then 0.75 μL of matrix solution (saturated solution of CHCA in 50% v/v ACN and 0.5% v/v TFA) was applied to the dried sample, which was dried again. After acquiring the mass of the peptides, a mass fingerprinting search was carried out in Swiss-Prot/TrEMBL and NCBInr databases using MASCOT (Matrix Science Ltd., London, UK, http://www. matrixscience.com) software available on-line. Taxonomy was limited to Mammalia, mass tolerance was 100 ppm, and the acceptable number of missed cleavage sites was set at one. Alkylation of cysteine by carbamidomethylation was assumed and oxidation of methionine was considered as a possible modification.

2.7.

Protein identification by LC–MS/MS analysis

Trypsin digests that did not produce unambiguous MALDIToF/ToF identifications subsequently underwent peptide sequencing on a nanoscale LC–ESI/MS-MS, as described in detail by Meiring [18]. All the analyses were carried out on an LC–MS system consisting of a PHOENIX 40 (ThermoQuest Ltd., Hemel Hempstead, U.K.) and an LCQ DECA IonTrap mass spectrometer (Finnigan, San Jose, CA, USA). The peptides, after manual injection (5 μL) in a six-port valve, were trapped in a C18 trapping column (20 mm × 100 μm ID × 360 μm OD, Nanoseparations, Nieuwkoop, NL) using 100% solvent A (HPLC grade water + 0.1% v/v formic acid) at a flow rate of 5 μL/min for 10 min. A linear gradient up to 60% solvent B (acetonitrile + 0.1% v/v formic acid) for 30 min was used for analytical separation, and with a pre-column splitter restrictor we obtained a column flow rate of 100–125 nL/ min on a C18 analytical column (30 cm × 50 μm ID × 36 0 μm OD, Nanoseparations). Before injection of the next sample, the trapping and analytical columns were equilibrated for 10 min in 100% solvent B and for 10 min in

64

JO U R N A L OF PR O TE O MI CS 83 ( 20 1 3 ) 6 0 –75

100% solvent A. The ESI emitter of gold-coated fused silica (5 cm × 25 μm ID × 360 μm OD, Nanoseparations) was heated to 195 °C. A high voltage of 2 kV was applied for stable spray operation. The LC pump, the mass spectrometer and the automatic mass spectra acquisitions were controlled using Xcalibur™ 1.2 software (Thermo). The MS/ MS ion search was carried out in Swiss-Prot/UniprotKB databases using MASCOT. Taxonomy was limited to Homo sapiens, peptide precursor charge was set at 2 + or 3 +, mass tolerance at ± 1.2 Da for precursor peptide and ± 0.6 Da for fragment peptides and the acceptable number of missed cleavage sites was set at one. Alkylation of cysteine by carbamidomethylation was taken as a fixed modification, while oxidation was considered as a possible modification. We considered significant peptides with individual ion scores (− 10 * Log[P], where P is the probability that the observed match is a random event) that indicated identity (p < 0.05).

2.8.

Validation by Western blot

Western blot (WB) analysis was performed in 5 BAL samples for each condition to ensure the reliability of the 2DE results. It was performed for plastin 2, 14-3-3ε, annexin A3 and S10A6. Aliquots of sample, prepared as previously described, were diluted in Laemmli buffer (100 mM Tris– HCl pH 6.8, 2% (w/v) SDS, 20% (v/v) glycerol, 4% (v/v) β-mercaptoethanol) [19] and heated at 95 °C for 5 min. For each condition we loaded 25 μg of proteins, separated on 12% polyacrylamide gel and transferred onto nitrocellulose membrane (Hybond ECL, GE Healthcare) according to Towbin [20,21]. Before immunodetection the nitrocellulose membrane was reversibly stained with Red ponceau (0.2% w/v Ponceau S in 3% w/v trichloroacetic acid). Rabbit anti-plastin2, anti-14-3-3ε, S10A6 and mouse monoclonal anti-annexin A3 (Sigma Aldrich, St. Louis, USA) were primary antibodies for the immunodetection, achieved with the appropriate dilution indicated by the manufacturers. Goat-anti-mouse and goat-anti-rabbit were used as secondary antibodies (Sigma Aldrich). Hybridization with primary antibodies was performed overnight at room temperature while incubation with HRP-conjugate secondary antibodies was performed for 2 h at room temperature, and immunostained bands were visualized by chemiluminescence with Image Quant LAS 4000 (GE Healthcare) using ECL detection reagents (GE Healthcare). Quantification of the band integrated density value was performed using Image J software.

2.9.

graphic representation of the results. The aim of PCA was to simplify the enormous amount of data (%V variables) by linear transformation projecting the original variables into a new Cartesian system where the variables are placed in decreasing order. The variable with highest variance is projected on the first axis, the second on the second axis and so on. Complexity is reduced by limiting analysis to the principal (in terms of variance) new variables. By this simplification it is possible to observe the distribution of each sample in a twodimensional plane and easily visualize any experimental groups on the basis of protein spot expression in BAL (spot maps).

2.10.

Network analysis by MetaCore

Differentially expressed protein spots, found by image analysis and consequently identified by MALDI ToF/ToF and LC–MS/MS, were further processed by pathway analysis using the MetaCore 6.8 network building tool (GeneGo, St, Joneph, MI, USA). MetaCore includes a manually annotated database of protein interactions and metabolic reactions obtained from the scientific literature. The gene names of the differentially expressed proteins were uploaded into MetaCore network analysis software, version 6.8 (http:// portal.genego.com) and processed using the shortest-path algorithm. The networks were visualized graphically through “nodes” representing proteins, connected by “arches”, representing protein interactions. The shortest-path algorithm makes it possible to link two uploaded experimental proteins through a single node. Using this process, hypothetical networks were built between the experimental proteins and the MetaCore database proteins. Enrichment analysis of the biological process was based on the hypergeometric distribution algorithm and relevant pathway maps were then prioritized according to their statistical significance.

2.11.

Functional classification by DAVID

The list of gene IDs of the differentially expressed spots identified were used to perform functional analysis with DAVID 6.7. The list of gene IDs was loaded into the online tool (http://david.abcc.ncifcrf.gov/) clicking on Functional annotation clustering and selecting gene ID as identifier and gene list as list type. After submission of the list, functional classification was performed on the basis of Gene Ontology.

Statistical analysis by principal component analysis

Principal component analysis (PCA) was used to perform multivariate analysis with STATAsoft 7.0 software. For this analysis, we used the %V of each differentially expressed spot in the gels of the six different conditions. Differential analysis data was organized in a specific matrix in which the columns represented gel maps and the rows showed differentially expressed proteins. The matrix was loaded into the software that produced a

3.

Results

3.1.

Proteomic analysis

Differential proteomic analysis carried out by 2D electrophoresis and subsequent gel matching by Image Master 2D Platinum 7.0 were performed to highlight specific protein patterns in ILD patients and controls. An average of 1000 spots was detected in each gel from patients and healthy

65

JO U R N A L OF P ROTE O MI CS 8 3 ( 20 1 3 ) 6 0–7 5

subjects and 800 of them were matched in at least 80% of the total number of spot maps. Image analysis was performed, matching the gels of each condition with their own “master gel” (Supplemental Fig. 1). The master gels were then matched among themselves, extrapolating quantitative protein differences. We found the differently expressed spots between couples of conditions, taking into consideration the ratio of the percentage of relative volume (%V) means, greater than two folds, verified from statistical T-test (p ≤ 0.05). A total of 15 matching groups were carried out (PLCH-SSc, PLCH-IPF, Sar-sc, Sar-nsc, IPF-nsc, IPF-sc, PLCH-Sar, Sar-SSc, Sar-IPF, IPF-SSc, SSc-sc, SSc-nsc, PLCH-nsc, PLCH-sc, sc-nsc) revealing a total of 339 spots up- or down-regulated with a valid T-test (p < 0,05) in almost one of the 15 comparisons. These spots have been processed by Kruskal–Wallis test (p ≤ 0.05). One hundred and fifty-four spots, satisfying Kruskal–Wallis test (p < 0.05), have been

submitted to Dunn's test for multiple comparison. Significantly different spots according to the non-parametric Kruskal–Wallis test were identified by mass spectrometry (MALDI-ToF–ToF/MS and LC–MS/MS). The identification of 77 protein spots have been established and showed in Fig. 1, where all differentially expressed protein spots are indicated by circles and numbers. They are reported also in Table 2, which gives the spot numbers (corresponding to the numbers in Fig. 1) and other information besides Mascot Search Results with score, number of matched peptides and sequence coverage. Peptide sequences for proteins identified by LC–MS/MS are also included. Supplemental Table 1 reports the results of statistical analysis (Kruskal–Wallis p value, H value, FDR, and Dunn's test) for the identified protein spots, highlighting statistically significant differences between spots of patients and between spots of patients and controls.

Non linear IPG

pH3.5 200

29

90 66 63 65

pH 10

28

68

64

70 46 89 36

4

34

5 24

15

76 31

7 74

75

6

25

30 71

57

8

23

26

82

73

35

81

72

86

53 27

22 43 77

52

69

95

84

91

79

83 21

13 12

Mr kDa

14

80

20 19

56

2 17

42 38

39

41 37

62

61 59

55

58

44

47

48 33 3

51 88

93

92

10

Fig. 1 – 2D master gel of a BALF control sample reporting the identified differential expressed spots among the six different conditions (Sar, PLCH, SSc, IPF, nsc, sc). Numbers correspond to spot number present in Table 2.

66

JO U R N A L OF PR O TE O MI CS 83 ( 20 1 3 ) 6 0 –75

Table 2 – Differentially expressed protein spots identified by mass spectrometry. Peptide sequences obtained by LC–MS/MS or MALDI ToF/ToF for unambiguous identification are also included. Spot numbers correspond to those reported in Master gel in Fig. 1. No. spot

Protein name

2 3 4 5 6 7 8 12 13 14 15 17

Actin cytoplasmic 1 Actin cytoplasmic 1 Alpha-1-antichymotrypsin Alpha-1-antichymotrypsin Alpha 1 antitrypsin Alpha 1 antitrypsin Alpha 1 antitrypsin Annexin A2 Annexin A2 Annexin A3 Ig alpha-2 chain C region Immunoglobulin J chain

19 20 21 22 23 24

Pulmonary surfactant-associated Pulmonary surfactant-associated Pulmonary surfactant-associated Pulmonary surfactant-associated Alpha-2-HS-glycoprotein Alpha-2-HS-glycoprotein

25

AC

Experimental Theoretical pI/MW (kDa) pI/MW (kDa)

P60709 P60709 P01011 P01011 P01009 P01010 P01009 P07355 P07355 P12429 P01877 P01591

4.92/24.02 5.15/43.60 4.58/58.37 4.63/57.56 4.85/49.03 4.81/50428 4.78/53.95 7.37/36.15 6.86/35.44 5.58/32.12 5.20/62.68 5.38/196.14

5.2/42.05 5.29/42.05 5.33/47.80 5.33/47.80 5.37/ 46.88 5.37/46.88 5.37/46.88 7.57/38.81 7.57/38.81 5.63/36.52 5.71/37.30 4.62/15.60

Q8IWL1 Q8IWL1 Q8IWL2 P35247 P02765 P02765

4.76/30.17 4.70/30.26 4.62/34.49 6.52 /43.25 4.60/54.70 4.57/55.14

5.07/26.62 5.07/26.62 5.07/26.62 6.97/35.50 5.43/40.10 5.43/40.10

Alpha-2-HS-glycoprotein

P02765

5.14/62.18

5.43/40.10

26 27

Complement C3 alpha chain Complement C3

P01024 P01024

6.65/65.94 4.77/40.70

6.02/188.57 6.02/188.57

28 29 30 31 33 34 35 36 37 38 39 41 42 43 44 46 47

Complement Factor H Complement Factor H Complement factor B Complement factor I Protein S10A6 Antithrombin-III Antithrombin-III Angiotensinogen Apolipoprotein AI Apolipoprotein AI Apolipoprotein AI Apolipoprotein AI Apolipoprotein AI Haptoglobin Haptoglobin Serotransferrin Serotransferrin

P08603 P08603 P00751 P05156 P06703 P01008 P01008 P01019 P02647 P02647 P02647 P02647 P02647 P00738 P00738 P02787 P02787

5.46/192.01 5.55/189.30 6.44/58.47 5.59/53.12 4.93/9.51 5.16/56.94 5.20/56.78 4.97/58.47 5.18/21.94 5.03/22.27 5.08/21.94 5.21/23.10 5.10/22.89 5.31/41.13 5.67/16.46 6.28/61.35 6.43/52.55

6.21/143.68 6.21/143.68 6.67/86.85 7.38/62.49 5.33/10.23 6.32/53.03 6.32/53.03 5.87/53.41 5.56/30.76 5.56/30.76 5.56/30.76 5.56/30.76 5.56/30.77 6.13/45.86 6.13/45.86 6.81/79.30 6.81/79.29

48 51 52

Transthyretin Fatty acid-binding protein Zinc-alpha-2-glycoprotein

P02766 P15090 P25311

5.52/ 13.77 6.40/11.83 4.89/41.56

5.52/15.99 6.59/14.82 5.71/ 34.47

proteinA2 proteinA2 proteinA2 protein D

Mascot search results No. matched peptides

Sequence Score coverage (%)

7 20 8 28 17 38 17 45 13 38 14 41 8 22 15 50 14 44 12 37 6 23 SSEDPNEDIVER FVYHLSDLCK IVLVDNK 8 44 7 39 9 43 TAGFVKPFTEAQLLCTQAGGQLASPR 5 18 HTLNQIDEVK FSVVYAK CNLLAEK EHAVEGDCDFQLLK HTFMGVVSLGSPSGEVSHPR HTLNQIDEVK EHAVEGDCDFQLLK 22 21 SEETKENEGFTVTAEGK VTIKPAPETEK FYHPEKEDGK VSHSEDDCLAFK GQGTLSVVTMYHAK AKDQLTCNK ENEGFTVTAEGK SGSDEVQVGQQR VYAYYNLEESCTR VHQYFNVELIQPGAVK 15 16 11 11 14 21 HGNTDSEGIVEVK 6 64 15 41 13 38 9 23 14 45 8 29 10 36 19 53 14 46 6 16 6 19 23 35 EDPQTFYYAVAVVK DSGFQMNQLR SVIPSDGPSVACVK 7 64 6 59 8 39

101 104 186 195 184 195 111 220 209 143 78

141 125 114 73

176

119 82 135 133 168 155 106 167 112 137 232 178 76 94 263

141 110 114

67

JO U R N A L OF P ROTE O MI CS 8 3 ( 20 1 3 ) 6 0–7 5

Table 2 (continued) No. spot

Protein name

AC

Experimental Theoretical pI/MW (kDa) pI/MW (kDa)

53 55 56 57 58 59 61 62 63 64 65 66 68 69 70 71 72 73 74 75 76

Zinc-alpha-2-glycoprotein Retinol-binding protein 4 14-3-3 protein epsilon Selenium-binding protein Calcyphosin Peroxiredoxin-1 Glutathione S-transferase P Glutathione S-transferase P Ceruloplasmin Ceruloplasmin Ceruloplasmin Ceruloplasmin Ceruloplasmin Albumin Albumin Albumin Albumin Albumin Albumin Albumin Albumin

P25311 P02753 P62258 Q13228 Q13938 Q06830 P09211 P09211 P00450 P00450 P00450 P00450 P00450 P02768 P02768 P02768 P02768 P02768 P02768 P02768 P02768

4.83/41.13 5.25/19.48 4.59/29.73 6.04/56.78 4.66/19.10 4.83/40.87 5.24/22.66 5.53/22.49 5.05/132.12 5.11/130.70 5.16/130.00 5.23/166.62 5.27/165.44 5.50/40.87 6.38/51.03 5.38/36.12 6.03/53.40 5.18/47.23 5.89/60.54 5.38/45.74 5.44/40.87

5.71/34.47 5.76/23.34 4.63/29.33 5.93/52.93 4.74/21.07 8.27/22.32 5.43/23.57 5.43/23.57 5.44/122.98 5.44/122.98 5.44/122.99 5.44/122.98 5.44/122.98 5.92/71.36 5.92/71.32 5.92/71.32 5.92/71.32 5.92/71.32 5.92/71.32 5.92/71.32 5.92/71.32

77

Albumin

P02768

6.00/60.22

5.92/71.32

79

Albumin

P02768

5.65/37.66

5.92/71.32

80 81 82 83 84 86 88

Albumin Albumin Albumin Albumin C-term Albumin C-term Albumin C-term Cystatin-B

P02768 P02768 P02768 P02768 P02768 P02768 P04080

6.09/31.55 5.69/50.63 5.50/37.66 5.84/46.11 5.65/47.61 4.40/47.10 7.44/16.17

5.92/71.32 5.92/71.32 5.92/71.32 5.92/71.32 5.92/71.32 5.92/71.32 6.96/11.14

89

Plastin 2

P13796

7.76/65.94

5.20/70.82

90

Macrophage mannose receptor 1

P22897

5.47/194.75

6.08/164.12

91

Pancreatic alpha-amylase

P04746

6.26/37.43

6.45/55.89

92 93 95

Beta-2 microglobulin Lysozyme C Serpin B3

P61769 P61626 P29508

6.03/11.12 9.22/11.57 6.43/39.22

6.06/13.82 9.38/16.98 6.35/44.59

Mascot search results No. matched peptides

Sequence Score coverage (%)

9 39 9 45 7 34 10 27 8 47 6 30 6 39 6 48 7 8 9 11 8 10 12 16 12 18 VPQVSTPTLVEVSR KVPQVSTPTLVEVSR 5 10 10 16 9 15 9 14 10 17 AVMDDFAAFVEK YLYEIAR FQNALLVR YICENQDSISSK VPQVSTPTLVEVSR KVPQVSTPTLVEVSR DVFLGMFLYEYAR FQNALLVR RHPDYSVVLLLR LDELRDEGK AVMDDFAAFVEK YICENQDSISSK VPQVSTPTLVEVSR AVMDDFAAFVEK FQNALLVR VPQVSTPTLVEVSR KVPQVSTPTLVEVSR 10 14 13 21 9 16 4 8 6 11 11 20 VHVGDEDFVHLR SQLEEKENK SQVVAGTNYFIK VFQSLPHENKPLTLSNYQTNK GDEEGVPAVVIDMSGLR IGNFSTDIK TENLNDDEK AECMLQQAER GSVSDEEMMELR AYYHLLEQVAPK FSLVGIGGQDLNEGNR GEPSHENNR EKETMDNAR TGSGDIENYNDATQVR SGNEDEFR SNFLNCYVSGFHPSDIEVDLLK 8 39 7 24

124 114 103 110 117 100 101 100 92 120 105 126 126

73 117 100 123 103

125 148 111 61 92 125

120 101

68

Multivariate analysis by PCA was performed to obtain an overview of the proteomic data for overall trends in the BAL expression dataset and to identify possible outliers of Sar, SSc, IPF and PLCH patients and non-smoker and smoker controls. Gel maps were grouped according to the variance of their protein expression; Fig. 2 shows their spatial distribution. Six distinct circles indicating the spatial distribution of the 50 gel maps were highlighted by PCA and corresponded to the six groups studied. The first principal component (PC1) explained 21% of the variance and the second (PC2) 14.61%. The graph of PC1 and PC2 (Fig. 2A) suggests that smoker control samples overlapped with those of Sar patients, indicative of a similar differential protein pattern. In order to see if the two conditions are really similar on the basis of their protein expression pattern, we have performed a more detailed PCA analysis. First, the graph of eigenvalues of significant principal components suggests that the first six principal components can be considered significant (Fig. S2B). Therefore we performed the spot map distribution on the first and third principal components, which did not yet bring out the distinction between groups (Fig. S2A, Supplemental material). Then, we performed the spot map distribution in the 2D plane defined by the first and fourth principal components (PC1 = 21% and PC4 = 11.11% explained variance). This different spatial view of group distribution highlights the distinction between the sarcoidosis and smoker-control groups (Fig. 2B).

3.3.

Network analysis by MetaCore

In order to further explore the probable role of the differentially expressed proteins, their gene names were loaded in the software and processed by the shortest-path algorithm. MetaCore database allowed us to build a biological network in which proteins potentially involved in disease pathogenesis were represented. Fig. 3 shows the resulting pathways in which proteins significantly related to the ILDs are reported. Software processing developed a network with alpha 1-antitrypsin, alpha 1 antichymotrypsin (SERPINA3), glutathione S transferase P1 (GSTP1), 14-3-3 epsilon and albumin as “central hubs” (i.e. proteins interacting with five or more edges of the network). Transcription factors, in particular NF-kB, p53, c-myc, PPAR-alpha and PPAR-gamma, were involved in the network by potential interactions with many of the identified proteins. Moreover, the shortest-path algorithm highlighted involvement of other molecules analyzed through the MetaCore database, such as MIF, amyloid beta and many proteases directly inhibited by alpha 1 antitrypsin and alpha 1 antichymotrypsin, such as chymotrypsin B, leukocyte elastase, cathepsin G, thrombin, chymase, matriptase and myeloblastin. Interestingly, 14-3-3ε protein regulates transcriptional factors, such as heat shock factor 1 (HSF1) and NF-AT4, as well as c-myc, and directly inhibits protein kinase C. The canonical pathway maps and GeneGo process networks, validated by statistical values, were evaluated by MetaCore and are reported in Table 3 together with the top-ten ranking for each processing system.

A

PC2 14,61%

Principal component analysis

PC1 21,00%

B

1,0

sc sc sc sc sc sc sc

0,5

PC4 11,11%

3.2.

JO U R N A L OF PR O TE O MI CS 83 ( 20 1 3 ) 6 0 –75

nsc nsc nsc nscnsc nsc nsc IPF nsc Ssc Sar Ssc Ssc Sar Ssc Sar Ssc nsc Sar Sar Sar Sar PLCH nsc Sar Sar PLCH PLCH PLCH PLCH PLCH PLCH PLCH PLCH

0,0

-0,5

-1,0 -1,0

-0,5

0,0

0,5

1,0

PC1 21,00% Fig. 2 – (A) Principal component analysis (PCA) clusterized 50 spot maps obtained from 50 BAL into six groups (group distribution on PC1 and PC2). Each group perfectly corresponds to each treated condition: the red circle shows the PCA distribution of the smoker control gel maps (sc), the green circle shows the PCA distribution of the non-smoker control gel maps (nsc), the blue circle shows the PCA distribution of the idiopathic pulmonary fibrosis gel maps (IPF), the orange ones show the PCA distribution of the systemic sclerosis gel maps (SSc), the yellow circle shows the pulmonary Langerhans cell histiocytosis gel maps (PLCH) and the black ones show the PCA distribution of the sarcoidosis gel maps (Sar). (B) Principal component analysis graph, reporting the groups distribution on PC1 and PC4.

According to the pathway maps, immune response by the alternative complement pathway and protein folding and maturation by angiotensin system maturation (p < 0.001), as well as Slit-Robo signaling and blood

69

JO U R N A L OF P ROTE O MI CS 8 3 ( 20 1 3 ) 6 0–7 5

Generic enzyme Generic kinase Protein kinase Lipid kinase

Generic phosphatase Protein phosphatase Lipid phosphatase Compound

Receptor ligand Transcription factor Protein Cell membrane glycoprotein

Transporter Generic receptor Reaction Generic binding protein

Positive effect Negative effect Unspecified effect Technical link

Generic phospholipase Metalloprotease Generic protease

Fig. 3 – Major signaling network by MetaCore analysis, associated with the proteins differentially expressed among Sar, PLCH, SSc, IPF, nsc and sc named “Alpha 1-antitrypsin, SERPINA3 (AACT), GSTP1, 14-3-3 epsilon, Albumin”. Network proteins are visualized with proper symbols which specify the functional nature of the protein (network caption). The arches define the relationship existing between individual proteins while the arrowheads represent the direction of the interaction. The line color represents the nature of the interactions: red = negative effect, green = positive effect, gray = unspecified effect.

coagulation (p < 0.01), were the most significant pathways involved in ILDs, while according to GeneGo process networks, regulation of stress, hormone stimulus and inflammatory responses were highly significant processes (p < 0.001).

3.4.

Functional classification by DAVID

Functional mapping of differentially expressed proteins was performed by DAVID. This instrument provides exploratory visualization tools that promote discovery through information about biological processes, molecular functions and cell components, remaining linked to rich sources of biological annotation such as Gene

Ontology (GO). DAVID is also useful for understanding the biological context of proteins, their involvement in various physiological pathways and association with disease pathogenesis. Functional pathway analysis of differentially expressed proteins suggested modulation of multiple vital pathophysiological pathways, including defense (p = 2.1E−10), inflammatory (p = 4.6E−10) and acute inflammatory responses (p = 5.9E−9). By matching the functions of our identified proteins with information in the DAVID disease database, it emerged that many proteins were involved in lung functions in general (p = 9.6E−10) and specifically in lung diseases such as COPD (p = 8.1E−5), tuberculosis (p = 1.9E−3), asthma (p = 8.3E−3) and pulmonary fibrosis (p = 2E−2).

70

JO U R N A L OF PR O TE O MI CS 83 ( 20 1 3 ) 6 0 –75

Table 3 – Top ten-ranking Gene GO process network and pathway maps obtained by enrichment analysis of the deregulated proteins found in the proteomic analysis. p Value Gene GO process network 1 2 3 4 5 6 7 8 9 10

Regulation of response to stress Response to hormone stimulus Regulation of inflammatory response Response to stress Response to wounding Regulation of response to external stimulus Response to organic substance Acute inflammatory response Response to stimulus Defense response

4,646E−21 6,243E−17 9,904E−17 1,665E−16 2,599E−16 2,652E−16 8,690E−16 1,139E−15 1,766E−15 2,520E−15

Gene GO pathway maps 1 2 3 4 5 6 7 8 9 10

Immune response_Alternative complement pathway Protein folding and maturation_Angiotensin system maturation\Human version Protein folding and maturation_Angiotensin system maturation\Rodent version Immune response_Lectin induced complement pathway Immune response_Classical complement pathway Development_Slit-Robo signaling LRRK2 in neurons in Parkinson's disease Blood coagulation_Blood coagulation wtCFTR and delta508 traffic/Clathrin coated vesicles formation (norm and CF) Cytoskeleton remodeling_Regulation of actin cytoskeleton by Rho GTPases

4,114E−18 1,920E−10 4,876E−10 1,980E−08 3,040E−08 8,764E−04 1,162E−03 1,895E−03 6,344E−03 9,235E−03

By DAVID it was possible to define GO biological processes, molecular functions and cell components for 38 unique proteins (Supplemental material: Table S2)

3.5.

Western Blot validation

To confirm some of the differences obtained by proteomics, Western blots were performed on independent BAL samples from 5 PLCH, 5 SSc, 5 IPF, 5 Sar patients and from 5 non-smoker and 5 smoker controls, different from the cohort used for proteomic analysis. The validation was performed for annexin A3, plastin 2, S10A6 and 14-3-3ε which were considered of relevance due to their direct interaction with some important transcriptional factors highlighted by MetaCore analysis and for their functional characteristics. For the 4 proteins, we confirmed the 2DE data by 1D Western blot via normalized relative integrated density values of detected bands (Fig. 4A, B). The bands were normalized to the integrated density value of a protein band, visualized on the nitrocellulose membrane stained with Ponceau S reagent. The reference protein used for normalization step has not been identified but is constantly and equally expressed in the six conditions, as shown in Fig. 4A sections. The histograms in Fig. 4B sections show normalized mean relativeintegrated-density ± standard deviation. The significance of changes in expression was analyzed by the Mann–Whitney test: * = p ≤ 0.05; ** = p ≤ 0.01. Graphic representation of the trends of annexin A3, plastin 2, S10A6 and 14-3-3ε expression by proteomic analysis are also shown in Fig. 4C sections. The y-axis shows spot percentage of relative volume and the x-axis shows the corresponding conditions under which the protein spot was analyzed. Dunn's test p values are reported: * = p ≤ 0.05; ** = p ≤ 0.01; *** = p ≤ 0.001.

Annexin A3 was validated, confirming its statistically significant up-regulation in BAL from SSc and PLCH patients with respect to Sar patients. In the same way, protein S10A6 was validated confirming its over-expression in SSc patients with respect to Sar and non-smoker controls. WB of 14-3-3ε confirmed the 2DE data regarding its up-regulation in SSc with respect to non-smoker controls and in PLCH with respect to IPF, smoker controls and non-smoker controls. Plastin 2 validation confirmed its up-regulation in BAL from smoker controls with respect to IPF and Sar patients (Fig. 4).

4.

Discussion

An interesting application of proteomics is the analysis of the protein composition of fluid recovered by BAL. It is safe, minimally invasive, reproducible at different times, and can also be done in patients with partial respiratory function deficit. Application of BAL to the study of ILD has contributed greatly to knowledge of immunopathological mechanisms in these diseases, providing insights for diagnosis and prognosis. In this study, BAL samples from ILD patients were evaluated by a proteomic approach associated with multivariate, functional and pathway analysis to focus on the complexity of ILD pathogenesis. BAL from 50 subjects (9 with Sar, 7 with SSc, 7 with IPF, 9 with PLCH, 10 non-smoker controls and 8 smoker controls) was resolved by 2D electrophoresis and compared by image analysis. Proteomic analysis identified several differently expressed spots among the six groups of subjects and 77 of them were unambiguously identified by mass spectrometry. The spots were studied by multivariate statistical analysis using PCA to obtain an overview of the proteomic data and to identify

71

JO U R N A L OF P ROTE O MI CS 8 3 ( 20 1 3 ) 6 0–7 5

possible outliers and trends in the BAL expression data of the six groups. PCA demonstrated that PLCH, Sar, IPF, SSc, smoker control and non-smoker control gel maps clustered into six distinct groups, highlighting the consistent reproducibility of biological samples of each condition and the six differential protein patterns (see the spatial view reported in Fig. 2B for Sar and sc). In particular, it was interesting to note that all IPF spot maps were clustered near each other, far from the other conditions in the upper right quadrant, suggesting that the differential protein pattern of IPF is characteristic of the pathology and not related to genetic differences between IPF patients. The position of the IPF cluster probably reflects the severity of fibrotic involvement in this disease with respect to the other ILDs, while the two lung granulomatoses (PLCH and Sar) showed small variance and similar behavior with a classification on the bottom left quadrant. The smoker control group clustered near PLCH and Sar probably reflecting its involvement in the onset of several granulomatoses [22].

Non-smoker controls revealed a distinct expression profile clearly separated from ILD patients and smoker controls. Based on these results, differential protein patterns characteristic of each group were highlighted by PCA. Pathway analysis by MetaCore is considered important for extracting notions and hypotheses from the vast amount of proteomic data. In this preliminary work, we have included all the identified proteins because we wanted to highlight the common pathways in the four considered ILDs. Recently, some interesting findings suggest a potential common pathway for different interstitial lung diseases, such as an accelerated senescence of pulmonary parenchyma determined by either telomere dysfunction and/or a variety of genetic predisposing factors and the noxious activity of cigarette smoke-induced oxidative damage [23–26]. Therefore this study focused on potential common pathogenetic ways characteristic of different ILDs.

Annexin A3

Plastin 2

32 kDa

PLCH

B

120 100 80 60 40 20 0

C

nsc

nsc

PLCH

sc

IPF

Sar

Sar

Ponceau

Ponceau

*

B IPF

nsc

sc

SSc

A

**

PLCH SSc

sc

SSc

IPF

A

66 kDa

** **

120 100 80 60 40 20 0 sc

Sar

nsc

PLCH SSc

IPF

Sar

C

Fig. 4 – (A) Validation by Western blot of four differentially expressed proteins such as annexin A3, S10A6, plastin 2 and 14-3-3 epsilon. BAL samples from PLCH, Sar, SSc, IPF, sc and nsc were used to perform WB. (B) Histograms visualize normalized mean relative-integrated-density ± standard deviation values. The significance of expression changes was performed by Mann–Whitney test: * = p ≤ 0.05; ** = p ≤ 0.01. (C) Graphical representation of protein expression trend by proteomic analysis. The y-axis shows spot percentage of relative volume and the x-axis shows the corresponding conditions under which the protein spot was analyzed. Dunn's test p values are reported in the figure: * = p ≤ 0.05; ** = p ≤ 0.01; *** = p ≤ 0.001.

72

JO U R N A L OF PR O TE O MI CS 83 ( 20 1 3 ) 6 0 –75

14-3-3 ε

S10A6 10 kDa

30 kDa

SSc PLCH

SSc

IPF

Sar

A

sc

A

25

IPF

nsc

nsc

Sar

sc

Ponceau

Ponceau

** ** ** **

** **

80

20

B

PLCH

60

15

B

10

40 20

5 0

0 SSc PLCH IPF

sc

nsc

Sar

PLCH SSc

C

Sar

IPF

nsc

sc

C

Fig. 4 (continued).

Pathway analysis is based on the concept that the function of a protein depends directly on the context in which it acts and MetaCore correlates proteins identified by 2DE/MS with the cellular pathways hypothetically affected by different ILDs. In our study, most of the differently expressed proteins identified in BAL (82%) were present in the MetaCore network obtained (Fig. 3). Alpha 1-antitrypsin (A1AT), serpinA3 (AACT), glutathione S transferase P1 (GSTP1), 14-3-3ε and albumin were the “functional hubs” of the map. A1AT proved to be a crucial anti-protease that directly inhibits different proteins, such as leukocyte elastase, cathepsin G, thrombin, chymase, matriptase, myeloblastin and amyloid beta. It is well known that A1AT plays a protective role against inflammatory lung damage, pulmonary fibrosis [27] and autoimmune-mediated tissue injury [28]. Similarly, AACT, another anti-protease of crucial interest, proved to be linked to amyloid beta, and other protease molecules. The potential role of protease/antiprotease imbalance in the pathogenesis of ILDs has already been demonstrated, especially for PLCH and IPF [8,9]. According to MetaCore analysis, glutathione S transferase P1, a protein involved in antioxidant defenses, and 14-3-3ε, a regulatory protein with multiple functions, are directly linked to many transcriptional factors such as NF-kB, p53, c-myc.

In particular 14-3-3ε proved to be a central molecule linked to heat shock factor 4 and playing a role in the regulation of cell differentiation, proliferation and transformation [29]. Proteomic analysis revealed that 14-3-3ε was up-regulated in SSc and PLCH with respect to non-smoker controls and in PLCH with respect to IPF and smoker controls. Notably, these findings were confirmed by WB, supporting our hypothesis that 14-3-3ε could play a key role in the pathogenesis of ILD. Interestingly, the altered expression of 14-3-3 proteins were also found in lung adenocarcinoma [30]. An important feature of MetaCore analysis is the possibility to extrapolate important key molecules (e.g. transcriptional and regulatory factors) involved in specific pathways, difficult to be directly visualized by 2DE and computer analysis. Accordingly, almost 40% of the proteins identified and loaded in MetaCore proved to be related to NF-kB, a crucial transcriptional factor. Among these NF-kB-related proteins, there were several antioxidant molecules, including peroxiredoxin 1 (PRDX1) and GSTP1, already studied as proteins involved in antioxidant defenses in IPF, SSc and Sar [31]. The potential pathogenetic role of the peroxiredoxin family in the regulation of immunoinflammatory mechanisms occurring in ILDs is also suggested by our previous proteomic studies [8,10]. Cytoplasmic PRDX1 is known to

JO U R N A L OF P ROTE O MI CS 8 3 ( 20 1 3 ) 6 0–7 5

suppress NF-kB activation by eliminating peroxides, while nuclear PRDX1 enhances its activity [32]. In turn, NF-kB activation is able to regulate lung inflammation and injury [33]. In the network produced by MetaCore, PRDX1 was linked to MIF or cyclophilin A, another immunoinflammatory protein with interesting multiple functions (including regulation of NK cell activity) that we found to be over-expressed in BAL and tissues of patients with ILDs, in particular IPF patients [34]. NF-kB is also a critical transcriptional factor in our network analysis, because it is connected with proteins that DAVID functional analysis revealed involved in inflammatory response such as factor B and fetuin A, in immune response like beta 2 microglobulin and plastin 2 and in cellular proliferation such as S10A6, all evidenced in our study. In particular, plastin 2 is a protein identified for the first time in BAL in our previous work which was found up-regulated in sc with respect to PLCH [9]. In the present study, plastin 2 was also found up-regulated in sc with respect to ILDs like Sar and IPF. These results are supported by the WB validation because we believe that plastin 2 could play an important role in the pathophysiology of ILDs. This protein is a member of a large family of actin filament cross-linkers and trigger immune response, cell migration, proliferation and cell-adhesion [35] and its role in actin cytoskeleton rearrangement and T-cell activation is crucial. Another function of plastin 2 is protection against TNF-cytotoxicity [36]. As cigarette smoke may induce production of tumor necrosis factor-alpha (TNF-α) by alveolar macrophages [37], up-regulation of plastin2 in BAL of smokers may have a protective role against this pro-inflammatory cytokine. S10A6 or calcyclin is an interesting protein found significantly up-regulated in SSc with respect to Sar and nsc. These results are supported by WB validation in which S10A6 protein showed a weak signal but considerably more intense in SSc samples with respect to the others. Our interest is due to its particular role in various stressful conditions such as oxidative stress and hypoxia. These protein results expressed mostly in fibroblast and epithelial cells. It was shown that S10A6 over expression leads to higher cell proliferation rate and sensitized cells to apoptosis [38]. On the other hand, PPARγ, another transcriptional factor highlighted by our MetaCore analysis, blocks expression of inflammatory genes and promotes expression of antiinflammatory genes by interacting with NF-kB [33,39]. PPARγ is linked to proteins involved in defense responses (haptoglobin, alpha 1 antichymotrypsin) and positively induces proteins implicated in immune responses (apolipoprotein A1, retinol binding protein 4), as revealed by our analysis. PPARγ ligands are fatty acid derivatives [40]. In our MetaCore network, for example, fatty acid binding protein 4 was an interactor with this transcriptional factor. Annexin A3 (ANXA3) is another interesting protein that MetaCore analysis revealed connected with PPARγ transcriptional factor. The proteomic analysis showed that ANXA3 was up-regulated in SSc, PLCH and IPF with respect to Sar. WB validation was performed confirming its up-regulation in SSc, and PLCH than in Sar. Our interest in ANXA3 is suggested from its anticoagulant role and involvement in cell proliferation, motility, invasiveness and signaling pathways [41–43]. Interestingly, in our previous study it was found up-regulated in BAL of PLCH with respect to sc and nsc [9]. The up-regulation of ANXA3 was found to be correlated with the metastatic process of lung adenocarcinoma and hepatocarcinoma [44].

73

C-myc and p53 are two other important transcriptional factors unearthed by MetaCore analysis. Interestingly, PRDX1 interacts with c-myc, suppressing the regulation of some c-myc targets and inhibiting cell growth. C-myc oncogene products influence many cell processes, including growth, cell cycle progression, apoptosis and differentiation [32]. Gene products revealed by MetaCore analysis, including NF-kB, p53, c-myc, PPARγ and α, are crucial molecules that cannot be detected by proteomic analysis. Based on this evidence, we evaluated expression of these transcriptional factors in BAL by WB but, as they are mainly cellular factors, they were undetectable in BAL samples (data not shown). For this reason their real pathogenetic value could be evaluated only using cellular samples. Thanks to MetaCore it also proved possible to characterize the biological attributes of protein sets by enrichment analysis. The GO dataset provides a central collection of such attributes, already known and assigned to groups of proteins. According to our results, the significant canonical pathway maps that could be involved in ILDs were: immune responses by alternative complement pathways, protein folding and maturation by the angiotensin system, Slit-Robo signaling and blood coagulation (Table 3, Fig. S4A, B). Activation of alternative complement pathway and blood coagulation mechanisms is the subject of recent interest in the field of ILD pathogenesis. Researchers have evaluated abnormalities in alveolar coagulation in acute and chronic lung injury, especially systemic sclerosis and IPF. A pro-coagulant state has been suggested in patients with IPF in stable or acute phase [45–47] and a potential indication for heparin treatment was also recently proposed in this ILD [48]. It is recognized that the physiological function of the coagulation cascade extends beyond blood coagulation. Indeed, this cascade plays a pivotal role in inflammatory and repair responses to tissue injury. Uncontrolled coagulation activity contributes to the pathophysiology of several conditions, including inflammatory diseases and acute and chronic lung injury [46]. Protein folding is an interesting field of study due to recent findings regarding the protein folding in endoplasmic reticulum (ER) affected from smoke cigarettes [49,50]. Moreover, Korfei et al. showed that in lung of sporadic IPF patients, specifically type II alveolar epithelial cells, several proteins involved in the Unfolded Proteins Response were significantly up-regulated. This mechanism acts to overcome accumulation of unfolded proteins in ER, inducing the stress mediated apoptosis pathway when the situation is difficult to solve. These findings suggested the involvement of unfolded proteins in apoptosis induction in type II alveolar epithelial cells, leading to onset of IPF [51]. Interestingly, Slit/Robo involvement has been observed in embryonic lung development, but its exact role is still unknown [52]. The main function of Slit-Robo signaling is to regulate cell migration of leukocytes, eosinophils, endothelial cells and cancer cells during lung inflammation. Other studies have also described involvement of Slit and its receptor in angiogenesis [53,54]. Again, enrichment analysis opened a wide range of new possibilities that need to be validated. MetaCore software made it possible to extract Gene GO process networks from the differentially expressed proteins identified by us. Regulation of stress, hormone stimulus and inflammatory responses showed high statistical significance (Table 3).

74

JO U R N A L OF PR O TE O MI CS 83 ( 20 1 3 ) 6 0 –75

It is interesting that stress responses are particularly involved in ILDs pathogenesis. In IPF, it has been shown that oxidative stress and consequent generation of free radicals induce a transition from epithelial to mesenchymal in lung epithelium via a TGF-β-dependent mechanism promoting fibrotic repair [55,56]. Up-regulated proteins involved in stress have been highlighted in sporadic IPF patients; they include not only DNA damage binding proteins and heat shock proteins but also proteins involved in unfolded protein response. In IPF, the ER-stress response seems to be fully activated in type II alveolar cells, significantly disturbing protein synthesis and degradation machinery [51]. Moreover, if accumulation of unfolded or misfolded proteins cannot be overcome, ER-stress-mediated apoptosis is induced in the cells concerned. Different responses to different stressors at lung level can induce fibrosis, deranged protein synthesis and apoptosis.

5.

Conclusions

Proteomic analysis of BAL from patients with different interstitial lung diseases, such as sarcoidosis, idiopathic pulmonary fibrosis, pulmonary Langerhans cells histiocytosis and fibrosis associated with systemic sclerosis, and in smoker and non-smoker controls allowed us to identify a large number of differentially expressed proteins with different biological functions (such as immune response, inflammation, coagulation, oxidative stress and anti-protease activity), potentially involved in ILDs pathogenesis. To corroborate our results, principal component analysis confirmed high reproducibility of biological replicates and differential pattern profiles primarily in IPF, the severest of all progressive fibrotic ILDs. Based on the concept that the function of a protein depends directly on the biological context in which it acts, MetaCore network analysis extracted new knowledge, hypotheses and emerging properties from the proteomic data. Interestingly, transcriptional factors NF-kB, PPARγ, PPAR-α, c-myc and p53 emerged as well as a group of functional hubs. These results confirm the importance to use pathway and functional analyses to correlate protein variations on biological fluids with cellular factors that could be directly or indirectly involved in the perturbed biological state. Enrichment analysis suggested that gene GO process networks and pathway maps involved in ILDs included coagulation, defective protein folding and Slit-Robo signaling. In conclusion, the combination of proteomic data with system biology platforms allowed us to amplify the information obtained processing the results and indicated the principal pathways involved. These information can point to potential biomarkers and new therapeutic targets opening the way for further analysis. Supplementary data to this article can be found online at http://dx.doi.org/10.1016/j.jprot.2013.03.006.

Acknowledgment This work was supported by the FIRB project “Italian Human ProteomeNet” (BRN07BMCT_013), from the MIUR. Authors are grateful to Ms. Helen Hampt for careful English revision.

REFERENCES

[1] Jain V, Hasselquist S, Delaney MD. PET scanning in sarcoidosis. Ann N Y Acad Sci 2011;1228:46–58. [2] O'Connell OJ, Kennedy MP, Henry MT. Idiopathic pulmonary fibrosis: treatment update. Adv Ther 2011;28(11):986–99. [3] Raghu G. Idiopathic pulmonary fibrosis: guidelines for diagnosis and clinical management have advanced from consensus-based in 2000 to evidence-based in 2011. Eur Respir J 2011;37(4):743–6. [4] Ling CH, Ji C, Raymond DP, Bourne PA, Xu HD. Uncommon features of pulmonary Langerhans' cell histiocytosis: analysis of 11 cases and a review of the literature. Chin Med J 2010;123(4): 498–501. [5] Sundar KM, Gosselin MV, Chung HL, Cahill BC. Pulmonary Langerhans cell histiocytosis: emerging concepts in pathobiology, radiology, and clinical evolution of disease. Chest 2003;123: 1673–83. [6] Tazi A. Adult pulmonary Langerhans cell histiocytosis. Eur Respir J 2006;27:1272–85. [7] Hassoun PM. Lung involvement in systemic sclerosis. Presse Med 2011;40(1 Pt 2):3–17. [8] Magi B, Bargagli E, Bini L, Rottoli P. Proteome analysis of BAL in lung diseases. Proteomics 2006;6(23):6354–69. [9] Landi C, Bargagli E, Magi B, Prasse A, Muller-Quernheim J, Bini L, et al. Proteome analysis of bronchoalveolar lavage in pulmonary Langerhans cell histiocytosis. J Clin Bioinform 2011;1:31. [10] Rottoli P, Magi B, Perari MG, Liberatori S, Nikiforakis N, Bargagli E, et al. Cytokine profile and proteome analysis in bronchoalveolar lavage of patients with sarcoidosis, pulmonary fibrosis associated with systemic sclerosis and idiopathic pulmonary fibrosis. Proteomics 2005;5(5):1423–30. [11] Bargagli E, Bigliazzi C, Leonini A, Nikiforakis N, Perari MG, Rottoli P. Tryptase concentrations in bronchoalveolar lavage from patients with chronic eosinophilic pneumonia. Clin Sci (Lond) Mar 2005;108(3):273–6. [12] Rottoli P, Magi B, Cianti R, Bargagli E, Vagaggini C, Nikiforakis N, et al. Caronylated proteins in BAL of patients with sarcoidosis, pulmonary fibrosis associated with systemic sclerosis and idiopathic pulmonary fibrosis. Proteomics Jul 2005;5(10):2612–8. [13] Bradford MM. A rapid and sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein-dye binding. Anal Biochem 1976;72:248–54. [14] Bjellqvist B, Pasquali C, Ravier F, Sanchez JC, Hochstrasser D. A nonlinear wide-range immobilized pH gradient for two-dimensional electrophoresis and its definition in a relevant pH scale. Electrophoresis 1993;14:1357–65. [15] Hochstrasser DF, Harrington MG, Hochstrasser AC, Miller MJ, Merril CR. Methods for increasing the resolution of two-dimensional protein electrophoresis. Anal Biochem Sep 1988;173(2):424–35. [16] Oakley BR, Kirsch DR, Morris NR. A simplified ultrasensitive silver stain for detecting proteins in polyacrylamide gels. Anal Biochem 1980;105:361–3. [17] Hochstrasser DF, Patchornik A, Merril CR. Development of polyacrylamide gels that improve the separation of proteins and their detection by silver staining. Anal Biochem 1988;173: 412–23. [18] Cañas B, Piñeiro C, Calvo E, López-Ferrer D, Gallardo JM. Trends in sample preparation for classical and second generation proteomics. J Chromatogr A 2007;1153(1–2):235–58. [19] Meiring HD, Van der Heeft E, Ten Hove GJ, De Jong A. Nanoscale LC–MS(n) technical design and applications to peptide and protein analysis. J Sep Sci 2002;25:557–68. [20] Gallagher SR. One-dimensional SDS, gel electrophoresis of proteins. Curr Protoc Protein Sci 2012;10:44.

JO U R N A L OF P ROTE O MI CS 8 3 ( 20 1 3 ) 6 0–7 5

[21] Minter M, Towbin J, Harter J, McCabe ER. Enzyme product blot for nondestructive assay of protein catalytic function in polyacrylamide gels. Anal Biochem 1989;178(1):22–6. [22] Vassallo R. Diffuse lung diseases in cigarette smokers.Semin Respir Crit Care Med Oct 2012;33(5):533–42 [Epub 2012 Sep 21]. [23] Antoniou KM, Margaritopoulos GA, Proklou A, Karagiannis K, Lasithiotaki I, Soufla G, et al. Investigation of Telomerase/telomeres system in bone marrow mesenchymal stem cells derived from IPF and RA-UIP. J Inflamm (Lond) Jul 2 2012;9(1):27. [24] Chilosi M, Poletti V, Rossi A. The pathogenesis of COPD and IPF: distinct horns of the same devil? Respir Res Jan 11 2012;13:3. [25] Königshoff M, Kramer M, Balsara N, Wilhelm J, Amarie OV, Jahn A, et al. WNT1-inducible signaling protein-1 mediates pulmonary fibrosis in mice and is upregulated in humans with idiopathic pulmonary fibrosis. J Clin Invest Apr 2009;119(4):772–87. [26] Wang Y, Huang C, Reddy Chintagari N, Bhaskaran M, Weng T, Guo Y, et al. miR-375 regulates rat alveolar epithelial cell trans-differentiation by inhibiting Wnt/β-catenin pathway. Nucleic Acids Res Feb 8 2013. http://dx.doi.org/10.1093/nar/ gks1460 [Epub ahead of print]. [27] Song JS, Kang CM, Rhee CK, Yoon HK, Kim YK, Moon HS, et al. Effects of elastase inhibitor on the epithelial cell apoptosis in bleomycin-induced pulmonary fibrosis. Exp Lung Res 2009;35(10):817–29. [28] Ludwicka-Bradley A, Silver RM, Bogatkevich GS. Coagulation and autoimmunity in scleroderma interstitial lung disease. Semin Arthritis Rheum 2011;41(2):212–22. [29] Qi W, Liu X, Qiao D, Martinez JD. Isoform-specific expression of 14-3-3 proteins in human lung cancer tissues. Int J Cancer 2005;113(3):359–63. [30] Bortner Jr JD, Das A, Umstead TM, Freeman WM, Somiari R, Aliaga C, et al. Down-regulation of 14-3-3 isoforms and annexin A5 proteins in lung adenocarcinoma induced by the tobacco-specific nitrosamine NNK in the A/J mouse revealed by proteomic analysis. J Proteome Res 2009;8(8):4050–61. [31] Montaldo C, Cannas E, Ledda M, Rosetti L, Congiu L, Atzori L. Bronchoalveolar glutathione and nitrite/nitrate in idiopathic pulmonary fibrosis and sarcoidosis. Sarcoidosis Vasc Diffuse Lung Dis 2002;19(1):54–8. [32] Ishii T, Warabi E, Yanagawa T. Novel roles of peroxiredoxins in inflammation, cancer and innate immunity. J Clin Biochem Nutr 2012;50(2):91–105. [33] Gea-Sorlí S, Guillamat R, Serrano-Mollar A, Closa D. Activation of lung macrophage subpopulations in experimental acute pancreatitis. J Pathol 2011;223(3):417–24. [34] Bargagli E, Olivieri C, Nikiforakis N, Cintorino M, Magi B, Perari MG, et al. Analysis of macrophage migration inhibitory factor (MIF) in patients with idiopathic pulmonary fibrosis. Respir Physiol Neurobiol 2009;167(3):261–7. [35] Janji B, Giganti A, De Corte V, Catillon M, Bruyneel E, Lentz D, et al. Phosphorylation on Ser5 increases the F-actin-binding activity of L-plastin and promotes its targeting to sites of actin assembly in cells. J Cell Sci 2006;119(Pt 9):1947–60. [36] Janji B, Vallar L, Al Tanoury Z, Bernardin F, Vetter G, Schaffner-Reckinger E, et al. The actin filament cross-linker L-plastin confers resistance to TNF-alpha in MCF-7 breast cancer cells in a phosphorylation-dependent manner. J Cell Mol Med 2010;14(6A):1264–75. [37] Petrescu F, Voican SC, Silosi I. Tumor necrosis factor-alpha serum levels in healthy smokers and nonsmokers. Int J Chron Obstruct Pulmon Dis 2010;5:217–22.

75

[38] Słomnicki LP, Leśniak W. S100A6 (calcyclin) deficiency induces senescence-like changes in cell cycle, morphology and functional characteristics of mouse NIH 3T3 fibroblasts. J Cell Biochem 2010;109(3):576–84. [39] Kulkarni AA, Woeller CF, Thatcher TH, Ramon S, Phipps RP, Sime PJ. Emerging PPARγ-independent role of PPARγ ligands in lung diseases. PPAR Res 2012;2012:705352. [40] Reddy RC. Immunomodulatory role of PPAR-gamma in alveolar macrophages. J Investig Med 2008;56(2):522–7. [41] Rescher U, Gerke V. Annexins — unique membrane binding proteins with diverse functions. J Cell Sci 2004;117(Pt 13):2631–9. [42] Park JE, Lee DH, Lee JA, Park SG, Kim NS, Park BC, et al. Annexin A3 is a potential angiogenic mediator. Biochem Biophys Res Commun 2005;337(4):1283–7. [43] Liu YF, Xiao ZQ, Li MX, Li MY, Zhang PF, Li C, et al. Quantitative proteome analysis reveals annexin A3 as a novel biomarker in lung adenocarcinoma. J Pathol 2009;217(1):54–64. [44] Wu N, Liu S, Guo C, Hou Z, Sun MZ. The role of annexin A3 playing in cancers. Clin Transl Oncol 2012;15(2):106–10. [45] Markart P, Nass R, Ruppert C, Hundack L, Wygrecka M, Korfei M, et al. Safety and tolerability of inhaled heparin in idiopathic pulmonary fibrosis. J Aerosol Med Pulm Drug Deliv 2010;23(3): 161–72. [46] Scotton CJ, Krupiczojc MA, Königshoff M, Mercer PF, Lee YC, Kaminski N, et al. Increased local expression of coagulation factor X contributes to the fibrotic response in human and murine lung injury. J Clin Invest 2009;119(9):2550–63. [47] Collard HR, Calfee CS, Wolters PJ, Song JW, Hong SB, Brady S, et al. Plasma biomarker profiles in acute exacerbation of idiopathic pulmonary fibrosis. Am J Physiol Lung Cell Mol Physiol 2010;299(1):3–7. [48] Tuinman PR, Dixon B, Levi M, Juffermans NP, Schultz MJ. Nebulized anticoagulants for acute lung injury — a systematic review of preclinical and clinical investigations. Crit Care 2012;16(2). [49] Kenche H, Baty CJ, Vedagiri K, Shapiro SD, Blumental-Perry A. Cigarette smoking affects oxidative protein folding in endoplasmic reticulum by modifying protein disulfide isomerase. FASEB J Mar 2013;27(3):965–77. [50] Kelsen SG. Respiratory epithelial cell responses to cigarette smoke: the unfolded protein response. Pulm Pharmacol Ther Dec 2012;25(6):447–52. [51] Korfei M, Schmitt S, Ruppert C, Henneke I, Markart P, Loeh B, et al. Comparative proteomic analysis of lung tissue from patients with idiopathic pulmonary fibrosis (IPF) and lung transplant donor lungs. J Proteome Res 2011;10(5):2185–205. [52] Nasarre P, Potiron V, Drabkin H, Roche J. Guidance molecules in lung cancer. Cell Adh Migr 2010;4(1):130–45. [53] Ye BQ, Geng ZH, Ma L, Geng JG. Slit2 regulates attractive eosinophil and repulsive neutrophil chemotaxis through differential srGAP1 expression during lung inflammation. J Immunol 2010;185(10):6294–305. [54] Dickinson RE, Duncan WC. The SLIT-ROBO pathway: a regulator of cell function with implications for the reproductive system. Reproduction 2010;139(4):697–704. [55] Gorowiec MR, Borthwick LA, Parker SM, Kirby JA, Saretzki GC, Fisher AJ. Free radical generation induces epithelial-to-mesenchymal transition in lung epithelium via a TGF-β1-dependent mechanism. Free Radic Biol Med 2012;52(6):1024–32. [56] Ha B, Kim EK, Kim JH, Lee HN, Lee KO, Lee SY, et al. Human peroxiredoxin 1 modulates TGF-β1-induced epithelial-mesenchymal transition through its peroxidase activity. Biochem Biophys Res Commun 2012;421(1):33–7.