Functional roles and substrate specificities of twelve cytochromes P450 belonging to CYP52 family in n-alkane assimilating yeast Yarrowia lipolytica

Functional roles and substrate specificities of twelve cytochromes P450 belonging to CYP52 family in n-alkane assimilating yeast Yarrowia lipolytica

Accepted Manuscript Functional Roles and Substrate Specificities of Twelve Cytochromes P450 Belonging to CYP52 family in n-Alkane Assimilating Yeast Y...

1MB Sizes 109 Downloads 63 Views

Accepted Manuscript Functional Roles and Substrate Specificities of Twelve Cytochromes P450 Belonging to CYP52 family in n-Alkane Assimilating Yeast Yarrowia lipolytica Ryo Iwama, Satoshi Kobayashi, Chiaki Ishimaru, Akinori Ohta, Hiroyuki Horiuchi, Ryouichi Fukuda PII: DOI: Reference:

S1087-1845(16)30034-2 http://dx.doi.org/10.1016/j.fgb.2016.03.007 YFGBI 2957

To appear in:

Fungal Genetics and Biology

Received Date: Revised Date: Accepted Date:

9 December 2015 16 March 2016 29 March 2016

Please cite this article as: Iwama, R., Kobayashi, S., Ishimaru, C., Ohta, A., Horiuchi, H., Fukuda, R., Functional Roles and Substrate Specificities of Twelve Cytochromes P450 Belonging to CYP52 family in n-Alkane Assimilating Yeast Yarrowia lipolytica, Fungal Genetics and Biology (2016), doi: http://dx.doi.org/10.1016/j.fgb. 2016.03.007

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Functional Roles and Substrate Specificities of Twelve Cytochromes P450 Belonging to CYP52 family in n-Alkane Assimilating Yeast Yarrowia lipolytica

Ryo Iwama1, Satoshi Kobayashi1, Chiaki Ishimaru1, Akinori Ohta2, Hiroyuki Horiuchi1, and Ryouichi Fukuda1*

1

Department of Biotechnology, The University of Tokyo, Yayoi 1-1-1, Bunkyo-ku, Tokyo, 113-8657, Japan 2

Department of Biological Chemistry, College of Bioscience and Biotechnology, Chubu University, 1200 Matsumoto-cho, Kasugai, Aichi 487-8501, Japan

* To whom correspondence should be addressed Department of Biotechnology, The University of Tokyo, Yayoi 1-1-1, Bunkyo-ku, Tokyo, 113-8657, Japan Tel.: +81-3-5841-5178; Fax: +81-3-5841-8015; E-mail: [email protected]

Running title: CYP52 family P450s in Yarrowia lipolytica

1

Abstract Yarrowia lipolytica possesses twelve ALK genes, which encode cytochromes P450 in the CYP52 family. In this study, using a Y. lipolytica strain from which all twelve ALK genes had been deleted, strains individually expressing each of the ALK genes were constructed and their roles and substrate specificities were determined by observing their growth on n-alkanes and analyzing fatty acid metabolism. The results suggested that the twelve Alk proteins can be categorized into four groups based on their substrate specificity: Alk1p, Alk2p, Alk9p, and Alk10p, which have significant activities to hydroxylate n-alkanes; Alk4p, Alk5p, and Alk7p, which have significant activities to hydroxylate the -terminal end of dodecanoic acid; Alk3p and Alk6p, which have significant activities to hydroxylate both n-alkanes and dodecanoic acid; and Alk8p, Alk11p, and Alk12p, which showed faint or no activities to oxidize these substrates. The involvement of Alk proteins in the oxidation of fatty alcohols and fatty aldehydes was also analyzed by measuring viability of the mutant deleted for twelve ALK genes in medium containing dodecanol and by analyzing growth on dodecanal of a mutant strain, in which twelve ALK genes were deleted along with four fatty aldehyde dehydrogenase genes. It was suggested that ALK gene(s) is/are involved in the detoxification of dodecanol and the assimilation of dodecanal. These results imply that genes encoding CYP52-family P450s have undergone multiplication and diversification in Y. lipolytica for assimilation of various hydrophobic compounds.

Keywords: Yarrowia lipolytica; cytochrome P450; n-alkane; fatty acid

2

1. Introduction Cytochromes P450 (P450s or CYPs) are hemoproteins that are conserved across organisms in every kingdom, and they constitute a large protein super family (Nelson, 2009). P450s catalyze a wide variety of oxidative reactions of various compounds. P450s are involved in various cellular processes, including eukaryotic sterol biosynthesis, xenobiotic degradation, and secondary metabolite production (Nelson, 2013). P450s belonging to the CYP52 family are widely distributed in various yeasts that can assimilate n-alkanes, e.g. Candida maltosa (Ohkuma et al., 1995; Ohkuma et al., 1991), Candida tropicalis (Seghezzi et al., 1992), Debaryomyces hansenii (Yadav and Loper, 1999), Candida albicans (Kim et al., 2007; Panwar et al., 2001), Lodderomyces elongisporus, Meyerozyma guilliermondii, Scheffersomyces stipitis, Starmerella bombicola (Van Bogaert et al., 2009), and Yarrowia lipolytica (Fickers et al., 2005; Hirakawa et al., 2009; Iida et al., 1998; Iida et al., 2000) (Fig. 1A). n-Alkanes are hydroxylated to fatty alcohols by CYP52-family P450s in the endoplasmic reticulum (ER) membrane. Fatty alcohols are oxidized via fatty aldehydes to fatty acids, which are metabolized through -oxidation in the peroxisome or used to synthesize membrane or storage lipids. Interestingly, multiple paralogs encoding P450s in the CYP52-family are present in the genomes of n-alkane-assimilating yeasts (Fig. 1B). The substrate specificities of a few of the CYP52-family P450s, mainly those in C. maltosa and C. tropicalis, have been examined; they were found to show distinct substrate preferences (Fig. 1B) (Eschenfeldt et al., 2003; Kim et al., 2007; Ohkuma et al., 1998; Van Bogaert et al., 2009; Zimmer et al., 1996). Some CYP52-family P450s showed strong substrate preferences for n-alkanes, whereas others showed strong preferences for the -terminal ends of fatty acids. In addition, a subset of the CYP52-family P450s exhibited 3

significant hydroxylation activities to both n-alkanes and fatty acids. The structures of P450s, CYP52A7, CYP52A8, and CYP52A17 of C. tropicalis, CYP52A9, CYP52A10, and CYP52A11 of C. maltosa, and CYP52A21 of C. albicans, which have significant -hydroxylation activities to fatty acids, are similar, but the structures of other P450s are relatively divergent (Fig. 1). The oleaginous yeast Y. lipolytica has an outstanding ability to utilize a variety of hydrophobic compounds, including n-alkanes, fatty alcohols, fatty aldehydes, and fatty acids, as sole carbon and energy sources (Fickers et al., 2005; Fukuda, 2013; Fukuda and Ohta, 2013; Nicaud, 2012; Tenagy et al., 2015). In Y. lipolytica, n-alkanes are hydroxylated to fatty alcohols by P450ALKs belonging to the CYP52 family, as in other n-alkane-assimilating yeasts. Y. lipolytica possesses twelve ALK genes (ALK1 – ALK12) (Hirakawa et al., 2009; Takai et al., 2012), which are deduced to encode P450ALKs. Alk1p to Alk10p and Alk12p belong to the CYP52F subfamily and appear to constitute a monophyletic clade in the phylogenetic tree of the CYP52-family P450s (Fig. 1B), whereas Alk11p is classified as CYP52S1 (Nelson, 2009). Thus, the roles and substrate specificities of the CYP52-family P450s of Y. lipolytica are of particular interest in terms of the evolution of the CYP52-family P450s. Deletion mutants of these ALK genes have been constructed and analyzed to determine their substrates. An ALK1 deletion mutant showed a severe growth defect on n-decane, and additional deletion of ALK2 resulted in profound growth defects on n-hexadecane (Iida et al., 1998; Iida et al., 2000), indicating that ALK1 and ALK2 play key roles in the metabolism of n-alkanes. When heterologously expressed in the plant Nicotiana benthamiana, Alk3p, Alk5p, and Alk7p hydroxylated the -terminus of dodecanoic acid, whereas Alk1p, Alk2p, Alk4p, and Alk6p did not (Hanley et al., 2003). These results suggest 4

that the Alk proteins of Y. lipolytica have distinct substrate specificities. However, the detailed molecular functions and physiological roles of the individual Y. lipolytica Alk proteins remain to be elucidated. This is largely due to the presence of multiple paralogs of ALK genes in the Y. lipolytica genome, which make it difficult to determine the function of each ALK gene in vivo by analyzing deletion mutants. The substrate specificities of Alk proteins may be determined by heterologous expression of ALK genes in E. coli, S. cerevisiae, or other hosts, but the in vivo functions of Alk proteins cannot be examined in those systems. In addition, it is often difficult to stably express full-length P450s in these heterologous expression systems. We previously constructed a deletion mutant, alk1-12 strain, in which all twelve ALK genes have been deleted. The n-alkane-inducible P450 was not detected in the alk1-12 strain, and the alk1-12 strain could not grow using n-alkanes as a sole carbon source (Takai et al., 2012). In this study, each of the twelve ALK genes was expressed in the alk1-12 strain and the substrate specificities of the Alk proteins were determined. In addition, the involvement of Alk proteins in the oxidation of a fatty alcohol or fatty aldehydes was examined. Our results suggest that Y. lipolytica possesses multiple ALK genes that encode P450s with distinct substrate specificities.

5

2. Materials and methods 2.1. Yeast strains and growth conditions Yeast strains used in this study are shown in Table 1. Y. lipolytica strain CXAU1 and CXAU/A1, derived from CX161-1B (ATCC32338, ade1), were used as wild-type strains (Iida et al., 1998; Yamagami et al., 2004). Deletion of HFD1, HFD2, HFD3, and HFD4 was performed using pop-in/pop-out method as described previously (Iwama et al., 2014; Iwama et al., 2015; Takai et al., 2012). An appropriate carbon source was added to YNB [0.17% yeast nitrogen base without amino acids and ammonium sulfate (Difco), 0.5% ammonium sulfate] as follows: 2% (w/v) glucose (SD medium); 2% (v/v) n-dodecane; 0.5% (w/v) or 2% (w/v) dodecanoic acid; 0.5% (w/v) tetradecanoic acid; 0.5% (w/v) hexadecanoic acid; 0.1%(w/v) tetradecanal. Uracil (24 mg l-1) and/or adenine (24 mg l-1) were/was added, if necessary. For solid media, 2% agar was added. n-Alkanes (C10; n-decane, C11; n-undecane, C12; n-dodecane, C13; n-tridecane, C14; n-tetradecane, C15; n-pentadecane, C16; n-hexadecane, C17; n-heptadecane, and C18; n-octadecane) were supplied in the vapor phase to YNB solid media as described previously (Endoh-Yamagami et al., 2007). Dodecanoic acid, tetradecanoic acid, hexadecanoic acid, and tetradecanal were added to medium with 0.5% (v/v) Triton X-100. Yeast cells were grown at 30°C.

2.2 Plasmids Plasmids used in this study are shown in Table S1, and sequences of used primers are listed in Table S2. The plasmid pS4ARR, an expression vector, which contains four tandem repeat of 6

ARR1, an upstream activating sequence responsible for n-alkane response, was constructed as follows: Primers Throm-6xHis-F and Throm-6xHis-R were annealed, phosphorylated, and inserted into KpnI site of pSUT5 (Yamagami et al., 2001) to obtain pSUT5-TH. ALK1 core promoter with four tandem repeats of ARR1 was amplified by PCR using primers XbaI-4xARR-F and 4xARR-EcoRIStuI-R from pSUT-4xARR1 (Sumita et al., 2002). The obtained DNA fragment was digested with XbaI and EcoRI and cloned into XbaI-EcoRI sites of pSUT5-TH to obtain pS4ARR. The plasmid pSUTEF1, an expression vector, which harbor the Y. lipolytica TEF1 promoter of for constitutive expression, was constructed as follows: The TEF1 promoter region was amplified from CXAU1 total DNA by PCR using primers XbaI-TEF1p-F and TEF1p-StuIEcoRI-R. The amplified DNA fragment was digested with XbaI and EcoRI and cloned into XbaI-EcoRI sites of pSUT5 to obtain pSUTEF1. The plasmids, p4ARR-lacZ and pTEF1p-lacZ, to test promoter activities of the 4xARR1 promoter and the TEF1 promoter respectively, were constructed as follows: pS4ARR was digested with ApaI and StuI and pSUTEF1 was digested with KpnI and StuI to obtain DNA fragments carrying the 4xARR1 promoter and the TEF1 promoter, respectively. These fragments were cloned into ApaI-StuI sites and KpnI-StuI sites of pSUT5lacZ (Sumita et al., 2002), and the obtained plasmids were p4ARR-lacZ and pTEF1p-lacZ, respectively. The plasmids to express ALK1, ALK2, ALK3, ALK4, ALK5, ALK6, ALK7, ALK8, and ALK12 under the control of the 4xARR1 promoter or the TEF1 promoter were constructed as follows: The open reading frames (ORFs) were amplified from CXAU1 total DNA by PCR using primers ALK1-F and ALK1-R for ALK1, ALK2-F and ALK2-R for ALK2, ALK3-F and ALK3-R for ALK3, ALK4-F and ALK4-R for ALK4, ALK5-F and ALK5-R for ALK5, 7

ALK6-F and ALK6-R for ALK6, ALK7-F and ALK7-R for ALK7, ALK8-F and ALK8-R for ALK8, and ALK12-F and ALK12-R for ALK12. The amplified fragments were digested with BglII and ApaI for ALK1, ALK2, ALK3, and ALK8, EcoRI and ApaI for ALK4, ALK5, ALK6, and ALK12, and BglII and SalI for ALK7 and cloned into the corresponding sites of pS4ARR to obtain pS4RALK1, pS4RALK2, pS4RALK3, pS4RALK4, pS4RALK5, pS4RALK6, pS4RALK7, pS4RALK8, and pS4RALK12. These plasmids were digested with StuI and KpnI for pS4RALK1, pS4RALK2, pS4RALK3, and pS4RALK6 and EcoRI and KpnI for pS4RALK4, pS4RALK5, and pS4RALK12 to obtain DNA fragments harboring ALK genes. These fragments were cloned into the corresponding sites of pSUTEF1 to obtain, pSUTEF1-ALK1, pSUTEF1-ALK2, pSUTEF1-ALK3, pSUTEF1-ALK4, pSUTEF1-ALK5, pSUTEF1-ALK6, and pSUTEF1-ALK12. The ORFs of ALK7 and ALK8 were amplified from CXAU1 total DNA by PCR using primers ALK7-F2 and ALK7-R for ALK7 and ALK8-F2 and ALK8-R2 for ALK8. Amplified fragments were digested with HindIII and SalI for ALK7 and EcoRI and KpnI for ALK8, and cloned into HindIII-SalI sites and EcoRI-KpnI sites of pSUTEF1, respectively, to obtain pSUTEF1-ALK7 and pSUTEF1-ALK8. The plasmids to express ALK9, ALK10, and ALK11 under the control of the 4xARR1 promoter or the TEF1 promoter were constructed as follows: The ORFs were amplified from CXAU1 total DNA by PCR using primers ALK9-F and ALK9-R for ALK9, ALK10-F and ALK10-R for ALK10, and ALK11-F and ALK11-R for ALK11. The amplified fragments were digested with EcoRI and KpnI and cloned into EcoRI-KpnI sites of pSUTEF1 to obtain pSUTEF1-ALK9, pSUTEF1-ALK10, and pSUTEF1-ALK11. These plasmids were digested with EcoRI and KpnI and DNA fragments containing ALK genes were cloned into EcoRI-KpnI sites of pS4ARR to obtain pS4RALK9, pS4RALK10, and pS4RALK11. 8

2.3. Transformation of Y. lipolytica Y. lipolytica was transformed by electroporation as described previously (Iida et al., 1998).

2.4. Measurement of reduced CO-difference spectra Reduced CO-difference spectra were measured as described previously (Takai et al., 2012). P450 was quantified by the absorbance at 450 nm of CO-difference spectra using an extinction coefficient of 91 mM-1 x cm-1 (Omura and Sato, 1964).

2.5. Extraction of dicarboxylic acid (DCA) from culture and analysis by GC-MS Cells were inoculated into the SD medium and incubated for 2 days. These precultured cells were seeded to the SD medium containing 0.5% (w/v) fatty acids with 0.5% (v/v) Triton X-100 at an initial OD600 = 0.1 and incubated for 1–5 days. Five ml of the culture was harvested, and 160 l of 1000 ppm undecanoic acid in methanol was added into the culture as an internal standard. Then, 0.5 ml of 1 M KOH was added and the mixture was vortexed for 20 sec. Two ml of ethyl acetate and 1 ml of 1 M HCl were added and the mixture was vortexed for 30 sec. Ethyl acetate phase was recovered and centrifuged at 15,000 rpm for 1 min at room temperature. Approximately 1 ml of the supernatant was recovered and evaporated under nitrogen gas. The extract was dissolved in 600 l hexane, and trimethylsilylated with 200 l of HMDS + TMCS + Pyridine, 3:1:9 (Sylon HTP) Kit (SUPELCO).

Metabolites

of

dodecanoic

acid

were

quantified

using

gas

chromatography-mass spectrometry (GC-MS) QP2010 SE (Shimadzu) with capillary column 9

DB5 (30 m x 0.25 mm, J & W Scientific). The experiments were carried out with injector and interface temperature of 270°C. The oven temperature was programmed for 2min at 100°C, increased linearly to 300°C at a rate of 10°C min-1 and for 20 min at 300°C. Quantification of trimethylsilylated derivatives of undecanoic acid, dodecanedioic acid, and 12-hydroxydodecanoic acid was performed by calculating areas of target ion (undecanoic acid; m/z 243.0, dodecanedioic acid; m/z 359.0, 12-hydroxydodecanoic acid; m/z 345.0). Standard curves were constructed using dodecanedioic acid (Sigma-Aldrich) and 12-hydroxydodecanoic acid (Sigma-Aldrich). Five ml solution containing 50, 200, or 500 ppm dodecanedioic acid and 12-hydroxydodecanoic acid was prepared and 160 l of 1000 ppm undecanoic acid in methanol was added into the solution. These compounds were extracted as described above.

2.6. -Galactosidase activity assay Measurement of -galactosidase activity was performed as described previously (Mori et al., 2013).

2.7. Quantitative real time PCR (qRT-PCR) Quantitative real time PCR was performed using primers specific for each ALK gene as described previously (Hirakawa et al., 2009), except for ALK9 specific primers (5’GAACACCTTCCATCTGTTAATCG -3’ and 5’- GTAGTAGAGTGGCTTTCCGCACT -3’).

2.8. Rapid amplification of cDNA Ends (RACE) RACE analysis was performed as described previously (Iwama et al., 2014). Used 10

primers are ALK9_GSP1 and ALK9_NGSP1 for 5’-RACE of ALK9 and ALK9_GSP2 and ALK9_NGSP2 for 3’-RACE of ALK9 (Table S2).

2.9. Measurement of viable cells Cell viability was determined by phloxine B staining as described previously (Mori et al., 2013).

2.10. Sequence accession number The nucleotide sequence reported in this paper has been submitted to the DDBJ/GenBankTM/EMBL Data Bank with an accession number, LC070676.

11

3. Results 3.1. N-terminal sequence of Alk9p The amino acid sequences of all Alk proteins, deduced by the Génolevures project, were examined. A Kyte-Doolittle hydropathy plot indicated that all Alk proteins except Alk9p possess predicted N-terminal transmembrane domains (TMDs) (Fig. 2A), which are characteristic features of eukaryotic microsomal P450s. Alk9p contains an extra amino acid extension at the N-terminus of the predicted TMD (Fig. 2A). RACE analysis was used to determine the transcription initiation site of ALK9. The results suggested that the start codon of ALK9 was 99 bp downstream of that predicted by the Génolevures project (Fig. 2B) and that ALK9 encodes a protein of 512 amino acids. Analysis of the Kyte-Doolittle hydropathy plot of the Alk9p amino acid sequence determined by RACE analysis suggested the presence of an N-terminal TMD in Alk9p, similarly to other Alk proteins (Fig. 2C).

3.2. Expression of each ALK genes in the alk1-12 strain To express each ALK gene in the alk1-12 strain, two expression vectors were constructed. One used an artificial promoter, the 4xARR1 promoter, the expression of which is highly inducible by n-alkanes. The 4xARR1 promoter contains four tandem repeats of the alkane-responsive ARR1 region upstream of the ALK1 core promoter (Sumita et al., 2002). The other expression vector used the constitutive promoter of the TEF1 gene (Müller et al., 1998). In both systems, the promoters were cloned into a low-copy vector of Y. lipolytica. In addition, to express the target protein as a fusion with a C-terminal 6xHis tag, a DNA fragment containing a nucleotide sequence encoding a 6xHis tag and a stop codon was inserted into the expression vector along with the 4xARR1 promoter. In this study, Alk 12

proteins were expressed without a 6xHis tag to evaluate their inherent functions and substrate specificities. We evaluated the activities of these two promoters in the presence of a n-alkane and/or a fatty acid using a lacZ reporter gene. Plasmids carrying lacZ under either the 4xARR1 or the TEF1 promoter were constructed and introduced into the wild-type Y. lipolytica strain. Cells precultured to logarithmic phase in the SD medium were shifted to the YNB medium supplemented with glucose, n-dodecane, and/or dodecanoic acid. Cells were incubated for 3 h and collected. Crude cell extracts were prepared from these cells and subjected to a -galactosidase activity assay (Fig. 3A). The 4xARR1 promoter was highly activated in the presence of n-dodecane alone and showed lower activity in the presence of both glucose and n-dodecane, whereas its activity was low in the medium containing glucose, dodecanoic acid, or glucose and dodecanoic acid. Dodecanoic acid strongly repressed the activity of the 4xARR1 promoter in the presence of n-dodecane with glucose. These results suggest that n-alkanes highly activate the 4xARR1 promoter and that dodecanoic acid represses promoter activity. The activity of the TEF1 promoter was high regardless of carbon sources, and significant differences in its activity were not observed. Next, plasmids carrying each of the twelve ALK genes under either the TEF1 promoter or the 4xARR1 promoter were constructed and introduced into the alk1-12 strain. The alk1-12 strain containing ALK genes under the TEF1 promoter were cultured to logarithmic phase in the SD medium and the mRNA levels of ALK genes were analyzed by qRT-PCR (Fig. 3B). Significant amounts of mRNAs were detected in all twelve strains expressing one of ALK genes, although their levels varied. Next, the strains harboring plasmids to express ALK genes under the TEF1 promoter were cultured in SD medium for 30 h, and the P450 13

contents were determined from the CO-difference spectra measured using whole cells (Fig. 3C and Fig. S1). No significant peak was detected at 450 nm in the alk1-12 strain harboring an empty vector. In contrast, similar amounts of P450s were produced in the alk1-12 strains carrying one of ALK genes, except the strain expressing ALK6, which exhibited a higher P450 content than the strains expressing other ALK genes, and the strains expressing ALK2 and ALK11, which exhibited lower P450 contents. Significant correlation was not observed between the expression levels of ALK genes and the P450 contents (Fig. 3D). The wild-type strain and the alk1-12 strains harboring plasmids to express ALK genes under the 4xARR1 promoter were incubated in medium containing n-dodecane and glucose for 6 h, and the P450 contents were also determined (Fig. 3E and Fig. S2). Glucose was added with n-dodecane to support growth of the alk1-12 strains, since glucose does not repress transcription of the ALK genes induced by n-alkanes (Iida et al., 1998; Iida et al., 2000). A peak was detected at 450 nm in the wild-type strain. In the alk1-12 strains expressing ALK genes, peaks at 450 nm were also observed, but the P450 contents varied. The strain expressing ALK3 exhibited a higher P450 content than other strains, while the strain expressing ALK8 showed a much lower P450 content. In addition, the P450 content in the strain expressing ALK2, ALK9, ALK10, ALK11, or ALK12 was approximately half of that in the strain expressing ALK1, ALK4, ALK5, ALK6, or ALK7. Significant correlation was not observed between the P450 contents in the strains expressing ALK genes under the 4xARR1 promoter and those in the strains expressing ALK genes under the TEF1 promoter (Fig. 3F). These results indicated that each Alk protein was produced in both systems. In the following experiments, the 4xARR1 promoter was used when n-alkanes were used as substrates, and the TEF1 promoter was used when fatty acids or other carbon sources were used as substrates. 14

3.3. Oxidation of n-alkanes by Alk proteins The alk1-12 strain grew on glucose but not on 10–16-carbon n-alkanes, and introduction of ALK1 into the alk1-12 strain restored growth on n-alkanes (Takai et al., 2012). The ability of each Alk protein to hydroxylate n-alkanes in vivo was evaluated by investigating the growth on n-alkanes of the alk1-12 strain expressing each ALK gene under the 4xARR1 promoter. ALK1 restored the growth of the alk1-12 strain on n-alkanes of various lengths and ALK2 did on longer-chain n-alkanes (Fig. 4 and Table 2). These results are consistent with the observations that the alk1 strain showed severe growth defects on 10–15-carbon n-alkanes and that the deletion of ALK2 in the alk1 strain led to an additional growth defect on n-hexadecane (Iida et al., 2000; Takai et al., 2012). The alk1-12 strains expressing ALK3, ALK6, ALK9, and ALK10 also grew on n-alkanes, but they showed distinct preferences for n-alkanes of particular chain lengths (Fig. 4 and Table 2). It was suggested that Alk3p oxidizes n-alkanes of various carbon numbers and that Alk10p oxidizes n-alkanes of a wide range of lengths, but prefers shorter-chain n-alkanes. In contrast, Alk6p and Alk9p were suggested to prefer longer-chain n-alkanes. When the TEF1 promoter was used to express each ALK gene, similar results were obtained, although growth was slightly slower than under the 4xARR1 promoter, consistent with the lower expression of the TEF1 promoter on n-alkanes compared with the 4xARR1 promoter (data not shown). These results suggest that Alk proteins show distinct substrate specificities for n-alkanes.

3.4. Oxidation of -termini of fatty acids by Alk proteins The activities of Alk proteins to oxidize the -termini of fatty acids were evaluated by 15

the production of -hydroxy fatty acids or dicarboxylic acids from fatty acids by alk1-12 strains expressing ALK genes. Since fatty acids repress the expression of the 4xARR1 promoter, cells expressing each ALK gene under the constitutive TEF1 promoter were used (Fig. 3A). First, time-dependent production of -hydroxydodecanoic acid and dodecanedioic acid from dodecanoic acid by the alk1-12 strain expressing Alk3p, Alk5p, and Alk7p were examined. The results of experiments using the plant expression system had previously suggested that these Alk proteins show -hydroxylation activities against dodecanoic acid (Hanley et al., 2003). These strains were cultured in SD medium containing dodecanoic acid for 1–5 days. Metabolites were extracted from the whole culture and were analyzed by GC-MS. The alk1-12 strain expressing Alk3p, Alk5p, and Alk7p produced substantial amounts of -hydroxydodecanoic acid and dodecanedioic acid, reaching maximum levels on the third to fourth day, followed by a decrease, probably due to assimilation of these compounds by Y. lipolytica. These compounds were not detected in the whole culture of the alk1-12 strain containing the empty vector. These results suggested that Alk3p, Alk5p, and Alk7p have hydroxylation activities against the -terminus of dodecanoic acid in Y. lipolytica, concordant with the results obtained using the plant expression system (Hanley et al., 2003). Next, production of -hydroxydodecanoic acid and dodecanedioic acid from dodecanoic acid by the alk1-12 strain expressing other Alk proteins were examined (Fig. 5B and C). Cells were cultured in SD medium containing dodecanoic acid for 3 days and the metabolites were analyzed. It was suggested that Alk4p and Alk6p have -oxidation activities against dodecanoic acid, although their activities were much lower than those of Alk3p, Alk5p, and Alk7p. In addition, trace amounts of -hydroxydodecanoic acid and/or dodecanedioic acid were detected in the whole culture of cells expressing ALK1, ALK2, ALK8, ALK9, and ALK11. 16

In the whole culture of the alk1-12 strain expressing Alk3p, Alk5p, or Alk7p, a small peak was detected at approximately 14.8 min (Fig. 5B, black arrow). The mass spectrum pattern of this

peak

is

consistent

with

that

of

a

trimethylsilylated

derivative

of

(-1)-hydroxydodecanoic acid reported by Kim et al. (data not shown)(Kim et al., 2007), raising the possibility that Alk3p, Alk5p, and Alk7p catalyzed oxidation at the -1 position of dodecanoic acid. When tetradecanoic acid was used as a substrate, a small amount of tetradecanedioic acid was detected in the whole culture of the alk1-12 strain expressing Alk5p, but not of those expressing other Alk proteins (data not shown). When hexadecanoic acid was used as a substrate, the -oxidation product of hexadecanoic acid was not detected in the whole culture of the alk1-12 strain expressing any Alk protein (data not shown).

3.5. Involvement of Alk proteins in oxidation of fatty alcohols and fatty aldehydes CYP52A3 of C. maltosa has been proposed to catalyze the oxidation of fatty alcohol and fatty aldehyde, in addition to hydroxylation of n-alkane (Scheller et al., 1998). We explored the possibility that the Alk proteins of Y. lipolytica oxidize fatty alcohol and fatty aldehyde in vivo. In the genome of Y. lipolytica, eight genes encoding alcohol dehydrogenases (ADH1–ADH7 and FADH) and a gene encoding a fatty alcohol oxidase (FAO1) have been identified (Gatter et al., 2014; Iwama et al., 2015). We have previously shown that a mutant strain, in which ADH1–ADH7, FADH, and FAO1 had been deleted, showed no growth defect on n-alkanes of more than 12 carbons, but that it was sensitive to exogenous fatty alcohols and showed severe grow defects on fatty alcohols, probably due to a decrease in its activity 17

for oxidation of fatty alcohols (Iwama et al., 2015). We examined whether deletion of ALK genes in the wild-type strain confers sensitivity to 1-dodecanol (Fig. 6A). The wild-type and alk1-12 strains were incubated for 6 h in SD medium containing 5 mM, 0.5 mM, 0.05mM 1-dodecanol with 0.1% Triton X-100 (added to solubilize the 1-dodecanol), and cell viability was determined by phloxine B staining. Triton X-100 has slight toxicity to the wild-type and alk1-12 strains. In the presence of 5 mM or 0.05 mM 1-dodecanol, the viability of these strains was markedly decreased, but no significant differences in viability were observed among these strains. In the presence of 0.05 mM 1-dodecanol, the viability of the alk1-12 strain was lower than that of the wild-type strain. These results raise the possibility that Alk proteins are involved in the detoxification of fatty alcohols. Y. lipolytica possesses four genes that encode fatty aldehyde dehydrogenases, HFD1 – HFD4, which are involved in the oxidation of fatty aldehydes to fatty acids in the n-alkane assimilation pathway. The hfd1-4 strain, in which HFD1–HFD4 had all been deleted, could not grow on n-alkanes, but grew on 1-dodecanal (Iwama et al., 2014). These results suggest that Y. lipolytica possesses other enzyme(s) that can oxidize extracellular fatty aldehydes. To examine the involvement of Alk proteins in the oxidation of exogenous fatty aldehydes, the alk1-12hfd1-4 strain, which lacks all twelve ALK genes and 4 HFD genes, was constructed and its growth on fatty aldehydes was analyzed (Fig. 6B). The alk1-12 strain grew on dodecanal and tetradecanal as well as the wild-type strain. The hfd1-4 strain was found to show partially defective growth on dodecanal and tetradecanal in a spot assay. Additional deletion of all twelve ALK genes from the hfd1-4 strain exacerbated the growth defect of the hfd1-4 strain on dodecanal, but not on tetradecanal. These results raise the possibility that Alk protein(s) are involved in the oxidation of shorter-chain fatty aldehydes incorporated 18

from the culture medium.

4. DISCUSSION In this study, we characterized the functions and substrate specificities of all twelve CYP52-family P450s encoded by ALK genes in Y. lipolytica. Because our assay employed expression systems using Y. lipolytica, instead of heterologous systems, we were able to characterize the proteins under physiological conditions.

4.1. Substrate Specificities of Alk proteins Our results suggest that Alk1p and Alk3p oxidize n-alkanes of various carbon numbers, whereas Alk2p, Alk6p, and Alk9p prefer longer-chain n-alkanes and Alk10p prefers shorter-chain n-alkanes (Fig. 4 and Table 2). Alk1p−Alk9p and Alk11p showed the ability to oxidize the -terminus of dodecanoic acid, although the activities of Alk1p, Alk2p, Alk8p, Alk9p, and Alk11p were very weak (Fig. 5). Oxidation of the -terminus of dodecanoic acid has been reported to be catalyzed in vitro by Alk3p, Alk5p, and Alk7p expressed in N. benthamiana, but not by Alk1p, Alk2p, Alk4p, or Alk6p (Hanley et al., 2003). The -oxidation activities of Alk1p, Alk2p, Alk4p, and Alk6p might not be detectable when expressed in a plant, due to weak activity or low levels of expression. Alternatively, Alk1p, Alk2p, Alk4p, and Alk6p may function only in Y. lipolytica. Significant amounts of P450s were produced in the strains expressing ALK genes under the 4xARR1 promoter and the TEF1 promoter. However, when ALK genes were expressed using the 4xARR1 promoter in the cells cultured in the medium containing n-dodecane and glucose, the P450 content in the strain expressing ALK3 was higher compared with those in 19

other strains (Fig. 3E). In addition, the P450 content in the strain expressing ALK2, ALK9, or ALK10 was approximately half of that in the strain expressing ALK1 or ALK6. Therefore, the activity of Alk3p to hydroxylate n-alkanes might be overestimated and those of Alk2p, Alk9p, and Alk10p might be underestimated. Furthermore, the defective growth of the alk1-12 strain expressing ALK8 on n-alkanes might be due to its low P450 content. When ALK genes were expressed using the TEF1 promoter, the strain expressing ALK6 exhibited a higher P450 content compared with other strains, while the strains expressing ALK2 and ALK11 exhibited lower P450 contents (Fig. 3C). The activity of Alk6p to hydroxylate the -terminus of dodecanoic acid might be overestimated and those of Alk2p and Alk11p might be underestimated. The reason for the lack of correlation between the P450 contents in the strains expressing ALK genes under the 4xARR1 promoter (Fig. 3C) and those in the strains expressing ALK genes under the TEF1 promoter (Fig. 3F) is currently unknown. Phylogenetic analysis of the Y. lipolytica Alk proteins has suggested that structurally similar Alk proteins have similar substrate preferences; Alk1p, Alk9p, Alk2p, and Alk10p, which have significant preferences for n-alkanes, are closely related, although they show different chain-length specificities for n-alkanes. In addition, Alk5p and Alk7p, which show strong activity to oxidize the -terminus of dodecanoic acid, are also closely related (Fig. 7). To quantitatively compare the activities of Alk proteins for hydroxylation of n-alkanes or fatty acids, accumulation of fatty alcohols or -hydroxy fatty acids must be quantified in a strain that expresses an ALK gene but that is defective in the downstream oxidation of fatty alcohols or -hydroxy fatty acids. We have previously shown that the alcohol dehydrogenase genes ADH1 and ADH3 and the fatty alcohol oxidase FAO1 are involved in the oxidation of exogenous fatty alcohols (Iwama et al., 2015). However, the ALCY02 strain, in which 20

ADH1–ADH7, FADH, and FAO1 were deleted, did not show any defects in growth on n-alkanes of more than 12 carbons (Iwama et al., 2015), and a gene(s) involved in the oxidation of fatty alcohols generated during metabolism of n-alkanes remain(s) to be identified. Therefore, in this study the activities of Alk proteins were evaluated by observing the growth of the alk1-12 strains expressing ALK genes on n-alkanes or by analyzing fatty acid metabolites.

4.2. Substrate recognition by Alk proteins Because Alk proteins show high levels of amino acid similarity and because their substrates, n-alkanes and fatty acid, are structurally similar, we expected to be able to determine the residues involved in substrate recognition in Alk proteins by comparing the sequences of Alk proteins that show a substrate preferences for n-alkanes with those that show a preferences for fatty acids. Thus, the amino acid sequence of Alk1p, which strongly prefers n-alkanes, was compared with Alk5p, which prefers fatty acids. In addition, chimeric Alk proteins or Alk proteins containing amino acid substitutions in Alk1p and Alk5p were expressed in the alk1-12 strain to determine the regions or residues that recognize n-alkanes or fatty acids. However, domain exchanges or amino acid substitutions did not alter the substrate preference of Alk1p from n-alkanes to dodecanoic acid, and that of Alk5p was not altered from fatty acids to n-alkanes (data not shown). Thus, multiple domains or residues of Alk proteins might be involved in substrates recognition. The crystal structures of S. cerevisiae Erg11p, a CYP51-family P450, has been determined, and the transmembrane domain of ScErg11p has been suggested to interact with the catalytic domain, two regions of which lie inside the lipid bilayer (Monk et al., 2014). In addition, the hydrophobic surface 21

regions of the catalytic domains of microsomal P450s of the CYP2 and CYP16 families have been suggested to interact with the membrane, and substrate-access channels that are open to the membranes have been suggested to be present (Johnson and Stout, 2013). In order to understand how Alk proteins recognize their substrates, the crystal structures of full-length Alk proteins must be solved.

4.3. Physiological roles of Alk proteins in Y. lipolytica In previous studies of ALK gene deletion mutants, ALK1 was suggested to play a pivotal role in the metabolism of n-alkanes of various lengths and ALK2 and ALK6 were suggested to be involved in the oxidation of longer-chain n-alkanes (Iida et al., 1998; Iida et al., 2000; Takai et al., 2012). In addition, transcription of ALK1, ALK2, and ALK6 is highly induced by n-alkanes (Hirakawa et al., 2009). In accordance with these results, overproduction of Alk1p supported growth of the alk1-12 strain on n-alkanes of various carbon numbers and overproduction of Alk2p and Alk6p supported growth on long-chain n-alkanes (Fig. 4 and Table 2). The alk1alk2alk4alk6 strain could not grow on n-alkanes of 10 to 18 carbon numbers (Takai et al., 2012). This result suggests that the other ALK genes are not involved in n-alkane assimilation. However, overproduction of Alk3p and Alk10p supported growth of the alk1-12 strain on n-alkanes (Fig. 4 and Table 2). These results imply that Alk3p and Alk10p, at least, can oxidize n-alkanes, but that their expression under their native promoters is insufficient to support growth on n-alkanes (Hirakawa et al., 2009). Although Alk9p shows the highest similarity to Alk1p (Fig. 7) and transcription of ALK9 is significantly induced by n-alkanes (Hirakawa et al., 2009), the growth of the alk1-12 strain producing Alk9p is very 22

weak (Fig. 4 and Table 2). The alteration of the structure of Alk9p might have resulted in loss of activity or altered its substrate preference. Overexpression of Alk3p, Alk5p, and Alk7p leads to production of significant amounts of dodecanedioic acid and -hydroxydodecanoic acid and a trace amount of (-1)-hydroxydodecanoic acid in Y. lipolytica (Fig. 5), although those compounds cannot be detected in whole culture of the wild-type strain (data not shown). Thus, the physiological significance of -oxidation and (-1)-oxidation of fatty acids by Alk3p, Alk5p, and Alk7p remains unclear. In S. bombicola, -hydroxy fatty acids are used for synthesis of sophorolipid (Shah et al., 2007; Van Bogaert et al., 2013). Huang et al. also showed that cis-9-octadecene-1,18-dioic acid is used as a component of a glucolipid in S. bombicola (Huang et al., 2014). In contrast to dodecanoic acid, the -oxidation or (-1)-oxidation products of tetradecanoic acid or hexadecanoic acid were not detected. Y. lipolytica might preferentially utilize longer-chain fatty acids as energy sources or components of membrane or storage lipids. Alternatively, Alk3p, Alk5p, and Alk7p might prefer shorter-chain fatty acids as substrates. ALK3, ALK5, and ALK7 exhibited distinct transcription profiles in terms of both responses to n-alkanes and effects of deletion of YAS2 or YAS3, which encode critical transcriptional regulators involved in the response to n-alkanes (Endoh-Yamagami et al., 2007; Hirakawa et al., 2009). It would be of interest to investigate whether transcription of these genes is upregulated in the presence of fatty acids and whether it is regulated by Por1p, a Zn2Cys6 transcription factor, involved in the transcriptional activation of a subset of these genes by fatty acids (Poopanitpan et al., 2010). 23

The results of this study raise the possibility that the Alk protein(s) of Y. lipolytica can oxidize 1-dodecanol and dodecanal. The mutant strain from which ADH1–ADH7, FADH, and FAO1 were deleted showed severe growth defects on fatty alcohols, but it grew normally on n-alkanes of more than 12 carbons (Iwama et al., 2015). Alk proteins might be involved in the oxidation of fatty alcohols produced during metabolism of n-alkanes. In addition, the hfd1-4 strain did not grow on n-alkanes of 12 to 18 carbons, but it grew on n-decane and n-undecane (Iwama et al., 2014). It is possible that Alk proteins are involved in the oxidation of fatty aldehydes generated during metabolism of n-decane and n-undecane. Alk4p, Alk8p, and Alk11p, which showed weak activities to oxidize the -terminus of lauric acid, and Alk12p, which did not show activity to n-alkanes or fatty acids, may have other substrates. The system developed here, in which individual ALK genes are expressed in the alk1-12 strain, will facilitate identification of their physiological functions and substrates.

4.4. Evolution of Y. lipolytica in terms of P450s In fungi, the number of P450 genes varies from 1 to more than 100 (Chen et al., 2014). P450 genes are thought to have undergone multiplication and diversification in fungi to meet metabolic needs in divergent environments (Ichinose, 2012; Kelly and Kelly, 2013). One of the most striking features of P450 genes is CYP “bloom”, in which genes encoding a specific CYP subfamily are highly increased but genes encoding other CYP subfamilies are not (Feyereisen, 2011). Y. lipolytica appears to have undergone a CYP bloom, as eleven of its twelve ALK genes, ALK1 – ALK10 and ALK12, encode P450s in the CYP52F subfamily (Nelson, 2009). The CYP52 family is distributed in n-alkane-assimilating yeasts, but 24

CYP52F-subfamily P450s in Y. lipolytica appear to constitute a monophyletic clade (Fig. 1). Y. lipolytica is phylogenetically distant from other n-alkane-assimilating yeasts, including C. maltosa, C. tropicalis, D. hansenii, and S. stipitis (Fig. 1A). Therefore, genes encoding CYP52-family P450s must have undergone multiplication in Y. lipolytica after it diverged from ancestral n-alkane-assimilating yeasts carrying one or a small number of gene(s) encoding CYP52-family P450s. S. bombicola shows a similar pattern (Fig. 1). Expansion of CYP52 genes was probably required for adaptation to an environment where various hydrophobic compounds were present. In addition, some CYP52F-subfamily P450s in Y. lipolytica appear to have acquired similar activities to other CYP52-family P450s, despite the independent expansion of CYP52F-subfamily P450s in Y. lipolytica. The similar activities of CYP52-family P450s may be a result of convergent evolution, in response to pressure for efficient oxidization of various hydrophobic substrates by n-alkane assimilating yeasts. It is also possible that the present diversity of the CYP52F subfamily in Y. lipolytica is transient. Investigation of species closely related to Y. lipolytica may shed light on the evolution of the CYP52F subfamily.

25

Acknowledgments This work was partly supported by JSPS KAKENHI Grant Number 24380045 and 25•7284. R. I and S. K. were Research Fellows of the Japan Society for the Promotion of Science (DC-1). This work was performed using the facilities of the Biotechnology Research Center of The University of Tokyo. The authors declare that they have no conflicts of interest with the contents of this article.

Footnotes The abbreviations used are: CYP or P450, Cytochrome P450; DCA, dicarboxylic acid; ER, endoplasmic reticulum; GC-MS, gas chromatography-mass spectrometry; ORF, open reading frame; RACE, Rapid amplification of cDNA Ends; TMD, transmembrane domain.

26

References Chen, W., et al., 2014. Fungal cytochrome p450 monooxygenases: their distribution, structure, functions, family expansion, and evolutionary origin. Genome Biol Evol. 6, 1620-1634. Endoh-Yamagami, S., et al., 2007. Basic helix-loop-helix transcription factor heterocomplex of Yas1p and Yas2p regulates cytochrome P450 expression in response to alkanes in the yeast Yarrowia lipolytica. Eukaryot Cell. 6, 734-743. Eschenfeldt, W. H., et al., 2003. Transformation of fatty acids catalyzed by cytochrome P450 monooxygenase enzymes of Candida tropicalis. Appl Environ Microbiol. 69, 5992-5999. Feyereisen, R., 2011. Arthropod CYPomes illustrate the tempo and mode in P450 evolution. Biochim Biophys Acta. 1814, 19-28. Fickers, P., et al., 2005. Hydrophobic substrate utilisation by the yeast Yarrowia lipolytica, and its potential applications. FEMS Yeast Res. 5, 527-543. Fukuda, R., 2013. Metabolism of hydrophobic carbon sources and regulation of it in n-alkane-assimilating yeast Yarrowia lipolytica. Biosci Biotechnol Biochem. 77, 1149-1154. Fukuda, R., Ohta, A., 2013. Utilization of hydrophobic substrate by Yarrowia lipolytica. In: Barth G (ed). Yarrowia lipolytica Genetics, Genomics, and Physiology. Heidelberg: Springer, 2013, 111-119. Gatter, M., et al., 2014. A newly identified fatty alcohol oxidase gene is mainly responsible for the oxidation of long-chain w-hydroxy fatty acids in Yarrowia lipolytica. FEMS Yeast Res. 14, 858-872. 27

Hanley, K., et al., 2003. Development of a plant viral-vector-based gene expression assay for the screening of yeast cytochrome p450 monooxygenases. Assay Drug Dev Technol. 1, 147-160. Hirakawa, K., et al., 2009. Yas3p, an Opi1 family transcription factor, regulates cytochrome P450 expression in response to n-alkanes in Yarrowia lipolytica. J Biol Chem. 284, 7126-7137. Huang, F. C., et al., 2014. Expression and characterization of CYP52 genes involved in the biosynthesis of sophorolipid and alkane metabolism from Starmerella bombicola. Appl Environ Microbiol. 80, 766-776. Ichinose, H., 2012. Molecular and functional diversity of fungal cytochrome P450s. Biol Pharm Bull. 35, 833-837. Iida, T., et al., 1998. Cloning and characterization of an n-alkane-inducible cytochrome P450 gene essential for n-decane assimilation by Yarrowia lipolytica. Yeast. 14, 1387-1397. Iida, T., et al., 2000. The cytochrome P450ALK multigene family of an n-alkane-assimilating yeast, Yarrowia lipolytica: cloning and characterization of genes coding for new CYP52 family members. Yeast. 16, 1077-1087. Iwama, R., et al., 2014. Fatty Aldehyde Dehydrogenase Multigene Family Involved in the Assimilation of n-Alkanes in Yarrowia lipolytica. J Biol Chem. 289, 33275-33286. Iwama, R., et al., 2015. Alcohol dehydrogenases and an alcohol oxidase involved in the assimilation of exogenous fatty alcohols in Yarrowia lipolytica. FEMS Yeast Res. 15, fov014. Johnson, E. F., Stout, C. D., 2013. Structural diversity of eukaryotic membrane cytochrome p450s. J Biol Chem. 288, 17082-17090. 28

Kelly, S. L., Kelly, D. E., 2013. Microbial cytochromes P450: biodiversity and biotechnology. Where do cytochromes P450 come from, what do they do and what can they do for us? Philos Trans R Soc Lond B Biol Sci. 368, 20120476. Kim, D., et al., 2007. Functional expression and characterization of cytochrome P450 52A21 from Candida albicans. Arch Biochem Biophys. 464, 213-220. Monk, B. C., et al., 2014. Architecture of a single membrane spanning cytochrome P450 suggests constraints that orient the catalytic domain relative to a bilayer. Proc Natl Acad Sci U S A. 111, 3865-3870. Mori, K., et al., 2013. Transcriptional repression by glycerol of genes involved in the assimilation of n-alkanes and fatty acids in yeast Yarrowia lipolytica. FEMS Yeast Res. 13, 233-240. Müller, S., et al., 1998. Comparison of expression systems in the yeasts Saccharomyces cerevisiae, Hansenula polymorpha, Klyveromyces lactis, Schizosaccharomyces pombe and Yarrowia lipolytica. Cloning of two novel promoters from Yarrowia lipolytica. Yeast. 14, 1267-1283. Nelson, D. R., 2009. The cytochrome p450 homepage. Hum Genomics. 4, 59-65. Nelson, D. R., 2013. A world of cytochrome P450s. Philos Trans R Soc Lond B Biol Sci. 368, 20120430. Nicaud, J. M., 2012. Yarrowia lipolytica. Yeast. 29, 409-418. Ohkuma, M., et al., 1995. CYP52 (cytochrome P450alk) multigene family in Candida maltosa: identification and characterization of eight members. DNA Cell Biol. 14, 163-173. Ohkuma, M., et al., 1991. CYP52 (cytochrome P450alk) multigene family in Candida 29

maltosa: molecular cloning and nucleotide sequence of the two tandemly arranged genes. DNA Cell Biol. 10, 271-282. Ohkuma, M., et al., 1998. Isozyme function of n-alkane-inducible cytochromes P450 in Candida maltosa revealed by sequential gene disruption. J Biol Chem. 273, 3948-3953. Omura, T., Sato, R., 1964. The carbon monoxide-binding pigment of liver microsomes. J. Biol. Chem. 239, 2379-2385. Panwar, S. L., et al., 2001. CaALK8, an alkane assimilating cytochrome P450, confers multidrug resistance when expressed in a hypersensitive strain of Candida albicans. Yeast. 18, 1117-1129. Poopanitpan, N., et al., 2010. An ortholog of farA of Aspergillus nidulans is implicated in the transcriptional activation of genes involved in fatty acid utilization in the yeast Yarrowia lipolytica. Biochem Biophys Res Commun. 402, 731-735. Scheller, U., et al., 1998. Oxygenation cascade in conversion of n-alkanes to ,-dioic acids catalyzed by cytochrome P450 52A3. J Biol Chem. 273, 32528-32534. Seghezzi, W., et al., 1992. Identification and characterization of additional members of the cytochrome P450 multigene family CYP52 of Candida tropicalis. DNA Cell Biol. 11, 767-780. Shah, V., et al., 2007. Utilization of restaurant waste oil as a precursor for sophorolipid production. Biotechnol Prog. 23, 512-515. Sumita, T., et al., 2002. YlALK1 encoding the cytochrome P450ALK1 in Yarrowia lipolytica is transcriptionally induced by n-alkane through two distinct cis-elements on its promoter. Biochem Biophys Res Commun. 294, 1071-1078. 30

Takai, H., et al., 2012. Construction and characterization of a Yarrowia lipolytica mutant lacking genes encoding cytochromes P450 subfamily 52. Fungal Genet Biol. 49, 58-64. Tenagy, et al., 2015. Involvement of acyl-CoA synthetase genes in n-alkane assimilation and fatty acid utilization in yeast Yarrowia lipolytica. FEMS Yeast Res. 15, fov031. Van Bogaert, I. N., et al., 2009. Importance of the cytochrome P450 monooxygenase CYP52 family for the sophorolipid-producing yeast Candida bombicola. FEMS Yeast Res. 9, 87-94. Van Bogaert, I. N., et al., 2013. The biosynthetic gene cluster for sophorolipids: a biotechnological interesting biosurfactant produced by Starmerella bombicola. Mol Microbiol. 88, 501-509. Yadav, J. S., Loper, J. C., 1999. Multiple p450alk (cytochrome P450 alkane hydroxylase) genes from the halotolerant yeast Debaryomyces hansenii. Gene. 226, 139-146. Yamagami, S., et al., 2001. Isolation and characterization of acetoacetyl-CoA thiolase gene essential for n-decane assimilation in yeast Yarrowia lipolytica. Biochem Biophys Res Commun. 282, 832-838. Yamagami, S., et al., 2004. A basic helix-loop-helix transcription factor essential for cytochrome p450 induction in response to alkanes in yeast Yarrowia lipolytica. J Biol Chem. 279, 22183-22189. Zimmer, T., et al., 1996. The CYP52 multigene family of Candida maltosa encodes functionally diverse n-alkane-inducible cytochromes P450. Biochem Biophys Res Commun. 224, 784-789.

31

Figure Legends Fig 1. n-Alkane assimilating yeasts and CYP52-familty P450s. (A) Phylogenetic tree of D1/D2 regions of 26S ribosomal DNA in n-alkane assimilating yeasts was constructed using ClustalW (DDBJ, v2.1) and drawn using Njplot. D1/D2 region of S. pombe was used as an outgroup. The scale bar indicates 0.02 substitutions per site. The bootstrap values by 1000 repetitions are indicated. D1/D2 regions are derived from type strains of respective yeasts, and the accession numbers of sequences of D1/D2 regions from GenBank are as follows: C. albicans (U45776), C. maltosa (U45745), C. tropicalis (U45749), D. hansenii (U45808), L. elongisporus (U45763), M. guilliermondii (U45709), S. stipitis (U45741), S. pombe (U40085), S. bombicola (U45705), and Y. lipolytica (U40080). (B) Phylogenetic tree of CYP52-family P450s in n-alkane assimilating yeasts was constructed using ClustalW (DDBJ, v2.1) and drawn using Njplot. CYP51 of Y. lipolytica was used as an outgroup. The scale bar indicates 0.05 substitutions per site. The bootstrap values by 1000 repetitions are indicated. The accession numbers of sequences from UniProtKB are as follows: Alk1p (O74127), Alk2p (O74128), Alk3p (O74129), Alk4p (O74130), Alk5p (O74131), Alk6p (O74132), Alk7p (O74133), Alk8p (O74134), Alk9p (A0A0K2S2A7), Alk10p (Q6CDW4), Alk11p (Q6CCE5), Alk12p (Q6CGD9), CYP51 of Y. lipolytica (Q6CFP4). CYP52A21 (Q59K96), CYP52A22 (Q5AAH7), CYP52A23 (Q5AAH6), CYP52A24 (Q5A8M1), CYP52C3 (Q5AGW4), CYP52A3 (P1649), CYP52A5 (Q12581), CYP52A4 (P16141), CYP52A9 (Q12586), CYP52A10 (Q12588), CYP52A11 (Q12589), CYP52C2 (Q12587), CYP52D1 (Q12585), CYP52A1 (P10615), CYP52A2 (P30607), CYP52A6 (P30608), CYP52A7 (P30609), CYP52A8 (P30610), CYP52B1 (P30611), CYP52C1 (P30612), CYP52D2 (Q874J0), CYP52A43 (Q6BVP2), CYP52A44 (Q6BVH7), CYP52A45 (Q6BNW0), CYP52A46 32

(Q6BNV9), CYP52A47 (Q6BNV8), CYP52A48 (A5E5R8), CYP52A49 (A5E1L9), CYP52A50 (A5E1L8), CYP52A51 (A5E122), CYP52A52 (A5DRQ8), CYP52C7 (A5H2Q3), CYP52A39 (A5DD87), CYP52A40 (A5DRF4), CYP52A41 (A5DL54), CYP52A42 (A5DQW9), CYP52A53 (A3LRT5), CYP52A54 (A3LR60), CYP52A55 (A3LS01), CYP52A56 (A3LZV9), CYP52A57 (A3LSP0), CYP52E3 (B8QHP3), CYP52M1 (B8QHP1), and CYP52N1 (B8QHP5).Species are indicated in parentheses as follows: C. albicans (Ca), C. maltosa (Cm), C. tropicalis (Ct), D. hansenii (Dh), L. elongisporus (Le), Meyerozyma guilliermondii (Mg), Scheffersomyces stipitis (Ss), and Starmerella bombicola (Sb). Gray box indicates a monophyletic group of Y. lipolytica CYP52F subfamily. P450s catalyzing the oxidation of n-alkanes or -termini of fatty acids are indicated as A or F, respectively.

Fig 2. N-terminal sequence of Alk9p. (A) Kyte-Doolittle hydropathy plots of Alk proteins. The accession numbers of sequences from UniProtKB are as follows: Alk1p (O74127), Alk2p (O74128), Alk3p (O74129), Alk4p (O74130), Alk5p (O74131), Alk6p (O74132), Alk7p (O74133), Alk8p (O74134), Alk9p (Q6CFK2), Alk10p (Q6CDW4), Alk11p (Q6CCE5), and Alk12p (Q6CGD9). Bars indicate the predicted transmembrane domains. (B) Nucleotide sequence of the 5’-coding and non-coding regions of ALK9 and its deduced amino acid sequence. Arrows indicate transcription start sites predicted by RACE analysis. The start codon ATG predicted by the Génolevures project is highlighted with thick underline. (C) Kyte-Doolittle hydropathy plot of Alk9p sequence predicted by RACE analysis. Bar indicates the predicted transmembrane domain.

33

Fig 3. Expression of each Alk protein in Y. lipolytica cells. (A) Reporter assays of the 4xARR1 promoter and the TEF1 promoter. The wild-type CXAU/A1 strain containing p4ARR-lacZ or pTEF1p-lacZ was grown in the SD medium to logarithmic phase, transferred to YNB medium containing 2% glucose (Glc), 2% n-dodecane (C12), 2% dodecanoic acid (C12FA), 2% glucose with 2% n-dodecane (Glc + C12), 2% glucose with 2% dodecanoic acid (Glc + C12FA), or 2% glucose with 2% n-dodecane and 2% dodecanoic acid (Glc + C12 + C12FA), and incubated for 3 h. -Galactosidase activities in the crude extracts were measured as described in the “Materials and methods” section. Each result represents an average of three independent experiments ± S.E. (B) mRNA level of ALK gene in the alk1-12 strain expressing each ALK gene. The alk1-12 strains harboring the plasmids containing ALK1–ALK12 (1–12) under the TEF1 promoter were cultured to logarithmic phase in the SD medium. The copy number of mRNA was determined by qRT-PCR using specific primers for each ALK gene. Each result represents an average of three independent experiments ± S.E. There was a significance difference among these expression levels (P < 0.001, by Kruskal-Wallis test). (C) P450 contents in the alk1-12 strain expressing ALK genes under the TEF1 promoter. The alk1-12 strains harboring empty vector pSUT5 (vec) or plasmids containing ALK1–ALK12 (1–12) under the TEF1 promoter were cultured in the SD medium for 30 h. Reduced CO-difference spectra of whole yeast cells were measured, and the P450 contents were determined. Each result represents an average of three independent experiments ± S.E. *, statistically significant difference relative to the result of the alk1-12 strain containing pSUT5 (P < 0.05, unpaired t test, two-tailed). There was a significance difference among these P450 contents in the alk1-12 strains harboring plasmids containing ALK genes (P < 0.001, by ANOVA). (D) Scatter plot of the mRNA levels of ALK 34

genes versus the P450 contents. Data obtained from (B) and (C) were plotted. The Pearson correlation coefficient was -0.316. (E) P450 contents in the alk1-12 strain expressing ALK genes under the 4xARR1 promoter. The wild-type strain (WT) and the alk1-12 strains harboring plasmids containing ALK1–ALK12 (1–12) under the 4xARR1 promoter were cultured in the SD medium for 24 h, after which they were incubated in the SD medium containing 2% n-dodecane for 6h. Reduced CO-difference spectra of whole yeast cells were measured, and the P450 contents were determined. Each result represents an average of three independent experiments ± S.E. There was a significance difference among these expression levels (P < 0.001, by Kruskal-Wallis test). (F) Scatter plot of the P450 content in the strain expressing ALK genes under the 4xARR1 promoter versus those in the strains expressing ALK genes under the TEF1 promoter. Data obtained from (C) and (E) were plotted. The Pearson correlation coefficient was 0.350.

Fig 4. Growth of the alk1-12 strain expressing each ALK gene under the 4xARR1 promoter. The wild-type CXAU1 strain harboring pSUT5 (WT) and the alk1-12 strains carrying pSUT5 (alk1-12), pS4RALK1 (ALK1), pS4RALK2 (ALK2), pS4RALK3 (ALK3), pS4RALK4 (ALK4), pS4RALK5 (ALK5), pS4RALK6 (ALK6), pS4RALK7 (ALK7), pS4RALK8 (ALK8), pS4RALK9 (ALK9), pS4RALK10 (ALK10), pS4RALK11 (ALK11), and pS4RALK12 (ALK12), were cultured at 30°C for 7 days on n-decane or n-hexadecane.

Fig 5. Production of dodecanedioic acid and 12-hydroxydodecanoic acid by the alk1-12 strains producing Alk proteins in the medium containing dodecanoic acid. (A) The alk1-12 strain expressing ALK3, ALK5, or ALK7 under the TEF1 promoter was cultured in the SD 35

medium containing 0.5% dodecanoic acid for 1, 2, 3, 4, or 5 days. Dodecanedioic acid and 12-hydroxydodecanoic acid were extracted from the whole culture and analyzed by GC-MS as described in the “Materials and methods” section, and their amounts were quantified. Each result represents an average of three independent experiments ± S.E. (B) The alk1-12 strains harboring empty vector pSUT5 or plasmids containing ALK genes under the TEF1 promoter were cultured in the SD medium containing 0.5% dodecanoic acid for 3 days. Dodecanedioic acid, 12-hydroxydodecanoic acid, and 11-hydroxydodecanoic acid were extracted and analyzed by GC-MS as described in the “Materials and methods” section. Peaks

corresponding

to

trimethylsilylated

derivatives

of

dodecanedioic

acid,

12-hydroxydodecanoic acid, and undecanoic acid (an internal standard) are indicated by black, gray, and white arrowheads, respectively. Black arrows indicate a trimethylsilylated derivative of 11-hydroxydodecanoic acid. (C) Quantitation of dodecanedioic acid and 12-hydroxydodecanoic acid produced in the alk1-12 cell expressing each Alk protein. The alk1-12 strains harboring empty vector pSUT5 (No) or plasmids containing ALK genes under the TEF1 promoter (1-12) were cultured as in B. Lower panel indicates the magnified view of the part of the upper panel. Each result represents an average of three independent experiments ± S.E. *, † statistically significant difference relative to the result of the alk1-12 strain expressing no Alk protein (P < 0.05, unpaired t-test, one-tailed, *; dodecanedioic acid, †; 12-hydroxydodecanoic acid).

Fig 6. The roles of Alk proteins to oxidize fatty alcohols and fatty aldehydes in Y. lipolytica. (A) Sensitivity of the alk1-12 strain to 1-dodecanol. The wild-type CXAU1 and alk1-12 strains were incubated in the SD medium containing 0 mM, 5 mM, 0.5 mM, or 0.05 mM 36

1-dodecanol with 0.1% Triton X-100 for 6 h. Cell viability was measured by phloxine B staining. Each result represents an average of three independent experiments ± S.E. *P < 0.05 (unpaired t-test, two-tailed) (B) The growth of the alk1-12hfd1-4 strain on dodecanal or tetradecanal. The wild-type CXAU1, alk1-12, hfd1-4, alk1-12hfd1-4 strains were cultured overnight in the SD medium, and then spotted onto plates containing indicated carbon sources in 5-fold serial dilutions starting with 5 l of 5 OD600 units/ml or 2 l of 12.5 OD600 units/ml. Cells were grown at 30°C for 2 days on glucose or 4 days on dodecanal or tetradecanal.

Fig 7. Phylogenetic trees of the CYP52-family P450s. A, Phylogenetic tree of Alk proteins in Y. lipolytica and substrates of Alk proteins. The phylogenetic tree of Alk proteins in Y. lipolytica was constructed using ClustalW (DDBJ, v2.1) and drawn using Njplot. CYP51 of Y. lipolytica was used as an outgroup. The scale bar indicates 0.05 substitutions per site. The bootstrap values by 1000 repetitions are indicated. The accession numbers of Alk proteins and CYP51 in Y. lipolytica from UniProtKB are shown in Figure 1B. Substrates determined in this study are indicated. Alk1p, Alk2p, Alk8p, Alk9p, and Alk11p have week activities to fatty acid as shown in grey.

37

38

39

40

41

42

43

44

Table 1 Yeast strains used in this study. Strain

Genotype

Source of reference

CXAU1

MATA ade1 ura3

Iida et al. (1998)

CXAU/A1

CXAU1 ade1::ADE1

Yamagami et al. (2004)

alk1-12

CXAU1 alk1 alk2 alk3 alk4 alk5 alk6 alk7 alk8

Takai et al. (2012)

alk9 alk10 alk11 alk12 hfd1-4

hfd1 hfd2 hfd3 hfd4

Iwama et al. (2014)

alk1-12hfd1-4 alk1-12 hfd1 hfd2 hfd3 hfd4

This study

Table 2 Growth of the alk1-12 strain harboring each ALK gene on n-alkanes of various carbon lengths. Strain

Glc

C10

C11

C12

C13

C14

C15

C16

C17

C18

WT

+++

+++

+++

+++

+++

+++

+++

+++

+++

+++

alk1-12

+++

-

-

-

-

-

-

-

-

-

ALK1

+++

+++

+++

+++

+++

+++

+++

+++

+++

+++

ALK2

+++

±

±

±

+

±

++

+++

+++

+++

ALK3

+++

+++

+++

++

++

+++

+++

+++

+++

+++

ALK4

+++

-

-

-

-

-

-

-

-

-

ALK5

+++

-

-

-

-

-

-

-

-

-

ALK6

+++

+

+

++

++

+++

+++

+++

+++

+++

ALK7

+++

-

-

-

-

-

-

-

-

-

ALK8

+++

-

-

-

-

-

-

-

-

-

ALK9

+++

±

+

±

±

+

+

++

++

++

ALK10

+++

+++

+++

+++

+++

+++

+++

++

++

++

ALK11

+++

-

-

-

-

-

-

-

-

-

ALK12

+++

-

-

-

-

-

-

-

-

-

The wild-type CXAU1 carrying pSUT5 (WT) and the alk1-12 strains carrying pSUT5 (alk1-12) or plasmid harboring each ALK gene (ALK1 – ALK12) were cultivated on indicated n-alkanes for 15 days. Growth equivalent to the wild-type cells was indicated as ‘+++’.

45

Highlights > 12 CYP52-family P450s encoded by ALK genes of Yarrowia lipolytica were characterized. > Subsets of the Alk proteins oxidize n-alkane(s) and/or -terminus of dodecanoic acid. > The ALK genes have been multiplicated and functionally diversified in Y. lipolytica.

46