Sequence analysis of novel CYP4 transcripts from Mytilus galloprovincialis

Sequence analysis of novel CYP4 transcripts from Mytilus galloprovincialis

Accepted Manuscript Title: Sequence analysis of novel CYP4 transcripts from Mytilus galloprovincialis ˇ cko Mirta Smodlaka Tankovi´c Author: Sanda Rav...

1MB Sizes 0 Downloads 47 Views

Accepted Manuscript Title: Sequence analysis of novel CYP4 transcripts from Mytilus galloprovincialis ˇ cko Mirta Smodlaka Tankovi´c Author: Sanda Ravlic Jurica Zu´ – Maja Fafandel Nevenka Bihari PII: DOI: Reference:

S1382-6689(15)30006-5 http://dx.doi.org/doi:10.1016/j.etap.2015.06.005 ENVTOX 2273

To appear in:

Environmental Toxicology and Pharmacology

Received date: Revised date: Accepted date:

14-3-2015 29-5-2015 2-6-2015

ˇ cko, J., Tankovi´c, M.S., FafanDF el, Please cite this article as: Ravlic, S., Zu´ M., Bihari, N.,Sequence analysis of novel CYP4 transcripts from Mytilus galloprovincialis, Environmental Toxicology and Pharmacology (2015), http://dx.doi.org/10.1016/j.etap.2015.06.005 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Sequence analysis of novel CYP4 transcripts from Mytilus galloprovincialis 

us

cr

ip t

Cytochrome P450 enzymes (CYPs) are essential components of cellular detoxification system. We  identified and characterized seven new cytochrome P450 gene transcript clusters in the populations  of bivalve mollusc Mytilus galloprovincialis from three different locations. The phylogenetic analysis  identified all transcripts as clusters within the CYP4 branch. Identified clusters, each comprising a  number of transcript variants, were designated CYP4Y1, Y2, Y3, Y4, Y5, Y6 and Y7. Transcript clusters  CYP4Y2 and Y7, and CYP4Y5 and Y6 showed site specificity, while the transcript clusters CYP4Y1, Y3  and Y4 were present at all investigated locations. The comparison of transcripts deduced amino acid  sequences with CYP4s from vertebrate and invertebrate species showed high conservation of the  residues and domains essential to the putative function of the enzyme, as terminal ω ‐ hydroxylation  and prostaglandin hydroxylation. Our results suggest the great expansion of the CYP4Y cDNAs  indicative of CYP4 proteins in the mussel M. galloprovincialis presumably as a response to different  environmental conditions. 

an

cytochrome P450, cDNA cloning, phylogenetic analysis, environment, bivalvia     

M

 

             

te

 

Ac ce p

 

d

 

         

Page 1 of 28

       

ip t

   

cr

 

us

   

an

   

M

   

Name and surname: Sanda Ravlić 

d

Corresponding author: 

Ac ce p

te

Address:  Ruđer  Bošković  Institute,  Center  for  Marine  Research,  Giordano  Paliaga  5,  52210  Rovinj,  Croatia  Telephone: +38552804716 Fax: +38552813496

Email: [email protected]               

Page 2 of 28

 

1. Introduction The Cytochrome P450 family (CYPs) is by number and function the largest and most diverse protein superfamily found in nature (Estabrook, 2003). CYP enzymes use molecular oxygen to modify

ip t

numerous substrates involved in a huge number of physiological, ecological and toxicological

processes. CYPs substrates include exogenous compounds such as polycyclic aromatic hydrocarbons

(PAH), pesticides and plant allelochemicals, as well as endogenous compounds such as steroids, fatty

cr

acids and eicosanoids (Feyereisen, 1999; James and Boyle, 1998; Whalen et al., 2010).

us

The presence of CYPs in all living beings ranging from bacteria to plants and animals is implying the existence of a common ancestral gene, which has undergone consecutive gene duplications and subsequent divergence evolution resulting in the formation of a supergene family (Nebert and

an

Gonzalez, 1987; Nelson and Strobel, 1987). Based on phylogenetics and according to sequence identity, CYP genes are classified into clans, families and subfamilies (Nelson et al., 1996). To date, 12 456 CYPs are named with about 6000 more that are known, but not yet named, all belonging to

M

over 1000 families that have been identified in all three domains of life. In animals alone, 11 distinct clades have been described, encompassing 4088 named sequences placed in 156 CYP gene families. Vertebrate CYPs are distributed in 19 families and a growing number of subfamilies, e.g. 57 genes in

d

humans and 102 genes in mouse (Nelson, 2011; Nelson et al., 2004). Increasing genomic information

te

available for invertebrates already data – mined for their P450s, like sea anemone Nematostella vectensis (Goldstone, 2008), purple sea urchin Strongylocentrotus purpurea (Goldstone et al., 2006), 

Ac ce p

owl limpet Lottia gigantea (Gotoh, 2012), points to a greater diversity of invertebrates CYPs than the one observed in vertebrates.

The induction of CYP1A transcripts, protein or enzyme activity has long been employed as a biomarker of exposure of fish and other vertebrates to anthropogenic contaminants in aquatic environments. A number of studies have searched for CYP responses in bivalve molluscs similar to those found in vertebrates. Results interpretation of the studies on molluscs have been complicated by cross – reactivity of antibodies to vertebrate CYP1As with non – P450 molluscan proteins, or low enzymatic activity with known CYP substrates (Livingstone, 1991; Livingstone et al., 2000; Peters et al., 1998; Shaw et al., 2004; Snyder, 2000). While evidence of deuterostome (e.g. tunicate and sea urchin) CYP1A – like proteins and CYP1A – like genes exist (Goldstone et al., 2007), to date no full length CYP1 – like sequences have been reported for molluscs (Rewitz et al., 2006). However, results of the insects CYP4 genes studies indicated their involvement in some forms of insecticide resistance and their induction by alkaloids and phenobarbital suggests that CYP4 isoform of cytochrome P450 is responsible for the metabolism of xenobiotics in invertebrates (Danielson et al., 1998).

Page 3 of 28

The CYP4 family is considered one of the most ancient CYP families, having evolved from the steroid synthesizing CYPs and since diverged into an array of subfamilies and genes encoding enzymes acting on diverse substrates (Simpson, 1997). In vertebrates, CYP4 genes are predominantly fatty acid ω – hydroxylases, engaged in preventing lipotoxicity by hydroxylating eicosanoids, prostagladines and leukotrienes (Hardwick, 2008). Although little is known about the functions of the CYP4 family in

ip t

bivalves, complete CYP4 cDNA have been identified in Chlamys farreri (Miao et al., 2011),

Venerupis philippinarum (Pan et al., 2011), whereas partial sequences have been cloned in Unio

tumidus (Chaty et al., 2004), Mytilus galloprovincialis (Snyder, 1998) and Perna viridis (Zhou Chi,

cr

2010).

us

There is a need for new knowledge of CYP genes and their function in bivalves, partly to understand CYP gene evolution in this old invertebrate class, and partly for the reason that these animals posses biological features excellent for pollution monitoring. Bivalve molluscs are sessile filter feeders and

an

have been known to accumulate foreign organic chemicals (Bihari et al., 2007; Stegeman and Teal, 1973) and metals (Peric et al., 2012), and have been employed extensively as sentinels in monitoring

M

programs, e.g., the Mussel Watch Program in the USA and MEDPOL programs in Europe. In the present study we described novel cDNA sequences coding for CYP4 proteins in the mussel M. galloprovincialis. Furthermore, we investigated whether the environmental conditions as well as

d

seasonal changes influence the presence of CYP4 transcript variants and their distribution. The identification and characterization of novel CYP4 transcript variants would contribute to our current

te

knowledge regarding the metabolic detoxification metabolism in bivalve molluscs, while the analysis of mussels expressed CYP4 genes distribution could become useful in programs for detection of

Ac ce p

environmental pressures on a local, regional or even worldwide scale.

Page 4 of 28

2. Materials and methods 2.1.

Animals collection

ip t

Mussels M. galloprovincialis Lamarc 1891 (Mollusca: Bivalvia) average mass (10 ± 2 g) and length (4 ± 1 cm) were collected from natural populations at three different locations in Northern Adriatic Sea (Figure 1). 10 mussel specimens were collected from each sampling site every second month, from

cr

September 2012 until May 2013. Sampling site Budava (uncontaminated sampling site) is protected

area known for mariculture production. Sampling site Pula is in the inner part of the Pula harbor, in the

us

close vicinity of highly urbanized area and industrial facilities, and is influenced by industrial and/or urban runoff from a shipyard and urban waste. Sampling site Dina is in the close vicinity of the Organic Petrochemical industry, and according to Croatian National Institute of Public Health no

an

impact of the industry was determined on the quality of seawater and sediment („Report on monitoring the impacts of the DINA Petrokemija d.d. Omišalj on the environment in 2008“; in Croatian). Mussels were transported in seawater tanks to the laboratory and the digestive glands were dissected within 1h

RNA extraction and cDNA synthesis

d

2.2.

M

following collection, than immediately processed further.

Total RNA was isolated from 100 mg of individual mussel digestive gland tissue using TRI Reagent®

te

Solution (Ambion, Austin, USA) according to the manufacturer’s protocol. RNA quantity, purity and integrity were verified by both native RNA electrophoresis on 1% agarose gel in 1X TAE (Tris –

Ac ce p

acetate – EDTA, pH 8.6) buffer, and the UV absorbance ratios (A260/A280, A260/A230) were quantified using the NanoPhotometer™ Pearl (Implen GmbH, München, Germany). First strand cDNA was reverse transcribed from 2 µg of total RNA using the anchored – oligo (dT)18 and random hexamer primers according to the Roche protocol for Transcriptor First Strand cDNA synthesis Kit (Roche, Basel, Switzerland).

2.3.

Rapid amplification of cDNA ends (RACE)

In order to obtain a full length CYP4 cDNA, 5’ and 3’ RACE – PCRs (Polymerase chain reaction) were carried out on the total RNA using the GeneRacerTM Kit (Invitrogen, Waltham, USA) according to manufacturer’s protocol. Gene specific primers (Frace, Rrace; Table 1) were designed based on the partial sequence of the M. galloprovincialis CYP4Y1 (AF072855.1). For 3' RACE of CYP4, amplification of first strand cDNA was conducted using the Frace, forward gene – specific primer (Table 1), and the GeneRacerTM 3’ Primer. Amplification conditions for touchdown PCR were the following: 2 min denaturation at 94ºC, 10 cycles of 94ºC for 30 s, 65ºC for 30 s, 72ºC for 1 min, 15 cycles of 94ºC for 30 s, 62ºC for 30 s, 72ºC for 1 min, 10 cycles of 94ºC for 30 s, 58ºC for 30 s, 72ºC

Page 5 of 28

for 1 min, followed by an additional extension at 72ºC for 10 min. The amplified PCR product was analyzed by electrophoresis on 1% agarose gel. The target gene band was purified using MinEluteTM Gel Extraction Kit (Qiagen, Hilden, Germany) and ligated into pGEM® - T easy vector (Promega, Madison, USA). Vectors containing cloned inserts were transformed into Subcloning EfficiencyTM DH5αTM Competent Cells (Invitrogen, Waltham, USA) and incubated overnight at 37ºC. Positive

ip t

clones were identified by blue/white screening and PCR screening with pUC/M13 forward and

pUC/M13 reverse sequencing primers (Table 1) and sequenced in both directions (Macrogen Inc.,

Seoul, Korea). For 5’ RACE of CYP4, after selective ligation of a GeneRacerTM RNA Oligo to the 5’

cr

ends of decapped mRNA oligonucleotides, first strand cDNA was reverse transcribed using R2,

reverse gene specific primer (Table 1). Amplification of first strand cDNA was conducted using the

us

Rrace, reverse gene – specific primer (Table1) and GeneRacerTM 5’ Primer. Amplification conditions for touchdown PCR were the following: 3 min denaturation at 94ºC, 5 cycles of 94ºC for 30 s, 65ºC for 30 s, 72ºC for 1 min 30s, 10 cycles of 94 ºC for 30 s, 62ºC for 30 s, 72ºC for 1 min 30s, 10 cycles

an

of 94ºC for 30 s, 60ºC for 30 s, 72ºC for 1 min 30s, 5 cycles of 94ºC for 30 s, 58ºC for 30 s, 72 ºC for 1 min 30s, than an additional extension at 72ºC for 10 min. The PCR product was gel purified, cloned USA).

Cloning of the CYP4 cDNA

d

2.4.

M

and sequenced as described above. All PCR primers were obtained from Sigma – Aldrich (St. Louis,

Once the initiation and termination codons of M. galloprovincialis CYP4 were identified, primers

te

were designed to amplify the full-length CYP4 cDNAs using C1000TM Thermal Cycler (BioRad, Hercules, USA). The PCR reactions were performed in a total volume of 25 µL, PCR mixture

Ac ce p

containing reaction buffer with 20mM MgCl2 and 0.5 U of DreamTaq Green DNA Polymerase (Thermo Scientific, Waltham, USA), with 1 mM MgCl2 (Applied Biosystems, Foster City, USA), 0.8 mM dNTP mix (Fermentas, Thermo Scientific, Waltham, USA), 0.4 µM of 5-1 and 3-1, sequence specific primers (Table 1) and template cDNA (first strand cDNA) previously described in section 2.2. For full-length CYP4 amplification PCR cycles were conducted at 94ºC for 2 min followed by 30 cycles of 94ºC for 30 s, 52ºC for 30 s, 72ºC for 1 min 40 s, followed by an additional extension at 72ºC for 10 min. The resulting PCR products were purified, cloned and sequenced as previously described. Minimum of 8 clones were sequenced for every PCR fragment.

2.5.

Sequence analysis and structural alignment

Sequences were aligned using MUSCLE algorithm implemented in MEGA 5. The homology searches of nucleotide and protein sequences were conducted with BLAST program (http://www.ncbi.nlm.gov/blast). Open reading frame was defined using NCBI ORF finder (http://www.ncbi.nlm.nih.gov/gorf/gorf.html).

Page 6 of 28

To infer phylogenetic relationships within newly sequenced clones of MgCYP4Y gene transcripts along with the RACE MgCYP4Y product phylogenetic reconstruction using maximum likelihood (ML) method with 1000 bootstraps was carried out on nucleotide data. As a substitution model Tamura 3-parameter model with gamma distribution was used based on the results of Model selection function implemented in MEGA 5. Initial tree was obtained using neighbor-joining (NJ) method on a

ip t

matrix of pairwise distances estimated using Maximum Composite Likelihood approach.

Multiple sequence alignment of RACE MgCYP4Y deduced protein sequence with representative

cr

protein sequences of other CYP families was created using ClustalW version 2.12. Representative

protein sequences from CYP1, CYP2, CYP3 and CYP4 were retrieved from GenBank or Ensembl

us

database. Representative protein sequences from CYP4 family, used for the phylogenetic analysis of evolutionary relationship between M. galloprovincialis and other invertebrate/vertebrate CYP4s, were downloaded from Cytochrome P450 Homapage (http://drnelson.uthsc.edu/CytochromeP450.html). All

an

phylogenetic analysis were carried out using MEGA 5. Maximum likelihood (ML), maximum parsimony (MP) and neighbor-joining (NJ) methods with 1000 bootstraps were used to generate respective trees. All ambiguous positions were removed for each sequence pair. For NJ method JTT

M

model with gamma distribution was used, while for the ML method WAG model with gamma distribution was used. Initial tree for ML was obtained with NJ method using JTT model for pairwise

d

distance estimate. For MP method default settings were used. Identification of putative substrate recognition sites (SRS) of MgCYP4Ys was accomplished by using

te

the bacterial CYP102 (Ravichandran et al., 1993) as a template to highlight (putative) active site residues. MEGA 5 and ClustalW were used to align RACE MgCYP4Y deduced protein sequence with

Ac ce p

CYP102 and selective mammalian CYPs whose SRSs have previously been determined (Gotoh, 1992; Kalsotra et al., 2004; Loughran et al., 2000). Alignment was visualized in BioEdit v7.2.0 (Hall, 1999) and SRS for M. galloprovincialis CYP4Ys was generated by copying the backbone coordinates from CYP102. Because of the high sequence identities among all MgCYP4Y transcripts, and due to the simplicity of the sequence analysis overview, only the deduced protein sequence of RACE MgCYP4Y was used in the alignment.

To study sequence variability among all MgCYP4Y transcripts, the alignment results of 41 sequences generated in MEGA were applied into Weblogo software (http://weblogo.berkeley.edu/). Weblogo depicts an alignment as a sequence logo (Schneider and Stephens, 1990), in which each amino acid residue is represented as a stack of one amino acid letter. The height of each stack corresponds to the amino acid conservation at that position. When the amino acid residue is invariant only one letter is shown, and the substitutions are noted when the residue is variable. Amino acid substitutions were analyzed using the 250 PAM transmembrane protein exchange matrix (Jones et al., 1994).

Page 7 of 28

3. Results 3.1.

Identification of M. galloprovincialis CYP4

ip t

The complete nucleotide sequence of M. galloprovincialis CYP4Y cDNA obtained according to RACE -PCR method was 1782 bp long (RACE MgCYP4Y), containing a partial 5’ untranslated

region (UTR) of 36 bp, a 3’ UTR of 210 bp with a poly (A) tail, and an open reading frame (ORF) of

cr

1536 bp. The RACE MgCYP4Y nucleotide sequence matched the 447 bp of previously reported

partial sequence of the M. galloprovincialis CYP4Y1 (AF072855.1; Snyder, 1998), with 95% identity

us

on the whole length of the fragment. BlastX analysis identified CYP4 genes from several molluscs as the most similar to RACE MgCYP4Y, sharing up to 55% identity. The RACE MgCYP4Y deduced protein sequence was analyzed with other CYP amino acid sequences to determine the phylogenetic

an

relationship among the CYP subfamily genes from mammals, fish and invertebrates (Supplementary File 1). Phylogenetic analysis identified RACE MgCYP4Y sequence as a member of cytochrome P450 clan 4. The RACE MgCYP4Y sequence was submitted to NCBI GenBank (accession number:

M

KJ364531).

To define the range of distinct CYP4Y transcript sequences present in M. gallopovincialis specimens

d

(MgCYP4Y) living in different environmental conditions mussels were collected at three different

te

locations (Budava, Dina and Pula; Figure 1) subsequently using an RNA extraction, cDNA synthesis and cloning approach. Total RNA isolated from the individual mussels sampled in summer and autumn, from all three sampling areas, resulted in none or very low quantities of MgCYP4Y

Ac ce p

transcripts (results not shown). Sufficient amounts of MgCYP4Y transcripts, required for the upcoming experiments, were gained from the total RNAs isolated from the individual mussels sampled in November and January at sea temperature of 11ºC, and also in April from location Budava at sea temperature of 13ºC. In total, RACE and cDNA cloning efforts generated 42 different cDNA sequences from 24 specimens (12 from Budava, 4 from Dina and 8 from Pula). Alignment of the obtained nucleotide sequences was used to construct Maximum Likelihood tree that assisted in the identification of possible distinct MgCYP4Y transcripts (Figure 2). P450 Nomenclature Committee distributed MgCYP4Y transcripts in seven distinct clusters (CYP4Y1, CYP4Y2, CYP4Y3, CYP4Y4, CYP4Y5, CYP4Y6 and CYP4Y7) within the CYP4Y subfamily, each comprising a number of transcript variants. Although sequence CYP4Y3v5 is an outgroup to sister clades CYP4Y2 and CYP4Y3 it was assigned to CYP4Y3 cluster by the P450 Nomenclature Committee based on the pairwise comparison. All identified sequences have been submitted to NCBI GenBank with accession numbers KJ364532 through KJ364572.

Page 8 of 28

3.2.

MgCYP4Y sequences analysis

Open reading frame of MgCYP4Y cDNA sequences encoded deduced protein sequences of 510 amino acids. Sequence alignment showed that the distance between MgCYP4Y deduced protein sequences ranged from 1.5 to 4.2 % (Figure 3). Analysis of all seven clusters revealed the CYP4Y6 cluster as the

ip t

most diverse one. It encoded three putative transcript variants (CYP4Y6v1 – v3) that share 98.6% amino acid identity and one pseudogene transcript (CYP4Y6v4p), which 381 pb mutation has led to a stop codon (and a loss of 383 amino acids) truncating the protein, likely resulting in a nonfunctional

cr

P450 protein lacking its heme- binding domain. Within the CYP4Y7 cluster there were three distinct transcript variants (CYP4Y7v1 – v3) sharing 99.5 % amino acid identity, while the three transcript

us

variants within CYP4Y5 cluster (CYP4Y5v1 – v3) shared amino acid identity of 98.8 %. CYP4Y1 is by number the largest cluster comprising of 12 transcript variants (CYP4Y1v1 – v7, v9 – v12) that share 99 % identity and one pseudogene transcript (CYP4Y1v8p) characterized by an 835 pb mutation

an

and a loss of 230 amino acids, presumably resulting in a nonfunctional protein lacking its heme – binding domain. CYP4Y2 is the second largest cluster within CYP4Y subfamily comprising of 8 transcript variants all characterized by one amino acid deletion at position 424 in the protein sequence.

M

Although CYP4Y3 (CYP4Y3v1 – v5) and CYP4Y4 (CYP4Y4v1 – v6) clusters have amino acid identity higher than 98% based on phylogenetic analysis they form two separate groups, both sharing

d

99% in-group identity.

te

Seven clusters identified in this study showed site-specific distributions. CYP4Y1 cluster as well as CYP4Y3 and Y4 cluster was present at all locations. Specificity considering location has been found for CYP4Y2, Y5, Y6 and Y7 clusters. CYP4Y2 and CYP4Y5 clusters were present only at sampling

Ac ce p

site Budava, whereas the CYP4Y6 and CYP4Y7 clusters were specfic for sampling site Pula.

3.3.

CYP4 structural alignment

Multiple alignment of the MgCYP4Y deduced protein sequences along with the CYP4 sequences from several invertebrates and vertebrates, and also CYP102 from Bacilus megaterium demonstrated the high similarity in the conserved regions among all CYP4 sequences. Because of the high sequence identities among all MgCYP4Y transcripts, and due to the simplicity of the sequence analysis overview, only the RACE MgCYP4Y deduced protein sequence was used in the alignment (Figure 4). The following conserved regions have been identified: starting from C – terminus i) the absolutely conserved CYP signature sequence, FxxGxxxCxG (F461SAGPRNCIG470), corresponding to the heme – binding domain and serving as a fifth ligand to the heme iron ii) the consensus sequence, ExxR (E388GMR391), needed to stabilize the core structure of the protein (Werck-Reichhart and Feyereisen, 2000), iii) the conserved GxxT motif (E323VDTFMFEGHDTT335) corresponding to proton transfer groove during monooxigenation with absolutely conserved glutamic acid residue, which is unique to

Page 9 of 28

the CYP4 family and absent in other P450 families (Williams et al., 2000), iiii) the highly conserved WxxxR (W152ARSR156) domain towards the N-terminus which neutralizes the charge of one of the propionate side chains of the heme group by the tryptophan nitrogen and the basic arginine (Graham and Peterson, 2002).

ip t

The bacterial CYP102 (CYPBM3; Ravichandran et al., 1993) backbone was used as a template to define putative substrate recognition sites (SRS) within RACE MgCYP4Y deduced protein sequence. Amino acid identity between bacterial CYP102 and MgCYP4Y sequences ranged between 18.9-

cr

21.9%. In addition, human CYP4F11 and rabbit CYP4A4 and CYP4A7 were included in the

alignment to aid in comparing previously annotated active site residues between mammalian fatty acid

us

hydroxylases and MgCYP4Y forms. Despite the low sequence similarity overall, residues responsible for protein activity are absolutely conserved between all the sequences within the alignment, including T261, F262, G266 and T269 from CYP102. Three residues located near the heme core in CYP102

an

(L76, A265, and A329) are replaced by bulkier residues in RACE MgCYP4Y deduced protein sequence (Y133, E330, and V396), which is in concordance with also bulkier residues found in human CYP4A11. Y204 and F252 in rabbit CYP4A4, one of which is located outside of putative M.

M

galloprovincialis SRS sites defined here, match the same residues find in M. galloprovincialis CYP4Y sequences.

d

To show the transcripts sequence peculiarities we have taken the advantage of weblogo, generating sequence logos that are graphical representations of patterns within a multiple sequence alignment.

te

The distinct features that characterized novel MgCYP4Y sequences (Supplementary file 2.), within the conserved regions defined here, were: substitution in the ExxR consensus sequence, 378M/V in the

Ac ce p

CYP4Y2 gene cluster and 379R/G in the CYP4Y7v3 transcript variant; 322T/S substitution in the GxxT domain in the isoform CYP4Y3v4; 140W/R substitution in the transcript variants CYP4Y7v3 and CYP4Y1v10 (WxxxR domain) and 144R/H substitution in the pseudogene transcript CYP4Y1v8p (WxxxR domain). Analysis of 250 PAM transmembrane protein exchange matrix (Jones et al., 1994) revealed that all amino acid substitutions indicated above could be specified as neutral substitutions. As a peculiarity, a 424 amino acid deletion was found in the whole CYP4Y2 cluster.

3.4.

Phylogenetic analysis

Maximum likelihood phylogenetic tree that was constructed using the 41 M. galloprovincialis CYP4Y deduced amino acid sequences and 56 representative CYP4 protein sequences from numerous vertebrate, molluscs, crustacean and insect species revealed the relationship among CYP4 subfamilies (Figure 5). In general, the tree showed that the sequences within CYP4Y subfamily fall within the larger clade containing CYP4 sequences from other molluscs and annelids whose most related vertebrate homologs are found within CYP4 subfamilies F, A, Z, X and T (Figure 5). BlastX searches

Page 10 of 28

revealed full-length MgCYP4Y sequences as most similar to Ruditapes philippinarum CYP4 coding sequence (ACM16804.2) sharing 55% identity, to Crassostrea gigas CYP4F22 coding sequence (EKC34228.1) sharing 53% identity and Meretrix meretrix CYP4 coding sequence (AGC92781.1)

Ac ce p

te

d

M

an

us

cr

ip t

sharing 51% identity, all situated in the Molluscan CYP4 branch of the phylogenetic tree (Figure 5).

Page 11 of 28

4. Discussion Using the approach of PCR – based cloning technique to study the diversity of the cytochrome P450 superfamily, over the past decade numerous new CYP gene sequences have been successfully

ip t

identified. This study provides the first evidence linking the presence of specific CYP4 gene

transcripts in marine mussel M. galloprovincialis (MgCYP4Ys) to different environmental conditions. The MgCYP4Y isoforms described here are the first full – length CYP4 protein sequences reported in

cr

Mitilidae, economically and ecologically very important family of bivalve molluscs. These CYP

sequences are most homologous to CYP4 enzymes and contain the glutamic acid residue, found only

us

among members of the CYP4 family (Colas and Ortiz de Montellano, 2003). This highly conserved residue previously has also been reported as invariant among CYP4 family members (Liu and Zhang, 2004; Miao et al., 2011; Pan et al., 2011). Apart from absolutely conserved glutamic acid residue the

an

deduced RACE MgCYP4Y sequence aligned with CYP4s from other species is showing the conserved nature of the whole length of the MgCYP4Y sequences. This indicates that conserved

M

regions in the CYP proteins are involved in functions essential for the enzymes and therefore highly conserved, despite the relatively low overall sequence similarity (Jorgensen et al., 2005). Even though MgCYP4Y sequences share less than 40% amino acid sequence identity with full length vertebrate

d

CYP4 members, they have been included in the CYP4 family, in recognition of the ancient origin of the CYP4 family and the view that phylogenetic relationship need to be considered in assigning

te

nomenclature (Nelson et al., 2004).

Ac ce p

In this study, seven distinct MgCYP4Y transcript clusters were inferred, each comprising a number of transcript variants, and all sharing high nucleotide sequence identity, some greater than 97 % – the conventional cutoff for considering two sequences to be alleles at one locus (Nebert and Gonzalez, 1987). According to these high sequence identities obtained for all transcript variants it is not possible to determine the exact number of transcript variants vs. MgCYP4Y gene transcripts present in a single organism. However, there are examples of distinct P450 genes sharing greater than 97% amino acid identity, further complicating allelic variant vs. putative paralogue assignment (Nebert et al., 1989). Even though MgCYP4Y sequences are highly similar to one another within the clusters defined here, variation of only one residue can have a profound effect on enzymatic function (Lindberg and Negishi, 1989). The great diversity of P450 reactions stems from structural arrangement of the P450 proteins, allowing for the discrimination and orientation of substrates in close proximity to the activated oxygen species located in the heme core. MgCYP4Y deduced protein sequences were compared with characterized vertebrate fatty acid hydroxylases to help define the possible function and range of substrates for MgCYP4Y forms. MgCYP4Y deduced protein sequences were also aligned with the crystal structure coordinates of bacterial fatty acid monooxygenase CYP102 (CYPBM3)

Page 12 of 28

(Ravichandran et al., 1993) to better define putative SRS regions and highlight differences between active site and substrate binding residues among M.galloprovincialis P450 forms. Reason for using bacterial CYP102 in the analysis can be found in the marginal primary structure homology of CYP102 with P450s from other bacteria (15-20%) and because in the P450 phylogenetic tree CYP102 segregates with the eukaryotic families 4 and 52 (Nelson et al., 1993). Apart from the above

ip t

mentioned P450 phylogenetic analysis, an evolutionary relationship between bacterial CYP102

enzyme and members of the CYP4A subfamily can be found in the fact that long chain fatty acids are substrates for CYP102, as well as for vertebrate CYP4As (Lewis and Lake, 1999). Although both

cr

bacterial CYP102 and mammalian CYP4A11 isozymes share a common function as fatty acid hydroxylases, distinctly different preferred sites of oxidation are observed with the CYP102

us

performing the non-terminal hydroxylation or epoxidation and the CYP4A11 enzymes performing the terminal ω-hydroxylation (Chang and Loew, 1999). Despite the modest overall sequence identity between MgCYP4Ys and human CYP4A11 amino acid sequences, all of the residues near the heme

an

active site are fully conserved between human CYP4A11 and MgCYP4Y transcript sequences. Within the active site of mammalian CYP4A, CYP4B and CYP4F forms, glutamate residue (E330 in RACE

M

MgCYP4Y) binds to the heme and positions the fatty acid in the position favoring ω - hydroxylation (Stark et al., 2005). Potential importance of sequence variations in defining the substrate range and hydroxylation position has also been proven by site – directed mutational analysis, which highlighted

d

residues Y204 and F252 in rabbit CYP4A4 as being important in prostaglandin E1 (PGE1) hydroxylation (Loughran et al., 2000). Both of these residues are conserved in MgCYP4Y transcript

te

sequences. When these residues are replaced with H206 and S252, as seen in rabbit CYP4A7, metabolism of PGE1 is decreased. Increased concentration of eicosanoids (prostaglandins and

Ac ce p

leukotrienes) showed marked seasonal variability linked to the reproductive cycle of bivalves as was observed in the gonad and digestive gland in the pre – spawning and spawning period of the scallop Mizuhopecten yessoensis (Lukyanova and Khotimchenko, 1995). Since most marine animals express seasonal variations in their basic physiology and biochemistry, it is to be expected that seasonal variations in P450 activity would occur. The sufficient amount of the gene specific mRNAs required for the cloning procedures was gained from the M. galloprovincialis digestive gland samples collected during the winter months. Observation in Northern Adriatic Sea mussels confirmed the large peak of spawning mussels in December when the water temperature fell below 16ºC (Hrs - Brenko, 1971). Total RNA sampled in April from the location Budava (sea temperature 13ºC) also contained the sufficient amount of the gene specific mRNA, which could be related to prolonged restoration of gonads due to low winter temperatures. It is likely that the amount of CYP4 gene transcripts present in the mussel can be related to the gonad development i.e. reproductive cycle, which appears to be closely related to seasonal changes in water temperature. It is not possible to know if the results would be the same if total RNA was isolated from M. galloprovincialis tissues other than digestive gland (e.g. mantle, gonads, gills).

Page 13 of 28

Inference made after sequence analysis and comparison of highlighted residues coincides with the results of phylogenetic analysis, where MgCYP4Y transcripts were most closely related to vertebrate CYP4A and CYP4F forms, well known for their ability to metabolize prostaglandins (Kalsotra et al., 2004; Oktia and Okita, 2001). As shown in Figure 5 CYP4Y subfamily forms a group with other mollusc and annelid CYP4 subfamilies, indicating an explicit separation from the vertebrate CYP4

ip t

subfamilies. The sequence homology and phylogenetic analysis suggested that the MgCYP4Ys along with CYP4 sequences from molluscs and annelids share the highest amino acid sequence identities with the vertebrate homologs found within CYP4 subfamilies F, A, B and T. Similar phylogenetic

cr

clustering was obtained in the analysis of CYP4s from Nereis virens (Rewitz et al., 2004) and Chlamys farreri (Miao et al., 2011) which also separated annelid and mollusc CYP4 sequences from vertebrate

us

CYP4V subfamily members. In general, the phylogenetic tree displays CYP4 sequences in two major clusters, which is in concordance with the previously described CYP4 clustering by Kirischian and Wilson (2012), where the diversification of CYP4 genes in invertebrates and vertebrate species was

an

explained as a result of a common and strongly supported duplication event. The separation of invertebrate CYP4 sequences in the phylogenetic analysis could arise from the high diversity of

M

invertebrate CYP4s. Due to the low identity percentages between MgCYP4Ys and other CYP4 sequences, the phylogenetic analysis opened the possibility of novel CYP4 roles, besides the ones already described for this family of cytochromes P450 (Hardwick, 2008; Whalen et al., 2010).

d

The expansion of the MgCYP4Y clusters, identified in this study, suggests that positive selection may

te

be acting to enhance the diversity of this subfamily, likely through repeated gene duplication and divergence (Nelson and Strobel, 1987). Pollution is a common stress in the marine environment and

Ac ce p

one of today’s most powerful agents of selection, yet we have little understanding of how anthropogenic toxicants influence mechanisms of adaptation in marine populations. Yet, the pollution is just one possibility in the complex composition of both coastal sea water and tissue matrices. Great expansion of gene transcripts in CYP4Y subfamily presented in all three locations can partially be explained by different environmental pressure (natural and anthropogenic pressure) and thus the need for various isoforms of MgCYP4Y genes that will provide greater ability to decompose a range of contaminants present in the environment. Different environmental niches selected in this study could present an important factor that contributed to a diversity of MgCYP4Y gene transcripts present in natural populations of mussels living in selected areas. Presumably, MgCYP4Y clusters present at all locations could have physiological role in the metabolism of endogenous substrates, whereas the pattern of the site specific MgCYP4Y clusters could be influenced by the specific conditions in particular environment. Significantly greater expansion of CYP4 genes has been found in rodents in CYP4A, CYP4B and CYP4F subfamilies (Nelson et al., 2004). The rodents CYP4F gene amplification has been associated with increased diversity in function in metabolizing both endogenous and exogenous compounds (Cui et al., 2001; Hardwick, 2008).

Page 14 of 28

5. Conclusion This study provided the first description of the full length cytochrome P450 family 4 genes in the mussel M. galloprovincialis, one of the most – used species in marine ecotoxicological surveys. From phylogenetic analysis and pairwise comparisons of novel cDNA sequences the existence of multiple

ip t

genes / gene transcripts for the CYP4Y locus in mussel was deduced. Due to high sequence identity of MgCYP4Y transcripts it was impossible to accurately determine number of genes / gene transcript variants present in a single organism. The homology searches of nucleotide and protein sequences

cr

revealed that CYP4Y genes from this study are most similar to CYP4s from other molluscs and

annelids. Sequence analysis further demonstrated that MgCYP4Ys and vertebrate CYP4A/4F forms

us

share identical amino acid residues at key position within fatty acid substrate recognition site. It can be assumed that the positive selection is acting on the elevation of the diversity of cytochrome P450 gene transcripts, due to possibility that different environmental niches selected in this study contributed to

an

the diversity of expressed CYP4Y genes. Moreover, our data suggest that the quantity of the CYP4Y mRNA in the mussel is very much connected with the mussels’ reproductive cycle that is in concordance with the seasonal changes. This enormous gene expansion can be associated with

M

increased diversity in function of metabolizing both endogenous and exogenous compounds, which opens up a great need for future studies essential for determination of the function of CYP4Y subfamily of enzymes. A possible work step for understanding the real function of CYP4Ys in the

Ac ce p

Financial support

te

the CYP4Ys.

d

mussel M. galloprovincialis could be the activity measurements based on substrate transformation by

This study was financially supported by the Croatian Ministry of Science, Education and Sport, Project No 098-0982705-2725.

Page 15 of 28

References

Ac ce p

te

d

M

an

us

cr

ip t

Bihari, N., Fafanđel, M., Piškur, V., 2007. Polycyclic Aromatic Hydrocarbons and Ecotoxicological Characterization of Seawater, Sediment, and Mussel Mytilus galloprovincialis from the Gulf of Rijeka, the Adriatic Sea, Croatia. Arch Environ Contam Toxicol 52, 379-387. Chang, Y.-T., Loew, G.H., 1999. Homology modeling and substrate binding study of human CYP4A11 enzyme. Proteins: Structure, Function, and Bioinformatics 34, 403-415. Chaty, S., Rodius, F., Vasseur, P., 2004. A comparative study of the expression of CYP1A and CYP4 genes in aquatic invertebrate (freshwater mussel, Unio tumidus) and vertebrate (rainbow trout, Oncorhynchus mykiss). Aquatic toxicology (Amsterdam, Netherlands) 69, 8194. Colas, C., Ortiz de Montellano, P.R., 2003. Autocatalytic Radical Reactions in Physiological Prosthetic Heme Modification. Chemical Reviews 103, 2305-2332. Cui, X., Kawashima, H., Barclay, T.B., Peters, J.M., Gonzalez, F.J., Morgan, E.T., Strobel, H.W., 2001. Molecular Cloning and Regulation of Expression of Two Novel Mouse CYP4F Genes: Expression in Peroxisome Proliferator-Activated Receptor α-Deficient Mice upon Lipopolysaccharide and Clofibrate Challenges. Journal of Pharmacology and Experimental Therapeutics 296, 542-550. Danielson, P.B., Foster, J.L., McMahill, M.M., Smith, M.K., Fogleman, J.C., 1998. Induction by alkaloids and phenobarbital of Family 4 Cytochrome P450s in Drosophila: evidence for involvement in host plant utilization. Molecular & general genetics : MGG 259, 54-59. Estabrook, R.W., 2003. A passion for P450s (rememberances of the early history of research on cytochrome P450). Drug metabolism and disposition: the biological fate of chemicals 31, 1461-1473. Feyereisen, R., 1999. Insect P450 enzymes. Annual review of entomology 44, 507-533. Goldstone, J.V., 2008. Environmental sensing and response genes in cnidaria: the chemical defensome in the sea anemone Nematostella vectensis. Cell Biol Toxicol 24, 483-502. Goldstone, J.V., Goldstone, H.M., Morrison, A.M., Tarrant, A., Kern, S.E., Woodin, B.R., Stegeman, J.J., 2007. Cytochrome P450 1 genes in early deuterostomes (tunicates and sea urchins) and vertebrates (chicken and frog): origin and diversification of the CYP1 gene family. Mol Biol Evol 24, 2619-2631. Goldstone, J.V., Hamdoun, A., Cole, B.J., Howard-Ashby, M., Nebert, D.W., Scally, M., Dean, M., Epel, D., Hahn, M.E., Stegeman, J.J., 2006. The chemical defensome: Environmental sensing and response genes in the Strongylocentrotus purpuratus genome. Developmental Biology 300, 366-384. Gotoh, O., 1992. Substrate recognition sites in cytochrome P450 family 2 (CYP2) proteins inferred from comparative analyses of amino acid and coding nucleotide sequences. Journal of Biological Chemistry 267, 83-90. Gotoh, O., 2012. Evolution of cytochrome p450 genes from the viewpoint of genome informatics. Biological & pharmaceutical bulletin 35, 812-817. Graham, S.E., Peterson, J.A., 2002. Sequence alignments, variabilities, and vagaries. Methods in enzymology 357, 15-28. Hall, T.A., 1999. BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucl. Acids. Symp. Ser. 41, 95-98. Hardwick, J.P., 2008. Cytochrome P450 omega hydroxylase (CYP4) function in fatty acid metabolism and metabolic diseases. Biochemical Pharmacology 75, 2263-2275. Hrs - Brenko, M., 1971. The reproductive cycle of the Mytilus galloprovincialis Lamk in the Northern Adriatic Sea and Mytilus edulis L. at Long Island Sound. Thalassia Jugoslavica 7, 533-542.

Page 16 of 28

Ac ce p

te

d

M

an

us

cr

ip t

James, M.O., Boyle, S.M., 1998. Cytochromes P450 in crustacea. Comparative Biochemistry and Physiology Part C: Pharmacology, Toxicology and Endocrinology 121, 157-172. Jones, D.T., Taylor, W.R., Thornton, J.M., 1994. A mutation data matrix for transmembrane proteins. FEBS Letters 339, 269-275. Jorgensen, A., Rasmussen, L.J., Andersen, O., 2005. Characterisation of two novel CYP4 genes from the marine polychaete Nereis virens and their involvement in pyrene hydroxylase activity. Biochemical and biophysical research communications 336, 890-897. Kalsotra, A., Turman, C.M., Kikuta, Y., Strobel, H.W., 2004. Expression and characterization of human cytochrome P450 4F11: Putative role in the metabolism of therapeutic drugs and eicosanoids. Toxicology and Applied Pharmacology 199, 295-304. Kirischian, N.L., Wilson, J.Y., 2012. Phylogenetic and functional analyses of the cytochrome P450 family 4. Molecular Phylogenetics and Evolution 62, 458-471. Lewis, D.F.V., Lake, B.G., 1999. Molecular modelling of CYP4A subfamily members based on sequence homology with CYP102. Xenobiotica 29, 763-781. Lindberg, R.L.P., Negishi, M., 1989. Alteration of mouse cytochrome P450coh substrate specificity by mutation of a single amino-acid residue. Nature 339, 632-634. Liu, N., Zhang, L., 2004. CYP4AB1, CYP4AB2, and Gp-9 gene overexpression associated with workers of the red imported fire ant, Solenopsis invicta Buren. Gene 327, 81-87. Livingstone, D., 1991. Organic Xenobiotic Metabolism in Marine Invertebrates, Advances in Comparative and Environmental Physiology. Springer Berlin Heidelberg, pp. 45-185. Livingstone, D.R., Chipman, J.K., Lowe, D.M., Minier, C., Pipe, R.K., 2000. Development of biomarkers to detect the effects of organic pollution on aquatic invertebrates: recent molecular, genotoxic, cellular and immunological studies on the common mussel (Mytilus edulis L.) and other mytilids. International Journal of Environment and Pollution 13, 56-91. Loughran, P.A., Roman, L.J., Aitken, A.E., Miller, R.T., Masters, B.S.S., 2000. Identification of Unique Amino Acids That Modulate CYP4A7 Activity†. Biochemistry 39, 15110-15120. Lukyanova, O.N., Khotimchenko, Y.S., 1995. Lipid peroxidation in organs of the scallop Mizuhopecten yessoensis and sea-urchin Strongylocentrotus intermedius during the reproductive cycle. Comparative Biochemistry and Physiology Part B: Biochemistry and Molecular Biology 110, 371-377. Miao, J., Pan, L., Liu, N., Xu, C., Zhang, L., 2011. Molecular cloning of CYP4 and GSTpi homologues in the scallop Chlamys farreri and its expression in response to Benzo[a]pyrene exposure. Marine genomics 4, 99-108. Nebert, D.W., Gonzalez, F.J., 1987. P450 genes: structure, evolution, and regulation. Annual review of biochemistry 56, 945-993. Nebert, D.W., Nelson, D.R., Adesnik, M., Coon, M.J., Estabrook, R.W., Gonzalez, F.J., Guengerich, F.P., Gunsalus, I.C., Johnson, E.F., Kemper, B., et al., 1989. The P450 superfamily: updated listing of all genes and recommended nomenclature for the chromosomal loci. DNA (Mary Ann Liebert, Inc.) 8, 1-13. Nelson, D.R., 2011. Progress in tracing the evolutionary paths of cytochrome P450. Biochimica et Biophysica Acta (BBA) - Proteins and Proteomics 1814, 14-18. Nelson, D.R., Kamataki, T., Waxman, D.J., Guengerich, F.P., Estabrook, R.W., Feyereisen, R., Gonzalez, F.J., Coon, M.J., Gunsalus, I.C., Gotoh, O., et al., 1993. The P450 superfamily: update on new sequences, gene mapping, accession numbers, early trivial names of enzymes, and nomenclature. DNA and cell biology 12, 1-51. Nelson, D.R., Koymans, L., Kamataki, T., Stegeman, J.J., Feyereisen, R., Waxman, D.J., Waterman, M.R., Gotoh, O., Coon, M.J., Estabrook, R.W., Gunsalus, I.C., Nebert, D.W., 1996. P450 superfamily: update on new sequences, gene mapping, accession numbers and nomenclature. Pharmacogenetics 6, 1-42. Nelson, D.R., Strobel, H.W., 1987. Evolution of cytochrome P-450 proteins. Molecular

Page 17 of 28

Ac ce p

te

d

M

an

us

cr

ip t

Biology and Evolution 4, 572-593. Nelson, D.R., Zeldin, D.C., Hoffman, S.M., Maltais, L.J., Wain, H.M., Nebert, D.W., 2004. Comparison of cytochrome P450 (CYP) genes from the mouse and human genomes, including nomenclature recommendations for genes, pseudogenes and alternative-splice variants. Pharmacogenetics 14, 1-18. Oktia, R.T., Okita, J.R., 2001. Cytochrome P450 4A Fatty Acid Omega Hydroxylases. Current Drug Metabolism 2, 265-281. Pan, L., Liu, N., Xu, C., Miao, J., 2011. Identification of a novel P450 gene belonging to the CYP4 family in the clam Ruditapes philippinarum, and analysis of basal- and benzo(a)pyreneinduced mRNA expression levels in selected tissues. Environmental Toxicology and Pharmacology 32, 390-398. Peric, L., Fafandel, M., Glad, M., Bihari, N., 2012. Heavy metals concentration and metallothionein content in resident and caged mussels Mytilus galloprovincialis from Rijeka bay, Croatia. Fresen Environ Bull 21, 2785-2794. Peters, L.D., Nasci, C., Livingstone, D.R., 1998. Immunochemical investigations of cytochrome P450 forms/epitopes (CYP1A, 2B, 2E, 3A and 4A) in digestive gland of Mytilus sp. Comparative Biochemistry and Physiology Part C: Pharmacology, Toxicology and Endocrinology 121, 361-369. Ravichandran, K., Boddupalli, S., Hasermann, C., Peterson, J., Deisenhofer, J., 1993. Crystal structure of hemoprotein domain of P450BM-3, a prototype for microsomal P450's. Science (New York, N.Y.) 261, 731-736. Rewitz, K.F., Kjellerup, C., Jørgensen, A., Petersen, C., Andersen, O., 2004. Identification of two Nereis virens (Annelida: Polychaeta) cytochromes P450 and induction by xenobiotics. Comparative Biochemistry and Physiology Part C: Toxicology & Pharmacology 138, 89-96. Rewitz, K.F., Styrishave, B., Løbner-Olesen, A., Andersen, O., 2006. Marine invertebrate cytochrome P450: Emerging insights from vertebrate and insect analogies. Comparative Biochemistry and Physiology Part C: Toxicology & Pharmacology 143, 363-381. Schneider, T.D., Stephens, R.M., 1990. Sequence logos: a new way to display consensus sequences. Nucleic Acids Research 18, 6097-6100. Shaw, J.P., Large, A.T., Donkin, P., Evans, S.V., Staff, F.J., Livingstone, D.R., Chipman, J.K., Peters, L.D., 2004. Seasonal variation in cytochrome P450 immunopositive protein levels, lipid peroxidation and genetic toxicity in digestive gland of the mussel Mytilus edulis. Aquatic Toxicology 67, 325-336. Snyder, M.J., 1998. Cytochrome P450 Enzymes Belonging to the CYP4 Family from Marine Invertebrates. Biochemical and biophysical research communications 249, 187-190. Snyder, M.J., 2000. Cytochrome P450 enzymes in aquatic invertebrates: recent advances and future directions. Aquatic toxicology (Amsterdam, Netherlands) 48, 529-547. Stark, K., Wongsud, B., Burman, R., Oliw, E.H., 2005. Oxygenation of polyunsaturated long chain fatty acids by recombinant CYP4F8 and CYP4F12 and catalytic importance of Tyr-125 and Gly-328 of CYP4F8. Archives of biochemistry and biophysics 441, 174-181. Stegeman, J.J., Teal, J.M., 1973. Accumulation, release and retention of petroleum hydrocarbons by the oyster Crassostrea virginica. Marine Biology 22, 37-44. Werck-Reichhart, D., Feyereisen, R., 2000. Cytochromes P450: a success story. Genome biology 1, 1-9. Whalen, K., Starczak, V., Nelson, D., Goldstone, J., Hahn, M., 2010. Cytochrome P450 diversity and induction by gorgonian allelochemicals in the marine gastropod Cyphoma gibbosum. BMC Ecology 10, 24. Williams, P.A., Cosme, J., Sridhar, V., Johnson, E.F., McRee, D.E., 2000. Mammalian microsomal cytochrome P450 monooxygenase: structural adaptations for membrane binding and functional diversity. Molecular cell 5, 121-131.

Page 18 of 28

Ac ce p

te

d

M

an

us

cr

ip t

Zhou Chi, L.C.-h., ZHANG Wei-min,JIA Xiao-ping, 2010. CYP4 gene cloning and expression level analysis of Perna viridis. Journal of Tropical Oceanography 29, 82-88.

Page 19 of 28

d

te

Ac ce p us

an

M

cr

ip t

 

 

Page 20 of 28

FIGURE LEGENDS FIG. 1. Mytilus galloprovincialis sampling sites in the Northern Adriatic Sea, Croatia. Black circles indicate the locations where animals were collected. FIG. 2. Cluster analysis of the clones within the MgCYP4Y subfamily visualized by Maximum Likelihood

ip t

method. Circles indicate putative gene clusters, each represented with several transcript variants. Sequence RACE MgCYP4Y was added to the analysis in order to see its clustering. Scale bar indicates the number of nucleotide base changes.

cr

FIG. 3. Heat map depicting distance between Mytilus galloprovincialis CYP4Y cDNAs. 41 cDNA sequences

were grouped in MEGA according to their affiliation to a particular gene cluster, and percentage of diversity was

us

calculated between groups. Values are shaded according to their percentage of diversity with dark gray and white indicating the highest and lowest degree of diversity, respectively. Nucleotide percentage of diversity is shown in

an

the lower triangle while amino acid percentage of diversity is shown in the upper triangle.

FIG. 4. Amino acid alignment depicting conserved regions and putative residues involved in substrate recognition among CYP4 protein sequences. Conserved regions are marked with light grey highlighting. The distinct glutamic acid residue present only in the CYP4 family is marked by “*”. Boxed regions represent the six

M

substrate recognition sites (SRS). Residues shaded in dark gray, responsible for protein activity (Chang and Loew, 1999) are identical in all sequences in the alignment. Residues lining the substrate binding channel as described in Chang and Loew (1999) are indicated by a “●” in the alignment. Residues affecting fatty acid

d

hydroxylase activity in rabbit CYP4A forms as described in Loughran et al. (2000) are indicated by a “◊” in the

te

alignment. Gaps in the alignment are indicated by “-”. Protein sequences used in the alignment include: Meretrix maretrix (MmCYP4BK4; AGC92781), Ruditapes philippinarum (RpCYP4BK3; ACM16804.2), Crassostrea gigas (CgCYP4F22; EKC34228.1), Mus musculus (AF233644), rabbit CYP4A4 (P10611) and CYP4A7

Ac ce p

(P14581), human CYP4F11 (Q9HB16), bacterial CYP102 (2HPD) and RACE MgCYP4Y4 (KJ364531). FIG. 5. Maximum Likelihood tree presenting the evolutionary relationship between Mytilus galloprovincialis and other invertebrate/vertebrate CYP4s. All methods used for phylogenetic reconstruction gave trees with the same overall topology, therefore only the Maximum Likelihood tree is shown. Sequences used in the phylogenetic analysis, with accession numbers, are listed in Supplementary File 3. 

 

Page 21 of 28

TABLE Primer sequence (5' – 3') 

Frace 

CAGAATATCAGAAAATGTGTCAGAATGAA 

Rrace 

GGAGACCATATATGTTGATACCGAAAA 

R2 

AAGAGTCCATCTTTGTAGCATTG 

3‐1 

GTTCCTTTCTGCGTTTTGC 

5‐1 

GGCGAACATAAGCTTTTTGTC 

GeneRaceTM 3'Primer 

GCTGTCAACGATACGCTACGTAACG 

M

an

us

cr

ip t

Primer name 

GeneRaceTM 5'Primer 

d

Ac ce p

pUC/M13 reverse 

te

pUC/M13 forward 

CGACTGGAGCACGAGGACACTGA 

CCCAGTCACGACGTTGTAAAACG 

AGCGGATAACAATTTCACACAGGAA 

Table 1. Oligonucleotide primers used in this study

 

Highlights:

• Seven new CYP4 transcript clusters were identified in Mitilidae • CYP4 transcript clusters showed site specific distribution • CYP4 transcript sequences suggested on involvement in the terminal ω hydroxylation • CYP4 transcript sequences suggested on involvement in prostaglandin hydroxylation

Page 22 of 28

       

ip t

   

cr

 

us

     

an

   

Ac ce p

te

d

M

 

Page 23 of 28

Page 24 of 28

d

te

Ac ce p us

an

M

cr

ip t

Page 25 of 28

d

te

Ac ce p us

an

M

cr

ip t

Page 26 of 28

d

te

Ac ce p us

an

M

cr

ip t

Page 27 of 28

d

te

Ac ce p us

an

M

cr

ip t

Page 28 of 28

d

te

Ac ce p us

an

M

cr

ip t