Biochimica et Biophysica Acta 1340 Ž1997. 253–267
Computer analysis of phytochrome sequences and reevaluation of the phytochrome secondary structure by Fourier transform infrared spectroscopy a Jurgen Suhnel , Gudrun Hermann ¨ ¨
b,)
, Utz Dornberger c , Hartmut Fritzsche
c
a
b
Institute of Molecular Biotechnology, Beutenbergstraße 11, D-07745 Jena, Germany Department of Biochemistry and Biophysics, Friedrich-Schiller-UniÕersity Jena, Philosophenweg 12, D-07743 Jena, Germany c Institute of Molecular Biology, Friedrich-Schiller-UniÕersity Jena, Winzerlaer Straße 10, D-07745 Jena, Germany Received 20 February 1997; accepted 6 March 1997
Abstract A repertoire of various methods of computer sequence analysis was applied to phytochromes in order to gain new insights into their structure and function. A statistical analysis of 23 complete phytochrome sequences revealed regions of non-random amino acid composition, which are supposed to be of particular structural or functional importance. All phytochromes other than phyD and phyE from Arabidopsis have at least one such region at the N-terminus between residues 2 and 35. A sequence similarity search of current databases indicated striking homologies between all phytochromes and a hypothetical 84.2-kDa protein from the cyanobacterium Synechocystis. Furthermore, scanning the phytochrome sequences for the occurrence of patterns defined in the PROSITE database detected the signature of the WD repeats of the b-transducin family within the functionally important 623–779 region Žsequence numbering of phyA from AÕena. in a number of phytochromes. A multiple sequence alignment performed with 23 complete phytochrome sequences is made available via the IMB Jena World-Wide Web server Žhttp:rrwww.imb-jena.derPHYTO.html.. It can be used as a working tool for future theoretical and experimental studies. Based on the multiple alignment striking sequence differences between phytochromes A and B were detected directly at the N-terminal end, where all phytochromes B have an additional stretch of 15–42 amino acids. There is also a variety of positions with totally conserved but different amino acids in phytochromes A and B. Most of these changes are found in the sequence segment 150–200. It is, therefore, suggested that this region might be of importance in determining the photosensory specificity of the two phytochromes. The secondary structure prediction based on the multiple alignment resulted in a small but significant b-sheet content. This finding is confirmed by a reevaluation of the secondary structure using FTIR spectroscopy. q 1997 Elsevier Science B.V. Keywords: Phytochrome; Photomorphogenesis; Multiple sequence alignment; Sequence comparison; Secondary structure; FTIR spectroscopy
Abbreviations: CD, circular dichroism; FTIR, Fourier transform infrared spectroscopy; Pfr, far-red light-absorbing form of phytochrome; PHD, Profile network from HeiDelberg; Pr, red light-absorbing form of phytochrome; ZPRED, secondary structure prediction by the method of Zvelebil and co-workers ) Corresponding author. 0167-4838r97r$17.00 q 1997 Elsevier Science B.V. All rights reserved. PII S 0 1 6 7 - 4 8 3 8 Ž 9 7 . 0 0 0 5 0 - 2
254
J. Suhnel et al.r Biochimica et Biophysica Acta 1340 (1997) 253–267 ¨
1. Introduction Phytochromes represent a family of regulatory photoreceptors in plants that control a wide variety of growth and developmental processes. They exist in two photoreversible forms, a red light-absorbing form ŽPr. with an absorption maximum at 666 nm and a far-red light-absorbing form ŽPfr. with an absorption maximum at 730 nm. Based on the interconversion between the Pr and Pfr form they function as a light-regulated switch which is activated by red light and attenuated by far-red light. Understanding the phytochrome action is still hampered by the fact that the three-dimensional structure of this protein is not known. In the absence of an experimentally determined tertiary structure, numerous biochemical, physicochemical and mutational studies have been carried out to obtain structural information in a more or less indirect manner Ž for more details see w1–3x.. Computer sequence analysis can provide structural information as well and identify proteins with related functions. Previous attempts include the generation of three-dimensional atomic models of a short sequence region around the chromophore binding site starting out with the secondary structure w4x or using the already known three-dimensional structure of C-phycocyanin as a template w5x, secondary structure predictions w6–8x and the derivation of the putative domain structure w9x. In addition, similarity searches of phytochrome sequences against sequence databases were performed. A weak homology between the C-terminal domain of phytochrome sequences and bacterial sensor proteins was reported w10,11x. Also, for the C-terminus of the moss Ceratodon purpureus a homology to the catalytic domain of protein kinases was found w12,13x. This was one of the arguments in favor of the idea that phytochromes might be light-regulated protein kinases. In a more recent study Jones and Edgerton w14x identified a conserved amino acid repeat within the central hinge region of phytochromes that could be important for dimerization andror phototransformation. Profile analysis of this repeat indicated a similarity to the photoactive yellow protein from the purple bacterium Ectothiorhodospira halophila w15x. The very latest results of a sequence similarity search with a regulatory protein of the chromatic adaptation in cyanobacteria suggest a structural relatedness of
this protein to the amino-terminal half of phytochromes w16,17x. Finally, sequence sets of complete phytochrome sequences were used to derive phylogenetic trees w18,19x. Over the last few years, improved methods of computer sequence analysis have been developed. Moreover, the databases have considerably increased in size. For these reasons, we have applied the new computer sequence analysis tools to the current databases in order to update the information about the structure and function of phytochrome available from sequence analysis. The results obtained include findings from statistical analysis of the phytochrome sequences, search for local and global similarities to other proteins, search for potential motifs of known function and prediction of the secondary structure. The results of the secondary structure prediction are complemented by experimental data obtained from a Fourier transform infrared Ž FTIR. analysis.
2. Materials and methods 2.1. Preparation of phytochrome Phytochrome was isolated from 5-day-old etiolated oat seedlings Ž AÕena satiÕa L. cv. Pirol. as described w20x with the following modiby Grimm and Rudiger ¨ fications: After the last resolubilization step phytochrome was pelleted by polyvinylpyrrolidone ŽPVP-40., subsequently fractionated by ammonium sulfate and desalted by gel filtration through HighLoad Superdex 200 ŽPharmacia, Upsala. . The specific absorbance ratio Ž A 665rA 280 . of this preparation was above 1.0. SDS elctrophoresis on polyacrylamide gradient gels Ž 8–18%. gave only one band corresponding to a monomer molecular weight of 124 kDa. Phytochrome samples used for the FTIR measurements were dissolved in a 20 mM potassium phosphate buffer at pH 7.8 containing 0.3% Ž vrv. glycerol. They were adjusted by ultrafiltration through Centricon-30 centrifugal concentrators ŽAmicon, Beverly, MA, USA. to a final concentration of 1 mgrml. Glycerol was added to the phytochrome solutions in order to guarantee photoreversibility be-
J. Suhnel et al.r Biochimica et Biophysica Acta 1340 (1997) 253–267 ¨
tween Pr and Pfr after rehydration of the hydrated film samples w21x. 2.2. FTIR spectroscopy FTIR measurements were carried out on a Bruker IFS-66 FTIR spectrophotometer equipped with a MCT detector. The spectral resolution was 2 cmy1. For obtaining the IR spectra 100 m l of a phytochrome solution with a concentration of 1 mgrml were deposited onto a CaF2 infrared window. The solvent was removed by evaporation under a gentle stream of nitrogen. The film was rehydrated by placing about 30 m l D 2 O into the infrared cell which was sealed with another CaF2 infrared window. The samples were photoconverted within the FTIR spectrometer by irradiation with red and far-red light. Actinic light was supplied by a 6-V tungsten lamp equipped with interference filters of either 658- or 720-nm peak transmission. The FTIR spectra were recorded at 298 K using identical scanning parameters for Pr and Pfr. 256 or 512 interferograms were accumulated, coadded and subsequently Fourier-transformed using a Happ–Genzel apodization function. The unsmoothed spectra were used for further analysis. To estimate the secondary structure content in percentages the FTIR measurements were repeated five times with separately isolated phytochrome samples. Photoreversibility of phytochrome in the rehydrated film samples was checked by recording the visible absorption spectra between 500 and 800 nm on a Perkin Elmer Lamda 19 UVrVIS spectrometer. The FTIR spectra were analyzed by standard procedures of Fourier derivation and deconvolution w22x. The derivation was performed using a power of 3 and breakpoint of 0.3. For the Fourier deconvolution a Lorentzian bandwidth of 12–15 cmy1 and a resolution enhancement factor of 2 were used. Quantitative information on protein secondary structure elements was obtained by decomposition of the amide IX band into its constituents. The number of bands and their positions were taken from the Fourier derivative and deconvoluted spectra applying a center of gravity algorithm as described previously w23x. The initial full width at half height of the GaussianrLorentzian bands was fixed at 8–15 cmy1 depending on the width of the corresponding band in the Fourier derivative spectrum. For the most intensive bands and those in
255
the wings the initial band height was set at 90% of the original intensity and at 70% for the other bands. The original spectra were fitted by an iteration algorithm reported recently w24x. The relative contents of the distinct secondary structure elements was estimated by dividing the areas of the individual bands assigned to particular secondary structures by the whole area of the resulting amide IX band. The component band around 1612 cmy1, which is due to vibrations of the amino acid side-chains, was not included in this procedure. 2.3. Computer sequence analysis Twenty-three complete phytochrome sequences were retrieved from protein databases and used for computer sequence analysis. They are characterized by the SwissProt code w25x. The sequences include 11 p h y to c h ro m e s A : p h y A o f N ic o tia n a Ž PH Y 1_TO BA C . , AÕena Ž PH Y 3_A V ESA , PHY4_AVESA. , Arabidopsis Ž PHYA_ARATH. , Cucurbita ŽPHYA_CUCPE., Zea ŽPHYA_MAIZE., Oryza Ž PHYA_ORYSA. , Pisum Ž PHYA_PEA. , Ž P H Y A _ S O L T U . , G ly c in e S o la n u m ŽPHYA_SOYBN., Petroselinum ŽPHYA_PETCR., five phytochromes B: phyB of Arabidopsis Ž PHYB_ARATH . , Oryza Ž PHYB_ORYSA . , S o la n u m Ž P H Y B _ S O L T U . , N ic o tia n a ŽPHYB_TOBAC., Glycine ŽPHYB_SOYBN. and further phyC, phyD and phyE of Arabidopsis ŽP H Y C _ A R A T H , P H Y D _ A R A T H , PHYE_ARATH . , phytochrome of Adiantum Ž PHY_ADICA . , phytochrome of Ceratodon Ž PHY_CERPU . as well as phytochrome of Physcomitrella ŽPHY1_PHYPA. and Selaginella ŽPHY1_SELMA.. Segments of non-random amino acid compositions were identified by the program SEG w26x. Similarity searches were performed using the BLAST interface of ´ SwissProt pointing to the web site at the Ecole Polytechnique Federale de Lausanne Ž http:rr expasy.hcuge.chr cgi-binrBLASTEPFL.pl. w27x. The search for potential motifs available via the PROSITE database was carried out using the ExPASY web server Ž http:rr-expasy.hcuge.chr sprotr scnpsite. html. w28x. The multiple alignment was calculated by means of the AMPS package and the MULTAL program which is part of the CAMELEON sequence analysis tool
256
J. Suhnel et al.r Biochimica et Biophysica Acta 1340 (1997) 253–267 ¨
of Oxford Molecular Ltd. w29,30x. For the estimation of the secondary structure two different alignment based prediction methods were applied: the PHD method w31,32x and the ZPRED approach w33x.
3. Results and discussion 3.1. Computer sequence analysis The results obtained from computer sequence analysis of 23 complete phytochrome sequences include new findings and confirmatory facts. To make all the data obtained available we have prepared a phyto c h ro m e w e b site Ž h ttp :r r w w w .im b jena.derPHYTO.html.. In the following paragraphs only some special findings are reported. 3.1.1. Identification of segments with non-random amino acid composition For a statistical analysis of the phytochrome sequences the SEG algorithm w26x was used. It identifies segments of non-random amino acid composition, or so-called ‘low-complexity’ regions. ‘Lowcomplexity’ regions are suggested to correspond with elongated non-globular protein domains. Many of these are likely to be relatively mobile and especially suitable for conformational adaptability in interactions within molecular complexes w26x. The analysis of the 23 phytochrome sequences reveals that apart from phyD and phyE of Arabidopsis all the other ones have a ‘low-complexity’ segment of a length between 10 and 30 amino acids at the N-terminus between residues 2 and 35 Ž local sequence numbering.. These segments consist primarily of polar amino acids like serine, for instance. PhyA of AÕena exhibits three additional ‘low-complexity’ regions at sequence positions 343–364, 755– 766 and 1087–1098. While all the other phytochromes A exhibit a very similar pattern with either one additional or one missing ‘low-complexity’ segment, all non-phytochromes A have only the segment at the N-terminus Žexcept for phyB of Oryza.. The ‘low-complexity’ segments identified in PhyA of AÕena correspond with regions that have definite functions. The N-terminal segment 2–21 agrees with the three serinerthreonine-rich tracts within the first 20 amino acids which are obviously essential for the
control of the phytochrome activity w34–36x. The segments 343–364 and 755–766 match the sites around E-354 and K-753 which are exposed in Pfr but not in Pr w37x and the segment 1087–1098 coincides with the subunit contact sites of dimerization w38,39x. It is therefore reasonable to speculate, that the N-terminal ‘low-complexity’ region 2–35 common to all phytochromes except for phyD and phyE of Arabidopsis is of general functional importance and a promising region for genetic engineering and mutagenesis experiments. 3.1.2. Sequence similarities between phytochromes and other proteins To find proteins which share sequence homologies with the phytochromes a sequence similarity search was performed using the BLASTP algorithm. The results obtained indicate a strong similarity of all phytochromes to the hypothetical 84.2-kDa protein from the cyanobacterium Synechocystis sp. This protein is suggested to be the product of a gene within the genome of Synechocystis sp. that encodes a 748 amino acid open reading frame w16x. It was already reported that it shows similarities to the NH 2-terminal domain of phyE from Arabidopsis and phytochrome 1 from Selaginella w16,17x. According to our BLASTP analysis the hypothetical 84.2-kDa protein is the non-phytochrome sequence with the highest score for all phytochromes. Nearly identical amino acid regions are involved in the overlap with the distinct phytochromes. They are primarily located within the first 500 N-terminal amino acids. The sequence identity varies between 25% and 60%, depending on the aligned region. An interesting difference occurs between phytochromes A and B. The region comprising residues 118–129 of the hypothetical protein overlaps only with phytochromes B in a sequence region that is different between the two types of phytochrome Žsee below.. From the high degree of resemblance between phytochromes and the hypothetical 84.2-kDa protein it appears that both possess the same structural prerequisites for determining their photosensory function. It remains to be seen whether this fact indicates a common evolutionary origin. In addition to the hypothetical 84.2-kDa protein there are further deduced proteins within the genome of Synechocystis sp. which show differing degrees of
J. Suhnel et al.r Biochimica et Biophysica Acta 1340 (1997) 253–267 ¨
similarity to all phytochromes. Of particular interest are local homologies between the C-terminal domain of the phytochromes and the sensory transduction histidine kinases w40x. In the list of the matched sequences these kinases are positioned around the bacterial two-component protein kinases whose homology to the phytochromes was already noted in earlier data base searches w10,11,15x. Furthermore, the BLASTP analysis reveals striking similarities with histidine kinase A, a two-component histidine kinase of Dictyostelium discoideum w41x and with the sensor kinase of Pseudomonas fluorescence w42x. The fact that phytochromes share substantial homologies with sensor kinases of the histidine type which include
257
several other sequence families besides the bacterial sensor proteins further strengthens the possibility of an evolutionary relationship between phytochromes and sensor kinases. This speculation is of some interest in that the histidine kinase mechanism of the bacterial sensor proteins turns out to play a central role even in the process of signal transmission in eukaryotes w43x. 3.1.3. Identification of annotated functional motifs within the phytochrome sequences In order to identify potential functionally important motifs the phytochrome sequences were scanned for annotated motifs of the PROSITE pattern database. The
Table 1 PROSITE
signatures identified by a motif search in phytochromes PROSITE
PHY1_TOBAC PHY3_AVESA PHY4_AVESA PHYA_ARATH PHYA_CUCPE PHYA_MAIZE PHYA_ORYSA PHYA_PEA PHYA_SOLTU PHYA_SOYBN PHYA_PETRC PHYB_ARATH PHYB_ORYSA
G_BETA_REPEATS ZINC_PROTEASE ZINC_PROTEASE G_BETA_REPEATS G_BETA_REPEATS – ZINC_PROTEASE G_BETA_REPEATS G_BETA_REPEATS G_BETA_REPEATS G_BETA_REPEATS DNA_LIGASE_A1 DNA_LIGASE_A1 G_BETA_REPEATS DNA_LIGASE_A1 DNA_LIGASE_A1 G_BETA_REPEATS AA_TRNA_LIGASE_I G_BETA_REPEATS – – DNA_LIGASE_A1 G_BETA_REPEATS ATP_GTP_A AA_TRNA_LIGASE_II_2 PROTEIN_KINASE_ATP PROTEIN_KINASE_ST
PHYB_SOLTU PHYB_TOBAC PHYB_SOYBN PHY1_PHYPA PHY1_SELMA PHYC_ARATH PHYD_ARATH PHYE_ARATH PHY_CERPU
signature
a
phytochrome
Sequence range
Sequence pattern
632–646 873–882 873–882 633–647 631–645
IFAVDVDGQLNGWNT VASHELQHAL VASHELQHAL ILAVDSDGLVNGWNT ILAVDLDGLINGWNT
876–885 632–646 632–646 639–653 637–651 571–579 580–588 676–690 545–553 547–555 661–675 812–822 625–639
VPSHELQHAL ILAVDVDGTVNGWNI IFAVDVDGQVNGWNT ILAVDVDGLVNGWNI IFAVDADEIVNGWNT EDKDDGQRM EDKDDGQRM IFAVDTDGCINGWNA EDKDDGQRM EDKDDGQRM IFAVDVDGHVNGWNA PSNENVTVGGV ILAVDSNGMINGWNA
575–583 671–685 634–641 782–791 1010–1031 1123–1135
EDKDDGQRM IFAVDIDGCINGWNA ASEAMGKS TGSVERLDLY IIHRDLKSMNILV IIHRDLKSMNILV
The pattern of the chromophore attachment site in phytochromes and the often occurring phosphorylation and glycosylation sites are omitted. a G_BETA_REPEATS s beta-transducin family WD signature; DNA_LIGASE_A1 s ATP-dependent DNA ligase signature; ZINC_PROTEASEs zinc-binding region signature; ATP_GTP_As ATPrGTP-binding motif A ŽP loop.; AA_TRNA_LIGASE_Is aminoacyl-transfer RNA synthetases class-I signature; AA_TRNA_LIGASE_II_2s aminoacyl-transfer RNA synthetases class-II signature; PROTEIN_KINASE_ATPs protein kinase signature ŽATP binding.; PROTEIN_KINASE_STs protein kinase signature ŽSrT binding.. References of these signature sequences can be found on the PROSITE web site: http:rrexpasy.hcuge.chrsprotrprosite.html
258
J. Suhnel et al.r Biochimica et Biophysica Acta 1340 (1997) 253–267 ¨
Fig. 1. Condensed representation of the multiple alignment of 23 complete phytochrome sequences using the MULTAL program w30x. The numbers above and below the schematic representation of the phytochrome sequence depict the amino acid residue positions according to the MULTAL alignment Župper line. and the amino acid sequence of PHY3_AVESA Žlower line., indicating gaps by ‘–’. The upper case letters in the line below mark residues that are totally conserved in phytochromes A and those in the next line in phytochromes B. The following line represents the conservation pattern of all phytochrome sequences aligned. Upper case letters characterize totally conserved amino acids and lower case letters those that are identical by more than 70%. The boxes filled with double slashes ‘rr’ and ‘__’ indicate sequence stretches with no conservation within the phytochrome A and phytochrome B sets and within the complete sequence set, respectively. The last two lines compile the results of the secondary structure prediction. The line before last represents the results of the PHD analysis w31,32x and the last one of the ZPRED approach w33x. b-sheet structure elements are indicated by ‘s ’ and helix elements by ‘a ’.
J. Suhnel et al.r Biochimica et Biophysica Acta 1340 (1997) 253–267 ¨
results are summarized in Table 1 omitting the pattern of the chromophore attachment site in phytochromes and other often occurring patterns like phosphorylation and glycosylation sites. A common motif which is found in 11 out of the 23 phytochromes is the signature of the WD-repeats of the b-transducin family ŽG_BETA_REPEATS.. The occurrence of this motif is remarkable since WD-repeat proteins are known to have regulatory functions and to play a role in protein–protein recog-
259
nition w44x. It is very interesting that the WD-repeat signature is located within the internal repeat of the hinge region w14x. Upon closer inspection it turns out that it occurs exclusively in the region of the first repeat part. Two questions arise immediately: why is the motif not found in all phytochromes and why has it been identified only once even though it is part of a repeat. The signature represents a 15 amino acid stretch and reads wLIVMSTACx-wLIVMFYWSTAGCx-wLIMSTAGx-wLIVM STAGCx-x-x-wDNx-x-x-
Fig. 1 Žcontinued..
260
J. Suhnel et al.r Biochimica et Biophysica Acta 1340 (1997) 253–267 ¨
wLIVMWSTCx x wLIVMFSTAGx-W-wDENx-wLIVMFSTAGCNx. The symbol ‘ x’ stands for any amino acid and one of the amino acids given in square brackets is required for a particular sequence position. As can be seen from the multiple alignment discussed below, the amino acids in positions 7 and
15 of the motif are apparently crucial for the lack of this pattern in a few phytochromes, even though they match the remaining part of the motif. Further, there are significant differences between the repeating sequence parts Ž Fig. 1. . The replacement of the totally conserved stretch GWN Ž residues 644–646 of the
Fig. 1 Žcontinued..
J. Suhnel et al.r Biochimica et Biophysica Acta 1340 (1997) 253–267 ¨
PHY3_AVESA numbering. in the first repeat part by the stretch EWN Ž residues 777–779 of the PHY3_AVESA numbering., which is conserved in all phytochromes other than phytochrome of Ceratodon ŽPHY_CERPU. is most striking. The WDmotif tolerates 9 different amino acids in position 12, but not E. Thus, the question arises whether the minor deviation from the WD-repeat signature in a few phytochromes and the differences between the two repeat parts are a problem of motif definition or whether they are of real biological significance. Nonetheless, this motif is identified in a substantial number of phytochromes, therefore suggesting that it might be important for the signal transducing function. It is worth mentioning in this context that the COP1 protein which was characterized as a signaling component downstream of multiple phytochromes and a blue light receptor w45x contains the repeated WDmotif as well w46,47x. Apart from the WD-repeat another common sequence pattern is the Zn-binding domain signature of
261
Zn-dependent proteases w48,49x. It is identified on the C-terminal half of some phytochromes A and suggests DNA-binding activity, although there is no evidence for this effect as yet. More intriguing is the result that phytochromes B contain the sequence motif of the ATP-dependent DNA ligase since recent experimental evidence indicates the nuclear localization of phytochromes B w50x. The presence of this motif therefore raises the possibility of a direct function of phytochromes B in controlling the activity of light regulated genes.
3.1.4. Multiple sequence alignment The 23 complete phytochrome sequences given in Section 2.3 were subjected to a multiple sequence alignment with the aim of supplementing existing alignments. It is expected that the quality of former alignments is clearly improved with the larger number of sequences included in the present study. The multiple alignment was performed with the
Fig. 1 Žcontinued..
262
J. Suhnel et al.r Biochimica et Biophysica Acta 1340 (1997) 253–267 ¨
AMPS
package and the MULTAL program w29,30x. It was carried with the complete set of the 23 sequences as well as with the 11 phytochromes A and 5 phytochromes B alone. The detailed results are available via the IMB Jena World-Wide Web server Žhttp:rrwww.imb-jena.derPHYTO.html.. They can serve as a working tool for future experimental studies on phytochromes. For readers who have no easy access to the World Wide Web Fig. 1 compiles the results of the MULTAL alignment in a very condensed manner. We have used the multiple alignment to deduce the domain structure of phytochromes and to explore structural differences between phytochromes A and B.
regions: 27–48, 68–343 Ž62–344., 365–585 Ž365– 588., 607–701 Ž600–1129., 713–983, 1009–1030 and 1056–1129. This domain structure agrees fairly well with the functionally important domains derived indirectly from proteolytic mapping studies w37,51x. It is also in close correspondence with the structural pattern proposed by Romanowsky and Song w9x from a sequence–structure correlation study of six different phytochromes. However, our analysis does not indicate a domain boundary around position 462 as it was postulated by those authors. In this respect our results are in better agreement with experimental studies which did not recognize any proteolytic cleavage site at this position w37,51x.
3.1.5. Domain structure The structural domains were defined through an analysis of the conservation pattern. The sequence stretches with no conservation delimiting conserved patterns were assumed to reflect boundaries for the structural domains. Two different approaches were applied for their identification. The first one identifies sequence regions of 10 or more amino acids without conserved patterns within the complete sequence set and the second one corresponding regions within the phytochrome A and phytochrome B sets. The following regions were found by the two methods Žthe data of the second approach are given in parentheses. : 2–26, 49–67 Ž 50–61. , 344–364 Ž345–357., 586–606 Ž589–599., 702–712, 984–1008, 1031–1055. Accordingly, structural domains can be predicted for the
3.1.6. Sequence differences between phytochromes A and B Phytochromes A and B are reported to have distinct photosensory functions towards continuous red and far-red light. Experimental evidence indicates that the determinants of the photosensory specificity reside in the N-terminal domains. The C-terminal domains are likely to carry the determinants of the regulatory activity which appear to be common to the two types of phytochrome w3x. We have, therefore, tried to identify differences between phytochromes A and B on the sequence level. Striking sequence differences occur within the first N-terminal amino acids. As it is inferred from the multiple alignment of all phytochrome sequences, the first totally conserved amino acid after MŽ1. is FŽ43. ŽPHY3_AVESA numbering.. The very same amino acid corresponds to sequence positions 58–85 in the five phytochromes B. This means that phytochromes B compared with phytochromes A have an additional stretch of 15–42 amino acids in this region. In the remaining part of the sequence phytochromes A and B align in a very similar way. However, there are positions in which they have different totally conserved amino acids. They are indicated in Fig. 1 with a grey background. Fig. 2 shows the distribution of these amino acids over the sequence. It turns out that there are no positions of different totally conserved amino acids in the regions 440–610, 700–760, 870– 1010 and 1020–1100. Further, their number is higher at the N-terminal half compared to the C-terminal half with a significant peak in the sequence region 150–200 ŽPHY3_AVESA numbering.. From these
Fig. 2. Distribution of sequence positions with different totally conserved amino acids in phytochromes A and B ŽPHY3_AVESA numbering..
J. Suhnel et al.r Biochimica et Biophysica Acta 1340 (1997) 253–267 ¨
data it is tempting to speculate that the latter sequence segment 150–200 andror the additional sequence stretch directly at the N-terminus might play a role as determinants of the photosensory specificity of phytochromes A and B. 3.1.7. Secondary structure prediction The secondary structure of phytochrome was determined both by theoretical prediction methods w6–9x and experimental CD measurements w52,53x over the past years. However, while the theoretical methods calculated a significant amount of b-sheet structure, the CD analysis did not estimate any b-sheet content which led to the conclusion that the main features of the phytochrome structure are an extensive a-helical folding and an apparent lack of b-sheet structure. To address this obvious discrepancy between the predicted results and the experimentally determined CD data we have recalculated the secondary structure content of phyA from AÕena. Two prediction methods which are based on the multiple alignment of homologous sequences were used for this purpose, the PHD method w31,32x and the ZPRED approach w33x. These alignment-based methods were employed since they yield more reliable results than the standard approaches of secondary structure prediction. The reason is that the sequence variation within a multiple alignment provides much more structural information than a single sequence w54,55x. To assess the predictive ability of these methods in the case of tetrapyrrole-containing proteins the secondary structure of the biliprotein C-phycocyanin from Fremyella was first recalculated by use of the PHD method and compared with actual X-ray values. The PHD method gave a structure content of 65.9% helix, 4% b-sheet and 30.1% loop structure and a classification of this protein as all-helix type. These data are in excellent agreement with the X-ray crystallographic structure w56x. The agreement does not only hold for the number and type of secondary structure elements but also for their localization. The prediction results obtained for phyA of AÕena ŽPHY3_AVESA. from the PHD and ZPRED method are presented in Fig. 1. In both cases only secondary structure elements consisting of at least four subsequent residues are depicted. Furthermore, for the PHD approach only the high reliability predictions are shown. The PHD method yields a secondary
263
structure content of 44.3% helix, 13.9% b-sheet and 41.8% loop structure. The ZPRED approach results in a very similar structural assignment, estimating 48.4% helix, 17.7% b-sheet and 33.9% loop structure. ŽNote that these fractions are calculated on a residue-by-residue basis.. While the length of the secondary structure elements varies slightly between the two methods, the locations of the helix and b-sheet segments agree fairly well. Differences with b-sheet replacing helix and vice versa only occur in the sequence region around positions 240, 1015–1055 and 1115. The two prediction methods clearly reveal that the secondary structure of phyA from AÕena contains primarily a-helical segments and in addition a small but significant amount of b-sheet structure. Even if the dominant structural element is a-helix, the secondary structure is not of the all-helix type as in the case of phycocyanin. With regard to the b-sheet content of the phytochrome structure our prediction results are qualitatively in line with those from earlier prediction methods w7–9x. However, those methods were not able to predict the correct secondary structure of C-phycocyanin. It was therefore argued that they did not take into account interactive forces between the phytochrome chromophore and its binding domains which would result in a false prediction. This reasoning is not valid for the PHD and ZPRED approaches which reproduced the X-ray crystallographic structure of C-phycocyanin almost identically. A residue-by-residue discussion of the whole phytochrome sequence in terms of secondary structure is beyond the scope of this work. However, a few comments are appropriate. Periodicities of particular types of amino acids often indicate a definite secondary structure type. For example, surface strands are ideally characterized by a pattern hxhxhx and ideal surface helices by hxxhhxxh, where h stands for a conserved hydrophobic and x for any amino acid. Much more difficult is the differentiation between buried helices and strands, which both are often represented by short runs of conserved hydrophobic residues. The latter structural element is found, for instance, in the sequence regions around 170 and 180, between 260 and 270, around 370, between 630 and 640 and around 730 Ž PHY3_AVESA numbering. . On the other hand, a typical pattern of an exposed
264
J. Suhnel et al.r Biochimica et Biophysica Acta 1340 (1997) 253–267 ¨
strand can be recognized around the sequence position 380 and that of an exposed helix around 620. 3.2. FTIR analysis of the phytochrome A secondary structure The discrepancy between the theoretical prediction results and the experimentally determined data from CD measurements w52,53x prompted us to reevaluate the phytochrome secondary structure by means of the FTIR technique. Compared with the CD method the FTIR method seems to be superior in terms of its sensitivity to sheet structure. The FTIR spectra of both the Pr and Pfr form of natiÕe phytochrome from AÕena satiÕa were measured on hydrated films. In order to preserve phytochrome during the preparation of the films and to retain its photoreversibility after rehydration, 0.3% glycerol was added to the samples. In case that no glycerol was added drying of the samples onto the infrared window caused an irreversible denaturation indicated by a strong band at 1618 cmy1 in the IR spectra Ždata not shown. w57x. Fig. 3 shows the absorption spectra of Pr and Pfr after rehydration of
Fig. 3. Visible absorption spectra of the Pr and Pfr form of phytochrome after rehydration of the film samples in D 2 O. The upper trace of the Pr spectrum was obtained after back conversion of the photostationary PfrrPr mixture by saturating red light irradiation. The differences between the original and the back reverted Pr spectrum are due to a baseline drop which occurred because the hydrated film sample had to be removed from the spectrophotometer for the photoconversion.
Fig. 4. Original infrared spectra of typical Pr and Pfr film samples hydrated with D 2 O in the 1760–1280-cmy1 spectral region with intensity on the ordinate.
the film samples. They clearly reveal the full photoreversibility of the phytochrome samples used for the FTIR experiments. The maintenance of photoreversibility strongly suggests that phytochrome retained its native secondary and tertiary structure in the hydrated film samples. Fig. 4 shows the IR spectra of the Pr and Pfr form hydrated with D 2 O in the frequency region between 1760–1280 cmy1. The absorbance bands centered at around 1637 and 1450 cmy1 correspond to the amide IX and amide IIX band, respectively. For secondary structure analysis the amide IX band shape was decomposed into subspectra of individual spectral components by Fourier derivation and deconvolution as described in Section 2.2. Fig. 5 shows the Fourier derivative spectra of the Pr and Pfr forms. At least seven spectral component bands are resolved within the amide IX band contour. Several of them can be ascribed to specific secondary structure elements w58–61x. The two strongest bands at about 1654 and 1628 cmy1 can be assigned to a-helix and b-sheet structure, respectively, while the band at about 1668 cmy1 can be attributed to turns and the bands at 1677 and 1687 cmy1 to turns with minor contributions of antiparallel b-sheet structure w62x. Further, the band at 1641 cmy1 can be assigned to irregular Žrandom. and unknown structure. It cannot unambiguously be related to irregular structure
J. Suhnel et al.r Biochimica et Biophysica Acta 1340 (1997) 253–267 ¨
solely since several other types of secondary structure Ž310-helix, open loops and b-sheet. are known to contribute to this band w57,63–65x. For an accurate calculation of the irregular structure content, it is necessary to compare the individual component bands in H 2 O and D 2 O w66x. Unfortunately, the IR-spectra in H 2 O cannot be used for such analysis as the strong absorption of the water bending deformation mode in this spectral region prevented a reliable substraction of the water contribution. Finally, the band at 1612 cmy1 most likely originates from vibrations of the amino acid side-chains. For a quantitative estimation of the secondary structure content the original amide IX band was curve-fitted with seven GaussianrLorentzian component bands using the band positions of the Fourier derivative or deconvoluted spectra as initial input frequencies. Fig. 6 represents the result of this fitting procedure for the experimentally measured amide IX band of the Pr form. As can be seen the analysis with seven subspectral components produced a reasonably good fit. There is virtually no difference between the fitted and the experimentally measured band shape. From the relative areas of the fitted component bands a secondary structure content of 32–36% ahelix, 28–32% b-sheet, 10–15% turns and 20–24% irregular and unknown structure was determined for
Fig. 5. Fourier derivative infrared spectra of typical Pr and Pfr film samples hydrated with D 2 O in the 1700–1600-cmy1 spectral region with intensity on the ordinate. Fourier derivation was performed with a power of 3, breakpoint 0.3.
265
Fig. 6. Spectral decomposition of the amide IX band of the Pr film sample hydrated with D 2 O. The fitted GaussianrLorentzian band shapes as well as the calculated and experimentally measured amide IX absorption band are shown. The calculated and experimentally measured amide IX absorption bands are identical and are shown superimposed here.
the Pr form of phytochrome. Similar values within the error limits were also obtained for the Pfr form. No significant differences in respect to the band positions, band width and peak heights of the amide IX band components were observed between Pr and Pfr ŽFigs. 4 and 5.. These findings suggest that the photoconversion between Pr and Pfr is not accompanied by a significant change in the secondary structure. In accordance with the theoretical results, the FTIR data provide further evidence for a b-sheet fraction in the secondary structure of phytochrome. The differences found between the b-sheet estimates of the prediction methods and FTIR analysis are not surprising if the limitations of either method are taken into account. For example, the FTIR analysis is highly sensitive to sheet structure and particularly useful for a qualitative estimation of average secondary structure, but less reliable for a quantification of the fractional distribution of specific secondary structural types w67x. On the other hand, the prediction accuracy of the theoretical methods based on multiple alignments depends highly on the availability of sequences with varying degrees of sequence similarity. With regard to its b-sheet content, the secondary structure of phytochrome differs significantly from
266
J. Suhnel et al.r Biochimica et Biophysica Acta 1340 (1997) 253–267 ¨
that of C-phycocyanin. In this light the structure of C-phycocyanin is not an adequate template for modeling the phytochrome structure.
References w1x W. Rudiger, F. Thummler, Angew. Chem. 103 Ž1991. ¨ ¨ 1242–1252. w2x M. Furuya, P.-S. Song, in: R.E. Kendrick, G.H.M. Kronenberg ŽEds.., Photomorphogenesis in Plants, 2nd ed., Kluwer, Dordrecht, 1994, pp. 105–140. w3x P.H. Quail, M.T. Boylan, B.M. Parks, T.W. Short, Y. Xu, D. Wagner, Science 268 Ž1995. 675–680. w4x J.L. Gabriel, J.K. Hoober, J. Theor. Biol. 151 Ž1991. 541– 556. w5x W. Parker, P. Goebel, C.R. Ross, P.-S. Song, J.J. Stezowski, Bioconjugate Chem. 5 Ž1994. 21–30. w6x W. Parker, P.-S. Song, J. Biol. Chem. 265 Ž1990. 17568– 17575. w7x M.D. Partis, R. Grimm, Z. Naturforsch. 45c Ž1990. 987–998. w8x Z.A. Luka, N.D. Volotovsky, Biofizika 38 Ž1993. 1025– 1030. w9x M. Romanowski, P.-S. Song, J. Prot. Chem. 11 Ž1992. 139–155. w10x H.A.W. Schneider-Poetsch, B. Braun, S. Marx, A. Schaumburg, FEBS Lett. 281 Ž1991. 245–249. w11x H.A.W. Schneider-Poetsch, Photochem. Photobiol. 56 Ž1992. 839–846. w12x F. Thummler, M. Dufner, P. Kreisl, P. Dittrich, Plant Mol. ¨ Biol. 20 Ž1992. 1003–1017. w13x F. Thummler, P. Algarra, G.M. Fobo, FEBS Lett. 357 ¨ Ž1995. 149–155. w14x A.M. Jones, M.D. Edgerton, Semin. Cell Biol. 5 Ž1994. 295–302. w15x D.M. Lagarias, S.-H. Wu, J.C. Lagarias, Plant Mol. Biol. 29 Ž1995. 1127–1142. w16x D.M. Kehoe, A.R. Grossman, Science 273 Ž1996. 1409– 1412. w17x T. Kaneko, A. Tanaka, S. Sato, H. Kotani, T. Sazuka, N. Miyajima, M. Sugiura, S. Tabata, DNA Res. 2 Ž1995. 153–166. w18x H.U. Kolukisaoglu, S. Marx, A. Wiegmann, S. Hanelt, H.A.W. Schneider-Poetsch, J. Mol. Evol. 41 Ž1995. 329– 337. w19x S. Mathews, M. Lavin, R.A. Sharrock, Ann. Missouri Bot. Gard. 82 Ž1995. 296–321. w20x R. Grimm, W. Rudiger, Z. Naturforsch. 41c Ž1986. 988–992. ¨ w21x F. Siebert, R. Grimm, W. Rudiger, G. Schmidt, H. Scheer, ¨ Eur. J. Biochem. 194 Ž1990. 921–928. w22x J.K. Kauppinen, D.J. Moffatt, H.H. Mantsch, D.G. Cameron, Appl. Spectrosc. 35 Ž1981. 271–276. w23x D.G. Cameron, J.K. Kauppinen, D.J. Moffatt, H.H. Mantsch, Appl. Spectrosc. 36 Ž1982. 245–250.
w24x J.L.R. Arrondo, J. Castresana, J.M. Valpuesta, F.M. Goni, ˜ Biochemistry 33 Ž1994. 11650–11655. w25x A. Bairoch, B. Boeckmann, Nucl. Acids Res. 20 Ž1992. 2019–2022. w26x J.C. Wootton, Curr. Opinion Struct. Biol. 4 Ž1994. 413–421. w27x S.F. Altschul, W. Gish, W. Miller, E.W. Myers, D.J. Lipman, J. Mol. Biol. 215 Ž1990. 403–410. w28x S. Henikoff, J.G. Henikoff, Genomics 19 Ž1994. 97–107. w29x G.J. Barton, Methods Mol. Biol. 25 Ž1994. 327–347. w30x W.R. Taylor, Methods Enzymol. 183 Ž1990. 456–474. w31x B. Rost, C. Sander, R. Schneider, Comput. Appl. Biosci. 10 Ž1994. 53–60. w32x B. Rost, C. Sander, Proteins 19 Ž1995. 55–72. w33x M.J. Zvelebil, G.J. Barton, W.R. Taylor, M.J.E. Sternberg, J. Mol. Biol. 195 Ž1987. 957–961. w34x J. Stockhaus, A. Nagatani, U. Halfter, S. Kay, M. Furuya, N.-H. Chua, Genes Devel. 6 Ž1992. 2364–2372. w35x E.T. Jordan, J.R. Cherry, J.M. Walker, R.D. Vierstra, Plant J. 9 Ž1995. 243–257. w36x E.T. Jordan, J.R. Cherry, J.M. Walker, R.D. Vierstra, Plant J. 9 Ž1995. 243–257. w37x R. Grimm, Ch. Eckerskorn, F. Lottspeich, C. Zenger, W. Rudiger, Planta 174 Ž1988. 396–401. ¨ w38x M.D. Edgerton, A.M. Jones, Plant Cell 4 Ž1992. 161–171. w39x J.R. Cherry, D. Hondred, J.M. Walker, J.M. Keller, H.P. Hershey, R.D. Vierstra, Plant Cell 5 Ž1993. 565–575. w40x T. Kaneko, S. Sato, H. Kotani, A. Tanaka, E. Asamizu, Y. Nakamura, N. Miyajima, M. Hirosawa, M. Sugiura, S. Sasamoto, T. Kimura, T. Hosouchi, A. Matsuno, A. Muraki, N. Nakazaki, K. Naruo, S. Okumura, S. Shimpo, C. Takeuchi, T. Wada, A. Watanabe, M. Yamada, M. Yasuda, S. Tabata, DNA Res. 3 Ž1996. 109–136. w41x N. Wang, G. Shaulsky, R. Escalante, W.F. Loomis, EMBO J. 15 Ž1996. 3890–3898. w42x Th.D. Gaffney, St.T. Lam, J. Lignon, K. Gates, A. Frazelle, J.D. Maio, St. Hill, N. Torkewiz, A.M. Allshouse, H.-J. Kempf, J.O. Becker, J. Mol. Plant Microbe Interact. 7 Ž1994. 455–463. w43x C. Chang, TIBS 21 Ž1996. 129–133. w44x E.J. Neer, C.J. Schmidt, R. Nambudripad, T.F. Smith, Nature ŽLondon. 371 Ž1994. 297–300. w45x J. Chory, Trends Genet. 9 Ž1993. 167–172. w46x X.-W. Deng, M. Matsui, N. Wie, D. Wagner, A.M. Chu, K.A. Feldmann, P.H. Quail, Cell 71 Ž1992. 791–801. w47x X.-W. Deng, Cell 76 Ž1994. 423–426. w48x C.V. Jongeneel, J. Bouvier, A. Bairoch, FEBS Lett. 242 Ž1989. 211–214. w49x G.J.P. Murphy, G. Murphy, J.J. Reynolds, FEBS Lett. 289 Ž1991. 4–7. w50x K. Sakamoto, A. Nagatani, Plant J. 10 Ž1996. 859–868. w51x J.C. Lagarias, F.M. Mercurio, J. Biol. Chem. 260 Ž1985. 2415–2423. w52x Y.G. Chai, P.-S. Song, M.-M. Cordonnier, L.H. Pratt, Biochemistry 26 Ž1987. 4947–4952. w53x D. Sommer, P.-S. Song, Biochemistry 29 Ž1990. 1943–1948. w54x G.J. Barton, Curr. Opinion Struct. Biol. 5 Ž1995. 372–376.
J. Suhnel et al.r Biochimica et Biophysica Acta 1340 (1997) 253–267 ¨ w55x R.B. Russell, M.J.E. Sternberg, Current Biol. 5 Ž1995. 488–490. w56x M. Duerring, G.B. Schmidt, R. Huber, J. Mol. Biol. 217 Ž1991. 557–592. w57x U. Dornberger, F. Fandrei, J. Backmann, W. Hubner, K. ¨ Rahmelow, K.H. Guhrs, M. Hartmann, B. Schlott, H. ¨ Fritzsche, Biochim. Biophys. Acta 1294 Ž1996. 168–176. w58x J.L.R. Arrondo, A. Muga, J. Castresana, F.M. Goni, ˜ Prog. Biophys. Mol. Biol. 59 Ž1993. 23–56. w59x A. Dong, P. Huang, W.S. Caughey, Biochemistry 29 Ž1990. 3303–3308. w60x W.K. Surewicz, H.H. Mantsch, D. Chapman, Biochemistry 32 Ž1993. 389–394.
267
w61x M. Jackson, H.H. Mantsch, Crit. Rev. Biochem. Mol. Biol. 30 Ž1995. 95–120. w62x S. Krimm, J. Bandekar, Adv. Protein Chem. 38 Ž1986. 181–364. w63x H. Susi, M. Byler, Methods Enzymol. 130 Ž1986. 290–311. w64x S.J. Prestrelski, D.M. Byler, M.P. Thompson, J. Protein Res. 37 Ž1991. 508–512. w65x H. Fabian, D. Naumann, R. Misselwitz, O. Ristau, D. Gerlach, H. Welfle, Biochemistry 31 Ž1992. 6532–6538. w66x J.L.R. Arrondo, I. Extabe, U. Dornberger, F.M. Goni, ˜ Biochem. Soc. Trans. 22 Ž1994. 380. w67x V. Baumruk, P. Pancoska, A. Keiderling, J. Mol. Biol. 259 Ž1996. 774–791.