The psaC genes of Synechococcus sp. PCC7002 and Cyanophora paradoxa: cloning and sequence analysis

The psaC genes of Synechococcus sp. PCC7002 and Cyanophora paradoxa: cloning and sequence analysis

Gene. 112 (1992) 123-128 © 1992ElsevierSciencePublishersB.V.All rightsreserved. 0378-1119/92/$05.00 123 GENE 06310 The psaC genes of analysis * Sy...

580KB Sizes 0 Downloads 54 Views

Gene. 112 (1992) 123-128 © 1992ElsevierSciencePublishersB.V.All rightsreserved. 0378-1119/92/$05.00

123

GENE 06310

The psaC genes of analysis *

Synechococcus sp.

PCC7002 and

Cyanophoraparadoxa: cloning

and sequence

(Photosynthesis; cyanobacteria; cyanelles; photosystem I; chloroplasts; iron-sulfur protein; recombinant DNA)

Erhard Rhiel, Veronica L. Stirewalt, Gall E. Gasparich and Donald A. Bryant Department of Molecular and Cell Biology, The Pennsylvania State University, UniversityPark, PA 16802(U.S.A.) Receivedby R.E. Yasbin: 6 September 1991 Revised/Accepted: 1 November/15November 1991 Receivedat publishers: I1 December 1991

SUMMARY The psaC genes of the cyanobacterium, Synechococcus sp. PCC7002, and of the cyaneUe genome of the phylogenetically ambiguous biflagellate, Cyanophora paradoxa, were cloned, mapped and sequenced. The PsaC proteins of both species exhibit high degrees (approx. 95%) of sequence similarity to the PsaC proteins of other cyanobacteria as well as the chloroplast-encoded proteins of green algae and higher plants. The Synechococcus sp. PCC7002 psaC gene is transcribed as a monocistronic mRNA of approx. 350-400 nt, and transcription is initiated 51 nt upstream from the translational start codon. As found for the chloroplasts of higher plants, the C. paradoxa psaC gene is encoded within the small single-copy region of the cyanelle genome. In contrast to results obtained for chloroplasts and for the cyanobacterium Synechocystis sp. PCC6803, neither psaC gene is flanked by genes encoding components of the NAD(P)H dehydrogenase complex.

INTRODUCTION The photosystem I complex (PS I) of cyanobacteria and higher plants is a membrane-bound photo-oxidoreductase which catalyzes the light-driven transport of electrons from reduced plastocyanin (or cytochrome c553) to oxidized, soluble ferredoxin or flavodoxin (for reviews, see Scheller and

Correspondenceto: Dr. D.A. Bryant,Departmentof Molecularand Cell Biology, S-231 Frear Bldg., PennsylvaniaState University, University Park, PA 16802(U.S.A.) Tel. (814)865-1992; Fax (814)863-7024; e-mail:DABI4@PSUVM. * Dedicated to Dr. WernerWehrmeyerin honorof his 60th birthday. Abbreviations:aa, amino acid(s); bp, base pair(s); C., Cyanophora;Fx, FA, and FB, [4Fe-4S] centersinvolvedit. electrontransport in PS I; kb, kilobase(s) or 1000bp; ndh, genes encodingsubunitsof the NAD(P)H dehydrogenasecomplex; nt, nueleotide(s);ORF, open reading frame; psaC,geneencodingthe 9-kDaFA/Fa-bindingprotein;PS I, photosystem I; RBS, ribosome-bindingsite(s).

Moiler, 1990; Golbeck and Bryant, 1991; Bryant, 1991; Chitnis and Nelson, 1991). The PS I complex is comprised of at least twelve polypeptides, about 100 chlorophyll a molecules, 10-15/~-carotene molecules, two molecules of phylloquinone (vitamin Kt), and three [4Fe-4S] centers denoted Fx, FA, and FB. It is presently believed that all of these cofaetors except for the F A and Fil centers are bound to a PsaA-PsaB b,~,erodi,mer The F~ ~r~d F~ c*~ters are bound to a membrane-extrinsic, 9-kDa protein denoted PsaC (Oh-Oka et al., 1987; 1988; Dunn and Gray, 1988; Wynn and Malkin, 1988). After the absorption of a photon of light by the antennae chlorophylls and migration of the excitation energy to the special pair in PS I, an electron is transferred from the primary donor P700 to a chlorophyll acceptor A0, to a phylloquinone acceptor (A1), to F x, and finally to the terminal acceptors F A and F e. The precise pathway by which electrons move from the A 1 acceptor to the terminal F A and Fn centers and beyond to the soluble 2Fe-2S ferredoxht is not yet known. Golbeck and

124 coworkers have shown that it is possible to remove the extrinsic PsaC, PsaD, and PsaE proteins from the cyanobacterial PSI complex by treatment with chaotropic agents such as 6.8 M urea to produce the so-called ' P S I core protein' (Golbeck et al., 1988; Parrett et al., 1990; Li et al., 1991a). Under appropriate conditions, the Fe-S centers can be reinserted into the PsaC apoprotein to regenerate a functional PsaC holoprotein, which becomes stably rebound to the P S I core protein in the presence of the refolded PsaD and PsaE proteins (Parrett et al., 1990; Zhao et al., 1990; Li et al., 1991a). We are interested in structural and functional properties of the P $ I complex in cyanobacteria. We have recently shown that the PsaC protein ofSynechococcus sp. PCC7002 can be overproduced in E. coil and, after reinsertion of the two (4Fe-4S) centers, this protein along with the PsaD protein can be rebound to the PS I core protein of Synechococcus sp. PCC6301 to reconstitute electron transport to the FA/F . terminal acceptors (Zhao et al., 1990; Li et al., 1991b). In preparation for site-directed mutagenesis studies of the PsaC protein by this in vitro-reassembly approach, we have cloned and determined the nt sequence of the psaC gene of Synechococcus sp. PCC7002. Additionally, as part of our efforts to determine the complete nt sequence of the cyanelle genome of the phylogenetically ambiguous eukaryote Cyanophora paradoxa, we have mapped, cloned, and determined the sequence of the psaC gene in this organism. A preliminary report concerning this work has appeared (Bryant et al., 1990).

EXPERIMENTALAND DISCUSSION (a) The psaC gene o f Synechoeoccus sp. P C C 7 0 0 2 Heterologous hybridization experiments performed with a DNA fragment encoding the psaC gene of Nicotiana tabacum chloroplasts demonstrated that the psaC gene of Synechococcus sp. PCC7002 was encoded on a 3.0-kb EcoRI fragment (see map in Fig. 1). This fragment was cloned from a library of size-fractionated EcoRI restriction fragments in plasmid pUC19 to produce plasmid pER1. A portion (1800 bp) of the nt sequence of this fragment was determined and is presented in Fig. 2. The Synechococcus sp. PCC7002 psaC gene, consisting of 82 codons, extends from nt 585-830 in Fig. 2. The gene predicts a protein of 81 aa with a calculated M r of 8814 and a pl of 5.58. The 9-kDa polypeptide of the Synechococcus sp. PCC7002 PS I complex was purified by preparative polyacrylamide gel electrophoresis and subjected to N-terminal aa sequence analysis (data not shown). The sequence obtained (SHSVKIYDT...) matched the sequence predicted for the nt sequence of the psaC gene except for the absence of the initiator Met which must be post-translationally removed.

psaC

Fig. 1. Physicalmap of the psaClocus of Synechococcussp. PCC7002. The psaCgene was mapped and clonedwith a hybridizationprobe derived from the chloroplast genome of N. tabacum (0.6-kb HincII-Clal fragment derivedfrom plasmid pTb2SX; Hayashidaet al., 1987)using hybridizationconditionspreviouslydescribed (Bryantand Tandeau de Marsac, 1988). A 3.0-kb EcoRI fragmentwas isolatedby cloningsizefractionatedDNA fragmentsintoplasmidpUCI9 (Yanisch-Perronet al., 1985). Hybridizationscreeningof this libraryat highdensityproduceda number of hybridizingclones, and one, denotedpERI, was chosen for furtherrestrictionmappingand nt sequenceanalysis.A regionof 1800bp was sequencedon both strands (see arrows at bottomof figure)using a combinationof subclonesand syntheticoligodeoxynucleotideprimers as previouslydescribed(Cantrelland Bryant, 1987).The arrow withinthe box indicatesthe directionof transcriptionof the psaCgene. Restriction endonucleases: R, BamHI;C, HincI[, E, EcoRI;G, Bgl[[;H, Hind[H; N, Ncol; X, Xbal. Therefore, the predicted Mr of the mature protein is calculated to be 8682. Northern hybridization experiments (data not shown) indicated that the psaC gene was transcribed as a monocistronic mRNA of approx. 350-400 nt; primer extension experiments demonstrated that transcription is initiated 51 bp upstream from the translation start (data not shown). Although a sequence (5'-TAAACT-3'; see Fig. 2) resembling the -10 element of the consensus 0.7o promoter of E. coli is found upstream from the mapped transcription start site, no sequence resembling the -35 element was observed. Finally, attempts to create a deletion mutation in which the psaC gene was replaced by a DNA fi'agment encoding aminoglycoside 3' phosphotransferase have not been successful (J. Zhao and D.A.B., unpublished results). Three additional ORFs were detected within the region sequenced. Upstream from the psaC gene and encoded on the opposite DNA strand, an ORF predicting a protein of 92 aa with M r of 10410 and a predicted pl of 8.49 was detected from nt 331-53. A portion of a second ORF is found from nt 44-1. Both of these ORFs are preceded by seqt'ences resembling typical prokaryotic RBS, and hence are likely to represent actual coding sequences. Downstream from the psaC gene, an ORF predicting a protein of 184 aa with a calculated M r of 20872 and a predicted pI of 6.09 was found to be encoded vn the; opposite DNA strand from the psaC gene from nt 1622-1067. Database searches did not identify known proteins with homology to those predicted by these three ORFs.

125 ATGGGTTGTAACCGACCAGTCACATCGAGTTTAAAGCCCTTCATAAAGGGTTCTAGACCCCGCAACTGGTAGTCCAAACT P Q L R G T V D L K F G K M
AGGTCACAATGGGCTGAAATTTGTACAGATCTAAAAAGCCAATGGGC T V I P Q F K Y L D L F G I P

TTTGATATGGACTTGGTAGTGCGACCGCACCAAATTTTCA K S I S K T T R G C W I K L

120 (92)

C G C A T T T T G A G T T G G T A T T G A T C C G G G G C A A T T TT T T C G G T G C G T T C T G G A T C G G C G A T C G C C C C C A C C A G A C R M K L Q Y Q D P A I K E T R E P D A I A G V L C

240 {70)

AATCAATGTCTTGGAGATAcTCTTGAATCGGACGGGGTGCCTCTGGTACGGAGAGGGCAACAACTTCGAGAGCTTGGAAGCGGATGGCCATAGTAAAGTTTTGTTCCTAAAATTTTGTAG D I D Q L Y E Q I P R P A E P V S L A V V E L A Q F R I A M
360 (30)

TCAACATCGAAT•AATAGAAA•CCTGGTTTTGCCC•CAAAAAGcACC•cAGAAAATTTCAAAAAACGCTCAAAATTGCCTAGAGACGCCGGTTTTTTAGGGATATGTAACGCAACGTAAA

480

CT T A T A T C A C C T T T T C G C C A A G G G A C T T C T A G C C C T G C G G T A A A C T A A C C C A T A A T C C A G A A T G C T G A C A T A A T C A A A C C A C A T T C T L.-)

600 5

T G T A G G A ~ T T A A C A C T CAATG TC T C A T A G C G T T S.-D. PsaC> M S H S V

AAATTTACGACACTTGCATTGG•TGcACCCAGTGCGTCCGTGCCTGTCCCCTTGATGTcCTAGAGATGGTTCCTTGGGATGGCTGCAAAGCGGGTcAGATCG•ATCTTCTCCTAGAAcA K

I

Y

D

T

C

I

G

C

T

Q

C

V

R

A

C

P

L

D

V

L

E

M

V

P

W

D

G

C

K

A

G

Q

I

A

S

S

P

GAAGATTGCGTTGGTTGTAAGCGGTGTGAAA~TGCTTGCc~CACCGA~TTTCTCAGTATcCGGGTTTACCTCGGTGCCGAGACAACTCGTAGTATGGGTCTGGCCTACTAAA~ E

D

C

V

G

C

K

R

C

E

T

A

C

P

T

D

F

L

S

I

R

V

Y

L

G

A

E

T

T

R

S

M

G

L

A

Y

R

T

TCGCCCA

*

TTTAAC~cAAGCCTTTTTGTATTGGGTTATATTAAAACACTATCCCATTAACGTTTTCAACCATTCTGGTCATTTCCGGAGAGGCATTTTAGTCCTCTCTTTTTTTTATTGTCCGCTACA

720 45

840 81

960

TTATATTTGTTTCGcCATTAGAATGCGGCGATCcGCGT~GGTGGCGATCGCCATGGGTCAGACTTGTTCAGGATTAAGTCCTATTTTTCAAAACAATCAAGAGTAAATTAACCAATCCAG * G I W

1080 (184)

TTAATTTCTGTGGGATCAGTCTCCTTGGcCGTGTTTAGGGTCTCTTTAGAGAGTGGGGGTTGAGTCGCTTT•GCGTTGGGCTGCTGGGCGAGAGATGTTTTGGGATGATGGGTGGT•TGC N I E T P D T E K A T N L T E K S L P P Q T A K A N P Q Q A L S T K P H H T T Q

1200 (181)

CAAAATCGGCGATcGCCAACATGTTGATTTAGGATACGA~GcAGGAAAAAATTCAAGGTTTTCTGGGGAATATAAccTTT~AACACcGCTAAACTGC~GAATAACATATGGTTTAGCCGT

1320 (141)

W

F

R

R

D

G

V

H

Q

N

L

I

R

R

L

F

F

N

L

T

K

Q

P

I

Y

G

K

L

V

A

L

S

G

F

L

M

H

N

L

R

TGCTCATCAAGAATTAGAGCGAGCTGCTCAGTGGTCAGTAAATCCGCTTCAAGTAAATAATCACCAAGCCGTTTGCTATCAATTAAGATCAACTCTTGTTTAGCATACTTTGCAAAAAAA F

1440 (I01)

TCAGCCGTGGGTTGACGAATCCAGCCCCGCAAAGTCAAAATTTCACCAAATTTTAAATCACTATAAATTTGCTGATCAAAAAGAGCCGTTTGTAATTGGGCTTCAGAAATTAAGCCTGcA D A T P Q R I W G R L T L I E G F K L D S Y I Q Q D F L A T Q L Q A E S I L G A

1560 (61)

T~TTCAAGGACTTCGCCGATCAACTGAATTTTcGGA~CATTGTTGGTAGATGCTGCATTCATAAG~GCTATTTCAACCTACTGTGAcGATC~CAAGAACAATACTTTACAGACCCAGTGAT

1680 (21)

Q

D

E

E

D

L

L

V

I

E

L

G

A

I

L

L

Q

Q

E

I

T

K

T

P

L

G

L

N

D

N

A

T

E

S

L

A

L

A

Y

N

D

G

L

R

K

S

D

I

L

I

L

E

Q

K

A

Y

K

M
ACTCATT~CCoT~AA~GG~TTGCTCTTTT~T~TT~TTTTCCCC~T~C~AT~T~CGTG~C~CT~T~TTT~AT~GG~CGCT~C~GT~CCT~TTTTGT~CAATCT~ATGCA~TC

A

F

1800

Fig. 2. The nt sequence and deduced aa sequence of the psaCgene of Sy,echococcussp. PCC7002 and surrounding region. Deduced aa sequences for three additionalORFs of 184,92, and > 14codons are shownupstreamand downstreamfromthe psaCcodingsequence.The putativeShine-Dalgarno.like RBS sequence (underlined and labeled S.-D.) is shown immediately5' to the psaCgene. The mapped 5' end of the psaCtranscript is indicated by the arrow 5 i nt upstreamfromthe translational start codon; a sequencemotifresemblingthe - 10elementof consensusE. coli0.70 promotersis alsounderlined. The sequences presented have been deposited in GenBank under the accession No. M86238. In most chloroplasts the psaC gene has been shown to be flanked by genes exhibiting homology to subunits of the mitochondrial NAD(P)H dehydrogenase complex (Shinozaki et al., 1986; Ohyama et al., 1986; Schantz and Bogorad, 1987; Hiratsuka et al., 1989). The psaC gene is flanked upstream by the ndhE gene, while on the downstream side ;,saC is flanked by the ndhD gene. In maize, the psaC and ndhD genes have been shown to be cotranscribed (Sehantz and Bogorad, 1987). This arrangement has recently been shown to be partially conserved in the cyanobacterium Synechocystis sp. PCC6803 (Anderson and Mclntosh, 1991). In this cyanobacterium, an O R F found 5' to the psaC gene exhibits considerable homology to the ndhE gene of tobacco, maize, and liverwort. Additionally, downstream from the psaC gene, an ORF of 273 bp is 48 identical to the 5' portion of the 1527 bp-long ndhD gene of maize. This arrangement is not found in Synechococcus sp. PCC7002, as no significant homology to ndh genes

could be detected both upstream and downstream from the psaC gene.

(b) The psaC gene of the cyanelle genome Cyanophora paradoxa Heterelogous hybridization experiments to map the location of the psaC gene of (7. paradoxa indicated that the psaC gene was encoded on the cyanelle DNA and was located in the small single-copy region of the genome near map position 120 (see Lambert et al., 1985). Additional hybridization experiments indicated that the psaC gene was encoded on a Hincll fragment of 2.1 kb (see restriction map shown in Fig. 4). The complete nt sequence of this fragment was determined on both strands and is shown in Fig. 5. The psaC gene consists of 82 codons and occurs from nt 372-617 and predicts a protein with a Mr of 8709 (assuming the N-terminal Met is post-translationally removed

126 as found for all other PsaC proteins) with a predicted pI of 5.58. Downstream from the psaC gene a large inverted repeat which could play a role in transcription termination or mRNA stabilization occurs from nt 670-724. The start codon is preceded by the consensus RBS motif 5 ' AGGAG which occurs 9-13 bp upstream from the Met start codon. The predicted sequence of the PsaC protein of C. paradoxais compared in Fig. 3 to PsaC proteins of several cyanobacteria (prokaryotes), green algae, and higher plants (eukaryotes). As can be seen in Fig. 3, the cyanelle protein is intermediate in similarity between the prokaryotic and eukaryotic sequences. As found for the psaC gene of Synechococcus sp. PCC7002 described above, the psaC gene of C. paradoxa is not flanked by genes exhibiting similarity to ndhD and ndhE. Downstream from the psaC gene, an ORF predicting a protein of 243 aa with predicted Mr of 26 606 and pl of 8.55 was found (nt 774-1505; Figs. 4 and 5). Although this putative gene is not preceded by an obvious RBS, the ORF is followed by a large inverted repeat that could function as a transcription terminator or mRNA stabilizer (see Fig. 5). Database searches failed to identify any known proteins exhibiting high degrees of similarity to this sequence. Still further downstream and encoded on the opposite DNA strand from psaC and ORF 243, an ORF of greater than 162 codons was found (nt 2153-1665; Figs. 4 and 5). This partial ORF predicts a very hydrophobic protein, and hydropathy analyses suggest that sequence motifs are present which could represent transmembrane ~-helices (data not shown). Although the identity of this ORF is not known, database searches indicate that the predicted protein has strong similarity to an ORF which occurs downstream from the ruvB gene of E. cot (Shinagawa et al., 1988). Io

Synechococcus 7002: NO$10C 8009:

Synechocysti$ S. vulcanu$:

C. paradoxa: C. reinhardtii: Tobacco: Liverwort:

6803:

2o

D

D

C

S

a

trnN t r n L

-

od

psaC, b





Fig. 4. Physical map of the psaC locus of the cyanelle genome of C. paradoxa. The psaC gene was localized by heterolognus hybridization with a probe derived from the chloroplast genome of IV. tabacum (Hayashida et al., 1987) by using hybridization conditions previously described (Bryant and Tandeau de Marsac, 1988). Hybridization to a 14.6-kb BamHI fragment and to a 14.4-kb Pstl fragment localized the gent to the small unique-copy region of the cyanelle genome (see Lambert et al., 1985). Additional mapping indicated that the psaCgene was localized to a 2. l-kb Hincll fragment (see Fig. 5) near map position 122 (Lambert et al., 1985), This fragment was suheloned into pUCI9 (Yanisch-Perron et al., 1985) to yield plasmid pCpHc2.0. The complete nt sequence of this fragment (see arrows below the map) was determined on both strands using synthetic oligodeoxynucleotide primers as previously described (Cantrell and Bryant, 1987), Arrows within the boxes indicate the direction of transcription of the respective genes and ORFs, Restriction endonucleasos: C, Clal; D, Dra[; He, Hi, cI[; S, Sail.

Upstream from the psaCgene of C. paradoxa, two genes encoding tRNAs, trnN(GUU) and trnL(UAG) were found at nt 1-40 and 204-123, respectively. The tmL(UAG) gene is approx. 60% identical to the equivalent tRNA genes of liverwort (Ohyama et at., 1986) and tobacco (Shinozaki et al., 1986) and is divergently transcribed relative to the psaC gene (Fig. 4). The partial tmN gene sequence is approx. 94~o identical to the equivalent tRNA genes of liverwort (Ohyama et al., 1986) and tobacco (Shinozaki et al., 1986) and is transcribed from the same DNA strand as the psaC gene (convergently transcribed towards the trnL(UAG) gene; Fig. 4).

3o

4o

50

6o

7o

8o

MSHSVK IYDTC IGCTQCVPJ~C PLDVLEMVPWDGCKAGQIASSPRTEDCVGCKRCETAC PTDFLS IRVYLGAETTRSMGLAY T T A T I K A S V WH A ? T

A T

T

R N

A I

A A

Spinach: Pea:

T T

I I

T

I

T

I

Wheat: Barley:

T T T

i I I-

R Joe:

T

I

Maize:

Hc

a

G

S M K K K K K K K K

A A A A A A A A A A

A S SR S

S

G V V V V V V

S S S

V

S

V

S S WH N WH WH P P P P

S

S G' A S A S A S

A S

Fig. 3. Comparison of the deduced aa sequences or" the Sy,echococcus sp. PCC7002 and C. pamdoxa PsaC proteins to those of diverse cyanobacteria and eukaryotes. Only aa which differ ,ore those for the Synechococcus sp. PCC7002 sequence are shown. Sources of sequence data: Synecbococcus sp. PCC7002 and C paradoxa, this work; Nostoc sp. PCC8009, Bryant et al., 1990; Synechoeysgs sp. PCC6803, Anderson and McIntosh, 1991; Syneehocoecus vulcant~, Shimizu et al., 1990; Chlamydomonas reinhardtg, Takahashi et al., 1991; tobacco, Shinozaki et al., 1986; liverwort, Ohyama et al., 1986; spinach, Oh-oka et aL, 1988; pea and wheat, Dunn and Gray, 1988; maize, Schantz and Bogorad, 1988; barley, Scheller et al., 1989; rice, Hiratsuka et al., 1989. Last digits of numerals are aligned with the corresponding aa.

127 GTTAACCGATTGGTCGTAGGTTCAAGTcCTACCTGGGGAGTTATTTTTATATTAAATTATTTTTAATCAGAAAAATTATTATTATTAATATTAATAATATTAATTTATTATTATTATTAT CAATTGGCTAACCAGCATCCAAGTTCAGGATGGACCCCTC
120

TATGCGGACGGAGAGACTCGAACTCTCACGAGCGAAC7~ACTAGATCCTAAGTCTAGGGCGTCTACCAATTCCGCCACGTCCGCAATATAATATTATTATTATATCAAATATTTCTTAAT CGCCTGCCTCTCTGAGCTTGAGAGTGCTCGCTTGAGTGATcTAGGATTCAGATCCCGCAGATGGTTAAGGCGGTGCAGGCG
240

TTGGATAAATTATTTCAAATAATTTCTTAAAAAAAGAGATAGAAGTTTTAAGTTATAGTAAAATAGAATATATATATTTATATAATAAATTCTTAAATTTATTTAATTAACTCTTTTTAG

360

GAGAATTTT~A~GCACATACCGTt~t~`TTTACGAC~CTTGTATTGGTTGT~CCCAATGTGTA~TGCATGTCCTACTGATGTATTAGA~TGGTTCCTTGGGATGGTTGTCGTGC~ 480 P s a C > M A H T V K I Y D T C I G C T Q C V R A C P T D V L E M V P W D G C R A

36

ACCAAATCGCTTC~CTCCGAG~CTG~GATTGTGTAGGTTGTAAAAGATGTG~TCTGCATGTCCTACTGATTTCTTAAGTATTCGTGTTTACTTAGGTGCAG~ACTACTCGTAGTA N Q I A S A P R T E D C V G C K R C E S A C P T D F L S I R V Y L G A E T T R S

600 76

TG~TCTA~TTATT~GATAAAGTATATATTCGTCTAAAATATATATATTATTTT~TT~ATATAAATATAAGTAAATAGTATAATTTT~TTTGTATTTAAAATTATACTATTTACT • M G L G Y *

720 dl

TATACTTATATTTCG~GTAT~GTT~TTATTTATTATAAAACTTAACAATTATG~T~TAAAATTGA~AATTTTTT~cAAA~A~TTAATT~AGTTATTAGT~ATTAAATA ORF243>M N N K I D N F L K Q K K L I K V I S G L N

840 22

ATTTT~TAC~CACATGTTATCAAAATAGCAAAAGCAGCAAGTAAAACA~TGCTAGTTTTATTGATATAGCTGCTGCTCCTAAGTTAGTTGA~GTAAAAAAAGAAGTCCcAAATT N F N T T H V I K I A K A A S K T N A S F I D I A A A P K L V E K V K K E V P N

960 62

TACC~TTTGCGTATCCGC~TTAAACCTG~TTATTTGTTCCATGCGTTAAAGCCGGTG~AGAATTAATTGAAATAGGTAATTT~GACAGCTTATATAATCAAGGTTATAAAATTAATT 1080 L P I C V S A I K P E L F V P C V K A G A E L I E I G N F D S L Y N Q G Y K I N

102

TTTCTGACGTTCTTTcACTTGTA~ACAAACCCGTTCTTTATTACCTGATACTCCTTTATCTGTAA~GATT~CTTATTTATTAC~ATTA~TTTACAATTAGAATTAGcTTAT~GATTAG 1200 F S D V L S L V K Q T P S L L P D T P L S V T I P Y L L P L N L Q L E L A Y R L

142

AAGATTTAAACGT~ATTTAATACA~CCGAAGGAA~AT~ATAAAATTACTTCTTTACTAGATAATCGTAATATCGAAACTATATTA~CAACGTTAGCTTCAACATATTTAATTGC~ 1320 E D L N V D L I Q T E G K I N K I T S L L D N R N I E T I L P T L A S T Y L I A 182 ATAATGTTACTATACCTGTAATTTGTGCTTCGGGTTTAACTATTT~TACAATTGAATTACCTTTTA~TTAAATGCAT~TGGGATTGGTATTGGAAATGCCGTTTCTAAATTAAATTCTA N N V T I P V I C A S G L T I S T I E L P F K L N A S G I G I G N A V S K L N S

1440 222

CAG~GA~TGAT~AT~TTTTA~TGAAATTTCTACTAG~TTAATTATTCAACTGTCTTAT~ATTAAAT~AAAAATTAC~TAT~TATAT~T~ATATAGATAGGTAATTAAAAAAA 1560 T E E M I N L L N E I F T R I N Y S T V L *

243 i.o

(162) TCAGTAGTTG~TCTTTTTTATTTATTATT~A~GAA~A~AAAATCAAAGATGTAATTAAAAC~TT£~AGGACCAGGAGGAATATTA~AAGATAACTTAATAACATT~CAGAA~0 D T T S D K K N I I L F L I F F I L S T I L V I T P G P P I N F L Y S L L M G S ( 1 5 8 ) ~ACTAC~CTTACCCCGAT~TCG~CTCGT~GAATTACA~ATTCT~ACTTTTTCCTATTAAT~ACCTGTAGCTCCCGGAGTAACT~T~AGCTAAAAC?AATATTACACCTATT F S C S V G I I S S T L I V Y E L S K G I L L G T A G P T V L L A L V L I V G I ( I I S )

1920

GCTTTCATACT-AC~TAATTGTTAAAGCGACT~AATTA~AAAGTAGATTGTA~AAGTTAATA~TAATCCTATAGTTTGTGCCATTATA~GTCAAATGTATAACATTTTAATTGT~ 2040 A K M S V I I T L A V L I L F T S Q L F N I P L G I T Q A M I P D F T Y C K L Q (78) CTATAACAA^TTATAAAAAAAAGTAAAAT~TTG~TAA~TTATAGATGTATTTTGTAAATCTTCTGATGTAATTCCTAAAATATTACCA~TAAAAAATGAT'>AGGTCGAC R Y C ~ I F F L L I I A L I I S T N Q L D E S T I G ' L I N G F L F H N ' D V < O R F

2153 (38)

Fig. 5. The nt sequence and of 2153-bp ~ , c l l ~a~ent of the cyanelle genome of C.pam~xa encoding psaC.The deduced aa sequence ~r the psaC gene, ~und between nt 372-617, is shown. The coding sequences ~r two tRNA genes, tnlL(UAG)and tr,~(GUU), are also indicated. An ORF (nt 774-1505) potentially encoding a protein of 243 aa occurs downstream ~om the psaCgene and is Iocat~ on the same strand. A second incomplete ORF occurs still ~rther downstream on the opposite strand (nt 2153-1665). Inverted repeat sequences, which could play a role in transcription te~ination or mRNA stabilization, are shown by the convergent arrows 3' ~,om the psaCgene and ORF 243. A putative RBS (nt 359-365) ~r the psaCgene is underlined. Asterisks mork the stop codons. The sequences presented have been deposited in GenBank under the accession No. M86239.

(c) Conclusions (1) The psaC gene of the unicellular cyanobacterium Synechococcus sp. PCC7002 has been cloned and sequenced. The psaC gene is not immediately flanked by genes encoding components of the N A D ( P ) H dehydrogenase complex as found for the cyanobacterium Synechocystis sp. PCC6803 or for higher plant chloroplast D N A s . (2) The psaC gene of Synechococcus sp. PCC7002 is transcribed as a monocistronic m R N A of approx. 350400 nt; transcription is initiated 51 nt 5' to the translational start codon of the gene. (3) The psaC gene of C. paradoxa is found in the small single-copy region of the cyanelle genome. The psaC gene of C. paradoxa likewise is not flanked by genes encoding components of the N A D ( P ) H dehydrogenase complex but instead is flanked upstream by trnN(GUU) and

trnL(UAG) genes and downstream by two ORFs, one of which encodes a very hydrophobic protein which has an F,. co~"homolog. (4) The PsaC proteins of Synechococcus sp. PCC7002 and C. paradoxa are highly homologous (approx. 90-95 % identical and > 95% similar) to the PsaC proteins of diverse cyanobacteria and higher plants.

ACKNOWLEDGEMENTS This work was supported by N S F grants DMB-8504294 and DMB-S818997 and Pennsylvania State University Experiment Station Project 2874 to D . A . B . E . R . was the recipient of a postdoctoral fellowship from the Deutsche Forschungsgemeinschaft.

128 REFERENCES Anderson, S.L. and Mclntosh, L.: Partial conservation of the 5' ndhEpsaC-ndhD3' gene arrangement of chloroplasts in the cyanobacterium Synechocystis sp. PCC 6803: implications for NDH-D function in cyanobacteria and chloroplasts. Plant Mol. Biol. 16 (1991) 487-499. Bryant, D.A.: Molecular biology of Photosystem I. In: Barber, J. (Ed.), Current Topics in Photosynthesis, Vol. 11. Elsevier, Amsterdam, 1992, pp. 501-549. Bryant, D.A. and Tandcan de Marsac, N.: Isolation of genes encoding components of photosynthetic apparatus. Methods Enzymol. 167 (1988) 755-765. Bryant, D.A., Rhiel, E., de Lorimier, R., Zhou, J., Stirewalt, V.L., Gasparich, G.E., Dubbs, J.M. aud Snyder, W.: Analysis of phycobilisome and Photosystem I complexes ofcyanobacteria. In: Baltscheffsky, M. (Ed.), Current Research in Photosynthesis, Vol. II. Kluwer Academic Publishers, Dordrecht, 1990, pp. I-9. Cantrell, A. and Bryant, D.A.: Molecular cloning and nucleotide sequence of the psaA and psaB genes of the cyanobacterium Synechococcussp. PCC 7002. Plant Mol. Biol. 9 (1987) 453-468. Chitnis, P.R. and Nelson, N.: Photosystem I. In: Bogorad, L. and Vasil, I.K. (Eds.), The Photosynthetic Apparatus: Molecular Biology and Operation. Academic Press, San Diego, 1991, pp. 178-224. Dunn, P.PJ. and Gray, J.C.: Localization and nucleotide sequence of the gene for the 8 kDa subunit of photosystem I in pea and wheat chloroplast DNA. Plant Mol. Biol. 11 (1988) 311-319. Golbeck, J.H. and Bryant, D.A.: Photosystem I. In: Lee, C.P. (Ed.), Current Topics in Bioencrgetics, Vol. 16. Academic Press, New York, 1991, pp. 83-177. Golbeck, J.H., Parrett, K.G., Mehari, T., Jones, J.L. and Brand, J.J.: Isolation of the intact photosystem I reaction center core containing P700 and iron-sulfur center F×. FEBS Lett. 228 (1988) 268-272. Hayashida, N., Matsubayashi, T., Shinozaki, K., Sugiura, M., Inoue, K. and Hiyama, T.: The gent for the 9 kDa polypeptidc, a possible apeprotein for the iron-sulfur centers A and B of the photosystem ! complex in tobacco chloroplast DNA. Cart. Genet. 12 (1987) 247-250. Hiratsuka, J., ~himada, H., Whittier, R., Ishibashi, T., Sakamoto, M., Mori, M., Kondo, C., Honji, Y., Sun, C.-R., Meng, B.-Y., Li, Y.-Q., Kanno, A., Nishizawa, Y., Hirai, A., Shinozaki) K. and Sugiura, M,: The complete sequence of the rico (Or),zasatlva)chloroplast genome: intermoleeular recombination between distinct tRNA genes accounts for a major plastid DNA inversion during the evolution of cereals. Mol. Gen. Genet. 217 (1989) 185-194. Lambert, D.H., Bryant, D.A., Stirewalt, V.L., Dubbs, J.M., Stevens Jr., S.E. and Porter, R.D.: Gent map for the Cyanophoraparadoxa cyanelle genome. J. Bacteriol. 164 (1985) 659-664. Li. N., Warren, PN., Golbeck, J.H., Frank, G., Zuber, H. and Bryant, D.A.: Polypeptide composition of the photosystem I complex and the photosystem ! core protein from Synechococcus sp. PCC6301. Biochem. Biophys. Acta 1059 (1991a) 215-225. Li, N., Zhao, J., Warren, P.V., Warden, J.T., Bryant, D.A. and Golbeck, J.H.: PsaD is required for the stable binding of PsaC of the photosystem i core protein of Synechococcussp. PCC 6301. Biochemistry 30 (1991b) 7863-7872.

Oh-oka, H., Takahashi, Y., Wada, K., Matsubara, H., Ohyama, K. and Ozeki, H.: The 8 kDa polypeptide in photosystem I is a probable candidate of an iron sulfur centre protein encoded by the chloroplast genefrxA. FEBS Lett. 218 (1987) 52-54. Oh-oka, H., Takahashi, Y., Kufiyama, K., Saeki, K. and Matsubara, H.: The protein responsible for center A/B in spinach photosystem l: isolation with iron-sulfur cluster(s) and complete sequence analysis. J. Biochem. 103 (1988) 962-968. Ohyama, K., Fuguzawa, H., Kohchi, T., Shirai, H., Sane, T., Sane, S., Umesono, K., Shiki, Y., Takeuchi, M., Chang, Z., Aota, S., Inokuchi, H. and Ozeki, H.: Chloroplast gene organization deduced from the complete sequence of liverwort Marchantia polymorpha chloroplast DNA. Nature 322 (1986) 572-574. Parrett, K.G., Mehari, T. and Golbeck, J.H.: Resolution and reconstitution ofthe cyanobactcdai photosystem I complex. Biochim. Biophys. Acta 1015 (1990) 341-352. Schantz, R. and Bogorad, L.: Maize chloroplast genes ndhD, ndhE and psaC sequences, transcripts and transcript pools. Plant Mol. Biol. I 1 (1987) 239-247. Schellvr, H.V. and Moiler, B.L.: Photosystem I polypeptides. Physiol. Plant. 78 (1990) 484-494. Schcller, H.V., Svendsen, I. and Mollar, B.L.: Amino acid sequence of the 9-kDa iron-sulfur protein of Photosystem ! in barley. Carlsbvrg Res. Commun. 54 (1989) ! 1-15. Shimizu, T., Hiyama, T., lkeuchi, M., Koike, H. and Inoue, Y.: Nucleotide sequence of the psaC gene of the cyanobacterium Synechococcus vulcanus. Nucleic Acids Res. 18 (1990) 3644. Shinagawa, H.) Makino, K,, Amemura, M,, Kimura, S,, Iwasaki, H. and Nakata, A.: Structure and regulation of the Escherichiacolirue operon involved in DNA repair and recombination. J. Bacteriol. 170 (1988) 4322-4329. Shinozaki, K., Ohme, M., Tanaka, M., Wakasugi, T., l-layashida, N., Matsubayashi, T., Zaita, N., Chunwongse, J., Obokata, J., Yamaguchi-Shinozaki, K., Ohto, C., Torazawa, K., Meng, B.Y., Sugita, M., Deno, H., Kamogashira, T,, Yamada, K., Kusuda, J., Takaiwa, F., Kate, A., Tohdoh, N., Shimada, H. and Suguira, M.: The complete nucleotide sequence of the tobacco chloroplast genome: its gene organization and expression. EMBO J. 5 (1986) 2043-2049. Takahashi, Y., Goldschmidt-Clermont, M., Seen, S.-Y., Franz6n, L.G. and Roehaix, J.-D,: Directed chloroplast transformation in Chlamydomonas reinhardtii: insertionai inactivation of the psaC gane encoding the iron sulfur protein destabilizes photosystem I. EMBO J. 0991) 2033-2040. Wynn, R.M. and Maikin, R.: Characterization of an isolated chloroplast membrane Fe-S protein and its identification as the photosystem ! Fe-SA/Fe-Sa binding protein. FEBS Lett. 229 (1988) 293-297. Yanisch-Perron, C., Vieira, J, and Messing, J.: Improved MI3 phage cloning vectors and host strains: nucleotide sequences of the MI3mpI8 and pUCI9 vectors. Gene 33 0985) 103-119. Zhao, J., Warren, P.V., Li, N., Bryant, D.A. and Golbeck, J.H.: Reconstitution of electron transport in photosystem I with PsaC and PsaD proteins expressed in Escherichia coll. FEBS Lett. 276 (1990) 175180.