Cloning and analysis of the tomato nitrate reductase-encoding gene: protein domain structure and amino acid homologies in higher plants

Cloning and analysis of the tomato nitrate reductase-encoding gene: protein domain structure and amino acid homologies in higher plants

Gene, 85 (1989) 371-380 Elsevier 371 GENE 03302 Cloning and analysis of the tomato nitrate reductase-encoding gene: protein domain structure and am...

974KB Sizes 0 Downloads 30 Views

Gene, 85 (1989) 371-380 Elsevier

371

GENE 03302

Cloning and analysis of the tomato nitrate reductase-encoding gene: protein domain structure and amino acid homologies in higher plants (Lycopersicon

esculentum;

Franqoise Daniel-Vedele,

nucleotide sequence; introns; catalytic domains; recombinant DNA)

Marie-France

Dorbe, Michel Caboche and Pierre Rouz6

Laboratoire de Bioiogie Cellulaire, INRA, 78026 Versailles Cedex (France) Received by J.-P. Lecocq: 10 April 1989 Revised: 15 June 1989 Accepted: 28 June 1989

SUMMARY

We have cloned and sequenced the nitrate reductase (NR)-encoding gene (niu) from tomato. When compared to the two Nicotiana tabacum nia structural genes, this 5-kb tomato gene shows a highly conserved structure, the coding sequence being interspersed with three introns at the same positions. Nucleotide sequences of the 5’ promoter regions are not homologous, except for a 250-bp fragment. This small region might be involved in the similar regulation of the niu expression in tomato and tobacco plant species. The tomato gene codes for a 911 amino acid (aa) polypeptide chain. This sequence was aligned with and compared to other higher plant NR sequences. This alignment clearly identifies the three catalytic domains of NR, namely, a molybdopterin cofactor-binding domain, a heme domain and a FAD/NADH domain. On the other hand, it suggests that the less conserved 80-aa N-terminal region, containing a striking acidic aa cluster, is an additional domain bearing regulatory or structural function.

INTRODUCTION

Nitrates are the most important source of nitrogen for higher plants. Taken up by roots, they are translocated into various tissues of the plant, and then

Correspondence10: Dr. F. Daniel-Vedele, Laboratoire de Biologie Cellulaire, INRA, Route de St. Cyr, 78026 Versailles Cedex (France) Tel. (33.1)30.83.30.67; Fax (33.1)30.83.30.99. Abbreviations: aa, amino acid(s); bp, base pair(s); cDNA, DNA complementary to mRNA; ExoIII, exonuclease III; FAD, flavh 0378-l 119/89/$03.50

0 1989 Elsevier

Science Publishers

B.V. (Biomedical

reduced to ammonia in two steps. The first step requires the enzyme nitrate reductase (NR, EC 1.6.6.1.) which catalyses the reduction of nitrate to nitrite in the cytoplasm. Nitrite is then reduced in the chloroplast (or plastid) by nitrite reductase. Nitrate

adenine diuucleotide; kb, 1000 bp; MoCo, molybdopterin cofactor; NADH, nicotinamide adenine dinucleotide (reduced); nia, gene encoding NR; NR, nitrate reductase; nt, nucleotide(s); ORF, open reading frame; ss, single strand(ed); SSC, 0.15 M NaCl/O.OlS M Na, .&rate pH 7.6; SSPE, 0.15 M NaCl/8.8 mM NaH,PO,/l mM EDTA pH 7.5; tip, transcription start point(s). Division)

312

reduction is considered to be the major controlling step in the assimilation of nitrate and has been studied extensively in higher plants (for review see Wray, 1986). NR is a homodimer carrying three cofactors, namely FAD, heme and MoCo (for review see Campbell, 1988). Regulation of NR in plants appears to be very complex since many factors such as light, nitrate, ammonium and growth regulators have been shown to influence the level of NR activity (Beevers and Hageman, 1983). The isolation of cDNA clones coding for the NR apoprotein has made possible the study of this regulation at the molecular level (Cheng et al., 1986; Crawford et al., 1986; Galangau et al., 1988). The two NR structural genes from tobacco have been cloned in our laboratory (Vaucheret et al., 1989). These genes, niu-2 and nia-1, were found to derive from the Nicotiana sylvestris and Nicotiana tomentosiformis ancestors of tobacco, an amphidiploid species. Their nt sequences have been recently established and the regulation of their transcription has been studied under different physiological conditions (Galangau et al., 1988). It is our aim to study the expression of niu and its relationship with plant development and biomass production. As opposed to tobacco, which has a complex genome, tomato is a true diploid species which has been well characterized genetically (Rick, 1980). Tomato has been extensively used as a model for agronomical studies and the mineral nutrition of this species is well characterized. It is also an interesting model for studying the influence of mineral nutrition on fruit ripening. The cloned tomato nia gene can be introduced into NR- mutants of Nicotiuna tabacum (MUller and Grafe, 1978) or Nicotiana plumbaginifolia (Gabard et al., 1987). Tomato NR- mutants have also been isolated (M. Koomneef, pers. commun.). Hence, it is also now possible to study the expression of this gene introduced back into the tomato genome. In this paper, we describe the isolation of the tomato niu gene and its nt sequence analysis. The aa sequence of the tomato NR protein, deduced from the nt sequence of the gene, is compared to the NR aa sequences established for other plant species. From this comparison, a domain model of NR is proposed.

RESULTS

AND DISCUSSION

(a) Construction

of tomato genomic libraries

Tomato (Lycopersicon esculentum cv. Manapal) plants were grown in the greenhouse until they had developed six to seven leaves. At this stage, leaves were excised, mixed, weighed and immediately frozen in liquid nitrogen. Samples were stored at -80°C until further analysis. Genomic DNA was isolated by the method of Dellaporta (1983) and was purified by CsCl-gradient centrifugation. Tomato DNA, digested to completion by the restriction enzymes, EcoRI or HindIII, was sizefractionated through 5-40% sucrose gradients. Southern blot experiments were performed on fractions containing DNA of the expected size and the fractions showing the strongest hybridization signal were selected for the construction of libraries. DNA from these fractions was ligated to EcoRIdigested IgtWES (Stratagene, Inc.) or to HindIIIdigested ANM1149 (Murray, 1983). In vitro encapsidation of the ligated molecules was achieved at 22°C during 2 h using commercial extracts (Gigapack, Clontechlabs, CA, USA). Recombinant phages were grown on Escherichia coli strains LE392 for &t WES or POP13 for lNMll49. A yield of 5 x lo5 recombinant clones per pg of insert DNA was usually obtained. In situ hybridizations of recombinant phages using either the tobacco cDNA probe, carried by the plasmid pBMC102010 (Calza et al., 1987) or the first isolated tomato niu genomic fragment, were performed under, respectively, low or high stringency conditions. DNA probes were radiolabelled with [a-32P]dCTP (Amersham) by nick translation to 5 x 10’ cpm/pg and used at a concentration of lo6 cpm/ml of the hybridization solution. Hybridizations under low-stringency conditions were carried out at 55°C in 5 x SSPE, 2 x Denhart’s solution, 0.1 y0 SDS (w/v). Washes were performed 3 x 10 min at room temperature and 2 x 30min at 55°C in 2 x SSC, 0.1% SDS. Highstringency conditions were obtained by increasing the temperature of hybridizations and washes up to 65°C. Positive phages were purified and obtained through two further cycles of in situ hybridizations.

313

(b) Isolation of the tomato nia gene

Southern-blot experiments performed on tomato DNA digested by EcoRI (not shown), using tobacco cDNA as radiolabelled probe, showed a unique 6.5-kb fragment in this tomato genome. We built a partial genomic library with tomato DNA digested to completion by EcoRI, size fractionated on sucrose gradients and ligated to EcoRI digested AgtWES DNA. 250 000 recombinant clones were screened by in situ hybridizations with the tobacco probe under low-stringency conditions (see section a) and six independent clones were isolated. Each of these clones contained the same 6.5-kb tomato DNA fragment. Sequencing analysis (see section c, below) showed that this fragment did not contain the entire tomato nia gene. To isolate the 5’ end of the gene, we constructed a new genomic library with tomato DNA totally digested by HindIII, size-fractionated on sucrose gradients and ligated to HindIII-digested ANM1149 DNA. Among 300000 recombinant clones, screened by in situ hybridization with the tomato 6.5-kb fragment under high-stringency conditions, 16 positive clones were detected. Five of them were further characterized and contained a 7-kb fragment sharing a common area of 1.5 kb with the EcoRI fragment previously isolated. (c) Analysis of the nucleotide sequence of the tomato nia gene

To determine the nt sequence of the entire tomato nia gene, we subcloned restriction fragments into

pBluescript vectors harbouring a polylinker with multiple cloning sites in either orientation. The characteristics of these vectors allowed us to determine, by unidirectional deletions, the nt sequence on both strands of a 5309-bp fragment, containing the entire nia gene flanked by 5’ and 3’-noncoding regions. This sequence is presented in Fig. 1. The localization of putative ORFs was obtained by comparison with the tobacco cDNA. The size and positions of introns were further deduced by computer analysis of the sequence. All three introns begin with the dinucleotide, GT, and end with the dinucleotide, AG, as observed consistently at the intron/exon junctions of other eukaryotic genes. The average A + T content of these regions is 73%, while the A + T content of

the ORF is 56%. High levels of A + T sequence in introns seems to be a common feature of plant genes. It is presumed that such sequences play an important role in the specific splicing of plant introns by the plant machinery (Wiebauer et al., 1988). The plot of the G + C content as a function of nt position in the sequence of the nia gene is illustrated in Fig. 2. An almost perfect agreement between intron position and reduced G + C content is observed. Such a shift in G + C content at borders between introns and exons suggests that G + C plots could be used as a prediction method to locate introns in genomic sequences, in addition to the currently used codon frequency plot. A hexanucleotide, AATAAA, underlined in Fig. 1, is located 44 bp downstream from the stop codon. This sequence fits perfectly with the consensus sequence found lo-30 nt upstream from the polyadenylation site in the majority of mRNAs (Joshi, 1987a). It has been shown to be required for polyadenylation (Wickens and Stephenson, 1984). Recently, Wilusz and Shenk (1988) demonstrated the binding of a 64-kDa nuclear protein to RNA segments that include this motif. The tsp was determined by comparison to the tobacco nia promoters on which primer extension experiments were already performed (Vaucheret et al., 1989) and located 48 bp upstream of the start codon in the tomato gene. The first three nt, CGT of the tomato mRNA transcript, lit with the consensus sequence, PyPuPy, known for most animal genes. The eukaryotic cell initiation of transcription is dependent upon recognition of a TATA sequence element by RNA polymerase II. Such an element, TATATAAA (Fig. l), is found at nt position -33 upstream from the tsp, in agreement with the position of most of these elements in plant cells (Joshi, 1987b). An incomplete CCAAT box, CAAT, underlined in Fig. 1, is located 72 bp upstream from the TATA element. This sequence is thought to modulate the levels of transcription in animal genes. Molecular cloning and nt sequence analysis indicates that the gene coding for the tomato NR apoprotein has all the features of a functional eukaryotic gene. These sequence data will appear in the EMBL/GenBank/DDBJ Nucleotide Sequence Databases under the accession number X14060.

314 tacgatgaaaaatacaccttaaaatgttagtcgaagtttttgtaatttgactctgaaaatagaaaccacgacacttaatagtgaacggagagagtaattgatataatt 108 tctattttagagtcaaactataaaatttaaccaacatcttgtaatttagtagtatttttcatatataatttttgaatatctaaaatttttaactttgaaagttaacgt

216

aacatgacaaataaaatcaataaatgaggaaaataattttcatcaaatttaaaattgtttttaatatttgaacgataaaattgtaaatcatatagacgaatttttatt

324

atttttgctgatgatgtcacaaacttttgtaatcaaaatt.g.aaaffttggtgtcgattttttgggtcctacttatgtacagtaaatatggggttgagatagttcgtaac

432

I 540 catt=staattttattttaaaaaaatca

ATG GCG GCA TCT GTG GAA AAC AGA CAG TAT ACT CAC CTT GM MAASVENRQYTHLEPGLSGV

CCG GGT TTA TCA GGC GTA

GGC CGT ACT TTC AAG CCT AGG CCT GAT TCC CCG GTT CGT GGT TGC AAC TTC CCT CCT TCA TCC MC GRTFKPRPDSPVRGCNFPPSSNHELPF CM Q

CTC CCT TTC

709 47

AAA AAA CAA AAT CCC CCA ATT TAC CTT GAT TAT TCG TCT AGT GAA GAC GAG GAT GAC GAT GAC GAA AAA AAT GAA TAC K K Q N P P I Y L D Y S S S E D E D D D D E K N E Y

790 74

GTT CAA ATG ATC AAA AAA GGT AAA ACT GAA TTA GAA CCA TCA ATT CAT GAT ACT AGA GAT GM VQMIKKGKTELEPSIHDTRDEGTADNW

CAT GM

628 20

GGT ACC GCT GAT AAT TGG

871 101

ATC GAA CGA AAC TTT TCC TTA ATA CGT CTC ACC GGT AAA CAC CCA TTC AAC TCC GAA CCA CCT TTA TCT CGC CTT ATG CAT IERNFSLIRLTGKHPFNSEPPLSRLMH

952 128

CAC GGG TTC ATC ACT CCC GTA CCA CTT CAT TAC GTC CGT AAC CAC GGC CCA GTC CCC AAA GCA TCC TGG TCT GAC TGG ACT HGFITPVPLHYVRNHGPVPKASWSDWT

1033 155

GTG GAA GTT RCA GGG CTG GTA AAA CGA CCA ATG AAA TTC ACA ATG GAT CAA TTA GTT AAC GAA TTC CCT TCA CGT GAA TTC VEVTGLVKRPMKFTMDQLVNEFPSREF

1114 182

CCT GTC ACA CTT GTG TGC GCA GGC AAT CGT CGT AAA GAG CAG AAT ATG GTG AAG CAG ACA ATT GGT TTC AAT TGG GGT GCT PVTLVCAGNRRKEQNMVKQTIGFNWGA

1195 209

GCT GCC GTT TCA ACC ACC GTA TGG CGC GGA GTA CCT CTC CGC GCC CTG TTG AAA CGG TGC GGT GTT CAG AGT AAG AAA AAA AAVSTTVWRGVPLRALLKRCGVQSKKK

1276 236

GGC GCG CTT AAT GTC TGT TTC GAA GGT TCC GAT GTT TTG CCT GGA GGT GGT GGT TCA AAG TAC GGA ACG AGT ATA AAG AAG GALNVCFEGSDVLPGGGGSKYGTSIKK

1357 263

GAA TTC GCC ATG GAT CCA TCT CGT GAT ATT ATT GTA GCT TAC ATG CAA AAC GGA GAA ATG TTG TCA CCG GAT CAT GGT TTT EFAMDPSRDIIVAYMQNGEMLSPDHGF

1438 290

CCG GTA AGG ATG ATT ATC CCC GGA TTC ATC GGT GGA AGA ATG GTG AAA TGG TTA AAG AGG ATT GTG GTC ACT ACA CAA GAA PVRMIIPGFIGGRMVKWLKRIVVTTQE

1519 317

TCG GAA AGC TAT TAT CAT TAC AAG GAC AAT AGA GTC CTC CCT CCA CAC GTT GAC GCG GAA CTT GCC AAC GCG GAA Ggtacgat 1602 SESYYHYKDNRVLPPHVDAELANAE 342 atgaataataattagcatctattttagatcaaagttatttattatactcaagttgtcatttgtttag

CT TGG TGG TAC AAA CCA GAG TAC ATC ATC AWWYKPEYII

1698 352

AAT GAG CTC AAC ATA AAC TCT GTC ATT ACA ACT CCG TGC CAT GAA GAA ATT TTG CCC ATC AAT GCG TGG ACT ACT CAG AGA NELNINSVITTPCHEEILPINAWTTQR

1779 379

CCT TAC ACG TTG AGA GGC TAT GCT TAT TCT Ggttagttatttttctctctgatttgctgaattttcttagtactgtcgaatttcatcaactgtttgtt P Y T L R G Y A Y S

1877 389

ttaattatttcgtacctcttgttttcagctaatttcaacctcaccctcttatctcatattatcatacaaattttaagatttcatgaaaaaagtatattagtacgaata

1985

ttttgtatgttgttcgaactttttagaaatatcaataaatatatatcgtatttgtcgaaaatactgtatttttaaagagtctgcaatttaggatagtaaaggaaaaat

2093

aggggtgagtggaggtggatgcatgaactttttgatggtgtcatcaacttgtttggttatgtaaacagaatagcaagaaaaagacaatatagttggatttatttaatg

2201

gtttacagaaaaacctggggagcttttcctagttctgaagagtcggtctaaaaataatatataggtaacaaaatttaattgcctacaaagacagaatcttttttgact

2309

gctttagttcctgcatcttaggctgcctcaacaacattataataagttaaattattattattatataatacatcattgtgtttgtattcaatttgaaaactttcaagt

2417

ggtcactttatatagatctagtcaatattattttaatttaagcctaggtataaggacttttatctttttctaatcaacgtgcctaacttgttgcaattttgattcaaa

2525

tacaagtttgacccttcaagtgggaaagtgtgttttttaattgaatagtcagttacatgtttaaattaagctattagacacgattacagttgaatttatgaatt

2633

gacgagatggtgattgtgtziatag GT GGA GGT AAA AAG GTA ACT CGA GTG GAA GTG ACT TTG GAT GGA GGA GAG ACA TGG AGT GTG GGGKKVTRVEVTLDGGETWSV

2719 410

TGT ACA CTT GAT CAC CCA GAG AAG CCA ACA AAG TAT GGC AAG TAC TGG TGT TGG TGC TTT TGG TCA CTC GAG GTT GAG GTG C T L D H P E K P T K Y G K Y W C W C F W S L E V E V

2800 437

CTT GAC TTG CTT AGT GCT AAA GAA ATT GCT GTA CGA GCT ACC GAT GAG ACC CTC MC LDLLSAKEIAVRATDETLNTQPEKLIW MC

N

ACT CM

CCC GAG MG

CTT ATT TGG

2881 464

GTC ATG gtaaattcacatttaactttttacaacttctttaaaatttaaaattaatggcgtaaagtcaaatatatacatgtaactagtccttgatgaaaagtaattttca 2986 V M 467

gtccttttcaacttttcttttatttatgattactttaccatatgcagattgatagagcatgtgatttcacataaaaacaaatttttttcattggtttaagtttcagat

3094

tctttttttctgatctatgtggtccatggatggatgattaatttaaaccacagaaagtcatgaaaattcctattatgcgtaagcagcagcgtgatcacatgaatatgtcacg 3202 tgctaggcctcttattataatttaagttgatcgtgttactgatggaggacggttactagtaagtaactggtaatgttgttatacatttgtctaattttatggtgatgg

3310

315 atgttgtaatgttcag

GGA ATG ATG MC MT TGT TGG TTT CGA GTG MG GMMNNCWFRVKMNVCKPHKGEIG

ATA GTG TTT GAG CAT CCG ACT CM IVFEHPTQPGNQSGGWMAKERHLEISA

CCT GGA MT

CM

ATG MT

GTG TGC AAA CCT CAC AAG GGA GAG ATT GGT 3395 490

TCG GGT GGA TGG ATG GCA AAG GAG AGA CAC TTG GAG ATA TCA GCA

GTG GCT CCT CCA ACA CTA MG AAG AGT ATA TCA ACT CCT TTC ATG MC VAPPTLKKSISTPFMNTASKMYSMSEV

ATG TAT TCC ATG TCC GAG GTG

3557 544

AGG AAA CAC MC TCT TCA GAC TCT GCT TGG ATC ATA GTC CAT GGA CAT ATC TAC GAT GCC TCA CGT TTC TTG AAA GAC CAT RKHNSSDSAWIIVHGHIYDASRFLKDH

3638 571

CCC GGT GGT GTT GAC AGC ATT CTG ATC MT PGGVDSILINAGTDCTEEFDAIHSDKA

GCT

3719 598

AGT TCT GTC S S V

3800 625

ACA CCA ACA AGG AGT GTA GCT CTC ATC

3881 652

TCC ATC TCC CAT GAT GTT AGG AAA TTC AAA TTT GCA TTA CCC

3962 679

TCT GAG GAT CM GTC TTA GGG TTA CCT GTT GGC AAA CAC ATA TTC CTC TGT GCC ACA GTT GAT GAC AAA CTC TGT ATG CGT SEDQVLGLPVGKHIFLCATVDDKLCMR

4043 706

GCC TAC ACG CCT ACT AGC ACA GTC GAT GM AYTPTSTVDEVGFFELVVKIYFKGVHP

GTA GGG TTC TTC GAG TTG GTT GTC MG

ATC TAC TTC AAA GGT GTT CAC CCT

4124 733

AAA TTC CCT MT GGA GGT CM ATG TCA CM KFPNGGQMSQHLDSLPIGAFLDVKGPL

CAT CTT GAT TCT CTC CCA ATA GGT GCA TTC CTT GAC GTT AAA GGT CCA TTA

4205 760

GGT CAC ATT GM TAC CAA GGT MG GHIEYQGKGNFLVHGKQKFAKKLAMIA

TTC TTA GTC CAT GGT AAA CM

4286 787

MG K

MG K

GCT.GGA ACT GAT TGT ACT GAG GM

CTA TTG GAG GAC TTT AGG ATT GGT GM L L E D F R I G E

CCA AGG GM AAA ATC CCT TGC AAA CTC GTC GAC AAG CM PREKIPCKLVDKQSISHDVRKFKFALP

GGT GGA ACA GGT ATA ACT CCA GTA TAT GGTGITPVYQVMQSILKDPEDDTEMYV

CM

GAG CTT GTT CM

GM

MG

TTT GCC AAG AAG TTA GCT ATG ATA GCG

GTA ATG CAA TCA ATA TTG AAA GAT CCT GM

GTG TAT GCA AAC AGA ACG GAG GAT GAT ATT TTG CTC AAA GAC GM VYANRTEDDILLKDELDAWAEQVPNRV AAA GTA TGG TAT GTC GTT CM KVWYVVQESITQGWKYSTGFVTESILR

TTT GAT GCA ATT CAT TCT GAT MG

CTC ATA ACT ACT GGT TAC ACG TCT GAT TCG TCT CCA MC L I T T G Y T S D S S P N

CAT GGA TCC TCT TCG ATC AGT AGC TTC TTA GCA CCT ATT MG HGSSSISSFLAPIKELVQTPTRSVALI

GGT MT

ACA GCT TCG MG

3476 517

TCC ATT ACA CM

GGA

GAC GAT ACA GAG ATG TAT GTG

CTT GAT GCA TGG GCA GAG CM

GTT CCA MT

TGG AAG TAT AGT ACA GGA TTC GTT ACA GM

AGG GTT

4448 841

TCG ATT CTT AGA

4529 868

GM CAT ATA CCT GM CCA TCT CAT RCA RCA TTG GCA TTA GCA TGT GGA CCA CCT CCA ATG ATA CAA TTT GCT ATT MT EHIPEPSHTTLALACGPPPMIQFAINP MC TTG GAG AAA ATG GGA TAT GAC ATT AAG GAG GM NLEKMGYDIKEELLVF

4367 814

CCA

4610 895

CTA TTG GTG TTC taaattggatggtgatgatgatagatgatatatctctttggagg

4702 911

~ttcttttgtattttcagttgtacatattgtgtttttgtttatcatcaaaatgtactacttcttgtagttcttacattttaattttctactcaactttaggta

4810

taaaatgttgtactgtgtactagtatgtgttatgtcaaagttaagattgtattcatcatggaattacagtactttgttgtcacaagtcttgttgtatatatttttacc

4918

aaattaatgtgtatatatatatagtaaattgaaacatttgtgtgtgtgtttatttattttcctctcaactctaaaactacaattttttttgctcttcattcaatatta

5026

atgttgtaattagcactccattacaagcctaaagagtaaaggatagtggatcatctactctaatatggaataagaaaatcgacattggtatatcgttgaaaatataat

5134

taataataaaaatatgtttttttacaaaagtttaagtctttagatcacatggtcacacaattcaatatgatatcagaactgacagaagttttgaaatcgaattatcag

5242

ttagtcgagtaatcttattttgctatattttatatatacctatagtgttaaacttcacgaaaagaca

5309

Fig. 1. Complete nucleotide sequence of the tomato nia gene and deduced amino acid sequence. The pBluescript vectors (Stratagene, Inc.), carrying multiple cloning sites in a polylinker and the intergenic region of the Ml3 ss phage were used as an nt sequencing system. DNA fragments of the appropriate size to be sequenced were generated in the vector by deletions using ExoIII/mung-bean nuclease digestions. Sequencing reactions were performed by the Sanger dideoxy nucleotide method (Sanger, 1977), using ‘Sequenase’ enzyme for the polymerization reaction. Noncoding and coding sequences are written in lower case and upper case letters, respectively. The nt are numbered from 1 to 5309,5’ to 3’ on the coding strand, and the aa from 1 to 911, from the start codon (aa 1 = Met). Transcription consensus sequences are underlined. The presumed tsp is indicated by a vertical bar above a double-underlined nt.

(d) Comparison of three niu genes

the tobacco and tomato coding sequences. Introns

Extensive homology was found between the tomato and tobacco niu genes. The ORF is interrupted by three introns localized at the same site in

from the tomato gene are smaller than those from the tobacco genes. Introns 1,2 and 3 in tomato, are 74, 846 and 437 bp, respectively, as compared to 594, 1298 and 788 bp in the tobacco niu-2 gene (Fig. 3).

316 % (G+C\

I

I

530

I

I

I

I

I

I

I

1060 1590 2120 2650 3180 3710

I

I

4240 4770 5300

Nucleotide position Fig. 2. G + C plot of the tomato nia gene. G + C content is calculated using a 51-bp window, by steps of 20 bp. The gene structure (see Fig. 3) has been drawn at the same scale.

The 5’- and 3’-untranslated regions are also smaller in the tomato nia gene than in the tobacco genes. The nt sequences of the exons are highly conserved between the tomato and tobacco genes (81%) whereas the nt sequences of the introns have diverged more during evolution. Comparison of the tobacco and tomato nia promoters shows a striking homology in a 250-bp area, located in the tomato gene between nt positions 233

Lywpersiwn

ssculentum, nia

-NW*Niwtiana

tabacum, nia-I

~““*+nii~nl”~“~ Niwtiana

tabacum, nia-2 1,,,,,,,11”111111111llllllllyyyl MW,,.,,W,“_I~ m

vn#

Promoter homologous sequences Coding sequences 5’ and 3’ untranslated sequences lllY”““Wl

lntrons

+---1

1 kb

Fig. 3. Comparison of the structure of the tomato and tobacco regions extend up to the first polyadenylation signal after the stop codon. Homology in coding sequences is 97% between the two tobacco genes and 90% between each ofthem and the tomato gene. Introns vary in length between the three genes; homology between their colinear part is about 70% for the three tobacco introns and about 63% for tomato intron 2 and intron 3 compared to their tobacco counterpart, no homology being found for intron 1. Extensive homology (> 80%) is found between the 5’ non-coding sequences of the two tobacco genes. When these tobacco 5’ regions are compared to the tomato sequence, homology is restricted to a 250-bp long area and falls to 75%.

nia genes. The 3’-untranslated

and 483. This region, containing the TATA and CAAT boxes, is 75% homologous to a fragment located at the same position in the tobacco gene. Since the regulation of nia gene expression seems to be similar in the two plant species (Galangau et al., 1988) one can imagine that the 250-bp homologous region of tobacco and tomato nia promoters is implicated in light- or nitrate-mediated gene regulation. No obvious areas of homology were found upstream or downstream from these sequences. The secondary structure found in the non-translated leader sequence of the tobacco nia-2 gene (Vaucheret et al., 1989) is not present in the tomato gene and may therefore not be relevant to the regulation of NR mRNA translation. (e) Analysis of the tomato NR amino acid sequence and comparison with NRs from other higher plants The aa sequence deduced from the nt sequence of tomato nia gene is 911 aa long, a length which falls between that of the Arabidopsis NR2 (917) and tobacco NR (904). The tomato sequence shows a high degree of similarity (90.5%) with both tobacco NR sequences (the tobacco sequences are themselves 98% homologous). When the tomato and tobacco NR sequences are compared to other complete or partial higher plant NR sequences (Fig. 4), the two Arabidopsis thaliana isoforms, NRl (Cheng et al., 1988) and NR2 (Crawford et al., 1988), and NR from maize (Gowri and Campbell, 1989) and barley (Cheng et al., 1986), the homology scores range from 76 to 65 %, according to the phylogenetic distance, meaning that NR is fairly conserved in higher plants. Nitrate reductase, a three-redox-center enzyme, appears to have evolved from gene fusion between sequences coding for one-redox-center proteins. Homology of NR sequences with two of those proteins offers the opportunity to link sequence and function for the C-terminal half of the polypeptide. The surprisingly high homology with all members of the cytochrome b5 superfamily and with cytochrome b5 reductase (Calza et al., 1987) clearly defines the boundaries of the heme and FAD/NADH domains, which are separated by a 25/30-aa hinge region. From experimental results of limited proteolysis on chlorella NR (Solomonson et al., 1986), and, more recently, on spinach NR (Kubo et al., 1988), there is

“h-2

2

l

.

h’

Fig. 4. Amino acid sequence alignment between NRs from different higher plant species and related redox proteins. Tomato NR (this paper), is compared to tobacco niu-2 NR (Vaucheret et al., 1989) Arubidop& NR2 (Crawford et al., 1988) Arabidopsir NRl and barley NR (Cheng et al., 1988), and maize NADH : NR (Gowri and Campbell, 1989). Positions where two or more identical aa are found are boxed. These NR are also compared to other redox proteins (sequences in italics), rat sulfite oxidase (Crawford et al., 1988). bovine liver cytochrome b5 catalytic domain (0~01s and Strittmatter, 1969), and human cytochrome 65 reductase catalytic domain (Yubisui et al., 1984). Positions where the same residue is found for every sequence are shown in bold face. Gap positions, inserted to optimize the alignment, are indicated by dashes. Above the ahgnment are shown the intron positions (numbered 1,2,3) and the domain boundaries (lowercase letters, a to h) as used in Table 1. Eleven residues, among which are the two heme-liganding histidines, are conserved in the NR heme domain, as in ah members of the cytochrome b5 superfamily (Guiard and Lederer. 1979). These residues and the ‘essential’ lysine and cysteine in the FAD/NADH domain are shown by asterisks.

Irrabidep.ir

Tobacco

5

378

good evidence that these domains are true structural units. Two hinge regions, d and fin Fig. 4, between each domain could be anticipated. They are characterized by less conserved, hydrophilic residues and present no indication of an ordered secondary structure. The observed proteolysis most probably takes place in these hinge regions, where appropriate cleavage sites for each protease are found. The limited proteolysis experiment also shows that a covalent structure is an absolute requirement for NADH : NR activity, the interdomain interaction probably being too weak to maintain an active quaternary structure. A catalytic thiol is known to play an essential role in NADH binding to NR and associated dehydrogenase activities (Barber and Solomonson, 1986), as well as in cytochrome b5 reductase activity (Hackett et al., 1988). A good candidate for that function is the only conserved cysteine residue in the C-terminal domain, as pointed out in Fig. 4 (Crawford et al., 1988). This cysteine stands on the border of a long conserved sequence, CGPPPMI (Fig. 4). There is however a discrepancy for cytochrome b5 reductase between its position, namely Cys-273 and the position of the Cys-283 residue which was labeled in N-ethylmaleimide tagging experiment (Hackett et al., 1986). No equivalent residue is found in the NR sequences at this latter position. Another crucial cytochrome b5 reductase residue, known to be involved in NADH binding through an amino group, has been identified as Lys-110 (Hackett et al., 1988). A lysine residue is also found at a position homolo-

gous to Lys-110 in every NR (Fig. 4). Other essential residues, either catalytic (a flavin-binding tyrosine) or involved in charge-pairing with cytochrome b5, remain to be localized on the cytochrome b5 reductase sequence (Hackett et al., 1988) and thereof, most probably, in the NR flavodehydrogenase domain. In many flavoenzymes, the model of which is glutathione reductase, a characteristic dinucleotide-binding Rossman fold is found in the NAD(P)H-binding and FAD-binding separate subdomains. The sequence data give no indication of such a supersecondary structure in the NR/cytochrome b5 reductase family. The MoCo/NO,-domain(s) should occupy the N-terminal half of the polypeptide chain. Two short peptides, obtained from the molybdopterin domain of sulfite oxidase, appear to be significantly homologous to NR sequences in the N-terminal moiety (Crawford et al., 1988) which confirms this assumption. Nevertheless, its precise extent remains unknown, since no obvious homology with other completely sequenced molybdoproteins has been found. The three intron sites are found to be located in this MoCo region, but there is no indication that they separate functional units as they often do. Homology is not evenly distributed all over the remaining sequence, the FAD/NADH domain appearing to have evolved more freely than the MoCo one(s) (Table I). The real N-terminal sequence (domain a in Fig. 4), contains an 8-aa polyacidic cluster and is less conserved than any other part of the NR sequence

TABLE I Amino acid sequence homology * of NR domains b Tobacco nia- 1

Tobacco

Arabidopsis

nia-2

NR2

Maize NADH : NR

Whole sequence a: N-term. b: MoCol

89.7 69.9 92.6

90.5 72.3 92.6

76.0 37.2 79.0

(73.5)’ -

c: MoCo2 e: heme f: hinge g: FAD1 h: FAD2

96.1 91.7 86.2 90.2 88.0

96.1 95.2 86.2 91.7 88.7

88.6 85.7 71.4 75.7 71.6

89.8 76.7 44.8 63.2 65.4

NR domains

a Homology is calculated as a percentage of perfect matches to the tomato sequence used as a reference. b The domain boundaries are defined by lowercase letters (a to h) above the aa sequences alignment in Fig. 4. ’ The maize NR sequence is a C-terminal partial sequence (617 aa).

319

(Table I). This suggests that this sequence is not involved in the catalytic process, but would more probably play a structural or regulatory role. (f) Conclusions

(1) The tomato niu gene is 5.3 kb long, the coding sequence being interrupted by three introns. Within the 5’ and 3’ regions, eukaryotic consensus sequences that may be involved in gene expression are found at their usual location. (2) Comparison of the tomato and tobacco niu genes shows extensive homology in their coding sequence and in a 250-bp area of their promoter region. (3) Comparison of the aa sequence of various higher plant NR and other redox proteins allows to define the boundaries of the three NR catalytic structural units, and to point out some of their functionally essential residues. (4) Since cytochrome b5 and cytochrome b5 reductase are universal redox proteins, their structure-function relationships are being intensely investigated. One hopes that findings on their catalytic mechanism will apply by homology to nitrate reductase. Conversely nitrate reductase, being a covalent analog of the cytochrome b5-cytochrome b5 reductase complex, would perhaps help to elucidate how these proteins interact.

ACKNOWLEDGEMENTS

Sequence data analyses were performed with the BISANCE package using the CIT12 computer facilities, granted by the French ‘Ministtre de la Recherche et de 1aTechnologie’. The technical assistance of Jacques Goujaud, Jean-Marie Pollien and Gerard Vastra are gratefully acknowledged. We thank W.H. Campbell for sharing sequence data before publication. We are indebted to Ian Small and Ann Oaks for critical reading of the manuscript.

REFERENCES Barber, M.J. and Solomonson, L.P.: The role of the essential sulthydryl group in assimilatory NADH : nitrate reductase of Chlorella. J. Biol. Chem. 261 (1986) 4562-4567.

Beevers, L. and Hageman, R.H.: Uptake and reduction of nitrate: bacteria and higher plants. In Lauchli, A. and Bieleski, R.L. (Eds.), Encyclopedia of Plant Physiology, Vol. 15A. Springer, New York, 1983, pp. 351-375. Calxa, R., Huttner, E., Vincentz, M., Roux&, P., Galangau, F., Vaucheret, H., ChCrel, I., Meyer, C., Kronenberger, J. and Caboche, M.: Cloning of DNA fragments complementary to tobacco nitrate reductase mRNA and encoding epitopes common to nitrate reductases from higher plants. Mol. Gen. Genet. 209 (1987) 552-562. Campbell, W.H.: Higher plant nitrate reductase: arriving at a molecular view. Curr. Top. Plant Biochem. Physiol. 7 (1988) l-15. Cheng, C.H., Dewdney, J., Kleinhofs, A. and Goodman, H.M.: Cloning and nitrate induction of nitrate reductase mRNA. Proc. Natl. Acad. Sci. USA 83 (1986) 6825-6828. Cheng, C.H., Dewdney, J., Nam, H.G., Den Boer, B.G.W. and Goodman, M.: A new locus (NIAI) in Arabidopsir rhaliana encoding nitrate reductase. EMBO J. 7 (1988) 3309-3314. Crawford, N.M., Campbell, W.H. and Davis, R.W.: Nitrate reductase from squash: cDNA cloning and nitrate regulation. Proc. Natl. Acad. Sci. USA 93 (1986) 8073-8076. Crawford, N.M., Smith, M., Bellissimo, D. and Davis, R.W.: Sequence and nitrate regulation of the Arabidopsti thaliana mRNA encoding nitrate reductase, a metalloflavoprotein with three functional domains. Proc. Natl. Acad. Sci. USA 85 (1988) 5006-5010. Dellaporta, S.L., Wood, J. and Hicks, J.B.: A plant DNA minipreparation: version II. Plant. Mol. Biol. Rep. 1(1983) 19-21. Gabard, J., Marion-Poll, A., Chtrel, I., Meyer, C., Miiller, A.J. and Caboche, M.: Isolation and characterization ofNicotiana phmbaginifolia nitrate reductase deficient mutants: genetic and biochemical analysis of the NZA complementation group. Mol. Gen. Genet. 209 (1987) 596-606. Galangau, F., Daniel-Vedele, F., Moureaux, T., Dorbe, M.F., Leydecker, M.T. and Caboche, M.: Expression of leaf nitrate reductase genes from tomato and tobacco in relation to lightdark regimes and nitrate supply. Plant Physiol. 88 (1988) 383-388. Gowri, G. and Campbell, W.H.: cDNA clones for corn leaf NADH : nitrate reductase and chloroplast NAD(P)’ : glyceraldehyde-3-phosphate dehydrogenase. Plant Physiol. 90 (1989) 792-798. Guiard, B. and Lederer, F.: The ‘cytochrome b5 fold’: structure of a novel protein superfamily. J. Mol. Biol. 135 (1979) 639-650. Hackett, C.S., Novoa, W.B., Ozols, J. and Strittmatter, P.: Identification of the essential cysteine residue of NADHcytochrome b5 reductase. J. Biol. Chem. 261 (1986) 9854-9857. Hackett, C.S., Novoa, W.B., Read Kensil, C. and Strittmatter, P.: NADH binding to cytochrome b5 reductase blocks the acetylation of lysine 110. J. Biol. Chem. 263 (1988) 7539-7543. Joshi, C.P.: Putative polyadenylation signals in nuclear genes of higher plants: a compilation and analysis. Nucleic Acids Res. 15 (1987a) 9627-9640. Joshi, C.P.: An inspection of the domain between putative TATA

380 box and translation start in 79 plant genes. Nucleic Acids Res. 15 (1987b) 6643-6653. Kubo, Y., Ogura, N. and Nakagawa, H.: Limited preoteolysis of the nitrate reductase from spinach leaves. J. Biol. Chem. 263 (1988) 19684-19689. Mitller, A.J. and Grafe, R.: Isolation and characterization of cell lines of Nicotiuna tabamm lacking nitrate reductase. Mol. Gen. Genet. 161 (1978) 67-76. Murray, N.E.: Phage lambda and molecular cloning. In Hendrix, R.W., Roberts, J.W., Stahl, F.W. and Weisberg, R.A. (Eds.), Lambda II. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 1983, pp. 395-432. Ozols, J. and Strittmatter, P.: Correction of the amino acid sequence of calf liver cytochrome bS. J. Biol. Chem. 244 (1969) 6617-6618. Rick, C.M.: Linkage map ofthe tomato (Lycopersicon esculentum). In Brien, S., (Ed.), Genetic Maps, Vol. 1, NIH, Bethesda, 1980, pp. 268-281. Sanger, F., Nicklen, S. and Coulson, A.R.: DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA 74 (1977) 5463-5467. Solomonson, L.P., Barber, M.J., Robbins, A.P. and Oaks, A.: Functional domains of assimilatory NADH :nitrate reductase of Chlorella. J. Biol. Chem. 261 (1986) 11290-l 1294.

Vaucheret, H., Vincentz, M., Kronenberger, J., Caboche, M. and Rouzt, P.: Molecular cloning and characterisation of the two homeologous genes coding for nitrate reductase in tobacco. Mol. Gen. Genet. 216 (1989) 10-15. Wickens, M. and Stephenson, B.: The role of the conserved AAUAAA sequence in mRNA maturation: four AAUAAA points mutants prevent 3’ end formation. Science 226 (1984) 1045-1051. Wiebauer, K., Herrero, J.J. and Filipowicz, W.: Nuclear premRNA processing in plants: distinct models of 3’-splice site selection in plants and animals. Mol. Cell. Biol. 8 (1988) 2042-205 1. Wilusz, J. and Shenk, T.: A 64 kDa nuclear protein binds to RNA segments that includes the AAUAAA polyadenylation motif. Cell 52 (1988) 221-228. Wray, J.L.: The molecular genetics of higher plant nitrate assimilation. In Blonstein, A.D. and King, P.J. (Eds.), A Genetic Approach to Plant Biochemistry. Springer, New York, 1986, pp. 101-157. Yubisui, T., Miyata, T., Iwanaga, S., Tamura, M., Yoshida, S., Takeshita, M. and Nakajima, H.: Amino-acid sequence of NADH-cytochrome b5 reductase of human erythrocytes. J. Biochem. 96 (1984) 579-582.