GENOMICS
2,240-248 (1988)
Molecular
Cloning of the Mouse Angiotensinogen
W. M. CLOUSTON, B. A. EVANS, J. HARALAMBIDIS,
Gene
AND R. I. RICHARDS
Howard Florey Institute of Experimental Physiology and Medicine, University of Melbourne, Parkville 3052, Victoria, Australia Received
December
22, 1987;
Inc.
INTRODUCTION
The renin-angiotensin-aldosterone system plays an essential role in the regulation of blood pressure, sodium homeostasis, and thirst. Angiotensinogen is a glycoprotein secreted predominantly by the liver, and after cleavage by renin liberates angiotensin I. Hydrolysis of this decapeptide by angiotensin-converting enzyme produces the biologically active peptide angiotensin II. Enzyme kinetic data from several species suggest that the plasma concentration of angiotensinogen is a rate-limiting factor in the generation of the angiotensin peptides (for review, see Reid et al, 1978). Hepatic production of angiotensinogen can be increased by a variety of physiological stimuli, including angiotensin II (Sernia and Reid, 1980), glucocorticoids, estrogen, thyroxine, and bilateral nephrectomy (Campbell et al., 1984). Angiotensinogen messenger RNA is present in many tissues besides the liver, notably brain and kidney (Ohkubo et aZ., 1986; Dzau et al., 1987). Expression of the angiotensinogen gene is
9, 1988
MATERIALS
AND
METHODS
Mice
BALB/c mice used in this study were obtained from three sources. For the construction of the cDNA library, BALB/c mice were purchased from the Mon-
Sequence data from this article has been deposited with the EMBL//GenBank Data Libraries under Accession No. 503046. 033%7543/33 $3.00 Copyright Q 1988 by Academic Press, Inc. All rights of reproduction in any form reserved
March
differentially regulated in various tissues (Campbell and Haebener, 1986). For example, glucocorticoids increase angiotensinogen mRNA in the liver and brain, but not in the kidney (Kalinyak and Perlman, 1987); lipopolysaccharide exerts effects in the liver, but not in the brain (Kageyama et al., 1985); and a decrease in plasma sodium concentration induces angiotensinogen mRNA predominantly in the kidney (Ingelfinger et al., 1986). The molecular basis of this differential expression has not been defined. An unexpected observation that followed the determination of the rat angiotensinogen protein sequence was its homology to members of the serine protease inhibitor family, especially cul-antitrypsin (Doolittle, 1983). Subsequently, the rat angiotensinogen gene was shown to have structural features in common with the cwi-antitrypsin gene (Tanaka et al., 1984). However, the physiological basis for this sequence homology remains to be definitively established. It has been suggested that des-AI-angiotensinogen may function as a renin inhibitor (Poulsen and Jacobsen, 1986), despite the fact that renin is an acid protease and not a serine protease. We describe here the cloning and molecular anatomy of the mouse angiotensinogen gene. Because of the differentially regulated expression of angiotensinogen in a number of organs, a major reason for undertaking this project was to provide a completely homologous system for characterizing the tissue-specific enhancers flanking the gene using transgenic mice. Our comparative analysis of the two available angiotensinogen promoter sequences, namely, rat (Ohkubo et al., 1986) and mouse, has defined the position of shared consensus sequences for several enhancers and a highly conserved palindromic sequence (5’-dTCTGTACAGA-3’) upstream of the start of transcription.
We describe here the cloning, restriction mapping, and sequencing of the mouse angiotensinogen gene. The 5’ flanking region contains consensus sequences for several hormone-responsive elements and virallike enhancers within 750 bp of the cap site. The deduced amino acid sequence shows 87% identity with rat angiotensinogen, but there is a discrepancy in the number of cysteine residues in the mature protein among rat (n = 3), human (n = 4), and mouse (n = 4). Because angiotensinogen is homologous to other members of the serine protease inhibitor family, we aligned the putative reactive center of angiotensinogens from various species. This alignment shows that the inhibitor site in human angiotensinogen is different from its rodent counterpart, but the role of this sequence divergence in the pathogenesis of human disease remains to be established. 0 1SSS Academic Prm,
revised
240
MOUSE
ANGIOTENSINOGEN
ash University Animal House, Clayton, Victoria, Australia. DNA for the construction of the X genomic libraries was prepared (while our group was at the Australian National University) from BALB/c mice obtained at the John Curtin School of Medical Research, Canberra, Australia. Dr. M. Graham constructed his BALB/c mouse cosmid library from animals bred at the Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia.
241
GENE
subsequently hybridized to the 32P-labeled oligodeoxyribonucleotide for 16-18 h under the same conditions. The filters were then washed to a stringency of 0.5~ SSC (0.075 M NaCl, 0.0075 M sodium citrate) at 42”C, air-dried, and autoradiographed overnight at -80°C with a single intensifying screen. cDNA Library
Construction
In order to enrich for the low-abundance angiotensinogen mRNA, mice were given a combination of ethynylestradiol (0.3 mg/kg subcutaneously) for 3 days and dexamethasone (6.7 mg/kg intraperitoneally) 12 and 24 h before death. In addition, the animals were bilaterally nephrectomized 24 h before death (Campbell et al., 1984).
A cDNA library was constructed from induced liver mRNA in the bacteriophage vector XgtlO using an RNaseH second-strand synthesis (Watson and Jackson, 1985). Recombinant phage were screened by in situ hybridization on duplicate 0.45~pm nitrocellulose filters which were probed under the same conditions as those used for the Northern blots. All experiments using recombinant DNA were performed under Cl conditions according to the guidelines of the recombinant DNA Monitoring Committee of Australia.
mRNA Preparation
Genomic Southern Blots
Livers were surgically removed and homogenized in 5 M guanidinium thiocyanate. RNA was extracted by centrifugation through a C&l cushion and enriched for polyadenylated RNA by affinity chromatography on oligo(dT)-cellulose (Maniatis et al., 1982).
Southern blots of genomic DNA digested with HindIII and BamHI (Pharmacia Molecular Biologic&) were hybridized at high stringency with the 800-bp mouse angiotensinogen cDNA, which had been labeled using random primers, [ar-32P]dATP, and [a32P)dCTP exactly as described by Shine et al. (1983).
Hormone
Northern
Treatment
Blots
Polyadenylated RNA (10 Mg) from induced and control mouse liver was electrophoresed through a 1.2% agarose gel containing 2.1 M formaldehyde. The gel was stained with ethidium bromide to visualize the size markers, and the RNA was subsequently transferred to nitrocellulose (Maniatis et aZ., 1982). Oligodeoxyribonudeotides The sequences of the two oligodeoxyribonucleotides used to probe the Northern blot and the cDNA library were based on conserved regions of the rat and human angiotensinogen cDNA sequences (Ohkubo et aZ., 1983; Kageyama et al, 1984). The exon 2 probe (5’-dGAGATGAAAGGGGTGGATGTATACGCGGTCCCCAGC-3’) was from the angiotensin I coding region including 6 bases upstream in the signal peptide region, and the exon 3 probe (5’-dCACTGAGGTGCTGTTGTCCACCCAGAACTC-3’) spans a glycosylation site. Probes were chemically synthesized on an Applied Biosystems Inc. Model 380A DNA synthesizer by the solid-phase phosphoramidite method (Beaucage and Caruthers, 1981) and purified by polyacrylamide gel electrophoresis. Probes were labeled at the 5’ terminus using T4 polynucleotide kinase and [T-~~P]ATP (Maniatis et al., 1982). Filters were prehybridized for 2-4 h at 42°C in the 20% formamide buffer described by Ullrich et al. (1984) and
Genomic Library Screening Two genomic libraries were constructed in the EMBL3 vector using the method of Kaiser and Murray (1985). Briefly, BALB/c mouse liver DNA was subjected to partial digestion with Sau3A, dephosphorylated with bacterial alkaline phosphatase to prevent multiple inserts, and size-fractionated on a 5-24% NaCl gradient. DNA in the size range of 14-22 kb was ligated to EMBL3 arms. Library 1 was packaged using BHB2688/BHB2690-derived extracts (Maniatis et al., 1982) and plated on the recA+ Escherichia coli strain LE392. Library 2 was packaged using Gigapak-Gold (Stratagene Inc., La Jolla CA) and plated on the recombination-deficient E. coli strain DB1255 (Wyman et al., 1985) obtained from Dr. R. Chiu, San Diego, California. An amplified BALB/c mouse cosmid library was obtained from Dr. M. Graham, Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia. The libraries were screened with the mouse angiotensinogen, cDNA probe under the high-stringency conditions described for the genomic Southern blots. Positive clones were mapped by digestion of the DNA with a panel of restriction enzymes, followed by transfer to nitrocellulose and hybridization to a series of exon-specific oligodeoxyribonucleotide probes derived from the rat (Ohkubo et al., 1983) and mouse cDNA sequences. To assist with the mapping of frag-
242
CLOUSTON
ET AL.
ments at either end of the cosmid clones, we used a strategy analogous to that previously described for bacteriophage X (Rackwitz et al., 1984; Clouston et aZ., 1987). Briefly, asymmetrical vector “arms” were created by digestion of the unique SalI site in the vector pJB8. After complete digestion with a second enzyme followed by Southern blotting, restriction fragments containing the vector arms were identified with specific oligodeoxyribonucleotides complementary to either arm. DNA Sequence Analysis DNA restriction fragments were cloned into M13mp18 or mp19 (Yanisch-Perron et al, 1985) and sequenced by the chain-termination method (Sanger et al., 1977). Sequencing reactions were routinely incubated at 50°C to minimize effects of secondary structure. Any remaining GC compressions were resolved using deoxy-7-deazaguanosine (Mizusawa et al., 1986). Primer Extension An oligodeoxyribonucleotide (5’-dTGCTCTTGTTGTGGTAAAGGAGATGGAAGG-3’) was synthesized complementary to residues 998 to 1027 (according to the numbering of Fig. 5). This is equivalent to residues 153 to 182 from the putative cap site of spliced mRNA. After 5’-end-labeling, the oligodeoxyribonucleotide was used for primer extension with 10 pg of polyadenylated liver RNA according to the method of Bodner and Karin (1987). RESULTS
Identification
AND
DISCUSSION
of a Mouse Angiotensinogen
late the entire angiotensinogen gene. Two unamplified libraries were constructed in EMBL3, consisting of a total of 1.5 X lo6 recombinants. This is equivalent to six mouse genomes, and yielded one clone, designated XgMA2 which contained exons 3,4, and 5 (Fig. 2). A 2.1-kb Sr&EcoRl fragment from the 5’ end of XgMA2 was unsuitable for chromosome walking because it contained repetitive elements. Accordingly, 250,000 colonies from an amplified BALB/c mouse cosmid library were screened with the angiotensinogen cDNA probe. Of eight positive clones, the three hybridizing to our cDNA clone as well as exon l- and exon 2-specific oligodeoxyribonucleotide probes were subjected to restriction mapping (Fig. 2). This map shows that the mouse angiotensinogen gene spans approximately 15 kb of the 70 kb of genomic DNA represented in these overlapping cosmid and X clones.
cDNA Clone
In order to enrich for the low-abundance angiotensinogen mRNA, mice were nephrectomized and given estrogen and dexamethasone. The exon 2 probe hybridizes to a mRNA of correct size when compared to other species (Ohkubo et al., 1983), and this mRNA is inducible by estrogen and dexamethasone (Fig. 1). A cDNA library constructed in XgtlO from this induced liver mRNA contained 10’ individual recombinants. This library contained no cDNAs extending to where the exon 2 probe would hybridize. Only one clone, designated XMAI, hybridized to the exon 3 oligodeoxyribonucleotide, which is in agreement with previous estimates (0.01%) (Ohkubo et al., 1983) of angiotensinogen mRNA abundance. This clone is 800 bp in length and spans exons 3,4, and 5 as well as 158 bp of 3’ untranslated region prior to the poly(A) tail. Cloning of the Mouse Angiotenhwgen
FIG. 1. Northern blot of mouse liver poly(A)+ RNA (10 pgl track) probed with the exon 2 oligodeoxyribonucleotide spanning the angiotensin I coding region. Lanes 2 and 4 are from control animals. Lanes 1, 3, and 5 are from nephrectomized mice treated with dexamethasone and estrogen. Exposure was for 16 h with an intensifying screen.
Gene
It was necessary to screen three genomic libraries with the mouse angiotensinogen cDNA probe to iso-
Angiotensinogen
Is Encoded by a Single-Copy
Gene
Genomic Southern blots of BALB/c mouse DNA digested with Hind111 and BamHI were probed with the mouse angiotensinogen cDNA, which contains exons 3, 4, and 5 (Fig. 3). The 11-kb Hind111 fragment is identical to that containing exons 2-5 in the cosmid clones (Fig. 2). Similarly, the 6.3-kb BamHI fragment contains exons 2 and 3, and the 3.6-kb BamHI fragment contains exon 5 and part of exon 4 (Figs. 2 and 3). A 0.28-kb BamHI fragment containing 40 bp of exon 4 is beyond the resolving power of the 0.9% agarose gel used for electrophoresis of the genomic DNA, but has been confirmed by DNA sequencing (Fig. 4B). Structural Organization of the Mouse Angiotensinogen Gene The five exons, as well as 750 bp of 5’ flanking region and 404 bp of 3’ untranslated region, were sequenced (Figs. 4 and 5). The exon-intron boundaries
MOUSE 5’-3 EXONS
______t
EC0
II.
;
m
I
1
SamtHe
243
GENE
’
I
REsmTloN SrrEs tlindm_t__S_+__t
ANGIOTENSINOGEN
I
III II!
;I ;
1
I
;I
1
!
1;
III/
;I
I
CLONES cmr2t_____---.l cMA20~+---.---~ cMAlE.+
I
B.
A&VA2 +----., KlLOSAS!ZS I 0
10
20
30
40
50
00
1 70
EXON
2
EXON
3
EXON
4
EXON
5
FIG. 2. Restriction map of the mouse angiotensinogen genomic clones. cMA2, cMAl8, and cMA20 were from the cosmid library; XgMA2 was isolated from an EMBL3 library. The arrow above the map shows the direction of transcription, and vertical bars denote the 5 exons.
of the mouse angiotensinogen gene are identical to those from the rat (Tanaka et al., 1984). The position of the transcription initiation site was confirmed by primer extension. Using a 30-mer synthetic oliodeoxyribonucleotide complementary to residues 153 to 182 from the putative cap site of spliced mRNA, a major band of 182 bp was synthesized from liver RNA (Fig. 6). The TATA box is located 30 bp upstream of the cap site, but there is no CAAT box (Fig. 5). Instead, 71 bp from the transcription start site is a long inverted repeat (5’-dTCTGTACAGA-3’) H ,,^I
23::: 4.4-
FIG. 3. Genomic Southern blot of BALBlc mouse DNA @g/track) digested with either Hind111 (H) or BamHI (B) probed with the mouse angiotensinogen cDNA. Size markers Xc1857 digested with HindHI. Exposure was for 4 days with intensifying screen.
(10 and are an
FIG. 4. (A) Detailed restriction map of the 5’ flanking region of the mouse angiotensinogen gene. The 6.8-kb Hi&III-BamHI fragment containing exon 1 was derived from the cosmid clone cMA2. The strategy for sequencing is indicated by horizontal arrows. (B) Sequencing strategy for exons 2 to 5. Sequences taken from the mouse angiotensinogen cDNA originate with vertical bars: those from genomic clones begin with either circles (Ml3 universal primer) or squares (specific primers).
which is also found in the rat angiotensinogen promoter (Ohkubo et al., 1986). The 5’ untranslated region contains two putative glucocorticoid-responsive elements complementary to the sequence 5’-dTGTTCT-3’ (Fig. 5; Schneidereit et oz., 1983), both of which are present in the rat promoter (Ohkubo et al., 1986). In addition, elements complementary to a core sequence found in several viral enhancers (Weiher et al., 1983; 5’-dGTGGTTT/ AAA-3’) are at positions 127 to 133 and positions 194 to 200 with reference to the numbering of Fig. 5. A long stretch of T residues found in both the rat albumin (Heard et al., 1987) and the rat angiotensinogen promoters (Ohkubo et al., 1986) is present in neither of the mouse counterparts (Heard et aZ., 1987; Fig. 5). When the rat and mouse angiotensinogen promoters are compared (Fig. 7), a striking degree of sequence homology emerges, which is 83% overall and extends to the 5’ end of the published rat sequence. This sequence conservation suggests that there may be other regulatory elements further upstream. The polyadenylation signal AATAAA at position 2727 is functional in. uiuo, as it is 17 bp upstream of the poly(A) tail in our mouse liver cDNA clone (Fig. 5). Because multiple polyadenylation sites contribute to the known size heterogeneity of rat angiotensinogen
244
CLOUSTON
ET AL. 120 240 360 480 600 720 840
MetThrProThrGlyAlaGlyLe"LysAlaThrIlePheCysIleLe"ThrTrp ------CTCTTGTCACTCTCGAIIIUTGTTTTTC~ACACACAGMGC~TGCACAGATCGGAGATGACTCCCACG~GGCAGGCCTGM~CCACCATCTTCTGCATCTTGACCTffi 1 1" 7" 30 --. V~lSerLe"ThrAlaGlyA~pArgValTyIlleXisProPheHisLe"Le"TyrHisAsnLysSerThrCysAlaGlnLe"Gl"AsnProSerValGl"ThrLe"~~oGl"SerThrPhe GT~~CTGACGCCTOCGCGTATA~TCCACCCCTTC~TCTCCTTTACCACMCMGAGCACCT~GCCCAGCTGGAGMCCCCAGTGT~AGACACTCCCAGAGTCMCGTTC 60 40 50 70 Gl"ProValProIleGlnA1aLysThrSerProValAsnGl"LysThrLeuHisAspG1nLeuValLeuAlbAlaGluLysLeuGluAspGluAspArgLysArgAlaAlaGlnValAla OACCCTGTGCCCATTCAGOCCMGACCTCCCCTGTGMTGAGMGACCCT~ATGATCMCTCGTGCTGGCCGCCGAGM~TAGAGGATGAGGACC~MGCG~CTGCCCAGGTCGCA 90 100 110 SO M~tIleAlaAsnPheV~lGlyPheArgHetTyrLysMetLeuAsnGl"AlaGlySerGlyAlaSerGlyAlaIleLeuSerProProAlaLeuPheGlyThrLeuValSerPheTyrLeu ATCATCGCCAACTTCGTGGOCTTCCGCATGTACM~T~TGMTGAGGCA~MGT~~CCAGT~GGCCATCCTCTCACCACCAGCTCTCTTT~ACCCTGGTCTCTTTCTACCTT 120 130 140 150 GlyserLeuA~pProThrAlaSerGlnLeuGlnThrLeuLeuAspVa1ProValLysG1uGlyAspCysThrSerArgLeuAspG1yHiaLysValLeuAlaAlaLeuArgAlaValGln OOATCCTTAGATCCCACGGCCAGCCAGCCA~T~GACGCTGCT~ATGTCCCTGTGM~AGGGAGACTGCACCTCCCGACTAGAT~ACACMGGTCCTCGCTGCCCTGC~CGTTCAG 160 170 IS0 190 GlyLeuLeuValThrGlnGlyGlySerSerSerGlnThrProLauLeuGlnSerIleHetValGlyLeuPheThrAlaProGlyPheArgLeuLysHisSerPheValGlnSerLeuAla OOCTTOCTGOTCACCCAGOGTGGOAGCAOCAGCCAGACACCCCT~TA~GTCCATCAT~TGG~CTCTTCACT~CCCAGGCTTTCGTCT~GCACTCATTTGTTCAGA~CT~T 200 210 220 230 L.uPhsThrProA1~LsuPheProA+gSerLa"AspLe"SerThrAspProVa1Le"A1aThrGl"LysIleAsnArgPheIleLysAlaValThrGlyTrpLysNetAsnLeuProLeu CTCTTTACCCCTGCCCTCTTCCCACOCTCTCTCT~ATTTATCCACTGACCCAGTTCTTGCCACTGAG~TCMCAffiTTCAT~GGCTGTGACA~T~MGATG~CTT~CACTG 250 240 INTRON S GluGlyValS~rThrAspS~rThrLeuLeuPheAsnThrTyrValHi~PheGl"G WLCOOGCTCAGTACAGACAOCACCCTACTTTTCMCACCTACGTTCACTTCCM~AGGC~CACTTGGGTCACTGGTCCTG----------3,9 kb----------TGAGCTTCT 260 270 280 yThrMetArgGlyPheSerGlnLeuProGlyValHisGluPheTrpValAspAsnSerIleSerValSerValProMetIleSerGlyThrGlyAsn MCGATGAGAGGTTTCTCTCT~CT~AGTCCATGAA CTTTGGTCTCTGCTGCTTT 456 310 290 300 320 Ph~GlnHiaTrpSsr~pAlaGlnAsn~nPheSerValThrCysValProLeuGlyG1uArgAlaThrLeuLeuLeuIleG1nProHisCysThrSerAspLeuAspArgValGl"Ala TTCCAGCACT~GTGACCCCCAGAACAACTTCTCCGTGACGTGCGTGCCCCTAGGTGAGAGAGCCACCCTGCT~TCATCCAGCCC~CTGCACCTCAGATCTCGACAGGGTffiAffiCC 340 330 Le"IlePheArgA.nAspLeuLe"ThrTrpIlaGl"AsnProProProAr INTRON C CTCATCTTCCODMCGACCTCCTGACTTffiATAGAGMCCC~CTCCTC~GGMGTGTGC~MCCTTGG~AGT---------1.4 kb----------TCCCAGCCCCCTM 360 370 350
960
&
AGGCTTTATCTCCA
1080 1200 1320 1440 1560 1680 1800 1920 2040 2160 2280
390
AsnLeuSerAmIleGlyAspThrAsnProArgValGlyGl" MTCTWL~CATTffiTCACACCCCCGAGTGGGAGA~AGTGCTGCTCT~CTGTGTA~AGGGA---------d"O 410
INTRON D 0.6 kb----------AGATCTCTCTGTTTTG 420
Vd 2400 2520 2640 2760 2880 2984
FIG. 6. Sequence of the mouse angiotensinogen gene. The nucleotide and deduced amino acid sequences are shown. The numbers on the right refer to the nucleotide residues. The splice donor and acceptor sites around each exon are boxed. The numbers above the amino acid sequence refer to the position relative to the N-terminal aspartic acid of the mature protein. The transcription initiation site is shown by a vertical arrow. The TATA box (Corden et al., 1980) is outlined. The duplicated inverted repeat sequences (5’-dCTGTACAG-3’) in the 5 untranslated region are indicated by horizontal arrows starting at the center of the palindrome. Sequences complementary to the consensus sequence (5’-dTGTTCT-3’) of glucocorticoid responsive elements (Schneidereit et al., 1933) are overlined. Polyadenylation signals are underlined. The end of the mouse liver cDNA is shown by a triangle.
mRNA (Ohkubo et al., 1966), we compared the sequences of the 3’ untranslated regions from both species. While the proximal and distal polyadenylation signals are identical in rat and mouse, the middle site in the rat sequence (see Fig. 5 of Ohkubo et aZ., 1966) corresponds to ACCAAA in the mouse gene, which is unlikely to be functional (Birnsteil et al., 1985). In contrast, the 3’ untranslated region from human angiotensinogen is 266 bp longer than that in the mouse, and contains two polyadenylation sites (Kageyama et al., 1964). The most distal polyadenylation signal in the rodents (ATTAAA) is the only one in common for
all three species and is the predominant poly(A) addition in the human. Amino Acid Sequence Homology
site of
of Angiotensinogens
The deduced amino acid sequence of mouse angiotensinogen comprises 477 amino acids, including a 24 amino acid leader sequence (Fig. 5). In agreement with previous physiological evidence from other nonprimate species (Reid et al., 1978), there is a Leu-Leu bond at the site of cleavage of angiotensin I from precursor angiotensinogen by renin. Potential glycosylation sites
MOUSE
ANGIOTENSINOGEN
-217 -201 - 190 - 180 - 160 FIG. 6. Determination of the transcription initiation site. A 30-mer oligodeoxyribonucleotide complementary to residues 153 to 182 from the cap site of spliced RNA was annealed to 10 pg poly(A)+ liver RNA at 60°C for 1 h. The primer extension products (lane 1) were analyzed on a denaturing 8% polyacrylamide gel. Size markers (lane 2) are end-labeled HpaII fragments of pBR322. A single band of 182 bp is present in lane 1.
conforming to the canonical Asn-X-SerA’hr sequence are located at residues 14,23, and 295. Several polymorphisms were noted when the mouse angiotensinogen genomic sequence was compared to its cDNA counterpart. Silent changes were at positions 2203 (T to C), 2244 (C to T), and 2475 (G to A). A single amino acid change from Ser to Asn resulted from a G to A substitution at position 2288 (Fig. 5). It seems unusual that polymorphisms have been observed from an inbred mouse strain, but we note in retrospect that the BALB/c mice used for the construction of the libraries came from different breeding colonies. Furthermore, a polymorphism in the deduced amino acid sequence of human angiotensinogen has recently been reported (Kunapuli and Kumar, 1986). The distribution of protein sequence identity for angiotensinogen from mouse, rat (Ohkubo et al., 1983), and human (Kageyama et al., 1984) was examined (Figs. 8a and 8b). The angiotensin I sequence is identical for all three species. Adjacent to the site of angiotensin I cleavage by renin is a region of sequence divergence, which is especially pronounced when human and mouse are compared (Fig. 8b), and occurs to a lesser extent for mouse and rat (Fig. 8a). This divergent region is likely to play a role in determining the substrate specificity for renin from different mammalian species. Despite the variation in sequence identity throughout the remainder of the molecule (Figs. 8a and 8b), there is appreciable similarity in the hydropathy plots of mouse and human angiotensinogen (Fig. 8~). This suggests that the three-dimensional configuration of the molecule will be conserved. There is a discrepancy in the number of cysteine residues in the mature protein between the rat (n = 3; Ohkubo et al., 1983), human (n = 4; Kageyama et al., 19&Q, and mouse (n = 4). The cysteine within the signal peptide and the first two cysteines of the ma-
245
GENE
ture protein at residues 18 and 137 are conserved in all three species. The disparity observed between rat and mouse is because of a single base change from CGC to TGC at position 1963 of the nucleic acid sequence. The C-terminal pair of cysteines in human angiotensinogen cannot be aligned with either the rat or the mouse sequences. The pattern of disulfide bonding within angiotensinogen has not been determined for any species, but has important implications for protein folding. Relationship of Angiotensinogen Protease Inhibitors
to the Serine
Because of the known homology between angiotensinogen and the serine protease inhibitors (Doolittle, 1983), we examined the putative protease inhibitor site in mouse, rat, and human angiotensinogens. Computerized alignment of new members of the serine protease inhibitor family is unable to predict exactly the position of the P1 and Pi residues at the reactive center because of the variable length of this crucial part of inhibitors with different substrate specificity. This limitation is evidenced by the different positioning of gaps in this region by various authors (compare Doolittle, 1983; Bock et al., 1986; Ragg, 1986). However, a serine residue is most often at the Pi position (Carrel1 et al., 1982; Hill et al, 1984) and this is the case for the alignment of rat angiotensinogen to the serine protease inhibitors proposed by Tanaka et al. (1984). When this alignment is extended
-800-500-400-300-200-1000 MOUSE
PROMOTER
FIG. 7. Comparison of the 5’untranslated regions of the mouse (horizontal axis) and rat (vertical axis) angiotensinogen genes. The graphic display was derived from a modification (by Dr. A. Kyne, Walter and Eliza Hall Institute of Medical Research, Melbourne) of the DIAGON program (Staden, 1982). The parameters used were a span length of 15 and a proportional matching score of 11. Positions of the cap site, TATA box, putative glucocorticoid responsive elements (GRE), and the inverted repeat 5’-dCTGTACAG-3’ (I.R.) are shown. A stretch of T residues within the rat promoter is indicated (T).
CLOUSTON
MOUSE
“w.“s
ET AL. b
I
RAT
----174
AMINO
,
1
I
237 ACID
300
363
426
I I
1
I
48
111
174
NUMBER
I
237 AMINO ACID
I
300 NUMBER
I
363
426
C I
L+2.46
I 1
I 36
I 65
I 154
213 AMINO
ACIO
MOUSE
-
HUMAN
------
272
I 331
390
I 453
NUMBER
FIG. 8. (a) Comparison of the percentage protein sequence identity of mouse and rat angiotensinogen, relative to a span length of 9 amino acids. The graphic display is from the SEQMATCH program written for the Melbourne Data Base System by Dr. A. Kyne (Walter and Eliza Hall Institute of Medical Research, Melbourne). The position of angiotensin I (AI) is shown by a solid bar, and the residue numbers are identical to those for th6 mouse protein sequence shown in Fig. 5. Note that this algorithm does not show the 4 residues at either end of the molecule. (b) The percentage protein sequence identity of mouse and human angiotensinogen was plotted using the same parameters described in (a). This comparison takes into account gaps in the alignment at position 92 in the mouse sequence and positions 218 and 439 in the human sequence. The second potential initiation methionine in the human sequence was used to allow comparison of signal peptides of identical length. (c) Hydropathy plot (Kyte and Doolittle, 1982) of mouse (solid line) and human (dashed line) angiotensinogens using a span length of 9 residues. Note that the scale of they axis is slightly different for the two species. The position of angiotensin I is again shown by a solid bar.
to angiotensinogens from other species (Fig. 9), both mouse and rat have glycine and serine at the P1 and P1 positions. Human angiotensinogen is different, with asparagine and lysine at the P1 and P1 positions. This divergence within the reactive center is further highlighted by examining the 25 residues surrounding the active site between the highly conserved glutamic acid at position 404 and the invariable phenylalanine at position 428. There are only 3 mismatches between rat and mouse, but 11 mismatches when the mouse sequence is compared to the human (Fig. 9).
Divergence within the reactive center region between humans and mice has been described for other serine protease inhibitors (Hill et ai., 19&i; Hill and Hastie, 1987). In the case of al-antitrypsin, this difference does not change the inhibitory specificity for elastase. Conversely, mouse contrapsin differs in specificity from its human counterpart, a+ntichymotrypsin (for commentary, see Leigh-Brown, 1987). The role of such sequence diversity within the inhibitor site of the angiotensinogens merits further physiological study.
MOUSE
Human
ANGIOTENSINOGEN
GENE
247
Angiotensinqen
FIG. 9. Alignment of the putative serine protease inhibitor reactive centers from mouse, rat, and human angiotensinogens. Human cyl-antitrypsin (Carrel1 et al., 1982) and antithrombin III (Chandra et aL, 1983) are included for comparison. With reference to the numbering from mouse angiotensinogen (Fig. 5), the serine at position 416 has been designated as the P” residue, and the proposed site of cleavage is shown by a vertical arrow. Conserved residues between the angiotensinogens and the two reference serine protease inhibitors are boxed.
Finally, because several serine protease inhibitors in the mouse are genetically linked (Hill et al., 1985), we probed Southern blots of our cosmid clones at low stringency with the mouse angiotensinogen cDNA probe. There are no additional hybridizing bands to suggest the presence of related sequences. The Southern blots also showed no hybridization to synthetic oligodeoxyribonucleotides based on published sequences for mouse aI-antitrypsin (Krauter et aZ., 1986; Hill et al., 19&I), mouse contrapsin (Hill et al., 1984), or human leuserpin (Rag@;, 1986). This important question will be answered by chromosomal assignment of the angiotensinogen gene. ACKNOWLEDGMENTS We thank Peter Aldred for advice on the construction of the cDNA library, Tiina Oldfield and Lucy Duncan for synthesis of oligodeoxyribonucleotides, Helen Glenn for preparation of the artwork, and Michael Graham for providing the mouse cosmid library. This work was supported by the National Health and Medical Research Council of Australia, the Ian Potter Foundation, and the Myer Family Trusts. W.M.C. is the recipient of a National Health and Medical Research Council of Australia Medical Postgraduate Scholarship. REFERENCES BEAUCAGE, S. L., AND CARUTHERS, M. H. (1981). Deoxynucleoside phosphoramidites-a new class of key intermediates of deoxypolynucleotide synthesis. Tetrahedron Lett. 22: 1859-1862. BIRNSTEIL, M. L., BUSSLINGER, M., AND STRUR, K. (1985). Transcription termination and 3’ processing: The end is in site! Cell 41: 349-359. BOCK S. C., SKFUVER, K., NIELSEN, E., THOGERSEN, H.-C., WIMAN, B., DONALDSON, V. H., EDDY, R. L., MARRINAN, J., RADZIEJEWSKA, E., HUBER, R., SHOWS, T. B., AND MAGNUSSON, S. (1986). Human cl inhibitor: Primary structure, cDNA cloning and chromosomal localization. Biochemistry 26: 4292-4301. BODNER, M., AND KARIN, M. (1987). A pituitary-specific trans-acting factor can stimulate transcription from the
5.
6. I.
8.
9.
10.
11. 12.
13.
14.
15.
16.
growth hormone promoter in extracts of non-expressing cells. Cell 50: 267-275. CAMPBELL, D. J., BOUHNIK, J., COEZY, E., PINET, F., CLAUSER, E., MENARD, J., AND CORVOL, P. (1984). Characterization of precursor and secreted forms of rat angiotensinogen. Endocrinology 114: 776-785. CAMPBELL, D. J., AND HAEBENER, J. (1986). Angiotensinogen gene is expressed and differentially regulated in multiple tissues in the rat. J. Clin. Inuest. 78: 31-39. CARRELL, R. W., JEPPSSON, J. O., LAURELL, C. B., BRENNAN, S. O., OWEN, M. C., VAUGHAN, L., AND BOSWELL, D. R. (1982). Structure and variation of human ai-antitrypsin. Nature (London) 298: 329-334. CHANDRA, T., STACKHOUSE, R., KIDD, V. J., AND Woo, S. L. C. (1983). Isolation and sequence characterization of a cDNA clone of human antithrombin III. Proc. N&l. Acad. Sci. USA 80: 1845-1848. CLOUSTON W. M., EVANS, B. A., AND RICHARDS, R. I. (1987). A strategy for assigning end fragments of genomic DNA cloned into the cosmid vector pJB8. Nucleic Acids Res. 15: 10057. CORDEN, J., WASYLYK, B., BUCHWALDER, A., SASONE-CORSI, P., KEDINGER, C., AND CHAMBON, P. (1980). Promoter sequences of eukaryotic protein-coding genes. Science 209: 1406-1414. DOOLITTLE, R. (1983). Angiotensinogen is related to the antitrypsin-antithrombin-ovalbumin family. Science 222: 417-419. DZAU, V. J., ELLISON, K. E., BRODY, T., INGELFINGER, J., AND PRATT, R. E. (1987). A comparative study of the distributions of renin and angiotensinogen messenger ribonucleic acids in rat and mouse tissues. Endocrinology 120: 2334-2338. HEARD, J.-M., HERBOMEL, P., Orr, M-O., MOTTURA-ROLLIER, A., WEISS, M., AND YANN, M. (1987). Determinants of rat albumin promoter tissue specificity analysed by an improved transient assay system. Mol. Cell. Bial. 7: 2425-2434. HILL, R. E., SHAW, P. H., BOYD, P. A., BAUMANN, H., AND HASTIE, N. D. (1984). Plasma protease inhibitors in mouse and man: Divergence within the reactive center regions. Nature Gondon) 311: 175-177. HILL, R. E., SHAW, P. H., BARTH, R. K., AND HASTIE, N. D. (1985). A genetic locus closely linked to a protease inhibitor gene complex controls the level of multiple RNA transcripts. Mol. Cell. Bill. 5: 2114-2122. HILL, R. E., AND HASTIE, N. D. (1987). Accelerated evolution
248
CLOUSTON in the reactive center regions of serine protease inhibitors.
renin inhibitor
Nature
sion 4: 65-69.
(London)
326:
96-99.
17. INGELFINGER, J. R., PRATT, R. E., ELLISON, K. E., AND DZAU, V. (1986). Angiotensinogen gene expression in the rat kidney: Evidence for sodium regulation of an intrarenal renin-angiotensin system. Clin. Res. 34: 598A. 18. KAGEYAMA, R., OHKUBO, H., AND NAKANISHI, S. (1984). Primary structure of human preangiotensinogen deduced from the cloned DNA sequence. Biochemistry 23: 3603-3609. 19. KAGEYAMA, R., OHKUBO, H., AND NAKANISHI, S. (1985). Induction of rat liver angiotensinogen mRNA following acute inflammation. Biochem. Biophys. Res. Commun. 129:
826-832. 20. KAISER, K., AND MURRAY, N. E. (1985). The use of phage lambda replacement vectors in the construction of representative genomic DNA libraries. In “DNA Cloning: A Practical Approach” (D. M. Glover, Ed.), Vol. 1, pp. l-47, IRL Press, Oxford. 21. KALINYAK, J. E., AND PERLMAN, A. J. (1987). Tissue-specific regulation of angiotensinogen mRNA accumulation by dexamethasone. J. Biol. Chem. 262: 460-464. 22. KRAUTER, K. S., CITRON, B. A., Hsu, M-T., POWELL, D., AND DARNELL, J. E. (1986). Isolation and characterization of the cui-antitrypsin gene of mice. DNA 5: 29-36. 23. KUNAPULI, S. P., AND KUMAR, A. (1986). Difference in the nucleotide sequence of human angiotensinogen cDNA. Nucleic Acids Res. 14: 7509. 24. KYTE, J., AND DOOLI’M‘LE, R. F. (1982). A simple method for
25. 26. 27.
28.
29.
30.
ET AL.
displaying the hydropathic character of a protein. J. Mol. Bial. 157: 105-132. LEIGH-BROWN, A. (1987). Positively Darwinian molecules? Nature (London) 326: 12-13. MANIATIS, T., FRITSCH, E. F., AND SAMBROOK, J. (1982). “Molecular Cloning: A Laboratory Manual,” Cold Spring Harbor Laboratory Publications, New York. MIZUSAWA, S., NISHIMURA, S., AND SEELA, F. (1988). Improvement of the dideoxy chain termination method of DNA sequencing by the use of deoxy-7-deazaguanosine in place of dGTP. Nucleic Acids Res. 14: 1319-1324. OHKUKO, H., KAGEYAMA, R., UJIHARA, M., HIROSE, T., INAYAMA, S., AND NAKANISHI, S. (1983). Cloning and sequence analysis of cDNA for rat angiotensinogen. Proc. Natl. Acad. Sci. USA 80: 2196-2200. OHKUBO, H., NAKAYAMA, K., TANAKA, T., AND NAKANISHI, S. (1986). Tissue distribution of rat angiotensinogen mRNA and structural analysis of its heterogeneity. J. Biol. Chem. 261: 319-323. POULSEN, K., AND JACOBSEN J. (1986). Is angiotensinogen a
and not the substrate for renin? J. Hyperten-
31. RACKWITZ, H-R., ZEHETNER, G., FRISCHAUF, A.-M., AND LEHRACH, H. (1984). Rapid restriction mapping of DNA cloned in lambda phage vectors. Gene 30: 195-200. 32. RAGG, H. (1986). A new member of the plasma protease inhibitor gene family. Nucleic Acids Res. 14: 1073-1088. 33. REID, I. A., MORRIS, B. J., AND GANONG, W. F. (1978). The renin-angiotensin system. Annu. Rev. Physiol. 40: 377-410. 34. SANGER, F., NICKLEN, S., AND COULSON, A. R. (1977). DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad.
Sci. USA
74:
5463-5467.
35. SCHNEIDEREIT, C., GREISSE, S., WESTPHAL, H. M., AND
36. 37.
38. 39. 40.
41.
42.
BEATO, M. (1983). The glucocorticoid receptor binds to defined nucleotide sequences near the promoter of mouse mammary tumour virus. Nature (Landan) 304: 149-752. SERNIA, C., AND REID, I. A. (1980). Stimulation of angiotensinogen production: A dose-related effect of angiotensin II in the conscious dog. Amer. J. Physial. 239: E442-E446. SHINE, J., MASON, A. J., EVANS, B. A., AND RICHARDS, R. I. (1983). The kallikrein multigene family: Specific processing of biologically active peptides. CoM Spring Harbor Symp. Quant. Biol. 48: 419-426. STADEN R. (1982). An interactive graphics program for comparing and aligning nucleic acid and amino acid sequences. Nucleic Acids Res. 10: 2951-2961. TANAKA, T., OHKUBO, H., AND NAKANISHI, S. (1984). Common structural organization of the angiotensinogen and the cu,-antitrypsin genes. J. Bial. Chem. 259: 8063-8065. ULLRICH, A., BERMAN, C. H., DULL, T. J., GRAY, A., AND LEE, J. M. (1984). Isolation of the human insulin-like growth factor I gene using a single synthetic DNA probe. EMBO J. 3: 361-364. WATSON, C. J., AND JACKSON, J. F. (1985). An alternative protocol for the synthesis of double-stranded cDNA for cloning in phage and plasmid vectors. In “DNA Cloning: A Practical Approach” (D. M. Glover, Ed.), Vol. 1, pp. 79-88, IRL Press, Oxford. WEIHER, H., KONIG, M., AND GRUSS, P. (1983). Multiple point-mutations affecting the simian virus 40 enhancer.
Science219:626-631. 43. WYMAN A. R., WOLFE, L. B., AND BOTSTEIN, D. (1985). Propagation of some human DNA sequences in bacteriophage requires mutant Escherichia coli hosts. Proc. Natl. Acad. Sci. USA 82:
2880-2884.
44. YANISCH-PERRON, C., VIEIRA, J., AND MESSING, J. (1985). Improved Ml3 phage cloning vectors and host strains: Nucleotide sequence of the M13mpl8 and pUC19 vectors. Gene 33: 103-119.