cDNA sequence for bovine biglycan (PGI) protein core

cDNA sequence for bovine biglycan (PGI) protein core

Biochimica et Biophysica Acta, 1173(1993) 81-84 81 © 1993 Elsevier Science Publishers B.V. All rights reserved 0167-4781/93/$06.00 BBAEXP 90493 Sh...

321KB Sizes 0 Downloads 71 Views

Biochimica et Biophysica Acta, 1173(1993) 81-84

81

© 1993 Elsevier Science Publishers B.V. All rights reserved 0167-4781/93/$06.00

BBAEXP 90493

Short Sequence-Paper

cDNA sequence for bovine biglycan (PGI) protein core Maureen A. Torok, Suvia A.S. Evans and James A. Marcum Department of Pathology, Beth Israel Hospital, Hareard Medical School, Boston, MA (USA)

(Received 23 December 1992)

Key words: Biglycan;cDNA; (Aorta); (Smooth muscle cell); (Bovine) The nucleotide sequence of the protein core for bovine aortic smooth muscle cell biglycan was determined using recombinant DNA technology. Analysis of the deduced amino acid sequence for bovine biglycan revealed a striking homology, 94.6% and 95.7%, to human and rat biglycan, respectively. The bovine biglycan protein core has four potential O-linked and two potential N-linked glycosylation sites and is composed of 11 leucine-rich repeat units.

Sulfated proteoglycans are acidic macromolecules composed of glycosaminoglycans covalently attached, via an O-linkage, to protein cores [1]. The carbohydrate chains consist, in part, of chondroitin sulfate, dermatan sulfate, and heparan sulfate, and the core proteins range in molecular masses from about 10000 to over 200 000 Da [2]. Neame et al. [3] determined the primary amino acid sequence of a secreted form of a small molecular weight proteochondroitin (biglycan, PG I) isolated from bovine articular cartilage. Fisher et al. [4] cloned the biglycan protein core from a human bone-derived cell cDNA library. The core proteins of bovine and human biglycan contain 331 and 368 amino acid residues, respectively [3,4]. Asundi et al. [5] demonstrated, by Northern analysis with a nonhomologous species cDNA probe, that rat aortic smooth muscle cells maintained in vitro and rat aortic medial tissue obtained in vivo synthesize a 2.9 kb transcript for the core protein of biglycan. Dreher et al. [6] isolated and sequenced clones from a rat aortic smooth muscle cell cDNA library that encode for the protein core of biglycan. In the present report, the cDNA sequence for the core protein of bovine aortic smooth muscle cell biglycan was determined utilizing recombinant D N A technology. Smooth muscle cells were isolated from bovine aorta by explant outgrowth [7] and grown in a humidified atmosphere of 5% CO2, 95% air on 100 mm plastic

Correspondence to: J.A. Marcum, Department of Pathology, Beth Israel Hospital, 330 Brookline Ave., Boston, MA 02215, USA. The sequence data presented in this paper have been submitted to the EMBL/GenBank Data Libraries under the accession number L07953.

culture dishes containing Dulbecco's modified minimal essential media, 10% fetal bovine serum (Intergen), penicillin (100 units/ml), and streptomycin sulfate (100 ~ g / m l ) . Cells were fed every 2-3 days and passaged, after confluence, at a 1:5 split ratio using a 0.05% trypsin-0.2% E D T A solution. Cell type was confirmed by light and electron microscopy [8] and by muscle-actin immunofluorescence [9]. Total RNA was isolated from bovine aortic smo6th muscle cells (passage 4) using quanidinium thioc y a n a t e - p h e n o l - c h l o r o f o r m single-step extraction (Stratagene), and poly(A) ÷ RNA was purified using oligo(dT)-cellulose spun columns (Pharmacia). Synthesis of cDNA was initiated by the addition of reverse transcriptase and oligo(dT)~2 18 (Pharmacia) to the bovine poly(A)* RNA. A 5' extension cDNA libra~ was also constructed using a primer (5' ~ T I ' G G G G A T G C C A G T G A G C T 3') consisting of nucleotides no. 660 to no. 678 (Fig. 1). Second strand synthesis of cDNA was performed with Escherichia coli DNA polymerase and RNase H (Pharmacia). Doublestranded cDNA was blunted with T4 DNA polymerase, coupled to E c o R I linkers containing a non-phosphorylated E c o R I overhang (Pharmacia), ligated into E c o R I - d i g e s t e d , alkaline phosphatase-treated Agtll expression vector (Stratagene), and packaged in vitro. E. coli strain Y1088 was infected with the bacteriophage and plated onto 150 mm Petri dishes. Plaque D N A was transferred, in duplicate, to NEN-Dupont Colony Plaque screen membranes, and filters were prehybridized at 68°C for 10 min in Quikhyb (Stratagene), employing an Autoblot hybridization oven (Bellco). A 2.1 kb contiguous cDNA fragment from the 3' end of rat biglycan [6] was labeled with [32p]ATP (NEN-Dupont) by random priming (Boehringer-Mann-

82 heim). Hybridization was conducted at 68°C for 60 min by adding the denatured double-stranded probe (1.25 • 106 cpm/ml) to the Quikhyb solution containing the prehybridized filters. After hybridization, the filters were washed twice with 2 × SSC and 0.1% SDS for 15 min at 22°C and then once with 0.1 × SSC and 0.1% SDS for 30 rain at 60°C. Filters were exposed overnight to X-OMAT film (Kodak) by autoradiography. Positive plaques were purified by additional rounds of screening. Phage DNA was isolated from plaques using LambdaSorb Phage Adsorbent (Promega). Inserts were removed with EcoRl restriction enzyme and subcloned into pBluescript II K S ( - ) (Stratagene) or pGEM3 Z f ( - ) (Promega). The recombinant plasmids were used to transform E. coli DH5a cells (Gibco-BRL). Positive transformants were grown in Circle Grow (Bio-101) containing ampicillin (50 /xg/ml), and plasmid DNA was isolated using Plasmid Quik mini-columns (Stratagene). Nested deletions of intact plasmid clones were generated using Erase-a-Base System (Promega). The DNA sequence of three overlapping clones was determined by the Sanger dideoxy chain-termination reaction using Sequenase Version 2.0 (USB) and [35S]ATP (NEN-Dupont). Reaction products were electro-

i14 71

5'

ttgcEgtcgctgctctcagacgacacaacacacagacacaggg

actggctgacgcctcaggctgcccaccagccagcgcgtgcccaccgcttgccctcctcaggcacgcccacc

] 1

ATG Met

TGG Trp

CCC Pro

CTG

Leu

TGO Trp

CCT Pro

CTT Leu

GC A Ala

GCC Ala

CTG Leu

CTG Leu

GCC Ala

CTG Leu

AGC Ser

49 !7

CTG Leu

CCC

TTT

CAA Gln

AAA Ly3

C,C C Ala

TGG

GAC

TTC

ACC

CTG

Phe

GAG Glu

TTC

Pro

phe

Trp

Asp

Phe

Thr

Leu

97 33

CTG Leu

CCC Pro

ATG Met

CTG Leu

AAC

GAT

ASh

Asp

145 49

GGC Gly

ATC Ile

CCA

GAC

CTG

GAC

Pro

Asp

Leu

Asp

193 65

241 81

289 97

337 113

385 129

433 145

CCT

TTT

Pro

Phe

GGT Gly

GAC Asp

GGC Gly

AAG Ly3

TAC Tyr

CTG Leu

CTG Leu

CTC Leu

ATC lle

ATC Ile

GGC Gly

AAG Lys

CAG GIn

CAG Gln

CAC His

TGC Cys

GCT Ala

CAC His

GTG Val

TGC Cys

CCC Pro

GAG Glu

GAA Glu

ACA

ACC

TCG

Thr

Thr

Set

TCC

CTC

CCA

CCC

ACC

TAC

AGC

Ser

Leu

Pro

Pro

Thr

Tyr

Set

GCC Aia

ATG Met

TGC CyS

CAT His

AAG Lys

CTG Leu

GAG Glu

ASh

Ile

$er

AAG Lys

GCC Ala

GC~ Gly

GCA Ala

Asn

GAG Glu

Asp

GGT Gly

TCT

TAC Tyr

GAT

Asp

TCG Set

ATC

CTC Leu

GAT

GCT Ala

A~KT G A C

CAC His

GCC Ala

GAA Glu

AAT

Asp

CAG Gin

GCC Ala

TTG Leu

AGG Arg

ATC Ile

GAG Glu

GTC Val

GTT Val

GTT Val

CAG Gln

TGC Cys

TCC

GAC

Set

Asp

TCG

CCT

GAC

ACC

ACC

Ser

Pro

ASp

Thr

Thr

CTC Leu

CTC Leu

CGA Arg

GTG Val

TTC

AGC

CCA

CTG

CGG

Phe

Set

Pro

Leu

Arg

TCC

AAG

AAC

GAC

CTG

TGT

GAG

ATC

Set

Lys

ASh

His

Leu

Val

Glu

Ile

CCT Pro

AAA Lys

GAT

GAC

TTC

Asp

Asp

Phe

AAC

AAC

ASh

ASh

AAG Lys

CCC P~o

CTG Leu

CTG Leu

AAG Lys

GAG Gln

ATC Ile

AAG 5ys

CTG Leu

CTG LeU

AAA Lys

Set

CTC 5eu

AAC

CTG

CCC

AGC

Asn

Leu

Pro

Set

TCC Set

CTG Leu

GTG Val

GAG Glu

CTC Leu

CGC Arg

ATC Ile

CAT His

GAC

AAC

Asn

CGC Arg

ATC Ile

CGC Arg

AAG Lys

GTG Val

CCC

Asp

529 177

AAG Lys

GGC Gly

GTG Val

TTC

AGT

AAC

ATG

AAC

Set

CTC Leu

CGC

Phe

GGG Gly

Arg

ASh

Met

Asn

TGC Cys

ATT Ile

GAG Glu

ATG Met

GGT Gly

GGG Gly

AAT

CCC

Asn

Pro

CTG Leu

GAG Glu

AAC

AGC

GGC

TTT

GAA

CCT

ASh

5er

Gly

Phe

Glu

Pro

GGA Gly

GCA Ala

TTT

GAT

Phe

Asp

625 209

CTG

AAG Lys

CTC Leu

AAC

Leu

ASh

TAC Tyr

CTT Leu

C(SC AI( A r g lie

?(TA GA(J ,5CC AA
673 225

CCC Pro

AAA Lys

GAC Asp

CTC Leu

CCT Pro

GAG Glu

ACC Thr

CTC Leu

AAT Ash

GAA Glu

721 241

AAA Lys

ATC Ile

CAG Gln

GCA Ala

ATC Ile

GAG Glu

CTA Leu

GAG Glu

GAT Asp

CTC C T C Leu

769 257

TAC Tyr

AGG Arg

CTG Leu

GGC Gly

CTG Leu

GGC Gly

CAC His

AAC Asn

CAG Gln

817 273

AGC Ser

CTG Leu

AGT

TTT

CTG

CCC

ACG

Ser

Phe

Leu

Pro

Thr

CTG Leu

865 289

AAG Lys

CTG Leu

TCT Ser

AGG Arg

GTG Val

CCA Pro

GCT Ala

913 305

GTG Val

GTC Val

TAT Tyr

CTG Leu

CAC His

ACC

Thr

961 321

TTC Phe

TC-C C C A C y s Pro

GTG Val

GGC Gly

1009 337

AGC Set

CTC Leu

TTC Phe

AAC Asn

1057 353

TTT Phe

GCG Arg

TGC Cys

GTC Val

1105

AAG

TAG

aggctgtggcagtctgctgcggtggtggcttggtaagggtctcttggggtgcataaggcgtg

369

LyS

End

CTC Leu

'I,' A
,}t;<: A:',. (;ly lie

CA¢~ (~2'G (;AC CAC His Leu A S p His

AAC ASh

Leu

CGC Arg

TAC Tyr

ECC Set

AAG Ly3

TTG Leu

ATC Ile

CCC Arg

ATG Met

ATT lle

GAG Glu

AAC ASh

GGG Gly

CGG Arg

GAG Glu

CTG Leu

CAC His

TTG Leu

GAC Asp

AAC Ash

AAC ASh

GGT Gly

CTT Leu

CCA

GAC

Pro

Asp

CTC Leu

AAG Lys

CTC Leu

CTC Leu

CAG Gln

AAC

AAC

ATC

ACC

ASh

lle

Thr

AAG Lys

GTG Val

GGC Gly

GTC Val

AAC Ash

GAC

ASh

TTC Phe

GGG Gly

GTC Val

AAG Lys

AC43 G C C Arg Ala

TAC Tyr

TAC Tyr

AAC

ASh

GGC Gly

ATC Ile

AAC

CCC

GTT

CCC

TAC

TGG

Pro

Val

Pro

Tyr

Trp

GAG Glu

GTG Val

CAG Gln

CCG Pro

GCC Ala

ACC

ASh

ACT

GAC

Asp

CGC Arg

CTG Leu

GCC Ala

ATC lle

CAG Gln

TTT Phe

GGC Gly

AAC

Thr

TAT Tyr

AAA Lys

Asp

TCC

481 161

577 193

phoresed on 6% polyacrylamide-8 M urea buffer gradient gels. After drying, the gel was exposed to film by autoradiography. The sequence for the 2043 nucleotides of bovine aortic smooth muscle biglycan core protein is shown in Fig. 1. Computer-assisted analysis of the nucleotide sequence revealed an open reading frame of 369 amino acids, corresponding to a molecular mass of 41 589 Da for the complete protein core of bovine biglycan. The size of bovine biglycan protein core is similar to those reported for human [4] and rat [6], and comparison of the deduced amino acid sequences of human [10] and rat [6] biglycan with bovine biglycan revealed a striking homology of 94.6% and 95.7%, respectively. The core protein of bovine biglycan contains four possible O-linked and two possible N-linked glycosylation sites (Fig. 1). Based on amino acid sequence analysis data, Fisher et al. [11] and Neame et al. [3] have proposed that the two O-linked glycosylation sites near the aminoterminus are substituted with glycosaminoglycans (Fig. 1). In addition, bovine biglycan contains eleven leucine-rich repeat units with a consensus sequence of LXXLXLLXNXHXXHPXXXXHX (X denotes any amino acid, and H represents hydrophobic amino acids, and lower case letters denote predominate amino acid). Although there are slight variations, the composition

Pro

GGC Gly

1173 1244 1315 1386 1451 1528 1599 1670 1741 1812 1883

ASh

Thr

tgtcctgaaggggcagcaaagcaaggagccaagccccgcctttgacccccaccctccactcacggcccctt caacccccaccctggctcccaagtgtgcaggtggggcgtgatgcctggcccccatcacatgtcccttggat tcagactgcccctgccccacccgcatcatacccattcagagcgccccccccccaacca~gctttcttccca

ttcaccccaaaagcaaatgatctgagggctccagtccaaggtaaacggtccctgggtctggggggctcaag gatggagaccccactaagcccaccccacctgccagacacacatccctcctcagcccagccagctacctttg tgctcctcagccccccgccatcgtcttgttcagcttctgctctgcccagccattacccaggcaggtggagt

gggcacagctgccctcctactctgccaggctcacccgaagcctgggtgacccttccagaggccagcgaata gggagtgctgcaccccctcttccacagccaagagaggagccctgggctcagccagaccctgagggtctgtc ccactggagttcccatcatgcttctcactgtcccccttccccccatgatggctcagtcHcccctccctttc gcatccggcctctggtctggtgggggtttcaaaccatcacacccagcttgaggaggggctgcttctgaggt cggttgttgtctttcaat~aRagaaacactgtgcaataaaaaaaaaa 3'

Fig. 1. c D N A sequence and deduced amino acid sequence for the protein core of bovine biglycan. Lower case letters correspond to untranslated nucleotide sequence, while upper case letters represent an open reading frame. Solid circles and triangles indicate potential O-linked and N-linked oligosaccharide attachment sites, respectively. Potential adenylation signal is underlined.

83 and number of leucine-rich repeat units for bovine biglycan are similar to those described for human [4] and rat [6]. The deduced amino acid sequence from nucleotide sequence of bovine aortic smooth muscle cell biglycan is almost identical to the primary amino acid sequence published for the secreted form of bovine articular cartilage biglycan by Neame et al. [3]. Only one difference between the two amino acid sequences was detected: residue no. 151 was Cys (Fig. 1) instead of Glu [3]. This is consistent with the deduced amino acid sequences for the rat [6,10] and human [4] biglycan core proteins which also contain Cys at this position. In addition, the deduced amino acid sequence for the bovine biglycan protein core shown in Fig. 1 contains the amino acid sequence prior to the amino-terminus for the secreted form of bovine biglycan [3]. Fisher et al. [4] proposed that the 37 amino acids prior to the amino-terminus for the secreted form of human biglycan is composed of a prepeptide (residues No. 1 to No. 19) and a propeptide (residues No. 20 to No. 37), based upon distribution of charged amino acids. Analysis of biglycan secreted by bovine and rat aortic smooth muscle cells maintained under in vitro conditions reveals an additional species of biglycan that contains Leu (residue No. 17) as the amino-terminus [12], supporting the proposal that biglycan contains a prepeptide and a propeptide [4]. The above data suggest, however, that the prepeptide and the propeptide for rat and bovine biglycan are residues No. 1 to No. 16 and No. 17 to No. 37, respectively. Based upon these data the molecular masses for the secreted form of bovine biglycan, by aortic smooth muscle cells, containing the amino-termini Leu and Asp are 39825 and 37 326 Da, respectively.

Dreher et al. [6] identified a hypervariable region (residues No. 44 to No. 60) of the core proteins for rat, bovine and human biglycan that contains an unusually high degree of heterogeneity when compared to the remaining amino acid sequence. Examination of the prepeptide amino acid sequence for bovine, rat, and human biglycan also reveals an additional, but shorter, region with a high degree of heterogeneity when compared to the remaining amino acid sequence (Fig. 2). For amino acid residues No. 2 to No. 9 of the biglycan protein core, there are 3 substitutions for the human species (62.5% homology) and 4 substitutions for the rat species (50% homology), when compared to the bovine species. Two of the amino acid changes are conservative, i.e., variation only in the aliphatic amino acid residue, while the other substitutions are changes in the type of amino acid residue (Fig. 2). Comparison of the nucleotide sequence for the bovine biglycan core protein with those of human and rat revealed a striking homology of 81.6% and 83.0%, respectively. The greatest level of conservation was within the coding region, although selected regions of the 3' and 5' untranslated sequence did exhibit high homology. For example, the nucleotide sequence surrounding the potential polyadenylation signal (Fig. 1) was highly homologous among the three species. Dreher et al. [6] reported the presence of (CT)22 and (AC)38 dinucleotide repeats within the 3' untranslated region of the nucleotide sequence for rat biglycan. No such dinucleotide repeats were observed in the nucleotlde sequence for bovine biglycan or reported for the human biglycan nucleotide sequence [4]. We thank Dr. Kevin Dreher for the generous gift of rat biglycan cDNA, Dr. Tet-Kin Yeo for critical reading of the manuscript, and Dr. Thomas Graf from the 68

MWPLWPLAALLALSQALPFEQKAFWDFTLDDGLPMLNDEEASGAETTSGIPDLDSLPPTYSAMCPFGC R VS RG PF M D* VL P VT R L TL G L M SD V T F 136 b h r

HCHLRVVQCSDLGLKAVPKEISPDTTLLDLQNNDISELRKDDFKGLQHLYALVLVNNKISKIHEKAFS S T 204 PLRKLQKLYISKNHLVEIPPNLPSSLVELRIHDNRIRKVPKGVFSGLKNMNCIEMGGNPLENSGFEPG

272 AFDGLKLNYLRISEAKLTGIPKDLPETLNELHLDHNKIQAIELEDLLRYSKLYRLGLGHNQIRMIENG

340 SLSFLPTLRELHLDNNKLSRVPAGLPDLKLLQVVYLHTNNITKVGVNDFCPVGFGVKRAYYNGISLFN A S S M S I M

369 NPVPYWEVQPATFRCVTDRLAIQFGNYKK

Fig. 2. Comparison of bovine (b), human (h) and rat (r) biglycan deduced amino acid sequences. Human and rat sequences are taken from [10] and [6], respectively. Asterisk denotes missing amino acid in human sequence.

84

Molecular Biology Computer Research Resource, Dana-Farber Cancer Institute, Boston, MA for assistance with computer analysis. This work was supported by an American Heart Association Grant-in-Aid (No. 900735) and by the Beth Israel Hospital Pathology Foundation. References 1 Hassell, J.R., Kimuro, J. and Hascall, V.C. (1986) Annu. Rev. Biochem. 55, 539-567. 2 Ruoslahti, E. (1988) Annu. Rev. Cell Biol. 4, 229-255. 3 Neame, P.J., Choi, H.U. and Rosenberg, L.C. (1989) J. Biol. Chem. 264, 8653-8661. 4 Fisher, L.W., Termine, J.D. and Young, M.F. (1989) J. Biol. Chem. 264, 4571-4576.

5 Asundi, V., Cowan, K., Matzura, D., Wagner, W. and Dreher, K.L. (1990) Eur. J. Cell. Biol. 52, 98-104. 6 Dreher, K.L., Asundi, V., Matzura, D. and Cowan, K. (1990) Eur. J. Cell Biol. 53, 296-304. 7 Fritze, L.M., Reilly, C.F. and Rosenberg, R.D. (1985) J. Cell Biol. 100, 1041-1049. 8 Chamley-Campbell, J., Campbell, G.R. and Ross, R. (1979) Physiol. Rev. 59, 1-61. 9 Libby, P., Warner, S.J.C., Salomon, R.N. and Birinyi, L.K. (1988) N. Engl. J. Med. 318, 1493-1498. 10 Fisher, L.W., Heegaard A-M., Vetter, U., Vogel, W., Just, W., Termine, J.D. and Young, M.F. (1991) J. Biol. Chem. 266, 1437114377. 11 Fisher, L.W., Hawkins, G.R., Tuross, N. and Termine, J.D. (1987) J. Biol. Chem. 262, 9702-9708. 12 Marcum, J.A. and Thompson, M.A. (1991) Biochem. Biophys. Res. Commun. 175, 706-712.