Baboon apolipoprotein C-I: cDNA and gene structure and evolution

Baboon apolipoprotein C-I: cDNA and gene structure and evolution

GENOMICS 13,368-374 (1992) Baboon Apolipoprotein MARTINE Department C-l: cDNA and Gene Structure PASTORCIC, SHIFRA BIRNBAUM, of Genetics, Southwe...

1MB Sizes 3 Downloads 74 Views

GENOMICS

13,368-374

(1992)

Baboon Apolipoprotein MARTINE Department

C-l: cDNA and Gene Structure PASTORCIC, SHIFRA BIRNBAUM,

of Genetics, Southwest

Foundation

ReceivedAugust2,

We have isolated and characterized cDNA and genomic clones for apolipoprotein C-I (apoC-I) encoded by the APOCl locus in baboons. Baboon apoC-I cDNA is only 410 bp in size, but the gene spans 4.5 kb including four small exons and three introns containing a large number of Ah repeats. The coding sequences of apoC-I cDNA and genomic clones are identical, indicating that this genomic clone contains the functional gene for apoC-I rather than a pseudogene like human APOCl’. We also detected a second gene in Southern blots of baboon genomic DNA that may correspond to the human APOCl’ pseudogene. Two start sites for baboon APOCl transcription were mapped to nucleotides that are 7 and 9 bp downstream from the predominant start site for human APOCZ transcription. Alignment of Alu repeats showed that the 5’ region of the baboon APOCl gene is more similar to that of the human pseudogene APOCl’, and the 3’ region and coding sequences are more similar to those of human APOCl. These regions are separated by an Ah repeat that is present only in the baboon gene, perhaps reflecting its role in gene rearrangement or conversion. Sequence comparisons from baboon, human, dog, and rat showed extensive differences in apoC-I amino acid sequences, which are less conserved than nucleotide sequences. However, comparisons of hydrophilicity profiles show significant conservation of protein domains that may be important for apoCI function. 8 1992 Academic Press. Inc.

INTRODUCTION Apolipoproteins play an essential role in lipid transport and redistribution among different tissues. They are structural components of lipoprotein particles and also contribute to lipoprotein metabolism by interacting specifically with cellular receptors and modulating the activity of enzymes such as 1ecithin:cholesterol acyltransferase (LCAT) and lipoprotein lipase. Apolipoprotein C-I (apoC-I) is a 57-amino-acid peptide that is produced in the liver (Knott et al., 1984) and is associated with HDL and triglyceride-rich lipoprotein particles (chylomicrons and VLDL) (Schaefer et al., 1978). Like other apolipoproteins, apoC-I contains an amphipathic a-helix for lipid binding in lipoprotein particles (Kaiser and Kezdy, 1983). Although its function in lipid metabo08&x-7543/92 Copyright All rights

$5.00 Q 1992 by Academic Press, of reproduction in any form

368 Inc. reserved.

AND JAMES E. HIXSON

for Biomedical

1991;revised

and Evolution

Research, San Antonio,

February

Texas 78228

14, 1992

lism remains unclear, apoC-I is known to activate LCAT in vitro (Soutar et al., 1975). The apoC-I gene (APOCl) has been isolated and sequenced from a human genomic library (Lauer et al., 1988). Similar to other apolipoprotein genes, APOCl consists of four exons separated by three introns. The exons encode several functional domains, including an untranslated leader sequence, a signal peptide, a region of intragenic repeats that encode amphipathic helices for lipid binding, and a 3’ untranslated region that contains a polyadenylation signal. Although the coding region is small (249 bp), human APOCl is 4.6 kb in length due to large introns that contain a total of nine Alu repeats. In humans, APOCl is part of a small family of genes that is located on chromosome 19 and includes genes for apoC-II, apoE, and apoC-I (Smit et al., 1988). In addition, a nonfunctional gene (pseudogene) for apoC-I (APOCI’) has been identified adjacent to the functional APOCl. APOCl’ is inactivated by a nucleotide substitution in exon 3 that creates a translational stop codon in the signal peptide (Lauer et al., 1988). We are using the baboon as a model for human atherosclerosis because this species is phylogenetically and physiologically similar to man (McGill et al., 1981a,b). For genetic studies, we are investigating the role of candidate gene variation, structure, and expression in lipid and lipoprotein metabolism. We previously isolated a clone from a genomic library that contained the entire APOE gene (Hixson et al., 1988b). Using Southern blotting, we identified APOCl sequences that were located approximately 4 kb downstream from APOE. In this report, we present the sequence of cDNA encoding baboon apoC-I. We have also extended studies of the baboon APOE-APOCl cluster by isolation and sequencing of a genomic clone that contains the entire APOCl gene and overlaps with our previous clone containing APOE. This baboon APOCl gene appears to be functional (rather than a pseudogene) since it contains coding sequences that are identical to those of apoC-I cDNA. MATERIALS Isulation RNA from tors tion

AND

and sequencing of cDNA baboon liver was extracted

as previously

described

of mRNA

by oligo(dT)

METHODS

clones for baboon apoC-I. Total in the presence of RNase inhibi(Hixson and Britten, 1988). After purificachromatography, random oligonucleotide

BABOON

Baboon Human

L CTG

GENE

CCAGGAAGGATTCAGAGTGCCCCTCCGGCCCCGCC C

v GTG

v GTG

v GTT

L CTG

s TCG u

V GTT

apoC-I

S TCC

S AGC

A GCC

I M V ATG GTC C

1

L TTG

STRUCTURE

AND

ATG

AGG

CTC

P CCA

A GCC

P CCA

F TTC

L CTG

S TCG

L CTC

P CCG

9 62

V GTC

T

V GTC C

Q CAG

1

G GGG

L CTG

E GAG

D GAC

TIA

A

E GAA

G GGC

-

L TTG

D GAT

K AAG

L CTG

K AAG

E GAG

R CGC

I ATC

K AAA

Q CAG

S AGT

E GAA

F TTT

G GGA

q F TTT C

P CCT T

N MC

T ACC

A GCC

M T R :n ARGMCGGNTGG

s

u

TCA

TGA

-

E D l-l

GCC A

P D CCG GAT A C

29 122

K AAG

A GCT

R W TGG

49 182

W

F TTT

TEA

69 242

1

83 306

GCACGTGAAGGGTGACATC G C

CCAGGAGGGGGCTCTGAAATTTCCCGCACCTCAGCACCTGTGCTGAGGACTGCCTCCACGTGGCCCCAGGTGCCACCBB A C G C T C

385

mGCCTATAG(A).

399

T

C

FIG. 1. Nucleotide and deduced for baboon apoC-I cDNA is shown, Nucleotide differences in the human sequence. Codons with substitutions site is underlined.

amino acid sequences of baboon apoC-I cDNA aligned with the human sequence. The nucleotide sequence and the inferred amino acid sequence is shown above the nucieotide sequence (numbered in italic type). gene are shown below the baboon nucleotide sequence, and aa substitutions are shown above the baboon aa that change aa sequences are boxed, the termination codon is shown (U), and the polyadenylation signal

and oligo(dT) primers were used for cDNA synthesis. EcoRI linkers were added to cDNAs for ligation into hgtl0, and libraries were screened with a human apoC-1 cDNA probe provided by Dr. Ola Myklebost (Myklebost and Rogne, 1986). Inserts from six different hybridizing plaques were subcloned into Ml3 bacteriophage vectors for nucleotide sequence analysis. Isolation and sequencing of APOCl from a baboon genomic library. High-molecular-weight DNA was extracted from baboon liver (Blin and Stafford, 1976) and subjected to partial digestion with SauSA. DNA fragments 15-20 kb in length were isolated on NaCl gradients according to Sambrook et al. (1989) and inserted into the BamHI site

TABLE Homology

of XEMBL3 (Frischauf et al., 1983). After packaging and transfection, the library was plated (without amplification) and screened with the human apoC-I cDNA probe (Benton and Davis, 1977). In screening of 1 x lo6 plaques, we identified a single hybridizing clone for further

s

APoE

E

P

B

SH

IA

1

j

I

III

Comparisons among apoC-I Sequences Baboon, Human, Dog, and Rat

vs vs vs vs vs

human apoC-Id human apoC-I” apoC-I dog apoC-I’ rat apoC-If

P

EPB

Ikb

1

DNA” apoC-I apoC-I apoC-I apoC-I apoC-I

APOCl

E

from

/

\

P -

Cd I. I.

Baboon Baboon Human Baboon Baboon

369

EVOLUTION

92.2% 81.0% 79.1% 70.5% 58.4%

Coding

DNAb

92.5% 88.5% 89.7% 74.5% 69.0%

Protein 85.5% 74.7% 76.2% 69.3% 63.6%

’ Comparisons of nucleotide sequences in coding and untranslated regions. * Comparisons of nucleotide sequences in coding regions only. ’ Comparisons of inferred amino acid sequences. d Human apoC-I and C-I’ sequences from Lauer et al. (1988). e Dog apoC-I cDNA sequence from Luo et al. (1989). ‘Rat apoC-I cDNA sequence from Shen and Howlett (1989).

12 - c-z --

El

5

300

bp

P

a-=--, I I

3 -----

Ifi

==r=.L=--z- -

4

-

FIG. 2. Restriction map and sequencing strategy for the baboon APOCl gene. At the top, a map of the baboon genomic region containing APOE and APOCl (filled boxes, transcriptional orientation shown by arrows above the map) is shown. Key restriction sites (BBamHI; E-EcoRI; H--HindIII; P--P&) that were used to align the recombinant genomic clones containing APOE and APOCI are shown (brackets below the restriction map). Below the clone maps, an expanded map of the baboon APOCl gene (exons shown as filled boxes) shows the restriction fragments used for subcloning and sequence analysis. The arrows above the map show sequencing for full-length subclones, and the arrows below the map show the endpoints and direction of sequencing for deleted subclones.

370

PASTORCIC,

BIRNBAUM,

AND

HIXSON

ALU-1

~TGCAGCC~CTCCTGGGCTCAAGCAATCTTCCT~CTCAGCCTCTTGAGTAGCTGGGACT~GCATG~~CACCACACCTGGCTAATTTTTGAA

-138 -638 -538 -438 -338

GGGCGGATGAGATCACAGGGTTATTACTAGAAGGCCCCTGAGGGCAGATGTCCACAGGGACAGGACGAGGCTGTCCTCTGAGTGGGGAAAGGGACTGTAG

-238

TAGTCTGAGGACCCCCCAGAGTCAGGAAGATTGGGAGGTGAGAATGCTGAACCCTAAGGGGCTCTGGGGCTAAGGGAAGTGCCCGGGACCCCACCTGACC

-138

CCAACGCCCACGGGCCAGGGACAGAGGAGAAAAACGAGGGTGGACACAGGGAGGCAGGCTCAGGAGGAGGGAGATCAACATCAACCTGCCCCGCCCCCTC EXON-1 I X~CAGCCCTCCAGG~GGATT~AG

CCCAGC~TGATAAAGGT~~TGCGGG~~GGGA~~T~~~

-38

I GTTGGTG~TGAGTGCTTGGGAGGGA~ACCAGCCTT

61

CACTCTGCAAGAACTCAAAAAGGGAGATGAGGGGATCGTGGGAAGGAGGGAGGGAGGACGGTGCCACTGATCCCCTGCACCCCTCCCTCTGCCTCCAG

159

EXON-2 I AGTGCCCCTCCGGCCCCGCC

19 238

M R L F L S L P V L V V V L S M V L E ATG AGG CTC TTC CTG TCG CTC CCG GTC CTG GTG GTG GTT CTG TCG ATG GTC TTG GAA G I

GTAAAAGTGGGATGGGAGAATGCGGAGTTGGACTTTTGGAAGAGTGAAGGTGGCTACAGGCCGGGGGTCCCGGCTTAGAGGACTTCTGAGAGCTCTGGGG

338 438 538

CTGAGTAAGGCAACTCCCAGCGGCTTGAGCCCCACTAGGTCACTCCAGTATCCTCCCCATTCTAACCACATGATCCCCAAAGATCTCCTTATCCATCCCG

638

ALU-2

GGATCCCATCCTAAGGGGGTTCCAATAAC

LGCGGGC-

CTTGAGGTCAGGAGTXXiUiACCAGCCTGGCCCATATGGTGAAA

738

CTCC GTCTCTACTAAAAATACAAAAAAATTBGCCGGGTATGGATGCACGCACCTGTA

838

938

GCTACTTGGGAGGCTGAGGCAGG~CTGCTTGAACTCAGAAGGTGGTGGTTGCAGTGAGCCAAG~CTCCAGCTTTGGC~ EXON-3

I"GC CCA p GCC A CCA p GTC " CAG Q GGG G

26 1030

A P D V S S A L D K L K E F G N T L E D K A W E V GCC CCG GAT GTT TCC AGC GCC TTG GAT AAG CTG AAG GAG TTT GGA AAC ACC CTG GAG GAC AAG GCT TGG GAA GTG

1105

I N R 1 K Q S E F P A K T ATC AAC CGC ATC AAA CAG AGT GAA TTT CCT GCC AAG ACA CG "I GTTAGAACCCTTCCCAGGGCACAGGAGGGCGGGGGTGTGTTTTTG

1191

ACAGAGCAAGACTCTGTCTCAAAAAAAAAAAAGAAAAGAAAAGAA~A~G~~GAACCCCTGCCCATATTCCTGGCAG

51

65

1291

GGTGGAGCCCTGGAGGATGGTCCAAGATGAACAAATTGAARRMAAARRACAAACAAGTCCCAGAGAGGCTGACAACGTCCCTCTGGTCACACAGCTAGATCTC Am-3 AGGGTGCTTAGACTTTAGGGACAGTTTCCCTGACTCTCATCCAGGCCACATTTTTTTTTTTTTTTTTTTTTTTGAGACGGAGTCTCGCGCTGT~

c

1391 1491 1591

GCCCACCACCGCGCCTGGCTAATTTTTTGTATTTTTAATAGAGACGGGGTTTC~TCCTGACCTCGTGATCCAC~

ALU-4 CGCcTCGGCCTCCCAAAGTGCTGGGATTACAGGCTTGAGCCACCTGTCCCGGCCCAGGCCACATTTTAAAAAGATGGTCTTCX;GCTGT~CGTGG~

1691 1791

c AAAAATACAAAAAAATTAGCCAGGCGTGGTGGCGTGCTGTAATCCCAGCTAGTCAGGAGGCTGAGTCAGGGGAATTGCTCG~AGGCGG

1891 Am-4.5

GGGTTACAGTGAGCTGACATTGCGCCACTGAACTCGGGCCTGGGTGACAGAACGAGACT~~TGGTCTTGCCCAGGTACAGT

1991

GGCTCACACCTGTAATCCTAGCACTATGGGAGGCTGAGATGGGAGGATTGCTTGAGCCCAGAGGTTC~CCAGCC~CCTGT

2091

~TATTAGA~G~~GG~G~~~GAGATAATGRAAATGAAGGATCGGGGGATCCAAGCTTGCcAGAccTGcGTCCCcAGGcTGGcGTAcGAcCc

2191

CTTACTTCCTGTGTGATCTTGACAGAGGGTCATTACTGTCAGCCTCAGTTTCCTCTCCTATAAACTGGTGGTTCTGCGGGGAAGT~AGGAGAGGGCCTA

2291

CAGGGTGTCTGGTACATGTAGATGCTCAGTACATCATCAAAAATATTTATT

2391 ALU-5

GAGCATCTGCTAAGTGTTGGAAACTGTCTCAGCGTGGGGAAT~ACAGT~ACTCACACCTGTAATCCCAGCACTTTGG~GGCTGAG~

2491

TGGGTGGATCACTTGAGGTCAGGAGTGCGAGAACCCCGTCTCTACTBGBAATACAAAAAAAATTGGC~TGGTGGCCCACGCCTGTAGTCCCAGCT

2591

ACTTGGGAGGCTGAGGCAAGAGGAACCCTTGAGCCCAGGAGACTGAGGCTGCAGTGCGCCATGTTTGTGCCACTGCATTCCAGCCTGGGTAA~

2691

GACCCTGTCTCAACAACA~ACAACAAAAAAAAAAAAAAAAAGAAAAGAAAGAAGGAAAGAAAAGAAAAGAAAAAGGAAGGAAACAAAACATCC

2791

FIG. 3. Nucleotide and inferred amino acid sequences of the baboon APOCl gene. Nucleotide sequences are numbered from the Ytranscriptional start site, and amino acid sequences are numbered in italic type. Introns and exons are indicated along the baboon APOCl sequence (labeled and separated by vertical bars). Nucleotide sequences for Ah repeats are underlined, and flanking direct repeats are italicized and underlined. Transcriptional start sites are shown by dots above the sequence, and the polyadenylation signal site is underlined with a double line.

BABOON

apoC-I

GENE

STRUCTURE

AND

371

EVOLfJTION

ALU-6

2891

AGCCAGGCCTAAACTTATAAAGATTGTTTGGAGGCCAGGCACAGTGCCTCACACCTGTAATCCCAGCACTTTGGGAGGCCGAGGC~ACGGATCACCTGA

2991 3091 ALU-1 GAGACTCTGTCTCAAAAAAAAAAAAAAAAAAAA

3191

CTCCTGGCACGGTGGCTG4CGCCAGTFaarCCCAGCACTATGGGAGGCCGAGCGGGCAGA~ATGAGG

3291

TCAGGAGTTCAAGACTAGCCTGCTCAACATAATGAAACCCTCTCTGTACTAAAAATACAAAAATTAGCTGGATGTGGTGC~AGG~ACCTATAGTCCTAGC

3391

TACTCGGGAGTCTGAGGCAGGAGAATCGCTTGAACCTGGGAGGCAGAGGTTGCAGTGAGCC~GACAGCGCCATTGCACTCTAGTCTGGGTGACAGAGCG ALU-8

71

BAACTATATCTCAAAAAAAAAAAGAAGGGGGTTGGTAGCAAGAGATGACAGGCCTTGA~GCCAGGTCAGGGTGAAACGTT~TTATTTTAT.TT

-

3491

A ~TG~T~TGT~G~~~AGG~TGGAGTG~AGTGGCGCGATCTTGGCTCA~TGTAAG~TCCG~

3591

CTCCCGGGTTCACGCCATTCTCCTGCCTCAGCCTCCCAAGTAGCTGGGACTACAGGCGTCCGCACCACGCCGGCTAATTTTTGTATTTTACTAGAGATG~

3691

p

3791

GTT GC

A CTCCT A CT

ALU-9

GGTCTTATTTTTCATTTTTTGAAATGGAGTCTTGGAGTCTTGCTCTGTTGcCcAGGCTGGAGTAcAGTGG~GcGATcTGGTCTcACTG~ccACcAccTcCTGGGTT

3891

CAGGCAATTCTCCTACCTCAGCCTCATTAAGTACCTGGAACTACAGGTGTGCAACACTATGCATGGTATTTT~~GTATTT~AAAGACGAGATTT~

3991

CCATGTTGTCCAGGCTGGTCTCGAACTCCTGACCTcAAGTGATCTGccTGCCTTGGccTccCAAAG~GcTGGGA~TAcAGGCG~GAcGAcGA~GcccAAc

4091

TAAGGGTAAAGTGTTTAGACTTCAA~GTGCTCTGGTCCATCTGTGAACCTGAAGCACAGAAGTTGGCCCACCCAGCCCAGCGGA~~T~~~AAT~CCACAG

419;

ACAGTGAGGATGGAGATTCAGGAAGGGGAAGAGGTGGGAGTCAGGTAGCAGGTAGAATCTGGACAGCCTGGGAGGGAGCTGCACACAGTGACCCCTTCC~ EXON-4 ,D W F S E T F R K V K E K L K I N Ster TRTCCCTCCCCACA G GAT TGG TTT TCA GAG ACA TTT CGG AAA GTG AAG GAG AAA CTC AAG ATT AAC TCA TGA GCACGTG

429:

AAGGGTGACATCCCAGGAAGGGGCTCTGAAATTTCCCCGCACCTCAGCACCTGTGCTGAGGATGCCCTCCACGT(;GCCCCAGGTGCCACC~Tcc

4468

TATAG I AAAATTCTCTCCTGAGTGCTTCTTTACCCTGGGAAGGGCTGCGGAGAGGGTAGGGCTTCCAGCTTCCAGAGAGGGAGGGGGTGCGGGAGAGGGC

4568

AGGAGCTGAACC

4580 FIG.

83 4368

3-Continued

analysis. Using Southern blots with cDNA probes, we constructed a restriction map of this clone (insert size of 16 kb) and localized apoE and apoC-I sequences. Alignment of restriction maps showed that this clone overlapped with our previously isolated APOE clone (Hixson et al., 1988b) and extended to contain the entire APOCl gene. To obtain the nucleotide sequence of the APOCl clone, restriction fragments containing apoC-I sequences were gel-isolated from the XEMBL3 recombinant and subcloned into Bluescript Ml3 phagemid vectors (SK and KS for both strands) (Stratagene; San Diego, CA). We constructed overlapping deletion clones of fragments that were too large for complete sequencing by digestion of recombinant plasmids with Escherichia coli exonuclease III and Sl nuclease (Henikoff, 1984). Single-stranded templates were sequenced using Sequenase and [35S]dATP (Biggen et al., 1983). Primer extension analysis of APOCl transcriptional start sites. Synthetic oligonucleotides (5’-GCCTTCCAAGACCATCGACAGAACC-3’ and 6-CACTCTGTTTGATGCGGTTGATCAC-3’) were end-labeled with T4 polynucleotide kinase and hybridized with total hepatic RNA for 14 h at 37°C (0.4 M NaCl and 0.2% SDS, final volume of 20 ~1). After precipitation of hybridization products (15 min at room temperature) using isopropanol(0.4 ml) and ammonium acetate (0.4 M, 0.4 ml), reactions were adjusted to 50 mM Tris (pH 8.3), 40 mM KCl, 6 mM MgCl,, and 1 mM dithiothreitol, and 0.4 mM of each dNTP. Extension reactions (10 ~1) used AMV reverse transcriptase (5 units) in the presence of RNasin (5 units). After incubation at 42°C for 45 min, extension products were precipitated (as described above) and denatured (95’C, 5 min) in 80% formamide, 10 mM NaOH, and 1 mM EDTA for electrophoresis on denaturing gels (6% polyacrylamide, 7 M urea). Sizes of extension products were obtained by comparisons with a radiolabeled size standard (HpaII-digested pBR322 DNA). Southern blotting to determine the number ojAPOC1 genes. DNA from baboon leukocytes (10 pg) was digested with several restriction enzymes for Southern blotting (Southern, 1975) with a baboon apoC-I cDNA probe. After electrophoresis on agarose gels (0.8%), DNA was transferred to nylon filters (GeneScreen Plus) under buffer conditions

described by the suppliers (New England Nuclear; Boston, MA). Hybridizations (65”C, 24 h) used radioactive probes after labeling by random priming of gel-isolated DNA fragments (Feinberg and Vogelstein, 1983). After washing to remove unbound probe (final wash was 2X SSC and 1% SDS at 60°C for 30 min), filters were subjected to autoradiography with intensifying screens.

R M n_XC_ *

a"-

160

76

FIG. 4. Primer extension mapping of start sites for baboon APOCI transcription. This polyacrylamide gel shows the products of a primer extension reaction (lane marked R) using total RNA from baboon liver hybridized with an oligonucleotide constructed from baboon apoC-I sequences (sizes marked to the left). The lane marked M shows molecular size standards (HpaII-digested pBR322 DNA, sizes marked to the right) used to determine sizes of primer extension products.

372

PASTORCIC.

A

B

C

D

BIRNBAUM,

AND

E

HIXSON

5.00

BABOON

5.00 HUMAN 0.00 4.1’ 3.8

e+

3.6’ 3.5

3.6’

2.5’

2.5’

3.6’

4

RAT

FIG. 5. Southern blotting of baboon leukocyte DNA with a baboon apoC-I cDNA probe. Lane A shows digestion of leukocyte DNA with EcoRI, lane B shows simultaneous digestion with EcoRI and BarnHI, lane C shows digestion with PstI, lane D shows digestion with PstI and HindIII, and lane E shows digestion with PstI and BamHI. Sizes of fragments that hybridized with a baboon apoC-I cDNA probe were calculated by comparisons with HindIII-digested bacteriophage X DNA and are indicated adjacent to each lane. Fragments that were not predicted by the cloned APOCl sequence are indicated by asterisks.

RESULTS

Comparative Analysis of Baboon, Human, Rodent apoC-I cDNAs

Dog, and

We constructed hepatic cDNA libraries and used a human apoC-I cDNA probe (Myklebost and Rogne, 1986) to identify cross-hybridizing clones. Figure 1 shows the nucleotide and inferred amino acid sequences from six different apoC-I cDNA clones that included a 5’ untranslated region, a coding region (83 aa with a signal sequence of 26 aa), and a 3’ nontranslated region. We Human

APOCI I 1 , I

,, I,

I’ 1,’

‘I /’



Baboon

APOC 1

Human

APOCl’

FIG. 6. Distributions of Ah repeats in baboon APOCl compared to those in human APOCl and APOCl’ (pseudogene). The middle gene map shows the 9.5 Alu repeats (repeats with rightward orientation are shown by white arrows; gray arrows show leftward orientation) along the baboon APOCl gene (exons shown by filled boxes). A similar map for human APOCl is shown above the baboon map, and the map for human APOCl’ is shown below. Dotted lines between the maps show the sites of insertion or deletion of aligned repeats.

-5.00 10

20

30

40

50

60

70

80

FIG. 7. Hydrophilicity profiles from inferred aa sequences for apoC-I from human, baboon, rat, and dog. They axis shows the hydrophilicity value (~5.0 to 5.01 for apoC-I as residues shown on the n axis (marked in groups of 10). Values above the 0.00 mark represent hydrophilic regions; those below are hydrophobic regions.

aligned the baboon apoC-I cDNA with the human sequence (Lauer et al., 1988) to show both nucleotide and amino acid substitutions (Fig. 1). We also aligned the baboon sequence with cDNAs from less closely related species, including dog (Luo et al., 1989) and rat (Shen and Howlett, 1989). Table 1 shows the amounts of nucleotide and amino acid divergence between the baboon apoC-1 cDNA and apoC-I from each of the other species. Comparative Analysis Genomic Sequences

of Baboon and Human

APOCl

To isolate the baboon APOCl gene, we constructed a representative genomic library in XEMBL3 which we screened with the human apoC-I cDNA probe. Figure 2 shows the single cross-hybridizing clone (insert size of 16 kb) that included the 3’ end of APOE and the entire APOCl gene. Figure 2 also shows our strategy for subcloning and sequencing of APOCl. The complete nucleotide sequence of baboon APOCl is shown in Fig. 3. The sequence includes 830 bp of 5’ flanking region, four exons separated by three introns, and 106 bp of 3’ flanking sequence. Sequence analysis revealed that this clone contains a functional APOCl gene. Its exons do not contain a stop eodon (unlike human APOCI’), and its coding sequences are identical to that of the hepatic apoC-I cDNA. In addition, all introns contain normal splice sites, and the 3’ untranslated region contains a signal sequence for polyadenylation. Like the human genes,

BABOON

apoC-I

GENE

STRUCTURE

baboon APOCl contains a high density of Alu repeats (total of 9.5), including a baboon-specific AEu repeat (Alu-3 in Fig. 3) that is not found in either human APOCl or APOCl’. Baboon APOCl does not contain the Alu repeat located proximal to the APOCl’ polyadenylation signal that may interfere with its expression (Lauer et al., 1988). Identification Sites

of Baboon APOCl

Transcriptional

Start

To map the transcriptional start site(s) of baboon APOCl, we performed primer extension experiments using an oligonucleotide from exons 2 and 3 and total RNA from baboon liver. Figure 4 shows two sizes of extension products after electrophoresis, corresponding to start sites at nucleotides 1 and 3 (Fig. 3). This result was confirmed using a second primer located at the end of the third exon (data not shown). Transcription of baboon APOCl begins at A nucleotides 7 and 9 bp downstream from the A nucleotide of the predominant human APOCl start site (Lauer et CL, 1988). Detection of a Second Baboon APOCl Gene To investigate the number of baboon genes coding for apoC-I, we used Southern blots of genomic DNA for hybridization with a baboon cDNA probe containing exons 2 and 3 (nucleotides 50-242, Fig. 1). Figure 5 shows a Southern blot of baboon genomic DNA cut with several different restriction enzymes. If only one gene for apoCI is present in the genomic DNA, we expected to detect fragments of specific sizes corresponding to restriction sites in the APOCl clone (see restriction map in Fig. 2). For EcoRI (lane A), we expected a single 8.6-kb fragment after EcoRI cleavage of leukocyte DNA. Instead, we detected an additional fragment (23.0 kb), as well as the predicted 8.6-kb fragment. Simultaneous digestion with EcoRI and BamHI (lane B) produced predicted 3.8- and 1.5-kb fragments, but we also detected a 4.1-kb fragment that was not predicted by the APOCl sequence. Digestion with PstI (lane C) produced a predicted 3.5-kb fragment, but we also detected additional 3.6- and 2.5-kb fragments. Further digestion of P&I-treated DNA with Hind111 (lane D) or BamHI (lane E) cut the 3.5-kb PstI fragment as predicted from the sequence, but did not cut the additional 3.6-kb PstI fragment. The consistent presence of additional fragments using several restriction enzymes indicates the presence of a second baboon gene that may correspond to the human APOCl’ pseudogene. DISCUSSION

Baboon APOCl Is a Functional Gene, but Shares Noncoding Structures with Both Human APOCl and the APOCl’ Pseudogene We have cloned and sequenced a gene for apoC-I (APOCl) in baboons that is located approximately 4 kb downstream from APOE. This gene appears to be func-

AND

EVOLUTION

373

tional because it contains coding sequences that are identical to that of the hepatic apoC-I cDNA. We isolated six independent clones from baboon hepatic cDNA libraries, but did not find any clones with nucleotide substitutions that could have been transcribed from a different gene. Although baboon APOCl coding sequences are more similar to human APOCI (92.5%) than to APOCl’ (88.5%) coding sequences (Table l), comparisons of noncoding regions show that baboon APOCl shares distinct structural features with both human APOCl and APOCI’. A large portion of the baboon 5’ flanking region (nucleotides l-576, Fig. 3) is similar to that of APOCl’ (88.5% identical). This sequence is not even present in human APOCl (Lauer et al., 1988). We also aligned Alu repeats in baboon APOCl with those in human APOCl and APOCl’ to compare their numbers and distributions (Fig. 6). Exactly as in human APOCl’, the second intron of the baboon gene contains only one Alu repeat compared to 2.5 Alu repeats in human APOCl. In contrast, the number and distribution of Alu repeats in baboon intron 3 and the 3’ flanking region are identical to those of APOCl, and different from those of APOCI’. An Alu repeat that is unique to baboon APOCl (Alu-3, Fig. 3) is found between these two regions, and may be responsible for evolutionary rearrangements or gene conversion. APOCl

Is a Rapidly Evolving Apolipoprotein Gene

Comparisons between baboon cDNA sequences and homologous sequences from human, dog, and rat show that APOCl has evolved rapidly with respect to both nucleotide and amino acid substitutions. Unlike those for most genes, APOCl nucleotide sequences are more conserved than amino acid sequences, reflecting a large proportion of nucleotide substitutions that alter amino acid sequences (nonsynonymous substitutions). For example, only 37% of the nucleotide differences between baboon APOCl and human APOCl are synonymous substitutions that conserve protein sequence. To further examine APOCl evolution, we calculated the number of substitutions per synonymous site (Ks) and substitutions per nonsynonymous site (KA) of baboon APOCl versus human APOCl and APOCl’ (Li et al., 1985). We also calculated the rate of synonymous and nonsynonymous substitutions using a baboon/human divergence time of 25 million years (Gingerich, 1984). Comparison of baboon APOCl with human APOCl’ further supports ancient inactivation of this pseudogene. Baboon APOCl and APOCl’ have diverged only slightly more (KS = 0.09, KA = 0.14) than human APOCl andAPOC1’ (K, = 0.06, KA = 0.13). As in comparisons of the human genes (Luo et al., 1989), the proportion of nonsynonymous to synonymous substitutions between baboon APOCl and APOCl’ is much higher than expected for comparisons with a functional gene. Comparisons also show that APOCl is evolving more rapidly than APOE and APOAl since the divergence of human and baboon lineages. The rate of nonsynonymous substitutions of baboon and human APOCl is 1.35

374

PASTORCIC,

BIRNBAUM,

X lo-’ substitutions per site per year compared to 0.57 X lo-’ for APOE (Hixson et aZ., 1988b) and 0.46 X lo-’ for APOAl (Hixson et al., 1988a). The rate of synonymous substitutions is also higher for APOCl (2.55 X lo-‘) than for APOE (1.80 X lo-‘) or APOAl (1.73 X lo-‘). To investigate the effects of extensive nucleotide and amino acid substitutions during APOCl evolution, we compared hydrophilicity profiles for inferred apoC-I amino acid sequences from several species. Figure 7 shows alignment of hydrophilicity profiles for baboon, human, rat, and dog that were calculated according to Kyte and Doolittle (1984). Despite extensive amino acid substitutions among these sequences (Table l), hydrophilic and hydrophobic domains are highly conserved. This result is consistent with the models of amphipathic helices proposed for the lipid-associating domains of apolipoproteins (Kaiser and Kezdy, 1983). In addition, hydrophobic domains containing the signal peptide are conserved among all species. Two Genes Containing apoC-I in the Baboon Genome

Sequences

Are Detected

Sequence comparisons of human genes for apoC-1 showed a high rate of nonsynonymous substitutions that may reflect inactivation of the APOCl’ pseudogene soon after gene duplication and divergence from APOCl (Luo et al., 1989). Although we have not yet cloned a pseudogene, Southern blotting of baboon genomic DNA detects additional restriction fragments that may correspond to an additional gene for apoC-I (Fig. 5). The presence of this second gene in baboons indicates that the genes for apoC-I diverged before the split of baboon and human lineages (approximately 25 million years; Gingerich, 1984), providing further support for a divergence time of 35 million years as proposed by Luo et al. (1989). ACKNOWLEDGMENTS We thank Dr. Ola Myklebost for providing the cloned human apo C-I cDNA probe, and Dr. Wen-Hsiung Li for providing the computer program to determine rates of sequence divergence. We also thank Dr. Donna Driscoll for helpful comments in the preparation of the manuscript. This work was supported by NIH Biomedical Research Support Grant 2 SO7 RR05519-27 to M.P. and NIH Grant HL28972.

REFERENCES Benton, W. D., and Davis, R. W. (1977). Screening hgt recombinant clones by hybridization to single plaques in situ. Science 196: 180182. Biggin, M. D., Gibson, T. J., and Hong, G. F. (1983). Buffer gradient gels and %I label as an aid to rapid DNA sequence determination. Proc. N&l. Acad. Sci. USA 80: 3963-3965. Blin, N., and Stafford, D. (1976). A general method for isolation of high molecular weight DNA from eukaryotes. Nucleic Acids Res. 3: 2303-2308. Feinberg, A. P., and Vogelstein, B. (1983). A technique for radiolabeling DNA restriction endonuclease fragments to high specific activity. Anal. Biochem. 132: 6-13. Frischauf, A.-M., Lehrach, H., Poustka, A., and Murray, N. (1983).

AND

HIXSON

Lambda replacement vectors carrying polylinker sequences. J. Mol. Biol. 170: 821-842. Gingerich, P. D. (1984). Primate evolution: Evidence from the fossil record, comparative morphology, and molecular biology. Yearb. Phys. Anthropol. 27: 57-72. Henikoff, creates 359.

S. (1984). Unidirectional digestion with targeted breakpoints for DNA sequencing.

exonuclease Gene 28:

III 351-

Hixson, J. E., Borenstein, S., Cox, L. A., Rainwater, D. L., and VandeBerg, J. L. (1988a). The baboon gene for apo A-I: Characterization of a cDNA clone and identification of DNA polymorphisms for genetic studies of cholesterol metabolism. Gene 74: 483-490. Hixson, J. E., and Britten, M. L. (1988). The baboon @-myosin heavychain gene: Construction and characterization of cDNA clones and gene expression in cardiac tissues. Gene 64: 33-42. Hixson, d. E., Cox, L. A., and Borenstein, S. (1988b). The baboon apo E gene: Structure, expression, and linkage with the gene for apo C-I. Genomics 2: 315323. Kaiser, teins Natl.

E. T., and Kezdy, F. J. (1983). Secondary and peptides in amphiphilic environments Acad. Ski. USA 80: 1137-1143.

Knott, T. J., Robertson, M. E., Priestley, and Scott, J. (1984). Characterization cursor for apolipoprotein C-I. Nucleic

structures of pro(a review). Proc.

L. M., Urdea, M., Wallis, S., of mRNAs encoding the preAcids Res. 12: 3909-3915.

Kyte, J., and Doolittle, R. F. (1982). A simple method the hydropathic character of a protein. J. Mot. Biol.

for displaying 157: 1055132.

Lauer, S. J., Walker, D., Elshourbagy, N. A., Reardon, C. A., LevyWilson, B., and Taylor, J. M. (1988). Two copies of the human apolipoprotein C-I gene are linked closely to the apolipoprotein E gene. J. Biol. Chem. 263: 7277-7286. Li, W.-H., Wu, C.-I., and Luo, C.-C. (1985). A new method for estimating synonymous and nonsynonymous rates of nucleotide substitution considering the relative likelihood of nucleotide and codon changes. Mol. Biol. Euol. 2: 150-174. Luo, C.-G., Li, W.-H., and Chan, L. (1989). Structure and expression of dog apolipoprotein A-I, E, and C-I mRNAs: Implications for the evolution and functional constraints of apolipoprotein structure. J. Lipid Res. 30: 173551746. McGill, H. C., dr., McMahan, C. A., Kruski, A. W., Kelley, J. L., and Mott, G. E. (1981a). Responses of serum lipoproteins to dietary cholesterol and type of fat in the baboon. Arteriosclerosis 1: 337344. McGill, H. C., Jr., McMahan, C. A., Kruski, A. W., and Mott, G. E. (198lb). Relationship of lipoprotein cholesterol concentrations to experimental atherosclerosis in baboons. Arteriosclerosis 1: 3-12. Myklebost, protein gene on Sambrook, Cloning: Harbor Schaefer, protein

O., and Rogne, S. (1986). The gene for the human apolipoC-I is located 4.3 kilobases away from the apolipoprotein E chromosome 19. Hum. Genet. 73: 286-289. J., Fritsch, E. F., and Maniatis, T. (1989). “Molecular 2nd ed., pp. 2.88-2.89, Cold Spring A Laboratory Manual,” Laboratory Press, Cold Spring Harbor, NY. E. J., Eisenberg, S., and Levy, R. I. (1978). Lipoprotein apometabolism (a review). J. Lipid Res. 19: 667-687.

Shen. P. Y.. and Howlett. G. J. (1989). Nucleotide sequence of cDNA for rat apolipoprotein c-1. Nucleic Acids Res. 17: 6405. Smit, M., Kooij-Meijs, E.v.d., Frants, R. R., Havekes, L., and Klasen, E. C. (1988). Apolipoprotein gene cluster on chromosome 19: Definite localization of the APOCZ gene and the polymorphic HpaI site associated with type III hyperlipoproteinemia. Hum. Genet. 78: 9093. Soutar, A. K., Garner, C. W., Baker, H. N., Sparrow, J. I., Jackson, R. L., Gotto, A. M., and Smith, L. C. (1975). Effects of the human plasma apolipoproteins and phosphatidylcholine acyl donor on the activity of lecithin cholesterol acyltransferase. Biochemistry 14: 3057-3064. Southern, E. M. (1975). Detection of specific sequences among DNA fragments separated by gel electrophoresis. J. Mol. Biol. 98: 503517.