The bovine and ovine genomes contain multiple sequences homologous to the α-lactalbumin-encoding gene

The bovine and ovine genomes contain multiple sequences homologous to the α-lactalbumin-encoding gene

331 Gene. 83 (1989) 331-338 Elsevier GENE 03188 The bovine and ovine genomes contain multiple sequences homologous to the a-lactalbumin-encoding gen...

691KB Sizes 0 Downloads 5 Views

331

Gene. 83 (1989) 331-338 Elsevier GENE 03188

The bovine and ovine genomes contain multiple sequences homologous to the a-lactalbumin-encoding gene (Milk proteins; exon; intron; pseudogene; duplication; recombinant DNA)

S. SouIieF’, J.C. Mercier”, J.L. Vilotte“, J. Andersonb, A.J. Clarkb and C. ProvotP a Laboratoire de Gknktique Biochimique. INRA-CRJ, 78350 Jouy-en-Josas (France) and b AFRC, IAPGR-ERS, Buildings, EH93JO Edinburgh (UK) Tel. (03 I) 667 69 01; Fax 668 33 09

King’s

Received by J.-P. Lecocq: 26 April 1989 Revised: 9 June 1989 Accepted: 10 June 1989

SUMMARY

Bovine and ovine (pseudolgenes homologous to the a-lactalbumin-encoding gene are described. In both cases, sequence analysis reveals homology extending downstream from exon 2. Southern analysis indicates the presence of a family of a-lactalbumin-related sequences in the bovine genome.

a-Lactalbumin (ala) is a calcium metalloprotein found in almost all mammalian milks (Jenness, 1982). It interacts with the enzyme, UDP-galactosyltransferase (EC 2.4.1.22), modifying its carbohydrate-binding properties and, thereby, induces the synthesis of lactose in the lactating mammary gland (Kuhn, 1983). Lysozyme and ala have been shown to be structurally related (Brew et al., 1967). Genes encoding rat (Qasba and Safaya, 1984), human (Hall et al., 1987), bovine (Vilotte et al., 1987) and guinea

pig (Laird et al., 1988) ala and hen egg-white lysozyme have a similar organization, with three introns located at comparable sites, suggesting a common ancestral origin. In rat, human and guinea pig, ala has been reported to be present in the genome as a single copy. We have recently reported the sequence of the bovine ala and the preliminary characterization of a related clone (Vilotte et al., 1987). Here, we report sequences of that clone and an ovine counterpart and describe the complex genomic organization of ala-related sequences in the bovine genome.

Correspondence to: Dr. J.L. Vilotte, Laboratoire de Genetique Biochimique, INRA-CNRZ, 78350 Jouy-en-Josas (France) Tel. (1)34652576; Fax 34652273.

Abbreviations: aa, amino acid(s); ala, a-lactalbumin; ala, gene encoding ala; bp, base pair(s); kb, kilobase or 1000 bp; nt, nucleotide(s); SDS, sodium dodecyl sulfate.

INTRODUCTION

0378-l 119/89/$03.50

0 1989 Elsevier

Science Publishers

B.V. (Biomedical

Division)

332 EXPERIMENTAL AND DISCUSSION

A.Insert

of bovine

clone h2

(a) Cloning, sequencing and structural analysis of genomic clones Isolation and characterization of bovine genomic clones from a Hind111 library constructed in INM762 have been reported previously (Vilotte et al., 1987). Ovine clone SSS was isolated from a Sau3A partial library constructed in IEMBL3. Sequences were obtained by the dideoxy chaintermination procedure @anger et al., 1977) after subcloning in Ml3 vectors mplO/ll (Messing and Vieira, 1982). Sequences of overlapping fragments of both strands were determined for the bovine clone, 12, but only partial sequence analysis was performed for the ovine clone, SS5 (Fig. 1). We have previously shown that a 3.2-kb HindIIITaqI fragment from 12 hybridizes with bovine ala cDNA sequences (Vilotte et al., 1987; see also Fig. 1). This fragment was completely sequenced and aligned with the published nt sequence for the bovine ala. The two sequences are only partially homologous. This homology starts close to, and extends downstream from, the donor splice site in intron 2 of ala (Fig. 2). Over this region the homology between the two sequences is 8 1%. On the basis of this alignment 12 has consensus acceptor splice sites in introns 2 and 3, a donor splice site in intron 3 as well as the equivalent polyadenylation site (Fig. 3). However, a deletion of 1 nt (position 2167) at the end of the putative exon 3 would cause a frameshift mutation with respect to the ala sequence. The translated C-terminal part of this protein would share no significant homology with bovine cda. Hybridization of the 3.6-kb TaqI-Hind111 fragment located at the 3’ end of 12 (Fig. 1) to the equivalent region of ala indicates that the homology between the two clones extends further downstream from the known sequences. Upstream from the exon 2/intron 2 junction, ala and 12 are completely unrelated. A search through the EMBL DNA data base did not produce any sequences with significant homology to this region of 12. Southern blotting and sequence analysis of the ovine clone, SS5, suggested an organization similar to that determined for the bovine clone, A2. First, (bovine) ala cDNA probes comprising sequences downstream from exon 2 hybridized strongly to 0.93-

0

“id111

1

2

3

3.2

kb

ECORI

B.1nsex-t of ovine clone SS5 SalIHind111 Hi"dIiI ECORI

d-+ 0.5

kb

Fig. 1. Restriction maps and sequencing strategy of the inserts of the bovine clone, r22(A) and the ovine clone, SS5 (B). (Upper maps in A and B) Genomic DNA inserts of recombinant 12 and SS5 clones. Regions hybridizing with the bovine ala cDNA are blackened. The star marks the TuqI-Hind111 subfragment hybridizing with the 3’4lanking region of ala. (Lower maps in A and B) Strategy used for sequencing fragments subcloned in M13mplO or M13mpll vectors, using the dideoxy chain-termination method. 5’ specific probe from 12 used against SS5 and the derived EcoRI subclone hybridizing to it are shaded.

and OS-kb EcoRI fragments (Fig. I), whereas ala sequences upstream from exon 2 failed to hybridize to SS5 (not shown). Similarly, the l.l-kb HindIIIEcoRI fragment derived from the 5’ end of 22 (a region which shows no homology to ala) was shown

333

El

Bovine

ti

putat,ve pseudogene

(b) Southern analysis of bovine genomic DNA

(3 177 bp)

The isolation of 12 suggested the presence of alarelated sequences in the bovine genome. Southemblot analysis of bovine DNA samples (using fulllength cDNA and specific 5’ and 3’ probes) confirmed the relatively complex organization of these sequences (Figs. 4 and 5). For example, in addition to the three Hind111 fragments predicted from the maps of ala and 122,three other Hind111 fragments were revealed by Southern blotting (Fig. 4). Furthermore, digestion of bovine DNA with restriction endonucleases, whose recognition sites are absent from bovine ala (Fig. 5), generated at least six hybridizing fragments for each enzyme used. The majority of these fragments hybridized only to the 3’-specific cDNA probe. Fig. 2. Homology matrix analysis and schematic representation of the bovine ala and the 5’ end of the insert of clone 12. (A)Dot analysis was performed using an homology matrix analysis routine (‘DNA inspector II +’ program from Textco, West Lebanon, NH) with the following parameters: search element 3 nt. length: 15 nt; maximum number of mismatches: (B) Blackened and hatched segments represent the coding frame and the 5’ or 3’ untranslated regions of& and their homologous regions in clone 12, respectively. Shaded segments represent the other regions homologous between the two clones. Numbers refer to their location in the genomic fragments (see Fig. 3).

to hybridize to the 3.7-kb EcoRI fragment located immediately upstream from the region of homology between SSS and bovine ala (Fig. 1). Two regions of SS5 were sequenced. The alignment of these sequences with both ala and 12 is shown in Fig. 3. The two regions that were sequenced include putative exons 3 and 4. In this alignment, their overall homology with ala is 73% and with 12 is 68%. In contrast to r22, the putative coding sequence of SS5 has no nt sequence deletion with respect to ala, and overall the two sequences (SS5 and ala) share 80% homology at the protein level. However, in this alignment, SS5 does not have a stop codon in the equivalent position to ala, and so would encode an additional 37 aa at the C terminus. We have not yet determined the exact breakpoint of homology between SS5 and ala.

(c) Conclusions 22 and SS5 represent similar bovine and ovine chromosomal segments, respectiveiy. They contain sequences related only to the 3’ half of ala and they, themselves, share homologous upstream sequences. They appear very similar and thus may have arisen before the divergence of the two species from the gene duplication of ala or ah-like sequences. The relatively complex pattern of restriction fragments observed when bovine DNA was probed with 3’-specific ala cDNA sequences indicates that the duplication of similar ala regions may have occurred a number of times. Although the nt stretches of SS5 and J.2 homologous to ala exons 3 and 4 could be correctly processed, as judged from the conservation of putative binding sites, the absence of upstream homology, the frameshift mutation (in J.2) and the stop codon mutation (in S S5) would suggest that these segments correspond to ala pseudogenes. However, the possibility that one (or both) related gene(s) does (do) encode protein(s) cannot be ruled out. Already, duplication of an ancestral lysozyme gene gave rise to ala and such events are generally regarded as a first step towards the creation of new genes.

10

1

20

30

40

60

50

70

80

AAGCTTTTGAAGAACTGCCATACTGTTTTTCACAGAGACTTTATTCTTTTATTTTACTCCCTCTATGGTGAATAATATGT

80

CTGTTTACAGATGTACACAGTTTTGTTTATTCATTTACCCGCTGATGGACTTTTGCATTGTTTCTGCCTCTTGGCTATTG

160

TGAACAGTGCTGCTGTCAATATATGTGTACACCTGTTATC~TTCTTTGGGGTATATTTCTAGGATGGAGTTGTTG~CA

240

ATATACTAATTATATATTTATATTGTAAATGTGGACTTCCCTGGTGGTAAGCCCATCTTCTCAGATGGTA~GAACTCTG

320

CCCTGCATTGCAGGAGACCTGAGTTTGATCCCTGGGTTGGGAAGATTCCCTGGAGAAGGGATTGGCTACCCACTCCCGTA TTCTTGCCTGGAGATTTCCATGGACAGAAGAACCTGATGAGATACAGCCCATAGGGTCCCAAAGGGTTGGACGTGAATGA

400 480

GAGGCTAACACATCACATTGCAAATGTAGGTCTCACCCTTTTCTATTTTTACAAGTAATGTTTGAAGGTTCCAATTTCTT

560

TACATTGTTGTGAACCGTTGTTATTTTTAGATTTCTTTATTTTAACCATTTTAGAATATGTGAAGCAGTTGCTGCATTGC

640

GGTTTTAATTTGCATCTCCATCATGACTAATGATCTTTGACATATTTTTATGGCTTGGTGACCATTTGTATGACCTGTGT GGAGAACAGTCTATTGAAGTCATTTATCCATTATTATAATTGTTTTTTTGGTCTTTGTGTTGGTGAGTTGTAAACATTCT TGAGATTCTTTGGATAATGATTTATCAGATATATAATGTATGAACTCT ATAAATGTGAACCCTGTGGATATTGGATGTTGACTGTAATTTCACTTCTTACATTTAAGCCTTTTATGTATTTTGGGTTA CATTTTGTATGTGGAAGGAGGAGAGGGTCCAATTTCATTGCTATGCATGTGGATAACC~TTGCCTAGCGCTGTTTGTTG

720 800 880 960 1040

AAAATATCATTTTCCCCCCAGTTTAATGTTGTGGACTGTTGTCTAA~TGAATTCACCATAGATGGGTGGGTTTATTTCT GTGATCTCCCTTCAATTTCATTCATCTCTATGTATATCTTTATGCC~TCTCATAGTGTTTTATTCAGTGTTTTTTGTAT TAACATGTAAAAATGGGAAGTGTGAGTATCTGTAGGGAGAGAGCAAATTAGCCAAGATGGTGGTCAGGCTCTCTCGCCCC ACGTTTGGTAAATACATGGCTATGTAACAGCTGAGATCATTTACAGCGTGCCACGAGGTACTCACTTGGTCACGGGCTAC TTTGTGTGATATGCTTATAGAAGCACCATCTGAGAAGTCCTTGAGGCTGCCTTATGGCTCCGTGTGACTGCCTATGGTTA TGTTGCTGGATCAGCCATGAGAGGAGAGAATACAGT GC GTC TGCT GCTGC 1 111 11 1 1 1 Cc C!Z'CTG T AC C z-p Val Cy P Th r

1120 1200 1280 1360 1440 AG 1492 1 AC 1241

Th

C TGC TGT GCC AGCCA

GGA GAG AAC AAA CGT GTCT GCA GTA C C TAC AGC TCC TTG AGT TTT 1 11 1 1 1 11 1 11111 11 111 TGGTTATGACACACZAGCCATAGTACAAAACAATG4CAOCACAGAA1301

1 1 11 11 GTTTCATACCAG I Phe His Thr So

r Gly Tyr Asp Thr Gin A la Ile Vdl Gin Am

1552

Asn Asp Ser Thr Glu

CTT CCA GCT TCC CAG CTC ACA TCT TGC CTA CCA TGG ATT CAGT GAA CAG TGC GCA TGG TGA 1613 1 11 111 1 1 11 11 1 1 111 1 1 1 1 11 1 TAT GGA CTC TTC CYAG ATA AAT AAT AAA ATT !l'GGTGC MC GA C GAC CAGAAC CCT CAC TCA 1361 Tyr Gly Lw Pho Gln Ilo Am Am Lys Ile Trp Cys Lys As p Asp Gln Am Pro Hia Sor AAT ACA ATG AGA CAG CAG CCA ACT

c 1660 1 1422

TTGATTTTCTT TTTTAAGACTG 1111 11 1 1 111 1 111 1111 1 111 AM! AAC ATC TGT AAC ATC TCC TGT GAC AGTGAGTAACTTCTTTTTACTCTGTTCCTGTGTTTTTCTGAAAC Ser Am Ilo Cyo Am Ile Ssr Cys Asp L(ys) CTACTTCTGGGATAAATTC

TTTTTTCGGTGTCAAGCGCACCTCTGGTTTCATTGTCTAGGACTCTACATCAAGTGTGG

1738

11111 111111111 11 111111 11111 1111 111111111 1111 11 11 111111 1 11 11 11111 CTACTCCTGGGATAACCTCCTTTTTTTTGGTGTGAAGCACACACCTCTGGCTTCACTGCCTTGGACTCC~TTAACTGTGG

1502

GACTTGAACTGATA TTATTAAGAGGCTGTTAGAATTTTCATTATCACCAAATCCCCAGACAGTTCCTT~AGTTCCTG 1111111 1 111 1 111111111 111111 111111111 111 1111111111111111 1111111111111

1817

GACTTGA

1579

TAATACCGAGTAAGAGGCTCTTAGAATTTTTTCATTAACACTAAATCCCCAGACAGTTTCTTAAAGTTCCTG

GATAGATGATCTGAGTTGTTTGGGGATCTTGAAGTCTAATACTCTGCGTTTTCA

GAG

GAAGTC

GGCTGATGAAGT

1 111 111 11111

11

1111

11 111111111

TAAGTT

GGTTGATGAAGT

1111111111111111

11 111111

111

111111

GGTAGGTGACCTGAGCTGTTTGGGGATCTTGATGTATAATACCCTGTATTTTCA GAC 1 11 1 1 11 1 11 1 111 1 11

11 11 11 . ..TATAGACCAGCTTAATTATCTCGCTCTTNTTTGATTACGATACATTCACACAGAACAGCT

TG 11 TG

1

1892

1654

1 (OV)

ATAATTCCTC 11111111~

CAGAGATGCCCTGGAG AAAGGAAGGGAGTCTTTACCGAGGGGGAGGCATTATTGTATTG 11111111 111111 111 11111 111 11 1111111111 11111 1111

1963

ATAATTCCTA

AGGAGCTGCCCCAGAG

AAGAGAAGGGAGTCCTTACCTAGGGATAGGCATTACTGTATTA

1725

11111111111 1 11111 11 CAAGGGA AGATATTACTGTATTG

(OV)

11 11 111 1 1 111 11111 1111 111 111 1111111111 ATGACCATGATTACGATTCTCTGAGATGCCCGAGAGCAAGTGAA GGAGTCCTTA

GATTGCTCACATAAAAGGAAAACAGGCTTAAGCCTCTAATTGAGAG AAGGACCAGGGAAGAGGGAAACTCATTACCTTT 2042 111 11111 1 1111 1111111 1111111111 11 1111 11 11111 11111111111 111111 1 11

AATTTCTCACCCAGAAGG

CAACAGGCATMGCCTCTAGTTCAGAG

AAA ACCAGAGAAGAGGGAAATTCATTATCCTT

111111111 1 1111 111 11111111 111111111 1111 1 GATTTCTCACATAAAAGG CAATAGGCATAAACCTCTAGTTTAGAGCAGG CTGGGTAATACTTAGCTCCTCTCATTTTTTCCACCTATAACTCCTGCCCAG 111111111111111111 1111 11111111111 111111

CTGGGTAATACTTAGCTC

TCTCATTTTTTCCACCAGAGGCTCCTG

11111111 111111 11 11111 CTGGGTAACACTTAGGTC TCTCA Fig.3(leged onpage336)

111111111111111 111111 11 ACCAGAGAAGAGGGATGCTCATTACTTTT AG TTC CTG GAT GAT GAC CTT

111 CCAG AC TTC CTG CAT CAT CAT CTT (L)ys Phe Leu Asp Asp Asp Leu

1802 (OV) 2113

1111 11 111 111 111 111 11

111 11 1 11111111111 TTTCCCTTCCATAACTCCTG CCAG

11 111 111 1 1 111 11 111 AG TTC CTG GGT GAT GAC CTT

1871

(OV)

335

* ACT GAT GAC ATT CTG CAT GTC AAG 111 111 111 111

AAG

ATT CTG GAT AAA GTG GGA ATT AAC TAC

1 111 ill 111 111 ill 111 111 11

11

G

111 111 111 111

2168

1

ACTCATGACATTATeTGTCTCllACAAOATT~C~TAlUl~AGGAATTRllCT~TG Thr

Jkp Jisp

Tie

Met

Cp

F&l

Lye Lys 110 L&u Asp Lyn

1927 Val Gly 110 Ann

Tyr Tr(p)

1 111 11 111 11 111 11 111 1 1 111 111 111 1 1 11 111 111 111 111 111 ACT GAT GAC ATT ATG GAT GTC AAN AAG GTT CTG GAC AAA GCA GGA ATT AAC TCC TG GTGAGTCTCCATTCTATTTTCTATTTTACACTTCCTCTCCCTCTTCTCAGCCCTTTAGTCCCAGCACCATACCCCTTTCT 2248 11111111 111111111 11111 1111111111 1111 1111111 lllll.ll 11 11111111 1 11 1 GTGAGXCACCTCTCTATTTTTCACTTAATCTTTCCTCTCTTTCTTCTCAGTCCTTTCGTCCCAGCACTATACTCCTTTCT 2007

11 GTA......

(OV)

CTCTATTTCCTGG 111111111 111

TTTTAA GCTAGAATATAGTTTGCAAAACAAAACTCATCAAGCGGACTCAGGTTTCCAATTTTCA 11.l111 11111111 11 1 1 111111111 111111111 11111 lllllllllllll

CTCTATTTCTTGGTCTTTTAh

2325

1

GCTAGAATGTAATCTTA?iAAACAAAAATCATCAAGCAGACTCCGGTTTCCAATTTTGA

2086

AGCCCCACTTACTCCACCACTGTTTAG~AACTCCCAGGATTTCCTT 111 lllllllf 111 1 1 fllllllllllllll 1 1 1 111 111111111111111 1111111111111 AGCTTCACTTACTTCACTCCCG TTAGCAATTTTCCTACCTAAGGGTCCCTAATAGAGGGCTGAGATCCAGGATTTCCTT

2405 2165

CACCTGGACTTGAACATCTAATTCGACTTGTTTAGTTCTAAGCGCTAAGACAAACCCTTGTTACCACTGCCCTACAATTT 2485 1111 1111111111111111111 1111111 111 111 11111 11 11111 1 1111111111 111111 CACCAGGACTTGRRCATCT~TTCTACTn;TTG4GTCCTAC~TCCTRRGGCACGCCCTT 1 111111111 I

...C

TGACCACTGCCCCGCAATTT

2244

11 11111 1111 1111111 ~CT~GGCAAGTT~TGTGTACCACGGCCCTA~AATTT fOV)

TCTTGGAGTTTAAGAAAATGGACCTTATTCCACTAGGTGGCTCAGTGTCCCTTGCCACCTGGCTAGGAAAGTCTG 11111111111 1 1111111111111 1lllllL 1111111111111 11 1111 11111111111111

TGT 2563 111

TCTTGGAGTTTTAAAAAA TGGACCTTACTCCACT~GTGGCTCAGTGTCTCTAGCCATGTGGCTAGG~GTC 1111 111 1 111111111 1 11111 111111 11 11 111111 11 111 11

TGT 2320

TCTT GAG T~AAA~TGG

C

GGTGGCATAGTGTC

11 TCTTTTTGG

CTTGCACTGTGGCCAG~AA

(OV)

CTGTAATTTTCACCCACA CTTCCACGTCAGCCCTCCTGGGGATAAAACTGAGTGGGAGTTT GAGC TAA 2631 1111111111 1111111 1111111 111111 1111111111111 11 11 111 111 11 CTGT~TTTTAACCCACAGTCTTCCA~~TCCTGGGGATAAAGCTAGATGTAAATCTAACCAAGATCCTGTCAG 2400 1 1111111 11111 1 111111111111111111 1 11 11111 1 111 11 1 1 1 111111111111 TTTTAATTTTCACCCATACTCTTCCACCTCAGCCTTCTAGA GACAAAGCAAAATGAAAGTTGAGCTGAGATCCTGTCAG (OV) AAT 111

CCTTGTCTCCTTCTTCATGATCAG 11111111111111111111llll

T AATTTGCCTTGTCTCXTTCTTCATGATCAG

1 11 1111111111

G TTG GTC CAT AAA GCA CTC TGT TCT GAG AAG CTG 1 111 1 1 111 111 111 111 111 111 llf. 111 111

G TTC Ccc (Tr)p Lou Ala

1111111 1111111 1 I.11 111

TCAACTTGCCTTGTCCTCTTCTTCTTGATCAG

CAT MA GCA CTC TGT TCT WLG AAG CTG Ly8 Ala leu Cys Ser Glu Lys Leu

2692 2465

His

11

1

1 111

1

111. 111 11

111

G TTG GCC TAT GAT ATA CTC AGC TCT GAG AAA CTG

(OV)

GAT CAG TGG CTC TGT GAG AAG TTG TGA ACAC CTG~TGTCTTTGCTA~TTTTGCTGT~TTTCTGTCCCTGA 2762 111 111 111 111 11% 111 111 111 111 1111 11111111111111 111 11 111111111 1111 GAT C&G TGG CTC z”GT GAG AAG TTG TGA ACAC CTGCTGTCTTTGCTGCTTfZTGTCCTCTTTCTTl’Cl’G!Sl’CCTGG 2535 Asp Gln Trp Lwz Cys Glu Lys Lou End 111 111 111 111 111 111 111 111 1 1 11 11111 11111111111111 111111111 1111111 GAT CAG TGG CTC TGT GAG AAG TTG AAA ATACTTTGCTGCCTTTGCTGCTTCTGCCCTCTTTCTATTCCTGG AATTCCTCTGCCCTTTGGCTACCTCATTTTG~TTCTTTGTACTGC~TTGAAGCAGATTTGTCTCTGAG 11 llllllllll AAC!KCTCTGcccc

CCTGGGCCCTC 2841

111111111 1111111111111111 111111111 1 1 111111111 1111111111 GTCGCTACC!WG!~'TTTGCTTCTTTGTACCCCCTTGAAC%TA?IC!PCGT~ZX-~SAGCC~!~CCGC~!~'C

11 111111 111 1111111111 111111111111111111111

(OW

2615

11111111111111111111llllllllllllll

AATTCCTCTTCCCTGTGGCTACCTTGTTTTGCTTCTTTGTACCCCCATGAA~CTAACTCGTCTC~GCCCTGGGCCCTG

(OV)

TAGTGATGTTGTTGGACATACAAGGACTATTCTCCAGGGATCCGTAAACAGTGCTCTGAGACTTTTCACTCTTGCTCAGT 2921 111111 1111111 llllllll 11111111 1 1 1 ll 1 111111 1111111 11 1111111 1 TAGTGA CRIITCCRCA~T~~~~T~~~~~GACTT~CC~TGCTCY;AT 2691 11 111111 1 111111 111111111 111 111 11 llllll 1111111111111 11 11 11 111 TATTGACACCACTGGACA TTAGGACTGATCTCCAGGGATGCGTGACTGATGCTCTG AACTTTTGACCCTTTTTCAAT cow

Fig. 3 (conrind)

336 GCCCCCAGTGGCACTTTCACTACAACAGT

GTCC

TGTGT

1 111 1111 1111 1 1 1111111 111 CTCCCTCATCOCECTTTTMTCCAIICACTA~TA~C~C~T~T 1

111111

111 1 1

GACCCTGA

11 1 1 1 11

AAATTATTTCGGCATTGCCT

TT

TG

2992 2 7 70

CTTTGGTACTGGAATAAAAA

(OV)

TGTTATTTTCTTCCTTGAGGGAGAGGGAGGAAATGGGGTGA

11 111111111111 1 1111111111111111

CCTAAATAAAGGGCTTGGTTTTGAGTGGCTGGT

1 111 11 111111 11 11 11111111111111 CCC WCTGATTTTGAGTGGCTGGC 1111 11 1 111111111 1111 1 11 1

111111

CTGA

CTGTTTCCAGTCC

AAGGCAGAGATGGCCAAGGGTCACAGCCACCTTCATCT

111 111

3071

111111 111111111 1 111111111

TATTTTCTTCCTGGTGGGACGGGAGGAAATAGGGTGAGTAGGTAGACCTGGCCATGGGTCACAGACCCCTTCATCT

2849

11 1111

11 1

TG TGGCCACTGCCTGAG... CTACCAGAGAGG 1111 1 11111

(OV)

AAATAGGCTGAACTTACAACATCTCAATGATGGAGATTCCTTTCTGTATCAATTCAATTCAACCAA

3149

1 1 111111111111 1111 11111 1111111111 1111111111 1111111111111

1

CTACTAAAGAGGATAGAGAGGCTGAACTTATAACAACTCAAAGATGWlGATTACTTTCTGTATTAATTCAATTCAACAGA

2939

GTTTTATTGATCACCTAGCATAATTCGA... 1111111111111111111111111 1

3177

GTTTTATTGATCACCTAGCATAATTT~GAGCTATGGAGGGGATCTAAAGTTGACTAAAAGCATCTCTTACCTAAACTG

3009

CTGCTAAGTCACTTCAGTTGTGTCCGACTCTGTGTGACCCCATAGACGGTAGCCCACAAGGCTCCCATGTCCCTGGAATTC

3090

Fig. 3. Nucleotide sequences of bovine clone 12 and ovine clone SS5: homology with the region of the bovine ala, downstream from exon 2. (Upper sequence; nt 1-3177) 5’ end sequence of bovine clone 12. (Middle sequence; nt 1228-3090) sequence of bovine ala (Vilotte et al., 1987). The first 1227 nt of the bovine ala are not shown. Italicized numbers refer to the gene. Bold-face italicized nt and aa refer to exons 3 and 4 of the gene and the corresponding aa sequence of ala. The polyadenylation signal, AATAAA, is underlined. (Lower sequence; (OV) denotes partial nt sequences of ovine clone SS5 homologous to regions (nt 1600-2010) and (nt 2204-2785) of bovine ala.) The putative stop codon in SS5 is underlined. Vertical bars (arabic numerals 1) refer to identical nt. Sequence alignments were carried out with the ‘Microgenie’ program (Beckman). Asterisk denotes frameshitt mutation located at the 3’ end of the putative exon 3 of 12. Hind111 1

2

3

4567

BglI 89

1

2

3

4

5

6

7

8

9

z4 DNA

6-23.3

9-9.3

ä 4.26

.2.23

Fig. 4. Southern-blot analysis of bovine DNA. Total genomic DNAs prepared from fresh bovine lymphocytes from nine unrelated cows (lanes l-9) by a standard proteinase K-SDS technique (kind gift from Dr. P. Brown) were digested to completion with Hind111 or BglI as indicated. Digestion, 1% agarose gel electrophoresis and Southern transfers to hybond-N membranes were done according to Maniatis et al. (1982). Hybridization with [a-32P] dCTP oligo-labeled bovine ala cDNA probe was performed in 500 mM Na phosphate pH 7.2/7x SDS at 65 “C overnight. Filters were washed with 40 mM Na phosphate pH 7.2/0.1 y0 SDS at 65 “C. The I-Hind111 size markers are shown on the right margin (in kb). Expected Hind111 genomic restriction fragments corresponding to ala and the putative pseudogene are indicated by the arrows on the left margin (in kb).

337 h

DNA (kb Wells+

vu11

PstI

BamHI

EcoRI

BglI

PVUII

PStI

Em""1

EcoRI

BglI

21.34 21.31 7.44 5.8/ 5.61

7.4

*

i.8/

4.91

i.6 4.9

* -a

3.51

.

Probe

5’ : fragment

HinfI

Probe

PH

3’:

fragment

HinfI-PstI P

H

VW

Exon 1

__-

Exon 2

VT

Exon 3

Exon 4 d

) 100

ala

bp

cDNA

Fig. 5. Southern-blot analysis with 5’- and 3’-specific probes. Partial restriction map of the bovine ala cDNA is indicated at the bottom; P, PstI; H, Hinff. Blackened and open segments represent the coding and untranslated regions, respectively (Vilotte et al., 1987). The 5’ Hinfl fragment and the 3’ Hi&I-PstI fragment used as probes, are indicated. Restriction endonucleases used in the Southern-blot analysis are indicated on the top of each lane. Experimental procedures are identical to those described in Fig. 4. The I-HindHI-EcoRI size markers are indicated on the left margins (in kb). Arrows on the right margins indicate the Bg11 genomic restriction fragment encompassing the entire transcription unit of ala (Vilotte et al., 1987).

ACKNOWLEDGEMENTS

REFERENCES

The 1 bovine genomic library was a kind gift from Dr. P. Sondermeyer. We are very grateful to Dr. P. Brown for the gift of the bovine genomic DNAs and to Dr. s. Ali for helpful discussions.

Brew, K., Vanaman, J.C. and Hill, R.C.: Comparison of the amino-acid sequence of bovine a-lactalbumin and hen’s egg white lysozyme. J. Biol. Chem. 242 (1967) 3747-3749. Hall, L., Emery, D.C., Davies, M.S., Parker, D. and Craig, R.K.: Organization and sequence of the human a-lactalbmnin gene. Biochem. J. 242 (1987) 735-742.

338 Jenness, R.: Inter-species comparison of milk proteins. In Fox, P.F. (Ed.), Developments in Dairy Chemistry, Vol. 1, Proteins. Applied Science Publishers, New York, 1982, pp. 87-l 14. Kuhn, N.J.: The biosynthesis of lactose. In Mepham, T.B. (Ed.), Biochemistry of Lactation. Elsevier, Amsterdam, 1983, pp. 159-176. Laird, J.E., Jack, L., Hall, L., Boulton, A.P., Parker, D. and Craig, R.K.: Structure and expression of the guinea-pig a-lactalbumin gene. Biochem. J. 254 (1988) 85-94. Maniatis, T., Fritsch, E.F. and Sambrook, J.: Molecular Cloning. A Laboratory Manual. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 1982.

Messing, J. and Vieira, J.: A new pair of Ml3 vectors for selecting either DNA strand of double-digest restriction fragments. Gene 19 (1982) 269-276. Qasba, P.K. and Safaya, S.K.: Similarity of the nucleotide sequences of the rat a-lactalbumin and chicken lysozyme genes. Nature 308 (1984) 377-380. Sanger, F., Nicklen, S. and Coulson, A.R.: DNA sequencing with chain-terminating inhibitors. Proc. Nat]. Acad. Sci. USA 74 (1977) 5463-5467. Vilotte, J.L., Soulier, S., Mercier, J.C., Gaye, P., Hue-Delahaie, D. and Furet, J.P.: Complete nucleotide sequence of bovine a-lactalbumin gene: comparison with its rat counterpart. Biochimie 69 (1987) 609-620.