331
Gene. 83 (1989) 331-338 Elsevier GENE 03188
The bovine and ovine genomes contain multiple sequences homologous to the a-lactalbumin-encoding gene (Milk proteins; exon; intron; pseudogene; duplication; recombinant DNA)
S. SouIieF’, J.C. Mercier”, J.L. Vilotte“, J. Andersonb, A.J. Clarkb and C. ProvotP a Laboratoire de Gknktique Biochimique. INRA-CRJ, 78350 Jouy-en-Josas (France) and b AFRC, IAPGR-ERS, Buildings, EH93JO Edinburgh (UK) Tel. (03 I) 667 69 01; Fax 668 33 09
King’s
Received by J.-P. Lecocq: 26 April 1989 Revised: 9 June 1989 Accepted: 10 June 1989
SUMMARY
Bovine and ovine (pseudolgenes homologous to the a-lactalbumin-encoding gene are described. In both cases, sequence analysis reveals homology extending downstream from exon 2. Southern analysis indicates the presence of a family of a-lactalbumin-related sequences in the bovine genome.
a-Lactalbumin (ala) is a calcium metalloprotein found in almost all mammalian milks (Jenness, 1982). It interacts with the enzyme, UDP-galactosyltransferase (EC 2.4.1.22), modifying its carbohydrate-binding properties and, thereby, induces the synthesis of lactose in the lactating mammary gland (Kuhn, 1983). Lysozyme and ala have been shown to be structurally related (Brew et al., 1967). Genes encoding rat (Qasba and Safaya, 1984), human (Hall et al., 1987), bovine (Vilotte et al., 1987) and guinea
pig (Laird et al., 1988) ala and hen egg-white lysozyme have a similar organization, with three introns located at comparable sites, suggesting a common ancestral origin. In rat, human and guinea pig, ala has been reported to be present in the genome as a single copy. We have recently reported the sequence of the bovine ala and the preliminary characterization of a related clone (Vilotte et al., 1987). Here, we report sequences of that clone and an ovine counterpart and describe the complex genomic organization of ala-related sequences in the bovine genome.
Correspondence to: Dr. J.L. Vilotte, Laboratoire de Genetique Biochimique, INRA-CNRZ, 78350 Jouy-en-Josas (France) Tel. (1)34652576; Fax 34652273.
Abbreviations: aa, amino acid(s); ala, a-lactalbumin; ala, gene encoding ala; bp, base pair(s); kb, kilobase or 1000 bp; nt, nucleotide(s); SDS, sodium dodecyl sulfate.
INTRODUCTION
0378-l 119/89/$03.50
0 1989 Elsevier
Science Publishers
B.V. (Biomedical
Division)
332 EXPERIMENTAL AND DISCUSSION
A.Insert
of bovine
clone h2
(a) Cloning, sequencing and structural analysis of genomic clones Isolation and characterization of bovine genomic clones from a Hind111 library constructed in INM762 have been reported previously (Vilotte et al., 1987). Ovine clone SSS was isolated from a Sau3A partial library constructed in IEMBL3. Sequences were obtained by the dideoxy chaintermination procedure @anger et al., 1977) after subcloning in Ml3 vectors mplO/ll (Messing and Vieira, 1982). Sequences of overlapping fragments of both strands were determined for the bovine clone, 12, but only partial sequence analysis was performed for the ovine clone, SS5 (Fig. 1). We have previously shown that a 3.2-kb HindIIITaqI fragment from 12 hybridizes with bovine ala cDNA sequences (Vilotte et al., 1987; see also Fig. 1). This fragment was completely sequenced and aligned with the published nt sequence for the bovine ala. The two sequences are only partially homologous. This homology starts close to, and extends downstream from, the donor splice site in intron 2 of ala (Fig. 2). Over this region the homology between the two sequences is 8 1%. On the basis of this alignment 12 has consensus acceptor splice sites in introns 2 and 3, a donor splice site in intron 3 as well as the equivalent polyadenylation site (Fig. 3). However, a deletion of 1 nt (position 2167) at the end of the putative exon 3 would cause a frameshift mutation with respect to the ala sequence. The translated C-terminal part of this protein would share no significant homology with bovine cda. Hybridization of the 3.6-kb TaqI-Hind111 fragment located at the 3’ end of 12 (Fig. 1) to the equivalent region of ala indicates that the homology between the two clones extends further downstream from the known sequences. Upstream from the exon 2/intron 2 junction, ala and 12 are completely unrelated. A search through the EMBL DNA data base did not produce any sequences with significant homology to this region of 12. Southern blotting and sequence analysis of the ovine clone, SS5, suggested an organization similar to that determined for the bovine clone, A2. First, (bovine) ala cDNA probes comprising sequences downstream from exon 2 hybridized strongly to 0.93-
0
“id111
1
2
3
3.2
kb
ECORI
B.1nsex-t of ovine clone SS5 SalIHind111 Hi"dIiI ECORI
d-+ 0.5
kb
Fig. 1. Restriction maps and sequencing strategy of the inserts of the bovine clone, r22(A) and the ovine clone, SS5 (B). (Upper maps in A and B) Genomic DNA inserts of recombinant 12 and SS5 clones. Regions hybridizing with the bovine ala cDNA are blackened. The star marks the TuqI-Hind111 subfragment hybridizing with the 3’4lanking region of ala. (Lower maps in A and B) Strategy used for sequencing fragments subcloned in M13mplO or M13mpll vectors, using the dideoxy chain-termination method. 5’ specific probe from 12 used against SS5 and the derived EcoRI subclone hybridizing to it are shaded.
and OS-kb EcoRI fragments (Fig. I), whereas ala sequences upstream from exon 2 failed to hybridize to SS5 (not shown). Similarly, the l.l-kb HindIIIEcoRI fragment derived from the 5’ end of 22 (a region which shows no homology to ala) was shown
333
El
Bovine
ti
putat,ve pseudogene
(b) Southern analysis of bovine genomic DNA
(3 177 bp)
The isolation of 12 suggested the presence of alarelated sequences in the bovine genome. Southemblot analysis of bovine DNA samples (using fulllength cDNA and specific 5’ and 3’ probes) confirmed the relatively complex organization of these sequences (Figs. 4 and 5). For example, in addition to the three Hind111 fragments predicted from the maps of ala and 122,three other Hind111 fragments were revealed by Southern blotting (Fig. 4). Furthermore, digestion of bovine DNA with restriction endonucleases, whose recognition sites are absent from bovine ala (Fig. 5), generated at least six hybridizing fragments for each enzyme used. The majority of these fragments hybridized only to the 3’-specific cDNA probe. Fig. 2. Homology matrix analysis and schematic representation of the bovine ala and the 5’ end of the insert of clone 12. (A)Dot analysis was performed using an homology matrix analysis routine (‘DNA inspector II +’ program from Textco, West Lebanon, NH) with the following parameters: search element 3 nt. length: 15 nt; maximum number of mismatches: (B) Blackened and hatched segments represent the coding frame and the 5’ or 3’ untranslated regions of& and their homologous regions in clone 12, respectively. Shaded segments represent the other regions homologous between the two clones. Numbers refer to their location in the genomic fragments (see Fig. 3).
to hybridize to the 3.7-kb EcoRI fragment located immediately upstream from the region of homology between SSS and bovine ala (Fig. 1). Two regions of SS5 were sequenced. The alignment of these sequences with both ala and 12 is shown in Fig. 3. The two regions that were sequenced include putative exons 3 and 4. In this alignment, their overall homology with ala is 73% and with 12 is 68%. In contrast to r22, the putative coding sequence of SS5 has no nt sequence deletion with respect to ala, and overall the two sequences (SS5 and ala) share 80% homology at the protein level. However, in this alignment, SS5 does not have a stop codon in the equivalent position to ala, and so would encode an additional 37 aa at the C terminus. We have not yet determined the exact breakpoint of homology between SS5 and ala.
(c) Conclusions 22 and SS5 represent similar bovine and ovine chromosomal segments, respectiveiy. They contain sequences related only to the 3’ half of ala and they, themselves, share homologous upstream sequences. They appear very similar and thus may have arisen before the divergence of the two species from the gene duplication of ala or ah-like sequences. The relatively complex pattern of restriction fragments observed when bovine DNA was probed with 3’-specific ala cDNA sequences indicates that the duplication of similar ala regions may have occurred a number of times. Although the nt stretches of SS5 and J.2 homologous to ala exons 3 and 4 could be correctly processed, as judged from the conservation of putative binding sites, the absence of upstream homology, the frameshift mutation (in J.2) and the stop codon mutation (in S S5) would suggest that these segments correspond to ala pseudogenes. However, the possibility that one (or both) related gene(s) does (do) encode protein(s) cannot be ruled out. Already, duplication of an ancestral lysozyme gene gave rise to ala and such events are generally regarded as a first step towards the creation of new genes.
10
1
20
30
40
60
50
70
80
AAGCTTTTGAAGAACTGCCATACTGTTTTTCACAGAGACTTTATTCTTTTATTTTACTCCCTCTATGGTGAATAATATGT
80
CTGTTTACAGATGTACACAGTTTTGTTTATTCATTTACCCGCTGATGGACTTTTGCATTGTTTCTGCCTCTTGGCTATTG
160
TGAACAGTGCTGCTGTCAATATATGTGTACACCTGTTATC~TTCTTTGGGGTATATTTCTAGGATGGAGTTGTTG~CA
240
ATATACTAATTATATATTTATATTGTAAATGTGGACTTCCCTGGTGGTAAGCCCATCTTCTCAGATGGTA~GAACTCTG
320
CCCTGCATTGCAGGAGACCTGAGTTTGATCCCTGGGTTGGGAAGATTCCCTGGAGAAGGGATTGGCTACCCACTCCCGTA TTCTTGCCTGGAGATTTCCATGGACAGAAGAACCTGATGAGATACAGCCCATAGGGTCCCAAAGGGTTGGACGTGAATGA
400 480
GAGGCTAACACATCACATTGCAAATGTAGGTCTCACCCTTTTCTATTTTTACAAGTAATGTTTGAAGGTTCCAATTTCTT
560
TACATTGTTGTGAACCGTTGTTATTTTTAGATTTCTTTATTTTAACCATTTTAGAATATGTGAAGCAGTTGCTGCATTGC
640
GGTTTTAATTTGCATCTCCATCATGACTAATGATCTTTGACATATTTTTATGGCTTGGTGACCATTTGTATGACCTGTGT GGAGAACAGTCTATTGAAGTCATTTATCCATTATTATAATTGTTTTTTTGGTCTTTGTGTTGGTGAGTTGTAAACATTCT TGAGATTCTTTGGATAATGATTTATCAGATATATAATGTATGAACTCT ATAAATGTGAACCCTGTGGATATTGGATGTTGACTGTAATTTCACTTCTTACATTTAAGCCTTTTATGTATTTTGGGTTA CATTTTGTATGTGGAAGGAGGAGAGGGTCCAATTTCATTGCTATGCATGTGGATAACC~TTGCCTAGCGCTGTTTGTTG
720 800 880 960 1040
AAAATATCATTTTCCCCCCAGTTTAATGTTGTGGACTGTTGTCTAA~TGAATTCACCATAGATGGGTGGGTTTATTTCT GTGATCTCCCTTCAATTTCATTCATCTCTATGTATATCTTTATGCC~TCTCATAGTGTTTTATTCAGTGTTTTTTGTAT TAACATGTAAAAATGGGAAGTGTGAGTATCTGTAGGGAGAGAGCAAATTAGCCAAGATGGTGGTCAGGCTCTCTCGCCCC ACGTTTGGTAAATACATGGCTATGTAACAGCTGAGATCATTTACAGCGTGCCACGAGGTACTCACTTGGTCACGGGCTAC TTTGTGTGATATGCTTATAGAAGCACCATCTGAGAAGTCCTTGAGGCTGCCTTATGGCTCCGTGTGACTGCCTATGGTTA TGTTGCTGGATCAGCCATGAGAGGAGAGAATACAGT GC GTC TGCT GCTGC 1 111 11 1 1 1 Cc C!Z'CTG T AC C z-p Val Cy P Th r
1120 1200 1280 1360 1440 AG 1492 1 AC 1241
Th
C TGC TGT GCC AGCCA
GGA GAG AAC AAA CGT GTCT GCA GTA C C TAC AGC TCC TTG AGT TTT 1 11 1 1 1 11 1 11111 11 111 TGGTTATGACACACZAGCCATAGTACAAAACAATG4CAOCACAGAA1301
1 1 11 11 GTTTCATACCAG I Phe His Thr So
r Gly Tyr Asp Thr Gin A la Ile Vdl Gin Am
1552
Asn Asp Ser Thr Glu
CTT CCA GCT TCC CAG CTC ACA TCT TGC CTA CCA TGG ATT CAGT GAA CAG TGC GCA TGG TGA 1613 1 11 111 1 1 11 11 1 1 111 1 1 1 1 11 1 TAT GGA CTC TTC CYAG ATA AAT AAT AAA ATT !l'GGTGC MC GA C GAC CAGAAC CCT CAC TCA 1361 Tyr Gly Lw Pho Gln Ilo Am Am Lys Ile Trp Cys Lys As p Asp Gln Am Pro Hia Sor AAT ACA ATG AGA CAG CAG CCA ACT
c 1660 1 1422
TTGATTTTCTT TTTTAAGACTG 1111 11 1 1 111 1 111 1111 1 111 AM! AAC ATC TGT AAC ATC TCC TGT GAC AGTGAGTAACTTCTTTTTACTCTGTTCCTGTGTTTTTCTGAAAC Ser Am Ilo Cyo Am Ile Ssr Cys Asp L(ys) CTACTTCTGGGATAAATTC
TTTTTTCGGTGTCAAGCGCACCTCTGGTTTCATTGTCTAGGACTCTACATCAAGTGTGG
1738
11111 111111111 11 111111 11111 1111 111111111 1111 11 11 111111 1 11 11 11111 CTACTCCTGGGATAACCTCCTTTTTTTTGGTGTGAAGCACACACCTCTGGCTTCACTGCCTTGGACTCC~TTAACTGTGG
1502
GACTTGAACTGATA TTATTAAGAGGCTGTTAGAATTTTCATTATCACCAAATCCCCAGACAGTTCCTT~AGTTCCTG 1111111 1 111 1 111111111 111111 111111111 111 1111111111111111 1111111111111
1817
GACTTGA
1579
TAATACCGAGTAAGAGGCTCTTAGAATTTTTTCATTAACACTAAATCCCCAGACAGTTTCTTAAAGTTCCTG
GATAGATGATCTGAGTTGTTTGGGGATCTTGAAGTCTAATACTCTGCGTTTTCA
GAG
GAAGTC
GGCTGATGAAGT
1 111 111 11111
11
1111
11 111111111
TAAGTT
GGTTGATGAAGT
1111111111111111
11 111111
111
111111
GGTAGGTGACCTGAGCTGTTTGGGGATCTTGATGTATAATACCCTGTATTTTCA GAC 1 11 1 1 11 1 11 1 111 1 11
11 11 11 . ..TATAGACCAGCTTAATTATCTCGCTCTTNTTTGATTACGATACATTCACACAGAACAGCT
TG 11 TG
1
1892
1654
1 (OV)
ATAATTCCTC 11111111~
CAGAGATGCCCTGGAG AAAGGAAGGGAGTCTTTACCGAGGGGGAGGCATTATTGTATTG 11111111 111111 111 11111 111 11 1111111111 11111 1111
1963
ATAATTCCTA
AGGAGCTGCCCCAGAG
AAGAGAAGGGAGTCCTTACCTAGGGATAGGCATTACTGTATTA
1725
11111111111 1 11111 11 CAAGGGA AGATATTACTGTATTG
(OV)
11 11 111 1 1 111 11111 1111 111 111 1111111111 ATGACCATGATTACGATTCTCTGAGATGCCCGAGAGCAAGTGAA GGAGTCCTTA
GATTGCTCACATAAAAGGAAAACAGGCTTAAGCCTCTAATTGAGAG AAGGACCAGGGAAGAGGGAAACTCATTACCTTT 2042 111 11111 1 1111 1111111 1111111111 11 1111 11 11111 11111111111 111111 1 11
AATTTCTCACCCAGAAGG
CAACAGGCATMGCCTCTAGTTCAGAG
AAA ACCAGAGAAGAGGGAAATTCATTATCCTT
111111111 1 1111 111 11111111 111111111 1111 1 GATTTCTCACATAAAAGG CAATAGGCATAAACCTCTAGTTTAGAGCAGG CTGGGTAATACTTAGCTCCTCTCATTTTTTCCACCTATAACTCCTGCCCAG 111111111111111111 1111 11111111111 111111
CTGGGTAATACTTAGCTC
TCTCATTTTTTCCACCAGAGGCTCCTG
11111111 111111 11 11111 CTGGGTAACACTTAGGTC TCTCA Fig.3(leged onpage336)
111111111111111 111111 11 ACCAGAGAAGAGGGATGCTCATTACTTTT AG TTC CTG GAT GAT GAC CTT
111 CCAG AC TTC CTG CAT CAT CAT CTT (L)ys Phe Leu Asp Asp Asp Leu
1802 (OV) 2113
1111 11 111 111 111 111 11
111 11 1 11111111111 TTTCCCTTCCATAACTCCTG CCAG
11 111 111 1 1 111 11 111 AG TTC CTG GGT GAT GAC CTT
1871
(OV)
335
* ACT GAT GAC ATT CTG CAT GTC AAG 111 111 111 111
AAG
ATT CTG GAT AAA GTG GGA ATT AAC TAC
1 111 ill 111 111 ill 111 111 11
11
G
111 111 111 111
2168
1
ACTCATGACATTATeTGTCTCllACAAOATT~C~TAlUl~AGGAATTRllCT~TG Thr
Jkp Jisp
Tie
Met
Cp
F&l
Lye Lys 110 L&u Asp Lyn
1927 Val Gly 110 Ann
Tyr Tr(p)
1 111 11 111 11 111 11 111 1 1 111 111 111 1 1 11 111 111 111 111 111 ACT GAT GAC ATT ATG GAT GTC AAN AAG GTT CTG GAC AAA GCA GGA ATT AAC TCC TG GTGAGTCTCCATTCTATTTTCTATTTTACACTTCCTCTCCCTCTTCTCAGCCCTTTAGTCCCAGCACCATACCCCTTTCT 2248 11111111 111111111 11111 1111111111 1111 1111111 lllll.ll 11 11111111 1 11 1 GTGAGXCACCTCTCTATTTTTCACTTAATCTTTCCTCTCTTTCTTCTCAGTCCTTTCGTCCCAGCACTATACTCCTTTCT 2007
11 GTA......
(OV)
CTCTATTTCCTGG 111111111 111
TTTTAA GCTAGAATATAGTTTGCAAAACAAAACTCATCAAGCGGACTCAGGTTTCCAATTTTCA 11.l111 11111111 11 1 1 111111111 111111111 11111 lllllllllllll
CTCTATTTCTTGGTCTTTTAh
2325
1
GCTAGAATGTAATCTTA?iAAACAAAAATCATCAAGCAGACTCCGGTTTCCAATTTTGA
2086
AGCCCCACTTACTCCACCACTGTTTAG~AACTCCCAGGATTTCCTT 111 lllllllf 111 1 1 fllllllllllllll 1 1 1 111 111111111111111 1111111111111 AGCTTCACTTACTTCACTCCCG TTAGCAATTTTCCTACCTAAGGGTCCCTAATAGAGGGCTGAGATCCAGGATTTCCTT
2405 2165
CACCTGGACTTGAACATCTAATTCGACTTGTTTAGTTCTAAGCGCTAAGACAAACCCTTGTTACCACTGCCCTACAATTT 2485 1111 1111111111111111111 1111111 111 111 11111 11 11111 1 1111111111 111111 CACCAGGACTTGRRCATCT~TTCTACTn;TTG4GTCCTAC~TCCTRRGGCACGCCCTT 1 111111111 I
...C
TGACCACTGCCCCGCAATTT
2244
11 11111 1111 1111111 ~CT~GGCAAGTT~TGTGTACCACGGCCCTA~AATTT fOV)
TCTTGGAGTTTAAGAAAATGGACCTTATTCCACTAGGTGGCTCAGTGTCCCTTGCCACCTGGCTAGGAAAGTCTG 11111111111 1 1111111111111 1lllllL 1111111111111 11 1111 11111111111111
TGT 2563 111
TCTTGGAGTTTTAAAAAA TGGACCTTACTCCACT~GTGGCTCAGTGTCTCTAGCCATGTGGCTAGG~GTC 1111 111 1 111111111 1 11111 111111 11 11 111111 11 111 11
TGT 2320
TCTT GAG T~AAA~TGG
C
GGTGGCATAGTGTC
11 TCTTTTTGG
CTTGCACTGTGGCCAG~AA
(OV)
CTGTAATTTTCACCCACA CTTCCACGTCAGCCCTCCTGGGGATAAAACTGAGTGGGAGTTT GAGC TAA 2631 1111111111 1111111 1111111 111111 1111111111111 11 11 111 111 11 CTGT~TTTTAACCCACAGTCTTCCA~~TCCTGGGGATAAAGCTAGATGTAAATCTAACCAAGATCCTGTCAG 2400 1 1111111 11111 1 111111111111111111 1 11 11111 1 111 11 1 1 1 111111111111 TTTTAATTTTCACCCATACTCTTCCACCTCAGCCTTCTAGA GACAAAGCAAAATGAAAGTTGAGCTGAGATCCTGTCAG (OV) AAT 111
CCTTGTCTCCTTCTTCATGATCAG 11111111111111111111llll
T AATTTGCCTTGTCTCXTTCTTCATGATCAG
1 11 1111111111
G TTG GTC CAT AAA GCA CTC TGT TCT GAG AAG CTG 1 111 1 1 111 111 111 111 111 111 llf. 111 111
G TTC Ccc (Tr)p Lou Ala
1111111 1111111 1 I.11 111
TCAACTTGCCTTGTCCTCTTCTTCTTGATCAG
CAT MA GCA CTC TGT TCT WLG AAG CTG Ly8 Ala leu Cys Ser Glu Lys Leu
2692 2465
His
11
1
1 111
1
111. 111 11
111
G TTG GCC TAT GAT ATA CTC AGC TCT GAG AAA CTG
(OV)
GAT CAG TGG CTC TGT GAG AAG TTG TGA ACAC CTG~TGTCTTTGCTA~TTTTGCTGT~TTTCTGTCCCTGA 2762 111 111 111 111 11% 111 111 111 111 1111 11111111111111 111 11 111111111 1111 GAT C&G TGG CTC z”GT GAG AAG TTG TGA ACAC CTGCTGTCTTTGCTGCTTfZTGTCCTCTTTCTTl’Cl’G!Sl’CCTGG 2535 Asp Gln Trp Lwz Cys Glu Lys Lou End 111 111 111 111 111 111 111 111 1 1 11 11111 11111111111111 111111111 1111111 GAT CAG TGG CTC TGT GAG AAG TTG AAA ATACTTTGCTGCCTTTGCTGCTTCTGCCCTCTTTCTATTCCTGG AATTCCTCTGCCCTTTGGCTACCTCATTTTG~TTCTTTGTACTGC~TTGAAGCAGATTTGTCTCTGAG 11 llllllllll AAC!KCTCTGcccc
CCTGGGCCCTC 2841
111111111 1111111111111111 111111111 1 1 111111111 1111111111 GTCGCTACC!WG!~'TTTGCTTCTTTGTACCCCCTTGAAC%TA?IC!PCGT~ZX-~SAGCC~!~CCGC~!~'C
11 111111 111 1111111111 111111111111111111111
(OW
2615
11111111111111111111llllllllllllll
AATTCCTCTTCCCTGTGGCTACCTTGTTTTGCTTCTTTGTACCCCCATGAA~CTAACTCGTCTC~GCCCTGGGCCCTG
(OV)
TAGTGATGTTGTTGGACATACAAGGACTATTCTCCAGGGATCCGTAAACAGTGCTCTGAGACTTTTCACTCTTGCTCAGT 2921 111111 1111111 llllllll 11111111 1 1 1 ll 1 111111 1111111 11 1111111 1 TAGTGA CRIITCCRCA~T~~~~T~~~~~GACTT~CC~TGCTCY;AT 2691 11 111111 1 111111 111111111 111 111 11 llllll 1111111111111 11 11 11 111 TATTGACACCACTGGACA TTAGGACTGATCTCCAGGGATGCGTGACTGATGCTCTG AACTTTTGACCCTTTTTCAAT cow
Fig. 3 (conrind)
336 GCCCCCAGTGGCACTTTCACTACAACAGT
GTCC
TGTGT
1 111 1111 1111 1 1 1111111 111 CTCCCTCATCOCECTTTTMTCCAIICACTA~TA~C~C~T~T 1
111111
111 1 1
GACCCTGA
11 1 1 1 11
AAATTATTTCGGCATTGCCT
TT
TG
2992 2 7 70
CTTTGGTACTGGAATAAAAA
(OV)
TGTTATTTTCTTCCTTGAGGGAGAGGGAGGAAATGGGGTGA
11 111111111111 1 1111111111111111
CCTAAATAAAGGGCTTGGTTTTGAGTGGCTGGT
1 111 11 111111 11 11 11111111111111 CCC WCTGATTTTGAGTGGCTGGC 1111 11 1 111111111 1111 1 11 1
111111
CTGA
CTGTTTCCAGTCC
AAGGCAGAGATGGCCAAGGGTCACAGCCACCTTCATCT
111 111
3071
111111 111111111 1 111111111
TATTTTCTTCCTGGTGGGACGGGAGGAAATAGGGTGAGTAGGTAGACCTGGCCATGGGTCACAGACCCCTTCATCT
2849
11 1111
11 1
TG TGGCCACTGCCTGAG... CTACCAGAGAGG 1111 1 11111
(OV)
AAATAGGCTGAACTTACAACATCTCAATGATGGAGATTCCTTTCTGTATCAATTCAATTCAACCAA
3149
1 1 111111111111 1111 11111 1111111111 1111111111 1111111111111
1
CTACTAAAGAGGATAGAGAGGCTGAACTTATAACAACTCAAAGATGWlGATTACTTTCTGTATTAATTCAATTCAACAGA
2939
GTTTTATTGATCACCTAGCATAATTCGA... 1111111111111111111111111 1
3177
GTTTTATTGATCACCTAGCATAATTT~GAGCTATGGAGGGGATCTAAAGTTGACTAAAAGCATCTCTTACCTAAACTG
3009
CTGCTAAGTCACTTCAGTTGTGTCCGACTCTGTGTGACCCCATAGACGGTAGCCCACAAGGCTCCCATGTCCCTGGAATTC
3090
Fig. 3. Nucleotide sequences of bovine clone 12 and ovine clone SS5: homology with the region of the bovine ala, downstream from exon 2. (Upper sequence; nt 1-3177) 5’ end sequence of bovine clone 12. (Middle sequence; nt 1228-3090) sequence of bovine ala (Vilotte et al., 1987). The first 1227 nt of the bovine ala are not shown. Italicized numbers refer to the gene. Bold-face italicized nt and aa refer to exons 3 and 4 of the gene and the corresponding aa sequence of ala. The polyadenylation signal, AATAAA, is underlined. (Lower sequence; (OV) denotes partial nt sequences of ovine clone SS5 homologous to regions (nt 1600-2010) and (nt 2204-2785) of bovine ala.) The putative stop codon in SS5 is underlined. Vertical bars (arabic numerals 1) refer to identical nt. Sequence alignments were carried out with the ‘Microgenie’ program (Beckman). Asterisk denotes frameshitt mutation located at the 3’ end of the putative exon 3 of 12. Hind111 1
2
3
4567
BglI 89
1
2
3
4
5
6
7
8
9
z4 DNA
6-23.3
9-9.3
ä 4.26
.2.23
Fig. 4. Southern-blot analysis of bovine DNA. Total genomic DNAs prepared from fresh bovine lymphocytes from nine unrelated cows (lanes l-9) by a standard proteinase K-SDS technique (kind gift from Dr. P. Brown) were digested to completion with Hind111 or BglI as indicated. Digestion, 1% agarose gel electrophoresis and Southern transfers to hybond-N membranes were done according to Maniatis et al. (1982). Hybridization with [a-32P] dCTP oligo-labeled bovine ala cDNA probe was performed in 500 mM Na phosphate pH 7.2/7x SDS at 65 “C overnight. Filters were washed with 40 mM Na phosphate pH 7.2/0.1 y0 SDS at 65 “C. The I-Hind111 size markers are shown on the right margin (in kb). Expected Hind111 genomic restriction fragments corresponding to ala and the putative pseudogene are indicated by the arrows on the left margin (in kb).
337 h
DNA (kb Wells+
vu11
PstI
BamHI
EcoRI
BglI
PVUII
PStI
Em""1
EcoRI
BglI
21.34 21.31 7.44 5.8/ 5.61
7.4
*
i.8/
4.91
i.6 4.9
* -a
3.51
.
Probe
5’ : fragment
HinfI
Probe
PH
3’:
fragment
HinfI-PstI P
H
VW
Exon 1
__-
Exon 2
VT
Exon 3
Exon 4 d
) 100
ala
bp
cDNA
Fig. 5. Southern-blot analysis with 5’- and 3’-specific probes. Partial restriction map of the bovine ala cDNA is indicated at the bottom; P, PstI; H, Hinff. Blackened and open segments represent the coding and untranslated regions, respectively (Vilotte et al., 1987). The 5’ Hinfl fragment and the 3’ Hi&I-PstI fragment used as probes, are indicated. Restriction endonucleases used in the Southern-blot analysis are indicated on the top of each lane. Experimental procedures are identical to those described in Fig. 4. The I-HindHI-EcoRI size markers are indicated on the left margins (in kb). Arrows on the right margins indicate the Bg11 genomic restriction fragment encompassing the entire transcription unit of ala (Vilotte et al., 1987).
ACKNOWLEDGEMENTS
REFERENCES
The 1 bovine genomic library was a kind gift from Dr. P. Sondermeyer. We are very grateful to Dr. P. Brown for the gift of the bovine genomic DNAs and to Dr. s. Ali for helpful discussions.
Brew, K., Vanaman, J.C. and Hill, R.C.: Comparison of the amino-acid sequence of bovine a-lactalbumin and hen’s egg white lysozyme. J. Biol. Chem. 242 (1967) 3747-3749. Hall, L., Emery, D.C., Davies, M.S., Parker, D. and Craig, R.K.: Organization and sequence of the human a-lactalbmnin gene. Biochem. J. 242 (1987) 735-742.
338 Jenness, R.: Inter-species comparison of milk proteins. In Fox, P.F. (Ed.), Developments in Dairy Chemistry, Vol. 1, Proteins. Applied Science Publishers, New York, 1982, pp. 87-l 14. Kuhn, N.J.: The biosynthesis of lactose. In Mepham, T.B. (Ed.), Biochemistry of Lactation. Elsevier, Amsterdam, 1983, pp. 159-176. Laird, J.E., Jack, L., Hall, L., Boulton, A.P., Parker, D. and Craig, R.K.: Structure and expression of the guinea-pig a-lactalbumin gene. Biochem. J. 254 (1988) 85-94. Maniatis, T., Fritsch, E.F. and Sambrook, J.: Molecular Cloning. A Laboratory Manual. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 1982.
Messing, J. and Vieira, J.: A new pair of Ml3 vectors for selecting either DNA strand of double-digest restriction fragments. Gene 19 (1982) 269-276. Qasba, P.K. and Safaya, S.K.: Similarity of the nucleotide sequences of the rat a-lactalbumin and chicken lysozyme genes. Nature 308 (1984) 377-380. Sanger, F., Nicklen, S. and Coulson, A.R.: DNA sequencing with chain-terminating inhibitors. Proc. Nat]. Acad. Sci. USA 74 (1977) 5463-5467. Vilotte, J.L., Soulier, S., Mercier, J.C., Gaye, P., Hue-Delahaie, D. and Furet, J.P.: Complete nucleotide sequence of bovine a-lactalbumin gene: comparison with its rat counterpart. Biochimie 69 (1987) 609-620.