Two human γ-crystallin genes are linked and riddled with Alu-repeats

Two human γ-crystallin genes are linked and riddled with Alu-repeats

Gene. 38 (1985) 197-204 Elsevier 197 GENE 1401 Two human y-crystallin genes are linked and riddled with &u-repeats (Recombinant DNA; eye lens; ge...

877KB Sizes 0 Downloads 34 Views

Gene. 38 (1985) 197-204 Elsevier

197

GENE 1401

Two human y-crystallin

genes are linked and riddled with &u-repeats

(Recombinant DNA; eye lens; gene linkage; repetitive DNA; human cataract; cosmid vector; intron)

Johan T. den Dunnen, Rob J.M. Moormann, Frans P.M. Cremers and John G.G. Schoenmakers* Department of Molecular Biology, University of Nijmegen, Toernooiveld, 6525 ED Nijmegen (The Netherlands) Tel. (SO)558833-2911 (Received March lst, 1985) (Revision received June 18th, 1985) (Accepted June 28th 1985)

SUMMARY

A human genomic cosmid clone, pHcos y- 1, has been isolated cont~~g two closely liked ~-c~st~lin genes, oriented in the same direction. The sequence of these genes and their 5’ and 3’ flanking regions has been determined. The coding regions of both genes are interrupted by two introns. The first introns (94 and 100 bp, respectively) are located in the 5’ region of the genes. The second introns (2.82 and 0.95 kb, respectively) divide the genes into two halves, each encoding a structural domain of the y-crystallin protein. The coding regions of the two genes show 80% homology. Due to a mutation in the splice acceptor site of the second intron of the first gene, the coding region of its third exon is 3 bp longer than that of the second gene. In the flanking regions several conserved sequence elements were found, includ~g those elements that are known to be necessary for the correct expression of eukaryotic genes. The flanking and intronic regions of the genes contain ‘simple sequence’ DNA and Ah repeats. The Ah repeats are usually clustered, contain truncated elements, and are often located near simple sequence DNA.

Approx. 90% of the soluble protein of the vertebrate lens consists of structural proteins, known as the crystallins. In mammals these lens-specific proteins can be divided into three antigenically distinct classes, CI-,@-and y-crystallin, each of which comprises several polyp~tides of related primary structure (Bloemend~, 198 1; Piatigorsky, 198 1). * To whom correspondence addressed.

and reprint requests should be

Abbreviations: aa, amino acid(s); bp, base pair(s); kb, kilobases or 1000 bp; ORF, open reading frame; nt, nucleotide(s). 0378-I 119/85/$03.30 0 1985 Elsevier Science Publishers

The ~-c~st~lins, which account for up to 40% of the soluble protein in the mammalian lens (Ocken et al., 1977), are a homogeneous group of highly symmetrical, monomeric proteins of M, 20000 (Bloemendal, 1981). Due to the high homology between the various y-crystallins and the occurrence of posttranslational modifications it has been difIicult to establish the exact number of primary y-crystallins: estimates vary from four to seven y-crystallins in rat and mouse (Ramaekers et al., 1982; Shinohara et al., 1982). It is nevertheless clear that, in all species examined, the y-crystallins are encoded by a multigene family. In the rat, six y-crystallin genes are present, five of

198

which are closely linked (Moo~~n et al., 1985). The sequence of one of these genes has been completely elucidated (Moormann et al., 1983). This gene contains a small 5’ exon, encoding the first three amino acid residues, and two large exons that each encode one of the two protein domains. Recently the sequence of a murine y-crystallin gene has been reported (Lok et al., 1984). The mosaic structure of this gene is identical to that of the rat gene. The three dimensional structure of the major y-crystallin of the bovine lens, $1, has been determined by high-resolution X-ray diffraction analysis (Wistow et al., 1983). The polypeptide is organized into two similar globular domains. Each domain consists of two ‘Greek key’ motifs containing predominantly B-pleated sheets. The high homology y-crystallins from various species between (Moormann et al., 1982; Lok et al., 1984; Tomarev et al., 1984; J.T.d.D., in preparation) and the observation that these polypeptides have conserved the aa residues which are essential for maintaining the key motif structure, suggests that all y-cisterns are structurally very similar to calf $1. The short-range spatial order of the crystallins is probably responsible for the transparency of the lens (Delaye and Tardieu, 1983). Changes in transparency of the lens during cataractogenesis are accompanied by structural changes in the crystallins (Carper et al., 1982). The causal relationship between the structural changes in the crystallins and the decrease in transparency of the lens remains, however, obscure. One of the problems in interpreting the results of studies of human cataracts is that the primary aa sequences of the human crystallins are not known. We have therefore started to isolate the human crystallin genes and report here the sequence of two of the y-crystallin genes as well as that of their immediate flanking regions. The deduced aa sequences of the two proteins have recently been used in model building studies to predict possible differences between the properties of the human proteins and those of the more easily studied calf y-crystallins. These studies will provide a better insight into the role played by the individual y-crystallins in maintaining the transparency of the human lens.

MATERIALS

AND METHODS

(a) Construction and screening of the human cosmid library

The human cosmid library used here was constructed by Grosveld et al. (1981; 1982a) and consisted of a partial digest of human placental DNA cloned in the BumHI site of the cosmid vector pOPF. The library originally consisted of 150000 independent clones. A replate of this library (300000 clones) was screened using as a probe the two rat y-crystallin cDNA clones pRLy-2 and pRLy-3 (Moormann et al., 1982). Three positive cfones were isolated but these appeared to be identical upon a first restriction enzyme analysis (this probably was the consequence of the fact that the clones were isolated from a replate of the original master library). One of these clones, designated pHcos y-1, was studied further. It contains two complete vector molecules (pOPF) ligated to each other as well as a rearranged one. In the rearranged molecule the 5’ end up to the first BstEII site is replaced by the 3’ end from the second BstEII site on. Apparently the vector arms used in the construction of the cosmid library were not completely dephosphorylated and were contaminated with incomplete BstEII and/or ClaI digestion products. (b) DNA preparations

Cosmid or plasmid DNA was isolated essentially as described by Ish-Horowitz and Burke (1981). DNA was subsequently purified by equilibrium CsCl ~en~fugation. Human liver DNA was isolated and manipulated essentially as described before for rat liver DNA (Moormann et al., 1984).

RESULTS

AND

DISCUSSION

(a) Isolation and characterization pHcosy-1

of clone

When restriction digests of human genomic DNA are hybridized with rat y-crystallin sequences, multiple bands are always found (Fig. 1A). This indicates the presence of multiple y-crystallin sequences

23.1-

0

16.8-

23.1 '212 -94 - 7.4

m

12.39.4-

-z$g

Il)oryl) I)+

-4.4

7.26.65.6-

2.3-

*

2.0 -

*

-

-0.6

C

B Fig. 1. Restriction

and hybridization

of human genomic DNA (15 pg/lane, inserts of rat y-crystallin (SSPE

clones (Moormann

= 0.18 M NaCI/lO

0.7% agarose)

analysis

mM Na,POJl

after staining

of human genomic

electrophoresed

et al., 1982) essentially mM EDTA,

with ethidium

bromide.

Restriction enzymes used are E, EcoRI; B, BamHI; fragment

length standards

shown. Asterisks

(Moormann

pH 7.7) at 60°C. (B) Restriction

or to fragments

to bands

that give hybridization

in the EcoRI, Hind111 and BglII digests

enzyme digests 32P-labelled

et al., 1983). Final wash was in 0.1 x SSPE

enzyme

digests of pHcosy-I

DNA (1 pg/lane,

of a blot of the gel shown in (B) after hybridization

in (A) and (C) mark DNA fragments

bands in (C) do not correspond

yr,

as described

restriction

with the nick-translated

as in (A).

H, Hind111 and Bg, BglII. Phage 1 DNA digests were run on the gel to obtain

DNA. Some hybridizing

from the human

(B,C). (A) Various

were blotted and hybridized

(C) Autoradiograph

contain vector sequences marked

DNA (A) and pHcosy1

on 0.7% agarose)

found in the cosmid

seen in the genomic

pattern

signals too low to be detected

derive from the human

Y,_~ gene, whereas

the

DNA as well as in the genomic

(A): these are due to fragments in the genomic that marked

that also

DNA. Note that the bands

in the BamHI

digest derives

gene.

in the human genome. To study the genomic zation of these sequences we screened a cosmid library with rat y-crystallin cDNA (see MATERIALS AND METHODS, section isolated the clone pHcos y- 1.

organihuman clones b) and

The restriction map of the insert of pHcos y- 1 is shown in Fig. 2A. Although pHcos y- 1 contains 46.8 kb of DNA, the insert is only 23.7 kb long. The remainder of the cosmid molecule consists essentially of a vector trimer: two tandemly linked vector molecules are coupled to a third, rearranged vector molecule. In the latter the 3’ end is replaced by the 5’ end in an inverted orientation.

(b) Localization

of y-crystallin sequences

Restriction digests of pHcos y- 1 always contain at least two fragments that hybridize with the y-crystallin cDNA clones (Fig. 1C) suggesting that pHcos y- 1 contains more than one y-crystallin gene. Further hybridization experiments with the 5’ and 3’ regions of the cDNA clones showed that pHcos y- 1 contains two closely linked II-crystallin genes oriented in the same transcriptional direction (Fig. 2). The two genes are designated y1_-2and Y~._~,since they show the highest sequence homology with those two genes of the rat (J.T.d.D., in preparation).

200

Bg )kbi

S’,

I

HE

BEH

I/

\I(

BgB

E

I

/I

/ /}

/

\

1

0

\



\

/ \

//

\ \

/

W),

B

H Bg I

1

,\

1Lb.

S’-$3

1

\ \

1:’ \

\\ \\

\

\

1

\

/ HEPK I

8

e--t/

yl-2

/

E

\

5’_3’

/

BgEHB

&JH

\

/ /

H

\ S II

,

BPEH

I

S

YII

7% Oh3

s

I

Th WA)n

(Ahas @%

4m

XBgB

XE

l&II

H

PX

Bg

IQbtc” t

iAl,

W2,

“=~

p*’ (T)g

Fig. 2. Physical map of the insert of pHcosy-1. The upper line represents the insert of pHcosy-1. The positions of several restriction enzyme sites are indicated. The location and transcriptional direction of the two human y-crystallin genes was determined by hybridizing blots ofvarious single and double restriction digests of pHcos y-1 with 5’) middle or 3’ specific fragments of the two rat y-crystallin cDNA clones pRL?-2 and pRLy-3 (Moormann et al.. 1982; 1983). The lower line shows an expanded map ofthe two genes and their immediate flanking regions, For B, BamHI; Bg, BglII; E, EcoRI; H, Hind111 and K, KpnI all sites are given. A, AvaII; P, PstI; S, SmaI; X, Xbal site are only shown for the y,_a and ya_i genes. No SstII or Sal1 sites were found. The location of the genes is shown by bars; black areas represent exons, white areas introns. The location of the bipartite Alu repeats is indicated by heavy arrows; smail black squares show the location of direct repeats surrounding the repetitive elements. Upward arrows point to positions where ‘simple-sequence DNAs’ are located, while their composition is indicated. The wavy line shows the location of the probe used in hyb~dization studies to identify Aiu repeats, and the line bounded by two asterisks indicates the location of the fragment used as a probe for the intergenic region in genomic DNA.

(c) y-Crystallin sequences in the human genome

The hyb~dization pattern of restriction digests of pHcos y- 1 with the rat y-crystallin cDNA clones was also compared with the genomic hybridization pattern (Fig. 1). The cosmid fragments correspond exactly to genomic fragments (except of course when parts of the vector molecule are present) but are only a subset thereof. Hence the two y-crystallin genes present in pHcos y-l do not represent the complete human genomic complement of y-cry&thin sequences. From the genomic hybridization pattern we estimate that there are five to six human y-crystallin genes. In pHcos y-1 the two y-crystallin genes are separated by 12.5 kb of DNA (Fig. 2). This organization exactly reflects the genomic organization since a fragment isolated from the cloned intergenic region (Fig. 2) hybridizes to identically sized genomic restriction fragments (results not shown). Other linked human y-crystallin genes were isolated by Meakin et al. (1985). Linkage of all the y-crystallin genes in man is further indicated by the fact that all map to c~omosome 2 (Den Dunnen et al., 1985). A similar situation exists in the rat, where extensive

linkage of the y-crystallin genes is also found (Moormann et al., 1985). (d) Sequence analysis of the coding region of the y-crystaltin genes

The nucleotide sequence of the two human y-crystallin genes and parts of their flanking regions is presented in Fig. 3. For comparison, the sequence of the rat y3_r gene (Moo~~n et al., 1983) is also shown. The organization of the two human genes is identical to that of the rat yS_i crystallin gene: all three genes contain two large ORFs preceded by a 9-bp-long ORF which starts with the initiation codon ATG. Hence the genes contain two introns, one close to the 5’ end and a second, approximately in the middle of the gene. Exon 2 is 243 bp long in both human genes and encodes 8 1 aa residues. The third exon has an ORF of 276 bp in the yr_* gene, while that of the yZ_, gene is 3 nt shorter. This length difference is most probably caused by a G --f C mutation which altered the original -agcag- splice acceptor sequence into -agCAC- and created a new splice acceptor site 3 nt upstream, where another -ag- dinucleotide is

201

humn hvnm

~1-2 ~2-1

mt

y,-1

1 47 gcagttccc~-~aca-gcaaccagaaaacatctgctcacttccttcaaa--------ATOGGAAA taagtcctgggtaccggatgctcagccttg-----------gc gc c

‘t

-c!

t

t

ctg

--

ac

*

-

---c-

c

--a

-

g --g

cca

-

ga

--c

g

cagcc

caacagcc

--

G

g

G

1 a c

a sac

a a

a*-

taaataeaa

c t -----

anaaaaaagtt

t, w

aaaa a

a

48 cctaatgcagtggcctcagtgggggcaa--tcac-tccatgot-----cccaca-tcttccatttttca TCACCTTCTACGAGGACAGGGCCTTCCAGGGCCGCAGCTACGAATGCAC AC tt ata gt- a a a B tg aat c-- ggaa ---g t -a _-_ --_-- -___ CA TG G g--c a tg--c t c ggc a aggc ctgac 200

150

100

~)ACTGACTGCCCCAACCTACAACCCTATTTCAGCCGCTGCAACTCCATCAGGGTGGAGAGCGGCTGCTGGATGATCTATGAGCGCCCCAACTACCAGGGCCACCAGTACTTCCTGCGGCG G C T A T A A G G G C A

CA T

G

G

TG GC C

C

C

AG

C

T

T ACA

T

TG

T

T

tgagtgtggctct-gtctttaccttccatctttttggaaataaa.. t __a *_ - a gccgcca gtcccc.. a ca -ca tt gc .. c tttc a gg--cc

A

intron intron intron

2.82 kb kb

..tgtttttgtttgtttactcttgcgttttctgt-ctgcc-ac ..a a-c CB c tt tc-catcc t %a

1 .88

..a

0.95

kb

g

c ga-ca

c

--

t

c g

c

g

GC

CG C G

G

G

A

CCAC

AA

C T

aa--

450 400 350 ~ATGTCAGAGCTCACAGACGACTGTCTCTCTGTTCAGGACCGCTTCCACCTCACTGAAATTCACTCCCTCAATGTGCTGGAGGGCAGCTGGATCCTCTATGAGATGCCCAACTACAGGGG C T G c c GGT A GC G C GT cc C ATG C C AGCA C G

GTG

C C CACC G

A

T

G

CT C

TCC

TA

AA

G

C

550 500 GAGGCAGTATCTGCTGAGGCCGGGGGAGTACAGGAGGTTTCTTGATTGGGGGGCTCCAAATGCCAAAGTTGGCTCTCTTAGACGAGTCATGGATTTGTACTGAagtatttacgttttcca CATGG T GCA T GC GA GG T A a gc taecac a c C CCAA C GC AG C A C G GA A CT a ttac c R c C C T A A C C AC AC C C CATG GG A

+70 +40 tagtagagccctgtctttctttcaatctattaagcatttataagtgataatggcactcagccaaacataa gt t ct ctt a t g gca t g -c cc tta ctc tgt a ac t gcct ccatg a ggg-c c8- ca gaaaggatgtgactggtctc ca c

600

aa

cat

gg

Fig. 3. Nucleotide sequence

Numbering

Boxes enclose the poly(A)

of pHcos y-I.

method

Codogenic

yi_a gene is shown completely.

that the sequence

capsites,

y,_a and ys_, crystallin

equals that of the human

addition

yi_a exonic sequences

+I from the putative

poly(A)

addition

sequences

y,_s gene. Dashes

signal is heavily underlined,

sequences

of the rat ys_i gene. The DNA

et al., 1983) using [a-32P]dATP

were determined

from appropriate

indicate

at least once on both DNA

1 bp deletions.

shown in capital

1 assigned

poly(A)

The sequences

to the putative

site is marked capsite;

strands.

are specified;

The blanks

are aligned to optimize

letters. The three asterisks addition

(410 Ci/mmol,

single or double digests

ya_, and the rat ys_, genes only differences

and the putative

with position

with the sequence

(Moormann

studies were constructed

For the human

the exonic DNA with the coding

is for the human

genes compared

as described

M13mp phages used in the sequencing

ZfindIII-fragments

of the human

homology.

of the human

by the dideoxy chain termination

The recombinant

of the subcloned sequence

numbered

sequences

was analysed

Amersham).

indicate

c

mark the putative

with an upward

3’ gene-flanking

arrow.

sequences

are

site on.

located. This ultimately led to a His insertion at position 84 of the ~i_.~protein, immediately following the intron/exon junction. An additional aa is also present at this site in the calf $1 crystallin (Bhat and Spector, 1984), some frog y-crystallins (Tomarev et al., 1984) and the rat ~i_~ crystallin (J.T.d.D., in preparation). The 71-z gene encodes a protein of 174 aa (M, 20753) and the yZ_i gene a protein of 173 aa (M, 20696) (Fig. 4). Both polypeptides start with a Gly residue, in agreement with the results of Kabasawa et al. (1977) who determined that Gly is found at the

N terminus of human y-crystallins. The two proteins are very similar and have an overall aa homology of 77%. Using the derived aa sequences of these two genes, Summers et al. (1984) have shown in recent modelbuilding studies that the three-dimensional structure of the human polypeptides is very similar to the one established for the bovine yII-crystallin. The residues involved in the stabilization of the folded hairpin structure in each motif are all conserved with the exception of Arg-79 in motif II, which is replaced by a cysteine residue in the two human sequences. The

202

30

1

human Y~-~ humzn Y,-,

TTDCPNLQPYF TTDCPNLQPYF

UIGKITFYEDRAFQGRSY mGKITFYEDRAFQGRSY

LRRGEYPDYQQWMGLSDSIRSCCLI LRRGEYPDYQQWMGLSDSIRSCCLI Go

~:E:$::::~EE:~~~I~~~~~~~

P ij) Q

a~~~~~~-~~~~~~~~~:~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ tt

Fig. 4. Comparison

of the deduced

amino acid sequences

one letter code. The gap after position is for the sequence

of the human

the asterisk

the Cys-residue

marks

tt

t.

tt

of the human

29 is due to the alignment

y,_s gene. Nonmatching discussed

in section

aa are boxed; d of RESULTS

replacement of a charged residue by a hydrophobic side chain exposes other aromatic residues, thereby increasing the hydrophobicity of the surface of the human y-crystallins. These and other changes tend to favour aggregation of the human y-crystallins. This may contribute to the conversion of the watersoluble protein to insoluble protein, a process that occurs on aging and is accelerated in cataract. The sequence divergence at the nt level between the human Y,_.~and y2_i genes is only about 20%) while the average sequence divergence between these two human genes and the two orthologous rat y-crystallin sequences is only slightly higher, about 22 % . Sequence variability is virtually restricted to the third exon (i.e., the second domain of the protein) and predominantly found in the first half of this exon, which encodes the third motif of the protein: the nt sequences of the human Y,_~and y2_i genes differ 35 % in the coding region for protein motif III, 26 % in the coding region for motif IV, 10% in the coding region for motif I and 9% in the coding region for motif II. We noted a similar variability of the second domain in two rat y-crystallins (Moormann et al., 1982). At present we cannot distinguish between two possible explanations for this phenomenon, namely concerted evolution of the first domain or a more rapid divergent evolution of the second. (e) Sequence analysis of the noncoding and flanking regions

In the 3 ’ -noncoding region of both genes an AATAAA polyadenylation signal is found at equivalent positions followed 23 and 22 bp, respectively, later by a GCA triplet. We suggest that this triplet is the site of poly(A) addition by analogy with the rat ys_, gene (Moormann et al., 1983). The 3’-noncoding regions would then be 64 and 65 bp long. The

crystallins

of the sequences arrows

tf

t

y,_* and ~a_,. The aa sequences encoding indicate

each of the protein

residues

conserved

are shown in the

domains.

Numbering

in all y-crystallins

while

AND DISCUSSION.

genes show a general sequence homology which extends into the 3’ gene-flanking region for about 25 bp. After that large insertions/deletions are required to align the sequences. Upstream from the initiation codon possible cap sites are found at the positions indicated in Fig. 3. About 25-30 bp upstream from the putative cap site a Goldberg-Hogness box (-TATA-) is found. During the sequencing of the intronic and flanking regions of the two genes several Ah repetitive elements (Jelinek et al., 1980) were encountered, some of which are flanked by short direct repeats (Fig. 2). Most of the Ah elements end in a poly(A) tract but one ends in an (AAC), tract. The Ah elements are mostly arranged in small clusters consisting of one complete and one or two truncated elements. This suggests that new Ah repeats frequently integrate in the genome in or near pre-existing ones, thereby giving rise to deletions in the region where they insert. Many more Ah elements, either complete or truncated, are present in the regions surrounding the two y-crystallin genes since a cloned Ah element was found to hybridize strongly to the flanking and intergenie regions. Ah elements are often found in the neighbourhood of human genes but such a high density of Ah repeats around a gene has thus far only been reported for the human growth hormone gene (Barsh et al., 1983). Besides Ah elements, the sequenced region also contains ‘simple sequence’ DNA, primarily repeats of the dinucleotide AC/TG (Fig. 2). In general, simple sequence DNA is often found near repetitive DNA elements. In our case two of the three (AC/TG), elements are found close to Alu repeats. Rogers (1983) observed that (CA), sequences are often found near or at breakpoints in DNA. Simple sequence DNA near repetitive elements could then somehow have been involved in the breakage and

203

reunion of chromosomes during the transposition of these repetitive elements. Alternatively, simple-sequence DNA could stimulate the insertion of repetitive DNA in its direct neighbourhood. Finally, simple sequence DNA could originate from the 3’ end of repetitive DNA elements. This can also be true for the (AC/TG), elements, as we find one Alu repeat that ends in a poly(AAC) sequence and not in a poly(A) tract. The variable presence of (mobile) repetitive elements and simple sequence DNA is at least partially responsible for the variation in length (0.95-2.82 kb) of the second intron. In contrast, the length of the 5’ intron is relatively constant (86-100 bp). The intronic sequences show only homology in some short stretches and directly surrounding the splice sites (a detailed analysis will be presented elsewhere).

Two of these sequence elements have been shown to be necessary for efficient and correct transcription of the rabbit j?-globin gene in human HeLa cells (Grosveld et al., 1982b). The third element, at -50 to -60, has as yet not been found to be necessary for transcription of the B-globin gene in studies of transient expression. The conservation of this sequence in the human and rat y-crystallin genes indicates that further study of the functional significance of this sequence is warranted. We have no direct evidence that the two human genes described here do function in vivo. However, the lack of in-phase stop codons and the presence of a proper poly(A) addition signal at the 3’ ends, of a Goldberg-Hogness box, and of elements involved in the modulation of transcription of eukaryotic genes at the 5’ ends all suggest that these two genes do function.

(f) Conserved elements in the 5’ flanking region

ACKNOWLEDGEMENTS

Since possible regulatory elements may be revealed by the conservation of their sequence we have compared the 5 ‘-flanking region of the two human genes with those of two other genes, the rat 1/3-I crystallin gene and the human /I-globin gene (Fig. 5). Besides the -TATA- box at -30, three other conserved elements are found: a region around -50 to -60, a sequence resembling the -CCAAT- box (Benoist et al., 1980) at position -80 and finally a -CCCT- sequence (Grosveld et al., 1982b) at -100.

We thank Drs. R.A. Flavell and F.G. Grosveld for providing us with the human gene library. We thank Roselie Jongbloed for help with the experiments, Dr. N.H. Lubsen for critical reading of the manuscript and Dr. M.L. Breitman for communicating his results prior to publication. This investigation was carried out under the auspices of the Netherlands Foundation for Chemical Research (SON) with financial aid from the Netherlands Organization for the Advancement of Pure Research (Z WO).

REFERENCES

Barsh, G.S., Seeburg, Fig. 5. Comparison sequences

ofthe 5’ flanking sequences.

of the human

y,_* and yz_, genes, the rat ys_, and the

human

pglobin

served

nt (with reference

shown in capital conserved human

genes are aligned letters.

in all three 7,-Z crystallin

The 5’-flanking

to optimize

to the human Heavy

gene

y-crystallin

overlining

y-crystallin with

homology. genes)

indicates

are

nt that are

genes. Numbering position

Con-

is for the

1 assigned

to the

putative

cap site; the arrow points to the cap site of the human

pglobin

gene. Boxes enclose sequences

of transcription sequence section

of eukaryotic

is heavily underlined f).

involved in the regulation

genes while their conserved (see

RESULTS

AND

core

DISCUSSION,

P.H. and Gelinas,

gene family:

mosomal

locus. Nucl. Acids Res. 11 (1983) 3939-3958.

Benoist,

C., O’Hare,

ovalbumin

structure

R.E.: The human growth

hormone

and evolution

K., Breathnach,

gene sequence

of the chro-

R. and Chambon,

of putative

control

P.: The

regions.

Nucl.

Acids Res. 8 (1980) 127-142. Bhat, S.P. and Spector,

A.: Complete

nucleotide

cDNA derived from calf lens y-crystallin AZuI-like sequences. Bloemendal,

Wiley, New York, Carper,

D., Shinohara,

Deficiency

of a

presence

of

DNA 3 (1984) 287-295.

H.: Biosynthesis

H. (Ed.), Molecular

sequence

mRNA;

of lens crystallins,

and Cellular

Biology

in Bloemendal, of the Eye Lens.

1981, pp. 189-220. T., Piatigorsky,

of functional

messenger

J. and Kinoshita,

J.H.:

RNA for a developmen-

204 taIIy regutated &cry&all& pofypeptide. Science 217 (1982) 463-464. Delaye, M. and Tardieu, A.: Short-range order of crystalfin protein accounts for eye lens transparency. Nature 302 (1983) 415-417. Den Dunnen, J.T., Jongbloed, R.J.E., Geurts van Kessel, A.H.M. and Schoenmakers, J.G.G.: Human y-crystallin genes are located in the ~1%qter region of~hromosome 2. Hum. Genet. 70 (1985) 217-221. Grosveld, F.G., Dahl, H.H.M., de Boer, E. and Flaveh, R.A.: isolation of j%gIobin related genes from a human cosmid library. Gene 13 (1981) 227-237. Grosveld, F.G., Lund, T., Murray, E.J., Mellor, A.L., DahI, H.H.M. and Flavell, R.A.: The construction of cosmid libraries which can be used to transform eukaryotic cells. Nuel. Acids Res. 10 (1982a) 6715-6732. Grosveld, G.C., RosenthaI, A. and Ftavell, R.A.: Sequence requirements for the transcription of the rabbit @globin gene in vivo: the -80 region. Nucl. Acids Res. IO (1982s) 4951-4971. Ish-Horowitz, D. and Burke, J.F.: Rapid and eficient cosmid cloning. Nucl. Acids Res.9 (1981) 2989-2998. Jelinek, W.R., Toomey, T.P., Leinwand, L., Duncan, C.H., Biro, P.A., Choudary, P.V., Weissman, S.M., Rubin, CM., Houck, CM., Deininger, P.L., and Schmid, C.W.: Ubiquitous interspersed repeated sequences in mammalian genomes Proc. Natl. Acad. Sci. USA 77 (1980) 1398-1402. Kabasawa, I., Barber, G.W. and Kinoshita, J.H.: Aging effects and some properties ofthe human lens low molecular weight proteins. Jap. J. Qphthalmol. 21 (1977) 87-97. Lok, S., Tsui, L-C., Shinobara, T., Piatigorsky, J., Gold, R. and Breitman, M.L.: Analysis of the mouse y-crystallin gene family: assignment of multiple eDNAs to discrete genomic sequences and characterization of a representative gene. Nuel. Acids Res. (1984) 4517-4529. Meakin, S., Breitman, M.L. and Tsui, L-C.: Characterization of five members of the human y-crystallin gene family (submitted). Moormann, R.J.M., Den Dunnen, J.T., Bloemendal, H. and Schoenmakers, J.G.G.: Extensive intragenic sequence homology in two distinct rat lens cDNAs suggests duplications of a primordial gene. Proc. Natl. Acad. Sci. USA 79 f 1982) 6876-6880.

Moormann, R.J.M., Den Dunnen, J.T., MulIeners, L., Andreoli, P., Bloemendal, H. and Schoenmakers, J.G.G.: Strict colinearity of genetic and protein folding domains in an intragenically duplicated y-crystallin gene. J. Mol. Biol. 171 (1983) 353-368. Moormann, R.J.M., Den Dunnen, J.T., Heuyerjans, J., Jongbloed, R.J.E., Van Leen, R.W., Lubsen, NH. and Schoenmakers, J.G.G.: Characterization of the rat y-crystallin gene family and its expression in the eye lens. J. Moi. Biol. I82 (1985) 419-430. &ken, P.R., Fu, S-C.J., Hart, R., White, J.H., Wagner, B.J. and Lewis, K.E.: Characterization of lens proteins, I. Identification of additional soluble fractions in rat lenses. Exp. Eye Res. 24 (1977) 355-367. Piatigorsky, J.: Lens differentiation in vertebrates: a review of cellular and mofecuiar features. Differentiation 19 (198 1) 134-153. Ramaekers, F., Dodemont, H., Vorstenbosch, P. and Bloemendal, H.: Classification ofrat lens crystallins and identification of proteins encoded by rat lens mRNA. Eur. J. Biochem. 128 (1982) 503-508. Rogers, J.: Molecular biology; CACA-sequences - the ends and the means? Nature 305 (1983) 101-102. Shinohara, T., Robmson, E.A., Appella, E. and Piatigorsky, J.: Multiple y-crystallins of the mouse lens: fractionation of mRNAs by cDNA cloning. Proc. NatI. Acad. Sci. USA 79 (1982) 2783-2787. Summers, L., Slingsby, C., White, H., Narebor, M., Moss, D., Miller, L., Mahadevan, D., Lindley, I’., Driessen, H., Blundell, T., Den Dunnen, J.T. and Schoenmakers, J.G.G.: The molecular structures and interactions of bovine and human y-crystallins, in Nugent, J., and Whelan, J. {Eds.), Human Cataract Formation. CIBA Foundation Symposium No. 106. Pitman, London, 1984, pp. 219-236. Tomarev, S.I., Zinovieva, R.D., Chalovka, P., Krayev, A.S., Skryabin, K.G. and Gausse Jr., G.G.: Multiple genes coding for the frog eye lens y-crystallins. Gene 27 (1984) 301-308. Wistow, G., Turnell, B., Summers, L., Slingsby, C., Moss, D., Miller, F., Lindley, P. and Blundell, T.: X-ray analysis of the eye lens protein yII-crystallin at 1,9 A resolution. J. Mol. Biol. 170 (1983) 175-202. Communicated by H. van Grmondt.