Gene. 38 (1985) 197-204 Elsevier
197
GENE 1401
Two human y-crystallin
genes are linked and riddled with &u-repeats
(Recombinant DNA; eye lens; gene linkage; repetitive DNA; human cataract; cosmid vector; intron)
Johan T. den Dunnen, Rob J.M. Moormann, Frans P.M. Cremers and John G.G. Schoenmakers* Department of Molecular Biology, University of Nijmegen, Toernooiveld, 6525 ED Nijmegen (The Netherlands) Tel. (SO)558833-2911 (Received March lst, 1985) (Revision received June 18th, 1985) (Accepted June 28th 1985)
SUMMARY
A human genomic cosmid clone, pHcos y- 1, has been isolated cont~~g two closely liked ~-c~st~lin genes, oriented in the same direction. The sequence of these genes and their 5’ and 3’ flanking regions has been determined. The coding regions of both genes are interrupted by two introns. The first introns (94 and 100 bp, respectively) are located in the 5’ region of the genes. The second introns (2.82 and 0.95 kb, respectively) divide the genes into two halves, each encoding a structural domain of the y-crystallin protein. The coding regions of the two genes show 80% homology. Due to a mutation in the splice acceptor site of the second intron of the first gene, the coding region of its third exon is 3 bp longer than that of the second gene. In the flanking regions several conserved sequence elements were found, includ~g those elements that are known to be necessary for the correct expression of eukaryotic genes. The flanking and intronic regions of the genes contain ‘simple sequence’ DNA and Ah repeats. The Ah repeats are usually clustered, contain truncated elements, and are often located near simple sequence DNA.
Approx. 90% of the soluble protein of the vertebrate lens consists of structural proteins, known as the crystallins. In mammals these lens-specific proteins can be divided into three antigenically distinct classes, CI-,@-and y-crystallin, each of which comprises several polyp~tides of related primary structure (Bloemend~, 198 1; Piatigorsky, 198 1). * To whom correspondence addressed.
and reprint requests should be
Abbreviations: aa, amino acid(s); bp, base pair(s); kb, kilobases or 1000 bp; ORF, open reading frame; nt, nucleotide(s). 0378-I 119/85/$03.30 0 1985 Elsevier Science Publishers
The ~-c~st~lins, which account for up to 40% of the soluble protein in the mammalian lens (Ocken et al., 1977), are a homogeneous group of highly symmetrical, monomeric proteins of M, 20000 (Bloemendal, 1981). Due to the high homology between the various y-crystallins and the occurrence of posttranslational modifications it has been difIicult to establish the exact number of primary y-crystallins: estimates vary from four to seven y-crystallins in rat and mouse (Ramaekers et al., 1982; Shinohara et al., 1982). It is nevertheless clear that, in all species examined, the y-crystallins are encoded by a multigene family. In the rat, six y-crystallin genes are present, five of
198
which are closely linked (Moo~~n et al., 1985). The sequence of one of these genes has been completely elucidated (Moormann et al., 1983). This gene contains a small 5’ exon, encoding the first three amino acid residues, and two large exons that each encode one of the two protein domains. Recently the sequence of a murine y-crystallin gene has been reported (Lok et al., 1984). The mosaic structure of this gene is identical to that of the rat gene. The three dimensional structure of the major y-crystallin of the bovine lens, $1, has been determined by high-resolution X-ray diffraction analysis (Wistow et al., 1983). The polypeptide is organized into two similar globular domains. Each domain consists of two ‘Greek key’ motifs containing predominantly B-pleated sheets. The high homology y-crystallins from various species between (Moormann et al., 1982; Lok et al., 1984; Tomarev et al., 1984; J.T.d.D., in preparation) and the observation that these polypeptides have conserved the aa residues which are essential for maintaining the key motif structure, suggests that all y-cisterns are structurally very similar to calf $1. The short-range spatial order of the crystallins is probably responsible for the transparency of the lens (Delaye and Tardieu, 1983). Changes in transparency of the lens during cataractogenesis are accompanied by structural changes in the crystallins (Carper et al., 1982). The causal relationship between the structural changes in the crystallins and the decrease in transparency of the lens remains, however, obscure. One of the problems in interpreting the results of studies of human cataracts is that the primary aa sequences of the human crystallins are not known. We have therefore started to isolate the human crystallin genes and report here the sequence of two of the y-crystallin genes as well as that of their immediate flanking regions. The deduced aa sequences of the two proteins have recently been used in model building studies to predict possible differences between the properties of the human proteins and those of the more easily studied calf y-crystallins. These studies will provide a better insight into the role played by the individual y-crystallins in maintaining the transparency of the human lens.
MATERIALS
AND METHODS
(a) Construction and screening of the human cosmid library
The human cosmid library used here was constructed by Grosveld et al. (1981; 1982a) and consisted of a partial digest of human placental DNA cloned in the BumHI site of the cosmid vector pOPF. The library originally consisted of 150000 independent clones. A replate of this library (300000 clones) was screened using as a probe the two rat y-crystallin cDNA clones pRLy-2 and pRLy-3 (Moormann et al., 1982). Three positive cfones were isolated but these appeared to be identical upon a first restriction enzyme analysis (this probably was the consequence of the fact that the clones were isolated from a replate of the original master library). One of these clones, designated pHcos y-1, was studied further. It contains two complete vector molecules (pOPF) ligated to each other as well as a rearranged one. In the rearranged molecule the 5’ end up to the first BstEII site is replaced by the 3’ end from the second BstEII site on. Apparently the vector arms used in the construction of the cosmid library were not completely dephosphorylated and were contaminated with incomplete BstEII and/or ClaI digestion products. (b) DNA preparations
Cosmid or plasmid DNA was isolated essentially as described by Ish-Horowitz and Burke (1981). DNA was subsequently purified by equilibrium CsCl ~en~fugation. Human liver DNA was isolated and manipulated essentially as described before for rat liver DNA (Moormann et al., 1984).
RESULTS
AND
DISCUSSION
(a) Isolation and characterization pHcosy-1
of clone
When restriction digests of human genomic DNA are hybridized with rat y-crystallin sequences, multiple bands are always found (Fig. 1A). This indicates the presence of multiple y-crystallin sequences
23.1-
0
16.8-
23.1 '212 -94 - 7.4
m
12.39.4-
-z$g
Il)oryl) I)+
-4.4
7.26.65.6-
2.3-
*
2.0 -
*
-
-0.6
C
B Fig. 1. Restriction
and hybridization
of human genomic DNA (15 pg/lane, inserts of rat y-crystallin (SSPE
clones (Moormann
= 0.18 M NaCI/lO
0.7% agarose)
analysis
mM Na,POJl
after staining
of human genomic
electrophoresed
et al., 1982) essentially mM EDTA,
with ethidium
bromide.
Restriction enzymes used are E, EcoRI; B, BamHI; fragment
length standards
shown. Asterisks
(Moormann
pH 7.7) at 60°C. (B) Restriction
or to fragments
to bands
that give hybridization
in the EcoRI, Hind111 and BglII digests
enzyme digests 32P-labelled
et al., 1983). Final wash was in 0.1 x SSPE
enzyme
digests of pHcosy-I
DNA (1 pg/lane,
of a blot of the gel shown in (B) after hybridization
in (A) and (C) mark DNA fragments
bands in (C) do not correspond
yr,
as described
restriction
with the nick-translated
as in (A).
H, Hind111 and Bg, BglII. Phage 1 DNA digests were run on the gel to obtain
DNA. Some hybridizing
from the human
(B,C). (A) Various
were blotted and hybridized
(C) Autoradiograph
contain vector sequences marked
DNA (A) and pHcosy1
on 0.7% agarose)
found in the cosmid
seen in the genomic
pattern
signals too low to be detected
derive from the human
Y,_~ gene, whereas
the
DNA as well as in the genomic
(A): these are due to fragments in the genomic that marked
that also
DNA. Note that the bands
in the BamHI
digest derives
gene.
in the human genome. To study the genomic zation of these sequences we screened a cosmid library with rat y-crystallin cDNA (see MATERIALS AND METHODS, section isolated the clone pHcos y- 1.
organihuman clones b) and
The restriction map of the insert of pHcos y- 1 is shown in Fig. 2A. Although pHcos y- 1 contains 46.8 kb of DNA, the insert is only 23.7 kb long. The remainder of the cosmid molecule consists essentially of a vector trimer: two tandemly linked vector molecules are coupled to a third, rearranged vector molecule. In the latter the 3’ end is replaced by the 5’ end in an inverted orientation.
(b) Localization
of y-crystallin sequences
Restriction digests of pHcos y- 1 always contain at least two fragments that hybridize with the y-crystallin cDNA clones (Fig. 1C) suggesting that pHcos y- 1 contains more than one y-crystallin gene. Further hybridization experiments with the 5’ and 3’ regions of the cDNA clones showed that pHcos y- 1 contains two closely linked II-crystallin genes oriented in the same transcriptional direction (Fig. 2). The two genes are designated y1_-2and Y~._~,since they show the highest sequence homology with those two genes of the rat (J.T.d.D., in preparation).
200
Bg )kbi
S’,
I
HE
BEH
I/
\I(
BgB
E
I
/I
/ /}
/
\
1
0
\
’
\
/ \
//
\ \
/
W),
B
H Bg I
1
,\
1Lb.
S’-$3
1
\ \
1:’ \
\\ \\
\
\
1
\
/ HEPK I
8
e--t/
yl-2
/
E
\
5’_3’
/
BgEHB
&JH
\
/ /
H
\ S II
,
BPEH
I
S
YII
7% Oh3
s
I
Th WA)n
(Ahas @%
4m
XBgB
XE
l&II
H
PX
Bg
IQbtc” t
iAl,
W2,
“=~
p*’ (T)g
Fig. 2. Physical map of the insert of pHcosy-1. The upper line represents the insert of pHcosy-1. The positions of several restriction enzyme sites are indicated. The location and transcriptional direction of the two human y-crystallin genes was determined by hybridizing blots ofvarious single and double restriction digests of pHcos y-1 with 5’) middle or 3’ specific fragments of the two rat y-crystallin cDNA clones pRL?-2 and pRLy-3 (Moormann et al.. 1982; 1983). The lower line shows an expanded map ofthe two genes and their immediate flanking regions, For B, BamHI; Bg, BglII; E, EcoRI; H, Hind111 and K, KpnI all sites are given. A, AvaII; P, PstI; S, SmaI; X, Xbal site are only shown for the y,_a and ya_i genes. No SstII or Sal1 sites were found. The location of the genes is shown by bars; black areas represent exons, white areas introns. The location of the bipartite Alu repeats is indicated by heavy arrows; smail black squares show the location of direct repeats surrounding the repetitive elements. Upward arrows point to positions where ‘simple-sequence DNAs’ are located, while their composition is indicated. The wavy line shows the location of the probe used in hyb~dization studies to identify Aiu repeats, and the line bounded by two asterisks indicates the location of the fragment used as a probe for the intergenic region in genomic DNA.
(c) y-Crystallin sequences in the human genome
The hyb~dization pattern of restriction digests of pHcos y- 1 with the rat y-crystallin cDNA clones was also compared with the genomic hybridization pattern (Fig. 1). The cosmid fragments correspond exactly to genomic fragments (except of course when parts of the vector molecule are present) but are only a subset thereof. Hence the two y-crystallin genes present in pHcos y-l do not represent the complete human genomic complement of y-cry&thin sequences. From the genomic hybridization pattern we estimate that there are five to six human y-crystallin genes. In pHcos y-1 the two y-crystallin genes are separated by 12.5 kb of DNA (Fig. 2). This organization exactly reflects the genomic organization since a fragment isolated from the cloned intergenic region (Fig. 2) hybridizes to identically sized genomic restriction fragments (results not shown). Other linked human y-crystallin genes were isolated by Meakin et al. (1985). Linkage of all the y-crystallin genes in man is further indicated by the fact that all map to c~omosome 2 (Den Dunnen et al., 1985). A similar situation exists in the rat, where extensive
linkage of the y-crystallin genes is also found (Moormann et al., 1985). (d) Sequence analysis of the coding region of the y-crystaltin genes
The nucleotide sequence of the two human y-crystallin genes and parts of their flanking regions is presented in Fig. 3. For comparison, the sequence of the rat y3_r gene (Moo~~n et al., 1983) is also shown. The organization of the two human genes is identical to that of the rat yS_i crystallin gene: all three genes contain two large ORFs preceded by a 9-bp-long ORF which starts with the initiation codon ATG. Hence the genes contain two introns, one close to the 5’ end and a second, approximately in the middle of the gene. Exon 2 is 243 bp long in both human genes and encodes 8 1 aa residues. The third exon has an ORF of 276 bp in the yr_* gene, while that of the yZ_, gene is 3 nt shorter. This length difference is most probably caused by a G --f C mutation which altered the original -agcag- splice acceptor sequence into -agCAC- and created a new splice acceptor site 3 nt upstream, where another -ag- dinucleotide is
201
humn hvnm
~1-2 ~2-1
mt
y,-1
1 47 gcagttccc~-~aca-gcaaccagaaaacatctgctcacttccttcaaa--------ATOGGAAA taagtcctgggtaccggatgctcagccttg-----------gc gc c
‘t
-c!
t
t
ctg
--
ac
*
-
---c-
c
--a
-
g --g
cca
-
ga
--c
g
cagcc
caacagcc
--
G
g
G
1 a c
a sac
a a
a*-
taaataeaa
c t -----
anaaaaaagtt
t, w
aaaa a
a
48 cctaatgcagtggcctcagtgggggcaa--tcac-tccatgot-----cccaca-tcttccatttttca TCACCTTCTACGAGGACAGGGCCTTCCAGGGCCGCAGCTACGAATGCAC AC tt ata gt- a a a B tg aat c-- ggaa ---g t -a _-_ --_-- -___ CA TG G g--c a tg--c t c ggc a aggc ctgac 200
150
100
~)ACTGACTGCCCCAACCTACAACCCTATTTCAGCCGCTGCAACTCCATCAGGGTGGAGAGCGGCTGCTGGATGATCTATGAGCGCCCCAACTACCAGGGCCACCAGTACTTCCTGCGGCG G C T A T A A G G G C A
CA T
G
G
TG GC C
C
C
AG
C
T
T ACA
T
TG
T
T
tgagtgtggctct-gtctttaccttccatctttttggaaataaa.. t __a *_ - a gccgcca gtcccc.. a ca -ca tt gc .. c tttc a gg--cc
A
intron intron intron
2.82 kb kb
..tgtttttgtttgtttactcttgcgttttctgt-ctgcc-ac ..a a-c CB c tt tc-catcc t %a
1 .88
..a
0.95
kb
g
c ga-ca
c
--
t
c g
c
g
GC
CG C G
G
G
A
CCAC
AA
C T
aa--
450 400 350 ~ATGTCAGAGCTCACAGACGACTGTCTCTCTGTTCAGGACCGCTTCCACCTCACTGAAATTCACTCCCTCAATGTGCTGGAGGGCAGCTGGATCCTCTATGAGATGCCCAACTACAGGGG C T G c c GGT A GC G C GT cc C ATG C C AGCA C G
GTG
C C CACC G
A
T
G
CT C
TCC
TA
AA
G
C
550 500 GAGGCAGTATCTGCTGAGGCCGGGGGAGTACAGGAGGTTTCTTGATTGGGGGGCTCCAAATGCCAAAGTTGGCTCTCTTAGACGAGTCATGGATTTGTACTGAagtatttacgttttcca CATGG T GCA T GC GA GG T A a gc taecac a c C CCAA C GC AG C A C G GA A CT a ttac c R c C C T A A C C AC AC C C CATG GG A
+70 +40 tagtagagccctgtctttctttcaatctattaagcatttataagtgataatggcactcagccaaacataa gt t ct ctt a t g gca t g -c cc tta ctc tgt a ac t gcct ccatg a ggg-c c8- ca gaaaggatgtgactggtctc ca c
600
aa
cat
gg
Fig. 3. Nucleotide sequence
Numbering
Boxes enclose the poly(A)
of pHcos y-I.
method
Codogenic
yi_a gene is shown completely.
that the sequence
capsites,
y,_a and ys_, crystallin
equals that of the human
addition
yi_a exonic sequences
+I from the putative
poly(A)
addition
sequences
y,_s gene. Dashes
signal is heavily underlined,
sequences
of the rat ys_i gene. The DNA
et al., 1983) using [a-32P]dATP
were determined
from appropriate
indicate
at least once on both DNA
1 bp deletions.
shown in capital
1 assigned
poly(A)
The sequences
to the putative
site is marked capsite;
strands.
are specified;
The blanks
are aligned to optimize
letters. The three asterisks addition
(410 Ci/mmol,
single or double digests
ya_, and the rat ys_, genes only differences
and the putative
with position
with the sequence
(Moormann
studies were constructed
For the human
the exonic DNA with the coding
is for the human
genes compared
as described
M13mp phages used in the sequencing
ZfindIII-fragments
of the human
homology.
of the human
by the dideoxy chain termination
The recombinant
of the subcloned sequence
numbered
sequences
was analysed
Amersham).
indicate
c
mark the putative
with an upward
3’ gene-flanking
arrow.
sequences
are
site on.
located. This ultimately led to a His insertion at position 84 of the ~i_.~protein, immediately following the intron/exon junction. An additional aa is also present at this site in the calf $1 crystallin (Bhat and Spector, 1984), some frog y-crystallins (Tomarev et al., 1984) and the rat ~i_~ crystallin (J.T.d.D., in preparation). The 71-z gene encodes a protein of 174 aa (M, 20753) and the yZ_i gene a protein of 173 aa (M, 20696) (Fig. 4). Both polypeptides start with a Gly residue, in agreement with the results of Kabasawa et al. (1977) who determined that Gly is found at the
N terminus of human y-crystallins. The two proteins are very similar and have an overall aa homology of 77%. Using the derived aa sequences of these two genes, Summers et al. (1984) have shown in recent modelbuilding studies that the three-dimensional structure of the human polypeptides is very similar to the one established for the bovine yII-crystallin. The residues involved in the stabilization of the folded hairpin structure in each motif are all conserved with the exception of Arg-79 in motif II, which is replaced by a cysteine residue in the two human sequences. The
202
30
1
human Y~-~ humzn Y,-,
TTDCPNLQPYF TTDCPNLQPYF
UIGKITFYEDRAFQGRSY mGKITFYEDRAFQGRSY
LRRGEYPDYQQWMGLSDSIRSCCLI LRRGEYPDYQQWMGLSDSIRSCCLI Go
~:E:$::::~EE:~~~I~~~~~~~
P ij) Q
a~~~~~~-~~~~~~~~~:~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ tt
Fig. 4. Comparison
of the deduced
amino acid sequences
one letter code. The gap after position is for the sequence
of the human
the asterisk
the Cys-residue
marks
tt
t.
tt
of the human
29 is due to the alignment
y,_s gene. Nonmatching discussed
in section
aa are boxed; d of RESULTS
replacement of a charged residue by a hydrophobic side chain exposes other aromatic residues, thereby increasing the hydrophobicity of the surface of the human y-crystallins. These and other changes tend to favour aggregation of the human y-crystallins. This may contribute to the conversion of the watersoluble protein to insoluble protein, a process that occurs on aging and is accelerated in cataract. The sequence divergence at the nt level between the human Y,_.~and y2_i genes is only about 20%) while the average sequence divergence between these two human genes and the two orthologous rat y-crystallin sequences is only slightly higher, about 22 % . Sequence variability is virtually restricted to the third exon (i.e., the second domain of the protein) and predominantly found in the first half of this exon, which encodes the third motif of the protein: the nt sequences of the human Y,_~and y2_i genes differ 35 % in the coding region for protein motif III, 26 % in the coding region for motif IV, 10% in the coding region for motif I and 9% in the coding region for motif II. We noted a similar variability of the second domain in two rat y-crystallins (Moormann et al., 1982). At present we cannot distinguish between two possible explanations for this phenomenon, namely concerted evolution of the first domain or a more rapid divergent evolution of the second. (e) Sequence analysis of the noncoding and flanking regions
In the 3 ’ -noncoding region of both genes an AATAAA polyadenylation signal is found at equivalent positions followed 23 and 22 bp, respectively, later by a GCA triplet. We suggest that this triplet is the site of poly(A) addition by analogy with the rat ys_, gene (Moormann et al., 1983). The 3’-noncoding regions would then be 64 and 65 bp long. The
crystallins
of the sequences arrows
tf
t
y,_* and ~a_,. The aa sequences encoding indicate
each of the protein
residues
conserved
are shown in the
domains.
Numbering
in all y-crystallins
while
AND DISCUSSION.
genes show a general sequence homology which extends into the 3’ gene-flanking region for about 25 bp. After that large insertions/deletions are required to align the sequences. Upstream from the initiation codon possible cap sites are found at the positions indicated in Fig. 3. About 25-30 bp upstream from the putative cap site a Goldberg-Hogness box (-TATA-) is found. During the sequencing of the intronic and flanking regions of the two genes several Ah repetitive elements (Jelinek et al., 1980) were encountered, some of which are flanked by short direct repeats (Fig. 2). Most of the Ah elements end in a poly(A) tract but one ends in an (AAC), tract. The Ah elements are mostly arranged in small clusters consisting of one complete and one or two truncated elements. This suggests that new Ah repeats frequently integrate in the genome in or near pre-existing ones, thereby giving rise to deletions in the region where they insert. Many more Ah elements, either complete or truncated, are present in the regions surrounding the two y-crystallin genes since a cloned Ah element was found to hybridize strongly to the flanking and intergenie regions. Ah elements are often found in the neighbourhood of human genes but such a high density of Ah repeats around a gene has thus far only been reported for the human growth hormone gene (Barsh et al., 1983). Besides Ah elements, the sequenced region also contains ‘simple sequence’ DNA, primarily repeats of the dinucleotide AC/TG (Fig. 2). In general, simple sequence DNA is often found near repetitive DNA elements. In our case two of the three (AC/TG), elements are found close to Alu repeats. Rogers (1983) observed that (CA), sequences are often found near or at breakpoints in DNA. Simple sequence DNA near repetitive elements could then somehow have been involved in the breakage and
203
reunion of chromosomes during the transposition of these repetitive elements. Alternatively, simple-sequence DNA could stimulate the insertion of repetitive DNA in its direct neighbourhood. Finally, simple sequence DNA could originate from the 3’ end of repetitive DNA elements. This can also be true for the (AC/TG), elements, as we find one Alu repeat that ends in a poly(AAC) sequence and not in a poly(A) tract. The variable presence of (mobile) repetitive elements and simple sequence DNA is at least partially responsible for the variation in length (0.95-2.82 kb) of the second intron. In contrast, the length of the 5’ intron is relatively constant (86-100 bp). The intronic sequences show only homology in some short stretches and directly surrounding the splice sites (a detailed analysis will be presented elsewhere).
Two of these sequence elements have been shown to be necessary for efficient and correct transcription of the rabbit j?-globin gene in human HeLa cells (Grosveld et al., 1982b). The third element, at -50 to -60, has as yet not been found to be necessary for transcription of the B-globin gene in studies of transient expression. The conservation of this sequence in the human and rat y-crystallin genes indicates that further study of the functional significance of this sequence is warranted. We have no direct evidence that the two human genes described here do function in vivo. However, the lack of in-phase stop codons and the presence of a proper poly(A) addition signal at the 3’ ends, of a Goldberg-Hogness box, and of elements involved in the modulation of transcription of eukaryotic genes at the 5’ ends all suggest that these two genes do function.
(f) Conserved elements in the 5’ flanking region
ACKNOWLEDGEMENTS
Since possible regulatory elements may be revealed by the conservation of their sequence we have compared the 5 ‘-flanking region of the two human genes with those of two other genes, the rat 1/3-I crystallin gene and the human /I-globin gene (Fig. 5). Besides the -TATA- box at -30, three other conserved elements are found: a region around -50 to -60, a sequence resembling the -CCAAT- box (Benoist et al., 1980) at position -80 and finally a -CCCT- sequence (Grosveld et al., 1982b) at -100.
We thank Drs. R.A. Flavell and F.G. Grosveld for providing us with the human gene library. We thank Roselie Jongbloed for help with the experiments, Dr. N.H. Lubsen for critical reading of the manuscript and Dr. M.L. Breitman for communicating his results prior to publication. This investigation was carried out under the auspices of the Netherlands Foundation for Chemical Research (SON) with financial aid from the Netherlands Organization for the Advancement of Pure Research (Z WO).
REFERENCES
Barsh, G.S., Seeburg, Fig. 5. Comparison sequences
ofthe 5’ flanking sequences.
of the human
y,_* and yz_, genes, the rat ys_, and the
human
pglobin
served
nt (with reference
shown in capital conserved human
genes are aligned letters.
in all three 7,-Z crystallin
The 5’-flanking
to optimize
to the human Heavy
gene
y-crystallin
overlining
y-crystallin with
homology. genes)
indicates
are
nt that are
genes. Numbering position
Con-
is for the
1 assigned
to the
putative
cap site; the arrow points to the cap site of the human
pglobin
gene. Boxes enclose sequences
of transcription sequence section
of eukaryotic
is heavily underlined f).
involved in the regulation
genes while their conserved (see
RESULTS
AND
core
DISCUSSION,
P.H. and Gelinas,
gene family:
mosomal
locus. Nucl. Acids Res. 11 (1983) 3939-3958.
Benoist,
C., O’Hare,
ovalbumin
structure
R.E.: The human growth
hormone
and evolution
K., Breathnach,
gene sequence
of the chro-
R. and Chambon,
of putative
control
P.: The
regions.
Nucl.
Acids Res. 8 (1980) 127-142. Bhat, S.P. and Spector,
A.: Complete
nucleotide
cDNA derived from calf lens y-crystallin AZuI-like sequences. Bloemendal,
Wiley, New York, Carper,
D., Shinohara,
Deficiency
of a
presence
of
DNA 3 (1984) 287-295.
H.: Biosynthesis
H. (Ed.), Molecular
sequence
mRNA;
of lens crystallins,
and Cellular
Biology
in Bloemendal, of the Eye Lens.
1981, pp. 189-220. T., Piatigorsky,
of functional
messenger
J. and Kinoshita,
J.H.:
RNA for a developmen-
204 taIIy regutated &cry&all& pofypeptide. Science 217 (1982) 463-464. Delaye, M. and Tardieu, A.: Short-range order of crystalfin protein accounts for eye lens transparency. Nature 302 (1983) 415-417. Den Dunnen, J.T., Jongbloed, R.J.E., Geurts van Kessel, A.H.M. and Schoenmakers, J.G.G.: Human y-crystallin genes are located in the ~1%qter region of~hromosome 2. Hum. Genet. 70 (1985) 217-221. Grosveld, F.G., Dahl, H.H.M., de Boer, E. and Flaveh, R.A.: isolation of j%gIobin related genes from a human cosmid library. Gene 13 (1981) 227-237. Grosveld, F.G., Lund, T., Murray, E.J., Mellor, A.L., DahI, H.H.M. and Flavell, R.A.: The construction of cosmid libraries which can be used to transform eukaryotic cells. Nuel. Acids Res. 10 (1982a) 6715-6732. Grosveld, G.C., RosenthaI, A. and Ftavell, R.A.: Sequence requirements for the transcription of the rabbit @globin gene in vivo: the -80 region. Nucl. Acids Res. IO (1982s) 4951-4971. Ish-Horowitz, D. and Burke, J.F.: Rapid and eficient cosmid cloning. Nucl. Acids Res.9 (1981) 2989-2998. Jelinek, W.R., Toomey, T.P., Leinwand, L., Duncan, C.H., Biro, P.A., Choudary, P.V., Weissman, S.M., Rubin, CM., Houck, CM., Deininger, P.L., and Schmid, C.W.: Ubiquitous interspersed repeated sequences in mammalian genomes Proc. Natl. Acad. Sci. USA 77 (1980) 1398-1402. Kabasawa, I., Barber, G.W. and Kinoshita, J.H.: Aging effects and some properties ofthe human lens low molecular weight proteins. Jap. J. Qphthalmol. 21 (1977) 87-97. Lok, S., Tsui, L-C., Shinobara, T., Piatigorsky, J., Gold, R. and Breitman, M.L.: Analysis of the mouse y-crystallin gene family: assignment of multiple eDNAs to discrete genomic sequences and characterization of a representative gene. Nuel. Acids Res. (1984) 4517-4529. Meakin, S., Breitman, M.L. and Tsui, L-C.: Characterization of five members of the human y-crystallin gene family (submitted). Moormann, R.J.M., Den Dunnen, J.T., Bloemendal, H. and Schoenmakers, J.G.G.: Extensive intragenic sequence homology in two distinct rat lens cDNAs suggests duplications of a primordial gene. Proc. Natl. Acad. Sci. USA 79 f 1982) 6876-6880.
Moormann, R.J.M., Den Dunnen, J.T., MulIeners, L., Andreoli, P., Bloemendal, H. and Schoenmakers, J.G.G.: Strict colinearity of genetic and protein folding domains in an intragenically duplicated y-crystallin gene. J. Mol. Biol. 171 (1983) 353-368. Moormann, R.J.M., Den Dunnen, J.T., Heuyerjans, J., Jongbloed, R.J.E., Van Leen, R.W., Lubsen, NH. and Schoenmakers, J.G.G.: Characterization of the rat y-crystallin gene family and its expression in the eye lens. J. Moi. Biol. I82 (1985) 419-430. &ken, P.R., Fu, S-C.J., Hart, R., White, J.H., Wagner, B.J. and Lewis, K.E.: Characterization of lens proteins, I. Identification of additional soluble fractions in rat lenses. Exp. Eye Res. 24 (1977) 355-367. Piatigorsky, J.: Lens differentiation in vertebrates: a review of cellular and mofecuiar features. Differentiation 19 (198 1) 134-153. Ramaekers, F., Dodemont, H., Vorstenbosch, P. and Bloemendal, H.: Classification ofrat lens crystallins and identification of proteins encoded by rat lens mRNA. Eur. J. Biochem. 128 (1982) 503-508. Rogers, J.: Molecular biology; CACA-sequences - the ends and the means? Nature 305 (1983) 101-102. Shinohara, T., Robmson, E.A., Appella, E. and Piatigorsky, J.: Multiple y-crystallins of the mouse lens: fractionation of mRNAs by cDNA cloning. Proc. NatI. Acad. Sci. USA 79 (1982) 2783-2787. Summers, L., Slingsby, C., White, H., Narebor, M., Moss, D., Miller, L., Mahadevan, D., Lindley, I’., Driessen, H., Blundell, T., Den Dunnen, J.T. and Schoenmakers, J.G.G.: The molecular structures and interactions of bovine and human y-crystallins, in Nugent, J., and Whelan, J. {Eds.), Human Cataract Formation. CIBA Foundation Symposium No. 106. Pitman, London, 1984, pp. 219-236. Tomarev, S.I., Zinovieva, R.D., Chalovka, P., Krayev, A.S., Skryabin, K.G. and Gausse Jr., G.G.: Multiple genes coding for the frog eye lens y-crystallins. Gene 27 (1984) 301-308. Wistow, G., Turnell, B., Summers, L., Slingsby, C., Moss, D., Miller, F., Lindley, P. and Blundell, T.: X-ray analysis of the eye lens protein yII-crystallin at 1,9 A resolution. J. Mol. Biol. 170 (1983) 175-202. Communicated by H. van Grmondt.