Vo1.157, No. 3,1988
BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS Pages 1033-1038
December 30,1988
THE HUMAN PROTEIN S LOCUS: IDENTIFICATION OF THE PS~ GENE AS A SITE OF LIVER PROTEIN S MESSENGER RNA SYNTHESIS
Hans K. Ploos van Amstel*, Pieter H. Reitsma and Rogier M. Bertina
Haemostasis and Thrombosis Research Unit, Leiden University Hospital, 2333 AA Leiden, The Netherlands
Received October 28, 1988
Summary: The protein S locus, situated on chromosome 3, consists of two protein S genes. Here, we report the cloning and complete nucleotide sequence of the 3'-untranslated region of the two genes designated PS~ and PSi. Both regions span approximately 1,200 nucleotides. They show a high degree (-97%) of homology, with deviations caused by small deletions, insertions and point mutations. Comparison of PS~ and PS~ with the reported protein S liver eDNAs, shows that the latter all originate from the PS~ gene. The PSe gene therefore is marked as the major site of synthesis of liver protein S mRNA. Sequence comparison with the bovine protein S eDNA reveals that the PS~ gene has accumulated a few more mutations than the PS~ gene since duplication of the ancestral protein S gene that seems to have occurred recently during primate evolution. © 1988 A c a d e m i c
Press,
Inc.
Human protein S is a vitamin K-dependent plasma glycoprotein that acts as a cofactor of activated protein C, the key component of the anti-coagulant pathway of the blood
coagulation system
(i). In plasma protein S cir-
culates both free and bound with C4b-binding protein, a component of the complement activated
system protein
( 2 ) . Only C,
thereby
free
protein
accelerating
S
serves
the
as
cofactor
inactivation
of
of the
coagulation factors Va (3) and Villa (4). Heterozygotes reported
to be
for a deficiency of either protein C or protein S are at risk to develop venous
thrombo-embolic disease
at a
relatively young age, probably through a shift in the hemostatic balance that favors the formation of insoluble fibrin (5-8). The nucleotide sequence of liver cDNA coding for human protein S has been
reported by
several
laboratories
(9-12).
Southern analysis
of the
protein S locus, with eDNA probes encompassing the 3'-untranslated region
*To whom correspondence should be addressed.
1033
0006-291X/88 $1,50 Copyright © 1988 by Academic Press, Inc. All rights of reproduction in any form reserved.
Vol. 157, No. 3, 1988
BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS
of protein S mRNA, has shown that it consists of two protein S genes both situated
on
chromosome
3
(12).
The
conservation
of
restriction
sites
suggested that the two genes are highly homologous. Recently, we demonstrated that in a family with hereditary protein S deficiency
the
reduced
plasma
protein
S
level
was
associated with
a
partial deletion in the protein S locus (13). For a proper understanding of the pathogenesis of the thrombotic disorder information is needed on the
relative
contribution
of
the
two
protein
S
genes
to
protein
S
synsthesis. Therefore, we started to isolate the protein S locus and to characterize it in greater detail. Here we report the nucleotide sequence of the complete 3'-untranslated regions of the two protein S genes, designated PS~ and PSfl. The
sequences
of PSi, PSfl and the liver cDNAs coding for protein S will be discussed.
Materials and Methods Materials Phage %EMBL4 DNA, packaging extracts, E.coli strain NM539 and restriction enzymes were purchased from Promega Biotech (Madison, USA). Labelling of the protein S cDNA fragments was performed with a random priming kit from Boehringer Mannheim (Mannheim, FRG) using (~-32p)dCTP (>3000 Ci/mmol) obtained from New England Nuclear (Boston, USA). The nucleotide sequence determinations were performed with a sequencing kit from Boehringer Mannheim using (~-35S)dATP (>600 Ci/mmol) from Amersham International (Amersham, UK). Genomic library: construction and screening High molecular weight DNA, isolated from peripheral blood leukocytes, was partially digested with the restriction enzyme EcoRl and ligated in phage AEMBL4 essentially as described (14). After in vitro packaging, E. coli NM539 bacterial cells were infected. The unamplified library, containing approximately 3x105 independent recombinant phages was screened by in situ hybridization using a probe, DS 400.2, representing the 3'-end of the 3'-untranslated region of human protein S cDNA (12). The positive reeombinants were plaque purified and DNA was isolated by the plate lysate method (14). Characterization and sequence determination of the recombinant clones DNA of the recombinant clones together with human total genomic DNA were characterized for their 3'-untranslated region by restriction mapping and Southern analysis. EcoRl restriction fragments were subcloned in phage MI3 mpl9 and the nucleotide sequence was determined using the dideoxy chain termination method (15). Based on the obtained nucleotide sequence oligonucleotides were synthesized and used as specific primers in the sequencing reactions. The nucleotide sequences were analyzed using the Microgenie program from Beckman Instruments (Fullerton, USA).
Results and Discussion Approximately 3x105 independent recombinant phages of the ~EMBL4 library were screened with a cDNA fragment coding for part of the 3'-untranslated region of the protein S mRNA
(probe DS 400.2,
1034
Figure
I). Two positive
V o l . 1 5 7 , N o . 3, 1 9 8 8
BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS
cz'} ~/) ~E o_ o_
'ROBE DS I+00.2 IONE ~PSo(
1 KB i i P I ..................... P I...
i
....
P
E E
I
I .,I
PE
E
LONE ),PS/3
I
E
II
I
PE II
E I
L._
E/
P
I
I I
f,~ O_
~
E
1.1
0.7'~ o~
,3
o
o
i~
i
-
®
-
o
P '
0.2 KB, '
EcoR
I
Pst
Figure 1:
Partial restriction map and sequencing strategy of the 3'untranslated region of the phage IEMBL4 clones PS~ and PSi. The black box indicates the exon for the 3' -untranslated region. Arrows indicate the direction and extension of the sequencing reactions. Open circles denote reactions primed by specific oligonucleotides. The position of probe DS 400.2 is indicated at the top. E,EcoRI; P,Pstl.
Figure 2:
Southern blot analysis of human total DNA (HMW) and the phage clones IPS~ and IPS~, digested with the enzymes EcoRl (left panel) and Pstl (right panel). Numbers indicate the calculated molecular weights (in kilobases) of the fragments hybridizing with probe DS 400.2 (see Figure i).
recombinants were obtained, plaque purified and subjected to Southern blot analysis
(Figure
restriction
2).
enzymes
in the visualization APS~
and
IPS#
Digestion EcoRl
of two EcoRl
(Figure
performed with
of
the phage
and Pstl.
2,
left
the enzyme Pstl,
DNA was
performed
Hybridization with
fragments panel).
of 0.77
DS
with
the
results
and I.i kb for both
However,
when
digestion
was
IPS~ shows a 2.4 kb fragment hybridizing
with DS 400.2, whereas IPS# shows a 5.6 kb Pstl fragment panel).
400.2
(Figure 2, right
All fragments are in accordance with the h y b r i d i z a t i o n p a t t e r n of
h u m a n total DNA fragments
(HMW in Figure 2).
present
in human
Earlier we reported that the two Pstl
total DNA and hybridizing with DS 400.2,
find
their origin in the presence of two protein S genes per h a p l o i d genome on chromosome Pstl
as
(12).
restriction
confirmed IPS#
3
by
the
It was postulated pattern
in
isolation
their
and
far
as
characterization
with
DS
400.2
differ
regions. of
the
in their
This
clones
is
now
IPS~
and
identical EcoRl restriction patterns
but
deviate
when
digested
with
Pstl.
the EcoRl fragments encompassing the 3'-untranslated region were
subcloned sequencing (Figure
detected
two genes
3°-flanking
(Figure i). The two clones have
Next,
that the
in
phage
strategy
MI3 is
and
the
outlined
3) of the fragments
nueleotide in
containing
1035
Figure
sequence I.
The
determined. sequence
the 3'-untranslated
The
analysis
region of PS~
I
Vol. 157, No. 3, 1988
BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS
-AsnSerEnd 60 GAATT CTTAAGG CATC TTTTCTCTG CTTATAATACCTTTTCCTTG TGTG TAATTATACTT .............. C ............ G ..................C ............. -HisThrEnd
PSc~ PSB
Bovine
-C-
PS~ PSB Bovine
ATGTTTCAATAACAGCTGAAG G GTTTTATTTACAATGTGCAGTCTTTGATTATTTTG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
PS~ PSB Bovine
TACA A~A
PSc, PSB Bovine
GATATAAATCACAGTAAAGAAATTCTTACTTCTCTTG
PS~ PSB Bovine
ATAACAATTTTAAATTTGAATTTTTTTG
-A .........
TT ...........
G ............
T-C-
- -G . . . . . . . . . . . . . 120
............
T ........
AA .....
CC ...........
TATC
TG G
. . . . . . . . . . . . . . . 180
-A-
CTTTCCTGGGATTTTTA.AAAGGTCCTTTGTGAAGG~TTCTGTTGT
AAA,AJk
- A- TTTAA
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
........
AA .....
A ............
•
T .....
. . . . . . . . . . . . .
A- - -C-
-G .....
C .... 240
. . . .
G . . . . . . . . . . . . . . . . . . .
....
TG .....
T-T
....
~
A-TG
CTATCTAAAGAATAGTG~
. . . . . . . . . . . . . . . .
....... .o..
C- - -G .....
•
G ....
. . . . .
AA-
G . . . . . .
G-A-T-AA
.....
CCT 3OO
PS~ PSB Bovine
CTACAAATGACAGT
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A&TTCAATTTTT
A A G TTT
aA
AA
. . . . . . . . . .
. . . .
....A ...... T ........ AA ..... GTG .......... T-TC- --TT ..... AT .... 360 GTAAAACTAAATTTTAATTTTATCATCATGAACTAGTGTCTAAATAC CTATG TTTTTTTC --G ......................................................... - -A-
- -G ....
G . . . . . . . . . . . . . . . . . .
420
PS~ PSB PS~ PSB
AGAAAG CAAGGAAGTAAACTCAAACAAAAGTGCG TGTAATTAAATACTATTAAT CATAGG ......... C .......................A .......................... 480 CAGATACTATAAAATTTGTTTATGTTTTTGTTTTTTTCCTGATGAAGG CAGAAGAGATGG .......... TTTG . . . . . . . T. . . . . . . . . . . . . . . . . . . . . . . . T ............ A 540
PSa PSB
TGGTCTATTAAATATGAATTGAATGGAGGGTCCTAATGCCTTATTTCAAAACAATTCCTC ............................................................ 60O
PSc~ PSB
AGGGGGACCAGCTTTGGCTTCATCTTTCTCTTGTGTGGCTTCACATTTAAACCAGTATCT ............................................................ 660
PS~ PSB
TTATTGAATTAGAAAACAAGTGGGACATATTTTCCTGAGAGGAGCACAGGAATCTTACTT
PS~
CTTGGCAGCTGCAGTCTGTCAGGATGAGATATCACATTAGGTTGGATAGGTGCGGAAATC
-
-
-C
. . . . . . . . . . . . . . . . . . . . . . .
A-
-
-G
. . . . . . . . . . . . . . . . . . . . . . . .
A-
-
-
720
PSB
. . . . . . . . . . . . . . . . . . . . .
PS~
TGAAGTCGGTACATTTTTTAAATTTTGCTGTGTGGGTCACACAAGGTGTACATTACAAAA
PSB
. . . . . . . . . . . . . . .
PS~ PSB
GACAGAATTCAGGGATGGAAAGGAGAATGAACAAATGTGGGAGTTCATAGTTTTCCTTGA ............................................................
PSc~
A .......
C . . . . . . . . . . . . . . . . . . . . .
AT . . . . . . 780
AC ........
C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 840
900 PSB
ATCCAACTTTTAATTACCAGAGTAAGTTGCCAAAATGTGATTGTTGAAGTACAAAAGGAA ............................................................
PSce
C TATGAAAACCAGAACAAATTTTAACAAAAG
PSB
...........
PS~
1020 G TATCATTG TAATCAAAGAAGTAAGGAGGTAAGATTG CCACGTG C CTG CTG GTAC TGTGA
PSB
.............
960 TGAATATC
A .......
A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A- - -
PSB
PS~
ATAAA
PSB
.....
PSB
PSc~ PSB
PS~
3:
GGATATAG
1080 TGCATTTCAAGTGGCAGTTTTATCACGTTTGAATCTACCATTCATAGCCAGATGTGTATC ............................................................ 1140 AGATGTTTGAC TGACAGTTTTTAACAATAAATTCTTTTCACTGTATTTTATATCACTTAT .............. G ..... G ....................................... 1200 AATAAATCGGTGTATAATTTTAAAATGCATGTGAATATCTTTATTATATCAACTGTTTGA ....... T-A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G ..........
PS~
Figure
GACAAC CACAGAO
G . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jc~xx
Nucleotide sequences of the 3'-untranslated human PS~ and PS~ genes, and of the bovine (19). The bovine sequence extends to
region of the protein S cDNA the site of
polyadenylation and is inherently shorter than sequence. A dash indicates nucleotide similarity, triangle indicates a nucleotide differences
single between
base deletion. Dots the human and bovine
the human a closed indicate sequences
that are unique for either the PS~ or PS~ gene. The Mspl (CCGG) restriction site present in the PS~ gene at position 9 1 0 is i n d i c a t e d by asterisks. Two putative polyadenylation signals are underlined. The arrow indicates the site of polyadenylation.
1036
Vol. 157, No. 3, 1988
BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS
and PSi, shows that the two genes are highly homologous
(-97%). Both PS~
and PS~ possess the two putative AATAAA polyadenylation signals and the homology between the two genes extends downstream from the putative site of adenylation. When the reported sequences of the 3'-untranslated regions of liver protein S cDNAs (9-12) are compared with the corresponding region of the PS~ and PS~ genes, they all show complete similarity with the PS~ gene.
This
observation
suggests
that
the
liver
mRNAs
that
served as
template for the protein S cDNA synthesis, have been transcribed from the
PS~ gene. The
survey
of
human
liver
protein
S
cDNAs
was
extended
to
eight
additional clones also containing the 3'-untranslated region but of which no nucleotide sequences were determined (ii). Analysis was performed by making use of a Mspl restriction site distinguishing the PS~ gene from the PSe
gene
at position
910
(Figure
3),
i.e.
upstream
from
the putative
polyadenylation signals. None of the eight protein S cDNA clones contained the additional Mspl site and therefore they can not originate from the PS~ gene.
If the PS~ gene is expressed in the liver it must be less than ten
percent when compared with the level of transcription of the PSe gene. No data are available on the identity of protein S mRNA of endothelial cells and megakaryocytes that both have been shown to be, like the liver (16), sites of protein S synthesis (17,18). When part of the 3'-untranslated regions of PS~ (Figure 3, nucleotide 1-316)
and
of
PS~
(Figure
corresponding complete
3,
nucleotide
1-312)
3'-untranslated region
are
compared with
in bovine
protein
the
S cDNA
(19) about 80% homology is found for both genes. Most of the mutations that underlie the -20% divergence of the 3'-untranslated region of PS~ and PS~ from their bovine counterpart, are the same for the two human genes (Figure
3).
However,
of
the
differences
two
point
mutations
are
only
found in the PS~ gene, whereas the PS~ gene contains five unique point mutations and one unique deletion of four nucleotides. Since the moment of duplication
of
the
ancestral
protein
S
gene,
apparently
a
few
more
mutations have accumulated in the PS~ gene than in the PS~ gene. The high degree of homology between the PSe and PS~ region,
where
most
mutations
are
thought
to
have
no
3'-untranslated effect
on
the
phenotype and can be considered neutral, suggests that the duplication of the
ancestral
protein
S
gene has
taken place
recently
during
primate
evolution (20). The elucidation of the nucleotide sequence of the exons encompassing
the
3'-untranslated region
of
the
PS~
and
PS~
genes
now
offers the tools to study tissue dependent protein S mR_NA synthesis using gene specific oligonucleotide probes. The analysis of the protein S mRNA population in the respective tissues (16-18) will give us further insight
1037
Vol. 157, No. 3, 1988
in the physiological
BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS
role of each of the two protein S genes PSa and PSfl
of the human protein S locus.
Acknowledgements We thank Mr. W. te Lintel Hekkert for excellent technical assistance and Mrs. M.J. Mentink for her help in preparing the manuscript.
This work was
supported by a grant (no. 86.015) from the Trombose Stichting Nederland.
References i. 2. 3. 4.
5. 6. 7. 8. 9.
I0. ii. 12. 13. 14.
15. 16. 17. 18. 19. 20.
Esmon, C.T. (1987) Science 235, 1348-1352. Dahlb~ck, B., and Stenflo, J. (1981) Proc.Natl.Acad. Sci.USA 78, 25122516. Walker, F.J. (1980) J.Biol. Chem. 255, 5521-5524. Gardiner, J.E., McGarn, M.A., Berridge, C.W., Fulchner, C.A., Zimmerman, T.S., and Griffin, J.H. (1984) Circulation 70, suppl. II, 205. Griffin, J.H., Evatt, B., Zimmerman, T.S., Kleiss, A.J., and Wideman, C. (1981) J.Clin.lnvest. 68, 1370-1373. Broekmans, A.W., Veltkamp, J.J., and Bertina, R.M. (1983) N.Engl. J.Med. 309, 340-344. Comp, P.C., Nixon, R.R., Cooper, M.R., and Esmon, C.T. (1984) J.Clin.lnvest. 74, 2082-2088. Engesser, L., Broekmans, A.W., Bri~t, E., Brommer, E.J.P., and Bertina, R.M. (1987) Ann. Intern.Med. 106, 677-682. Lundwall, A., Dackowski, W., Cohen, E., Shaffer, M., Malor, A., Dahlb~ck, B., Stenflo, J., and Wydro, R. (1986) Proc.Natl. Acad. Sci.USA 83, 6716-6720. Hoskins, J., Norman, D.K., Beckman, R.J., and Long, G.L. (1987) Proc.Natl.Acad. Sci.USA 84, 349-353. Ploos van Amstel, J.K,. Van der Zanden, A.L., Reitsma, P.H., and Bertina, R.M. (1987) FEBS Letters 222, 186-190. Ploos van Amstel, J.K., Van der Zanden, A.L., Bakker, E., Reitsma, P.H., and Bertina, R.M. (1987) Thromb.Haemostas. 58, 982-987. Ploos van Amstel, H.K., Huisman, M.V., Reitsma, P.H., ten Cate, J.W., and Bertina, R.M. Blood (in press). Maniatis, T., Fritsch, E.F., and Sambrook, J. (1982) Molecular cloning. A laboratory manual. Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. Sanger, F., Nicklen, S., and Coulson, A.R. (1977) Proc.Natl.Acad. Sci. USA 74, 5463-5467. Fair, D.S., and Marlar, R.A. (1986) Blood 67, 64-70. Stern, D., Brett, J., Harris, K., and Nawroth, P. (1986) J.Cell.Biol. 102, 1971-1978. Ogura, M., Tanabe, N., Nishioka, J., Suzuki, K., and Saito, H. (1987) Blood 70, 301-306. Dahlb~ck, B., Lundwall, A., and Stenflo, J. (1986) Proc.Natl.Acad. Sci. USA 83, 4199-4203. Britten, R.J. (1986) Science 231, 1393-1398.
1038