The primary structure of human H-protein of the glycine cleavage system deduced by cDNA cloning

The primary structure of human H-protein of the glycine cleavage system deduced by cDNA cloning

Vol. 176, No. 2, 1991 BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS Pages 711-716 April 30, 1991 THE PRIMARY STRUCTURE OF SYSTEM Kazuko ...

326KB Sizes 1 Downloads 41 Views

Vol. 176, No. 2, 1991

BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS Pages 711-716

April 30, 1991

THE

PRIMARY

STRUCTURE

OF

SYSTEM

Kazuko Fujiwara,

Institute

HUMAN

H-PROTEIN

DEDUCED

Kazuko

BY

OF

cDNA

THE

GLYCINE

CLEAVAGE

CLONING

Okamura-Ikeda, Yutaro Motokawa

Kiyoshi

Hayasaka*,

for Enzyme Research, the U n i v e r s i t y Tokushima 770, Japan

and

of Tokushima,

Received March 14, 1991

SUMMARY: A f u l l - l e n g t h cDNA encoding the human H-protein of the g l y c i n e c l e a v a g e s y s t e m has b e e n i s o l a t e d f r o m a lgt11 h u m a n fetal liver cDNA library. The cDNA insert was 1091 base pairs with an open reading frame of 519 base pairs which encoded a 125amino acid mature human H-protein with a 48-amino acid presequence. Human H-protein is 97%, 86%, and 46% identical to the bovine, chicken, and pea H-protein, respectively. © 1991Academic Press, Inc.

The in

glycine

cleavage

mitochondria,

glycine.

The

protein,

and

protein

(I-6).

and

system

a covalently

the

chemically

determined

H-protein

is

prosthetic

group

bovine

at

disease four

is

or

acid

of

125 59

sequence

amino (7).

genes. H-protein

Patients have

it

system with

been

small

the

acids

cleavage

the

T-

as

protein group

with

that

a

we

H-

with

which

catalysis.

revealed

The

chicken

lipoic

cloned

a

acid cDNA

(8). system

hyperglycinemia. as

P-protein,

designated

Recently,

and sequenced

of the glycine

of

prosthetic

during

located

degradation

enzymes,

protein

acid

system

the

is a h e a t - s t a b l e

lysine

heterogeneous

for

three

carrier

enzymes

amino

non-ketotic

structural

T-protein,

of

a

lipoic

H-protein

The activity with

and

three

composed

is a m u l t i e n z y m e

responsible

consists

attached

with

patients

is

H-protein

interacts

encoding

it

L-protein, The

system

The is

is d e f i c i e n t etiology

encoded

a defect reported

in (9,

by

the

of at

in

this least

P-protein,

10),

but

the

Permanent address: Department of Pediatrics, Akita University School of Medicine, Honmichi 1-chome, Akita 010, Japan. The abbreviations bp, base pairs.

used are:

711

SDS,

sodium dodecyl

sulfate;

0006-291X/91 $1.50 Copyright © 1991 by Academic Press, Inc. All rights of reproduction in any form reserved.

Vol. 176, No. 2, 1991

molecular ketotic

mechanisms

underlying

hyperglycinemia

deficiency the

BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS

of

the

components

present

of

study,

H-protein

and

are

system, the

mostly

it

is

human

we r e p o r t

the

enzyme unknown.

desirable

glycine

primary

MATERIALS

To to

cloning

in

non-

understand

have

cleavage

the m o l e c u l a r

its d e d u c e d

deficiency

cDNAs

system.

the

of

all

In

the

of the h u m a n

liver

structure.

AND

METHODS

Materials Restriction nucleases and DNA-modifying enzymes w e r e p u r c h a s e d f r o m T o y o b o (Tokyo, Japan), N i p p o n g e n e (Tokyo), and T a k a r a S h u z o (Kyoto, Japan). Radiolabeled nucleotides were o b t a i n e d from N e w E n g l a n d N u c l e a r (Boston, MA). Oligonucleotide primers were synthesized on an Applied Biosystems 381A DNA synthesizer. S c r e e n i n g of c D N A L i b r a r y A human fetal liver cDNA library in ~gt11 (Clontech) was screened with a 32p-labeled EcoRI f r a g m e n t e n c o d i n g the m a t u r e b o v i n e l i v e r H - p r o t e i n (BH5A (8)). The b o v i n e c D N A p r o b e w a s h y b r i d i z e d to the h u m a n c D N A l i b r a r y at 65 °C in 6 x S S C (I x S S C = 150 m M NaCI, 15 m M s o d i u m c i t r a t e , p H 7.0), 5 x D e n h a r d t ' s s o l u t i o n , 0.5% SDS , 10% d e x t r a n s u l f a t e , 160 ~ g / m l s a l m o n t e s t i s DNA, and 7 x I0 s c p m / m l of p r o b e . The f i l t e r s w e r e W a s h e d in I x SSC and 0.1% SDS at 65 °C for 30 m i n before autoradiography. Twenty five p u t a t i v e positive clones w e r e i s o l a t e d f r o m 6 x I0 s p l a q u e s . They were classified into four groups according to their cDNA sizes and restriction patterns and analyzed. DNA Sequencing EcoRI-excised DNA inserts or KpnI-SacIe x c i s e d DNAs f r o m lgt11 D N A c o n t a i n i n g the i n s e r t w e r e s u b c l o n e d into a pGEM-3Z cloning vector (Promega) and sequenced by the d i d e o x y c h a i n t e r m i n a t i o n p r o c e d u r e (11) u s i n g the G e m S e q K l e n o w s y s t e m (Promega).

RESULTS

Screening bovine HHI,

probe

that fetal

the

glycine

liver

liver

strands

shown

bp

of

site

complete was

chosen

is

higher

(12).

The to t h e

in Fig.

and

2.

sequencing

of

to the

in

the

strategy

outlined

in Fig.

the

deduced

cDNA

at the

3' end,

There

is

25-543)

the

primary

consists

an

I.

H-protein

sequence

bp,

the

internal

a

sequence (13)

and

including

of of

As a m i n o

amino

519 173

acids

determined the

5

EcoRI

frame

protein

sequence.

in

on b o t h

1091

encoding

of

than

The deoxy-

acid

reading

as

activity

determined

amino

with

in the

712

of

the

H-protein.

liver

open

NH2-terminal

human

the

fetal

was

the f i r s t A T G

identical

because

with

designated

for

sequence

(nucleotides

following

library

clone,

primary

The

tail

cDNA

region

system

786.

is

coding

library

nucleotide

residues

liver

of a c D N A

cDNA

sequence

nucleotides

direct

the

fetal

isolation

the p o l y ( A )

at

49-60

human

the

DISCUSSION

cleavage

according

nucleotide are

the

to

harbored

The

adult

of

led

AND

by acid

Vol. 176, No. 2, 1991

BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS

=

~

L |

I )

• A w

"© 1

i

I

%

I

!

I

200

0

I

400

I

I

600

800

l

I

I

I000

Fig. I. Restriction endonuclease map and strategy for nucleotide sequence analysis of human H-protein cDNA. The open box represents the coding region, and the solid lines represent the untranslated regions. The horizontal arrows indicate the d i r e c t i o n and extent of sequence analysis. Closed circles indicate the use of synthetic primers. The scale at the bottom is nucleotide basepairs. sequence bovine that

including

H-protein

this

the

125 a m i n o

H-protein

acids,

with

pair

of

the

H-protein and

the

561

The

processing of

isolated

a portion

bovine

coding

cDNA

region

Moreover,

alter

This in

of

3'

13

(8)

the

and

48 of

of

the

13,812

contains

the

bp A

present

for

AATAAA

is

normally

was

cDNA

latter

suggests

543

sequences.

AATAAA

signal

Da.

bovine

sequence

apparently (14).

that

we

used

The

short

have

only

region. H-protein

showed

in the

region that

leading

to

sequence homology

encoding

encodes

the

the

silent

39%

mature

untranslated

with

regions

different

that

90%

of

in the

protein.

protein,

at

mutations,

in t h e

with is

mature

predominantly

agreement

among

the

of the p r o t e i n s .

is o n l y

in g e n e r a l

cDNA

that

occurs

structures

5'

and

region

region

divergence

of

HHI

of

the

of

is a p o l y p e p t i d e mass

signal case

believe

a presequence

frame,

the

we

the

NH2-terminus

5'-untranslated

between

bp

sequences

codons

is

in

with

(8),

the

protein

polyadenylation

of this

of h o m o l o g y

finding

that

and

95%

the p r i m a r y

sequence

is

clone

and

in

the degree

as

of the h u m a n

within

divergence position

1065

identity

cDNA

molecular

of

distance

its

serine-49,

reading

bp

5.'-untranslated

Comparison the

24

90%

Since

polyadenylation

tail

stretch

open

and

and

(8).

poly(A)

for R N A

is

a calculated

consensus

at p o s i t i o n s

from

The m a t u r e

to the

3'-untranslated

has

H-protein.

is p r e d i c t e d .

In a d d i t i o n of

deduced

is h u m a n

human

acids

presequence

sequence

protein

mature

amino

the

the

the

third

which

do

not

On the o t h e r hand,

3' u n t r a n s l a t e d other have

species

region.

eucaryotic higher than

genes

rates

the

of

coding

regions. The distribution H-protein

examined

by

of h y d r o p h o b i c the

method

713

of

residues Kyte

and

in the m a t u r e Doolittle

human

(15)

and

Vol. 176, No. 2, 1991

BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS

1

GGG CGG GCC CGC ACC CCT GCG AAC ATG GCG CTG CGA GTG GTG CGG AGC Met Ala Leu Arg Val Val Arg Ser

48 8

49 9

GTG CGG GCC CTG CTC TGC ACC CTG CGC GCG GTC CCG TTA CCC GCC GCG Val Arg Ala Leu Leu Cys Thr Leu Arg Ala Val Pro Leu Pro Ala Ala

96 24

97 25

CCC TGC CCG CCG AGG CCC TGG CAG CTG GGG GTG GGC GCC GTC CGT ACG Pro Cys Pro Pro Arg Pro Trp Gln Leu Gly Val Gly Ala Val Arg Thr

144 40

CGT AAA TTC ACA GAG AAA Arg Lys Phe Thr Glu Lys

192 56

193 57

CTG CGC ACT GGA CCC GCT CTG CTC TCG GTG Leu Arg Thr Gly Pro Ala Leu Leu Ser Val + CAC GAA TGG GTA ACA ACA GAA AAT GGC ATT His Glu Trp Val Thr Thr Glu Asn Gly l l e

GGA ACA GTG GGA ATC AGC Gly Thr Val Gly l l e Ser

240 72

241 73

AAT TTT GCA CAG GAA GCG TTG GGA GAT GTT GTT TAT TGT AGT CTC CCT Asn Phe Ala Gln Glu Ala Leu Gly Asp Val Val Tyr Cys Ser Leu Pro

288 88

289 89

GAA GTT GGG ACA AAA TTG AAC AAA CAA GAT GAG TTT GGT GCT TTG GAA Glu Val Gly Thr Lys Leu Asn Lys Gln Asp Glu Phe Gly Ala Leu Glu

336 104

337 105

AGT GTG AAA GCT GCT AGT GAA CTC TAT TCT CCT TTA TCA GGA GAA GTA Ser Val Lys Ala Ala Ser Glu Leu Tyr Ser Pro Leu Ser Gly Glu Val

384 120

385 121

ACT GAA ATT AAT GAA GCT CTT GCA GAA AAT CCA GGA CTT GTA AAC AAA Thr Glu l l e Asn Glu Ala Leu Ala Glu Asn Pro Gly Leu Val Asn Lys

432 136

433 137

TCT TGT TAT GAA GAT GGT TGG CTG ATC AAG ATG ACA CTG AGT AAC CCT Ser Cys Tyr Glu Asp Gly Trp Leu l l e Lys Met Thr Leu Ser Asn Pro

480 152

481 153

TCA GAA CTA GAT GAA CTT ATG AGT GAA GAA GCA TAT GAG AAA TAC ATA Ser Glu Leu Asp Glu Leu Met Ser Glu Glu Ala Tyr Glu Lys Tyr l l e

528 168

529 169

AAA TCT ATT GAG GAG TGA AAA TGG AAC TCC TAA ATA AAC TAG TAT GAA Lys Ser l l e Glu Glu

576

145 41

577 625 673 721 769 817 865 913 961 1009 1057

ATA TTA TTA AGA CTA GTT TAT ATA GTT ATA TGC

ACG CAA GAA TAG ACA CTG TAA ATA GGC TCT ATA AAA ATA ATA CTT AAT GCT TGT CAC TGC AGC AAA

GCC AGC AGA GTT GTC TTA AAT AAA CTT TTA GTA I-[A CCG ATG CTA ATG AAA GAA AAT GCC CTT TAA TAT GCG TCT TTT TCA CAA AGT GTT CAG AAT TCA TGA AAT ATT ACA TAA TTC AAA GAT AAC TTG TAA CTT GCA TGT ATC CAT GAT CTT TCC ATT GGA AAT AAC ACA GTG TCA GAT GAG GAA CAC ATT TGC TGG TGC TAT TTT TAT ATA ATA AAA TAC TTC TTC GTT

TAG TGG TGG ATA GAA GAC 624 GGG AAA AAA AAA CTA CTG 672 TAA CTT TCT AAT GAT TAT 720 TAT CCT ATG ATT TTT AGA 768 TAT CCA TGG TAA AAA CTA 816 ATT GTT ATT CTT AAG CCT 864 ACC TGG ATT TGG GAT GAA 912 TGG AAG TGA AGA GGT TTT 960 CAC TAT CTT AAT TTT GCG 1008 ACA GTG AAG CAA CAG CTT 1056 AAA AA

Fig. 2. Nucleotide sequence of human H-protein cDNA. The complete sequence of clone HHI and its translation into the human H-protein are presented in the 5' to 3' direction. The asterisk indicates the residue involved in lipoic acid attachment. The arrow indicates the site where the presequence is predicted to be cleaved. The p o l y a d e n y l a t i o n signals are underlined.

the

secondary

Fasman

(16)

chicken

structure

are

H-protein

estimated

to

pWbsthetic

group

predicted

essentially (7).

be

38%

The and

is a g a i n

The

predicted

with

the

reported

(17,

18)

mature

the

found

mature sequences

H-protein.

contents

27%,

by

same

the

as of

method

those s-helix

respectively. to r e s i d e

human

bovine

The

human

714

(8),

and

for

the

and

B-sheet

were

lipoic

acid

sequence chicken

sequence

Chou

The

in a h e l i c a l

H-protein

for

of

reported

has

region.

was (7), 97%

compared and

pea

identity

V o l . 176, No. 2, 1991

BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS

human

-40 -30 -20 -lO -I MALRVVRSVRALLCTLRAVPLPAAPCPPRPWQLGVGAVRTLRTGPALL

bovine

~AL~A~AAVGGL~AISAbSAS~LS~5GGLRA6A~EL~6PALL

chicken

::::

:

:

:

MALRMWASSTANALKLSSSS. . . . . . . . . .

pea 1

I0

20

RLHLSPTFSISRCFSNVL +

30

40

50

human

SVRKFTEKHEWVTTENGIGTVGISNFAQEALGDVVYCSLPEVGTKLNKQD

bovine

SVRKFTEKHEWVTTENGVGTVGISNFAQEALGDVVYCSLPEVGTKLNLQE : :::: :::: ::::::::::::::::::::::::::::::::::: :

chicken

SARKFTDKHEWISVENGIGTVGISNFAQEALGDVVYCSLPEIGTKLNKDD

pea

DGLKYAPSHEWVKHEGSVATIGITDHAQDHLGEVVFVELPEPGVSVTKGK

human

6O 7O 8O 90 I00 EFGALESVKAASELYSPLSGEVTEINEALAENPGLVNKSCYEDGWLIKMT :::::::::::::::::::::::::: :::::::::::::::::::::::

bovine

EFGALESVKAASELYSPLSGEVTEINKALAENPGLVNKSCYEDGWLIKMT

chicken

EFGALESVKAASELYSPLTGEVTDINAALADNPGLVNKSCYQDGWLIKMT

pea

GFGAVESVKATSDVNSPISGEVIEVNTGLTGKPGLINSSPYEDGWMIKIK

human

llO 120 LSNPSELDELMSEEAYEKYI KSIEE

bovine

FSNPSELDELMSEEAYEKYI KSI EE

chicken

VEKPAELDELMSEDAYEKYIKSI ED

pea

PTSPDELESLLGAKEYTKFCEEEDAAH

::::::::::::::::: :::::::::::::::::::::::::::::::

:::::::::::::::::::::::: :

:::::::::::::::::::

Fig. 3. Comparison of amino acid s e q u e n c e s of human, b o v i n e (Ref. 8), c h i c k e n (Ref. 7), and pea (Refs. 17, 18) Hproteins. Amino acid residues are numbered beginning with the NH2-terminal serine of the mature human H-protein. The lysine r e s i d u e involved in lipoic acid a t t a c h m e n t is m a r k e d by an asterisk. Amino acid residues identical with human H-protein are marked with double dots. The arrow indicates the NH2-terminus of pea H-protein.

with

the

sequence, (Fig.

bovine while

3).

revealed

sequence 46%

attachment

the

site

similarity.

The

acid

to

of

the

protein

intermediate

during

the

of

human

presequence. of

the

both

acid

seems

presequence

the

the

conserved site

Sequences prepeptides

amino

acid

chicken

the

pea

sequence

and

the

is

NH 2- a n d well

715

pea

H-protein

including

significant sequence for

the

degradation.

H-protein

were

for

the

portion

to be e s s e n t i a l and/or

with

with

H-protein

middle exhibits

glycine

of

identity

observed

animal

in

lipoic

86%

was

the

region

of

lipoate-attachment lipoic

identity

Alignment that

and

68.8%

sequence

around

the

binding

transfer Identity with

the

COOH-terminal

conserved.

the

the of

of

the

of

the

bovine portions

Although

the

Vol. 176, No. 2, 1991

presequence bovine

of

BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS

the

and human

pea

H-protein

H-protein,

shorter

seven out of

half

the NH2-terminal

sequence of human and bovine

result

may

indicate

that

the pea

the

is important

presequence

conserved

than

sixteen

the NH2-terminal

these prepeptides

of

is

is

those amino

of

the

acids

of

homologous

with

presequences.

The

NH2-terminal

for the transport

sequence

of

of H-protein

to

the proper site in the mitochondria.

REFERENCES

I. Fujiwara, K., Okamura, K., and Motokawa, Y. (1979) Arch. Biochem. Biophys. 197, 454-462. 2. Hiraga, K., and Kikuchi, G. (1980) J. Biol. Chem. 255, 1166411670. 3. Okamura-Ikeda, K., Fujiwara, K., and Motokawa, Y. (1982) J. Biol. Chem. 257, 135-139. 4. Fujiwara, K., and Motokawa, Y. (1983) J. Biol. Chem. 258, 8156-8162. 5. Fujiwara, K., Okamura-Ikeda, K., and Motokawa, Y. (1984) J. Biol. Chem. 259, 10664-10668. 6. Okamura-Ikeda, K., Fujiwara, K., and Motokawa, Y. (1987) J. Biol. Chem. 262, 6746-6749. 7. Fujiwara, K., Okamura-Ikeda, K., and Motokawa, Y. (1986) J. Biol. Chem. 261, 8836-8841. 8. Fujiwara, K., Okamura-Ikeda, K., and Motokawa, Y. (1990) J. Biol. Chem. 265, 17463-17467. 9. Hiraga, K., Kochi, H., Hayasaka, K., Kikuchi, G., and Nyhan, W. L. (1981) J. Clin. Invest. 68, 525-534. 10. Hayasaka, K., Tada, K., Kikuchi, G., Winter, S. and Nyhan, W. L. (1983) Pediatr. Res. 17, 967-970. 11. Sanger, F., Nicklen, S., and Coulson, A. R. (1977) Proc. Natl. Acad. Sci. U.S.A. 74, 5463-5467. 12. Hayasaka, K., Tada, K., Fueki, N., Takahashi, I., Igarashi, A., Takabayashi, T. and Baumgartner, R. (1987) J. Pediactics 110, 124-126. 13. Hiraga, K., Kure, S., Yamamoto, M., Ishiguro, Y., and Suzuki, T. (1988) Biochem. Biophys. Res. Commun. 151, 758-762. 14. Proudfoot, N. J., and Brownlee, G. G. (1976) Nature (London) 263, 211-214. 15. Kyte, J., and Doolittle, R. F. (1982) J. Mol. Biol. 157, 105-132. 16. Chou, P. Y., and Fasman, G. D. (1978) Adv. Enzymol. Relat. Areas Mol. Biol. 47, 45-148. 17. Kim, Y., and Oliver, D. J. (1990) J. Biol. Chem. 265, 848853. 18. Macherel, D., Lebrun, M., Gagnon, J., Neuburger, M., and Douce, R. (1990) Biochem. J. 268, 783-789.

716